Alignment Research Engineer Accelerator
We are an ML engineering program that provides the skills, tools, and environment for upskilling in technical AI safety.
Programme
Participants will work through ML programming exercises in pairs over 4 to 5 weeks under the guidance of teaching assistants to develop their skills. Participants will be part of a group of talent engineers and researchers working out of the LISA office in London, allowing them to connect and exchange ideas with AI safety researchers. In the final week, participants will complete a Capstone project that dives deeper into research topics covered during the course.
Curriculum
-
Week 0: Fundamentals
We first cover the basics of deep learning, including basic machine learning terminology, what neural networks are, and how to train them.
-
Week 1: Transformers & Mechanistic Interpretability
You will learn how to build and train your own transformers, and the mechanistic interpretability transformers, a field which has been advanced by Anthropic’s Transformer Circuits Thread.
-
Week 2: Reinforment Learning
You will learn the fundamentals of RL, work with OpenAI’s Gym environment to run your own experiments. You will learn about Reinforcement Learning from Human Feedback (RLHF) and apply it to the transformers you built.
-
Week 3: LLM Evaluations
You will learn how to evaluate LLMs. We'll take you through the process of building a multiple choice benchmark from scratch and using this to evaluate current models. We'll then move on to study LM agents: how to build them and how to evaluate them.
-
Week 4: Capstone
We will conclude this programme with capstone projects, where you get to dig into something related to the course, applying the skills and knowledge you will have accumulated over the last 4 weeks.