ARENA Curriculum

Using our curriculum to self-study or upskill in AI safety independently? We’ve got you covered.

You can find the GitHub repository containing our materials here.



Chapter 0 – Fundamentals

In this chapter, you’ll learn about some coding best practices, become familiar with the PyTorch library, and build & train your own neural networks (CNNs and ResNets).

Note – this chapter is mainly for getting everyone up to the same level, so the rest of the programme can proceed.

Chapter 1 – Transformers and Mechanistic Interpretability

The transformer is an important neural network architecture used for language modelling, and it has made headlines with the introduction of models like ChatGPT.

In this chapter, you will learn all about transformers, and build and train your own. You’ll also learn about Mechanistic Interpretability of transformers, a field which has been advanced by Anthropic’s transformer circuits sequence and work by Neel Nanda.

Chapter 2 – Reinforcement Learning

Reinforcement learning (RL) is an important field of machine learning. It works by teaching agents to take actions in an environment to maximise their accumulated reward.

In this chapter, you’ll learn about some of the fundamentals of RL. You’ll work with OpenAI’s Gym environment to run your own experiments. You’ll also learn about Reinforcement Learning from Human Feedback (RLHF), and apply it to the transformers you trained in the previous section.


Chapter 3 – LLM Evaluations

Here, you’ll learn how to evaluate LLMs. We’ll take you through the process of building a multiple-choice benchmark from scratch and using this to evaluate current models.

We’ll then move on to study LM agents: how to build them, and how to evaluate them.


ARENA Capstone Projects

If you come to participate in one of our in-person programmes, we’ll conclude your time with us with a capstone project.

In this, you’ll have the chance to dig into a topic that you found interesting during the course, applying the skills and knowledge you’ve accumulated over the last four weeks with us.

Soon, you’ll be able to access a bank of previous participants’ Capstone Projects here.

Curriculum

Extending ARENA’s Impact: Beyond Self-Study

We’re committed to supporting others who share our mission of developing AI safety talent worldwide. For this reason, we have some provisions in place for others to use and share our materials.

Teaching AI Safety with ARENA Materials: We support educators, community groups, and universities using our curriculum to teach AI safety. We welcome the efforts of mission-aligned people and organisations to help us grow technical AI safety capacity.

Resources for Educators: We can provide additional support for those teaching our materials, such as advice on structuring your course. Our content is adaptable for different contexts and time frames.

Partnership Guidelines: To maintain quality and clarity across AI safety education, we’ve established some simple guidelines for using our materials. These include policies about giving us credit, distinct branding, and requesting permission for group use.

Learn more about using our materials.