layout | title | permalink | redirect_from | |
---|---|---|---|---|
sp25 |
Advanced Large Language Model Agents |
/sp25 |
|
- To sign up for the course, please fill in this form.
- For course discussion and questions, please join our LLM Agents Discord.
- This course is built upon the fundamentals from the Fall 2024 LLM Agents MOOC.
Instructor | (Guest) Co-instructor | (Guest) Co-instructor |
Dawn Song | Xinyun Chen | Kaiyu Yang |
Professor, UC Berkeley | Research Scientist, Google DeepMind |
Research Scientist, Meta FAIR |
Large language model (LLM) agents have been an important frontier in AI, however, they still fall short critical skills, such as complex reasoning and planning, for solving hard problems and enabling end-to-end applications in real-world scenarios. Building on our previous course, this course dives deeper into advanced topics in LLM agents, focusing on reasoning, AI for mathematics, code generation, and program verification. We begin by introducing advanced inference and post-training techniques for building LLM agents that can search and plan. Then, we focus on two application domains: mathematics and programming. We study how LLMs can be used to prove mathematical theorems, as well as generate and reason about computer programs. Specifically, we will cover the following topics:
- Inference-time techniques for reasoning
- Post-training methods for reasoning
- Search and planning
- Agentic workflow, tool use, and functional calling
- LLMs for code generation and verification
- LLMs for mathematics: data curation, continual pretraining, and finetuning
- LLM agents for theorem proving and autoformalization
Date | Guest Lecture (4:00PM-6:00PM PST) |
Supplemental Readings |
---|---|---|
Jan 27th | Inference-Time Techniques for LLM Reasoning Xinyun Chen, Google DeepMind Livestream Intro Slides Quiz 1 |
- Large Language Models as Optimizers - Large Language Models Cannot Self-Correct Reasoning Yet - Teaching Large Language Models to Self-Debug |
Feb 3rd | Learning to reason with LLMs Jason Weston, Meta |
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model - Iterative Reasoning Preference Optimization - Chain-of-Verification Reduces Hallucination in Large Language Models |
Feb 10th | TBA | |
Feb 17th | No Class - Presidents' Day | |
Feb 24th | Reasoning and Planning in Large Language Models Hanna Hajishirzi, University of Washington |
|
Mar 3rd | Code generation: foundation model Charles Sutton, Google DeepMind |
|
Mar 10th | Coding agents/web agents Ruslan Salakhutdinov, CMU/Meta |
|
Mar 17th | Multimodal Agents Caiming Xiong, Salesforce AI Research |
|
Mar 24th | No Class - Spring Recess | |
Mar 31st | Math LLMs: data curation, pretraining, finetuning, and tool-integrated reasoning Thomas Hubert, Google DeepMind |
|
Apr 7th | Language models for autoformalization and theorem proving Kaiyu Yang, Meta FAIR |
|
Apr 14th | Advanced topics in theorem proving Sean Welleck, CMU |
|
Apr 21st | Program verification & generating verified code Swarat Chaudhuri, UT Austin |
|
Apr 28th | Agent safety & security Dawn Song, UC Berkeley |
Coming Soon!