Week navigation
Week 3: Imitation Learning
Week 3 -
Imitation Learning
LeRobot, behavior cloning, ACT, diffusion policy, SmolVLA, LoRA, and OpenVLA.
LeRobot v0.5 setup, dataset exploration, BC baseline
Glossary primer (12 min) LeRobot — Hugging Face's flagship robot learning library. v0.5.0 released Q1 2026. Trains and evaluates ACT, Diffus...
ACT (Action Chunking Transformer) on ALOHA insertion
Glossary primer (10 min) ACT (Action Chunking Transformer) — Stanford 2023. CVAE based transformer that predicts the next K actions per call...
Diffusion Policy on PushT
Glossary primer (10 min) Diffusion Policy — Columbia/Stanford 2023. Uses a denoising diffusion model to generate actions conditioned on obse...
SmolVLA fine-tuning on LIBERO-Spatial
Glossary primer (12 min) SmolVLA — Hugging Face's compact (2.4B parameter) Vision Language Action model, released 2025. Designed for fine tu...
VQ-BeT and architecture-comparison day
Glossary primer (10 min) VQ BeT (Vector Quantized Behavior Transformer) — Carnegie Mellon 2024. Tokenize continuous actions via VQ VAE, then...
OpenVLA-OFT inference + LoRA on a custom small dataset
Glossary primer (12 min) OpenVLA — Stanford / TRI 2024. 7B parameter VLA built on Llama 2 7B + DINOv2 + SigLIP. Open weights. OpenVLA OFT —...
Week 3 capstone-track reflection + fresh-clone
Glossary primer (5 min) No new terms. Today is reflection + reproducibility. Hour 1 — Capstone Track D pre design (40 min) Track D of the We...
What you will know by end of Week 3
- Read the week's source papers without drowning in undefined terms.
- Run the week's core software stack from a fresh clone.
- Explain the week's systems in terms of data, control, and learning loops.