Separate weekend track
Track A Lite
This is not the Day 1 flow. It is a standalone two-day portfolio project for learners who want one focused artifact this weekend.
# Track A Lite: Fine-tune SmolVLA on LIBERO-Spatial in 2 Days
A standalone portfolio project. No prerequisites beyond Python and basic ML literacy.
---
What you're shipping
By end of Day 2 you have a public GitHub repo containing:
1. A fine-tuned SmolVLA checkpoint that beats Modern Robot LearningZero-shotDoing a new task without task-specific training. SmolVLA on LIBERO-Spatial by ≥ 30 percentage points (e.g. 0.36 → 0.70).
2. Three runs across three seeds, with mean ± std reported.
3. An eval video showing successful Core ConceptsTaskThe job the robot is supposed to complete, such as pick-and-place, navigation, or drawer opening. completions.
4. A 1-page writeup with a headline plot.
5. A Makefile that lets a stranger reproduce your headline number with make reproduce.
This is portfolio-grade work: small in scope, fully reproducible, defensible at a job interview, comprehensible to a friend.
---
What this isn't
- Not a research contribution. You're reproducing a known result with your own data.
- Not a hardware project. Pure Simulation & Sim-to-RealSimulationA virtual environment where robots can be trained or tested..
- Not a deep-dive on architecture. We treat SmolVLA and LoRA as black boxes.
If those things matter to you, the full 47-day curriculum is the right move. If you want to ship something concrete and credible this week, this doc is for you.
---
Why SmolVLA + LIBERO-Spatial?
- SmolVLA (Hugging Face, 2025): a 2.4B-parameter Modern Robot LearningVision-Language-Action model (VLA)A model that takes images and language as input and outputs robot actions. designed for Modern Robot LearningFine-tuningTaking a pretrained model and adapting it to a specific robot or task. on consumer GPUs. Open weights, well-documented, integrated with LeRobot's CLI.
- LIBERO-Spatial: a Simulation & Sim-to-RealBenchmarkA standard test used to compare methods fairly. of 10 spatial-relation Manipulation & TasksManipulationUsing a robot arm or hand to move or interact with objects. tasks ("put the bowl on the right of the plate"). Standard, fast to evaluate, has known Evaluation & ResearchBaselineA reference method used for comparison. numbers — meaning you can sanity-check your results against published work.
- LoRA (Low-Rank Adaptation): a Modern Robot LearningFine-tuningTaking a pretrained model and adapting it to a specific robot or task. technique that adds small trainable matrices to attention layers while freezing the base. Reduces fine-tune memory by ~5× — fits in 30GB GPU memory.
You'll fine-tune ~30M parameters out of 2.4B (1.3% of the model) and watch a ~2× improvement materialize in 60 minutes.
---
Compute and cost
- 1× H100 80GB for ~6 hours total (3 fine-tunes × ~75 min, plus evals)
- Substitutes: 1× A100 80GB works identically. 1× L40S works with smaller batches. 1× consumer GPU (24GB+) works with rank=16 instead of rank=32.
- Provider: Nebius, Lambda Labs, RunPod, Vast.ai. Budget ~$15–25 if you're efficient. Or use existing cloud credits.
---
Day 0 — Environment setup (~30 min, before Day 1)
Install on local laptop (for editing)
curl -LsSf https://astral.sh/uv/install.sh | sh
source ~/.bashrc
uv python install 3.12
mkdir -p ~/track-a-lite && cd ~/track-a-lite
git initProvision GPU instance
Pick your provider. On Nebius: 1. Sign up at https://nebius.com/, add SSH key. 2. Provision: 1× H100 80GB SXM, 16 vCPU, 200 GB NVMe, Ubuntu 22.04 + CUDA 12.4. 3. SSH in:
ssh -i ~/.ssh/<your-key> ubuntu@<instance-ip>On the GPU instance
# Tooling
sudo apt update && sudo apt install -y tmux htop nvtop git build-essential ffmpeg \
libgl1 libegl1 libglfw3 libosmesa6 libgles2-mesa-dev pkg-config
# uv
curl -LsSf https://astral.sh/uv/install.sh | sh
source ~/.bashrc
# Verify GPU
nvidia-smiExpected: A box showing H100 80GB HBM3, CUDA Version 12.4, GPU memory 0 / 81559 MiB.
Project workspace on GPU
mkdir -p ~/track-a && cd ~/track-a
uv venv --python 3.12 .venv && source .venv/bin/activate
uv pip install "lerobot[all]==0.5.0"
uv pip install wandb
wandb login # paste API key from https://wandb.ai/authorizeSanity-check LeRobot is alive
python -c "import lerobot; print(f'LeRobot {lerobot.__version__}')"
lerobot-train --help | head -20Expected: Version 0.5.0, then Robot LearningTrainingThe process of fitting a model using data or experience. help text.
Use tmux
Always work inside tmux so disconnects don't kill jobs:
tmux new -s track-a
# Detach with Ctrl-b d
# Reattach later with: tmux attach -t track-a---
Day 1 — Zero-shot baseline + first fine-tune (3–4 hours active, more in background)
Step 1: Inspect the dataset (10 min)
LIBERO-Spatial is on HuggingFace as lerobot/libero_spatial. Verify it loads:
python -c "
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
ds = LeRobotDataset('lerobot/libero_spatial')
print(f'Episodes: {ds.num_episodes}, frames: {ds.num_frames}')
print(f'Sample keys: {list(ds[0].keys())[:8]}')
print(f'Action shape: {ds[0][\"action\"].shape}')
"Expected output:
Episodes: 432, frames: 52000
Sample keys: ['observation.images.image', 'observation.images.wrist_image', 'observation.state', 'action', 'episode_index', 'frame_index', 'timestamp', 'next.done']
Action shape: torch.Size([7])If the Robot LearningDatasetA collection of training or evaluation data. doesn't download (rare), retry: rm -rf ~/.cache/huggingface/datasets/_locks/.
Step 2: Zero-shot eval (~30 min)
Establish your Evaluation & ResearchBaselineA reference method used for comparison. before any Robot LearningTrainingThe process of fitting a model using data or experience.. SmolVLA was pretrained on a mix of Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. data including LIBERO; we want to see what it knows out of the box.
mkdir -p runs figures videos
lerobot-eval \
--policy.path=lerobot/smolvla_base \
--env.type=libero --env.task_suite=libero_spatial \
--eval.n_episodes=50 \
--output_dir=runs/zeroshot \
--seed=1This loads the pretrained SmolVLA from HuggingFace (~5 GB download on first run; cached after) and rolls it out on 50 LIBERO-Spatial episodes.
What you should see in the first 60 seconds:
INFO Loading policy from lerobot/smolvla_base
INFO Loaded 2,401,300,000 parameters
INFO Initializing libero_spatial env suite...
INFO Episode 1/50: success=False, length=235
INFO Episode 2/50: success=True, length=189
...Expected at completion (~25 minutes):
INFO eval/success_rate: 0.36
INFO eval/episode_length: 232.4
INFO Wrote runs/zeroshot/eval_summary.jsonYour number will be in 0.30–0.45. If it's 0.0, something is wrong (env install, action-space mismatch); see "common failures" below.
Append to a results log:
mkdir -p logs
echo "zeroshot,seed=1,success_rate=0.36,n_episodes=50" >> logs/results.csv(Replace 0.36 with your actual number throughout this doc.)
Step 3: First fine-tune — LoRA r=32, seed=1 (60–75 min)
This is the main event. Launch in tmux pane 1:
lerobot-train \
--policy.type=smolvla \
--policy.pretrained_path=lerobot/smolvla_base \
--policy.lora.enable=true \
--policy.lora.rank=32 \
--policy.lora.alpha=64 \
--policy.lora.target_modules=["q_proj","k_proj","v_proj","o_proj"] \
--dataset.repo_id=lerobot/libero_spatial \
--env.type=libero --env.task_suite=libero_spatial \
--batch_size=4 \
--gradient_accumulation_steps=2 \
--steps=10000 \
--eval_freq=2000 \
--save_freq=2000 \
--output_dir=runs/lora_r32_s1 \
--wandb.enable=true \
--wandb.project=track-a-lite \
--seed=1What you should see in the first 60 seconds:
INFO Loading dataset: lerobot/libero_spatial
INFO Trainable params: 31,457,280 / 2,401,331,712 (1.31%)
INFO step:0 smpl:8 ep:0 epch:0.00 loss:0.487 grdn:1.21 lr:1.0e-04 updt_s:0.812The "1.31%" line is LoRA in Core ConceptsActionA command the robot sends to its motors, controller, or low-level system. — you're Modern Robot LearningFine-tuningTaking a pretrained model and adapting it to a specific robot or task. a small adapter while the base 2.4B parameters stay frozen.
Open a second tmux pane and watch GPU memory:
watch -n 2 nvidia-smiExpected: Memory usage settles at 22–28 GB / 80 GB. If it's >40GB, something's wrong with LoRA config; verify policy.lora.enable=true made it into the run config.
- Progress checkpoints:
- Step 2000 (~12 min):
eval/success_rate: 0.55(anywhere 0.45–0.65 normal) - Step 6000 (~40 min):
eval/success_rate: 0.71 - Step 10000 (~70 min):
eval/success_rate: 0.79
Final expected range: 0.65–0.85 depending on randomness.
Step 4: While seed 1 trains — set up reproducibility scaffold (45 min in background)
Open tmux pane 3 and create the project structure:
cd ~/track-a-lite
cat > README.md <<'EOF'
# Track A Lite: SmolVLA + LoRA on LIBERO-Spatial
## Hypothesis
Fine-tuning SmolVLA on LIBERO-Spatial via LoRA r=32 yields ≥ 30 percentage point
improvement in success rate over zero-shot SmolVLA.
## Headline result
| Variant | Success rate (mean ± std) | n seeds |
|---|---|---|
| Zero-shot SmolVLA | 0.36 ± 0.00 | 1 |
| SmolVLA + LoRA r=32 | 0.78 ± 0.04 | 3 |
42 percentage point improvement.
## Reproducemake install make eval
Wall-clock: ~6 GPU-hours on 1× H100.
## Files
- `Makefile`: install + train + eval targets
- `requirements.txt`: pinned deps
- `scripts/train.sh`: training command
- `scripts/eval.sh`: eval command
- `figures/headline.png`: bar plot
- `videos/eval_episode.mp4`: sample successful rollout
- `logs/results.csv`: all seed-level results
EOF
cat > requirements.txt <<'EOF'
lerobot[all]==0.5.0
wandb
matplotlib
pandas
numpy
EOF
cat > Makefile <<'EOF'
.PHONY: install zeroshot train eval reproduce clean
install:
uv venv --python 3.12 .venv
. .venv/bin/activate && uv pip install -r requirements.txt
zeroshot:
. .venv/bin/activate && bash scripts/zeroshot.sh
train:
. .venv/bin/activate && bash scripts/train_all_seeds.sh
eval:
. .venv/bin/activate && bash scripts/eval_all_seeds.sh
reproduce: install eval
clean:
rm -rf runs/ wandb/ figures/*.png
EOF
mkdir -p scripts figures videos logs
cat > scripts/zeroshot.sh <<'EOF'
#!/bin/bash
set -e
lerobot-eval \
--policy.path=lerobot/smolvla_base \
--env.type=libero --env.task_suite=libero_spatial \
--eval.n_episodes=50 \
--output_dir=runs/zeroshot --seed=1
EOF
cat > scripts/train_all_seeds.sh <<'EOF'
#!/bin/bash
set -e
for SEED in 1 2 3; do
lerobot-train \
--policy.type=smolvla \
--policy.pretrained_path=lerobot/smolvla_base \
--policy.lora.enable=true \
--policy.lora.rank=32 \
--policy.lora.alpha=64 \
--policy.lora.target_modules='["q_proj","k_proj","v_proj","o_proj"]' \
--dataset.repo_id=lerobot/libero_spatial \
--env.type=libero --env.task_suite=libero_spatial \
--batch_size=4 \
--gradient_accumulation_steps=2 \
--steps=10000 \
--eval_freq=2000 \
--save_freq=2000 \
--output_dir=runs/lora_r32_s${SEED} \
--wandb.enable=true --wandb.project=track-a-lite \
--seed=${SEED}
done
EOF
cat > scripts/eval_all_seeds.sh <<'EOF'
#!/bin/bash
set -e
for SEED in 1 2 3; do
lerobot-eval \
--policy.path=runs/lora_r32_s${SEED}/checkpoints/last/pretrained_model \
--env.type=libero --env.task_suite=libero_spatial \
--eval.n_episodes=50 \
--output_dir=runs/lora_r32_s${SEED}/eval \
--seed=${SEED}
done
EOF
chmod +x scripts/*.sh
git add -A && git commit -m "Day 1: Track A Lite scaffold + zero-shot baseline"Step 5: Confirm seed 1 is healthy (5 min check at ~step 4000)
Around 25 minutes into Robot LearningTrainingThe process of fitting a model using data or experience., peek at the logs:
tmux attach -t track-a # if you detached
# look for the most recent eval line in the training outputAt step 4000 you want to see eval/success_rate somewhere in 0.55–0.70. If it's still under 0.45, see "common failures."
Detach and let it finish.
Step 6: While seed 1 finishes — record an eval video (15 min)
Once seed 1 hits its first checkpoint at step 2000, you can render a video from that intermediate checkpoint to verify visually that the Core ConceptsPolicyThe rule or model that maps observations or states to actions. is learning something:
lerobot-eval \
--policy.path=runs/lora_r32_s1/checkpoints/002000/pretrained_model \
--env.type=libero --env.task_suite=libero_spatial \
--eval.n_episodes=5 \
--output_dir=runs/lora_r32_s1/preview_eval \
--seed=999Watch one of the videos:
ls runs/lora_r32_s1/preview_eval/videos/
# Pick episode_0.mp4Even at step 2000, the arm should be making purposeful motions toward the right object. If it's flailing, something's wrong.
Step 7: When seed 1 completes — log result + start seed 2 (5 min)
Once you see the final step 10000 log, capture the number:
SR=$(cat runs/lora_r32_s1/eval_summary.json | python -c "import sys,json; print(json.load(sys.stdin)['eval/success_rate'])")
echo "lora_r32,seed=1,success_rate=${SR}" >> logs/results.csvEdit the train script to launch seed 2:
# In a fresh tmux pane:
lerobot-train [...] --seed=2 --output_dir=runs/lora_r32_s2Let it run overnight if needed. Same for seed 3 in a third pane.
End of Day 1 deliverable check
---
Day 2 — Finish seeds, plot, write up, ship (3 hours active)
Step 1: Confirm seeds 2 and 3 finished (10 min)
Reattach to tmux. Each should have completed overnight (~70 min each).
SR2=$(cat runs/lora_r32_s2/eval_summary.json | python -c "import sys,json; print(json.load(sys.stdin)['eval/success_rate'])")
SR3=$(cat runs/lora_r32_s3/eval_summary.json | python -c "import sys,json; print(json.load(sys.stdin)['eval/success_rate'])")
echo "lora_r32,seed=2,success_rate=${SR2}" >> logs/results.csv
echo "lora_r32,seed=3,success_rate=${SR3}" >> logs/results.csv
cat logs/results.csvIf a seed crashed overnight, decide: retry (~70 min) or report n=2 with the issue documented in your writeup. n=2 is acceptable for a portfolio piece if you're transparent about it.
Step 2: Headline bar plot (30 min)
Create scripts/make_plot.py:
"""Track A Lite: produce the headline bar plot."""
import json, glob
from pathlib import Path
import numpy as np
import matplotlib.pyplot as plt
# Zero-shot
zs_path = Path("runs/zeroshot/eval_summary.json")
zs_sr = json.loads(zs_path.read_text())["eval/success_rate"]
# LoRA seeds
lora_srs = []
for d in sorted(glob.glob("runs/lora_r32_s*/eval_summary.json")):
lora_srs.append(json.loads(Path(d).read_text())["eval/success_rate"])
lora_srs = np.array(lora_srs)
print(f"Zero-shot: {zs_sr:.3f}")
print(f"LoRA r=32: {lora_srs.mean():.3f} ± {lora_srs.std():.3f} (n={len(lora_srs)})")
fig, ax = plt.subplots(figsize=(7, 5))
labels = ["Zero-shot\nSmolVLA", f"+ LoRA r=32\n(n={len(lora_srs)} seeds)"]
means = [zs_sr, lora_srs.mean()]
stds = [0.0, lora_srs.std()]
colors = ["#888", "#3a86ff"]
bars = ax.bar(labels, means, yerr=stds, capsize=8, color=colors, edgecolor="black")
for bar, m in zip(bars, means):
ax.text(bar.get_x() + bar.get_width()/2, m + 0.02, f"{m:.2f}",
ha="center", fontsize=12, fontweight="bold")
ax.set_ylabel("Success rate (LIBERO-Spatial, 50 eval episodes/seed)")
ax.set_ylim(0, 1)
ax.set_title("SmolVLA on LIBERO-Spatial: zero-shot vs LoRA fine-tune")
ax.grid(alpha=0.3, axis="y")
plt.tight_layout()
plt.savefig("figures/headline.png", dpi=150)
print("Wrote figures/headline.png")Run:
python scripts/make_plot.pyExpected console output:
Zero-shot: 0.360
LoRA r=32: 0.778 ± 0.041 (n=3)
Wrote figures/headline.pngExpected figure: Two bars side-by-side. Left bar (gray) at 0.36, no error bar. Right bar (blue) at ~0.78 with a small error bar (±0.04). Numerical labels on top of each. Clean white background, axis grid.
Step 3: Pick the best eval video (15 min)
ls runs/lora_r32_s1/eval/videos/
# Watch episode_0 through episode_4Pick the cleanest successful Robot LearningEpisodeOne full attempt at a task from start to finish.. Copy to project root:
cp runs/lora_r32_s1/eval/videos/episode_3.mp4 videos/eval_episode.mp4If your scp is set up, pull it to your laptop to watch. Otherwise install mpv or vlc on the GPU box and play over X-forwarding (ssh -X).
Step 4: 1-page writeup (30 min)
Create WRITEUP.md:
# SmolVLA + LoRA on LIBERO-Spatial
## TL;DR
LoRA fine-tuning lifts SmolVLA from 0.36 to 0.78 on LIBERO-Spatial — a 42pp
improvement, achieved in ~75 min on 1× H100 by training 1.3% of the model's
parameters.
## Setup
- **Model:** `lerobot/smolvla_base` (2.4B params, PaliGemma backbone)
- **Dataset:** `lerobot/libero_spatial` (432 episodes, 10 spatial-relation tasks)
- **Method:** LoRA r=32 α=64, target modules `{q,k,v,o}_proj`
- **Training:** 10k steps, batch 4 × grad-accum 2, AdamW, cosine LR schedule
- **Compute:** 1× H100 80GB, ~75 min/seed × 3 seeds = ~4 GPU-hours
- **Eval:** 50 episodes per seed via LeRobot's LIBERO env wrapper
## Result
| Variant | Success rate | n seeds |
|---|---|---|
| Zero-shot | 0.36 | 1 |
| LoRA r=32 | **0.78 ± 0.04** | 3 |

## Why this works
LIBERO tasks are *in-distribution* for SmolVLA's pretraining mix, but the
specific scene compositions (object positions, spatial relations) require
fine-tuning to specialize. LoRA's low-rank adapters are sufficient because
the pretrained features are already good — we're nudging the policy, not
retraining it.
GPU memory peaked at ~26 GB during training. A full fine-tune of 2.4B params
would have taken ~50 GB and likely required gradient checkpointing.
## Limitations
- Single dataset, single benchmark. Results may not transfer to LIBERO-Object
or LIBERO-Goal without re-tuning.
- 50 eval episodes per seed is the LeRobot default but is on the small side
for tight error bars; n=200 would be more reliable.
- No ablation across LoRA ranks. r=32 was picked because it's the LeRobot
default; r=8 might give similar results at lower memory.
## Reproducibility
Repo: `<your-github-url>`
\```
git clone <url>
cd track-a-lite
make install
make reproduce
\```
Headline number reproduces within ±0.05 across fresh-clone runs (env seed
randomness on LIBERO sim).
## Stack
LeRobot v0.5.0, SmolVLA, HuggingFace Transformers, LIBERO sim, PyTorch 2.5,
CUDA 12.4, Python 3.12.Step 5: Fresh-clone test (30 min)
This is the rubric step that separates a portfolio piece from a screenshot. Verify a stranger could reproduce.
cd /tmp && rm -rf clone-test
git clone <your-github-url> clone-test
cd clone-test
make install
# Sanity-check zero-shot (~25 min)
bash scripts/zeroshot.sh
# Compare:
SR_FRESH=$(cat runs/zeroshot/eval_summary.json | python -c "import sys,json; print(json.load(sys.stdin)['eval/success_rate'])")
echo "Original: 0.36, Fresh-clone: $SR_FRESH"Should match within ±0.04 (LIBERO has some env-seed variance even with --seed=1).
If you have GPU-hours to spare, also re-eval seed 1's saved checkpoint:
# This requires the saved checkpoint to be in the repo or downloaded from HF
# Easier path: skip and just do the zeroshot matchStep 6: Push, share, log (30 min)
cd ~/track-a-lite
git add -A
git commit -m "Track A Lite complete: zero-shot 0.36 -> LoRA 0.78"
git push origin mainAdd a 60-second screen recording of you running make reproduce and watching the eval video. Save as videos/demo.mp4. Optional but worth it for portfolio.
- Where to share:
- GitHub repo with the README front and center
- LinkedIn post: "Spent the weekend Modern Robot LearningFine-tuningTaking a pretrained model and adapting it to a specific robot or task. a 2.4B-param Modern Robot LearningVision-Language-Action model (VLA)A model that takes images and language as input and outputs robot actions. on a robotics Simulation & Sim-to-RealBenchmarkA standard test used to compare methods fairly.. 0.36 → 0.78 on LIBERO-Spatial. Repo + writeup: <Movement, Mechanics & Robot BodyLinkA rigid body segment between joints.>"
- X / Twitter: same with
#robotics #ML - HuggingFace: push your LoRA adapter to the Hub if you want
End of Day 2 deliverable check
---
Common failures and fixes
| Symptom | Likely cause | Fix |
| Modern Robot LearningZero-shotDoing a new task without task-specific training. success = 0.0 | Core ConceptsActionA command the robot sends to its motors, controller, or low-level system. space mismatch between SmolVLA and LIBERO | Check policy.action_dim matches Robot LearningDatasetA collection of training or evaluation data.; LeRobot v0.5 should auto-handle this |
| Modern Robot LearningZero-shotDoing a new task without task-specific training. eval hangs | LIBERO env init slow on first run | Wait 90 s for first Robot LearningEpisodeOne full attempt at a task from start to finish.; subsequent episodes are normal |
| OOM during Robot LearningTrainingThe process of fitting a model using data or experience. | LoRA disabled, or rank too high | Verify --policy.lora.enable=true in run config; drop rank to 16 if still OOM |
| Loss decreases but eval stays at 0.0 | Robot LearningDatasetA collection of training or evaluation data. Data, Distributions & Training IssuesNormalizationRescaling inputs or features to stabilize learning. stats not loaded | lerobot-compute-stats --dataset.repo_id=lerobot/libero_spatial then retrain |
| Loss is NaN early | bf16 underflow | Add --policy.use_amp=false |
| LoRA fine-tune Simulation & Sim-to-RealSuccess rateHow often the robot completes a task correctly. < 0.50 | LR too low for Modern Robot LearningFine-tuningTaking a pretrained model and adapting it to a specific robot or task. | Try --lr=1e-4 (default is 1e-5); also verify alpha/rank ratio is 2:1 |
wandb won't connect | Network on GPU box blocked | wandb offline then wandb sync later |
| Cached HF Robot LearningDatasetA collection of training or evaluation data. locked | Previous job died mid-download | rm -rf ~/.cache/huggingface/datasets/_locks/ |
| Different model versions changing | SmolVLA repo updated | Pin: lerobot/smolvla_base@v1.0 if available |
---
When this is done
You have a public GitHub repo, a one-page writeup with a published-looking plot, three seeded numbers, an eval video, and a fresh-clone-tested make reproduce. That's enough to:
- Drop into a portfolio or job application
- Talk through with a recruiter or interviewer ("I fine-tuned a 2.4B Modern Robot LearningVision-Language-Action model (VLA)A model that takes images and language as input and outputs robot actions. on a robotics Simulation & Sim-to-RealBenchmarkA standard test used to compare methods fairly., got 42pp improvement in 6 GPU-hours, here's the repo")
- Use as a launchpad for a real research project (an Evaluation & ResearchAblationAn experiment where one component is removed to see its effect. across LoRA ranks, a different LIBERO suite, a custom Robot LearningDatasetA collection of training or evaluation data.)
If after this you want to go deeper, the natural next stops are:
- Try a harder LIBERO suite: LIBERO-Long (long-horizon tasks) — same code, swap
libero_spatialforlibero_long, expect lower numbers. - Try a different Modern Robot LearningVision-Language-Action model (VLA)A model that takes images and language as input and outputs robot actions.: π0 has the same LeRobot integration. Swap
--policy.type=smolvlafor--policy.type=pi0. Compare numbers across both, write that up as a follow-up. - Add a real Evaluation & ResearchAblationAn experiment where one component is removed to see its effect.: Train at LoRA rank ∈ {8, 32, 128} and plot rank vs success_rate. This makes your repo a 1.5-week project instead of a 2-day project, but it's a real research-y comparison.
- Collect your own Robot LearningDatasetA collection of training or evaluation data.: With an SO-101 arm or a Imitation & Reinforcement LearningTeleoperation (teleop)A human remotely controlling the robot, often to collect demonstrations. sim setup, record 30 episodes of a custom Core ConceptsTaskThe job the robot is supposed to complete, such as pick-and-place, navigation, or drawer opening. and fine-tune on those. This is the real Track A from the full curriculum.
But if you stop here, you've shipped something genuine.
---
End of Track A Lite.