Day 32

Helix (Figure)

This is a valid v1.0 placeholder page for the later curriculum arc. Full interactive lab treatment ships after Week 1 dogfooding.

LECTURE & READING

Glossary primer (8 min)

  • Helix — Figure AI's whole-upper-body humanoid Modern Robot LearningVision-Language-Action model (VLA)A model that takes images and language as input and outputs robot actions., announced 2025. ~2B params. Optimized for high-frequency real-time Robot LearningInferenceUsing a trained model to make predictions or choose actions. (~200 Hz EE deltas).
  • Whole-upper-body — Controls 35+ DoFs simultaneously: head, torso, both arms, both hands. Most VLAs do single-arm.
  • System 1 / System 2 — Helix has a two-tier architecture: a slower vision-language module (System 2, 8 Hz) and a fast motor Control & PlanningControllerThe algorithm or system that turns desired behavior into motor commands. (System 1, 200 Hz).
  • Closed-source — Helix weights and Robot LearningTrainingThe process of fitting a model using data or experience. code are proprietary; public knowledge is from Figure's blog posts and demos.

Real-world analogy

Helix is the "specialized racecar" of VLAs: not the most general, but optimized for one thing — running a humanoid in real time.

Hour 1 — Reading

  • Figure's Helix announcement blog (~20 min): https://www.figure.ai/news/helix
  • Figure's follow-up "Helix Modern Robot LearningGeneralizationThe robot’s ability to work in new situations it has not seen before." video / writeup (~20 min)
  • 2025 humanoid Imitation & Reinforcement LearningTeleoperation (teleop)A human remotely controlling the robot, often to collect demonstrations. papers for context: HumanPlus (Stanford), OmniH2O (CMU)

Hour 2 — Concepts only (closed-source)

Since you can't run Helix, study the architecture conceptually. Write docs/day32_helix_notes.md:

# Helix architectural notes

## System 1 (fast motor)
- 200 Hz output rate
- ~80M params
- Input: proprioception + System 2 latent (updated at 8 Hz)
- Output: joint targets for 35 DoFs

## System 2 (slow VLM)
- 8 Hz output rate
- ~2B params
- Input: image + language instruction
- Output: latent vector (transferred to System 1)

## Why two systems?
A pure 200 Hz VLA is too slow (forward pass exceeds 5 ms even on H100).
Decoupling lets the slow brain plan while the fast brain reflexively executes.

## What this looks like at deploy
- Operator says "fold the laundry on the table"
- System 2 fires: image + prompt → latent
- System 1 ticks at 200 Hz: latent + proprio → joint targets
- Whenever System 2 finishes (every 125 ms), it updates the latent

## Comparable architectures
- Brain stem (200 Hz reflex) + cortex (8 Hz planning)
- π0.5's hierarchical setup (high-level skill picker + low-level executor) is similar

LAB

Hour 3 — Lab: build a System 1 / System 2 simulator (60 min)

What you're building. A toy implementation of the System 1 / System 2 split using your π0 from Day 30 as System 2 and an MLP as System 1. This is a conceptual demo, not a real Helix Evaluation & ResearchReplicationReproducing a previous result..

Step 1 — Wrap π0 inference at 8 Hz (20 min)

# src/day32_two_tier.py
import time, threading
import torch

class TwoTierController:
    def __init__(self, vla, motor):
        self.vla = vla  # π0
        self.motor = motor  # MLP
        self.latent = torch.zeros(256)
        self.proprio = None

    def vla_loop(self, image_stream, instruction):
        while True:
            img = image_stream.get()
            with torch.no_grad():
                self.latent = self.vla.encode(img, instruction)  # 256-d
            time.sleep(0.125)  # 8 Hz

    def motor_loop(self, proprio_stream, action_callback):
        while True:
            p = proprio_stream.get()
            with torch.no_grad():
                action = self.motor(torch.cat([p, self.latent]))
            action_callback(action)
            time.sleep(0.005)  # 200 Hz

(The MLP would be trained to map (proprio, latent) → Core ConceptsActionA command the robot sends to its motors, controller, or low-level system. via supervised data; we skip the actual Robot LearningTrainingThe process of fitting a model using data or experience. and just verify the pattern works.)

Full source continues in the committed curriculum files. The v1.0 page exposes the day flow and lab surface without inventing content.

Completion controls unlock when this day graduates from placeholder to full lab.