Perceptive Humanoid Parkour: Chaining Dynamic Human Skills via Motion Matching
Zhen Wu, Xiaoyu Huang, Lujie Yang, Yuanhang Zhang, Koushil Sreenath, Xi Chen, Pieter Abbeel, Rocky Duan, Angjoo Kanazawa, Carmelo Sferrazza, Guanya Shi, C. Karen Liu
ARCHITECTURE
THE PROBLEM
Before PHP, humanoid Navigation & LocomotionLocomotionMovement of the robot body through space, like walking, rolling, or running. research achieved stable walking on varied terrains, but parkour—dynamic, adaptive, human-like movement—remained out of reach. Prior work fell into two camps: (1) end-to-end Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. agents trained in Simulation & Sim-to-RealSimulationA virtual environment where robots can be trained or tested. that transfer poorly to real robots and struggle to compose multiple skills, and (2) hand-crafted motion controllers that work for specific tasks but lack expressiveness and don't adapt on the fly. The core limitation? Robots lacked both the motion expressiveness of humans AND the perceptual awareness to make real-time decisions about which Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. to execute. A Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. might nail climbing one obstacle, but couldn't decide whether to climb the next one or step over it based on depth Perception & SensingSensorA device that provides information about the robot or its environment. input. Existing motion capture retargeting ignored the long-horizon composition problem—you could animate one motion, but chaining them smoothly while preserving human fluidity was unsolved.
HOW IT WORKS
Motion Matching: Compose Atomic Human Skills into Fluid Trajectories
RL Expert Policies: Train Robots to Actually Track Human Motions
Policy Distillation with DAgger: Collapse Multiple Experts into One Depth-Based Policy
Perception-Driven Decision-Making: Autonomous Skill Selection
MORE DEMONSTRATIONS
KEY RESULTS
vs. 96% of the G1's 1.3m height; prior humanoid systems rarely exceeded 0.3-0.5m
This is the flashiest result: the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. climbs almost as high as its own body length. For context, most humanoid robots from prior work could step over 0.2-0.3m obstacles; here, we're seeing nearly 4x that. Climbing 1.25m requires explosive leg power, precise balance at the peak, and coordinated descent—all executed fluidly.
vs. Prior motion-matching or RL work typically handled single-skill execution or pre-planned sequences, not adaptive multi-skill chains
The paper demonstrates the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. running a course with multiple obstacles, autonomously selecting skills and adapting when obstacles are moved in real time. This is harder than climbing one wall—the Core ConceptsPolicyThe rule or model that maps observations or states to actions. must compose skills, handle Core ConceptsStateThe robot’s current condition, such as joint positions, velocity, object positions, or internal variables. transitions, and recover from Perception & SensingPerceptionThe process of turning raw sensor data into useful understanding of the world. errors. Real-time adaptation (not pre-planned re-optimization) proves the system generalizes beyond Robot LearningTrainingThe process of fitting a model using data or experience. data.
vs. Prior methods typically specialized in 1-2 skills per approach
The framework isn't a one-trick solution. It demonstrates cat vaults, speed vaults, platform climbs, rolling down from heights, crawling under obstacles, and more. This variety comes from the motion-matching + Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. + DAgger pipeline scaling to multiple skills without multiplicative Robot LearningTrainingThe process of fitting a model using data or experience. complexity for the student Core ConceptsPolicyThe rule or model that maps observations or states to actions..
vs. No multi-expert switching or expensive state estimation; lighter than running separate RL policies
The distillation to a single depth-based Core ConceptsPolicyThe rule or model that maps observations or states to actions. is pragmatic. The Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. doesn't need ground truth Core ConceptsStateThe robot’s current condition, such as joint positions, velocity, object positions, or internal variables. or external tracking—just an onboard depth Perception & SensingSensorA device that provides information about the robot or its environment. and one neural network. This is deployable on real hardware without a lab full of cameras.
PERFORMANCE COMPARISON
WHY DEVELOPERS SHOULD CARE
For software developers building robotics systems, PHP demonstrates three critical lessons. First, motion matching is a underrated tool for Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. Control & PlanningControlThe method used to make the robot move the way you want.. Instead of Robot LearningTrainingThe process of fitting a model using data or experience. everything from raw pixels with Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. (which requires massive Simulation & Sim-to-RealSimulationA virtual environment where robots can be trained or tested. and often fails in the real world), you can leverage human motion as a prior. Treat Control & PlanningControlThe method used to make the robot move the way you want. as a search problem—find the best-matching human motion, then learn to execute it. This dramatically cuts Robot LearningTrainingThe process of fitting a model using data or experience. time and improves motion quality. Second, Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. composition through modular Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. experts that distill into a single student Core ConceptsPolicyThe rule or model that maps observations or states to actions. is a practical architecture. You don't need to train one monolithic Core ConceptsPolicyThe rule or model that maps observations or states to actions.; break it into pieces (each expert for one Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer.), then compress that knowledge into a lightweight Core ConceptsPolicyThe rule or model that maps observations or states to actions. that runs on robots. DAgger is the glue—it lets you transfer expert knowledge to a Core ConceptsPolicyThe rule or model that maps observations or states to actions. that runs on different, limited sensors. Third, perception-driven behavior is non-negotiable for real-world robotics. The Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. isn't executing a fixed plan; it's perceiving obstacles in real time and adapting its Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. selection. This is what enables the closed-loop obstacle displacement demos—the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. isn't brittle to perturbations. If you're building Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. software, think about how to combine (1) pre-trained motion priors, (2) learnable Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. experts, and (3) lightweight Perception & SensingPerceptionThe process of turning raw sensor data into useful understanding of the world. policies that make real-time decisions. PHP shows this scales to complex, dynamic tasks like parkour.
LIMITATIONS
PHP's limitations are real and worth acknowledging. First, motion matching requires high-quality human motion data—the system only captures skills present in the Robot LearningTrainingThe process of fitting a model using data or experience. Robot LearningDatasetA collection of training or evaluation data.. If humans don't parkour in a certain way, the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. won't either. Second, the distillation pipeline (expert Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. → DAgger → student Core ConceptsPolicyThe rule or model that maps observations or states to actions.) is complex and requires careful Robot LearningDatasetA collection of training or evaluation data. collection; it's not as simple as end-to-end Robot LearningTrainingThe process of fitting a model using data or experience.. Third, the system relies on onboard Perception & SensingDepth sensingMeasuring how far objects are from the robot., which has limited range and can struggle with reflective surfaces or fast-moving obstacles. Fourth, the discrete Movement, Mechanics & Robot BodyVelocityHow fast something moves. command interface is limiting—the operator must still actively command the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions.; it's not fully autonomous decision-making about when to parkour. Fifth, Simulation & Sim-to-RealEvaluationMeasuring how well a robot system performs. is limited to one Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. (Unitree G1) and relatively controlled obstacle courses; Modern Robot LearningGeneralizationThe robot’s ability to work in new situations it has not seen before. to wildly different morphologies or unstructured outdoor terrain is unproven. Finally, the paper doesn't deeply analyze failure modes—when does the Core ConceptsPolicyThe rule or model that maps observations or states to actions. fail to climb or vault? What are the geometric or kinematic boundaries of the approach?
WHAT COMES NEXT
The obvious next frontier is full autonomy: instead of a human sending Movement, Mechanics & Robot BodyVelocityHow fast something moves. commands, the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. plans where it wants to go, perceives the obstacle course, and self-navigates. This requires adding high-level Control & PlanningPlanningFiguring out what the robot should do before or during movement. (graph search over obstacle configurations) on top of the perception-driven Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. selection. A second direction is Simulation & Sim-to-RealSim-to-real (sim2real)Transferring a policy trained in simulation to a real robot. Modern Robot LearningGeneralizationThe robot’s ability to work in new situations it has not seen before.—can you train motion matching and Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. policies in Simulation & Sim-to-RealSimulationA virtual environment where robots can be trained or tested. (where data is infinite) and transfer to new Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. hardware without retraining? The distillation pipeline hints at this, but it's not fully demonstrated. Third is humanoid morphology Modern Robot LearningGeneralizationThe robot’s ability to work in new situations it has not seen before.; does PHP work on Boston Movement, Mechanics & Robot BodyDynamicsThe study of motion including forces, torques, mass, and inertia. Atlas, Tesla Optimus, or other humanoids with different proportions and actuators? If yes, it becomes a general framework; if no, there's per-robot tuning. Fourth, exploring how to handle longer obstacle courses with hundreds of obstacles and more unpredictable geometry. Fifth, integrating higher-level reasoning—not just Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. selection, but Navigation & LocomotionObstacle avoidanceMoving while avoiding collisions with obstacles. Control & PlanningPlanningFiguring out what the robot should do before or during movement., energy-efficient route selection, and semantic understanding of the Core ConceptsEnvironmentThe external world the robot operates in, including objects, obstacles, people, and surfaces.. The dream: a humanoid Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. that explores unknown terrain, perceives obstacles, plans a parkour route, and executes it autonomously, all in real time. PHP is a big step toward that.