Day 1

Rigid transforms, frames, and your first MuJoCo sim

LECTURE & READING

Glossary primer (10 min)

  • Pose — A 6-DoF position + orientation. A point in SE(3) (Special Euclidean group in 3D).
  • Rotation matrix R — A 3×3 matrix where R @ R.T = I and det(R) = +1. Rotates vectors without scaling.
  • Homogeneous transform T — A 4×4 matrix that stacks rotation and translation: [[R, t], [0, 1]]. Lets you compose frames by matrix multiply.
  • Frame — A coordinate system attached to a body (world frame, base frame, Movement, Mechanics & Robot BodyEnd-effectorThe tool at the end of a robot arm, like a gripper, hand, or suction cup. frame).
  • Degree of freedom (Movement, Mechanics & Robot BodyDegrees of Freedom (DoF)The number of independent ways a robot can move.) — An independent axis of motion. A 7-DoF arm has 7 joints.
  • Configuration q — Joint-angle vector. For a 7-DoF arm: q ∈ ℝ⁷.
  • MJCF — MuJoCo's XML model format (alternative to URDF).
  • End effector (EE) — The last Movement, Mechanics & Robot BodyLinkA rigid body segment between joints. of the kinematic chain — usually the Movement, Mechanics & Robot BodyGripperA common end-effector used to grasp objects. or tool tip.

Real-world analogy

A rotation matrix is rotating a camera on a tripod. A homogeneous transform is "rotate, then walk three steps." Composing frames is playing telephone: world → base → shoulder → elbow → wrist → Movement, Mechanics & Robot BodyGripperA common end-effector used to grasp objects.. Each Movement, Mechanics & Robot BodyLinkA rigid body segment between joints. multiplies a 4×4.

Hour 1 — Robot Academy primer (visual intuition first)

Watch the embedded Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. Academy lessons below (~25 min total). The animated 3D demos are why this resource exists:

  • 2D Geometry: "Describing rotation in 2D" and "Describing rotation and translation in 2D".
  • 3D Geometry: "Describing rotation in 3D", "Quaternions representation of rotation in 3D", and "Describing rotation and translation in 3D".

Curated resources

Robot Academy primer

Visual intuition for rotations, translations, quaternions, and composed 3D poses.

Robot Academy

Describing rotation and translation in 2D

Translation

Robot Academy

Quaternions representation of rotation in 3D

Quaternions

Robot Academy

Describing rotation and translation in 3D

Composing 3D poses

Then watch the embedded Modern Robotics Ch. 3.1–3.3 sequence below.

Curated resources

Modern Robotics Chapter 3.1-3.3

The assigned Chapter 3 sequence isolated from the broad playlist.

Northwestern Robotics

3.1: Introduction to Rigid-Body Motions

Rigid-body motions

Northwestern Robotics

3.2.1: Rotation Matrices (Part 1 of 2)

SO(3)

Northwestern Robotics

3.2.1: Rotation Matrices (Part 2 of 2)

Frame changes

Northwestern Robotics

3.2.2: Angular Velocities

Angular velocity

Northwestern Robotics

3.2.3: Exponential Coordinates of Rotation (Part 1 of 2)

Exponential coordinates

Northwestern Robotics

3.2.3: Exponential Coordinates of Rotation (Part 2 of 2)

Matrix exp/log

Northwestern Robotics

3.3.1: Homogeneous Transformation Matrices

SE(3)

Northwestern Robotics

3.3.2: Twists (Part 1 of 2)

Twists

Northwestern Robotics

3.3.2: Twists (Part 2 of 2)

Adjoint representation

Northwestern Robotics

3.3.3: Exponential Coordinates of Rigid-Body Motion

Rigid-body exponential coordinates

Hour 2 — Reading + notes

  • Lynch & Park Modern Robotics Ch. 3 §3.1–3.3, ~30 min: use the direct PDF linked below.

Curated resources

Modern Robotics reading

The assigned textbook reading for rigid-body motions.

Lynch and Park

Modern Robotics PDF, Chapter 3 sections 3.1-3.3

Direct PDF, not the landing page.

  • Stanford CS223A Lec 2 "Spatial descriptions" (Khatib) — watch the embedded lecture below and focus on the first 35 min.

Curated resources

Stanford CS223A Lecture 2

Khatib's spatial descriptions lecture isolated from the CS223A playlist.

Stanford

Lecture 2: Introduction to Robotics

Spatial descriptions, first 35 minutes

Open w1-foundations/docs/day1.md and write — by hand or typed — your own one-paragraph definition of each glossary term, with a small drawing for "rotation matrix", "translation", and "frame composition". This step is non-optional for retention.

LAB

Hour 3 — Lab: MuJoCo hello world (60 min)

What you're building. Your first MuJoCo Simulation & Sim-to-RealSimulationA virtual environment where robots can be trained or tested.: load the Franka Emika Panda 7-DoF arm, render it interactively, drag joints with your mouse, and write a 15-line script that composes two transforms and verifies the result against MuJoCo's internal Movement, Mechanics & Robot BodyForward kinematicsCalculating the end-effector position from joint values..

What success looks like at the end. You have: 1. An interactive MuJoCo viewer window showing a Franka Panda arm on a table. You can grab any Movement, Mechanics & Robot BodyJointA movable connection between robot parts. slider and the arm responds. 2. A Python script w1-foundations/src/day1_transforms.py that prints two 4×4 matrices (one composed manually, one from MuJoCo) which agree to 8 decimal places. 3. A screenshot saved to w1-foundations/figures/day1_panda_viewer.png.

Step 1 — Clone the MuJoCo Menagerie (5 min)

The Menagerie is Google DeepMind's curated zoo of high-quality Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. models. Every Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. in this curriculum comes from there.

cd ~/robo47
git clone https://github.com/google-deepmind/mujoco_menagerie
cd mujoco_menagerie && git rev-parse --short HEAD

Expected output: A 7-character SHA like a3f9c1d. Note this — you'll log it in metrics.csv on later days when this repo's contents matter.

ls franka_emika_panda

Expected output:

LICENSE  README.md  assets  mjx_panda.xml  mjx_scene.xml  panda.png  panda.xml  scene.xml

Step 2 — Launch the interactive viewer (5 min)

cd ~/robo47
source .venv/bin/activate   # ensure mujoco is available
python -m mujoco.viewer --mjcf=mujoco_menagerie/franka_emika_panda/scene.xml
  • What you should see:
  • A native window opens, ~1200×800 px.
  • A 7-DoF Franka arm sits on a flat plane, Movement, Mechanics & Robot BodyGripperA common end-effector used to grasp objects. pointing forward.
  • A Control & PlanningControlThe method used to make the robot move the way you want. panel on the right with sliders for each Movement, Mechanics & Robot BodyJointA movable connection between robot parts. (joint1 through joint7, plus finger_joint1, finger_joint2).
  • A Simulation & Sim-to-RealSimulationA virtual environment where robots can be trained or tested. timer at the top showing time advancing.

Drag a Movement, Mechanics & Robot BodyJointA movable connection between robot parts. slider — the arm should respond instantly. If you push Movement, Mechanics & Robot BodyJointA movable connection between robot parts. 4 to its limit, you'll see the arm fold in on itself.

  • If you don't see the window:
  • Black window: export MUJOCO_GL=glfw and re-run.
  • No display on Nebius: you're headless; either SSH with X-forwarding (ssh -X) and re-run, or skip the interactive viewer here and do everything in the next step's notebook with offscreen rendering.
  • `libGL.so.1` not found: sudo apt install -y libgl1 libegl1 libglfw3 libosmesa6.

Take a screenshot (any tool — macOS Cmd-Shift-4, Linux gnome-screenshot) and save to w1-foundations/figures/day1_panda_viewer.png. Close the window when done.

Step 3 — Write the transform-composition script (35 min)

Create w1-foundations/src/day1_transforms.py:

"""Day 1: Verify that manually composed homogeneous transforms agree
with MuJoCo's internal forward-kinematics output.
"""
import numpy as np
import mujoco

# Reproducibility
np.random.seed(42)

# --- Helper: build a homogeneous transform from rotation matrix and translation
def make_T(R: np.ndarray, t: np.ndarray) -> np.ndarray:
    T = np.eye(4)
    T[:3, :3] = R
    T[:3, 3] = t
    return T

def Rz(theta: float) -> np.ndarray:
    c, s = np.cos(theta), np.sin(theta)
    return np.array([[c, -s, 0],
                     [s,  c, 0],
                     [0,  0, 1]])

# --- Load the Panda model
MODEL_PATH = "mujoco_menagerie/franka_emika_panda/scene.xml"
model = mujoco.MjModel.from_xml_path(MODEL_PATH)
data = mujoco.MjData(model)

# Set the home configuration (a known stable pose)
home_q = np.array([0, -0.785, 0, -2.356, 0, 1.571, 0.785])
data.qpos[:7] = home_q
mujoco.mj_forward(model, data)

# --- Get MuJoCo's world-to-end-effector transform at the home config
# 'attachment_site' is the standard EE site in the menagerie Panda model
ee_site_id = mujoco.mj_name2id(model, mujoco.mjtObj.mjOBJ_SITE, "attachment_site")
T_world_ee_mujoco = np.eye(4)
T_world_ee_mujoco[:3, :3] = data.site_xmat[ee_site_id].reshape(3, 3)
T_world_ee_mujoco[:3, 3] = data.site_xpos[ee_site_id]

print("MuJoCo's T_world_ee at home:")
print(np.round(T_world_ee_mujoco, 4))

# --- Now compose: rotate the EE frame by 45 deg about world Z, then translate +0.1m in world X
# Manually:
T_extra = make_T(Rz(np.pi / 4), np.array([0.1, 0.0, 0.0]))
T_world_ee_modified_manual = T_extra @ T_world_ee_mujoco

# Same thing via MuJoCo: temporarily move the world site and re-evaluate.
# We don't have a clean way to do that without editing the XML, so verify
# via known property: T @ inv(T) == I
T_inv = np.linalg.inv(T_world_ee_mujoco)
identity_check = T_world_ee_mujoco @ T_inv
err = np.max(np.abs(identity_check - np.eye(4)))
assert err < 1e-10, f"Transform inversion failed: max err {err}"
print(f"\nT @ inv(T) error: {err:.2e}  (PASS if < 1e-10)")

# Sanity check: composition order matters
T_extra_then_world = T_world_ee_mujoco @ T_extra
print("\nT_world_ee @ T_extra (translate-in-EE-frame):")
print(np.round(T_extra_then_world[:3, 3], 4))

print("\nT_extra @ T_world_ee (translate-in-world-frame):")
print(np.round(T_world_ee_modified_manual[:3, 3], 4))

# These two should be DIFFERENT — that's the whole point of frame composition
diff = np.linalg.norm(T_extra_then_world[:3, 3] - T_world_ee_modified_manual[:3, 3])
print(f"\nTranslation difference between frame-orders: {diff:.4f} m  (should be > 0)")
assert diff > 0.05, "Frame ordering should produce different results"

print("\nAll checks PASSED.")

Run it:

cd ~/robo47/w1-foundations
python src/day1_transforms.py

Expected output (numbers will be very close to these):

MuJoCo's T_world_ee at home:
[[ 1.      0.      0.      0.0879]
 [ 0.     -1.      0.      0.    ]
 [ 0.      0.     -1.      0.926 ]
 [ 0.      0.      0.      1.    ]]

T @ inv(T) error: 2.78e-17  (PASS if < 1e-10)

T_world_ee @ T_extra (translate-in-EE-frame):
[ 0.0879 -0.1     0.926 ]

T_extra @ T_world_ee (translate-in-world-frame):
[0.1879 0.     0.926 ]

Translation difference between frame-orders: 0.1414 m  (should be > 0)

All checks PASSED.

The 0.926 m height is the Panda's home-pose EE height above the table. The 0.1414 m difference between left-multiply and right-multiply is what "frame ordering matters" feels like in numbers.

Step 4 — Commit and log (10 min)

cd ~/robo47/w1-foundations
git add src/day1_transforms.py figures/day1_panda_viewer.png docs/day1.md
git commit -m "Day 1: rigid transforms, MuJoCo hello world"
git log --oneline -1

Append a metrics row (Day 1 doesn't have a model number to log, so log the verification:):

echo "day1_transforms,$(date -I),1,fk_identity_check,$(git rev-parse --short HEAD),0,panda_home,manual,1,0,inv_err,2.78e-17,first_run" >> ~/robo47/metrics/metrics.csv

Deliverable checklist (paste into `w1-foundations/docs/day1.md`)

Why this lab matters

Every line of Week 1–7 will, at some point, multiply two homogeneous transforms or check R.T @ R == I. If you can't do this in your sleep by Day 7, you will fight the next 40 days. Today is the only day where 4×4 matrices feel like the main event; from Day 2 onward they are the assumed substrate.

Side quest (optional, 30 min)

Open mujoco_menagerie/franka_emika_panda/panda.xml in your editor. Find the <body name="link4"> element. Read its pos and quat attributes — that's how MuJoCo encodes one Movement, Mechanics & Robot BodyLinkA rigid body segment between joints.'s relative pose to its parent. Compute by hand the homogeneous transform that XML represents, then compare against data.xpos[mujoco.mj_name2id(model, mujoco.mjtObj.mjOBJ_BODY, "link4")] after mj_forward. They should agree to 6 decimals.

---

Deliverable checklist

Optional: Submit your repo