Study sandbox · last verified 2026-05-08
Physical AI in 2026 — research, projects, and a learning roadmap.
Ten cross-cutting research docs and eighteen hands-on projects, organized around the four-loop data flywheel: collect → curate → label → eval.
Research
All 10 docs →Cross-cutting synthesis of the field: data is the bottleneck, the data engine is the product, the modular/E2E pendulum is dissolving.
Who collects what at what scale: Tesla, Waymo, Mobileye, Wayve, Waabi. Fleet logs vs customer-shadow vs sim vs world models.
The VLA recipe — RT-X, OpenVLA, π0/π0.5, Helix, Gemini Robotics. Open X-Embodiment, demo collection economics.
The competitor map. Applied Intuition Simian, Nvidia Cosmos + Isaac, CARLA, Waymax, the death of pure-rendering shops.
The most important doc for the role. Data-engine philosophy, FiftyOne/SAM2/OpenSCENARIO tools, foundation models as labelers.
Cosmos, GAIA, GR00T-Dreams, Wayve PRISM-1. World models as data engine and as eval substrate.
The 8-phase project arc
All 18 projects →- Phase AData fluency2 projects
- Phase BLabeling fundamentals3 projects
- Phase CProduction hygiene1 project
- Phase DSimulation and world models4 projects
- Phase ERobotics adjacency2 projects
- Phase FBehavior, sim agents, closed-loop3 projects
- Phase GActive learning + capstone2 projects
- Phase HStrategy1 project
Compressed-time critical chain
If you must compress 18 weeks into 6, this is the minimum-viable order. One thing not to skip: project 18 — the strategy memo.
Four loops
Every project is tagged by which part of the data flywheel it touches.
- COLLECT
Fleet logs, customer-shadow, simulation, world-model generation. Weeks–months.
- CURATE
Triage, embedding mining, scenario taxonomy, dedup, slicing. Hours–days.
- LABEL
Auto-label, human verify, distill, pretrain, fine-tune. Days–weeks.
- EVAL
Open-loop, closed-loop sim, scenario coverage, safety case. Continuous.