WorkBenchMark: Benchmarking Robotic Assembly with LEGO
A shared yardstick for robotic assembly, built on LEGO Duplo: 400 tasks across four difficulty tiers, an open-vocabulary Assembly-by-Disassembly baseline, and a finding — structured planning beats a modern vision-language-action model at every tier, and the gap widens with complexity.