Humanoid Robot Demo vs Reality Tracker (2026)

In our humanoid robotics market deck, you will find everything you need to understand the market
Humanoid Robot Demo vs Reality Tracker
This tracker follows 59 humanoid robot demos from sixteen leading programs, from November 2017 to April 2026. Each row scores what the demo claimed, how strong the claim was, whether later evidence arrived, and how close the robot got to real deployment. Rows run from most recent to oldest.
All three scores (DCQ, ESS, DRS) are reported on a 0 to 10 scale, explained in the methodology below. For the underlying market context, see our humanoid robotics market report.
| Demo date | Company | Claim shown | Capability | Environment shown | Demo Claim Quality (DCQ) (out of 10) | Evidence found | What was actually proven | Evidence date | Claim-to-evidence lag | Evidence type | Evidence Strength Score (ESS) (out of 10) | Deployment Reality Score (DRS) (out of 10) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mar 2026 | Figure | Figure 03 (Helix 02) cleans a full living room, sprays and wipes surfaces, scoops toys, and presses the TV remote | Task sequencing | Lab | 10 | Partial | Helix 02 extended from kitchen to living room, but still inside Figure's controlled environment. No customer home, no reproduction, no uptime numbers disclosed. | Mar 2026 | ~0 months | Vendor update with new material | 5 | 3 |
| Jan 2026 | Boston Dynamics | Product Atlas (electric) is revealed at CES 2026 with live stage walking, waving, and interaction on the Hyundai stage | Deployment / integration | Live stage demo | 7 | Partial | Product and customer commitment validated live, with Boston Dynamics acknowledging the stage demo was remote-piloted. Full Hyundai scale-out is a 2028 target, not 2026. | Jan 2026 | ~0 months | Third-party journalism | 8 | 5 |
| Jan 2026 | Figure | Figure 03 (Helix 02) autonomously unloads and reloads a dishwasher in a continuous 4-minute run across 61 loco-manipulation actions | Bimanual manipulation | Staged kitchen | 10 | Partial | Helix 02 was extended to further rooms, but the dishwasher task has no customer-home reproduction. Trained on motion capture and simulation, not real home data. | Mar 2026 | ~2 months | Vendor update with new material | 5 | 3 |
| Jan 2026 | AgiBot | AgiBot makes its U.S. debut at CES 2026 with live demos across its A2, X2, G2, and D1 platforms | Deployment / integration | Live stage demo | 7 | Partial | Live navigation and interaction confirmed by press, but show-floor demos carry no throughput or customer-site metrics. | Jan 2026 | ~0 months | Third-party journalism | 5 | 5 |
| Jan 2026 | XPeng | An XPeng Iron attempting a catwalk walk at Shenzhen MixC mall trips and falls, requiring three staff to carry it away | Locomotion | Uncontrolled public environment | 10 | Yes | The November 2025 catwalk robustness claim collapsed in unscripted public conditions two months later. CEO addressed it on Weibo; next-day run had Iron strapped to a support frame. | Jan 2026 | ~0 months | Independent video footage in uncontrolled conditions | 10 | 0 |
| Nov 2025 | Agility | Agility announces that Digit has moved more than 100,000 totes at the GXO Flowery Branch facility | Deployment / endurance | Live production line | 10 | Yes | Endurance and throughput validated at 100,000 totes, with expansion to Mercado Libre Texas weeks later. One customer, one workflow, but the only six-figure humanoid throughput publicly disclosed. | Nov 2025 | ~0 months | Operational metric disclosure | 10 | 10 |
| Nov 2025 | UBTech | UBTech announces Walker S2 mass production with hundreds of units shipping and CN¥800M in disclosed orders | Deployment | Multiple factories | 7 | Partial | Mass delivery into BYD, Geely, FAW-Volkswagen, Audi FAW, BAIC, Foxconn, and SF Express was confirmed, but only by UBTech itself. No plant has disclosed shift-level throughput. | Nov 2025 | ~0 months | Operational metric disclosure | 8 | 8 |
| Nov 2025 | XPeng | XPeng Iron walks with an uncanny human gait at XPeng AI Day 2025, with 82 DoF and a solid-state battery pack | Locomotion / design | Live stage demo | 7 | Partial | The leg teardown confirmed a real robot, but the catwalk gait failed in a public mall two months later. Cutting a leg open to prove authenticity is itself a trust signal. | Jan 2026 | ~2 months | Independent video footage in uncontrolled conditions | 5 | 3 |
| Nov 2025 | AgiBot | AgiBot A2 completes a 106.286 km walk from Suzhou to Shanghai, adjudicated as a Guinness World Record | Endurance / locomotion | Uncontrolled public environment | 10 | Yes | Walk validated by Guinness adjudicator and independent U.S. press on a public route with hot-swap batteries. The only fully unambiguous endurance validation in the dataset. | Nov 2025 | ~0 months | Independent video footage + regulatory/certifying filing | 10 | 8 |
| Oct 2025 | 1X | 1X opens $20,000 NEO Home Robot pre-orders with scheduled chores and explicit human assistance for unknown tasks | Deployment | Home (staged + forthcoming) | 3 | Yes | WSJ hands-on confirmed every task required a VR-headset operator, with timings disclosed per chore. 1X is unusually transparent at launch, strengthening disclosure even as autonomy stays weak. | Oct 2025 | ~0 months | Third-party journalism | 10 | 3 |
| Sep 2025 | Boston Dynamics | Atlas (electric) begins a first field test at Hyundai Savannah, autonomously identifying and sorting roof-rack parts | Manipulation / machine tending | Real customer site | 10 | Yes | CBS 60 Minutes and NBC validated on-site autonomous parts-sorting at the Hyundai plant. The customer is the parent company, so independence is partial; scale-out is a 2028 target. | Jan 2026 | ~4 months | Third-party journalism | 8 | 5 |
| Aug 2025 | Unitree | Unitree H1 sprints past a speed-measurement device on an athletics track at a measured 10.1 m/s | Locomotion | Uncontrolled public environment | 10 | Yes | Speed record validated with external measurement on a public track, extending the prior Guinness result. No productive task or customer workflow attached. | Aug 2025 | ~0 months | Independent video footage in uncontrolled conditions | 8 | 5 |
| Jul 2025 | UBTech | Walker S2 autonomously walks to a station, extracts its depleted battery, and installs a fresh module in about 3 minutes | Endurance / runtime | Staged warehouse zone | 10 | Yes | Hot-swap moved from demo to shipped feature on hundreds of S2 units. The capability commoditised fast: AgiBot A2 used the same mechanism on its Guinness walk. | Nov 2025 | ~4 months | Operational metric disclosure | 8 | 8 |
| Apr 2025 | Agility | Digit performs palletizing and AMR load/unload demonstrations on the ProMat trade-show floor | Manipulation | Trade-show staging | 10 | Yes | The trade-show task set translated nearly one-to-one into the live GXO operation that crossed 100,000 totes. A rare trade-show-to-deployment match in the dataset. | Nov 2025 | ~7 months | Operational metric disclosure | 10 | 10 |
| Mar 2025 | Boston Dynamics | Atlas (electric) performs part sequencing: picks, carries, and places auto parts into dark cubbies at varying heights | Manipulation | Lab | 10 | Yes | The lab capability was reproduced in the Hyundai Savannah pilot within six months, on a closely related task. A clean lab-to-pilot bridge, though still inside Hyundai ownership. | Sep 2025 | ~6 months | Named customer deployment | 8 | 5 |
| Mar 2025 | Unitree | Unitree G1 performs a standing side-flip, kick-up from face-up, combat sweep-kick, and recovers from a rear kick while walking | Locomotion | Lab | 10 | Partial | Capability family was reproduced across the ecosystem, including on EngineAI PM01, without a customer workflow. G1 remains the lowest-priced research platform in the global humanoid market. | Feb 2026 | ~11 months | Independent video footage in uncontrolled conditions | 5 | 5 |
| Mar 2025 | AgiBot | AgiBot announces the GO-1 foundation model running on the Yuanzheng A2 platform for general manipulation | Autonomy / planning | Lab | 7 | Partial | Over 1,000 A2 units entered commercial service, and the platform backed the 106 km Guinness walk. Model-level claim validated by shipping, not by independent benchmark. | Nov 2025 | ~8 months | Named customer deployment | 8 | 8 |
| Feb 2025 | Figure | Two Figure 02 robots coordinate to put away novel grocery items, each driven by one shared Helix neural network | Bimanual coordination | Lab | 10 | Partial | The Helix architecture continued into Helix 02, but the grocery-sort task was never reproduced in an external home. Novel-object generalization was asserted, never independently benchmarked. | Feb 2026 | ~12 months | Vendor update with new material | 5 | 3 |
| Feb 2025 | 1X | 1X NEO Gamma launches as a sleeker redesign of NEO Beta with updated soft-clad exterior and voice interface | Design / locomotion | Staged home | 3 | Yes | Platform continuity validated at the NEO Home Robot preorder, though WSJ later showed it is teleop-first. Form-factor update only, no new capability claim. | Oct 2025 | ~8 months | Third-party journalism | 10 | 0 |
| Feb 2025 | EngineAI | EngineAI PM01 performs a standing front-flip and walks around the Shenzhen Tourist Information Center | Locomotion | Uncontrolled public environment | 10 | Yes | Reproduced and extended into kick-recovery and rise-from-prone routines on the same platform. Public-environment walking is stronger than lab walking, but no commercial deployment has followed. | Feb 2026 | ~12 months | Vendor update with new material | 5 | 3 |
| Dec 2024 | Boston Dynamics | Electric Atlas performs a holiday-themed backflip and dance routine in the lab | Locomotion | Lab | 7 | Yes | Electric Atlas appeared live on CES 2026 stage, extending lab capability to public. Backflip reproducibility is motor capability, not productive work; no disclosed throughput. | Jan 2026 | ~13 months | Independent video footage in uncontrolled conditions | 5 | 5 |
| Nov 2024 | Figure | Figure claims Figure 02 makes up to 1,000 sheet-metal placements per day at roughly 4x speed and 7x accuracy versus the earlier pilot | Manipulation | Simulated production setting | 10 | Yes | Real deployment totalled 90,000 parts across 1,250 hours over 11 months, narrowing the "1,000/day" framing. Speed claim was real, but general-purpose marketing exceeded what BMW substantiated. | Nov 2025 | ~12 months | Operational metric disclosure | 10 | 8 |
| Nov 2024 | 1X | 1X NEO Beta prepares a steak on a chef's YouTube segment, with scripted dialogue and teleoperated movement | Bimanual coordination | Staged kitchen | 3 | Yes | WSJ's later hands-on confirmed teleoperation is NEO's default mode, fitting the scripted nature of the steak demo. Better understood as a performance than a capability. | Oct 2025 | ~11 months | Third-party journalism | 8 | 0 |
| Oct 2024 | Tesla | Dozens of Tesla Optimus (Gen 2) units walk among guests, pour drinks, play charades, and hold live conversations at the We, Robot event | Verbal interaction / perception | Uncontrolled public environment | 3 | Yes | Bloomberg and a Tesla VP confirmed the bots were human-assisted throughout the event. Canonical case of evidence arriving within 72 hours and sharply narrowing the onstage impression. | Oct 2024 | ~0 months | Third-party journalism | 8 | 0 |
| Oct 2024 | Apptronik | Apptronik announces a Google DeepMind partnership for Gemini Robotics integration on Apollo | Autonomy / planning | Lab | 3 | Partial | Gemini Robotics confirmed in Mercedes Berlin; Apollo remains in data-collection and teleop mode. Two AI partners in a year signals a mobile stack, not a moat. | Mar 2025 | ~5 months | Vendor update with new material | 5 | 3 |
| Sep 2024 | 1X | Thirty 1X EVE units autonomously pick and place items into boxes, bins, and trays, open doors, and plug into charging stations | Manipulation | Staged warehouse zone | 10 | Partial | WSJ's review of 1X's later home robot showed teleoperation dependency, narrowing the fleet's autonomy framing. No external customer has publicly validated the 30-EVE autonomy claim. | Oct 2025 | ~13 months | Third-party journalism | 5 | 3 |
| Sep 2024 | Fourier Intelligence | Fourier GR-2 performs a floor-to-stand maneuver after sim-to-real reinforcement learning, at a reported 89% success rate | Locomotion | Lab | 7 | Partial | Fourier continued to iterate through GR-3, but the 89% claim was never re-measured independently. The customer base remains research-heavy, with no industrial deployment numbers disclosed. | Oct 2025 | ~13 months | Vendor update with new material | 5 | 3 |
| Aug 2024 | Figure | Figure 02 fits sheet-metal parts into precise fixtures on BMW's Spartanburg body-shop line during a two-week pilot | Manipulation / machine tending | Real customer production plant | 10 | Yes | Over 11 months, 1,250 hours and 90,000 parts placed at 99%+ accuracy across 30,000 BMW X3s. A rare row with disclosed runtime and counts, though the target cell was highly automated. | Nov 2025 | ~15 months | Operational metric disclosure | 10 | 8 |
| Aug 2024 | LimX Dynamics | LimX CL-1 picks up 18-lb juice bins, deep-squats with the load, places them on upper and lower shelves, and replans when a man moves a bin | Bimanual coordination | Staged warehouse zone | 10 | Partial | Capability continued in further LimX videos, but the replanning claim was never reproduced at a customer site. Replanning under human interference remains unverified outside LimX's studio. | Jan 2025 | ~5 months | Vendor update with new material | 5 | 3 |
| Aug 2024 | 1X | 1X unveils NEO Beta, a bipedal, soft-clad robot walking around a staged home with a human interacting | Locomotion | Staged home | 7 | Partial | WSJ confirmed walking but showed almost nothing else in the home is autonomous, narrowing the home-ready framing. 1X publicly employs 12+ full-time teleoperators for training data. | Oct 2025 | ~14 months | Third-party journalism | 8 | 3 |
| Aug 2024 | AgiBot | AgiBot Yuanzheng A2 launches with a signature demo of threading a needle to showcase fine-grained dexterity | Manipulation | Lab | 10 | Partial | Platform scaled to 1,000+ commercial units and completed a Guinness walk, but the needle task was never re-demoed. Platform validation strong; marquee launch task never reappeared. | Nov 2025 | ~15 months | Named customer deployment + third-party journalism | 8 | 8 |
| Jun 2024 | Agility | A small fleet of Agility Digit robots begins moving totes from AMRs onto conveyors in a live GXO warehouse under a multi-year contract | Deployment | Live production line | 10 | Yes | Operational throughput at scale validated across 17 months on a narrow, stable tote-transfer workflow. The strongest validation in the dataset, disclosed by the customer, not just the vendor. | Nov 2025 | ~17 months | Operational metric disclosure | 10 | 10 |
| Jun 2024 | Apptronik | Apptronik Apollo is evaluated for tote and box transport and barcode scanning in a joint GXO lab-style setup | Manipulation / perception | Lab | 7 | Partial | GXO continues testing Apollo, but no operational metric has been disclosed for direct comparison with Digit at the same customer. Side-by-side throughput comparisons are unavailable. | Feb 2026 | ~20 months | Vendor update with new material | 5 | 3 |
| Apr 2024 | Boston Dynamics | The new electric Atlas rises from a supine position by rotating its hips 180 degrees, revealing its full range of motion | Locomotion / design | Lab | 10 | Yes | Electric Atlas reached a Hyundai Savannah pilot for autonomous parts-sorting, with the joint motion reproduced on 60 Minutes. Product deployment at scale remains a 2028 target. | Jan 2026 | ~21 months | Named customer deployment | 10 | 5 |
| Apr 2024 | Sanctuary AI | Sanctuary announces Phoenix Gen 7 with a claim of automating new tasks in under 24 hours of learning | Autonomy / planning | Lab | 3 | Partial | Iteration validated through Gen 8, but the 24-hour task-learning claim was never independently benchmarked. CEO Rose and CTO Gildert both departed in 2024, signalling leadership turbulence. | Jan 2025 | ~9 months | Vendor update with new material | 5 | 3 |
| Mar 2024 | Figure | Figure 01 converses in real time, identifies objects on a counter, hands over an apple, and places dishes in a drying rack | Verbal interaction + manipulation | Lab | 10 | No | The OpenAI-backed verbal-plus-manipulation combo was never reproduced at any customer site or follow-up. Figure ended its OpenAI collaboration within 12 months and built Helix internally. | no evidence | no evidence | n/a | 0 | 0 |
| Mar 2024 | Unitree | Unitree H1 Evolution V3.0 speed-walks at 3.3 m/s, dances, climbs stairs, and performs a standing jump as high as an adjacent man | Locomotion | Lab | 10 | Yes | Speed class extended to 10.1 m/s on a public track, surpassing the Guinness verification. H1 is widely deployed as a university research platform via its open SDK. | Aug 2025 | ~17 months | Independent video footage in uncontrolled conditions | 8 | 5 |
| Mar 2024 | Apptronik | Apptronik Apollo begins kitting assembly parts and inspecting components at Mercedes-Benz's Hungary plant under a commercial agreement | Task sequencing / manipulation | Real customer plant | 7 | Partial | Mercedes expanded to Berlin-Marienfelde with MANUS teleop gloves, describing Apollo as "exploring use cases." No shift-level throughput has been published; Mercedes frames it as "intra-logistics support," not production. | Mar 2025 | ~12 months | Third-party journalism | 8 | 5 |
| Feb 2024 | Figure | Figure 01 autonomously picks a bin off a stack and deposits it onto a conveyor belt, running at 16.7% of human speed | Manipulation | Staged warehouse zone | 10 | Yes | Related pick-and-place validated at BMW at 99% accuracy over 11 months, at cycle times slower than implied. Disclosing speed versus human baseline was unusually honest. | Nov 2025 | ~21 months | Operational metric disclosure | 10 | 8 |
| Feb 2024 | 1X | 1X EVE navigates an office, opens doors, tidies shelves, and interacts with people using a single vision NN at 10Hz | Task sequencing | Lab | 10 | Partial | The same capability family appeared in the 30-EVE fleet video, but without external autonomy validation. 1X later redirected focus from EVE to the bipedal NEO, making EVE discontinued. | Sep 2024 | ~7 months | Vendor update with new material | 5 | 3 |
| Feb 2024 | UBTech | UBTech Walker S performs door-lock inspection, seat-belt checks, and headlight-cover inspection on the NIO assembly line | Manipulation / task sequencing | Real customer site | 7 | Yes | Walker S-series reached BYD, Geely, FAW-Volkswagen, Audi FAW, BAIC, Foxconn, and SF Express. The customer list is UBTech-provided; task scope per customer has not been independently verified. | Nov 2025 | ~21 months | Operational metric disclosure + named customer deployment | 8 | 8 |
| Jan 2024 | Figure | Figure 01 loads a coffee pod and brews a cup after a verbal command, reportedly trained by watching ten hours of video | Task sequencing | Lab | 7 | No | The coffee task was never reproduced at any external site, and Figure's roadmap moved to Helix. Filmed in-house at undisclosed speed, with no published reliability metrics. | no evidence | no evidence | n/a | 0 | 0 |
| Dec 2023 | Tesla | Tesla Optimus Gen 2 walks 30% faster, performs deep squats, handles an egg without cracking it, and two units dance | Locomotion + dexterity | Lab | 7 | Partial | Subsequent Optimus appearances were confirmed as teleop-assisted, narrowing the Gen 2 autonomy framing. No external observer has reproduced the egg-grip or squat claim. | Oct 2024 | ~10 months | Third-party journalism | 8 | 3 |
| Dec 2023 | Unitree | Unitree H1 walks and maintains balance after being kicked by a handler | Locomotion | Lab | 7 | Yes | Follow-on V3.0 Evolution reached 3.3 m/s and showed balance at university labs operating H1 independently. H1 is widely shipped to research institutions, so third-party reproduction is robust. | Mar 2024 | ~3 months | Vendor update with new material | 5 | 5 |
| Dec 2023 | LimX Dynamics | LimX CL-1 climbs stairs, walks down a 15-degree slope, and transitions between indoor and outdoor surfaces using real-time terrain perception | Locomotion | Lab | 10 | Partial | Subsequent CL-1 demos extended into load handling, but terrain perception has not been verified at a customer site. Sustained stair climbing under load remains unproven outside LimX. | Aug 2024 | ~8 months | Vendor update with new material | 5 | 3 |
| Dec 2023 | Agility | Agility Digit runs a proof-of-concept tote transfer from autonomous mobile robots onto a conveyor inside the SPANX zone at GXO | Manipulation | Live customer zone | 10 | Yes | The proof-of-concept converted into the first humanoid Robots-as-a-Service contract in the industry, narrowly scoped to one workflow. The best-documented humanoid pilot in the field. | Jun 2024 | ~6 months | Multi-year RaaS contract | 10 | 10 |
| Nov 2023 | Sanctuary AI | Sanctuary Phoenix Gen 6 completes 110 retail tasks across picking, tagging, and folding over a week at a Mark's store in Canada | Manipulation / task sequencing | Real customer site | 7 | Partial | Gen 7 claimed 24-hour task learning, but no third-party retail deployment emerged beyond Mark's. Sanctuary pivoted to a wheeled base in Gen 8, abandoning the bipedal retail thesis. | Apr 2024 | ~5 months | Vendor update with new material | 5 | 3 |
| Oct 2023 | Figure | Figure 01 walks untethered in a lab, demonstrating dynamic bipedal locomotion for the first time | Locomotion | Lab | 7 | Yes | Figure walking was validated at BMW Spartanburg, anchoring the locomotion claim. Stair climbing and unstructured walking in uncontrolled environments remain unproven at scale. | Aug 2024 | ~10 months | Named customer deployment | 8 | 5 |
| Oct 2023 | Agility | Agility Digit picks and moves empty totes inside Amazon's BFI1 robotics test facility near Seattle | Manipulation / task sequencing | Real customer facility | 7 | Yes | Tote-handling was validated at commercial scale at GXO, though the Amazon trial itself never publicly scaled. The Amazon BFI1 line faded from coverage; GXO became the showcase. | Nov 2025 | ~25 months | Operational metric disclosure | 10 | 8 |
| Sep 2023 | Tesla | Tesla Optimus sorts coloured blocks by colour and holds a yoga pose to show balance and perception | Manipulation / perception | Lab | 7 | Partial | The capability generalised into Optimus Gen 2, but no external customer has reproduced autonomous block-sorting. Analysts flagged the clip as likely teleoperated; Tesla never disputed the framing. | Dec 2023 | ~3 months | Vendor update with new material | 5 | 3 |
| Aug 2023 | Apptronik | Apptronik Apollo is unveiled with hot-swappable batteries, a 25 kg payload, and staged demos of walking, trailer unload, and palletizing | Design / manipulation | Lab + staged warehouse | 7 | Yes | Apollo moved from unveiling to a Mercedes-Benz pilot within seven months, though autonomous palletizing remains unproven externally. Apptronik cites NASA Valkyrie heritage as a credibility anchor. | Mar 2024 | ~7 months | Named customer deployment | 8 | 3 |
| Jul 2023 | Fourier Intelligence | Fourier reveals the 1.65 m GR-1 biped at WAIC Shanghai, walking on stage with a claim of a 50 kg carrying capacity | Locomotion / manipulation | Conference booth | 3 | Partial | Over 100 GR-1 units shipped to research customers; GR-2 continued iterating, but clinical deployment remains unproven. The 2023 mass-production target was publicly met, rare for this category. | Sep 2024 | ~14 months | Vendor update with new material | 5 | 3 |
| Sep 2022 | Agility | Agility Cassie completes the 100m Guinness World Record for a bipedal robot in 24.73 seconds | Locomotion | Uncontrolled public environment | 10 | Yes | Guinness-verified locomotion record, published by the developing university with full specifications. A locomotion stunt that never translated into deployment, but anchored Agility's later Digit credibility. | Sep 2022 | ~0 months | Academic paper / regulatory filing | 10 | 0 |
| Sep 2022 | Tesla | The Tesla Bumble-C / Optimus prototype walks stiffly across the AI Day 2 stage untethered and waves at the audience | Locomotion | Live stage demo | 7 | Partial | Subsequent Optimus prototypes walked in-lab, but no external customer site has ever walked one. Independent AI analysts publicly dismissed the AI Day 2 presentation at the time. | Dec 2023 | ~15 months | Vendor update with new material | 5 | 3 |
| Aug 2022 | Xiaomi | Xiaomi CyberOne walks onto the keynote stage, greets the CEO, and hands over a flower | Locomotion | Live stage demo | 3 | No | No operational follow-through emerged; Xiaomi never produced CyberOne in meaningful quantity. One of the clearest cases of keynote theatre without downstream product in the dataset. | no evidence | no evidence | n/a | 0 | 0 |
| Dec 2021 | Engineered Arts | Engineered Arts Ameca performs a viral facial-expression "waking up" sequence in a lab teaser video | Perception / verbal interaction | Lab | 7 | Yes | Ameca made its CES 2022 public debut with live audience interaction, and has been purchased by museums and science centers. Easiest capability class to validate publicly. | Jan 2022 | ~1 month | Independent video footage in uncontrolled conditions | 8 | 8 |
| Aug 2021 | Boston Dynamics | Two Boston Dynamics Atlas (hydraulic) robots run a parkour course: banked panels, broad jump, stairs, balance beam, vault, synchronized backflips | Locomotion | Lab obstacle course | 10 | Partial | Electric Atlas reproduced backflips and dance routines, but the synchronized two-robot parkour was never repeated. The hydraulic Atlas was retired in 2024 with no customer workflow ever attached. | Dec 2024 | ~40 months | Vendor update with new material | 5 | 0 |
| Oct 2018 | Boston Dynamics | Boston Dynamics Atlas (hydraulic) hops over a log and up a series of blocks on an outdoor parkour-style course | Locomotion | Staged outdoor course | 7 | Yes | Dynamic locomotion progression was validated in the lab, with zero productive commercial use over the following years. A research platform, never sold to a paying customer. | Aug 2021 | ~34 months | Vendor update with new material | 5 | 0 |
| Nov 2017 | Boston Dynamics | Boston Dynamics Atlas (hydraulic) performs a standing backflip and jumps onto stacked boxes | Locomotion | Lab | 7 | Yes | Backflip was reproduced across hydraulic and electric Atlas variants, establishing it as a repeatable motor capability. A locomotion stunt that validated reliably but never translated into revenue. | Aug 2021 | ~45 months | Vendor update with new material | 5 | 0 |

This market map, featured in our humanoid robotics market deck, highlights top companies and startups in the humanoid robotics market
Insights
We reviewed 59 humanoid robot demos from sixteen leading programs, from November 2017 to April 2026, and scored each one on the gap between the original claim and the evidence that actually arrived. Here is what stood out.
- The most predictive feature separating real progress from narrative is the gap between a vendor's demo and a deploying customer's disclosed operational metric. Almost every ESS 10 row comes from a customer publishing numbers unprompted, not a vendor retrospective.
- One customer (GXO) produces more independent validation than the rest of the industry combined. Agility's 100,000-tote disclosure is the only case where a deploying customer's own press carried the operational metric. Elsewhere the number comes from the vendor.
- The "home humanoid" category has zero DRS ≥ 8 rows. Figure 03's dishwasher, 1X's NEO, and Tesla's household aspirations all fail evidence matching or carry teleop disclosures. Home is the widest demo-to-reality gap in the field today.
- Bimanual coordination has zero ESS 10 entries. Helix grocery, LimX juice bins, NEO steak, and Figure 03 dishwasher all lack customer-site confirmation. Locomotion validates fastest but almost never operationally: backflips and sprints get reproduced; none appears in any deployment row.
- AgiBot A2's 106 km Suzhou-to-Shanghai walk is the only fully unambiguous endurance validation in the dataset: Guinness-adjudicated, multi-day, public route, CBS News on site. Endurance is under-represented in demos but over-represented in credibility.
- Figure's 01-to-03 migration abandons rather than extends every demo claim made before April 2024. The coffee task, OpenAI conversation, and 16.7% bin-pick did not carry into BMW. A canonical demo corpus not reused in deployment, unique to Figure.
- Apptronik-Mercedes is a textbook case of corporate validation substituting for operational validation. Mercedes joined a funding round and praised the partnership, yet Apollo's work is explicit teleop training framed as "exploring use cases." No BMW-style post-mortem.
- The XPeng Iron fall at Shenzhen MixC mall (January 2026) is the dataset's most valuable negative validation. A November catwalk-smooth gait claim collapsed two months later under public conditions, documented independently and by the CEO himself.
- Across seven years the demo-to-reality gap has neither widened nor narrowed; it has shifted category. 2019-2022: locomotion vs usefulness. By 2024: manipulation autonomy. By late 2025: narrow-task manipulation solved at two or three sites, gap moved to home. Forecasts treating home as the last barrier likely underestimate what comes after.

This chart, featured in our humanoid robotics market deck, compares the main business model options for humanoid robot manufacturers
The methodology behind the Humanoid Robot Demo vs Reality Tracker
This is a live research tool, updated as evidence arrives. It tracks how far humanoid demos actually turn into operational evidence over time.
Each demo needs a public source, a visible capability, and a recognised program. We excluded 21 of 84 initial demos for low confidence (under 50% certainty of what was visibly shown), and dropped four more as operationally redundant with richer entries, leaving 59 rows.
Every row captures one demo claim, not a whole video. We coded the visible claim first from the original source, before looking for follow-up evidence, preventing hindsight bias.
Each demo is scored on three axes, all on a 0 to 10 scale. Demo Claim Quality (DCQ) rates demonstration clarity, from 0 (hype) to 10 (specific, measurable, reproducible). Deployment Reality Score (DRS) rates progress toward commercial operation, from 0 (research) to 10 (audited, recurring, paid). Evidence Strength Score (ESS) rates the first credible later proof, from 0 (none) to 10 (audited third-party KPI disclosure). Scores rescale internally from 0-3 (DCQ) and 0-4 (ESS, DRS), so values fall on discrete steps.
The DCQ-to-DRS gap is the main object of analysis. When later evidence only partly matched the claim, we coded it Partial.
Environment and autonomy are core interpretive variables: a lab capability under teleoperation is not the same as one on a live customer floor under autonomous control. The "What was actually proven" column shows where evidence stops. See our humanoid robotics market report for more.

This chart, featured in our humanoid robotics market deck, shows how Agility Robotics is capturing share in humanoid robotics
Related blog posts
- The startups that have raised the most funding in the humanoid robotics market
- The most highly valued startups in the humanoid robotics market
- The full range of business models in the humanoid robotics market
- What is some real-world evidence of humanoid robots at work today?
- Which of Musk’s Optimus promises have been fulfilled so far?
Who is the author of this content?
NEW MARKET PITCH TEAM
We track new markets so founders and investors can move fasterWe build living “market pitch” documents for emerging markets: from AI to synthetic biology and new proteins. Instead of digging through outdated PDFs, random blog posts, and hallucinated LLM answers, our clients get a clean, visual, always-updated view of what’s really happening. We map the key players, deals, regulations, metrics and signals that matter so you can decide faster whether a market is worth your time. Want to know more? Check out our about page.