Humanoid Real-Work Evidence Tracker

In our humanoid robotics market deck, you will find everything you need to understand the market
Humanoid Real-Work Evidence Tracker
This tracker captures every publicly documented humanoid robot deployment we could verify between October 2023 and April 2026, with each row graded on source credibility and whether the robot is actually performing recurring productive work.
Rows are ordered from most recent to oldest. If you want to dig deeper into this market, you can check out our humanoid robotics market report.
| Date | Robot (Company) | Customer | What has been proven | Capability | Environment | Stage | Sources | Evidence Credibility | Real-work Verdict | Notes |
|---|---|---|---|---|---|---|---|---|---|---|
| Mar 2026 | Xiaomi humanoid (Xiaomi) | Xiaomi EV (Beijing, internal) | Two robots autonomously installed self-tapping nuts at a 76-second cycle time, achieving 90.2% success across a 3-hour run on the assembly line. | Install small fasteners | Factory (assembly line) | Pilot | CnEVPost, CNBC | Medium | Partial | The CEO publicly described the robots as "interns" still in training. The trial duration was short and confined to an internal deployment. |
| Feb 2026 | Digit (Agility Robotics) | Toyota Motor Manufacturing Canada | A year-long pilot with 3 units was converted into a commercial agreement for 7 units. The task is unloading totes from automated tuggers at the RAV4 plant. | Unload totes from tugger | Factory (operational) | Pilot → Real work | The Robot Report, TechCrunch, Agility Robotics | High | Yes | The fleet remains small and robots are still segregated from humans. Full operational scale awaits Digit's next safety capability for human-proximity work. |
| Jan 2026 | Atlas electric (Boston Dynamics) | Hyundai (Metaplant Georgia) | First factory-floor task deployment shown to 60 Minutes, with full fleets committed for 2026 delivery. Most scale targets sit in 2027 and 2028. | Parts handling (auto) | Factory (pilot/early operational) | Pilot | The Robot Report, Boston Dynamics, Supply Chain 24/7 | Medium | Partial | The 60 Minutes correspondent noted this is not yet mass deployment. Operational scale on the factory floor is targeted for 2027 and 2028 rather than now. |
| Dec 2025 | Walker S2 (UBTech) | Airbus | An early-concept testing agreement for aircraft assembly, explicitly framed by both parties as exploratory rather than productive work. | Aerospace assembly (concept) | Factory (not yet operational) | Pilot | MERICS | Low | No | Both parties explicitly labelled this as "early concept testing" rather than production. We include it as context adjacent to the exclusion threshold. |
| Dec 2025 | Digit (Agility Robotics) | Mercado Libre (San Antonio, TX) | Commercial agreement signed with deployment planned in a 2026 fulfilment centre. No live throughput data has been disclosed yet. | Transfer totes between stations | Warehouse (pre-operational) | Pilot | Agility Robotics, Digital Commerce 360 | Medium | Partial | The deal is at agreement stage with no throughput data available yet. Operational evidence should emerge during 2026 as the centre ramps up. |
| Nov 2025 | Figure 02 (Figure AI) | BMW Group | Two robots ran 10-hour shifts for 11 months, logging 1,250+ hours. They loaded 90,000 parts and contributed to 30,000 X3 vehicles on the line. | Place sheet-metal in fixture | Factory (operational assembly line) | Real work | Figure AI, Assembly Magazine, Repairer Driven News | High | Yes | Fortune previously accused Figure of inflating the robots' role at BMW. The retirement announcement also doubles as marketing for Figure 03. |
| Nov 2025 | Digit (Agility Robotics) | GXO / Spanx | Over 100,000 totes were moved in live fulfilment across roughly 12 months of continuous operation. The milestone is self-reported but publicly endorsed by GXO. | Transfer totes between stations | Warehouse (operational) | Scaled | Agility Robotics, Robotics & Automation News | High | Yes | The milestone is self-reported by Agility, but GXO has publicly endorsed the numbers. This is the clearest scaled real-work case in the dataset. |
| Nov 2025 | Walker S2 (UBTech) | Multiple (BYD, Geely, FAW-VW, Audi-FAW, Foxconn, SF Express) | First mass production batch of several hundred units began delivery, with cumulative orders exceeding 800 million yuan across multiple industrial customers. | Industrial multi-purpose | Mixed | Pilot → Scaled | PR Newswire, AI Business, MERICS | Medium | Partial | The orders are real but per-site real-work performance is harder to verify. MERICS cites UBTech itself admitting Walker S2 runs at roughly half human efficiency. |
| Aug 2025 | HMND 01 (Humanoid, UK) | Schaeffler (Germany) | A near-production trial at a Schaeffler German plant using a pre-alpha unit. Limited third-party reporting accompanies the announcement. | Factory assist | Factory | Pilot | Humanoid, The Robot Report | Low | Partial | Most reporting on this trial is company-originated with limited third-party verification. Schaeffler's broader Humanoid partnership adds some independent grounding. |
| 2025 | Walker S2 (UBTech) | Texas Instruments | Units were purchased for semiconductor plants with deployment reported publicly. Specific productive-task data on-site remains unavailable. | Semiconductor plant assist | Factory | Pilot | MERICS | Low | Partial | The purchase itself is real and documented. On-site productive-task data is not available, so the real-work verdict is only partial. |
| Early 2025 | Digit (Agility Robotics) | Schaeffler (Cheraw, SC) | Digit moves 25-pound baskets of bearing components from a stamping press. It runs two 4-hour shifts daily and replaced one worker who moved to inspection. | Carry component baskets | Factory (operational, in plexiglass cage) | Real work | Wall Street Journal, Humanoids Daily, Schaeffler Tomorrow | High | Yes | A plexiglass enclosure is required by current safety regulations. An Agility contractor supervises the robot on-site during shifts. |
| Early 2025 | Digit (Agility Robotics) | Amazon (operational site) | Limited reporting on the Amazon and Agility relationship since 2023. No confirmed live-site operational data has emerged publicly. | Empty-tote movement | Warehouse | Pilot | The Robot Report | Low | No | We include this only as contextual pilot evidence within the dataset. There is insufficient evidence of recurring productive work to warrant a higher verdict. |
| Mar 2025 | Walker S1 (UBTech) | Zeekr (Geely, Ningbo) | "Dozens" of Walker S units were deployed across final assembly, instrumentation, and QC in a 5G-enabled factory. Unit counts and shift structure are not externally audited. | Collaborative sorting/handling | Factory (operational) | Pilot → Real work | IoT World Today, UBTech | Low | Partial | This is the strongest Chinese automotive case in the dataset. Unit counts and shift structure remain unaudited by independent third parties. |
| Feb 2025 | Apollo (Apptronik) | Jabil | A pilot on the electronics manufacturing floor was announced alongside Jabil taking on contract manufacturing duties for Apollo itself. | Parts handling (electronics) | Factory (pilot cell) | Pilot | TechCrunch | Medium | Partial | This pilot is at an early stage with no throughput metrics disclosed. The dual manufacturing role adds commercial commitment beyond a pure evaluation. |
| Oct 2024 | Walker S1 (UBTech) | BYD (Changsha) | Walker S1 is being trained at a BYD factory on handling tasks. Most of the reporting originates from UBTech itself. | Parts handling + AMR coordination | Factory (training/semi-operational) | Pilot | EyeShenzhen, UBTech | Low | Partial | Most sourcing is company-originated without external audit. The "training" framing itself limits how strongly we can claim real-work status. |
| Aug 2024 | Figure 02 (Figure AI) | BMW Group | The robot placed sheet-metal parts into welding fixtures over several weeks. Accuracy was cited as high during this initial trial. | Place sheet-metal in fixture | Factory (body shop, semi-operational) | Pilot | BMW Group, Manufacturing Dive | Medium | Partial | This was the initial trial that seeded the later 11-month deployment. BMW confirmed there was no permanent on-site deployment at the end of this phase. |
| Jun 2024 | Digit (Agility Robotics) | GXO / Spanx | A multi-year RaaS agreement was signed after the 2023 POC. Robots transfer totes from AMRs to conveyors in a live fulfilment setting. | Transfer totes between stations | Warehouse (operational) | Real work | GXO, The Robot Report | High | Yes | This was the first formal commercial humanoid deployment under a RaaS contract. Robots remain segregated from humans during operation. |
| Jun 2024 | Apollo (Apptronik) | GXO | Early-stage POC initially run in a lab environment, with planned target deployment in a US distribution centre. GXO describes it as pre-deployment R&D. | Warehouse parts handling | Lab → warehouse | Pilot | GXO, The Robot Report | Medium | Partial | GXO explicitly describes this as pre-deployment R&D rather than productive work. The pilot is one of several GXO is running across vendors. |
| Apr 2024 | Phoenix (Sanctuary AI) | Magna International | Pilot deployment for manipulation tasks was announced, with operation described as teleoperated under "pilot mode". | Dexterous small-part handling | Factory (pilot cell) | Pilot | The Logic, TechCrunch | Medium | Partial | Magna statements confirm operation is primarily teleoperated rather than autonomous. We downgrade credibility accordingly given the autonomy ambiguity. |
| Mar 2024 | Apollo (Apptronik) | Mercedes-Benz (Berlin-Marienfelde, Kecskemét) | A single-digit number of units are testing intra-logistics tasks. These include parts delivery and tote transport on semi-operational lines. | Deliver parts/totes | Factory (semi-operational) | Pilot | PR Newswire, WardsAuto, TechCrunch | Medium | Partial | The deployment remained at pilot stage as of early 2025 per Apptronik's own statements. Press coverage of this programme has consistently exceeded its operational scale. |
| 2024 | Walker S1 (UBTech) | Audi-FAW (Changchun) | Deployed for air-conditioning leak detection and visual quality control inspection. The specific task description is unusually detailed for this category. | Visual/leak QC inspection | Factory | Pilot | People's Daily Online, Audi Club North America | Low | Partial | The specific task is well-described compared to other Walker S deployments. Independent audit of productivity and uptime remains limited. |
| 2024-2025 | Walker S Lite (UBTech) | FAW-Volkswagen (Qingdao) | Walker S Lite units were deployed for quality-control inspection tasks on the production line. The programme remains at pilot stage. | Visual QC inspection | Factory | Pilot | UBTech, People's Daily Online | Low | Partial | The reporting is company-originated with limited third-party confirmation. The pilot label has persisted without a visible transition to productive scale. |
| Oct 2023 | Digit (Agility Robotics) | Amazon (R&D facility, Seattle area) | Tested on empty-tote recycling at Amazon's robotics R&D site. The environment was explicitly a research facility rather than a live fulfilment centre. | Move empty totes | Lab-like facility | Pilot | Agility Robotics, IEEE Spectrum | Medium | Partial | The R&D site framing rules out any real-work claim in live operations. Partnership status between Amazon and Agility has remained unclear since. |

This market map, featured in our humanoid robotics market deck, highlights top companies and startups in the humanoid robotics market
Humanoid Capability Tracker
This table gives an up-to-date view, as of today, of what humanoid robots can and cannot actually do in real operational environments, based on real-life deployments rather than demos.
Each capability is graded across four stages of proof, from isolated demo up to scaled deployment. You can find more detail in our humanoid robotics market report.
| Capability | Demo proof | Pilot proof | Real-work proof | Scaled proof | Best example | Confidence | Caveat |
|---|---|---|---|---|---|---|---|
| Transfer totes between stations | Yes | Yes | Yes | Partial | Digit at GXO and Spanx moved over 100,000 totes across 12 months in live fulfilment. | High | Robots remain segregated from humans on the floor. Each installation handles a single narrowly specialised task. |
| Load sheet-metal into fixture | Yes | Yes | Yes | No | Figure 02 at BMW ran 1,250+ hours, loaded 90,000 parts and achieved over 99% accuracy. | High | This was an extended structured pilot rather than open-ended production. BMW declined to convert it into a permanent on-site deployment. |
| Carry component baskets and material | Yes | Yes | Yes | No | Digit at Schaeffler Cheraw moves 25-pound baskets on two 4-hour split shifts every day. | High | A plexiglass enclosure is mandated by current safety standards. An Agility contractor supervises the robot on-site throughout shifts. |
| Unload totes from tugger for line feed | Yes | Yes | Partial | No | Digit at Toyota Canada moved from a year-long 3-unit pilot to a 7-unit commercial rollout. | Medium | The fleet remains small at seven units. Full operational data on the commercial phase is still pending public disclosure. |
| Install small fasteners at cycle time | Yes | Yes | Partial | No | Xiaomi humanoids achieved 90.2% success at a 76-second cycle over a 3-hour run. | Medium | This was a short-horizon test rather than a full shift. The CEO himself framed the robots as "interns" still in training. |
| Deliver parts kits in intra-logistics | Yes | Yes | Partial | No | Apollo at Mercedes has run an ongoing intra-logistics pilot across two European plants. | Medium | The programme has remained at pilot stage per Apptronik's statements to TechCrunch. Press coverage has run well ahead of deployed operational scale. |
| Visual QC and leak inspection | Yes | Yes | Partial | No | Walker S1 at Audi-FAW performs air-conditioning leak detection on the assembly line. | Low | Evidence is primarily company-reported with thin external audit. The pilot label has persisted without independent productivity benchmarks. |
| Dexterous small-part manipulation via teleop | Yes | Yes | No | No | Sanctuary Phoenix at Magna handled small parts under teleoperation in a pilot cell. | Low | Operation runs primarily under teleoperation rather than autonomy. Autonomous validation of the same capability has not been publicly demonstrated. |
| Multi-hour shift endurance | Yes | Yes | Yes | Partial | Figure 02 at BMW, Digit at Schaeffler and Digit at GXO each run multi-hour daily shifts. | Medium | "Multi-hour" here means 4 to 10 hours with charging breaks rather than true 24/7 uptime. No current Western deployment runs continuously around the clock. |
| Factory-aisle navigation among humans | Yes | Partial | No | No | No deployment fully validates this today, since all current real-work cases segregate robots from workers. | Low | The binding constraint is regulatory machine-guarding rather than AI capability. Agility's next-generation Digit safety certification is the key 2026 milestone. |
| Multi-robot coordination on factory tasks | Yes | Yes | Partial | No | UBTech Walker S1 at Zeekr reportedly runs across dozens of workstations simultaneously. | Low | The scale claims originate from UBTech and its customer rather than independent audit. Coordination quality has not been externally benchmarked. |
| Autonomous battery swap for 24/7 uptime | Yes | Yes | No | No | UBTech Walker S2 demonstrates an autonomous hot-swap battery capability on a live production line. | Medium | The capability has been demonstrated but continuous productive use remains unverified. No customer has published round-the-clock utilisation metrics. |
| Electronics assembly precision work | Yes | Yes | No | No | Apptronik Apollo is running a pilot on the Jabil electronics manufacturing floor. | Low | The programme is very early with no throughput metrics disclosed publicly. Precision benchmarks have not yet been validated against production tolerances. |
| Home chores such as laundry and tidying | Yes | Partial | No | No | 1X NEO early-access units and Figure 03 partner pilots are the only visible consumer programmes. | Low | No verifiable productive-task metrics exist for any deployed residential robot. Teleoperation reliance is understood to be high in current home programmes. |

This chart, featured in our humanoid robotics market deck, compares the main business model options for humanoid robot manufacturers
Insights
We built this tracker around a single question: where, today, is a humanoid robot plausibly performing useful productive work, under conditions we can verify? After filtering demos, marketing events and announcement-only deals from the evidence base, here is what stood out.
- Only one humanoid robot has independently verified multi-customer productive deployment today: Agility Robotics' Digit. Credible on-site evidence exists at GXO's Spanx warehouse, Schaeffler's Cheraw plant, and Toyota Canada's RAV4 factory.
- Across humanoid robots doing real productive work today, two task families dominate the evidence base. These are placing sheet-metal parts into welding fixtures, and transferring plastic totes between warehouse stations. No other capability has cleared the same bar despite thousands of public demos.
- Humanoid robot announcements rarely disclose all four pieces of evidence together: customer, task, duration, and throughput. Figure 02 at BMW, Digit at GXO, and Digit at Schaeffler are the clearest cases supplying all four.
- Every documented US humanoid factory deployment today operates inside a cage or segregated zone, per federal safety rules. The marketing imagery of robots walking freely among workers does not match the regulated factory-floor reality in 2026.
- Every humanoid robot working in a factory today still needs a human supervisor on-site. The Wall Street Journal showed an Agility contractor watching Digit at Schaeffler, so real unit economics include supervisor salaries.
- Safety regulation, not AI capability, is the binding constraint on humanoid real work in factories today. Every meaningful humanoid deployment is functionally a fixed-cell industrial robot that happens to walk on two legs.
- The binding constraint for humanoid real-work scaling in 2026 and 2027 is safety certification, not AI or hardware. Whether Agility's next-generation Digit clears US human-proximity standards will decide if deployments scale or stagnate.
- In humanoid robot hardware, wrists and forearms are the current frontier, not legs. Figure's most-cited lesson from 1,250 hours at BMW was forearm failure in manipulation tasks. Locomotion is largely solved in humanoids today, but repetitive manipulation endurance is not.
- Matching factory line tempo is emerging as the new credibility benchmark for humanoid robots. Xiaomi's 76-second cycle, Figure's 84-second cycle at BMW, and UBTech's swarm focus all converge on production tempo.
- Single-robot backflip demos matter less than fleet demos that meet factory line tempo. Unitree's acrobatic videos and AgiBot's flying kicks are technically impressive but operationally irrelevant to productive deployment.
- Humanoid robots are marketed as general-purpose machines that can learn any task. But in reality, each deployed humanoid today handles only one narrow task per installation. General capability exists in demo videos, not in customer production environments.
- The automotive industry is effectively the only customer for humanoid real work today. BMW, Mercedes, Hyundai, Toyota, Audi, BYD, Geely, FAW, and Xiaomi EV dominate the entire deployed base.
- Real humanoid work in logistics today reduces essentially to one company doing one task. That is Agility's Digit transferring plastic totes at GXO's Spanx warehouse in the US. No humanoid is yet doing varied e-commerce picking at scale anywhere.
- Retail humanoid deployments have effectively disappeared from the serious evidence base. Sanctuary's early retail trials at Canadian stores Mark's and Sport Chek have not been repeated by any credible peer since.
- Figure 02's 11-month BMW deployment ended in retirement rather than scaling up, contrary to media framing. It was a structured learning exercise that transitioned Figure toward its Figure 03 hardware redesign, not permanent production.
- Mercedes-Benz's actual 2025 Apollo humanoid deployment was a single-digit number of robots across two plants. Press coverage of this programme vastly exceeded its operational scale, which is typical across the humanoid sector.
- The gap between humanoid announcements and actual deployments is widening rather than shrinking. Hyundai-Atlas targets sit in 2028, Schaeffler's 100-plant rollout in 2030, while current deployed fleets total in the low hundreds globally.
- Teleoperation, meaning humans remotely controlling robots, is more common in humanoid deployments than publicly acknowledged. 1X hires "1X Experts" to remotely operate NEO robots, and Sanctuary runs its Magna deployment in teleoperated "pilot mode".
- Chinese humanoid volume claims are probably real, but productivity claims are likely exaggerated. UBTech's own CEO openly admitted that Walker S2 runs at roughly 50% human efficiency at most.
- The humanoid evidence base is biased toward visible Western failures and invisible Chinese failures. Fortune investigated Figure's BMW claims in 2025, but no equivalent English-language scrutiny of UBTech or AgiBot exists.
- Agility Robotics rents its Digit humanoid for $10 to $25 per robot-hour, with a long-term target of $2 to $3. This Robot-as-a-Service pricing sits around a $20 US entry-level factory wage, so humanoids are at wage-equivalence rather than dominance.
- The narrative that humanoid robot costs are collapsing sits uneasily with actual deployment economics. Unitree's $5,900 and $13,500 robots are research toys, while robots doing productive work cost an order of magnitude more.
- Battery architecture is quietly more strategic than AI for factory humanoid ROI. UBTech's Walker S2 uses hot-swappable batteries targeting 24/7 operation, while most Western humanoids still need dock-and-charge cycles.
- Humanoid makers partnering with established contract manufacturers are shipping more units than those building in-house. Apptronik partners with Jabil while vertically integrated programmes at Tesla and Figure have consistently missed volume milestones.
- The "customer becomes investor" pattern correlates strongly with the only real humanoid deployments today. Amazon, Magna, Schaeffler, Mercedes, and Hyundai all took equity in their humanoid partner before meaningful pilot work started.
The methodology behind this Real-Work Evidence Tracker
We built this tracker as an evidence screen rather than a media archive. Every candidate was tested against three screens: source quality (company plus customer plus credible third party), operational specificity (concrete task, location and duration), and real-work relevance (actual productive output rather than isolated capability demonstration).
We excluded standard video demos, marketing montages, concept videos, stage demonstrations and social media clips regardless of virality. We excluded pure partnership announcements without an observed task. We also excluded Tesla Optimus factory claims because the company's own Q4 2025 earnings framing placed current units in a data-collection role rather than productive work. Reasonable readers could disagree, but we preferred consistency. When in doubt, we excluded.
We preferred Partial verdicts over Yes verdicts in borderline cases, and we downgraded credibility aggressively where customer confirmation was missing, autonomy was unclear, or the setting was described as "training" rather than "production". The Figure 02 BMW entry was our hardest single judgment, coded as Yes because the operational output was substantial and independently referenced by BMW.
The most important limitation is regional sourcing asymmetry. Chinese humanoid activity at Geely, Zeekr, BYD, Foxconn, AgiBot and Xiaomi is at least as significant in scale terms as Western deployments, but the verification layer is thinner. Independent on-site audits and standardised throughput metrics are comparatively rare, so we likely underrepresent real-world Chinese activity.
This tracker will age quickly. Toyota Canada, Mercado Libre and Atlas at Hyundai are on the cusp of shifting from pilot to real work within 6 to 12 months. You can find more analysis and the full market context in our humanoid robotics market report.

This chart, featured in our humanoid robotics market deck, shows how Agility Robotics is capturing share in humanoid robotics
Related blog posts
- The startups that have raised the most funding in the humanoid robotics market
- The most highly valued startups in the humanoid robotics market
- The full range of business models in the humanoid robotics market
- Can we believe the impressive humanoid robot demos?
- Which of Musk’s Optimus promises have been fulfilled so far?
Who is the author of this content?
NEW MARKET PITCH TEAM
We track new markets so founders and investors can move fasterWe build living “market pitch” documents for emerging markets: from AI to synthetic biology and new proteins. Instead of digging through outdated PDFs, random blog posts, and hallucinated LLM answers, our clients get a clean, visual, always-updated view of what’s really happening. We map the key players, deals, regulations, metrics and signals that matter so you can decide faster whether a market is worth your time. Want to know more? Check out our about page.