What's Actually on the Floor in Erlangen, Leipzig, and Spartanburg
Three press releases in three months told the industrial-AI story more clearly than any analyst report. Erlangen, Leipzig, and Spartanburg have settled whether the systems function. Watch the dock, not the demo.
What's Actually on the Floor in Erlangen, Leipzig, and Spartanburg
Three press releases in three months told the story of industrial AI more clearly than any analyst report I have read in 2026. Siemens and NVIDIA committed to a fully AI-driven Electronics Factory in Erlangen, beginning this year (Siemens press, January 2026). BMW completed its initial humanoid test deployment at Plant Leipzig in December 2025 and is staging the pilot proper from April 2026 into summer (BMW Group news). And in Spartanburg, South Carolina, Figure 02 closed an 11-month run loading sheet-metal parts at greater than 99% placement accuracy across 1,250 operational hours (IIoT World coverage of the data).
That is not a marketing arc. That is a delivery curve. If you sit on a board that has been studying a manufacturing AI program since 2023 and you still do not have a production deployment, the calendar has stopped being your friend.
I want to walk through what actually shipped, what is still POC theatre, and where the money will and will not survive contact with a real factory.
the three plants, in plain terms
Erlangen is the cleanest test of how far the design-engineer-build-operate cycle can compress when the simulation layer and the execution layer share an operating system. The Siemens and NVIDIA expanded partnership puts the Industrial AI Operating System into the same plant that builds the controllers it instruments — an internal pilot disguised as an external one (NVIDIA newsroom). Read the language carefully. The companies say "fully AI-driven, adaptive manufacturing sites, starting in 2026." Vendor language. I would knock fifteen percent off the schedule and another twenty percent off the breadth of "adaptive" before I quoted it to a CFO. But the underlying claim — that the digital-twin tooling is now mature enough to drive line decisions in something close to real time — has empirical support beyond the Siemens release. PepsiCo has been running the Siemens-NVIDIA twin stack across selected U.S. plants and reports identifying up to 90% of issues before physical implementation (Siemens news on Digital Twin Composer, January 2026). The 90% number is theirs. It is a useful upper bound, not a planning figure.
Leipzig is the more conservative bet, and that is the point. BMW chose a humanoid platform — AEON from Hexagon Robotics — for high-voltage battery assembly and component handling, with a deliberate testing window before pilot start (Assembly Magazine writeup). The piece worth noticing is that BMW has now run two distinct humanoid programs — Figure 02 in Spartanburg and AEON in Leipzig — and produced quantitative data from at least one of them. Loaded parts, hours, accuracy. Not a video. Not a sizzle reel. A bill of goods. That is the threshold automotive procurement actually clears, and it has been cleared.
Spartanburg is the hardest evidence we have. Over 90,000 sheet-metal parts loaded. Better than 99% placement. 1,250 hours of operation. The IIoT World writeup is candid about ROI — the proof point is small relative to the plant's overall output, but it is the first publicly documented, production-scale humanoid run in automotive. Procurement people read this kind of artifact differently than executives do. They read it as evidence that the supplier can deliver a contractually defensible uptime number, which is what an SLA actually contains.
what is being installed under the headlines
The thing that does not show up in any of the headline coverage is the substrate. The OPC Foundation announced this year that it is converting its 430-plus Companion Specifications into formats optimised for RAG, the Model Context Protocol, and AI-assisted engineering — making OPC UA the semantic layer between agentic systems and the machines they need to read and command (Automation.com). That is the boring sentence that matters. Without a normalised, machine-readable semantic layer, every agent on the floor is an integration project disguised as a product. With one, you start to get reusable patterns — and reusable patterns are what compounds into ROI.
The second substrate is the Mega blueprint NVIDIA published for Omniverse, which Accenture and Schaeffler are using to simulate humanoid fleets — Agility Robotics' Digit, in their published demos — inside factory and warehouse twins before any hardware moves (NVIDIA blog). The economics here are the part nobody quite says out loud. A single humanoid in a real plant runs at low double-digit dollars an hour all-in once you load the cost of the cell, the supervisor, and the downtime. Simulated hours run at fractions of a cent. You move the learning loop into the twin and you trade hardware time for compute time, which is the trade hyperscaler economics is actually built for.
The third substrate is the industrial-copilot layer. Siemens shipped nine copilots at CES 2026 spanning design, production, and operations (HPCwire summary). Schneider Electric paired EcoStruxure with Microsoft Azure AI and launched what it is branding "agentic manufacturing" at Hannover Messe in April (TechFinitive coverage). Rockwell stitched NVIDIA's Nemotron models into its edge stack (ARC Advisory writeup). Three competitors, three near-identical bets, on a single architectural premise: the copilot belongs at the edge, sitting on top of OPC UA, calling out to cloud only for what cannot be done locally.
the failure data nobody quotes on the trade-show stage
If you only read the press releases you would think this is a flat, rising curve. The macro data says otherwise. Gartner in April 2026 reported that AI projects across infrastructure and operations are stalling before they reach meaningful ROI (Gartner press release, 7 April 2026). MIT's Project NANDA work from mid-2025, still being cited heavily a year later, put the share of generative-AI deployments showing zero measurable P&L impact at 95%. The RAND survey from late 2025 placed overall enterprise AI failure at 80.3%. Manufacturing-specific adoption has indeed climbed from 70% to 77% in roughly eighteen months — but in the same surveys, integration burns 58% of project resources.
Read those numbers against the BMW data. BMW shipped a humanoid pilot with a measurable uptime figure inside the failure window everyone else is sitting in. That is not luck. It is the product of a tightly bounded scope (one task, sheet metal, well-defined geometry), a clear partner relationship (Figure, then Hexagon), and a procurement function that demanded numbers rather than narratives. Most of the floors I look at do not have any of those three, and that is why their POCs die.
The honest measurement problem applies on the factory floor as cleanly as it does in knowledge work. Vendors will quote you 95–99% defect-detection accuracy from vision systems, and the 2025 ROI of AI in Manufacturing report says 54% of organisations are now using AI agents for quality control (Edge AI and Vision Alliance, May 2026). I take those numbers seriously and not at face value. The 95–99% is on the workloads the vendor optimised. Run the same model on the SKU mix you actually have on your line and the number changes. Sometimes by a lot. The discipline that survives contact with industrial reality — and the one I push hardest on every Real AI engagement — is to instrument the baseline before you install the system. Without that baseline, every percentage point of improvement is an argument, not a result.
where I would put the money this quarter
Three things, in priority order.
One: an OPC UA semantic-layer audit. Whatever else you do, the agent layer is going to land on top of this. If your tags are inconsistent across cells, your agents will inherit the chaos. The OPC Foundation's RAG-and-MCP push is going to make this rapidly cheaper to fix, but you want a clean layer ready when it does.
Two: one — and only one — humanoid or vision pilot bounded the way BMW bounded Spartanburg. Single task. Hard accuracy threshold. Hours-of-operation reporting. Quarterly review. If the supplier will not commit to those terms in writing, the supplier is not yet shippable.
Three: a written shutdown criterion. Not a launch criterion — those are easy. A shutdown one. The number, the date, and the named executive who has authority to cut funding if the criterion is not met. Industrial AI is now mature enough that the projects which fail no longer fail because the technology is missing. They fail because nobody wrote down what success looks like, and the program continued past the point at which it should have been killed.
I would bet against any program in 2026 that cannot answer those three questions on a single slide. And I would fund any program that can.
The factory floor has become the most boring and most consequential part of the AI conversation this year. Boring because the work is no longer about whether the systems can be made to function — Erlangen, Leipzig, and Spartanburg have settled that. Consequential because the dollar volumes that will flow through manufacturing AI in the next five years will dwarf the chatbot economy that has consumed the press for two. Watch the dock, not the demo.
Tarry Singh is the founder and CEO of Real AI, an enterprise AI advisory and deployment firm working with global enterprises on production agent systems, model risk, and AI sovereignty strategy. He also leads Earthscan for Energy AI startup, and is a founding contributor to the EU-funded HCAIM and PANORAIMA programmes for responsible AI education across European universities. He writes at tarrysingh.com.