Exclusive: Xiaoyu AI’s Qiao Zhongliang – Forging a Practical, Ground-Level Path for Embodied Intelligence in the Factory, Beyond All the Hype

Avatar 0

Reporter | Wu Yangyu

Editor | Wen Shuqi

In a way, Qiao Zhongliang’s entrepreneurial destiny was sparked by a toy cube he got from his kid.

He saw the words “Made in Vietnam” on it and felt confused—wasn’t it always “Made in China”?

After digging in, he realized that with rising labor costs and capital chasing cheaper options, some manufacturing was moving out of China. This internet veteran, who had witnessed China’s manufacturing dominance, felt a structural shift happening—and he thought he had both the opportunity and the vision to do something about it.

After 12 years inside Xiaomi’s system, Qiao still wanted to step outside and build something of his own. After plenty of research and soul-searching, he founded Xiaoyu AI in 2023, focusing on “one brain, many forms” technology for industrial settings—basically a universal brain model that can control different types of robots.

More specifically, his team trained a native multimodal robotics foundation model that lets robots “see” workpieces as humans do and execute tasks accurately, achieving zero-shot generalization for new parts. Recently, the company closed a B+ round worth hundreds of millions of yuan, with its first commercial use case—intelligent welding—already in mass production and delivery.

After choosing his lane, Qiao visited over 30 factories, from auto assembly lines to cotton-picking farms. He ultimately picked welding as his entry point: a “monolithic mega-market” with an estimated 5 million workers globally, and a pinpoint-perfect target.

Welding is a brutal job—50-degree heat, dust, and noise. Young people don’t want to do it, and experienced welders are aging out. In Qiao’s view, using AI and robots to replace those jobs isn’t just about cutting costs and improving efficiency—it’s also a moral imperative.

But let’s be honest: this business story, which he calls “both right and righteous,” didn’t exactly feel “sexy” during the past two years of crazy hype around embodied intelligence.

“Yeah, we’re not part of that mainstream narrative,” Qiao admits. Despite the inevitable noise, his startup holds a strategy meeting every six months to double-check its direction. Three years in, they haven’t received a single opposing vote in a full-team vote.

Qiao says his strategic thinking and resilience were both forged at Xiaomi. As an engineer, he was deeply involved in the birth of MIUI, shifted through 11 different roles, ran an e-commerce platform, led AI imaging, oversaw quality testing, and before leaving, spent five years delivering a cross-device universal operating system for Xiaomi.

Looking back, it was like a decade-long pre-training of his brain’s “foundation model”—he developed an instinct for user needs, the ability to make quick decisions under pressure, and a methodology for solving complex problems.

That might be the common thread among entrepreneurs who came out of China’s golden decade of the internet.

Now Qiao is using that foundation to build something new. After nailing welding, the team plans to move into grinding or spraying and eventually establish a standardized set of end-to-end components.

His vision for that capability is a “productivity distribution platform,” where Xiaoyu AI provides the underlying intelligent components, and partners inject industry-specific know-how to cover more industrial scenarios—think Xiaomi’s ecosystem model.

In short, you’ll see how a mid-career internet guy overcomes path dependence, starts from a “Made in Vietnam” toy to envision the future, and tries to carve out a realistic alternative path for China’s embodied intelligence in industrial settings far from the hype.

Below is the full interview, lightly edited for clarity:

“Probably Xiaomi’s Most Rotated Engineer: Every Move Was a Big Opportunity for Me”

Q: Let’s talk about the whole startup journey. When did you first get the idea?

Qiao Zhongliang:Actually, I started thinking about it in 2022. Xiaomi was tracking language models (GPT) internally, and there was debate—is it just overfitting or real generalization? I leaned toward real generalization.

Q: Believing it was real generalization—what did that mean to you?

Qiao Zhongliang:You just felt like a new era was coming.

AI wasn’t useless before—AlphaGo beat Lee Sedol. But that was a narrow model with huge application costs. When a general model’s generalization gets strong, it opens up so many scenarios. That’s when you feel a revolution is coming.

Of course, that’s the trigger. But there was also a deeper motivation—what did I really want to do? I’d been at Xiaomi for 12 years, witnessed the birth of a great company, but I still had that entrepreneurial itch.

Q: That itch never went away? You lived through a golden 12-year entrepreneurial ride—why wasn’t that enough?

Qiao Zhongliang:It never was. Working for someone else is different from building your own thing. When you’re inside a system, you’re still part of it. When you go out on your own, you start building your own value system.

Q: So you had ideas you hadn’t put into practice?

Qiao Zhongliang:Yeah, I wanted bigger innovation, bigger sense of achievement.

Q: But Xiaomi was at a critical moment then—just deciding to build cars and chips. Didn’t that feel like a new playground?

Qiao Zhongliang:Honestly, I’d been through those kinds of moments many times at Xiaomi.

Xiaomi was growing fast, and I was probably the most rotated engineer—11 different roles. Each one was like the kind of opportunity you’re describing.

Q: What roles did you rotate through?

Qiao Zhongliang:I was the fourth MIUI engineer. I worked with Li Wanqiang on Xiaomi’s e-commerce, then built a mobile e-commerce platform during the mobile internet wave. At that time, Xiaomi was the third-largest e-commerce platform in China—hard to believe, right? All driven by hit-product traffic.

After e-commerce, I moved to system-level anti-harassment—calls and SMS spam were terrible during the mobile internet boom. Then I took over AI imaging, quality testing during Xiaomi’s “Quality Year,” and internet monetization tools.

Later, Xiaomi evolved from “Phone + AIoT” to “Phone X AIoT”—serial production became parallel. Lei Jun proposed “develop once, deploy on multiple devices,” and I built a universal operating system for heterogeneous hardware.

That was tough. I spent five full years on it before leaving Xiaomi and delivering the result to Lei Jun.

Q: But in terms of scale, can those opportunities compare to chips and cars?

Qiao Zhongliang:When I was working on MIUI, you know how many business units were under me? I can tell you this: even if I’d gone to work on cars, it wouldn’t have pushed me out of my comfort zone.

Q: Between the entrepreneurial urge and actually starting, what led you to your current field?

Qiao Zhongliang:Starting in 2022, I looked at commercial space, autonomous driving. For various practical reasons, they seemed challenging. Then a catalyst came—my kid bought a toy made in Vietnam. That gave me a weird signal: wasn’t it always “Made in China”? Why is it “Made in Vietnam”?

I’d been in the internet world and didn’t have that visceral feeling. I followed the thread and saw manufacturing outflow. Capital is profit-driven; it goes where costs are lower. That’s a structural trend. I thought, in this window of change, could we build great productivity tools so Chinese manufacturing competes not on low costs but on intelligence? That question pointed me in a direction.

Q: What kind of toy was that?

Qiao Zhongliang:Something like a kids’ Rubik’s cube.

Q: You went directly from “Made in Vietnam” on it to all of that?

Qiao Zhongliang:It felt really weird—it used to always be “Made in China.” And when you have a perspective of “what do I want to do,” you start thinking a lot.

Q: Why did you end up focusing on a universal brain for industrial embodied intelligence?

Qiao Zhongliang:When I was looking at autonomous driving applications, I was already thinking about robots. Also, Xiaomi had built CyberDog and CyberOne, and I’d worked on robot systems, so my sensitivity to this stuff was high.

When I worked on robots, I wondered: could we reuse a mature supply chain? Could we find an open-source system like Android?

After lots of research, I concluded the industry’s infrastructure is very immature—kind of like the EV industry before 2015, when the three-electric system (battery, motor, controller) wasn’t fully ready.

Theoretically, robots will be widely used in the future. From an industry perspective, specialization will happen: some do applications, some models, some systems, some hardware, some core components.

For an industry to grow, you have to solve its pain points. I happened to have built universal operating systems, so it was natural to apply that skill here.

Q: But you could have chosen consumer scenarios. Why industrial?

Qiao Zhongliang:Systemic revolutions usually start in production before daily life. Steam engines went into textile mills first, then came railways. Computers went into businesses first, then smartphones. Getting robots into homes isn’t just a tech problem—there are also commercial paths and social ethics.

Second, there’s a moral angle. Some AI applications face resistance because they replace jobs people still want to do. But in factories, welders, grinders, and sprayers work in 50°C summers, minus-ten winters, surrounded by dust and noise year-round.

The morality here is: first, don’t compete with humans for work they want; tech should serve people. Second, humans really shouldn’t be doing those jobs. Replacing them with machines isn’t stealing jobs—it’s liberation.

Q: Based on that logic, you chose a universal brain for industrial embodied intelligence. But you came from mobile software. Could you transfer that technical know-how?

Qiao Zhongliang:Over 15 years of dealing with hundreds of millions of users every day, I developed an instinct for users—intuition about needs and product definition. You have to make decisions under huge pressure.

Plus, I built a problem-solving methodology and the ability to execute.

So for me, the technical barrier wasn’t the issue. The key is whether you can identify user needs and translate them into technical solutions.

Finding the “Driver” in Industry, Analogous to Cars, and Making It a “Hit Product”

Q: After you started, how did you develop insight into B2B scenarios?

Qiao Zhongliang:Decision-making bottlenecks are cognition and methodology. I had the methodology from my past, but lacked cognition. So I had to learn. I visited over 30 companies—from auto factories to farms, including bean-picking and cotton-harvesting. I filled that cognitive gap before making judgments.

Q: How did you pick welding?

Qiao Zhongliang:When I narrowed down from broad scanning, I wanted a good entry point that could scale. It had to fit two criteria: a big market and a good technology match. I looked for non-contact work, not too flexible, not too demanding on force control—ideally solvable with 3D spatial path planning.

Welding is a perfect huge scenario I could find. It’s also the largest single market I could identify.

Q: What do you mean by “largest single market”?

Qiao Zhongliang:Like in the automotive world, you can’t find another role like “driver.”

Q: So welders are to industry what drivers are to cars?

Qiao Zhongliang:In terms of scale of a single job type. There are over 2 million welders in China, and globally, I estimate over 5 million. That means if you build a robot, you have the potential to sell hundreds of thousands or even a million units worldwide. When a job is highly homogeneous, people don’t want to do it, and customers can easily calculate the ROI—those three conditions together make a great starting point.

Q: What’s the replacement ratio and penetration rate here?

Qiao Zhongliang:In non-standard industries, it’s basically 1:1. If China’s market penetration hits 60%, that’s 1.2 million robots (2 million × 60%). If we replace them over five years, that’s 200,000+ per year in China, and 500,000+ globally. To thrive in an industrial scenario, you need over 50% market share. So I could sell 200,000+ units a year—or even half that, 100,000. Selling over 100,000 units a year is essential to staying in the game long-term.

Q: In your view, who’s most likely to come in and do something similar?

Qiao Zhongliang:First, this is definitely not an easy road. The smartest talent in the market might not want to enter this space—it’s slow, the story is hard to tell, and you actually have to go into factories and get your hands dirty. That alone filters out half the people.

The second wave of competition could come from existing automation players in the industry. They’re strong on process and automation, but today you need flexibility in the “brain”—their capital structure and talent pool may not support the competition. So I see us entering a kind of “no man’s land.”

Of course, competition won’t be absent forever. More accurately, we’re in a window. The faster we build barriers, the harder it’ll be for later entrants.

Q: This “no man’s land”—slow and hard to sell—how did you pitch it?

Qiao Zhongliang:Early on, I had a huge challenge explaining it to many people—including investors and suppliers.

Q: When you decided to start, who was your first investor?

Qiao Zhongliang:Lei Jun.

Q: Did Lei Jun get it?

Qiao Zhongliang:Lei is a super genius—he understood in a few sentences. He’s very optimistic about robots. He supported my decision to start a company, but he warned me that starting up is tough, and the most critical thing is people. If you get the right people, a lot of things fall into place. If not, it’s trouble.

Q: There are other robot companies building universal brains for embodied intelligence, but for consumer scenarios. Why are you delivering a full robot product at this stage, instead of just a software/hardware solution like them?

Qiao Zhongliang:An industry without a super app can’t really take off. Before the iPhone, there was no mature smartphone supply chain. Before Tesla, no mature EV supply chain. Someone has to build the super app first, to pull the whole ecosystem together.

Q: After entering, what’s the next step?

Qiao Zhongliang:Our strategy has a few phases. First, build a hit product—that lets us consolidate capabilities: supply chain, controllers, operating systems, model frameworks, interaction services, and brand. Then we can offer those capabilities at reasonable prices to other solution providers, speeding up industry adoption.

Phase one: set an example ourselves. Phase two: capabilities flow down, we start outputting. Phase three: build a real ecosystem.

Q: Why does the “hit product” theory also apply to industrial scenarios?

Qiao Zhongliang:Industry has had hit products for a long time—like the Ford Model T. Xiaomi just formalized the theory. A hit product is an extreme test under fierce competition: can you use all your capabilities to break through on one product and create a massive market? The hit product mindset isn’t limited to certain fields—it’s really about extreme user experience.

Q: One prerequisite in Xiaomi’s hit product theory is the “ant market”—where large brands can’t dominate and small brands struggle to survive. Are there many “ant markets” in industrial settings?

Qiao Zhongliang:There are tons of ant markets in industry. Look at welding—there are over 200 welding equipment makers in China. Each earns maybe 100–200 million yuan, not bad, but they’re all “ants.” Look at robotic arm manufacturers—dozens of names you can think of. Aren’t those ant markets?

Q: In such a market, how much share counts as a “hit product”?

Qiao Zhongliang:Over 50%.

Q: What’s your basis for that?

Qiao Zhongliang:First, experience. In B2B, you either have a monopoly and live well, or you’re all ants—none making money, squeezed by upstream, living hard. If you’re not the absolute leader, your market position is awkward.

Second, logic. Industrial equipment is a productivity tool, used for ROI calculations. If one machine gives 40% annual return and another gives 30%, you always pick the 40% one. Such efficiency differences cause markets to consolidate toward the leader, like the internet. So in industry, specific problems are usually solved by giants—like the “Big Four” robot families, or Da Vinci in medical surgery.

Building an “Ecosystem” for Industrial Embodied Intelligence: Knowing What to Do and What Not to Do Is Key

Q: Let’s talk about your future business vision. Once you nail welding and grab a big share, you plan to expand to other scenarios (like grinding or spraying) and provide capabilities to partners. That sounds a lot like Xiaomi’s ecosystem model. Why not just handle all scenarios yourself? Why let partners use your solutions, funding, and even marketing?

Qiao Zhongliang:Meituan couldn’t do what Ctrip does, and JD couldn’t do what Pinduoduo does. The differences between industries are huge. Beyond the foundational capabilities, each scenario has deep domain know-how.

Welding, grinding, spraying—their skills are wildly different. Spraying is about even coverage; welding is about precise path control. If I did everything myself, I’d either need tons of business units or make tons of enemies.

In business, you want to make as many friends as possible and work together. That’s what I call a “united front.” It’s about commercial efficiency and ecosystem positioning. Knowing what to do and what not to do is crucial.

Q: To clarify, does your offering include domain know-how for those scenarios, or just general capabilities?

Qiao Zhongliang:Core offering has three things:

1. “Brain” and “Controller”: the universal foundation model and universal controller. Physically, it’s a device with chips, storage, and communication interfaces.

2. “Hand” and “Eye”: competitive cameras and hands, refined and proven in our own scenarios.

3. Supply chain and service system: we have scale for centralized purchasing to lower costs. We can also provide standardized delivery services and joint marketing.

What do partners do? They bring domain-specific data, know-how, and on-the-ground delivery and service.

Q: How do you ensure partners’ data and know-how are up to standard?

Qiao Zhongliang:Rely on physical simulators and acceptance criteria. Like Apple’s App Store—Apple doesn’t develop every app, but it has standards. To get into the ecosystem, you have to meet them.

Q: Is that your final vision for Xiaoyu AI?

Qiao Zhongliang:Final vision is too far. But I believe that eventually, humans will completely exit physical manufacturing. Ordering, scheduling, producing—the entire production chain will be handled by intelligent agents, guiding robots in a closed loop. Humans will set rules and oversee output, no longer doing manual labor.

Q: Besides welding, which scenarios must you keep in-house?

Qiao Zhongliang:We need to build “universal capabilities.” I already have pure vision-driven 3D world understanding. But I’m still missing “force control” and “flexible manipulation.” To give my model multidimensional physical common sense, I need to personally handle scenarios that build those core capabilities.

Q: When you started, embodied intelligence wasn’t hot yet, and the VLA model architecture didn’t exist. How did you design your technical path back then?

Qiao Zhongliang:Initially, the path was 3D reconstruction. 3D reconstruction either uses mathematical methods like 3D Gaussian or model-based methods. We chose model-based, using 3D information to supervise a 2D model, eventually reconstructing the physical world.

We prioritized solving perception first, then planning and control step by step. Think of it as a technical route similar to later-stage autonomous driving.

Q: So you started with autonomous driving technical cognition, which you thought could apply to industrial scenarios. Did the later explosion of large language models help accelerate things?

Qiao Zhongliang:Huge help. First, models have evolved to end-to-end, and thanks to the generalization of foundation models, work that might have taken a team of ten a year can now produce similar results in a month.

Second, thanks to massive resources invested in model layers, open-source models and data are abundant. Where we used to need tens of millions or hundreds of millions of 2D frames, now we can make a good 3D model with millions of frames from an open-source multimodal model.

Q: What tech bottlenecks remain?

Qiao Zhongliang:For welding, the tech and product bottlenecks are gone. Now it’s execution and customer validation, then scaling. The challenge there is building a solid delivery and service system.

After that, the bigger challenge is standardizing the end-to-end components from the first scenario, then replicating that for the second, third, and fourth.

Q: After welding, what’s the next scenario?

Qiao Zhongliang:Grinding or spraying. Spraying is more definite.

Q: Will you build your own production line for supply chain?

Qiao Zhongliang:We’re still in early scaling, focusing on quick validation. Whether to build our own line later is a business decision.

Q: Is the supply chain mature enough now?

Qiao Zhongliang:It wasn’t when we started. After three years of hands-on work, it’s maturing for the welding scenario.

Q: You mentioned enterprises can recoup their investment in a single device within 10 months. Can that cycle shorten?

Qiao Zhongliang:Yes. Two reasons: algorithm improvements boost efficiency—more work per unit time. And costs come down. Early on, I used top-tier supply chain, which is expensive. As scale grows or algorithms reduce hardware dependency, we can widen our selection and lower costs.

Q: What’s your sales forecast for the next two years?

Qiao Zhongliang:This year we expect 1,500–2,000 units. If we break the 1,000-unit mark this year, next year might see 10x growth, heading toward 10,000 units.

“We’re Not in the Mainstream Narrative, But This Path Is Both Right and Righteous”

Q: You come from the internet. What path dependencies did you have to overcome with this startup?

Qiao Zhongliang:The biggest one is the obsession with “speed.” Internet is all about fast iteration. But hardware is strategic—once you define a product and sell it, you can’t easily change it. You have to think way more upfront. But I’m finding that in the AI era, that inertia is changing again, because AI is just so fast.

Another big correction is about talent. I used to think young people are better—new ideas, give them chances. In the AI era, I’ve flipped that. It’s the “second spring for old hands.” AI handles most execution; the critical remaining thing is: can you judge whether the AI output is good? That takes experience—seeing enough right and wrong.

Our top five token consumers (AI usage frequency) are all senior engineers with over a decade of experience. Give an old pro AI tools, and their productivity is terrifying.

Q: Early in your startup, embodied intelligence wasn’t hot in VC circles. Were you worried about funding?

Qiao Zhongliang:Absolutely. The scariest thing in a startup is running out of money. As long as you have money, you still have a chance.

Q: How much did you estimate you’d need to get through the first phase?

Qiao Zhongliang:I initially thought about two years and 200 million yuan to break even on cash flow. That was the internet guy’s fast-iteration mindset answering. I’ve revised that.

Q: What’s the revision?

Qiao Zhongliang:From R&D to mature application, I’d say at least five years. To build a truly general product, billion-level investment.

Q: So overall, in these three years, you haven’t felt too financially strained?

Qiao Zhongliang:Not really. I’m not too anxious about fundraising—my background and what I’m doing have support from many people: Lei Jun, Professor Wang Tianmiao, Li Wanqiang, Cheng Wei, and many industrial investors.

But yeah, we’re not part of that mainstream narrative.

Q: So over the past two years, the huge hype in embodied intelligence—has it affected you?

Qiao Zhongliang:Entrepreneurship comes with noise and distraction—maybe from existing shareholders’ expectations, or from team members looking at the external environment.

We hold a strategy meeting every six months. We ask the same question: are we on the right track? Are we doing it the right way? Every time, the answer is yes.

Q: What dimensions does that question cover? Is the universal brain idea right? Or the welding entry point?

Qiao Zhongliang:If my goal is to build a universal brain, is my current strategic entry point and approach correct? We ask almost every six months. Three and a half years in, we haven’t changed course.

Q: Have there been any opposing voices that made sense, or is everyone pretty aligned?

Qiao Zhongliang:Every time we vote by a show of hands. Without exception, everyone raises their hand.

In my 15-year career, I’ve had my strategic thinking and resilience tempered. So when I started this company, I wrote a 20,000-word draft on my strategic judgments and approach, with very meticulous reasoning.

Q: What directions did it cover?

Qiao Zhongliang:Industry, market, users, competition, ourselves—and what kind of company we want to become, how to become it, and the core bottlenecks.

Q: But you said external noise does affect you. How do you respond?

Qiao Zhongliang:I firmly believe our path is right—both correct and righteous. I do sometimes ask: why are those projects valued so high? Eventually I found some answers. Some people are doing serious work; others are telling stories. Both types get capital support. Capital markets reward stories at certain windows, but they’ll eventually return to products and data.

Q: Do investors understand your thinking?

Qiao Zhongliang:Some really don’t. They advised me: with your background, even making a smart hardware gadget could get you a $1 billion valuation. My reasoning is simple: am I doing this for fame or money? Neither, really.

Ultimately, I want to build something truly successful, something that can deliver.

Q: You worry about it becoming a castle in the air?

Qiao Zhongliang:Yeah, I’ve seen that happen too many times.

Q: As a company not in the mainstream narrative, how do you attract top talent?

Qiao Zhongliang:Talent competition is white-hot right now. Social recruitment is basically empty for startups—even if you find someone, it’s tough. I use a “horse racing” strategy: use the salary budget for second-tier social hires to recruit the best fresh graduates. But more fundamentally, the work itself has to be credible and real—that’s what makes people want to join.

Q: Where do you think embodied intelligence is in its development cycle?

Qiao Zhongliang:Like new energy vehicles before 2015. Back then, Tesla had just launched three models, the three-electric system and operating system were in place, assisted driving was onboard, and the basic supply chain was forming. But robots haven’t reached that state yet.

The wild card is AI. In terms of development cycles, I think industrial applications will develop fast—3–5 years to an explosion period. Service sectors may take 5–10 years. Home scenarios at least 10 years.

Q: You said you learned three things at Xiaomi: “hit product,” “mass line,” and “united front.” Are they just nice-to-haves for your current venture, or decisive?

Qiao Zhongliang:Those are methodologies. What I feel most now is the “muscle memory” forged under extreme pressure—like sensitivity to users, how to solve tough problems, and repeating it over and over.

Q: So at Xiaomi, you completed the pre-training of your own foundation model?

Qiao Zhongliang:Exactly. You hit the nail on the head.

Leave a Reply

Your email address will not be published. Required fields are marked *

Log In / Sign Up

Enter your email to receive a secure code. No password needed.