Author: Paul Veradittakit, Partner at Pantera Capital; Translation: Jinse Finance xiaozou
summary:
VLA innovation and economies of scale are driving the creation of affordable, efficient, and versatile humanoid robots.
As warehouse robots expand into the consumer robot market, robot safety, financing and evaluation mechanisms deserve further exploration.
Cryptography will advance the robotics industry by providing economic guarantees for robot safety and optimizing their docking infrastructure, latency, and data collection processes.
ChatGPT completely rewrites human expectations of artificial intelligence. When large language models began to interact with the external software world, many people thought that AI agents were the ultimate form. But if you look back at classic science fiction movies such as "Star Wars", "Blade Runner" or "RoboCop", you will find that what humans really dream of is that artificial intelligence can interact with the physical world in the form of robots.
Pantera Capital believes that the "ChatGPT moment" in the field of robotics is coming. We will first analyze how breakthroughs in artificial intelligence have changed the industry landscape in the past few years, and then explore how battery technology, latency optimization, and data collection improvements will shape the future landscape, as well as the role of encryption technology. Finally, we will explain why we believe that robot safety, financing, evaluation, and education are vertical areas that need to be focused on.
1. Elements of change
( 1 ) Breakthroughs in artificial intelligence
Advances in the field of multimodal large language models are giving robots the "brains" they need to perform complex tasks. Robots perceive their environment primarily through vision and hearing.
Traditional computer vision models (such as convolutional neural networks) are good at object detection or classification tasks, but they have difficulty converting visual information into purposeful action instructions. Although large language models perform well in text understanding and generation, they are limited by their ability to perceive the physical world.
Through the Vision-Language-Action (VLA) model, robots are able to integrate visual perception, language understanding, and physical actions in a unified computing framework. In February 2025, Figure AI released Helix, a universal humanoid robot control model. The VLA model sets a new benchmark for the industry with its zero-shot generalization capabilities and System 1/System 2 dual architecture. The zero-shot generalization feature allows robots to instantly adapt to new scenarios, new objects, and new instructions without repeated training for each task. The System 1/System 2 architecture separates high-order reasoning from lightweight reasoning, realizing a commercial humanoid robot with both human-like thinking and real-time accuracy.
( 2 ) Economical robots become a reality
Technologies that change the world all have one thing in common: accessibility. Smartphones, personal computers, and 3D printing have all become accessible to the middle class at prices that are affordable. When robots like the Unitree G1 cost less than a Honda Accord sedan or the minimum annual income of $34,000 in the United States, it’s not surprising to imagine a world where manual labor and daily tasks are largely performed by robots.
( 3 ) From warehousing to consumer market
Robotics is expanding from warehouse solutions to the consumer sector. The world is designed for humans - humans can do all the work of professional robots, but professional robots cannot do all the work of humans. Robotics companies are moving beyond manufacturing factory-specific robots to developing more general-purpose humanoid robots. As a result, the forefront of robotics technology will not only exist in warehouses, but will also permeate everyday life.
Cost is one of the main bottlenecks of scalability. The indicator we are most concerned about is the comprehensive cost per hour, which is calculated as the sum of the opportunity cost of training and charging time, the cost of task execution and the cost of purchasing the robot, divided by the total operating time of the robot. This cost must be lower than the average wage level of the relevant industry to be competitive.
To fully penetrate the warehousing sector, the comprehensive cost of robots per hour must be less than $31.39. In the largest consumer market, private education and health services, the cost must be controlled below $35.18. Currently, robots are moving towards becoming cheaper, more efficient and more versatile .
2. The next breakthrough in robotics
( 1 ) Battery optimization
Battery technology has always been a bottleneck for user-friendly robots. Early electric vehicles such as the BMW i3 were difficult to popularize due to the limitations of battery technology, resulting in short battery life, high cost, and low practicality. Robots are facing the same dilemma. Boston Dynamics' Spot robot has a single battery life of only 90 minutes, and the Unitree G1 battery has a battery life of about 2 hours . Users are obviously unwilling to manually charge every two hours , so autonomous charging and docking infrastructure have become the key development direction. At present, there are two main modes for robot charging: battery replacement or direct charging.
Battery swap mode enables continuous operation by quickly replacing depleted battery packs, minimizing downtime, suitable for field or factory scenarios. This process can be done manually or automatically.
Inductive charging uses wireless power supply. Although it takes a long time to fully charge, it can easily achieve a fully automated process.
( 2 ) Latency optimization
Low-latency operations can be divided into two categories: environmental perception and remote control. Perception refers to the robot's spatial cognition of the environment, while remote control specifically refers to the real-time control of a human operator.
According to Cintrini research, the robot perception system starts with cheap sensors, but the technological moat lies in the integration of software, low-power computing and millisecond-level precision control loops. When the robot completes spatial positioning, the lightweight neural network will mark obstacles, pallets or humans. After the scene label is input into the planning system, the motor command sent to the feet, wheels or robotic arms is immediately generated. The perception delay of less than 50 milliseconds is equivalent to the human reflex speed - any delay beyond this threshold will cause the robot to move clumsily. Therefore, 90% of the decisions need to be made locally through a single vision-language-action network.
Fully autonomous robots need to ensure that the latency of high-performance VLA models is less than 50 milliseconds; remote-controlled robots require that the signal latency between the operator and the robot does not exceed 50 milliseconds. The importance of the VLA model is particularly prominent here - if the visual and text inputs are processed by different models and then input into a large language model, the overall latency will far exceed the 50 millisecond threshold.
( 3 ) Data collection optimization
There are three main ways to collect data: real-world video data, synthetic data, and remote control data. The core bottleneck of real-world data and synthetic data is to bridge the gap between the robot's physical behavior and the video / simulation model. Real-world video data lacks physical details such as force feedback, joint motion errors, and material deformation; while simulation data lacks unpredictable variables such as sensor failures and friction coefficients.
The most promising data collection method is remote control , where a human operator remotely controls the robot to perform tasks. However, labor costs are the main limiting factor for remote control data collection.
Customized hardware development is also providing new solutions for high-quality data collection. Mecka combines mainstream methods with customized hardware to collect multi-dimensional human motion data, which is processed and converted into data sets suitable for robot neural network training. Combined with a fast iteration cycle, it provides massive amounts of high-quality data for AI robot training. Together, these technical pipelines shorten the conversion path from raw data to deployable robots.
3. Key areas of exploration
( 1 ) Integration of encryption technology and robots
Cryptography can incentivize trustless parties to improve the efficiency of robot networks. Based on the key areas mentioned above, we believe that cryptography can improve efficiency in three aspects: docking infrastructure, latency optimization, and data collection.
The Decentralized Physical Infrastructure Network (DePIN) is expected to revolutionize charging infrastructure. When humanoid robots run around the world like cars, charging stations need to be as accessible as gas stations. Centralized networks require huge upfront investments, while DePIN spreads the costs among node operators, allowing charging facilities to expand rapidly to more areas.
DePIN can also optimize remote control latency by leveraging distributed infrastructure. By aggregating geographically dispersed edge node computing resources, remote control commands can be processed by local or nearest available nodes, minimizing data transmission distance and significantly reducing communication latency. However, the current DePIN project mainly focuses on decentralized storage, content distribution, and bandwidth sharing. Although some projects demonstrate the application advantages of edge computing in streaming media or the Internet of Things, it has not yet extended to the fields of robotics or remote control.
Remote control is the most promising way to collect data, but it is extremely costly for centralized entities to hire professionals to collect data. DePIN solves this problem by using crypto tokens to incentivize third parties to provide remote control data. The Reborn project builds a global network of remote operators, converts their contributions into tokenized digital assets, and forms a decentralized system without permission - participants can not only earn benefits, but also participate in governance and help AGI robot training.
( 2 ) Safety is always the core concern
The ultimate goal of robotics is to achieve full autonomy, but as the Terminator series of movies warns, the last thing humans want to see is autonomy turning robots into offensive weapons. The safety of large language models has attracted attention, and when these models have the ability to take physical actions, robot safety becomes a key prerequisite for social acceptance.
Economic security is one of the pillars of the prosperity of the robot ecosystem. OpenMind, a company in this field, is building FABRIC, a decentralized machine coordination layer that uses cryptographic proofs to achieve device identity authentication, physical presence verification, and resource acquisition. Unlike simple task market management, FABRIC enables robots to independently prove identity information, geographic location, and behavior records without relying on centralized intermediaries.
Behavioral constraints and identity authentication are enforced through on-chain mechanisms, ensuring that anyone can audit compliance. Robots that meet safety standards, quality requirements, and regional regulations will be rewarded, while violators will face penalties or disqualification, thus establishing accountability and trust mechanisms in autonomous machine networks.
Third-party re-staking networks (such as Symbiotic) can also provide equivalent security guarantees. Although the penalty parameter system still needs to be improved, the relevant technology has entered the practical stage. We expect that industry security guidelines will be formed soon, and the penalty parameters will be modeled according to these guidelines.
Example implementation:
Robotics company joins Symbiotic network.
Setting verifiable penalty parameters (e.g. " applies a human contact force exceeding 2500 Newtons " );
Stakers provide a deposit to ensure the bot adheres to parameters;
In the event of a violation, the deposit will be used as compensation for the victims.
This model not only incentivizes companies to put security first, but also promotes consumer acceptance through the insurance mechanism of the pledge fund pool.
The Symbiotic team’s insights into the field of robotics are:
The Symbiotic Universal Staking Framework aims to extend the concept of staking to all areas that require economic security endorsement, whether through shared or independent models. Its application scenarios range from insurance to robotics and require specific design for specific cases. For example, a robotics network can be built entirely based on the Symbiotic framework, allowing stakeholders to provide economic guarantees for the integrity of the network.
4. Filling the gaps in the robotics technology stack
OpenAI has promoted the popularization of AI, but the foundation of ChatGPT moment has already been laid. Cloud services have broken the model's dependence on local computing power, Huggingface has realized model open source, and Kaggle has provided an experimental platform for AI engineers. These incremental breakthroughs have jointly contributed to the popularization of AI.
Unlike AI , it is difficult to enter the field of robotics when funding is limited. To achieve the popularization of robots, the development threshold needs to be lowered to the same level of convenience as AI application development. We believe there is room for improvement in three aspects: financing mechanism, evaluation system and education ecology.
Financing is a pain point in the field of robotics. To develop a computer program, you only need a computer and cloud computing resources, but to build a fully functional robot, you must purchase hardware such as motors, sensors, and batteries, and the cost can easily exceed $100,000. This hardware attribute makes robot development less flexible and more expensive than AI.
The evaluation infrastructure for robots in real-world scenarios is still in its infancy. A clear loss function system has been established in the AI field, and testing can be completely virtualized. However, excellent virtual strategies cannot be directly converted into effective solutions in the real world. Robots need evaluation facilities to test autonomous strategies in diverse real-world environments in order to achieve iterative optimization.
When these infrastructures mature, talent will flood in, and humanoid robots will repeat the explosion curve of Web2. Crypto robotics company OpenMind is moving in this direction - its open source project OM1 ("Android for Robots") transforms raw hardware into an economically aware, upgradeable intelligent agent. Vision, language, and motion planning modules can be plug-and-play like mobile phone apps, and all reasoning steps are presented in plain English, allowing operators to audit or adjust behavior without touching the firmware. This natural language reasoning capability allows a new generation of talent to enter the field of robotics seamlessly, taking a key step towards an open platform that will ignite the robotics revolution, just as the open source movement has accelerated AI.
Talent density determines the trajectory of the industry. A structured inclusive education system is crucial for the delivery of talent in the field of robotics. OpenMind's listing on Nasdaq marks the beginning of a new era in which intelligent machines participate in both financial innovation and physical education. OpenMind and Robostore jointly announced that they will launch the first general education course based on the Unitree G1 humanoid robot in K-12 public schools in the United States. The course design is platform-independent and can be adapted to various robot forms, providing students with practical operation opportunities. This positive signal strengthens our judgment: the richness of robotics education resources in the next few years will be comparable to that of the AI field.
5. Future Outlook
Innovations in the Vision-Language-Action (VLA) model and economies of scale have led to affordable, efficient, and versatile humanoid robots. As warehouse robots expand into the consumer market, safety, financing models, and evaluation systems become key areas of exploration. We firmly believe that cryptography will drive the development of robots through three paths: providing economic guarantees for safety, optimizing charging infrastructure, and improving latency performance and data collection pipelines.