Humanoid robots, also known as anthropomorphic robots, possess human-like perception, decision-making, behavior, and interaction capabilities. They have human-like appearances, sensory systems, intelligent thinking methods, control systems, and decision-making abilities, ultimately exhibiting "human-like behavior."
- Humanoid robots involve engineering and control science, integrating research achievements from fields such as electronics, mechanics, automation control, and computer science. They cannot simply achieve humanoid functions by purchasing and assembling components.
- Humanoid robots are classified by height into large humanoid robots and medium-small humanoid robots.
Research on humanoid robots began in Japan and has now entered a high-dynamic motion development stage. Reviewing the development history of humanoid robots, there are three significant milestones:
- The first stage: The early development phase represented by the humanoid robot from Waseda University.
- The second stage: The system integration development phase represented by Honda's humanoid robot.
- The third stage: The high-dynamic motion development phase represented by Boston Dynamics' humanoid robot.
Japan was the first to initiate research on humanoid robots, achieving bipedal walking.
- In 1971, Professor Kato from Waseda University introduced the hydraulic system-based bipedal robots WL-3 and WL-5, achieving a stride length of 15 cm and a cycle time of 45 seconds for static walking.
- Subsequently, the motor-driven WL-9R and WL-10DR were designed, achieving dynamic walking with an ankle joint torque control, reducing the single-step cycle to 1.3 seconds.
- In 2006, Professor Takashi Kato's student, Junfu Takahashi, introduced the humanoid robot WABIAN-2R (with 41 degrees of freedom), achieving a walking speed of 1.8 km/h and adapting to various ground surfaces.
HONDA's Asimo represented the most advanced technology level at the time.
- In 1996, Japan's HONDA developed the first humanoid robot P1, followed by P2, which could walk on ordinary roads, and subsequently P3.
- On November 12, 2000, the most representative motor-controlled bipedal robot Asimo was released, standing 120 cm tall, weighing 52 kg, and walking at speeds of 0 to 1.6 km/h.
- The third-generation ASIMO robot was released in 2011, achieving a walking speed of up to 9 km/h, capable of climbing stairs, kicking a soccer ball with one leg, and jumping on one leg, with a walking stride that can be continuously adjusted, reaching 57 degrees of freedom, making it suitable for fixed environment service robot applications.
Cassie reflects a new drive design, enriching the drive technology route.
- In 1997, Grizzle and others from the University of Michigan developed the underactuated bipedal robot RABBIT, which can achieve dynamic walking without feet.
- Based on RABBIT, a series of underactuated walking robots such as MEBAL, MARLO, and ATRIAS were developed, achieving three-dimensional underactuated walking.
- In 2017, the robot Cassie was released, priced at about $70,000, with its drive motors positioned high, incorporating springs in the legs to achieve efficient gait while being able to stand still.
- In 2022, Digit was launched based on Cassie, featuring robust walking and running gaits, capable of climbing stairs and autonomous navigation perception, applicable for package handling.
Source: CNKI, Zhejiang University, 1997, Grizzle and others from the University of Michigan developed the underactuated bipedal robot RABBIT.
The HRP series robots can achieve stable walking and collaborate with humans.
- In 1998, the National Institute of Advanced Industrial Science and Technology in Japan began leading the HRP series project, aimed at developing "humanoid robot systems that can coordinate and coexist with humans in human work and living environments, capable of completing complex tasks."
- HRP-2 and HRP-3 can walk stably and perform various dexterous movements (such as Japanese dance), collaborate with humans to lift objects, overcome obstacles, pick up objects from the ground, protect themselves when falling, and stand up again.
Source: Company website, Zhejiang University, CITIC Construction Investment, Japan's National Institute of Advanced Industrial Science and Technology launched the HRP system bipedal robot.
Atlas uses a self-designed hydraulic drive system, with the world's leading motion capabilities.
- Boston Dynamics developed the hydraulic-driven quadruped robot BigDog under the funding of the Defense Advanced Research Projects Agency (DARPA).
- In October 2009, Boston Dynamics released PETMAN, a military device designed for U.S. experimental protective clothing, featuring strong self-balancing capabilities and motion performance, able to adjust its gait in response to external environmental disturbances and maintain balance.
- Since its release in 2013, Atlas has undergone three major iterations, with its all-terrain adaptability representing the current highest level.
IIT launched WALK-MAN, influential in Europe.
- IIT's WALK-MAN firefighting robot incorporates force control to form torque-controlled hand joints, sacrificing some rigidity of the robot.
- In 2008, IIT manufactured the open-source humanoid robot iCub for research on perceptual learning and human-robot interaction, featuring excellent human-robot interaction capabilities. It is designed to resemble a three-and-a-half-year-old child, standing 1 meter tall, with 53 degrees of freedom, capable of walking and balancing on one leg.
- In 2012, the bipedal robot COMAN was developed, with SEA drives used for all joints in the forward plane.
Swiss research institutions utilize passive flexibility to further enhance jumping and terrain adaptability.
- In 2011, the Robotics and Intelligent Systems Laboratory at ETH Zurich developed the single-leg robot ScarlETH based on SEA joints, utilizing the robot's passive flexibility to achieve high-energy-efficient jumping and terrain adaptability.
- Based on this, a motor-driven quadruped robot StarlETH and ANYmal were developed.
HUBO won first place in the DRC competition, promoting research and development in Asia.
- The bipedal robot HUBO from KAIST won first place in the 2015 DRC competition with its hybrid movement method of wheels and feet.
- With the help of Rainbow Robotics, HUBO2 became the world's first commercial humanoid robot platform. It has been purchased by leading research institutions (MIT, Google, etc.) as a research platform.
- The "HUBO2" robot can walk at a speed of 1.4 km/h with straight knees and run at a speed of 3.6 km/h.
HUBO won first place in the 2015 DRC competition with its hybrid movement method of wheels and feet.
The University of Tokyo launched a new version of Schaft, reducing costs and energy consumption.
- In 2013, the humanoid robot team Schaft, acquired by Google, won the championship in the DRC 2013 competition, standing 1480 mm tall, weighing 95 kg, with an arm span of 1309 mm, capable of walking and climbing stairs.
- In 2016, a new low-cost, low-energy humanoid robot was released, capable of carrying 66 kg.
Small humanoid robot development is in full swing, enriching and expanding application scenarios. - France's Aldebaran Robotics launched the NAO typical robot, selling over 10,000 units. The company has consistently pursued a commercialization path, significantly differing from Boston Dynamics and Asimo, and was later acquired by Japan's SoftBank. It subsequently launched the Pepper and Romeo robots.
- Among small bipedal robots under 50 cm in height, Korea's Robotis company is well-known for its Darwin-OP robot, which can walk stably and recognize colors.
- Korea's Hitec company launched Robonova-1, and the domestic Leju (Shenzhen) robotics subsidiary launched the "Aleos" robot.
Domestic research on humanoid robots started relatively late, mainly led by universities and research institutions.
- Tsinghua University, Zhejiang University, Shanghai Jiao Tong University, Beijing Institute of Technology, and the Chinese Academy of Sciences have also successively conducted research on humanoid robots.
- The National University of Defense Technology started early, developing the "Pioneer" in 2000 and the Blackman in 2003, standing 1.55 m tall, weighing 63.5 kg, with a total of 36 degrees of freedom, achieving a maximum walking speed of 1 km/h, with in-depth research on robot turning and walking on uneven surfaces.
- In 2002, Tsinghua developed the THBIP-I robot, standing 1.7 m tall, weighing 130 kg, capable of stable walking and climbing stairs.
- In 2022, Beijing Institute of Technology launched the "BHR-1," achieving independent walking without external cables for the first time; in 2005, BHR-2 broke through technologies for stable walking and complex motion planning.
Domestic humanoid robot research started relatively late, mainly led by universities and military research.
An average adult typically has 206 bones and nearly 230 joints, forming 244 degrees of freedom controlled by 630 muscles.
- If a precise model of the human body is to be created, the work will be extremely complex. Hanavan proposed simplifying the human model, usually dividing it into 15 parts corresponding to the head, chest, upper arms, forearms, hands, thighs, lower legs, and feet.
- Humanoid robots are highly flexible, strong nonlinear dynamic systems, typically analyzed using a combination of multi-body dynamics systems and numerical simulations for dynamics and kinematics analysis.
- Robot motion analysis includes dynamics analysis and kinematics analysis, with kinematics divided into forward kinematics and inverse kinematics.
Humanoid robots not only possess some human-like shapes, such as upper limbs and heads, but should also have human-like lower limb structures and bipedal walking capabilities.
- In the design process of bionic mechanisms, the degrees of freedom are first determined based on target specifications, followed by the types and numbers of joints, typically consisting of multiple single-degree-of-freedom rotational joints.
- Sensors are typically used to simulate human perception of the environment, such as machine vision, pressure sensors, touch sensors, directional microphones, and sonar rangefinders.
- The NAO robot has a total of 25 drive motors, 2 cameras, 9 touch sensors, 4 directional microphones, 8 pressure sensors, 2 sets of infrared receivers and generators, and sonar rangefinders.
Joint drive route one: Hydraulic drive has high power and strong explosiveness.
- Advantages: High output power, no need for a reducer, strong force, strong explosiveness, and high ability to withstand mechanical shocks and damage.
- Disadvantages: Hydraulic systems are prone to oil leaks, large in size, noisy, and have high power consumption, requiring a hydraulic source.
Joint drive route two: Motor drive is the most traditional, with a simple structure and wide application.
- Advantages: Simple structure, precise position servo.
- Disadvantages: Poor torque servo, high transmission loss, and less explosiveness compared to hydraulic drives.
Joint drive route two: Motor drive + flexible software improves energy storage cycle capability.
- Advantages: High torque precision, passive flexibility, and the ability to achieve energy storage cycles.
- Disadvantages: Poor position servo and limited response bandwidth.
Joint drive route two: Direct motor drive scheme achieves high position accuracy and fast response.
- Advantages: High torque precision, high position accuracy, and fast response.
- Disadvantages: Motors need to be custom-made, and motor size is large.
Joint drive route three: Pneumatic drive is lightweight and low-cost, but control precision is not high.
- Advantages: Pneumatic artificial muscles are lightweight, low-cost, easy to maintain, and have a larger power-to-volume ratio and power-to-weight ratio compared to cylinders.
- Disadvantages: Control precision is not high, work efficiency is relatively low, and work speed stability is poor.
Each of the three drive methods has its characteristics; motor drive is the most traditional, while hydraulic drive is the most expensive.
- Hydraulic, motor, and pneumatic drive methods each have their characteristics, with motor drive being the most traditional, rapidly evolving in technology, and widely applied globally; hydraulic drive is complex, with high difficulty in hydraulic valves, making system costs very expensive, but providing the best robot motion performance; pneumatic drive performance is between hydraulic and motor direct drives, currently applied relatively less.
Balance control directly affects walking performance, and companies typically develop core control algorithms independently.
- The core issues of robot state estimation include: sensor selection and layout, sensor data calibration, modeling of the robot body, and multi-sensor data fusion.
- In the design selection of controllers, control strategies are typically chosen based on the robot's own state and model, followed by executing control commands. The design of the controller is the most critical part of robot design.
To achieve good human-robot interaction performance, algorithms, AI technology, and sensors are essential.
- In motion planning and interaction design based on environmental perception, a good understanding and recognition of the environment are needed, calculating feasible areas, reasonably selecting contact points (e.g., bipedal, two hands, or using hands and feet), as well as selecting stride lengths and optimizing models.
Humanoid robot batteries: Estimating the basic parameters of battery packs from limited performance indicators. - Boston Dynamics' Atlas robot has a maximum power of 5 kW and an overall weight of 80 kg. The mounted 48V lithium-ion battery pack weighs 5-10 kg, with a mass energy density of 200-250 Wh/kg and a volume energy density of 500 Wh/L estimated. The discharge rate of this battery pack is 2C-5C, with a volume of 2-5L and a mass power density of 0.5-1 kW/kg, and a volume power density of 1-2.5 kW/L.
- Based on the performance ranges of mass energy density and mass power density, we estimate that the battery pack used by Atlas is similar to high-performance power battery packs.
Latest developments in power batteries: CTP3.0 "Qilin Battery" is on the horizon. - According to CATL's official website, the "Qilin Battery" using CTP3.0 technology can achieve a mass energy density of 255 Wh/kg (ternary) or 160 Wh/kg (iron-lithium), a volume utilization rate of 72%, 4C fast charging, 5-minute hot start, and multiple performance indicators with no thermal diffusion.
Looking ahead: What are the demand directions for battery materials in humanoid robots? - It can be seen that humanoid robots do not have high requirements for discharge rates and cycle life but have high requirements for mass and volume energy density, with potential requirements for fast charging capabilities.
- Therefore, batteries and battery materials with high energy density, preferably also considering fast charging capabilities, are the demand direction for humanoid robot batteries.
- High nickel/mid-nickel high-voltage ternary cathodes belonging to layered oxide positives are currently the preferred choice, and lithium-rich manganese-based positives may also occupy a place in the future.
Looking ahead: What are the demand directions for battery materials in humanoid robots? - Supplementing lithium in lithium battery material systems means introducing high lithium content substances into the battery material system, allowing these high lithium content substances to effectively release lithium ions and electrons, compensating for the loss of active lithium.
- Whether in the anode or cathode, pre-lithiation can improve the actual energy density of the battery, even though lithium consumption still exists.
If solid electrolytes can achieve lightweight, thin, strong, and high stability, it will significantly enhance battery energy density. - Humanoid robot batteries have relatively low requirements for cycle life but may have high safety requirements, making them a potential high-quality application scenario for high-energy-density solid-state batteries.
Components and Materials Exclusive to Humanoid Robots#
High-explosive motors, high-performance chips, precision reducers, high-precision sensors, long-lasting batteries, and other core components will build a more stable and high-performance humanoid robot hardware system.
Artificial intelligence empowers humanoid robot design.
AI for Design of Humanoid Robots
Based on artificial intelligence technologies such as neural networks, graph grammar, and evolutionary algorithms, humanoid robot modules such as legs, arms, and trunks can be automatically constructed according to scene and task requirements, achieving coordinated optimization of form and control.
Motion Intelligence of Humanoid Robots
p Walking on Complex Terrains: Humanoid robots are expected to adapt to complex terrains and narrow environments built for humans, such as slopes, steps, and thresholds, achieving stable, adaptive, and anti-interference walking.
p Cooperative Operation of Dual-arm: In the case of unstable lower body, humanoid robots are expected to complete high-performance operation tasks with collaborative dual arms using human tools and equipment.
p Compensation for Hardware with Software: When the hardware performance of humanoid robots is subpar and the sensory information is lacking, this technology systematically seeks and fully utilizes environmental and information constraints to compensate for the performance of hardware, achieving high-level task execution.
Multimodal Large Model for Humanoid Robots
p By integrating multimodal information such as voice, images, text, sensory signals, and 3D point clouds, humanoid robots will have stronger multimodal understanding, generation, and correlation capabilities, enhancing their generalization ability in complex scene tasks.
Large-Scale Dataset for Humanoid Robots
p Based on simulation synthesis or data collected from physical robots, constructing large-scale, standardized humanoid robot datasets will help improve the design, simulation training, and algorithm transfer capabilities of humanoid robots.
Humanoid Robots Inspired by Human Anatomy and Neural Mechanisms#
Different from most existing humanoid robot research methods that simulate human functions from the outside in, this approach simulates human musculoskeletal systems and neural mechanisms from the inside out, exploring the essential mechanisms by which humans achieve high dexterity, high compliance, and high intelligent behavior. As a new avenue in humanoid robot research, it is expected to build a more efficient and stable system closer to humans.
Open Source Community for Humanoid Robots
p This community will gather experts and scholars in the field of humanoid robots worldwide, promoting technical discussions, information exchange, and multi-party cooperation, facilitating deep integration and collaborative development of the industry chain.
‘Manufactory’ of Humanoid Robots
p This will connect the software environment with the ontology design-control-intelligent algorithm development based on analytical technology and large models, quickly and custom-designed high-quality, intelligent humanoid robot systems based on performance requirements, achieving hardware system validation through software-hardware consistency and the development of new components.
Applications of Humanoid Robots#
Humanoid robots possess versatility and intelligence, seamlessly using human tools, which will ensure their application scenarios continue to expand and deepen, profoundly transforming human production and lifestyle, leading society into a new stage of intelligent development, and bringing disruptive changes to various industries.
In the industrial sector, they will widely participate in dangerous production tasks, greatly improving production efficiency and safety; in specialized fields, they will become an important force in executing scientific exploration, disaster relief, security inspections, and other tasks in extreme environments; in the livelihood sector, they will fully integrate into people's lives, from providing housekeeping services to participating in medical assistance, becoming an indispensable presence.
The development history of humanoid robots: When dreams come true, commercialization is imminent.
Multimodal large models endow robots with generalization capabilities, and the dawn of embodied intelligence is emerging.
- General large models bring revolutionary potential to embodied intelligence. The hardware of humanoid robots determines the flexibility of movement, with components mostly migrating from applications in other industries, and cost pain points can be resolved through large-scale production in the industry chain; while software algorithms act as the "brain" of the robot, determining the upper limits of robot applications and being the main bottleneck for the commercialization of robots. Previously, robots relied on inherent program settings to perform tasks, making it difficult to have algorithms that are universally applicable across various scenarios, limiting the practical applications of robots. In recent years, the development of general large models such as LLM, VLM, and VNM has endowed the robot body with powerful generalization capabilities, allowing robots to be applicable in more complex scenarios, enabling non-professionals to operate without programming, accelerating the commercialization process of humanoid robots. Robots with "embodied intelligence" are no longer mechanically completing single tasks but can autonomously plan, decide, act, and execute based on perceived tasks and environments, incorporating language interaction, intelligent decision-making, autonomous learning, and multimodal perception.
1.3 Tesla leads the way, and tech giants accelerate entry to drive industrial innovation.
- Tech giants are accelerating their entry into the industry to drive innovation. 1) Tesla: On September 30, 2022, Tesla launched the humanoid robot Optimus prototype, and in 2023, Musk stated that Tesla's long-term value will come from AI and robotics; 2) OpenAI: In March 2023, OpenAI invested in Norwegian humanoid robot company 1X Technologies; in May 2024, OpenAI announced it had restarted its robotics team for two months; 3) Samsung: In January 2023, Samsung invested 59 billion won in Korean robot manufacturer Rainbow Robotics; 4) NVIDIA: In May 2023, Huang Renxun stated that the next wave of AI will be embodied intelligence; in February 2024, NVIDIA established a research department for general embodied intelligence; in March 2024, NVIDIA released the humanoid robot large model Project GR00T; in June 2024, Huang Renxun emphasized that "the next wave of AI is physical AI, and the era of robots has arrived"; 5) Figure AI: Established in 2022, Figure AI received a total of $675 million in investments from tech companies including NVIDIA, Microsoft, OpenAI, and Intel in February 2024.
1.3 Tesla Optimus progresses beyond expectations, and the industry begins a new round of "arms race."
- Tesla Optimus is rapidly iterating, leading a new wave of technological revolution. Musk proposed the humanoid robot concept Tesla Bot at the 2021 AI DAY, and then began rapid development and iteration. In February 2022, the development platform was completed, and in October 2022, the prototype Optimus was officially launched at AI DAY, capable of simple actions such as walking, carrying, and watering. In December 2023, Optimus-Gen2 was launched, significantly evolving compared to the first generation, with notable improvements in perception, brain, and control capabilities. Tesla's humanoid robot can form a complete industrial closed loop, and the commercialization landing is worth looking forward to: Optimus reuses autonomous driving-related technologies, rapidly achieving the evolution from concept machine to intelligent flexible robot. The production and sales of Tesla cars also provide initial scenarios for the commercialization of humanoid robots, and the advantages of the industry chain offer possibilities for cost reduction, with a long-term mass production price target of $20,000 per unit.
Humanoid robots will first land in factories and will be applied in commercial services and family companionship in the future.
- Humanoid robots will gradually move from factories to homes, from B2B to B2C. From the strategic planning of mainstream robot manufacturers, humanoid robots will first be applied in the industrial manufacturing sector, and after accumulating maturity, will expand to commercial services, family companionship, and other scenarios. This is mainly because factory manufacturing scenarios are relatively simple, and the demand for robots to replace humans is more urgent, while commercial and family scenarios are complex, requiring high hardware and software standards for humanoid robots.
- The "Guiding Opinions on the Innovative Development of Humanoid Robots" points out three major demonstration scenarios: special services, manufacturing, and people's livelihood, envisioning deep integration with the real economy by 2027. The application of humanoid robots in China will proceed in two steps: the first phase aims for applications in special services, manufacturing, and people's livelihood by 2025; the second phase aims for accelerated large-scale development of the industry by 2027, with richer application scenarios and related products deeply integrated into the real economy, becoming an important new engine for economic growth, making the future of humanoid robots promising.
Breakdown of Tesla's humanoid robot: 14 rotary joints + 14 linear joints + 12 hand joints#
Breakdown of Tesla's humanoid robot: rotary joints, linear joints, and hand joints.
- Rotary joints: Mainly composed of "actuator + torque sensor + encoder + frameless torque motor + harmonic reducer + bearing + mechanical clutch," similar to collaborative robot joint modules, transmitting data to the actuator through input sensors, controlling the motor, and amplifying the output torque through the harmonic reducer, with output sensors providing position feedback and optimizing algorithms.
- Linear joints: Mainly composed of "actuator + torque sensor + encoder + frameless torque motor + screw + bearing," where the actuator drives the frameless torque motor to rotate, and the rotational motion is converted into linear motion by the screw.
- Hand joints: Mainly composed of "actuator + encoder + sensor + hollow cup motor + planetary gearbox + worm gear," possessing adaptive capabilities and non-reversible drive capabilities, able to bear 20 pounds, use tools, and accurately grasp parts.
Estimated costs of humanoid robots and potential supplier overview (using Tesla Bot and related domestic parts as examples)#
Estimated cost distribution of each link/component of Tesla's humanoid robot (using domestic parts prices as examples)#
Frameless torque motors: High efficiency, compact structure, easy maintenance, used for humanoid robot linear and rotary joints.
- Frameless torque motors are a special type of permanent magnet brushless synchronous motor, lacking shafts, bearings, housings, feedback, or end caps, consisting only of stator and rotor components, with the internal rotor made of a rotating steel ring assembly with permanent magnets, directly mounted on the machine shaft; the stator is the external component, surrounding the steel sheet and copper winding to generate electromagnetic force tightly adhered to the inside of the machine casing.
- Frameless torque motors have advantages of high efficiency, compact structure, and easy maintenance. 1) High efficiency: Directly integrating the motor into the rotating shaft component can reduce the overall system inertia, thereby reducing the torque required for motor acceleration and deceleration, allowing for better control of motor motion and stability time, increasing system bandwidth, and improving machine efficiency; 2) Compact structure: Increasing torque density reduces footprint and weight; 3) Easy maintenance: Fewer mechanical components, with no easily worn or maintained components.
Precision reducers include RV reducers, harmonic reducers, and planetary reducers. Reducers are transmission components composed of multiple gears, utilizing gear meshing to change motor speed, torque, and load capacity, and can also achieve precise control. There are many types and models of reducers, which can be divided into general transmission reducers and precision reducers based on control precision. General transmission precision reducers have low control precision and can meet the basic power transmission needs of mechanical equipment. Precision reducers have small backlash, high precision, long service life, and are more reliable and stable, applied in high-end fields such as robots and CNC machine tools, specifically including RV reducers, harmonic reducers, and planetary reducers.
- Humanoid robot rotary joints will use harmonic reducers, while hand joints or some low-precision body joints may use planetary reducers. RV reducers are larger in size and have limited applications in humanoid robots. Harmonic reducers are small, have a large reduction ratio, and high precision, and will be used for the humanoid robot's body rotary joints; planetary reducers are small, lightweight, have high transmission efficiency, and long service life, but have lower precision than harmonic reducers, and will be used for humanoid robot hand joints or body joints with lower precision requirements.
Tesla's humanoid robot includes three categories of a total of 14 linear actuators distributed in the arms and legs. Tesla Optimus has 14 linear actuators, specifically including three types, with output/weight ratios of 500N/0.36kg, 3900N/0.93kg, and 8000N/2.20kg; the distribution locations are in the upper arms (21), forearms (22), thighs (22), and lower legs (22).
- The cost of screws is currently high, but there is potential for future reduction. Linear actuators consist of "actuator + frameless torque motor + screw + torque sensor + encoder + bearing," with the screw being an important component. According to our estimates, the current cost of screws accounts for about 23.4% of the total cost of Tesla's humanoid robot, with an expected final cost share of 13.9%. In terms of types, screws used in humanoid robots can be divided into trapezoidal screws and roller screws, with trapezoidal screws used for forearms and roller screws used for higher load-bearing requirements in upper arms, thighs, and lower legs.
Compared to ball screws, roller screws have higher load capacity, longer lifespan, greater speed and acceleration, and smaller lead, making them more suitable for humanoid robots. Screws are transmission accessories that convert rotary motion into linear motion, which can be classified into sliding screws, rolling screws, and hydrostatic screws based on friction characteristics, with rolling screws further divided into ball screws and planetary roller screws. The distinction lies in that the load transfer unit of planetary roller screws is a threaded roller, which is a typical line contact; while the load transfer unit of ball screws is a ball, which is point contact. Compared to ball screws, planetary roller screws have more contact points, allowing them to bear higher static and dynamic loads, with static loads three times that of ball screws and lifespans fifteen times longer; they also have stronger rigidity and impact resistance, allowing for greater speed and acceleration; and a wider pitch design range, enabling smaller lead designs.
Screws: Standard roller screws are suitable for high load and high-speed scenarios and are widely used.#
Planetary roller screws can be classified into five main types based on their structural composition and the relative motion relationships of components: standard, reverse, circulating, bearing ring, and differential types. Standard roller screws are suitable for harsh environments, high loads, and high speeds, mainly applied in precision machine tools, robots, and military equipment, and are currently the primary application type.
Screws: High precision in manufacturing through cutting processes, including turning, milling, grinding, and other core processes.#
The core components of roller screws, including the screw, rollers, and nuts, are precision threaded components with small pitches, and the processing steps are essentially the same. Traditional processing methods can be divided into two main categories: cutting and rolling.
✓ Cutting: Using the center holes at both ends as the processing reference, completing the process through heat treatment, turning, grinding, and over 10-20 steps, achieving a manufacturing precision of up to P1 level, capable of realizing positioning and transmission functions.
✓ Rolling: A method of processing threads by using forming rolling molds to create plastic deformation in the workpiece, with high automation in the mold opening process, low cost, and high efficiency in mass production, but lower manufacturing precision, generally around P7 level, only achieving transmission functions.
The rough processing of roller screws has diverse technical routes, while precision processing still requires grinding machines. The cutting process of roller screws can be roughly divided into rough machining, preparatory heat treatment (annealing), rough processing, final heat treatment (quenching), precision processing, and assembly inspection. Rough processing includes turning, milling, and grinding three process routes (which can be used alone or in combination), while precision processing is grinding. New processing techniques such as "turning instead of grinding" and "swirling milling" can theoretically replace grinding and improve processing efficiency, but the technology is still maturing, and precision processing still requires grinding technology and grinding machines.
Dexterous Hands: Hollow cup motors/brushless slotless motors are the core power sources.#
Dexterous hand motors mainly use hollow cup motors or brushless slotless motors. Micro-special motors are characterized by small size, high power density, and low noise, making them more suitable for the compact space and load capacity requirements of humanoid robot dexterous hands compared to traditional motors. Hollow cup motors and brushless slotless motors are currently the mainstream solutions for dexterous hands.
4 Dexterous Hands: The core barriers of hollow cup motors lie in coil design, winding, and equipment.
The three core barriers of hollow cup motors are coil design, coil winding, and automation equipment. The rotor of the brushless hollow cup motor consists of a ring-shaped magnetic steel, a rotating shaft, and its fixing components, while the stator is formed by bonding a ring-shaped silicon steel sheet and a hollow cup coil, with the core process being the design and manufacturing of the coil. Common winding methods for hollow cup motors include straight winding, saddle winding, and oblique winding, with winding methods divided into manual winding, semi-automated (winding type), and one-time forming automated winding. Foreign countries mainly use one-time winding forming production technology, with high automation, capable of processing wire diameters of 0.08-0.2 mm and coils for motors below 400W; while domestic production mainly relies on winding production, depending on manual labor, with low production efficiency and limited wire diameter, and one-time forming winding equipment needs breakthroughs.
- The hollow cup market is steadily growing, and humanoid robots open new spaces. Hollow cup motors are mainly applied in high-precision, high-speed response, and compact efficient scenarios, such as aerospace, instrumentation, industrial robots, and medical fields. According to QYResearch data, the global hollow cup motor market size was approximately $810 million in 2023, expected to grow to $1.19 billion by 2028, with a CAGR of 8% from 2023 to 2028. According to MarketResearch data, in 2021, the market sizes of hollow cup motors in China and Europe accounted for 34.8% and 25.85%, respectively.
Sensors are the medium through which robots perceive the world, divided into internal and external sensors. Sensors convert the physical quantities perceived by robots regarding internal and external environments into electrical outputs. Depending on the detection objects, they can be divided into internal sensors and external sensors. Internal sensors measure the robot's own state, such as position, speed, and acceleration; external sensors measure external environments related to the robot's operations, such as vision, hearing, touch, and smell.
Chart: Robot sensor schematic
Sensor classification and main functions
Internal Sensors
Photoelectric encoder for measuring motor angle/speed, odometer measurement
Inertial measurement unit for measuring mobile robot posture
Accelerometer for measuring acceleration
External Sensors
Vision sensors for object recognition, navigation, and mapping tasks, including cameras, lidar, infrared sensors, etc.
Hearing sensors for receiving sound signals to recognize and understand language, including microphones and speakers, etc.
Touch sensors for perceiving contact force and contact area information between the robot and external objects, including force sensors and pressure sensors, etc.
Smell sensors for perceiving odor information in the surrounding environment, used for environmental monitoring, hygiene inspection, etc.
Chart: Robot sensors can be divided into internal and external sensors.
Torque sensors are important components for robotic arms to perceive force. Torque sensors, also known as torque transducers, can detect torsional torque on various rotating or non-rotating mechanical components, converting the physical changes in torque into precise electrical signals, with advantages of high precision, fast frequency response, good reliability, and long lifespan. Torque sensors are one of the key components of robotic arms, providing real-time force and torque information to assist robotic arms in completing precise and intelligent operational tasks.
- In humanoid robots, six-dimensional torque sensors are mainly used in wrists and ankles, where compliance control is required. Based on measurement dimensions, torque sensors can be divided into one-dimensional, three-dimensional, and six-dimensional torque sensors, with one-dimensional, three-dimensional, and six-dimensional sensors being the most common. Six-dimensional force/torque sensors, also known as six-axis force/torque sensors or F/T sensors, are used to accurately measure force information in the X, Y, and Z directions and torque information in the Mx, My, and Mz dimensions. In humanoid robots, six-dimensional torque sensors may be used in wrists and ankles where compliance control is required, while other body joints will use joint torque sensors (one-dimensional).
Source: ATI, Tesla AI Day, Kunwei Technology, AVIC Securities Research Institute
4.1 Force
The development and production of six-dimensional force/torque sensors are challenging, but costs are expected to continue to decrease after scaling. Compared to one-dimensional force sensors, multi-dimensional force/torque sensors must address issues of monotonicity and consistency sensitive to the measured force components, as well as inter-dimensional interference caused by structural processing and process errors, dynamic and static calibration issues, and decoupling algorithms and circuit implementations in vector calculations, requiring high demands on equipment and materials, making the R&D and manufacturing difficulty much higher than that of one-dimensional force sensors. The main raw materials for strain-type force sensors include metals, chips, and strain gauges. For example, in 2023, the direct material cost of the main products of Keli Sensors reached 74%; the number of strain gauges required for six-dimensional force/torque sensors is several times that of one-dimensional force sensors, and with the added production difficulty, their costs are much higher than those of one-dimensional torque sensors. According to Baidu's procurement data, the unit price of the ATI FC-NANO17 six-dimensional force/torque sensor is 20,000 yuan. We believe that in the future, as domestic strain gauge and related industry chain R&D and production capabilities improve and downstream demand opens up, there is significant potential for cost reduction in six-dimensional force/torque sensors.
Encoders are high-precision sensors used for detecting rotational positions, with a value of approximately 8,550 yuan per humanoid robot encoder. An encoder is a sensor used for motion control, detecting the mechanical position and changes of an object using principles such as photoelectric, electromagnetic, capacitive, or inductive, converting this information into electrical signals, which are then transformed into transmittable and storable signal forms, finally fed back to various motion control devices. Encoders are applied in the rotary joints (142), linear joints (141), and hand joints (12*1) of Tesla's humanoid robot, with a total value of approximately 8,550 yuan.
- Encoders can be classified into optical, magnetic, and capacitive types based on their working principles. 1) Optical encoders have high precision, good stability, and strong anti-interference capabilities, suitable for high-precision and high-speed measurements, but are relatively expensive and easily affected by the environment; 2) Magnetic encoders use magnetic code disks instead of slotted optical code disks, making them more durable, resistant to vibration and shock, suitable for measurements in harsh environments, but with relatively lower resolution and precision; 3) Capacitive encoders have high reliability, high precision, and long lifespan, suitable for battery-powered applications.
Tesla Optimus's pure vision solution reuses the underlying technology of autonomous driving, with the core being massive data, self-developed chips, and algorithm training. Tesla Optimus's pure vision solution is equipped with the same FSD computer and Autopilot-related neural network technology as Tesla cars, but the actual application scenarios are more refined than those for cars, requiring more data accumulation and algorithm training. In the progress video of the humanoid robot released by Tesla in September 2023, it was shown that Optimus can accurately determine object positions and eliminate interference using only vision and joint position encoders, with "end-to-end" neural networks running locally, outputting commands immediately after visual input images without needing to connect to the internet or manual operation. The pure vision solution of Tesla can accurately perceive depth, speed, and acceleration information, significantly reducing hardware costs compared to the usual lidar fusion solutions, while "algorithms + computing power + data" build a high competitive barrier.
Touch sensors are important components for robots to interact with the external environment, giving robots a sense of touch. Touch is a form of perception through the skin that humans use to sense the external environment. Robot touch mainly perceives physical quantities such as temperature, humidity, pressure, and vibration when in contact with the external environment, as well as the softness and hardness of target object materials, shapes, and sizes, thus achieving precise positioning of objects and executing various operational tasks.
Applications of sensors.
- Specific scheme comparison of dexterous hands: Dexterous hand = fingers (drive + transmission + sensors) * degrees of freedom + shell.
With the continuous advancement of industrial automation and artificial intelligence technology, robots are gradually transforming from single repetitive task executors to intelligent agents capable of executing complex and variable tasks. In this transformation process, dexterous hands, as important tools for robots to interact with the external environment, are becoming increasingly significant. The design inspiration for dexterous hands comes from the complex structure and functions of the human hand, enabling robots to perform diverse tasks such as grasping, manipulating, and even sensing, greatly expanding the application range and operational capabilities of robots.
The composition of dexterous hands is the foundation for achieving their multifunctionality. A typical dexterous hand system usually consists of the following key components:
(1) Drive system: Responsible for providing power, allowing fingers to perform various movements. The drive system includes motors, pneumatic, and hydraulic types.
(2) Transmission system: Converts the power generated by the drive system into the movement of finger joints. The transmission system includes screws, gears, linkages, ropes, and tendons.
(3) Sensor system: Includes touch, force, and position sensors, used to perceive the contact state and force between the hand and external objects, as well as the position and motion state of the hand itself.
(4) Control system: Precisely controls the drive and transmission systems through algorithms and software to achieve predetermined hand movements and task execution.
This article will analyze the technical schemes, future development directions, competitive landscape, and value of dexterous hand components such as drives, transmissions, and sensors, in conjunction with Tesla's dexterous hand patents.
1.1 Analysis of dexterous hand technology routes: Electric drive + composite transmission + force and touch sensing as the main direction.
1.1.1 Number of degrees of freedom: There is a trend of increasing degrees of freedom.
The human hand has a total of 24 degrees of freedom. According to "Robot Dexterous Hands - Modeling, Planning, and Simulation," the human hand's 24 degrees of freedom include 5 degrees for the thumb, 4 degrees for each of the other four fingers, and an additional 3 degrees for wrist abduction, wrist flexion, and palm curvature.
The more degrees of freedom, the greater the design difficulty, with one of the challenges being how to place numerous actuators while keeping the dexterous hand's size close to that of a human hand. Currently, the known dexterous hand with the most degrees of freedom is the Shadow Hand, which has 24 degrees of freedom. Tesla's first-generation humanoid robot hand has 6 degrees of freedom, while the second generation has 11 degrees of freedom, overall moving towards higher degrees of freedom. Since 2014, at least four dexterous hands have achieved 21 degrees of freedom, employing various transmission methods such as tendons, cables, and gear linkages.
Comprehensive comparisons indicate that motor drive is the most suitable method for mass production of dexterous hands. This is mainly due to advancements in motor design, processing technology, and electronics, which can provide small-sized, high-output micro motors for dexterous hands. Additionally, the ease of obtaining and storing electrical energy provides a foundation for motor applications. Possible motors include hollow cup motors and brushless slotless motors.
The hollow cup motor scheme is highly efficient and suitable for battery-powered dexterous hands that need to run for extended periods. Hollow cup motors use ironless rotors, eliminating energy losses caused by eddy currents formed by iron cores, resulting in higher efficiency, smaller moment of inertia, and easier control. According to "Research Progress on Hollow Cup Micro Motors and Coils," hollow cup motors mainly have the following characteristics: (1) Energy-saving characteristics: The energy conversion efficiency is very high, with maximum efficiency generally above 65%, and some products can reach over 90% (iron core motors generally do not exceed 75%); (2) Control characteristics: Quick start and stop, with rapid response; according to "Research Progress on Hollow Cup Micro Motors and Coils," the mechanical time constant is less than 28 milliseconds, with some products achieving within 10 milliseconds (iron core motors generally exceed 100 milliseconds); (3) Fluctuation characteristics: Very reliable operational stability, with minimal speed fluctuations. As micro motors, the speed fluctuations of hollow cup motors can be easily controlled within 2%. Therefore, hollow cup motors are particularly suitable for applications requiring battery power and long operational durations, such as bionic hands, humanoid robots, and handheld electric tools.
The main challenges of hollow cup motors lie in winding design, dynamic balance design, and capital investment. Therefore, new entrants have shallow technical accumulation and find it difficult to meet the high efficiency requirements in the robotics field.
(1) Winding design: The winding needs to ensure high density and consistent arrangement of coils, enabling the product to have high power and torque density. The diversity of winding forms directly affects production yield, but most technologies are patented by foreign companies, further increasing the difficulty for domestic companies to break through.
(2) Dynamic balance design: Rotor dynamic balance is an extremely important process in motor manufacturing, directly affecting whether the motor's noise and vibration performance meet standards. Differences in rotor dynamic balance are caused by different companies using different magnetic materials, leading to uneven mass distribution of the rotor.
(3) Capital investment: The equipment prices in automated production lines for motors are relatively high, with single winding equipment costing over a million yuan, requiring customized development from equipment manufacturers, placing high capital requirements on hollow cup motor manufacturers.
The brushless slotless motor scheme is a feasible way to reduce costs. The motors used in finger parts can be classified into brushless slotless motors and brushless slotless motors based on the presence or absence of slots:
- With slots: Most brushless DC motors adopt a slotted design, with coils wound in the slots on the stator;
- Without slots: Hollow cup motors belong to the category of slotless motors; in slotless motors, there are no slot structures on the stator, and coils are wound and formed separately, directly fixed on the surface or inside of the stator.
Due to the small diameter and low torque fluctuations of hollow cup motors, they currently dominate in robotics. Brushless slotless motors have greater power than hollow cup motors, but their larger diameter means they can only be installed in the thumb (which has a higher spatial tolerance) in the short term. Compared to hollow cup motors, brushless slotless motors have torque fluctuations, leading to greater fluctuations in speed and torque, and cannot operate at high speeds with iron cores; hollow cup motors can achieve high speeds and small diameters, mainly relying on the palm structure to bear weight. From this perspective, brushless DC motors have greater power and can be placed on the thumb, but will not be placed on other fingers in the short term, as fingers need to be smaller and lighter.
The future development direction of motors will focus on reducing costs and increasing efficiency, mainly through reducing size and weight, such as harmonic magnetic field motors and integrated technology motors.
- Harmonic magnetic field motors: Achieving reduced size and increased power density by changing the internal design structure of the motor; 2) Integrated technology motors: Achieving reduced size and increased power density by integrating reducers and other products.
1.1.3 Transmission methods: The layout of screws is becoming the development trend for hand transmission.
Transmission methods mainly include tendons, screws, gears, linkages, etc. Early dexterous hands used gears, linkages, and other transmission mechanisms, but due to size and weight issues, as well as inflexible movements, they were gradually phased out. Tendon transmission, which mimics the tendon structure of animals, is currently widely adopted in dexterous hands. According to various companies' mid-year reports for 2024, listed companies are focusing on developing hand screws, and the layout of screws is becoming the development trend for transmission.
Tendon transmission uses ropes to simulate the tendon structure of the human hand, allowing large actuators to be positioned away from the execution mechanism, reducing the load and inertia at the end, increasing grasping speed, and allowing for flexible arrangements, making it suitable for transmission scenarios with limited space and a high number of degrees of freedom.
Linkage transmission uses multiple linkages in a mixed series-parallel form to transmit motion and torque. The movement and power of the fingers are transmitted by rigid linkages, allowing for grasping large objects and compact structural design, enabling enveloping grasping. The downside is that it is difficult to control over long distances, prone to ejection, and has limited grasping space.
Ball screw transmission, according to "Design of Control Systems for Space Five-Finger Dexterous Hands," places the motor and ball screw externally in the arm, with the motor driving the ball screw through a reducer. The rotational motion of the motor shaft is converted into the translational motion of the screw nut, which pulls the tendon, connecting the other end to the finger bones, causing the finger joints to rotate around the joint axis, forming finger bending motion. According to Tesla's public information, Tesla will subsequently install the driving device in the arm rather than inside the fingers.
The transmission device of dexterous hands can generally be divided into three levels: (1) The first level: located on the motor side, mainly for the reducer, serving to adjust precision; (2) The second level: the most important, responsible for action execution; (3) The third level: connecting the driver and the end of the joint, mainly consisting of tendons and linkages. From market cases, the first-level transmission mainly uses belts, the second-level transmission mainly employs screws or bevel gears, and the third-level transmission generally uses tendons and linkages.
Comprehensive comparisons show that various schemes have their strengths, but due to the high load-bearing requirements in factory labor scenarios, screws may become the mainstream transmission scheme in factory settings.
1.2 Breakdown of Tesla's gen1 patent: Actuators and gearboxes are core components.
According to Tesla's hand patent, the hand uses approximately 14 core components. Sorted by value, actuators, gearboxes, and Hall effect sensors rank high:
- Actuators: Hollow cup motors (we estimate the domestic production cost after mass production to be 1,000 yuan each, totaling 13×1,000=13,000 yuan) or brushless slotless motors (estimated domestic production cost after mass production to be 160 yuan each, totaling 160×13=2,080 yuan).
- Planetary gearboxes: 304a-304f (gearbox = 1 gear + 1 worm), we estimate the value of a single degree of freedom to be about 100 yuan, totaling 1,300 yuan for 13 degrees of freedom.
- Finger joint components: proximal 402, distal 408, 420, for the casing, we estimate the value to be small.
- Ends of the finger joints: 410, 412, for the casing, we estimate the value to be small.
- Axles (including pins, axles, etc.): 406, 414, we estimate the value to be small.
- Cables: 416, 418, 512, we estimate the value to be small.
- Channel structures: 424, 426, we estimate the value to be small.
- Torsion springs: distal 436, proximal 434, we estimate the value to be small.
- Spring brackets, pins, we estimate the value to be small.
- Others: fingernails, tendons, automatic tensioners, manual tensioners, flange sleeve bearings, we estimate the value to be small.
- Pipes: 514, 516, we estimate the value to be small.
- Worm gears: 704 (including pulley 706), already calculated in the gearbox.
- Gears: 702, already calculated in the gearbox.
- Hall effect sensors: composed of sensor 802 and magnetic field source 804 (the processor 114 determines the position or rotation angle of the finger joint components through the magnetic field measured by the Hall effect sensor), with a mature industry chain, and the value is relatively small.
According to Tesla's GEN1 dexterous hand patent, the basic working mechanism of the fingers is achieved through the collaborative action of the cable drive system and actuators. When the actuator is activated, it pulls the cable, which is guided through the channel structure of the fingers, driving the joints of the fingers to rotate around the pivot. The cable maintains appropriate tension through the gearbox, ensuring smooth joint movement. Additionally, the torsion springs provide extra rebound force, allowing the fingers to naturally return to their initial state after operation. This cable-driven design not only reduces complex mechanical components but also enhances the flexibility and durability of the fingers. With precise positioning provided by Hall effect sensors, the fingers can adjust their movements in real-time, achieving highly refined operational tasks.
The steps for breaking down Tesla's dexterous hand patent are as follows:
The first step, according to Tesla's dexterous hand patent, shows that its single hand has 5 fingers, each containing two phalanges (proximal phalanx 206 + distal phalanx 208), with each finger secured to the palm by fasteners. This step adds finger joint components.
The second step, according to Tesla's dexterous hand patent, indicates that its single hand has 6 degrees of freedom, thus containing 6 actuators and gearboxes. The actuators are mainly placed in the palm (due to the increase in degrees of freedom in the third generation, the palm capacity is insufficient, so the actuators are loaded into the larger capacity arm). This step adds actuators and gearboxes, with 6 for each hand.
The third step, according to Tesla's dexterous hand patent, shows that each finger has 2 pivot structures and 2 torsion springs. The two pivots are located at the proximal end (406) and distal end (414) of the finger, typically composed of pins or bearings, allowing the finger to rotate freely within a specific angular range. The distal torsion spring (436) and proximal torsion spring (434) are located at the distal and proximal joints of the finger, providing additional stability and assisting the finger in returning to its initial state. This step adds pivot structures and torsion springs.
The fourth step, according to Tesla's dexterous hand patent, indicates that each finger has 2 cables and 2 channel structures, with one end of the cable connected to the actuator and the other end routed through complex channel structures (424, 426) along the inside of the finger. When the actuator moves, the cable can bend the dexterous hand. The channel structure provides a guiding path for the cable's movement, ensuring that the cable can move freely without tangling when the finger bends. This step adds cables and channel structures.
The fifth step, according to Tesla's dexterous hand patent, indicates that the dexterous hand has 6 gearboxes, with each actuator controlling the movement of the cable through a gearbox. The gearbox typically consists of a worm gear and worm, with the pulley in the gearbox connected to the cable, ensuring that the tension of the cable remains within a stable range. This step adds gearboxes (including gears, worm gears, worms, and pulleys, etc.).
The sixth step, according to Tesla's dexterous hand patent, indicates that each finger is also equipped with a Hall effect sensor to monitor the rotation angles and positions of each joint of the finger. The Hall effect sensor is connected to the processor, and when the finger rotates, it determines the exact position of the finger by measuring changes in the magnetic field, providing real-time feedback. This step adds Hall effect sensors.