October 10, 2023 in Large World Models

What’s Next after LLMs? Unlocking Robotics’ Full Potential with Large World Models

Nag Murty

SHARE: PRINT ARTICLE:

https://doi.org/10.1287/LYTX.2023.04.07

The world of robotics is advancing at an exciting pace. Everywhere we look, we see robots playing a significant role: from manufacturing units to the very desk we sit at, from e-commerce warehouses to our kitchens. Despite this increasing prevalence of robots, the scale and impact of robotic automation is far less than it could be.

What is restraining the potential of this industry? To understand this, we need to delve deeper into the prevalent business models and dominant technological approaches and how these can evolve.

The Constraints of Classical Robotics and RaaS

The classic approach to robotics is based on a set of predefined rules, highly accurate sensors and detailed environmental maps. These systems, although competent within the narrow parameters for which they are designed, struggle to adapt when those parameters change. This lack of adaptability limits their effectiveness in complex, dynamic, real-world environments. What we need are learning systems – robots that improve and adapt over time – mastering their tasks the way humans do.

The current business models add another layer of constraint. The prevalent Robotics-as-a-Service (RaaS) model views robots as products or services, pushing companies to focus on getting paid for robots that will deliver output against service-level agreements (SLAs).

This short-sighted focus shifts the value from the creation of increasingly efficient, intelligent systems to selling as many units as possible (a focus on valuing contracted revenue over real cash flows) and assumes a general big bang of scale will follow years of research and development investment (e.g., how a drug gets commercialized). This approach has led to high upfront investments in first-generation sensors and an obsession with meeting SLAs, often prematurely.

For example, despite consuming vast amounts of capital, autonomous mobile robot product companies or self-driving car ventures are yet to scale effectively or demonstrate a commensurate value unlock. The RaaS model, with its emphasis on revenue, doesn’t allow the necessary space for these robotics systems to evolve into learning systems that continually improve.

Understanding Large Language Models and Their Analog, Large World Models

Large language models (LLMs) are artificial intelligence (AI) models that have developed an innate understanding of language comprehension and manipulation. A quintessential example is OpenAI’s GPT-3, which, given a piece of text, can generate human-like written responses. These models have become immensely powerful tools, making significant contributions in fields such as text analysis, language translation and automated customer service, to name a few.

The development of LLMs didn’t happen overnight; it required three key ingredients:

Quality data curation: Building LLMs required massive amounts of text data, which was laboriously scraped off the internet. This data formed the foundation on which the LLMs learn and build their understanding of human language.
Fine-tuning on domain-specific data sets: Although the base model is trained on a general corpus of text, fine-tuning the model on domain-specific data allows it to excel in specific tasks or industries, thereby enhancing its versatility and usefulness.
Reinforcement learning from human feedback (RLHF): By employing human feedback in a reinforcement learning framework, these models were further improved. Companies such as Scale.AI and others played a significant role in this process, offering human-in-the-loop systems to provide feedback and training to the models.

Each of these steps required significant investments in time, money and expertise. Yet, they were achievable with language data, mainly because such data can be scraped, bought or generated relatively cheaply, with few barriers to access.

In the world of robotics, large world models (LWMs) play a role analogous to LLMs in language processing. LWMs are developed to understand and manipulate the physical world. Just like LLMs, the development of LWMs also requires analogous steps:

Quality data curation: In the context of LWMs, data pertains to the information captured by robots from the real world through sensors. This could include video feeds, temperature readings, distance measurements and more.
Fine-tuning on domain-specific data sets: Similar to LLMs, LWMs also need to be fine-tuned on specific domains, e.g., a factory floor or warehouse. This helps the models perform well in specific operational environments.
RLHF: Human feedback is crucial for improving LWMs. As robots execute tasks in the real world, human supervisors can monitor their performance, provide feedback and thus help improve the models.

The creation of LLMs, as seen in the case of OpenAI, was supported by tens of billions of dollars in investments. However, the financial approach to building LWMs need not mirror this exactly. The tasks of data curation, fine-tuning and RLHF for LWMs must occur within the framework of a real-world, outcome-delivering, cash-generating operation. This can drastically reduce the upfront capital required, because the operation itself can fund the gradual development and refinement of LWMs.

We will need to rethink the business model for robotics for LWMs to make an impact. As such, the journey to create LWMs, although challenging, is both feasible and vital to unlock the full potential of robotics. The development of LWMs can follow a pathway similar to that of LLMs, albeit with important differences considering the uniqueness of physical-world operations. The insights from the evolution of LLMs and the successful use of robots in enterprises such as Amazon can guide this journey, bringing us closer to a future in which robotics become an integral, adaptable and ever-improving part of our lives.

As we look toward building the future of robotics, we need to rethink the prevalent business models and technological approaches. The following are some recommendations to consider:

Focus on revenue as a means to data capture: The focus of robotics companies needs to shift from revenue generation to data capture. The data that feeds into LWMs drives their learning and performance improvement.
Take a long-term view: Building a learning system, especially in the realm of robotics, is a marathon, not a sprint. It requires patience, consistent effort and the ability to withstand the pressure for immediate results. The goal should be to create an adaptable, efficient system that can deliver value over the long term.
Strategic acquisitions of service-heavy businesses: One way to gain access to large amounts of valuable data quickly is through strategic acquisitions of services-heavy businesses. These businesses have already done much of the groundwork in data collection and can be a rich source of input for LWMs.
Promote coevolution of humans and robots: Robots need not replace humans; instead, they can work alongside humans, each learning from the other. This coevolution can lead to more effective, efficient operations while reducing disruption.
Bake RLHF into operational workflows: As we have seen with LLMs, RLHF plays a critical role in refining and improving models. The same approach should be adopted for LWMs, with human feedback integrated into the learning process to drive continuous improvement.
Sandboxing is the way to scale profitably: Use sandboxing or simulation environments to test and refine models before deploying them in the real world. This helps ensure that the models are robust and can handle a variety of scenarios effectively before scaling.

Building the robotics companies of the future will require us to step away from conventional thinking and embrace new models and strategies. By focusing on the development of LWMs and leveraging the valuable lessons learned from the journey of LLMs, we can usher in an era where robotics becomes an integral, indispensable and continually improving part of our world.

The lessons we can learn from companies like Amazon show that it is possible to build profitable, scalable operations while driving the development of robots. The robotics companies of the future will be those that understand and leverage these lessons to their advantage, and in doing so, help unlock the full potential of robotics.

Nag Murty

Nag Murty is the founder and CEO of Electric Sheep.

Keywords: