Large Action Models
Day 37 / 366
People have been working hard on how to use LLMs and Generative AI to bring actual value to consumers. While many are going the traditional route of adding LLMs to existing devices and apps, there are a few who are taking a novel approach to rethinking how users would like to interact with this technology.
For instance, take the new Rabbit R1, shown in the image above. It’s a personal AI assistant device that works through natural language instead of traditional keypads and buttons.
Rabbit R1 uses something called Large Action Models or LAMs. LAMs take LLM’s one step forward. While LLM can generate text based on input, LAMs work like agents who utilize LLMs to understand the task provided in the input and break it up into steps for it to perform the said task as well. While LLMs are good just for answering queries, LAMs can perform tasks to achieve a goal as well.
For instance, in the case of the Rabbit R1, you can ask it through a voice command to book a cab for you, and it will understand what steps it needs to perform on the Uber app to make that booking.
How is it different from Siri? Well Siri needs to be programmed beforehand to perform certain tasks, and those tasks are limited by the apps that would provide APIs for Siri to integrate. That means that Siri would perform a task but in a way different than what a human would have.
LAM Agents, on the other hand, are capable of understanding how to navigate through an existing user interface just like a human would. It’s even possible for you to train your Rabbit R1 to perform a custom task for you, by showing it once how it needs to be done.