Technology

Why The Rabbit r1 Is A Big Deal

Karan Kamble

Jan 25, 2024, 02:58 PM | Updated Aug 13, 2024, 03:56 PM IST


(The Rabbit r1 Device - file photo)
(The Rabbit r1 Device - file photo)

‘There’s an app for that’ was once a remark that spoke to the rich capabilities of mobile applications — often to the surprise of those whose ears were meant to hear it.

A new technology company on the block is now aiming to upend this prevailing dependence on the app-based, operating-system model that reigns supreme on smartphones.

The company Rabbit unveiled a new kind of device called the “r1” — not quite a phone, not very different from one — a dozen days ago.

The good-looking, orange gadget dressed in retro for a nostalgic hit is powered by artificial intelligence (AI). It can ‘speak’ with you in natural language, in contrast to the annoyingly specific way one needs to speak with a digital assistant on the smartphone or a smart speaker only for it to still misunderstand the message. The r1 is more like ChatGPT than like Siri.

But it’s not merely a talking GPT.

The r1 goes a step further and gets things done based on your interactions with the built-in AI model. It can perform actions far beyond the typical smartphone fare of telling the date, playing a song, or setting an alarm.

Because of this unique ‘action’ approach, the r1’s AI model is dubbed the “large action model” (LAM) — as against the “large language model” of ChatGPT and other similar AI assistants like Microsoft’s Bing and Google’s Bard.

The difference can be roughly captured thus: While an LLM speaks with you and answers your queries, LAM gets things done for you. (The r1 uses an LLM, too, that of San Francisco-based startup Perplexity.)

Rabbit considers LAM the “cornerstone” of its advanced rabbit OS operating system and describes it as a “new type of foundation model that understands human intentions on computers” and gets the job done accordingly.

The major challenge it overcomes is human communication. Humans aren’t always clear in their communication. They generally provide incomplete information, make requests that don’t fully capture all their needs, and change what they want without notice.

Such an input is hard for the typical human-machine interface to handle. That is the reason Apple’s Siri or Amazon’s Alexa sometimes provide a strange response, perform the wrong action, or, more often than not, say they don’t understand you, whether on a smartphone or on a “smart speaker.” Though the Google Assistant is comparatively better, with natural language models taking hold, it is still way behind where it ought to have been by now.

Rabbit OS promises to be able to understand our all-too-human voice input, make sense of what is really being asked, and translate it into actionable steps or responses in accordance with the interaction.

For this purpose, rabbit OS uses its long-term memory of you, watching how you do things. "...we take advantage of neuro-symbolic programming to directly learn user interactions with applications, bypassing the need to translate natural language user requests into rigid APIs," Rabbit says, explaining its research.

LAM then calls on the particular apps or services you use regularly and performs the desired task. “LAM has seen most interfaces of consumer applications on the internet and is more capable as we feed it with more data of actions taken on them,” Rabbit says.

Here’s where it’s different: In going about its work, it doesn’t ask users to download apps, install plugins, or write code the way we are typically asked to. One simply has to enter the “rabbit hole” web portal, select and log into the apps they want connected to the device via a secured cloud, and then speak to the device naturally to get things done using various apps.

In the current smartphone paradigm, we have to use a variety of apps to perform different tasks. So, we switch from one app to another to yet another to do things like book a ride, order food or groceries, play music, and book a table at the local restaurant. While apps demand time and attention, they are designed to ask even more time and attention of yours over time, ultimately keeping you glued to your smartphone rather than freeing you up to be away from it.

The promise of the r1 is that you simply tell it what needs to be done, like book a ride home, and it follows through, even if it involves a bunch of apps and tasks. For instance, the company advertised a surprisingly multilayered prompt: “Order me an Uber and find me a good podcast to pass the time… oh, and tell everybody that I might be late.” Such a request cuts across apps and services.

But, you might wonder, surely LAM doesn’t know to do whatever is asked of it. That’s true, but here comes another useful feature: it can be trained to do new stuff.

Using the “experimental teach mode,” anyone can train the model without any technical knowledge or experience. It leads to the creation of LAM-powered “rabbits” — a label for unique tasks or routines.

Jesse Lyu, the founder and chief executive officer of Rabbit, demonstrated in his keynote how the r1 could be taught to perform a task it doesn’t know how to do — generating an image of a puppy from AI art generator Midjourney via the social platform Discord.

After recording all the steps involved in the process, Lyu tells rabbit OS how to use this ‘rabbit’ so that it can generate any image in this way in the future. He returns to his r1 and commands it to similarly generate an image of a bunny: “Now let’s use Midjourney as I told you to generate a picture of a bunny in pixel art style.” It does.

To sum up, “Record your actions, explain them with your voice, and play them to rabbit OS. LAM will learn the nuances and create a rabbit that can be applied to various scenarios,” as Rabbit says on its website. And if you create a rabbit that could be useful to many others, you even have the option to monetise and distribute it on the upcoming rabbit store.

According to Rabbit, LAM's ability to solve complex problems spanning multiple apps will grow over time — covering even those areas which otherwise require professional skills.

“With LAM fast evolving, my r1 will eventually help me do things that can never be achieved on an app-based phone,” Lyu said in his presentation, arguing the need for this new orange brick in addition to the more expensive one, a smartphone, already sitting in your pocket.

He clarifies that the r1 is not a smartphone replacement; instead, it represents a different generation of devices.

Karan Kamble writes on science and technology. He occasionally wears the hat of a video anchor for Swarajya's online video programmes.


Get Swarajya in your inbox.


Magazine


image
States