Running an LLM on Google Chrome

2 min readJul 3, 2024

Day 185 / 366

I have been trying to write this blog for the past week. Ever since I found out that Google Chrome is soon going to ship with Gemini Nano, google’s own LLM. In short, this would mean that you can create websites with GenAI capabilities that can run right on the user’s browsers. No need to call any APIs or pay any token fees.

I saw a post on Twitter where someone had posted a screenshot of him trying this out on a dev version of Chrome. And ever since then, I had been trying to set this up on my laptop as well so that I can review it. But I failed consistently, until today.

Gemini Nano working on the dev console of my Google Chrome

How to get access

You need to fill this form in order to get access to this feature —

It usually takes 24 hours to get the access. Once you are done, you will need to install the dev version of google chrome from this link —

Google Chrome Developer Tools - Google Chrome

Google Chrome for developers was built for the open web. Test cutting-edge web platform APIs and developer tools that…

www.google.com

Setting up dev Google Chrome

Once you have downloaded Google Chrome for developers, you need to turn on a few flags in order to enable the on-device LLM feature.

Open a new tab in Chrome, go to chrome://flags/#optimization-guide-on-device-model
Select Enabled BypassPerfRequirement
This bypasses performance checks which might get in the way of having Gemini Nano downloaded on your device.
Go to chrome://flags/#prompt-api-for-gemini-nano
Select Enabled
Relaunch Chrome.

Once you are done with this, you need to download the Gemini Nano model

Go to chrome://components/
Search for Optimization Guide On Device Model and click on check for update. This will start the model download.
You might need to refresh the page a few times to see the effect.

After some time, depending on how fast your internet is, the model should be ready to use

Testing it out

To test this, just open a new tab and then open dev tools. In the console then type the following code -

await window.ai.canCreateTextSession();

If you see the output as “readily”, you are good to go!

Now this is how you ask the LLM a question

const session = await window.ai.createTextSession();
const stream = session.promptStreaming("Write me an extra-long poem");
for await (const chunk of stream) {
  console.log(chunk);
}

That is it. You just got a locally running AI model on your browser!