AI worms
Day 70 / 366
People who used the internet in the early 2000s would remember what computer worms were. These were a kind of virus that spread from computer to computer, mainly via emails. They were the reason why we were scared to click on an email that came from an unknown account. Well, today I found out that people have also started to create similar viruses for targeting Generative AI apps.
With Gen AI being integrated into most applications, we will soon have an ecosystem of Gen AI abstract layers talking to each other, with the hopes of improving the user experience. For instance, there are already AI-powered email clients that will allow you to use AI to reply to emails as well as use AI to summarize incoming emails.
This is where the AI worm concept was used to demonstrate a vulnerability. It starts with a malicious email that has hidden within it an AI prompt. And this is a type of prompt known as an “Adversial Self-Replicating Prompt”. When passed to an LLM, it can be used to change its output to whatever you want.
How it works
The AI email assistant used RAG to go through your existing email to get additional info to create a reply for an incoming email. A malicious email will get into this database and essentially poison it. When RAG is used, the bad prompt from the email will make it to the info that is being passed to the LLM, and so it will control its output and therefore also the automated reply to the email.
That’s not all. When the reply reaches the email folder of another user, their RAG database gets poisoned as well, and this is how the worm can spread to multiple users.
When the internet was new, it had a lot of vulnerabilities like this, and it took a lot of time for things to be as secure as they are today. Something similar would have to happen for AI as well.