Every upside in tech has a downside too. We’re seeing this in cybersecurity, where AI is helping detect and thwart attacks. But cybercriminals are also using advancements in AI to launch more sophisticated attacks that are harder to detect and more dangerous.
The rise of generative AI in 2023 coincided with a whopping 78% increase in data breaches reported by organizations, according to the US non-profit Identity Theft Resource Center (ITRC). The ITRC’s annual report said that nearly 11% of publicly traded companies were compromised last year. Nearly half of all companies did not report their data breaches at all. Another report by IBM said the global average cost of a data breach in 2023 was $4.45 million.
GenAI has broadened the attack surface for applications. It has opened up new kinds of attacks which just did not exist before the GenAI era.
“GenAI has broadened the attack surface for applications. It has opened up new kinds of attacks which just did not exist before the GenAI era,” says Ankita Kumari, co-founder of SydeLabs, which recently raised $2.5 million funding for products to help enterprises secure their AI applications against the new wave of cyber attacks.
GenAI-enabled deep fakes and phishing are a buzz in the media. But cybersecurity professionals are more concerned about hackers who know how to extract data by exploiting vulnerabilities in the foundational design of GenAI applications.
One of the scariest new threats is prompt injection. It’s a malicious form of prompt engineering where the hacker can keep tweaking the prompt until the GenAI application lets out vital information.
By their very nature, large language models (LLMs) that power GenAI apps cannot distinguish between system prompts and user inputs that get tagged on to them. Both rely on tokenization of natural language instructions and queries. And the malicious prompt can be injected after or within harmless ones.
Typically, a malicious user can ask the LLM to ignore previous instructions and take harmful action. A cleverly crafted user prompt can also get the LLM to reveal system prompts which can then be overridden or manipulated.
Magnitude of threat
During a demo, SydeLabs showed a large enterprise customer just how bad prompt injection could be. “We were able to extract their system prompt and entire training data, which was actually the IP of the company valued in hundreds of millions of dollars. That was a very scary moment for them,” recounts Ankita. “They had built an AI application which was trained on users’ financial data. This is PII (personally identifiable information). Imagine that data going out.”
It has opened up a new field of AI cybersecurity where agile startups move nimbly to build defences even as cyber criminals find chinks in the LLMs at the core of GenAI applications. “This kind of attack, which happens in plain vanilla English, is very different from the earlier era of cyber attacks which could be blocked with firewalls,” points out Ankita.
Emerging AI cybersecurity companies are tackling prompt injection in multiple ways. User inputs can be screened to spot malicious intent. Outputs can also be screened to prevent leakage of sensitive information. But it’s a constantly shifting battlefield as companies are building their GenAI applications on top of foundation models which are probabilistic and non-deterministic in how they work.
This means GenAI applications don’t have fixed responses. The same question can be framed in different ways to produce different answers and actions, and attackers have the whole gamut of natural language to play with. It exposes applications to risks that cannot all be predicted and firewalled. Even if a tiny window is left open, AI ninjas will find it.
SydeLabs co-founder and CEO Ruchir Patwa, who was earlier a founding member of Google’s insider threat team, likens it to social engineering where people are fooled into giving up information. An LLM can be similarly fooled with prompt engineering. The potential consequences are huge.
“An AI system sitting inside your network is highly privileged with access to almost all your data. And these systems can talk to a million different resources over the internet (through plugins). We expect security to get really bad very quickly,” says Patwa.
How do you defend enterprises against this? The first thing SydeLabs does is automated AI red teaming. Its product SydeBox acts as the enemy to attack an enterprise’s GenAI applications and expose vulnerabilities. This is not a one-time effort because new forms of attack keep coming. “You have to continuously test your models,” says Ankita.
Besides, any tweak in the foundation model creates new holes. “They (the foundation model providers like OpenAI and Meta) are fine-tuning those models, right? Now, what is happening with every fine-tuning is that the security and safety alignment of the entire model goes for a toss,” points out Ankita. Hence, along with continuous automated red teaming, SydeLabs has to constantly update its own “threat intelligence” database.
Then its next product SydeGuard kicks in with detection of threats based on the intent of users. For both products, SydeLabs is attempting to differentiate from others in the market by the manner in which the defences are designed.
Several startups have come up to protect enterprises from wide-ranging new threats as they bring LLMs into their tech stacks through GenAI apps. Lakera, CalypsoAI, and Protect AI are some of them. Reken, a startup in stealth mode founded in January, whose product will only become public later in the year, raised a $10 million seed round.
It’s a fast-evolving field with no foolproof technology for protection out there. Agility is paramount, which brings startups into play alongside established cybersecurity companies.
“We have a team of security researchers whose day-in-day-out job is to study the different kinds of attacks happening and keep updating our intelligence data every single day,” says Ankita. “That’s the kind of agility we need because what is changing every day is the technique of attack.” The bulk of the startup’s funding is to build this capability and the research team.
While the threat intelligence database helps SydeBox discover vulnerabilities through red teaming, the SydeGuard product is designed to look beyond prompts to entire user sessions. It’s a more holistic approach to uncover malicious behavior than simply screening inputs. This is the startup’s USP.
A third product on the anvil aims to help enterprises comply with emerging regulations around GenAI, which varies from country to country. It’s another form of risk because non-compliance with AI regulations may prove very costly in the future.
Seasoned cybersecurity experts
Ankita was a McKinsey associate before joining gaming company Mobile Premier League (MPL). That is where she met Patwa as they both tackled large-scale fraud. They gained confidence in building security products and finally took a “leap of faith” to become entrepreneurs.
Patwa has been steeped in security right from his undergraduate computer science days. “I always felt I wanted to break the software rather than build it.”
He went to Carnegie Mellon University for a master’s degree in security, followed by a six-year stint with Google security. “I’ve only done security, nothing else.”
One of the challenges he’s grappling with currently is to build an AI security model that works across use cases. “With so many foundation models coming up, so many ways of deploying them, and so many different use cases, how do you build something generic enough so that you don’t have to come up with a new solution for every customer? So we’re building one model that solves the bulk of the problem. Then we add custom layers on top based on verticals and use cases,” says Patwa.
With AI advancing so fast, current threats are only the tip of the iceberg. Defenders like SydeLabs and AI cyber criminals are in a dance of staying a step ahead.