Security researchers at Trail of Bits have uncovered a new vulnerability in AI systems. This vulnerability exploits image downscaling to hide malicious prompts capable of stealing user data. The method demonstrates how attackers can embed invisible instructions into high-resolution images. These instructions only become visible when the image is resampled — a common preprocessing step in AI pipelines.
The attack builds on a concept first proposed in a 2020 USENIX paper by TU Braunschweig researchers. It extends the theory into a practical exploit against large language model (LLM) applications. Trail of Bits researchers Kikimora Morozova and Suha Sabi Hussain showed that hidden patterns emerge when images are automatically downscaled. Algorithms such as nearest neighbor, bilinear, or bicubic interpolation can create these patterns, which AI models interpret as text.
In one proof-of-concept, the team used bicubic downscaling to make specific dark regions of an image reveal hidden instructions. These instructions were then executed by Google’s Gemini CLI through Zapier MCP. This enabled the exfiltration of Google Calendar data to an arbitrary email address without explicit user approval.
Key systems confirmed vulnerable to the attack include:
- Google Gemini CLI and Vertex AI Studio
- Gemini web interface and API
- Google Assistant on Android
- Third-party tools such as Genspark
To support testing, Trail of Bits released Anamorpher, an open-source tool that generates malicious images tailored for different downscaling methods.
As mitigation, researchers recommend restricting image dimensions, providing users with previews of downscaled images, and requiring explicit confirmation for sensitive tool calls. They further emphasize adopting secure design patterns to defend against multimodal prompt injection.
This discovery highlights the growing complexity of safeguarding AI systems, particularly as attackers find new ways to manipulate inputs across text and image modalities.
Source:

