Microscopic pixel-level changes, imperceptible to the human eye, are enough to bypass the safety safeguards of some artificial intelligence systems, according to new research from Florida International University. Researchers Hadi Amini, associate professor at FIU’s Knight Foundation School of Computing and Information Sciences, and graduate assistant Md Jueal Mia discovered that an altered image, even a picture of a panda bear, can trick AI into generating harmful or policy-violating outputs. As presented at the International Conference on Machine Learning and Applications, the team’s findings demonstrate that AI models “don’t see images the same way humans do,” Amini explains, instead interpreting them as patterns of numbers and pixels. “In order to protect AI systems from attacks, we try to break them ourselves, identify potential vulnerabilities and design defense mechanisms,” Amini said, framing their work as a proactive effort to bolster future AI security.

Pixel-Level Perturbations Bypass AI Safety Safeguards

This research focuses on exploiting the way AI systems process visual information at a fundamental level, rather than creating complex adversarial attacks. To achieve this, they developed JaiLIP (Jailbreaking with Loss-guided Image Perturbation), an algorithm designed to determine the optimal degree of pixel-level manipulation needed to bypass AI safeguards. Testing JaiLIP on BLIP-2, a multimodal AI model, revealed a significant increase in the likelihood of the system generating harmful or unsafe responses when presented with altered images. In one example, a JaiLIP-modified image of a stoplight successfully tricked the AI model into providing detailed instructions on how to disregard traffic signals without incurring a penalty.

The researchers found that using JaiLIP images nearly doubled the number of harmful responses generated by the AI models tested, extending the risk beyond simple requests for illegal activities. Amini emphasizes that small businesses and companies integrating AI must be aware of these potential vulnerabilities and prioritize deploying sufficient guardrails to ensure the safety and integrity of their AI tools; the challenge lies in ensuring AI can recognize threats hidden in plain sight, even when humans cannot.

AI models don’t see images the same way humans do.
Hadi Amini, associate professor at Florida International University’s Knight Foundation School of Computing and Information Sciences

JaiLIP Algorithm Increases Harmful AI Response Rates

Florida International University researchers are actively probing the defenses of artificial intelligence systems, employing a counterintuitive strategy of intentional exploitation to bolster future security. This approach centers on identifying vulnerabilities before malicious actors can leverage them. The team’s work reveals that even microscopic pixel-level changes are sufficient to circumvent these safeguards, highlighting the fragility of current AI security measures. Amini emphasizes the need for proactive security measures, recommending limiting sensitive data input, restricting system access, and thoroughly evaluating built-in security features before deploying AI tools.

In one example, a JaiLIP-altered version of a stoplight tricked the AI model into divulging detailed instructions on how to run the light while avoiding a traffic ticket.

Source: https://news.fiu.edu/2026/fiu-researchers-reveal-how-altered-images-can-bypass-ai-safeguards

Stay current. See today’s quantum computing news on Quantum Zeitgeist for the latest breakthroughs in qubits, hardware, algorithms, and industry deals.

Tags:

AI safeguards Machine Learning

The Neuron

Microscopic Changes Trick AI Into Unsafe Responses

Pixel-Level Perturbations Bypass AI Safety Safeguards

JaiLIP Algorithm Increases Harmful AI Response Rates

Latest Posts by The Neuron:

INFN: From bit to qubit. The future of computing is Quantum

Perimeter Institute: Quantum Information Recovers Via 1970s Stochastic Equations

HOLO Says It Restructures Quantum State Prep to Cut Exponential Gates