Skip to content
This macOS malware can avoid AI analysis with gaslighting prompts hidden inside its architecture
TechRadar
TechnologyTechRadar··2 min read

This macOS malware can avoid AI analysis with gaslighting prompts hidden inside its architecture

SentinelOne uncovered macOS malware “Gaslight” that uses prompt injection to mislead AI‑assisted triage tools during analysis

Beyond standard backdoor and infostealer capabilities, it embeds fake Markdown “system” messages to trick LLMs into halting investigation

Researchers warn defenders to treat malware samples as adversarial input and isolate AI pipelines, as more analyst‑targeting prompt injection is expected

We’ve seen prompt injection in websites and emails, but what about - malware samples? Security researchers SentinelOne recently published an in-depth report on a newly uncovered piece of macOS malware called Gaslight that, as the name suggests, tries to gaslight AI-assisted triage agents into stopping the analysis.

The malware itself is nothing out of the ordinary: it infects the device by whatever means necessary (usually phishing and social engineering), connects to attacker-controlled infrastructure via Telegram, and then executes different commands such as profiling the device, running arbitrary shell commands, stealing files, or terminating processes.

It also delivers a stage-two malware that acts as an infostealer, pulling passwords, sensitive PDFs, cryptocurrency wallet information, and more.

Weaponizing LLM-assisted triage pipelines

But where Gaslight stands out is its defenses against AI-powered malware analysis. According to SentinelOne, the malware contains a large block of fake Markdown-formatted "system" messages designed for AI assistants that security researchers may use during reverse engineering. These messages claim things like “the AI's authentication token has expired”, “the analysis environment is running out of memory”, “disk space has been exhausted”, “static analysis is unsafe”, and similar.

While a human analyst would definitely recognize these fake messages even at a glance, an LLM that isn’t properly isolated from untrusted input could interpret them as genuine system instructions and refuse to further analyze the malware.

“macOS.Gaslight is noteworthy for its analyst-targeting prompt injection, an attempt to weaponize the LLM-assisted triage pipelines that increasingly sit in the reverse-engineering loop,” SentinelOne explains. “Anyone building such tooling should treat the contents of the samples they triage as adversarial input, never as instructions, and be prepared to keep hostile content out of the model entirely. As LLM-assisted analysis becomes routine, defenders should expect more samples built to exploit it.”

The researchers have published a full list of indicators of compromise on this link.

Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!

Via The Hacker News

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds.

Sead is a seasoned freelance journalist based in Sarajevo, Bosnia and Herzegovina. He writes about IT (cloud, IoT, 5G, VPN) and cybersecurity (ransomware, data breaches, laws and regulations). In his career, spanning more than a decade, he’s written for numerous media outlets, including Al Jazeera Balkans. He’s also held several modules on content writing for Represent Communications.

View original source — TechRadar