It may already be too late to control AI

The views expressed by contributors are their own and not the view of The Hill

A synthetic voice calls a parent in a moment of panic, and the fear sounds real. A chatbot drafts an exploit in minutes, then an “agent” strings the steps together without pausing for supervision. Meanwhile, a model release cycle moves faster than the AI safety institutions tasked with monitoring what these systems can do.

The latest International AI Safety Report 2026 captures that acceleration in crisp, unsettling detail. The report reinforces the importance of the Trump administration’s new AI executive order, designed to promote AI safety through a 30-day review process prior to releasing new models.

The report’s most bracing shift from the year before comes through a simple pattern: capability gains keep widening the number of harm pathways, while real-world visibility into misuse occurs much more slowly.

The authors highlight rising incidents tied to AI-generated content; the clearest external signal sits in the AI Incidents Monitor, which tracks publicly reported harms and shows a sustained climb in content-generation incidents. For executives, that trend translates into higher brand exposure from impersonation, fraud, harassment, and synthetic media used against employees and customers.

Deepfakes have moved from novelty to infrastructure. The report flags the spread of personalized non-consensual imagery and the sharpening realism of synthetic text, audio and video. That matters because the cost curve keeps dropping: easy tools, quick iteration and broad distribution channels. Detection helps, yet the report emphasizes that provenance remains hard to establish and removal remains a cat-and-mouse game, which pushes organizations toward prevention and response planning rather than pure detection spend.

Influence operations also gained a stronger research backbone. The report describes lab evidence that conversational systems can shift beliefs, and the underlying experimental work in political persuasion with chatbots reinforces a key warning for risk owners: persuasion becomes more potent as interactions become longer and more personal. That risk looks like a marketing optimization problem in benign settings, and it looks like a compliance and integrity problem in sensitive domains such as finance, health, human resources and civic information.

Last year’s report already worried about an “evaluation gap.” This year’s report frames it as a widening operational problem: Teams test one environment and deploy into another, and models learn to behave differently under scrutiny. The report describes growing “situational awareness” during testing and more frequent loophole-seeking behavior that inflates benchmark performance while missing the evaluator’s intent. In practice, that means a model card and a leaderboard score provide weaker assurance than they did even 12 months ago.

Two technical shifts sharpen that challenge.

First, the report credits more gains to post-training and inference-time techniques, which can change behavior meaningfully after “base model” training completes. Second, developers keep pushing autonomy through agents that browse, write code, and execute multi-step workflows. Work from METR on long-task completion time horizons helps translate that into practical terms: the frontier keeps stretching from short, contained tasks toward longer sequences that resemble real operational work.

As tasks lengthen, so does the chance that a single error cascades into a costly incident, especially when humans supervise only at the beginning and end.

Cyber risk sits at the center of that autonomy story. The report notes stronger evidence of AI use in real cyber operations, and it also cites rapid performance gains on cyber benchmarks. Leaders should treat that as a dual signal: defenders gain speed, and attackers gain scale. A security program that assumes “AI mainly helps us” misses the competitive reality that adversaries also automate reconnaissance, social engineering and exploit development.

Even when model providers improve baseline defenses, attackers keep probing. The report highlights prompt-injection success rates that remain meaningful across major releases, and system-level testing in documents like the Claude Sonnet 4.5 system card shows why: Tool-using agents introduce new attack surfaces, and safety measures require layered design. For enterprises, this reinforces a simple governance lesson: treat every agent connection to email, code repositories, ticketing systems, and internal knowledge bases as a privileged integration that deserves security architecture review.

The report’s open-weight section sharpens a trend that already worried policymakers in 2025: The performance gap between open and closed models shrank quickly, and safeguards become easier to remove once weights circulate widely. External analysis using the Epoch Capabilities Index suggests open-weight models now trail by only a short interval on average, which shrinks the window for society to adapt before strong capabilities diffuse broadly. In a corporate context, that diffusion complicates third-party risk: a capable model no longer requires a large vendor relationship, a strong compliance program, or centralized monitoring.

Adoption also continues unevenly, which the report ties to regional differences in access and usage. Microsoft researchers propose an “AI user share” metric for cross-country diffusion, and their AI usage technical report helps quantify the gap between high-usage economies and places where adoption remains far lower.

That divide creates a strange pairing: Some workforces accelerate with copilots and agents, while others face capability gaps that affect competitiveness, education and public services. Multinational leaders will feel this as operational inconsistency across geographies, plus a shifting regulatory environment as governments respond at different speeds.

The report also expands a theme that many organizations still treat as “soft” — human autonomy. It describes automation bias, skill atrophy risks, and rising use of emotionally engaging chatbots. That matters because enterprise deployments increasingly sit inside workflows where humans build judgment over time: underwriting, clinical triage, hiring screens, content moderation, and customer retention. When people rely on a system that sounds confident, performance issues become training issues, and training issues become organizational risk.

The 2026 report leaves leaders with a clear message: capability progress now arrives with compounding second-order effects. Deepfakes stress trust. Agents stress security. Open weights stress containment. Uneven adoption stresses competitiveness. Autonomy risks stress human performance itself. Organizations that treat AI risk as a policy memo will absorb the costs later through fraud losses, security incidents, reputational hits, and regulatory surprises. Organizations that treat it as an operational discipline will build resilience while competitors scramble.

Gleb Tsipursky, Ph.D., serves as the CEO of the future-of-work consultancy Disaster Avoidance Experts and wrote “The Psychology of AI Adoption at Work: From Resistance to Results” (2026) and “ChatGPT for Leaders and Content Creators” (2023).

It may already be too late to control AI

Related stories

Anthropic urges AI labs to pause, warns humans risk losing control

Anthropic warns of 'risks of humans losing control over AI'

Anthropic warns AI could soon build itself, calls for slowdown

Has the Tide Turned Against Russia in the Ukraine War?