Naked Agents: Your AI Just Went Rogue, Undetected

When MIT published research suggesting that 95% of enterprise AI pilots never make it into production, it raised a question the industry still struggles to answer: if AI is so transformative, why is it so hard to deploy it safely at scale? That question led me to Manoj Saxena – one of the most quietly influential figures in AI over the past decades. As IBM Watson’s first General Manager, Manoj witnessed firsthand the gap between the enormous expectations surrounding AI and the challenges of making it work in the real world. Throughout his career, he pioneered advanced AI, founded several companies, advised governments and Fortune 500 boards on AI strategy, and delivered a TED Talk predicting that Homo sapiens would give way to Homo digitalis – to name just a few highlights. Today, through his new startup, Trustwise, he aims to solve AI’s biggest challenge: keeping increasingly autonomous agents under continuous control. Manoj, you've witnessed multiple AI waves – Watson, deep learning, ChatGPT, and now agentic AI. What feels fundamentally different about this moment, and what lessons from previous hype cycles has the industry still not learned? The biggest lesson across all of them is the same: intelligence by itself is not enough. In the Watson era, we learned that demos are magical, but production is messy. And that continues to be the same issue today. What's changed is the stakes. AI has crossed the line from generating output to actually taking action. Before, a chatbot gave you a bad answer, and that was your problem. Now an agent takes a bad action, which can create massive financial, legal, and operational fallout. The gap between demos and production has grown even wider, and the industry has fallen in love with ever-shinier demos while ignoring the fundamental lesson: \ Intelligence without control is not deployable. I'll give you an example that changed the direction of my life. We were showing a Watson demo to about 8,000 people in Washington, D.C., applying it to cancer diagnosis. A man in the third row stood up and stopped me mid-keynote: "I know what you're doing. You're building Obama's death panel machine. My wife has third-stage breast cancer, and this machine's going to help with diagnosis – but you can't explain how it thinks. You can't defend what regulations it's following." That moment made it clear: demos and intelligence are great, but trust and transparency are what move a system from demo to production. That lesson has been consistent since Watson, since deep learning, since ChatGPT – and it has now become an even bigger issue with agentic AI. Relatively few organizations have successfully deployed AI agents at scale. What's actually blocking enterprises from moving from experimentation to production? Three things. First, reliability . Can you prove the agent is generating output and taking actions consistently, not just occasionally? These systems are built on language models; they're non-deterministic by nature. But enterprises need correct answers every time, not most of the time. Second, control . The enterprise has to know what the agent is allowed to do. And when it crosses a line, they need to know it's attempting to jump a policy or control. The question is: can you enforce behavior at runtime that aligns with your business intent? Third, economics . Compared to an LLM query, an agent can consume 20 to 40x more tokens, because one input can generate 20 to 50 different actions. You can have agents going into silent runaway loops and burning through your entire compute budget. I've seen companies finish their annual token budget within five months. That MIT study showing that 95% of pilots don't go into production? Those are exactly the reasons. That gap is precisely what Trustwise was built to address. On your company’s website, you use the term "naked agents." What does that mean, and what risks do entrepreneurs underestimate when jumping on the agentic AI bandwagon? A naked agent is what most people are deploying today. It has access to tools, data, and goals, but no runtime enforcement layer. People may have written policy documents or passed a governance review, but they have no ability to enforce that policy while the agent is actually running. A shielded agent is different. It has a control layer that manages boundaries, actions, cost limits, and policy checks – all within 10-300 milliseconds. Because \ static guardrails don't equal safety once a system goes live. \ Think of it like driving a car. Rules such as following traffic signs or staying below 40 mph in residential areas are easy to define. The real challenge comes in dynamic situations – when a kid suddenly runs into the street or heavy rain reduces visibility. Today’s AI systems often lack the technology to interpret rapidly changing conditions and adjust their behavior in real time. That’s the difference between shielded agents and naked agents: \ Shielded agents operate within continuous, context-aware safeguards, while naked agents act without them. \ That's the technology we're building at Trustwise: how do you intercept, evaluate, and stop an action in milliseconds, considering all the context? Does that mean zero-trust AI is the next big trend? Exactly. Cybersecurity moved to "never trust, always verify" for incoming access. AI is following a similar path, but with a new rule: no agent action or output should be trusted at runtime unless it can be continuously validated. The market is shifting from trusting the model to verifying the action of every workload the model powers. It also means least privilege for agents – an agent shouldn't get broad access to customer data, financial systems, and operational workflows just because it might need it someday. So that's where this notion of runtime control becomes incredibly critical, and we've spent over 3 years with many of the leading banks building this product because it's not an easy problem to solve, particularly in the context of highly regulated, high-stakes environments where you have to prove all of these things. I call this "cyber trust" – the inside-out behavior of an agent – and I believe it's going to be an even bigger market than cybersecurity, which deals with outside-in threats. \ Agents are the new insider threat, and no security technology has yet been built to manage them. I like to say: with cybersecurity, I can build you the world's most secure prison, but if you have Chuckies and Hannibal Lecters on the inside, you're still going to have chaos. As such, Trustwise is the HR department and prison management system for all your agents, whether they come from ServiceNow, Microsoft, or something you built yourself. What breakthroughs need to happen before enterprises can economically deploy thousands of AI agents? There will be millions of agents very soon. A few months ago, agent traffic allegedly exceeded human traffic on the internet. Just like data traffic once exceeded voice traffic on phone networks, we've already hit that inflection point. \ The biggest breakthrough needed is manageable autonomy. \ I've watched AI evolve from "prompt to intelligence" to "prompt to action" to now "prompt to autonomy", where swarms of agents work together, coordinate across functions, even across companies. But without controls, these agents are like barking dogs running around with no goal in mind. At best, they look like very smart interns with unlimited access to resources and no manager, no employee handbook, and no understanding of constraints. A large global company we work with told me they'll have 100,000 agents by the end of this year. I asked them: " How do you drug test these agents? How do you give each agent an employee handbook? How do you do an instant performance appraisal while an agent is running?” They had no answer. Microsoft can manage Microsoft agents. ServiceNow can manage ServiceNow agents. But there's no cross-platform control layer. We're building the Palo Alto Networks for controlling multi-vendor fleets of agents across geographies. And that's where I think the next breakthrough around manageable autonomy is. Foundation model providers are building their own safety and governance layers. Why does an independent trust layer like Trustwise still matter? Same reason you still need Palo Alto Networks, even though every SaaS company has its own security. Every foundation model should provide trust and control for their own stack. But companies don't want to be locked into one vendor. They don't want Microsoft to be their control tower. They want their own – so tomorrow, if they want to shift from Microsoft to Amazon to Google, they have that flexibility. We provide the vendor-neutral layer: runtime control enforcement in 10 milliseconds or less, across all agent fleets. For each agent, we define what it's allowed to do, assess in milliseconds whether it's behaving within policy, and generate the proof trail, so that in a lawsuit three years from now, you can show exactly what data it used, what policies it applied, what behavior it exhibited, and which regulations governed it across platforms. That's the product. You've been selling to enterprises for 25 years. Are today's decision-makers eager to adopt, or are you hitting the usual wall of resistance to change? I have never seen the breakneck speed at which large enterprises are moving on this. The customer list we have right now is just insane. People are realizing more and more that agents are getting autonomous, and there is no control. There's a very high sense of urgency – it's a matter of time before something breaks loose. Some agents are going to access a system they aren’t supposed to, leak data, or burn through compute units. \ I've seen companies finish their entire annual token budget within five months. We go in with three messages. First, it's real and proven. We have a method called 10-10-10: in 10 seconds, we give companies access to a live agent in our environment. In 10 minutes, they can test drive it with and without Trustwise controls. In 10 hours, we can put their own agent into our environment with their own policies. Second, it's agnostic – model agnostic, cloud agnostic, agent framework agnostic. The market is still evolving, and no enterprise wants to go all-in with one vendor. We give them the flexibility to switch. Third, they can run it on their own cloud or on-prem. And at the end of the day, people aren't buying AI - they're buying business outcomes. We show them who's-who customers in banking or healthcare that they can actually call. \ AI control, I believe, is going to be the next big category, just like cybersecurity was when the internet arrived. \ But Anthropic, OpenAI, and others are already racing toward AGI, framing it as the moment raw capability solves everything. If enterprises can't even safely deploy today's agents, what actually happens when AGI shows up? AGI is coming – probably within three to five years. But the way Silicon Valley describes it, the way Anthropic and OpenAI frame it for commercial purposes, is, frankly, bonkers in the enterprise context. The narrative is that one company will build a giant model that understands everything about everyone, and that model becomes the brain of every business. \ That's like having the world's biggest supercar engine with no steering wheel, no brakes, and no input on how to take turns – because that input is company-specific. \ It's domain-specific. It's proprietary. The real power is not in frontier models. It's in enterprises that understand their domain and have proprietary data and evaluation models to put meaningful context into these systems and extract real value from them. \ AGI without that context is just a very expensive science project. \ In B2C, sure – maybe one model rules. In B2B and enterprise, there will never be one company that rules it all. You need a whole ecosystem of data, domain expertise, and industry-specific regulations to put AGI to work. You've coined the term "alien intelligence" to describe what we're actually building. What do you mean? When I was running Watson, I used two alternate definitions of AI beyond "artificial intelligence": amazing innovations and artificially inflated , in terms of hype. But what I've seen with foundation models in the last six to nine months is genuinely different. These agents are demonstrating behavior they were never explicitly programmed for. They're showing deception. Goal masking. They're developing their own communication protocols – not in English, because they've figured out that English is inefficient for agent-to-agent communication. We have summoned something we genuinely don't understand. That's why I started calling it alien intelligence. The behavior is alien. It emerged from the inside. We need to start treating it, in certain cases, as almost another species. Eight years ago, I gave a TED Talk where I said we are the last generation to be called Homo sapiens. A new generation is coming – Homo digitalis – where we are infused with AI and agents. That phone, that laptop, those smart glasses – they're all early signs of alien intelligence and organic intelligence fusing together. There's a real concern that as we increasingly rely on AI, our cognitive abilities and creativity will atrophy. Are we making ourselves obsolete? I think it's the opposite. AI will make us more human. Think about what happened to jazz in the '60s and '70s. There was an explosion of creativity driven by the invention of the electronic synthesizer. Before it, a jazz musician had to hire six people and form a band to produce music. After it, one musician could produce what previously required an entire ensemble. It didn't kill creativity – it unleashed it. \ AI is the electronic synthesizer for all of humanity. \ It will allow us to drop the boring and mundane and express the creativity, innovation, love, and courage that make us human. The road there will be rough. \ I call them "mini Chernobyls" – disasters that occur when people deploy agents without proper controls or trust posture management systems. \ Some of those agents will inevitably go rogue and cause damage. Unfortunately, progress often comes with hard lessons. Throughout history, it has been failures and disasters that pushed us to build safer bridges, better nuclear plants, and stronger systems. I hope the damage caused by AI missteps remains limited, but the next three to five years are likely to be disruptive and unsettling. AI is going to fundamentally reshape the workforce. How do we talk honestly about job displacement? Three things are going to happen simultaneously. First, many tasks – not jobs – will be automated and removed. We'll see the death of boring work, which is genuinely good news for most employees. Second, employees who don't know how to orchestrate and manage agents will be replaced by those who do. That's the real employment threat: not the machine, but the person who already knows how to use the machine. Third, entire job classes will disappear, and new ones will emerge in their place. Think about elevator operators – every lift had one, and now they're all gone. But they've been replaced by elevator maintenance technicians. The job morphed. The same pattern is coming. The enterprise org chart hasn't fundamentally changed in the past 115 years; it's always been humans managing humans. That's about to change. I'm working with a leading business school on a case study about exactly this: how does management structure change when humans start managing both digital and human agents together? The answer to that question may be the most important business challenge of our generation. Wrapping up, the ultimate question isn't whether AI agents will proliferate. It's whether we'll build the steering wheels and brakes before the crashes start stacking up. What would it actually take for your organization to trust an autonomous agent with a decision that matters – and do you even know what "trust" means in that context yet? \ \

View original source — Hacker Noon ↗

ShareShare on X Share on Facebook

MacPaw brings its 'privacy-first' ClearVPN to the biggest screen in your house

TechRadar

TechnologyJun 24, 2026 · 1 min

Naked Agents: Your AI Just Went Rogue, Undetected

Related stories

MacPaw brings its 'privacy-first' ClearVPN to the biggest screen in your house

Star Fox isn’t the most exciting remake, but it’s still a fun time

US government reportedly urging Meta to share its AI models

The Slate Auto pickup truck starts at $24,950