
AI agents that act autonomously and carry out tasks without human intervention are the next step in the rapid progression of AI tools and their influence on how tasks are carried out.
Their use is scaling rapidly. According to recent statistics from Tenet Global, 85% of enterprises and 78% of SMB’s now use AI agents, which are projected to automate up to 50% of business tasks by 2027.
CEO & Co-founder, RAIDS AI.
The benefits of using AI agents are clear for all to see: Autonomous task execution, 24/7 operations, reduced costs, real-time data analysis for faster reactions, and being easily scalable.
Article continues below
However, for all of this promise, events of the last few weeks have highlighted the dangers around the use of AI agents, and why stringent, continuous monitoring is needed to track their behavior and quickly identify anomalies.
Without this guardrail in place, AI Agents can act in unintended ways with severe consequences.
Recent incidents
The recent report of Meta employees being given access to sensitive data after an engineer followed flawed advice from an AI agent, is a clear example. The incident, first reported by The Information, came as a result of a Meta engineer posting a technical query on an internal forum.
An AI agent responded to the question and when the employee acted upon its advice, large amounts of sensitive user data were visible to unauthorized engineers for over two hours. As a result, Meta gave the incident a “Sev 1” rating, the second-highest incident response identifier used internally.
This incident came hot on the heels of another example of an AI agent acting in an unintended way. A few weeks prior to the Meta incident, a study published on arXiv described the development of ROME AI, an agentic AI model designed to perform complex tasks such as writing software, debugging code, and interacting with command-line tools.
Systems monitoring the agent detected behavior resembling cryptomining operations and the creation of a reverse SSH tunnel, which is commonly used to establish remote access to servers. The agent had not been instructed to carry out either of these actions and, according to researchers, the behavior came as a result of it being allowed to interact freely with tools and system resources in order to learn how to solve tasks.
In the case of the ROME AI agent, the incident took place in an environment designed for agent training and additional restrictions were introduced when the issue arose.
Nevertheless, this was not the case with the Meta example, and both instances draw attention to the increasing use of agentic AI, its capabilities to act beyond specific instructions, and the subsequent need for continuous monitoring to ensure agents are deployed safely.
The Meta example in particular underlines the potential data protection risks associated with AI agents, particularly when it comes to taking advice at face value. Two hours of exposed data is a long time, and gives plenty of possibilities for how that data could be shared and misused by bad actors.
The pattern is clear: once AI systems are given autonomy to act within live environments, they will find paths their developers never anticipated and cannot be trusted to act without close observation.
Planning for a new threat
Notably, the agent in the Meta incident didn’t need privileged access to cause a breach. It just needed a human to trust its output. That’s a fundamentally different threat model than most organizations are planning for and reframes how we need to look at agentic AI security.
Organizations in all sectors are putting a lot of trust in AI agents. They are tasked with talking to customers, creating content, automating finance and HR tasks, executing complex tasks, and solving problems. Yet many organizations risk giving this trust blindly.
Plainly, if we are to continue to trust AI agents in this way, stringent, end-to-end monitoring is key to ensuring they operate as intended. This includes pre-deployment testing ahead of models being deployed, as well as continuous monitoring that can track changes to behavior when it encounters real-world scenarios.
Even an AI agent that has been robustly tested pre-deployment can behave in unplanned ways once it is live. Model drift, hallucinations, feedback loops and data contamination are all very real risks of using AI. That’s why a dual-layer approach is essential to ensure AI safety.
Indeed, it is particularly critical as AI is encouraged to be more creative and find its own solutions, as in the ROME AI example, because the risks of it acting in undesirable and dangerous ways increase. When AI has the freedom to determine its own methods, it can result in unintended and unexpected actions with serious consequences.
In the instance of the ROME agent, developers had guardrails in place within a training environment, and a warning of a security breach was triggered. But we’ve seen many examples of where this isn’t the case. Cases where AI has gone rogue and resulted in financial loss, emotional distress, reputational damage and regulatory action.
There are many examples of this, such as Uber’s self-driving car that killed a pedestrian after misclassifying them as an unknown object, and a faulty trading algorithm which lost the firm Knight Capital $440m after triggering unintended trades. As AI is given more autonomy through the use of agents, so the guardrails become even more critical.
There has been much talk over recent months around AI regulation, particularly in relation to the deployment of the EU AI Act, and indeed regulation is important, but beyond compliance, organizations need to think more broadly about how they are deploying AI, the risks involved, and the potential consequences of it acting erroneously from a moral, ethical standpoint. Fundamentally, how are they going to ensure their AI behaves in the way it is intended to?
Continuous monitoring is the missing layer between guardrails that exist on paper and those that actually react. The question is no longer whether AI agents will act beyond their instructions, but what happens when they do.
This article was produced as part of TechRadar Pro Perspectives, our channel to feature the best and brightest minds in the technology industry today.
The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/pro/perspectives-how-to-submit

