Thought Leaders
The Coming “Exolution” of AI

Today, being at the edge of a technological fault line, we observe the journey from LLMs to agents, and eventually to agentic AI and AGI, and it isn’t just about bigger models or faster responses. It’s also about machines moving from being passive assistants to active collaborators, and perhaps, one day, independent thinkers.
Let’s trace this path and explore what it means for work, expertise, and the very role of humans in shaping the intelligence of tomorrow.
The difference between LLMs, agent-based systems, and agentic AI
To better understand the difference, here is an example. If I ask an LLM something like: “I want to travel from Chicago to Austin, drive no more than four hours a day, and stop in scenic places,” a regular LLM will return a static response in text format based on language generation. It will likely just respond to the request without performing a thorough analysis.
An agent would first classify the request as travel-related. Then, it would determine what data is needed: routes using mapping services, weather information, fuel costs, hotels, restaurants, etc. After that, the agent would break the request into sub-tasks and route them to specialized modules or LLMs trained on the relevant sources. This is orchestration and coordinating multiple models and tools under a unified logic.
Today, most major systems like ChatGPT or Claude from Anthropic are essentially already agents. Although it may appear to the user that they’re interacting with a single model, behind the scenes is a complex architecture involving many models and systems. They can already handle complex queries, but their capabilities are mostly limited to providing information; they don’t take action yet.
A fully autonomous agent is a system that gathers information and can, for instance, independently book a hotel, purchase a ticket, or initiate a payment, provided it has access to the relevant APIs or user data. Such agents are currently in early development stages. At this point, they are more like semi-agents, capable of processing information but not yet performing autonomous actions.
An interesting area of discussion in the research community is agentic AI. Unlike a regular agent, whose behavior is scripted by developers, agentic AI is a system that independently decides what tasks to perform, what data it needs, and even how to continue its own training. This goes beyond executing instructions; it involves making autonomous decisions. However, agentic AI remains theoretical at this stage; no such systems exist in practice yet.
AGI – the new horizon. But is it achievable?
Meta has invested in Scale AI three months ago. The goal was to join forces on the path to building AGI, Artificial General Intelligence, capable of performing any task at a human level or even surpassing it. If today’s AI is a technological revolution, AGI will be a true mega revolution; sometimes I call it “exolution”, meaning the “exodus” of AI from the shadows. Whoever achieves it first will gain a global strategic advantage.
As for how close we are to actual AGI, that depends heavily on how we define it. I align with Ilya Sutskever’s view: AGI is a system capable of performing any intellectual task that a human can. Not just answering questions, but reasoning, decision-making, generalization, and interpretation across domains. True AGI is universal and not confined to narrow task boundaries.
None of the current models has reached that level. We’re moving in that direction, but true AGI, in the theoretical sense, still doesn’t exist. And perhaps that’s for the best. We’re still in a phase of approximation, and it’s likely we’ll remain there for quite some time.
The foundation of AGI will likely be an agent-based system. It won’t necessarily rely on a single LLM, because just as no single human, no matter how brilliant, can master all domains of knowledge and skills, no single LLM can handle the full spectrum of AGI tasks on its own. What we’ll need is a kind of “collective intelligence”: an architecture capable of coordinating multiple models and components.
AGI is likely to emerge not simply as a human-designed agent, but as a meta-agent. It’ll be a system that is partially developed and evolves with the help of AI itself. This is important because systems designed entirely by humans may carry inherent limitations. Involving AI in the design process could help overcome these constraints and make the system more adaptive.
AGI probably won’t come from any one specific breakthrough. Not particularly larger LLMs, smarter agents, or entirely new architectures, but rather from a synthesis of all three. Most likely, something fundamentally new that transcends the categories we currently use.
“Humanity’s Last Exam” and other AGI benchmarks
“Humanity’s Last Exam” (HLE) is one of the more ambitious benchmarks currently being discussed in the context of LLMs, Agents, and AGI. Essentially, it’s a test consisting of around 2,500 questions spanning a wide range of academic disciplines – mathematics, physics, biology, chemistry, engineering, computer science, and even chess. The idea is to evaluate whether an AI system can solve problems at a level that reflects genuine human understanding.
The current language models perform very poorly on HLE, often scoring less than 5% accuracy. This is in stark contrast to other benchmarks like MMLU or GPQA, where models achieve significantly higher scores. The difficulty models have with HLE highlights just how far they still are from truly general intelligence.
It’s important to note that high performance on benchmarks with known or narrow datasets doesn’t necessarily indicate the presence of real general intelligence. A model can be fine-tuned or “trained to the test,” which may inflate its apparent abilities. So even a perfect score on HLE wouldn’t mean we’ve reached AGI; it would only mean we’ve passed one particular test.
What moves the AGI
I fully agree that the core pillars of AGI are data, compute, and talent. The situation with computing is clear. Key players like Meta tried to produce their own chips, investing billions into their own chip development process. But companies still heavily rely on other chips and computing power of other players like Nvidia, which not only supply the necessary hardware but also understand the importance of scaling up production.
More questions are about data and talents. The internet has run out — there is no piece of human-created text from open sources that wasn’t used for training at the moment. The total volume of information humanity has produced so far turns out to be surprisingly small. That’s why companies are starting to actively partner with those who can generate high-quality human data.
Full automation or human-in-the-loop?
Another point – the decline in demand for manual data annotation. A few years ago, the industry was scaling at full speed. Thousands of annotators were onboarded to feed the hunger of AI pipelines. Today, much of that momentum has shifted toward automation. Models have matured, and so has the tooling around them.
Take facial recognition. It used to be one of the main drivers of image annotation volume. But the category is largely solved now. Models like YOLO, SAM, and Samurai are rapidly absorbing routine work. These systems compress weeks of manual effort into minutes, often with astonishing accuracy. We also implemented many ML-assisted tools in our proprietary platform Keylabs. It really helps to cut the routine workflow.
But all these models are limited by their generalization and are suitable for automating standard and uniform operations. Complex or unique cases still require human attention.
We’re moving away from the old paradigm where an annotator was simply a detail-oriented person who could recognize an object or emotion. In the new reality, professionals are needed: doctors to annotate medical images, programmers to code, architects to create blueprints, marketers for customer insights, and military experts for defense scenarios.
We’re already seeing real-world cases, such as fighter pilots annotating data for AI and earning $1,000 per hour for their expertise. Because such specialists are rare, and their knowledge is critical for training high-performance AI.
The world is changing: more and more people are becoming operators and “trainers” of artificial intelligence. Just the other day, I got a LinkedIn message asking me to check a dataset for an AI app designed for CEOs. In the future, any one of us might receive an offer to work as an annotator, not just someone clicking buttons, but an expert whose knowledge shapes the intelligence of tomorrow.
We already live in this new reality, a world of data labeling and AI training. Those who recognize it and adapt will gain a significant advantage.












