Connect with us

Thought Leaders

The Holy Grail of Computational Power in AI

mm

Despite incredible progress, the capabilities of artificial intelligence are still limited when compared against real-world expectations. We build complex models, run neural networks, and test algorithms, yet progress sometimes stalls in the places we least expect.

The problem often lies not in the algorithms or the data, but in computational power, the resources that allow models to learn and operate at the necessary scale. So what lies behind this barrier? Let’s examine the critical resource without which even the most promising AI projects cannot move beyond the laboratory.

The сompute deficit and its consequences

To understand this topic, let’s start with the history of mobile communications. When 3G and later 4G networks appeared, the internet was already nearly global. And when 5G was introduced, many people asked a perfectly reasonable question: “The internet will be faster – but so what?”

In reality, the increase in internet speed doesn’t come down to user convenience. It transforms the entire technological landscape. Use cases emerge that were previously impossible. 5G turned out to be far faster than 4G, and this leap was not gradual, like the jump from 1G to 2G, but exponential. As a result, new applications, devices, and entire classes of technology can appear.

Traffic light cameras, real-time traffic analysis systems, and automated traffic regulation mechanisms – all of this becomes possible thanks to new communication technologies. Police gain new ways to exchange data, and in space, telescopes and satellites can transmit vast amounts of information to Earth. A qualitative leap in a foundational technology drives the development of the entire ecosystem.

The same principle applies to computational power. Imagine humanity’s total computing capacity in hypothetical units. Today, we might have, say, ten such units. With them, we can generate images and videos, write texts, create marketing materials… This is already substantial, but the range of applications is limited mainly.

Now imagine we had not ten, but a thousand such units. Suddenly, technologies that were previously too expensive become feasible, and startups that were abandoned due to high computational costs start to make economic sense.

Take robotaxis, for example. Today, they mostly rely on relatively weak local computers installed in the vehicle. However, if the video feed were transmitted to the cloud with enormous computational resources, the data could be processed and returned in real-time. And this is critical: a car moving at 100 km/h must make decisions in fractions of a second – go straight, turn, brake, or not brake.

That’s when a fully functioning robotaxi industry becomes possible, not just isolated solutions like the ones we see today. Any local computer installed in a car is inherently limited in a way that a connected system isn’t. The faster we can scale it, the faster the world around us will change.

Access to chips and the “golden ticket” in AI

In the context of computational power, the question arises: is access to modern chips becoming the “golden ticket” to entering the AI market? Are large players who sign contracts with chip manufacturers, or produce them themselves, creating a gap between major enterprise companies and everyone else?

Such a gap arises only in one case: if a business model is focused exclusively on selling chips to large clients. In practice, manufacturers like NVIDIA aim to provide cloud solutions for everyone. Their optimized chips are available in the cloud to both OpenAI and independent developers.

Even strategic alliances among companies like Google, Anthropic, Microsoft, OpenAI, Amazon, and NVIDIA are primarily partnerships for shared resource utilization, rather than attempts to close off the market. This model enables the efficient allocation of computational power, thereby accelerating technological development.

If we trace the chain of computational resource usage, it begins with the end user. For example, when you use WhatsApp for video calls and messaging, the company must ensure the service works: storing and processing data, running models for video cleanup, adding effects, and improving image quality.

Maintaining proprietary servers is expensive, they become outdated, and require constant upkeep. That’s why cloud solutions, “the cloud”, have emerged. The market is dominated by three players: Google Cloud, AWS, and Microsoft Azure. Other companies cannot compete at this level: the scale of infrastructure is too vast.

Cloud services are massive data centers with cooling, power supply, and round-the-clock maintenance. They house servers and specialized chips from NVIDIA, AMD, and other manufacturers, enabling large-scale computational processes.

Here we come to the key question I discussed in my previous column about data centers, and want to continue here: what is the main bottleneck in this system? Is it the shortage of electricity, or the difficulty of cooling data centers in regions where the climate makes it especially challenging? In reality, the secret lies in the chips themselves…

The holy grail

Why is NVIDIA today valued at around $5 trillion and counted among the most successful publicly traded companies in the world? The reason is simple: NVIDIA produces the chips on which AI models are trained and run inference.

Each of these chips consumes enormous amounts of electricity when training large models or processing ever-growing volumes of data. But how efficiently is that energy used? This is where specialized chips come into play; they handle specific tasks far more efficiently than general-purpose GPUs.

AI models differ. OpenAI, for example, has one family of models, Anthropic another. The concepts may be similar, but the mathematical structures and computational processes are different. A single general-purpose chip, when training OpenAI models (like ChatGPT) versus Anthropic models (like Claude), acts as a “one-size-fits-all tool,” consuming, say, 100,000 hours of computation for one model and 150,000 for another. Efficiency varies significantly and is rarely optimal.

Companies solve this problem by producing specialized chips. For example, one chip can be optimized for the ChatGPT architecture and train it in, say, 20 minutes, while another is tailored to Anthropic’s architecture and also completes training in 20 minutes. Energy consumption and training time are reduced multiple times compared to a general-purpose chip.

When these chips are sold to large companies, such as Google, Amazon, Microsoft, or Azure, they are offered as standalone products. Users can choose, for instance, a chip optimized for a YOLO model or a simpler, cheaper chip for a Xen architecture. This way, companies gain access to computational resources precisely tailored to their tasks, rather than purchasing general-purpose GPUs. If a user has ten different functions, they can use ten different specialized chips.

The trend is clear: specialized chips are gradually replacing general-purpose ones. Many startups now work with ASICs (Application-Specific Integrated Circuits), chips designed for specific computational tasks. The first ASICs appeared for Bitcoin mining: initially, cryptocurrency was mined on NVIDIA GPUs, then chips were created solely for Bitcoin and were incapable of performing other tasks.

I see this in practice: the same hardware configuration can produce completely different results depending on the task. In my startup Introspector, we study these processes in real projects, and as a strategic advisor of Keymakr, I observe how clients gain efficiency from specialized chips, allowing models to run faster. Projects that previously stalled during training or inference reach stable results with this approach.

However, narrow specialization carries risks. A chip optimized for Anthropic’s architecture won’t work for training OpenAI models, and vice versa. Each new architecture requires a new generation of hardware, creating a risk of large-scale “deprecation”. If Anthropic releases a new architecture tomorrow, all previous-generation chips become inefficient or useless. Producing new chips costs billions of dollars and can take years.

This creates a dilemma: should we make specialized chips that work perfectly in a narrow scenario, or continue producing general-purpose chips that solve all tasks moderately well but don’t require complete replacement when architectures change?

Efficiency in this context is measured by three primary parameters: runtime, electricity consumption, and heat generation. These metrics are directly related: the longer a system runs, the more energy it consumes and the more heat it produces. Reducing one parameter automatically improves the other two.

Here lies the “holy grail” of AI performance: if at least one of the fundamental efficiency metrics can be optimized, the other metrics almost automatically improve as well.

Sustainable process

With the growing use of specialized chips, the issue of overproduction risks has become pressing. Currently, the surplus of equipment is already significant, and companies are addressing this issue in various sustainable ways, including the reuse of existing resources.

Recycling equipment has become a key element of sustainable development in high-tech industries. Chips contain substantial amounts of precious and base metals, gold, copper, aluminum, palladium, and rare-earth materials, as well as materials used in microchips and transistors. Once equipment becomes obsolete, these valuable resources can be returned to production, reducing the cost of new components while simultaneously lowering the industry’s environmental footprint.

Some specialized factories and companies focus on recycling and extracting precious metals from outdated components. For example, some facilities use hydrometallurgical processes and advanced chemical methods to extract gold and copper with a high degree of purity, allowing these materials to be reused in new chips.

Additionally, companies are implementing closed-loop models, where old equipment is upgraded or integrated into new solutions, thereby reducing the need for primary resource extraction. Such approaches not only help minimize waste but also lower the carbon footprint of production, as traditional mining and metal processing require significant energy.

Sustainable management of the lifecycle of chips and equipment could become an industry standard, where technological progress aligns with environmental responsibility.

Michael Abramov is the founder & CEO of Introspector, bringing over 15+ years of software engineering and computer vision AI systems experience to building enterprise-grade labelling tools.

Michael began his career as a software engineer and R&D manager, building scalable data systems and managing cross-functional engineering teams. Until 2025, he has served as the CEO of Keymakr, a data labelling service company, where he pioneered human-in-the-loop workflows, advanced QA systems, and bespoke tooling to support large-scale computer vision and autonomy data needs.

He holds a B.Sc. in Computer Science and a background in engineering and creative arts, bringing a multidisciplinary lens to solving hard problems. Michael lives at the intersection of technology innovation, strategic product leadership, and real-world impact, driving forward the next frontier of autonomous systems and intelligent automation.