Interviews

Andreas Hellander, CEO and Co-Founder of Scaleout Systems – Interview Series

mm

Andreas Hellander is the CEO and co-founder of Scaleout Systems, a company building infrastructure for edge AI and federated learning that trains models on distributed, sensitive data without centralizing it. His company has worked with NATO and defense primes such as BAE systems. He holds a PhD in Scientific Computing and an MSc in Biotechnology Engineering, and is an Associate Professor at Uppsala University, where he built one of the top research groups at the Department of Information Technology before founding Scaleout.

You co-founded Scaleout after years of research in distributed computing, cloud infrastructure, and scientific computing at Uppsala University. What was the moment when you realized federated learning and edge AI needed to move beyond academia and become a commercial platform?

Years of research on large-scale distributed systems made one thing increasingly clear: as ML started to show real promise across industries, applying it responsibly required solving the data problem first. For many organizations, the most valuable data simply cannot be centralized, whether that’s for regulatory, practical, or security reasons. Federated learning emerged as a research response to that constraint. Our federated learning software started as a research prototype at Uppsala University and at some point it became clear the timing was right to take it further. The infrastructure to make ML safe and secure for sensitive data didn’t exist in any production-ready form, and we felt we were well positioned to build it. Scaleout was founded to do that.

Scaleout’s partnership with AI Verse combines synthetic battlefield data generation with federated learning at the tactical edge. How do you see this changing the way military AI systems are developed compared to traditional approaches that rely on centralized datasets?

Traditionally, the approach has been to collect operational footage, ship it centrally, train, redeploy. Every stage introduces a bottleneck. We partnered with AI Verse, a NATO-backed synthetic data firm, to remove much of the first obstacle. AI Verse’s GAIA platform generates photorealistic, fully annotated Red, Green, Blue (RGB) and Infrared (IR) imagery on demand, removing the need for field collection, manual labeling, and long wait times. Scaleout removes the training bottleneck at the other end. Once deployed, models improve continuously from live edge data without centralizing anything. The combined effect is that organizations can go from no model to a fielded, improving model without ever touching restricted operational data.

One of the biggest challenges in defense AI is that battlefield conditions evolve faster than training cycles. How can edge-based learning help military systems adapt to new drones, vehicles, and threats without waiting for centralized retraining?

Computer vision for counter-UAS and Intelligence, Surveillance, and Reconnaissance (ISR) today is commonly trained centrally, deployed once, and then goes stale. Centralized retraining is a batch process that involves collecting data, labeling it, retraining the model, testing it, and then redeploying it. That cycle takes weeks or months. Meanwhile detection models degrade as seasons, sensors, environments and adversary tactics change.

Edge-based learning closes the loop at the source. Each site runs real-time detection on live sensor feeds while simultaneously filtering out the most useful frames for training via active learning. A ground node at a test range that encounters a new drone type flags the uncertain detections, annotators review a curated queue rather than terabytes of raw video, local fine-tuning runs on-site, and the improved model is back in production within days. Federated learning then propagates that improvement across all sites.

Many AI systems still depend heavily on cloud infrastructure. Why do you believe the future of defense AI will increasingly move toward distributed and disconnected environments rather than centralized cloud architectures?

There are three structural reasons for this. First, classification boundaries mean operational sensor data often can’t leave the site. Second, contested environments mean the network link cannot be assumed. And finally, a model that requires cloud connectivity to function is a single point of failure in exactly the environment it’s most needed. These aren’t preferences, they’re constraints that disqualify cloud-first architectures before they start.

Synthetic data is becoming an important tool for AI development. In defense applications, where accuracy can have life-or-death consequences, what are the strengths and limitations of using synthetic battlefield data to train computer vision models?

Synthetic data offers several important advantages. It can generate scenarios impossible or impractical to collect in the field, for example, rare threat classes, IR, adverse conditions and exact sensor geometries. It also eliminates manual labeling, operational data risk and it can be scaled immediately.

At the same time there are limitations. The synthetic-to-real gap is real and varies by scenario. Fine detail that determines threat classification may not transfer unless the simulation fidelity is high. The honest position is that synthetic data is a strong cold-start mechanism and fills gaps field collection structurally cannot, but it doesn’t replace live operational data for final model performance. That’s why the pipeline combines both.

Scaleout has spent years developing federated learning technologies that allow organizations to train AI without moving sensitive data. What lessons from healthcare, industrial AI, and other regulated sectors are now proving valuable in defense deployments?

As ML started to be applied across regulated industries, a common pattern emerged. The data that would most improve models was also the data that couldn’t be moved.

Addressing that required more than high quality algorithms, it required infrastructure that made distributed learning safe, auditable, and governable. What that work surfaced was the importance of rigorous audit trails, the challenge of data quality and selection when you can’t see the full dataset centrally, and the question of whether model updates themselves might leak information about training data. That last concern led to LeakPro, our open-source privacy auditing framework. The underlying problems of sensitive data, distributed environments, and governance requirements translate directly to defence – even if the specific constraints differ.

NATO and allied nations are increasingly focused on technological sovereignty. Do you see federated learning becoming a strategic capability that allows allied nations to collaborate on AI development without sharing sensitive operational data?

Yes, and this is already happening. The FEDAIR programme under NATO DIANA is a direct test of whether allied nations can jointly improve shared AI capability without exchanging classified sensor data. The architecture’s answer is yes. Each nation trains on its own data, contributes weight updates to a shared aggregation point, and receives an improved global model. No raw data crosses national boundaries.

Sovereignty here means more than data protection. It means staying in full control of the AI lifecycle with the ability to deploy any model to any sensor stream and use the data you own to continuously improve your models. That requires resisting lock-in, including air-gapped capable infrastructure, full model provenance, and vendor-neutral sensor integration. Those properties are structural, not contractual, and that distinction matters in procurement.

Counter-drone systems are emerging as one of the most important AI applications in modern warfare. What technical hurdles still need to be overcome before AI-powered counter-drone platforms can operate reliably across diverse and rapidly changing combat environments?

From our viewpoint, the technical challenges in counter-drone AI look significant and are likely underappreciated. The sensor landscape alone is complex. Different sensor modalities produce different data characteristics, and a model trained on one may not transfer cleanly to another. Threat diversity compounds this. The proliferation of low-cost commercial platforms means the population of objects a system needs to detect and classify is expanding faster than most training datasets can follow.

Beyond the pure ML challenges, there’s an architectural question that seems important: who owns and controls the model, and who can update it? Systems that depend on a vendor’s closed model and update cycle inherit that vendor’s constraints. The ability to adapt models to your own operational data, on your own infrastructure, on your own timeline, seems like a precondition for reliable long-term performance rather than an optional feature.

As warfare increasingly involves autonomous systems and AI-assisted decision-making, how should military organizations balance continuous learning in the field with the need for reliability, predictability, and human oversight?

The platform is designed as a sidecar to existing command and control, not a replacement. Detections feed into Tactical Assault Kit (TAK) and standard Command and Control (C2) systems using standardised data formats and protocols, enabling them to be visualised and acted on within existing operational workflows. The AI enhances human decision-making rather than substituting for it.

On the model side, continuous doesn’t mean uncontrolled. Every update is validated against a benchmark suite before promotion, operators approve model versions before fleet deployment and shadow deployment lets a new version run in parallel before replacing production. The audit trail records which version ran where and what it showed. The improvement cycle is systematic and governed, not automatic and unchecked.

Looking ahead five years, what will distinguish the most advanced military AI architectures from those being deployed today, and which technologies do you believe will have the greatest impact on future defense capabilities?

The current divide is between systems that can adapt in the field and those that can’t. Most deployed systems today are static. In five years the differentiator will be how well adaptation is governed, which version of what model ran on which platform, shaped by which data, under what conditions, and whether that chain can be audited, validated, and trusted by operators and procurement.

The underlying ML will commoditise. The infrastructure for governed, continuous learning at the edge, especially across allied networks where each nation must retain sovereignty over its own data, will not. The nations and programmes that build that infrastructure now will have a compounding advantage: models that keep improving from operational data, under national control, without depending on any vendor’s update cycle or any cloud provider’s availability.

Thank you for the great interview, readers who wish to learn more should visit Scaleout Systems.

Antoine is a visionary leader and founding partner of Unite.AI, driven by an unwavering passion for shaping and promoting the future of AI and robotics. A serial entrepreneur, he believes that AI will be as disruptive to society as electricity, and is often caught raving about the potential of disruptive technologies and AGI.

As a futurist, he is dedicated to exploring how these innovations will shape our world. In addition, he is the founder of Securities.io, a platform focused on investing in cutting-edge technologies that are redefining the future and reshaping entire sectors.