Interviews

Ugur Tigli, Chief Technical Officer at MinIO – Interview Series

Published May 22, 2024

Antoine Tardif, CEO & Founder of Unite.AI

Ugur Tigli is the Chief Technical Officer at MinIO, the leader in high-performance object storage for AI. As CTO, Ugur helps clients architect and deploy API-driven, cloud-native and scalable enterprise-grade data infrastructure using MinIO.

Can you describe your journey to becoming the CTO of MinIO, and how your experiences have shaped your approach to AI and data infrastructure?

I started my career in infrastructure engineering at Merrill Lynch as a backup and restore administrator. I continued to take on different challenges and various technical positions. I joined Bank of America through the acquisition of Merrill Lynch, where I was the vice president of Storage Engineering. Still, my role expanded to include computing and data center engineering.

As part of my job, I also worked with various venture capital firms (VCs) and their portfolio companies to bring the latest and greatest technology. During one of my meetings with General Catalyst, I was introduced to the idea and people behind MinIO. It appealed to me because of how they approached data infrastructure — it differed from everyone else on the market. The company realized the importance of the object store and the standard APIs that applications were getting started with. During those years, they could predict the future of computing and AI before anyone else or even before it was called what it is today. I wanted to be part of executing that vision and building something truly unique. MinIO is now the most broadly deployed object store on the planet.

The impact of my previous roles and experience on how I approach new technologies, specifically AI and data infrastructure, is also simply an accumulation of the many projects I have been involved in through my years of supporting application teams in a highly demanding financial services firm.

From the limited network bandwidth days, which led to Hadoop technology being the newest technology 15 years ago so, to various data media technologies from Hard Disk Drive (HDD) to Solid State Drive (SSD), many of these technology changes shaped my current view of the AI ecosystem and data infrastructure.

MinIO is recognized for its high-performance object storage capabilities. How does MinIO specifically cater to the needs of AI-driven enterprises today?

When AB and Garima were conceptualizing MinIO, their first priority was to think about a problem statement — they knew data would continue to grow and existing storage technologies were incompatible with that growth. The rapid emergence of AI has made their prescient views of the market a reality. Since then, object storage has become foundational for AI infrastructure (all the major LLMs like OpenAI and Anthropic are all built on object stores), and the modern data center is built on an object store foundation.

MinIO recently launched a new object storage platform with critical enterprise-grade features to support organizations in their AI initiatives: the MinIO Enterprise Object Store. It’s designed for the performance and scale challenges introduced by massive AI workloads and enables customers to address the challenges associated with billions of objects more easily, as well as hundreds of thousands of cryptographic operations per node per second. It has six new commercial features that target key operational and technical challenges faced by AI workloads: Catalog (this solves the problem of object storage namespace and metadata search), Firewall (purpose-built for the data), Key Management System (solves the problem of dealing with billions of cryptographic key), Cache (operates as a caching service), Observability (allows administrators to view all system components across every instance), and lastly, the Enterprise Console (serves as a single pane of glass for all of the org’s instances of MinIO).

Handling AI at scale is becoming increasingly crucial. Could you elaborate on why this is the case and how MinIO facilitates these requirements for modern enterprises?

Almost everything organizations build is now on object storage which will only accelerate as those running infrastructure with an appliance hit a wall in the age of modern data lakes and AI. Organizations are looking at new infrastructures to manage all of the data coming into their system and then building data-centric applications on top of it – this requires extraordinary scale and flexibility that only object storage can support. That’s where MinIO comes in and why the company has always stood miles ahead of the competition because it’s designed for what AI needs – storing massive volumes of structured and unstructured data and providing performance at scale.

Similar to machine learning (ML) needs in previous generations of AI, data and modern data lakes have been critical to the success of any “predictive” AI. However, with the advancement of “generative” AI, this landscape has expanded to include many other components, such as AI Ops data and document pipelines, foundational models, and vector databases.

All of these additional components use object storage, and most of them directly integrate with MinIO. For example, Milvus, a vector database, uses MinIO, and many modern query engines integrate with MinIO through S3 APIs.

AI technical debt is a growing concern for many organizations. What strategies does MinIO employ to help clients avoid this issue, especially in terms of utilizing GPUs more efficiently?

A chain is as strong as its weakest link – and your AI/ML infrastructure is only as fast as your slowest component. If you train machine learning models with GPUs, your weak link may be your storage solution. The result is what I call the “Starving GPU Problem.” The Starving GPU problem occurs when your network or storage solution cannot serve training data to your training logic fast enough to fully utilize your GPUs, leaving valuable compute power on the table. Something that organizations can do to fully leverage their GPUs is first to understand the signs of a poor data architecture and how it can directly result in the underuse of AI technology. To avoid technical debt, companies must change how they view (and store) data.

Organizations can set up a storage solution that is in the same data center as their computing infrastructure. Ideally, this would be in the same cluster as your compute. Because MinIO is a software-defined storage solution, it’s capable of the performance needed to feed hungry GPUs – a recent benchmark achieved 325 GiB/s on GETs and 165 GiB/s on PUTs with just 32 nodes of off-the-shelf NVMe SSDs.

You have a rich background in creating high-performance data infrastructures for global financial institutions. How do these experiences inform your work at MinIO, especially in architecting solutions for diverse industry needs?

I helped build the first private cloud for Bank of America and that initiative saved billions of dollars by providing features and functionality available in public clouds internally at a lower cost. Not only this major initiative but many other diverse application requirements I have worked on at BofA Merrill Lynch has shaped my work at MinIO as it relates to architecting solutions for our customers today.

For example, learning it the wrong or the “hard” way worked with the team that built Hadoop clusters that only used the data storage components of the server while keeping the server CPUs underutilized or nearly idle. Simple examples or learnings like this allowed me to use disaggregated data and compute solutions in the modern data infrastructure of today while helping our customers and partners, which are technically better and lower cost solutions using today’s high bandwidth network technologies and high performance object stores like MinIO and any query or processing engine.

The hybrid cloud presents unique challenges and complexities. Could you discuss these in detail and explain how MinIO’s hybrid “burst” to the cloud model helps control cloud costs effectively?

Going multicloud should not lead to ballooning IT budgets and an inability to hit milestones —it should help manage costs and accelerate an organization’s roadmap. Something to consider is cloud repatriation — the reality is that shifting operations from the cloud to on-premises infrastructure can lead to substantial cost savings, depending on the case, and you should always look at the cloud as an operating model, not a destination. For example, organizations spin up GPU instances but then spend time preprocessing data in order to fit it into the GPU. This wastes precious time and money – organizations need to optimize better by choosing cloud native and, more importantly, cloud-portable technologies that can unlock the power of multicloud without significant costs. Using the cloud-first operating model principles and adhering to that framework provides the agility to adapt to changing operational requirements.

Kubernetes-native solutions are pivotal for modern infrastructure. How does MinIO’s integration with Kubernetes enhance its scalability and flexibility for AI data infrastructure?

MinIO is Kubernetes-native by design and S3 compatible from inception. Developers can quickly deploy persistent object storage for all of their cloud-native applications. The combination of MinIO and Kubernetes provides a powerful platform that allows applications to scale across any multi-cloud and hybrid cloud infrastructure and still be centrally managed and secured, avoiding public cloud lock-in.

With Kubernetes as its engine, MinIO is able to run anywhere Kubernetes does – which, in the modern, cloud-native/AI world, is essentially everywhere.

Looking ahead, what are the future developments or enhancements users can expect from MinIO in the context of AI data infrastructure?

Our recent partnerships and product launches are a sign to the market that we’re not slowing down anytime soon, and we’ll continue pushing where it makes sense for our customers. For example, we recently partnered with Carahsoft to make MinIO’s software-defined object storage portfolio available to the Government, Defense, Intelligence and Education sectors. This enables Public Sector organizations to build any diverse scale data infrastructure, ranging from expansive modern datalakes to mission-specific data storage solutions at the autonomous edge. Together, we are bringing these cutting-edge, unique solutions to Public Sector customers, empowering them to address data infrastructure challenges easily and efficiently. This partnership comes at a time when there’s an increased push toward enabling the public sector to be AI-ready, with the recent OMB requirements stating that all federal agencies need a Chief AI Officer (among other things). Overall, the partnership helps strengthen the industry’s AI posture and gives the public sector the valuable tools necessary to succeed.

Additonally, MinIO is very well positioned for the future. AI data infrastructure is still in its infancy. Many areas of it will be more apparent in the next couple of years. For example, most enterprises will want to use their proprietary data and documents with foundational models and Retrieval Augmented Generation (RAG). Further integration to this deployment pattern will be easy for MinIO of the fact that all these architectural choices and deployment patterns have one thing in common – all that data is already stored on MinIO.

Finally, for technology leaders looking to build or enhance their data infrastructure for AI, what advice would you offer based on your experience and insights at MinIO?

In order to make any AI initiative successful, there are three key elements you must stick to: having the right data, the right infrastructure, and the right applications. It really starts with understanding what you need – don’t go out and buy expensive GPUs just because you’re afraid you’ll miss out on the AI boat. I strongly believe that enterprise AI strategies will fail in 2024 if organizations focus only on the models themselves and not on data. Thinking model down vs. data up is a critical mistake – you have to start with the data. Build a proper data infrastructure. Then, think about your models. As organizations move towards an AI-first architecture, it is imperative that your data infrastructure enables your data – not constraints it.

Thank you for the great interview, readers who wish to learn more should visit MinIO.

Unite.AI

Ugur Tigli, Chief Technical Officer at MinIO – Interview Series

You may like