Choosing Storage to Support AI/ML Initiatives
By Candida Valois, Field CTO, Americas, Scality
Adoption of ML and AI continues to increase quickly, which isn’t surprising, given the business insights and industry transformation that its many use cases portend. PwC predicts that by 2030, AI could contribute almost $16 trillion to the global economy. That translates to a 26% increase in GDP for local economies.
These technologies require vast amounts of unstructured data to operate, and that data often comes in the form of videos, images, text and voice. Workloads of these types require a new approach to data storage; the old ways won’t suffice. With the advent of such workloads, applications need faster access to massive amounts of data – data that is created everywhere: in the cloud, at the edges and on-premises. These intensive workloads require low latency, the ability to support different types and sizes of payloads, and the ability to scale linearly.
What’s needed is a fresh approach to data delivery, one that is application-centric rather than location- or technology-centric. With the large-scale adoption of AI/ML and analytics, enterprise IT leaders need a significant shift in the way they think about data management and storage.
Handling all file sizes
In terms of AI/ML workloads and data storage, organizations need a solution that can handle different types of workloads, both small and large files. In some cases, you may need to deal with just a few tens of terabytes, while in others, there are many petabytes. Not all solutions are meant for huge files, just as not all can handle very small ones. The trick is finding one that can handle both in a flexible manner.
Scalability is essential
To ensure accuracy and speed, organizations require massive data sets because that’s what AI/ML algorithms need to properly train underlying models. Organizations want to grow in terms of capacity and performance but are often hampered by traditional storage solutions. When they try to scale linearly, they are unable to. AI/ML workloads require a storage solution that can scale infinitely as the data grows.
A few hundred terabytes maxes out standard file and block storage solutions; after that, they can’t scale. Object storage can scale limitlessly, elastically and seamlessly based on demand. And what’s important about object storage compared with traditional storage is that it’s a completely flat space in which there are no limitations. Users won’t encounter the limitations they’d find with traditional storage.
Meeting performance requirements
Capacity scaling is important, but it isn’t enough. Organizations also need the ability to scale linearly in terms of performance. Unfortunately, with many traditional storage solutions, scaling capacity comes at the expense of performance. So, when an organization needs to scale linearly in terms of capacity, performance tends to plateau or decline.
The standard storage paradigm consists of files organized into a hierarchy, with directories and sub-directories. This architecture works quite well when the data capacity is small, but as capacity grows, performance suffers at a certain point due to system bottlenecks and limitations with file lookup tables. However, object storage provides an unlimited flat namespace so that by simply adding additional nodes, you can scale to petabytes and beyond. For this reason, you can scale for performance as you scale for capacity.
Storage that can support AI/ML projects
Organizations must adopt a new way of looking at storage as AI and ML rise in popularity. This new approach must empower them to establish, run and scale their AI/ML initiatives in the proper manner. AI/ML training is a clear need, so some of the enterprise-grade object storage software available today is built to fulfill that need. Enterprises can begin their initiatives on a small scale, starting with one server, then scale out as needed for both capacity and performance. These projects also crucially need performance for their analytics applications, and fast object storage delivers it. In addition, object storage provides complete data lifecycle management across multiple clouds and enables flexibility from the edge to the core.
Enterprises need to process data efficiently, and object storage does this by letting applications easily access data on-premises, even in multiple clouds. Its low latency, scalability and flexibility make object storage a strong ally for AI/ML initiatives.
- NVIDIA: From Chipmaker to Trillion-Dollar AI Powerhouse
- Laura Petrich, PhD Student in Robotics & Machine Learning – Interview Series
- Liquid Neural Networks: Definition, Applications, & Challenges
- Patrick M. Pilarski, Ph.D. Canada CIFAR AI Chair (Amii) – Interview Series
- AI Leaders Warn of ‘Risk of Extinction’