By Victor Thu, vice president of customer success and operations, Datatron.
A survey by Gartner in late 2020 found that 75% of respondents planned to continue or start new AI initiatives in the coming year. At the same time, Gartner analysts also found that one of the most significant struggles with moving AI initiatives into production is the inability for those organizations to connect those investments back to the business value.
What’s more, it’s widely estimated that the majority of AI/ML projects will fail. And that fact can make it even harder to get buy-in from the top on these investments. This is where MLOps – Machine Learning Operations – can play a key role.
The current ML landscape
Machine learning offers profound possibilities for organizations, but the reality is that getting to those possibilities can be expensive and time-consuming. So, while interest in implementing ML is high, actual production implementation remains low. The main hurdle of bringing solutions into production isn’t the quality of the models, but rather the lack of infrastructure in place to allow companies to do so.
The development lifecycle for machine learning is fundamentally different than the lifecycle of traditional software development. Over the last 20 years, people have, for the most part, figured out what it takes for traditional software to go from development to production. They understand the compute, middleware, networking, storage and other elements needed to ensure the app is running well.
Unfortunately, most are trying to use the same software development lifecycle (SDLC) for the machine learning development lifecycle (MLLC). However, ML is a significant paradigm shift. Infrastructure allocations are unique. The languages and frameworks are different.
Machine learning models can be created relatively quickly in a matter of weeks, but the process of getting these models into production can take anywhere from six to nine months due to siloed processes, disconnects between teams, and manually translating and scripting ML models into existing application.
It’s also difficult to monitor and govern machine learning models once they’ve made it into production. There’s no guarantee ML models created in the lab will run the way they’re intended in production. And there are several different factors that could be behind that.
The benefits of MLOps
When it comes to deploying machine learning models in production, as mentioned, there’s a lot that can go wrong. When IT/DevOps attempts to operationalize machine learning models, these teams need to manually script and automate the different processes. These models are often being updated, and each time the models are updated, the entire process is repeated.
When an organization has more and more models and the different iterations of these models, keeping track of them becomes a huge issue. One of the big issues is that often, the tools they’re using don’t address the problem of different codebases and frameworks being disjointed amongst each other. That can lead to problems, which results in wasting time and resources, among other issues. Most teams today also struggle with tracking and versioning as they update their models.
MLOps helps bridge the divides between data science and operations to manage the production ML lifecycles – essentially applying DevOps principles to ML delivery. That enables faster time to market for ML-based solutions, more rapid rate of experimentation, and assurance of quality and reliability.
Using traditional SDLC models, you might be able to get one or two ML models done a year, at great pain and with extreme inefficiency. But with MLOps, you can scale, so you can address multiple problems. You can use these models to help better target prospective clients, find more relevant customers or find and improve inefficiencies. You’re able to roll out improvements much faster, ultimately improving productivity and profit.
The elements of MLOps success
MLOps isn’t a silver bullet. You still need to have the proper groundwork and know the best practices for it to work. To succeed with MLOps, you need to focus on two primary duties. The first is understanding the different roles. You need to ensure you have the right, diverse set of skills and employees in place; don’t treat data scientists and machine learning engineers as one and the same. Both are necessary, but you need a mix.
The second thing to keep in mind is don’t try to DIY it all. MLOps is also labor-intensive, requiring large teams of ML engineers. It’s important to think through what you need and look at the tools that are available to help you simplify the approach and streamline the number of dedicated people needed.
Going forward with confidence
Industry analysts estimate that close to half of enterprise AI projects are destined to fail. There are several reasons for such failure, including an organization’s culture. But a primary reason is the lack of appropriate technology in place to support the project. MLOps is a highly useful tools for helping organizations achieve success in their AI/ML projects, resulting in competitive business advantage.