Enterprises are accelerating their artificial intelligence (AI) initiatives at a rapid pace. A study by Algorithmia showed that 76 percent of CIOs are prioritizing and increasing their IT budgets to have greater focus on AI and machine learning (ML) solutions. Organizations are also recognizing the importance of data, and most are embracing the fact that 80 percent of enterprise data is unstructured in nature.
Unstructured data is being produced and growing at an alarming rate in an enterprise stack. The unit of measurement has shifted from terabytes to petabytes. As a result, IT professionals, CDOs, and CIOs must deal with some new challenges in order to meet an increasing demand for usable data and actionable insights. Despite AI's enormous potential to transform any industry, only 15 percent of AI solutions deployed by the end of 2022 will be successful, and fewer of those will generate a positive ROI.
The biggest issue is that most enterprise AI solutions don’t see the light of day due to misalignment of expectations. There continues to be misconceptions around the possibilities of AI and projects continue to be conceived on hype-driven models. Most products or models are far from the actual reality of day-to-day enterprise operations. Other driving factors of lower success rates include: cost overruns, lack of AI Centers of Excellence (CoE), inexperienced talent, unavailability of data, and outdated policies, to name a few.
Planning Paves the Way for Enterprise AI Success
Unstructured data is data that lacks a predefined data model and includes everything from text-heavy documents and websites to images, video files, chatbots, audio streams, and social media posts. With the increasing amount of unstructured data in enterprise architecture, it is critical to have an efficient and incremental plan that aligns with the objectives of all corporate stakeholders. Typical objectives at an organizational level can include: process automation, fraud detection, improving the customer experience, improving safety, increasing sales, and so on. While some of these objectives can be achieved rather effectively, due to the structured nature of the data, planning around unstructured data can be challenging.
Typically, planning starts with identifying areas of opportunity within an organization. While there can be a grand AI vision at the executive management level, it is critical to identify an area that has high impact, low risk, and continuous growth in data. A good example of such a use case would be the function of loan processing in the banking and finance industry. Loan origination to servicing is riddled with manual processes where information is entered by hand into systems in a repetitive manner. Due diligence of loan applications involves a significant amount of document submission, which poses several risks. However, AI can be applied in several areas of the workflow, including document processing and fraud detection. This is also an area where there is continuous year-over-year growth of data.
Other critical steps to consider during this planning phase include defining measurable success criteria, formulating a cohesive data strategy, continuous training and feedback, and gauging the user experience, scalability, and infrastructure.
Defining Measurable Success Criteria (and Avoiding the Cart Before the Horse Moment!)
Google’s early success is often attributed to the company instituting Objective Key Results (OKRs). While this approach is something that can be applied to any aspect of business or personal goals, taking this proven approach towards your AI strategy could yield some promising results. However, when it comes to unstructured data, it is an evolving problem which the industry at large is attempting to solve. Given the challenges, business leaders should ask various questions to determine the ‘what’ and the ‘why’. For example, if increasing productivity is the key objective, two questions that could be answered are:
- Should I plan to improve throughput by way of automation? or
- Should I plan to solve 80 percent of the problem for 100 percent of all cases submitted?
Answering these questions lead to two different implementation journeys and it’s important to decide which one would be right for your enterprise.
With unstructured data, another ambiguous measurement area is accuracy. In the example of loan processing, there is so much variability in the documents submitted by customers, it is critical for business and technology leaders to come to consensus on how accuracy of the AI solution is measured. If productivity is one of the objectives of instituting an AI solution, then it would be necessary to identify other areas impacting productivity. This can be achieved by closely looking at the current as-is process and reimagining the process with AI automation. Often new automation leads to new steps in the process such as manual exception management, annotation, training, etc. With these steps in place, it would be easier to determine how to measure accuracy.
Data is the Lifeblood of all Enterprises
Unstructured data has a high degree of variability in how information is structured and presented. Enterprises are riddled with information presented in documents, which by nature have complex structures consisting of paragraphs, sentences, and, more importantly, multi-dimensional table structures. In addition to documents, organizations are increasingly investing in chat bots, monitoring social media data, and other forms of unstructured data like news, images, and videos.
Most organizations underestimate how much data is available and accessible on hand. Oftentimes the challenge is as simple as overcoming compliance restrictions and sharing data within the organization. Nevertheless, having clean and high variability of data allows for better assessment of a problem and designing of an optimal solution.
Another important factor to consider is what outcome you are expecting from this unstructured data. This will ensure an accurate amount of ground truth, training, and testing data. Going back to the loan processing example, if the outcome of this AI solution is to determine applicants' average daily balances, ground truth and training data can be hyper focused around bank statements. However, if the focus is to determine fraudulent applicants through submitted bank statements, one will have to access a wider range of documents to obtain the necessary ground truth and training data.
Scaling from PoC to Production
Embarking on a measurable Proof of Concept (PoC) ensures that all stakeholders understand the challenges, outcomes, and value proposition of an AI solution. However, a PoC is not the same as a production-ready solution. A PoC enables an organization to identify gaps, stimulates design thinking for a production solution, and streamlines the objectives and key results that should be achieved. In order to go from PoC to a scalable solution, organizations should plan for complex data scenarios which include constant data changes, unavailability of labeled data, and a high degree of variation in form and formats. Equally important is reimagination of the workflow, retraining of your workforce, and determining the right infrastructure, costs, performance, data architecture, information security, and service-level agreements (SLAs).
It’s absolutely imperative to evaluate the entire workflow and business process in order to gain the best results from any AI solution. Taking a cue from behavioral economics, it is critical to compare the outcome to an existing reference point (also known as “reference dependency”), at which point better efficiencies can be anticipated ahead of production through design thinking and process remapping.
This scenario assumes that both business and technical leaders have agreed on a MI or deep learning approach based on the PoC. Some problem statements could be deterministic and a statistical approach can be taken to solve the problem, whereas other challenges might require a combination of MI and neural network based approaches to achieve the desired outcomes.
Some AI solutions require the incorporation of Natural Language Processing (NLP). While general language models do serve as a foundational step, most models are not designed to meet the unique needs of every enterprise problem statement and would require fine tuning. At the same time, most executives are likely to get enthused about huge models like GPT3, which demand significant computational power and can have a direct influence on a company’s ROI. These models are most likely not a suitable fit for your company.
Your AI-driven PoC is just the beginning of a long process, so keep the following in mind:
- Don’t choose a complex problem to solve for at the stage of PoC
- Apply design thinking and review your end-to-end process; predict and manage risks early
- Accuracy is not the only measurement; design and plan to build a value-driven solution versus achieving 100 percent accuracy
- Evaluate your AI approach; don’t plan on hype-driven models, rather choose the most optimal approach that is modular in nature
- Manage expectations across all stakeholders to ensure the most successful outcome
- Design your solution and architecture to scale with the growth of your data for the most optimal ROI
Best Practices for AI-Driven Solutions
Today, most businesses are undertaking one or more AI projects. Despite excellent intentions and hard work, many enterprise AI programs fall short of expectations, do not scale, and do not generate the desired ROI. It will take time to integrate artificial intelligence as a core business component, however some of the best practices followed by successful organizations include:
- Start with AI CoE: Many large corporations, even non-tech ones, have set up AI Centers of Excellence (AI CoE) to maximize the chances of their success. An AI CoE brings together the necessary expertise, resources, and people to allow AI-based transformation initiatives. The primary benefits include:
- Consolidating AI learning, resources, and talent in a single place
- Developing a unified AI vision and business strategy
- Standardization of AI approaches, platforms, and processes
- Identifying new revenue opportunities for AI and innovation
- Scaling data science efforts by making AI available to all business functions
- Executive Buy In: An AI strategy is most successful through a top-down approach. Scaling pilots throughout an organization successfully requires leadership buy-in, necessary skills and data, and establishing an organizational structure that ensures models remain accurate over time.
- Availability of Data: Most organizations have siloed data for various compliance reasons. However, data is the lifeblood of any AI solution and provisioning of this data is critical. Along with provisioning, classification and cleansing of data is essential. Developing accurate ground truth and training data can make or break an AI solution.
- Architecture: Leveraging AI is a paradigm shift for any organization, which requires new ways of thinking and planning. Designing an optimal technical and operational architecture increases your chances of success. This includes having new functions like ML ops, data ops, iterative training, and annotations, among others.
- Modularity and Flexibility: AI-driven solutions are still in their nascent stages, especially when organizations are dealing with heavy unstructured data. It is critical to design and build a modular and flexible solution that can scale with the business and its growing challenges.
Establishing and embarking on an AI strategy has great potential for most organizations, and the use cases are endless. Machine and deep learning solutions touch every aspect of an organization, from sales and marketing to daily operations. However, like building a rocket or inventing a new gadget, success won’t be achieved all at once. AI-driven solutions should be approached in stages and built on smaller wins over time.
- The Black Box Problem in LLMs: Challenges and Emerging Solutions
- Alex Ratner, CEO & Co-Founder of Snorkel AI – Interview Series
- Circleboom Review: The Best AI-Powered Social Media Tool?
- Stable Video Diffusion: Latent Video Diffusion Models to Large Datasets
- Donny White, CEO & Co-Founder of Satisfi Labs – Interview Series