Connect with us

Interviews

Ricky Costa, CEO of Quantum Stat – Interview Series

mm

Published

 on

Ricky Costa, CEO of Quantum Stat - Interview Series

Ricky Costa is the CEO of Quantum Stat a company that offers business solutions for NLP and AI Initiatives

What initially got you interested in artificial intelligence?

Randomness. I was reading a book on probability when I came across a famous theorem. At the time, I naively wondered if I could apply this theorem into a natural language problem I was attempting to solve at work. As it turns out, the algorithm already existed unbeknownst to me, it was called the Naïve Bayes, a very famous and simple generative model used in classical machine learning. That theorem was Bayes theorem. I felt this coincidence was a clue, and planted a seed of curiosity to keep learning more.

 

You’re the CEO of Quantum Stat a company which offers solutions for Natural Language Processing. How did you find yourself in this position?

When there’s a revolution in a new technology some companies are most hesitant than others when facing the unknown. I started my company because pursuing the unknown is fun to me.  I also felt it was the right time to venture into the field of NLP given all of the amazing research that has arrived in the past 2 years. The NLP community has the capacity now to achieve a lot more with a lot less given the advent of new NLP techniques that require less data to scale performance.

 

For readers who may not be familiar with this field, could you share with us what Natural Language Processing does?

NLP is a subfield of AI and analytics that attempts to understand natural language in text, speech or multi-modal learning (text and images/video) and computing it to the point where you are driving insight and/or providing a valuable service. Value can arrive from several angles, from information retrieval in a company’s internal file system, to classifying sentiment in the news, or a GPT-2 twitter bot that helps with your social media marketing (like the one we built couple of weeks ago).

 

You have a Bachelor of Arts from Hunter College in Experimental Psychology. Do you feel that understanding the human brain and human psychology is an asset when it comes to understanding and expanding the field of Natural Language Processing?

This is contrarian, but unfortunately, no. The analogy of neurons and deep neural networks is simply for illustration and instilling intuition. One can probably learn a lot more from complexity science and engineering. The difficulty with understanding how the brain works is that we are dealing with a complex system. “Intelligence” is an emergent phenomenon from the brain’s complexity interacting with its environment, and very difficult to pin down. Psychology and other social sciences, which are dependent on “reductionism” (top-down) don’t work under this complex paradigm. Here’s the intuition: imagine someone attempting to reduce the Beatle’s song “Let it Be” to the C Major scale. There’s nothing about that scale that predicts “Let it Be” will emerge from it. The same follows with someone attempting to reduce behavior to neural activity in the brain.

 

Could you share with us why Big Data is so important when it comes to Deep Learning and more specifically Natural Language Processing?

As it stands, because deep learning models interpolate data, the more data you feed into the model the less edge cases it will see when making an inference in the wild. This architecture “incentivizes” large datasets to be computed by models in order to increase accuracy of output. However, if we want to achieve more intelligent behavior by AI models, we need to look beyond how much data we have and more towards how we can improve the ability of model’s ability to reason more efficiently, which intuitively, shouldn’t require lots of data. From a complexity perspective, the cellular automata experiments conducted in the past century by physicists John von Neumann and Stephen Wolfram show that complexity can emerge from simple initial conditions and rules. What these conditions/rules should be with regards to AI, is what everyone’s hunting.

 

You recently launched the ‘Big Bad NLP Database’. What is this database and why does it matter to those in the AI industry?

This database was created for NLP developers to have a seamless access to all the pertinent datasets in the industry. This database helps to index datasets which has a nice secondary effect of being able to be queried by users. Preprocessing data takes the majority of time in the deployment pipeline, and this database attempts to mitigate this problem as much as possible. In addition, it’s a free platform for anyone regardless of whether you are an academic researcher, practitioner, or independent AI guru that wants to get up to speed with NLP data. Link

 

Quantum Stat currently offers end-end solutions. What are some of these solutions?

We help companies facilitate their NLP modeling pipeline by offering development at any stage. We can cover a wide range of services from data cleaning in the preprocessing stage all the way up to model server deployment in production (these services are also highlighted on our homepage). Not all AI projects come to fruition due to the unknown nature of how your specific data/project architecture works with a state-of-the-art model. Given this uncertainty, our services give companies a chance to iterate on their project at the fraction of cost of hiring a full-time ML engineer.

 

What recent advancement in AI do you find the most interesting?

The most important advancement of late is the transformer model, you may have heard of it: BERT, RoBERTa, ALBERT, T5 and so on. These transformer models are very appealing because they allow the researcher to achieve state-of-the-art performance with a smaller datasets. Prior to transformers, a developer would require a very large dataset to train a model from scratch. Since these transformers come pretrained on billions of words, it allows for faster iteration of AI projects and it’s what we are mostly involved with at the moment.

 

Is there anything else that you would like to share about Quantum Stat?

We are working on a new project dealing with financial market sentiment analysis that will be released soon. We have leveraged multiple transformers to give unprecedented insight to how financial news unfolds in real-time. Stay tuned!

To learn more visit Quantum Stat or read our article on the Big Bad NLP Database.

Spread the love

Antoine Tardif is a futurist who is passionate about the future of AI and robotics. He is the CEO of BlockVentures.com, and has invested in over 50 AI & blockchain projects. He is also the Co-Founder of Securities.io a news website focusing on digital securities, and is a founding partner of unite.ai

Big Data

Anthony Macciola, Chief Innovation Officer at ABBYY – Interview Series

mm

Published

on

Anthony Macciola, Chief Innovation Officer at ABBYY - Interview Series

Anthony is recognized as a thought leader and primary innovator of products, solutions, and technologies for the intelligent capture, RPA, BPM, BI and mobile markets.

ABBYY is an innovator and leader in artificial intelligence (Al) technology including machine learning and natural language processing that helps organizations better understand and drive context and outcomes from their data. The company sets a goal to grow and strengthen its leadership positions by satisfying the ever-increasing demand for AI-enabled products and solutions.

ABBYY has been developing semantic and AI technologies for many years. Thousands of organizations from over 200 countries and regions have chosen ABBYY solutions that transform documents into business value by capturing information in any format. These solutions help organizations of diverse industries boost revenue, improve processes, mitigate risk, and drive competitive advantage.

What got you initially interested in AI?

I first became interested in AI in the 90s. In my role, we were utilizing support vector machines, neural networks, and machine learning engines to create extraction and classification models. At the time, it wasn’t called AI. However, we were leveraging AI to address problems surrounding data and document-driven processes, problems like effectively and accurately extracting, classifying and digitizing data from documents. From very early on in my career, I’ve known that AI can play a key role in transforming unstructured content into actionable information. Now, AI is no longer seen as a futuristic technology but an essential part of our daily lives – both within the enterprise and as consumers. It has become prolific. At ABBYY, we are leveraging AI to help solve some of today’s most pressing challenges. AI and related technologies, including machine learning, natural language processing, neural networks and OCR, help power our solutions that enable businesses to obtain a better understanding of their processes and the content the fuels them.

 

You’re currently the Chief Innovation Officer at ABBYY. What are some of the responsibilities of this position? 

In my role as Chief Innovation Officer for ABBYY, I’m responsible for our overall vision, strategy, and direction relative to various AI initiatives that leverage machine learning, robotic process automation (RPA), natural language processing and text analytics to identify process and data insights that improve business outcomes.

As CIO, I’m responsible for overseeing the direction of our product innovations as well as identifying outside technologies that are a fit to integrate into our portfolio. I initiated the discussions that lead to acquisition of TimelinePI, now ABBYY Timeline, the only end-to-end Process Intelligence platform in the market. Our new offering enables ABBYY to provide an even more robust and dynamic solution for optimizing the processes a business runs on and the data within those processes. We provide enterprises across diverse industries with solutions to accelerate digital transformation initiatives and unlock new opportunities for providing value to their customers.

I also guide the strategic priorities for the Research & Development and Product Innovation teams. My vision for success with regards to our innovations is guided by the following tenants:

  • Simplification: make everything we do as easy as possible to deploy, consume and maintain.
  • Cloud: leverage the growing demand for our capabilities within a cloud-based SaaS model.
  • Artificial Intelligence: build on our legacy expertise in linguistics and machine learning to ensure we take a leadership role as it relates to content analytics, automation and the application of machine learning within the process automation market.
  • Mobility: ensure we have best-of-breed on device and zero footprint mobile capture capabilities.

 

ABBYY uses AI technologies to solve document-related problems for enterprises using intelligent capture. Could you walk us through the different machine learning technologies that are used for these applications?

ABBYY leverages several AI enabling technologies to solve document-related and process-related challenges for businesses. More specifically, we work with computer vision, neural networks, machine learning, natural language processing and cognitive skills. We utilize these technologies in the following ways:

Computer Vision: utilized to extract, analyze, and understand information from images, including scanned documents.

Neural Networks: leveraged within our capture solutions to strengthen the accuracy of our classification and extraction technology. We also utilize advanced neural network techniques within our OCR offerings to enhance the accuracy and tolerance of our OCR technology.

Machine Learning: enables software to “learn” and improve, which increases accuracy and performance. In a workflow involving capturing documents and then processing with RPA, machine learning can learn from several variations of documents.

Natural Language Processing: enables software to read, interpret, and create actionable and structured data around unstructured content, such as completely unstructured document such as contracts, emails and other free-form communications.

Cognitive Skill: the ability to carry out a given task with determined results within a specific amount of time and cost. Examples within our products including extracting data and classifying a document.

 

ABBYY Digital Intelligence solutions help organizations accelerate their digital transformation. How do you define Digital Intelligence, how does it leverage RPA, and how do you go about introducing this to clients?

Digital Intelligence means gaining the valuable, yet often hard to attain, insight into an organization’s operation that enables true business transformation. With access to real-time data about exactly how their processes are currently working and the content that fuels them, Digital Intelligence empowers businesses to make tremendous impact where it matters most: customer experience, competitive advantage, visibility, and compliance.

We are educating our clients as to how Digital Intelligence can accelerate their digital transformation projects by addressing the challenges they have with unstructured and semi-structured data that is locked in documents such as invoices, claims, bills of lading, medical forms, etc. Customers focused on implementing automation projects can leverage Content Intelligence solutions to extract, classify, and validate documents to generate valuable and actionable business insights from their data.

Another component of Digital Intelligence is helping customers solve their process-related challenges. Specifically in relation to using RPA, there is often a lack of visibility of the full end-to-end process and consequently there is a failure to consider the human workflow steps in the process and the documents on which they work. By understanding the full process with Process Intelligence, they can make better decisions on what to automate, how to measure it and how to monitor the entire process in production.

We introduce this concept to clients via the specific solutions that make up our Digital Intelligence platform. Content Intelligence enables RPA digital workers to turn unstructured content into meaningful information. Process Intelligence provides complete visibility into processes and how they are performing in real time.

 

What are the different types of unstructured data that you can currently work with?

We transform virtually any type of unstructured content, from simple forms to complex and free-form documents. Invoices, mortgage applications, onboarding documents, claim forms, receipts, and waybills are common use cases among our customers. Many organizations utilize our Content Intelligence solutions, such as FlexiCapture, to transform their accounts payable operations, enabling companies to reduce the amount of time and costs associated with tedious and repetitive administrative tasks while also freeing up valuable personnel resources to focus on high-value, mission critical responsibilities.

 

Which type of enterprises best benefit from the solutions offered by ABBYY?

Enterprises of all sizes, industries, and geographic markets can benefit from ABBYY’s Digital Intelligence solutions. In particular, organizations that are very process-oriented and document driven see substantial benefits from our platform. Businesses within the insurance, banking and financial services, logistics, and healthcare sectors experience notable transformation from our solutions.

For financial service institutions, extracting and processing content effectively can enhance application and onboarding operations, and also enable mobile capabilities, which is becoming increasingly important to remain competitive. With Content Intelligence, banks are able to easily capture documents submitted by the customer – including utility bills, pay stubs, W-2 forms – on virtually any device.

In the insurance industry, Digital Intelligence can significantly improve claims processes by identifying, extracting, and classifying data from claim documents then turning this data into information that feeds into other systems, such as RPA.

Digital Intelligence is a cross-industry solution. It enables enterprises of all compositions to improve their processes and generate value from their data, helping businesses increase operational efficiencies and enhance overall profit margins.

 

Can you give some examples of how clients would benefit from the Digital Intelligence solutions that are offered by ABBYY?

Several recent examples come to mind relating to transforming accounts payable and claims. A billion-dollar manufacturer and distributor of medical supplies was experiencing double-digit sales growth year-over-year. It used ABBYY solutions with RPA to automate its 2,000/day invoices and achieved significant results in productivity and cost efficiencies. Likewise, and insurance company digitized its 150,000+ annual claims processing. From claim setup to invoice clarity it achieved more than 5,000 hours of productivity benefits.

Another example is with a multi-billion global logistics company that had a highly manual invoice processing challenge. It had dozens of people processing hundreds of thousands of invoices from 124 different vendors annually. When it first considered RPA for its numerous finance activities, it shied away from invoice processing because of the complexity of semi-structured documents. It used our solutions to extract, classify and validate invoice data, which included machine learning for ongoing training of invoices. If there was data that could not be matched, invoices went to a staff member for verification, but the points that needed to be checked were clearly highlighted to minimize effort. The invoices were then processed in the ERP system using RPA software bots. As a result, its accounts payables are now completely automated and is able to processes thousands of invoices at a fraction of the time with significantly less errors.

 

What are some of the other interesting machine learning powered applications that are offered by ABBYY?

Machine learning is at the heart of our Content Intelligence solutions. ML fuels how we train our classification and extraction technology. We utilize this technology in our FlexiCapture solution to acquire, process, and validate data from documents – even complex or free form ones – and then feed this data into business applications including BPM and RPA. Leveraging machine learning, we are able to transform content-centric processes in a truly advanced way.

 

Is there anything else that you would like to share about ABBYY?

It goes without saying that these are uncertain and unprecedented times. ABBYY is fully committed to helping businesses navigate these challenging circumstances. It is more important than ever that businesses have what it takes to make timely, intelligent decisions. There is a lot of data coming in and it can be overwhelming. We are committed to making sure organizations are equipped with the technologies they need to deliver outcomes and help customers.

I really enjoyed learning about your work, for anyone who wishes to learn more please visit ABBYY

Spread the love
Continue Reading

Big Data

Human Genome Sequencing and Deep Learning Could Lead to a Coronavirus Vaccine – Opinion

mm

Published

on

Human Genome Sequencing and Deep Learning Could Lead to a Coronavirus Vaccine - Opinion

The AI community must collaborate with geneticists, in finding a treatment for those deemed most at risk of coronavirus. A potential treatment could involve removing a person’s cells, editing the DNA and then injecting the cells back in, now hopefully armed with a successful immune response. This is currently being worked on for some other vaccines.

The first step would be sequencing the entire human genome from a sizeable segment of the human population.

Sequencing Human Genomes

Sequencing the first human genome cost $2.7 billion and took nearly 15 years to complete. The current cost of sequencing an entire human has dropped dramatically. As recent as 2015 the cost was $4000, now the cost is less than $1000 per person. This cost could drop a few percentage points more when economies of scale are taken into consideration.

We need to sequence the genome of two different types of patients:

  1. Infected with Coronavirus; but healthy
  2. Infected with Coronavirus; but poor immune response

It is impossible to predict which data point will be most valuable, but each sequenced genome would provide a dataset. The more data the more options there are to locate DNA variations which increase a body’s resistance to the disease vector.

Nations are currently losing trillions of dollars to this outbreak, the cost of $1000 a human genome is minor in comparison. A minimum of 1,000 volunteers for both segments of the population would arm researchers with significant volumes of big data. Should the trial increase in size by one order of magnitude, the AI would have even more training data which would increase the odds of success by several orders of magnitude. The more data the better, which is why a target of 10,000 volunteers should be aimed for.

Machine Learning

While multiple functionalities of machine learning would be present, deep learning would be used to find patterns in the data. For instance, there might be an observation that certain DNA variables correspond to a high immunity, while others correspond to a high mortality. At a minimum we would learn which segments of the human population are more susceptible and should be quarantined.

To decipher this data an Artificial Neural Network (ANN) would be located on the cloud, and sequenced human genomes from around the world would be uploaded. With time being of the essence, parallel computing will reduce the time required for the ANN to work its magic.

We could even take it one step further and use the output data sorted by the ANN,and feed it into a separate system called a Recurrent Neural Network (RNN). The RNN uses reinforcement learning to identify which gene selected by the initial ANN is most successful in a simulated environment. The reinforcement learning agent would gamify the entire process of creating a simulated setting, to test which DNA changes are more effective.

A simulated environment is like a virtual game environment, something many AI companies are well positioned to take advantage of based on their previous success in designing AI algorithms to win at esports. This includes companies such DeepMind and OpenAI.

These companies can use their underlying architecture optimized at mastering video games, to create a stimulated environment, test gene edits, and learn which edits lead to specific desired changes.

Once a gene is identified, another technology is used to make the edits.

CRISPR

Recently, the first ever study using CRISPR to edit DNA inside the human body was approved. This was to treat a rare type of genetic disorder that effects one of every 100,000 newborns. The condition can be caused by mutations in as many as 14 genes that play a role in the growth and operation of the retina. In this case, CRISPR sets out to carefully target DNA and to cause slight temporary damage to the DNA strand, causing the cell to repair itself. It is this restorative healing process which has the potential to restore eyesight.

While we are still waiting for results on if this treatment will work, the precedent of having CRISPR approved for trials in the human body is transformational. Potential disorders which can be treated include improving a body’s immune response to specific disease vectors.

Potentially, we can manipulate the body’s natural genetic resistance to a specific disease. The diseases that could potentially be targeted are diverse, but the community should be focusing on the treatment of the new global epidemic coronavirus.  A threat that if unchecked could lead to a death sentence to a large percentage of our population.

FINAL THOUGHTS

While there are many potential options to achieving success, it will require that geneticists, epidemiologists, and machine learning specialists unify. A potential treatment option may be as described above, or may be revealed to be unimaginably different, the opportunity lies in the genome sequencing of a large segment of the population.

Deep learning is the best analysis tool that humans have ever created; we need to at a minimum attempt to use it to create a vaccine.

When we take into consideration what is currently at risk with this current epidemic, these three scientific communities need to come together to work on a cure.

Spread the love
Continue Reading

Big Data

How AI Predicted Coronavirus and Can Prevent Future Pandemics – Opinion

mm

Published

on

How AI Predicted Coronavirus and Can Prevent Future Pandemics - Opinion

BlueDot AI Prediction

On January 6th, the US Centers for Disease Control and Prevention (CDC) notified the public that a flu-like outbreak was propagating in Wuhan City, in the Hubei Province of China.  Subsequently, the World Health Organization (WHO) released a similar report on January 9th.

While these responses may seem timely, they were slow when compared to an AI company called BlueDot.  BlueDot released a report on December 31st, a full week before the CDC released similar information.

Even more impressive, BlueDot predicted the Zika outbreak in Florida six months before the first case in 2016.

What are some of the datasets that BlueDot analyzes?

  • Disease Surveillance, this includes scanning 10,000+ media and public sources in over 60 languages.
  • Demographic data from national censuses, and national statistic reports. (Population density is a factor behind virus propagation)
  • Real-time climate data from NASA, NOAA, etc. (Viruses spread faster in certain environmental conditions)
  • Insect vectors and animal reservoirs (Important when virus can spread from species to species).

BlueDot currently works with various Government agencies including Global Affairs Canada, Public Health Agency of Canada, the Canadian Medical Association, and the Singapore Ministry of Health.  The BlueDot Insights product sends near real-time infectious disease alerts. Some advantages behind this product include:

  • Reducing the risk of exposure to frontline healthcare workers
  • Global visibility enables time saving on infectious disease surveillance
  • Opportunity to communicate crucial information clearly before it’s too late.
  • Ability to protect populations from infections

How AI Predictability Could Be Improved

What’s preventing the BlueDot AI and similar AIs from improving? The number one limiting factor is inability to access the necessary big data in real-time.

These types of predictive systems rely on big data feeding into an artificial neural network (ANN), which uses deep learning to search for patterns. The more data that is fed into this ANN, the more accurate the machine learning algorithm becomes.

This essentially means that what is preventing the AI from being able to flag a potential outbreak sooner than later, is simply a lack of access to the necessary data. In countries like China which regularly monitor, and filter news, these delays to the necessary data are even more pronounced. The censoring process of each datapoint can significantly reduce the amount of available data, and worse, can even completely remove the accuracy of this data, which removes the potential usefulness of this data. Faulty data was even why previous efforts such as Google Flu Trends failed.

In other words, the major problem that is preventing AI systems from fully being able to predict an outbreak as early as possible is Government interference. Governments like China, and the current Trump administration, need to remove themselves from any type of data filtering, and enable full access to the press to report on global health issues.

That being stated, reporters can only work with the information that is available to them. Bypassing news reports and accessing sources directly would enable machine learning systems to access data in a timelier and more efficient fashion.

What Needs to be Done

Starting immediately, Governments that are truly interested in reducing the cost of healthcare, and preventing an outbreak, should begin a mandatory review of how their health clinics, and hospitals, can distribute certain datapoints in real-time to officials, reporters and AI systems.

Individual private information can be completely stripped from each patient, enabling the patient to remain anonymous while the important data is shared.

A network of hospitals in any city that collects data in real-time and shares this data would be able to offer superior healthcare. For example, it could be tracked that a specific hospital has shown an increase in patients showing flu-like symptoms, with 3 patients at 10:00 AM, to 7 patients at 1:00 PM, to 49 patients by 5:00PM. This data could be compared to hospitals within the same region, for immediate alerts that a certain region is a potential hotzone.

Once this information is collected and assembled, the AI system could trigger alerts to all neighboring regions so that necessary precautions can be made.

While this would be difficult in certain regions of the world, countries with large AI hubs and smaller population densities such as Canada could institute such an advanced system. Canada has AI hubs in the most populated provinces (Waterloo and Toronto, Ontario, and Montreal, Quebec). The advantages of this inter-hospital and inter-provincial cooperation could be extended to offer Canadians other benefits such as accelerated access to emergency medical care, and reduced healthcare spending. Canada could become a leader in both AI and healthcare, licensing this technology to other jurisdictions.

Most importantly, once a country such as Canada has a system in place, the technology/methodologies can then be cloned and exported to other regions. Eventually, the goal would be to blanket the entire world, to ensure outbreaks are a relic of the past.

This type data collection by healthcare workers has benefits for multiple applications. There is no reason why in 2020 that a patient should have to register themselves with each hospital individually, and that those same hospitals are not communicating to one another in real-time. This lack of communication can result in the loss of data with patients who suffer from dementia, or other symptoms which may prevent them from fully communicating the severity of their condition, or even where else they have been treated.

Lessons Learned

We can only hope that governments around the world, take advantage of the important lessons that coronavirus is teaching us. Humanity should consider itself lucky that coronavirus has a relatively mild fatality rate compared to some infectious agents of the past such as the Black Plague which is estimated to have killed 30% to 60% of Europe’s population.

The next time we might not be so lucky, what we do know so far, is that governments are currently ill-equipped to deal with the severity of an outbreak.

Bluedot was conceived in the wake of Toronto’s 2003 SARS outbreak and launched in 2013. The goal was to protect people around the world from infectious diseases with human and artificial intelligence. The AI component has demonstrated remarkable ability to predict the path of infectious diseases, what remains is the human component. We need new policies in place in order to enable companies such as BlueDot to excel at what they do best. As people we need to demand more from our politicians, and healthcare providers.

Spread the love
Continue Reading