Wilson Pang is the Chief Technology Officer at Appen, where he leads a group of world-class data scientists, engineers, and product managers to combine the power of technology and humans to solve the AI data problems.
In this interview we discuss AI ethics, Appen’s 2020 State of AI and Machine Learning Report and current industry challenges.
What was it that attracted you personally to software engineering and data science?
I received the opportunity to work on data and AI 10 years ago. In the AI world, developers are no longer controlling logic in code, instead, data is deciding the logic of the AI model which is fascinating. I started my career as a developer with IBM, building large systems for banks, telecom operators and securities exchange companies. I was excited by the power of software and AI.
I am also lucky to see how data and machine learning help to grow business firsthand. My team at eBay leveraged AI to increase buyer purchases thus increasing tens of millions in revenue. We used AI to increase internal efficiency and reduce operational costs significantly for Trip.com. Now at Appen, we are helping companies from all kinds of industries use AI to drive success of their business.
Could you describe some of Appen’s AI and data labeling offerings?
We are a training data provider working with over 1 million contractors to collect and label images, text, speech, audio, video, and other data. Those contractors reside in over 130 countries and speak 180 languages and dialects which gives us the ability to provide high quality data for AI projects. We also have professional teams who have worked in AI data for more than 20 years and have a lot of knowledge on how to get the training data right. Last but not least, we have an industry leading AI-assisted annotation platform which has built-in features to assure quality and productivity. Our customers can choose between our managed service solution where we partner with their team, or the self-service platform. Our AI-assisted data annotation platform gives customers the ability to manage projects along with machine learning assisted tools to enhance quality, accuracy, and annotation speed.
With more than 20 years of experience, our services offer world-class training data, the most advanced AI- assisted data annotation platform, and a diverse, global crowd to ensure high-quality data.
What are some of the open source data sets that are available?
Appen has several open source data sets available online from image annotation to handwriting recognition. These data sets are free to download and can be used to help build and train your AI model. One of the most interesting sets available is on nucleus segmentation from medical images. The data contains more than 21000 nuclei annotated and validated by medical experts.
What was your biggest personal take away from the 2020 State of AI and Machine Learning Report?
The most interesting finding from the report was the increase in C-Suite involvement. I had been hearing about it, but to see the increase by 31% from last year was a big surprise. AI is becoming a part of core business and not just with tech leaders.
One cause for concern in the report is that only 25% of companies stated that unbiased AI is mission critical. Do you believe that this is due to lack of education on the importance of removing AI bias? What needs to be done to improve these statistics?
Yes, education is the first step to improving that statistic. An AI model built on biased data will deliver biased results and never be fully successful. Business leaders need to learn the importance of having unbiased data and how that leads to a successful deployment.
What are some AI ethics that companies should consider when working with AI and large data sets?
It’s important to look at where the data came from. Is that data being ethically sourced and unbiased? When looking at where it was sourced, you want to know if the people were paid a fair wage and if the data came from a diverse group. We recently released our Crowd Code of Ethics in support of inclusion, diversity, fair pay, and communication for our contributors.
What do you view as the current biggest industry challenge?
Lack of data and data management is the biggest challenge for the industry. Teams have a lot of decisions that they need to make about the data and many have challenges recognizing what they need. They need to understand what data they have, where it came from and what data they still need. All that data management is important to building and training an AI model. Lack of data can lead to a biased model which in turn will not be successful.
Appen has been releasing the State of AI and Machine Learning Reports for many years now. When was the first report launched and what are some of the biggest changes that have been seen since this initial report?
The first report was launched in 2015 and the biggest change we’ve seen is in ownership of AI projects. The first report was primarily answered by data scientists who managed AI for their companies. Today, 71% state C-Suite ownership which indicates a huge shift in perspective of AI becoming more critical to businesses. Data scientists also faced many challenges from lack of resources to build valuable insights from the data to unclear goals and unrealistic expectations. However, one of the key challenges remains the same around data and data management.
Is there anything else that you would like to share about Appen?
On July 16, we’re excited to host the first virtual round table on launching AI in the real world on Appen.com. The four-part series features industry-leading practitioners sharing personal experiences and insights into their own AI journeys, shedding light so others can accelerate progress toward their AI-initiatives. To succeed, companies will have to be prepared to overcome several common challenges around data, ethics, people, and lifecycles. In the first edition, bring together leading experts to share what it means to them and their organization to participate in creating responsible AI.
Thank you for the interview. Readers may wish to read the Appen’s 2020 State of AI and Machine Learning Report or to visit the Appen website.
- Attention-Based Deep Learning Networks Could Improve Sonar Systems
- Cerebras CS-1 System Integrated Into Lassen Supercomputer
- Deepfaked Voice Enabled $35 Million Bank Heist in 2020
- Facebook: ‘Nanotargeting’ Users Based Solely on Their Perceived Interests
- IBM Announces AI-Driven Software for Environmental Intelligence