Back in January, we covered the release of Quantum Stat’s “Big Bad NLP Database,” a database containing hundreds of different datasets for machine learning developers to use. It was a big development for natural language processing (NLP), and the company has now created “The Super Duper NLP Repo.”
The Super Duper NLP Repo
According to Ricky Costa, CEO of Quantum Stat, the need for a new database stemmed from the advancement of the NLP industry. Because of this, Quantum Stat set out to find new solutions and to give developers direct access to code.
Ricky Costa gave an interview to Unite.AI that can be found here.
The Super Duper NLP Repo database contains over 100 Colab notebooks, which run ML code for different NLP tasks. Colab notebooks help spread various models and provide a way for developers to experiment since it provides free GPU/TPU in Google’s back-end servers.
The layout of the new database is similar to the previous one and easy to follow. It includes the name of the notebook, the date it was added, a description, model, task, the creator, and a link to open in Colab.
According to the company, the notebooks in the database come from both independent and industry AI researchers. Some of the ones included are TensorFlow, Hugging Face, and DeepPavlov.
With the notebooks, many models can be run including BERT, TD, CNN, and GPT-2. There are many tasks such as classification, text generation, embeddings, dialogue, sentiment analysis, and speech translation.
According to a post by Ricky Costa on Medium, “We’ve continued with the same community mindset to have a destination for developers to contribute their own code to the burgeoning field of NLP. If you have a notebook to share, you can always hit the big red button.”
Natural Language Processing
Natural language processing deals with computers and human languages. Techniques and tools are used to enable computers to process, interpret, and analyze human language, and the field borrows from others such as linguistics, computer science, information engineering, and artificial intelligence.
Human language must first be converted to make it possible for a computer to manipulate it. Eventually, a machine will be able to read and understand human language, as well as derive meaning from it.
NLP is making a lot of progress thanks to access to data and increased computational power. Some of the fields that utilize NLP include healthcare, finance, media, and human resources.
There are many more applications for NLP such as chatbots, digital assistants, document organization, sentiment analysis, and talent recruitment. In the case of digital assistants, such as Amazon’s Alexa, NLP is used to interpret voice commands and respond accordingly. The real power in this is that it allows a user to assign cognitive tasks to the technology, which allows the individual to then focus on other areas.
When it comes to sentiment analysis, NLP techniques help make connections between the use of language and people’s reactions and feelings. Companies can use this to find out things like how a product is being received by the users.
Quantum Stat’s The Super Duper NLP Repo helps bring all of this together in one place. Developers can rely on The Super Duper NLP Repo as an opportunity to explore and experiment with different models, and it is formatted in a very informative and easy-to-follow manner. Perhaps its biggest strength is that it also provides a platform for independent AI researchers.