Connect with us

AI 101

What is a Decision Tree?

mm

Published

 on

What is a Decision Tree?

A decision tree is a useful machine learning algorithm used for both regression and classification tasks. The name “decision tree” comes from the fact that the algorithm keeps dividing the dataset down into smaller and smaller portions until the data has been divided into single instances, which are then classified. If you were to visualize the results of the algorithm, the way the categories are divided would resemble a tree and many leaves.

That’s a quick definition of a decision tree, but let’s take a deep dive into how decision trees work. Having a better understanding of how decision trees operate, as well as their use cases, will assist you in knowing when to utilize them during your machine learning projects.

General Format of a Decision Tree

A decision tree is a lot like a flowchart. To utilize a flowchart you start at the starting point, or root, of the chart and then based on how you answer the filtering criteria of that starting node you move to one of the next possible nodes. This process is repeated until an ending is reached.

Decision trees operate in essentially the same manner, with every internal node in the tree being some sort of test/filtering criteria. The nodes on the outside, the endpoints of the tree, are the labels for the datapoint in question and they are dubbed “leaves”. The branches that lead from the internal nodes to the next node are features or conjunctions of features. The rules used to classify the datapoints are the paths that run from the root to the leaves.

Steps and Algorithms

Decision trees operate on an algorithmic approach which splits the dataset up into individual data points based on different criteria. These splits are done with different variables, or the different features of the dataset. For example, if the goal is to determine whether or not a dog or cat is being described by the input features, variables the data is split on might be things like “claws” and “barks”.

So what algorithms are used to actually split the data into branches and leaves? There are various methods that can be used to split a tree up, but the most common method of splitting is probably a technique referred to as “recursive binary split”. When carrying out this method of splitting, the process starts at the root and the number of features in the dataset represents the possible number of possible splits. A function is used to determine how much accuracy every possible split will cost, and the split is made using the criteria that sacrifices the least accuracy. This process is carried out recursively and sub-groups are formed using the same general strategy.

In order to determine the cost of the split, a cost function is used. A different cost function is used for regression tasks and classification tasks. The goal of both cost functions is to determine which branches have the most similar response values, or the most homogenous branches. Consider that you want test data of a certain class to follow certain paths and this makes intuitive sense.

In terms of the regression cost function for recursive binary split, the algorithm used to calculate the cost is as follows:

sum(y – prediction)^2

The prediction for a particular group of data points is the mean of the responses of the training data for that group. All the data points are run through the cost function to determine the cost for all the possible splits and the split with the lowest cost is selected.

Regarding the cost function for classification, the function is as follows:

G = sum(pk * (1 – pk))

This is the Gini score, and it is a measurement of the effectiveness of a split, based on how many instances of different classes are in the groups resulting from the split. In other words, it quantifies how mixed the groups are after the split. An optimal split is when all the groups resulting from the split consist only of inputs from one class. If an optimal split has been created the “pk” value will be either 0 or 1 and G will be equal to zero. You might be able to guess that the worst-case split is one where there is a 50-50 representation of the classes in the split, in the case of binary classification. In this case, the “pk” value would be 0.5 and G would also be 0.5.

The splitting process is terminated when all the data points have been turned into leaves and classified. However, you may want to stop the growth of the tree early. Large complex trees are prone to overfitting, but several different methods can be used to combat this. One method of reducing overfitting is to specify a minimum number of data points that will be used to create a leaf. Another method of controlling for overfitting is restricting the tree to a certain maximum depth, which controls how long a path can stretch from the root to a leaf.

Another process involved in the creation of decision trees is pruning. Pruning can help increase the performance of a decision tree by stripping out branches containing features that have little predictive power/little importance for the model. In this way, the complexity of the tree is reduced, it becomes less likely to overfit, and the predictive utility of the model is increased.

When conducting pruning, the process can start at either the top of the tree or the bottom of the tree. However, the easiest method of pruning is to start with the leaves and attempt to drop the node that contains the most common class within that leaf. If the accuracy of the model doesn’t deteriorate when this is done, then the change is preserved. There are other techniques used to carry out pruning, but the method described above – reduced error pruning – is probably the most common method of decision tree pruning.

Considerations For Using Decision Trees

Decision trees are often useful when classification needs to be carried out but computation time is a major constraint. Decision trees can make it clear which features in the chosen datasets wield the most predictive power. Furthermore, unlike many machine learning algorithms where the rules used to classify the data may be hard to interpret, decision trees can render interpretable rules. Decision trees are also able to make use of both categorical and continuous variables which means that less preprocessing is needed, compared to algorithms that can only handle one of these variable types.

Decision trees tend not to perform very well when used to determine the values of continuous attributes. Another limitation of decision trees is that, when doing classification, if there are few training examples but many classes the decision tree tends to be inaccurate.

To Learn More

Recommended Artificial Intelligence CoursesOffered ByDurationDifficulty


Introduction to Artificial Intelligence



IBM

9 Hours

Beginner


Deep Learning for Business


Yonsei University

8 Hours

Beginner


An Introduction to Practical Deep Learning


Intel Software

12 Hours

Intermediate


Machine Learning Foundations


University of Washington

24 Hours

Intermediate
Spread the love

Deep Learning Specialization on Coursera

Blogger and programmer with specialties in machine learning and deep learning topics. Daniel hopes to help others use the power of AI for social good.

AI 101

New AI Powered Tool Enables Video Editing From Themed Text Documents

mm

Published

on

New AI Powered Tool Enables Video Editing From Themed Text Documents

A team of computer science researchers from Tsinghua and Beihand University in China, IDC Herzilya in Israel, and Harvard University have recently created a tool that generates edited videos based on a text description and a repository of video clips.

Massive amounts of video footage are recorded every day by professional videographers, hobbyists, and regular people. Yet editing this video down into a presentation that makes sense is still a costly time investment, often requiring the use of complex editing tools that can manipulate raw footage. The international team of researchers recently developed a tool that takes themed text descriptions and generates videos based on them. The tool is capable of examining video clips in a repository and selecting the clips that correspond with the input text describing the storyline. The goal is that the tool is user-friendly and powerful enough to produce quality videos without the need for extensive video editing skills or expensive video editing software.

While current video editing platforms require knowledge of video editing techniques, the tool created by the researchers lets novice video creates create compositions that tells stories in a more natural, intuitive fashion. “Write-A-Video”, as it is dubbed by its creators, lets users edit videos by just editing the text that accompanies the video. If a user deletes text, adds text, or moves sentences around, these changes will be reflected in the video. Corresponding shots will be cut or added as the user manipulates the text and the final resulting video tailored to the user’s description.

Ariel Shamir, the Dean of the Efi Arazi School of Computer Science at IDC Herzliya explained that the Write-A-Video tool lets the user interact with the video mainly through text, using natural language processing techniques to match video shots based on the provided semantic meaning. An optimization algorithm is then used to assemble the video by cutting and swapping shots. The tool allows users to experiment with different visual styles as well, tweaking how scenes are presented by using specific film idioms that will speed up or slow down the action, or make more/fewer cuts.

The program selects possible shots based on their aesthetic appeal. The program considers how shots are framed, focused, and light in order to determine the aesthetic appeal. The tool  will select shots that are better focused, instead of blurry or unstable, and it will also prioritize shots that are well lit. According to the creators of Write-A-Video, the user can render the generated video at any point and preview it with a voice-over narration that describes the text used to select the clips.

According to the research team, their experiment demonstrated that digital techniques that combine aspects of computer vision and natural language processing can assist users in creative processes like the editing of videos.

“Our work demonstrates the potential of automatic visual-semantic matching in idiom-based computational editing, offering an intelligent way to make video creation more accessible to non-professionals,” explained Shamir to TechXplore.

The researchers tested their tool out on different video repositories combined with themed text documents. User studies and quantitative evaluation was performed to interpret the results of the experiment. The results of the user studies found that non-professionals could sometimes produce high quality edited videos using the tool faster than professionals using frame-based editing software could. As reported by TechXplore, the team will be presenting their work in a few days at the ACM SIGGRAPH Asia conference held in Australia. Other entities are also using AI to augment video editing. Adobe has also been working on its own AI-powered extensions for Premiere Pro, its editing platform. The tool helps people ensure that changes in aspect ratio don’t cut out important pieces of video.

Spread the love

Deep Learning Specialization on Coursera
Continue Reading

AI 101

Structured vs Unstructured Data

mm

Published

on

Structured vs Unstructured Data

Unstructured data is data that isn’t organized in a pre-defined fashion or lacks a specific data model. Meanwhile, structured data is data that has clear, definable relationships between the data points, with a pre-defined model containing it. That’s the short answer on the difference between structured and unstructured data, but let’s take a closer look at the differences between the two types of data.

Structured Data

When it comes to computer science, data structures refer to specific ways of storing and organizing data. Different data structures possess different relationships between data points, but data can also be unstructured. What does it mean to say that data is structured? To make this definition clearer, let’s take a look at some of the various ways of structuring data.

Structured data is often held in tables such as Excel files or SQL databases. In these cases, the rows and columns of the data hold different variables or features, and it is often possible to discern the relationship between data points by checking to see where data rows and columns intersect. Structured data can easily be fit into a relational database, and examples of different features in a structured dataset can include items like names, addresses, dates, weather statistics, credit card numbers, etc. While structured data is most often text data, it is possible to store things like images and audio as structured data as well.

Common sources of structured data include things like data collected from sensors, weblogs, network data, and retail or e-commerce data. Structured data can also be generated by people filling in spreadsheets or databases with data collected from computers and other devices. For instance, data collected through online forms is often immediately fed into a data structure.

Structured data has a long history of being stored in relational databases and SQL. These storage methods are popular because of the ease of reading and writing in these formats, with most platforms and languages being able to interpret these data formats.

In a machine learning context, structured data is easier to train a machine learning system on, because the patterns within the data are more explicit. Certain features can be fed into a machine learning classifier and used to label other data instances based on those selected features. In contrast, training a machine learning system on unstructured data tends to be more difficult, for reasons that will become clear.

Unstructured Data

Unstructured data is data that isn’t organized according to a pre-defined data model or structure. Unstructured data is often called qualitative data because it can’t be analyzed or processed in traditional ways using the regular methods used for structured data.

Because unstructured data doesn’t have any defined relationships between data points, it can’t be organized in relational databases. In contrast, the way unstructured data is stored is typically with a NoSQL database, or a non-relational database. If the structure of the database is of little concern, a data lake, or a large pool of unstructured data, can be used to store the data instead of a NoSQL database.

Unstructured data is difficult to analyze, and making sense of unstructured data often involves examining individual pieces of data to discern potential features and then looking to see if those features occur in other pieces of data within the pool.

The vast majority of data is in unstructured formats, with estimates that unstructured data comprises around 80% of all data. Data mining techniques can be used to help structure data.

In terms of machine learning, certain techniques can help order unstructured data and turn it into structured data. A popular tool for turning unstructured data into structured data is a system called an autoencoder.

Spread the love

Deep Learning Specialization on Coursera
Continue Reading

AI 101

What is Natural Language Processing?

mm

Published

on

What is Natural Language Processing?

Natural Language Processing (NLP) is the study and application of techniques and tools that enable computers to process, analyze, interpret, and reason about human language. NLP is an interdisciplinary field and it combines techniques established in fields like linguistics and computer science. These techniques are used in concert with AI to create chatbots and digital assistants like Google Assistant and Amazon’s Alexa.

Let’s take some time to explore the rationale behind Natural Language Processing, some of the techniques used in NLP, and some common uses cases for NLP.

Why Is Natural Language Processing Important?

In order for computers to interpret human language, they must be converted into a form that a computer can manipulate. However, this isn’t as simple as converting text data into numbers. In order to derive meaning from human language, patterns have to be extracted from the hundreds or thousands of words that make up a text document. This is no easy task. There are few hard and fast rules that can be applied to the interpretation of human language. For instance, the exact same set of words can mean different things depending on the context. Human language is a complex and often ambiguous thing, and a statement can be uttered with sincerity or sarcasm.

Despite this, there are some general guidelines that can be used when interpreting words and characters, such as the character “s” being used to denote that an item is plural. These general guidelines have to be used in concert with each other to extract meaning from the text, to create features that a machine learning algorithm can interpret.

Natural Language Processing involves the application of various algorithms capable of taking unstructured data and converting it into structured data. If these algorithms are applied in the wrong manner, the computer will often fail to derive the correct meaning from the text. This can often be seen in the translation of text between languages, where the precise meaning of the sentence is often lost. While machine translation has improved substantially over the past few years, machine translation errors still occur frequently.

Natural Language Processing Techniques

What is Natural Language Processing?

Photo: Tamur via WikiMedia Commons, Public Domain (https://commons.wikimedia.org/wiki/File:ParseTree.svg)

Many of the techniques that are used in natural language processing can be placed in one of two categories: syntax or semantics. Syntax techniques are those that deal with the ordering of words, while semantic techniques are the techniques that involve the meaning of words.

Syntax NLP Techniques

Examples of syntax include:

  • Lemmatization
  • Morphological Segmentation
  • Part-of-Speech Tagging
  • Parsing
  • Sentence Breaking
  • Stemming
  • Word Segmentation

Lemmatization refers to distilling the different inflections of a word down to a single form. Lemmatization takes things like tenses and plurals and simplifies them, for example, “feet” might become “foot” and “stripes” may become “stripe”.  This simplified word form makes it easier for an algorithm to interpret the words in a document.

Morphological segmentation is the process of dividing words into morphemes or the base units of a word. These units are things like free morphemes (which can stand alone as words) and prefixes or suffixes.

Part-of-speech tagging is simply the process of identifying which part of speech every word in an input document is.

Parsing refers to analyzing all the words in a sentence and correlating them with their formal grammar labels or doing grammatical analysis for all the words.

Sentence breaking, or sentence boundary segmentation, refers to deciding where a sentence begins and ends.

Stemming is the process of reducing words down to the root form of the word. For instance, connected, connection, and connections would all be stemmed to “connect”.

Word Segmentation is the process of dividing large pieces of text down into small units, which can be words or stemmed/lemmatized units.

Semantic NLP Techniques

Semantic NLP techniques include techniques like:

  • Named Entity Recognition
  • Natural Language Generation
  • Word-Sense disambiguation

Named entity recognition involves tagging certain text portions that can be placed into one of a number of different preset groups. Pre-defined categories include things like dates, cities, places, companies, and individuals.

Natural language generation is the process of using databases to transform structured data into natural language. For instance, statistics about the weather, like temperature and wind speed could be summarized with natural language.

Word-sense disambiguation is the process of assigning meaning to words within a text based on the context the words appear in.

Deep Learning Models For Natural Language Processing

Regular multilayer perceptrons are unable to handle the interpretation of sequential data, where the order of the information is important. In order to deal with the importance of order in sequential data, a type of neural network is used that preserves information from previous timesteps in the training.

Recurrent Neural Networks are types of neural networks that loop over data from previous timesteps, taking them into account when calculating the weights of the current timestep. Essentially, RNN’s have three parameters that are used during the forward training pass: a matrix based on the Previous Hidden State, a matrix based on the Current Input, and a matrix that is between the hidden state and the output. Because RNNs can take information from previous timesteps into account, they can extract relevant patterns from text data by taking earlier words in the sentence into account when interpreting the meaning of a word.

Another type of deep learning architecture used to process text data is a Long Short-Term Memory (LSTM) network. LSTM networks are similar to RNNs in structure, but owing to some differences in their architecture they tend to perform better than RNNs. They avoid a specific problem that often occurs when using RNNs called the exploding gradient problem.

These deep neural networks can be either unidirectional or bi-directional. Bi-directional networks are capable of taking not just the words that come prior to the current word into account, but the words that come after it. While this leads to higher accuracy, it is more computationally expensive.

Use Cases For Natural Language Processing

What is Natural Language Processing?

Photo: mohammed_hassan via Pixabay, Pixabay License (https://pixabay.com/illustrations/chatbot-chat-application-artificial-3589528/)

Because Natural Language Processing involves the analysis and manipulation of human languages, it has an incredibly wide range of applications. Possible applications for NLP include chatbots, digital assistants, sentiment analysis, document organization, talent recruitment, and healthcare.

Chatbots and digital assistants like Amazon’s Alexa and Google Assistant are examples of voice recognition and synthesis platforms that use NLP to interpret and respond to vocal commands. These digital assistants help people with a wide variety of tasks, letting them offload some of their cognitive tasks to another device and free up some of their brainpower for other, more important things. Instead of looking up the best route to the bank on a busy morning, we can just have our digital assistant do it.

Sentiment analysis is the use of NLP techniques to study people’s reactions and feelings to a phenomenon, as communicated by their use of language. Capturing the sentiment of a statement, like interpreting whether a review of a product is good or bad, can provide companies with substantial information regarding how their product is being received.

Automatically organizing text documents is another application of NLP. Companies like Google and Yahoo use NLP algorithms to classify email documents, putting them in the appropriate bins such as “social” or “promotions”. They also use these techniques to identify spam and prevent it from reaching your inbox.

Groups have also developed NLP techniques are being used to identify potential job hires, finding them based on relevant skills. Hiring managers are also using NLP techniques to help them sort through lists of applicants.

NLP techniques are also being used to enhance healthcare. NLP can be used to improve the detection of diseases. Health records can be analyzed and symptoms extracted by NLP algorithms, which can then be used to suggest possible diagnoses. One example of this is Amazon’s Comprehend Medical platform, which analyzes health records and extracts diseases and treatments. Healthcare applications of NLP also extend to mental health. There are apps such as WoeBot, which talks users through a variety of anxiety management techniques based in Cognitive Behavioral Therapy.

To Learn More

Recommended Natural Language Processing CoursesOffered ByDurationDifficulty


Introduction to Artificial Intelligence



IBM

9 Hours

Beginner


Natural Language Processing in TensorFlow


Deep Learning AI

9 Hours

Intermediate


An Introduction to Practical Deep Learning


Intel Software

12 Hours

Intermediate


Natural Language Processing


Higher School of Economics

34 Hours

Advanced
Spread the love

Deep Learning Specialization on Coursera
Continue Reading