What is Reinforcement Learning?
Put simply, reinforcement learning is a machine learning technique that involves training an artificial intelligence agent through the repetition of actions and associated rewards. A reinforcement learning agent experiments in an environment, taking actions and being rewarded when the correct actions are taken. Over time, the agent learns to take the actions that will maximize its reward. That’s a quick definition of reinforcement learning, but taking a closer look at the concepts behind reinforcement learning will help you gain a better, more intuitive understanding of it.
The term “reinforcement learning” is adapted from the concept of reinforcement in psychology. For that reason, let’s take a moment to understand the psychological concept of reinforcement. In the psychological sense, the term reinforcement refers to something that increases the likelihood that a particular response/action will occur. This concept of reinforcement is a central idea of the theory of operant conditioning, initially proposed by the psychologist B.F. Skinner. In this context, reinforcement is anything that causes the frequency of a given behavior to increase. If we think about possible reinforcement for humans, these can be things like praise, a raise at work, candy, and fun activities.
In the traditional, psychological sense, there are two types of reinforcement. There’s positive reinforcement and negative reinforcement. Positive reinforcement is the addition of something to increase a behavior, like giving your dog a treat when it is well behaved. Negative reinforcement involves removing a stimulus to elicit a behavior, like shutting off loud noises to coax out a skittish cat.
Positive & Negative Reinforcement
Positive reinforcement increases the frequency of a behavior while negative reinforcement decreases the frequency. In general, positive reinforcement is the most common type of reinforcement used in reinforcement learning, as it helps models maximize the performance on a given task. Not only that but positive reinforcement leads the model to make more sustainable changes, changes which can become consistent patterns and persist for long periods of time.
In contrast, while negative reinforcement also makes a behavior more likely to occur, it is used for maintaining a minimum performance standard rather than reaching a model’s maximum performance. Negative reinforcement in reinforcement learning can help ensure that a model is kept away from undesirable actions, but it can’t really make a model explore desired actions.
Training a Reinforcement Agent
Imagine that we are training a reinforcement agent to play a platforming video game where the AI’s goal is to make it to the end of the level by moving right across the screen. The initial state of the game is drawn from the environment, meaning the first frame of the game is analyzed and given to the model. Based on this information, the model must decide on an action.
During the initial phases of training, these actions are random but as the model is reinforced, certain actions will become more common. After the action is taken the environment of the game is updated and a new state or frame is created. If the action taken by the agent produced a desirable result, let’s say in this case that the agent is still alive and hasn’t been hit by an enemy, some reward is given to the agent and it becomes more likely to do the same in the future.
This basic system is constantly looped, happening again and again, and each time the agent tries to learn a little more and maximize its reward.
Episodic vs Continuous Tasks
Reinforcement learning tasks can typically be placed in one of two different categories: episodic tasks and continual tasks.
Episodic tasks will carry out the learning/training loop and improve their performance until some end criteria are met and the training is terminated. In a game, this might be reaching the end of the level or falling into a hazard like spikes. In contrast, continual tasks have no termination criteria, essentially continuing to train forever until the engineer chooses to end the training.
Monte Carlo vs Temporal Difference
There are two primary ways of learning, or training, a reinforcement learning agent. In the Monte Carlo approach, rewards are delivered to the agent (its score is updated) only at the end of the training episode. To put that another way, only when the termination condition is hit does the model learn how well it performed. It can then use this information to update and when the next training round is started it will respond in accordance to the new information.
The temporal-difference method differs from the Monte Carlo method in that the value estimation, or the score estimation, is updated during the course of the training episode. Once the model advances to the next time step the values are updated.
Exploration vs Exploitation
Training a reinforcement learning agent is a balancing act, involving the balancing of two different metrics: exploration and exploitation.
Exploration is the act of collecting more information about the surrounding environment, while exploration is using the information already known about the environment to earn reward points. If an agent only explores and never exploits the environment, the desired actions will never be carried out. On the other hand, if the agent only exploits and never explores, the agent will only learn to carry out one action and won’t discover other possible strategies of earning rewards. Therefore, balancing exploration and exploitation is critical when creating a reinforcement learning agent.
Use Cases For Reinforcement Learning
Reinforcement learning can be used in a wide variety of roles, and it is best suited for applications where tasks require automation.
Automation of tasks to be carried out by industrial robots is one area where reinforcement learning proves useful. Reinforcement learning can also be used for problems like text mining, creating models that are able to summarize long bodies of text. Researchers are also experimenting with using reinforcement learning in the healthcare field, with reinforcement agents handling jobs like the optimization of treatment policies. Reinforcement learning could also be used to customize educational material for students.
Summary of Reinforcement Learning
Reinforcement learning is a powerful method of constructing AI agents that can lead to impressive and sometimes surprising results. Training an agent through reinforcement learning can be complex and difficult, as it takes many training iterations and a delicate balance of the explore/exploit dichotomy. However, if successful, an agent created with reinforcement learning can carry out complex tasks under a wide variety of different environments.
What is an Autoencoder?
If you’ve read about unsupervised learning techniques before, you may have come across the term “autoencoder”. Autoencoders are one of the primary ways that unsupervised learning models are developed. Yet what is an autoencoder exactly?
Briefly, autoencoders operate by taking in data, compressing and encoding the data, and then reconstructing the data from the encoding representation. The model is trained until the loss is minimized and the data is reproduced as closely as possible. Through this process, an autoencoder can learn the important features of the data. While that’s a quick definition of an autoencoder, it would be beneficial to take a closer look at autoencoders and gain a better understanding of how they function. This article will endeavor to demystify autoencoders, explaining the architecture of autoencoders and their applications.
What is an Autoencoder?
Autoencoders are neural networks. Neural networks are composed of multiple layers, and the defining aspect of an autoencoder is that the input layers contain exactly as much information as the output layer. The reason that the input layer and output layer has the exact same number of units is that an autoencoder aims to replicate the input data. It outputs a copy of the data after analyzing it and reconstructing it in an unsupervised fashion.
The data that moves through an autoencoder isn’t just mapped straight from input to output, meaning that the network doesn’t just copy the input data. There are three components to an autoencoder: an encoding (input) portion that compresses the data, a component that handles the compressed data (or bottleneck), and a decoder (output) portion. When data is fed into an autoencoder, it is encoded and then compressed down to a smaller size. The network is then trained on the encoded/compressed data and it outputs a recreation of that data.
So why would you want to train a network to just reconstruct the data that is given to it? The reason is that the network learns the “essence”, or most important features of the input data. After you have trained the network, a model can be created that can synthesize similar data, with the addition or subtraction of certain target features. For instance, you could train an autoencoder on grainy images and then use the trained model to remove the grain/noise from the image.
Let’s take a look at the architecture of an autoencoder. We’ll discuss the main architecture of an autoencoder here. There are variations on this general architecture that we’ll discuss in the section below.
As previously mentioned an autoencoder can essentially be divided up into three different components: the encoder, a bottleneck, and the decoder.
The encoder portion of the autoencoder is typically a feedforward, densely connected network. The purpose of the encoding layers is to take the input data and compress it into a latent space representation, generating a new representation of the data that has reduced dimensionality.
The code layers, or the bottleneck, deal with the compressed representation of the data. The bottleneck code is carefully designed to determine the most relevant portions of the observed data, or to put that another way the features of the data that are most important for data reconstruction. The goal here is to determine which aspects of the data need to be preserved and which can be discarded. The bottleneck code needs to balance two different considerations: representation size (how compact the representation is) and variable/feature relevance. The bottleneck performs element-wise activation on the weights and biases of the network. The bottleneck layer is also sometimes called a latent representation or latent variables.
The decoder layer is what is responsible for taking the compressed data and converting it back into a representation with the same dimensions as the original, unaltered data. The conversion is done with the latent space representation that was created by the encoder.
The most basic architecture of an autoencoder is a feed-forward architecture, with a structure much like a single layer perceptron used in multilayer perceptrons. Much like regular feed-forward neural networks, the auto-encoder is trained through the use of backpropagation.
Attributes of An Autoencoder
There are various types of autoencoders, but they all have certain properties that unite them.
Autoencoders learn automatically. They don’t require labels, and if given enough data it’s easy to get an autoencoder to reach high performance on a specific kind of input data.
Autoencoders are data-specific. This means that they can only compress data that is highly similar to data that the autoencoder has already been trained on. Autoencoders are also lossy, meaning that the outputs of the model will be degraded in comparison to the input data.
When designing an autoencoder, machine learning engineers need to pay attention to four different model hyperparameters: code size, layer number, nodes per layer, and loss function.
The code size decides how many nodes begin the middle portion of the network, and fewer nodes compress the data more. In a deep autoencoder, while the number of layers can be any number that the engineer deems appropriate, the number of nodes in a layer should decrease as the encoder goes on. Meanwhile, the opposite holds true in the decoder, meaning the number of nodes per layer should increase as the decoder layers approach the final layer. Finally, the loss function of an autoencoder is typically either binary cross-entropy or mean squared error. Binary cross-entropy is appropriate for instances where the input values of the data are in a 0 – 1 range.
As mentioned above, variations on the classic autoencoder architecture exist. Let’s examine the different autoencoder architectures.
While autoencoders typically have a bottleneck that compresses the data through a reduction of nodes, sparse autoencoders are an alternative to that typical operational format. In a sparse network, the hidden layers maintain the same size as the encoder and decoder layers. Instead, the activations within a given layer are penalized, setting it up so the loss function better captures the statistical features of input data. To put that another way, while the hidden layers of a sparse autoencoder have more units than a traditional autoencoder, only a certain percentage of them are active at any given time. The most impactful activation functions are preserved and others are ignored, and this constraint helps the network determine just the most salient features of the input data.
Contractive autoencoders are designed to be resilient against small variations in the data, maintaining a consistent representation of the data. This is accomplished by applying a penalty to the loss function. This regularization technique is based on the Frobenius norm of the Jacobian matrix for the input encoder activations. The effect of this regularization technique is that the model is forced to construct an encoding where similar inputs will have similar encodings.
Convolutional autoencoders encode input data by splitting the data up into subsections and then converting these subsections into simple signals that are summed together to create a new representation of the data. Similar to convolution neural networks, a convolutional autoencoder specializes in the learning of image data, and it uses a filter that is moved across the entire image section by section. The encodings generated by the encoding layer can be used to reconstruct the image, reflect the image, or modify the image’s geometry. Once the filters have been learned by the network, they can be used on any sufficiently similar input to extract the features of the image.
Denoising autoencoders introduce noise into the encoding, resulting in an encoding that is a corrupted version of the original input data. This corrupted version of the data is used to train the model, but the loss function compares the output values with the original input and not the corrupted input. The goal is that the network will be able to reproduce the original, non-corrupted version of the image. By comparing the corrupted data with the original data, the network learns which features of the data are most important and which features are unimportant/corruptions. In other words, in order for a model to denoise the corrupted images, it has to have extracted the important features of the image data.
Variational autoencoders operate by making assumptions about how the latent variables of the data are distributed. A variational autoencoder produces a probability distribution for the different features of the training images/the latent attributes. When training, the encoder creates latent distributions for the different features of the input images.
Because the model learns the features or images as Gaussian distributions instead of discrete values, it is capable of being used to generate new images. The Gaussian distribution is sampled to create a vector, which is fed into the decoding network, which renders an image based on this vector of samples. Essentially, the model learns common features of the training images and assigns them some probability that they will occur. The probability distribution can then be used to reverse engineer an image, generating new images that resemble the original, training images.
When training the network, the encoded data is analyzed and the recognition model outputs two vectors, drawing out the mean and standard deviation of the images. A distribution is created based on these values. This is done for the different latent states. The decoder then takes random samples from the corresponding distribution and uses them to reconstruct the initial inputs to the network.
Autoencoders can be used for a wide variety of applications, but they are typically used for tasks like dimensionality reduction, data denoising, feature extraction, image generation, sequence to sequence prediction, and recommendation systems.
Data denoising is the use of autoencoders to strip grain/noise from images. Similarly, autoencoders can be used to repair other types of image damage, like blurry images or images missing sections. Dimensionality reduction can help high capacity networks learn useful features of images, meaning the autoencoders can be used to augment the training of other types of neural networks. This is also true of using autoencoders for feature extraction, as autoencoders can be used to identify features of other training datasets to train other models.
In terms of image generation, autoencoders can be used to generate fake human images or animated characters, which has applications in designing face recognition systems or automating certain aspects of animation.
Sequence to sequence prediction models can be used to determine the temporal structure of data, meaning that an autoencoder can be used to generate the next even in a sequence. For this reason, an autoencoder could be used to generate videos. Finally, deep autoencoders can be used to create recommendation systems by picking up on patterns relating to user interest, with the encoder analyzing user engagement data and the decoder creating recommendations that fit the established patterns.
What Is Synthetic Data?
What is Synthetic Data?
Synthetic data is a quickly expanding trend and emerging tool in the field of data science. What is synthetic data exactly? The short answer is that synthetic data is comprised of data that isn’t based on any real-world phenomena or events, rather it’s generated via a computer program. Yet why is synthetic data becoming so important for data science? How is synthetic data created? Let’s explore the answers to these questions.
What is a Synthetic Dataset?
As the term “synthetic” suggests, synthetic datasets are generated through computer programs, instead of being composed through the documentation of real-world events. The primary purpose of a synthetic dataset is to be versatile and robust enough to be useful for the training of machine learning models.
In order to be useful for a machine learning classifier, the synthetic data should have certain properties. While the data can be categorical, binary, or numerical, the length of the dataset should be arbitrary and the data should be randomly generated. The random processes used to generate the data should be controllable and based on various statistical distributions. Random noise may also be placed in the dataset.
If the synthetic data is being used for a classification algorithm, the amount of class separation should be customizable, in order that the classification problem can be made easier or harder according to the problem’s requirements. Meanwhile, for a regression task, non-linear generative processes can be employed to generate the data.
Why Use Synthetic Data?
As machine learning frameworks like TensorfFlow and PyTorch become easier to use and pre-designed models for computer vision and natural language processing become more ubiquitous and powerful, the primary problem that data scientists must face is the collection and handling of data. Companies often have difficulty acquiring large amounts of data to train an accurate model within a given time frame. Hand-labeling data is a costly, slow way to acquire data. However, generating and using synthetic data can help data scientists and companies overcome these hurdles and develop reliable machine learning models a quicker fashion.
There are a number of advantages to using synthetic data. The most obvious way that the use of synthetic data benefits data science is that it reduces the need to capture data from real-world events, and for this reason it becomes possible to generate data and construct a dataset much more quickly than a dataset dependent on real-world events. This means that large volumes of data can be produced in a short timeframe. This is especially true for events that rarely occur, as if an event rarely happens in the wild, more data can be mocked up from some genuine data samples. Beyond that, the data can be automatically labeled as it is generated, drastically reducing the amount of time needed to label data.
Synthetic data can also be useful to gain training data for edge cases, which are instances that may occur infrequently but are critical for the success of your AI. Edge cases are events that are very similar to the primary target of an AI but differ in important ways. For instance, objects that are only partially in view could be considered edge cases when designing an image classifier.
Finally, synthetic datasets can minimize privacy concerns. Attempts to anonymize data can be ineffective, as even if sensitive/identifying variables are removed from the dataset, other variables can act as identifiers when they are combined. This isn’t an issue with synthetic data, as it was never based on a real person, or real event, in the first place.
Uses Cases for Synthetic Data
Synthetic data has a wide variety of uses, as it can be applied to just about any machine learning task. Common use cases for synthetic data include self-driving vehicles, security, robotics, fraud protection, and healthcare.
One of the initial use cases for synthetic data was self-driving cars, as synthetic data is used to create training data for cars in conditions where getting real, on-the-road training data is difficult or dangerous. Synthetic data is also useful for the creation of data used to train image recognition systems, like surveillance systems, much more efficiently than manually collecting and labeling a bunch of training data. Robotics systems can be slow to train and develop with traditional data collection and training methods. Synthetic data allows robotics companies to test and engineer robotics systems through simulations. Fraud protection systems can benefit from synthetic data, and new fraud detection methods can be trained and tested with data that is constantly new when synthetic data is used. In the healthcare field, synthetic data can be used to design health classifiers that are accurate, yet preserve people’s privacy, as the data won’t be based on real people.
Synthetic Data Challenges
While the use of synthetic data brings many advantages with it, it also brings many challenges.
When synthetic data is created, it often lacks outliers. Outliers occur in data naturally, and while often dropped from training datasets, their existence may be necessary to train truly reliable machine learning models. Beyond this, the quality of synthetic data can be highly variable. Synthetic data is often generated with an input, or seed, data, and therefore the quality of the data can be dependent on the quality of the input data. If the data used to generate the synthetic data is biased, the generated data can perpetuate that bias. Synthetic data also requires some form of output/quality control. It needs to be checked against human-annotated data, or otherwise authentic data is some form.
How Is Synthetic Data Created?
Synthetic data is created programmatically with machine learning techniques. Classical machine learning techniques like decision trees can be used, as can deep learning techniques. The requirements for the synthetic data will influence what type of algorithm is used to generate the data. Decision trees and similar machine learning models let companies create non-classical, multi-modal data distributions, trained on examples of real-world data. Generating data with these algorithms will provide data that is highly correlated with the original training data. For instances where the typical distribution of data is known , a company can generate synthetic data through use of a Monte Carlo method.
Deep learning-based methods of generating synthetic data typically make use of either a variational autoencoder (VAE) or a generative adversarial network (GAN). VAEs are unsupervised machine learning models that make use of encoders and decoders. The encoder portion of a VAE is responsible for compressing the data down into a simpler, compact version of the original dataset, which the decoder then analyzes and uses to generate an a representation of the base data. A VAE is trained with the goal of having an optimal relationship between the input data and output, one where both input data and output data are extremely similar.
When it comes to GAN models, they are called “adversarial” networks due to the fact that GANs are actually two networks that compete with each other. The generator is responsible for generating synthetic data, while the second network (the discriminator) operates by comparing the generated data with a real dataset and tries to determine which data is fake. When the discriminator catches fake data, the generator is notified of this and it makes changes to try and get a new batch of data by the discriminator. In turn, the discriminator becomes better and better at detecting fakes. The two networks are trained against each other, with fakes becoming more lifelike all the time.
How Does Image Classification Work?
How can your phone determine what an object is just by taking a photo of it? How do social media websites automatically tag people in photos? This is accomplished through AI-powered image recognition and classification.
The recognition and classification of images is what enables many of the most impressive accomplishments of artificial intelligence. Yet how do computers learn to detect and classify images? In this article, we’ll cover the general methods that computers use to interpret and detect images and then take a look at some of the most popular methods of classifying those images.
Pixel-Level vs. Object-Based Classification
Pixels are the base units of an image, and the analysis of pixels is the primary way that image classification is done. However, classification algorithms can either use just the spectral information within individual pixels to classify an image or examine spatial information (nearby pixels) along with the spectral information. Pixel-based classification methods utilize only spectral information (the intensity of a pixel), while object-based classification methods take into account both pixel spectral information and spatial information.
There are different classification techniques used for pixel-based classification. These include minimum-distance-to-mean, maximum-likelihood, and minimum-Mahalanobis-distance. These methods require that the means and variances of the classes are known, and they all operate by examining the “distance” between class means and the target pixels.
Pixel-based classification methods are limited by the fact that they can’t use information from other nearby pixels. In contrast, object-based classification methods can include other pixels and therefore they also use spatial information to classify items. Note that “object” just refers to contiguous regions of pixels and not whether or not there is a target object within that region of pixels.
Preprocessing Image Data For Object Detection
The most recent and reliable image classification systems primarily use object-level classification schemes, and for these approaches image data must be prepared in specific ways. The objects/regions need to be selected and preprocessed.
Before an image, and the objects/regions within that image, can be classified the data that comprises that image has to be interpreted by the computer. Images need to be preprocessed and readied for input into the classification algorithm, and this is done through object detection. This is a critical part of readying the data and preparing the images to train the machine learning classifier.
Object detection is done with a variety of methods and techniques. To begin with, whether or not there are multiple objects of interest or a single object of interest impacts how the image preprocessing is handled. If there is just one object of interest, the image undergoes image localization. The pixels that comprise the image have numerical values that are interpreted by the computer and used to display the proper colors and hues. An object known as a bounding box is drawn around the object of interest, which helps the computer know what part of the image is important and what pixel values define the object. If there are multiple objects of interest in the image, a technique called object detection is used to apply these bounding boxes to all the objects within the image.
Another method of preprocessing is image segmentation. Image segmentation functions by dividing the whole image into segments based on similar features. Different regions of the image will have similar pixel values in comparison to other regions of the image, so these pixels are grouped together into image masks that correspond to the shape and boundaries of the relevant objects within the image. Image segmentation helps the computer isolate the features of the image that will help it classify an object, much like bounding boxes do, but they provide much more accurate, pixel-level labels.
After the object detection or image segmentation has been completed, labels are applied to the regions in question. These labels are fed, along with the values of the pixels comprising the object, into the machine learning algorithms that will learn patterns associated with the different labels.
Machine Learning Algorithms
Once the data has been prepared and labeled, the data is fed into a machine learning algorithm, which trains on the data. We’ll cover some of the most common kinds of machine learning image classification algorithms below.
K-Nearest Neighbors is a classification algorithm that examines the closest training examples and looks at their labels to ascertain the most probable label for a given test example. When it comes to image classification using KNN, the feature vectors and labels of the training images are stored and just the feature vector is passed into the algorithm during testing. The training and testing feature vectors are then compared against each other for similarity.
KNN-based classification algorithms are extremely simple and they deal with multiple classes quite easily. However, KNN calculates similarity based on all features equally. This means that it can be prone to misclassification when provided with images where only a subset of the features is important for the classification of the image.
Support Vector Machines are a classification method that places points in space and then draws dividing lines between the points, placing objects in different classes depending on which side of the dividing plane the points fall on. Support Vector Machines are capable of doing nonlinear classification through the use of a technique known as the kernel trick. While SVM classifiers are often very accurate, a substantial drawback to SVM classifiers is that they tend to be limited by both size and speed, with speed suffering as size increases.
Multi-Layer Perceptrons (Neural Nets)
Multi-layer perceptrons, also called neural network models, are machine learning algorithms inspired by the human brain. Multilayer perceptrons are composed of various layers that are joined together with each other, much like neurons in the human brain are linked together. Neural networks make assumptions about how the input features are related to the data’s classes and these assumptions are adjusted over the course of training. Simple neural network models like the multi-layer perceptron are capable of learning non-linear relationships, and as a result, they can be much more accurate than other models. However, MLP models suffer from some notable issues like the presence of non-convex loss functions.
Deep Learning Algorithms (CNNs)
The most commonly used image classification algorithm in recent times is the Convolutional Neural Network (CNNs). CNNs are customized versions of neural networks that combine the multilayer neural networks with specialized layers that are capable of extracting the features most important and relevant to the classification of an object. CNNs can automatically discover, generate, and learn features of images. This greatly reduces the need to manually label and segment images to prepare them for machine learning algorithms. They also have an advantage over MLP networks because they can deal with non-convex loss functions.
Convolutional Neural Networks get their name from the fact that they create “convolutions”. CNNs operate by taking a filter and sliding it over an image. You can think of this as viewing sections of a landscape through a moveable window, concentrating on just the features that are viewable through the window at any one time. The filter contains numerical values which are multiplied with the values of the pixels themselves. The result is a new frame, or matrix, full of numbers that represent the original image. This process is repeated for a chosen number of filters, and then the frames are joined together into a new image that is slightly smaller and less complex than the original image. A technique called pooling is used to select just the most important values within the image, and the goal is for the convolutional layers to eventually extract just the most salient parts of the image that will help the neural network recognize the objects in the image.
Convolutional Neural Networks are comprised of two different parts. The convolutional layers are what extract the features of the image and convert them into a format that the neural network layers can interpret and learn from. The early convolutional layers are responsible for extracting the most basic elements of the image, like simple lines and boundaries. The middle convolutional layers begin to capture more complex shapes, like simple curves and corners. The later, deeper convolutional layers extract the high-level features of the image, which are what is passed into the neural network portion of the CNN, and are what the classifier learns.
- SingularityNET Goes Multi-Chain with Cardano Collaboration
- How Facebook’s AI Spreads Misinformation and Threatens Democracy
- Andrew Stein, Software Engineer Waymo – Interview Series
- Michael Schrage, Author of Recommendation Engines (The MIT Press) – Interview Series
- Scientists Detect Loneliness Through The Use Of AI And NLP