Connect with us

AI 101

What is Machine Learning?




Machine learning is one of the quickest growing technological fields, but despite how often the words “machine learning” are tossed around, it can be difficult to understand what machine learning is, precisely.

Machine learning doesn’t refer to just one thing, it’s an umbrella term that can be applied to many different concepts and techniques. Understanding machine learning means being familiar with different forms of model analysis, variables, and algorithms. Let’s take a close look at machine learning to better understand what it encompasses.

What Is Machine Learning?

While the term machine learning can be applied to many different things, in general, the term refers to enabling a computer to carry out tasks without receiving explicit line-by-line instructions to do so. A machine learning specialist doesn’t have to write out all the steps necessary to solve the problem because the computer is capable of “learning” by analyzing patterns within the data and generalizing these patterns to new data.

Machine learning systems have three basic parts:

  • Inputs
  • Algorithms
  • Outputs

The inputs are the data that is fed into the machine learning system, and the input data can be divided into labels and features. Features are the relevant variables, the variables that will be analyzed to learn patterns and draw conclusions. Meanwhile, the labels are classes/descriptions given to the individual instances of the data.

Features and labels can be used in two different types of machine learning problems: supervised learning and unsupervised learning.

Unsupervised vs. Supervised Learning

In supervised learning, the input data is accompanied by a ground truth. Supervised learning problems have the correct output values as part of the dataset, so the expected classes are known in advance. This makes it possible for the data scientist to check the performance of the algorithm by testing the data on a test dataset and seeing what percentage of items were correctly classified.

In contrast, unsupervised learning problems do not have ground truth labels attached to them. A machine learning algorithm trained to carry out unsupervised learning tasks must be able to infer the relevant patterns in the data for itself.

Supervised learning algorithms are typically used for classification problems, where one has a large dataset filled with instances that must be sorted into one of many different classes. Another type of supervised learning is a regression task, where the value output by the algorithm is continuous in nature instead of categorical.

Meanwhile, unsupervised learning algorithms are used for tasks like density estimation, clustering, and representation learning. These three tasks need the machine learning model to infer the structure of the data, there are no predefined classes given to the model.

Let’s take a brief look at some of the most common algorithms used in both unsupervised learning and supervised learning.

Types of Supervised Learning

Common supervised learning algorithms include:

Support Vector Machines are algorithms that divide up a dataset into different classes. Data points are grouped into clusters by drawing lines that separate the classes from one another. Points found on one side of the line will belong to one class, while the points on the other side of the line are a different class. Support Vector Machines aim to maximize the distance between the line and the points found on either side of the line, and the greater the distance the more confident the classifier is that the point belongs to one class and not another class.

Logistic Regression is an algorithm used in binary classification tasks when data points need to be classified as belonging to one of two classes. Logistic Regression works by labeling the data point either a 1 or a 0. If the perceived value of the data point is 0.49 or below, it is classified as 0, while if it is 0.5 or above it is classified as 1.

Decision Tree algorithms operate by dividing datasets up into smaller and smaller fragments. The exact criteria used to divide the data is up to the machine learning engineer, but the goal is to ultimately divide the data up into single data points, which will then be classified using a key.

A Random Forest algorithm is essentially many single Decision Tree classifiers linked together into a more powerful classifier.

The Naive Bayes Classifier calculates the probability that a given data point has occurred based on the probability of a prior event occurring. It is based on Bayes Theorem and it places the data points into classes based on their calculated probability. When implementing a Naive Bayes classifier, it is assumed that all the predictors have the same influence on the class outcome.

An Artificial Neural Network, or multi-layer perceptron, are machine learning algorithms inspired by the structure and function of the human brain. Artificial neural networks get their name from the fact that they are made out of many nodes/neurons linked together. Every neuron manipulates the data with a mathematical function. In artificial neural networks, there are input layers, hidden layers, and output layers.

The hidden layer of the neural network is where the data is actually interpreted and analyzed for patterns. In other words, it is where the algorithm learns. More neurons joined together make more complex networks capable of learning more complex patterns.

Types of Unsupervised Learning

Unsupervised Learning algorithms include:

  • K-means clustering
  • Autoencoders
  • Principal Component Analysis

K-means clustering is an unsupervised classification technique, and it works by separating points of data into clusters or groups based on their features. K-means clustering analyzes the features found in the data points and distinguishes patterns in them that make the data points found in a given class cluster more similar to each other than they are are to clusters containing the other data points. This is accomplished by placing possible centers for the cluster, or centroids, in a graph of the data and reassigning the position of the centroid until a position is found that minimizes the distance between the centroid and the points that belong to that centroid’s class. The researcher can specify the desired number of clusters.

Principal Component Analysis is a technique that reduces large numbers of features/variables down into a smaller feature space/fewer features. The “principal components” of the data points are selected for preservation, while the other features are squeezed down into a smaller representation. The relationship between the original data potions is preserved, but since the complexity of the data points is simpler, the data is easier to quantify and describe.

Autoencoders are versions of neural networks that can be applied to unsupervised learning tasks. Autoencoders are capable of taking unlabeled, free-form data and transforming them into data that a neural network is capable of using, basically creating their own labeled training data. The goal of an autoencoder is to convert the input data and rebuild it as accurately as possible, so it’s in the incentive of the network to determine which features are the most important and extract them.

Spread the love

Blogger and programmer with specialties in Machine Learning and Deep Learning topics. Daniel hopes to help others use the power of AI for social good.

AI 101

What is an Autoencoder?




If you’ve read about unsupervised learning techniques before, you may have come across the term “autoencoder”. Autoencoders are one of the primary ways that unsupervised learning models are developed. Yet what is an autoencoder exactly?

Briefly, autoencoders operate by taking in data, compressing and encoding the data, and then reconstructing the data from the encoding representation. The model is trained until the loss is minimized and the data is reproduced as closely as possible. Through this process, an autoencoder can learn the important features of the data. While that’s a quick definition of an autoencoder, it would be beneficial to take a closer look at autoencoders and gain a better understanding of how they function. This article will endeavor to demystify autoencoders, explaining the architecture of autoencoders and their applications.

What is an Autoencoder?

Autoencoders are neural networks. Neural networks are composed of multiple layers, and the defining aspect of an autoencoder is that the input layers contain exactly as much information as the output layer. The reason that the input layer and output layer has the exact same number of units is that an autoencoder aims to replicate the input data. It outputs a copy of the data after analyzing it and reconstructing it in an unsupervised fashion.

The data that moves through an autoencoder isn’t just mapped straight from input to output, meaning that the network doesn’t just copy the input data. There are three components to an autoencoder: an encoding (input) portion that compresses the data, a component that handles the compressed data (or bottleneck), and a decoder (output) portion. When data is fed into an autoencoder, it is encoded and then compressed down to a smaller size. The network is then trained on the encoded/compressed data and it outputs a recreation of that data.

So why would you want to train a network to just reconstruct the data that is given to it? The reason is that the network learns the “essence”, or most important features of the input data. After you have trained the network, a model can be created that can synthesize similar data, with the addition or subtraction of certain target features. For instance, you could train an autoencoder on grainy images and then use the trained model to remove the grain/noise from the image.

Autoencoder Architecture

Let’s take a look at the architecture of an autoencoder. We’ll discuss the main architecture of an autoencoder here. There are variations on this general architecture that we’ll discuss in the section below.

Photo: Michela Massi via Wikimedia Commons,(

As previously mentioned an autoencoder can essentially be divided up into three different components: the encoder, a bottleneck, and the decoder.

The encoder portion of the autoencoder is typically a feedforward, densely connected network. The purpose of the encoding layers is to take the input data and compress it into a latent space representation, generating a new representation of the data that has reduced dimensionality.

The code layers, or the bottleneck, deal with the compressed representation of the data. The bottleneck code is carefully designed to determine the most relevant portions of the observed data, or to put that another way the features of the data that are most important for data reconstruction. The goal here is to determine which aspects of the data need to be preserved and which can be discarded. The bottleneck code needs to balance two different considerations: representation size (how compact the representation is) and variable/feature relevance. The bottleneck performs element-wise activation on the weights and biases of the network. The bottleneck layer is also sometimes called a latent representation or latent variables.

The decoder layer is what is responsible for taking the compressed data and converting it back into a representation with the same dimensions as the original, unaltered data. The conversion is done with the latent space representation that was created by the encoder.

The most basic architecture of an autoencoder is a feed-forward architecture, with a structure much like a single layer perceptron used in multilayer perceptrons. Much like regular feed-forward neural networks, the auto-encoder is trained through the use of backpropagation.

Attributes of An Autoencoder

There are various types of autoencoders, but they all have certain properties that unite them.

Autoencoders learn automatically. They don’t require labels, and if given enough data it’s easy to get an autoencoder to reach high performance on a specific kind of input data.

Autoencoders are data-specific. This means that they can only compress data that is highly similar to data that the autoencoder has already been trained on. Autoencoders are also lossy, meaning that the outputs of the model will be degraded in comparison to the input data.

When designing an autoencoder, machine learning engineers need to pay attention to four different model hyperparameters: code size, layer number, nodes per layer, and loss function.

The code size decides how many nodes begin the middle portion of the network, and fewer nodes compress the data more. In a deep autoencoder,  while the number of layers can be any number that the engineer deems appropriate, the number of nodes in a layer should decrease as the encoder goes on. Meanwhile, the opposite holds true in the decoder, meaning the number of nodes per layer should increase as the decoder layers approach the final layer. Finally, the loss function of an autoencoder is typically either binary cross-entropy or mean squared error. Binary cross-entropy is appropriate for instances where the input values of the data are in a 0 – 1 range.

Autoencoder Types

As mentioned above, variations on the classic autoencoder architecture exist. Let’s examine the different autoencoder architectures.


Photo: Michela Massi via Wikimedia Commons, CC BY SA 4.0 (

While autoencoders typically have a bottleneck that compresses the data through a reduction of nodes, sparse autoencoders are an alternative to that typical operational format. In a sparse network, the hidden layers maintain the same size as the encoder and decoder layers. Instead, the activations within a given layer are penalized, setting it up so the loss function better captures the statistical features of input data. To put that another way, while the hidden layers of a sparse autoencoder have more units than a traditional autoencoder, only a certain percentage of them are active at any given time. The most impactful activation functions are preserved and others are ignored, and this constraint helps the network determine just the most salient features of the input data.


Contractive autoencoders are designed to be resilient against small variations in the data, maintaining a consistent representation of the data. This is accomplished by applying a penalty to the loss function. This regularization technique is based on the Frobenius norm of the Jacobian matrix for the input encoder activations. The effect of this regularization technique is that the model is forced to construct an encoding where similar inputs will have similar encodings.


Convolutional autoencoders encode input data by splitting the data up into subsections and then converting these subsections into simple signals that are summed together to create a new representation of the data. Similar to convolution neural networks,  a convolutional autoencoder specializes in the learning of image data, and it uses a filter that is moved across the entire image section by section. The encodings generated by the encoding layer can be used to reconstruct the image, reflect the image, or modify the image’s geometry. Once the filters have been learned by the network, they can be used on any sufficiently similar input to extract the features of the image.


Photo: MAL via Wikimedia Commons, CC BY SA 3.0 (

Denoising autoencoders introduce noise into the encoding, resulting in an encoding that is a corrupted version of the original input data. This corrupted version of the data is used to train the model, but the loss function compares the output values with the original input and not the corrupted input. The goal is that the network will be able to reproduce the original, non-corrupted version of the image. By comparing the corrupted data with the original data, the network learns which features of the data are most important and which features are unimportant/corruptions. In other words, in order for a model to denoise the corrupted images, it has to have extracted the important features of the image data.


Variational autoencoders operate by making assumptions about how the latent variables of the data are distributed. A variational autoencoder produces a probability distribution for the different features of the training images/the latent attributes. When training, the encoder creates latent distributions for the different features of the input images.


Because the model learns the features or images as Gaussian distributions instead of discrete values, it is capable of being used to generate new images. The Gaussian distribution is sampled to create a vector, which is fed into the decoding network, which renders an image based on this vector of samples. Essentially, the model learns common features of the training images and assigns them some probability that they will occur. The probability distribution can then be used to reverse engineer an image, generating new images that resemble the original, training images.

When training the network, the encoded data is analyzed and the recognition model outputs two vectors, drawing out the mean and standard deviation of the images. A distribution is created based on these values. This is done for the different latent states. The decoder then takes random samples from the corresponding distribution and uses them to reconstruct the initial inputs to the network.

Autoencoder Applications

Autoencoders can be used for a wide variety of applications, but they are typically used for tasks like dimensionality reduction, data denoising, feature extraction, image generation, sequence to sequence prediction, and recommendation systems.

Data denoising is the use of autoencoders to strip grain/noise from images. Similarly, autoencoders can be used to repair other types of image damage, like blurry images or images missing sections. Dimensionality reduction can help high capacity networks learn useful features of images, meaning the autoencoders can be used to augment the training of other types of neural networks. This is also true of using autoencoders for feature extraction, as autoencoders can be used to identify features of other training datasets to train other models.

In terms of image generation, autoencoders can be used to generate fake human images or animated characters, which has applications in designing face recognition systems or automating certain aspects of animation.

Sequence to sequence prediction models can be used to determine the temporal structure of data, meaning that an autoencoder can be used to generate the next even in a sequence. For this reason, an autoencoder could be used to generate videos. Finally, deep autoencoders can be used to create recommendation systems by picking up on patterns relating to user interest, with the encoder analyzing user engagement data and the decoder creating recommendations that fit the established patterns.

Spread the love
Continue Reading

AI 101

What Is Synthetic Data?




What is Synthetic Data?

Synthetic data is a quickly expanding trend and emerging tool in the field of data science. What is synthetic data exactly? The short answer is that synthetic data is comprised of data that isn’t based on any real-world phenomena or events, rather it’s generated via a computer program. Yet why is synthetic data becoming so important for data science? How is synthetic data created? Let’s explore the answers to these questions.

What is a Synthetic Dataset?

As the term “synthetic” suggests, synthetic datasets are generated through computer programs, instead of being composed through the documentation of real-world events. The primary purpose of a synthetic dataset is to be versatile and robust enough to be useful for the training of machine learning models.

In order to be useful for a machine learning classifier, the synthetic data should have certain properties. While the data can be categorical, binary, or numerical, the length of the dataset should be arbitrary and the data should be randomly generated. The random processes used to generate the data should be controllable and based on various statistical distributions. Random noise may also be placed in the dataset.

If the synthetic data is being used for a classification algorithm, the amount of class separation should be customizable, in order that the classification problem can be made easier or harder according to the problem’s requirements. Meanwhile, for a regression task, non-linear generative processes can be employed to generate the data.

Why Use Synthetic Data?

As machine learning frameworks like TensorfFlow and PyTorch become easier to use and pre-designed models for computer vision and natural language processing become more ubiquitous and powerful, the primary problem that data scientists must face is the collection and handling of data. Companies often have difficulty acquiring large amounts of data to train an accurate model within a given time frame. Hand-labeling data is a costly, slow way to acquire data. However, generating and using synthetic data can help data scientists and companies overcome these hurdles and develop reliable machine learning models a quicker fashion.

There are a number of advantages to using synthetic data. The most obvious way that the use of synthetic data benefits data science is that it reduces the need to capture data from real-world events, and for this reason it becomes possible to generate data and construct a dataset much more quickly than a dataset dependent on real-world events. This means that large volumes of data can be produced in a short timeframe. This is especially true for events that rarely occur, as if an event rarely happens in the wild, more data can be mocked up from some genuine data samples. Beyond that, the data can be automatically labeled as it is generated, drastically reducing the amount of time needed to label data.

Synthetic data can also be useful to gain training data for edge cases, which are instances that may occur infrequently but are critical for the success of your AI. Edge cases are events that are very similar to the primary target of an AI but differ in important ways. For instance, objects that are only partially in view could be considered edge cases when designing an image classifier.

Finally, synthetic datasets can minimize privacy concerns. Attempts to anonymize data can be ineffective, as even if sensitive/identifying variables are removed from the dataset, other variables can act as identifiers when they are combined. This isn’t an issue with synthetic data, as it was never based on a real person, or real event, in the first place.

Uses Cases for Synthetic Data

Synthetic data has a wide variety of uses, as it can be applied to just about any machine learning task. Common use cases for synthetic data include self-driving vehicles, security, robotics, fraud protection, and healthcare.

One of the initial use cases for synthetic data was self-driving cars, as synthetic data is used to create training data for cars in conditions where getting real, on-the-road training data is difficult or dangerous. Synthetic data is also useful for the creation of data used to train image recognition systems, like surveillance systems, much more efficiently than manually collecting and labeling a bunch of training data. Robotics systems can be slow to train and develop with traditional data collection and training methods. Synthetic data allows robotics companies to test and engineer robotics systems through simulations. Fraud protection systems can benefit from synthetic data, and new fraud detection methods can be trained and tested with data that is constantly new when synthetic data is used. In the healthcare field, synthetic data can be used to design health classifiers that are accurate, yet preserve people’s privacy, as the data won’t be based on real people.

Synthetic Data Challenges

While the use of synthetic data brings many advantages with it, it also brings many challenges.

When synthetic data is created, it often lacks outliers. Outliers occur in data naturally, and while often dropped from training datasets, their existence may be necessary to train truly reliable machine learning models. Beyond this, the quality of synthetic data can be highly variable. Synthetic data is often generated with an input, or seed, data, and therefore the quality of the data can be dependent on the quality of the input data. If the data used to generate the synthetic data is biased, the generated data can perpetuate that bias. Synthetic data also requires some form of output/quality control. It needs to be checked against human-annotated data, or otherwise authentic data is some form.

How Is Synthetic Data Created?

Synthetic data is created programmatically with machine learning techniques. Classical machine learning techniques like decision trees can be used, as can deep learning techniques. The requirements for the synthetic data will influence what type of algorithm is used to generate the data. Decision trees and similar machine learning models let companies create non-classical, multi-modal data distributions, trained on examples of real-world data. Generating data with these algorithms will provide data that is highly correlated with the original training data. For instances where the typical distribution of data is known , a company can generate synthetic data through use of a Monte Carlo method.

Deep learning-based methods of generating synthetic data typically make use of either a variational autoencoder (VAE) or a generative adversarial network (GAN). VAEs are unsupervised machine learning models that make use of encoders and decoders. The encoder portion of a VAE is responsible for compressing the data down into a simpler, compact version of the original dataset, which the decoder then analyzes and uses to generate an a representation of the base data. A VAE is trained with the goal of having an optimal relationship between the input data and output, one where both input data and output data are extremely similar.

When it comes to GAN models, they are called “adversarial” networks due to the fact that GANs are actually two networks that compete with each other. The generator is responsible for generating synthetic data, while the second network (the discriminator) operates by comparing the generated data with a real dataset and tries to determine which data is fake. When the discriminator catches fake data, the generator is notified of this and it makes changes to try and get a new batch of data by the discriminator. In turn, the discriminator becomes better and better at detecting fakes. The two networks are trained against each other, with fakes becoming more lifelike all the time.

Spread the love
Continue Reading

AI 101

How Does Image Classification Work?




How can your phone determine what an object is just by taking a photo of it? How do social media websites automatically tag people in photos? This is accomplished through AI-powered image recognition and classification.

The recognition and classification of images is what enables many of the most impressive accomplishments of artificial intelligence. Yet how do computers learn to detect and classify images? In this article, we’ll cover the general methods that computers use to interpret and detect images and then take a look at some of the most popular methods of classifying those images.

Pixel-Level vs. Object-Based Classification

Image classification techniques can mainly be divided into two different categories: pixel-based classification and object-based classification.

Pixels are the base units of an image, and the analysis of pixels is the primary way that image classification is done. However, classification algorithms can either use just the spectral information within individual pixels to classify an image or examine spatial information (nearby pixels) along with the spectral information. Pixel-based classification methods utilize only spectral information (the intensity of a pixel), while object-based classification methods take into account both pixel spectral information and spatial information.

There are different classification techniques used for pixel-based classification. These include minimum-distance-to-mean, maximum-likelihood, and minimum-Mahalanobis-distance. These methods require that the means and variances of the classes are known, and they all operate by examining the “distance” between class means and the target pixels.

Pixel-based classification methods are limited by the fact that they can’t use information from other nearby pixels. In contrast, object-based classification methods can include other pixels and therefore they also use spatial information to classify items. Note that “object” just refers to contiguous regions of pixels and not whether or not there is a target object within that region of pixels.

Preprocessing Image Data For Object Detection

The most recent and reliable image classification systems primarily use object-level classification schemes, and for these approaches image data must be prepared in specific ways. The objects/regions need to be selected and preprocessed.

Before an image, and the objects/regions within that image, can be classified the data that comprises that image has to be interpreted by the computer. Images need to be preprocessed and readied for input into the classification algorithm, and this is done through object detection. This is a critical part of readying the data and preparing the images to train the machine learning classifier.

Object detection is done with a variety of methods and techniques. To begin with, whether or not there are multiple objects of interest or a single object of interest impacts how the image preprocessing is handled. If there is just one object of interest, the image undergoes image localization. The pixels that comprise the image have numerical values that are interpreted by the computer and used to display the proper colors and hues. An object known as a bounding box is drawn around the object of interest, which helps the computer know what part of the image is important and what pixel values define the object. If there are multiple objects of interest in the image, a technique called object detection is used to apply these bounding boxes to all the objects within the image.

Photo: Adrian Rosebrock via Wikimedia Commons, CC BY SA 4.0 (

Another method of preprocessing is image segmentation. Image segmentation functions by dividing the whole image into segments based on similar features. Different regions of the image will have similar pixel values in comparison to other regions of the image, so these pixels are grouped together into image masks that correspond to the shape and boundaries of the relevant objects within the image. Image segmentation helps the computer isolate the features of the image that will help it classify an object, much like bounding boxes do, but they provide much more accurate, pixel-level labels.

After the object detection or image segmentation has been completed, labels are applied to the regions in question. These labels are fed, along with the values of the pixels comprising the object, into the machine learning algorithms that will learn patterns associated with the different labels.

Machine Learning Algorithms

Once the data has been prepared and labeled, the data is fed into a machine learning algorithm, which trains on the data. We’ll cover some of the most common kinds of machine learning image classification algorithms below.

K-Nearest Neighbors

K-Nearest Neighbors is a classification algorithm that examines the closest training examples and looks at their labels to ascertain the most probable label for a given test example. When it comes to image classification using KNN, the feature vectors and labels of the training images are stored and just the feature vector is passed into the algorithm during testing. The training and testing feature vectors are then compared against each other for similarity.

KNN-based classification algorithms are extremely simple and they deal with multiple classes quite easily. However, KNN calculates similarity based on all features equally. This means that it can be prone to misclassification when provided with images where only a subset of the features is important for the classification of the image.

Support Vector Machines

Support Vector Machines are a classification method that places points in space and then draws dividing lines between the points, placing objects in different classes depending on which side of the dividing plane the points fall on. Support Vector Machines are capable of doing nonlinear classification through the use of a technique known as the kernel trick. While SVM classifiers are often very accurate, a substantial drawback to SVM classifiers is that they tend to be limited by both size and speed, with speed suffering as size increases.

Multi-Layer Perceptrons (Neural Nets)

Multi-layer perceptrons, also called neural network models, are machine learning algorithms inspired by the human brain. Multilayer perceptrons are composed of various layers that are joined together with each other, much like neurons in the human brain are linked together. Neural networks make assumptions about how the input features are related to the data’s classes and these assumptions are adjusted over the course of training. Simple neural network models like the multi-layer perceptron are capable of learning non-linear relationships, and as a result, they can be much more accurate than other models. However, MLP models suffer from some notable issues like the presence of non-convex loss functions.

Deep Learning Algorithms (CNNs)

Photo: APhex34 via Wikimedia Commons, CC BY SA 4.0 (

The most commonly used image classification algorithm in recent times is the Convolutional Neural Network (CNNs). CNNs are customized versions of neural networks that combine the multilayer neural networks with specialized layers that are capable of extracting the features most important and relevant to the classification of an object. CNNs can automatically discover, generate, and learn features of images. This greatly reduces the need to manually label and segment images to prepare them for machine learning algorithms. They also have an advantage over MLP networks because they can deal with non-convex loss functions.

Convolutional Neural Networks get their name from the fact that they create “convolutions”. CNNs operate by taking a filter and sliding it over an image. You can think of this as viewing sections of a landscape through a moveable window, concentrating on just the features that are viewable through the window at any one time. The filter contains numerical values which are multiplied with the values of the pixels themselves. The result is a new frame, or matrix, full of numbers that represent the original image. This process is repeated for a chosen number of filters, and then the frames are joined together into a new image that is slightly smaller and less complex than the original image. A technique called pooling is used to select just the most important values within the image, and the goal is for the convolutional layers to eventually extract just the most salient parts of the image that will help the neural network recognize the objects in the image.

Convolutional Neural Networks are comprised of two different parts. The convolutional layers are what extract the features of the image and convert them into a format that the neural network layers can interpret and learn from. The early convolutional layers are responsible for extracting the most basic elements of the image, like simple lines and boundaries. The middle convolutional layers begin to capture more complex shapes, like simple curves and corners. The later, deeper convolutional layers extract the high-level features of the image, which are what is passed into the neural network portion of the CNN, and are what the classifier learns.

Spread the love
Continue Reading