Connect with us

AI 101

What is Natural Language Processing?

mm

Published

 on

Natural Language Processing (NLP) is the study and application of techniques and tools that enable computers to process, analyze, interpret, and reason about human language. NLP is an interdisciplinary field and it combines techniques established in fields like linguistics and computer science. These techniques are used in concert with AI to create chatbots and digital assistants like Google Assistant and Amazon’s Alexa.

Let’s take some time to explore the rationale behind Natural Language Processing, some of the techniques used in NLP, and some common uses cases for NLP.

Why Is Natural Language Processing Important?

In order for computers to interpret human language, they must be converted into a form that a computer can manipulate. However, this isn’t as simple as converting text data into numbers. In order to derive meaning from human language, patterns have to be extracted from the hundreds or thousands of words that make up a text document. This is no easy task. There are few hard and fast rules that can be applied to the interpretation of human language. For instance, the exact same set of words can mean different things depending on the context. Human language is a complex and often ambiguous thing, and a statement can be uttered with sincerity or sarcasm.

Despite this, there are some general guidelines that can be used when interpreting words and characters, such as the character “s” being used to denote that an item is plural. These general guidelines have to be used in concert with each other to extract meaning from the text, to create features that a machine learning algorithm can interpret.

Natural Language Processing involves the application of various algorithms capable of taking unstructured data and converting it into structured data. If these algorithms are applied in the wrong manner, the computer will often fail to derive the correct meaning from the text. This can often be seen in the translation of text between languages, where the precise meaning of the sentence is often lost. While machine translation has improved substantially over the past few years, machine translation errors still occur frequently.

Natural Language Processing Techniques

Photo: Tamur via WikiMedia Commons, Public Domain (https://commons.wikimedia.org/wiki/File:ParseTree.svg)

Many of the techniques that are used in natural language processing can be placed in one of two categories: syntax or semantics. Syntax techniques are those that deal with the ordering of words, while semantic techniques are the techniques that involve the meaning of words.

Syntax NLP Techniques

Examples of syntax include:

  • Lemmatization
  • Morphological Segmentation
  • Part-of-Speech Tagging
  • Parsing
  • Sentence Breaking
  • Stemming
  • Word Segmentation

Lemmatization refers to distilling the different inflections of a word down to a single form. Lemmatization takes things like tenses and plurals and simplifies them, for example, “feet” might become “foot” and “stripes” may become “stripe”.  This simplified word form makes it easier for an algorithm to interpret the words in a document.

Morphological segmentation is the process of dividing words into morphemes or the base units of a word. These units are things like free morphemes (which can stand alone as words) and prefixes or suffixes.

Part-of-speech tagging is simply the process of identifying which part of speech every word in an input document is.

Parsing refers to analyzing all the words in a sentence and correlating them with their formal grammar labels or doing grammatical analysis for all the words.

Sentence breaking, or sentence boundary segmentation, refers to deciding where a sentence begins and ends.

Stemming is the process of reducing words down to the root form of the word. For instance, connected, connection, and connections would all be stemmed to “connect”.

Word Segmentation is the process of dividing large pieces of text down into small units, which can be words or stemmed/lemmatized units.

Semantic NLP Techniques

Semantic NLP techniques include techniques like:

  • Named Entity Recognition
  • Natural Language Generation
  • Word-Sense disambiguation

Named entity recognition involves tagging certain text portions that can be placed into one of a number of different preset groups. Pre-defined categories include things like dates, cities, places, companies, and individuals.

Natural language generation is the process of using databases to transform structured data into natural language. For instance, statistics about the weather, like temperature and wind speed could be summarized with natural language.

Word-sense disambiguation is the process of assigning meaning to words within a text based on the context the words appear in.

Deep Learning Models For Natural Language Processing

Regular multilayer perceptrons are unable to handle the interpretation of sequential data, where the order of the information is important. In order to deal with the importance of order in sequential data, a type of neural network is used that preserves information from previous timesteps in the training.

Recurrent Neural Networks are types of neural networks that loop over data from previous timesteps, taking them into account when calculating the weights of the current timestep. Essentially, RNN’s have three parameters that are used during the forward training pass: a matrix based on the Previous Hidden State, a matrix based on the Current Input, and a matrix that is between the hidden state and the output. Because RNNs can take information from previous timesteps into account, they can extract relevant patterns from text data by taking earlier words in the sentence into account when interpreting the meaning of a word.

Another type of deep learning architecture used to process text data is a Long Short-Term Memory (LSTM) network. LSTM networks are similar to RNNs in structure, but owing to some differences in their architecture they tend to perform better than RNNs. They avoid a specific problem that often occurs when using RNNs called the exploding gradient problem.

These deep neural networks can be either unidirectional or bi-directional. Bi-directional networks are capable of taking not just the words that come prior to the current word into account, but the words that come after it. While this leads to higher accuracy, it is more computationally expensive.

Use Cases For Natural Language Processing

Photo: mohammed_hassan via Pixabay, Pixabay License (https://pixabay.com/illustrations/chatbot-chat-application-artificial-3589528/)

Because Natural Language Processing involves the analysis and manipulation of human languages, it has an incredibly wide range of applications. Possible applications for NLP include chatbots, digital assistants, sentiment analysis, document organization, talent recruitment, and healthcare.

Chatbots and digital assistants like Amazon’s Alexa and Google Assistant are examples of voice recognition and synthesis platforms that use NLP to interpret and respond to vocal commands. These digital assistants help people with a wide variety of tasks, letting them offload some of their cognitive tasks to another device and free up some of their brainpower for other, more important things. Instead of looking up the best route to the bank on a busy morning, we can just have our digital assistant do it.

Sentiment analysis is the use of NLP techniques to study people’s reactions and feelings to a phenomenon, as communicated by their use of language. Capturing the sentiment of a statement, like interpreting whether a review of a product is good or bad, can provide companies with substantial information regarding how their product is being received.

Automatically organizing text documents is another application of NLP. Companies like Google and Yahoo use NLP algorithms to classify email documents, putting them in the appropriate bins such as “social” or “promotions”. They also use these techniques to identify spam and prevent it from reaching your inbox.

Groups have also developed NLP techniques are being used to identify potential job hires, finding them based on relevant skills. Hiring managers are also using NLP techniques to help them sort through lists of applicants.

NLP techniques are also being used to enhance healthcare. NLP can be used to improve the detection of diseases. Health records can be analyzed and symptoms extracted by NLP algorithms, which can then be used to suggest possible diagnoses. One example of this is Amazon’s Comprehend Medical platform, which analyzes health records and extracts diseases and treatments. Healthcare applications of NLP also extend to mental health. There are apps such as WoeBot, which talks users through a variety of anxiety management techniques based in Cognitive Behavioral Therapy.

To Learn More

Recommended Natural Language Processing CoursesOffered ByDurationDifficulty


Introduction to Artificial Intelligence



IBM

9 Hours

Beginner


Natural Language Processing in TensorFlow


Deep Learning AI

9 Hours

Intermediate


An Introduction to Practical Deep Learning


Intel Software

12 Hours

Intermediate


Natural Language Processing


Higher School of Economics

34 Hours

Advanced
Spread the love

Blogger and programmer with specialties in Machine Learning and Deep Learning topics. Daniel hopes to help others use the power of AI for social good.

AI 101

What Are Nanobots? Understanding Nanobot Structure, Operation, and Uses

mm

Published

on

As technology advances, things don’t always become bigger and better, objects also become smaller. In fact, nanotechnology is one of the fastest-growing technological fields, worth over 1 trillion USD, and it’s forecast to grow by approximately 17% over the next half-decade. Nanobots are a major part of the nanotechnology field, but what are they exactly and how do they operate? Let’s take a closer look at nanobots to understand how this transformative technology works and what it’s used for.

What Are Nanobots?

The field of nanotechnology is concerned with the research and development of technology approximately one to 100 nanometres in scale. Therefore, nanorobotics is focused on the creation of robots that are around this size. In practice, it’s difficult to engineer anything as small as one nanometer in scale and the term “nanorobotics” and “nanobot” is frequently applied to devices which are approximately 0.1 – 10 micrometers in size, which is still quite small.

It’s important to note that the term “nanorobot” is sometimes applied to devices which interact with objects at the nanoscale, manipulating nanoscale items. Therefore, even if the device itself is much larger, it may be considered a nanorobotic instrument. This article will focus on nanoscale robots themselves.

Much of the field of nanorobotics and nanobots is still in the theoretical phase, with research focused on solving the problems of construction at such a small scale. However, some prototype nanomachines and nanomotors have been designed and tested.

Most currently existing nanorobotic devices fall into one of four categories: switches, motors, shuttles, and cars.

Nanorobotic switches operate by being prompted to switch from an “off” state to an “on” state. Environmental factors are used to make the machine change shape, a process called conformational change. The environment is altered using processes like chemical reactions, UV light, and temperature, and the nanorobotic switches shift into different forms as a result, able to accomplish specific tasks.

Nanomotors are more complex than simple switches, and they utilize the energy created by the effects of the conformational change in order to move around and affect the molecules in the surrounding environment.

Shuttles are nanorobots that are capable of transporting chemicals like drugs to specific, targeted regions. The goal is to combine shuttles with nanorobot motors so that the shuttles are capable of a greater degree of movement through an environment.

Nanorobotic “cars” are the most advanced nanodevices at the moment, capable of moving independently with prompts from chemical or electromagnetic catalysts. The nanomotors that drive nanorobotic cars need to be controlled in order for the vehicle to be steered, and researchers are experimenting with various methods of nanorobotic control.

Nanorobotics researchers aim to synthesize these different components and technologies into nanomachines that can complete complex tasks, accomplished by swarms of nanobots working together.

Photo: Photo: ” Comparison of the sizes of nanomaterials with those of other common materials.” Sureshup vai Wikimedia Commons, CC BY 3.0 (https://en.wikipedia.org/wiki/File:Comparison_of_nanomaterials_sizes.jpg)

How Are Nanobots Created?

The field of nanorobotics is at the crossroads of many disciplines and the creation of nanobots involves the creation of sensors, actuators and motors. Physical modeling must be done as well, and all of this must be done at nanoscale. As mentioned above, nanomanipulation devices are used to assemble these nano-scale parts and manipulate artificial or biological components, which includes the manipulation of cells and molecules.

Nanorobotics engineers must be able to solve a multitude of problems. They have to address issues regarding sensation, control power, communications, and interactions between both inorganic and organic materials.

The size of a nanobot is roughly comparable to biological cells, and because of this fact future nanobots could be employed in disciplines like medicine and environmental preservation/remediation. Most “nanobots” that exist today are just specific molecules which have been manipulated to accomplish certain tasks. 

Complex nanobots are essentially just simple molecules joined together and manipulated with chemical processes. For instance, some nanobots are comprised of DNA, and they transport molecular cargo.

How Do Nanobots Operate?

Given the still heavily theoretical nature of nanobots, questions about how nanobots operate are answered with predictions rather than statements of fact. It’s likely that the first major uses for nanobots will be in the medical field, moving through the human body and accomplishing tasks like diagnosing diseases, monitoring vitals, and dispensing treatments. These nanobots will need to be able to navigate their way around the human body and move through tissues like blood vessels.

Navigation

In terms of nanobot navigation, there are a variety of techniques that nanobot researchers and engineers are investigating. One method of navigation is the utilization of ultrasonic signals for detection and deployment. A nanobot could emit ultrasonic signals that could be traced to locate the position of the nanobots, and the robots could then be guided to specific areas with the use of a special tool that directs their motion. Magnetic Resonance Imaging (MRI) devices could also be employed to track the position of nanobots, and early experiments with MRIs have demonstrated that the technology can be used to detect and even maneuver nanobots. Other methods of detecting and maneuvring nanobots include the use of X-rays, microwaves and radio-waves. At the moment, our control of these waves at the nano-scale is fairly limited, so new methods of utilizing these waves would have to be invented.

The navigation and detection systems described above are external methods, relying on the use of tools to move the nanobots. With the addition of onboard sensors, the nanobots could be more autonomous. For instance, chemical sensors included onboard nanobots could allow the robot to scan the surrounding environment and follow certain chemical markers to a target region.

Power

When it comes to powering the nanobots, there are also a variety of power solutions being explored by researchers. Solutions for powering nanobots include external power sources and onboard/internal power sources.

Internal power solutions include generators and capacitors. Generators onboard the nanobot could use the electrolytes found within the blood to produce energy, or nanobots could even be powered using the surrounding blood as a chemical catalyst that produces energy when combined with a chemical the nanobot carries with it. Capacitors operate similarly to batteries, storing electrical energy that could be used to propel the nanobot. Other options like tiny nuclear power sources have even been considered.

As far as external power sources go, incredibly small, thin wires could tether the nanobots to an outside power source. Such wires could be made out of miniature fiber optic cables, sending pulses of light down the wires and having the actual electricity be generated within the nanobot.

Other external power solutions include magnetic fields or ultrasonic signals. Nanobots could employ something called a piezoelectric membrane, which is capable of collecting ultrasonic waves and transforming them into electrical power. Magnetic fields can be used to catalyze electrical currents within a closed conducting loop contained onboard the nanobot. As a bonus, the magnetic field could also be used to control the direction of the nanobot.

Locomotion

Addressing the problem of nanobot locomotion requires some inventive solutions. Nanobots that aren’t tethered, or aren’t just free-floating in their environment, need to have some method of moving to their target locations. The propulsion system will need to be powerful and stable, able to propel the nanobot against currents in its surrounding environment, like the flow of the blood. Propulsion solutions under investigation are often inspired by the natural world, with researchers looking at how microscope organisms move through their environment. For instance, microorganisms often use long, whip-like tails called flagella to propel themselves, or they use a number of tiny, hair-like limbs dubbed cilia.

Researchers are also experimenting with giving robots small arm-like appendages that could allow the robot to swim, grip, and crawl. Currently, these appendages are controlled via magnetic fields outside the body, as the magnetic force prompts the robot’s arms to vibrate. An added benefit to this method of locomotion is that the energy for it comes from an outside source. This technology would need to be made even smaller to make it viable for true nanobots.

There are other, more inventive, propulsion strategies also under investigation. For instance, some researchers have proposed using capacitors to engineer an electromagnetic pump that would pull conductive fluids in and shoot it out like a jet, propelling the nanobot forward.

Regardless of the eventual application of nanobots, they must solve the problems described above, handling navigation, locomotion, and power.

What Are Nanobots Used For?

As mentioned, the first uses for nanobots will likely be in the medical field. Nanobots could be used to monitor for damage to the body, and potentially even facilitate the repair of this damage. Future nanobots could deliver medicine directly to the cells that need them. Currently, medicines are delivered orally or intravenously and they spread throughout the body instead of hitting just the target regions, causing side effects. Nanobots equipped with sensors could easily be used to monitor for changes in regions of cells, reporting changes at the first sign of damage or malfunction.

We are still a long way away from these hypothetical applications, but progress is being made all the time. As an example, in 2017 scientists created nanobots that targeted cancer cells and attacked them with a miniaturized drill, killing them. This year, a group of researchers from ITMO University designed a nanobot composed of DNA fragments, capable of destroying pathogenic RNA strands. DNA-based nanobots are also currently capable of transporting molecular cargo, The nanobot is made of three different DNA sections, maneuvering with a DNA “leg” and carrying specific molecules with the use of an “arm”.

Beyond medical applications, research is being done regarding the use of nanobots for the purposes of environmental cleanup and remediation. Nanobots could potentially be used to remove toxic heavy metals and plastics from bodies of water. The nanobots could carry compounds that render toxic substances inert when combined together, or they could be used to degrade plastic waste through similar processes. Research is also being done on the use of nanobots to facilitate the production of extremely small computer chips and processors, essentially using nanobots to produce microscale computer circuits.

Spread the love
Continue Reading

AI 101

What Are Deepfakes?

mm

Published

on

As deepfakes become easier to make and more prolific, more attention is paid to them. Deepfakes have become the focal point of discussions involving AI ethics, misinformation, openness of information and the internet, and regulation. It pays to be informed regarding deepfakes, and to have an intuitive understanding of what deepfakes are. This article will clarify the definition of a deepfake, examine their use cases, discuss how deepfakes can be detected, and examine the implications of deepfakes for society.

What Is A Deepfakes?

Before going on to discuss deepfakes further, it would be helpful to take some time and clarify what “deepfakes” actually are. There is a substantial amount of confusion regarding the term Deepfake, and often the term is misapplied to any falsified media, regardless of whether or not it is a genuine deepfake. In order to qualify as a Deepfake, the faked media in question must be generated with a machine-learning system, specifically a deep neural network.

The key ingredient of deepfakes is machine learning. Machine learning has made it possible for computers to automatically generate video and audio relatively quickly and easily. Deep neural networks are trained on footage of a real person in order for the network to learn how people look and move under the target environmental conditions. The trained network is then used on images of another individual and augmented with additional computer graphics techniques in order to combine the new person with the original footage. An encoder algorithm is used to determine the similarities between the original face and the target face. Once the common features of the faces have been isolated, a second AI algorithm called a decoder is used. The decoder examines the encoded (compressed) images and reconstructs them based off on the features in the original images. Two decoders are used, one on the original subject’s face and the second on the target person’s face. In order for the swap to be made, the decoder trained on images of person X is fed images of person Y. The result is that person Y’s face is reconstruction over Person X’s facial expressions and orientation.

Currently, it still takes a fair amount of time for a deepfake to be made. The creator of the fake has to spend a long time manually adjusting parameters of the model, as suboptimal parameters will lead to noticeable imperfections and image glitches that give away the fake’s true nature.

Although it’s frequently assumed that most deepfakes are made with a type of neural network called a generative adversarial network (GAN), many (perhaps most) deepfakes created these days do not rely on GANs. While GANs did play a prominent role in the creation of early deepfakes,  most deepfake videos are created through alternative methods, according to Siwei Lyu from SUNY Buffalo.

It takes a disproportionately large amount of training data in order to train a GAN, and GANs often take much longer to render an image compared to other image generation techniques. GANs are also better for generating static images than video, as GANs have difficulties maintaining consistencies from frame to frame. It’s much more common to use an encoder and multiple decoders to create deepfakes.

What Are Deepfakes Used For?

Many of the deepfakes found online are pornographic in nature. According to research done by Deeptrace, an AI firm, out of a sample of approximately 15,000 deepfake videos taken in September of 2019, approximately 95% of them were pornographic in nature. A troubling implication of this fact is that as the technology becomes easier to use, incidents of fake revenge porn could rise.

However, not all deep fakes are pornographic in nature. There are more legitimate uses for deepfake technology. Audio deepfake technology could help people broadcast their regular voices after they are damaged or lost due to illness or injury. Deepfakes can also be used for hiding the faces of people who are in sensitive, potentially dangerous situations, while still allowing their lips and expressions to be read. Deepfake technology can potentially be used to improve the dubbing on foreign-language films, aid in the repair of old and damaged media, and even create new styles of art.

Non-Video Deepfakes

While most people think of fake videos when they hear the term “deepfake”, fake videos are by no means the only kind of fake media produced with deepfake technology. Deepfake technology is used to create photo and audio fakes as well. As previously mentioned, GANs are frequently used to generate fake images. It’s thought that there have been many cases of fake LinkedIn and Facebook profiles that have profile images generated with deepfake algorithms.

It’s possible to create audio deepfakes as well. Deep neural networks are trained to produce voice clones/voice skins of different people, including celebrities and politicians. One famous example of an audio Deepfake is when the AI company Dessa made use of an AI model, supported by non-AI algorithms, to recreate the voice of the podcast host Joe Rogan.

How To Spot Deepfakes

As deepfakes become more and more sophisticated, distinguishing them from genuine media will become tougher and tougher. Currently, there are a few telltale signs people can look for to ascertain if a video is potentially a deepfake, like poor lip-syncing, unnatural movement, flickering around the edge of the face, and warping of fine details like hair, teeth, or reflections. Other potential signs of a deepfake include lower-quality parts of the same video, and irregular blinking of the eyes.

While these signs may help one spot a deepfake at the moment, as deepfake technology improves the only option for reliable deepfake detection might be other types of AI trained to distinguish fakes from real media.

Artificial intelligence companies, including many of the large tech companies, are researching methods of detecting deepfakes. Last December, a deepfake detection challenge was started, supported by three tech giants: Amazon, Facebook, and Microsoft. Research teams from around the world worked on methods of detecting deepfakes, competing to develop the best detection methods. Other groups of researchers, like a group of combined researchers from Google and Jigsaw, are working on a type of “face forensics” that can detect videos that have been altered, making their datasets open source and encouraging others to develop deepfake detection methods. The aforementioned Dessa has worked on refining deepfake detection techniques, trying to ensure that the detection models work on deepfake videos found in the wild (out on the internet) rather than just on pre-composed training and testing datasets, like the open-source dataset Google provided.

There are also other strategies that are being investigated to deal with the proliferation of deepfakes. For instance, checking videos for concordance with other sources of information is one strategy. Searches can be done for video of events potentially taken from other angles, or background details of the video (like weather patterns and locations) can be checked for incongruities. Beyond this, a Blockchain online ledger system could register videos when they are initially created, holding their original audio and images so that derivative videos can always be checked for manipulation.

Ultimately, it’s important that reliable methods of detecting deepfakes are created and that these detection methods keep up with the newest advances in deepfake technology. While it is hard to know exactly what the effects of deepfakes will be, if there are not reliable methods of detecting deepfakes (and other forms of fake media), misinformation could potentially run rampant and degrade people’s trust in society and institutions.

Implications of Deepfakes

What are the dangers of allowing deep fake to proliferate unchecked?

One of the biggest problems that deepfakes create currently is nonconsensual pornography, engineered by combining people’s faces with pornographic videos and images. AI ethicists are worried that deepfakes will see more use in the creation of fake revenge porn. Beyond this, deepfakes could be used to bully and damage the reputation of just about anyone, as they could be used to place people into controversial and compromising scenarios.

Companies and cybersecurity specialists have expressed concern about the use of deepfakes to facilitate scams, fraud, and extortion. Allegedly, deepfake audio has been used to convince employees of a company to transfer money to scammers

It’s possible that deepfakes could have harmful effects even beyond those listed above. Deepfakes could potentially erode people’s trust in media generally, and make it difficult for people to distinguish between real news and fake news. If many videos on the web are fake, it becomes easier for governments, companies, and other entities to cast doubt on legitimate controversies and unethical practices.

When it comes to governments, deepfakes may even pose threats to the operation of democracy. Democracy requires that citizens are able to make informed decisions about politicians based on reliable information. Misinformation undermines democratic processes. For example, the president of Gabon, Ali Bongo, appeared in a video attempting to reassure the Gabon citizenry. The president was assumed to be unwell for long a long period of time, and his sudden appearance in a likely fake video kicked off an attempted coup. President Donald Trump claimed that an audio recording of him bragging about grabbing women by the genitals was fake, despite also describing it as “locker room talk”. Prince Andrew also claimed that an image provided by Emily Maitilis’ attorney was fake, though the attorney insisted on its authenticity.

Ultimately, while there are legitimate uses for deepfake technology, there are many potential harms that can arise from the misuse of that technology. For that reason, it’s extremely important that methods to determine the authenticity of media be created and maintained.

Spread the love
Continue Reading

AI 101

What is Federated Learning?

mm

Published

on

The traditional method of training AI models involves setting up servers where models are trained on data, often through the use of a cloud-based computing platform. However, over the past few years an alternative form of model creation has arisen, called federated learning. Federated learning brings machine learning models to the data source, rather than bringing the data to the model. Federated learning links together multiple computational devices into a decentralized system that allows the individual devices that collect data to assist in training the model.

In a federated learning system, the various devices that are part of the learning network each have a copy of the model on the device. The different devices/clients train their own copy of the model using the client’s local data, and then the parameters/weights from the individual models are sent to a master device, or server, that aggregates the parameters and updates the global model. This training process can then be repeated until a desired level of accuracy is attained. In short, the idea behind federated learning is that none of the training data is ever transmitted between devices or between parties, only the updates related to the model are.

Federated learning can be broken down into three different steps or phases. Federated learning typically starts with a generic model that acts as a baseline and is trained on a central server. In the first step, this generic model is sent out to the application’s clients. These local copies are then trained on data generated by the client systems, learning and improving their performance.

In the second step, the clients all send their learned model parameters to the central server. This happens periodically, on a set schedule.

In the third step, the server aggregates the learned parameters when it receives them. After the parameters are aggregated, the central model is updated and shared once more with the clients. The entire process then repeats.

The benefit of having a copy of the model on the various devices is that network latencies are reduced or eliminated. The costs associated with sharing data with the server is eliminated as well. Other benefits of federate learning methods include the fact that federated learning models are privacy preserved, and model responses are personalized for the user of the device.

Examples of federated learning models include recommendation engines, fraud detection models, and medical models. Media recommendation engines, of the type used by Netflix or Amazon, could be trained on data gathered from thousands of users. The client devices would train their own separate models and the central model would learn to make better predictions, even though the individual data points would be unique to the different users. Similarly, fraud detection models used by banks can be trained on patterns of activity from many different devices, and a handful of different banks could collaborate to train a common model. In terms of a medical federated learning model, multiple hospitals could team up to train a common model that could recognize potential tumors through medical scans.

Types of Federated Learning

Federated learning schemas typically fall into one of two different classes: multi-party systems and single-party systems. Single-party federated learning systems are called “single-party” because only a single entity is responsible for overseeing the capture and flow of data across all of the client devices in the learning network. The models that exist on the client devices are trained on data with the same structure, though the data points are typically unique to the various users and devices.

In contrast to single-party systems, multi-party systems are managed by two or more entities. These entities cooperate to train a shared model by utilizing the various devices and datasets they have access to. The parameters and data structures are typically similar across the devices belonging to the multiple entities, but they don’t have to be exactly the same. Instead, pre-processing is done to standardize the inputs of the model. A neutral entity might be employed to aggregate the weights established by the devices unique to the different entities.

Common Technologies and Frameworks for Federated Learning

Popular frameworks used for federated learning include Tensorflow Federated, Federated AI Technology Enabler (FATE), and PySyft. PySyft is an open-source federated learning library based on the deep learning library PyTorch. PySyft is intended to ensure private, secure deep learning across servers and agents using encrypted computation. Meanwhile, Tensorflow Federated is another open-source framework built on Google’s Tensorflow platform. In addition to enabling users to create their own algorithms, Tensorflow Federated allows users to simulate a number of included federated learning algorithms on their own models and data. Finally, FATE is also open-source framework designed by Webank AI, and it’s intended to provide the Federated AI ecosystem with a secure computing framework.

Federated Learning Challenges

As federated learning is still fairly nascent, a number of challenges still have to be negotiated in order for it to achieve its full potential. The training capabilities of edge devices, data labeling and standardization, and model convergence are potential roadblocks for federated learning approaches.

The computational abilities of the edge devices, when it comes to local training, need to be considered when designing federated learning approaches. While most smartphones, tablets, and other IoT compatible devices are capable of training machine learning models, this typically hampers the performance of the device. Compromises will have to be made between model accuracy and device performance.

Labeling and standardizing data is another challenge that federated learning systems must overcome. Supervised learning models require training data that is clearly and consistently labeled, which can be difficult to do across the many client devices that are part of the system. For this reason, it’s important to develop model data pipelines that automatically apply labels in a standardized way based on events and user actions.

Model convergence time is another challenge for federated learning, as federated learning models typically take longer to converge than locally trained models. The number of devices involved in the training adds an element of unpredictability to the model training, as connection issues, irregular updates, and even different application use times can contribute to increased convergence time and decreased reliability. For this reason, federated learning solutions are typically most useful when they provide meaningful advantages over centrally training a model, such as instances where datasets are extremely large and distributed.

Spread the love
Continue Reading