In the past few years, Artificial Intelligence (AI) and Machine Learning (ML) have witnessed a meteoric rise in popularity and applications, not only in the industry but also in academia. However, today's ML and AI models have one major limitation: they require an immense amount of computing and processing power to achieve the desired results and accuracy. This often confines their use to high-capability devices with substantial computing power.
But given the advancements made in embedded system technology, and substantial development in the Internet of Things industry, it is desirable to incorporate the use of ML techniques & concepts into a resource-constrained embedded system for ubiquitous intelligence. The desire to use ML concepts into embedded & IoT systems is the primary motivating factor behind the development of TinyML, an embedded ML technique that allows ML models & applications on multiple resource-constrained, power-constrained, and cheap devices.
However, the implementation of ML on resource-constrained devices has not been simple because implementing ML models on devices with low computing power presents its own challenges in terms of optimization, processing capacity, reliability, maintenance of models, and a lot more.
In this article, we will be taking a deeper dive into the TinyML model, and learn more about its background, the tools supporting TinyML, and the applications of TinyML using advanced technologies. So let’s start.
An Introduction to TinyML : Why the World Needs TinyML
Internet of Things or IoT devices aim to leverage edge computing, a computing paradigm that refers to a range of devices & networks near the user to enable seamless and real-time processing of data from millions of sensors & devices interconnected to one another. One of the major advantages of IoT devices is that they require low computing & processing power as they are deployable at the network edge, and hence they have a low memory footprint.
Furthermore, the IoT devices heavily rely on edge platforms to collect & then transmit the data as these edge devices gather sensory data, and then transmits them either to a nearby location, or cloud platforms for processing. The edge computing technology stores & performs computing on the data, and also provides the necessary infrastructure to support the distributed computing.
The implementation of edge computing in IoT devices provides
- Effective security, privacy, and reliability to the end-users.
- Lower delay.
- Higher availability, and throughput response to applications & services.
Furthermore, because edge devices can deploy a collaborative technique between the sensors, and the cloud, the data processing can be conducted at the network edge instead of being conducted at the cloud platform. This can result in effective data management, data persistence, effective delivery, and content caching. Additionally, to implement IoT in applications that deal with H2M or Human to Machine interaction and modern healthcare edge computing provides a way to improve the network services significantly.
Recent research in the field of IoT edge computing has demonstrated the potential to implement Machine Learning techniques in several IoT use cases. However, the major issue is that traditional machine learning models often require strong computing & processing power, and high memory capacity that limits the implementation of ML models in IoT devices & applications.
Furthermore, edge computing technology today lacks in high transmission capacity, and effective power savings that leads to heterogeneous systems which is the main reason behind the requirement for harmonious & holistic infrastructure mainly for updating, training, and deploying ML models. The architecture designed for embedded devices poses another challenge as these architectures depend on the hardware & software requirements that vary from device to device. It’s the major reason why its difficult to build a standard ML architecture for IoT networks.
Also, in the current scenario, the data generated by different devices is sent to cloud platforms for processing because of the computationally intensive nature of network implementations. Furthermore, ML models are often dependent on Deep Learning, Deep Neural Networks, Application Specific Integrated Circuits (ASICs) and Graphic Processing Units (GPUs) for processing the data, and they often have a higher power & memory requirement. Deploying full-fledged ML models on IoT devices is not a viable solution because of the evident lack of computing & processing powers, and limited storage solutions.
The demand to miniaturize low power embedded devices coupled with optimizing ML models to make them more power & memory efficient has paved the way for TinyML that aims to implement ML models & practices on edge IoT devices & framework. TinyML enables signal processing on IoT devices and provides embedded intelligence, thus eliminating the need to transfer data to cloud platforms for processing. Successful implementation of TinyML on IoT devices can ultimately result in increased privacy, and efficiency while reducing the operating costs. Additionally, what makes TinyML more appealing is that in case of inadequate connectivity, it can provide on-premise analytics.
TinyML : Introduction and Overview
TinyML is a machine learning tool that has the capability to perform on-device analytics for different sensing modalities like audio, vision, and speech. Ml models build on the TinyML tool have low power, memory, and computing requirements that makes them suitable for embedded networks, and devices that operate on battery power. Additionally, TinyML’s low requirements makes it an ideal fit to deploy ML models on the IoT framework.
In the current scenario, cloud-based ML systems face a few difficulties including security & privacy concerns, high power consumption, dependability, and latency problems which is why models on hardware-software platforms are pre-installed. Sensors gather the data that simulate the physical world, and are then processed using a CPU or MPU (Microprocessing unit). The MPU caters to the needs of ML analytic support enabled by edge aware ML networks and architecture. Edge ML architecture communicates with the ML cloud for transfer of data, and the implementation of TinyML can result in advancement of technology significantly.
It would be safe to say that TinyML is an amalgamation of software, hardware, and algorithms that work in sync with each other to deliver the desired performance. Analog or memory computing might be required to provide a better & effective learning experience for hardware & IoT devices that do not support hardware accelerators. As far as software is concerned, the applications built using TinyML can be deployed & implemented over platforms like Linux or embedded Linux, and over cloud-enabled software. Finally, applications & systems built on the TinyML algorithm must have the support of new algorithms that need low memory sized models to avoid high memory consumption.
To sum things up, applications built using the TinyML tool must optimize ML principles & methods along with designing the software compactly, in the presence of high-quality data. This data then must be flashed through binary files that are generated using models that are trained on machines with much larger capacity, and computing power.
Additionally, systems & applications running on the TinyML tool must provide high accuracy when performing under tighter constraints because compact software is needed for small power consumption that supports TinyML implications. Furthermore, the TinyML applications or modules may depend on battery power to support its operations on edge embedded systems.
With that being said, TinyML applications have two fundamental requirements
- Ability to scale billions of cheap embedded systems.
- Storing the code on the device RAM with capacity under a few KBs.
Applications of TinyML Using Advanced Technologies
One of the major reasons why TinyML is a hot topic in the AI & ML industry is because of its potential applications including vision & speech based applications, health diagnosis, data pattern compression & classification, brain-control interface, edge computing, phenomics, self-driving cars, and more.
Speech Based Applications
Typically, speech based applications rely on conventional communication methods in which all the data is important, and it is transmitted. However, in recent years, semantic communication has emerged as an alternative to conventional communication as in semantic communication, only the meaning or context of the data is transmitted. Semantic communication can be implemented across speech based applications using TinyML methodologies.
Some of the most popular applications in the speech communications industry today are speech detection, speech recognition, online learning, online teaching, and goal-oriented communication. These applications typically have a higher power consumption, and they also have high data requirements on the host device. To overcome these requirements, a new TinySpeech library has been introduced that allows developers to build a low computational architecture that uses deep convolutional networks to build a low storage facility.
To use TinyML for speech enhancement, developers first addressed the sizing of the speech enhancement model because it was subject to hardware limitations & constraints. To tackle the issue, structured pruning and integer quantization for RNN or Recurrent Neural Networks speech enhancement model were deployed. The results suggested the size of the model to be reduced by almost 12x whereas the operations to be reduced by almost 3x. Additionally, it's vital that resources must be utilized effectively especially when deployed on resource constrained applications that execute voice-recognition applications.
As a result, to partition the process, a co-design method was proposed for TinyML based voice and speech recognition applications. The developers used windowing operation to partition software & hardware in a way to pre process the raw voice data. The method seemed to work as the results indicated a decrease in the energy consumption on the hardware. Finally, there’s also potential to implement optimized partitioning between software & hardware co-design for better performance in the near future.
Furthermore, recent research has proposed the use of a phone-based transducer for speech recognition systems, and the proposal aims to replace LSTM predictors with Conv1D layer to reduce the computation needs on edge devices. When implemented, the proposal returned positive results as the SVD or Singular Value Decomposition had compressed the model successfully whereas the use of WFST or Weighted Finite State Transducers based decoding resulted in more flexibility in model improvement bias.
A lot of prominent applications of speech recognition like virtual or voice assistants, live captioning, and voice commands use ML techniques to work. Popular voice assistants currently like Siri and the Google Assistant ping the cloud platform every time they receive some data, and it creates significant concerns related to privacy & data security. TinyML is a viable solution to the issue as it aims to perform speech recognition on devices, and eliminate the need to migrate data to cloud platforms. One of the ways to achieve on-device speech recognition is to use Tiny Transducer, a speech recognition model that uses a DFSMN or Deep Feed-Forward Sequential Memory Block layer coupled with one Conv1D layer instead of the LSTM layers to bring down the computation requirements, and network parameters.
Hearing loss is a major health concern across the globe, and humans ability to hear sounds generally weakens as they age, and its a major problems in countries dealing with aging population including China, Japan, and South Korea. Hearing aid devices right now work on the simple principle of amplifying all the input sounds from the surrounding that makes it difficult for the person to distinguish or differentiate between the desired sound especially in a noisy environment.
TinyML might be the viable solution for this issue as using a TinyLSTM model that uses speech recognition algorithm for hearing aid devices can help the users distinguish between different sounds.
Vision Based Applications
TinyML has the potential to play a crucial role in processing computer vision based datasets because for faster outputs, these data sets need to be processed on the edge platform itself. To achieve this, the TinyML model encounters the practical challenges faced while training the model using the OpenMV H7 microcontroller board. The developers also proposed an architecture to detect American Sign Language with the help of a ARM Cortex M7 microcontroller that works only with 496KB of frame-buffer RAM.
The implementation of TinyML for computer vision based application on edge platforms required developers to overcome the major challenge of CNN or Convolutional Neural Networks with a high generalization error, and high training & testing accuracy. However, the implementation did not generalize effectively to images within new use cases as well as backgrounds with noise. When the developers used the interpolation augmentation method, the model returned an accuracy score of over 98% on test data, and about 75% in generalization.
Furthermore, it was observed that when the developers used the interpolation augmentation method, there was a drop in model’s accuracy during quantization, but at the same time, there was also a boost in model’s inference speed, and classification generalization. The developers also proposed a method to further boost the accuracy of generalization model training on data obtained from a variety of different sources, and testing the performance to explore the possibility of deploying it on edge platforms like portable smart watches.
Furthermore, additional studies on CNN indicated that its possible to deploy & achieve desirable results with CNN architecture on devices with limited resources. Recently, developers were able to develop a framework for the detection of medical face masks on a ARM Cortex M7 microcontroller with limited resources using TensorFlow lite with minimal memory footprints. The model size post quantization was about 138 KB whereas the interference speed on the target board was about 30 FPS.
Another application of TinyML for computer vision based application is to implement a gesture recognition device that can be clamped to a cane for helping visually impaired people navigate through their daily lives easily. To design it, the developers used the gestures data set, and used the data set to train the ProtoNN model with a classification algorithm. The results obtained from the setup were accurate, the design was low-cost, and it delivered satisfactory results.
Another significant application of TinyML is in the self-driving, and autonomous vehicles industry because of the lack of resources, and on-board computation power. To tackle the issue, developers introduced a closed loop learning method built on the TinyCNN model that proposed an online predictor model that captures the image at the run-time. The major issue that developers faced when implementing TinyML for autonomous driving was that the decision model that was trained to work on offline data may not work equally well when dealing with online data. To fully maximize the applications of autonomous cars and self-driving cars, the model should ideally be able to adapt to the real-time data.
Data Pattern Classification and Compression
One of the biggest challenges of the current TinyML framework is to facilitate it to adapt to online training data. To tackle the issue, developers have proposed a method known as TinyOL or TinyML Online Learning to allow training with incremental online learning on microcontroller units thus allowing the model to update on IoT edge devices. The implementation was achieved using the C++ programming language, and an additional layer was added to the TinyOL architecture.
Furthermore, developers also performed the auto-encoding of the Arduino Nano 33 BLE sensor board, and the model trained was able to classify new data patterns. Furthermore, the development work included designing efficient & more optimized algorithms for the neural networks to support device training patterns online.
Research in TinyOL and TinyML have indicated that number of activation layers has been a major issue for IoT edge devices that have constrained resources. To tackle the issue, developers introduced the new TinyTL or Tiny Transfer Learning model to make the utilization of memory over IoT edge devices much more effective, and avoiding the use of intermediate layers for activation purposes. Additionally, developers also introduced an all new bias module known as “lite-residual module” to maximize the adaptation capabilities, and in course allowing feature extractors to discover residual feature maps.
When compared with full network fine-tuning, the results were in favor of the TinyTL architecture as the results showed the TinyTL to reduce the memory overhead about 6.5 times with moderate accuracy loss. When the last layer was fine tuned, TinyML had improved the accuracy by 34% with moderate accuracy loss.
Furthermore, research on data compression has indicated that data compression algorithms must manage the collected data on a portable device, and to achieve the same, the developers proposed TAC or Tiny Anomaly Compressor. The TAC was able to outperform SDT or Swing Door Trending, and DCT or Discrete Cosine Transform algorithms. Additionally, the TAC algorithm outperformed both the SDT and DCT algorithms by achieving a maximum compression rate of over 98%, and having the superior peak signal-to-noise ratio out of the three algorithms.
The Covid-19 global pandemic opened new doors of opportunity for the implementation of TinyML as it’s now an essential practice to continuously detect respiratory symptoms related to cough, and cold. To ensure uninterrupted monitoring, developers have proposed a CNN model Tiny RespNet that operates on a multi-model setting, and the model is deployed over a Xilinx Artix-7 100t FPGA that allows the device to process the information parallelly, has a high efficiency, and low power consumption. Additionally, the TinyResp model also takes speech of patients, audio recordings, and information of demography as input to classify, and the cough-related symptoms of a patient are classified using three distinguished datasets.
Furthermore, developers have also proposed a model capable of running deep learning computations on edge devices, a TinyML model named TinyDL. The TinyDL model can be deployed on edge devices like smartwatches, and wearables for health diagnosis, and is also capable of carrying out performance analysis to reduce bandwidth, latency, and energy consumption. To achieve the deployment of TinyDL on handheld devices, a LSTM model was designed and trained specifically for a wearable device, and it was fed collected data as the input. The model has an accuracy score of about 75 to 80%, and it was able to work with off-device data as well. These models running on edge devices showed the potential to resolve the current challenges faced by the IoT devices.
Finally, developers have also proposed another application to monitor the health of elderly people by estimating & analyzing their body poses. The model uses the agnostic framework on the device that allows the model to enable validation, and rapid fostering to perform adaptations. The model implemented body pose detection algorithms coupled with facial landmarks to detect spatiotemporal body poses in real time.
One of the major applications of TinyML is in the field of edge computing as with the increase in the use of IoT devices to connect devices across the world, its essential to set up edge devices as it will help in reducing the load over the cloud architectures. These edge devices will feature individual data centers that will allow them to carry out high-level computing on the device itself, rather than relying on the cloud architecture. As a result, it will help in reducing the dependency on the cloud, reduce latency, enhance user security & privacy, and also reduce bandwidth.
Edge devices using the TinyML algorithms will help in resolving the current constraints related with power, computing, and memory requirements, and it’s discussed in the image below.
Furthermore, TinyML can also enhance the use and application of Unmanned Aerial Vehicles or UAVs by addressing the current limitations faced by these machines. The use of TinyML can allow developers to implement an energy-efficient device with low latency, and high computing power that can act as a controller for these UAVs.
Brain-Computer Interface or BCI
TinyML has significant applications in the healthcare industry, and it can prove to be highly beneficial in different areas including cancer & tumor detection, health predictions using ECG & EEG signals, and emotional intelligence. The use of TinyML can allow the Adaptive Deep Brain Stimulation or aDBS to adapt successfully to clinical adaptations. The use of TinyMl can also allow aDBS to identify disease-related bio marks & their symptoms using invasive recordings of the brain signals.
Furthermore, the healthcare industry often includes the collection of a large amount of data of a patient, and this data then needs to be processed to reach specific solutions for the treatment of a patient in the early stages of a disease. As a result, it's vital to build a system that is not only highly effective, but also highly secure. When we combine IoT application with the TinyML model, a new field is born named as the H-IoT or Healthcare Internet of Things, and the major applications of the H-IoT are diagnosis, monitoring, logistics, spread control, and assistive systems. If we want to develop devices that are capable of detecting & analyzing a patient’s health remotely, it’s essential to develop a system that has a global accessibility, and a low latency.
Finally, TinyML can have widespread applications in the autonomous vehicles industry as these vehicles can be utilized in different ways including human tracking, military purposes, and has industrial applications. These vehicles have a primary requirement of being able to identify objects efficiently when the object is being searched.
As of now, autonomous vehicles & autonomous driving is a fairly complex task especially when developing mini or small sized vehicles. Recent developments have shown potential to improve the application of autonomous driving for mini vehicles by using a CNN architecture, and deploying the model over the GAP8 MCI.
TinyML is a relatively newer concept in the AI & ML industry, and despite the progress, it's still not as effective as we need it for mass deployment for edge & IoT devices.
The biggest challenge currently faced by TinyML devices is the power consumption of these devices. Ideally, embedded edge & IoT devices are expected to have a battery life that extends over 10 years. For example, in ideal condition, an IoT device running on a 2Ah battery is supposed to have a battery life of over 10 years given that the power consumption of the device is about 12 ua. However, in the given state, an IoT architecture with a temperature sensor, a MCU unit, and a WiFi module, the current consumption stands at about 176.4 mA, and with this power consumption, the battery will last for only about 11 hours, instead of the required 10 years of battery life.
To maintain an algorithm’s consistency, it's vital to maintain power availability, and given the current scenario, the limited power availability to TinyML devices is a critical challenge. Furthermore, memory limitations are also a significant challenge as deploying models often requires a high amount of memory to work effectively, and accurately.
Hardware constraints make deploying TinyML algorithms on a wide scale difficult because of the heterogeneity of hardware devices. There are thousands of devices, each with their own hardware specifications & requirements, and resultantly, a TinyML algorithm currently needs to be tweaked for every individual device, that makes mass deployment a major issue.
Data Set Constraints
One of the major issues with TinyML models is that they do not support the existing data sets. It is a challenge for all edge devices as they collect data using external sensors, and these devices often have power & energy constraints. Therefore, the existing data sets cannot be used to train the TinyML models effectively.
The development of ML techniques have caused a revolution & a shift in perspective in the IoT ecosystem. The integration of ML models in IoT devices will allow these edge devices to make intelligent decisions on their own without any external human input. However, conventionally, ML models often have high power, memory, and computing requirements that makes them unify for being deployed on edge devices that are often resource constrained.
As a result, a new branch in AI was dedicated to the use of ML for IoT devices, and it was termed as TinyML. The TinyML is a ML framework that allows even the resource constrained devices to harness the power of AI & ML to ensure higher accuracy, intelligence, and efficiency.
In this article, we have talked about the implementation of TinyML models on resource-constrained IoT devices, and this implementation requires training the models, deploying the models on the hardware, and performing quantization techniques. However, given the current scope, the ML models ready to be deployed on IoT and edge devices have several complexities, and restraints including hardware, and framework compatibility issues.
- Lior Hakim, Co-founder & CTO of Hour One – Interview Series
- The Smart Enterprise: Making Generative AI Enterprise-Ready
- Flick Review: The Best Instagram Hashtag Tool to Boost Reach
- U.S. Imposes Export Restrictions on NVIDIA Chips to Certain Middle East Countries
- Tanguy Chau, Co-Founder & CEO of Paxton AI – Interview Series