Join IoT Central | Join our LinkedIn Group | Post on IoT Central

edge ai (6)

For object detection projects, labeling your images with their corresponding bounding boxes and names is a tedious and time-consuming task, often requiring a human to label each image by hand. The Edge Impulse Studio has already dramatically decreased the amount of time it takes to get from raw images to a fully labeled dataset with the Data Acquisition Labeling Queue feature directly in your web browser. To make this process even faster, the Edge Impulse Studio is getting a new feature: AI-Assisted Labeling.

ezgif_com_gif_maker_3_2924bbe7c1.gifAutomatically label common objects with YOLOv5.


To get started, create a “Classify multiple objects” images project via the Edge Impulse Studio new project wizard or open your existing object detection project. Upload your object detection images to your Edge Impulse project’s training and testing sets. Then, from the Data Acquisition tab, select “Labeling queue.” 


1. Using YOLOv5

By utilizing an existing library of pre-trained object detection models from YOLOv5 (trained with the COCO dataset), common objects in your images can quickly be identified and labeled in seconds without needing to write any code!

To label your objects with YOLOv5 classification, click the Label suggestions dropdown and select “Classify using YOLOv5.” If your object is more specific than what is auto-labeled by YOLOv5, e.g. “coffee” instead of the generic “cup” class, you can modify the auto-labels to the left of your image. These modifications will automatically apply to future images in your labeling queue.



Click Save labels to move on to your next raw image, and see your fully labeled dataset ready for training in minutes!


2. Using your own model

You can also use your own trained model to predict and label your new images. From an existing (trained) Edge Impulse object detection project, upload new unlabeled images from the Data Acquisition tab. Then, from the “Labeling queue”, click the Label suggestions dropdown and select “Classify using <your project name>”:



You can also upload a few samples to a new object detection project, train a model, then upload more samples to the Data Acquisition tab and use the AI-Assisted Labeling feature for the rest of your dataset. Classifying using your own trained model is especially useful for objects that are not in YOLOv5, such as industrial objects, etc.

Click Save labels to move on to your next raw image, and see your fully labeled dataset ready for training in minutes using your own pre-trained model!


3. Using object tracking

If you have objects that are a similar size or common between images, you can also track your objects between frames within the Edge Impulse Labeling Queue, reducing the amount of time needed to re-label and re-draw bounding boxes over your entire dataset.

Draw your bounding boxes and label your images, then, after clicking Save labels, the objects will be tracked from frame to frame:

ezgif_com_gif_maker_2_87d2148451.gifTrack and auto-label your objects between frames.


Now that your object detection project contains a fully labeled dataset, learn how to train and deploy your model to your edge device: check out our tutorial!


Originally posted on the Edge Impulse blog by Jenny Plunkett - Senior Developer Relations Engineer.

Read more…

The head is surely the most complex group of organs in the human body, but also the most delicate. The assessment and prevention of risks in the workplace remains the first priority approach to avoid accidents or reduce the number of serious injuries to the head. This is why wearing a hard hat in an industrial working environment is often required by law and helps to avoid serious accidents.

This article will give you an overview of how to detect that the wearing of a helmet is well respected by all workers using a machine learning object detection model.

For this project, we have been using:

  • Edge Impulse Studi to acquire some custom data, visualize the data, train the machine learning model and validate the inference results.
  • Part of this public dataset from Roboflow, where the images containing the smallest bounding boxes has been removed.
  • Part of the Flicker-Faces-HQ (FFHQ) (under Creative Commons BY 2.0 license) to rebalance the classes in our dataset.
  • Google Colab to convert the Yolo v5 PyTorch format from the public dataset to Edge Impulse Ingestion format.
  • A Rasberry Pi, NVIDIA Jetson Nano or with any Intel-based Macbooks to deploy the inference model.

Before we get started, here are some insights of the benefits / drawbacks of using a public dataset versus collecting your own. 

Using a public dataset is a nice-to-have to start developing your application quickly, validate your idea and check the first results. But we often get disappointed with the results when testing on your own data and in real conditions. As such, for very specific applications, you might spend much more time trying to tweak an open dataset rather than collecting your own. Also, remember to always make sure that the license suits your needs when using a dataset you found online.

On the other hand, collecting your own dataset can take a lot of time, it is a repetitive task and most of the time annoying. But, it gives the possibility to collect data that will be as close as possible to your real life application, with the same lighting conditions, the same camera or the same angle for example. Therefore, your accuracy in your real conditions will be much higher. 

Using only custom data can indeed work well in your environment but it might not give the same accuracy in another environment, thus generalization is harder.

The dataset which has been used for this project is a mix of open data, supplemented by custom data.

First iteration, using only the public datasets

At first, we tried to train our model only using a small portion of this public dataset: 176 items in the training set and 57 items in the test set where we took only images containing a bounding box bigger than 130 pixels, we will see later why. 


If you go through the public dataset, you can see that the entire dataset is strongly missing some “head” data samples. The dataset is therefore considered as imbalanced.

Several techniques exist to rebalance a dataset, here, we will add new images from Flicker-Faces-HQ (FFHQ). These images do not have bounding boxes but drawing them can be done easily in the Edge Impulse Studio. You can directly import them using the uploader portal. Once your data has been uploaded, just draw boxes around the heads and give it a label as below: 


Now that the dataset is more balanced, with both images and bounding boxes of hard hats and heads, we can create an impulse, which is a mix of digital signal processing (DSP) blocks and training blocks:


In this particular object detection use case, the DSP block will resize an image to fit the 320x320 pixels needed for the training block and extract meaningful features for the Neural Network. Although the extracted features don’t show a clear separation between the classes, we can start distinguishing some clusters:


To train the model, we selected the Object Detection training block, which fine tunes a pre-trained object detection model on your data. It gives a good performance even with relatively small image datasets. This object detection learning block relies on MobileNetV2 SSD FPN-Lite 320x320.    

According to Daniel Situnayake, co-author of the TinyML book and founding TinyML engineer at Edge Impulse, this model “works much better for larger objects—if the object takes up more space in the frame it’s more likely to be correctly classified.” This has been one of the reason why we got rid of the images containing the smallest bounding boxes in our import script.

After training the model, we obtained a 61.6% accuracy on the training set and 57% accuracy on the testing set. You also might note a huge accuracy difference between the quantized version and the float32 version. However, during the linux deployment, the default model uses the unoptimized version. We will then focus on the float32 version only in this article.


This accuracy is not satisfying, and it tends to have trouble detecting the right objects in real conditions:


Second iteration, adding custom data

On the second iteration of this project, we have gone through the process of collecting some of our own data. A very useful and handy way to collect some custom data is using our mobile phone. You can also perform this step with the same camera you will be using in your factory or your construction site, this will be even closer to the real condition and therefore work best with your use case. In our case, we have been using a white hard hat when collecting data. For example, if your company uses yellow ones, consider collecting your data with the same hard hats. 

Once the data has been acquired, go through the labeling process again and retrain your model. 


We obtain a model that is slightly more accurate when looking at the training performances. However, in real conditions, the model works far better than the previous one.


Finally, to deploy your model on yourA Rasberry Pi, NVIDIA Jetson Nano or your Intel-based Macbook, just follow the instructions provided in the links. The command line interface `edge-impulse-linux-runner` will create a lightweight web interface where you can see the results.


Note that the inference is run locally and you do not need any internet connection to detect your objects. Last but not least, the trained models and the inference SDK are open source. You can use it, modify it and integrate it to a broader application matching specifically to your needs such as stopping a machine when a head is detected for more than 10 seconds.

This project has been publicly released, feel free to have a look at it on Edge Impulse studio, clone the project and go through every steps to get a better understanding:

The essence of this use case is, Edge Impulse allows with very little effort to develop industry grade solutions in the health and safety context. Now this can be embedded in bigger industrial control and automation systems with a consistent and stringent focus on machine operations linked to H&S complaint measures. Pre-training models, which later can be easily retrained in the final industrial context as a step of “calibration,” makes this a customizable solution for your next project.

Originally posted on the Edge Impulse blog by Louis Moreau - User Success Engineer at Edge Impulse & Mihajlo Raljic - Sales EMEA at Edge Impulse

Read more…

Only for specific jobs

Just a few decades ago, headsets were meant for use only with specific job functions – primarily B2B. They were used as simply extensions of communication devices, reserved for astronauts, mission control engineers, air traffic controllers, call center agents, fire fighters, etc. who all had mission critical communication to convey while their hands had to deal with something more important than holding a communication device. In the B2C consumers space you rarely saw anyone wearing headsets in public. The only devices you saw attached to one’s ears were hearing aids.


Tale of two cities: Telephony and music

Most headsets were used for communication purposes, which also referred to as ‘Telephony’ mode. As with most communications, this requires bi-directional audio. Except for serious audiophiles and audio professionals, headsets were not used for music consumption. Any type of half-duplex audio consumption was referred to as ‘Music' mode.

Deskphones and speakerphones

Within the enterprise, a deskphone was the primary communication device for a long time. Speakerphones were becoming a common staple in meeting rooms, facilitating active collaboration amongst geographically distributed team members. So, there were ‘handsets’ but no ‘headsets’ quite yet. 


Mobile revolution: Communication and consumption

As the Internet and the browser were taking shape in the early ’90s, deskphones were getting untethered in the form of big and bulky cellular phones. At around the same time, a Body Area Network (BAN) wireless technology called Bluetooth was invented. Its original purpose was simply to replace the cords used for connecting a keyboard and mouse to the personal computer.


As cellular phones were slimming down and becoming more mainstream, scientists figured out how to use Bluetooth radio for short-range full-duplex audio communications as well. Fueled by rapid cell-phone proliferation, along with the need for convenient hands-free communication by enterprise executives and professionals (for whom hands-free communication while being mobile was important), monaural Bluetooth headsets started becoming a loyal companion to cell phones.

While headsets were used with various telephony devices for communications, portable analog music (Sony Walkman, anybody?) started giving way to portable digital music. Cue the iPod era. The portable music players primarily used simple wired speakers on a rope. These early ‘earbuds’ didn’t even have a microphone in them because they were meant solely for audio consumption – not for audio capture. 

The app economy, softphones and SaaS

Mobile revolution transformed simple communication devices into information exchange devices and then more recently, into mini super computers that have applications to take care of functions served by numerous individual devices like a telephony device, camera, calculator, music player, etc. As narrowband networks gave way to broadband networks for both the wired and wireless worlds, ‘communication’ and ‘media consumption’ began to transform in a significant way as well. 

Communication: Deskphones or ‘hard’-phones started being replaced by VoIP-based soft-phones. A new market segment called Unified Communications (UC) was born because of this hard- to soft-phone transition. UC has been a key growth driver for the enterprise headset market for the last several years, and it continues to show healthy growth. Enterprises could not part ways with circuit-switched telephony devices completely, but they started adopting packet-switched telephony services called soft-phones. So, UC communication device companies are effectively helping enterprises by being the bridge from ‘old’ to ‘new’ technology. UC has recently evolved into UC&C – where the second ‘C’ represents ‘Collaboration.’ Collaboration using audio and video (like Zoom or Teams calls) got a real shot in the arm because of the COVID-19-induced remote work scenario that has been playing out globally for the last year and a half.

Media consumption: ‘Static’ storage media (audio cassettes, VHS tapes, CDs, DVDs) and their corresponding media players, including portable digital music devices like iPods, were replaced by ‘streaming’ services in a swift fashion. 

Why did this transformation matter to the headset world?

Communication & collaboration by the enterprise users as well as media consumption by consumers collided head-on. Because of this, monaural headsets have almost become irrelevant. Nearly all headsets today are binaural or stereo, and have microphone(s) in them.

This is because the same device needs to serve the purposes of both: consuming half-duplex audio when listening to music, podcasts, or watching movies or webinars, and enabling full-duplex audio for a telephone conversation, a conference call, or video conference.

Fewer form factors… more smarts 

From: Very few companies building manifold headset form factors that catered to the needs of every diverse persona out there.

To: Quite a few companies (obviously, a handful of them a great deal more successful than the others) driving the headset space to effectively just two form factors:

  1. Tiny True Wireless Stereo (TWS) earbuds and
  2. Big binaural occluding cans!


Less hardware… more software

Such a trend has been in place for quite some time impacting several industries. Headsets are no exception. Ever so sophisticated semiconductor components and proliferation of miniaturized Microelectromechanical Systems, or MEMS in short, components have taken the place of numerous bulkier hardware components.

What do modern headsets primarily do with regards to audio?

  1. Render received audio in the wearer’s ear
  2. Capture spoken audio from the wearer’s mouth
  3. Calculate anti-noise and render it in the wearer’s ear (in noise-cancelling headsets)

Sounds straightforward, right? It is not as simple as it sounds – at least for enterprise-grade professional headsets. Audio is processed in the digital domain in all modern headsets using sophisticated digital signal processing techniques. DSP algorithms running on the DSP cores of the processors are the most compute-intensive aspects of these devices. Capture/transmit/record audio DSP is relatively more complicated than the render/receive/llayback audio DSP. Depending on the acoustic design (headset boom, number of microphones, speaker/microphone placement), audio performance requirements, and other audio feature requirements, the DSP workload varies.

Intelligence right at the edge!

Headsets are true edge devices. Most headset designs have severe constraints around several factors: cost, size, weight, power, MIPS, memory, etc.

Headsets are right at the horse’s mouth (pun intended) of massive trends and modern use cases like:

  • Wake word detection for virtual private assistants (VPAs)
  • Keyword detection for device control and various other data/analytics purposes
  • Modern user interface (UI) techniques like voice-as-UI, touch-as-UI, and gestures-as-UI
  • Transmit noise cancellation/suppression (TxNC or TxNS)
  • Adaptive ambient noise cancellation (ANC) mode selection
  • Real-time transcription assistance
  • Ambient noise identification
  • Speech synthesis, speaker identification, speaker authentication, etc.

Most importantly, note that there is immense end customer value for all these capabilities.

Until recently, even if one wanted to, very little could be done to support most of these advanced capabilities right in the headset. Just the features and functionalities that were addressable within the computational limits of the on-board DSP cores using traditional DSP techniques were all that could be supported.

Enter edge compute, AutoML, tinyML, and MLOps revolutions…

Several DSP-only workloads of the past are rapidly transitioning to an efficient hybrid model of DSP+ML workloads. Quite a few ML only capabilities that were not even possible using traditional DSP techniques are becoming possible right now as well. All of this is happening within the same constraints that existed before.

Silicon as well as software innovations are behind such possibilities. Silicon innovations are relatively slow to be adopted into device architectures at the moment, but they will be over time. Software innovations extract more value out of existing silicon architectures while helping converge on more efficient hardware architecture designs for next-generation products.

Thanks to embedded machine learning, tasks and features that were close to impossible are becoming a reality now. Production-grade Inference models with tiny program and data memory footprints in addition to impressive performance are possible today because of major advancements in AutoML and tinyML techniques. Building these models does not require massive amounts of data either. The ML-framework and automated yet flexible process offered by platforms like those from Edge Impulse make the ML model creation process simple and efficient compared to traditional methods of building such models.

Microphones and sensors galore

All headsets feature at least one microphone, and many feature multiple, sometimes up to 16 of them! The field of ML for audio is vast, and it is continuing to expand further. Many of the ML inferencing that was possible only at the cloud backends or sophisticated compute-rich endpoints are now fully possible in most of the resource-constrained embedded IoT silicon.

Microphones themselves are sensors, but many other sensors like accelerometers, capacitive touch, passive infrared (PIR), ultrasonic, radar, and ultra-wideband (UWB) are making their way into headsets to meet and exceed customer expectations. Spatial audio, aka 3D audio, is one such application that utilizes several sensors to give the end-user an immersive audio experience. Sensor fusion is the concept of utilizing data from multiple sensors concurrently to arrive at intelligent decisions. Sensor fusion implementations that use modern ML techniques have been shown to have impressive performance metrics compared to traditional non-ML methods.

Transmit noise suppression (TxNS) has always been the holy grail of all premium enterprise headsets. It is an important aspect of enterprise collaboration. A magical combination of physical acoustic design – which is more art than science – combined with optimally tuned complex audio DSP algorithms implemented under severe MIPS, memory, latency, and other constraints. In recent years, some groundbreaking work has been done in utilizing recursive neural network (RNN) techniques to improve TxNS performance to levels that were never seen before. Because of their complexity and high-compute footprint, these techniques have been incorporated into devices that have mobile phone platform-like compute capabilities. The challenge of bringing such solutions to the resource-constrained embedded systems, such as enterprise headsets, while staying within the constraints laid out earlier, remains unsolved to a major extent. Advancements in embedded silicon technology, combined with tinyML/AutoML software innovations listed above, is helping address this and several other ML challenges.



Modern use cases that enable the hearables to become ‘smart’ are compelling. Cloud-based frameworks and tools necessary to build, iterate, optimize, and maintain high performance small footprint ML models to address these applications are readily available from entities like Edge Impulse. Any hearable entity that doesn’t take full advantage of this staggering advancement in technology will be at a competitive disadvantage.

Originally posted on the Edge Impulse blog by Arun Rajasekaran.

Read more…

Edge Impulse has joined 1% for Planet, pledging to donate 1% of our revenue to support nonprofit organizations focused on the environment. To complement this effort we launched the ElephantEdge competition, aiming to create the world’s best elephant tracking device to protect elephant populations that would otherwise be impacted by poaching. In this similar vein, this blog will detail how Lacuna Space, Edge Impulse, a microcontroller and LoraWAN can promote the conservation of endangered species by monitoring bird calls in remote areas.

Over the past years, The Things Networks has worked around the democratization of the Internet of Things, building a global and crowdsourced LoraWAN network carried by the thousands of users operating their own gateways worldwide. Thanks to Lacuna Space’ satellites constellation, the network coverage goes one step further. Lacuna Space uses LEO (Low-Earth Orbit) satellites to provide LoRaWAN coverage at any point around the globe. Messages received by satellites are then routed to ground stations and forwarded to LoRaWAN service providers such as TTN. This technology can benefit several industries and applications: tracking a vessel not only in harbors but across the oceans, monitoring endangered species in remote areas. All that with only 25mW power (ISM band limit) to send a message to the satellite. This is truly amazing!

Most of these devices are typically simple, just sending a single temperature value, or other sensor reading, to the satellite - but with machine learning we can track much more: what devices hear, see, or feel. In this blog post we'll take you through the process of deploying a bird sound classification project using an Arduino Nano 33 BLE Sense board and a Lacuna Space LS200 development kit. The inferencing results are then sent to a TTN application.

Note: Access to the Lacuna Space program and dev kit is closed group at the moment. Get in touch with Lacuna Space for hardware and software access. The technical details to configure your Arduino sketch and TTN application are available in our GitHub repository.


Our bird sound model classifies house sparrow and rose-ringed parakeet species with a 92% accuracy. You can clone our public project or make your own classification model following our different tutorials such as Recognize sounds from audio or Continuous Motion Recognition.


Once you have trained your model, head to the Deployment section, select the Arduino library and Build it.


Import the library within the Arduino IDE, and open the microphone continuous example sketch. We made a few modifications to this example sketch to interact with the LS200 dev kit: we added a new UART link and we transmit classification results only if the prediction score is above 0.8.

Connect with the Lacuna Space dashboard by following the instructions on our application’s GitHub ReadMe. By using a web tracker you can determine when the next good time a Lacuna Space satellite will be flying in your location, then you can receive the signal through your The Things Network application and view the inferencing results on the bird call classification:

       "housesparrow": "0.91406",
       "redringedparakeet": "0.05078",
       "noise": "0.03125",
       "satellite": true,

No Lacuna Space development kit yet? No problem! You can already start building and verifying your ML models on the Arduino Nano 33 BLE Sense or one of our other development kits, test it out with your local LoRaWAN network (by pairing it with a LoRa radio or LoRa module) and switch over to the Lacuna satellites when you get your kit.

Originally posted on the Edge Impulse blog by Aurelien Lequertier - Lead User Success Engineer at Edge Impulse, Jenny Plunkett - User Success Engineer at Edge Impulse, & Raul James - Embedded Software Engineer at Edge Impulse

Read more…

Edge Products Are Now Managed At The Cloud

Now more than ever, there are billions of edge products in the world. But without proper cloud computing, making the most of electronic devices that run on Linux or any other OS would not be possible.

And so, a question most people keep asking is which is the best Software-as-a-service platform that can effectively manage edge devices through cloud computing. Well, while edge device management may not be something, the fact that cloud computing space is not fully exploited means there is a lot to do in the cloud space.

Product remote management is especially necessary for the 21st century and beyond. Because of the increasing number of devices connected to the internet of things (IoT), a reliable SaaS platform should, therefore, help with maintaining software glitches from anywhere in the world. From smart homes, stereo speakers, cars, to personal computers, any product that is connected to the internet needs real-time protection from hacking threats such as unlawful access to business or personal data.

Data being the most vital asset is constantly at risk, especially if individuals using edge products do not connect to trusted, reliable, and secure edge device management platforms.

Bridges the Gap Between Complicated Software And End Users

Cloud computing is the new frontier through which SaaS platforms help manage edge devices in real-time. But something even more noteworthy is the increasing number of complicated software that now run edge devices at homes and in workplaces.

Edge device management, therefore, ensures everything runs smoothly. From fixing bugs, running debugging commands to real-time software patch deployment, cloud management of edge products bridges a gap between end-users and complicated software that is becoming the norm these days.

Even more importantly, going beyond physical firewall barriers is a major necessity in remote management of edge devices. A reliable Software-as-a-Service, therefore, ensures data encryption for edge devices is not only hackproof by also accessed by the right people. Moreover, deployment of secure routers and access tools are especially critical in cloud computing when managing edge devices. And so, developers behind successful SaaS platforms do conduct regular security checks over the cloud, design and implement solutions for edge products.

Reliable IT Infrastructure Is Necessary

Software-as-a-service platforms that manage edge devices focus on having a reliable IT infrastructure and centralized systems through which they can conduct cloud computing. It is all about remotely managing edge devices with the help of an IT infrastructure that eliminates challenges such as connectivity latency.

Originally posted here

Read more…

Introducing Profiler, by Auptimizer: Select the best AI model for your target device — no deployment required.

Profiler is a simulator for profiling the performance of Machine Learning (ML) model scripts. Profiler can be used during both the training and inference stages of the development pipeline. It is particularly useful for evaluating script performance and resource requirements for models and scripts being deployed to edge devices. Profiler is part of Auptimizer. You can get Profiler from the Auptimizer GitHub page or via pip install auptimizer.

The cost of training machine learning models in the cloud has dropped dramatically over the past few years. While this drop has pushed model development to the cloud, there are still important reasons for training, adapting, and deploying models to devices. Performance and security are the big two but cost-savings is also an important consideration as the cost of transferring and storing data, and building models for millions of devices tends to add up. Unsurprisingly, machine learning for edge devices or Edge AI as it is more commonly known continues to become mainstream even as cloud compute becomes cheaper.

Developing models for the edge opens up interesting problems for practitioners.

  1. Model selection now involves taking into consideration the resource requirements of these models.
  2. The training-testing cycle becomes longer due to having a device in the loop because the model now needs to be deployed on the device to test its performance. This problem is only magnified when there are multiple target devices.

Currently, there are three ways to shorten the model selection/deployment cycle:

  • The use of device-specific simulators that run on the development machine and preclude the need for deployment to the device. Caveat: Simulators are usually not generalizable across devices.
  • The use of profilers that are native to the target device. Caveat: They need the model to be deployed to the target device for measurement.
  • The use of measures like FLOPS or Multiply-Add (MAC) operations to give approximate measures of resource usage. Caveat: The model itself is only one (sometimes insignificant) part of the entire pipeline (which also includes data loading, augmentation, feature engineering, etc.)

In practice, if you want to pick a model that will run efficiently on your target devices but do not have access to a dedicated simulator, you have to test each model by deploying on all of the target devices.

Profiler helps alleviate these issues. Profiler allows you to simulate, on your development machine, how your training or inference script will perform on a target device. With Profiler, you can understand CPU- and memory-usage as well as run-time for your model script on the target device.

How Profiler works

Profiler encapsulates the model script, its requirements, and corresponding data into a Docker container. It uses user-inputs on compute-, memory-, and framework-constraints to build a corresponding Docker image so the script can run independently and without external dependencies. This image can then easily be scaled and ported to ease future development and deployment. As the model script is executed within the container, Profiler tracks and records various resource utilization statistics including Average CPU UtilizationMemory UsageNetwork I/O, and Block I/O. The logger also supports setting the Sample Time to control how frequently Profiler samples utilization statistics from the Docker container.

Get Profiler: Click here

How Profiler helps

Our results show that Profiler can help users build a good estimate of model runtime and memory usage for many popular image/video recognition models. We conducted over 300 experiments across a variety of models (InceptionV3, SqueezeNet, Resnet18, MobileNetV2–0.25x, -0.5x, -0.75x, -1.0x, 3D-SqueezeNet, 3D-ShuffleNetV2–0.25x, -0.5x, -1.0x, -1.5x, -2.0x, 3D-MobileNetV2–0.25x, -0.5x, -0.75x, -1.0x, -2.0x) on three different devices — LG G6 and Samsung S8 phones, and NVIDIA Jetson Nano. You can find the full set of experimental results and more information on how to conduct similar experiments on your devices here.

The addition of Profiler brings Auptimizer closer to the vision of a tool that helps machine learning scientists and engineers build models for edge devices. The hyperparameter optimization (HPO) capabilities of Auptimizer help speed up model discovery. Profiler helps with choosing the right model for deployment. It is particularly useful in the following two scenarios:

  1. Deciding between models — The ranking of the run-times and memory usages of the model scripts measured using Profiler on the development machine is indicative of their ranking on the target device. For instance, if Model1 is faster than Model2 when measured using Profiler on the development machine, Model1 will be faster than Model2 on the device. This ranking is valid only when the CPU’s are running at full utilization.
  2. Predicting model script performance on the device — A simple linear relationship relates the run-times and memory usage measured using Profiler on the development machine with the usage measured using a native profiling tool on the target device. In other words, if a model runs in time x when measured using Profiler, it will run approximately in time (a*x+b) on the target device (where a and b can be discovered by profiling a few models on the device with a native profiling tool). The strength of this relationship depends on the architectural similarity between the models but, in general, the models designed for the same task are architecturally similar as they are composed of the same set of layers. This makes Profiler a useful tool for selecting the best suited model.

Looking forward

Profiler continues to evolve. So far, we have tested its efficacy on select mobile- and edge-platforms for running popular image and video recognition models for inference, but there is much more to explore. Profiler might have limitations for certain models or devices and can potentially result in inconsistencies between Profiler outputs and on-device measurements. Our experiment page provides more information on how to best set up your experiment using Profiler and how to interpret potential inconsistencies in results. The exact use case varies from user to user but we believe that Profiler is relevant to anyone deploying models on devices. We hope that Profiler’s estimation capability can enable leaner and faster model development for resource-constrained devices. We’d love to hear (via github) if you use Profiler during deployment.

Originaly posted here

Authors: Samarth Tripathi, Junyao Guo, Vera Serdiukova, Unmesh Kurup, and Mohak Shah — Advanced AI, LG Electronics USA

Read more…
