Internet Explorer는 TI.com에서 지원되지 않는 브라우저입니다. 최상의 환경을 위해 다른 브라우저를 사용하십시오.

비디오 시리즈

Process this: Edge AI technology 주제

이 프로세서의 아카이브 웨비나 동영상을 통해 TI 엣지 AI 기술에 대해 알아보세요.

Process This: Program an Edge AI "Hello World" Application Using Free Online Tools

|

발표자

리소스

OK. Let's get started. Good morning, good evening, everyone, depending on your time zone. Thank you for joining this webinar, our second in our monthly Edge AI Webinar series. We have more than 100 people that are joining this webinar. We have two sessions, one for today, and then we have one more session early next week that will have some Chinese interpretation. We are really excited about your attendance, as we are now taking Jacinto AI technology that is proven in many different automotive and industrial applications available for broad market applications.

If you are new to AI hands-on code development, especially embedded AI development, you will have all the fundamentals and code examples in the next 45 minutes to do your first Hello World program. That's definitely guaranteed. You will also get to know different software frameworks and tools a lot of developers use and how TI is making it easy to use any of your popular or your favorite machine learning frameworks and get your embedded AI development started.

My name is Srik Gurrapu. I am in the processor/software applications team. And we have three panelists along with me, Manisha Agrawal, Katrina Tuazon from our Jacinto marketing team, and also Mukul Bhatnagar from our Jacinto applications team. They will be checking for any questions that you may have throughout the presentation. So please don't be shy in asking anything during the presentation. They'll be able to answer right away or take actions to follow up.

OK. Let's get started. So if you think about a Hello World or in a program, the tradition started back in 1978. So the image that you see on the left side is from Wikipedia. This is a simple C Hello World program that Brian Kernighan wrote it to get people started with any programming language. That kind of became the tradition.

If we apply the same thing in the AI world, what we can do is, we can really do some fancy things. You can take an image, for example, as you see over here. And we can be-- we can read and pre-process the image, we can automatically determine what objects are in that image, and we can output the results in a way that is very easy to understand. So the bar is much higher for the Hello World program in the AI world, because we are really doing some smart stuff.

That's a lot of complexity. But at TI what we are trying to do is make these complex things much simpler. And that's what this webinar is all about. So you will see all the steps that you need to take in order to do all these functions and also be able to embed that into an edge device.

Because at the end of the day, all the-- if you see AI, all the embedded electronic devices, you will always see AI inside. How do you get AI inside? That's really the essence of this webinar.

So this is the agenda that we are going to be following. So we'll do a quick recap from the previous webinar where Manisha introduced Jacinto AI technology for embedded edge AI applications. Then we'll spend some time briefly talking about, what is the application development process? What are these models? How do you go about selecting it, how do you go about optimizing it?

And then we'll apply this principle to our Hello World example-- the same dog cat example that we have seen before, but how do you go about doing it. And we're going to approach this problem in three steps. Number one, run the code on the PC. Using the software tools that are available in the market, how do you first get up to speed on the PC [INAUDIBLE]. And then the step two is now get the same code as-is and run it on the ARM processor inside the Jacinto SoC.

And the third step is, now how do you enable the accelerations to do this much faster? And we'll also be spending a few minutes on the demo. And after we finish that basic Hello World, of course that's when your real journey begins. And how do you-- what path you can take in that real application development journey, we'll spend a few minutes, and then wrap it up with a call to action.

OK. So just looking at-- we talked about a lot of different electronic and equipment saying AI inside. So we see AI all around us. It's [INAUDIBLE] in broad applications across many different markets. And one thing that you see here is all these applications don't need to be new. Even existing applications, like whether it's a factory automation robot, now we are finding new use cases enabled by AI, enabled by the smartness that AI technology offers. Could be the last-mile robots. There are many different applications that this would be relevant.

Typical architecture of enabling this intelligence is use general-purpose compute or use GPU accelerators to be able to do it. These were designed for different applications. But that's one way to get the job done.

And in TI, what we are doing is very [INAUDIBLE] hardware. We want to enable AI in lower power and with lower complexity. And we want to enable this with simple, Linux-based programming using popular software frameworks. So you don't have to do a lot of ground work on driver development or Linux development. You can really stick to a very high level and use popular machine learning frameworks and the software tools that are already available.

So that's what the Jacinto architecture that you see on the right-hand side, we have a deep learning accelerator. This is completely purpose-built for AI applications. And you will see some of the performance improvement when you start leveraging the use of this deep learning activator. And then we have multiple imaging and vision accelerators. And ARM A72 core, this is a multicore. And that's what's running on Linux and really enabling the software frameworks and ease of use.

And what is this purpose-built accelerator? And this is a quick refresh. This is-- we have TI's special accelerator. It's including both C7x DSP and also a module called MMA, Matrix Multiply Accelerator. With the combination of these two, architected primarily for AI, we are really achieving industry's best [INAUDIBLE] high frames per second per TOPS.

Frames per second is how often this-- you can run this image processing and detect what's in the image. We can achieve that very efficiently with this particular architecture. And even on the performance side, we offer up to 8 TOPS at the industry's best-in-class in the power envelope. So that's the core of this Jacinto architecture.

So that's just a recap of what Jacinto AI technology is about. Now let's get into how do we go about developing your first software, and also your application development. So for this, we provide comprehensive software from-- on our platforms. It comes with both the foundational components you see at the bottom-- so all the drivers that I talked about, all the board support packages, and all that stuff-- and then the development environment to keep the development as easy as possible.

So we have enough for video-in, video-out, we have a GStreamer framework. We have OpenCV for computer vision. And for the deep learning, we provide-- we offer support for all the popular runtime frameworks. So focus for today's webinar is going to be in the middle, really using deep learning frameworks that are available, and how do you tweak it to our hardware. But we'll have subsequent webinars covering different aspects of it, digging deeper into each one of these.

So just a quick recap of what this deep learning is all about. We looked at the dog and the cat picture. And how do you automatically determine that there is a dog? How do you automatically determine that there is a cat, and extend that to any different image, extend that to different video clips? So fundamentally, deep learning is all about creating a model that is trained on a known data set.

And there are a lot of popular data sets that you can see over here-- Microsoft COCO, ImageNet, and many others. Essentially, these are millions of data examples. And then the middle box in the red here is really creating that model, training the model to be able to recognize these objects.

And once you create that model, then you can take that and then use it with your own input. And that future inference is happening in one pass. So the left-hand side is model development, and then the right-hand side is using that model to be able to apply on your real-time use cases.

And typically, when you want to do that inference on an electronic end device, that's going to be an embedded edge device. And that's where our Jacinto TDA platform is going to be very applicable. So that is the basic essence of AI-- creating the models and using the model.

Now, how do you add AI into your system? It's as simple as three steps with TI's comprehensive software offering. So the first step is, there are a lot of models out there already using different frameworks. You can do all the development on your PC. And you can train anywhere, develop anywhere, and bring that model to the embedded platform.

And we also offer a proactive service, where we verify popular models and make sure they're running on our hardware. And we provide a tool for you to choose out of 60 models that we have already verified, pick anything that is based-- that meets your accuracy or the speed requirements. And we have all these models on the GitHub. And you can see the link over there.

So you can do-- the first one, model, is on the PC. You can do anywhere. And the second step is, now [? that we ?] get that model, compile and optimized for our SoC. And the C7x DSP and the MMA accelerator that I showed you couple of slides ago, how you get this neural network model run on this TI SoC? That's step number two.

And even here, we provide support for industry-standard frameworks. So you would use the same language that you use for developing the model. The same TensorFlow Lite libraries that can be used. The same Python program could be used to be able to compile and optimize into the SoC.

And then the number three is deploy the model that is optimized onto your processor. You will be able to run it directly on top of the Linux and get to your product development much faster. So those are the three steps of adding AI into your system.

I talked about model selection, the first step. So there are-- if you look in the industry, there is a good reference over here, there is a lot of models that are developed. They are free to use for production into your product. And these are all open-source models. And there is a basically trade-off. Just like anything in life, there is a trade-off. That's going to be accuracy versus operations.

And it's just amazing how much processing that needs to happen in a model to get decent accuracy. In this x-axis that you see over here, just an example. 155 in a million, that's how many operations that needs to happen. So the complexity of the Hello World here is [? quite ?] significant. It's not just writing a printf statement.

But there is a lot of research that's already been done. There is a lot of library of these neural network models that are already available for you to use. Now, you may have to do some customization for your data set. But that's going to be incremental approach rather than doing something from grounds up.

And as I mentioned, we have all those models in the previous slide. And we do a lot of work for you already. As I said, we have this model zoo, where we have already verified 60-plus models that it would work on our platform. And you can also get this tool from our Edge AI Cloud tool that I will be showing later on in the presentation.

And you can select these models by the type of function that you are doing, whether it's a classification AI function, detection, or a semantic segmentation of the scene. You can choose based on that. You can choose based on the type of runtime that you are using. So multiple ways to get to the model that would work for your application. And these are all available, ready to use. And we are continuously extending the 60, and we plan to get 100-plus models very soon, as we get requirements from all of our customer base.

I just talked about runtime, right? So I just did a quick Google Trends search on-- there are so many deep learning frameworks. And you can see the top five over here. Blue is TensorFlow, red is PyTorch, and then Keras, and Theano, and MXNet. And the exciting thing is we support all the popular frameworks. In this webinar, in this example Hello World application, we will focus on TensorFlow runtime framework. And we'll also use Python, because that is very easy to get you get started with AI development.

So now let's get to saying hello world to the dog and the the cat. So we're going to approach this problem in three steps. As I said, this is a very complex stuff. But you can really make it simple if you can do-- if you can follow these three steps. Your end goal is to be able to do this detection on the fly in real time on the embedded edge AI device using deep learning acceleration.

But let's take three steps. First, let's get the program running on the PC first. And then port the same program onto the Jacinto 7 SoC platform. And then enable the deep learning acceleration.

So we talked about this already. For this Hello World example, we have a decision to make. So we can develop a model in three ways. You can create a custom model from grounds up. And then we can read up all about neural networks and how they are created, train the model, create the model right. That's one option we have. Or number two is use a pre-trained model that we have already verified, we have already compiled, we know it works. So use a pre-trained model.

And the third one that's also common is more of incremental approach. Use a pre-trained model, and then apply transfer learning specific to your custom use case. It could be changing some weights in the network, it could be changing couple of layers on the outputs of the network. Right so it could be done multiple ways. And that's common practice for many applications.

In this webinar, we're going to be using option two. So we're going to use a pre-trained model that's available, open-source on TensorFlow in a hub that you can directly pick it up. And you can learn about what that model is doing, what are the labels that it is trying to detect. And all that stuff is available in this link.

So what is MobileNet? So this is one of the frameworks that we have seen in the previous chart of the accuracy and performance. On the middle, it basically shows all the architecture, starting from input size-- input size is 224 by 224 by 3. So that 3 is basically RGB, Red, Green, Blue, of the image that's coming in. And 224 by 224 is the input.

And it goes through many different convolutional layers, as you can see over here. There's a lot of pooling mechanisms. And the last stage is the classifier. So you take any image, and the classifier is basically providing you what's in that image, up to 1,000 values.

And you can see how many parameters in this network-- so 4.2 million, 2.5, based on your different optimizations. That's a lot of parameters in the network, all the weights and all the nodes in the structure. It's pretty complex. And the good thing is, these are easy to use. And we can use that now to develop our first application.

So what tools to use to create your first Hello World application? There are many different frameworks available. And one framework that we have used in this webinar is very popular Anaconda framework. You can just go to Anaconda.com, and then you can download your free individual license and get started.

It also has Jupyter. You can install-- once you install Anaconda, you can also install Jupyter and Jupyter is kind of an IDE, but it's a very nice browser-based Python compiler. And the neat thing is, it's not just the code. You can also have very good documentation along with it. And that's what we will be using for this webinar. And even our cloud tool uses the same, because it is completely browser-based.

And we talked about this framework. So we're going to be using Google's TensorFlow deep learning runtime. And this is completely open-source. And there are two versions for this. TensorFlow, it is more for cloud and PC type applications. And then TensorFlow Lite, as the name indicates, it's a lightweight version of TensorFlow targeting edge devices. They'll have a much smaller footprint in terms of both the memory and also performance requirements.

And basically, what it does is quantifying the weights to be able to make it optimum for embedded devices. It supports both 30-bit floating point or the fixed point, and also cutting away some parameters in the model and that have little impact on the performance. So that is very commonly used for quite a bit of embedded applications.

So that's what we're going to be using. And so once you install Anaconda from the previous slide, now what you need to do is installing the TensorFlow runtime. And you can see the command over here. You can use the pip3 in order to install TensorFlow Lite.

And the other useful software library for any AI application is open-source-- sorry, OpenCV. It's basically a computer vision library. And it provides a common infrastructure for many different computer vision applications, whether you use Python, you use C. And it has a vast library of algorithms to be able to read the image, pre-process it very easily, and then doing a post-processing as well.

So it's highly used in the industry. And you can install this as well from the pip tool, using the command over here. And we can do this in the Python. But similar interfaces will be there in the other languages like C++, [INAUDIBLE].

And the other one, if you know Python, if you heard about Python, then you would also be hearing about NumPy. And this is an open-source package for dealing with all these numbers. So image itself is a 2 by 2, and then when you add in RGB, we looked at 3 by 224 by 224. So manipulating all these matrices becomes much more easy and much more intuitive with this NumPy package. So this is what you will see in many of Python programs. And we can install this as well using this command.

So once we have the Anaconda and the Jupyter installed, TensorFlow installed, OpenCV installed, NumPy installed, now we have everything. And all of these are available online. And now we can jump to the code.

So once we have all the software libraries installed, our Hello World code essentially has three main segments. So we just imported NumPy, cv2, TensorFlow Lite. So we can import all this into the Python program. So that's what the step number one is doing.

And step number two is, we already selected and downloaded the model from the TensorFlow page. So we are using MobileNet V1, So that's what this over here. And that model comes with the actual network file, and also the labels. The labels are basically the 1,000 outputs that is at the final stage of the neural network.

So we can make a quick dictionary out of this in the Python, and then to be able to interpret what the network puts out in the results page. And very simple commands to be able to load the TensorFlow Lite deep learning model that is coming from open-source. It's essentially two commands that you see [INAUDIBLE] over here.

So set the interpreter from the TensorFlow Lite. It's part of the TensorFlow Lite library that we just imported. And then allocate the individual sensors, both input and output, and allocate the memory for the whole network in the system. That's what step number two is doing.

And once you have the model already in the system, now you can take any input-- that's what this first step is saying, you're reading an image file using the cv2-- I mentioned like cv2 makes it very easy-- you read this input file, and then invoke the neural network model, the interpreter. That's what this invoke is doing.

And once you do the-- once you invoke it, then all this neural network computation happens, this millions of operations happens, and all you'll be coming out with, the output, will have what are all the objects that it finds, and what is the score, what is the confidence level that we believe that object that we recognized is the dog or the cat. And then we can basically analyze all that output.

And that's what we get from this code. So we put in just dog cat image exactly as is, and now what we have done is basically say hello to the dog, say hello to the cat. And we are also putting, what is the confidence percentage-- the 94%, 95%.

So that simple code, we'll be walking through that code in a bit. But we took the image, we went through all those three steps, and then be able to do this inference to this automatic detection of the two images in the object. Now you can apply this to any image in the real world.

And you can also do this now, if I want to do this in a very high speed-- now this takes like 0.5 seconds. That's on the PC. Now I want to do this on a video. What does it take? We need to be able to do this inference much faster. It could be 50-- sorry, 50 frames per second or 100. The more the better.

Think of a robot. Now it's trying to do the same thing to avoid obstacles. The faster it can do this inference, the smarter it can get. So that's the goal of the AI. But as we said, we'll go step by step. So this is step number one.

We have now Hello World AI code running on the PC. Obviously, our end goal is to make it much faster. Now let's get to the second step. Take this code as-is and run it on the J7 EVM platform. So now, we just did the first step over here, running it directly from there.

Now, to be able to run it on an embedded device, we talked about the compilation. So we need to compile the model and then generate some kind of artifacts that would be understood by the target hardware, which is a TI SoC that you see here. And once you do that optimization, then you can run the inference the same way that you have done in the previous step, and then generate the result.

And the exciting thing here is, you don't need to buy an EVM from TI to do this step two or step three. We are now excited to provide this complete cloud tool, where you can log into one of our EVMs, and then run the same code and then do a lot of evaluation and benchmarking. So this tool is available. You can see the link. And we are going to be using that tool in this webinar.

So once you log into the tool, you will have four options over here. And we are also indicating what are the different things that you can do, and approximate time that it would take to do that function. For example, comparing different models out of the 60-plus models that we talked about, it just takes less than a minute to be able to do that.

And if you want to determine performance, you can do that in five minutes or so. Or you can do pre-compiled models. You can do a lot of benchmarks. It's less than an hour to do it. And then of course, custom models, you can do any kind of work that you want to do.

So for this step two of our Hello World program, we're going to be using a custom model. And we're going to be opening up the same program that ran on the PC. So this is basically now the same program that you see over here on the left-hand side. That's a snapshot of the Jupyter notebook. I'll show that in live.

Now take the same code, and now you can run it on the cloud. And we also provide a simple in Python utility to measure the performance of that inference. So you can see this image over here. So this is basically taking inference time is about 350 milliseconds, very much similar to what we saw in the PC. So it's still not good for the real-time, but you can see on the computation graph over here, it's all done by the CPU. And CPU here is the ARM A72 core in the Jacinto SoC.

So the thing I want to highlight here is-- and I'll be showing this in the demo as well-- is it's exactly the same code that's running on the left side, on the local host, on the PC, and on the right-hand side. And you can see from the browser address over here, this is a cloud tool. So you're basically accessing the EVM from the cloud and then running the same program, using the same TensorFlow Lite framework. That's the beauty of this. And especially with this Jupyter Notebook, your experience will stay exactly the same.

So now it's working. Inference time is like 0.3 seconds. But how can I use this for the real-time video? So as we'll discuss now, what we need to do is, now let's bring in deep learning hardware accelerators. So that's going to be our step three of this program.

So here, we just have to do one extra step. We talked about compiling and optimizing the model. And TensorFlow provides a lot of mechanisms to be able to compile the original model [INAUDIBLE] any hardware accelerator. And this is by means of something called a delegate.

So as you can see in this code example over here, there is this experimental delegating option. And it's a way for this TensorFlow Lite interpreter, the runtime engine, to offload some parts of the network to the hardware accelerator. Obviously, it's all numerically intensive operations. And then now you can offload that to the hardware.

And then once you do, you compile this model, you run the inference again, and this acceleration happens automatically. And that's really the beauty of this. Now, if I run the same program after compiling, now I'm getting inference time is 3.3 milliseconds.

And I think I have the next slide here, comparing side by side on how-- what kind of results you will get if you run only in the ARM core. And then on the right-hand side, now you're enabling the hardware, compiling the model, and leveraging the C7x and the MMA accelerator that is out there.

So as you can see from the graphs as well, on the right-hand side, the CPU is only involved in the beginning. And then you have the green portion is basically all the computations, and the purple is basically when the data is going out. That's really the beautiful aspect of running in the hardware accelerator. Now the performance became 100 times-- 2.8 frames per second to more than 300 frames per second. Now we can do some useful stuff for real-time use cases.

Let's say we have a camera, a surveillance camera, and you have this shopping mall kind of thing. Now with this high-speed object detection, we can put this video into the same code that we have, the same Hello World program, but now you're detecting objects on the fly. You'll be able to zoom into the people and record whenever a person is there. You can do all kinds of smart things once you have this capability to do higher-performance inference in low-power, optimized for edge device.

So step one, develop the model, run it on the PC. Step two, on the ARM, just as-is. And step three is, just add one extra step of compilation. Three steps to embedded edge AI device development. Although Hello World is as simple as it looked in the beginning, but if you look at this video now, you can do really fancy stuff, no matter what application that you have in your system. Unlimited possibilities.

So now, we are doing very well on time. So I will spend just a few minutes on the demo and code walkthrough live. So for this, I'm going to open up my browser. So this one-- so I have anaconda . Already installed over here. You can kind of see this over here from here. And I launched the Jupyter notebook.

And so this is the first step. So we're running the webinar Hello World program on the PC. And as I mentioned, like the Jupyter Notebook, you can have all this all kinds of comments, a nice description of what you are doing. And then the first segment was importing all the libraries. And I can run cell by cell over here. And so I can run basically pretty much all the steps over here by restarting and running all the cells.

So now it's running over here. The last step is where it's running the inference. And it basically just finished reading the image and then doing the classification. So basically, this is the code that we talked about. So importing all the required libraries and reading the image. And this is really the essence of it, invoking the interpreter, and then after you get the result, and then doing your post-processing.

And then the actual model itself-- so this is-- if you look up online, you will see this. That's the TensorFlow Lite model that we are inputting to the TensorFlow Lite interpreter. So that's this part is, setting up the model, is about.

So we ran this. Everything kind of works. And this is output we are getting. So now, step two is getting the same code and then running it on the cloud. And so that's basically now over here. So now over here, now we are on our dev.ti.com. So I kind of already logged in. And you will have three minutes-- sorry, three hours per one session for you to log into.

So now, I have my own program already loaded in my workspace. So I will go to My Workspace. And this is the demo over here. So here I have three notebooks. These Jupyter programs with all the description, they're called notebooks. So we have Hello World PC notebook we just ran on the local PC.

And then you have-- two is the same demo, ARM-only. And the third one is compile and run on the deep learning accelerator. So now, let me open up number two. And over here, so this is basically running it only on the ARM. So you can use exactly the same code. You're importing the same libraries as we have done on the PC.

And this is pre-processing the image, same model I uploaded into my workspace and then loading it, and allocating tensors, same command. It sets up all the memory and everything. And then same input and running the same interpreter.

So now if I run this entire code over here, [INAUDIBLE] basically runs cell by cell. Star means it's still not running. Let's see where it is right now. So it finished that. It's actually doing this process right now.

And while that is running, I'm going to put these two [INAUDIBLE] side by side. So this over here and this over here. So it's now finished this step. And now it's doing this. This is 0% over here. Now it just finished running. And we got almost similar results over here. The accuracy is slightly down, 94.97 on the hardware. But it's pretty much the same.

So this is on the local host. That's running on the PC, and this is running on the cloud tool. Side by side, exactly the same code. And that's the beauty of this cloud tool-- same environment. And as I said, now we have more utilities. Now I will be able to measure the inference time. We just talked about 2.8 frames per second. And that's over here.

It's all blue, because we have not yet activated the DSP and the MMA accelerator. That's going to be our third step. So here, again I'll kind of run this whole thing over here. And while it is running, I'll kind of walk you through the same code.

So the only step that is additional in this step three is compiling the model. So all of this is the same. And when you're compiling the model-- so that's what this is all about, and that's where we're spending some time over here-- so we're basically using the same API from the TensorFlow library. But here, we are providing some compile options. And we are also giving the path of where the TIDL-- TIDL stands for TI Deep Learning libraries-- and then where that compiler is-- make this full screen.

And then we have the application, the compiler, that basically imports the TF Lite and then generates all the artifacts that will be used to leverage the accelerators. So once you do this, and then we also have some calibration images to be able to tweak the weights and optimizing this model for-- to be able to [? suit ?] for TI's hardware.

So this compilation, typically it happens on your local PC. You can do this on your x86 platform. We have the TIDL libraries, full software environment is already available. So you can do this offline. Because it's all about compiling the model so that way you can use it on your hardware later on.

So once-- so now this is done. So now the next step is now using the compiled model. But now here, it's basically now the same steps as before. Now we are using that compiled model, which is in the artifacts folder over here. And you can see that in your workspace. So it's over here. And these are all the files that the compiler creates.

So now this is using that model. And it's basically the same command. The only difference now is it's using this experimental delegate option. This option was not there when you're running only on the PC or when you're running only on the ARM core. So just very quickly going over here for that portion of the code over here, so it doesn't have the experimental delegate. We are just executing the TF Lite model as-is.

So that's really what you need to do-- just one extra step of compilation. And then once you compile it, then you enable that when you do the inference. It's as simple as that. And you're doing everything on the Python over here. You're not-- basically, there is no different environment. It's exactly the same environment to be able to do this stuff.

So it's still kind of doing this processing over here. So while that is doing, I will kind of open up-- and this workspace is also pretty easy to use. Once you create your own programs, they're always there on the clock. We are saving this for you so you can always bring it back up whenever you want to tweak the program or something like that.

So I just notice this program is now done. So now let's see. So basically, it's finished everything. So it set up the interpreter and did the inference. And now it's finished. We got kind of similar results. And the accuracy seems to be 87% for the dog compared to 95-plus in the previous example. But that's all because of the quantization and all the benchmarking.

And that's something you can tune in your application. And that's what most of our developers do. So that's going to be quite a bit of stuff. And then we're going to talk about a lot of different tools you have at your disposal to be able to tweak that. And so it finished the inference. And then, from all the statistics we can see over here, exactly the same as the snapshot we have seen before.

So 3.2 milliseconds. So we finished all the three steps. So step one, running on the PC; step two, running on the ARM of the embedded device; and step three is really make it real-time by compiling the model and running inferences from that model.

So the nice thing about this Jupyter notebook is you can kind of save these files along with the results. And [INAUDIBLE] save this part for future. Most of the time in this kind of webinars, the demos don't work as planned. But luckily it all worked fine in this case.

But if it were not working, I could always use the saved files and then show you how the output would look like under normal circumstances. So that's another advantage of the Jupyter notebook.

OK. So that's the demo portion of it. We have seen all the three steps. Now let's go back to the slide set. So we have seen this. So we have compiled a step three program. And then we witnessed more than 300 frames per second object detection in a frame. And then we have seen this video on what we can do with this kind of technology in real AI use cases.

So just couple of examples. Let's, say, think about a camera, just like that surveillance example. Now we can do automatic object detection, automatic person detection. We can do adaptively zoom into areas of interest, and then can do real-time decision making. So lot of useful functions. And you can add to the camera, and then you can call it AI-enabled camera.

That's what we see in the market right now, right? Everything is AI-enabled. Same thing for a robot. It could be last-mile delivery, where you could detect hazards, you could detect collisions and avoid them. And you can really make this last-mile delivery of applications possible. And you can call it AI-enabled robot.

And other example could be a shopping cart. We can put AI-enabled cameras in the shopping cart and then completely automate the shopping experience for the user. You can detect fraud, you can provide convenience, just by doing three core AI functions-- image classification, detecting the objects, and then segmenting a scene. So AI-embedded, AI in low power really offer unlimited possibilities.

So that's the essence of this webinar. So we started with the Hello World, and then we kind of showed all the three steps of making that Hello World example really useful for real use cases. What can you do from here, for advanced development?

We're going to spend a few minutes on that. So we talked about this. So essentially the same, Now you can go deeper understanding all the different models that we have made available for our customers. Look at the link over there, look at the Edge AI Cloud. You can really understand the trade-offs between accuracy, performance, and power across from the 60-plus models. Play with those models.

And then also get to know the compilation in more details. Now, we have noticed some loss reduction. How do you-- with the basic model that we just downloaded, without any optimization, just straight compile, what can you do in terms of improving that process, optimizing it to get the accuracy back up? And you could even probably increase the accuracy in your application. And then deploying this into a real SoC as well.

And you can-- we have not only the model zoo, but we have extensive tools available to really jump-start your application development. So model zoo has all the models that are verified to work. MobileNet V1 that we used in this is one example of that. We have also done a lot of compilation for these models. And we can take the compiled model directly as well. You can even bypass the step that I just showed in step three of compiling the model. You can do that as well.

And then, of course, you can do a lot of benchmarking. So this should be 1, 2, 3, so basically three tools that we have available for you to do for the development with these tools. And one note on compilation. So right now, our TIDL compiler supports all these layers. If you create a model from scratch and then, let's say, that has a different layer that is not yet supported, that will be-- dealt automatically. It could be running on the ARM. So there are a lot of documentation on our users guide on how do you handle unsupported layers for your own custom model.

It's very rare for developers to choose something that's not already out there. But that option is also available for advanced users. Most of the time what we see is, take an existing model that's already verified. And then either do optimization, change the weights, or do some transfer learning where you're just some couple of output layers in the model. But all the documentation is over here that you could get access to that right away.

And then we have the full SDK. We have not just this Hello World demo, but we have many vision analytics demos, using both our image processing accelerators and also deep learning accelerators. We have some demos where you're not only doing detection but also classification and segmentation, all upwards of 50 frames per second, with different models. And you can see all those links over here as well.

And for more services, we have a comprehensive third-party ecosystem as well for a broad range of applications. That's a possibility as well. So we have listed a couple of partners over here. And so with that, we'll come to the last slide. So what is the call to action?

So hopefully, this webinar gave you the exposure to-- the whole landscape of different software tools, and what software frameworks that you can use to get started with the AI application development. And how you take that on the PC and then run it onto the SoC with acceleration? So we'll be making this example code available to you after we finish the webinar. And you can download this, try it with different images and different video clips. And you can do all that with just using the basic commands and then in a few more commands you can find in respective library documentation.

And the next one is re-imagine what's possible. We just give you a kind of an idea of what you can do with this real-time video analytics. And with your own application, with your own imagination, there are so many smart things that we can do, we can add into your end device.

And as you are getting started on your journey to AI development, we are here for you. Not only with comprehensive software offering that we showed in this webinar, but if you have any issues that you run into, you can use our E2E forum and ask specific questions. And then we'll be happy to help.

And as I said, before I wrap up, this is a monthly webinar series. We're going to be planning different topics. And if you have any specific topics that you want us to cover, please definitely let us know. There is a forum post for this webinar. You can comment there. Or you can just put a post into the e2e.ti.com.

We are planning more detailed sessions on more architecture of the deep planning accelerator, how do your-- more aspect-- more detailed aspects of your custom model compilation and development. And also, GStreamer and OpenCV image processing, and things like that. There are many different things to learn. So we'll keep this regular recording sessions to help you with your AI journey for your application.

That concludes this webinar. Thank you very much again. And we are really excited, and I'm sure you are excited as well to make your end system AI-enabled. Thank you.

View series

Process this: Edge AI technology 주제