SPRADB0 may   2023 AM62A3 , AM62A3-Q1 , AM62A7 , AM62A7-Q1 , AM67A , AM68A , AM69A

 

  1.   1
  2.   Abstract
  3.   Trademarks
  4. 1Introduction
    1. 1.1 Intended Audience
    2. 1.2 Host Machine Information
  5. 2Creating the Dataset
    1. 2.1 Collecting Images
    2. 2.2 Labelling Images
    3. 2.3 Augmenting the Dataset (Optional)
  6. 3Selecting a Model
  7. 4Training the Model
    1. 4.1 Input Optimization (Optional)
  8. 5Compiling the Model
  9. 6Using the Model
  10. 7Building the End Application
    1. 7.1 Optimizing the Application With TI’s Gstreamer Plugins
    2. 7.2 Using a Raw MIPI-CSI2 Camera
  11. 8Summary
  12. 9References

Creating the Dataset

The first step of neural network, a.k.a. "model", creation is to create/curate a dataset. As a data-driven methodology to solving problems, machine learning and deep learning models are only as good as the data these models are trained on. It is ideal to train a model with data that is custom to the final task is designed for.

Public datasets like COCO or Imagenet provide a convenient path to developing and evaluating a deep learning model. There are many of publicly available datasets; many can be found on sites like paperswithcode.com [5]. However, not all public datasets are usable per license terms; they may also have too few high-quality data points. At the same time, custom datasets are time-consuming to create.

In the context of a retail scanner application, the usable and licensable online datasets were not high enough quality to use as-is. Many images were crowd-sourced and poorly labeled, such that trained model had poor performance on both the validation data subset as well as in practice within a well-lit checkout area. Creating a dataset from scratch was necessary.