SPRADB0 may   2023 AM62A3 , AM62A3-Q1 , AM62A7 , AM62A7-Q1 , AM67A , AM68A , AM69A

 

  1.   1
  2.   Abstract
  3.   Trademarks
  4. 1Introduction
    1. 1.1 Intended Audience
    2. 1.2 Host Machine Information
  5. 2Creating the Dataset
    1. 2.1 Collecting Images
    2. 2.2 Labelling Images
    3. 2.3 Augmenting the Dataset (Optional)
  6. 3Selecting a Model
  7. 4Training the Model
    1. 4.1 Input Optimization (Optional)
  8. 5Compiling the Model
  9. 6Using the Model
  10. 7Building the End Application
    1. 7.1 Optimizing the Application With TI’s Gstreamer Plugins
    2. 7.2 Using a Raw MIPI-CSI2 Camera
  11. 8Summary
  12. 9References

Augmenting the Dataset (Optional)

“Augmenting a dataset” means to artificially expand it by adding copies of the data with various types of noise and modifications without increasing the data-labelling burden. When done well, augmentation improves model robustness and prevents overfitting. For instance, adding blurring effects can improve robustness when camera focus is not correct; adding rotations to the image can help introduce object orientations that were not part of the original dataset.

If a train-test split is being created for the dataset, the split should be done before augmentation so as to not contaminate the testing set. Some augmentations may have little overall impact and thus have a near perfect match between a testing and training image. This would unfairly increase the calculated accuracy of the model.

For the food-recognition model, the list shown in Table 2-1 of augmentations is used.

Table 2-1 List of Augmentations
Size/Orientation transforms Filters/Localized Effects Additive/Multiplicative Effects
Perspective Transform Gaussian blur Gaussian noise
Flip left-right Sharpen Add saturation
Flip up-down Motion blur Change color temperature
Shear Contrast Multiply hue and saturation
Perspective transform JPEG compression
Rotation Autocontrast
Sharpen

Not all augmentations were applied on every image; a random subset was applied. For each original image, eight additional augmented versions of the image were created. These augmentations were selected based on how likely their effects were to be seen in practice. This is not an exhaustive list of all the available and useful augmentations, but includes more than the standard augmentations of flip, crop, resize, rotate, blur, and add gaussian noise. The imgaug library in Python was used to apply these augmentations and maintain bounding boxes in the process to avoid additional labeling.

Given the type and number of augmentations allowed here, this dataset is considered ‘heavily’ augmented. Heavy augmentation is not recommended for all applications, especially those where small and fine changes in an image could be considered themselves a pattern, for example, defect detection in machine vision applications.

The data_manipulation.py Python3 script within this application's repository includes source code for modifying the images, saving as files, and adding them to the COCO JSON file that describes data labels/annotations. It also includes other functions for reworking the data and labels, for example, combining multiple label-studio training sessions into one large dataset and converting JSON label files into an intermediate format that is easier to manipulate.