SPRADB0 Application note

SPRADB0 may 2023 AM62A3 , AM62A3-Q1 , AM62A7 , AM62A7-Q1 , AM67A , AM68A , AM69A

2.3 Augmenting the Dataset (Optional)

“Augmenting a dataset” means to artificially expand it by adding copies of the data with various types of noise and modifications without increasing the data-labelling burden. When done well, augmentation improves model robustness and prevents overfitting. For instance, adding blurring effects can improve robustness when camera focus is not correct; adding rotations to the image can help introduce object orientations that were not part of the original dataset.

If a train-test split is being created for the dataset, the split should be done before augmentation so as to not contaminate the testing set. Some augmentations may have little overall impact and thus have a near perfect match between a testing and training image. This would unfairly increase the calculated accuracy of the model.

For the food-recognition model, the list shown in Table 2-1 of augmentations is used.

Table 2-1 List of Augmentations

Size/Orientation transforms	Filters/Localized Effects	Additive/Multiplicative Effects
Perspective Transform	Gaussian blur	Gaussian noise
Flip left-right	Sharpen	Add saturation
Flip up-down	Motion blur	Change color temperature
Shear	Contrast	Multiply hue and saturation
Perspective transform	JPEG compression
Rotation	Autocontrast
	Sharpen

Not all augmentations were applied on every image; a random subset was applied. For each original image, eight additional augmented versions of the image were created. These augmentations were selected based on how likely their effects were to be seen in practice. This is not an exhaustive list of all the available and useful augmentations, but includes more than the standard augmentations of flip, crop, resize, rotate, blur, and add gaussian noise. The imgaug library in Python was used to apply these augmentations and maintain bounding boxes in the process to avoid additional labeling.

Given the type and number of augmentations allowed here, this dataset is considered ‘heavily’ augmented. Heavy augmentation is not recommended for all applications, especially those where small and fine changes in an image could be considered themselves a pattern, for example, defect detection in machine vision applications.

The data_manipulation.py Python3 script within this application's repository includes source code for modifying the images, saving as files, and adding them to the COCO JSON file that describes data labels/annotations. It also includes other functions for reworking the data and labels, for example, combining multiple label-studio training sessions into one large dataset and converting JSON label files into an intermediate format that is easier to manipulate.