Step 1: Prepare dataset

This document contains an explanation of the dataset structure compatible with NetsPresso. NetsPresso supports tasks such as image classification, object detection, and semantic segmentation.

1. Datasets for image classification

Supported image file types

'jpg', 'png', 'jpeg'

Supported formats

ImageNet format

ImageNet format

Dataset structure

ImageNet format has each class as a directory with the class name. If directories are not class names, the mapping.txt should be given.
Example of dataset structure

mapping.txt for class information (optional)

All comments starting with '#', '//' are not allowed.
Example of mapping.txt

class-1 car
class-2 banana
...
class-n rice cake

2. Datasets for object detection

Supported image file types

'bmp', 'jpg', 'jpeg', 'png', 'tif', 'tiff', 'dng', 'webp', 'mpo'

Supported formats

YOLO format (dataset yaml file needed) — Recommended
COCO format
VOC format

There are labeling tools, such as CVAT and labelimg, which support these formats.

If you are using Roboflow, YOLO v5 PyTorch, COCO, Pascal VOC formats are compatible with NetsPresso. (use Export-download zip to computer to download dataset)

YOLO format

Dataset structure

YOLO format has a '.txt' file for each image with the same file name. However, if there is no object in the image file, no '.txt' file is required for that image. Make sure that every '.txt' file requires a corresponding image file.
Example of dataset structure

Example of txt file

Example image with 4 objects (3 people, 1 hammer).

0 0.415730 0.494949 0.394864 0.828283
0 0.659711 0.476010 0.205457 0.648990
0 0.848315 0.375000 0.229535 0.674242
42 0.776083 0.356061 0.062600 0.161616

One row per object
Each row represents {class number} {center_x} {center_y} {width} {height}
Box coordinates must be in a normalized xywh format (from 0 - 1). If your boxes are in pixels, divide center_xand width by image width, and center_y and height by image height.
- For example, an image above has a size of width 623px, height 396px. And the coordinates of the first object in its label are center_x 259, center_y 196, width 246, height 328. After normalization, the coordinates are center_x 0.415730, center_y 0.494949, width 0.394864, height 0.828283.
Class numbers are zero-indexed (starting from 0).

data.yaml for class information

YAML file that contains information about the class name and the number of classes is required for YOLO format.
All elements of the names must be written in the same class number as the dataset.
- names: Names of the class
- nc: The number of classes
Example of data.yaml

names:
- aeroplane
- bicycle
- bird
- boat
- bottle
- bus
- car
- cat
- chair
- cow
- diningtable
- dog
- horse
- motorbike
- person
- pottedplant
- sheep
- sofa
- train
- tvmonitor
nc: 20

COCO format

Dataset structure

Please refer to the official COCO Data format for COCO label format.
COCO format has one ‘.json’ file for all images in the same folder. Each set (train, test, and valid) should have its ‘.json’ file. Make sure that file names ‘.json’ file should be the same as the names of image files.
Example of dataset structure

VOC format

Dataset structure

Please refer to the official VOC Data format for VOC label format.
VOC format has one '.xml' file per image with the same file name. If there is no object in an image, no '.xml' file is required for the image. Make sure that every '.xml' file requires a corresponding image file.
Example of dataset structure

3. Datasets for semantic segmentation

Supported image file types

'jpg', 'png'

Supported formats

UNet (like YOLO)

UNet (like YOLO)

Dataset structure

UNet (like YOLO) format has a single-channel image mask file for each image with the same file name. Mask file contains an integer value for every pixel that represents the segmentation class. The mask file and the image file should have the same image size. Make sure that every mask file requires a corresponding image file.
Example of dataset structure

id2label.json for class information

Number of classes of id2label.json should be same as the number of classes in the mask files.
Be sure that the name of json file must to be 'id2label.json', any single typo will cause the error.
Example of id2label.json

{
  "0": "road",
  "1": "sidewalk",
  "2": "building",
  "3": "wall",
  "4": "fence",
  "5": "pole",
  "6": "traffic light",
  "7": "traffic sign",
  "8": "vegetation",
  "9": "terrain",
  "10": "sky",
  "11": "person",
  "12": "rider",
  "13": "car",
  "14": "truck",
  "15": "bus",
  "16": "train",
  "17": "motorcycle",
  "18": "bicycle"
}

palette.json for color index for each class

Example of palette.json

{
	"0": [128, 64, 128], 
  "1": [244, 35, 232],
  "2": [70, 70, 70],
  "3": [102, 102, 156],
  "4": [190, 153, 153],
  "5": [153, 153, 153],
  "6": [250, 170, 30],
  "7": [220, 220, 0],
  "8": [107, 142, 35],
  "9": [152, 251, 152],
  "10": [70, 130, 180],
  "11": [220, 20, 60],
  "12": [255, 0, 0],
  "13": [0, 0, 142],
  "14": [0, 0, 70],
  "15": [0, 60, 100],
  "16": [0, 80, 100],
  "17": [0, 0, 230],
  "18": [119, 11, 32]
}