Step 1: Prepare dataset
This document contains an explanation of the dataset structure compatible with NetsPresso. NetsPresso supports tasks such as image classification, object detection, and semantic segmentation.
1. Datasets for image classification
Supported image file types
- 'jpg', 'png', 'jpeg'
Supported formats
- ImageNet format
ImageNet format
Dataset structure
- ImageNet format has each class as a directory with the class name. If directories are not class names, the mapping.txt should be given.
- Example of dataset structure
mapping.txt for class information (optional)
- All comments starting with '#', '//' are not allowed.
- Example of mapping.txt
class-1 car
class-2 banana
...
class-n rice cake
2. Datasets for object detection
Supported image file types
- 'bmp', 'jpg', 'jpeg', 'png', 'tif', 'tiff', 'dng', 'webp', 'mpo’
Supported formats
- YOLO format (dataset yaml file needed) — Recommended
- COCO format
- VOC format
There are labeling tools, such as CVAT and labelimg, which support these formats.
If you are using Roboflow, YOLO v5 PyTorch, COCO, Pascal VOC formats are compatible with NetsPresso. (use Export-download zip to computer to download dataset)
YOLO format
Dataset structure
- YOLO format has a '.txt' file for each image with the same file name. However, if there is no object in the image file, no '.txt' file is required for that image. Make sure that every '.txt' file requires a corresponding image file.
- Example of dataset structure
Example of txt file
- Example image with 4 objects (3 people, 1 hammer).
0 0.415730 0.494949 0.394864 0.828283
0 0.659711 0.476010 0.205457 0.648990
0 0.848315 0.375000 0.229535 0.674242
42 0.776083 0.356061 0.062600 0.161616
- One row per object
- Each row represents
{class number} {center_x} {center_y} {width} {height}
- Box coordinates must be in a normalized xywh format (from 0 - 1). If your boxes are in pixels, divide
center_x
andwidth
by image width, andcenter_y
andheight
by image height.- For example, an image above has a size of
width 623px, height 396px
. And the coordinates of the first object in its label arecenter_x 259, center_y 196, width 246, height 328
. After normalization, the coordinates arecenter_x 0.415730, center_y 0.494949, width 0.394864, height 0.828283
.
- For example, an image above has a size of
- Class numbers are zero-indexed (starting from 0).
data.yaml for class information
- YAML file that contains information about the class name and the number of classes is required for YOLO format.
- All elements of the names must be written in the same class number as the dataset.
- names: Names of the class
- nc: The number of classes
- Example of data.yaml
names:
- aeroplane
- bicycle
- bird
- boat
- bottle
- bus
- car
- cat
- chair
- cow
- diningtable
- dog
- horse
- motorbike
- person
- pottedplant
- sheep
- sofa
- train
- tvmonitor
nc: 20
COCO format
Dataset structure
- Please refer to the official COCO Data format for COCO label format.
- COCO format has one ‘.json’ file for all images in the same folder. Each set (train, test, and valid) should have its ‘.json’ file. Make sure that file names ‘.json’ file should be the same as the names of image files.
- Example of dataset structure
VOC format
Dataset structure
- Please refer to the official VOC Data format for VOC label format.
- VOC format has one '.xml' file per image with the same file name. If there is no object in an image, no '.xml' file is required for the image. Make sure that every '.xml' file requires a corresponding image file.
- Example of dataset structure
3. Datasets for semantic segmentation
Supported image file types
- 'jpg', 'png'
Supported formats
- UNet (like YOLO)
UNet (like YOLO)
Dataset structure
- UNet (like YOLO) format has a single-channel image mask file for each image with the same file name. Mask file contains an integer value for every pixel that represents the segmentation class. The mask file and the image file should have the same image size. Make sure that every mask file requires a corresponding image file.
- Example of dataset structure
id2label.json for class information
- Number of classes of id2label.json should be same as the number of classes in the mask files.
- Be sure that the name of json file must to be 'id2label.json', any single typo will cause the error.
- Example of id2label.json
{
"0": "road",
"1": "sidewalk",
"2": "building",
"3": "wall",
"4": "fence",
"5": "pole",
"6": "traffic light",
"7": "traffic sign",
"8": "vegetation",
"9": "terrain",
"10": "sky",
"11": "person",
"12": "rider",
"13": "car",
"14": "truck",
"15": "bus",
"16": "train",
"17": "motorcycle",
"18": "bicycle"
}
palette.json for color index for each class
- Example of palette.json
{
"0": [128, 64, 128],
"1": [244, 35, 232],
"2": [70, 70, 70],
"3": [102, 102, 156],
"4": [190, 153, 153],
"5": [153, 153, 153],
"6": [250, 170, 30],
"7": [220, 220, 0],
"8": [107, 142, 35],
"9": [152, 251, 152],
"10": [70, 130, 180],
"11": [220, 20, 60],
"12": [255, 0, 0],
"13": [0, 0, 142],
"14": [0, 0, 70],
"15": [0, 60, 100],
"16": [0, 80, 100],
"17": [0, 0, 230],
"18": [119, 11, 32]
}
Updated 10 months ago