Documentation

Features & Scope of support

NetsPresso Compressor

Compresses models for better computational efficiency

Key Features

Automatic Compression

  • Only focus on the compression, not wasting time implementing complicated methods
  • Compress PyTorch, TensorFlow models immediately

Structured Pruning, Filter decomposition

  • Structured pruning directly improves the inference speed of AI models by reducing the amount of computation
  • Filter decomposition decomposes AI models and restores important information
  • Fine-tuning after compression is possible to restore the accuracy of the model.

HW aware Model Profiling

  • Visualize the neural network to check the structure and available layers to be compressed
  • Profile the neural network on the target hardware to decide which layers to compress and how much

Workflow

Scope of support

Model Compressor

Framework

  • PyTorch (PyTorch version ≥ 1.11)
    • PyTorch-ONNX (ONNX version ≥ 1.10)
    • PyTorch-GraphModule
  • TensorFlow-Keras (TensorFlow version 2.3~2.8)

Compression methods

  • Structured Pruning : Pruning by index, Pruning by criteria (L2 Norm, GM, NuclearNorm)
  • Filter Decomposition : Tucker Decomposition, Singular Value Decomposition, CP Decomposition