How to use compressor

NetsPresso Compressor

Compresses models for better computational efficiency

Key Features

Automatic Compression

  • Only focus on the compression, not wasting time implementing complicated methods
  • Compress PyTorch, TensorFlow models immediately

Structured Pruning, Filter decomposition

  • Structured pruning directly improves the inference speed of AI models by reducing the amount of computation
  • Filter decomposition decomposes AI models and restores important information
  • Fine-tuning after compression is possible to restore the accuracy of the model.

HW aware Model Profiling

  • Visualize the neural network to check the structure and available layers to be compressed
  • Profile the neural network on the target hardware to decide which layers to compress and how much

Scope of support

Framework

  • PyTorch (PyTorch version ≥ 1.11)
    • PyTorch-ONNX (ONNX version ≥ 1.10)
    • PyTorch-GraphModule
  • TensorFlow-Keras (TensorFlow version 2.3~2.8)

Compression methods

  • Structured Pruning : Pruning by index, Pruning by criteria (L2 Norm, GM, NuclearNorm)
  • Filter Decomposition : Tucker Decomposition, Singular Value Decomposition, CP Decomposition

PyNetsPresso

For first-time use or to obtain detailed information about PyNetsPresso, please visit PyNetsPresso Github.

# 1. Declare compressor
compressor = netspresso.compressor_v2()

# 2. Run automatic compression
compression_result = compressor.automatic_compression(
    input_shapes=[{"batch": 1, "channel": 3, "dimension": [224, 224]}],
    input_model_path="./examples/sample_models/graphmodule.pt",
    output_dir="./outputs/compressed/pytorch_automatic_compression",
    compression_ratio=0.5,
)

To learn more about how to use PyNetsPresso, please visit the Recipes page below and follow the step-by-step guides.
PyNetsPresso Recipes