NetsPresso Compressor

Compresses models for better computational efficiency

Key Features

Automatic Compression

Only focus on the compression, not wasting time implementing complicated methods
Compress PyTorch, TensorFlow models immediately

Structured Pruning, Filter decomposition

Structured pruning directly improves the inference speed of AI models by reducing the amount of computation
Filter decomposition decomposes AI models and restores important information
Fine-tuning after compression is possible to restore the accuracy of the model.

HW aware Model Profiling

Visualize the neural network to check the structure and available layers to be compressed
Profile the neural network on the target hardware to decide which layers to compress and how much

Scope of support

Framework

PyTorch (PyTorch version ≥ 1.11)
- PyTorch-ONNX (ONNX version ≥ 1.10)
- PyTorch-GraphModule
TensorFlow-Keras (TensorFlow version 2.3~2.8)

Compression methods

Structured Pruning : Pruning by index, Pruning by criteria (L2 Norm, GM, NuclearNorm)
Filter Decomposition : Tucker Decomposition, Singular Value Decomposition, CP Decomposition

PyNetsPresso

For first-time use or to obtain detailed information about PyNetsPresso, please visit PyNetsPresso Github.

# 1. Declare compressor
compressor = netspresso.compressor_v2()

# 2. Run automatic compression
compression_result = compressor.automatic_compression(
    input_shapes=[{"batch": 1, "channel": 3, "dimension": [224, 224]}],
    input_model_path="./examples/sample_models/graphmodule.pt",
    output_dir="./outputs/compressed/pytorch_automatic_compression",
    compression_ratio=0.5,
)

To learn more about how to use PyNetsPresso, please visit the Recipes page below and follow the step-by-step guides.
PyNetsPresso Recipes