How to use compressor
NetsPresso Compressor
Compresses models for better computational efficiency
Key Features
Automatic Compression
- Only focus on the compression, not wasting time implementing complicated methods
- Compress PyTorch, TensorFlow models immediately
Structured Pruning, Filter decomposition
- Structured pruning directly improves the inference speed of AI models by reducing the amount of computation
- Filter decomposition decomposes AI models and restores important information
- Fine-tuning after compression is possible to restore the accuracy of the model.
HW aware Model Profiling
- Visualize the neural network to check the structure and available layers to be compressed
- Profile the neural network on the target hardware to decide which layers to compress and how much
Scope of support
Framework
- PyTorch (PyTorch version ≥ 1.11)
- PyTorch-ONNX (ONNX version ≥ 1.10)
- PyTorch-GraphModule
- TensorFlow-Keras (TensorFlow version 2.3~2.8)
Compression methods
- Structured Pruning : Pruning by index, Pruning by criteria (L2 Norm, GM, NuclearNorm)
- Filter Decomposition : Tucker Decomposition, Singular Value Decomposition, CP Decomposition
PyNetsPresso
For first-time use or to obtain detailed information about PyNetsPresso, please visit PyNetsPresso Github.
# 1. Declare compressor
compressor = netspresso.compressor_v2()
# 2. Run automatic compression
compression_result = compressor.automatic_compression(
input_shapes=[{"batch": 1, "channel": 3, "dimension": [224, 224]}],
input_model_path="./examples/sample_models/graphmodule.pt",
output_dir="./outputs/compressed/pytorch_automatic_compression",
compression_ratio=0.5,
)
To learn more about how to use PyNetsPresso, please visit the Recipes page below and follow the step-by-step guides.
PyNetsPresso Recipes
Updated 9 days ago