πŸš€ New Features

Task Cancellation

  • Running tasks can now be canceled mid-process. This allows you to stop incorrect or outdated tasks without waiting for completion and immediately launch a new one.

CSV Export for Profiling Results

  • You can now download profiling results in CSV format, making it easy to review layer-wise profiling data with external tools or spreadsheets.

πŸ› οΈ Improvements

Enhanced Graph UI and Interaction

  • The model graph visualization has been refined for better readability and smoother interaction. Users can now more easily navigate between the graph and table views.

🧠 Why These Matter

  • Efficiency in Iteration: Canceling tasks saves time by letting you quickly pivot away from unwanted runs and continue experimentation.
  • Better Analysis: CSV exports provide a convenient way to perform detailed profiling analysis outside the platform.
  • Improved Usability: The upgraded graph UI makes it easier to interpret model structures and profiling results, supporting faster and clearer decision-making.

πŸš€ New Features

API Token Support

  • You can now use API tokens to access PyNetsPresso and Training Studio.

πŸ› οΈ Improvements

Enhancements in Optimization Studio

  • A convenient feature has been added to rerun quantization using previous settings.
  • Improved information clarity and model graph visualization.

Improved Sign-in Flow

  • Added a Hub page to improve post-login navigation and user orientation.

🧠 Why These Matter

  • Improved Security: API tokens provide a more secure way to authenticate and access services.
  • Better User Experience: Updates to Optimization Studio help users iterate more efficiently and intuitively, boosting productivity in repeated experiments.

✨ Overview

Optimization Studio is a cloud-based graphical interface designed to help users efficiently optimize AI models for deployment on edge devices.

This tool simplifies the traditionally complex process of model quantization and optimization by offering a no-code, visual workflow. Users can upload their models, configure optimization settings, and generate deployable, hardware-friendly versions of those models β€” all through a streamlined interface.

Whether you are a machine learning engineer looking for fine-grained control, or a non-technical user aiming to apply optimization with minimal setup, Optimization Studio provides a flexible, intuitive solution.

If you’d like to try Optimization Studio, please visit the link below:
Optimization Studio


πŸ’‘ What You Can Do with Optimization Studio

  • Upload AI models and automatically apply post-training quantization
  • Reduce model size, latency, and memory usage while maintaining accuracy
  • Generate optimized models compatible with various edge hardware targets
  • Compare performance metrics (latency, memory, model size, etc.) before and after optimization
  • Use auto-recommendation for quantization or configure settings layer by layer

🧠 Why It Matters

Optimizing models for edge deployment typically requires deep technical expertise and time-consuming manual adjustments. Optimization Studio eliminates those barriers by turning optimization into a visual, guided experience β€” making model deployment faster, more accessible, and less error-prone.

πŸš€ New Features

JetPack 6.1 Support for NVIDIA Devices

  • Updated internal configuration to support JetPack 6.1 (version 6.1+b123).
  • Software version enum changed from JETPACK_6_0 to JETPACK_6_1 for compatibility with the latest NVIDIA Jetson platform.

Package Version Check Functionality

  • Added a new mechanism to check whether users are running the latest PyNetsPresso SDK version.
  • Users will be notified if an update is recommended, improving consistency and reducing compatibility issues.

🧠 Why these matter

  • Supporting JetPack 6.1 ensures that PyNetsPresso remains compatible with the latest NVIDIA hardware and software stacks.
  • The version check feature helps users stay up-to-date with critical patches and improvements, reducing the risk of integration failures.

πŸš€ New Features

Qualcomm AI Hub Integration

  • PyNetsPresso now supports integration with Qualcomm AI Hub.
  • Users can connect and deploy models directly to the hub, enabling seamless collaboration and deployment for Qualcomm-based workflows.

QAI-Hub SDK Update

  • Upgraded qai-hub version from 0.21.0 to 0.24.0, ensuring compatibility with the latest Qualcomm AI toolchain.

Documentation Updates

  • Added and updated guides for Qualcomm AI Hub usage within PyNetsPresso.
  • Updated documentation dependencies to include TensorFlow 2.8.0 and fixed version-related inconsistencies.

🐞 Bug Fixes

  • Rolled back launcher URI prefix from v3 to v2 to restore compatibility with existing environments.
  • Fixed an issue in the benchmark environment schema that caused incorrect validation during profiling runs.

🧠 Why these matter

  • These updates expand PyNetsPresso’s deployment capabilities to the Qualcomm ecosystem, enabling broader hardware support.
  • The launcher rollback ensures backward compatibility with production environments using v2 APIs.
  • Documentation and dependency fixes ensure a smoother onboarding and build process across diverse environments.

πŸš€ New Features

Custom & Auto Quantization Support

  • Introduced flexible quantization configuration options:
    • Users can manually define bitwidth, symmetry, per-channel settings, and rounding behavior.
    • Alternatively, automatic quantization selects optimal settings based on calibration data.

Quantizer Module Added

  • Added a new quantizer module and corresponding launcher client API, enabling quantization workflows to be run through PyNetsPresso's unified interface.

Benchmark & Conversion Task Cancellation

  • Users can now cancel benchmark and conversion tasks in progress, offering greater control during long-running operations.

Automatic Project Folder Creation

  • When launching tasks, a project folder is now created automatically if one doesn’t exist, organizing output files consistently.

JWT Expiration Handling Improvement

  • A 60-second buffer has been added to JWT token expiration checks to prevent unintentional session timeouts.

🐞 Bug Fixes

  • Added InternalServerError exception to clearly distinguish server-side failures from user errors.

🧠 Why these matter

  • These updates significantly improve the flexibility and robustness of the optimization pipeline.
  • Quantization is now easier to configure or automate depending on user preference and hardware constraints.
  • Automatic folder creation and cancellation support improve user experience, especially during iterative workflows.
  • Better token handling and error classification reduce friction and simplify debugging.

πŸš€ New Features

Inference Interface Update

  • Refactored the inference interface for improved clarity and consistency.
  • Provides a cleaner experience when running inference across multiple model formats.

Training Result in Compressor Metadata

  • Added training_result field to compressor metadata, giving users visibility into pre-compression training performance.

Data Type in Benchmark Results

  • Benchmark results now include a data_type field, helping users evaluate performance with clearer context.

🐞 Bug Fixes

  • Added scipy to requirements.txt to resolve missing dependency issues in environments using compression or quantization.
  • Fixed incorrect return_stage_idx parameter in ResNet-50 backbone configuration.
  • Added missing preprocess and postprocess steps for full INT8 pipelines, ensuring accurate inference results.

🧠 Why these matter

  • These updates improve usability and output consistency in inference and benchmarking workflows.
  • Enhancing metadata structures gives users more insight into pipeline stages and improves integration with downstream tools.
  • Fixes to model configuration and INT8 processing ensure correctness and stability in common deployment paths.

πŸš€ New Features

ONNX & TFLite Inferencer Added

  • Introduced a unified inferencer for ONNX and TFLite models, enabling easy evaluation across formats.

Runtime Config Support for Inference

  • Added configuration options to control runtime behavior during inference, offering more flexibility in deployment.

TensorRT/DRPAI Device & Software Filtering

  • Conversion flows now support device + software version filtering, improving deployment compatibility.

Netspresso Trainer v1.0.0 Integration

  • Upgraded to netspresso_trainer v1.0.0 with enhanced stability and standardized behavior.

PyNPException for GUI

  • Added PyNPException class to streamline error reporting between SDK and GUI environments.

Optimizer & Scheduler Enum

  • Introduced enums for optimizer and scheduler types, simplifying training config definitions.

Timeout Handling for Compression

  • Added a 600-second timeout for compressor requests to prevent indefinite hangs.

Improved Metadata Logging

  • Trainer and converter metadata now include error logs and model info (framework, input_shapes) for better traceability.

DLC Framework Filtering

  • Excluded unsupported DLC framework from trainer configuration options.

🐞 Bug Fixes

  • Fixed incorrect task value in inference runtime configs.
  • Corrected typo in training_result key of trainer metadata.
  • Resolved issue with invalid runtime config options not applying properly.
  • Added validate_token check when generating converter metadata.
  • Introduced GatewayTimeoutException and UnexpectedException for clearer backend error handling.
  • Enhanced training error logs with detailed name and message fields.

🧠 Why these matter

  • These updates bring more control and reliability to the inference and deployment processes.
  • By enhancing metadata and configuration handling, users can debug more effectively and automate more confidently.
  • The inclusion of timeout handling, structured exceptions, and training options paves the way for smoother integration in both GUI and SDK environments.

πŸš€ New Features

Token Expiration Handling Improvements

  • Switched from token reissue to fresh login when a token expires.
  • Ensures more reliable and predictable authentication flow.

Automatic Token Update Post-Login

  • After a successful login, self.tokens is now automatically updated to prevent stale token usage in subsequent requests.

UploadDataset Dataclass

  • Added a new UploadDataset dataclass for clearer and more structured handling of dataset upload metadata.

Launcher Task Error Logging

  • Improved error logging for task failures in the launcher, helping users debug configuration and runtime issues more easily.

API Client Info Logging

  • API client now prints host and port information upon initialization to confirm connection settings.

🐞 Bug Fixes

  • Fixed a bug where trainer model architecture configuration was incorrectly parsed.
  • Resolved an issue with classification dataset settings not applying correctly in training.
  • Added missing model_name definition in initialize_from_yaml() to avoid runtime errors during model setup.

🧠 Why these matter

  • These updates improve reliability in authentication and API interaction, particularly in long-running sessions.
  • Users benefit from better visibility into configuration issues and API usage context.
  • Dataset and training configuration flows are now more stable and transparent, reducing the chance of runtime errors.

πŸš€ New Features

Structured Neuron-level Pruning (SNP)

  • Added a new structured pruning method targeting neuron-level granularity.
  • Allows more fine-tuned control over model size and latency during optimization.

Upload Progress Bar

  • Introduced a progress bar display when uploading large models or datasets via the SDK.
  • Helps users track upload status more clearly and improve user experience.

Model Name Standardization & Deprecation Notices

  • Unified naming convention for preloaded model names.
  • Added deprecation warnings to guide users toward supported configurations.

Environment Variable Configuration

  • Refactored internal configuration to support HOST and PORT via environment variables.
  • Simplifies deployment in containerized or cloud environments.

🐞 Bug Fixes

  • Updated file-matching logic to support both *best.pt and *best_fx.pt patterns during model search.
  • Fixed attribute reference from model.name to model_name to prevent runtime errors.

🧠 Why these matter

  • The addition of Structured Neuron-level Pruning enables more granular and efficient model optimization, especially for edge deployment scenarios.
  • Visual upload progress improves transparency and confidence during long upload operations.
  • Unified naming and configuration approaches reduce user confusion and improve maintainability.
  • Bug fixes ensure more reliable model handling and deployment flow.