πŸš€ New Features

Minimap in Model Graph View A minimap has been added to the model graph view, allowing users to quickly navigate large models and maintain context while exploring detailed areas.

πŸ› οΈ Improvements

Updated Default Colors The default color scheme of the model graph has been updated for better readability.

Latency-Based Coloring Layers are now visually distinguished based on latency. Slower layers appear brighter, while faster layers appear darker, making performance bottlenecks easier to identify.

Enhanced Graph-Table Interaction Interactions between the graph view and table view have been improved, allowing seamless selection and navigation between the two representations.

🧠 Why These Matter

  • Improved Navigation: The minimap helps users efficiently explore large and complex model graphs without losing context.
  • Faster Bottleneck Identification: Latency-based coloring makes it easy to spot slow-performing layers at a glance.
  • Seamless Analysis: Enhanced graph-table interaction allows for a smoother workflow between visual and tabular analysis, supporting more efficient profiling and decision-making.

πŸ› οΈ Improvements

Enhanced Drawer UI for Model Graph

  • The drawer UI in the model graph has been improved to present NODE PROPERTIES and tensor information in a more organized format.
  • Interaction has also been refined, allowing users to stay more focused on the graph itself.

Improved ViT Model Mapping

  • Support for Vision Transformer (ViT) models has been enhanced, ensuring more accurate layer mapping and profiling results.

🧠 Why These Matter

  • Focused Exploration: The improved drawer UI provides a clearer view of NODE PROPERTIES and tensor details, enabling deeper engagement with model graph analysis.
  • Clarity in Understanding: A more structured interface and refined interactions make it easier to interpret model structures and behaviors.
  • Broader Model Coverage: Enhanced ViT support ensures that advanced transformer-based models are more effectively handled in optimization workflows.

πŸš€ New Features

Task Cancellation

  • Running tasks can now be canceled mid-process. This allows you to stop incorrect or outdated tasks without waiting for completion and immediately launch a new one.

CSV Export for Profiling Results

  • You can now download profiling results in CSV format, making it easy to review layer-wise profiling data with external tools or spreadsheets.

πŸ› οΈ Improvements

Enhanced Graph UI and Interaction

  • The model graph visualization has been refined for better readability and smoother interaction. Users can now more easily navigate between the graph and table views.

🧠 Why These Matter

  • Efficiency in Iteration: Canceling tasks saves time by letting you quickly pivot away from unwanted runs and continue experimentation.
  • Better Analysis: CSV exports provide a convenient way to perform detailed profiling analysis outside the platform.
  • Improved Usability: The upgraded graph UI makes it easier to interpret model structures and profiling results, supporting faster and clearer decision-making.

πŸš€ New Features

API Token Support

  • You can now use API tokens to access PyNetsPresso and Training Studio.

πŸ› οΈ Improvements

Enhancements in Optimization Studio

  • A convenient feature has been added to rerun quantization using previous settings.
  • Improved information clarity and model graph visualization.

Improved Sign-in Flow

  • Added a Hub page to improve post-login navigation and user orientation.

🧠 Why These Matter

  • Improved Security: API tokens provide a more secure way to authenticate and access services.
  • Better User Experience: Updates to Optimization Studio help users iterate more efficiently and intuitively, boosting productivity in repeated experiments.

✨ Overview

Optimization Studio is a cloud-based graphical interface designed to help users efficiently optimize AI models for deployment on edge devices.

This tool simplifies the traditionally complex process of model quantization and optimization by offering a no-code, visual workflow. Users can upload their models, configure optimization settings, and generate deployable, hardware-friendly versions of those models β€” all through a streamlined interface.

Whether you are a machine learning engineer looking for fine-grained control, or a non-technical user aiming to apply optimization with minimal setup, Optimization Studio provides a flexible, intuitive solution.

If you’d like to try Optimization Studio, please visit the link below:
Optimization Studio


πŸ’‘ What You Can Do with Optimization Studio

  • Upload AI models and automatically apply post-training quantization
  • Reduce model size, latency, and memory usage while maintaining accuracy
  • Generate optimized models compatible with various edge hardware targets
  • Compare performance metrics (latency, memory, model size, etc.) before and after optimization
  • Use auto-recommendation for quantization or configure settings layer by layer

🧠 Why It Matters

Optimizing models for edge deployment typically requires deep technical expertise and time-consuming manual adjustments. Optimization Studio eliminates those barriers by turning optimization into a visual, guided experience β€” making model deployment faster, more accessible, and less error-prone.

πŸš€ New Features

JetPack 6.1 Support for NVIDIA Devices

  • Updated internal configuration to support JetPack 6.1 (version 6.1+b123).
  • Software version enum changed from JETPACK_6_0 to JETPACK_6_1 for compatibility with the latest NVIDIA Jetson platform.

Package Version Check Functionality

  • Added a new mechanism to check whether users are running the latest PyNetsPresso SDK version.
  • Users will be notified if an update is recommended, improving consistency and reducing compatibility issues.

🧠 Why these matter

  • Supporting JetPack 6.1 ensures that PyNetsPresso remains compatible with the latest NVIDIA hardware and software stacks.
  • The version check feature helps users stay up-to-date with critical patches and improvements, reducing the risk of integration failures.

πŸš€ New Features

Qualcomm AI Hub Integration

  • PyNetsPresso now supports integration with Qualcomm AI Hub.
  • Users can connect and deploy models directly to the hub, enabling seamless collaboration and deployment for Qualcomm-based workflows.

QAI-Hub SDK Update

  • Upgraded qai-hub version from 0.21.0 to 0.24.0, ensuring compatibility with the latest Qualcomm AI toolchain.

Documentation Updates

  • Added and updated guides for Qualcomm AI Hub usage within PyNetsPresso.
  • Updated documentation dependencies to include TensorFlow 2.8.0 and fixed version-related inconsistencies.

🐞 Bug Fixes

  • Rolled back launcher URI prefix from v3 to v2 to restore compatibility with existing environments.
  • Fixed an issue in the benchmark environment schema that caused incorrect validation during profiling runs.

🧠 Why these matter

  • These updates expand PyNetsPresso’s deployment capabilities to the Qualcomm ecosystem, enabling broader hardware support.
  • The launcher rollback ensures backward compatibility with production environments using v2 APIs.
  • Documentation and dependency fixes ensure a smoother onboarding and build process across diverse environments.

πŸš€ New Features

Custom & Auto Quantization Support

  • Introduced flexible quantization configuration options:
    • Users can manually define bitwidth, symmetry, per-channel settings, and rounding behavior.
    • Alternatively, automatic quantization selects optimal settings based on calibration data.

Quantizer Module Added

  • Added a new quantizer module and corresponding launcher client API, enabling quantization workflows to be run through PyNetsPresso's unified interface.

Benchmark & Conversion Task Cancellation

  • Users can now cancel benchmark and conversion tasks in progress, offering greater control during long-running operations.

Automatic Project Folder Creation

  • When launching tasks, a project folder is now created automatically if one doesn’t exist, organizing output files consistently.

JWT Expiration Handling Improvement

  • A 60-second buffer has been added to JWT token expiration checks to prevent unintentional session timeouts.

🐞 Bug Fixes

  • Added InternalServerError exception to clearly distinguish server-side failures from user errors.

🧠 Why these matter

  • These updates significantly improve the flexibility and robustness of the optimization pipeline.
  • Quantization is now easier to configure or automate depending on user preference and hardware constraints.
  • Automatic folder creation and cancellation support improve user experience, especially during iterative workflows.
  • Better token handling and error classification reduce friction and simplify debugging.

πŸš€ New Features

Inference Interface Update

  • Refactored the inference interface for improved clarity and consistency.
  • Provides a cleaner experience when running inference across multiple model formats.

Training Result in Compressor Metadata

  • Added training_result field to compressor metadata, giving users visibility into pre-compression training performance.

Data Type in Benchmark Results

  • Benchmark results now include a data_type field, helping users evaluate performance with clearer context.

🐞 Bug Fixes

  • Added scipy to requirements.txt to resolve missing dependency issues in environments using compression or quantization.
  • Fixed incorrect return_stage_idx parameter in ResNet-50 backbone configuration.
  • Added missing preprocess and postprocess steps for full INT8 pipelines, ensuring accurate inference results.

🧠 Why these matter

  • These updates improve usability and output consistency in inference and benchmarking workflows.
  • Enhancing metadata structures gives users more insight into pipeline stages and improves integration with downstream tools.
  • Fixes to model configuration and INT8 processing ensure correctness and stability in common deployment paths.

πŸš€ New Features

ONNX & TFLite Inferencer Added

  • Introduced a unified inferencer for ONNX and TFLite models, enabling easy evaluation across formats.

Runtime Config Support for Inference

  • Added configuration options to control runtime behavior during inference, offering more flexibility in deployment.

TensorRT/DRPAI Device & Software Filtering

  • Conversion flows now support device + software version filtering, improving deployment compatibility.

Netspresso Trainer v1.0.0 Integration

  • Upgraded to netspresso_trainer v1.0.0 with enhanced stability and standardized behavior.

PyNPException for GUI

  • Added PyNPException class to streamline error reporting between SDK and GUI environments.

Optimizer & Scheduler Enum

  • Introduced enums for optimizer and scheduler types, simplifying training config definitions.

Timeout Handling for Compression

  • Added a 600-second timeout for compressor requests to prevent indefinite hangs.

Improved Metadata Logging

  • Trainer and converter metadata now include error logs and model info (framework, input_shapes) for better traceability.

DLC Framework Filtering

  • Excluded unsupported DLC framework from trainer configuration options.

🐞 Bug Fixes

  • Fixed incorrect task value in inference runtime configs.
  • Corrected typo in training_result key of trainer metadata.
  • Resolved issue with invalid runtime config options not applying properly.
  • Added validate_token check when generating converter metadata.
  • Introduced GatewayTimeoutException and UnexpectedException for clearer backend error handling.
  • Enhanced training error logs with detailed name and message fields.

🧠 Why these matter

  • These updates bring more control and reliability to the inference and deployment processes.
  • By enhancing metadata and configuration handling, users can debug more effectively and automate more confidently.
  • The inclusion of timeout handling, structured exceptions, and training options paves the way for smoother integration in both GUI and SDK environments.