DeepSparse 1.5 Released For Faster AI Inference On CPUs
Neural Magic's DeepSparse AI inference runtime continues to pursue "GPU-class performance on CPUs" and with the new DeepSparse 1.5 release is delivering even faster performance for CPU inference.
DeepSparse offers leading CPU-based inference performance and I've made great use of it on Intel and AMD CPUs and commonly using it among my CPU benchmarks arsenal. I'm excited to see with DeepSparse 1.5 there is even more performance improvements. The DeepSparse 1.5 release notes call out the following performance improvements:
DeepSparse 1.5 also adds an ONNX evaluation pipeline for OpenPiPaf, YOLOv8 segmentation pipelines, support for using hwloc to determine CPU topology to improve the performance inside Kubernetes clusters, and various other enhancements. On the downside, DeepSparse 1.5 still doesn't seem to support Python 3.11 yet.
Downloads (if not using pip) and more details on Neural Magic's DeepSparse 1.5 release via GitHub.
DeepSparse offers leading CPU-based inference performance and I've made great use of it on Intel and AMD CPUs and commonly using it among my CPU benchmarks arsenal. I'm excited to see with DeepSparse 1.5 there is even more performance improvements. The DeepSparse 1.5 release notes call out the following performance improvements:
- Inference latency for unstructured sparse-quantized CNNs has been improved by up to 2x.
- Inference throughput and latency for dense CNNs has been improved by up to 20%.
- Inference throughput and latency for dense transformers has been improved by up to 30%.
- The following operators are now supported for performance:
Neg, Unsqueeze with non-constant inputs
MatMulInteger with two non-constant inputs
GEMM with constant weights and 4D or 5D inputs
DeepSparse 1.5 also adds an ONNX evaluation pipeline for OpenPiPaf, YOLOv8 segmentation pipelines, support for using hwloc to determine CPU topology to improve the performance inside Kubernetes clusters, and various other enhancements. On the downside, DeepSparse 1.5 still doesn't seem to support Python 3.11 yet.
Downloads (if not using pip) and more details on Neural Magic's DeepSparse 1.5 release via GitHub.
1 Comment