Breaking News

G.SKILL Demo New Memory Solutions for Gaming, Server, AI, Workstation Applications at Computex 2026 LIAN LI Launches HydroShift II OLED Curved 360 AIO LIAN LI Unveils O11 VISION-M CORSAIR PRO launches AI Workstations and Servers ASUS Announces T1 GeForce RTX 5070 and RTX 5060 Ti Graphics Cards

logo

  • Share Us
    • Facebook
    • Twitter
  • Home
  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map

Search form

Intel Introduces "Throughput Mode" in the Intel Distribution of OpenVINO Toolkit

Intel Introduces "Throughput Mode" in the Intel Distribution of OpenVINO Toolkit

Enterprise & IT Mar 7,2019 0

The latest release of the Intel Distribution of OpenVINO Toolkit includes a CPU “throughput” mode, which is said to accelerate deep learning inference.

One of the biggest challenges to AI can be eliciting high-performance deep learning inference that runs at real-world scale, leveraging existing infrastructures. The Intel Distribution of OpenVINO toolkit (a developer tool suite that stands for Open Visual Inference and Neural Network Optimization) accelerates high-performance deep learning inference deployments.

Latency, or execution time of an inference, is critical for real-time services. Typically, approaches to minimize latency focus on the performance of single inference requests, limiting parallelism to the individual input instance. This often means that real-time inference applications cannot take advantage of the computational efficiencies that batching (combining many input images to achieve optimal throughput) provides, as high batch sizes come with a latency penalty. To address this gap, the latest release of the Intel Distribution of OpenVINO Toolkit includes a CPU “throughput” mode.

According to Intel, this new mode allows efficient parallel execution of multiple inference requests by processing them using the same CNN, greatly improving the throughput. In addition to the reuse of filter weights in convolutional operations (also available with batching), a finer execution granularity available with the new mode further improves cache utilization. Using this “throughput” mode, CPU cores are evenly distributed between parallel inference requests, following the general “parallelize the outermost loop first” rule of thumb. It also reduces the amount of scheduling/synchronization compared to a latency-oriented approach when every CNN operation is made parallelized internally over the full number of CPU cores.

Intel says that the speedup from the new mode is particularly strong on high-end servers, but also significant on other Intel architecture-based systems.

Topology\Machine Dual-Socket Intel Xeon Platinum 8180 Processor Intel Core i7-8700K Processor
mobilenet-v2 2.0x 1.2x
densenet-121 2.6x 1.2x
yolo-v3 3.0x 1.3x
se-resnext50 6.7x 1.6x

Together with general threading refactoring, also introduced in the R5 release, the toolkit does not require playing OMP_NUM_THREADS, KMP_AFFINITY and other machine-specific settings to achieve these performance improvements; they can be realized with the “out of the box” Intel Distribution of OpenVINO toolkit configuration.

To simplify benchmarking, the Intel Distribution of OpenVINO toolkit features a dedicated Benchmark App that can be used to play with the number of inference requests running in parallel from the command-line. The rule of thumb is to test up to the number of CPU cores in your machine. In addition to the number of inference requests, it is also possible to play with batch size from the command-line to find the throughput sweet spot.

Tags: deep learningArtificial IntelligenceIntel
Previous Post
AMD Ryzen 3000 Desktop Series and 3rd Generation Threadripper Coming Later This Year
Next Post
Apple Expands Workforce in Qualcomm's Hometown

Related Posts

  • Intel at Computex 2026

  • Intel Launches Intel Core Series 3 Processors

  • ASRock Unveils Intel Arc Pro B70 Graphics Cards, Redefining Professional Workspaces

  • G.SKILL DDR5 Memory Kits Confirmed as Intel XMP 3.0 'Ready' for Intel Core Ultra 200S Plus Series Processors

  • Intel Launches New Core Ultra 200HX Plus Series Mobile Processors

  • Intel Announces New Intel Core Ultra 200S Plus Series Desktop Processors

  • Intel Launches Core Series 2 Processor with Real-Time Performance and Expands Edge AI Portfolio

  • Intel Launches new Intel Xeon 600 Processors for Workstation

Latest News

G.SKILL Demo New Memory Solutions for Gaming, Server, AI, Workstation Applications at Computex 2026
PC components

G.SKILL Demo New Memory Solutions for Gaming, Server, AI, Workstation Applications at Computex 2026

LIAN LI Launches HydroShift II OLED Curved 360 AIO
Cooling Systems

LIAN LI Launches HydroShift II OLED Curved 360 AIO

LIAN LI Unveils O11 VISION-M
Cooling Systems

LIAN LI Unveils O11 VISION-M

CORSAIR PRO launches AI Workstations and Servers
Cooling Systems

CORSAIR PRO launches AI Workstations and Servers

ASUS Announces T1 GeForce RTX 5070 and RTX 5060 Ti Graphics Cards
GPUs

ASUS Announces T1 GeForce RTX 5070 and RTX 5060 Ti Graphics Cards

Popular Reviews

Akaso 360 Action camera

Akaso 360 Action camera

Dragon Touch Digital Calendar

Dragon Touch Digital Calendar

be quiet! Pure Loop 3 280mm

be quiet! Pure Loop 3 280mm

Noctua NF-A12x25 G2 fans

Noctua NF-A12x25 G2 fans

Endorfy Thock V2 Wireless Keyboard

Endorfy Thock V2 Wireless Keyboard

Soft2bet and the unseen hardware that makes instant play possible

Soft2bet and the unseen hardware that makes instant play possible

Crucial T710 2TB NVME SSD

Crucial T710 2TB NVME SSD

JSAUX 65Wh Rog Ally Battery

JSAUX 65Wh Rog Ally Battery

Main menu

  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map
  • About
  • Privacy
  • Contact Us
  • Promotional Opportunities @ CdrInfo.com
  • Advertise on out site
  • Submit your News to our site
  • RSS Feed