Intel is demonstrating scalable artificial intelligence performance with results from the newly released MLPerf Inference v6.0 benchmarks, showcasing its Xeon 6 processors and Arc Pro B-Series GPUs as solutions for AI inference across workstations, datacenters, and edge systems. The benchmarks reveal a four GPU system utilizing Intel Arc Pro B70/B65 graphics delivers 128GB of VRAM, capable of running 120 billion parameter models, with the Arc Pro B70 achieving up to 1.8 times higher inference performance than the Arc Pro B60. Software optimizations within an open, containerized stack are also improving performance, yielding up to 1.18 times higher gains on existing Intel Arc Pro B60 hardware compared to MLPerf v5.1. Anil Nanduri, Intel vice president, AI Products and GTM, Intel Data Center Group, said, “The combination of Intel Xeon 6 and Intel’s Arc Pro B-Series GPUs represent our investment to expand customer choice and value, offering real-world solutions that address both LLM models as well as traditional machine learning workloads, with leading performance and incredible value for graphics professionals and AI developers worldwide.”
\nIntel Xeon 6 and Arc Pro B-Series GPU Performance in MLPerf v6.0
\nThe demand for increasingly complex artificial intelligence models is driving rapid advancements in hardware capabilities, particularly in graphics memory. This expanded memory capacity is crucial because larger models, while potentially more accurate and nuanced, demand substantially more computational resources, especially during the inference phase when the model is used to generate outputs. Intel’s systems, featuring Intel Xeon 6 CPUs and Arc Pro B70/B65 graphics, are designed to address the needs of both traditional machine learning and large language models. This performance gain is not solely attributable to the GPU; the CPU plays a vital role in overall system efficiency, handling tasks like memory management and workload distribution. The ability to handle larger models and context windows, up to 1.6 times more KV cache capacity, differentiates the Arc Pro B70 in multi-GPU setups, simplifying adoption with a containerized solution built for Linux environments.
\nCPU-Accelerated System Performance & Intel’s AMX/AVX512 Technologies
\nIntel is demonstrating a holistic approach to artificial intelligence performance, extending beyond GPU throughput to emphasize the critical role of the central processing unit. Recent MLPerf Inference v6.0 benchmark submissions highlight the importance of CPU-accelerated system performance in modern AI infrastructure, noting that the CPU manages essential functions like memory management and workload distribution. Intel is the only server processor vendor to submit stand-alone CPU results for these benchmarks, demonstrating its commitment to advancing AI inference across all platforms. This focus on CPU capabilities is particularly evident in the Intel Xeon 6 processors, which delivered up to a 1.9 times generational performance gain in MLPerf Inference v5.1. Built-in AI acceleration technologies, including AMX and AVX512, enable efficient execution of workloads such as large language model inference and classical machine learning without requiring dedicated accelerator hardware. These advancements aim to reduce reliance on proprietary AI models and associated subscription costs, offering a compelling alternative for graphics professionals and AI developers seeking performance and value.
\n\n\nThe combination of Intel Xeon 6 and Intel’s Arc Pro B-Series GPUs represent our investment to expand customer choice and value, offering real-world solutions that address both LLM models as well as traditional machine learning workloads, with leading performance and incredible value for graphics professionals and AI developers worldwide.
Anil Nanduri, Intel vice president, AI Products and GTM, Intel Data Center Group
