The increasing demands on internet infrastructure require ever-faster and more efficient methods for directing network traffic, and Domain Name System (DNS) load balancing plays a critical role in this process. Max Schrötter, Sten Heimbrodt, and Bettina Schnor from the University of Potsdam investigate the potential of modern, programmable network interface cards (NICs) to accelerate DNS load balancing with their new system, XenoFlow. This research demonstrates how offloading processing from traditional server CPUs to the NIC’s programmable dataplane significantly improves performance, achieving a 44% reduction in latency compared to existing software-based solutions. By exploring the capabilities and limitations of the Bluefield-3 NIC, the team reveals a pathway towards faster, more responsive internet services and highlights the benefits of processing data closer to the network itself.
BlueField-3 DPU Load Balancing Performance Evaluation
This paper presents XenoFlow, a simple Layer 3 (L3) load balancer implemented using the DOCA Flow API on a BlueField-3 Data Processing Unit (DPU). The research investigates the performance capabilities and limitations of the Bluefield-3 and the DOCA Flow API for load balancing tasks. Key findings reveal that while the Bluefield-3 doesn’t achieve line rate for small packets, XenoFlow demonstrates a 44% lower latency compared to a comparable eBPF implementation running on the host CPU, even under high load. This performance is maintained even under substantial load, stemming from the benefits of hardware offloading and closer proximity to the network.
The paper highlights challenges working with the DOCA Flow API and the BlueField-3, specifically sparse documentation and limited access to internal API specifications. The research suggests the BlueField-3’s hybrid architecture holds promise for integrating more functionality into the DPU, such as intrusion detection or firewalls. XenoFlow provides a foundation for more complex DPU-based network services.
XenoFlow Achieves Low Latency on Bluefield-3
Scientists have demonstrated XenoFlow, a novel load balancer running on the Nvidia BlueField-3 SmartNIC, pushing the boundaries of network performance. This work investigates offloading network functions from traditional CPUs to the programmable dataplane of the Bluefield-3, achieving significant advancements in latency and throughput. Experiments reveal that while the Bluefield-3 currently faces limitations in achieving line rate performance with small packets, XenoFlow delivers a 44% reduction in latency compared to a comparable eBPF-based load balancer running on the host CPU, even under high load conditions. Researchers meticulously measured the performance of XenoFlow, demonstrating its ability to handle substantial network traffic while maintaining low latency. This breakthrough delivers a significant step towards realizing the full potential of SmartNICs for accelerating data center applications and enhancing network performance.
XenoFlow Achieves Lower Latency Load Balancing
This work presents XenoFlow, a load balancer implemented on the BlueField-3 Data Processing Unit, and details an evaluation of the platform’s capabilities. Results demonstrate that while the Bluefield-3 currently faces limitations in achieving line rate performance with small packets, XenoFlow delivers a significant 44% reduction in latency compared to a comparable eBPF-based load balancer running on a standard host CPU, even under substantial load. The research acknowledges current limitations in the BlueField-3 platform, specifically regarding achieving peak throughput with smaller packet sizes. Furthermore, the authors note challenges stemming from sparse documentation and limited access to internal API specifications, which complicates development. Future work should focus on evaluating the BlueField-3’s performance in real-world network scenarios and exploring the integration of additional network functions, such as intrusion detection systems or firewalls, directly into the DPU to further leverage its capabilities.
👉 More information
🗞 XenoFlow: How Fast Can a SmartNIC-Based DNS Load Balancer Run?
🧠 ArXiv: https://arxiv.org/abs/2509.21656
