MARVEL Generates Custom RISC-V Extensions for Efficient Deep Neural Network Deployment

Deploying artificial intelligence on tiny, power-sipping IoT devices presents a significant hurdle, often demanding bespoke hardware solutions for each AI model, and researchers led by Ajay Kumar M and Cian O’Mahoney from University College Dublin, along with colleagues including Pedro Kreutz Werle, have introduced a new framework called MARVEL to address this challenge. This automated system generates custom RISC-V processor extensions specifically tailored to classes of deep neural networks, with a focus on convolutional networks, enabling efficient AI deployment on severely resource-constrained devices. Unlike existing tools, MARVEL creates hardware and software that operates without needing an operating system or complex software dependencies, streamlining the process and reducing overhead, and the team demonstrates a two-fold increase in inference speed and a corresponding reduction in energy consumption, achieved with a modest increase in hardware area when implemented on a modern FPGA platform. This advancement promises to unlock the potential of AI in a wider range of embedded applications, from smart sensors to wearable devices, by overcoming the limitations of current deployment pipelines.

Existing accelerator-generation tools struggle to address the extreme resource limitations faced by IoT endpoints operating without an operating system. Consequently, this work investigates methods for automatically generating custom hardware accelerators optimised for both performance and resource utilisation, enabling the deployment of sophisticated AI capabilities on even the most limited IoT platforms.

AI Acceleration on RISC-V for Edge Computing

This research details work on AI-enhanced RISC-V cores, specifically for edge computing applications. The core focus lies in improving the performance and efficiency of deep learning models on resource-constrained edge devices. Researchers have developed a system-level framework, LiteAIR5, for designing, modeling, and evaluating AI-extended RISC-V cores, providing a valuable tool for researchers and developers working on edge AI applications. The work demonstrates performance improvements and enhanced energy efficiency compared to existing solutions.

Automated Hardware Acceleration for Deep Learning

Researchers have developed a new automated framework, MARVEL, that addresses the challenge of deploying deep neural networks on resource-constrained IoT devices. Unlike existing approaches, MARVEL generates custom RISC-V instruction set extensions tailored to specific classes of deep learning models, particularly convolutional neural networks. The system profiles high-level Python-based models and automatically creates both the specialized hardware and the necessary compiler tools for efficient execution, eliminating the need for an operating system. This end-to-end approach significantly improves performance and efficiency, achieving up to a two-fold increase in inference speed and a two-fold reduction in energy consumption when tested on a Zynq UltraScale+ FPGA platform.

While other methods focus on optimizing specific aspects, MARVEL uniquely combines hardware and software co-design through automated profiling and extension generation. Evaluations across a range of popular models, including LeNet-5, MobileNet, ResNet, and DenseNet, demonstrate the framework’s versatility and effectiveness. Although the specialized hardware introduces a 28. 23% increase in area, the substantial gains in speed and energy efficiency represent a significant advancement for edge AI applications where resources are severely limited.

Automated Hardware for Efficient Edge AI

This research presents MARVEL, an automated framework that generates custom RISC-V processor extensions specifically designed for convolutional neural networks (CNNs) and targeted at resource-constrained IoT devices. The framework bridges the gap between high-level Python-based AI models and low-level bare-metal C implementations, enabling efficient deployment without the need for an operating system. By automating the process of hardware customization, MARVEL achieves up to a two-fold improvement in inference speed and energy efficiency, with a modest 28. 23% increase in hardware area when implemented on a Zynq UltraScale+ FPGA platform. The authors plan future work to refine power estimation accuracy, explore alternative baseline RISC-V cores, and expand support for a wider range of deep learning models and quantization levels, further enhancing hardware-software co-design for edge AI.

👉 More information
🗞 MARVEL: An End-to-End Framework for Generating Model-Class Aware Custom RISC-V Extensions for Lightweight AI
🧠 ArXiv: https://arxiv.org/abs/2508.01800

Quantum News

Quantum News

As the Official Quantum Dog (or hound) by role is to dig out the latest nuggets of quantum goodness. There is so much happening right now in the field of technology, whether AI or the march of robots. But Quantum occupies a special space. Quite literally a special space. A Hilbert space infact, haha! Here I try to provide some of the news that might be considered breaking news in the Quantum Computing space.

Latest Posts by Quantum News:

IBM Remembers Lou Gerstner, CEO Who Reshaped Company in the 1990s

IBM Remembers Lou Gerstner, CEO Who Reshaped Company in the 1990s

December 29, 2025
Optical Tweezers Scale to 6,100 Qubits with 99.99% Imaging Survival

Optical Tweezers Scale to 6,100 Qubits with 99.99% Imaging Survival

December 28, 2025
Rosatom & Moscow State University Develop 72-Qubit Quantum Computer Prototype

Rosatom & Moscow State University Develop 72-Qubit Quantum Computer Prototype

December 27, 2025