IBM and ETH Zurich Advance AI with Efficient Analog In-Memory Computing and NMPU

Analog In-Memory Computing (AIMC) is a promising technology for deep learning inference due to its speed and energy efficiency. It performs computational tasks within the memory, improving energy efficiency and latency. However, AIMC faces challenges related to circuit mismatches and memory device nonidealities. The Near Memory Digital Processing Unit (NMPU) is proposed to overcome these. The NMPU, implemented in a 14 nm CMOS technology, provides a speedup, smaller area, and competitive power consumption. Despite its potential, further research is needed to realize its capabilities and address any challenges fully.

What is Analog In-Memory Computing (AIMC) and what is its importance in deep learning?

Analog In-Memory Computing (AIMC) is an emerging technology gaining traction for its speed and energy efficiency in deep learning inference. Deep learning, a subset of artificial intelligence (AI), involves the use of neural networks with many layers (deep neural networks) to model and understand complex patterns. Inference is the process of making predictions using a trained deep-learning model.

AIMC is a non-von Neumann architecture that performs computational tasks within the memory itself, a paradigm known as In-Memory Computing (IMC). This approach tackles the von Neumann bottleneck, leading to significant improvements in both energy efficiency and latency. The Matrix Vector Multiplication (MVM) operation, which dominates the computations of deep neural network inference, can be performed by mapping offlinetrained network weights onto AIMC tiles.

What are the Challenges and Solutions in AIMC?

Despite its potential, AIMC faces challenges associated with circuit mismatches and nonidealities linked to the memory devices. To address these issues, a certain amount of digital post-processing is required. This is where the Near Memory digital Processing Unit (NMPU) comes in. The NMPU, based on fixed-point arithmetic, is proposed to overcome these limitations.

The NMPU achieves competitive accuracy and higher computing throughput than previous approaches while minimizing the area overhead. It supports standard deep learning activation steps such as Rectified Linear Unit (ReLU) and Batch Normalization.

How does the NMPU Work?

The NMPU design was physically implemented in a 14 nm CMOS technology. The NMPU uses data from an AIMC chip and demonstrates that a simulated AIMC system with the proposed NMPU outperforms existing Floating Point 16 (FP16) based implementations. It provides a 1.39 speedup, 78% smaller area, and competitive power consumption.

The NMPU also achieves an inference accuracy of 86.65% and 65.06% with an accuracy drop of just 0.12% and 0.4% compared to the FP16 baseline when benchmarked with ResNet9/ResNet32 networks trained on the CIFAR-10/CIFAR-100 datasets respectively.

What is the Role of the AIMC Tile?

A typical AIMC tile comprises a crossbar array of memristive devices such as Phase Change Memory (PCM). The synaptic weights are programmed as the conductance value of these devices. The digital input to the tile is converted to voltage or pulse duration values using Digital-to-Analog Converters (DACs) or pulse-width modulation units respectively, and the output from the array is converted to digital values using Analog-to-Digital Converters (ADCs).

However, one key challenge is that there is substantial nonuniformity in the ADCs conversion behavior or transfer curves. To address this, AIMC tiles require additional data post-processing per column through an affine correction procedure.

What is the Future of AIMC and NMPU?

The research conducted by the team from IBM Research Europe and ETH Zurich demonstrates the potential of AIMC and NMPU in deep learning inference. The NMPU’s ability to achieve high computational efficiency and low latency in AIMC systems makes it a promising technology for future AI applications.

However, as with any emerging technology, further research and development are needed to fully realize its potential and address any challenges that may arise. The team’s work is a significant step forward in this direction, contributing to the ongoing advancements in the field of AI and deep learning.

Publication details: “A Precision-Optimized Fixed-Point Near-Memory Digital Processing Unit
for Analog In-Memory Computing”
Publication Date: 2024-02-12
Authors: Elena Ferro, Athanasios Vasilopoulos, Corey Lammie, Manuel Le Gallo et al.
Source: arXiv (Cornell University)
DOI: https://doi.org/10.48550/arxiv.2402.07549

Quantum News

Quantum News

As the Official Quantum Dog (or hound) by role is to dig out the latest nuggets of quantum goodness. There is so much happening right now in the field of technology, whether AI or the march of robots. But Quantum occupies a special space. Quite literally a special space. A Hilbert space infact, haha! Here I try to provide some of the news that might be considered breaking news in the Quantum Computing space.

Latest Posts by Quantum News:

IBM Remembers Lou Gerstner, CEO Who Reshaped Company in the 1990s

IBM Remembers Lou Gerstner, CEO Who Reshaped Company in the 1990s

December 29, 2025
Optical Tweezers Scale to 6,100 Qubits with 99.99% Imaging Survival

Optical Tweezers Scale to 6,100 Qubits with 99.99% Imaging Survival

December 28, 2025
Rosatom & Moscow State University Develop 72-Qubit Quantum Computer Prototype

Rosatom & Moscow State University Develop 72-Qubit Quantum Computer Prototype

December 27, 2025