Homomorphic encryption (HE) offers a powerful means of performing privacy-preserving machine learning, yet evaluating the softmax function, critical to modern transformer architectures, presents a significant computational hurdle. Hanjun Park, Byeong-Seo Min, and Jiheon Woo from the Department of Electrical Engineering at POSTECH, alongside Min-Wook Jeong, Jongho Shin et al. from LG Electronic R&D Center and Yongwoo Lee from Inha University, address this challenge with a novel reformulation called MGF-softmax. Their research introduces a moment generating function-based approach that replaces the problematic softmax denominator with a moment-based equivalent, substantially decreasing multiplicative depth without sacrificing accuracy. This advancement is significant because it enables more efficient and accurate inference with HE, achieving performance comparable to high-depth exact methods at a considerably reduced computational cost, as demonstrated through experiments on Vision Transformers and large language models.

Moment generating functions enable efficient privacy-preserving softmax evaluation without revealing individual inputs

Researchers have developed MGF-softmax, a new method for performing privacy-preserving machine learning inference on encrypted data. This work addresses a critical bottleneck in homomorphic encryption, specifically the computationally intensive softmax function used in transformer architectures. Evaluating softmax directly on encrypted data is challenging due to its complex structure, the wide range of values produced by exponential functions, and the need for accurate division during normalization.

MGF-softmax reformulates the softmax function using the moment generating function, replacing the standard denominator with a moment-based equivalent. This reformulation substantially reduces the multiplicative depth required for computation while maintaining the essential properties of the softmax function.

Crucially, MGF-softmax asymptotically converges to the exact softmax result as the number of input tokens increases, ensuring accuracy. Extensive experiments utilising both Vision Transformers and large language models demonstrate that MGF-softmax provides an efficient and accurate approximation of softmax during encrypted inference.

The method achieves inference accuracy comparable to high-depth exact methods, but with a significantly lower computational cost achieved through reduced multiplicative depth. By eliminating the need for homomorphic division and explicit maximum subtraction, MGF-softmax circumvents computationally expensive operations.

This reduction in multiplicative depth is vital, minimising the need for bootstrapping and translating into substantial computational savings. Theoretical analysis confirms key properties of MGF-softmax and provides a defined bound on its approximation error relative to the exact softmax function. Experimental results show a significant reduction in level consumption compared to existing softmax approximation techniques, such as that presented by Cho et al. (2024).

Furthermore, MGF-softmax consistently maintains high accuracy across diverse tasks, demonstrating less than a 1% accuracy drop for Vision Transformers (ViT/DeiT) on the ImageNet-1k dataset and comparable performance on large language models like LLaMA-3.2-. This breakthrough paves the way for more practical and efficient privacy-preserving machine learning applications, enabling secure inference without exposing sensitive user data. The innovation promises to accelerate the adoption of machine learning as a service while upholding stringent privacy standards.

Moment generating function based softmax approximation for reduced multiplicative depth offers efficient computation

A novel softmax reformulation, termed MGF-softmax, underpinned the research into efficient homomorphic encryption inference. This method replaces the standard softmax denominator with a moment-based counterpart derived from the moment generating function. By utilising this substitution, the work substantially reduced multiplicative depth, a key factor in computational cost within homomorphic encryption schemes, while maintaining the essential properties of the softmax function.

The MGF-softmax approximation asymptotically converges to the exact softmax as the number of input tokens increases, ensuring accuracy with larger datasets. Experiments were conducted on both Vision Transformers and large language models to assess the performance of MGF-softmax in encrypted inference scenarios.

The study rigorously compared MGF-softmax against high-depth exact methods, demonstrating that it achieves comparable inference accuracy. Crucially, this accuracy was attained with a significantly lower computational burden, directly attributable to the reduced multiplicative depth achieved through the moment-based reformulation.

Performance was evaluated by measuring the accuracy of inference on standard datasets for both Vision Transformers and large language models. The methodology specifically avoided techniques requiring comparison operations, which are incompatible with homomorphic encryption. Instead, the research focused on approximating the exponential function and division inherent in the standard softmax calculation with more efficient, polynomial-based equivalents.

This approach circumvents the need for complex numerical stabilisation techniques commonly used in plaintext inference, streamlining the process for encrypted data. The resulting MGF-softmax implementation offers a practical solution for deploying privacy-preserving machine learning models without sacrificing performance or accuracy.

High-accuracy homomorphic inference via efficient softmax approximation enables practical privacy-preserving machine learning

MGF-softmax achieves less than a 1% accuracy drop for Vision Transformers (ViT/DeiT) on ImageNet-1k and large language model (LLaMA-3.2-1B) on Clinc150, Banking77, and SST-2. This performance is particularly notable given the challenges inherent in homomorphic inference where existing softmax replacement methods often exhibit significant degradation.

The research introduces MGF-softmax, a novel reformulation of the softmax function eliminating the need for homomorphic division and maximum subtraction, substantially reducing computational cost. Theoretical analysis establishes key properties of MGF-softmax and provides an asymptotic bound on its approximation error relative to the exact softmax function.

Experimental validation demonstrates that MGF-softmax attains inference accuracy comparable to exact softmax while significantly reducing inference cost under homomorphic encryption. The method substantially reduces multiplicative depth compared to existing softmax approximation approaches, minimizing the need for bootstrapping and translating to significant computational savings.

This depth reduction is critical as it directly impacts the level consumption, outperforming the softmax approximation baseline. MGF-softmax is based on the moment generating function, replacing the softmax denominator with a moment-based counterpart. This reformulation preserves key properties of softmax and asymptotically converges to the exact softmax as the number of input tokens increases.

The work focuses on efficient evaluation of softmax, a core component of transformer architectures, within the framework of fully homomorphic encryption. The CKKS scheme is utilized, supporting approximate arithmetic operations over encrypted real numbers with a ciphertext slot count denoted as ‘s’. Supported homomorphic operations include element-wise addition, cyclic rotation, and multiplication, with ciphertext-ciphertext multiplication consuming one multiplicative level.

Each ciphertext is assigned a finite budget of multiplicative levels ‘L, and exceeding this budget necessitates bootstrapping, a computationally expensive operation. Consequently, algorithmic efficiency is evaluated based on multiplicative depth and the counts of dominant operations, specifically ciphertext-ciphertext multiplication and cyclic rotation. The research addresses challenges including overflow from the exponential function and the lack of native division support in homomorphic encryption schemes.

Moment generating functions enable efficient and accurate softmax approximation for encrypted inference by preserving privacy

MGF-softmax, a novel reformulation of the softmax function for use with homomorphic encryption, substantially reduces computational cost during machine learning inference. The method replaces the traditional softmax denominator with a moment-based counterpart derived from the moment generating function, thereby minimising multiplicative depth while maintaining key softmax properties.

This reformulation allows for accurate approximation of softmax, particularly crucial within transformer architectures where evaluating this function presents significant challenges. Experiments conducted on both Vision Transformers and large language models demonstrate that MGF-softmax achieves inference accuracy comparable to high-depth exact methods, but with a significantly lower computational burden.

Specifically, the approach maintains accuracy within one per cent of plaintext baselines, requiring only seven to ten multiplicative depths, a substantial improvement over existing approximations which can demand up to fifty-two depths to achieve similar results. The authors acknowledge a performance decrease of 8.2 per cent on a specific Vision Transformer configuration with a large number of classes, suggesting potential limitations when dealing with highly complex classification tasks.

This work offers a scalable and efficient solution for secure inference, reconciling high accuracy with reduced computational complexity. By facilitating the deployment of large language models and Vision Transformers without compromising data privacy, MGF-softmax has positive implications for sensitive sectors like healthcare and finance.

Furthermore, the reduction in computational demands addresses environmental concerns associated with energy-intensive secure computing protocols, contributing to more sustainable machine learning practices. Future research could explore optimising MGF-softmax for diverse model architectures and datasets, and investigating its performance in more complex privacy-preserving machine learning scenarios.

👉 More information
🗞 Efficient Softmax Reformulation for Homomorphic Encryption via Moment Generating Function
🧠 ArXiv: https://arxiv.org/abs/2602.01621

Tags:

Homomorphic Encryption Large Language Models MGF-softmax moment generating function multiplicative depth privacy-preserving machine learning. softmax approximation Vision Transformers

Privacy-Preserving AI Gets Speed Boost with New Mathematical Shortcut for Complex Calculations

Moment generating functions enable efficient privacy-preserving softmax evaluation without revealing individual inputs

Moment generating function based softmax approximation for reduced multiplicative depth offers efficient computation

High-accuracy homomorphic inference via efficient softmax approximation enables practical privacy-preserving machine learning

Moment generating functions enable efficient and accurate softmax approximation for encrypted inference by preserving privacy

Rohail T.

Latest Posts by Rohail T.:

Quantum Gates Mapped to Predictable Geometric Space

Quantum Error Framework Boosts Logical State Fidelity

Quantum Computers Cut Measurement Costs with New Method