Pal*m Achieves Efficient Attestation for Large Generative Models and Datasets

Researchers are tackling the critical challenge of verifying the trustworthiness of increasingly powerful generative AI models. Prach Chantasantitam, Adam Ilyas Caulfield, and Vasisht Duddu, all from the University of Waterloo, alongside Lachlan J. Gunn of Aalto University and N. Asokan, introduce PALM , a novel property attestation framework designed specifically for large generative models like large language models. This work is significant because existing methods struggle with the scale and complexity of these systems, hindering accountability and compliance with emerging regulations. PALM establishes a robust system for proving model and data integrity, utilising confidential virtual machines and efficient hashing techniques to ensure both security and scalability , paving the way for responsible AI deployment.

This work is significant because existing methods struggle with the scale and complexity of these systems, hindering accountability and compliance with emerging regulations.

PAL\*M framework for generative model accountability is crucial

This breakthrough research tackles a critical gap in current approaches, which struggle to support the complexities of generative models and the sheer scale of modern datasets. PALM introduces a unique solution for tracking data integrity: incremental multiset hashing applied to memory-mapped datasets, enabling efficient monitoring even when datasets exceed the capacity of trusted execution environment (TEE) memory. The core innovation of PALM lies in its ability to handle large datasets accessed randomly, a significant challenge for traditional attestation methods. Researchers employ incremental multiset hash functions to construct representative measurements of these datasets in external storage, ensuring secure integrity tracking at runtime.
Furthermore, the framework defines methods for measuring properties of generative model operations, incorporating attestation evidence from GPUs without compromising confidential details. This is achieved through TEE-aware GPUs, which efficiently and securely ensure the integrity of heterogeneous computing environments. The study establishes a property attestation protocol demonstrating how these measurements and outputs can be combined to prove that data and models were produced using a PAL*M-equipped CPU-GPU configuration. The framework’s implementation with real-world datasets and models confirms its ability to meet all specified requirements for robust property attestation.

Specifically, the research details how PAL*M can be used to verify operations such as fine-tuning, quantization, and even complete LLM chat sessions, all without revealing sensitive data or model parameters to the verifier. This is particularly important in light of emerging regulations, like the EU’s AI Act, which demand proof of model properties related to accuracy, training procedures, and data provenance. The work opens exciting possibilities for building trust and transparency in AI systems. By providing a secure and verifiable record of model properties, PAL*M facilitates accountability towards regulations and policies, enabling responsible deployment of generative models across critical domains like healthcare, finance, and autonomous systems.

Researchers. Experiments revealed that PAL*M effectively addresses limitations in existing approaches that struggle with generative models and extensive datasets. Data shows that for dataset attestations, the framework achieves a 62% to 70% overhead during hashing operations, primarily due to the initial attribute distribution and preprocessing proofs, tasks relevant for any future dataset use. The team measured that parallelizing dataset lookup iterations across 8 cores significantly improves performance, particularly with the memory-mapped approach, which exhibits superior I/O scaling compared to in-memory methods.

Measurements confirm that the memory-mapped approach reduces memory usage from 85-87 GB to just 4 GB, demonstrating a substantial reduction in resource requirements. Results demonstrate minimal overhead during proof of fine-tuning, with values of ≤1.35% across all tested models, Llama-3.1-8B, Gemma-3-4B, and Phi-4-Mini. The team recorded a total time of 268.81 minutes for fine-tuning Llama-3.1-8B using the in-memory approach, with a slight increase to 269.15 minutes using the memory-mapped variant. For proof of evaluation using the MMLU benchmark, the in-memory case showed overheads of 3.81-5.06%, while the memory-mapped case exhibited 10.03-11.84% overhead. Tests prove that for proof of inference with a single prompt, measurement overhead is substantial, reaching 64.34% for Llama-3.1-8B, but attestation overhead remains consistent. However, when applied to a session of inference, with attestation performed only after all interactions, the measurement overhead drops dramatically to 11.03% for Llama-3.1-8B, 3.57% for Gemma-3-4B, and 6.28% for Phi-4-Mini, highlighting the framework’s adaptability and efficiency in realistic scenarios.

PAL\*M verifies large model training integrity through comprehensive

This system addresses a critical gap in current approaches, which struggle to accommodate the scale and complexity of these models and their associated datasets. PAL*M establishes a method for verifying characteristics throughout both the training and inference stages, ensuring accountability and adherence to emerging regulations. Crucially, PAL*M employs incremental multiset hashing over memory-mapped datasets, enabling efficient tracking of data integrity without requiring the entire dataset to reside in main memory. They also suggest that future research could explore the application of PAL*M to other types of generative models beyond large language models, expanding its potential impact. This work represents a significant step towards building trustworthy and accountable AI systems, providing a robust mechanism for verifying the integrity of both data and models used in critical applications.

👉 More information
🗞 PAL*M: Property Attestation for Large Generative Models
🧠 ArXiv: https://arxiv.org/abs/2601.16199

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Film Decoders Achieve 11.1x Faster Quantum Error Correction on IBM Systems

Film Decoders Achieve 11.1x Faster Quantum Error Correction on IBM Systems

January 26, 2026
GPU Acceleration Achieves 100x Speedup for Sample-Based Quantum Diagonalization

GPU Acceleration Achieves 100x Speedup for Sample-Based Quantum Diagonalization

January 26, 2026
Quantinuum H2-2 Demonstrates Energy-Resolved Transport and 8x7 Lattice Localization

Quantinuum H2-2 Demonstrates Energy-Resolved Transport and 8×7 Lattice Localization

January 26, 2026