The increasing reliance on machine learning introduces significant privacy concerns as collaborative model training often requires access to sensitive data. Damian Harenčák, Lukáš Gajdošech, Martin Madaras, and colleagues from Comenius University and Skeletex Research address this challenge with a novel investigation into decentralized, privacy-preserving federated learning for computer vision. Their research moves beyond traditional server-side protections, analysing vulnerabilities to data reconstruction not only from a central server, but also from malicious activity amongst participating clients. This work is significant because it explores practical methods , including homomorphic encryption, gradient compression and noising , to safeguard data during the training of neural networks on edge devices, demonstrating a proof of concept implementation on the Jetson TX2 module. By assessing the trade-offs between privacy and accuracy, the team provides valuable insight into building robust and secure federated learning systems.

Client Data Leakage in Federated Learning

Federated learning offers a way of collectively training a single global model without the need to share client data, by sharing only the updated parameters from each client’s local model. A central server is then used to aggregate parameters from all clients and redistribute the aggregated model back to the clients. Recent findings have shown that even in this scenario, private data can be reconstructed using information about model parameters. Current efforts to mitigate these risks are mainly focused on reducing privacy risks on the server side, assuming that other clients will not act maliciously.

In this work, we analyse various methods for improving the privacy of client data concerning federated learning systems. The primary objective of this research is to identify vulnerabilities in federated learning that allow for the reconstruction of private client data, even when only model parameters are exchanged. We approach this problem by examining the information leakage inherent in parameter updates and exploring techniques to minimise this leakage. Specifically, the research investigates the effectiveness of differential privacy mechanisms applied to client model updates, alongside the use of secure aggregation protocols.

This work also considers the impact of varying levels of client participation and data heterogeneity on privacy preservation. A key contribution of this study is a detailed analysis of the trade-off between privacy and model utility when employing differential privacy in a federated learning setting. The research quantifies the impact of different privacy budgets (ε, δ) on the accuracy of the global model, providing guidance on selecting appropriate parameters for specific applications. Furthermore, the work introduces a novel secure aggregation scheme designed to enhance privacy by reducing the information available to the central server.

This scheme incorporates a randomised response technique to obfuscate individual client contributions. Finally, the research provides empirical evidence demonstrating the vulnerability of existing federated learning systems to privacy attacks, using a benchmark dataset and a realistic simulation environment. The results highlight the need for robust privacy-preserving mechanisms and inform the development of more secure federated learning frameworks. Through rigorous experimentation and analysis, this work advances the understanding of privacy challenges in federated learning and offers practical solutions for mitigating these risks.

Data Reconstruction Risks in Federated Learning

Scientists have demonstrated vulnerabilities in federated learning systems, revealing that private data can be reconstructed from shared model parameters. This work analyzes methods to enhance client data privacy against both the central server and other participating clients within neural network training. The research explored techniques including homomorphic encryption, gradient compression, and gradient noising, alongside modified federated learning approaches like split learning and fully encrypted models. Experiments focused on assessing the impact of these methods on the accuracy of convolutional neural networks used for image classification tasks.

The team measured the difficulty of reconstructing data when employing segmentation networks, finding it more resistant to such attacks than classification models. A proof-of-concept implementation was successfully deployed on a Jetson TX2 module, simulating a federated learning process and providing a practical testbed for the privacy-enhancing techniques. Analysis of gradient compression revealed that pruning small values can offer a beneficial trade-off between accuracy and security, though performance is dependent on the specific network structure and training data. The study quantified the effectiveness of these methods using the “bits of security” metric, a standard for evaluating cryptographic strength.

Researchers investigated additively homomorphic encryption (AHE) and fully homomorphic encryption (FHE) as server-side privacy solutions. Paillier encryption, an AHE scheme, was tested with a recommended key size of 2048 bits, providing a minimum of 112 bits of security. This scheme encrypts individual scalars, requiring each element of a gradient vector to be encrypted separately. Conversely, the CKKS encryption scheme, a FHE approach, allows for the encryption of entire vectors simultaneously, potentially improving efficiency through packing. Tests with CKKS encryption highlighted the importance of adjustable parameters like polynomial degree, ciphertext modulus, and scaling factor, all of which influence both security and computational load.

The study also examined client-side privacy methods, specifically gradient compression, defined by a function that sets values below a threshold (ε) to zero, and measured using a prune ratio (P). Gradient noising, achieved by adding random noise to gradients, was also explored as a means of obscuring individual client contributions. Data shows that the prune ratio, P(C(G,ε)), is calculated as the sum of gradient elements with absolute values less than ε, divided by the total number of elements (n).

Privacy-Accuracy Tradeoffs in Federated Learning Systems

This work details an investigation into privacy within federated learning systems, specifically focusing on neural networks trained across multiple clients and a central server. Researchers analysed methods to protect client data from reconstruction attacks, considering threats from both the server and other participating clients. Through systematic experimentation using convolutional and segmentation networks, the study quantified the trade-offs between data privacy and model accuracy when employing techniques like homomorphic encryption, gradient compression, and gradient noising. The findings demonstrate that while Paillier encryption offers strong security, its computational cost renders it impractical for large models.

Conversely, the CKKS scheme provided comparable security with significantly faster processing times. Gradient noising proved a more effective privacy measure than gradient compression, achieving comparable protection with less impact on model accuracy. Experiments with segmentation networks indicated that reconstructing data from more complex models and realistic datasets is considerably more challenging, suggesting inherent limitations in current reconstruction techniques. The authors acknowledge that formal privacy guarantees were not established, instead evaluating the effectiveness of privacy-preserving methods against the DLG reconstruction algorithm. Future research should concentrate on developing practical metrics that better capture information leakage, moving beyond traditional numerical error measures. Furthermore, exploring lightweight secure architectures, such as split learning and fully encrypted models, alongside meaningful privacy metrics, represents a key direction for creating deployable and verifiable privacy-preserving federated learning systems for resource-constrained edge devices.

👉 More information
🗞 Decentralized Privacy-Preserving Federal Learning of Computer Vision Models on Edge Devices
🧠 ArXiv: https://arxiv.org/abs/2601.04912

Tags:

Convolutional Neural Networks data reconstruction. Collaborative machine learning presents inherent risks to data privacy Edge Devices Federated Learning gradient compression gradient noising Homomorphic Encryption JETSON TX2 Privacy segmentation networks

Decentralized Federated Learning Enables Privacy-Preserving Computer Vision on Edge Devices

Client Data Leakage in Federated Learning

Data Reconstruction Risks in Federated Learning

Privacy-Accuracy Tradeoffs in Federated Learning Systems

Rohail T.

Latest Posts by Rohail T.:

Bio-inspired MLOps Reduces AI Inference Energy Costs by 42% for Faster Results

Tn-sim Implementation Within NWQ-Sim Enables Large-Scale Tensor Network Simulations

Advances Photon-Assisted Transport in MoS Via Three-Region Floquet Driving