Data privacy and model explainability represent critical challenges in contemporary machine learning. Júlio Oliveira, Rodrigo Ferreira (Institute of Exact and Natural Sciences, Federal University of Pará, Brazil), and André Riker, alongside Glaucio H. S. Carvalho and Eirini Eleni Tsilopoulou, investigate the interplay between these two vital aspects within federated learning systems. Their research addresses a significant gap by exploring how the addition of differential privacy, a technique for enhancing data protection, impacts the interpretability of machine learning models. The authors propose Federated EXplainable Trees with Differential Privacy (FEXT-DP), a novel approach utilising decision trees to balance privacy preservation with the need for transparent and understandable predictions, demonstrating improvements in training speed, accuracy, and explainability despite the inherent trade-offs of differential privacy.
Modern machine learning systems require both compliance with data privacy regulations and the ability to produce understandable outputs, a concept known as eXplainable Artificial Intelligence (XAI).
This work addresses both concerns by leveraging the inherent interpretability of Decision Trees within a federated learning framework. Consequently, this research also investigates the specific impact of DP protection on the explainability of the machine learning model.
Performance assessments demonstrate that FEXT-DP achieves improvements in training speed, measured by the number of communication rounds, and exhibits enhanced performance in terms of Mean Squared Error and overall explainability. This suggests a potential pathway towards more efficient and transparent machine learning deployments.
This innovative approach contrasts with existing methods that often prioritize either privacy or explainability, frequently at the expense of the other. Prior studies integrating tree-based models into federated learning, such as FedTree, have faced performance overhead due to techniques like Homomorphic Encryption.
Other solutions, like those employing blockchain, may lack mechanisms for aggregating trees on a central server. FEXT-DP distinguishes itself by focusing on a bagging-based decision tree approach, offering a granular and efficient use of the privacy budget by linking it directly to the depth of leaf nodes.
The research details how FEXT-DP addresses limitations found in previous work, including the trade-offs between privacy and explainability in systems like SecureGBM, which uses anonymous features to reduce model interpretability. The methodology centres on training decision trees locally on individual clients within a federated learning framework, subsequently transmitting these models to a central server for aggregation.
This approach prioritises both data privacy and model explainability, leveraging the inherent interpretability of decision trees over more complex neural network architectures. Following local training, the server implements a selective aggregation process, retaining only trees that surpass a predefined accuracy threshold, denoted as ‘K’.
Trees failing to meet this minimum performance level are discarded, ensuring that the aggregated model benefits from consistently accurate components. This selective approach distinguishes FEXT-DP from methods that incorporate all locally trained models, potentially improving overall model performance and stability.
The refined set of trees is then redistributed to the clients, initiating a new training round. The training process continues iteratively until a predetermined stopping criterion is met, facilitating convergence towards an optimal global model. Clients evaluate incoming global models against their most recent local models using Mean Squared Error (MSE), adopting the set of trees exhibiting lower MSE.
This client-side comparison ensures that local models are continuously refined based on the collective knowledge of the federation, while also preventing performance degradation. Differential Privacy is integrated to further protect data confidentiality, adding noise during the tree-building process to obscure individual contributions. The proposed FEXT-DP system operates through three distinct stages within each training round.
Clients initially train tree-based models using their local datasets and transmit these models to the central server. The server then aggregates the received trees, retaining only those that surpass a predefined minimum accuracy threshold, designated as ‘K’, discarding those below this level. Subsequently, the server redistributes the selected trees to the clients, completing one full training round.
This iterative process continues until pre-defined stopping criteria are met. During training, clients compare the performance of the global model received from the server against their most recent local model, selecting the set of trees exhibiting lower Mean Squared Error. The server employs a minimum accuracy threshold, ‘K’, to evaluate incoming trees, effectively filtering out potentially malicious contributions from clients attempting to compromise the federated model.
Algorithm 1 highlights the implementation of differential privacy within the decision tree training process, specifically focusing on a differentially private method for identifying the optimal split during tree construction. The research demonstrates that the proposed method allows for the elimination of low-performance trees, potentially originating from malicious clients, while maintaining model integrity.
The system’s design prioritizes a bagging-based decision tree approach, differentiating it from other methods that focus on gradient boosted trees or alternative privacy mechanisms like homomorphic encryption. This system utilizes decision trees, favoured for their inherent interpretability and reduced computational demands compared to neural network-based approaches, and integrates differential privacy to further protect data confidentiality.
The proposed method aims to address the challenge of maintaining explainability while enhancing privacy in machine learning models. Performance assessments demonstrate that FEXT-DP achieves faster training times, measured by the number of communication rounds, and improved performance in terms of mean squared error and explainability.
The study also investigates the impact of differential privacy on the explainability of tree-based federated learning models, revealing a trade-off between privacy protection and model interpretability. While acknowledging that adding differential privacy can diminish explainability, the research highlights the benefits of FEXT-DP in balancing these competing objectives.
The authors note a key limitation is the inherent compromise between privacy and explainability when employing differential privacy techniques. Future research could explore methods to mitigate the reduction in explainability caused by differential privacy, potentially through novel tree construction or privacy budget allocation strategies. This work establishes a foundation for developing machine learning systems that prioritize both data security and transparency, offering a pathway towards more trustworthy and understandable artificial intelligence.
👉 More information
🗞 Towards Explainable Federated Learning: Understanding the Impact of Differential Privacy
🧠 ArXiv: https://arxiv.org/abs/2602.10100
