Data Privacy in Federated Learning: The Trade-Offs in Balancing Operational Performance with Confidentiality
Abstract
The increasing adoption of Artificial Intelligence (AI) in sensitive domains such as healthcare, finance, and mobile services has intensified the need for robust data privacy mechanisms and strict adherence to regulatory frameworks such as GDPR and HIPAA. Traditional centralized learning frameworks require aggregating all user data on a central server for model training, creating a single point of vulnerability that exposes organizations to cyber-attacks, unauthorized access, data misuse, and compliance violations. The problem addressed in this study is that although Federated Learning (FL)—a decentralized model-training paradigm, reduces direct data exposure by retaining raw data at the source, recent research shows that FL remains susceptible to indirect privacy leaks. Model inversion, membership inference, and gradient leakage attacks have demonstrated that sensitive user information can still be inferred from model updates, especially in heterogeneous, non-Independent and Identically Distributed (non-IID) environments or when malicious clients/servers participate in training.
The objective of this study is to investigate how confidentiality-preserving mechanisms in federated learning influence operational performance. Specifically, the study seeks to identify the trade-offs between privacy strength, model accuracy, computational efficiency, and communication overhead, and to determine strategies that enable organizations to balance security requirements with system utility and scalability. The methodology employed an experimental evaluation of three widely used privacy-preserving techniques: Differential Privacy (DP), Secure Aggregation (SA), and Homomorphic Encryption (HE). Using multiple benchmark datasets under both Independent and Identically Distributed (IID) and non-Independent and Identically Distributed (non-IID) configurations, the study measured: Model accuracy, Execution and inference time, Communication cost, and Resilience, against privacy attacks. Comparisons were made between baseline FL and privacy-enhanced FL configurations to quantify the impact of each mechanism. Results show that FL, without additional protection, still leaks information through gradients and model updates. Applying Differential Privacy significantly reduces inference-attack success rates but introduces noise that lowers accuracy by up to 18% in non-IID settings. Secure Aggregation prevents server-side inspection of individual updates but increases communication overhead by 30–60%. Homomorphic Encryption provides the highest confidentiality guarantee but results in substantial latency and is impractical for large deep-learning models or edge devices due to its computation cost. The study recommends adopting hybrid and adaptive approaches that combine multiple privacy techniques, such as DP with Secure Aggregation, while dynamically adjusting privacy budgets to context and model sensitivity. Organizations should avoid a one-size-fits-all privacy configuration and instead tune privacy parameters based on workload, regulatory requirements, and device capability. This approach ensures that federated learning deployments maintain confidentiality, scalability, and operational efficiency in real-world environments.
References
Bagdasaryan, E., Poursaeed, O., & Shmatikov, V. (2019). Differential privacy has disparate
impact on model accuracy. Advances in Neural Information Processing Systems, 32. https://arxiv.org/abs/1905.12101
Bai, L., Hu, H., Ye, Q., Li, H., Wang, L., & Xu, J. (2024). Membership Inference Attacks and
Defenses in Federated Learning: A Survey. ACM Computing Surveys, 57(4), Article 3704633. https://doi.org/10.1145/3704633
Bhagoji, A. N., Chakraborty, S., Mittal, P., & Calo, S. (2019). Analyzing federated learning
through an adversarial lens. In Proceedings of the 36th International Conference on Machine Learning (Vol. 97, pp. 634–643). PMLR. https://proceedings.mlr.press/v97/bhagoji19a.html
Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., McMahan, H. B., Patel, S., ... &
Seth, K. (2017). Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (pp. 1175–1191). ACM. https://doi.org/10.1145/3133956.3133982
Daruvuri, R., & Patibandla, K. K. (2023). Enhancing data security and privacy in edge
computing: A comprehensive review of key technologies and future directions. Journal of Cloud Computing: Advances, Systems and Applications, 12(1), 1–25. https://doi.org/10.1186/s13677-023-00394-0
Dwork, C., McSherry, F., Nissim, K., & Smith, A. (2006). Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography Conference (pp. 265–284). Springer. https://doi.org/10.1007/11681878_14
Fu, J., Hong, Y., Ling, X., Wang, L., Ran, X., Sun, Z., Wang, W. H., Chen, Z., & Cao, Y. (2024).Differentially Private Federated Learning: A Systematic Review. arXiv preprint arXiv:2405.08299.
Geyer, R. C., Klein, T., & Nabi, M. (2017). Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557. https://arxiv.org/abs/1712.07557
Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., Bonawitz, K.,Charles, Z., Cormode, G., Cummings, R., D’Oliveira, R. G. L., Eichner, H., El Rouayheb, S., Evans, D., Gardner, J., Garrett, Z., Gascón, A., Ghazi, B., … Zhao, S. (2021). Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2), 1–210. https://doi.org/10.1561/2200000083
Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated learning: Challenges,
methods, and future directions. IEEE Signal Processing Magazine, 37(3), 50–60. https://doi.org/10.1109/MSP.2020.2975749
Refbacks
- There are currently no refbacks.