Publications | Angelo Rodio

2025

Conference

The Many Facets of Variance Reduction in Federated Learning

Angelo Rodio

SIGMETRICS Performance Evaluation Review, Mar 2025

Abs DOI Poster

Federated Learning (FL) enables clients (mobile or IoT devices) to train a shared machine learning model coordinated by a central server while keeping their data local, addressing communication and privacy concerns. In the FedAvg algorithm [2], clients perform multiple local stochastic gradient descent (SGD) steps on their datasets and send their model updates to the server. The server then aggregates these client updates to produce the new global model and sends this back to the clients for the subsequent iteration.
Preprint

Optimizing Privacy-Utility Trade-off in Decentralized Learning with Generalized Correlated Noise

Angelo Rodio, Zheng Chen, and Erik G. Larsson

Jul 2025

Abs arXiv Code

Decentralized learning enables distributed agents to collaboratively train a shared machine learning model without a central server, through local computation and peer-to-peer communication. Although each agent retains its dataset locally, sharing local models can still expose private information about the local training datasets to adversaries. To mitigate privacy attacks, a common strategy is to inject random artificial noise at each agent before exchanging local models between neighbors. However, this often leads to utility degradation due to the negative effects of cumulated artificial noise on the learning algorithm. In this work, we introduce CorN-DSGD, a novel covariance-based framework for generating correlated privacy noise across agents, which unifies several state-of-the-art methods as special cases. By leveraging network topology and mixing weights, CorN-DSGD optimizes the noise covariance to achieve network-wide noise cancellation. Experimental results show that CorN-DSGD cancels more noise than existing pairwise correlation schemes, improving model performance under formal privacy guarantees.

2024

Journal

Federated Learning Under Heterogeneous and Correlated Client Availability

Angelo Rodio, Francescomaria Faticanti, Othmane Marfoq, Giovanni Neglia, and Emilio Leonardi

IEEE/ACM Transactions on Networking, Apr 2024

Abs DOI arXiv Supp Code

In Federated Learning (FL), devices– also referred to as clients– can exhibit heterogeneous availability patterns, often correlated over time and with other clients. This paper addresses the problem of heterogeneous and correlated client availability in FL. Our theoretical analysis is the first to demonstrate the negative impact of correlation on FL algorithms’ convergence rate and highlights a trade-off between optimization error (related to convergence speed) and bias error (indicative of model quality). To optimize this trade-off, we propose Correlation-Aware FL (CA-Fed), a novel algorithm that dynamically balances the competing objectives of fast convergence and minimal model bias. CA-Fed achieves this by dynamically adjusting the aggregation weight assigned to each client and selectively excluding clients with high temporal correlation and low availability. Experimental evaluations on diverse datasets demonstrate the effectiveness of CA-Fed compared to state-of-the-art methods. Specifically, CA-Fed achieves the best trade-off between training time and test accuracy. By dynamically handling clients with high temporal correlation and low availability, CA-Fed emerges as a promising solution to mitigate the detrimental impact of correlated client availability in FL.
Conference

FedStale: Leveraging Stale Updates in Federated Learning

Angelo Rodio and Giovanni Neglia

In European Conference on Artificial Intelligence (ECAI), Apr 2024

Abs DOI arXiv Code Slides

Federated learning algorithms, such as FedAvg, are negatively affected by data heterogeneity and partial client participation. To mitigate the latter problem, global variance reduction methods, like FedVARP, leverage stale model updates for non-participating clients. These methods are effective under homogeneous client participation. Yet, this paper shows that, when some clients participate much less than others, aggregating updates with different levels of staleness can detrimentally affect the training process. Motivated by this observation, we introduce FedStale, a novel algorithm that updates the global model in each round through a convex combination of "fresh" updates from participating clients and "stale" updates from non-participating ones. By adjusting the weight in the convex combination, FedStale interpolates between FedAvg, which only uses fresh updates, and FedVARP, which treats fresh and stale updates equally. Our analysis of FedStale convergence yields the following novel findings: i) it integrates and extends previous FedAvg and FedVARP analyses to heterogeneous client participation; ii) it underscores how the least participating client influences convergence error; iii) it provides practical guidelines to best exploit stale updates, showing that their usefulness diminishes as data heterogeneity decreases and participation heterogeneity increases. Extensive experiments featuring diverse levels of client data and participation heterogeneity not only confirm these findings but also show that FedStale outperforms both FedAvg and FedVARP in many settings.
Preprint

Federated Learning for Collaborative Inference Systems: The Case of Early Exit Networks

Caelin Kaplan, Angelo Rodio, Tareq Si Salem, Chuan Xu, and Giovanni Neglia

Aug 2024

Abs arXiv Code

As Internet of Things (IoT) technology advances, end devices like sensors and smartphones are progressively equipped with AI models tailored to their local memory and computational constraints. Local inference reduces communication costs and latency; however, these smaller models typically underperform compared to more sophisticated models deployed on edge servers or in the cloud. Cooperative Inference Systems (CISs) address this performance trade-off by enabling smaller devices to offload part of their inference tasks to more capable devices. These systems often deploy hierarchical models that share numerous parameters, exemplified by Deep Neural Networks (DNNs) that utilize strategies like early exits or ordered dropout. In such instances, Federated Learning (FL) may be employed to jointly train the models within a CIS. Yet, traditional training methods have overlooked the operational dynamics of CISs during inference, particularly the potential high heterogeneity in serving rates across clients. To address this gap, we propose a novel FL approach designed explicitly for use in CISs that accounts for these variations in serving rates. Our framework not only offers rigorous theoretical guarantees, but also surpasses state-of-the-art (SOTA) training algorithms for CISs, especially in scenarios where inference request rates or data availability are uneven among clients.
PhD Thesis

Client Heterogeneity in Federated Learning Systems

Angelo Rodio

Université Côte d’Azur, Jul 2024

Abs PDF Slides

Federated Learning (FL) stands as a collaborative framework where clients (mobile devices) train a machine learning model under a central server’s orchestration, preserving data decentralization. Client participation heterogeneity stems from varied device capabilities in hardware specifications (CPU power, memory capacity), network connectivity types (3G, 4G, 5G, WiFi), and power availability (battery levels), and it is generally beyond server control. This thesis focuses on providing a theoretical understanding of federated learning systems under heterogeneous client participation, specifically analyzing the impact of this heterogeneity on the convergence of federated learning algorithms, and proposing practical solutions for a more efficient system and resource usage.The first part of the thesis focuses on tackling challenges associated with temporal and spatial correlation in client participation, the former due to the correlated client participation dynamics over time, the latter due to the clients correlated geographic distributions. In this chapter, we first observe that the heterogeneous client participation can potentially bias the learning process. We formalize the bias-variance tradeoff induced by heterogeneous client participation by decomposing the optimization error into variance (related to convergence speed) and bias (indicative of model quality). By minimizing these two errors, we demonstrate that assigning larger aggregation weights to frequently participating clients can accelerate convergence.Moreover, we study the impact of temporal and spatial correlation in client participation through a finite-state Markov chain modeling. We show that correlation slows down convergence within a logarithmic factor related to the Markov chain’s geometric mixing time. Minimizing the bias-variance tradeoff, we also find that lower aggregation weights for highly correlated clients accelerate convergence. We finally propose an algorithm, Correlation-Aware Federated Learning (CA-Fed), to optimize the bias-variance tradeoff and thus achieve faster convergence.The second part of the thesis consider more applied scenarios of lossy communication channels. Network conditions, particularly packet losses, represent a main, uncontrollable source of heterogeneity in client participation. In this chapter, challenging the conventional mitigation strategies for packet losses such as retransmission or error correction, we show that federated learning algorithms can still learn in asymmetric, lossy channels. Our proposed solution modifies traditional federated learning approaches by transmitting model updates in place of models and correcting the averaging step to account for the heterogeneity of the communication channels. Experimental results confirm that our algorithm, under lossy channels, matches the performance in ideal, lossless conditions within a limited number of communication rounds.The third part investigates leveraging variance reduction methods, specifically stale updates, to compensate for the heterogeneity in client participation. Recent research considered similar strategies to mitigate the effects of partial client participation in federated learning. These methods involve retaining the last computed, potentially stale, update for each client to replace unavailable current updates for non-participating clients. However, existing analyses rely on the assumption of uniform client participation — restrictive in real-world scenarios. By broadening the analysis to heterogeneous client participation, we discover that convergence is significantly influenced by the least participating clients. This suggests that existing algorithms are not optimally designed for such environments, and we propose a more robust approach, FedStale, to exploit stale model updates under heterogeneous client participation.

2023

Conference

Federated Learning under Heterogeneous and Correlated Client Availability

Angelo Rodio, Francescomaria Faticanti, Othmane Marfoq, Giovanni Neglia, and Emilio Leonardi

In IEEE INFOCOM 2023 - IEEE Conference on Computer Communications, May 2023

Abs DOI arXiv Code Slides

The enormous amount of data produced by mobile and IoT devices has motivated the development of federated learning (FL), a framework allowing such devices (or clients) to collaboratively train machine learning models without sharing their local data. FL algorithms (like FedAvg) iteratively aggregate model updates computed by clients on their own datasets. Clients may exhibit different levels of participation, often correlated over time and with other clients. This paper presents the first convergence analysis for a FedAvg-like FL algorithm under heterogeneous and correlated client availability. Our analysis highlights how correlation adversely affects the algorithm’s convergence rate and how the aggregation strategy can alleviate this effect at the cost of steering training toward a biased model. Guided by the theoretical analysis, we propose CA-Fed, a new FL algorithm that tries to balance the conflicting goals of maximizing convergence speed and minimizing model bias. To this purpose, CA-Fed dynamically adapts the weight given to each client and may ignore clients with low availability and large correlation. Our experimental results show that CA-Fed achieves higher time-average accuracy and a lower standard deviation than state-of-the-art AdaFed and F3AST, both on synthetic and real datasets.
Conference

Federated Learning with Packet Losses

Angelo Rodio, Giovanni Neglia, Fabio Busacca, Stefano Mangione, Sergio Palazzo, Francesco Restuccia, and Ilenia Tinnirello

In 2023 26th International Symposium on Wireless Personal Multimedia Communications (WPMC), Nov 2023

Abs DOI Code Slides

This paper tackles the problem of training Federated Learning (FL) algorithms over real-world wireless networks with packet losses. Lossy communication channels between the orchestrating server and the clients affect the convergence of FL training as well as the quality of the learned model. Although many previous works investigated how to mitigate the adverse effects of packet losses, this paper demonstrates that FL algorithms over asymmetric lossy channels can still learn the optimal model, the same model that would have been trained in a lossless scenario by classic FL algorithms like FedAvg. Convergence to the optimum only requires slight changes to FedAvg: i) while FedAvg computes a new global model by averaging the received clients’ models, our algorithm, UPGA-PL, updates the global model by a pseudo-gradient step; ii) UPGA-PL accounts for the potentially heterogeneous packet losses experienced by the clients to unbias the pseudo-gradient step. Still, UPGA-PL maintains the same computational and communication complexity as FedAvg. In our experiments, UPGA-PL not only outperforms existing state-of-the-art solutions for lossy channels (by more than 5 percentage points on test accuracy) but also matches FedAvg’s performance in lossless scenarios after less than 150 communication rounds.