Analysis of federated learning on non-independent and identically distributed sleep data

UDC.coleccionInvestigación
UDC.departamentoCiencias da Computación e Tecnoloxías da Información
UDC.institutoCentroCITIC - Centro de Investigación de Tecnoloxías da Información e da Comunicación
UDC.issue3
UDC.journalTitlePhysiological Measurement
UDC.volume47
dc.contributor.authorAnido Alonso, Adriana
dc.contributor.authorÁlvarez-Estévez, Diego
dc.date.accessioned2026-04-07T16:35:28Z
dc.date.available2026-04-07T16:35:28Z
dc.date.issued2026-03-09
dc.description.abstract[Abstract]: Objective. We investigate the application of federated learning (FL) across heterogeneous, non-independent and identically distributed (non-IID) sleep data. We evaluate three algorithms-federated stochastic gradient descent, federated averaging, and federated proximal (FedProx)-in a realistic setting where non-IID characteristics arise from distinct sensor configurations, varying acquisition protocols, and diverse patient populations across independent sleep cohort datasets. Approach. We employ a dual-layered evaluation framework. First, we systematically analyze the impact of local training epochs and aggregation schemes (weighted and unweighted) on model convergence. Second, we introduce and adapt a generalized sub-sampling strategy designed to mitigate model drift caused by heterogeneous data distribution and volume imbalances across participating clients. To ensure robust external generalization, our evaluation utilizes six independent databases in a leave-one-database-out cross-validation scheme. Main results. Our analysis has evidenced that increasing the number of local training epochs adversely affects performance across all evaluated federated schemes. This confirms that extended local training exacerbates client drift, hindering global convergence. Furthermore, weighted aggregation consistently under-performs unweighted approaches, suggesting that disproportionate client contributions bias the global data representation. While the inclusion of a proximal term partially mitigates this instability by constraining local updates, the proposed sub-sampling strategy proves most effective. This approach yields consistent generalization results across all algorithms and minimizes performance downgrading, while significantly reducing computational overhead. Significance. This work addresses critical privacy concerns in centralized automated sleep staging by validating FL in realistic, multi-center scenarios. We provide evidence that decentralized strategies can achieve performance comparable to centralized methods, effectively overcoming data silos. Ultimately, this approach enables robust collaborative training while strictly maintaining data privacy-a fundamental requirement for widespread clinical implementation.
dc.description.sponsorshipThis study has been supported by project RYC2022-038121-I, funded by MCIN/AEI/10.13 039/50110 0011033 and European Social Fund Plus (ESF+), project PID2023-147422OB-I00 funded by MCIU/AEI/10.13039/501100011033 and by the European FEDER program, and project ED431F 2025/35 funded by Xunta de Galicia. Authors wish to acknowledge the support received from Universidade da Coruña and Centro de Investigación de Galicia ‘CITIC’, center accredited for excellence within the Galician University System and a member of the CIGUS Network. CITIC receives subsidies from the Department of Education, Science, Universities, and Vocational Training of the Xunta de Galicia and it is co-financed by the EU through the FEDER Galicia 2021-27 operational program ED431G 2023/01. Furthermore, this research project was made possible through the access granted by the Galician Supercomputing Center (CESGA) to its supercomputing infrastructure. The supercomputer FinisTerrae III and its permanent data storage system have been funded by the NextGeneration EU 2021 Recovery, Transformation and Resilience Plan, ICT2021-006904, and also from the Pluriregional Operational Programme of Spain 2014-2020 of the European Regional Development Fund (ERDF), ICTS-2019-02-CESGA-3, and from the State Programme for the Promotion of Scientific and Technical Research of Excellence of the State Plan for Scientific and Technical Research and Innovation 2013-2016 State subprogramme for scientific and technical infrastructures and equipment of ERDF, CESG15-DE-3114
dc.description.sponsorshipXunta de Galicia; ED431F 2025/35
dc.description.sponsorshipXunta de Galicia; ED431G 2023/01
dc.description.sponsorshipXunta de Galicia; ICTS-2019-02-CESGA-3
dc.description.sponsorshipXunta de Galicia; CESG15-DE-3114
dc.identifier.citationA. Anido-Alonso y D. Alvarez-Estevez, «Analysis of federated learning on non-independent and identically distributed sleep data», Physiol. Meas., vol. 47, n.o 3, p. 035006, mar. 2026, doi: 10.1088/1361-6579/ae4a82
dc.identifier.doi10.1088/1361-6579/ae4a82
dc.identifier.issn1361-6579
dc.identifier.urihttps://hdl.handle.net/2183/47887
dc.language.isoeng
dc.publisherIOP Science
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/RYC2022-038121-I/ES/BIOMEDICAL SIGNAL PROCESSING AND ARTIFICIAL INTELLIGENCE FOR AIDING CLINICAL DIAGNOSIS IN SLEEP MEDICINE
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica, Técnica y de Innovación 2021-2023/PID2023-147422OB-I00/ES/ALGORITMOS DE APRENDIZAJE AUTOMATICO DE NUEVA GENERACION PARA EL ANALISIS DE REGISTROS MEDICOS DEL SUEÑO
dc.relation.projectIDinfo:eu-repo/grantAgreement/MICINN/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/ICT2021-006904/ES/
dc.relation.urihttps://doi.org/10.1088/1361-6579/ae4a82
dc.rights© 2026 The Author(s). Published on behalf of Institute of Physics and Engineering in Medicine by IOP Publishing Ltd
dc.rightsAttribution 4.0 Internationalen
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectDeep-learning
dc.subjectData-privacy
dc.subjectFederated learning
dc.subjectNon-independent and identically distributed data
dc.subjectSleep staging
dc.titleAnalysis of federated learning on non-independent and identically distributed sleep data
dc.typejournal article
dc.type.hasVersionVoR
dspace.entity.typePublication
relation.isAuthorOfPublication2f33139f-83f9-4a21-9fb4-43f4322a8a87
relation.isAuthorOfPublication.latestForDiscovery2f33139f-83f9-4a21-9fb4-43f4322a8a87

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
AnidoAlonso_Adriana_2026_analysis_FL_non_ind.pdf
Size:
3.01 MB
Format:
Adobe Portable Document Format