Use this link to cite:
http://hdl.handle.net/2183/31955 Desarrollo de una librería para el aprendizaje federado bajo una arquitectura peer-to-peer
Loading...
Identifiers
Publication date
Authors
Guijas Bravo, Pedro
Other responsabilities
Universidade da Coruña. Facultade de Informática
Journal Title
Bibliographic citation
Type of academic work
Academic degree
Abstract
[Resumen] En la última década, la evolución del Machine Learning ha sido muy próspera, necesitando los
modelos más fructíferos ser nutridos por grandes volúmenes de datos. A menudo, la obtención
y gestión de estos datos es complicada, siendo generalmente escasos y sujetos a medidas de
privacidad. El Federated Learning implica un cambio de paradigma en el entrenamiento de
modelos de Machine Learning. Este nuevo enfoque permite realizar el proceso de aprendizaje
sobre datos distribuidos entre una gran cantidad de clientes.
A pesar de que esta novedosa técnica trae consigo numerosas ventajas, una de sus grandes
limitaciones es la necesidad de un servidor que orqueste todo el proceso de aprendizaje, suponiendo
un único punto de falla. Así mismo, se necesitará disponer de una gran infraestructura
para hacer escalables estos sistemas. Para tratar de solventar estas desventajas, surgirán
nuevas aproximaciones denominadas Decentralized Federated Learning, siendo una de las soluciones
más prometedoras el uso de redes peer-to-peer.
Ante la inexistencia de alguna librería de soporte al Decentralized Federated Learning, en este
proyecto, se propone el desarrollo de una librería de propósito general que permita el Federated
Learning sobre redes peer-to-peer, empleando el protocolo Gossip. El uso del
protocolo Gossip garantizará la tolerancia a fallos en la red peer-to-peer, creando un ecosistema
descentralizado, escalable y robusto.
La librería busca dar soporte a toda clase de dispositivos, haciendo especial hincapié en su
facilidad de uso y ampliación futura. Además de permitir el despliegue, ésta, posibilita la ejecución
de simulaciones, haciendo posible la realización de pruebas en entornos controlados.
Para el desarrollo, se ha hecho uso de la metodología ágil SCRUM. Las iteraciones centrales
se han destinado a la implementación del sistema, mientras que la inicial y final se han dedicado
a la preparación del proyecto y realización de pruebas respectivamente. En las diversas
pruebas realizadas, se han empleado los datasets de MNIST y FEMNIST, obteniendo resultados
realmente similares a ejecuciones equivalentes con entrenamientos clásicos.
[Abstract] In the last decade, the evolution of Machine Learning has been really thriving. The most productive models need to be fed by a large volume of data. Obtaining and managing these data is often complicated, as they are generally scarce and subject to privacy measures. Federated Learning implies a paradigm shift in the training of Machine Learning models. This new approach allows us to carry out the learning process on data which are distributed among a large number of clients. In spite of the fact that this new technique has numerous advantages, one of its great limitations is the need for a server that orchestrates the entire learning process, assuming a single point of failure. Moreover, a large infrastructure will be necessary to make these systems scalable .In order to solve these disadvantages, new approaches called Decentralized Federated Learning will emerge, and one of the most promising solutions will be the use of peer-to-peer networks. Given the lack of any library to support Decentralized Federated Learning, this project proposes the development of a general purpose library that allows Federated Learning over peer-to-peer networks, using the Gossip protocol. The use of the Gossip protocol will guarantee fault tolerance in the network, creating a decentralized, scalable and robust ecosystem. The library will seek to support all kinds of devices, with special emphasis on ease of use and future expansion. In addition to allowing deployment, it will enable the execution of simulations, making it possible to perform tests in controlled environments. The agile SCRUM methodology has been used for this development. The central iterations have been used to the implementation of the system, while the initial and final iterations have been respectively dedicated to project preparation and testing. The MNIST and FEMNIST datasets have been used in the different tests carried out, obtaining results that are really similar to equivalent executions with classic training.
[Abstract] In the last decade, the evolution of Machine Learning has been really thriving. The most productive models need to be fed by a large volume of data. Obtaining and managing these data is often complicated, as they are generally scarce and subject to privacy measures. Federated Learning implies a paradigm shift in the training of Machine Learning models. This new approach allows us to carry out the learning process on data which are distributed among a large number of clients. In spite of the fact that this new technique has numerous advantages, one of its great limitations is the need for a server that orchestrates the entire learning process, assuming a single point of failure. Moreover, a large infrastructure will be necessary to make these systems scalable .In order to solve these disadvantages, new approaches called Decentralized Federated Learning will emerge, and one of the most promising solutions will be the use of peer-to-peer networks. Given the lack of any library to support Decentralized Federated Learning, this project proposes the development of a general purpose library that allows Federated Learning over peer-to-peer networks, using the Gossip protocol. The use of the Gossip protocol will guarantee fault tolerance in the network, creating a decentralized, scalable and robust ecosystem. The library will seek to support all kinds of devices, with special emphasis on ease of use and future expansion. In addition to allowing deployment, it will enable the execution of simulations, making it possible to perform tests in controlled environments. The agile SCRUM methodology has been used for this development. The central iterations have been used to the implementation of the system, while the initial and final iterations have been respectively dedicated to project preparation and testing. The MNIST and FEMNIST datasets have been used in the different tests carried out, obtaining results that are really similar to equivalent executions with classic training.
Description
Editor version
Rights
Atribución-NoComercial 3.0 España








