Use this link to cite:
https://hdl.handle.net/2183/46151 Arquitecturas eficientes para LLMs: estudo e implementación de Mixture of Experts en modelos Transformer
Loading...
Identifiers
Publication date
Authors
Gil Torres, Artur
Other responsabilities
Universidade da Coruña. Facultade de Informática
Journal Title
Bibliographic citation
Type of academic work
Academic degree
Abstract
[Resumo]: A IA xa está consolidada como a tecnoloxía de vangarda desta década, e o uso de ferramentas de xeración de texto baseadas en LLMs está á orde do día. Porén, o seu elevado consumo enerxético e consecuente impacto medioambiental resaltan a necesidade de facelas máis eficientes. Nos últimos anos, a mellora deste tipo de modelos logrouse principalmente a través do escalado mediante hardware cada vez máis potente e capaz, pero na actualidade esta tendencia está a cambiar. O uso de técnicas de optimización por software, e en especial as técnicas de computación condicional, preséntase como unha alternativa para elevar os resultados de rendemento de maneira sostible. Neste traballo afondarase nunha destas técnicas, a Mistura de Expertos, que permite distribuír a carga computacional activando de maneira selectiva aquelas partes do modelo que mellor se adecuen a cada entrada. Analizarase o seu funcionamento teórico, implementaranse diferentes versións e realizaranse experimentos para avaliar as súas vantaxes e limitacións.
[Abstract]: Artificial intelligence has already been established as the leading technology of this decade, and the use of text-generation tools based on LLMs is on the rise. However, their high energy consumption and the resulting environmental impact highlight the need to make them more efficient. In recent years, the improvement of these models has mainly been achieved through scaling with increasingly powerful hardware, but this trend is now changing. The use of software optimization techniques, especially conditional computation techniques, is emerging as a sustainable alternative to boost performance while reducing maintenance costs. This work will delve into one of these techniques, Mixture of Experts, which distributes the computational load by sparsely activating only those parts of the model that best fit each input. Its theoretical foundations will be analyzed, different implementations will be developed, and experiments will be conducted to evaluate its advantages and limitations.
[Abstract]: Artificial intelligence has already been established as the leading technology of this decade, and the use of text-generation tools based on LLMs is on the rise. However, their high energy consumption and the resulting environmental impact highlight the need to make them more efficient. In recent years, the improvement of these models has mainly been achieved through scaling with increasingly powerful hardware, but this trend is now changing. The use of software optimization techniques, especially conditional computation techniques, is emerging as a sustainable alternative to boost performance while reducing maintenance costs. This work will delve into one of these techniques, Mixture of Experts, which distributes the computational load by sparsely activating only those parts of the model that best fit each input. Its theoretical foundations will be analyzed, different implementations will be developed, and experiments will be conducted to evaluate its advantages and limitations.
Description
Editor version
Rights
Os titulares dos dereitos de autor autorizan a visualización do contido desta obra a través de Internet, así como a súa reprodución, gravación en soporte informático ou impresión para uso privado ou con fins de investigación. En ningún caso se permite o uso lucrativo deste documento. Estes dereitos afectan tanto ao resumo da obra como ao seu contido. Los titulares de los derechos de propiedad intelectual autorizan la visualización del contenido de este trabajo a través de Internet, así como su reproducción, grabación en soporte informático o impresión para su uso privado o con fines de investigación. En ningún caso se permite el uso lucrativo de este documento. Estos derechos afectan tanto al resumen del trabajo como a su contenido.







