Mostrar o rexistro simple do ítem

dc.contributor.authorPereira-Ruisánchez, Dariel
dc.contributor.authorFresnedo, Óscar
dc.contributor.authorPérez-Adán, Darian
dc.contributor.authorCastedo, Luis
dc.date.accessioned2023-12-19T18:25:09Z
dc.date.available2023-12-19T18:25:09Z
dc.date.issued2023-07
dc.identifier.citationD. Pereira-Ruisánchez, Ó. Fresnedo, D. Pérez-Adán and L. Castedo, "Deep Contextual Bandit and Reinforcement Learning for IRS-Assisted MU-MIMO Systems," in IEEE Transactions on Vehicular Technology, vol. 72, no. 7, pp. 9099-9114, July 2023, doi: 10.1109/TVT.2023.3249353.es_ES
dc.identifier.issn0018-9545
dc.identifier.urihttp://hdl.handle.net/2183/34562
dc.description© 2023 IEEE. This version of the article has been accepted for publication, after peer review. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The Version of Record is available online at: https://doi.org/10.1109/TVT.2023.3249353.es_ES
dc.description.abstract[Abstract]: The combination of multiple-input multiple-output (MIMO) systems and intelligent reflecting surfaces (IRSs) is foreseen as a critical enabler of beyond 5G (B5G) and 6G. In this work, two different approaches are considered for the joint optimization of the IRS phase-shift matrix and MIMO precoders of an IRS-assisted multi-stream (MS) multi-user MIMO (MU-MIMO) system. Both approaches aim to maximize the system sum-rate for every channel realization. The first proposed solution is a novel contextual bandit (CB) framework with continuous state and action spaces called deep contextual bandit-oriented deep deterministic policy gradient (DCB-DDPG). The second is an innovative deep reinforcement learning (DRL) formulation where the states, actions, and rewards are selected such that the Markov decision process (MDP) property of reinforcement learning (RL) is appropriately met. Both proposals perform remarkably better than state-of-the-art heuristic methods in scenarios with high multi-user interference.es_ES
dc.description.sponsorshipThis work has been supported by grants ED431C 2020/15 and ED431G 2019/01 (to support the Centro de Investigación de Galicia “CITIC”) funded by Xunta de Galicia and ERDF Galicia 2014-2020; and by grants PID2019-104958RB-C42 (ADELE) and BES-2017-081955 funded by MCIN/AEI/10.13039/501100011033.es_ES
dc.description.sponsorshipXunta de Galicia; ED431C 2020/15es_ES
dc.description.sponsorshipXunta de Galicia; ED431G 2019/01es_ES
dc.language.isoenges_ES
dc.publisherInstitute of Electrical and Electronics Engineerses_ES
dc.relationinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-104958RB-C42/ES/AVANCES EN CODIFICACIÓN Y PROCESADO DE SEÑAL PARA LA SOCIEDAD DIGITALes_ES
dc.relationinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/BES-2017-081955/ES/es_ES
dc.relation.isversionofhttps://doi.org/10.1109/TVT.2023.3249353
dc.relation.urihttps://doi.org/10.1109/TVT.2023.3249353es_ES
dc.rights© 2023 IEEE. All rights reserved. Todos os dereitos reservados.es_ES
dc.subjectDeep contextual bandites_ES
dc.subjectDDPGes_ES
dc.subjectDeep reinforcement learninges_ES
dc.subjectIntelligent reflecting surfaceses_ES
dc.subjectMIMOes_ES
dc.titleDeep Contextual Bandit and Reinforcement Learning for IRS-Assisted MU-MIMO Systemses_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.rights.accessinfo:eu-repo/semantics/openAccesses_ES
UDC.journalTitleIEEE Transactions on Vehicular Technologyes_ES
UDC.volume72es_ES
UDC.issue7es_ES
UDC.startPage9099es_ES
UDC.endPage9114es_ES
dc.identifier.doi10.1109/TVT.2023.3249353


Ficheiros no ítem

Thumbnail

Este ítem aparece na(s) seguinte(s) colección(s)

Mostrar o rexistro simple do ítem