Design, Implementation, and Practical Evaluation of a Voice Recognition Based IoT Home Automation System for Low-Resource Languages and Resource-Constrained Edge IoT Devices: A System for Galician and Mobile Opportunistic Scenarios

Froiz-Míguez, Iván; Fraga-Lamas, Paula; Fernández-Caramés, Tiago M.

dc.contributor.author	Froiz-Míguez, Iván
dc.contributor.author	Fraga-Lamas, Paula
dc.contributor.author	Fernández-Caramés, Tiago M.
dc.date.accessioned	2024-06-25T10:13:43Z
dc.date.available	2024-06-25T10:13:43Z
dc.date.issued	2023
dc.identifier.citation	I. Froiz-Míguez, P. Fraga-Lamas, y T. M. Fernández-CaraméS, «Design, Implementation, and Practical Evaluation of a Voice Recognition Based IoT Home Automation System for Low-Resource Languages and Resource-Constrained Edge IoT Devices: A System for Galician and Mobile Opportunistic Scenarios», IEEE Access, vol. 11, pp. 63623-63649, 2023, doi: 10.1109/ACCESS.2023.3286391.	es_ES
dc.identifier.issn	2169-3536
dc.identifier.uri	http://hdl.handle.net/2183/37349
dc.description.abstract	[Abstract]: Systems with voice control are an attractive option for increasing technological integration, not only for people with little knowledge on technology or constrained Internet access, but also for people with certain disabilities. In addition, devices based on Alexa or Google Home provide an interesting alternative for interacting with Internet of Things (IoT) devices, but they usually rely on an Internet connection to a cloud server for their full operation. Furthermore, many voice-recognition systems are only available in a limited number of languages, which tend to be those with the highest number of speakers, thus excluding minority-language speakers. To address the previously mentioned issues, this article presents a solution based on Edge Computing and voice commands that carries out offline voice processing and that is able to interact with IoT-based systems. The proposed system performs local speech inference, providing a communication interface with IoT devices in a Bluetooth mesh, all in a fast way and without the need for an Internet connection. In addition, the proposed solution can be adapted easily for voice recognition of languages with few resources. Such a feature is demonstrated with the Galician language, which is spoken by less than 3 million people worldwide. In particular, different Automatic Speech Recognition (ASR) models based on three of the most popular ASR development frameworks (wav2vec2, DistilHubert, Whisper) were developed to transcribe short speech and to translate it into IoT commands that perform specific home-automation actions. Such models were fine-tuned for Galician with a corpus of approximately 20 hours and were evaluated in static and mobile opportunistic scenarios in terms of accuracy, energy consumption and latency on an embedded platform (that acts as an edge device) and on a cloud server. The obtained results show that inference is performed in less than 2 seconds on a Raspberry Pi 4 for the two smallest models and in less than 500 ms on a high-end Android smartphone when processing all data locally with CPU-only inference (i.e., without hardware acceleration or external processing). The results of the transcriptions are accurate enough to be able to use simple text distance algorithms to detect keywords in the speech and perform commands on IoT devices. In particular, a maximum success rate of 92% was achieved for detecting the indicated commands when using models optimized for being executed on embedded devices. For selected home scenarios, command actions were sent via Bluetooth with average response times of up to 113 ms.	es_ES
dc.description.sponsorship	This work has been funded by the Xunta de Galicia (by grant ED431C 2020/15), and by grants PID2020-118857RA-100 (ORBALLO) and TED2021-129433A-C22 (HELENE) funded by MCIN/AEI/10.13039/501100011033 and the European Union NextGenerationEU/PRTR.	es_ES
dc.description.sponsorship	Xunta de Galicia; ED431C 2020/15	es_ES
dc.language.iso	eng	es_ES
dc.publisher	IEEE	es_ES
dc.relation	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-118857RA-100/ES/EDGE COMPUTING OPORTUNISTA BASADO EN DISPOSITIVOS LOT MÓVILES Y DE BAJA POTENCIA (ORBALLO)	es_ES
dc.relation	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/TED2021-129433A-C22/ES/SISTEMA DE ALTA SEGURIDAD BASADO EN BLOCKCHAIN PARA LA GESTIÓN PRIVADA DE DATOS DE PACIENTES DE SERVICIOS DE SALUD DIGITALES	es_ES
dc.relation.uri	https://doi.org/10.1109/ACCESS.2023.3286391	es_ES
dc.rights	Atribución 3.0 España	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by/3.0/es/	*
dc.subject	ASR	es_ES
dc.subject	Machine learning	es_ES
dc.subject	IoT	es_ES
dc.subject	Voice-assistant	es_ES
dc.subject	Edge AI	es_ES
dc.subject	Edge computing	es_ES
dc.subject	Home automation	es_ES
dc.subject	Opportunistic communications	es_ES
dc.title	Design, Implementation, and Practical Evaluation of a Voice Recognition Based IoT Home Automation System for Low-Resource Languages and Resource-Constrained Edge IoT Devices: A System for Galician and Mobile Opportunistic Scenarios	es_ES
dc.type	info:eu-repo/semantics/article	es_ES
dc.rights.access	info:eu-repo/semantics/openAccess	es_ES
UDC.journalTitle	IEEE Access	es_ES
UDC.volume	11	es_ES
UDC.startPage	63623	es_ES
UDC.endPage	63649	es_ES

Ficheiros no ítem

Nome:: FroizMiguez_Ivan_2023_Design_I ...
Tamaño:: 3.379Mb
Formato:: PDF

Ver/abrir

Nome:: license_rdf
Tamaño:: 1.337Kb
Formato:: application/rdf+xml

Ver/abrir

Este ítem aparece na(s) seguinte(s) colección(s)

GI-GTEC - Artigos [190]

Mostrar o rexistro simple do ítem