Spatial–temporal feature-based End-to-end Fourier network for 3D sign language recognition

UDC.coleccionInvestigaciónes_ES
UDC.departamentoCiencias da Computación e Tecnoloxías da Informaciónes_ES
UDC.endPage15es_ES
UDC.grupoInvLaboratorio de Investigación e Desenvolvemento en Intelixencia Artificial (LIDIA)es_ES
UDC.issueArticle 123258es_ES
UDC.journalTitleExpert Systems with Applicationses_ES
UDC.startPage1es_ES
UDC.volume248es_ES
dc.contributor.authorAbdullahi, Sunusi Bala
dc.contributor.authorChamnongthai, Kosin
dc.contributor.authorBolón-Canedo, Verónica
dc.contributor.authorCancela, Brais
dc.date.accessioned2024-11-19T19:24:54Z
dc.date.embargoEndDate2026-08-15es_ES
dc.date.embargoLift2026-08-15
dc.date.issued2024-08-15
dc.descriptionThis is the Accepted Manuscript. This version of the article has been accepted for publication in: Expert Systems with Applications, 248, 123258. The Version of Record is available online at https://doi.org/10.1016/j.eswa.2024.123258.es_ES
dc.description.abstract[Abstract]: Most dynamic sign word misclassifications are caused by redundant spatial–temporal (SPT) feature pruning that lacks language semantic and temporal dependencies. SPT feature recognition is one of the important aspects for the evaluation of the misclassification of dynamic sign words. The redundant pruning of SPT feature space influences the language model of sign confusion, model complexity, and SPT feature similarity. The purpose of this article is to develop a new multi-scale SPT feature-based dynamic sign word recognition approach via a low-cost feature selection method (FS) and End-to-end Fourier convolution neural network (EFCNN). Instead of a sensor fusion technique for obtaining frame position alignment, in the EFCNN, new 3D frame position and coordinates are determined using a pixel weighting and alignment function of the first and succeeding 25 spatial intensities of the 3D video changes across hand motion. The new spatial weight and the original spatial coordinates are fused and truncated in the Fourier domain. We generate the temporal dependence of the fused features. A feature selection known as the FS-EFCNN is introduced to select compact features with a preserved language meaning. Five state-of-the-art feature selection methods, namely Infinite FS (InFS), Relief FS, Fisher, MIM, ILFS, and ensemble FS-EFCNN were deployed to guide and optimize the learning performance of EFCNN. The experimental result analysis highlighted the improved results of the FS-EFCNN method with the best accuracy of 99.86%, 99.89%, and 90.69% on 3D American Sign Language, British Sign Language, and Greek Sign Language data sets, respectively.es_ES
dc.description.sponsorshipThis research has been financially supported in part by the Spanish Ministerio de Ciencia e Innovación MCIN/AEI/10.13039/501100011033 and ”NextGenerationEU”/PRTR under Grants [PID2019-109238GB-C22; PID2021-128045OA-I00; TED2021-130599A-I00], and by the Xunta de Galicia (ED431C 2022/44) with the European Union ERDF funds. CITIC, as a Research Center of the University System of Galicia, is funded by Consellería de Educación, Universidade e Formación Profesional of the Xunta de Galicia, Spain through the European Regional Development Fund (ERDF) and the Secretaría Xeral de Universidades (Ref. ED431G 2019/01). This research is also supported by King Mongkut’s University of Technology Thonburi’s Postdoctoral Fellowship Under Research Project ID 27180.es_ES
dc.description.sponsorshipXunta de Galicia; ED431C 2022/44es_ES
dc.description.sponsorshipXunta de Galicia; ED431G 2019/01es_ES
dc.description.sponsorshipThailand. King Mongkut's University of Technology Thonburi; 27180es_ES
dc.identifier.citationAbdullahi, S. B., Chamnongthai, K., Bolon-Canedo, V., & Cancela, B. (2024). Spatial–temporal feature-based End-to-end Fourier network for 3D sign language recognition. Expert Systems with Applications, 248, 123258. https://doi.org/10.1016/j.eswa.2024.123258es_ES
dc.identifier.doi10.1016/j.eswa.2024.123258
dc.identifier.issn0957-4174
dc.identifier.issn1873-6793
dc.identifier.urihttp://hdl.handle.net/2183/40192
dc.language.isoenges_ES
dc.publisherElsevieres_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-109238GB-C22/ES/APRENDIZAJE AUTOMATICO ESCALABLE Y EXPLICABLEes_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/MICINN/Plan Estatal de Investigación Científica, Técnica y de Innovación 2021-2023/PID2021-128045OA-I00/ES/APRENDIZAJE PROFUNDO ÉTICOes_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2024/TED2021-130599A-I00/ES/ALGORITMOS DE SELECCIÓN DE CARACTERÍSTICAS VERDES Y RÁPIDOSes_ES
dc.relation.urihttps://doi.org/10.1016/j.eswa.2024.123258.es_ES
dc.rightsAtribución-NoComercial-SinDerivadas 4.0 Internacionales_ES
dc.rights© 2024. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/.es_ES
dc.rights.accessRightsembargoed accesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.subjectEnd-to-end deep learning networkes_ES
dc.subjectFeature selectiones_ES
dc.subjectFourier convolutiones_ES
dc.subjectHand gestureses_ES
dc.subjectNatural language processinges_ES
dc.subjectSign language recognitiones_ES
dc.subjectSpatial–temporal informationes_ES
dc.titleSpatial–temporal feature-based End-to-end Fourier network for 3D sign language recognitiones_ES
dc.typejournal articlees_ES
dspace.entity.typePublication
relation.isAuthorOfPublicationc114dccd-76e4-4959-ba6b-7c7c055289b1
relation.isAuthorOfPublicationba91aca1-bdb4-4be5-b686-463937924910
relation.isAuthorOfPublication.latestForDiscoveryc114dccd-76e4-4959-ba6b-7c7c055289b1

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Bolon_Canedo_Veronica_2024_Spatial–temporal_feature-based_End-to-end_Fourier_network.pdf
Size:
1.96 MB
Format:
Adobe Portable Document Format
Description: