Self-Supervised Multimodal Reconstruction Pre-training for Retinal Computer-Aided Diagnosis

Hervella, Álvaro S.; Rouco, José; Novo Buján, Jorge; Ortega Hortas, Marcos

Use this link to cite:

http://hdl.handle.net/2183/28845

Self-Supervised Multimodal Reconstruction Pre-training for Retinal Computer-Aided Diagnosis

Files

Hervella_AlvaroS_2021_Self-supervised_Multimodal_Reconstruction.pdf (1.94 MB)

Identifiers

URI: http://hdl.handle.net/2183/28845

DOI: 10.1016/j.eswa.2021.115598

Publication date

2021

Authors

Hervella, Álvaro S.

Rouco, José

Novo Buján, Jorge

Ortega Hortas, Marcos

Bibliographic citation

Álvaro S. Hervella, José Rouco, Jorge Novo, Marcos Ortega, Self-supervised multimodal reconstruction pre-training for retinal computer-aided diagnosis, Expert Systems with Applications, Volume 185, 2021, 115598, ISSN 0957-4174, https://doi.org/10.1016/j.eswa.2021.115598. (https://www.sciencedirect.com/science/article/pii/S0957417421009982)

Abstract

[Abstract] Computer-aided diagnosis using retinal fundus images is crucial for the early detection of many ocular and systemic diseases. Nowadays, deep learning-based approaches are commonly used for this purpose. However, training deep neural networks usually requires a large amount of annotated data, which is not always available. In practice, this issue is commonly mitigated with different techniques, such as data augmentation or transfer learning. Nevertheless, the latter is typically faced using networks that were pre-trained on additional annotated data. An emerging alternative to the traditional transfer learning source tasks is the use of self-supervised tasks that do not require manually annotated data for training. In that regard, we propose a novel self-supervised visual learning strategy for improving the retinal computer-aided diagnosis systems using unlabeled multimodal data. In particular, we explore the use of a multimodal reconstruction task between complementary retinal imaging modalities. This allows to take advantage of existent unlabeled multimodal data in the medical domain, improving the diagnosis of different ocular diseases with additional domain-specific knowledge that does not rely on manual annotation. To validate and analyze the proposed approach, we performed several experiments aiming at the diagnosis of different diseases, including two of the most prevalent impairing ocular disorders: glaucoma and age-related macular degeneration. Additionally, the advantages of the proposed approach are clearly demonstrated in the comparisons that we perform against both the common fully-supervised approaches in the literature as well as current self-supervised alternatives for retinal computer-aided diagnosis. In general, the results show a satisfactory performance of our proposal, which improves existing alternatives by leveraging the unlabeled multimodal visual data that is commonly available in the medical field.

Description

Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG

Keywords

Deep learning Medical imaging Self-supervised learning Eye fundus Transfer learning Computer-aided diagnosis

Editor version

https://doi.org/10.1016/j.eswa.2021.115598

Rights

Atribución-NoComercial-SinDerivadas 4.0 Internacional

Collections

Investigación (FIC)

Full item page

Except where otherwise noted, this item's license is described as Atribución-NoComercial-SinDerivadas 4.0 Internacional

Self-Supervised Multimodal Reconstruction Pre-training for Retinal Computer-Aided Diagnosis

Files

Identifiers

Publication date

Authors

Advisors

Other responsabilities

Journal Title

Bibliographic citation

Type of academic work

Academic degree

Abstract

Description

Keywords

Editor version

Rights

Collections