Multimodal Image Encoding Pre-training for Diabetic Retinopathy Grading

HERVELLA, Álvaro S., et al. Multimodal image encoding pre-training for diabetic retinopathy grading. Computers in Biology and Medicine, 2022, vol. 143, p. 105302. https://doi.org/10.1016/j.compbiomed.2022.105302

Resumo

[Abstract] Diabetic retinopathy is an increasingly prevalent eye disorder that can lead to severe vision impairment. The severity grading of the disease using retinal images is key to provide an adequate treatment. However, in order to learn the diverse patterns and complex relations that are required for the grading, deep neural networks require very large annotated datasets that are not always available. This has been typically addressed by reusing networks that were pre-trained for natural image classification, hence relying on additional annotated data from a different domain. In contrast, we propose a novel pre-training approach that takes advantage of unlabeled multimodal visual data commonly available in ophthalmology. The use of multimodal visual data for pre-training purposes has been previously explored by training a network in the prediction of one image modality from another. However, that approach does not ensure a broad understanding of the retinal images, given that the network may exclusively focus on the similarities between modalities while ignoring the differences. Thus, we propose a novel self-supervised pre-training that explicitly teaches the networks to learn the common characteristics between modalities as well as the characteristics that are exclusive to the input modality. This provides a complete comprehension of the input domain and facilitates the training of downstream tasks that require a broad understanding of the retinal images, such as the grading of diabetic retinopathy. To validate and analyze the proposed approach, we performed an exhaustive experimentation on different public datasets. The transfer learning performance for the grading of diabetic retinopathy is evaluated under different settings while also comparing against previous state-of-the-art pre-training approaches. Additionally, a comparison against relevant state-of-the-art works for the detection and grading of diabetic retinopathy is also provided. The results show a satisfactory performance of the proposed approach, which outperforms previous pre-training alternatives in the grading of diabetic retinopathy.

Palabras chave

Diabetic retinopathy
Computer-aided diagnosis
Medical imaging
Self-supervised learning
Deep learning
Eye fundus

Descrición

Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG

Versión do editor

https://doi.org/10.1016/j.compbiomed.2022.105302

Dereitos

Atribución-NoComercial-SinDerivadas 4.0 Internacional