Automated retinal imaging analysis (ARIA) has high sensitivity for detecting diabetic retinopathy and high sensitivity for detecting proliferative diabetic retinopathy, according to a study published in Diabetic Medicine. ARIA is viable as a triage tool for human graders, particularly in underserved rural regions, the researchers report.
Although national diabetic retinopathy screening programs are effective, they require “considerable time” spent analyzing images as well as trained, certified, and quality-assured human graders, according to the investigators. New solutions like ARIA may be successfully used to analyze a large volume of retinal images, facilitating real-time point-of-care diabetic retinopathy grading.
For ARIA to be clinically applicable, it must perform well in areas with nonophthalmic clinicians and less experienced healthcare workers. Researchers therefore sought to compare ARIA grading accuracy against manual diabetic retinopathy grading, based on retinal images of Indigenous Australians with type 2 diabetes in remote primary care clinical settings. The UK National Health Services DES Program guidelines were used as the reference standard for grading.
The study cohort included 410 Indigenous Australians from the Northern Territory with type 2 diabetes who were recruited between 2013 and 2015 from 3 remote primary care centers. All underwent diabetic retinopathy screening, which included an assessment of visual acuity, pupil dilation, and fundus photography.
Images evaluated by humans were graded by a single trained and certified grader at the Belfast Ophthalmic Reading Center.
Four hundred participants had retinal images available for both eyes. For ARIA and human grading, respectively, 60 eyes from 43 participants and 68 eyes from 50 participants were ungradable. For human grading, 742 eyes from 391 participants were available for analysis. For ARIA grading, 750 eyes from 393 participants were available for analysis. In total, 391 participants were graded by both human graders and ARIA.
Reasons for ungradable ARIA images included cataracts, poor image quality, pterygium, and other unknown reasons (21.7%, 71.8%, 1.7%, and 5% of images, respectively).
Based on human grading, diabetic retinopathy prevalence was 47.3%, with 29.7% of participants classified as R1, 9.7% as R2, and 7.9% as R3. Using ARIA grading, the overall prevalence of diabetic retinopathy was 48.6%, with 28.0%, 14.0%, and 9.4% classified as R1, R2, and R3, respectively.
In total, 14.6% of cases were classified as false negatives and 10.7% were false positives. Following arbitration, primary causes for false positives included poor image quality or small pupil (23.8%), cataract (16.7%), artefacts (14.3%), and other pathology (4.8%). Primary reasons for false negatives included other pathology or treatment (19.3%), cataract (29.8%), poor image quality or small pupil (24.6%), and artefacts (19.3%).
When comparing ARIA to the reference standard, software sensitivity and specificity were 91.7% (95% CI, 88.0-94.6) and 90.5% (95% CI, 87.3-93.0), respectively. Sensitivity and specificity for each retinopathy grade ranged from 74.6% to 95.7% and 90.5% to 98.6%, respectively.
On a participant level, ARIA software sensitivity and specificity for the diagnosis of any diabetic retinopathy was 91.4% (95% CI, 86.3-95.0) and 85.0% (95% CI, 79.3-89.5), respectively. For levels R1 to R3, ARIA sensitivity and specificity ranged from 70.7% to 96.8% and 89.8% to 98.3%, respectively; R1 had the lowest sensitivities for both participants and all available eyes.
Agreement analysis between ARIA and the human grader for any diabetic retinopathy as well as each retinopathy stage ranged between 84.1% and 98.2%; proportionate agreement was 88%, 84.1%, 92.3%, and 98.2% for any diabetic retinopathy, R1, R2, and R3, respectively.
Among 77.3% of participants, both eyes were graded the same by ARIA and the human grader, while 13.9% of participants had one eye graded the same and 6.3% had both eyes graded differently. Of the 10 participants who only had images available for 1 eye, 8 had the same grade in ARIA and human grading; ARIA graded worse than the human in 1 eye, while the tenth eye was ungradable by the human (ARIA grading R3).
Study limitations include the lack of diabetic maculopathy data, missing information on referable diabetic retinopathy—an important limitation for the current ARIA software version, the use of only 1 human grader, and the need for future research into the diagnostics and evaluation of false positives and negatives.
“This study presents a current ARIA software as a potentially useful tool for [diabetic retinopathy] screening in remote and very remote Indigenous Australians for triaging screened participants,” according to the researchers. “This would reduce the workload of human graders by half in this clinical setting and population by eliminating those without [diabetic retinopathy] or reduced vision.”
“Updated ARIA versions of this application may achieve adequate sensitivity for the detection of referable retinopathy and maculopathy. Human graders could then prioritize their efforts to the quality assurance of all elements of an ARIA-based [diabetic retinopathy] screening program.”
Reference
Quinn N, Brazionis L, Zhu B, et al; for the Centre of Research Excellence in Diabetic Retinopathy Study, TEAMSnet Study Groups. Facilitating diabetic retinopathy screening using automated retinal image analysis in underresourced settings. Diabet Med. Published online April 7, 2021. doi:10.1111/dme.14582