Artificial Intelligence Accurately Differentiates ROP Early Stages

Approximately 90% of the time, retinopathy of prematurity (ROP) only develops into the first 2 of 5 stages, and then regresses without intervention. But for those in the approximate 10% who progress to type-1 ROP, swift, early treatment is often the only hope to reduce the risk of retinal detachment, permanent visual impairment and, in some cases, blindness. However, diagnosing ROP is difficult and timely, and few experts focus on it. Researchers believe an artificial intelligence (AI) model, specifically a convolutional neural network (CNN) algorithm trained to detect stage 1 and stage 2 ROP from fundus images, can improve diagnosis and keep a greater number of patients from progressing.

The proposed deep CNN system accurately differentiates between early stages of ROP,, according to research results published in the British Journal of Ophthalmology.

A retrospective, cross-sectional study enrolled premature infants without ROP, with stage 1 ROP, and with stage 2 ROP. Mean patient gestational age was 28.3 (±2.0) weeks (range: 23-32 weeks) in the no ROP group, 27.5 (±1.8) weeks (range: 24-32 weeks) in the stage 1 group, and 26.0 (±1.8) weeks (range: 24-31 weeks) in the stage 2 group. A total of 11,372 retinal fundus images were collected and split into the following groups: 10,235 were used for training, 1137 were used for validating the model, and 244 were used for testing. Investigators implemented the deep CNN to classify images by ROP stage.

Researchers evaluated the trained model via fivefold cross-validation with the test dataset of 244 images, achieving an average accuracy of 92.23%. The test dataset was further evaluated for sensitivity and specificity, which were computed as 96.14% (±0.87) sensitivity and 95.95% (±0.48) specificity for predicting whether ROP was present. To predict ROP stage 1 vs no ROP and stage 2 ROP, sensitivity and specificity were 91.82% (±2.03) and 94.50% (±0.71), compared with sensitivity and specificity of 89.81% (±1.82) and 98.99% (±0.40) to predict stage 2 ROP vs no ROP and stage 1 ROP.

Researchers note that, in this test, the CNN misclassified 6.15% of images. They offered a couple of possible explanations: first, some images did not visualize the stage well due to field of view, and second, some misclassified images were hazy in the periphery, meaning the demarcation line and ridge were obscured by optical artifacts such as “patches” generated by lens flare of the digital camera or excessive light exposure.

Limitations to the study included the retrospective design and small sample size, the use of imbalanced datasets, and the exclusion of eyes with poor image quality that may not reflect the quality of real-world data.

“Our study demonstrated the feasibility of classifying ROP stages with the proposed CNN model,” the researchers concluded. “This system provides proof of principle that AI techniques can theoretically reduce clinician workloads in the future.” They acknowledge that further studies are needed, and that there are additional challenges to incorporating artificial intelligence into clinical practice.

Disclosure: Several study authors declared affiliations with the pharmaceutical industry. Please see the original reference for a full list of authors’ disclosures.


Huang Y, Basanta H, Kang EYC, et al. Automated detection of early-stage ROP using a deep convolutional neural network.  Br J Ophthalmol. [published online August 23, 2020]. doi: 10.1136/bjophthalmol-2020-316526