Utilizing Machine Learning for Risk Factor Identification in Major Eye Diseases

Optical coherence tomography (OCT) showing the optical nerve of a patient who is showing a risk of glaucoma. (Photo by: BSIP/Universal Images Group via Getty Images)
The study aimed to quantify both modifiable and nonmodifiable risk factors.

Machine learning can be successfully utilized to quantify relative influences for 4 major age-related eye diseases, according to research published in the British Journal of Ophthalmology

Investigators sought to evaluate machine learning utility for determining the relative contributions of both modifiable and nonmodifiable risk factors for retinopathy, cataract, age-related macular degeneration (AMD), and glaucoma. 

Each of the 4 diseases were sub-classified, including diabetic and nondiabetic retinopathy; nuclear, cortical, and posterior subcapsular cataract; early and late AMD; and primary open-angle and primary angle-closure glaucoma. Grading systems consistent with each disease were used. In terms of risk factors, investigators considered demographics, ocular, and metabolic and systemic characteristics, ancestry genetic background, lifestyle patterns, and socioeconomic status. A 2-step analysis first identified the metabolites associated with each eye disease, then used a tree-based machine learning model to investigate the relationships between all risk factors and each eye disease. Relative influence was estimated using a gradient boosting machine for each individual risk factor. 

Participants were enrolled from the Singapore Epidemiology of Eye Disease (SEED) study, a multiethnic Asian, population-based, prospective cohort of adults from Singapore of Chinese, Indian, and Malay descent. In total, 10,033 participants (50.7% female; mean age, 57.7 ± 10.4 years) were included. 

Metabolic profile assessment identified 25 metabolites associated with diabetic retinopathy, 14 associated with early AMD, and 6, 21, and 18 metabolites associated with nuclear, cortical, and posterior subcapsular cataracts, respectively; 12 metabolites were associated with primary open-angle glaucoma. 

Metabolic characteristics were the primary risk factors for both diabetic and nondiabetic retinopathy (relative influence, 64.4% and 40%, respectively, with diabetes duration and hemoglobin A1c as the highest risk factors for diabetic retinopathy (relative influence, 22.1% and 8.9%). Older age and shorter axial length were the primary risk factors for AMD (relative influence, 15% and 5%, respectively), while the relative influence of ancestry genetic background was roughly 20%. 

Age was the most prevalent influence for nuclear and cortical cataract risk (relative influence, approximately 60%). For posterior subcapsular cataracts, longer axial length was the primary individual risk factor (relative influence, 30.8%), and for both primary open-angle and primary angle-closure glaucoma, ocular characteristics were the highest clustered risk factors (relative influence, 38.1% and 33.5%). 

Primary open-angle and primary angle-closed glaucoma were associated with the highest contribution of modifiable risk factors (cumulative relative influence, 34.8% and 30.8%), followed by retinopathy and late AMD. Modifiable risk factor contributions for early AMD and cataracts were lower than 11%. 

Study limitations include the cross-sectional design, potential issues with ancestry genetic information correlating to overall ancestral genetic background rather than individual overall genetic contributions, and the use of broad definitions for modifiable risk factors. 

“This study illustrates the utility of [machine learning] in ranking a large number of risk factors, allowing [us] to identify the highest contributors,” the researchers concluded. 


Nusinovici S, Zhang L, Chai X, et al. Machine learning to determine relative contribution of modifiable and non-modifiable risk factors of major eye diseases. Published online November 18, 2020. Br J Ophthalmol. doi: 10.1136/bjophthalmol-2020-317454