Taking Electroretinography to the Next Level

Artificial intelligence has the potential to enhance diagnosis and screening capabilities.

Nate R. Lighthizer, OD, FAAO headshot

Nate R. Lighthizer, OD, FAAO

Link has been copied to your clipboard.

At A Glance

In the treatment of diseases such as glaucoma, diabetic retinopathy (DR), and macular degeneration, early detection is crucial to ensuring a successful treatment plan. But early detection can prove difficult without the onset of significant symptoms. As we know, a patient’s visual acuity can remain reasonable even as some diseases progress.

The electroretinogram (ERG) test is a useful component of ophthalmic diagnosis and screening, but typical ERG test analysis focuses on only a few distinct data points, such as the amplitude of the response or the latency or delay of the peak of the waveform. This sort of discrete analysis ignores other factors of the waveform that may be useful in early detection of the diseases mentioned above.

Because ERG response waveforms are complex and include many data values, artificial intelligence (AI) is a useful tool for analyzing them. Rather than merely focusing on a few distinct points, AI algorithms can instead analyze the entirety of the patient’s response, providing more thorough classification and evaluation of the patient than is possible with conventional analysis techniques.

AI is a rapidly growing field in software development and engineering, and is particularly powerful for analyzing large data sets. When sufficiently trained with patient data, AI programs can be effective tools in the health care field for diagnostic categorization. With this idea in mind, I joined a team that set out to create a viable AI algorithm for use in ERG testing protocols.

AI AND ERG

To construct an AI algorithm, patient data with an already established doctor diagnosis is needed. These ERG test results are input into the AI model as training data, and the model then uses these examples to generate benchmarks to classify unknown patient data as either normal or abnormal. Therefore, in order to train an AI program to accurately diagnose and screen unknown patients, a sizable quantity of established patient ERG data is needed.

In our case, the AI models were based on data from approximately 780 eyes and their corresponding protocol waveforms. All data in the model were anonymous, and no patient identifying data were stored as part of the algorithm. After the model was generated, it was considered locked and did not continue to learn in real time. Eyes in the model were designated as either healthy or abnormal based on independent examination.

The AI model consisted of a support vector machine algorithm. The AI used its model data to create a boundary between healthy and abnormal ERG results. Subsequent, unknown ERG results could then be compared against this boundary and deemed either healthy or abnormal.

Assembling the data necessary to construct the AI model is a critically important step in order to generate an effective and efficient AI algorithm. A diverse set of patient data in the model permits the AI algorithm to better learn the patterns and nuances of patient responses and will ultimately result in more accurate classification. It is equally important that the data in the AI model be correctly classified as healthy or abnormal based on independent examination. Failure to correctly label the data in the model can result in warped algorithm outputs.

PUTTING IT TO THE TEST

To study the effectiveness of the AI algorithm, the AI model was retroactively tested in 235 patients at five sites. A total of 119 of the patients were deemed normal via conventional ophthalmic analysis, including visual fields and normal IOP. A total of 104 patients had glaucomatous optic neuropathy and characteristic glaucomatous visual fields defects. The other 12 patients had mild nonproliferative DR.

These patients were tested across three protocols: pattern ERG, chromatic red/blue screening, and photopic negative response tests. Each patient was tested on only one protocol. Although the general construction and algorithm of the AI model was the same for each protocol, the data used in each model were different owing to the differing nature and parameters of each of the testing protocols (Table).