A recent study used a systematic review to identify specific algorithms that can be used to type diabetes based on patient health information. The research, published in Diabetes Research and Clinical Practiceidentified the best algorithms for typing diabetes using electronic health records (EHRs) and clinical and administrative data.1
Over the past 20 years, large diabetes registries and comprehensive administrative and clinical databases have provided a gold mine for population-based research, researchers say.1 These data have the potential to improve planning of care and health services. “This is particularly crucial for conditions such as early-onset diabetes, where low prevalence makes it difficult to collect sufficient data to obtain meaningful results,” the authors write.1
Diagnosis of diabetes based on traditional criteria has become increasingly unreliable. More than half of new type 1 cases are diagnosed in adults and type 2 is increasing among young people, the authors note. Information from EHR, administrative, and clinical databases could improve the diagnosis and treatment of diabetes. The EHR can improve monitoring and help accurately type diabetes in patients, thereby improving health outcomes.1 Although several efforts have been made to obtain these data, no systematic review has been carried out until now, according to the authors.
The researchers conducted a systematic review of the literature and selected 19 eligible low-bias studies from EMBASE and MEDLINE published between January 2000 and January 2023. The studies examined the performance of the algorithms in identifying type 1 diabetes and type 2. Researchers evaluated studies based on their ability to accurately define types of diabetes using diagnostic measures against various reference standards. They assessed the quality of the studies using the Quality Assessment of Diagnostic Accuracy Studies.
The insulin consumption data was very accurate in identifying both types of diabetes.1 However, researchers noted that algorithms relying on prescriptions for oral hypoglycemic agents (OHAs) were less likely to accurately categorize diabetes. The authors said the algorithms worked better for patients who were younger at diagnosis and that typing accuracy declined with age. However, “single-criterion algorithms based on age at HAE diagnosis or prescription generally perform poorly,” the researchers said.1
Algorithms based on multiple diagnosis codes were most effective in predicting type 1 diabetes versus type 2 diabetes using health data.1 This included calculating code ratios to improve typing. Diabetes diagnosis codes, such as ICD-10-CM and ICD-9-CM codes, were critical to the success rate of this model. However, diagnostic code accuracy depends on variables such as the quality of medical record documentation and specific coding guidelines.1
In addition to using multiple codes, “approaches with more than one criterion may also increase sensitivity in distinguishing diabetes type,” the researchers said.1 Most of the top 10 algorithms used multiple criteria.
When diabetes diagnosis codes were not available, self-reported diabetes type, alone or with other predictors, outperformed alternative approaches.1 However, researchers said self-reported data is generally not available in EHRs or administrative databases.
Machine learning (ML) algorithms played a small role in the study, but demonstrated their potential in reducing false positives and false negatives. The authors noted that the combination of rule-based algorithms, clinical guidelines, and ML could be a future direction to improve the classification of diabetes types.
Due to study limitations, most of the data comes from high-income countries, so the results may not apply to low-income countries. Additionally, the authors noted concerns regarding the reference standards used, highlighting the need for more standardized criteria for diabetes typing. Another limitation was that “a critical gap in the literature is the paucity of studies attempting to externally validate already published algorithms.”1
The authors emphasized that their research lays the foundation for more accurate diabetes typing and treatment strategies.
“The results of this review demonstrate that with the use of EHR-based data, the presence of multiple diabetes diagnosis codes is one of the most readily available and accurate predictors for identifying diabetes type” , the researchers concluded.1 “As technology and our ability to analyze big data continue to evolve and improve, ML algorithms with or without human expertise are expected to become an important approach to accurately distinguish diabetes type.”