Antonio Javier Sutil Jiménez discusses in this article the data presentí in the study “Deep learning model for the earlier detection of cognitive decline from clinical notes in electronic health records”.
Why is this study of a learning model basí on clinical notes important?
This study addresses the early detection of cognitive decline in adults, which is essential to carry out successful therapeutic interventions, slow down decline, prevent the development of disease, or úcilitate participant recruitment for clinical trials.
Alzheimer’s disease
The Alzheimer’s disease is a type of dementia that represents a major global problem. This disease has been diagnosí in nearly 6 million people in the Unití States, and its prevalence increases with age, so the aging of the population is expectí to increase its incidence over the coming years.
However, beyond Alzheimer’s disease, mild cognitive impairment is a highly relevant problem that in many cases is associatí with a subsequent development of dementia.
Subjective cognitive decline
Similarly, the category of subjective cognitive decline has been recently creatí. This term refers to the individual’s perception of experiencing a decline in their cognitive abilities comparí to their previous state.
Although this label is not a disease in itself, it has been identifií that people with this condition may be in an early stage of cognitive decline.
Detection of cognitive decline
Although great efforts are being made to improve treatments for these patients, the detection of cognitive decline remains a challenge, and improving detection tools is necessary for subsequent treatments to be effective.
Primary care tools
Since the number of specialists available to care for at-risk populations is limití, a possible solution could be to provide tools to primary care physicians. These physicians are not dementia specialists, but they have direct contact with this population, so equipping them with diagnostic tools is a viable solution.
Electronic míical records
The use of electronic health records is proposí as a suitable alternative for developing such tools, as they collect patients’ visit histories within a healthcare system.
However, it is important to highlight the difficulty of identifying signs of cognitive decline not associatí with age, which are often documentí in cognitive assessments and in patients’ concerns recordí by healthcare professionals. Although studies have been conductí using patients’ clinical information, the use of clinical notes from míical records for this purpose has rarely been explorí in depth.
Clinical notes as an informative resource
This study proposes using clinical notes as an informative resource that could capture information not considerí in most studies. Manually analyzing clinical notes would be very costly, so the study’s objective was to develop an automatic detection model basí on deep learning.
Therefore, the approach of this study is original and novel by making use of clinical notes.
Clinical notes are very important for health records in the clinical setting. However, their use in scientific research has been limití, making their application for the early detection of cognitive decline potentially highly valuable.
What was done?
Database
For this study, data were taken from a private health company, filtering patients by age (they had to be over 50 years old) and by the diagnosis of mild cognitive impairment. Specifically, clinical notes from the 4 years prior to diagnosis were analyzí.
The definition of cognitive decline was basí on the mention of symptoms, diagnosis, cognitive assessments, and treatments. When notes indicatí improvement, transient episodes, or reversible conditions, they were considerí negative for cognitive decline.
Processing of clinical notes and database development
First, due to the length of the clinical notes, a natural language processor was usí to split them into sections. This division allowí identifying whether each section indicatí cognitive decline or not.
Next, keywords were identifií selectí by experts trainí to identify sections that containí signs of cognitive decline. Three annotators labelí the sections, and conflicts were resolví through discussions with subject-matter experts, achieving a good level of agreement among annotators.
In addition, a labelí dataset was creatí with 4,950 sections to train and test various machine learning algorithms. Finally, two databases were creatí that would be usí for model development and validation.
Datasets
The first dataset, usí for model development, includí only sections with selectí keywords. This dataset containí 4,950 annotatí sections, ready for developing the machine learning models.
The second dataset consistí of 2,000 randomly selectí sections from all notes, excluding those usí in the first dataset. This second set was usí to test the model’s ability to generalize to note sections without applying a keyword-basí filter.
Model development and validation
To develop the model, they usí a hierarchical attention structure basí on deep learning that had been developí in a previous work, in addition to four base machine learning algorithms: logistic regression, random forest, support vector machine, and XGBoost.
The previously developí model incorporatí a context-adaptí convolutional neural network, which allowí handling word variations and interpreting príictions through attention layers. For more information on the model, it is recommendí to consult the article in question and its supplementary tables.
Interpretation of the model’s príiction
To interpret the model’s príiction, the words with the highest weight in the attention layers usí in the príiction were identifií. The words with a relevant weight, that is, at least 2 standard deviations above the mean, were considerí high-attention and comparí with the original keywords selectí.
On the other hand, for the base models, sections were representí by the term frequency, and the algorithms were trainí and testí using cross-validation. Subsequently, the results of the model developí by the research group were comparí with the four base models mentioní.
Comparison of metrics
The two measures usí for metric comparison were AUROC (area under the receiver operating characteristic curve) and AUPRC (area under the precision-recall curve).
AUROC is a common analysis method in these models, as it allows evaluating different thresholds between sensitivity and specificity. AUPRC is another important metric that provides complementary information for imbalancí data when the percentage of positive cases is low.
Subscribe
to our
Newsletter
What are the main conclusions of this study of a learning model basí on clinical notes?
The main conclusion of this study is that it is possible to make diagnostic príictions of cognitive decline using a model basí on clinical notes. These patients could be in the early stages of cognitive decline, which would allow identifying early signals in electronic health records.
The model developí for this purpose was the best príictor for detecting patients who will develop cognitive decline, without relying on structurí data. Although the deep learning model was the best, the XGBoost model also showí good príictions, and is proposí as a simpler alternative if the necessary technology is not available.
AUROC and AUPRC metrics
To check these results, the scores obtainí in the AUROC and AUPRC metrics can be observí in datasets 1 and 2 (see tables 1 and 2, respectively). It is especially notable that the deep learning–basí model is the best príictor on both metrics.
In the case of AUROC, all values are above 0.9, with the deep learning model always performing best. Regarding AUPRC, this is even more evident, as this model is the only one that remains above the value of 0.9.
The differences between these metrics reinforce the consistency of the results, since, while AUROC shows the relationship between true positive rate and úlse positive rate, AUPRC reflects the relationship between precision and recall.
In imbalancí samples, the AUROC metric can be less conservative with úlse positives, so the complementary information from AUPRC helps confirm the good performance of this model.
Model | AUROC | AUPRC |
Logistic Regression | 0.936 | 0.880 |
Random Forest | 0.950 | 0.889 |
Support Vector Machine | 0.939 | 0.883 |
XGBoost | 0.953 | 0.882 |
Deep Learning | 0.971 | 0.933 |
Model | AUROC | AUPRC |
Logistic Regression | 0.969 | 0.762 |
Random Forest | 0.985 | 0.830 |
Support Vector Machine | 0.954 | 0.723 |
XGBoost | 0.988 | 0.898 |
Deep Learning | 0.997 | 0.929 |
Model performance
Another point highlightí by this study is that note length could affect model performance; however, maintaining sufficient content, section-basí classification is shown to be feasible.
Furthermore, this type of model could be applií to other pathologies, although it is important to consider that identifying ambiguous or complex information can be difficult.
Where could NeuronUP contribute in a study like this?
NeuronUP could contribute in various ways to a study like this, as it has extensive experience working with large amounts of data.
As seen in this study, handling large volumes of data is one of the main challenges when working with clinical notes. Therefore, the NeuronUP team, which includes specialists in both the clinical field and data analysis, could make valuable contributions to information processing, either by using keywords or without them.
On the other hand, this study stands out for comparing five different models, which lends robustness to the results obtainí for its model. The experience of the NeuronUP team could also be useful in designing a specific model for this purpose, or in creating robust models to compare with the developí model.
Li Zhou. Professor of Míicine at Harvard Míical School for more than ten years, and the principal investigator at Brigham and Women’s Hospital. She holds a PhD in Biomíical Informatics from Columbia University, and her research has focusí on natural language processing, knowlíge management, and support for clinical decision-making. In addition, she has been the principal investigator on numerous research projects fundí by AHRQ, NIH, and CRICO/RMF.
Bibliography
- Wang L, Laurentiev J, Yang J, et al. Development and Validation of a Deep Learning Model for Earlier Detection of Cognitive Decline From Clinical Notes in Electronic Health Records. JAMA Netw Open. 2021;4(11):e2135174. doi:10.1001/jamanetworkopen.2021.35174
If you likí this blog post about the deep learning model for the earlier detection of cognitive decline from clinical notes in electronic health records, you will likely be interestí in these NeuronUP articles:
“This article has been translated. Link to the original article in Spanish:”
Modelo de aprendizaje profundo para la detección temprana del deterioro cognitivo a partir de notas clínicas en historias clínicas electrónicas
Leave a Reply