Diagnostic codes used in health care billing speed large-scale observational studies, but codes representing suicidal behavior are often omitted from records.
For signs of suicidal behavior in the electronic health record, Cosmin Bejan, PhD, Colin Walsh, MD, MA, and colleagues turned to natural language processing (NLP) of notes written by the patient care team.
As reported in Scientific Reports, the researchers developed machine learning-based NLP queries for suicidal ideation and suicide attempt, using them to process 200 million notes from 3.4 million deidentified records and rank 239,785 records that bore varying indications of these behaviors.
Among the 200 highest ranked records per each query, manual chart review found 197 cases of suicidal thoughts and 193 cases of attempted suicide. The higher a record’s NLP ranking, the greater was its likelihood of containing a diagnosis code for suicidal behavior.
Adding NLP helps identify suicidal behavior cases and could be used to directly improve real-time risk prediction algorithms, the authors conclude.
Others on the study include Michael Ripperger, Drew Wilimitis, Ryan Ahmed, MD, MBA, JooEun Kang, PhD, Katelyn Robinson, Theodore Morley and Douglas Ruderfer, PhD. The study was supported by the National Institutes of Health (MH121455).