Treffer: Electronic health record-enhanced signal detection using tree-based scan statistic methods.

Title:
Electronic health record-enhanced signal detection using tree-based scan statistic methods.
Authors:
Russo M; Department of Statistics, The Ohio State University, Columbus, OH, United States.; Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States., Sreedhara SK; Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States., Smith J; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States., Davis SE; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States., Maro JC; Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, United States., Deramus T; Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States., Lii J; Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States., Yang J; Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States., Desai RJ; Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States., Hernández-Muñoz JJ; Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States., Ma Y; Office of Biostatistics, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States., Wang Y; Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States., Jones JT; Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, United States., Wang SV; Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States.
Source:
American journal of epidemiology [Am J Epidemiol] 2026 Jan 08; Vol. 195 (1), pp. 178-187.
Publication Type:
Journal Article
Language:
English
Journal Info:
Publisher: Oxford University Press Country of Publication: United States NLM ID: 7910653 Publication Model: Print Cited Medium: Internet ISSN: 1476-6256 (Electronic) Linking ISSN: 00029262 NLM ISO Abbreviation: Am J Epidemiol Subsets: MEDLINE
Imprint Name(s):
Publication: Cary, NC : Oxford University Press
Original Publication: Baltimore, School of Hygiene and Public Health of Johns Hopkins Univ.
Grant Information:
United States Food and Drug Administration
Contributed Indexing:
Keywords: data mining; electronic health records; natural language processing; permutation testing; pharmacoepidemiology; real-world data; tree-based scan statistics
Substance Nomenclature:
0 (Hypoglycemic Agents)
0 (Sulfonylurea Compounds)
0 (Dipeptidyl-Peptidase IV Inhibitors)
Entry Date(s):
Date Created: 20250908 Date Completed: 20260108 Latest Revision: 20260108
Update Code:
20260109
DOI:
10.1093/aje/kwaf199
PMID:
40916726
Database:
MEDLINE

Weitere Informationen

Tree-based scan statistics (TBSS) are data mining methods that screen thousands of hierarchically related health outcomes to detect unsuspected adverse drug effects. TBSS traditionally analyze claims data with outcomes defined via diagnosis codes. TBSS have not been previously applied to rich clinical information in electronic health records (EHR). We developed approaches for integrating EHR data in TBSS analyses, including outcomes derived from natural language processing (NLP) applied to clinical notes and laboratory results, related via multipath hierarchical structures. We consider 4 settings that sequentially add sources of outcomes to the TBSS tree: (1) diagnosis code, (2) NLP-derived outcomes, (3) binary outcomes from lab results, and (4) continuous lab results. In a comparative cohort study involving second-generation sulfonylureas (SUs) and dipeptidyl peptidase 4 (DPP-4) inhibitors among adults with type 2 diabetes, with an a priori expected signal of hypoglycemia, diagnosis code data showed no statistical alerts for inpatient or emergency department settings. Adding NLP-derived outcomes resulted in an alert for "Headaches" (P = .047), a nonspecific symptom of hypoglycemia. Progressively adding binary and continuous lab results produced the same alert. Integrating EHR in TBSS can be useful for the detection of safety signals for further investigation.
(© The Author(s) 2025. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For commercial re-use, please contact reprints@oup.com for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site-for further information please contact journals.permissions@oup.com.)