Result: Finding the Conversation in a Haystack: Leveraging AI to Detect Goals-of-Care Documentation.
Further Information
1. Describe a custom AI-based solution for reviewing and tracking goals of care conversation documentation and content. 2. Encourage culture change and increase proficiency with AI tools in serious illness care. At Providence, we encourage goals of care (GOC) conversations for seriously ill patients across the healthcare system. However, tracking these conversations is complicated. We developed an AI prompt-engineered model to evaluate clinical notes for narrative GOC documentation. With this advanced strategy, we more accurately detect, report, and improve on GOC conversations. It has become well recognized that patients experiencing serious illness and their families benefit from discussions regarding their wishes and values concerning their health, and that these conversations are a vital component of healthcare quality and equity (1,2). Organizationally, we have implemented system-wide initiatives to improve these goals-of-care (GOC) conversations. However, reliably tracking GOC discussions remains difficult, as details are often written narratively within daily documentation. If prompted towards the use of discrete data entry, clinicians frequently find the workflows to be restrictive, with less clinical value. (3,4). This led us to develop an AI-powered tool to detect GOC conversations written without structured documentation (5). We built the AI tool to leverage prompt-engineering powered by OpenAI's ChatGPT. We began with multi-disciplinary consensus validations to create annotation guidelines, with substantial reliability of agreement (Fleiss' kappa, 0.77). We then developed a Python package for testing across multiple versions of the large language model and tuned the model to maximize specificity and accuracy. We found that GPT 4o(mni) had the highest performance in GOC detection (specificity 0.95, accuracy 0.89, and F1 score 0.74) along with significant improvements in efficiency and speed. We also performed subgroup analyses using standard fairness metrics to assess for bias with respect to patients' race and sex (including parity and equalized odds metrics and disparate impact scores). We developed custom programming and extension code to pass note information to Nebula, Epic's cloud platform, where the model operates, and then return results through predictive modeling architecture. We can then track and report on these data both operationally and within patients' clinical records (6). We aim to describe this example of innovation with insights into our build process, along with lessons learned along the way, and to explore further potential AI solutions to improve serious illness care. 1. Paladino J, Koritsanszky L, Nisotel L, et al. Patient and clinician experience of a serious illness conversation guide in oncology: A descriptive analysis. Cancer Med. 2020 Jul;9(13):4550-4560. 2. Hua M, Guo L, Ing C, et al. Specialist Palliative Care Use and End-of-Life Care in Patients With Metastatic Cancer. J Pain Symptom Manage. 2024 May;67(5):357-365.e15. 3. Uyeda AM, Curtis JR, Engelberg RA, et al. Mixed-methods evaluation of three natural language processing modeling approaches for measuring documented goals-of-care discussions in the electronic health record. J Pain Symptom Manage. 2022 Jun;63(6):e713-e723. 4. Lindvall C, Deng CY, Moseley E, et al. Natural Language Processing to Identify Advance Care Planning Documentation in a Multisite Pragmatic Clinical Trial. J Pain Symptom Manage. 2022 Jan;63(1):e29-e36. 5. Lee RY, Brumback LC, Lober WB, et al. Identifying Goals of Care Conversations in the Electronic Health Record Using Natural Language Processing and Machine Learning. J Pain Symptom Manage. 2021 Jan;61(1):136-142.e2. 6. Chua IS, Ritchie CS, Bates DW. Enhancing serious illness communication using artificial intelligence. NPJ Digit Med. 2022 Jan 27;5(1):14. [ABSTRACT FROM AUTHOR]