Treffer: Transforming Medical Data Access: The Role and Challenges of Recent Language Models in SQL Query Automation.

Title:
Transforming Medical Data Access: The Role and Challenges of Recent Language Models in SQL Query Automation.
Authors:
Tanković, Nikola1 (AUTHOR) robert.sajina@unipu.hr, Šajina, Robert1 (AUTHOR), Lorencin, Ivan1 (AUTHOR) ivan.lorencin@unipu.hr
Source:
Algorithms. Mar2025, Vol. 18 Issue 3, p124. 24p.
Database:
Academic Search Index

Weitere Informationen

Generating accurate SQL queries from natural language is critical for enabling non-experts to interact with complex databases, particularly in high-stakes domains like healthcare. This paper presents an extensive evaluation of state-of-the-art large language models (LLM), including LLaMA 3.3, Mixtral, Gemini, Claude 3.5, GPT-4o, and Qwen for transforming medical questions into executable SQL queries using the MIMIC-3 and TREQS datasets. Our approach employs LLMs with various prompts across 1000 natural language questions. The experiments are repeated multiple times to assess performance consistency, token efficiency, and cost-effectiveness. We explore the impact of prompt design on model accuracy through an ablation study, focusing on the role of table data samples and one-shot learning examples. The results highlight substantial trade-offs between accuracy, consistency, and computational cost between the models. This study also underscores the limitations of current models in handling medical terminology and provides insights to improve SQL query generation in the healthcare domain. Future directions include implementing RAG pipelines based on embeddings and reranking models, integrating ICD taxonomies, and refining evaluation metrics for medical query performance. By bridging these gaps, language models can become reliable tools for medical database interaction, enhancing accessibility and decision-making in clinical settings. [ABSTRACT FROM AUTHOR]