Treffer: Large language model based automated translation of natural language to SQL ; Büyük dil modeli tabanlı doğal dilden SQL’e otomatik çeviri

Title:
Large language model based automated translation of natural language to SQL ; Büyük dil modeli tabanlı doğal dilden SQL’e otomatik çeviri
Contributors:
Tek, Faik Boray, Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Doktora Programı, Işık University, School of Graduate Studies, Ph.D. in Computer Engineering
Publisher Information:
Işık Üniversitesi, Lisansüstü Eğitim Enstitüsü
Publication Year:
2025
Collection:
Işık Üniversitesi: DSpace Repository
Document Type:
Dissertation doctoral or postdoctoral thesis
File Description:
application/pdf
Language:
English
Relation:
Tez; Kanburoğlu, A. B. (2025). Large language model based automated translation of natural language to SQL. İstanbul: Işık Üniversitesi Lisansüstü Eğitim Enstitüsü.; https://hdl.handle.net/11729/6412; 927714
Rights:
info:eu-repo/semantics/openAccess
Accession Number:
edsbas.EDA291AF
Database:
BASE

Weitere Informationen

Text in English ; Abstract: Turkish and English ; Includes bibliographical references (leaves 66-75) ; xiii, 76 leaves ; The field of Text-to-SQL, which involves converting natural language into SQL queries, has seen significant advancements, but challenges remain, particularly for low-resource languages like Turkish. This thesis introduces three key contributions to address these challenges. Our first contribution is the development and open-access release of TUR2SQL, the first cross-domain Turkish Text-to-SQL dataset, which consists of 10,809 natural language sentences paired with their corresponding SQL queries. We evaluate the performance of SQLNet, a deep learning model specifically designed for this task, and one of the most successful Large Language Models (LLMs), ChatGPT, on this dataset. The results demonstrate the superior performance of ChatGPT. The second major contribution is the construction and publicly available release of TURSpider, the most extensive Turkish Text-to-SQL dataset. TURSpider is built by translating the widely used cross-domain Spider dataset from English to Turkish. This dataset includes complex queries with varying difficulty levels, facilitating the training and comparison of large language models for Turkish Text-to-SQL tasks. Our comparative analysis shows that fine-tuned Turkish LLMs achieve competitive performance, with some models surpassing OpenAI models in query accuracy. To further enhance performance, we apply the Chainof-Feedback (CoF) methodology, demonstrating its effectiveness across multiple models. Finally, we explore the Mixture-of-Agents (MoA) framework, which combines outputs from multiple models to improve the performance of open-source LLMs for Text-to-SQL tasks. By integrating MoA with the CoF technique, we propose MoAF-SQL, an approach that significantly improves performance, particularly on complex queries. Our experiments show that MoAF-SQL achieves competitive results, highlighting its potential to enhance the Text-to-SQL capabilities of open-source ...