Treffer: Generalization challenges in semantic parsing
Weitere Informationen
Semantic parsing is the task of translating natural language utterances onto machine-interpretable programs, which can be executed against a real-world environment to obtain desired responses (e.g., a SQL query against a relational database). It is an established paradigm for building natural language interfaces. However, most existing semantic parsing systems are built only for the conventional yet limited setting, i.e., in the context of a single fixed database. Moreover, they are typically data-hungry, i.e., they require a large number of examples for training. This thesis focuses on extending the limited setting to a diverse spectrum of settings inspired by real-life scenarios, and addressing the generalization challenges that arise during such extensions. We consider three aspects of semantic parsing along which the conventional setting deviates based on some real-life scenarios. Firstly, we consider transferability, a property indicating whether a semantic parser is applicable on unseen domains (e.g., unseen databases), leading to cross-domain setting. Secondly, we consider three forms of supervision. Apart from standard, yet expensive, utterance-program pairs, we investigate settings where cheap supervision in the form of utterance-response pairs is given, or even no supervision is at all. Thirdly, we study linguistic coverage i.e., the extent to which a semantic parser can cover the space of utterances. In practice, it is operationalized by investigating the degree to which a parser can generalize to unseen utterances that are combinations of known fragments (e.g., phrases), which reflects the generative nature of natural languages. For example, we ask how well a parser can generalize to long utterances if exposed in training to only short ones. From the machine learning perspective, the practical settings introduced above can be formulated as two kinds of generalization challenges: settings driven by transferability and linguistic coverage require out-of-distribution generalization as test data in such ...