Treffer: Effective approaches to combining lexical and syntactical information for code summarization.
Weitere Informationen
Summary: Natural language summaries of source codes are important during software development and maintenance. Recently, deep learning based models have achieved good performance on the task of automatic code summarization, which encode token sequence or abstract syntax tree (AST) of code with neural networks. However, there has been little work on the efficient combination of lexical and syntactical information of code for better summarization quality. In this paper, we propose two general and effective approaches to leveraging both types of information: a convolutional neural network that aims to better extract vector representation of AST node for downstream models; and a Switch Network that learns an adaptive weight vector to combine different code representations for summary generation. We integrate these approaches into a comprehensive code summarization model, which includes a sequential encoder for token sequence of code and a tree based encoder for its AST. We evaluate our model on a large Java dataset. The experimental results show that our model outperforms several state‐of‐the‐art models on various metrics, and the proposed approaches contribute a lot to the improvements. [ABSTRACT FROM AUTHOR]
Copyright of Software: Practice & Experience is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)