This aim of this study is to classify requirement sentences from the specifications of US DOT using natural language processing (NLP) and a deep neural network. At the contract phase of the project, the requirements analysis of contract documents is a significant task to prevent claims or disputes caused by ambiguous or missing clauses, but it is highly human-intensive and difficult to identify requirements within a given short period. In this article, the requirement sentences identification model was proposed based on deep-learning algorithms. First, the critical terms that define what the requirement sentence is were identified, and then all sentences were labeled using the pre-defined critical terms. Second, three vectorizing methods were used, including two pre-trained methods—GloVe and Word2Vec—and a self-trained method to produce word embedding. Third, the automated classification of requirements sentences was experimented using three deep-learning models: the convolutional neural network (CNN), the long-short-term memory (LSTM), and the combination of CNN+LSTM. In the evaluation of nine total experiments, the results showed that the F1 scores of the CNN model were the highest at 92.9% and 92.4% for both the Word2Vec model and the Glove model. This study provided a way to achieve a high level of classification accuracy with simple deep-learning models and pre-trained embedding models.
|Title of host publication||Lecture Notes in Civil Engineering|
|Number of pages||9|
|Publication status||Published - 2021|
|Name||Lecture Notes in Civil Engineering|
Bibliographical noteFunding Information:
This work was supported by an Institute for Information & Communications Technology Promotion (IITP) Grant funded by the Korea Government (MSIT) (No. 2019-0-01559-001, Digitalizing Construction Project Requirements Using Artificial Intelligence and Natural Language Processing).
All Science Journal Classification (ASJC) codes
- Civil and Structural Engineering