Improving Machine Reading Comprehension with General Reading Strategies

October 31, 2018 · Entered Twilight · 🏛 North American Chapter of the Association for Computational Linguistics

"Last commit was 6.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: README.md, code

Authors Kai Sun, Dian Yu, Dong Yu, Claire Cardie arXiv ID 1810.13441 Category cs.CL: Computation & Language Citations 118 Venue North American Chapter of the Association for Computational Linguistics Repository https://github.com/nlpdata/strategy/ ⭐ 37 Last Checked 2 months ago

Abstract

Reading strategies have been shown to improve comprehension levels, especially for readers lacking adequate prior knowledge. Just as the process of knowledge accumulation is time-consuming for human readers, it is resource-demanding to impart rich general domain knowledge into a deep language model via pre-training. Inspired by reading strategies identified in cognitive science, and given limited computational resources -- just a pre-trained model and a fixed number of training instances -- we propose three general strategies aimed to improve non-extractive machine reading comprehension (MRC): (i) BACK AND FORTH READING that considers both the original and reverse order of an input sequence, (ii) HIGHLIGHTING, which adds a trainable embedding to the text embedding of tokens that are relevant to the question and candidate answers, and (iii) SELF-ASSESSMENT that generates practice questions and candidate answers directly from the text in an unsupervised manner. By fine-tuning a pre-trained language model (Radford et al., 2018) with our proposed strategies on the largest general domain multiple-choice MRC dataset RACE, we obtain a 5.8% absolute increase in accuracy over the previous best result achieved by the same pre-trained model fine-tuned on RACE without the use of strategies. We further fine-tune the resulting model on a target MRC task, leading to an absolute improvement of 6.2% in average accuracy over previous state-of-the-art approaches on six representative non-extractive MRC datasets from different domains (i.e., ARC, OpenBookQA, MCTest, SemEval-2018 Task 11, ROCStories, and MultiRC). These results demonstrate the effectiveness of our proposed strategies and the versatility and general applicability of our fine-tuned models that incorporate these strategies. Core code is available at https://github.com/nlpdata/strategy/.