AmazonQA: A Review-Based Question Answering Task

August 12, 2019 · Entered Twilight · 🏛 International Joint Conference on Artificial Intelligence

"Last commit was 5.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitattributes, .gitignore, README.md, notebooks, paper, requirements.txt, src

Authors Mansi Gupta, Nitish Kulkarni, Raghuveer Chanda, Anirudha Rayasam, Zachary C Lipton arXiv ID 1908.04364 Category cs.CL: Computation & Language Cross-listed cs.IR Citations 78 Venue International Joint Conference on Artificial Intelligence Repository https://github.com/amazonqa/amazonqa ⭐ 113 Last Checked 2 months ago

Abstract

Every day, thousands of customers post questions on Amazon product pages. After some time, if they are fortunate, a knowledgeable customer might answer their question. Observing that many questions can be answered based upon the available product reviews, we propose the task of review-based QA. Given a corpus of reviews and a question, the QA system synthesizes an answer. To this end, we introduce a new dataset and propose a method that combines information retrieval techniques for selecting relevant reviews (given a question) and "reading comprehension" models for synthesizing an answer (given a question and review). Our dataset consists of 923k questions, 3.6M answers and 14M reviews across 156k products. Building on the well-known Amazon dataset, we collect additional annotations, marking each question as either answerable or unanswerable based on the available reviews. A deployed system could first classify a question as answerable and then attempt to generate an answer. Notably, unlike many popular QA datasets, here, the questions, passages, and answers are all extracted from real human interactions. We evaluate numerous models for answer generation and propose strong baselines, demonstrating the challenging nature of this new task.