Paraphrase Identification with Deep Learning: A Review of Datasets and Methods

December 13, 2022 ยท The Cartographer ยท ๐Ÿ› IEEE Access

๐Ÿ“š THE CARTOGRAPHER: The Cartographer
Survey/review paper โ€” maps the landscape rather than implementing a method.

"No code URL or promise found in abstract"
"Title-pattern auto-detect: Paraphrase Identification with Deep Learning: A Review of Datasets and Methods"

Evidence collected by the PWNC Scanner

Authors Chao Zhou, Cheng Qiu, Lizhen Liang, Daniel E. Acuna arXiv ID 2212.06933 Category cs.CL: Computation & Language Cross-listed cs.AI, cs.IR Citations 37 Venue IEEE Access Last Checked 2 days ago
Abstract
The rapid progress of Natural Language Processing (NLP) technologies has led to the widespread availability and effectiveness of text generation tools such as ChatGPT and Claude. While highly useful, these technologies also pose significant risks to the credibility of various media forms if they are employed for paraphrased plagiarism -- one of the most subtle forms of content misuse in scientific literature and general text media. Although automated methods for paraphrase identification have been developed, detecting this type of plagiarism remains challenging due to the inconsistent nature of the datasets used to train these methods. In this article, we examine traditional and contemporary approaches to paraphrase identification, investigating how the under-representation of certain paraphrase types in popular datasets, including those used to train Large Language Models (LLMs), affects the ability to detect plagiarism. We introduce and validate a new refined typology for paraphrases (ReParaphrased, REfined PARAPHRASE typology definitions) to better understand the disparities in paraphrase type representation. Lastly, we propose new directions for future research and dataset development to enhance AI-based paraphrase detection.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computation & Language

๐ŸŒ… ๐ŸŒ… Old Age

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, ... (+6 more)

cs.CL ๐Ÿ› NeurIPS ๐Ÿ“š 166.0K cites 8 years ago