A Comparison of Semantic Similarity Methods for Maximum Human Interpretability

October 21, 2019 Β· Declared Dead Β· πŸ› 2019 Artificial Intelligence for Transforming Business and Society (AITB)

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Pinky Sitikhu, Kritish Pahi, Pujan Thapa, Subarna Shakya arXiv ID 1910.09129 Category cs.IR: Information Retrieval Cross-listed cs.CL, cs.LG Citations 97 Venue 2019 Artificial Intelligence for Transforming Business and Society (AITB) Last Checked 3 months ago
Abstract
The inclusion of semantic information in any similarity measures improves the efficiency of the similarity measure and provides human interpretable results for further analysis. The similarity calculation method that focuses on features related to the text's words only, will give less accurate results. This paper presents three different methods that not only focus on the text's words but also incorporates semantic information of texts in their feature vector and computes semantic similarities. These methods are based on corpus-based and knowledge-based methods, which are: cosine similarity using tf-idf vectors, cosine similarity using word embedding and soft cosine similarity using word embedding. Among these three, cosine similarity using tf-idf vectors performed best in finding similarities between short news texts. The similar texts given by the method are easy to interpret and can be used directly in other information retrieval applications.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Information Retrieval

Died the same way β€” πŸ‘» Ghosted