๐
๐
Old Age
Language Model Classifier Aligns Better with Physician Word Sensitivity than XGBoost on Readmission Prediction
November 13, 2022 ยท Entered Twilight ยท ๐ arXiv.org
Repo contents: API.py, LICENSE, README.md, change_words.py, config_template.ymal, looper.py, looper_data.py
Authors
Grace Yang, Ming Cao, Lavender Y. Jiang, Xujin C. Liu, Alexander T. M. Cheung, Hannah Weiss, David Kurland, Kyunghyun Cho, Eric K. Oermann
arXiv ID
2211.07047
Category
cs.CL: Computation & Language
Citations
4
Venue
arXiv.org
Repository
https://github.com/nyuolab/Model_Sensitivity
โญ 3
Last Checked
3 months ago
Abstract
Traditional evaluation metrics for classification in natural language processing such as accuracy and area under the curve fail to differentiate between models with different predictive behaviors despite their similar performance metrics. We introduce sensitivity score, a metric that scrutinizes models' behaviors at the vocabulary level to provide insights into disparities in their decision-making logic. We assess the sensitivity score on a set of representative words in the test set using two classifiers trained for hospital readmission classification with similar performance statistics. Our experiments compare the decision-making logic of clinicians and classifiers based on rank correlations of sensitivity scores. The results indicate that the language model's sensitivity score aligns better with the professionals than the xgboost classifier on tf-idf embeddings, which suggests that xgboost uses some spurious features. Overall, this metric offers a novel perspective on assessing models' robustness by quantifying their discrepancy with professional opinions. Our code is available on GitHub (https://github.com/nyuolab/Model_Sensitivity).
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
๐
๐
Old Age
XLNet: Generalized Autoregressive Pretraining for Language Understanding
๐ฎ
๐ฎ
The Ethereal
Effective Approaches to Attention-based Neural Machine Translation
๐
๐
Old Age
A large annotated corpus for learning natural language inference
๐
๐
Old Age