Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation
October 17, 2017 Β· Entered Twilight Β· π AAAI/ACM Conference on AI, Ethics, and Society
"Last commit was 7.0 years ago (β₯5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: .gitignore, README.md, distillationsetup.pseudocode, process_chicago_police_ssl.R, process_compas_recidivism.R, process_lending_club_loan.R, process_nypd_stopandfrisk_weapon.R, utils.R
Authors
Sarah Tan, Rich Caruana, Giles Hooker, Yin Lou
arXiv ID
1710.06169
Category
stat.ML: Machine Learning (Stat)
Cross-listed
cs.AI,
cs.LG
Citations
203
Venue
AAAI/ACM Conference on AI, Ethics, and Society
Repository
https://github.com/shftan/auditblackbox
β 9
Last Checked
2 months ago
Abstract
Black-box risk scoring models permeate our lives, yet are typically proprietary or opaque. We propose Distill-and-Compare, a model distillation and comparison approach to audit such models. To gain insight into black-box models, we treat them as teachers, training transparent student models to mimic the risk scores assigned by black-box models. We compare the student model trained with distillation to a second un-distilled transparent model trained on ground-truth outcomes, and use differences between the two models to gain insight into the black-box model. Our approach can be applied in a realistic setting, without probing the black-box model API. We demonstrate the approach on four public data sets: COMPAS, Stop-and-Frisk, Chicago Police, and Lending Club. We also propose a statistical test to determine if a data set is missing key features used to train the black-box model. Our test finds that the ProPublica data is likely missing key feature(s) used in COMPAS.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Machine Learning (Stat)
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Distilling the Knowledge in a Neural Network
R.I.P.
π»
Ghosted
Layer Normalization
R.I.P.
π»
Ghosted
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
R.I.P.
π»
Ghosted
Domain-Adversarial Training of Neural Networks
R.I.P.
π»
Ghosted