Considerations for Evaluation and Generalization in Interpretable Machine Learning

Doshi-Velez, Finale; Kim, Been

doi:10.1007/978-3-319-98131-4_1

Considerations for Evaluation and Generalization in Interpretable Machine Learning

Finale Doshi-Velez¹¹ &
Been Kim¹²

Chapter
First Online: 30 November 2018

6305 Accesses
57 Citations

Part of the book series: The Springer Series on Challenges in Machine Learning ((SSCML))

Abstract

As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs. These explanations are often used to qualitatively assess other criteria such as safety or non-discrimination. However, despite the interest in interpretability, there is little consensus on what interpretable machine learning is and how it should be measured and evaluated. In this paper, we discuss a definitions of interpretability and describe when interpretability is needed (and when it is not). Finally, we talk about a taxonomy for rigorous evaluation, and recommendations for researchers. We will end with discussing open questions and concrete problems for new researchers.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Hardcover + eBook: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Merriam-Webster dictionary, accessed 2017-02-07.

References

Adler P, Falk C, Friedler SA, Rybeck G, Scheidegger C, Smith B, Venkatasubramanian S (2016) Auditing black-box models for indirect influence. In: Data Mining (ICDM), 2016 IEEE 16th International Conference on, IEEE, pp 1–10
Google Scholar
Allahyari H, Lavesson N (2011) User-oriented assessment of classification model understandability. In: 11th scandinavian conference on Artificial intelligence, IOS Press
Google Scholar
Amodei D, Olah C, Steinhardt J, Christiano P, Schulman J, Mané D (2016) Concrete problems in AI safety. arXiv preprint arXiv:160606565
Google Scholar
Antunes P, Herskovic V, Ochoa SF, Pino JA (2012) Structuring dimensions for collaborative systems evaluation. In: ACM Computing Surveys, ACM
Article Google Scholar
Bechtel W, Abrahamsen A (2005) Explanation: A mechanist alternative. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences
Google Scholar
Bostrom N, Yudkowsky E (2014) The ethics of artificial intelligence. The Cambridge Handbook of Artificial Intelligence
Google Scholar
Buciluǎ C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM
Google Scholar
Bussone A, Stumpf S, O’Sullivan D (2015) The role of explanations on trust and reliance in clinical decision support systems. In: Healthcare Informatics (ICHI), 2015 International Conference on, IEEE, pp 160–169
Google Scholar
Carton S, Helsby J, Joseph K, Mahmud A, Park Y, Walsh J, Cody C, Patterson CE, Haynes L, Ghani R (2016) Identifying police officers at risk of adverse events. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM
Google Scholar
Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N (2015) Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp 1721–1730
Google Scholar
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Computers & Electrical Engineering 40(1):16–28
Article Google Scholar
Chang J, Boyd-Graber JL, Gerrish S, Wang C, Blei DM (2009) Reading tea leaves: How humans interpret topic models. In: NIPS
Google Scholar
Chater N, Oaksford M (2006) Speculations on human causal learning and reasoning. Information sampling and adaptive cognition
Google Scholar
Doshi-Velez F, Ge Y, Kohane I (2014) Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis. In: Pediatrics, Am Acad Pediatrics, vol 133:1, pp e54–e63
Google Scholar
Doshi-Velez F, Wallace B, Adams R (2015) Graph-sparse lda: a topic model with structured sparsity. In: Association for the Advancement of Artificial Intelligence
Google Scholar
Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. In: Innovations in Theoretical Computer Science Conference, ACM
Google Scholar
Elomaa T (2017) In defense of c4. 5: Notes on learning one-level decision trees. ML-94 254:62
Google Scholar
Freitas A (2014) Comprehensible classification models: a position paper. In: ACM SIGKDD Explorations
Google Scholar
Glennan S (2002) Rethinking mechanistic explanation. Philosophy of science
Google Scholar
Goodman B, Flaxman S (2016) European union regulations on algorithmic decision-making and a “right to explanation”. arXiv preprint arXiv:160608813
Google Scholar
Gupta M, Cotter A, Pfeifer J, Voevodski K, Canini K, Mangylov A, Moczydlowski W, Van Esbroeck A (2016) Monotonic calibrated interpolated look-up tables. In: Journal of Machine Learning Research
Google Scholar
Hamill S (2017) CMU computer won poker battle over humans by statistically significant margin. http://www.post-gazette.com/business/tech-news/2017/01/31/CMU-computer-won-poker-battle-over-humans-by-statistically-significant-margin/stories/201701310250, accessed: 2017-02-07
Hardt M, Talwar K (2010) On the geometry of differential privacy. In: ACM Symposium on Theory of Computing, ACM
Google Scholar
Hardt M, Price E, Srebro N (2016) Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems
Google Scholar
Hayete B, Bienkowska JR (2004) Gotrees: Predicting go associations from proteins. Biocomputing 2005 p 127
Google Scholar
Hempel C, Oppenheim P (1948) Studies in the logic of explanation. Philosophy of science
Google Scholar
Hughes MC, Elibol HM, McCoy T, Perlis R, Doshi-Velez F (2016) Supervised topic models for clinical interpretability. In: arXiv preprint arXiv:1612.01678
Google Scholar
Huysmans J, Dejaeger K, Mues C, Vanthienen J, Baesens B (2011) An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. In: DSS, Elsevier
Article Google Scholar
Keil F (2006) Explanation and understanding. Annu Rev Psychol
Google Scholar
Keil F, Rozenblit L, Mills C (2004) What lies beneath? understanding the limits of understanding. Thinking and seeing: Visual metacognition in adults and children
Google Scholar
Kim B, Chacha C, Shah J (2013) Inferring robot task plans from human team meetings: A generative modeling approach with logic-based prior. Association for the Advancement of Artificial Intelligence
Google Scholar
Kim B, Rudin C, Shah J (2014) The Bayesian Case Model: A generative approach for case-based reasoning and prototype classification. In: NIPS
Google Scholar
Kim B, Glassman E, Johnson B, Shah J (2015a) iBCM: Interactive bayesian case model empowering humans via intuitive interaction. In: MIT-CSAIL-TR-2015-010
Google Scholar
Kim B, Shah J, Doshi-Velez F (2015b) Mind the gap: A generative approach to interpretable feature selection and extraction. In: Advances in Neural Information Processing Systems
Google Scholar
Kindermans PJ, Schütt KT, Alber M, Müller KR, Dähne S (2017) Patternnet and patternlrp–improving the interpretability of neural networks. arXiv preprint arXiv:170505598
Google Scholar
Kochenderfer MJ, Holland JE, Chryssanthacopoulos JP (2012) Next-generation airborne collision avoidance system. Tech. rep., Massachusetts Institute of Technology-Lincoln Laboratory Lexington United States
Google Scholar
Krakovna V, Doshi-Velez F (2016) Increasing the interpretability of recurrent neural networks using hidden markov models. In: arXiv preprint arXiv:1606.05320
Google Scholar
Kulesza T, Stumpf S, Burnett M, Yang S, Kwan I, Wong WK (2013) Too much, too little, or just right? ways explanations impact end users’ mental models. In: Visual Languages and Human-Centric Computing (VL/HCC), 2013 IEEE Symposium on, IEEE, pp 3–10
Google Scholar
Lakkaraju H, Bach SH, Leskovec J (2016) Interpretable decision sets: A joint framework for description and prediction. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp 1675–1684
Google Scholar
Lazar J, Feng JH, Hochheiser H (2010) Research methods in human-computer interaction. John Wiley & Sons
Google Scholar
Lei T, Barzilay R, Jaakkola T (2016) Rationalizing neural predictions. arXiv preprint arXiv:160604155
Google Scholar
Lipton ZC (2016) The mythos of model interpretability. arXiv preprint arXiv:160603490
Google Scholar
Liu W, Tsang IW (2016) Sparse perceptron decision tree for millions of dimensions. In: AAAI, pp 1881–1887
Google Scholar
Lombrozo T (2006) The structure and function of explanations. Trends in cognitive sciences 10(10):464–470
Article Google Scholar
Lou Y, Caruana R, Gehrke J (2012) Intelligible models for classification and regression. In: ACM SIGKDD international conference on Knowledge discovery and data mining, ACM
Google Scholar
Mehmood T, Liland KH, Snipen L, Sæbø S (2012) A review of variable selection methods in partial least squares regression. Chemometrics and Intelligent Laboratory Systems 118:62–69
Article Google Scholar
Miller GA (1956) The magical number seven, plus or minus two: Some limits on our capacity for processing information. The Psychological Review (2):81–97
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. In: arXiv preprint arXiv:1312.5602
Google Scholar
Neath I, Surprenant A (2003) Human Memory. Wadsworth Cengage Learning
Google Scholar
Otte C (2013) Safe and interpretable machine learning: A methodological review. In: Computational Intelligence in Intelligent Data Analysis, Springer
Google Scholar
Parliament, of the European Union C (2016) General data protection regulation
Google Scholar
Ribeiro MT, Singh S, Guestrin C (2016) “why should i trust you?”: Explaining the predictions of any classifier. In: arXiv preprint arXiv:1602.04938
Google Scholar
Ross A, Hughes MC, Doshi-Velez F (2017) Right for the right reasons: Training differentiable models by constraining their explanations. In: International Joint Conference on Artificial Intelligence
Google Scholar
Ruggieri S, Pedreschi D, Turini F (2010) Data mining for discrimination discovery. ACM Transactions on Knowledge Discovery from Data
Google Scholar
Rüping S (2006) Thesis: Learning interpretable models. PhD thesis, Universitat Dortmund
Google Scholar
Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE transactions on systems, man, and cybernetics 21(3):660–674
Article MathSciNet Google Scholar
Schulz E, Tenenbaum J, Duvenaud D, Speekenbrink M, Gershman S (2016) Compositional inductive biases in function learning. In: bioRxiv, Cold Spring Harbor Labs Journals
Google Scholar
Sculley D, Holt G, Golovin D, Davydov E, Phillips T, Ebner D, Chaudhary V, Young M, Crespo JF, Dennison D (2015) Hidden technical debt in machine learning systems. In: Advances in Neural Information Processing Systems
Google Scholar
Selvaraju RR, Das A, Vedantam R, Cogswell M, Parikh D, Batra D (2016) Grad-cam: Why did you say that? visual explanations from deep networks via gradient-based localization. arXiv preprint arXiv:161002391
Google Scholar
Shrikumar A, Greenside P, Shcherbina A, Kundaje A (2016) Not just a black box: Interpretable deep learning by propagating activation differences. ICML
Google Scholar
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, et al (2016) Mastering the game of go with deep neural networks and tree search. In: Nature, Nature Publishing Group
Book Google Scholar
Singh S, Ribeiro MT, Guestrin C (2016) Programs as black-box explanations. arXiv preprint arXiv:161107579
Google Scholar
Smilkov D, Thorat N, Kim B, Viégas F, Wattenberg M (2017) Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:170603825
Google Scholar
Strahilevitz LJ (2008) Privacy versus antidiscrimination. University of Chicago Law School Working Paper
Google Scholar
Subramanian GH, Nosek J, Raghunathan SP, Kanitkar SS (1992) A comparison of the decision table and tree. Communications of the ACM 35(1):89–94
Article Google Scholar
Suissa-Peleg A, Haehn D, Knowles-Barley S, Kaynig V, Jones TR, Wilson A, Schalek R, Lichtman JW, Pfister H (2016) Automatic neural reconstruction from petavoxel of electron microscopy data. In: Microscopy and Microanalysis, Cambridge Univ Press
Book Google Scholar
Toubiana V, Narayanan A, Boneh D, Nissenbaum H, Barocas S (2010) Adnostic: Privacy preserving targeted advertising
Google Scholar
Ustun B, Rudin C (2016) Supersparse linear integer models for optimized medical scoring systems. Machine Learning 102(3):349–391
Article MathSciNet Google Scholar
Varshney K, Alemzadeh H (2016) On the safety of machine learning: Cyber-physical systems, decision sciences, and data products. In: CoRR
Google Scholar
Wang F, Rudin C (2015) Falling rule lists. In: Artificial Intelligence and Statistics, pp 1013–1022
Google Scholar
Wang T, Rudin C, Doshi-Velez F, Liu Y, Klampfl E, MacNeille P (2017) Bayesian rule sets for interpretable classification. In: International Conference on Data Mining
Google Scholar
Williams JJ, Kim J, Rafferty A, Maldonado S, Gajos KZ, Lasecki WS, Heffernan N (2016) Axis: Generating explanations at scale with learnersourcing and machine learning. In: ACM Conference on Learning@ Scale, ACM
Google Scholar
Wilson A, Dann C, Lucas C, Xing E (2015) The human kernel. In: Advances in Neural Information Processing Systems
Google Scholar

Download references

Acknowledgements

This piece would not have been possible without the dozens of deep conversations about interpretability with machine learning researchers and domain experts. Our friends and colleagues, we appreciate your support. We want to particularity thank Ian Goodfellow, Kush Varshney, Hanna Wallach, Solon Barocas, Stefan Rüping and Jesse Johnson for their feedback.

Author information

Authors and Affiliations

Harvard University, Cambridge, MA, USA
Finale Doshi-Velez
Google Brain, Mountain View, CA, USA
Been Kim

Authors

Finale Doshi-Velez
View author publications
You can also search for this author in PubMed Google Scholar
Been Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Finale Doshi-Velez .

Editor information

Editors and Affiliations

INAOE, Puebla, Mexico
Hugo Jair Escalante
University of Barcelona, Barcelona, Spain
Sergio Escalera
INRIA, Université Paris Sud, Université Paris Saclay, Paris, France
Isabelle Guyon
Open University of Catalonia, Barcelona, Spain
Xavier Baró
Radboud University Nijmegen, Nijmegen, The Netherlands
Yağmur Güçlütürk
Radboud University Nijmegen, Nijmegen, The Netherlands
Umut Güçlü
Radboud University Nijmegen, Nijmegen, The Netherlands
Marcel van Gerven

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Doshi-Velez, F., Kim, B. (2018). Considerations for Evaluation and Generalization in Interpretable Machine Learning. In: Escalante, H., et al. Explainable and Interpretable Models in Computer Vision and Machine Learning. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-98131-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-98131-4_1
Published: 30 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98130-7
Online ISBN: 978-3-319-98131-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics