Skip to main content

Considerations for Evaluation and Generalization in Interpretable Machine Learning

  • Chapter
  • First Online:

Part of the book series: The Springer Series on Challenges in Machine Learning ((SSCML))

Abstract

As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs. These explanations are often used to qualitatively assess other criteria such as safety or non-discrimination. However, despite the interest in interpretability, there is little consensus on what interpretable machine learning is and how it should be measured and evaluated. In this paper, we discuss a definitions of interpretability and describe when interpretability is needed (and when it is not). Finally, we talk about a taxonomy for rigorous evaluation, and recommendations for researchers. We will end with discussing open questions and concrete problems for new researchers.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover + eBook
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Available as EPUB and PDF

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Merriam-Webster dictionary, accessed 2017-02-07.

References

  • Adler P, Falk C, Friedler SA, Rybeck G, Scheidegger C, Smith B, Venkatasubramanian S (2016) Auditing black-box models for indirect influence. In: Data Mining (ICDM), 2016 IEEE 16th International Conference on, IEEE, pp 1–10

    Google Scholar 

  • Allahyari H, Lavesson N (2011) User-oriented assessment of classification model understandability. In: 11th scandinavian conference on Artificial intelligence, IOS Press

    Google Scholar 

  • Amodei D, Olah C, Steinhardt J, Christiano P, Schulman J, Mané D (2016) Concrete problems in AI safety. arXiv preprint arXiv:160606565

    Google Scholar 

  • Antunes P, Herskovic V, Ochoa SF, Pino JA (2012) Structuring dimensions for collaborative systems evaluation. In: ACM Computing Surveys, ACM

    Article  Google Scholar 

  • Bechtel W, Abrahamsen A (2005) Explanation: A mechanist alternative. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences

    Google Scholar 

  • Bostrom N, Yudkowsky E (2014) The ethics of artificial intelligence. The Cambridge Handbook of Artificial Intelligence

    Google Scholar 

  • Buciluǎ C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM

    Google Scholar 

  • Bussone A, Stumpf S, O’Sullivan D (2015) The role of explanations on trust and reliance in clinical decision support systems. In: Healthcare Informatics (ICHI), 2015 International Conference on, IEEE, pp 160–169

    Google Scholar 

  • Carton S, Helsby J, Joseph K, Mahmud A, Park Y, Walsh J, Cody C, Patterson CE, Haynes L, Ghani R (2016) Identifying police officers at risk of adverse events. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM

    Google Scholar 

  • Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N (2015) Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp 1721–1730

    Google Scholar 

  • Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Computers & Electrical Engineering 40(1):16–28

    Article  Google Scholar 

  • Chang J, Boyd-Graber JL, Gerrish S, Wang C, Blei DM (2009) Reading tea leaves: How humans interpret topic models. In: NIPS

    Google Scholar 

  • Chater N, Oaksford M (2006) Speculations on human causal learning and reasoning. Information sampling and adaptive cognition

    Google Scholar 

  • Doshi-Velez F, Ge Y, Kohane I (2014) Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis. In: Pediatrics, Am Acad Pediatrics, vol 133:1, pp e54–e63

    Google Scholar 

  • Doshi-Velez F, Wallace B, Adams R (2015) Graph-sparse lda: a topic model with structured sparsity. In: Association for the Advancement of Artificial Intelligence

    Google Scholar 

  • Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. In: Innovations in Theoretical Computer Science Conference, ACM

    Google Scholar 

  • Elomaa T (2017) In defense of c4. 5: Notes on learning one-level decision trees. ML-94 254:62

    Google Scholar 

  • Freitas A (2014) Comprehensible classification models: a position paper. In: ACM SIGKDD Explorations

    Google Scholar 

  • Glennan S (2002) Rethinking mechanistic explanation. Philosophy of science

    Google Scholar 

  • Goodman B, Flaxman S (2016) European union regulations on algorithmic decision-making and a “right to explanation”. arXiv preprint arXiv:160608813

    Google Scholar 

  • Gupta M, Cotter A, Pfeifer J, Voevodski K, Canini K, Mangylov A, Moczydlowski W, Van Esbroeck A (2016) Monotonic calibrated interpolated look-up tables. In: Journal of Machine Learning Research

    Google Scholar 

  • Hamill S (2017) CMU computer won poker battle over humans by statistically significant margin. http://www.post-gazette.com/business/tech-news/2017/01/31/CMU-computer-won-poker-battle-over-humans-by-statistically-significant-margin/stories/201701310250, accessed: 2017-02-07

  • Hardt M, Talwar K (2010) On the geometry of differential privacy. In: ACM Symposium on Theory of Computing, ACM

    Google Scholar 

  • Hardt M, Price E, Srebro N (2016) Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems

    Google Scholar 

  • Hayete B, Bienkowska JR (2004) Gotrees: Predicting go associations from proteins. Biocomputing 2005 p 127

    Google Scholar 

  • Hempel C, Oppenheim P (1948) Studies in the logic of explanation. Philosophy of science

    Google Scholar 

  • Hughes MC, Elibol HM, McCoy T, Perlis R, Doshi-Velez F (2016) Supervised topic models for clinical interpretability. In: arXiv preprint arXiv:1612.01678

    Google Scholar 

  • Huysmans J, Dejaeger K, Mues C, Vanthienen J, Baesens B (2011) An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. In: DSS, Elsevier

    Article  Google Scholar 

  • Keil F (2006) Explanation and understanding. Annu Rev Psychol

    Google Scholar 

  • Keil F, Rozenblit L, Mills C (2004) What lies beneath? understanding the limits of understanding. Thinking and seeing: Visual metacognition in adults and children

    Google Scholar 

  • Kim B, Chacha C, Shah J (2013) Inferring robot task plans from human team meetings: A generative modeling approach with logic-based prior. Association for the Advancement of Artificial Intelligence

    Google Scholar 

  • Kim B, Rudin C, Shah J (2014) The Bayesian Case Model: A generative approach for case-based reasoning and prototype classification. In: NIPS

    Google Scholar 

  • Kim B, Glassman E, Johnson B, Shah J (2015a) iBCM: Interactive bayesian case model empowering humans via intuitive interaction. In: MIT-CSAIL-TR-2015-010

    Google Scholar 

  • Kim B, Shah J, Doshi-Velez F (2015b) Mind the gap: A generative approach to interpretable feature selection and extraction. In: Advances in Neural Information Processing Systems

    Google Scholar 

  • Kindermans PJ, Schütt KT, Alber M, Müller KR, Dähne S (2017) Patternnet and patternlrp–improving the interpretability of neural networks. arXiv preprint arXiv:170505598

    Google Scholar 

  • Kochenderfer MJ, Holland JE, Chryssanthacopoulos JP (2012) Next-generation airborne collision avoidance system. Tech. rep., Massachusetts Institute of Technology-Lincoln Laboratory Lexington United States

    Google Scholar 

  • Krakovna V, Doshi-Velez F (2016) Increasing the interpretability of recurrent neural networks using hidden markov models. In: arXiv preprint arXiv:1606.05320

    Google Scholar 

  • Kulesza T, Stumpf S, Burnett M, Yang S, Kwan I, Wong WK (2013) Too much, too little, or just right? ways explanations impact end users’ mental models. In: Visual Languages and Human-Centric Computing (VL/HCC), 2013 IEEE Symposium on, IEEE, pp 3–10

    Google Scholar 

  • Lakkaraju H, Bach SH, Leskovec J (2016) Interpretable decision sets: A joint framework for description and prediction. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp 1675–1684

    Google Scholar 

  • Lazar J, Feng JH, Hochheiser H (2010) Research methods in human-computer interaction. John Wiley & Sons

    Google Scholar 

  • Lei T, Barzilay R, Jaakkola T (2016) Rationalizing neural predictions. arXiv preprint arXiv:160604155

    Google Scholar 

  • Lipton ZC (2016) The mythos of model interpretability. arXiv preprint arXiv:160603490

    Google Scholar 

  • Liu W, Tsang IW (2016) Sparse perceptron decision tree for millions of dimensions. In: AAAI, pp 1881–1887

    Google Scholar 

  • Lombrozo T (2006) The structure and function of explanations. Trends in cognitive sciences 10(10):464–470

    Article  Google Scholar 

  • Lou Y, Caruana R, Gehrke J (2012) Intelligible models for classification and regression. In: ACM SIGKDD international conference on Knowledge discovery and data mining, ACM

    Google Scholar 

  • Mehmood T, Liland KH, Snipen L, Sæbø S (2012) A review of variable selection methods in partial least squares regression. Chemometrics and Intelligent Laboratory Systems 118:62–69

    Article  Google Scholar 

  • Miller GA (1956) The magical number seven, plus or minus two: Some limits on our capacity for processing information. The Psychological Review (2):81–97

    Article  Google Scholar 

  • Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. In: arXiv preprint arXiv:1312.5602

    Google Scholar 

  • Neath I, Surprenant A (2003) Human Memory. Wadsworth Cengage Learning

    Google Scholar 

  • Otte C (2013) Safe and interpretable machine learning: A methodological review. In: Computational Intelligence in Intelligent Data Analysis, Springer

    Google Scholar 

  • Parliament, of the European Union C (2016) General data protection regulation

    Google Scholar 

  • Ribeiro MT, Singh S, Guestrin C (2016) “why should i trust you?”: Explaining the predictions of any classifier. In: arXiv preprint arXiv:1602.04938

    Google Scholar 

  • Ross A, Hughes MC, Doshi-Velez F (2017) Right for the right reasons: Training differentiable models by constraining their explanations. In: International Joint Conference on Artificial Intelligence

    Google Scholar 

  • Ruggieri S, Pedreschi D, Turini F (2010) Data mining for discrimination discovery. ACM Transactions on Knowledge Discovery from Data

    Google Scholar 

  • Rüping S (2006) Thesis: Learning interpretable models. PhD thesis, Universitat Dortmund

    Google Scholar 

  • Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE transactions on systems, man, and cybernetics 21(3):660–674

    Article  MathSciNet  Google Scholar 

  • Schulz E, Tenenbaum J, Duvenaud D, Speekenbrink M, Gershman S (2016) Compositional inductive biases in function learning. In: bioRxiv, Cold Spring Harbor Labs Journals

    Google Scholar 

  • Sculley D, Holt G, Golovin D, Davydov E, Phillips T, Ebner D, Chaudhary V, Young M, Crespo JF, Dennison D (2015) Hidden technical debt in machine learning systems. In: Advances in Neural Information Processing Systems

    Google Scholar 

  • Selvaraju RR, Das A, Vedantam R, Cogswell M, Parikh D, Batra D (2016) Grad-cam: Why did you say that? visual explanations from deep networks via gradient-based localization. arXiv preprint arXiv:161002391

    Google Scholar 

  • Shrikumar A, Greenside P, Shcherbina A, Kundaje A (2016) Not just a black box: Interpretable deep learning by propagating activation differences. ICML

    Google Scholar 

  • Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, et al (2016) Mastering the game of go with deep neural networks and tree search. In: Nature, Nature Publishing Group

    Book  Google Scholar 

  • Singh S, Ribeiro MT, Guestrin C (2016) Programs as black-box explanations. arXiv preprint arXiv:161107579

    Google Scholar 

  • Smilkov D, Thorat N, Kim B, Viégas F, Wattenberg M (2017) Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:170603825

    Google Scholar 

  • Strahilevitz LJ (2008) Privacy versus antidiscrimination. University of Chicago Law School Working Paper

    Google Scholar 

  • Subramanian GH, Nosek J, Raghunathan SP, Kanitkar SS (1992) A comparison of the decision table and tree. Communications of the ACM 35(1):89–94

    Article  Google Scholar 

  • Suissa-Peleg A, Haehn D, Knowles-Barley S, Kaynig V, Jones TR, Wilson A, Schalek R, Lichtman JW, Pfister H (2016) Automatic neural reconstruction from petavoxel of electron microscopy data. In: Microscopy and Microanalysis, Cambridge Univ Press

    Book  Google Scholar 

  • Toubiana V, Narayanan A, Boneh D, Nissenbaum H, Barocas S (2010) Adnostic: Privacy preserving targeted advertising

    Google Scholar 

  • Ustun B, Rudin C (2016) Supersparse linear integer models for optimized medical scoring systems. Machine Learning 102(3):349–391

    Article  MathSciNet  Google Scholar 

  • Varshney K, Alemzadeh H (2016) On the safety of machine learning: Cyber-physical systems, decision sciences, and data products. In: CoRR

    Google Scholar 

  • Wang F, Rudin C (2015) Falling rule lists. In: Artificial Intelligence and Statistics, pp 1013–1022

    Google Scholar 

  • Wang T, Rudin C, Doshi-Velez F, Liu Y, Klampfl E, MacNeille P (2017) Bayesian rule sets for interpretable classification. In: International Conference on Data Mining

    Google Scholar 

  • Williams JJ, Kim J, Rafferty A, Maldonado S, Gajos KZ, Lasecki WS, Heffernan N (2016) Axis: Generating explanations at scale with learnersourcing and machine learning. In: ACM Conference on Learning@ Scale, ACM

    Google Scholar 

  • Wilson A, Dann C, Lucas C, Xing E (2015) The human kernel. In: Advances in Neural Information Processing Systems

    Google Scholar 

Download references

Acknowledgements

This piece would not have been possible without the dozens of deep conversations about interpretability with machine learning researchers and domain experts. Our friends and colleagues, we appreciate your support. We want to particularity thank Ian Goodfellow, Kush Varshney, Hanna Wallach, Solon Barocas, Stefan Rüping and Jesse Johnson for their feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Finale Doshi-Velez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Doshi-Velez, F., Kim, B. (2018). Considerations for Evaluation and Generalization in Interpretable Machine Learning. In: Escalante, H., et al. Explainable and Interpretable Models in Computer Vision and Machine Learning. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-98131-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98131-4_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98130-7

  • Online ISBN: 978-3-319-98131-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics