Design Strategy

Learn how to explain an AI system in order to improve the user experience.

Design Strategy Overview

Based on industry and academic research, we propose the following design strategy for explainability in AI. This strategy is the beginning of a conversation; we hope you will experiment, play, use, and break what you see here and send us your feedback so that we can continue to iterate!

01. Users
Who are your users and why do they need an explanation?

Identify the distinct groups of people who are interested in explanations from your AI system and understand the nuances within these groups. There will likely be varying degrees of factors such as domain expertise, self-confidence, attitudes towards AI, and knowledge of how AI works, all of which can influence trust and how people understand the system. For each user group, identify what triggers a need for an explanation as well as the underlying motivations and expectations. 

By identifying and understanding your users, you can ensure the explanation matches their needs and capabilities.There are typically four distinct groups to consider:

User Group

Decision Makers

The people making decisions with AI
WHO

Decision makers are people who use the recommendations of an AI system to make a decision, such as a physician or loan officer.

WHY

Decision makers seek explanations that can build their trust and confidence in the system’s recommendations and possibly provide them with additional insight to improve their future decisions and understanding of the system. These users will have a high need for domain sophistication, but will also have less tolerance for complex explanations.

User Group

Affected Users

The people affected by the decisions
WHO

Affected users are people who are impacted by the recommendations made by an AI system, such as patients or loan applicants. In some scenarios, a person may be both a decision maker and affected user.

WHY

Affected users seek explanations that can help them understand if they were treated fairly and what factor(s) could be changed to get a different result. These users need reasons for their outcomes communicated in a simple and direct way and often have a lower threshold for both complexity and domain information.

User Group

Regulatory Bodies

The people checking the system
WHO

Regulatory bodies include government agencies who define and enforce relevant policies, such as the European Union’s General Data Protection Regulation (GDPR). In 2016, the “right to explanation” was approved, requiring that data subjects receive meaningful information about the logic involved in automated decision-making systems.

WHY

Regulatory bodies seek explanations that enable them to ensure decisions are made in a safe and fair manner. Their needs may be satisfied by showing the overall process, including training data, is free of negative societal impact and may not be able to consume a high level of complexity.

User Group

Internal Stakeholders

The people behind the system
WHO

Internal stakeholders are those who build and deploy an AI system, especially technical individuals such as data scientists and developers.

WHY

Internal stakeholders seek explanations that help them know if the system is working as expected, how to diagnose and improve it, and possibly gain insight from its decisions. They are likely to need and understand a more complex explanation of the system’s inner workings to take action accordingly.

Question Types

Explanations are typically responses to questions, and as such, user needs and triggers for explainability can be written in the form of questions. The following questions are based on common types of information that AI systems can present to users. 

02. Context
When do users need an explanation?

A mental model is a person’s understanding of a system and how it works. Mental models help people set expectations of AI system capabilities, constraints, and value. Expectations impact user satisfaction, behavior, and acceptance of an AI system. When a person’s mental model does not match how the system actually works, it often leads to frustration, misuse, and even product abandonment. 

As a result, mental models play a key role in calibrating trust in human-AI interaction. Mental models can change as a person interacts with an AI system, therefore the need for an explanation should be contextualized in the phase of a user’s experience. Within each phase, consider existing mental models and how to calibrate them accordingly.

03. Methods
What kind of explanation should be used?

Technical Considerations

When selecting an explainability method, it is important to consider the type of AI system you are explaining (supervised vs. unsupervised) as well as the relationship between the AI system’s prediction and its explanation. The complexity of an AI system is directly related to its ability to be explained; the more complex the model, the more difficult it is to interpret and explain. There are two main relationships between an AI system and an explainability approach:

  • Ante-hoc Approaches

    Ante-hoc approaches use the same model for predictions and explanations. These approaches are thought to provide full transparency and are typically model-specific because they are designed for and only applicable to a specific model.

  • Post-hoc Approaches

    Post-hoc approaches use a different model to reverse engineer the inner works of the original model and provide explanations. These approaches are thought to lighten the black box of complex models and are typically model-agnostic because they are designed to work with any type of model.

Methods

A common way to categorize explanation methods is by scope: global or local. Global or general system explanations describe how the system behaves while local or specific output explanations discuss the rationale behind a specific output. There is a promising line of work that is focusing on combining the strengths and benefits of both local and global explanations, suggesting a hybridized approach may be a possible human-in-the-loop workflow. 

Popular explanation methods for supervised machine learning are shown below.


Global Explanations

Global explanations help users understand and evaluate the system.


Local Explanations

Local explanations help users examine individual cases, which can help with identifying fairness discrepancies and calibrate trust on a case-by-case basis.

Characteristics

  • Fidelity

    Explanation fidelity includes soundness and completeness. Soundness refers to how truthful an explanation is with respect to the underlying predictive model. Completeness measures how well an explanation generalizes; in other words, what extent it covers the underlying predictive model.

  • System Interaction

    Explanations can be static or interactive. Interactive explanations accommodates a wider array of user needs and expectations. Some examples of interactivity include explanations that are reversible, collect and respond to user feedback, and allow adjustment of granularity.

  • Format

    Explanations can be delivered as a summarization (typically with statistics), visualization, text, formal argumentation, or a mixture of the above.

Questions to Explanations

Here are recommended connections between questions and common explanation methods. 

04. Evaluation
How can explanations be assessed and validated?

Assessment

Explainability approaches can be assessed with product teams (UX, product management, and engineering) using the following dimensions:

Functional Requirements
Determine whether a particular approach is suitable for a desired application.
Operational Requirements
Determine how users interact with an explainable system and what is expected of them.
Usability Requirements
Determine the properties of explanations that are important from an user’s point of view.
Safety Requirements
Determine the effect of explainability on robustness, security, and privacy aspects of predictive systems.

Validation

Explainability approaches should also be validated by users based on what is being tested. For example, an application level validation approach is best suited for testing concrete applications of an explanation, while a human level approach is best suited for testing more general notions of the quality of an explanation.

Application level
Validating an explainability approach on a real task with the intended audience.
Human level
Validating an explainability approach on a simplified task (that represents the same domain) and a lay audience.
Functionally-grounded evaluation
Validating an explainability approach on a proxy task in a setting similar to the intended scenario.

References

  • Adadi, A., & Berrada, M. (2018). Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access, 6, 52138-52160. 
  • Amershi, S., Inkpen, K., Teevan, J., Kikin-Gil, R., Horvitz, E., Weld, D., … Bennett, P. N. (2019). Guidelines for Human-AI Interaction. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI 19.
  • Doshi-Velez, F., & Kim, B. (2017). Towards A Rigorous Science of Interpretable Machine Learning. arXiv: Machine Learning.
  • Google. (n.d.). People AI Guidebook. Retrieved from https://pair.withgoogle.com/chapter/explainability-trust/
  • Hind, M., Wei, D., Campbell, M., Codella, N.C., Dhurandhar, A., Mojsilovic, A., Ramamurthy, K.N., & Varshney, K.R. (2018). TED: Teaching AI to Explain its Decisions. AIES '19.
  • Kocielnik, R., Amershi, S., & Bennett, P.N. (2019). Will You Accept an Imperfect AI?: Exploring Designs for Adjusting End-user Expectations of AI Systems. CHI '19.
  • Liao, Q. V., Gruen, D., & Miller, S. (2020). Questioning the AI: Informing Design Practices for Explainable AI User Experiences. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems - CHI 20.
  • Lim, B.Y., & Dey, A.K. (2009). Assessing demand for intelligibility in context-aware applications. Proceedings of the 11th international conference on Ubiquitous computing.
  • Lim, B.Y., Dey, A.K., & Avrahami, D. (2009). Why and why not explanations improve the intelligibility of context-aware intelligent systems. CHI.
  • Norman, D. (1988). The Design of Everyday Things.
  • Sokol, K., & Flach, P.A. (2020). Explainability fact sheets: a framework for systematic assessment of explainable approaches. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency.