More and more companies, organizations and even government agencies are using artificial intelligence to predict and make decisions. Sometimes accurate predictions are not enough, but fairness, accountability and transparency must also be maintained. The model must be interpretable and explainable; the implicit ability of a model to explain its reasoning process using the components of the problem domain and offer a justification for its predictions. This post briefly explains some of the methods that provide explanations for the outcomes or behaviour of an artificial intelligence agent.
Post-hoc explanation methods
Some machine learning models are already intrinsically interpretable. Decision trees can be interpreted according to rules. Linear/logistic regression can be interpreted using a linear/logistic relationship. In addition, k-Nearest Neighbours can be interpreted as a result that resembles similar examples in the dataset. In addition, there are more complex machine learning methods. For these models you can use post-hoc explanation models (also possible for intrinsically interpretable models). These explanation models are additionally applied after the model has already been trained. Post-hoc methods can give explanations separated from the machine learning model (model-agnostic) or model-specific. It also may be for the whole model or for a single instance.
Model-agnostic post-hoc methods
- Partial Dependence Plot: This method sets one independent variable to one specific value for all instances. It shows the marginal effect of the independent variable on the dependent variable. The relationship can also be visualized.
- Individual Conditional Expectation: This method differs from the Partial Dependence Plot in that the marginal effect of the independent variables is calculated for each instance. This can uncover heterogeneous effects.
- Local surrogates: This method is used to explain individual predictions by using a surrogate linear regression model.
- Global surrogates: This method approaches the predictions of the complex model by using a simpler interpretable surrogate model.
- Shapley values: This computationally expensive method determines the contribution of the independent variables to the prediction for a given instance.
Explanation and choosing intrinsically interpretable models
Usually, there is a trade-off between the performance of the Machine Learning models and the complexity/interpretability. Depending on the situation, a decision has to be made about this. If the stakes are high, it might be a better way to look at interpretable models. It can be said that among the choice of different models, the model must be chosen with the simplest explanation as well as having all the necessary information/fit.
Reference:
Gevaert, C. M. (2022, augustus). Explainable AI for earth observation: A review including societal and regulatory perspectives. International Journal of Applied Earth Observation and Geoinformation, 112, 102869. https://doi.org/10.1016/j.jag.2022.102869
Molnar, C. (2020). Interpretable Machine Learning. Leanpub.
Ragini, R. (2021, 10 December). Principle of Parsimony – Ruhi Ragini. Medium. https://medium.com/@ruhi3929/principle-of-parsimony-d510356ca06a