The problem of Interpretability in machine learning models

29

September

2020

5/5 (2)

In the recent years, machine learning technologies are increasing in popularity among variety of different fields. The vast amounts of data allow data scientists to play around and try to answer questions that are impossible to accurately calculate without such technologies. Some of these questions can be highly important and sensitive – for example, deciding whether a loan application will default, or whether an applicant should be considered for the job. There are variety of models that can predict such outcomes, however, is their prediction generalizable and free of biases?

A classical example of this is a neural network classification model, that was built to identify whether in a given picture there is a husky or a wolf. Surprisingly, the model was performing with a high accuracy and predicting the correct animals. However, researchers quickly realized, that something was wrong with it. The model, was basing its decisions purely on one aspect – whether the picture contains snow. Well, that is quite a logical explanation, as pictures of wolves do indeed have usually snow in them, however, given the explanation for this reasoning, I doubt that anyone would want to apply this biased model in a real life situation.

In making predictions, there is often a trade-off between interpretability and model accuracy. When a linear model predicts an outcome, the prediction is just input variables multiplied by different weights – these can be easily explained. However, when using a more advanced model, such as gradient boosting classifier or neural networks, this interpretation becomes complicated.
In recent years, however, there has been some effort to tackle this; one of them is the so called Local Interpretable Model-agnostic Explanations (LIME). Without going too deep in the theoretical details (I will provide some links for further information) LIME is essentially a technique, that helps better explain the reasoning behing any model‘s particular prediction, locally – meaning that it only allows to analyze a small portion of the network/model.

In practice, LIME allows to understand predicitons, by visualizing the features of a given observation that had the most impact in making that particular prediction. Returning back to our wolf/husky identifier – in a given picture, LIME will highlight the snow that that there either is, or is not, hence explaining the behaviour of that particular model.

The importance of this, is that it allows data scientists to better evaluate their models and their applicability. When a model will predict a loan applicant as ‚likely to default‘, and a bank makes the decision to deny him that loan – managers will be able to explain to the person what are the exact reasons. In a critical situations, we cannot simply use black boxes that give us answers to our simplified questions – we also need to know the reasoning.

Sources/Further information:
Video: Interpretable Machine Learning Using LIME Framework – Kasia Kulma (PhD), Data Scientist, Aviva
Article: Guide to Interpretable Machine Learning

Please rate this

2 thoughts on “The problem of Interpretability in machine learning models”

  1. Dear Jokubas,

    Thank you so much for the very interesting reading. I have always been skeptical about the implementation of machine learning algorithms into life changing decision making. Indeed, I think machine learning has very useful applications for repetitive workflow and processes or for standardizable tasks. However, I think it might not always be the best option when we talk about medical insurance, bank mortgage, or recruiting process. Algorithms might have hidden biases derived from the data provided to train the system. For instance, if past recruiters have been biased towards white male candidates, so will be the algorithm. It is very hard to change the way machines think as they are trained based on million past data points. I do believe humans are better for case by case judgment as a non-standardized process is needed. The LIME framework has very good potential to solve this problem. However, I believe it still has two major drawbacks. Firstly, LIME framework can be applied uniquely to linear models. For those datasets that involve complex, non-interpretable models, non-linearity in local region might occur. Nevertheless, it is exactly in those models that we would need more detailed explanations. Secondly, the type of modifications that need to be performed on the data to get proper explanations depend on the specific dataset and case that we are analyzing. In most cases, simple modifications are not enough. Ideally, the changes would be driven by the variation that is observed in the dataset. Manually steering the perturbations is not a good idea, as it most likely would introduce bias into the model explanations (going back to the above mentioned issue). LIME is a great framework to explain what machine learning classifiers are doing. It is a model-agnostic and leverages a simple and understandable idea. Moreover, it does not require a lot of effort to run. I still believe that human interpretation of the output is still needed when using LIME.

    1. Hi Beatrice,
      Thank you very much for the insightful comment!
      I do definitely agree that ML models are far away from making autonomous decisions in critical areas (recruitment, mortgage, insurance, etc…)
      However, I do believe that you are not entirely accurate regarding how LIME works. Linear models in most cases are explainable just by inspecting the weights they assign to each of the variables (unless its a very high dimensional dataset) – those weights are the explanation. LIME itself is essentially a linear model, applied to a dataset that is generated around a single observation of a more complex ‘black-box’ model. That is the assumption under which it was created – that a linear model is able to explain behavior of a complex model, locally.
      You are right about the drawbacks – in some cases, even locally these models are non-linear and too complex to explain. I do believe though, that LIME is very useful in identifying the biases that a model may have.

Leave a Reply

Your email address will not be published. Required fields are marked *