Machine learning models can make errors and be difficult to use. That’s why scientists have developed explanation methods to help users understand when and how they should trust a model’s predictions.
These explanations, however, are often complex and perhaps contain information on hundreds of features of the model. And they are sometimes presented as multi-faceted visualizations that can be difficult to understand for users who lack machine learning expertise.
To help people understand AI explanations, MIT researchers used extended language models (LLMs) to transform plot-based explanations into plain language.
They developed a two-part system that converts a machine-learning explanation into a paragraph of human-readable text, then automatically rates the quality of the story, so the end user knows whether to trust it.
By providing the system with a few sample explanations, researchers can tailor its narrative descriptions to meet user preferences or the requirements of specific applications.
In the long term, researchers hope to build on this technique by allowing users to ask a model follow-up questions about how it arrived at predictions in real-world settings.
“Our goal with this research was to take the first step in enabling users to have in-depth conversations with machine learning models about why they made certain predictions, so they can make better decisions about opportunity to listen to the model,” says Alexandra Zytek, a graduate student in electrical and computer engineering (EECS) and lead author of a article on this technique.
She is joined on the article by Sara Pido, a postdoctoral fellow at MIT; Sarah Alnegheimish, EECS graduate student; Laure Berti-Équille, research director at the National Research Institute for Sustainable Development; and lead author Kalyan Veeramachaneni, senior researcher at the Information and Decision Systems Laboratory. The research will be presented at the IEEE Big Data conference.
Enlightening explanations
The researchers focused on a popular type of machine learning explanation called SHAP. In a SHAP explanation, a value is assigned to each feature used by the model to make a prediction. For example, if a model predicts real estate prices, one feature might be the location of the house. The location would be assigned a positive or negative value that represents how much that feature changed the overall prediction of the model.
Often, SHAP explanations are presented in the form of bar charts that show which features are most or least important. But for a model with more than 100 features, this bar plot quickly becomes cumbersome.
“As researchers, we have to make many choices about what we will present visually. If we choose to only show the top 10, people might wonder what happened to another feature that isn’t in the plot. Using natural language saves us from having to make these choices,” says Veeramachaneni.
However, rather than using a large linguistic model to generate a natural language explanation, researchers use LLM to transform an existing SHAP explanation into a readable narrative.
By letting the LLM handle only the natural language part of the process, it limits the possibility of introducing inaccuracies into the explanation, Zytek says.
Their system, called EXPLINGO, is divided into two elements that work together.
The first component, called NARRATOR, uses an LLM to create narrative descriptions of SHAP explanations that meet user preferences. By initially providing the NARRATOR with three to five written examples of narrative explanations, the LLM will mimic this style when generating the text.
“Rather than asking the user to try to define what kind of explanation they’re looking for, it’s easier to just ask them to write what they want to see,” says Zytek.
This allows NARRATOR to be easily customized for new use cases by showing it a different set of manually written examples.
Once NARRATOR creates a plain language explanation, the second component, GRADER, uses an LLM to rate the story on four metrics: conciseness, accuracy, completeness, and flow. GRADER automatically prompts the LLM with the NARRATOR text and the SHAP explanation it describes.
“We find that even when an LLM makes a mistake while completing a task, they often don’t make an error while verifying or validating that task,” she says.
Users can also customize GRADER to assign different weights to each metric.
“One would imagine, in a high-stakes case, that accuracy and completeness would be much higher than fluency, for example,” she adds.
Analyze the stories
For Zytek and his colleagues, one of the biggest challenges was adjusting the LLM so that it generated natural-sounding narratives. The more guidelines they added to the controlling style, the more likely the LLM was to introduce errors into the explanation.
“A lot of adjustments were necessary to find and correct each error one by one,” she says.
To test their system, the researchers took nine machine learning datasets with explanations and asked different users to write stories for each dataset. This allowed them to evaluate the NARRATOR’s ability to imitate unique styles. They used GRADER to score each narrative explanation on the four parameters.
Ultimately, the researchers found that their system could generate high-quality narrative explanations and effectively imitate different writing styles.
Their results show that providing a few examples of manually written explanations significantly improves the narrative style. However, these examples should be worded carefully – the inclusion of comparative words, such as “bigger”, may cause GRADER to mark specific explanations as incorrect.
Based on these results, the researchers want to explore techniques that could help their system better handle comparative words. They also want to extend EXPLINGO by adding rationalization to explanations.
In the long term, they hope to use this work as a stepping stone to an interactive system where the user can ask a model follow-up questions about an explanation.
“It would make decision-making easier in many ways. If people disagree with a model’s prediction, we want them to be able to quickly determine whether their intuition is correct, or the model’s intuition is correct, and where that difference comes from. “, explains Zytek.