Counterfactuals: With complex mathematics to better recommendation algorithms

A novel machine learning (ML) model developed by a research team at the music streaming service Spotify captures for the first time the complex mathematics of so-called counterfactuals. It is a technique used to determine the causes of past events and predict the effects of future events.

The ML model, described earlier this year in the journal Nature Machine Intelligence, could improve the accuracy of automated decision-making — particularly when it comes to personalized recommendations — in a range of applications from music services to finance to healthcare.

The basic idea behind Counterfactuals is the question what would have happened in a certain situation if certain parameters had changed. Practically, this is like “rewinding” the world, changing a few crucial details, and then pressing play to see what happens now. If you make the right settings, real causality can be distinguished from correlation and coincidence.

“Understanding cause and effect is very important for decision-making,” says Ciaran Gilligan-Lee, head of Spotify’s Causal Inference Research Lab, who helped develop the model. “You want to understand what impact a decision you are currently making will have on the future.”

The right choice of music

In the case of a music service, that could mean how you decide which songs to listen to, or when it makes sense for an artist to release a new album. Spotify doesn’t currently use the technology, says Gilligan-Lee. “But it could help answer questions we deal with every day.” Counterfactuals are intuitive. People often form a picture of the world by imagining how things would have been if this had happened instead of that. Mathematically, however, this figure is monstrous.

“Counterfactuals are very strange looking statistical objects,” says Gilligan-Lee. “You have to think about strange things to do that. You ask about the probability of something happening if it didn’t happen.” Gilligan-Lee and his co-authors met after reading about each other’s work in an article in MIT Technology Review. Your counterfactuals model is now based on so-called twin networks. These were invented by computer scientists Andrew Balke and Judea Pearl in the 1990s. Pearl received the 2011 Turing Award, considered the Nobel Prize in computer science, for his work on causal reasoning and artificial intelligence.

Gilligan-Lee says Pearl and Balke used twin networks to work through a handful of simple examples. However, applying this mathematical framework to larger and more complex real-world cases is difficult. This is where machine learning comes into play. Twin networks treat counterfactuals as a pair of probabilistic models: one represents the actual world, the other the fictional. The models are linked in such a way that the actual world model constrains the fictional world model so that it stays the same in every way – except for the facts you want to change.

Gilligan-Lee and his colleagues used twin networks as a model for a neural network and then trained it to make predictions about how events would play out in the fictional world. The result is a general purpose Counterfactuals reasoning program. “You can use this to answer any counterfactual question about a scenario,” says Gilligan-Lee.

Testing the Spotify model

The Spotify team tested their model on several real-world case studies, including one on lending in Germany, one on an international clinical trial for stroke medication, and one on water security in Kenya.

In 2020, researchers investigated whether installing pipes and concrete barriers would protect water sources from bacterial contamination in a region of Kenya to reduce the incidence of diarrheal diseases in children. They saw a positive effect. However, the real cause was unclear, says Gilligan-Lee. Before wells were secured with concrete across the country, one had to be sure that the drop in the number of cases of the disease was actually caused by this measure and was not just a side effect.

So it’s possible that the researchers who conducted the study, who built concrete barriers around the wells, made people aware of the risks of contaminated water and that’s why they started boiling it at home. “In that case, education would be a cheaper route,” Gilligan-Lee said.

Gilligan-Lee and colleagues ran this scenario through their model and asked whether children who got sick in the real world after drinking from an unprotected well also got sick in the fictional world after drinking from a protected well. They found that changing where the child was drinking – and maintaining other conditions, such as B. the treatment of water at home – had no significant impact on the result, suggesting that the decrease in diarrheal diseases in children was not (directly) caused by the installation of pipes and concrete barriers.

Custom statistical model

This corresponds to the result of the 2020 study, which also included counterfactual considerations. However, those researchers had hand-built a custom statistical model just to ask that one question, Gilligan-Lee says. In contrast, the Spotify team’s machine learning model is universal and can be used to ask multiple counterfactual questions about many different scenarios.

Spotify isn’t the only tech company scrambling to develop machine learning models that could provide cause-and-effect insights. In recent years, companies such as Meta, Amazon, LinkedIn and ByteDance, owner of TikTok, have also started developing these methods.

“Causal inference is crucial for machine learning,” says Nailong Zhang, software engineer at Meta. For example, Meta uses these in its machine learning model that decides how many and what types of notifications Instagram should send its users to keep them coming back to the platform.

Romila Pradhan, a data scientist at Purdue University in Indiana, uses Counterfactuals to make automated decision-making more transparent. Businesses today use machine learning models to decide who gets—and doesn’t get—a loan, a job, prison parole, or even an apartment.

A matter of logic

Regulators have therefore begun to require companies to explain the outcome of such decisions to those affected. However, it is difficult to reconstruct the steps of a complex algorithm.

Pradhan believes Counterfactuals can help here. Let’s say a bank’s machine learning model rejects your loan application and you want to know why. One way to answer this question is through counterfactual scenarios. Suppose the application was denied in the real world. Would it have been rejected even in a fictional world where your credit history would have been different? What if you had a different zip code, job, income, etc.? “The ability to build such questions into future loan approval programs would allow banks to provide a rationale to their customers rather than just a ‘yes’ or ‘no’,” said Pradhan.

Counterfactual questions are important because they show how people think about different outcomes, she says: “It’s a good way to capture explanations.” Technology can also help companies predict human behavior. Since counterfactual questions allow conclusions to be drawn about what could happen in a certain situation – and not just on average – users could be better classified in their very own drawer.

The same logic used to justify drinking water protection or credit decisions could also be used to optimize responses to Spotify playlists, Instagram notifications, and targeted advertising. If we play this song, will the user listen longer? If we show this image, will that person keep scrolling? “Companies want to understand how to make recommendations for specific users, not just the average user,” says Gilligan-Lee. Whether users want it that way is another question.

(bsc)

Counterfactuals: With complex mathematics to better recommendation algorithms

The right choice of music

Testing the Spotify model

Custom statistical model

A matter of logic

Share this:

Related

They inform that vaccines against influenza and covid can be applied simultaneously

Second largest economy – inflation in China surprisingly falls – News

You may also like

Leave a Comment Cancel Reply