Home » Simpson’s paradox and anticovid vaccination data – Quentin Berger

Simpson’s paradox and anticovid vaccination data – Quentin Berger

by admin

09 December 2021 13:57

Statistics can produce completely counterintuitive results despite being rigorously proven. They are the paradoxes. This term indicates results that are not false or incompatible with another result, but contrary to our intuition.

One of the most striking statistical paradoxes is that of Simpson. It states that, analyzing a population composed of different groups, it is possible that within each group the same phenomenon is observed, while in the total population the opposite phenomenon is observed. This paradox is at the origin of many errors of interpretation, even by experienced mathematicians.

Here is an example from the data on hospitalizations and vaccinations in England. In the reports on deaths of people positive to the delta variant of covid-19 (the data, full references and calculations are shown here) it is noted that:

  • in the population under the age of fifty, the death rate is about 1.8 times more high among the unvaccinated compared to the vaccinated;
  • in the population over the age of fifty, the death rate is about 6.3 times more high among the unvaccinated compared to the vaccinated;
  • on the other hand, in the population taken as a whole, the death rate is about 1.3 times less high among the unvaccinated compared to the vaccinated.

Two observations are needed at this point. In the first place, the last figure seems to contradict the two previous ones. How can we explain the fact that the vaccine reduces the mortality rate in both those over fifty and under fifty but increases it if we consider the population as a whole?

Secondly (even more disturbing), depending on whether we rely on the data for people under fifty and over fifty separately, or whether we look at people of all ages, we come to opposite conclusions about the effectiveness of the vaccine. In other words: if we observe the first two points, the vaccine seems effective in reducing mortality both among minors and those over fifty, but if instead we consider the population as a whole (i.e. the last point) it could be concluded that the vaccine is not effective at all, indeed it is downright dangerous. What is the correct conclusion?

See also  Pregnancy: Covid risk for women and unborn children

Explanation of the paradox
The precise data are presented here, but it is useful to explain in general form how this paradox can be produced.

The basic concept is that in the period under examination the percentage of people vaccinated is very different between those over fifty (about 95 per cent according to the British health service) and those under fifty (about 50 per cent ).

As a result, a large proportion of unvaccinated people are under the age of fifty and have a low mortality rate (due to age). On the other hand, a large proportion of vaccinated people are over fifty years old and show a higher mortality rate (albeit greatly reduced by the vaccine). This explains why, considering the population as a whole, the percentage of deaths among the unvaccinated may be lower than that of the vaccinated.

Here is a graphical representation where the paradox emerges clearly, with fictitious data to make the phenomenon clearer:

Graphic illustration of the Simpson paradox with fictitious data: each person is represented by a square. The color of the square corresponds to an age group, while the dark or light shade represents the vaccination status. Each cross indicates a death.

(Quentin Berger and Francesco Caravenna)

If we consider minors and over 50s as two separate groups it is clear that in both cases the death rate is lower among the vaccinated population:

For those under 50 (blue), the death rate is lower among vaccinated (0 percent) than among unvaccinated (2.2 percent). Even for those over 50 (red) the death rate is lower among vaccinated (13.3 percent) than among non-vaccinated (40 percent).

(Quentin Berger and Francesco Caravenna)

In the total population, the death rate is higher among vaccinated (dark blue and dark red, 12 percent) than among unvaccinated (light blue and light red, 6 percent).

(Quentin Berger and Francesco Caravenna)

This phenomenon is due to the fact that most of the vaccinated are over fifty years old.

What conclusions can we draw?
From this paradox we can draw an important message: we must be very careful when analyzing statistical data that refer to groups with different characteristics. Basically, the Simpson paradox is linked to the fact that the vaccination rate varies a lot with age, therefore it is important to evaluate the effectiveness of the vaccine within a group of people with as homogeneous ages as possible.

Combining different age groups produces the phenomenon known as “selection bias” (selection bias): the set of vaccinated people is largely made up of elderly people, therefore more fragile, while the set of unvaccinated people is mostly made up of young people, who are less fragile. Consequently, a comparison between the mortality rates of vaccinated and unvaccinated people of any age becomes, in fact, a comparison between an average elderly population and an average young population. To affirm that mortality is higher among vaccinated than among non-vaccinated is therefore misleading, because the comparison is distorted by the great variability of the vaccination rate according to the different age groups.

The difficulty of interpreting the statistics
The problems of selection bias are well known in statistics, and are among the most common errors of interpretation.

A classic example is that of the statistician Abraham Wald, who during the Second World War, after observing all the planes returned from the fighting, suggested that the parts that had been less hit by bullets. The reasoning was that those spots were the most critical parts, because when they were hit the planes were less likely to return from combat. Wald understood the importance of correcting the distortion known as how survivorship bias or survival bias, which consists in making statistical analyzes taking into account only the data relating to survivors.

See also  Cyber ​​weapons will change warfare. The Army needs new technicians

Selection biases, whether conscious or not, are often an integral part of the statistical data collection process, as we saw in the previous example. It is important to know which distortions are present are present, to correct their effect. In our original example, comparing the rate of deaths between unvaccinated and vaccinated in the entire population results in an age-related bias, as we have seen. One way to correct it is to limit the comparison to as small as possible age groups, within which the vaccination rate is stable.

commercial break

To conclude, paradoxes remind us of the pitfalls to avoid: thanks to their ability to surprise they help us to refine our intuition, or at least not to trust it too much. They remind us that no one is infallible and that it is not always easy and immediate to solve problems that appear simple. Paradoxes push us to deepen our reflections with humility.

For lovers of paradoxes, here are some of the most classic in the field of probabilities: the birthday paradox, the Bertrand paradox, the Monty Hall problem, the prisoner’s dilemma, the children’s paradox.

(Translation by Andrea Sparacino)

.

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy