Correlation Without Causation

Statistics

Cum hoc ergo propter hoc

Valerio Gherardi https://vgherard.github.io
2023-03-30

It is part of common knowledge that correlation does not require causation. Absence of causation, say between a condition \(p\) and an effect \(q\), means that the realization of \(p\) has no influence on the presence of \(q\). If this is the case, a statistical correlation between \(p\) and \(q\) can still be present, if the realization of \(p\) modifies our state of information about \(q\).

As an example, let \(X,Y\) be two conditionally independent binary random variables, with a common probability \(\Theta\) of evaluating to one. Think, for instance, of a machine that produces pairs of identical biased coins, with a probability of tails \(\Theta\). If \(\Theta\) is equal to a given value \(\theta\), the joint probability distribution of \(X\) and \(Y\) is:

\[ \text {Pr}(X=x,Y=y\vert \Theta = \theta) = B(x;\theta)B(y;\theta), \tag{1} \] where \(B(z; \theta) = \theta ^z (1 - \theta) ^ {1-z}\). Whether or not this provides a satisfying probabilistic description of experiments on \(X\) and \(Y\) depends on context.

From a frequentist point of view, if \(\Theta\) is fixed once and for all, the right hand side of Eq. (1) correctly describes the experimental outcomes of \(X\) and \(Y\) for some value of \(\theta\). On the other hand, if \(\Theta\) can change from experiment to experiment in a random fashion, and we do not observe its values \(\theta\), we clearly cannot use Eq. (1) as it stands, as its usage requires knowing \(\theta\). Finally, from a bayesian’s point of view, if \(\Theta\) is fixed but unknown, Eq. (1) does not describe our state of knowledge about \(X\) and \(Y\), because it assumes unavailable information (\(\Theta = \theta\)).

In the last two cases, what we’re actually after is the unconditional probability:

\[ \text{Pr}(X=x,\,Y=y)=\intop\,\text{d}P_\Theta(\theta) \,\text{Pr}(X=x,Y=y\vert\Theta = \theta) \tag{2} \] where \(\text{d}P_\Theta(\theta)\) can be regarded either as the actual probability distribution of \(\Theta\) (in a frequentist framework) or as a subjective prior distribution (in a bayesian framework).

Plugging Eq. (1) into (2), we find:

\[ \begin{split} \text{Pr}(X=1,\,Y=1) & = \mathbb E(\Theta)^2+\text{Var}(\Theta)\\ \text{Pr}(X=1,\, Y=0)&=\mathbb E(\Theta)-\mathbb E(\Theta)^2-\text{Var}(\Theta)\\ \text{Pr}(X=0,\, Y=1)&=\mathbb E(\Theta)-\mathbb E(\Theta)^2-\text{Var}(\Theta)\\ \text{Pr}(X=0,\,Y=0) & = \mathbb (1-\mathbb E(\Theta))^2+\text{Var}(\Theta) \\ \end{split} \]

In particular, we have:

\[ \dfrac{\text{Pr}(Y = 1 \vert\, X = 1)}{\text {Pr}(Y=1)} = 1+\frac{\text{Var}(\Theta)}{\mathbb{E}(\Theta)^2}, \tag{3} \] which means that, unconditionally, \(X\) and \(Y\) are not independent, but in fact positively correlated1.

Observations of this kind apply, mutatis mutandis, in many practical situations. For instance if we were modeling the time series of new visitors to a website, we could reasonably assume that the number of yesterday’s new visitors does not influence the number of today’s ones (if individual visitors are unlikely to interact with each other). Yet, it would be wrong to assume, and easy to disprove, that these two numbers are by themselves statistically independent, because yesterday’s new visitors carry useful background information on today’s potential new visitors.

The bottom line of the post is that lack of causation does not imply lack of correlation, which is logically equivalent to the original motto… but, for some strange reason, I find easier to forget.


  1. Here I’m using the word correlation in a loose sense, as in the popular motto.↩︎

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. Source code is available at https://github.com/vgherard/vgherard.github.io/, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Gherardi (2023, March 30). vgherard: Correlation Without Causation. Retrieved from https://vgherard.github.io/posts/2023-03-10-correlation-without-causation/

BibTeX citation

@misc{gherardi2023correlation,
  author = {Gherardi, Valerio},
  title = {vgherard: Correlation Without Causation},
  url = {https://vgherard.github.io/posts/2023-03-10-correlation-without-causation/},
  year = {2023}
}