*Cum hoc ergo propter hoc*

It is part of common knowledge that **correlation does not require causation**.
Absence of causation, say between a condition \(p\) and an effect \(q\), means that the realization of \(p\) has no influence on the presence of \(q\). If this is the case,
a statistical correlation between \(p\) and \(q\) can still be present, if the realization of \(p\) modifies our *state of information* about \(q\).

As an example, let \(X,Y\) be two conditionally independent binary random variables, with a common probability \(\Theta\) of evaluating to one. Think, for instance, of a machine that produces pairs of identical biased coins, with a probability of tails \(\Theta\). If \(\Theta\) is equal to a given value \(\theta\), the joint probability distribution of \(X\) and \(Y\) is:

\[ \text {Pr}(X=x,Y=y\vert \Theta = \theta) = B(x;\theta)B(y;\theta), \tag{1} \] where \(B(z; \theta) = \theta ^z (1 - \theta) ^ {1-z}\). Whether or not this provides a satisfying probabilistic description of experiments on \(X\) and \(Y\) depends on context.

From a frequentist point of view, if \(\Theta\) is fixed once and for all, the right hand side of Eq. (1) correctly describes the experimental outcomes of \(X\) and \(Y\) for *some* value of \(\theta\). On the other hand, if \(\Theta\) can change from experiment to experiment in a random fashion, and we do not observe its values \(\theta\), we clearly cannot use Eq. (1) as it stands, as its usage requires knowing \(\theta\).
Finally, from a bayesian’s point of view, if \(\Theta\) is fixed but unknown, Eq. (1) does not describe our *state of knowledge* about \(X\) and \(Y\), because it assumes unavailable information (\(\Theta = \theta\)).

In the last two cases, what we’re actually after is the unconditional probability:

\[ \text{Pr}(X=x,\,Y=y)=\intop\,\text{d}P_\Theta(\theta) \,\text{Pr}(X=x,Y=y\vert\Theta = \theta) \tag{2} \] where \(\text{d}P_\Theta(\theta)\) can be regarded either as the actual probability distribution of \(\Theta\) (in a frequentist framework) or as a subjective prior distribution (in a bayesian framework).

Plugging Eq. (1) into (2), we find:

\[ \begin{split} \text{Pr}(X=1,\,Y=1) & = \mathbb E(\Theta)^2+\text{Var}(\Theta)\\ \text{Pr}(X=1,\, Y=0)&=\mathbb E(\Theta)-\mathbb E(\Theta)^2-\text{Var}(\Theta)\\ \text{Pr}(X=0,\, Y=1)&=\mathbb E(\Theta)-\mathbb E(\Theta)^2-\text{Var}(\Theta)\\ \text{Pr}(X=0,\,Y=0) & = \mathbb (1-\mathbb E(\Theta))^2+\text{Var}(\Theta) \\ \end{split} \]

In particular, we have:

\[
\dfrac{\text{Pr}(Y = 1 \vert\, X = 1)}{\text {Pr}(Y=1)} = 1+\frac{\text{Var}(\Theta)}{\mathbb{E}(\Theta)^2},
\tag{3}
\]
which means that, *unconditionally*, \(X\) and \(Y\) are not independent, but in fact positively correlated^{1}.

Observations of this kind apply, *mutatis mutandis*, in many practical situations. For instance if we were modeling the time series of new visitors to a website, we could reasonably assume that the number of yesterday’s new visitors does not influence the number of today’s ones (if individual visitors are unlikely to interact with each other). Yet, it would be wrong to assume, and easy to disprove, that these two numbers are by themselves statistically independent, because yesterday’s new visitors carry useful background information on today’s potential new visitors.

The bottom line of the post is that **lack of causation does not imply lack of correlation**, which is logically equivalent to the original motto… but, for some strange reason, I find easier to forget.

Here I’m using the word correlation in a loose sense, as in the popular motto.↩︎

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. Source code is available at https://github.com/vgherard/vgherard.github.io/, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

For attribution, please cite this work as

vgherard (2023, March 30). Valerio Gherardi: Correlation Without Causation. Retrieved from https://vgherard.github.io/posts/2023-03-10-correlation-without-causation/

BibTeX citation

@misc{vgherard2023correlation, author = {vgherard, }, title = {Valerio Gherardi: Correlation Without Causation}, url = {https://vgherard.github.io/posts/2023-03-10-correlation-without-causation/}, year = {2023} }