# Fisher’s Randomization Test

Notes and proofs of basic theorems

Valerio Gherardi https://vgherard.github.io
2023-06-07

Let $$N\in \mathbb N$$ be fixed, and let:

• $$\mathbf Y(1),\,\mathbf Y(0)\in \mathbb R ^N$$ be random vectors, with components $$Y_i(1),Y_i(0)\in \mathbb R$$,
• $$\mathbf Z$$ be a random vector with components $$Z_i \in \{0,1\}$$, independent from $$\mathbf Y(1)$$ and $$\mathbf Y(0)$$ above,
• $$\mathbf Y = \mathbf Z\times \mathbf Y(1)+(1-\mathbf Z)\times \mathbf Y(0)$$ (multiplication is component-wise).

Given a scalar function $$t = t(\mathbf Z, \,\mathbf Y)\in \mathbb R$$, define:

$P(t,\mathbf Z, \mathbf Y)=\sum _{\mathbf Z '}\text{Pr}_\mathbf Z(\mathbf Z')\cdot I(t(\mathbf Z',\mathbf Y)\geq t(\mathbf Z,\mathbf Y)),$ where $$\text{Pr}_\mathbf Z(\cdot)$$ is the marginal distribution of treatment assignments.

Theorem. If $$\mathbf Y(0) = \mathbf Y(1)$$ then:

$\text{Pr}(P(t,\mathbf Z,\mathbf Y)\leq \alpha) \leq \alpha.$

Proof. Let $$\mathbf Z'$$ be distributed according to $$\text{Pr}_\mathbf Z(\cdot)$$, and define $$\mathbf Y' = \mathbf Z'\times \mathbf Y(1)+(1-\mathbf Z')\times \mathbf Y(0)$$. Given $$t_0\in \mathbb R$$, we observe that:

$\text {Pr}(t(\mathbf Z',\mathbf Y')\geq t_0 \,\vert\,\mathbf Y(0),\,\mathbf Y(1)) = \sum _{\mathbf Z '}\text{Pr}_\mathbf Z(\mathbf Z')\cdot I(t(\mathbf Z',\mathbf Y')\geq t_0).$ Now, if $$\mathbf Y(0) = \mathbf Y(1)$$, we have $$t(\mathbf Z',\mathbf Y') = t(\mathbf Z',\mathbf Y)$$, so that we may replace $$\mathbf Y'$$ with $$\mathbf Y$$ in the RHS of the previous equation. If, moreover, we choose $$t_0= t(\mathbf Z , \mathbf Y)$$ we obtain:

$P(t, \mathbf Z, \mathbf Y) = \text {Pr}(t(\mathbf Z',\mathbf Y')\geq t(\mathbf Z,\mathbf Y) \,\vert\,\mathbf Y(0),\mathbf Y(1)).$ In other words, $$P(t,\mathbf Z, \mathbf Y)$$ is a conditional $$p$$-value. Therefore:

$\text{Pr}(P(t,\mathbf Z,\mathbf Y)\leq \alpha \,\vert\,\mathbf Y(0),\mathbf Y(1)) \leq \alpha.$

Since this is valid for any value of $$\mathbf Y (0)$$ and $$\mathbf Y(1)$$, the thesis follows.

In the usual setting of causal inference, we interpret:

• $$Z_i$$ as the treatment assignment for the $$i$$-th statistical unit, $$Z_i = 0,1$$ standing for “treatment” and “control”, respectively.
• $$Y_i(1)$$ and $$Y_i(0)$$ as the potential outcomes for the $$i$$-th unit under treatment and control, respectively.
• $$Y_i$$ as the observed outcome for the $$i$$-th unit.
• $$t(\cdot)$$ as a test statistic used to test the null hypothesis $$\mathbf Y(1)= \mathbf Y (0)$$.
• $$P(t,\mathbf Z,\mathbf Y)$$ is the randomization $$p$$-value of $$t(\mathbf Z, \mathbf Y)$$ in a Fisher Randomization Test.

Fisher’s “sharp” null hypothesis is an equality between random variables, the potential outcomes. Typical examples for the distribution $$\text{Pr}_\mathbf Z(\cdot)$$ are:

• Completely Randomized Experiment (CRE):

$\text{Pr}_\mathbf Z (\mathbf Z) = \begin{cases} \binom N {n_1} ^{-1} & \sum _{i=1}^N Z_i =n_1, \\ 0 & \text{otherwise.} \end{cases}$

• Bernoulli Randomized Experiment (BRE):

$\text{Pr}_\mathbf Z (\mathbf Z) = \prod _{i=1} ^N \pi^{Z_i}(1-\pi)^{1-Z_i}.$

An example of test statistic is the difference in means between the treatment and control group, that can be written:

$t(\mathbf Z , \mathbf Y) = \sum_i c_i Y_i,\qquad c_i=\frac{Z_i}{\sum _iZ_i} - \frac{1-Z_i}{\sum _i(1-Z_i)}.$

### Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

### Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. Source code is available at https://github.com/vgherard/vgherard.github.io/, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

### Citation

Gherardi (2023, June 7). vgherard: Fisher's Randomization Test. Retrieved from https://vgherard.github.io/posts/2023-06-07-fishers-randomization-test/
@misc{gherardi2023fisher's,
}