vgherard
https://vgherard.github.io/
Valerio Gherardi's Personal Website
DistillSun, 02 Jun 2024 00:00:00 +0000"Data Fission: Splitting a Single Data Point" by Leiner et al.Valerio Gherardi
https://vgherard.github.io/posts/2024-06-03-data-fission-splitting-a-single-data-point-by-leiner-et-al
<p><span class="citation">(James Leiner and Ramdas 2023)</span>. The <a
href="https://arxiv.org/pdf/2112.11079">arXiv version</a> is actually a
bit more comfortable to read. Abstract:</p>
<blockquote>
<p>Suppose we observe a random vector <span
class="math inline">\(X\)</span> from some distribution in a known
family with unknown parameters. We ask the following question: when is
it possible to split <span class="math inline">\(X\)</span> into two
pieces <span class="math inline">\(f(X)\)</span> and <span
class="math inline">\(g(X)\)</span> such that neither part is sufficient
to reconstruct <span class="math inline">\(X\)</span> by itself, but
both together can recover <span class="math inline">\(X\)</span> fully,
and their joint distribution is tractable? One common solution to this
problem when multiple samples of <span class="math inline">\(X\)</span>
are observed is data splitting, but Rasines and Young offers an
alternative approach that uses additive Gaussian noise — this enables
post-selection inference in finite samples for Gaussian distributed data
and asymptotically when errors are non-Gaussian. In this article, we
offer a more general methodology for achieving such a split in finite
samples by borrowing ideas from Bayesian inference to yield a
(frequentist) solution that can be viewed as a continuous analog of data
splitting. We call our method data fission, as an alternative to data
splitting, data carving and <span class="math inline">\(p\)</span>-value
masking. We exemplify the method on several prototypical applications,
such as post-selection inference for trend filtering and other
regression problems, and effect size estimation after interactive
multiple testing. Supplementary materials for this article are available
online.</p>
</blockquote>
<p>The paper offers a clear review and systematization of older work,
most prominently <span class="citation">(Rasines and Young 2022)</span>
(cited in the abstract), with some useful generalizations.</p>
<p>The idea is cool, but I find the applications to practical regression
cases given in the paper somewhat… impractical. For usual linear
regression with a continuous response, the applicability of the method
relies on (1) noise being homoskedastic and gaussian, (2) the existence
of a consistent estimator <span class="math inline">\(\hat
\sigma\)</span> of noise variance, and (3) samples being large enough
(guarantees are only asymptotic). On the other hand, in the
theoretically simpler case of logistic regression, there’s a technical
complication in that, under the usual GLM assumption <span
class="math inline">\(\theta(X) = X\beta\)</span>, the relevant
log-likelihood for maximization in the inferential stage is not a
concave function of <span class="math inline">\(\beta\)</span>, possibly
hindering optimization. If I got it right, the authors suggest to ignore
the conditional dependence of <span
class="math inline">\(g(Y_i)\)</span> on <span
class="math inline">\(f(Y_i)\)</span> to circumvent these complications
(see Appendix E.4), which I honestly don’t understand.</p>
<p>A case in which planets align and results have a nice analytic form
is that of Poisson regression, for which I will sketch the idea in some
detail. Suppose that we are given data <span
class="math inline">\(\mathcal D _0 = \{(X_i,Y_i)\}_{i=1}^N\)</span>
independently drawn from a joint <span
class="math inline">\((X,Y)\)</span> distribution, and we assume <span
class="math inline">\(Y \vert X \sim \text{Pois}(\lambda (X))\)</span>
for some unknown function <span class="math inline">\(\lambda
(X)\)</span> we would like to model. The key observation is
(<em>cf.</em> Appendix A of the reference) that if <span
class="math inline">\(Z \vert Y \sim \text{Binom}(Y,\,p)\)</span>, then
<span class="math inline">\(Z \sim \text {Pois}(p\lambda)\)</span> and
<span class="math inline">\(\overline Z = Y - Z \sim
\text{Pois}((1-p)\lambda)\)</span>, with <span
class="math inline">\(Z\)</span> and <span
class="math inline">\(\overline Z\)</span> unconditionally independent.
Hence, if we randomly draw <span class="math inline">\(Z _i\)</span>
according to <span class="math inline">\(\text{Binom}(Y_i,\,p)\)</span>,
and set <span class="math inline">\(\overline Z _i = Y_i -Z_i\)</span>,
the two datasets <span class="math inline">\(\mathcal D =
\{(X_i,\,Z_i)\}\)</span> and <span
class="math inline">\(\overline{\mathcal D} = \{(X_i,\,\overline
Z_i)\}\)</span>, are conditionally independent given the observed
covariates <span class="math inline">\(X_i\)</span>. This allows to
decouple different aspects of modeling, such as model selection and
inference, avoiding the usual biases associated with the intrinsic
randomness of the selection step.</p>
<p>The authors focus on regression with fixed covariates, because in
that setting the simpler option of data-splitting is less motivated,
calling for alternatives. However, the method can be applied equally
well to deal with selective inference in random covariates settings,
since it leads - at least in principle - to inferences which are valid
conditionally on the observed covariates and (in the general case) the
randomized responses <span class="math inline">\(f(Y_i)\)</span> of the
selection stage.</p>
<pre class="r distill-force-highlighting-css"><code></code></pre>
<div id="refs" class="references csl-bib-body hanging-indent">
<div id="ref-datafission" class="csl-entry">
James Leiner, Larry Wasserman, Boyan Duan, and Aaditya Ramdas. 2023.
<span>“Data Fission: Splitting a Single Data Point.”</span> <em>Journal
of the American Statistical Association</em> 0 (0): 1–12. <a
href="https://doi.org/10.1080/01621459.2023.2270748">https://doi.org/10.1080/01621459.2023.2270748</a>.
</div>
<div id="ref-splittingstrategies" class="csl-entry">
Rasines, D García, and G A Young. 2022. <span>“<span
class="nocase">Splitting strategies for post-selection
inference</span>.”</span> <em>Biometrika</em> 110 (3): 597–614. <a
href="https://doi.org/10.1093/biomet/asac070">https://doi.org/10.1093/biomet/asac070</a>.
</div>
</div>2a883ac9d5ba67db523f91c0a19f75efComment on...Selective InferenceModel SelectionStatisticshttps://vgherard.github.io/posts/2024-06-03-data-fission-splitting-a-single-data-point-by-leiner-et-alSun, 02 Jun 2024 00:00:00 +0000Statements of the Second Law of ThermodynamicsValerio Gherardi
https://vgherard.github.io/posts/2024-06-01-statements-of-the-second-law-of-thermodynamics
<p>The Second Law of Thermodynamics is commonly stated in the forms of
Kelvin’s and Clausius’ postulates. These can be enunciated in the
following way <span class="citation">(Dittman and Zemansky 2021)</span>
<a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a>:</p>
<blockquote>
<p><strong>Kelvin’s Postulate.</strong> It is impossible to construct an
engine that, operating in a cycle, will produce no effect other than the
extraction of heat from a reservoir and the performance of an equivalent
amount of work.</p>
</blockquote>
<blockquote>
<p><strong>Clausius’ Postulate.</strong> It is impossible to construct a
refrigerator that, operating in a cycle, will produce no effect other
than the transfer of heat from a lower-temperature reservoir to a higher
temperature reservoir.</p>
</blockquote>
<p>Either formulation is equivalent to the other and leads to the
fundamental <em>Clausius’ theorem</em>. This asserts the existence of a
universal state function <span class="math inline">\(T\)</span>, the
<em>absolute temperature</em>, defined for any thermodynamic system,
that satisfies the <em>Clausius inequality</em>. Concretely, if a system
undergoes a cyclic process, during which it absorbs quantities <span
class="math inline">\(\Delta Q _i\)</span> of energy in the form of heat
from reservoirs at absolute temperatures <span
class="math inline">\(T_i\)</span>, the inequality:</p>
<p><span class="math display">\[
\sum _i\frac{\Delta Q_i}{T_i} \leq 0 (\#eq:ClausiusTheorem)
\]</span> always holds.</p>
<p>The derivation of Eq. @ref(eq:ClausiusTheorem) from Kelvin’s and
Clausius’ postulates, a clever argument that employs ideal Carnot
engines, is standard textbook material; see for example <span
class="citation">(Fermi 1956)</span>. On the other hand, I’ve never seen
the converse being stressed, that is, that Clausius theorem allows one
to recover versions of Kelvin’s and Clausius’ postulates. Here are two
(fairly obvious) arguments in this direction.</p>
<p>Consider a cyclic process of a thermodynamic system during which a
quantity <span class="math inline">\(\Delta Q\)</span> of heat is
absorbed from a reservoir at constant temperature <span
class="math inline">\(T_0\)</span>. Equation @ref(eq:ClausiusTheorem)
applied to this special process implies:</p>
<p><span class="math display">\[
\Delta Q\leq 0.(\#eq:KelvinProof1)
\]</span> The fact that <span class="math inline">\(\Delta Q\leq
0\)</span> means that the heat reservoir can only absorb energy during a
cycle, which must be supplied by performing a positive work on the
system. This is the content of Kelvin’s postulate.</p>
<p>Similarly, if the system performs a cycle exchanging amounts of heat
<span class="math inline">\(\Delta Q_1\)</span> and <span
class="math inline">\(\Delta Q_2\)</span> with two heat sources at
temperatures <span class="math inline">\(T_1\)</span> and <span
class="math inline">\(T_2\)</span> respectively,
@ref(eq:ClausiusTheorem) implies:</p>
<p><span class="math display">\[
\frac{\Delta Q_1}{T_1}+\frac{\Delta Q_2}{T_2}\leq 0(\#eq:ClausiusProof1)
\]</span> But <span class="math inline">\(\Delta Q_1 + \Delta Q_2 =
\Delta Q =-\Delta W\)</span>, the external work performed on the system
during a cycle. Hence:</p>
<p><span class="math display">\[
(\frac{1}{T_1}-\frac{1}{T_2})\Delta Q_1\leq \frac{\Delta
W}{T_2}.(\#eq:ClausiusProof2)
\]</span> Therefore, <span class="math inline">\(\Delta Q_1 \geq
0\)</span> with <span class="math inline">\(T_1 < T_2\)</span>
requires <span class="math inline">\(\Delta W \geq 0\)</span>. In other
words, in order to perform a cycle in which a positive amount of heat is
transferred from a low-temperature reservoir to a high-temperature one,
we must necessarily perform some positive work<a href="#fn2"
class="footnote-ref" id="fnref2"><sup>2</sup></a>. This is the content
of Clausius’ postulate.</p>
<p>A subtle point that may require some elucidation is that, in the
usual logical exposition of Thermodynamics, the temperature to which
Kelvin’s and Clausius’ postulates make reference is the
<em>empirical</em> temperature, call it <span
class="math inline">\(\theta\)</span>. This is the “quantity measured by
a thermometer” <span class="citation">(Fermi 1956)</span>, and is
logically distinct from the absolute temperature <span
class="math inline">\(T\)</span>, whose existence is a consequence of
the second law. What we actually proved here are versions of Kelvin’s
and Clausius’ postulates <em>formulated in terms of the absolute
temperature</em>, <span class="math inline">\(T\)</span>.</p>
<p>Now, if we take Kelvin’s or Clausius’ postulate (formulated in terms
of <span class="math inline">\(\theta\)</span>) as our logical starting
point, we can actually prove that <span class="math inline">\(T\)</span>
is an increasing function of <span
class="math inline">\(\theta\)</span>, in which case there is no point
in specifying which temperature the postulates refer to. However, if our
starting point is<br />
Clausius’ Theorem, there is no <em>a priori</em> logical reason for a
relation between <span class="math inline">\(T\)</span> and <span
class="math inline">\(\theta\)</span>, which should be considered as an
additional assumption.</p>
<hr />
<p>Even though this goes a bit beyond the original scope of the post,
I’d like to show here how @ref(eq:ClausiusTheorem) leads the existence
of another state function, the <em>entropy</em> <span
class="math inline">\(S\)</span>, which satisfies a generalized version
of @ref(eq:ClausiusTheorem), namely:</p>
<p><span class="math display">\[
\sum _i\frac{\Delta Q_i}{T_i} \leq \Delta S(\#eq:ClausiusTheoremEntropy)
\]</span></p>
<p>where quantities have the same meaning as in Eq.
@ref(eq:ClausiusTheorem), but the process is not necessarily cyclic. One
can additionally show that the differential of <span
class="math inline">\(S\)</span> is given by:</p>
<p><span class="math display">\[
\text dS = \frac{\delta Q _R}{T}(\#eq:EntropyDifferential),
\]</span></p>
<p>where <span class="math inline">\(\delta Q_R = \text d U + \delta
W_R\)</span> is the differential heat absorbed by the system in a
reversible process, and <span class="math inline">\(T\)</span> is the
system’s temperature.</p>
<p>We start by observing that, for a reversible process, equality must
hold in Eq. @ref(eq:ClausiusTheorem). This is so because, for a
reversible cycle, the inverse cycle, in which the system absorbs amounts
<span class="math inline">\(-\Delta Q_i\)</span> of heat at temperatures
<span class="math inline">\(T_i\)</span>, must also be possible.
Altogether, the Clausius inequalities for the direct and inverse cycles
thus imply:</p>
<p><span class="math display">\[
\sum _i\frac{\Delta Q_i}{T_i} = 0\quad \text{(reversible
process)}(\#eq:ClausiusTheoremRev).
\]</span></p>
<p>Imagining an ideal cyclic process, in which the system exchanges
infinitesimal amounts of heat <span class="math inline">\(\delta
Q(T')\)</span> with a continuous distribution of sources at
temperatures <span class="math inline">\(T'\)</span>, we should
replace the sum in Eq. @ref(eq:ClausiusTheoremRev) with an integral:</p>
<p><span class="math display">\[
\intop \frac{\delta Q(T')}{T'} = 0 \quad\text{(reversible
process)}(\#eq:ClausiusTheoremRevInt)
\]</span> We now fix a reference state <span
class="math inline">\(\sigma _0\)</span> of our system, and define for
any other state <span class="math inline">\(\sigma\)</span>:</p>
<p><span class="math display">\[
S(\sigma;\sigma _0) = \intop _{\sigma_0}^\sigma \frac{\delta
Q(T')}{T'}(\#eq:EntropyDef)
\]</span> where the integral is taken along <em>any</em> reversible path
that connects <span class="math inline">\(\sigma _0\)</span> and <span
class="math inline">\(\sigma\)</span>, and <span
class="math inline">\(\delta Q(T')\)</span> is the amount of heat
exchanged at temperature <span class="math inline">\(T'\)</span>
along this representative process. The fact that the integral in
@ref(eq:EntropyDef) depends only upon the final states <span
class="math inline">\(\sigma _0\)</span> and <span
class="math inline">\(\sigma\)</span> is guaranteed by
@ref(eq:ClausiusTheoremRevInt).</p>
<p>By construction, we see that Eq. @ref(eq:EntropyDifferential) must
hold with <span class="math inline">\(T\)</span> being the temperature
of a source that, if placed in thermal contact with the system, can
produce a reversible exchange of heat. It remains to be shown that this
temperature is nothing but the temperature of the system itself.
Consider a reversible process in which two systems at temperatures <span
class="math inline">\(T_1\)</span> and <span
class="math inline">\(T_2\)</span> exchange an (infinitesimal) amount of
heat. From what we have just said:</p>
<p><span class="math display">\[
\text d S_1 = \frac{\delta Q_1}{T_2},\quad \text d S_2 = \frac{\delta Q
_2}{T_1},(\#eq:EntropyDifferentialsSwitched)
\]</span> where <span class="math inline">\(\delta Q_i\)</span> is the
heat absorbed by system <span class="math inline">\(i\)</span>, and
<span class="math inline">\(\text d S_i\)</span> is its corresponding
entropy change. However, since the composite system is thermally
insulated, we must have <span class="math inline">\(\delta Q_1 + \delta
Q_2=0\)</span> and <span class="math inline">\(\text d S_1 + \text d S_2
= 0\)</span><a href="#fn3" class="footnote-ref"
id="fnref3"><sup>3</sup></a>. Eq. @ref(eq:EntropyDifferentialsSwitched)
then implies that, if the process is reversible, we must necessarily
have <span class="math inline">\(T_1 = T_2\)</span>. This completes the
proof of @ref(eq:EntropyDifferential).</p>
<div id="refs" class="references csl-bib-body hanging-indent">
<div id="ref-dittman2021heat" class="csl-entry">
Dittman, Richard H, and Mark W Zemansky. 2021. <span>“Heat and
Thermodynamics SEVENTH EDITION.”</span>
</div>
<div id="ref-fermi1956thermodynamics" class="csl-entry">
Fermi, E. 1956. <em>Thermodynamics</em>. Dover Books in Physics and
Mathematical Physics. Dover Publications. <a
href="https://books.google.es/books?id=VEZ1ljsT3IwC">https://books.google.es/books?id=VEZ1ljsT3IwC</a>.
</div>
</div>
<div class="footnotes footnotes-end-of-document">
<hr />
<ol>
<li id="fn1"><p>We may compare these with the corresponding formulations
given in Enrico Fermi’s famous book <span class="citation">(Fermi
1956)</span>. For instance, Kelvin’s postulate reads: <em>“A
transformation whose only final result is to transform into work heat
extracted from a source which is at the same temperature throughout is
impossible.”</em>. Even though I’m a big fan of Fermi’s book, I find the
more modern formulations given in <span class="citation">(Dittman and
Zemansky 2021)</span> clearer.<a href="#fnref1"
class="footnote-back">↩︎</a></p></li>
<li id="fn2"><p>In fact, Eq. @ref(eq:ClausiusProof2) tells us a bit more
than Clausius postulate, since it gives the maximum theoretical
efficiency of a refrigerator operating between temperatures <span
class="math inline">\(T_1 < T_2\)</span>: <span
class="math display">\[
\frac{\Delta Q_1}{\Delta W} \leq \frac{T_1}{T_2-T_1}
\]</span><a href="#fnref2" class="footnote-back">↩︎</a></p></li>
<li id="fn3"><p>The additivity of entropy is a consequence of the
additivity of heat, which in turn would require a dedicate discussion.
Such a requirement boils down to the additivity of external work, which
holds generally if the interaction energies of the systems being
composed are negligible. This is always assumed (more or less
explicitly) whenever discussing the interaction of a system with a heat
reservoir.</p>
<pre class="r distill-force-highlighting-css"><code></code></pre>
<a href="#fnref3" class="footnote-back">↩︎</a></li>
</ol>
</div>3b38a91cced01b553fec61b3485cfe2fThermodynamicsPhysicshttps://vgherard.github.io/posts/2024-06-01-statements-of-the-second-law-of-thermodynamicsSat, 01 Jun 2024 00:00:00 +0000Authorship Attribution in Lennon-McCartney SongsValerio Gherardi
https://vgherard.github.io/posts/2024-05-23-authorship-attribution-in-lennon-mccartney-songs
<p><span class="citation">(Glickman, Brown, and Song 2019)</span>. An
enjoyable read. The authors present a statistical analysis of the
Beatles’ repertoire from the point of view of authorship (Lennon
<em>vs.</em> McCartney), a <a
href="https://vgherard.github.io/posts/2024-04-25-grammar-as-a-biometric-for-authorship-verification/">topic
with which I’ve been lately involved</a>. As a side-note, this also made
me discover the <a href="https://hdsr.mitpress.mit.edu/">Harvard Data
Science Review</a>.</p>
<p>From the paper’s abstract:</p>
<blockquote>
<p>The songwriting duo of John Lennon and Paul McCartney, the two
founding members of the Beatles, composed some of the most popular and
memorable songs of the last century. Despite having authored songs under
the joint credit agreement of Lennon-McCartney, it is well-documented
that most of their songs or portions of songs were primarily written by
exactly one of the two. Furthermore, the authorship of some
Lennon-McCartney songs is in dispute, with the recollections of
authorship based on previous interviews with Lennon and McCartney in
conflict. For Lennon-McCartney songs of known and unknown authorship
written and recorded over the period 1962-66, we extracted musical
features from each song or song portion. These features consist of the
occurrence of melodic notes, chords, melodic note pairs, chord change
pairs, and four-note melody contours. We developed a prediction model
based on variable screening followed by logistic regression with elastic
net regularization. Out-of-sample classification accuracy for songs with
known authorship was 76%, with a c-statistic from an ROC analysis of
83.7%. We applied our model to the prediction of songs and song portions
with unknown or disputed authorship.</p>
</blockquote>
<p>The modeling approach looks to me very sound and appropriate to the
small sample size available (<span class="math inline">\(N =
70\)</span>, the statistical unit corresponding to a song of known
authorship). Effective model selection and testing is achieved through
three nested layers of cross-validation (😱): one for elastic net
hyperparameter tuning, one for feature screening, and finally one for
estimating the prediction error.</p>
<p>The discussion of feature importance is insightful, in that it
identifies concrete aspects of McCartney’s compositions that make them
distinguishable from Lennon’s ones. This type of interpretability is a
big plus for authorship analysis. The general qualitative conclusion,
that McCartney’s music tended to exhibit more complex and unusual
patterns kinda resonates with my perception of Beatles’ songs.</p>
<p>Armed with the trained logistic regression model, together with a
valid accuracy estimate (76%), the authors set out to apply their model
to authorship prediction for controversial cases within the Beatles’
corpus (outside of the training sample). I don’t fully understand the
authors approach in this part of the paper, and some points appear to be
questionable, for the reasons I explain below.</p>
<p>One of the advantages of fitting a full probability model, such as
logistic regression, rather than a conceptually simpler pure
classification model (like a tree, for example), is that the output of
the former is not a mere class (McCartney or Lennon), but rather a
<em>probability</em> of belonging to that class. This allows one to make
much more informative statements in the analysis of new cases, since the
strength of evidence provided by the data towards the predicted class
can be quantified on a case by case basis. All of this is true, of
course, <em>provided that the fitted model gives a decent approximation
to the true data generating process</em>.</p>
<p>With similar considerations in mind, I suppose, the authors produce
probability estimates for each of the disputed cases considered, in the
form of a point estimate and a confidence interval to represent
uncertainty. I think there is room for improvement here, in two
aspects.</p>
<p>My first objection is what I already pointed out above: nothing in
the modeling process explained in the paper suggests that the final
model provides a good approximation to the true class probability
conditional on features. The model has, with reasonable confidence, a
predictive performance close to the best achievable within the
possibilities considered - quantified by 76% accuracy and 84% AUC - but
this says nothing about its correct specification as a probability
model. Without a careful specification study, it is impossible to
conclude anything on the nature of the true estimation targets of the
fitted “probabilities”: they may perfectly have nothing to do with the
actual <span class="math inline">\(\text{Pr}(\text{author}\,\vert\,
\text{song features})\)</span> the authors are after. There is still
value, I believe, in reporting fitted probabilities as qualitative
measures of evidence, but these should not be conflated with the true
(unknown) class probabilities… at least without some serious attempt to
detect differences between the two.</p>
<p>My second point is a technical one and concerns how they construct
confidence intervals for fitted probabilities. The construction
resembles that of bootstrap percentile confidence intervals but, rather
than the usual bootstrap synthetic datasets, the delete-one datasets
used in leave-one-out cross-validation are used to obtain replicas of
the fitted probabilities. This is nothing but Jackknife resampling in
disguise, and it is well known that the resampling standard deviation of
such Jackknife replicae is roughly <span class="math inline">\(N
^{-1/2}\)</span> times the true standard deviation, see <em>e.g.</em>
<span class="citation">(Tibshirani and Efron 1993)</span>. Therefore, I
have strong reasons to believe that the reported intervals strongly
underestimate the uncertainty associated with these probability
estimates.</p>
<p>All in all, the attempt to go beyond reporting simple classes -
backed up by an overall 76% accuracy estimate - is well-motivated in
principle, but the final outcome is not very dependable.</p>
<p>As usual, I’m more eloquent when criticizing than when praising, but
let me end on a very positive note. The authors do a <em>great</em>
favor to the reader, by including a discussion of the informal steps
performed prior and in parallel to the formal analysis presented in the
paper. This kind of transparency - which is also present in the rest of
the discussion - is, I believe, not so common as it should, and is what
makes it eventually possible to think critically about someone else’s
work.</p>
<pre class="r distill-force-highlighting-css"><code></code></pre>
<div id="refs" class="references csl-bib-body hanging-indent">
<div id="ref-Glickman2019Data" class="csl-entry">
Glickman, Mark, Jason Brown, and Ryan Song. 2019.
<span>“(<span>A</span>) <span>Data</span> in the <span>Life</span>:
Authorship <span>Attribution</span> in
<span>Lennon</span>-<span>McCartney</span> <span>Songs</span>.”</span>
<em>Harvard Data Science Review</em> 1 (1).
</div>
<div id="ref-tibshirani1993introduction" class="csl-entry">
Tibshirani, Robert J, and Bradley Efron. 1993. <span>“An Introduction to
the Bootstrap.”</span> <em>Monographs on Statistics and Applied
Probability</em> 57 (1): 1–436.
</div>
</div>7b1b8fa9a1ecfd0183e658839ab311f3Comment on...Authorship VerificationNatural Language ProcessingMachine LearningMusicStatisticshttps://vgherard.github.io/posts/2024-05-23-authorship-attribution-in-lennon-mccartney-songsThu, 23 May 2024 00:00:00 +0000Frequentist bounds for Bayesian sequential hypothesis testingValerio Gherardi
https://vgherard.github.io/posts/2024-05-22-frequentist-bounds-for-bayesian-sequential-hypothesis-testing
<p>I just came across <span class="citation">(Kerridge 1963)</span>, an
old result which falls under the umbrella of “frequentist properties of
Bayesian inference”. Specifically, the theorem proved in this reference
applies to sequential testing, a context in which the mechanics of
Bayesian inference, with its typical sequential updates, may be regarded
as natural.</p>
<p>Suppose we wish to compare two hypotheses <span
class="math inline">\(H_0\)</span> and <span
class="math inline">\(H_1\)</span>, where <span
class="math inline">\(H_0\)</span> is simple<a href="#fn1"
class="footnote-ref" id="fnref1"><sup>1</sup></a>. We start collecting
data until our sample meets some specific requirement, according to some
given <em>stopping rule</em> <span class="math inline">\(S\)</span>. If
this ever occurs, we compute the <em>Bayes factor</em>:</p>
<p><span class="math display">\[
B = \frac{\text{Pr}(\text {data} \vert H_0)}{\text{Pr}(\text {data}
\vert H_1)}.(\#eq:BayesRatio)
\]</span> and reject <span class="math inline">\(H_0\)</span> if <span
class="math inline">\(B \leq b\)</span>, for some <span
class="math inline">\(b > 0\)</span>. The theorem is that if <span
class="math inline">\(H_0\)</span> is the true data generating process,
the above procedure has a false rejection rate lower than <span
class="math inline">\(b\)</span>, <em>independently of the stopping rule
employed to end sampling</em>.</p>
<p>Notice that the rejection event is composed by two parts:</p>
<ol style="list-style-type: decimal">
<li>Sampling has stopped at some point during data taking.</li>
<li>When sampling stopped, <span class="math inline">\(B \leq b\)</span>
held.</li>
</ol>
<p>We also note that the stopping rule needs not be deterministic,
although this appears to be implicitly assumed in the original
reference. In general, the data collected up to a certain point will
only determine the <em>probability</em> that sampling stops at that time
(and, to reinforce the previous point, the sum of these probabilities
will not, in general, add up to <span
class="math inline">\(1\)</span>).</p>
<p>In order to prove this theorem, let us set up some notation. Let
<span class="math inline">\((X_n)_{n\in \mathbb N}\)</span> be some
stochastic process representing “data”, where each <span
class="math inline">\(X_n \in \mathcal X\)</span> is a data point. We
denote by <span class="math inline">\(P^{(0)}\)</span> the probability
distribution of <span class="math inline">\(X\)</span> under <span
class="math inline">\(H_0\)</span>, which is completely defined since
<span class="math inline">\(H_0\)</span> is simple. We further denote by
<span class="math inline">\(P_n ^{(0)}\)</span> the corresponding
probability measure on <span class="math inline">\(\mathcal X
^n\)</span> for the set of the first <span
class="math inline">\(n\)</span> observations <span
class="math inline">\(X_1,\,X_2,\,\dots, \,X_n\)</span>.</p>
<p>We first consider the case in which <span
class="math inline">\(H_1\)</span> is also simple, and denote by <span
class="math inline">\(P^{(1)}\)</span> and <span
class="math inline">\(P^{(1)}_n\)</span> the corresponding measures. The
Bayes factor is defined as the Radon-Nikodym derivative:</p>
<p><span class="math display">\[
B_n \equiv \frac{\text d P^{(0)}_n}{\text d
P_n^{(1)}}(\#eq:BayesRatioRadonNikodym)
\]</span> (we assume regularity conditions so that such a derivative
exists).</p>
<p>Also, we assume for the moment that the stopping rule is
deterministic, embodied by binary functions <span
class="math inline">\(S_n=S(X_1,\,X_2,\,\dots,X_n)\)</span> of the first
<span class="math inline">\(n\)</span> observations, with <span
class="math inline">\(S_n = 1\)</span> if sampling can stop at step
<span class="math inline">\(n\)</span>.</p>
<p>Now fix <span class="math inline">\(b>0\)</span>. A rejection of
<span class="math inline">\(H_0\)</span> at sampling step <span
class="math inline">\(n\)</span> is represented by the event:</p>
<p><span class="math display">\[
\mathcal R _{n}(b)\equiv \{B_n\leq b,\,S_n=1,\,S_i=0\,\text{ for
}i<n\},(\#eq:RejectionEvents)
\]</span> which, with abuse of notation, we may identify with a subset
of <span class="math inline">\(\mathcal X ^n\)</span>. The overall
rejection event (at any sampling step) is given by:</p>
<p><span class="math display">\[
\mathcal R (b)\equiv \bigcup _{n=1} ^\infty \mathcal
R_n(b),(\#eq:BigRejectionEvent)
\]</span> so that our theorem amounts to the bound:</p>
<p><span class="math display">\[
\text{Pr}_{H_0}(\mathcal R(b))\leq b. (\#eq:TheoremStatement)
\]</span> In order to prove this, we first note that:</p>
<p><span class="math display">\[
\text{Pr}_{H_0}(\mathcal R _n(b))=
\intop _{\mathcal R _n(b)} \text d P_n=
\intop _{\mathcal R _n(b)}B_n \text d Q_n \leq
b\intop _{\mathcal R _n(b)} \text d
Q_n=b\cdot\text{Pr}_{H_1}(\mathcal R _n(b)).(\#eq:TheoremStep1)
\]</span> Hence, since the events <span class="math inline">\(\mathcal R
_n(b)\)</span> and <span class="math inline">\(\mathcal R _m(b)\)</span>
are clearly disjoint for <span class="math inline">\(n\neq m\)</span>,
we have:</p>
<p><span class="math display">\[
\text{Pr}_{H_0}(\mathcal R(b))\leq b\cdot\text{Pr}_{H_1}(\mathcal R
(b))(\#eq:TheoremStep2),
\]</span> which, since <span
class="math inline">\(\text{Pr}_Q(\cdot)\leq1\)</span>, implies
@ref(eq:TheoremStatement).</p>
<p>We may relax the assumption that the alternative hypothesis is
simple, by considering a parametric family of measures <span
class="math inline">\((P^{(1)}_\theta)_{\theta \in \Theta}\)</span>,
where the parameter <span class="math inline">\(\theta\)</span> has some
prior probability <span class="math inline">\(\text
d\Phi(\theta)\)</span>. The argument given above still applies to this
case, if <span class="math inline">\(P^{(1)}\)</span> is replaced by the
mixture <span class="math inline">\(P^{(1)} = \intop \text d
\Phi(\theta) P^{(1)}_\theta\)</span> (under appropriate regularity
assumptions). In the notation of Eq. @ref(eq:BayesRatio), the
denominator <span class="math inline">\(\text {Pr}(\text {data} \vert
H_1)\equiv \intop \text d \Phi(\theta)\,\text{Pr}(\text{data} \vert
H_{1,\theta})\)</span>.</p>
<p>Finally, in order to lift the assumption that our stopping rule is
deterministic, let us first consider the following special
(deterministic) stopping rule:</p>
<p><span class="math display">\[
S^*_n =1\iff B_n \leq b.(\#eq:DataDredging)
\]</span> In other words, we stop sampling whenever the sample would
reject <span class="math inline">\(H_0\)</span> according to <span
class="math inline">\(B_n \leq b\)</span>. The rejection event <span
class="math inline">\(\mathcal R(b)\)</span> for this special stopping
rule is simply:</p>
<p><span class="math display">\[
\mathcal R^*(b) \equiv \{B_n \leq b\text{ for some }n\in \mathbb
N\}.(\#eq:RejectionDataDredging)
\]</span> Since we already proved the theorem for any deterministic
stopping rule, Eq. @ref(eq:TheoremStatement) implies:</p>
<p><span class="math display">\[
\text {Pr}_{H_0}(\mathcal R^*(b)) \leq b.(\#eq:BoundDataDredging)
\]</span> But Eq. @ref(eq:BoundDataDredging) clearly implies the theorem
for any stopping rule, deterministic or not, since in general:</p>
<p><span class="math display">\[
\mathcal R(b) \subseteq \mathcal R^*(b)(\#eq:ProperSubsetDataDredging)
\]</span> (we need <span class="math inline">\(B_n\leq b\)</span> to
hold for some <span class="math inline">\(n\in \mathbb N\)</span> in
order to reject <span class="math inline">\(H_0\)</span>).</p>
<p>Interestingly, the argument just given leads to a more accurate
statement of our main result @ref(eq:TheoremStatement):</p>
<p><span class="math display">\[
\text{Pr}_{H_0}(\mathcal R(b))\leq \text {Pr}_{H_0}(B_n \leq b\text{ for
some }n\in \mathbb N) \leq b,(\#eq:TheoremStatement2)
\]</span> where the leftmost quantity is the false rejection rate of a
selective testing procedure, such as the one we have been considering so
far, wheareas the central quantity is the false rejection rate of a
<em>simultaneous</em> testing procedure (that checks whether <span
class="math inline">\(B_n \leq b\)</span> at each step of sampling).
What’s happening here is analogous to a phenomenon observed in the
context of parameter estimation following model selection <span
class="citation">(Berk et al. 2013)</span>, where one can show that, in
order to guarantee marginal coverage for the selected parameters, if the
selection rule is allowed to be completely arbitrary one must actually
require <em>simultaneous</em> coverage for all possible parameters.</p>
<p>To conclude the post, let us remark that theorem
@ref(eq:TheoremStatement) was originally formulated in terms of the
posterior probability <span class="math inline">\(Q_n(\pi)\)</span> of
<span class="math inline">\(H_0\)</span>:</p>
<p><span class="math display">\[
Q_n(\pi) = \frac{\pi }{\pi +(1-\pi)B^{-1}_n},(\#eq:PosteriorProb)
\]</span></p>
<p>where <span class="math inline">\(\pi\)</span> and <span
class="math inline">\(1-\pi\)</span> are the prior probabilities of the
two competing models <span class="math inline">\(H_0\)</span> and <span
class="math inline">\(H_1\)</span>, respectively. We may use <span
class="math inline">\(Q_n(\pi) \leq q\)</span>, rather than <span
class="math inline">\(B_n \leq b\)</span>, as the relevant criterion for
rejecting <span class="math inline">\(H_0\)</span>. From the pure
frequentist point of view, this doesn’t add anything to our formulation
in terms of the Bayes ratio, as <span class="math inline">\(Q_n(\pi)\leq
q\)</span> is equivalent to <span class="math inline">\(B_n \leq
b\)</span> as long as <span class="math inline">\(b =
\frac{q}{1-q}\frac{1-\pi}{\pi}\)</span>. In particular, the bound
analogous to @ref(eq:TheoremStatement) reads:</p>
<p><span class="math display">\[
\text{Pr}_{H_0}(\mathcal R(q))\leq \text {Pr}_{H_0}(Q_n(\pi) \leq
q\text{ for some }n\in \mathbb N) \leq
\frac{q}{1-q}\frac{1-\pi}{\pi}.(\#eq:TheoremStatement3)
\]</span></p>
<pre class="r distill-force-highlighting-css"><code></code></pre>
<div id="refs" class="references csl-bib-body hanging-indent">
<div id="ref-berk2013valid" class="csl-entry">
Berk, Richard, Lawrence Brown, Andreas Buja, Kai Zhang, and Linda Zhao.
2013. <span>“Valid Post-Selection Inference.”</span> <em>The Annals of
Statistics</em>, 802–37.
</div>
<div id="ref-kerridge1963bounds" class="csl-entry">
Kerridge, D. 1963. <span>“Bounds for the Frequency of Misleading Bayes
Inferences.”</span> <em>The Annals of Mathematical Statistics</em> 34
(3): 1109–10.
</div>
</div>
<div class="footnotes footnotes-end-of-document">
<hr />
<ol>
<li id="fn1"><p>This is a technical term, meaning that <span
class="math inline">\(H_0\)</span> completely characterizes the
probability distribution of data. An example of a non-simple hypothesis
would be a parametric model depending on some unknown parameter <span
class="math inline">\(\theta\)</span>.<a href="#fnref1"
class="footnote-back">↩︎</a></p></li>
</ol>
</div>2cd0556ecb9f91f44d1332f72e39cba8Sequential Hypothesis TestingBayesian MethodsFrequentist MethodsStatisticshttps://vgherard.github.io/posts/2024-05-22-frequentist-bounds-for-bayesian-sequential-hypothesis-testingWed, 22 May 2024 00:00:00 +0000AIC in the well-specified linear model: theory and simulationValerio Gherardi
https://vgherard.github.io/posts/2024-05-09-aic-in-the-well-specified-linear-model-theory-and-simulation
Some illustrations of the Akaike Information Criterion (AIC) at work in a toy
example.Model SelectionLinear ModelsRegressionStatisticsRhttps://vgherard.github.io/posts/2024-05-09-aic-in-the-well-specified-linear-model-theory-and-simulationFri, 17 May 2024 00:00:00 +0000Grammar as a biometric for Authorship VerificationValerio Gherardi
https://vgherard.github.io/posts/2024-04-25-grammar-as-a-biometric-for-authorship-verification
Notes on preprint 2403.08462 by A. Nini, O. Halvani, L. Graner, S. Ishihara
and myself.Authorship VerificationNatural Language ProcessingForensic ScienceMachine LearningStatisticsRhttps://vgherard.github.io/posts/2024-04-25-grammar-as-a-biometric-for-authorship-verificationThu, 25 Apr 2024 00:00:00 +0000"Induction and Deduction in Bayesian Data Analysis" by A. GelmanValerio Gherardi
https://vgherard.github.io/posts/2024-04-25-induction-and-deduction-in-bayesian-data-analysis-by-a-gelman
On the importance of model checks in Bayesian data analysis.Comment on...Bayesian MethodsStatisticshttps://vgherard.github.io/posts/2024-04-25-induction-and-deduction-in-bayesian-data-analysis-by-a-gelmanThu, 25 Apr 2024 00:00:00 +0000"The Abuse of Power" by J. M. Hoenig and D. M. HeiseyValerio Gherardi
https://vgherard.github.io/posts/2024-04-18-the-abuse-of-power-by-j-m-hoenig-and-d-m-heisey
Why observed power calculations are useless (plus a few other points I don't buy).Comment on...Hypothesis TestingStatisticshttps://vgherard.github.io/posts/2024-04-18-the-abuse-of-power-by-j-m-hoenig-and-d-m-heiseyThu, 18 Apr 2024 00:00:00 +0000AIC for the linear model: known vs. unknown varianceValerio Gherardi
https://vgherard.github.io/posts/2024-03-13-aic-for-the-linear-model-known-vs-unknown-variance
Does knowledge of noise variance have any effect on model selection for the mean?Model SelectionLinear ModelsRegressionStatisticshttps://vgherard.github.io/posts/2024-03-13-aic-for-the-linear-model-known-vs-unknown-varianceWed, 13 Mar 2024 00:00:00 +0000"A Closer Look at the Deviance" by T. HastieValerio Gherardi
https://vgherard.github.io/posts/2024-03-07-a-closer-look-at-the-deviance-by-t-hastie
A nice review of properties of Deviance for one parameter exponential
families.Comment on...Maximum Likelihood EstimationLinear ModelsStatisticshttps://vgherard.github.io/posts/2024-03-07-a-closer-look-at-the-deviance-by-t-hastieThu, 07 Mar 2024 00:00:00 +0000No binomial overdispersion from variations at the individual levelValerio Gherardi
https://vgherard.github.io/posts/2024-03-06-no-binomial-overdispersion-from-variations-at-the-individual-level
Some notes on the causes of overdispersion in count data.Population DynamicsBiologyEcologyStatisticshttps://vgherard.github.io/posts/2024-03-06-no-binomial-overdispersion-from-variations-at-the-individual-levelWed, 06 Mar 2024 00:00:00 +0000On the first and second laws of thermodynamics for open systemsValerio Gherardi
https://vgherard.github.io/posts/2024-02-29-on-the-first-and-second-laws-of-thermodynamics-for-open-systems
Matter transfer in open systems changes the relationship between heat and entropy, and work and volume.Open SystemsThermodynamicsPhysicshttps://vgherard.github.io/posts/2024-02-29-on-the-first-and-second-laws-of-thermodynamics-for-open-systemsMon, 04 Mar 2024 00:00:00 +0000Gravity waves in an ideal fluidValerio Gherardi
https://vgherard.github.io/posts/2024-02-22-gravity-waves-in-an-ideal-fluid
Compares the "parcel" method with standard linearization of fluid dynamics equations.Atmospheric PhysicsFluid DynamicsWavesPhysicshttps://vgherard.github.io/posts/2024-02-22-gravity-waves-in-an-ideal-fluidThu, 22 Feb 2024 00:00:00 +0000Binary digits of uniform random variablesValerio Gherardi
https://vgherard.github.io/posts/2024-01-29-binary-digits-of-uniform-random-variables
... are independent fair coin tosses.Probability Theoryhttps://vgherard.github.io/posts/2024-01-29-binary-digits-of-uniform-random-variablesMon, 29 Jan 2024 00:00:00 +0000Interpreting the Likelihood Ratio costValerio Gherardi
https://vgherard.github.io/posts/2023-11-15-interpreting-the-likelihood-ratio-cost
Analysis of infinite sample properties and comparison with cross-entropy loss.Forensic ScienceBayesian MethodsInformation TheoryProbability TheoryRhttps://vgherard.github.io/posts/2023-11-15-interpreting-the-likelihood-ratio-costWed, 15 Nov 2023 00:00:00 +0000Conditional ProbabilityValerio Gherardi
https://vgherard.github.io/posts/2023-11-03-conditional-probability
Notes on the formal definition of conditional probability.Probability TheoryMeasure Theoryhttps://vgherard.github.io/posts/2023-11-03-conditional-probabilityFri, 03 Nov 2023 00:00:00 +0000Prefix-free codesValerio Gherardi
https://vgherard.github.io/posts/2023-10-31-prefix-free-codes
Generalities about prefix-free (a.k.a. instantaneous) codesInformation TheoryEntropyProbability Theoryhttps://vgherard.github.io/posts/2023-10-31-prefix-free-codesTue, 31 Oct 2023 00:00:00 +0000AB tests and repeated checksValerio Gherardi
https://vgherard.github.io/posts/2023-07-24-ab-tests-and-repeated-checks
False Positive Rates under repeated checks - a simulation study using R.AB testingSequential Hypothesis TestingFrequentist MethodsStatisticsRhttps://vgherard.github.io/posts/2023-07-24-ab-tests-and-repeated-checksThu, 27 Jul 2023 00:00:00 +0000Testing functional specification in linear regressionValerio Gherardi
https://vgherard.github.io/posts/2023-07-11-testing-functional-specification-in-linear-regression
Some options in R, using the `{lmtest}` package.StatisticsModel MisspecificationRegressionLinear ModelsRhttps://vgherard.github.io/posts/2023-07-11-testing-functional-specification-in-linear-regressionTue, 11 Jul 2023 00:00:00 +0000Sum and ratio of independent random variablesValerio Gherardi
https://vgherard.github.io/posts/2023-06-14-sum-and-ratio-of-independent-random-variables
Sufficient conditions for independence of sum and ratio.MathematicsProbability Theoryhttps://vgherard.github.io/posts/2023-06-14-sum-and-ratio-of-independent-random-variablesWed, 14 Jun 2023 00:00:00 +0000Fisher's Randomization TestValerio Gherardi
https://vgherard.github.io/posts/2023-06-07-fishers-randomization-test
Notes and proofs of basic theoremsStatisticsFrequentist MethodsCausal Inferencehttps://vgherard.github.io/posts/2023-06-07-fishers-randomization-testWed, 07 Jun 2023 00:00:00 +0000p-values and measure theoryValerio Gherardi
https://vgherard.github.io/posts/2023-06-07-p-values-and-measure-theory
Self-reassurance that p-value properties don't depend on regularity
assumptions on the test statistic.Probability TheoryMeasure TheoryFrequentist MethodsStatisticshttps://vgherard.github.io/posts/2023-06-07-p-values-and-measure-theoryWed, 07 Jun 2023 00:00:00 +0000Linear regression with autocorrelated noiseValerio Gherardi
https://vgherard.github.io/posts/2023-05-20-linear-regression-with-autocorrelated-noise
Effects of noise autocorrelation on linear regression. Explicit formulae and a simple simulation.StatisticsRegressionTime SeriesLinear ModelsModel MisspecificationRhttps://vgherard.github.io/posts/2023-05-20-linear-regression-with-autocorrelated-noiseThu, 25 May 2023 00:00:00 +0000Model Misspecification and Linear SandwichesValerio Gherardi
https://vgherard.github.io/posts/2023-05-14-model-misspecification-and-linear-sandwiches
Being wrong in the right way. With R excerpts.StatisticsRegressionLinear ModelsModel MisspecificationRhttps://vgherard.github.io/posts/2023-05-14-model-misspecification-and-linear-sandwichesSun, 14 May 2023 00:00:00 +0000Consistency and bias of OLS estimatorsValerio Gherardi
https://vgherard.github.io/posts/2023-05-12-consistency-and-bias-of-ols-estimators
OLS estimators are consistent but generally biased - here's an example.StatisticsRegressionLinear ModelsModel Misspecificationhttps://vgherard.github.io/posts/2023-05-12-consistency-and-bias-of-ols-estimatorsFri, 12 May 2023 00:00:00 +0000Bayes, Neyman and the Magic Piggy BankValerio Gherardi
https://vgherard.github.io/posts/2023-05-01-magic-piggy-bank
Compares frequentist properties of credible intervals and confidence
intervals in a gambling game involving a magic piggy bank.StatisticsConfidence IntervalsFrequentist MethodsBayesian Methodshttps://vgherard.github.io/posts/2023-05-01-magic-piggy-bankMon, 01 May 2023 00:00:00 +0000Correlation Without CausationValerio Gherardi
https://vgherard.github.io/posts/2023-03-10-correlation-without-causation
*Cum hoc ergo propter hoc*Statisticshttps://vgherard.github.io/posts/2023-03-10-correlation-without-causationThu, 30 Mar 2023 00:00:00 +0000How to get away with selection. Part II: Mathematical FrameworkValerio Gherardi
https://vgherard.github.io/posts/2022-11-07-posi-2
Mathematicals details on Selective Inference, model misspecification and coverage guarantees.StatisticsSelective InferenceModel Misspecificationhttps://vgherard.github.io/posts/2022-11-07-posi-2Fri, 25 Nov 2022 00:00:00 +0000How to get away with selection. Part I: IntroductionValerio Gherardi
https://vgherard.github.io/posts/2022-10-18-posi
Introducing the problem of Selective Inference, illustrated through a simple simulation in R.StatisticsSelective InferenceRhttps://vgherard.github.io/posts/2022-10-18-posiMon, 14 Nov 2022 00:00:00 +0000kgrams v0.1.2 on CRANValerio Gherardi
https://vgherard.github.io/posts/2021-11-13-kgrams-v012-released
kgrams: Classical k-gram Language Models in R.Natural Language ProcessingRhttps://vgherard.github.io/posts/2021-11-13-kgrams-v012-releasedSat, 13 Nov 2021 00:00:00 +0000R Client for R-universe APIsValerio Gherardi
https://vgherard.github.io/posts/2021-07-25-r-client-for-r-universe-apis
{runi}, an R package to interact with R-universe repository APIsRhttps://vgherard.github.io/posts/2021-07-25-r-client-for-r-universe-apisSun, 25 Jul 2021 00:00:00 +0000Automatic resumes of your R-developer portfolio from your R-UniverseValerio Gherardi
https://vgherard.github.io/posts/2021-07-21-automatically-resume-your-r-package-portfolio-using-the-r-universe-api
Create automatic resumes of your R packages using the R-Universe API.Rhttps://vgherard.github.io/posts/2021-07-21-automatically-resume-your-r-package-portfolio-using-the-r-universe-apiWed, 21 Jul 2021 00:00:00 +0000{r2r} now on CRANValerio Gherardi
https://vgherard.github.io/posts/2021-07-06-r2r
Introducing {r2r}, an R implementation of hash tables.Data StructuresRhttps://vgherard.github.io/posts/2021-07-06-r2rTue, 06 Jul 2021 00:00:00 +0000Test postValerio Gherardi
https://vgherard.github.io/posts/2021-07-06-test-post
A short description of the post.Otherhttps://vgherard.github.io/posts/2021-07-06-test-postTue, 06 Jul 2021 00:00:00 +0000