SlideShare a Scribd company logo
What is the Philosophy of Statistics?
(and how I was drawn to it)
Deborah G Mayo
Dept of Philosophy, Virginia Tech
April 30, 2025
1
In a conversation with Sir David Cox
COX: Deborah, in some fields foundations do not
seem very important, but we both think foundations of
statistical inference are important; why do you think?
MAYO: …in statistics …we invariably cross into
philosophical questions about empirical knowledge
and inductive inference.
(“A Statistical Scientist Meets a Philosopher of Science” 2011) 2
3
At one level, statisticians and
philosophers of science ask
similar questions
• What should be observed and what may
justifiably be inferred from data?
• How well do data confirm or test a model?
Two-way street
• Statistics is a kind of “applied philosophy of
science” (Kempthorne, 1976).
4
5
“The Statistics Wars”
Statistics has long been the subject of
philosophical debate marked by unusual heights
of passion and controversy
6
Philosophy → Statistics
• A central job for philosophers of science:
minister to conceptual, epistemological, and
logical problems of sciences
7
Statistics → Philosophy
Statistical ideas are used in philosophy to:
1) Model Scientific Inference—ways to arrive at
evidence and inference
2) Solve (or reconstruct) Philosophical
Problems about inference and evidence (e.g.,
problem of induction, should we prefer novel
predictions?)
8
Logics of Confirmation
• Failure to justify (enumerative) induction led to
building logics of confirmation C(H,e), like
deductive logic: Carnap
• Evidence e is given, calculating C(H,e) is formal.
• Carnap: Scientists could come to the inductive
logician for rational degree of confirmation
9
“According to modern logical empiricist orthodoxy...It is
quite irrelevant whether e was known first and h
proposed to explain it, or whether e resulted from
testing predictions drawn from h”.
(Alan Musgrave 1974, p. 2).
The search for C(H,e) did not succeed, and the
program is challenged by post-positivists (Popper,
Lakatos, Laudan, Kuhn)
10
Post-Positivists
Evidence e is theory-laden, not given:
Collected with difficulty, must start with a problem
Popper: Falsification, not Confirmation:
Corroborate claims that survive severe tests
11
Popper never adequately cashed out
severity—tried adding requirements like novel
predictive success to C(h, e)
(due partly to his unfamiliarity with statistical
methods)
Neyman and Pearson (N-P):
Methodological falsification
“We may search for rules to govern our behavior…in
following which we insure that in the long run of
experience, we shall not be too often wrong” (Neyman
and Pearson 1933, 141-2)
• Probability is assigned, not to hypotheses, but to
methods 12
Series of models
It is a fundamental contribution of modern mathematical
statistics to have recognized the explicit need of a
model in analyzing the significance of experimental
data. (Patrick Suppes 1969, 33)
13
“I might never have sauntered into that first class on
mathematical statistics had the Department of
Statistics not been situated so closely to the
Department of Philosophy [at U Penn]. ...I
suspected that understanding how these statistical
methods worked would offer up solutions to the
vexing problem of how we learn about the world in
the face of error.” (Mayo 1996, Error and the
Growth of Experimental Knowledge: EGEK)
14
15
What really influenced me in the years before EGEK
(here at Virginia Tech):
In statistics: famous Bayesian I.J. Good:
Why wasn’t I a Bayesian?
In philosophy, leading Popperian Laudan:
Can I make progress on their problems about
large scale units: paradigms and research programs
16
What I found then holds true:
• Insights from grappling with foundational
problems of statistics (and data science)
provide a gold mine for making progress on
philosophical problems
• Turn now to controversies of PhilStat
17
Long-standing philosophical
controversy on roles of probability
Frequentist (performance): to control and assess
the relative frequency of misinterpretations of
data—error probabilities
(e.g., P-values, confidence intervals,
randomization, resampling)
Bayesian (and other probabilisms): to assign
degrees of belief or support in claims
(e.g., Bayes factors, Bayesian posterior
probabilities)
Beyond performance and
probabilism
• Long-standing battles simmer below the surface
in today’s ”statistical (replication) crisis” in science
• Biggest source of handwringing? High powered
methods make it easy to find impressive-looking
but spurious effects
18
I set sail with a minimal principle
of evidence (severity principle)
• We don’t have evidence for a claim C if little if
anything has been done that would have
found C flawed, even if it is
19
Statistical inference as
severe testing
Probability arises (in statistical inference) to
assess and control how capable methods are at
uncovering and avoiding erroneous
interpretations of data
20
I invite the reader: “Envision yourself embarking on
a special interest cruise” (Excursions and Tours)
The severity principle is an excavation tool: to get
beyond the “statistics wars” and for appraising
reforms
21
Replication crisis leads to
“reforms”
Several are welcome:
• preregistration of protocol, replication
checks, avoid cookbook statistics
Others are radical
• and even lead to violating our minimal
requirement for evidence
22
Being an outside philosopher helps
To unearth assumptions, combat paradoxical,
“reforms” requires taking on strong ideological
leaders (philosophical observer)
23
What is the replication crisis?
• Results that had been found statistically
significant are not found significant when an
independent group seeks to replicate them.
• What are statistical significance tests?
24
25
Simple significance (Fisherian) tests
“…to test the conformity of the particular data
under analysis with H0 in some respect….” (Mayo
and Cox 2006, p. 81)
…the P-value: the probability the test would yield
an even larger (or more extreme) value of a test
statistic T assuming H0 chance variability or noise
NOT Pr(data|H0 )
26
Testing reasoning:
statistical modus tollens
• Small P-values indicate* some underlying
discrepancy from H0 because very probably
(1- P) you would have seen a less impressive
difference were H0 true.
• Still not evidence of a genuine statistical effect
H1 yet alone a scientific conclusion H*
*(until an audit is conducted testing assumptions)
27
Neyman and Pearson tests (1933) put
Fisherian tests on firmer ground:
Introduce alternative hypotheses H0, H1
e.g., H0: no benefit vs. H1: some benefit
Controls Type I errors (erroneous rejections) and
Type II errors (erroneously failing to reject); ensures
that as effect sizes increase, so does test’s power
Hypotheses vs Events
Statistical hypotheses assign probabilities to data or
events Pr(x0;H0)
• Correctly guessing tea or milk first is like a coin tossing
trial Pr(success) on each trial = .5 (lady tasting tea)
• A drug does not improve lung function in patients
• The deflection of light due to the sun is 1.75 degrees
(an estimate)
It’s rare to assign frequentist probabilities to stat
hypotheses (not random variables); inference is
qualified by probabilistic properties of the test method28
Error control is lost by biasing
selection effects: fooled by
randomness
• Sufficient finagling—cherry-picking,
significance seeking, multiple testing, post-
data subgroups, trying and trying again—
may practically guarantee an impressive-
looking effect, even if it’s unwarranted by
evidence
• Violates error control and severity
29
30
Data Dredging (Torturing): P-hacking
The case of the Drug CEO: Finding no statistically
significant benefit on the primary endpoint (lung benefit),
nor on 10 other secondaries....ransacks the data
Harkonen vs. United States (2013)—guilty of wire fraud
Trump pardons
31
Violates: Severity Requirement
We have evidence for a claim C only to the extent C
has been subjected to and passes a test that would
probably have found C flawed, just if it is.
• This probability is the stringency or severity with
which C has passed the test.
• applicable to any error-prone problem
Severity gives an evidential twist
• What is the epistemic value of good performance
relevant for inference in the case at hand?
• What bothers you with selective reporting, cherry
picking, stopping when the data look good, P-hacking
• We cannot say the test has done its job in the case at
hand in avoiding sources of misinterpreting data
32
A claim C is not warranted _______
• Probabilism: unless C is true or probable (gets
a probability boost, made comparatively firmer,
more believable)
• Performance: unless it stems from a method
with low long-run error
• Probativism (severe testing) unless
something (a fair amount) has been done to
probe (& rule out) ways we can be wrong
about C
33
Informal Example: Severe Tests
To test if I’ve gained weight between the start and
end of the pandemic, I use a series of well-calibrated
scales, at the start and after.
All show an over 4 lb gain, none shows a difference
in weighing EGEK, I’m forced to infer:
H: I’ve gained at least 4 pounds
We argue about the source of the readings from the
high capacity to reveal if any scales were wrong
34
35
The severe tester is assumed to be
in a context of wanting to find
things out
• I could insist all the scales are wrong—they
work fine with weighing known objects—but
this would prevent correctly finding out about
weight….. (rigged alternative)
• What sort of extraordinary circumstance could
cause them all to go astray just when we do
not know the weight of the test object?
36
Aside:
Severity led to reformulating significance tests,
after 1999, I was going to focus on problems in
philosophy of science and epistemology:
e.g., when can proportions in populations give
probabilities for the single case? (almost never)
(Laudan shifts to work on probability: error in the
law)
37
Prof. Aris Spanos Wilson Schmidt Professor
of Economics at Virginia Tech
In 2003 I invited Sir David Cox
(Oxford) to be part of a session on
Phil Stat
• “Our goal is to identify a key principle of
evidence by which hypothetical error
probabilities may be used for inductive
inference.” (Mayo and Cox 2006)
• Fisher tried fiducial probability
38
A rival (and more philosophically
popular) view of stat evidence
underlies ‘probabilism’
• Philosopher Ian Hacking (1965) “Law of
Likelihood”:
Pr(x0;H0)/Pr(x0;H1)
x0 supports H0 less well than H1 if H0 is less likely
than H1 in this technical sense.
(Note: the likelihood of a hypothesis is not its
probability)
39
Problem
Any hypothesis that perfectly fits the data is
maximally likely (even if data-dredged)
• “there always is such a rival hypothesis viz.,
that things just had to turn out the way they
actually did” (Barnard 1972, 129)
40
Error probabilities are
“one level above” a fit measure:
Pr(H0 is found less well supported than H1; H0)
is high for some H1 or other
41
“There is No Such Thing as a Logic
of Statistical Inference”
• Hacking changes his mind (1972, 1980)
“I now believe that Neyman, Peirce, and
Braithwaite were on the right lines to follow in
the analysis of inductive arguments”
(Hacking 1980, 141)
The analogous debate Mill (1888). Keynes
(1921) vs. Whewell (1847), Peirce (1931-5);
Popper (1959)
42
Likelihood Principle (LP)
A pervasive view remains: all the evidence
from x is contained in the ratio of likelihoods*:
Pr(x;H0) / Pr(x;H1)
• Follows from inference by Bayes theorem
*for inference in a model
43
On the LP, error probabilities
appeal to something irrelevant
“Sampling distributions, significance levels,
power, all depend on something more [than the
likelihood function]–something that is irrelevant
in Bayesian inference–namely the sample
space” (Lindley 1971, 436)
44
Some alternatives offered to replace
significance tests obey the LP
• “Bayes factors can be used in the complete absence
of a sampling plan…” (Bayarri, Benjamin, Berger,
Sellke 2016, 100)
• It seems very strange that a frequentist could not
analyze a given set of data…if the stopping rule is
not given. (Berger and Wolpert, The Likelihood
Principle 1988, 78)
• Stopping rules refer to a second kind of multiplicity 45
Optional Stopping:
H0: no effect vs. H1: some effect
Instead of fixing the sample size n in advance:
Keep sampling until H0 is rejected at (“nominal”)
0.05 level
46
In testing the mean of a standard
normal distribution
47
Optional Stopping
48
• “if an experimenter uses this [optional stopping]
procedure, then with probability 1 he will
eventually reject any sharp null hypothesis,
even though it be true”
(Edwards, Lindman, and Savage 1963, 239)
• Understandably, they observe, the significance
tester deems this cheating, requiring an
adjustment of the P-values
Nevertheless, on their view:
• “the import of the...data actually observed will
be exactly the same as it would had you
planned to take n observations.” (ibid)
• What counts as cheating depends on statistical
philosophy
49
Contrast this with reforms from
replication research
• Simmons, Nelson, and Simonsohn (2011):
“Authors must decide the rule for terminating
data collection before data collection begins and
report this rule in the articles” (ibid. 1362).
50
Replication Paradox
• Significance test critic: It’s too easy to obtain low
P-values
• You: Why is it so hard to replicate low P-values with
preregistered protocols?
• Significance test critic: Initial studies were guilty of
P-hacking, cherry-picking, data-dredging (QRPs)
• You: So, replication researchers want methods that
pick up on these biasing selection effects.
• Significance test critic: Actually, reforms
recommend methods insensitive to gambits that
alter error probabilities
51
52
A question
If a method is insensitive to error probabilities, does
it escape inferential consequences of gambits that
inflate error rates?
It depends on a critical understanding of “escape
inferential consequences”
53
It would not escape consequences for a
severe tester:
• What alters error probabilities alters the
method’s error probing capability
• Alters the severity of what’s inferred
54
Notes/qualifications:
• Bayesians and frequentists use sequential trials—
difference is whether/how to adjust
• Post-data selective inference is a major research
area of its own in AI/ML
• Data-dredging need not be pejorative: It’s a biasing
selection effect only when it injures severity
• It can even increase severity (e.g.,searching for a
match with DNA testing in a complete data base)
Bayesian Probabilists may (indirectly)
block intuitively unwarranted
inferences
(without error probabilities)
• Likelihoods + prior probabilities
• Give high prior probability to “no effect”
(spike prior)
55
Problems
• It doesn’t show what researchers had done
wrong—battle of beliefs
• The believability of data-dredged hypotheses is
what makes them so seductive
• Additional source of flexibility
56
No help with the severe tester’s
key problem
• How to distinguish the warrant for a single
hypothesis H with different methods
(e.g., one with selection effects, another, pre-
registered results)?
• Since there’s a single H, its prior would be the
same
57
Most Bayesians (last 15 years) use
“objective” or “conventional” priors
• ‘Eliciting’ subjective priors too difficult:
“[V]irtually never would different experts give prior
distributions that even overlapped” (J. Berger 2006,
392)
• Conventional priors are supposed to prevent prior
beliefs from influencing the posteriors–data
dominant (ideally, some performance guarantees)
58
How should we interpret them?
• “The priors are not to be considered expressions
of uncertainty, ignorance, or degree of belief.
Conventional priors may not even be
probabilities…” (Cox and Mayo 2010, 299)
• No agreement on rival systems for default/non-
subjective priors
(maximum entropy, invariance, maximizing the
missing information, coverage matching.)
59
60
Severity Reformulates tests to
avoid classic fallacies
in terms of discrepancies (effect sizes) that are and
are not severely-tested
SEV(Test T, data x, claim C)
• In a nutshell: one tests several discrepancies
from a test hypothesis and infers those well or
poorly warranted
Mayo 1991-; Mayo and Spanos (2006, 2011); Mayo and
Cox (2006); Mayo and Hand (2022)
61
SEV(𝛍 > 𝛍𝟏), 𝛍𝟏 = 𝛍𝟎+ 
to avoid misinterpreting low P-values
(SE =1)
Some Bayesians reject probabilism
(Gelman and Shalizi (2013)):
Falsificationist Bayesian
“[M]ost of [the] received view of Bayesian inference is
wrong. ...[C]rucial parts of Bayesian data analysis, …
can be understood as ‘error probes’ in Mayo’s sense.”
(10)
“[W]hat we are advocating, then, is what Cox and
Hinkley (1974) call ‘pure significance testing.” (10).
• Last part of SIST: (Probabilist) Foundations Lost,
(Probative/ Error Statistical) Foundations Found 62
63
• A silver lining to distinguishing highly
probable and highly probed–can use
different methods for different contexts
• Attempts to unify is behind much of the
confusion about stat concepts today in
medicine, economics, law, psychology,
climate science, social science
• Philosophical insight is needed
Aside: there’s a famous argument
(Birnbaum, 1962) that frequentists
principles entail LP
• I claim the argument is unsound—using logic alone
• In a book out a few weeks ago (Rod Little)—Little
doesn’t agree with me
64
Thank you!
If there’s time after questions, I’ll
say more about the severity
reinterpretation
65
66
SEV(𝛍 > 𝛍𝟏), 𝛍𝟏 = 𝛍𝟎+ 
to avoid misinterpreting low P-values
(SE =1)
67
Severity for 𝛍 > 𝛍𝟏 vs Power
In the same way, severity avoids
the “large n” problem
• Fixing the P-value, increasing sample
size n, the 2SE cut-off gets smaller
68
Severity tells us:
• A difference just significant at level α indicates less
of a discrepancy from the null if it results from larger
(n1) rather than a smaller (n2) sample size (n1 > n2 )
• What’s more indicative of a large effect (fire), a fire
alarm that goes off with burnt toast or one that
doesn’t go off unless the house is fully ablaze?
• [The larger sample size is like the one that goes off
with burnt toast] 69
70
What about fallacies of
non-significant results?
• Not evidence of no discrepancy, but not
uninformative
• Minimally: Test was incapable of
distinguishing the effect from noise
• Can also use severity reasoning to rule out
discrepancies
71
SEV(𝛍 < 𝛍𝟏), to set upper bounds
References
• Barnard, G. (1972). The logic of statistical inference (Review of “The Logic of
Statistical Inference” by Ian Hacking). British Journal for the Philosophy of Science
23(2), 123–32.
• Bayarri, M., Benjamin, D., Berger, J., Sellke, T. (2016). Rejection odds and
rejection ratios: A proposal for statistical practice in testing hypotheses. Journal of
Mathematical Psychology 72, 90-103.
• Berger, J. O. (2006). ‘The Case for Objective Bayesian Analysis’ and ‘Rejoinder’,
Bayesian Analysis 1(3), 385–402; 457–64.
• Berger, J. O. and Wolpert, R. (1988). The Likelihood Principle, 2nd
ed. Vol. 6
Lecture Notes-Monograph Series. Hayward, CA: Institute of Mathematical
Statistics.
• Carnap, R. (1962). Logical Foundations of Probability, 2nd ed. Chicago, IL:
University of Chicago Press.
• Cox, D. and Hinkley, D. (1974). Theoretical Statistics. London: Chapman and
Hall.
• Cox, D. R., and Mayo, D. G. (2010). Objectivity and conditionality in frequentist
inference. In D. Mayo & A. Spanos (Eds.), Error and Inference: Recent Exchanges
on Experimental Reasoning, Reliability, and the Objectivity and Rationality of
Science, pp. 276–304. Cambridge: Cambridge University Press
72
• Cox, D. R., and Mayo, D. G. (2011). A Statistical Scientist Meets a Philosopher of
Science: A Conversation between Sir David Cox and Deborah Mayo. Rationality,
Markets and Morals (RMM) 2, 103–14.
• Edwards, W., Lindman, H., and Savage, L. (1963). Bayesian statistical inference for
psychological research. Psychological Review, 70(3), 193-242.
• Fisher, R. A. (1947). The Design of Experiments 4th
ed., Edinburgh: Oliver and Boyd.
• Fisher, R. A., (1955), Statistical Methods and Scientific Induction, J R Stat Soc (B)
17: 69-78.
• Gelman, A. and Shalizi, C. (2013). Philosophy and the Practice of Bayesian
Statistics’ and Rejoinder. British Journal of Mathematical and Statistical Psychology
66(1), 8–38; 76–80.
• Good, I. J. (1983). Good Thinking: The Foundations of Probability and Its
Applications. Minneapolis, MN: University of Minnesota Press.
• Hacking, I. (1965). Logic of Statistical Inference. Cambridge: Cambridge University
Press.
• Hacking, I. (1972). ‘Review: Likelihood’, British Journal for the Philosophy of Science
23(2), 132–7.
• Hacking, I. (1980). The theory of probable inference: Neyman, Peirce and
Braithwaite. In D. Mellor (Ed.), Science, Belief and Behavior: Essays in Honour of R.
B. Braithwaite, Cambridge: Cambridge University Press, pp. 141–60.
73
• Harkonen v. United States, No. 13– (Supreme Court of the United States, filed
August 5, 2013). Petition for a Writ of Certiorari (Haddad & Philips). (2013[A]).
• Harkonen v. United States, No. 13–180 (Supreme Court of the United States, filed
November, 2013). Brief to United States Supreme Court for the United States in
opposition to petitioner. (Verrilli, Raman, Rao) (2013[B]). Retrieved from
https://www.justice.gov/sites/default/files/osg/briefs/2013/01/01/2013.resp.pdf
• -0180Kempthorne, O. (1976). ‘Statistics and the Philosophers’, in Harper, W. and
Hooker, C. (eds.), pp. 273–314, Foundations of Probability Theory, Statistical
Inference and Statistical Theories of Science, Volume II. Boston, MA: D. Reidel.
• Keynes, J. (1921). A Treatise on Probability. London: MacMillan and Co.
• Kuhn, T. (1962). The Structure of Scientific Revolutions. Chicago: University of
Chicago Press.
• Lakatos, I. (1978). The Methodology of Scientific Research Programmes. Edited
by J. Worrall and G. Currie. Vol. 1 of Philosophical papers. Cambridge:
Cambridge University Press.
• Laudan, L. (1977). Progress and Its Problems. Berkeley: University of California
Press.
• Lindley, D. V. (1971). The estimation of many parameters. In V. Godambe & D.
Sprott, (Eds.), Foundations of Statistical Inference pp. 435–455. Toronto: Holt,
Rinehart and Winston.
74
75
• Little, R. (2025). Seminal Ideas and Controversies in Statistics. Boca Raton, Florida:
CRC Press.
• Mayo, D. G. (1996). Error and the Growth of Experimental Knowledge. Science and Its
Conceptual Foundation. Chicago: University of Chicago Press.
• Mayo, D. G. (2018). Statistical Inference as Severe Testing: How to Get Beyond the
Statistics Wars, Cambridge: Cambridge University Press.
• Mayo, D. G. (2019). P-value Thresholds: Forfeit at Your Peril. European Journal of
Clinical Investigation 49(10): e13170. (https://doi.org/10.1111/eci.13170
• Mayo, D. G. (2020). Significance tests: Vitiated or vindicated by the replication crisis in
psychology? Review of Philosophy and Psychology 12, 101-120.
DOI https://doi.org/10.1007/s13164-020-00501-w
• Mayo, D. G. (2020). P-values on trial: Selective reporting of (best practice guides
against) selective reporting. Harvard Data Science Review 2.1.
• Mayo, D. G. (2022). The statistics wars and intellectual conflicts of interest.
Conservation Biology : The Journal of the Society for Conservation Biology, 36(1),
13861. https://doi.org/10.1111/cobi.13861.
• Mayo, D.G. (2023). Sir David Cox’s Statistical Philosophy and its Relevance to Today’s
Statistical Controversies. JSM 2023 Proceedings,
DOI: https://zenodo.org/records/10028243.
• Mayo, D. G. and Cox, D. R. (2006). Frequentist statistics as a theory of inductive
inference. In J. Rojo, (Ed.) The Second Erich L. Lehmann Symposium: Optimality,
2006, pp. 247-275. Lecture Notes-Monograph Series, Volume 49, Institute of
Mathematical Statistics.
76
• Mayo, D.G., Hand, D. (2022). Statistical significance and its critics: practicing
damaging science, or damaging scientific practice?. Synthese 200, 220.
• Mayo, D. G. and Kruse, M. (2001). Principles of inference and their consequences. In
D. Cornfield & J. Williamson (Eds.) Foundations of Bayesianism, pp. 381-403.
Dordrecht: Kluwer Academic Publishes.
• Mayo, D. G., and A. Spanos. (2006). Severe testing as a basic concept in a Neyman–
Pearson philosophy of induction.” British Journal for the Philosophy of Science 57(2)
(June 1), 323–357.
• Mayo, D. G., and A. Spanos (2011). Error statistics. In P. Bandyopadhyay and M.
Forster (Eds.), Philosophy of Statistics, 7, pp. 152–198. Handbook of the Philosophy of
Science. The Netherlands: Elsevier.
• Mill, J. S. (1888). A System of Logic. 8th ed. New York: Harper and Brothers.
• Musgrave, A. (1974). Logical versus Historical Theories of Confirmation. British Journal
for the Philosophy of Science 25, 1-23.
• Neyman, J. and Pearson, E. (1933). On the problem of the most efficient tests of
statistical hypotheses. Philosophical Transactions of the Royal Society of London
Series A 231, 289–337. Reprinted in Joint Statistical Papers of J. Neyman and E. S.
Pearson, pp. 140–85, University of California Press.
• Peirce, C. S. (1931–35). Collected Papers, Volumes 1–6. Hartsthorne, C. and Weiss,
P. (eds.), Cambridge, MA: Harvard University Press.
• Popper, K. (1959). The Logic of Scientific Discovery. London, New York: Routledge.
77
• Savage, L. J. (1962). The Foundations of Statistical Inference: A Discussion.
London: Methuen.
• Simmons, J. Nelson, L. and Simonsohn, U. (2011). A false-positive psychology:
Undisclosed flexibility in data collection and analysis allow presenting anything as
significant”, Dialogue: Psychological Science, 22(11), 1359-66.
• Suppes, P. (1969). Models of data. In Studies in the Methodology and Foundations
of Science, 24-35. Dordrecht, The Netherlands: D. Reidel.
• Whewell, W. [1847]/(1967). The Philosophy of the Inductive Sciences: Founded
Upon Their History. 2nd
ed. Vols. 1 and 2. Reprint, London: Johnson Reprint.
Jimmy Savage on the LP:
“According to Bayes' theorem,…. if y is the
datum of some other experiment, and if it
happens that P(x|µ) and P(y|µ) are
proportional functions of µ (that is,
constant multiples of each other), then
each of the two data x and y have exactly
the same thing to say about the values of
µ…” (Savage 1962, 17)
78
Ad

More Related Content

Similar to What is the Philosophy of Statistics? (and how I was drawn to it) (20)

D. G. Mayo Columbia slides for Workshop on Probability &Learning
D. G. Mayo Columbia slides for Workshop on Probability &LearningD. G. Mayo Columbia slides for Workshop on Probability &Learning
D. G. Mayo Columbia slides for Workshop on Probability &Learning
jemille6
 
D. Mayo: Replication Research Under an Error Statistical Philosophy
D. Mayo: Replication Research Under an Error Statistical Philosophy D. Mayo: Replication Research Under an Error Statistical Philosophy
D. Mayo: Replication Research Under an Error Statistical Philosophy
jemille6
 
Mayo &amp; parker spsp 2016 june 16
Mayo &amp; parker   spsp 2016 june 16Mayo &amp; parker   spsp 2016 june 16
Mayo &amp; parker spsp 2016 june 16
jemille6
 
Philosophy of Science and Philosophy of Statistics
Philosophy of Science and Philosophy of StatisticsPhilosophy of Science and Philosophy of Statistics
Philosophy of Science and Philosophy of Statistics
jemille6
 
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
jemille6
 
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
jemille6
 
D. Mayo: Philosophy of Statistics & the Replication Crisis in Science
D. Mayo: Philosophy of Statistics & the Replication Crisis in ScienceD. Mayo: Philosophy of Statistics & the Replication Crisis in Science
D. Mayo: Philosophy of Statistics & the Replication Crisis in Science
jemille6
 
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
jemille6
 
Mayo, DG March 8-Emory AI Systems and society conference slides.pdf
Mayo, DG March 8-Emory AI Systems and society conference slides.pdfMayo, DG March 8-Emory AI Systems and society conference slides.pdf
Mayo, DG March 8-Emory AI Systems and society conference slides.pdf
jemille6
 
Replication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden ControversiesReplication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden Controversies
jemille6
 
Statistical "Reforms": Fixing Science or Threats to Replication and Falsifica...
Statistical "Reforms": Fixing Science or Threats to Replication and Falsifica...Statistical "Reforms": Fixing Science or Threats to Replication and Falsifica...
Statistical "Reforms": Fixing Science or Threats to Replication and Falsifica...
jemille6
 
P-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and FalsificationP-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and Falsification
jemille6
 
Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)
jemille6
 
D. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severelyD. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severely
jemille6
 
Severe Testing: The Key to Error Correction
Severe Testing: The Key to Error CorrectionSevere Testing: The Key to Error Correction
Severe Testing: The Key to Error Correction
jemille6
 
Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance
Probing with Severity: Beyond Bayesian Probabilism and Frequentist PerformanceProbing with Severity: Beyond Bayesian Probabilism and Frequentist Performance
Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance
jemille6
 
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
jemille6
 
The Statistics Wars and Their Casualties
The Statistics Wars and Their CasualtiesThe Statistics Wars and Their Casualties
The Statistics Wars and Their Casualties
jemille6
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)
jemille6
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)
jemille6
 
D. G. Mayo Columbia slides for Workshop on Probability &Learning
D. G. Mayo Columbia slides for Workshop on Probability &LearningD. G. Mayo Columbia slides for Workshop on Probability &Learning
D. G. Mayo Columbia slides for Workshop on Probability &Learning
jemille6
 
D. Mayo: Replication Research Under an Error Statistical Philosophy
D. Mayo: Replication Research Under an Error Statistical Philosophy D. Mayo: Replication Research Under an Error Statistical Philosophy
D. Mayo: Replication Research Under an Error Statistical Philosophy
jemille6
 
Mayo &amp; parker spsp 2016 june 16
Mayo &amp; parker   spsp 2016 june 16Mayo &amp; parker   spsp 2016 june 16
Mayo &amp; parker spsp 2016 june 16
jemille6
 
Philosophy of Science and Philosophy of Statistics
Philosophy of Science and Philosophy of StatisticsPhilosophy of Science and Philosophy of Statistics
Philosophy of Science and Philosophy of Statistics
jemille6
 
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
jemille6
 
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
jemille6
 
D. Mayo: Philosophy of Statistics & the Replication Crisis in Science
D. Mayo: Philosophy of Statistics & the Replication Crisis in ScienceD. Mayo: Philosophy of Statistics & the Replication Crisis in Science
D. Mayo: Philosophy of Statistics & the Replication Crisis in Science
jemille6
 
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
jemille6
 
Mayo, DG March 8-Emory AI Systems and society conference slides.pdf
Mayo, DG March 8-Emory AI Systems and society conference slides.pdfMayo, DG March 8-Emory AI Systems and society conference slides.pdf
Mayo, DG March 8-Emory AI Systems and society conference slides.pdf
jemille6
 
Replication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden ControversiesReplication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden Controversies
jemille6
 
Statistical "Reforms": Fixing Science or Threats to Replication and Falsifica...
Statistical "Reforms": Fixing Science or Threats to Replication and Falsifica...Statistical "Reforms": Fixing Science or Threats to Replication and Falsifica...
Statistical "Reforms": Fixing Science or Threats to Replication and Falsifica...
jemille6
 
P-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and FalsificationP-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and Falsification
jemille6
 
Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)
jemille6
 
D. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severelyD. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severely
jemille6
 
Severe Testing: The Key to Error Correction
Severe Testing: The Key to Error CorrectionSevere Testing: The Key to Error Correction
Severe Testing: The Key to Error Correction
jemille6
 
Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance
Probing with Severity: Beyond Bayesian Probabilism and Frequentist PerformanceProbing with Severity: Beyond Bayesian Probabilism and Frequentist Performance
Probing with Severity: Beyond Bayesian Probabilism and Frequentist Performance
jemille6
 
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
jemille6
 
The Statistics Wars and Their Casualties
The Statistics Wars and Their CasualtiesThe Statistics Wars and Their Casualties
The Statistics Wars and Their Casualties
jemille6
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)
jemille6
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)
jemille6
 

More from jemille6 (20)

D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdf
jemille6
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdf
jemille6
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
jemille6
 
Causal inference is not statistical inference
Causal inference is not statistical inferenceCausal inference is not statistical inference
Causal inference is not statistical inference
jemille6
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?
jemille6
 
What's the question?
What's the question? What's the question?
What's the question?
jemille6
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metascience
jemille6
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
jemille6
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Two
jemille6
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
jemille6
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testing
jemille6
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredging
jemille6
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probability
jemille6
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severity
jemille6
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...
jemille6
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (
jemille6
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...
jemille6
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...
jemille6
 
The ASA president Task Force Statement on Statistical Significance and Replic...
The ASA president Task Force Statement on Statistical Significance and Replic...The ASA president Task Force Statement on Statistical Significance and Replic...
The ASA president Task Force Statement on Statistical Significance and Replic...
jemille6
 
D. G. Mayo jan 11 slides
D. G. Mayo jan 11 slides D. G. Mayo jan 11 slides
D. G. Mayo jan 11 slides
jemille6
 
D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdf
jemille6
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdf
jemille6
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
jemille6
 
Causal inference is not statistical inference
Causal inference is not statistical inferenceCausal inference is not statistical inference
Causal inference is not statistical inference
jemille6
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?
jemille6
 
What's the question?
What's the question? What's the question?
What's the question?
jemille6
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metascience
jemille6
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
jemille6
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Two
jemille6
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
jemille6
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testing
jemille6
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredging
jemille6
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probability
jemille6
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severity
jemille6
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...
jemille6
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (
jemille6
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...
jemille6
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...
jemille6
 
The ASA president Task Force Statement on Statistical Significance and Replic...
The ASA president Task Force Statement on Statistical Significance and Replic...The ASA president Task Force Statement on Statistical Significance and Replic...
The ASA president Task Force Statement on Statistical Significance and Replic...
jemille6
 
D. G. Mayo jan 11 slides
D. G. Mayo jan 11 slides D. G. Mayo jan 11 slides
D. G. Mayo jan 11 slides
jemille6
 
Ad

Recently uploaded (20)

Ancient Stone Sculptures of India: As a Source of Indian History
Ancient Stone Sculptures of India: As a Source of Indian HistoryAncient Stone Sculptures of India: As a Source of Indian History
Ancient Stone Sculptures of India: As a Source of Indian History
Virag Sontakke
 
Computer crime and Legal issues Computer crime and Legal issues
Computer crime and Legal issues Computer crime and Legal issuesComputer crime and Legal issues Computer crime and Legal issues
Computer crime and Legal issues Computer crime and Legal issues
Abhijit Bodhe
 
Lecture 1 Introduction history and institutes of entomology_1.pptx
Lecture 1 Introduction history and institutes of entomology_1.pptxLecture 1 Introduction history and institutes of entomology_1.pptx
Lecture 1 Introduction history and institutes of entomology_1.pptx
Arshad Shaikh
 
Rock Art As a Source of Ancient Indian History
Rock Art As a Source of Ancient Indian HistoryRock Art As a Source of Ancient Indian History
Rock Art As a Source of Ancient Indian History
Virag Sontakke
 
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdfRanking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Rafael Villas B
 
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and GuestsLDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDM Mia eStudios
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
Nguyen Thanh Tu Collection
 
Rococo versus Neoclassicism. The artistic styles of the 18th century
Rococo versus Neoclassicism. The artistic styles of the 18th centuryRococo versus Neoclassicism. The artistic styles of the 18th century
Rococo versus Neoclassicism. The artistic styles of the 18th century
Gema
 
How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18
Celine George
 
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulsepulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
sushreesangita003
 
Biophysics Chapter 3 Methods of Studying Macromolecules.pdf
Biophysics Chapter 3 Methods of Studying Macromolecules.pdfBiophysics Chapter 3 Methods of Studying Macromolecules.pdf
Biophysics Chapter 3 Methods of Studying Macromolecules.pdf
PKLI-Institute of Nursing and Allied Health Sciences Lahore , Pakistan.
 
LDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living WorkshopLDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living Workshop
LDM Mia eStudios
 
PHYSIOLOGY MCQS By DR. NASIR MUSTAFA (PHYSIOLOGY)
PHYSIOLOGY MCQS By DR. NASIR MUSTAFA (PHYSIOLOGY)PHYSIOLOGY MCQS By DR. NASIR MUSTAFA (PHYSIOLOGY)
PHYSIOLOGY MCQS By DR. NASIR MUSTAFA (PHYSIOLOGY)
Dr. Nasir Mustafa
 
Grade 2 - Mathematics - Printable Worksheet
Grade 2 - Mathematics - Printable WorksheetGrade 2 - Mathematics - Printable Worksheet
Grade 2 - Mathematics - Printable Worksheet
Sritoma Majumder
 
How to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo SlidesHow to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo Slides
Celine George
 
dynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south Indiadynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south India
PrachiSontakke5
 
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFAExercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Dr. Nasir Mustafa
 
Grade 3 - English - Printable Worksheet (PDF Format)
Grade 3 - English - Printable Worksheet  (PDF Format)Grade 3 - English - Printable Worksheet  (PDF Format)
Grade 3 - English - Printable Worksheet (PDF Format)
Sritoma Majumder
 
Ajanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of HistoryAjanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of History
Virag Sontakke
 
Ancient Stone Sculptures of India: As a Source of Indian History
Ancient Stone Sculptures of India: As a Source of Indian HistoryAncient Stone Sculptures of India: As a Source of Indian History
Ancient Stone Sculptures of India: As a Source of Indian History
Virag Sontakke
 
Computer crime and Legal issues Computer crime and Legal issues
Computer crime and Legal issues Computer crime and Legal issuesComputer crime and Legal issues Computer crime and Legal issues
Computer crime and Legal issues Computer crime and Legal issues
Abhijit Bodhe
 
Lecture 1 Introduction history and institutes of entomology_1.pptx
Lecture 1 Introduction history and institutes of entomology_1.pptxLecture 1 Introduction history and institutes of entomology_1.pptx
Lecture 1 Introduction history and institutes of entomology_1.pptx
Arshad Shaikh
 
Rock Art As a Source of Ancient Indian History
Rock Art As a Source of Ancient Indian HistoryRock Art As a Source of Ancient Indian History
Rock Art As a Source of Ancient Indian History
Virag Sontakke
 
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdfRanking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Rafael Villas B
 
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and GuestsLDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDM Mia eStudios
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
Nguyen Thanh Tu Collection
 
Rococo versus Neoclassicism. The artistic styles of the 18th century
Rococo versus Neoclassicism. The artistic styles of the 18th centuryRococo versus Neoclassicism. The artistic styles of the 18th century
Rococo versus Neoclassicism. The artistic styles of the 18th century
Gema
 
How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18
Celine George
 
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulsepulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
sushreesangita003
 
LDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living WorkshopLDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living Workshop
LDM Mia eStudios
 
PHYSIOLOGY MCQS By DR. NASIR MUSTAFA (PHYSIOLOGY)
PHYSIOLOGY MCQS By DR. NASIR MUSTAFA (PHYSIOLOGY)PHYSIOLOGY MCQS By DR. NASIR MUSTAFA (PHYSIOLOGY)
PHYSIOLOGY MCQS By DR. NASIR MUSTAFA (PHYSIOLOGY)
Dr. Nasir Mustafa
 
Grade 2 - Mathematics - Printable Worksheet
Grade 2 - Mathematics - Printable WorksheetGrade 2 - Mathematics - Printable Worksheet
Grade 2 - Mathematics - Printable Worksheet
Sritoma Majumder
 
How to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo SlidesHow to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo Slides
Celine George
 
dynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south Indiadynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south India
PrachiSontakke5
 
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFAExercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Dr. Nasir Mustafa
 
Grade 3 - English - Printable Worksheet (PDF Format)
Grade 3 - English - Printable Worksheet  (PDF Format)Grade 3 - English - Printable Worksheet  (PDF Format)
Grade 3 - English - Printable Worksheet (PDF Format)
Sritoma Majumder
 
Ajanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of HistoryAjanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of History
Virag Sontakke
 
Ad

What is the Philosophy of Statistics? (and how I was drawn to it)

  • 1. What is the Philosophy of Statistics? (and how I was drawn to it) Deborah G Mayo Dept of Philosophy, Virginia Tech April 30, 2025 1
  • 2. In a conversation with Sir David Cox COX: Deborah, in some fields foundations do not seem very important, but we both think foundations of statistical inference are important; why do you think? MAYO: …in statistics …we invariably cross into philosophical questions about empirical knowledge and inductive inference. (“A Statistical Scientist Meets a Philosopher of Science” 2011) 2
  • 3. 3 At one level, statisticians and philosophers of science ask similar questions • What should be observed and what may justifiably be inferred from data? • How well do data confirm or test a model?
  • 4. Two-way street • Statistics is a kind of “applied philosophy of science” (Kempthorne, 1976). 4
  • 5. 5 “The Statistics Wars” Statistics has long been the subject of philosophical debate marked by unusual heights of passion and controversy
  • 6. 6 Philosophy → Statistics • A central job for philosophers of science: minister to conceptual, epistemological, and logical problems of sciences
  • 7. 7 Statistics → Philosophy Statistical ideas are used in philosophy to: 1) Model Scientific Inference—ways to arrive at evidence and inference 2) Solve (or reconstruct) Philosophical Problems about inference and evidence (e.g., problem of induction, should we prefer novel predictions?)
  • 8. 8 Logics of Confirmation • Failure to justify (enumerative) induction led to building logics of confirmation C(H,e), like deductive logic: Carnap • Evidence e is given, calculating C(H,e) is formal. • Carnap: Scientists could come to the inductive logician for rational degree of confirmation
  • 9. 9 “According to modern logical empiricist orthodoxy...It is quite irrelevant whether e was known first and h proposed to explain it, or whether e resulted from testing predictions drawn from h”. (Alan Musgrave 1974, p. 2). The search for C(H,e) did not succeed, and the program is challenged by post-positivists (Popper, Lakatos, Laudan, Kuhn)
  • 10. 10 Post-Positivists Evidence e is theory-laden, not given: Collected with difficulty, must start with a problem Popper: Falsification, not Confirmation: Corroborate claims that survive severe tests
  • 11. 11 Popper never adequately cashed out severity—tried adding requirements like novel predictive success to C(h, e) (due partly to his unfamiliarity with statistical methods)
  • 12. Neyman and Pearson (N-P): Methodological falsification “We may search for rules to govern our behavior…in following which we insure that in the long run of experience, we shall not be too often wrong” (Neyman and Pearson 1933, 141-2) • Probability is assigned, not to hypotheses, but to methods 12
  • 13. Series of models It is a fundamental contribution of modern mathematical statistics to have recognized the explicit need of a model in analyzing the significance of experimental data. (Patrick Suppes 1969, 33) 13
  • 14. “I might never have sauntered into that first class on mathematical statistics had the Department of Statistics not been situated so closely to the Department of Philosophy [at U Penn]. ...I suspected that understanding how these statistical methods worked would offer up solutions to the vexing problem of how we learn about the world in the face of error.” (Mayo 1996, Error and the Growth of Experimental Knowledge: EGEK) 14
  • 15. 15 What really influenced me in the years before EGEK (here at Virginia Tech): In statistics: famous Bayesian I.J. Good: Why wasn’t I a Bayesian? In philosophy, leading Popperian Laudan: Can I make progress on their problems about large scale units: paradigms and research programs
  • 16. 16 What I found then holds true: • Insights from grappling with foundational problems of statistics (and data science) provide a gold mine for making progress on philosophical problems • Turn now to controversies of PhilStat
  • 17. 17 Long-standing philosophical controversy on roles of probability Frequentist (performance): to control and assess the relative frequency of misinterpretations of data—error probabilities (e.g., P-values, confidence intervals, randomization, resampling) Bayesian (and other probabilisms): to assign degrees of belief or support in claims (e.g., Bayes factors, Bayesian posterior probabilities)
  • 18. Beyond performance and probabilism • Long-standing battles simmer below the surface in today’s ”statistical (replication) crisis” in science • Biggest source of handwringing? High powered methods make it easy to find impressive-looking but spurious effects 18
  • 19. I set sail with a minimal principle of evidence (severity principle) • We don’t have evidence for a claim C if little if anything has been done that would have found C flawed, even if it is 19
  • 20. Statistical inference as severe testing Probability arises (in statistical inference) to assess and control how capable methods are at uncovering and avoiding erroneous interpretations of data 20
  • 21. I invite the reader: “Envision yourself embarking on a special interest cruise” (Excursions and Tours) The severity principle is an excavation tool: to get beyond the “statistics wars” and for appraising reforms 21
  • 22. Replication crisis leads to “reforms” Several are welcome: • preregistration of protocol, replication checks, avoid cookbook statistics Others are radical • and even lead to violating our minimal requirement for evidence 22
  • 23. Being an outside philosopher helps To unearth assumptions, combat paradoxical, “reforms” requires taking on strong ideological leaders (philosophical observer) 23
  • 24. What is the replication crisis? • Results that had been found statistically significant are not found significant when an independent group seeks to replicate them. • What are statistical significance tests? 24
  • 25. 25 Simple significance (Fisherian) tests “…to test the conformity of the particular data under analysis with H0 in some respect….” (Mayo and Cox 2006, p. 81) …the P-value: the probability the test would yield an even larger (or more extreme) value of a test statistic T assuming H0 chance variability or noise NOT Pr(data|H0 )
  • 26. 26 Testing reasoning: statistical modus tollens • Small P-values indicate* some underlying discrepancy from H0 because very probably (1- P) you would have seen a less impressive difference were H0 true. • Still not evidence of a genuine statistical effect H1 yet alone a scientific conclusion H* *(until an audit is conducted testing assumptions)
  • 27. 27 Neyman and Pearson tests (1933) put Fisherian tests on firmer ground: Introduce alternative hypotheses H0, H1 e.g., H0: no benefit vs. H1: some benefit Controls Type I errors (erroneous rejections) and Type II errors (erroneously failing to reject); ensures that as effect sizes increase, so does test’s power
  • 28. Hypotheses vs Events Statistical hypotheses assign probabilities to data or events Pr(x0;H0) • Correctly guessing tea or milk first is like a coin tossing trial Pr(success) on each trial = .5 (lady tasting tea) • A drug does not improve lung function in patients • The deflection of light due to the sun is 1.75 degrees (an estimate) It’s rare to assign frequentist probabilities to stat hypotheses (not random variables); inference is qualified by probabilistic properties of the test method28
  • 29. Error control is lost by biasing selection effects: fooled by randomness • Sufficient finagling—cherry-picking, significance seeking, multiple testing, post- data subgroups, trying and trying again— may practically guarantee an impressive- looking effect, even if it’s unwarranted by evidence • Violates error control and severity 29
  • 30. 30 Data Dredging (Torturing): P-hacking The case of the Drug CEO: Finding no statistically significant benefit on the primary endpoint (lung benefit), nor on 10 other secondaries....ransacks the data Harkonen vs. United States (2013)—guilty of wire fraud Trump pardons
  • 31. 31 Violates: Severity Requirement We have evidence for a claim C only to the extent C has been subjected to and passes a test that would probably have found C flawed, just if it is. • This probability is the stringency or severity with which C has passed the test. • applicable to any error-prone problem
  • 32. Severity gives an evidential twist • What is the epistemic value of good performance relevant for inference in the case at hand? • What bothers you with selective reporting, cherry picking, stopping when the data look good, P-hacking • We cannot say the test has done its job in the case at hand in avoiding sources of misinterpreting data 32
  • 33. A claim C is not warranted _______ • Probabilism: unless C is true or probable (gets a probability boost, made comparatively firmer, more believable) • Performance: unless it stems from a method with low long-run error • Probativism (severe testing) unless something (a fair amount) has been done to probe (& rule out) ways we can be wrong about C 33
  • 34. Informal Example: Severe Tests To test if I’ve gained weight between the start and end of the pandemic, I use a series of well-calibrated scales, at the start and after. All show an over 4 lb gain, none shows a difference in weighing EGEK, I’m forced to infer: H: I’ve gained at least 4 pounds We argue about the source of the readings from the high capacity to reveal if any scales were wrong 34
  • 35. 35 The severe tester is assumed to be in a context of wanting to find things out • I could insist all the scales are wrong—they work fine with weighing known objects—but this would prevent correctly finding out about weight….. (rigged alternative) • What sort of extraordinary circumstance could cause them all to go astray just when we do not know the weight of the test object?
  • 36. 36 Aside: Severity led to reformulating significance tests, after 1999, I was going to focus on problems in philosophy of science and epistemology: e.g., when can proportions in populations give probabilities for the single case? (almost never) (Laudan shifts to work on probability: error in the law)
  • 37. 37 Prof. Aris Spanos Wilson Schmidt Professor of Economics at Virginia Tech
  • 38. In 2003 I invited Sir David Cox (Oxford) to be part of a session on Phil Stat • “Our goal is to identify a key principle of evidence by which hypothetical error probabilities may be used for inductive inference.” (Mayo and Cox 2006) • Fisher tried fiducial probability 38
  • 39. A rival (and more philosophically popular) view of stat evidence underlies ‘probabilism’ • Philosopher Ian Hacking (1965) “Law of Likelihood”: Pr(x0;H0)/Pr(x0;H1) x0 supports H0 less well than H1 if H0 is less likely than H1 in this technical sense. (Note: the likelihood of a hypothesis is not its probability) 39
  • 40. Problem Any hypothesis that perfectly fits the data is maximally likely (even if data-dredged) • “there always is such a rival hypothesis viz., that things just had to turn out the way they actually did” (Barnard 1972, 129) 40
  • 41. Error probabilities are “one level above” a fit measure: Pr(H0 is found less well supported than H1; H0) is high for some H1 or other 41
  • 42. “There is No Such Thing as a Logic of Statistical Inference” • Hacking changes his mind (1972, 1980) “I now believe that Neyman, Peirce, and Braithwaite were on the right lines to follow in the analysis of inductive arguments” (Hacking 1980, 141) The analogous debate Mill (1888). Keynes (1921) vs. Whewell (1847), Peirce (1931-5); Popper (1959) 42
  • 43. Likelihood Principle (LP) A pervasive view remains: all the evidence from x is contained in the ratio of likelihoods*: Pr(x;H0) / Pr(x;H1) • Follows from inference by Bayes theorem *for inference in a model 43
  • 44. On the LP, error probabilities appeal to something irrelevant “Sampling distributions, significance levels, power, all depend on something more [than the likelihood function]–something that is irrelevant in Bayesian inference–namely the sample space” (Lindley 1971, 436) 44
  • 45. Some alternatives offered to replace significance tests obey the LP • “Bayes factors can be used in the complete absence of a sampling plan…” (Bayarri, Benjamin, Berger, Sellke 2016, 100) • It seems very strange that a frequentist could not analyze a given set of data…if the stopping rule is not given. (Berger and Wolpert, The Likelihood Principle 1988, 78) • Stopping rules refer to a second kind of multiplicity 45
  • 46. Optional Stopping: H0: no effect vs. H1: some effect Instead of fixing the sample size n in advance: Keep sampling until H0 is rejected at (“nominal”) 0.05 level 46
  • 47. In testing the mean of a standard normal distribution 47
  • 48. Optional Stopping 48 • “if an experimenter uses this [optional stopping] procedure, then with probability 1 he will eventually reject any sharp null hypothesis, even though it be true” (Edwards, Lindman, and Savage 1963, 239) • Understandably, they observe, the significance tester deems this cheating, requiring an adjustment of the P-values
  • 49. Nevertheless, on their view: • “the import of the...data actually observed will be exactly the same as it would had you planned to take n observations.” (ibid) • What counts as cheating depends on statistical philosophy 49
  • 50. Contrast this with reforms from replication research • Simmons, Nelson, and Simonsohn (2011): “Authors must decide the rule for terminating data collection before data collection begins and report this rule in the articles” (ibid. 1362). 50
  • 51. Replication Paradox • Significance test critic: It’s too easy to obtain low P-values • You: Why is it so hard to replicate low P-values with preregistered protocols? • Significance test critic: Initial studies were guilty of P-hacking, cherry-picking, data-dredging (QRPs) • You: So, replication researchers want methods that pick up on these biasing selection effects. • Significance test critic: Actually, reforms recommend methods insensitive to gambits that alter error probabilities 51
  • 52. 52 A question If a method is insensitive to error probabilities, does it escape inferential consequences of gambits that inflate error rates? It depends on a critical understanding of “escape inferential consequences”
  • 53. 53 It would not escape consequences for a severe tester: • What alters error probabilities alters the method’s error probing capability • Alters the severity of what’s inferred
  • 54. 54 Notes/qualifications: • Bayesians and frequentists use sequential trials— difference is whether/how to adjust • Post-data selective inference is a major research area of its own in AI/ML • Data-dredging need not be pejorative: It’s a biasing selection effect only when it injures severity • It can even increase severity (e.g.,searching for a match with DNA testing in a complete data base)
  • 55. Bayesian Probabilists may (indirectly) block intuitively unwarranted inferences (without error probabilities) • Likelihoods + prior probabilities • Give high prior probability to “no effect” (spike prior) 55
  • 56. Problems • It doesn’t show what researchers had done wrong—battle of beliefs • The believability of data-dredged hypotheses is what makes them so seductive • Additional source of flexibility 56
  • 57. No help with the severe tester’s key problem • How to distinguish the warrant for a single hypothesis H with different methods (e.g., one with selection effects, another, pre- registered results)? • Since there’s a single H, its prior would be the same 57
  • 58. Most Bayesians (last 15 years) use “objective” or “conventional” priors • ‘Eliciting’ subjective priors too difficult: “[V]irtually never would different experts give prior distributions that even overlapped” (J. Berger 2006, 392) • Conventional priors are supposed to prevent prior beliefs from influencing the posteriors–data dominant (ideally, some performance guarantees) 58
  • 59. How should we interpret them? • “The priors are not to be considered expressions of uncertainty, ignorance, or degree of belief. Conventional priors may not even be probabilities…” (Cox and Mayo 2010, 299) • No agreement on rival systems for default/non- subjective priors (maximum entropy, invariance, maximizing the missing information, coverage matching.) 59
  • 60. 60 Severity Reformulates tests to avoid classic fallacies in terms of discrepancies (effect sizes) that are and are not severely-tested SEV(Test T, data x, claim C) • In a nutshell: one tests several discrepancies from a test hypothesis and infers those well or poorly warranted Mayo 1991-; Mayo and Spanos (2006, 2011); Mayo and Cox (2006); Mayo and Hand (2022)
  • 61. 61 SEV(𝛍 > 𝛍𝟏), 𝛍𝟏 = 𝛍𝟎+  to avoid misinterpreting low P-values (SE =1)
  • 62. Some Bayesians reject probabilism (Gelman and Shalizi (2013)): Falsificationist Bayesian “[M]ost of [the] received view of Bayesian inference is wrong. ...[C]rucial parts of Bayesian data analysis, … can be understood as ‘error probes’ in Mayo’s sense.” (10) “[W]hat we are advocating, then, is what Cox and Hinkley (1974) call ‘pure significance testing.” (10). • Last part of SIST: (Probabilist) Foundations Lost, (Probative/ Error Statistical) Foundations Found 62
  • 63. 63 • A silver lining to distinguishing highly probable and highly probed–can use different methods for different contexts • Attempts to unify is behind much of the confusion about stat concepts today in medicine, economics, law, psychology, climate science, social science • Philosophical insight is needed
  • 64. Aside: there’s a famous argument (Birnbaum, 1962) that frequentists principles entail LP • I claim the argument is unsound—using logic alone • In a book out a few weeks ago (Rod Little)—Little doesn’t agree with me 64
  • 65. Thank you! If there’s time after questions, I’ll say more about the severity reinterpretation 65
  • 66. 66 SEV(𝛍 > 𝛍𝟏), 𝛍𝟏 = 𝛍𝟎+  to avoid misinterpreting low P-values (SE =1)
  • 67. 67 Severity for 𝛍 > 𝛍𝟏 vs Power
  • 68. In the same way, severity avoids the “large n” problem • Fixing the P-value, increasing sample size n, the 2SE cut-off gets smaller 68
  • 69. Severity tells us: • A difference just significant at level α indicates less of a discrepancy from the null if it results from larger (n1) rather than a smaller (n2) sample size (n1 > n2 ) • What’s more indicative of a large effect (fire), a fire alarm that goes off with burnt toast or one that doesn’t go off unless the house is fully ablaze? • [The larger sample size is like the one that goes off with burnt toast] 69
  • 70. 70 What about fallacies of non-significant results? • Not evidence of no discrepancy, but not uninformative • Minimally: Test was incapable of distinguishing the effect from noise • Can also use severity reasoning to rule out discrepancies
  • 71. 71 SEV(𝛍 < 𝛍𝟏), to set upper bounds
  • 72. References • Barnard, G. (1972). The logic of statistical inference (Review of “The Logic of Statistical Inference” by Ian Hacking). British Journal for the Philosophy of Science 23(2), 123–32. • Bayarri, M., Benjamin, D., Berger, J., Sellke, T. (2016). Rejection odds and rejection ratios: A proposal for statistical practice in testing hypotheses. Journal of Mathematical Psychology 72, 90-103. • Berger, J. O. (2006). ‘The Case for Objective Bayesian Analysis’ and ‘Rejoinder’, Bayesian Analysis 1(3), 385–402; 457–64. • Berger, J. O. and Wolpert, R. (1988). The Likelihood Principle, 2nd ed. Vol. 6 Lecture Notes-Monograph Series. Hayward, CA: Institute of Mathematical Statistics. • Carnap, R. (1962). Logical Foundations of Probability, 2nd ed. Chicago, IL: University of Chicago Press. • Cox, D. and Hinkley, D. (1974). Theoretical Statistics. London: Chapman and Hall. • Cox, D. R., and Mayo, D. G. (2010). Objectivity and conditionality in frequentist inference. In D. Mayo & A. Spanos (Eds.), Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability, and the Objectivity and Rationality of Science, pp. 276–304. Cambridge: Cambridge University Press 72
  • 73. • Cox, D. R., and Mayo, D. G. (2011). A Statistical Scientist Meets a Philosopher of Science: A Conversation between Sir David Cox and Deborah Mayo. Rationality, Markets and Morals (RMM) 2, 103–14. • Edwards, W., Lindman, H., and Savage, L. (1963). Bayesian statistical inference for psychological research. Psychological Review, 70(3), 193-242. • Fisher, R. A. (1947). The Design of Experiments 4th ed., Edinburgh: Oliver and Boyd. • Fisher, R. A., (1955), Statistical Methods and Scientific Induction, J R Stat Soc (B) 17: 69-78. • Gelman, A. and Shalizi, C. (2013). Philosophy and the Practice of Bayesian Statistics’ and Rejoinder. British Journal of Mathematical and Statistical Psychology 66(1), 8–38; 76–80. • Good, I. J. (1983). Good Thinking: The Foundations of Probability and Its Applications. Minneapolis, MN: University of Minnesota Press. • Hacking, I. (1965). Logic of Statistical Inference. Cambridge: Cambridge University Press. • Hacking, I. (1972). ‘Review: Likelihood’, British Journal for the Philosophy of Science 23(2), 132–7. • Hacking, I. (1980). The theory of probable inference: Neyman, Peirce and Braithwaite. In D. Mellor (Ed.), Science, Belief and Behavior: Essays in Honour of R. B. Braithwaite, Cambridge: Cambridge University Press, pp. 141–60. 73
  • 74. • Harkonen v. United States, No. 13– (Supreme Court of the United States, filed August 5, 2013). Petition for a Writ of Certiorari (Haddad & Philips). (2013[A]). • Harkonen v. United States, No. 13–180 (Supreme Court of the United States, filed November, 2013). Brief to United States Supreme Court for the United States in opposition to petitioner. (Verrilli, Raman, Rao) (2013[B]). Retrieved from https://www.justice.gov/sites/default/files/osg/briefs/2013/01/01/2013.resp.pdf • -0180Kempthorne, O. (1976). ‘Statistics and the Philosophers’, in Harper, W. and Hooker, C. (eds.), pp. 273–314, Foundations of Probability Theory, Statistical Inference and Statistical Theories of Science, Volume II. Boston, MA: D. Reidel. • Keynes, J. (1921). A Treatise on Probability. London: MacMillan and Co. • Kuhn, T. (1962). The Structure of Scientific Revolutions. Chicago: University of Chicago Press. • Lakatos, I. (1978). The Methodology of Scientific Research Programmes. Edited by J. Worrall and G. Currie. Vol. 1 of Philosophical papers. Cambridge: Cambridge University Press. • Laudan, L. (1977). Progress and Its Problems. Berkeley: University of California Press. • Lindley, D. V. (1971). The estimation of many parameters. In V. Godambe & D. Sprott, (Eds.), Foundations of Statistical Inference pp. 435–455. Toronto: Holt, Rinehart and Winston. 74
  • 75. 75 • Little, R. (2025). Seminal Ideas and Controversies in Statistics. Boca Raton, Florida: CRC Press. • Mayo, D. G. (1996). Error and the Growth of Experimental Knowledge. Science and Its Conceptual Foundation. Chicago: University of Chicago Press. • Mayo, D. G. (2018). Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars, Cambridge: Cambridge University Press. • Mayo, D. G. (2019). P-value Thresholds: Forfeit at Your Peril. European Journal of Clinical Investigation 49(10): e13170. (https://doi.org/10.1111/eci.13170 • Mayo, D. G. (2020). Significance tests: Vitiated or vindicated by the replication crisis in psychology? Review of Philosophy and Psychology 12, 101-120. DOI https://doi.org/10.1007/s13164-020-00501-w • Mayo, D. G. (2020). P-values on trial: Selective reporting of (best practice guides against) selective reporting. Harvard Data Science Review 2.1. • Mayo, D. G. (2022). The statistics wars and intellectual conflicts of interest. Conservation Biology : The Journal of the Society for Conservation Biology, 36(1), 13861. https://doi.org/10.1111/cobi.13861. • Mayo, D.G. (2023). Sir David Cox’s Statistical Philosophy and its Relevance to Today’s Statistical Controversies. JSM 2023 Proceedings, DOI: https://zenodo.org/records/10028243. • Mayo, D. G. and Cox, D. R. (2006). Frequentist statistics as a theory of inductive inference. In J. Rojo, (Ed.) The Second Erich L. Lehmann Symposium: Optimality, 2006, pp. 247-275. Lecture Notes-Monograph Series, Volume 49, Institute of Mathematical Statistics.
  • 76. 76 • Mayo, D.G., Hand, D. (2022). Statistical significance and its critics: practicing damaging science, or damaging scientific practice?. Synthese 200, 220. • Mayo, D. G. and Kruse, M. (2001). Principles of inference and their consequences. In D. Cornfield & J. Williamson (Eds.) Foundations of Bayesianism, pp. 381-403. Dordrecht: Kluwer Academic Publishes. • Mayo, D. G., and A. Spanos. (2006). Severe testing as a basic concept in a Neyman– Pearson philosophy of induction.” British Journal for the Philosophy of Science 57(2) (June 1), 323–357. • Mayo, D. G., and A. Spanos (2011). Error statistics. In P. Bandyopadhyay and M. Forster (Eds.), Philosophy of Statistics, 7, pp. 152–198. Handbook of the Philosophy of Science. The Netherlands: Elsevier. • Mill, J. S. (1888). A System of Logic. 8th ed. New York: Harper and Brothers. • Musgrave, A. (1974). Logical versus Historical Theories of Confirmation. British Journal for the Philosophy of Science 25, 1-23. • Neyman, J. and Pearson, E. (1933). On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London Series A 231, 289–337. Reprinted in Joint Statistical Papers of J. Neyman and E. S. Pearson, pp. 140–85, University of California Press. • Peirce, C. S. (1931–35). Collected Papers, Volumes 1–6. Hartsthorne, C. and Weiss, P. (eds.), Cambridge, MA: Harvard University Press. • Popper, K. (1959). The Logic of Scientific Discovery. London, New York: Routledge.
  • 77. 77 • Savage, L. J. (1962). The Foundations of Statistical Inference: A Discussion. London: Methuen. • Simmons, J. Nelson, L. and Simonsohn, U. (2011). A false-positive psychology: Undisclosed flexibility in data collection and analysis allow presenting anything as significant”, Dialogue: Psychological Science, 22(11), 1359-66. • Suppes, P. (1969). Models of data. In Studies in the Methodology and Foundations of Science, 24-35. Dordrecht, The Netherlands: D. Reidel. • Whewell, W. [1847]/(1967). The Philosophy of the Inductive Sciences: Founded Upon Their History. 2nd ed. Vols. 1 and 2. Reprint, London: Johnson Reprint.
  • 78. Jimmy Savage on the LP: “According to Bayes' theorem,…. if y is the datum of some other experiment, and if it happens that P(x|µ) and P(y|µ) are proportional functions of µ (that is, constant multiples of each other), then each of the two data x and y have exactly the same thing to say about the values of µ…” (Savage 1962, 17) 78