As discussed on the Bayesian Spectacles blog, some statisticians have an visceral dislike of the point-null hypothesis, fueled in part by the conviction that the point-null is never true. Back in 1938, Berkson (pp. 526-527) already summarized the argument:
“I believe that an observant statistician who has had any considerable experience with applying the chi-square test repeatedly will agree with my statement that, as a matter of observation, when the numbers in the data are quite large, the P’s tend to come out small. Having observed this, and on reflection, I make the following dogmatic statement, referring for illustration to the normal curve: `If the normal curve is fitted to a body of data representing any real observations whatever of quantities in the physical world, then if the number of observations is extremely large—for instance, on the order of 200,000—the chi-square P will be small beyond any usual limit of significance.’
This dogmatic statement is made on the basis of an extrapolation of the observation referred to and can also be defended as a prediction from a priori considerations. For we may assume that it is practically certain that any series of real observations does not actually follow a normal curve with absolute exactitude in all respects, and no matter how small the discrepancy between the normal curve and the true curve of observations, the chi-square P will be small if the sample has a sufficiently large number of observations in it.
If this be so, then we have something here that is apt to trouble the conscience of a reflective statistician using the chi-square test. For I suppose it would be agreed by statisticians that a large sample is always better than a small sample. If, then, we know in advance the P that will result from an application of a chi-square test to a large sample, there would seem to be no use in doing it on a smaller one. But since the result of the former test is known, it is no test at all!”
I believe this statement is easily misinterpreted. Even if the point-null is not exactly true, it may still provide a decent approximation to the true state of nature. Moreover, the point-null represents the idealized position of a skeptic, and without being able to refute that position, only the most gullible researchers will buy the claims that contradict it. These and other arguments are summarized in a handy flowchart. Later in the 1938 paper, Berkson (p. 531) actually provided a way out for the statistician who finds herself confronted with large sample sizes and significant p-values, but it reluctant to reject the null:
“My view is that there is never any valid reason for rejection of the null hypothesis except on the willingness to emrbrace an alternative one. No matter how rare an experience is under a null hypothesis, this does not warrant logically, and in practice we do not allow it, to reject the null hypothesis if, for any reasons, no alternative hypothesis is credible.”
At any rate, all Bayes factor hypothesis tests in JASP involve a point-null hypothesis. Wouldn’t it be nice if JASP could also be used to test interval-nulls (Morey & Rouder, 2011)? Well, the good news is that, with a little bit of Bayesian magic, you already can!
[Download this annotated JASP file or view it on the Open Science Framework to follow along with the explanation in JASP.]For concreteness, consider the association between (1) the height of US presidents versus their closest competitors (expressed as a height ratio); and (2) the proportion of the popular vote. Figure 1 below shows the scatter plot (data from Stulp et al., 2013, for the first 46 elections).
Figure 1. Scatterplot for the height ratio of US presidents (over their closest competitor) and the proportion of the popular vote for the first 46 elections.
Let’s first do a default test. With a few mouse clicks in JASP, we obtain the following plot:
Figure 2. Bayesian analysis of the Pearson correlation between height and popularity for US presidents.The dotted line is the default prior.
The important quantity here is BF10 = 6.332. This means that the data are about 6 times more likely under the alternative hypothesis H1 than under the point-null hypothesis H0. Yes, yes, you may complain about the uniform distribution under H1 (should it really have negative mass? Should it really be flat?) and in JASP you can explore alternative specifications easily. But for the present demonstration, the bigger fish we want to fry is the point-null. Instead of the point-null, we would really like to use something that isn’t exactly null, but close to null.
So what you can do in JASP (bear with me here, it is worth it) is to conduct a second analysis, and specify a prior distribution for H1 that is highly peaked around 0 – in fact, this distribution represents our interval-null. We obtain the following result:
Figure 3. Bayesian analysis of the Pearson correlation between height and popularity for US presidents. The dotted line is an “interval-null prior” that is relatively peaked around 0.
In fact, the interval-null prior I selected here is still relatively wide and perhaps not the best representation of a reasonable interval-null hypothesis; I have adopted it for illustrative purposes. The analysis shown in Figure 3 pits the point-null against the interval-null; these two models make highly similar predictions, and consequently the Bayes factor will be near 1. In this case, we have BF10 = 1.354, a smidgen of evidence in favor of the interval-null over the point-null.
So here comes the Bayesian magic. Our first analysis pitted the default prior against the point-null; our second analysis pitted the interval-null against the point-null. Given these two Bayes factors we can construct, by transitivity, the third Bayes factor that pits the interval null against the default prior. Here is how it works:
When we multiply the two Bayes factors, the common term (the point-null) drops out, and we are left with a Bayes factor that expresses the support for the alternative hypothesis against an interval-null. In this case, this Bayes factor equals 6.332 * 0.739 = 4.679. This is only a little less than the Bayes factor that involved the point-null.
In general, the above equation reveals that the Bayes factor involving an interval-null instead of a point-null can be obtained by multiplying the point-null result with an “interval correction factor” –the right-most element of the equation above– that compares the predictive adequacy of the point-null against the interval-null.
As you can imagine, for prior distributions that are relatively tightly centered around zero, the interval correction factor will be near 1, which means that the Bayes factor is virtually unchanged, and it does not matter whether you use a point-null or an interval-null (for a specific analysis see Berger & Delampady, 1987).
This analysis highlights the elegance of the Bayesian paradigm and illustrates how the use of an interval-null requires a specific correction to the Bayes factor. Nevertheless, it would be even better if the interval-null was available in JASP directly instead of indirectly. However, not many researchers seem to use the interval-null. Perhaps they believe that the interval correction factor will be negligable, or perhaps they are reluctant to specify the interval width? Once more researchers start to express an interest in the interval-null, we should definitely implement this analysis in JASP directly.
PS. The BayesFactor package (Morey & Rouder, 2015) in R allows users to test interval-nulls directly, at least for t-tests and proportions.
References
Berger, J. O., & Delampady, M. (1987). Testing precise hypotheses. Statistical Science, 2, 317-352.
Berkson. J. (1938). Some difficulties of interpretation encountered in the application of the chi-square test. Journal of the American Statistical Association, 33, 526-536.
Morey, R. D., & Rouder, J. N. (2011). Bayes factor approaches for testing interval null hypotheses. Psychological Methods, 16, 406-419.
Morey, R. D., & Rouder, J. N. (2015). BayesFactor 0.9.11-1. Comprehensive R Archive Network, http://cran.r-project.org/web/packages/BayesFactor/index.html.
Stulp, G., Buunk, A. P., Verhulst, S., & Pollet, T. V. (2013). Tall claims? Sense and nonsense about the importance of height of US presidents. The Leadership Quarterly, 24, 159-171.
Download
Annotated JASP file including the analyses discussed in this post
br>
Like this post?
Subscribe to our newsletter to receive regular updates about JASP including our latest blog posts, JASP articles, example analyses, new features, interviews with team members, and more! You can unsubscribe at any time.
br>