Bayesian Reanalyses of Clinical A/B Trials with JASP: The Heatmap Robustness Check

A recent addition to JASP is the Bayesian A/B test. This test compares two proportions and quantifies the extent to which the data undercut or support the hypothesis that a treatment has an effect. As the test requires only four numbers (i.e., the number of successes and failures in each of two groups), many clinical trials lend themselves to an easy reanalysis. In the past months, we have reanalyzed the following five medical data sets:

  1. Gronau, Q. F., & Wagenmakers, E.-J. (2020). Hydroxychloroquine in patients with COVID-19 (Chen et al., 2020): Absence of evidence, not evidence of absence.
  2. Gronau, Q. F., & Wagenmakers, E.-J. (2019). Progesterone in women with bleeding in early pregnancy: Absence of evidence, not evidence of absence.
  3. Hulme, O. J., Wagenmakers, E.-J., Damkier. P., Madelung, C. F., Siebner, H. R., Helweg-Larsen, J., Gronau, Q. F., Benfield, T., &, Madsen, K. H. (2020). Reply to Gautret et al. 2020: A Bayesian reanalysis of the effects of hydroxychloroquine and azithromycin on viral carriage in patients with COVID-19. Manuscript submitted for publication.
  4. Wagenmakers, E.-J., & Gronau, Q. F. (2020). Efficacy of hydroxychloroquine in patients with COVID-19 (Chen et al., 2020): Moderate evidence for a treatment effect on pneumonia.
  5. Wagenmakers, E.-J., & Gronau, Q. F. (2020). Absence of evidence and evidence of absence in the FLASH Trial: A Bayesian reanalysis..

One reason for conducting these reanalyses is that they showcase how easy it is to conduct informative Bayesian analyses and draw inferences that go considerably beyond the classical conclusions “p<.05” and “p>.05”. Over the course of reanalyzing the above trial data, we realized that all of our reports can be made to follow the same template, which we hope will be useful to other researchers as well.

One feature that we added to JASP even more recently is the “robustness check heatmap”. This heatmap shows how the evidence changes as a result of a two-parameter change in the prior distribution for the log odds ratio: a change in the prior mean and a change in the prior standard deviation. Let’s see the robustness heatmap in action for the case of a recent trial on the effect of hydroxychloroquine in patients with COVID-19.

Example Report: Efficacy of Hydroxychloroquine in Patients with COVID-19 (Chen et al., 2020): Moderate Evidence for a Treatment Effect on Pneumonia

A recent randomized clinical trial assessed the efficacy of hydroxychloroquine (HCQ) in the treatment of patients with common coronavirus disease-19 (COVID-19).1 The results showed that “the body temperature recovery time and the cough remission time were significantly shortened in the HCQ treatment group. Besides, a larger proportion of patients with improved pneumonia in the HCQ treatment group (80.6%, 25 of 31) compared with the control group (54.8%, 17 of 31).” The authors concluded: “Among patients with COVID-19, the use of HCQ could significantly shorten TTCR [time to clinical recovery] and promote the absorption of pneumonia”.

This conclusion leaves unaddressed the degree to which the data undercut or support the hypothesis that HCQ is (in)effective. Here we focus on the data concerning pneumonia and quantify evidence by conducting a Bayesian logistic regression.2,3 Under the no-effect model H0, the log odds ratio equals ψ=0, whereas under the positive-effect model H+, ψ is assigned a positive-only normal prior N+(μ,σ). A default analysis (i.e., μ=0, σ=1) reveals moderate4 evidence for H+: the data are 6 times more likely under the hypothesis that HCQ is beneficial than under the hypothesis that it is ineffective. Figure 1 shows the evidence is weak or moderate for all combinations of μ in [0,.30] and σ in [0.1,1].

Figure 1. Across a range of different priors, the evidence for the positive-effect H+ over the no-effect H0 is either weak or moderate.

In addition to hypothesis testing one may also inspect the posterior distribution for ψ under a two-sided model that assigns ψ a standard normal distribution. As Figure 2 shows, the posterior distribution is relatively wide, indicating that, under the assumption that the treatment is beneficial, there remains considerable uncertainty about the size of the benefit.

Figure 2. The posterior distribution for the log odds ratio is relatively wide, signifying a lack of certainty regarding the strength of the treatment effect (assuming it is present).

In sum, the data under consideration support the hypothesis that HCQ improves pneumonia in patients with COVID-19. However, the degree of this support is moderate. In line with the authors’ conclusion, more data are needed to draw definite conclusions.


  1. Chen, Z., Hu, J., Zhang, Z., Jiang, S., Han, S., Yan, D., Zhuang, R., Hu, B., & Zhang, Z. (2020). Efficacy of hydroxychloroquine in patients with COVID-19: Results of a randomized clinical trial. Medrxiv.
  2. Kass R. E., & Vaidyanathan S. K (1992). Approximate Bayes factors and orthogonal parameters, with application to testing equality of two binomial proportions. Journal of the Royal Statistical Society: Series B (Methodological), 54,129-144.
  3. Gronau Q. F., Raj K. N. A., & Wagenmakers, E.J. (2019). Informed Bayesian inference for the A/B test. Manuscript submitted for publication and available on arXiv:
  4. Jeffreys, H. (1939). Theory of Probability (1st ed). Oxford University Press, Oxford, UK.


Like this post?

Subscribe to our newsletter to receive regular updates about JASP including our latest blog posts, JASP articles, example analyses, new features, interviews with team members, and more! You can unsubscribe at any time.

About the authors

Eric-Jan Wagenmakers

Eric-Jan (EJ) Wagenmakers is professor at the Psychological Methods Group at the University of Amsterdam. EJ guides the development of JASP.

Quentin Gronau

Quentin is a PhD candidate at the Psychological Methods Group of the University of Amsterdam. At JASP, he is responsible for the t-tests and the binomial test.

Akash Raj

Akash is a software developer at JASP. He is responsible for the implementation of UI elements.