How to Conduct a Multinomial Test and Chi-Square Test in JASP

The analysis of categorical data has a long history and can be traced back to some of most influential statisticians: Karl Pearson and Sir Ronald Fisher. In 1900, Pearson first introduced the \chi^2-statistic and thus the initial versions of the now known multinomial test and \chi^2 goodness-of-fit test. Today, the multinomial test and the \chi^2 goodness-of-fit test have become a standard procedure for the empirical sciences. The multinomial test checks whether the observed cell counts are uniformly distributed and the \chi^2 goodness-of-fit test determines whether the observed cell counts deviate from a particular expected distribution. Both tests are now implemented in JASP 0.8.6.

You can replicate all analyses in this blogpost by downloading either our example dataset as a .csv file or the full analysis as an annotated JASP file from our OSF folder.

Investigating the Memory of Life Stresses

As an example, consider the study conducted by Uhlenhuth, Lipman, Balter, and Stern (1974). To investigate the connection between life stresses and illnesses Uhlenhuth et al. (1974) asked 735 participants to indicate which life stresses, negative life events, and illnesses they had experienced in the past 18 months. A subset of this data set was reanalyzed by Haberman (1978, p. 3). More precisely, Haberman investigated those 147 participants who had reported only one negative life event over this time span. The objective of Habermans reanalysis was to show that through the fallibility of human memory such retrospective surveys become unreliable; participants simply forget negative life events and illnesses the further in the past they have occured.

This is the data of Haberman’s subsample:

Months prior to the interviewFrequency of reported life stresses
115
211
314
417
54
610
79
82
910
1015
116
125
132
1415
154
162
177
1812


Haberman was interested in whether he could reject the hypothesis that the frequency of reported life stresses was equally distributed over the 18 months period. For that, he conducted a multinomial test. If the results indicate they they are not equally distributed it would question the reliability of retrospective surveys and hence the results from Uhlenhuth et al. (1974).

Conducting the Multinomial Test

To conduct the multinomial test in JASP we select in the ‘Common’ analysis menu in the ribbon ‘Frequencies’ and then ‘Multinomial Test’. The field ‘Factor’ refers to the categorical variable that we are interested in. In our example this is the variable ‘Months prior to the interview’ which is abbreviated as ‘Month’ in our dataset. In the field ‘Counts’ we indicate which column in our data set contains the count data. In our case we drag the variable ‘Stress.frequency’ in this field.

The multinomial test is now automatically conducted in JASP. The null hypothesis is tested by means of the Pearson’s chi-squared test statistic, which measures the deviation of the observed from the expected cell counts under the null hypothesis:

    \[$\chi^2$ = \sum_{i=1}^{k} \frac{({observed count}_i - {expected count}_i)^2}{{expected count}_i}\]

where k denotes the number of categories. In our example the variable ‘Month’ implies 18 categories, one for each month, which means that the expected frequencies of reported life stresses per month is 147/18 \approx 8. With \chi^2(17) = 45.367, p < .001 Haberman (1978) rejected the null hypothesis of equal frequencies across categories.1 In JASP, these results are displayed in the Results table:

Note that the multinomial test not only works for count data but also for normal factors. In this case the counts will be derived automatically from the factor and do not need to be specified in the ‘Counts’ field.

Conducting the Chi-Square Goodness-of-Fit Test

The multinomial test is a special case of the \chi^2 goodness-of-fit test. Just like the multinomial test the \chi^2 goodness-of-fit test investigates whether the observed distribution of cell counts corresponds to a expected distribution. However, in the \chi^2 goodness-of-fit test the expected distribution is not restricted to a uniform distribution but its shape can be arbitrarily adapted. Using a \chi^2 goodness-of-fit test we could, for instance, test whether the data from a replication experiment follow the same distribution as the original data. In JASP we can enter our expectations manually. To do this, we indicate under ‘Hypothesis’ that our null hypothesis is no longer the one under the ‘Multinomial test’ but the ‘Chi-square test’.

The spreadsheet that opens up corresponds to the null hypotheses we are testing. By default the first hypothesis H_0 (a) constitutes the null hypothesis of the multinomial test under which we expect an equal number of observations per category:

Now we can either change these values by clicking in the cells or we could click on ‘Add column’ button to create a new hypothesis. We describe our hypothesis now by entering either the expected number of observations per category, the expected proportions or in fact any number that we find useful; JASP automatically normalizes these numbers for us. We can add and test up to five hypotheses at once. The corresponding \chi^2 goodness-of-fit test is conducted automatically and is displayed in the Results table:

There is also a second way to conduct the \chi^2 goodness-of-fit test. If a column in our data set already contains a variable which reflects our expectations we can also drag this variable directly into the field ‘Expected Counts’. The values in this variable will then be interpreted as our null hypothesis. Here too, JASP automatically normalizes our expected values. Any other hypotheses that we might have specified previously will be grayed out:

Displaying Descriptives

The option Display allows us to specify whether we want to display the descriptives of the data as counts or proportions. For this analysis, we chose the option ‘Counts’. The observed and expected counts as well as the confidence intervals for the observed values can then be displayed in the Descriptives table:

Furthermore, the Descriptives plot illustrates the observed data:

For both the descriptives plot and the table the calculated confidence intervals are based on the independent binomial distributions of each category. For more information on how to interpret the Multinomial test or the \chi^2 goodness-of-fit test, see here or here.


Like this post?

Subscribe to our newsletter to receive regular updates about JASP including our latest blog posts, JASP articles, example analyses, new features, interviews with team members, and more! You can unsubscribe at any time.


Footnotes

1 This GIF shows how to change the general settings in JASP to display exact p-values.

References

Haberman, S. J. (1978). Analysis of qualitative data: Introductory topics (Vol 1). Academic Press.

Uhlenhuth, E. H., Lipman, R. S., Balter, M. B., & Stern, M. (1974). Symptom intensity and life stress in the city. Archives of General Psychiatry, 31, 759-764.

About the authors

Alexandra Sarafoglou

Alexandra Sarafoglou is a PhD candidate at the Department of Psychological Methods at the University of Amsterdam. At JASP, she is contributing to the multinomial analysis, and the video tutorials. Alexandra is also part of the workshop organization team.

Erik-Jan van Kesteren

Erik Jan van Kesteren is a PhD candidate at Utrecht University. At JASP, he is responsible for adding plots, functions, and UI elements, and interfacing R and C++.