An Introduction to Crunch's Statistical Tests of Significance (Hypothesis Testing)
The following is an overview of the hypothesis testing in Crunch. For a detailed review, including watch-out’s, formulae and other considerations, please see this page.
We’ll start by outlining the different types of statistical testing available and then show how to interpret them, via worked example.
Crunch is Power Made Simple. The goal is to help identify possible differences that may be of interest in forming a story about the data. The goal of Crunch is not to replicate the statistical testing of other platforms (for which the methods can be outdated). We automatically adjust and apply the most appropriate statistical testing to the data at hand. Our team of expert statisticians ensure that the most appropriate test is applied in every case.
The different types of tests
Crunch offers hypothesis testing within the web app, and on export of the data into tabs/tab books (under the multi-table function).
Within the web app, the asterisk icon in the Display Controller, tells you hypothesis testing is on.
When it is, cells are shaded according to the p-value. You’ll notice the scale at the bottom that describes the shading. In essence, the darker the shade, the greater the p-value. Green is positive, red is negative.
1) Cell comparisons in the web app (shading based on Z-scores)
The default view of the data is called cell comparisons. Cell comparisons are different to pair-wise column comparisons, which many researchers are familiar with from traditional tab books. In traditional tab books, column comparisons are represented by letters (A, B, C, D, etc) to show how one column compares to another (more on this below). Cell comparisons, by contrast, work differently by taking into account the whole table, as we’ll explain below.
2) Column comparisons in the web app (shading based on t-scores)
If you click under a column when the shading is on, it will allow you to Set Comparisons. This compares that column pairwise to the other columns.
3) Column comparisons in tab books (letters that indicate pair-wise tests)
If you’re exporting a multitable (banner table) to a tab book, you have the choice of the shading options (as per the above), but also to use traditional tab book column comparison letters. Tab books offer other customizations as well.
How to interpret the Z-score shading (and why it’s awesome)
Consider the table below, which has cell comparisons showing. The figures within the cells are column percentages.
In this table most of the cells are shaded, but what do they mean? 43% of 18-29 year olds are spending over an hour a day. We’ve indicated this cell as being significantly higher (p < .001).
How to interpret?
In the case of column percentages, it is saying that the 43% is significantly higher than everyone who is not 18-29 years old. It is NOT a comparison between the 43% and the marginal net (38%). It is a comparison between the 43% and the average of all the other age groups combined (everyone over 30 combined). That is shown in the table below, where we’ve collapsed all the 30+ columns into one.
So the 43% is actually being tested against the 37%, and is significantly higher (hence the dark green).
How cell comparisons are shaded.
Cell comparisons are based of the Z-score. You can turn the Z-score on for a particular table (using the Display Controller), as per the below.
You can see that the Z-score is 10.80 for the 18-29 year olds, and it is the same but negative for the complementary cell (those who are NOT 18-29 years old, that is, 30+).
The Z-score is known as the standardized residual. Without getting into the computation, Z-scores are a measure of how different the cell is from what we would expect the cell to be based on the row and column average. You can read up more about Z-scores here if you wish.
Why are Z-scores so cool? Because unlike column-comparison letters, the Z-scores take into account the whole table. It’s not just looking at all the different combinations of column pairs. This makes it easier to spot interesting differences, and see trends, especially when you are glancing through lots of (big) tables.
Although many researchers are familiar with column comparison letters. Part of the reasons they have persisted is because, up until now, tab books were static and didn’t offer the interactivity if you wanted to do a specific pairwise column test. But you can do that at a click of a button with Crunch, which is the topic of the next section.
Does it matter if you have row or column-wise percentages?
No. The other benefit of using the Z-score is that it works both row-wise and column-wise. Look what happens if the rows of the table are collapsed as well into two categories:
You can see a perfect symmetry. So scores and shading indicate that there are disproportionately more 18-29 year olds amongst those who use the Internet over an hour a day, than there are who use the Internet less than an hour per day. The below table shows the row percentages.
So the 27% here is being indicated as statistically higher than the 22%. If we expand the row categories back out, the 27% is interpreted against everything that is not over an hour a day.
So even though 27% is lower than the 33% and 49%, that is not the point. There are lot of respondents (N=20,518) in the “Never” category. So that 14% is dragging the average of “Not over an hour a day” right down (to 22%, as per the above).
Z-scores, therefore, don’t change if you swap the rows and columns. As stated, they make it easier to spot trends. There’s a clear trend between age and use of the Internet at work in the example above. The reverse if true if you look at a table (this time of column percentages) of internet usage at HOME by age.
You can see that in the above, it is suggested that a greater proportion of the 65+ age groups are much more likely to be using the internet at home over an hour a day (presumably, because they are at home more!).
A word of caution though – with very large base sizes (as in the above), it is common for cells to be shaded as significant. Does that mean it’s an important result? Not necessarily. The 83% is not that much higher than the other age groups, and the vast majority of the 18-29 year olds are still using the internet for more than an hour a day. Please see our full article for a discussion on effect size, and how that plays a role in interpreting results alongside significance.