Overview
Numeric arrays are a variable type that contains a group of numeric columns. Numeric arrays store multiple values of the same data type in a single variable. For example, you might have asked respondents for the number of times they have engaged with each of a list of activities or asked for how many times they've purchased each of a list of products, or asked for a numeric rating for how much they agree with a list of statements. Similar to the other array types in Crunch (multiple response variables and categorical arrays), numeric arrays have subvariables. A numeric array is stored as a parent variable with one or more subvariables.
Numeric array variable cards
Here’s an example of a numeric array Variable Card in the Crunch.io variable summaries view. The dot represents the mean number of colas consumed, and the whiskers represent +/-1 standard deviation on each side of the mean:
Hovering over the mean point with your mouse reveals the mean, the standard deviation, and the sample size for that variable:
In this example, each of the cola brands is a subvariable - each subvariable (brand) has its own alias. Like other arrays, you can expand the array in the Variable Sidebar (the panel on the left where you can see all your variables) to see all the subvariables associated with the parent variable. Subvariables can only be viewed on their own in the Tables and Graphs mode and Multitables mode:
Numeric arrays in Tables
The Display Controller (the set of controls at the bottom of the screen) is important here, as it is everywhere because you can change the statistics shown and specify the number of decimal places.
Each subvariable has its own base (N). The base size can vary for many reasons. Most commonly, subvariables have different amounts of missing data. In this example, there is no missing data, so each N is 800 observations:
By default, tables of numeric arrays display the mean for each subvariable. You can change the displayed statistic in the Display Controller. Other options include Sum and Percentage-Share (defined below):
The Sum statistic adds up all the valid cases for each subvariable. In the example below, the sum is the total number of each cola consumed by brand (amongst the valid sample of 800). In our sample, Coke Zero is consumed the most, consistent with the highest mean in the table above displaying the means:
The Percentage-Share is an aggregate-level calculation across the entire numeric array (considering all subvariables). So if 614 Diet Cokes are consumed, how much is that as a Percentage-Share of all the colas consumed?
In this case, the 614 / (1454 + 614 + 1582 + 332 + 217 + 739) = 12.4%, as shown in the following:
If you include a column variable, such as Gender in the example below, you still can select Means, Sums, or Percentage-Shares to display on your table. This table shows the column percentage share statistics, reflecting, for example, that 32.7% of the colas consumed by males were Coke Zero:
Note that the Percentage-Share, such as Percents for categorical variables, can be computed column-wise, row-wise, or table-wise, as per the options in the Display Controller.
You can customize the tooltips using the three dots (…) button at the end of the Display Controller > Display options. This allows you to reveal more information about the statistic in the table by hovering your mouse over statistics in the table (eg: to see the standard deviation):
The next section covers how to display multiple statistics (i.e., mean, sum, and/or percentage share) at once via an Excel export.
Exporting numeric arrays (to Excel)
You can export numeric arrays to Excel either from a deck or from a Multitables tab book export:
- For the deck, click Edit for the slide, make sure the numeric array is a table (not a graph), and then go to Export Settings > Custom > Numeric measures
- From a multitable, go to the top right corner menu for Export > Export tab book > Settings > Numeric measures
You can select the statistics you would like in the export.
Graphing numeric arrays
Numeric arrays can be graphed as bar plots or shown over time in a time plot. These plots can be both exported into PowerPoint and used in dashboards via a deck:
The statistics available for plotting are the same as for table cells: means, sums, and percentage-shares.
Hypothesis testing and uncertainty bands
At present, only 'Set Comparisons' (t-tests between columns) are available for numeric arrays in the web app. Set Comparisons can be switched on via the * button in the Display Controller, and then the column activated for comparisons using the Set Comparison button (turning the column grey). In the example below, the 0.9 mean for Diet Coke for Females is significantly higher than the 0.6 mean for Males (p<0.01 which can be determined from the shading scale).
Only point estimates (not error bands) are plotted on graphs at this time.
For more information on statistical testing in Crunch, please see Crunch's introductory article.
How to form numeric arrays
Crunch will interpret available metadata that defines the numeric array variable. For direct integrations (i.e., Qualtrics, Decipher, and SurveyMonkey, Confirmit, etc), the appropriate metadata are always available. Most data transfer formats do not contain metadata specifically defining array variables.
If variables enter Crunch as discrete numeric variables, you can group related columns into an array. At present, this data manipulation requires the use of Crunch Automation or R.
- GUI — not available
- Crunch Automation — use the CREATE NUMERIC ARRAY command
- R — use the deriveArray() function, passing the argument numeric = TRUE
Future work for numeric arrays
The following are not currently available for numeric arrays but may in the future:
- Subtotals — numeric arrays do not have subtotals (unlike categorical variables, multiple response, and categorical arrays).
- Medians — numeric arrays do not display median statistics in the web app.
- Donut charts — numeric arrays cannot be plotted as donut charts (even for percentage-share statistics).
- Overall table shading — Z-score p-value shading is not available for numeric arrays. Only 'Set Comparison' is available.
- Uncertainty bands — on time plots (and/or other graphs) — there are no regions of uncertainty plotted, (unlike multiple response and categorical variables which show error bands).
- Web-based (GUI) array builder — only categorical arrays and multiple response arrays can be built in the GUI. You need to use scripting (Crunch Automation or R) to construct a numeric array once it is in Crunch.