If you are a DP department (or a researcher setting up an SPSS file in Crunch) you need to be aware that SPSS files lose metadata that is important for analysis. Likewise, some direct survey importers don't set things up exactly as you may like them.
The aim with Crunch is to have a 'clean and tidy' dataset. The 'tidy' part is particularly important - because it enables researchers to more effectively self-serve if they can find the data they are looking for. Furthermore, Crunch is designed to deal specifically with questionnaire structures - such as grids. Thus to use Crunch's features most effectively, you should structure the dataset as though it is like a questionnaire (and not a tab book).
In the data collection, questions will be set up as multiple response questions and categorical arrays (typically scales, structured like a grid). These often need to be "reconstituted" in a Crunch dataset. Furthermore, clean and tidy naming helps the user do their analysis efficiently in Crunch as it helps them understand and recognise data.
The basic principle is as follows: aim to set up the dataset so the researcher can easily retrieve information. The most logical way to do is, is to structure the dataset like the questionnaire.
This includes:
- Pithy variable names (ie; titles on variable cards, acting as easy to understand headlines)
- Descriptions with the questionnaire wording, with question numbering optional
- Creating multiple response and categorical arrays variables (hide the individual sub variables)
- Organizing the Variable Organizer (left-side panel) with sensible folders
Detailed tips
All the below falls under “SETUP” phase in the Definitive Guide of Uploading and Preparing Data.
Follow the structure of questionnaire for creating initial folders
- Use the questionnaire breaks as the default (eg: "Screener" section, "Awareness" section, "KPI" section, etc), unless research asks for something otherwise.
- DP and researchers should not be shy of using sub-folders (ie: folders within folders) to keep things even more organized.
- Try to avoid big lists of variables in the Variable Organizer - use folders.
Variable names should be accurate without being long-winded (a fine balance).
- For example: “7032 – AGE” and “7132 – AGE VARIABLE” are not that useful. It’s better perhaps as:
- “All Age Categories” and “Age – Young vs. Old”
- Note: Remember;
- you could make subtotals on the one “All Age Categories” variable, so you may only need one Age Variable after all
- researchers can make additional variables with Combine Categories
Generally, it's not necessary to include the Question Numbering (eg: “Q4”, "Q3004") in the Variable name, for three reasons:
- It is probably already in the variable as an alias or description
- It looks better as a description than a variable name
- The researcher can use the Search bar at the top to find a variable this way because the search includes descriptions
- Note: it may be useful to have the questionnaire number in the description because a Crunch search won’t consider the aliases (just the names, descriptions and labels)
Variable names should be sentence case
- CAPS makes it seems as though the Variable Organizer is shouting at you!
- But more importantly, it’s easier to read sentence case at a glance
The norm for all scale-type questions should be set as a Categorical Array.
- Arrays are often very long, with lots of sub-variables. These can all be consumed under the higher variable
- When a researcher is looking at an array, Crunch will automatically give the user the option to arrange the array by subvariable (the Tabs feature). You do not need to show the individual
- The very first step in setting up a dataset, is to ‘atone’ for the loss of metadata in an SPSS file. The metadata lost here is that these variables are grouped together as a single categorical array. This is the first step in the Definitive Guide for Uploading and Processing Data.
- Tip: set a sensible description, name and then nice short subvariable labels for the categorical array
Remove values from nominal variables (eg: “Which region do you live in”)
- The values can produce means on tables, which will be meaningless and a distraction.
- Note: also suggest remove the means on scales if they are not likely to be used.
Check to make sure the Variable Organizer makes sense.
- For example, a “Weight” Variable doesn’t belong in a Screener section.
- Tip: remember you can use the “Edit Variables” tab of the Dataset Properties to organize into folders, and quickly change the variable names and descriptions.