Overview
The following is a resource guide for R users who want to set datasets up in R using the Crunch packages. Please note that the recommended method for preparing your data is via Crunch Automation. The R/API methods to manipulate data should be used with caution; in general, the only R command you should need to prepare a dataset is runCrunchAutomation().
These instructions assume that a dataset has already been uploaded into Crunch and shared with the user. The user also must have edit permissions to the dataset and associated folder(s).
One-time setup
Install R Studio and the Crunch package (with its dependencies).
Project setup
- Open R Studio.
- Load the Crunch package.
- Load your dataset.
Optional setup
See the following article for more information about organizing datasets:
Initial cleaning
See the following article for more information about how to delete variables:
Checking and changing variable properties
For information on variable properties in Crunch, please see the following article:
- Rename variables to facilitate analyses
- Best practice is to use a descriptive title that summarizes what the variable is (e.g., 'Gender')
- SET TITLE variables in Crunch Automation
- Give variable Descriptions (optional but recommended)
- SET DESCRIPTION in Crunch Automation
- Typically, this includes the actual wording of the question (e.g., 'Q1: Are you male or female?')
- Change the Aliases (optional)
- RENAME in Crunch Automation
- Modify Variable Type
- REPLACE <TYPE> alias
- Determine whether any variables that are numeric should be changed to categorical
- Determine whether any categorical variables should be changed to numeric
- Set any relevant date/time variables
- See Change Crunch variable types for more information
- Make any variables that are supposed to be weights available as a weight
- SET WEIGHT
- Make any variables that are supposed to be filters available as filters
- CREATE FILTER
- Correct the values for any variables for which you want the mean shown, including:
- NPS recoding, scale reverse for rating scale questions or other recodings
- Capping (if relevant)
- Setting outliers to missing values (if relevant)
Creating multi-variable sets
- Define Multiple Response questions
- Derive new variable with an option to hide the original contributing source variables
- Define Categorical Arrays
- Derive new variable with an option to hide the original contributing source variables
- Split Categorical Array into component variables
- Split Multiple Response questions into component variables
Creating more variables (optional)
- Banded time variables (as an approach to time-series smoothing)
- Improved smoothing techniques will become available soon in the Crunch web app
- Use any of the following:
- filters
- interaction variables
- weights
- banners (called Multitables in Crunch)
- other variables you may need
- Standardize variables
Setting the base
- Defining Missing Values for each variable
- For example, include or exclude 'Don’t Knows' from a scale
- Rebasing questions based on other questions
- Fixing survey skips
- Basing to another question’s response
Changing variable summary information
- Set subtotals (NETs)
- Merging categories
- Derive new variable with the option to hide the original contributing source variables
- Creating banded categories ("Buckets")
- From a numeric variable
- From a categorical variable
- Derive a new variable with an option to hide the original contributing source variables
Housekeeping
- Setting up the Variable Sidebar
- Make folders and sub-folders in the variable organizer
- Put variables into folders
- Change or set the order of variables within each folder
- Hiding variables that you no longer need to see and removing clutter, such as:
- Variables that were used to define the categorical arrays and/or MR questions
Final cleaning
- Removing cases you don’t want in the dataset (aka, "Exclusions"), such as:
- Bad sample (e.g., not meet screener)
- Outliers
- Speedsters
- Flat-liners/straight-liners
Analysis
See the following article for more information about decks and slides: