Setting up a dataset with R

Overview

The following is a resource guide for R users who want to set datasets up in R using the Crunch packages. Please note that the recommended method for preparing your data is via Crunch Automation. The R/API methods to manipulate data should be used with caution; in general, the only R command you should need to prepare a dataset is runCrunchAutomation().

These instructions assume that a dataset has already been uploaded into Crunch and shared with the user. The user also must have edit permissions to the dataset and associated folder(s).

One-time setup

Install R Studio and the Crunch package (with its dependencies).

Project setup

Open R Studio.
Load the Crunch package.
Load your dataset.

Optional setup

See the following article for more information about organizing datasets:

File your dataset into the right project folder

Initial cleaning

See the following article for more information about how to delete variables:

Deleting variables

Checking and changing variable properties

For information on variable properties in Crunch, please see the following article:

Variable Properties

Rename variables to facilitate analyses
- Best practice is to use a descriptive title that summarizes what the variable is (e.g., 'Gender')
- SET TITLE variables in Crunch Automation
Give variable Descriptions (optional but recommended)
- SET DESCRIPTION in Crunch Automation
- Typically, this includes the actual wording of the question (e.g., 'Q1: Are you male or female?')
Change the Aliases (optional)
- RENAME in Crunch Automation
Modify Variable Type
REPLACE <TYPE> alias
- Determine whether any variables that are numeric should be changed to categorical
- Determine whether any categorical variables should be changed to numeric
- Set any relevant date/time variables
- See Change Crunch variable types for more information
Make any variables that are supposed to be weights available as a weight
- SET WEIGHT
Make any variables that are supposed to be filters available as filters
- CREATE FILTER
Correct the values for any variables for which you want the mean shown, including:
- NPS recoding, scale reverse for rating scale questions or other recodings
- Capping (if relevant)
- Setting outliers to missing values (if relevant)

Creating multi-variable sets

Define Multiple Response questions
- Derive new variable with an option to hide the original contributing source variables
Define Categorical Arrays
- Derive new variable with an option to hide the original contributing source variables
Split Categorical Array into component variables
Split Multiple Response questions into component variables

Creating more variables (optional)

Banded time variables (as an approach to time-series smoothing)
- Improved smoothing techniques will become available soon in the Crunch web app
Use any of the following:
- filters
- interaction variables
- weights
- banners (called Multitables in Crunch)
- other variables you may need

Setting the base

Defining Missing Values for each variable
- For example, include or exclude 'Don’t Knows' from a scale
Rebasing questions based on other questions
- Fixing survey skips
- Basing to another question’s response

Changing variable summary information

Set subtotals (NETs)
Merging categories
- Derive new variable with the option to hide the original contributing source variables
Creating banded categories ("Buckets")
- From a numeric variable
- From a categorical variable
- Derive a new variable with an option to hide the original contributing source variables

Housekeeping

Setting up the Variable Sidebar
- Make folders and sub-folders in the variable organizer
- Put variables into folders
- Change or set the order of variables within each folder
Hiding variables that you no longer need to see and removing clutter, such as:
- Variables that were used to define the categorical arrays and/or MR questions

Final cleaning

Removing cases you don’t want in the dataset (aka, "Exclusions"), such as:
- Bad sample (e.g., not meet screener)
- Outliers
- Speedsters
- Flat-liners/straight-liners

Analysis

See the following article for more information about decks and slides:

Automatically creating a deck with slides

Help Center