See Crunch Automation basics for more information.
The CREATE CATEGORICAL RECODE command combines categories into a new variable. It achieves the same outcome as the Combine Categories interface in the web app. With Crunch Automation, however, you can specify multiple input variables (if they have the same categories to combine).
Optional arguments
MAPPING — You need at least one MAPPING argument, which is where you specify the categories you wish to combine. Optionally, you can have as many MAPPING arguments to cover the different combinations of the categories.
CREATE CATEGORICAL
RECODE alias, ..., alias
MAPPING
code|"label", ..., code|"label" INTO "label" [CODE value [MISSING]],
code|"label", ..., code|"label" INTO "label" [CODE value [MISSING]]
[ELSE "label" [CODE value [MISSING]]|ELSE INTO NULL]
AS alias, ..., alias
[TITLE "string", ..., "string"]
[DESCRIPTION "string", ..., "string" | COPY]
[NOTES "string", ..., "string" | COPY];
Using REPLACE
You can also use REPLACE within the CATEGORICAL RECODE command. The difference is that REPLACE does not include the AS... assignment at the end, which is used to indicate a new variable. Instead, the new variables are indicated at the beginning (before WHEN).
Everything else remains the same, including how it computes. The results are not stored on a new variable but instead overwrite the mentioned columns:
REPLACE CATEGORICAL
RECODE alias,...,alias
WHEN condition THEN "label" [CODE value [MISSING]]
...
WHEN condition THEN "label" [CODE value [MISSING]]
[ELSE "string" [CODE value [MISSING]|ELSE INTO NULL]
END;
Using the DATE attribute
The CREATE CATEGORICAL RECODE command also allows you to create categorical-date variables by specifying a DATE attribute to each of the categories defined in the command.
Creating a categorical date variable requires that all the non-missing categories contain a DATE attribute that should contain a valid and unique ISO-8601 date string:
CREATE CATEGORICAL
RECODE
cat_var
MAPPING
1, 2 INTO "first two" DATE "2022-01-21",
3, 4 INTO "Second two" DATE "2022-01-22,
5 INTO "Last two" DATE "2022-01-23"
AS my_recode TITLE "My recoded variable";
Use case
A few examples of where you might use the CREATE CATEGORICAL RECODE command include:
- A categorical variable with all 50 states of the USA, where you want a new variable that is grouped into regions (North-East, South, Midwest, and so on).
- A 0-10 scale that you want to bin it into 0–6, 7–8, and 9–10.
- In the context of NPS questions, this is referred to as binning of Detractors, Passives, and Promoters respectively.
- A 100-point scale is represented as a categorical variable. In fact, you have a set of 15 variables like this: you want to bin them all into quartiles (1-25, 26-50, 51-75, and 76-100).
- Note: if the original variable(s) was numeric, then you would use CREATE CATEGORICAL CUT instead.
In the 2019 Core Trends Mobile Broadband and Technology 2019 example, the following script generates a new variable from the individual states in the USA:
CREATE CATEGORICAL
RECODE state
MAPPING
"NY", "PA", "MA", "NJ", "CT", "ME", "NH", "RI", "VT" INTO "Northeast",
"IL", "MI", "OH", "WI", "MO", "IN", "MN", "IA", "KS", "NE", "SD", "ND" INTO "Midwest",
"FL", "TX", "NC", "VA", "GA", "TN", "MD", "LA", "KY", "SC", "OK", "AL", "AR", "MS", "WV", "DC", "DE" INTO "South",
"CA", "WA", "AZ", "CO", "OR", "UT", "NV", "AK", "NM", "ID", "MT", "HI", "WY" INTO "West"
AS region_coded
TITLE "Region of USA"
DESCRIPTION "States coded into region";
The variable on the left (state) recodes into the new derived variable on the right: