See the following articles for more information:
The CREATE MULTIPLE DICHOTOMY WITH RECODE command allows you to take a set of uniquely-coded categorical variables and indicate which category to use on each of them to create a multiple response variable.
As illustrated in the use cases below, this command efficiently solves the following common data-related problems:
- Creating multiple response data from non-dichotomous values
- Example: instead of 1s and 0s, you instead start with data that code 1 in the first variable, code 2 in the second variable, and so on.
- Recoding certain cases to missing in a new multiple response variable
- Example: you use the EXCLUDE EMPTY argument to recode missingness to cases that have no selected answers in the input categorical variables (sometimes DP jargon refers to this as 'basing to only those who answered').
Optional arguments
The template for the code shown below has many optional arguments, but that is part of what makes this command so powerful and efficient. As we work through the options, it’s important to keep in mind that:
- You can indicate the new LABEL you want for each subvariable as you go.
- The ALIAS option allows you to indicate what alias you wish to assign to a subvariable. An alias must be unique in the system.
- You can indicate for each input variable what specific category to include as SELECTED or NOT SELECTED (or multiples of each) in the final array.
- There are [square brackets] and (curved parentheses) for each alias. The [square brackets] are optional, but if you do use the option, then you must use any (curved parentheses) within. Thus, pay close attention to brackets within brackets that are within brackets.
- You can even use conditional expressions to define the non-selected values. In this sense, you can customize the missingness of the subvariable you are creating (DP jargon sometimes refers to this as "setting the base" of the output subvariable).
Additionally, a WITH block can be added after you define the input variables that sets default values for all the variables. That way, every variable always has a SELECTED/NOT SELECTED/EXCLUDE EMPTY and LABEL configured.
CREATE MULTIPLE DICHOTOMY WITH RECODE
alias [([LABEL label] [SELECTED code|label] [NOT SELECTED (code|label|WHEN expression)])],
...
alias [([LABEL label] [SELECTED code|label] [NOT SELECTED (code|label|WHEN expression)])]
[WITH
[ALIAS]
[SELECTED code|label, ..., code|label]
[NOT SELECTED code|label, ..., code|label]
[EXCLUDE EMPTY]
[LABELS FROM CATEGORIES]
]
AS alias
[TITLE "string"]
[DESCRIPTION "string"]
[NOTES "string"];
Use cases
You can use this sample datafile (SPSS .sav format) for the following use cases.
Creating multiple response data from non-dichotomous values
- Example: instead of 1s and 0s, you start with data that code 1 in the first variable, code 2 in the second variable, and so on.
Consider the following example where we have five variables that are supposed to be in a multiple response variable. The input data has five variables, each with different values and categories:
In each respective variable, CODE 1 = Nike, CODE 2 = Adidas, CODE 3 = Puma, and so on. All have CODE 0s, which are distinguished from missing data.
The goal is to set them into a multiple response variable, as shown in the following:
However, these cannot be set into a multiple dichotomous variable without some effort using the CREATE MULTIPLE DICHOTOMY WITH RECODE command:
CREATE MULTIPLE DICHOTOMY WITH RECODE
Q1_1 (SELECTED 1),
Q1_2 (SELECTED 2),
Q1_3 (SELECTED 3),
Q1_4 (SELECTED 4),
Q1_5 (SELECTED 5)
WITH
NOT SELECTED 0
LABELS FROM CATEGORIES
AS Q1_mr
TITLE "Sports Brands Purchased"
DESCRIPTION "Which of the following brands have you purchased in the past 6 months?";
- For each input subvariable, a different code is selected. You can also use a label, such as “Nike”, “Adidas”, and so on.
- The WITH section described in the above command is essential because otherwise there would have been no NOT SELECTED categories. If you attempt to run the code without the WITH section, then all of the resulting subvariables will show 100% (because they are just SELECTED and missing data).
- There is a different level of missing data in each of the subvariables (hence the range of base sizes).
Similarly, Q2 in the following example contains the same counts as Q1, except it doesn’t have any 0 categories—only missing data. As a result, the input variables all show 100% for each respective brand:
CREATE MULTIPLE DICHOTOMY WITH RECODE
Q2_1 (SELECTED 1),
Q2_2 (SELECTED 2),
Q2_3 (SELECTED 3),
Q2_4 (SELECTED 4),
Q2_5 (SELECTED 5)
WITH
NOT SELECTED NULL
LABELS FROM CATEGORIES
AS Q2_mr
TITLE "Sports Brands Purchased - Amongst all"
DESCRIPTION "Which of the following brands have you purchased in the past 6 months?";
In this example, using WITH and NOT SELECTED NULL means that all missing values are converted to 0s in the resulting multiple response variable:
Recoding certain cases to missing in a new multiple response variable
- Example: Use the EXCLUDE EMPTY argument to recode missingness to cases that have no selected answers in the input categorical variables (DP jargon sometimes refers to this as "basing to only those who answered").
Continuing from the Q2 example above, suppose you want to consider only those who had purchased at least one of the five brands in the above. The EXCLUDE EMPTY option sets that for you:
CREATE MULTIPLE DICHOTOMY WITH RECODE
Q2_1 (SELECTED 1),
Q2_2 (SELECTED 2),
Q2_3 (SELECTED 3),
Q2_4 (SELECTED 4),
Q2_5 (SELECTED 5)
WITH
NOT SELECTED NULL
EXCLUDE EMPTY
LABELS FROM CATEGORIES
AS Q2_mr
TITLE "Sports Brands Purchased - Amongst purchasers"
DESCRIPTION "Which of the following brands have you purchased in the past 6 months?"
NOTES "Base: Any those indicating they’ve purchased one of the brands";
Notice the small, but important, difference from the previous example. In this case, one respondent (out of 20) has been set to missing data (because they didn’t purchase any of the sports brands):