See the following articles for more information:
Multiple response data can sometimes be stored in a format of fixed columns. This is where the multiple response information is encoded in the categories rather than in dichotomous variables. Multiple response variables in Crunch must be in a dichotomous format and the CREATE MULTIPLE SELECTION command can do this transformation.
Given a set of multiple categorical variables as input, the CREATE MULTIPLE SELECTION command returns a multiple response. Each of the resultant subvariables has one of the input categories as the selected value, which takes into consideration the combined responses of all the input rows. In other words, the "net" of each category across the input variables is transformed into a multiple response subvariable.
For example, you may have 10 input variables with the same 300 categories in each. The use of the CREATE MULTIPLE SELECTION command transforms it into one multiple response variable as an output that has 300 subvariables. Each of the output subvariables is named from the corresponding category label.
Optional arguments
- SELECTED — Indicates one of many categories to consider as the selected value for all mentioned variables. Categories can be expressed as IDs or by name. If used, these categories must exist on all the indicated variables.
- NOT SELECTED — If there are categories you don’t wish to include as a subvariable in the multiple response variable output, then you can specify the categories to exclude. For example, 'None of these' might be a category you want to exclude. They can be referred to by category id (which is not the numeric value) or by label.
- EXCLUDE EMPTY — If any case (row of the dataset) does not have any responses across any of the resultant subvariables, then that case will have missing data in the output variable. In other words, the numeric tally for every case will be at least one.
- INCLUDE MISSING — Marks all-missing rows as valid.
- LABELS — If present, this argument must contain the correct number of strings to indicate the labels of each of the subvariables for the new multiple response variable. Note that these names must all be unique among them. If absent, the existing variables titles will be used.
- ALIASES — Allows you to indicate the aliases that the subvariables of the new array will have. You must provide the exact number of aliases for the final array (as with any alias, they must be unique in the dataset). If ALIASES is not provided, then the subvariables will default to have new_var_alias_# where # is a sequential number.
CREATE MULTIPLE SELECTION
alias, ..., alias
[SELECTED label|code, ..., label|code]
[NOT SELECTED label|code, ..., label|code]
[EXCLUDE EMPTY]
[LABELS "string", ..., "string"]
[ALIASES alias, ..., alias]
AS alias
[TITLE "string"]
[DESCRIPTION "string"]
[NOTES "string"];
Use case
In the Core Trends Mobile Broadband example, there are four categorical variables that encode race/ethnicity. The first variable captures the first response (hence no missing data) and then successively for the 2nd, 3rd, or even the 4th responses (e.g., if someone identifies with four different races):
The following script turns it into a single multiple response. In doing so, it optionally chooses to exclude the 'Don’t know' and 'Refused' categories, which become missing data in the result (n=59):
CREATE MULTIPLE SELECTION
racem1, racem2, racem3, racem4
NOT SELECTED label|code, ..., label|code]
AS race_mr
[TITLE "Race”]
[DESCRIPTION "Which of the following ethnicities do you identify with?"];
which results in the following: