See the following articles for more information:
The CREATE MULTIPLE DICHOTOMY command allows you to create a multiple response variable from a list of categorical variables.
- Often your dataset will import with metadata that tells Crunch which individual dichotomous columns of data belong together. That should be the case if the data is captured in the survey as a multiple response and the data is stored in a binary format (1s and 0s and missing data). But if that metadata is lacking, you might need to derive the multiple response variables.
- Alternatively, you may want to create a new variable that is a summary of responses across different questions.
A Multiple Response variable is very similar to a Categorical Array, in that you derive a new variable that has subvariables. The key difference is that you must specify what the SELECTED category or categories are. This is what makes it a “dichotomy”: each case is either selected or it’s not (or it’s missing data). See this article for more information on Categorical Arrays and Multiple Response variables.
- The input variables for this command must be individual categorical variables. You do not input an array with this command. If you want to input an array, you should refer to the CREATE MULTIPLE DICHOTOMY FROM command.
- All of the contributing variables MUST have the same categories. If they do not have the same categories, then you may need to do a recode. The CREATE MULTIPLE DICHOTOMY WITH RECODE command is useful in this case.
- See this article for a summary of these related commands.
The LABELS modifier is optional, but without it, you will simply have the aliases of the contributing variables, which more often than not are not particularly informative or attractive. So it's best to either:
- Specify each label now as you create it (“string”) in the same order as the order of the input variables, or
- Borrow from the TITLES or DESCRIPTIONS of the contributing variables (which can be handy if the contributing variable have tidy/clean metadata).
The optional ALIASES argument allows you to indicate the aliases that each of the subvariables will have in the new dichotomy. If absent, they will default to a __1 prefix. When used, it requires you to indicate a new alias for each of the present subvariables, which must be unique within the entire dataset. Otherwise, the command will return an error that indicates incomplete aliases or duplicate ones.
The HIDE INPUTS modifier is a handy way to move the contributing variables to the hidden folder. If you are setting up a multiple response variable to match the survey structure, you generally do not need to refer to it in order to see the contributing variables in the variable tree. Remember, you can still refer to contributing variables via their unique alias even if they are hidden.
CREATE MULTIPLE DICHOTOMY alias, ..., alias [LABELS ("string", ..., "string"| COPY (TITLE|DESCRIPTION))] [ALIASES "string", ..., "string"] SELECTED code|"label", ..., code|"label" [HIDE INPUTS] AS alias [TITLE "string"] [DESCRIPTION "string"] [NOTES "string"];
In the 2019 Mobile Technology and Home Broadband 2019 example study, the raw SPSS file does not contain information about the grouping of variables. Thus, we need to declare this in the setup of the dataset.
Consider the following variables in a raw import:
These all belong together. If we wanted to, we could make it into a categorical array, but in this case, we are only interested in the “Yes” response—so we make it into a multiple response variable.
CREATE MULTIPLE DICHOTOMY bbsmart3a, bbsmart3b, bbsmart3c, bbsmart3d, bbsmart3e, bbsmart3f LABELS "The monthly cost of a home broadband subscription is too expensive", "The cost of a computer is too expensive", "Your smartphone lets you do everything online that you need to do", "You have other options for internet access outside of your home", "Broadband service is not available where you live, or is not available at an acceptable speed", "Some other reason I haven't already mentioned (SPECIFY)" SELECTED "Yes" HIDE INPUTS AS bbamrt3_mr TITLE "Reasons for not having broadband" DESCRIPTION "Please tell me whether any of the following are reasons why you do not have high-speed internet at home";
In this example, the labels are explicitly declared. There is the option to use LABELS COPY(DESCRIPTION), however, as is common in SPSS files, the descriptions of the contributing source variables are too wordy and would require further clean-up. As standard practice, when creating a new multiple response variable from source variables, as is the case here, we are hiding the input variables with HIDE INPUTS.