Averaging and summing several variables using R

If you have set of numeric variables, you may like to calculate the averages of those variables as a new variable. So you're working towards calculating the mean of the average (of the contributing variables). Or perhaps you'd like to sum them up. Both are easily done with simple calculations using R.

This article explores three scenarios:

Creating an average of two or more variables
Summing two or more variables
Tallying the number of responses in a multiple response variable

The below worked examples assume you know how to log in and load your dataset.

Creating an average of two or more variables

Consider the three numeric variables below:

If you wanted to create a new average variable, the code is as follows:

rowwise_mean <- function(ds_subset) {
 Reduce(`+`, ds_subset) / ncol(ds_subset)
}
ds$average <- rowwise_mean(ds[, c("q2a_1", "q2a_2", "q2a_3")])

You simply need to:

change the alias from average to be whatever you choose
change and include the aliases of the contributing variables (q2a_1, q2a_2, q2a_3)

The result from the example is as follows:

Summing two or more variables

Creating a sum variable is just as simple as the above, it just requires a simple addition of variables:

ds$sum <- ds$q2a_1 + ds$q2a_2 + ds$q2a_3

Tallying the number of responses in a multiple response variable

Consider the following example, which is a multiple response variable that indicates which brands respondents love.

We want to know how much brands (subvariables) each respondent selects.

ds$sum <- selectedDepth(ds$mr_var)

The following code can be used if you need to create a multiple response variable with a subset of the categories. For example, if you want to exclude a "None of these" subvariable from the counting.

mr_var_alias <- "brands_liked"
subvars <- subvariables(ds[[mr_var_alias]]) 
subvars <- setdiff(names(subvars), "None of these")
ds$sum <- selectedDepth(arraySubsetExpr(ds[[mr_var_alias]], subvars, subvar_id = "name"))

Help Center

Creating an average of two or more variables

Summing two or more variables

Tallying the number of responses in a multiple response variable

Related articles