Pairwise tests with weighted data

When displaying or exporting indications of statistical significance — so-called hypothesis tests — in the Crunch application or exported tables, results involving weighted data use the effective N to compute the sample variance. The effective N is always lower than the unweighted sample size. Estimated standard errors are higher, test statistics more conservative, reflecting the added uncertainty introduced by the weights.

Survey weighting is often used to ensure that a survey more accurately reflects the population it's meant to represent. This approach can sometimes give the illusion that there are more respondents in the sample than there really are. This is because certain responses are given more weight than others to compensate for underrepresentation or overrepresentation of specific groups. While this can be useful for getting a more accurate picture of the overall population, it can also mask the true level of uncertainty in the data. Normally, the larger the sample size, the smaller the uncertainty. But when weights are applied, the sample size appears larger, and the uncertainty seems smaller than it actually is.

The concept of effective n or effective base size is a method used to counteract this illusion by adjusting the sample size to better reflect the actual number of respondents, taking into account the weights assigned to them. The adjustment is based on what's known as weighting efficiency or the design effect. This is essentially a measure of how much the weighting process has inflated the apparent sample size. This way, the uncertainty estimate can be more accurately assessed. The effective n corrects data by reducing the inflated sample size, which, in turn, increases the stated uncertainty. This adjustment is especially important when comparing differences between groups within the sample. The effective n will always be smaller than the true, unweighted number of respondents (n). This is because it's a correction for the artificial inflation caused by weighting.

The difference between pairs of point estimates (percentages of means) is displayed as cell-color shading, or in exports using letters indicating ‘significant differences’ based on an assumed Normal distribution of errors.

The effective sample size is currently only used to contribute to the calculation of the standard error of differences in pairwise comparisons. It is not currently available to include the effective N as a ‘summary row’ or ‘summary column’ of a weighted table, or as its own 'measure' from second-order analysis requests by the application.

Examples

We calculate the unweighted cell count and the weighted cell count $n_w$ for each cell. We then assume to calculate the margins over either the row and column and then the weighted percentages $p_w = n_w / \sum n_w$ (where the sum is taken over the row or column, depending upon the direction of percentaging). Finally, we choose a column/row to compare to and compute the standard error of the difference (primes indicate the comparison column):

$$ s.e._{\text{diff}}=\sqrt{\frac{{p_w}(1-p_w)}{n} +\frac{{p'_w}(1-p'_w)}{n'}} $$

To get the “weighting efficiency”, we just need to add to the cube computation squared weights, i.e., instead of summing the weights, we sum the squared weights in each cell. Call these $n_{w^2}$.

We then compute the marginal of these. The “weighting efficiency” for the column/row is the ratio of the effective base to the actual unweighted base, expressed as a percentage:

$$ e=\frac{(\sum n_w)^2}{\left(n*\sum n^2_{w}\right)}*100 $$

The Effective base for each cell is just:

$$ n_{\text{eff}}=\frac{(\sum n_w)^2}{\sum n^2_{w}}*n $$

Instead of dividing by $n$ in the standard error equation above, we then divide by $n_{\text{eff}}$.

Bringing it all together, for the purpose of documentation when someone asks what the formula is, it is a two-tailed $t$ test at a given significance level (90, 95%, etc.) with test statistic:

$$ t = \frac{p_w - p'_w}{\sqrt{\frac{{p_w}(1-p_w)}{n_{\text{eff}}} +\frac{{p'_w}(1-p'_w)}{n'_{\text{eff}}}}} $$

The degrees of freedom for the reference distribution also uses the the effective base:

$$ \text{d.f.} = n_\text{eff} +n'_\text{eff} -2 $$

When does Crunch use the effective N?

The squared weights are added to the cube query when all of the following conditions are met:

Pairwise column tests are requested (whether with a set comparison or all comparisons for export cell letter indications).
The values being compared are percentages or subtotals thereof. Numeric aggregations including means of categorical scales use the unweighted base.
The columns being compared are categorical. Multiple-response columns use a figure adjusted for overlap, but not effective overlapping N.
The cube request itself is weighted.

Example dataset with weights

The following file is a 11908-row dataset simulated to reproduce the results in Hirotsu 1983:

Hirotsu dataset

The dataset includes a vector of weights simulated from an Exponential distribution with mean (rate) 1 and normalized ($\sum{w}=n=11908$). (The squared weight w_sq is also included for convenience.)

$$ n = 11908 $$ $$ n_{\text{eff}} = 6030.304 $$ $$ \frac{n_{\text{eff}}}{n}*100 = 50.64078\% $$


> tt <- xtabs(w~illness+occupation,h)

> tt
         occupation
illness            1          2          3          4          5          6
  slight   152.26652  108.11936  649.46316  174.15498  382.00491   96.78265
  medium   448.68092  331.58111 1943.88193  781.81306 1811.24462  255.29591
  serious   89.85663   51.39372  354.07278  110.13130  287.51654   52.01445
         occupation
illness            7          8          9         10
  slight    92.45000  213.18413   51.00321  242.63994
  medium   316.32241  864.04056  215.53912 1350.40901
  serious   58.36697  163.09703   32.92213  227.75094

## Unweighted column marginal:
> margin.table(xtabs(~illness+occupation,h),2)
occupation
   1    2    3    4    5    6    7    8    9   10 
 678  512 2884 1055 2523  436  486 1228  288 1818

## Weighted column marginal: 
> margin.table(tt,2)
occupation
        1         2         3         4         5         6         7         8 
 690.8041  491.0942 2947.4179 1066.0993 2480.7661  404.0930  467.1394 1240.3217 
        9        10 
 299.4645 1820.7999

## Effective column marginal (notice how much smaller it is!)

> (margin.table(tt,2)/ (margin.table(tsq,2)) * margin.table(tt,2))
occupation
        1         2         3         4         5         6         7         8 
 355.3016  257.1205 1451.9430  531.2172 1296.8639  208.8716  230.0321  631.5161 
        9        10 
 132.9096  940.7413

Help Center

Examples

When does Crunch use the effective N?

Example dataset with weights

Related articles