When conducting a survey, having a representative sample of the population is of paramount importance. But in practice, you are prone to oversample some kinds of people and undersample others. Weighting is a statistical technique to compensate for this type of 'sampling bias'. A weight is assigned to:
Reflect the data item's relative importance based on the objective of the data collection;
Take into account the characteristics of sampling design;
Reduce bias arising from nonresponse when the characteristics of the respondents differ from those not responding;
Correct identifiable deviations from population characteristics.
Each individual case in the file is assigned a certain coefficient – individual weight – which is used to multiply the case in order to attain the desired characteristics of the sample.
Different types of weights and their different purposes
Several types of weights have different purposes and a different impact on data analysis.
An answer to the question whether or not to use weights isn't straightforward. For particular methods of analysis (e.g., estimating associations, regressions, etc.) using weights may be dysfunctional. There are also general theoretical and methodological issues which discourage some researchers from using weights. However, different types of weights are useful for different purposes. In some situations it is necessary to take an appropriate weight into account in your analysis (see several types of weighting below).
In all cases, if there are any weights in your data file, the rationale and calculation of the weights must be detailed in the data documentation.
Design weights are constructed in order to mutually adjust individual units’ probabilities of being sampled, which are normally not equal when complex sampling procedures combining multiple methods (stratification, group sampling) in several stages are implemented. For example, we want to adjust the probabilities of being sampled for all respondents in households. While individuals are the sampling units, households are sampled in the first stage. Therefore, respondents’ probabilities of being selected depend on the number of household members.
To solve these differences in sampling probabilities we have to compute design weights. The design weights are equal to the inverse of the probability of inclusion in the sample. The sum of all design weights should be equal to the total number of units in our population.
The way certain characteristics such as sex, age and education of your sample population are distributed may differ from the way it is distributed in the actual population. For example, your sample may consist of 66 percent men when they make up only 48 percent of the population. Post-stratification weighting is done in order to achieve a distribution equal with that of such known characteristics of the population. It is called a post-stratification weight because it can only be computed after you have collected all of your data. Stratification comes the various known strata (such as age group or sex distribution) of the population.
Different groups may be represented in the database in different proportions than they are in reality. Such discrepancies are normally compensated through weighting. For example, international data files combine data from various countries. However, similarly large surveys are usually implemented in each of these countries, although their total populations are radically different in size. If we want to analyse data about large populations, such as in Europe, then we have to adjust the proportions in the representation of individual European countries.
Variable name: netusoft Question: How often a respondent uses internet
In the first column, no weight was applied. In the second column, the Design Weights (DWEIGHT) adjust for different selection probabilities.
No weight
Design weight
Frequency
Valid Percent
Frequency
Valid Percent
1 Never
244
10,8
187
8,2
2 Only occasionally
162
7,1
155
6,8
3 A few times a week
302
13,3
284
12,5
4 Most days
384
16,9
379
16,6
5 Every day
1177
51,9
1271
55,8
Total
2269
100
2277
100
System missing
31
23
Total
2300
2300
Consider the following ..
An example: Using weights in European Social Survey data
The following table provides an illustration of using weights in the data from the European Social Survey (n.d.) (ESS). There are three different weights available in the ESS Source Main Questionnaire data file (see European Social Survey, 2014):
The design weight takes into consideration the different probabilities of being sampled given the sampling methods implemented in individual countries;
The post-stratification weight corrects for the differences of the sample from selected population characteristics caused by other sampling and non-sampling errors;
The population size weight corrects the fact that the individual countries’ sample sizes are very similar while there are large variations in the size of their actual populations.
Different types of data analysis then require the use of different weights or their combinations. When analysing data from one country alone or comparing data of two or more countries, only the design weight or the post-stratification weight needs to be applied. When combining different countries, design or post-stratification weights in combination with population size weights should be applied.
Example – voter turnout (% of respondents voting in the last election)
Weights to be used
Design weight / Post-stratification weight
Population weight
To examine data from a single country – whether a single variable or a cross-tabulation
Voter turnout in Germany
X
Voter turnout in Germany by age and gender
X
To compare results for two or more countries separately – without using totals or averages
Compare voter turnout in France, Germany and the UK
X
To combine countries – whether on a single variable or via a cross-tabulation
Voter turnout in Scandinavia
X
X
Voter turnout in the EU
X
X
Voter turnout across all countries participating in the ESS
X
X
Compare voter turnout between EU member states and accession countries
X
X
Voter turnout by age group across all ESS participating countries