Estimates of Sampling Error
The sample of respondents selected in the 2009 Kiribati Demographic Health Survey (KDHS) is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling errors are the errors that result from taking a sample of the covered population through a particular sample design. Non-sampling errors are systematic errors that would be present even if the entire population was covered (e.g. response errors, coding and data entry errors, etc.).
For the entire covered population and for large subgroups, the KDHS sample is generally sufficiently large to provide reliable estimates. For such populations the sampling error is small and less important than the non-sampling error. However, for small subgroups, sampling errors become very important in providing an objective measure of reliability of the data.
Sampling errors will be displayed for total, urban and rural and each sample domain only. No other panels should be included in the sampling error table. The choice of variables for which sampling error computations will be done depends on the priority given to specific variables. However, it is recommended that sampling errors be calculated for at least the following variables, which was not case with Kiribati given the smallness of the sample compared to other countries in the Pacific.
Sampling errors are usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected by simple random sampling, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2009 KDHS sample was the result of a multistage stratified design, and, consequently, it is necessary to use more complex formulae. The computer software used to calculate sampling errors for the 2009 KDHS is the Integrated Sample Survey Analysis (ISSA) Sampling Error Module. This module uses the Taylor linearisation method of variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
In addition to the standard error, ISSA Software Program computes the design effect (DEFT) for each estimate, which is defined as the ratio between the standard error using the given sample design and the standard error that would result if a simple random sample had been used. A DEFT value of 1.0 indicates that the sample design is as efficient as a simple random sample, while a value greater than 1.0 indicates the increase in the sampling error due to the use of a more complex and less statistically efficient design. ISSA also computes the relative error and confidence limits for the estimates.
Sampling errors for the 2009 KDHS are calculated for selected variables considered to be of primary interest for the women’s survey and for men’s surveys, respectively. The results are presented in this appendix for the country as a whole, and for urban and rural areas. The DEFT is considered undefined when the SE considering simple random sample is zero (when the estimate is close to 0 or 1). In the case of the total fertility rate, the number of unweighted cases is not relevant, as there is no known unweighted value for woman-years of exposure to childbearing.
The confidence interval (example, as calculated for children ever born to women aged 40–49) can be interpreted as follows: the overall average from the national sample is 4.993 and its SE is 0.145. Therefore, to obtain the 95% confidence limits, one adds and subtracts twice the standard error to the sample estimate (i.e. 4.993 ± 2×0.145). There is a high probability (95%) that the true average number of children ever born to all women aged 40–49 is between 4.703 and 5.283. Sampling errors are analysed for the national woman sample and for two separate groups of estimates: 1) means and proportions, and 2) complex demographic rates. The SE/R for the means and proportions range between 0.9% and 27.5%; the highest SE/Rs are for estimates of very low values (e.g. currently using IUD). So in general, the SE/R for most estimates for the country as a whole is small, except for estimates of very small proportions. However, for mortality rates, the averaged SE/R for the five-year period mortality rates is generally higher than those related to the 10-year estimates. There are differentials in the SE/R for the estimates of sub-populations. For example, for the variable want no more children, the SE/Rs as a percent of the estimated mean for the whole country, and for the urban areas are 3.9% and 6.2%, respectively.
The sampling errors are fully described in Appendix B of "Kiribati 2009 DHS Final Report" pp.268-276 provided in the External Resources section.