Dutch Census 2011: Analysis and Methodology
Introduction to the Dutch census 2011
Statistics Netherlands produced the tables for the Dutch Census 2011 by combining existing register and sample survey data. Since the last census based on a complete enumeration was held, in 1971, the willingness of the population to participate has fallen sharply. Statistics Netherlands no longer uses census questionnaires and has found an alternative in the register-based census, using only existing data. The register-based census is cheaper and more socially acceptable. The table results of the Netherlands are not only comparable with earlier Dutch censuses, but also with those of the other countries in the 2011 European census round.
The first part of this book consists of analyses of the results, focusing on the following topics: key figures and a global historical comparison with earlier Dutch censuses [chapter 2], a global comparison of the results of the 2011 Census in the Netherlands with results in other European countries [chapter 3], foreigners in the Netherlands and Dutch people in Europe [chapter 4], and the Caribbean Netherlands compared with the Frisian Islands [chapter 5]. The second part addresses the methodology; it examines the new weighting approach in which microdata of the labour force survey are reused in the 2011 Census [chapter 6] and an additional estimation technique for detailed cells of the census hypercubes that could not be estimated with repeated weighting [chapter 7].
The Dutch population and housing census 2011
The European census round
The 2011 Census Round was coordinated by Eurostat for all European Union [EU] and European Free Trade Association [EFTA] member states. The EU population and housing censuses have a broad basis: they are covered by four regulations [European Commission, 2008, 2009, 2010a and 2010b], which have served to harmonize population definitions, census variables and categories, census hypercubes and metadata within the EU. Moreover, they specify the technical format [SDMX] for data delivery. All EU member states were required to conduct a census for 2011. For most national statistical institutes this was a major operation involving a lot of work and high costs. Each country had to collect census data and validate and protect its census output in the hypercubes. All data had to be transformed to SDMX format and put in the so-called census hub. Lastly, in addition to sixty mandatory hypercubes, all countries had to produce a number of quality hypercubes and a metadata file describing the methodology used.
Census experts at Statistics Netherlands started preparations for the 2011 population and housing census in 2008. In 2009 they started work on the data collection procedures required to collect the census information about the 16,655,799 people living in the Netherlands on 1 January 2011.
The 2011 Census in the Netherlands resulted in sixty high-dimensional tables, so-called hypercubes [European Commission, 2010a]. Five relate to the Netherlands as a whole, forty contain data at provincial level [NUTS 2], ten at COROP level [NUTS 3] and five at municipal level [LAU 2]. The sixty hypercubes fall into three different groups: five are about housing, four relate to commuting and the remainder are demographic tables, concerning economic activity, occupation and level of education, for example.
A register-based census
Data from different sources were combined to produce the 2011 census tables. These data were not obtained by interviewing inhabitants in a complete enumeration, as in traditional censuses in most other countries, but by using data from registers and sample surveys that are already available at Statistics Netherlands. This approach has a number of advantages and disadvantages.
One of the advantages of this innovative approach is a much lower census bill for Dutch tax payers. A traditional census in the Netherlands would cost a few hundred million euros, while with this method it costs ?only? around 1.4 million euros. This bill includes the costs for all preparatory work, such as extending the methodology and updating and developing accompanying software, as well as the analyses of the results. It does not include the costs of the registers, as these are not kept for censuses but primarily for other purposes. Also, under Dutch law, Statistics Netherlands may access government registers free of charge. This low-cost census approach is only possible for countries with sufficient register information. By way of example, let us compare the costs of the Dutch register-based census with those of the traditional census held in the United Kingdom in 2011. In the United Kingdom the census cost approximately 565 million euros. In terms of PPP per capita [in 2011 US dollars], the census cost 11.82 in the UK, compared with 0.10 in the Netherlands [United Nations, 2014]. A register-based census costing less than 1 percent of a traditional census is not exceptional. Today, the huge costs of traditional censuses are often justified by pointing out the enormous implications of the census results for regional funding distribution. But a register-based census would be impossible in the UK anyway, because of the lack of sufficient register data and access restrictions.
Apart from the financial aspect, there are also other important differences between a traditional census and the register-based census conducted in the Netherlands. A well-known problem with traditional censuses is that participation is limited and selective. In spite of the mandatory character of a traditional census, part of the population will not participate at all [unit non-response] and those who do will not answer all questions [item non-response]. Although correcting for nonresponse by weighting and imputation techniques is worth trying, traditional correction methods are inadequate to obtain reliable results. The last traditional census in the Netherlands, in 1971, met with many privacy objections against the collection of integral information about the population living in the Netherlands. This increased the non-response problem, and non-response was expected to be even higher if another traditional census were to be held in the Netherlands [Corbey, 1994]. There are almost no objections to a register-based census in the Netherlands and the non-response problem only plays a role when survey microdata are reused.
Another advantage of the register-based census is the short production time. The register-based census in the Netherlands got off to a later start than traditional censuses in other countries. It would have been pointless to start the production phase of the 2011 census project before all sources were available, and some registers became available relatively late. In spite of this delay, Statistics Netherlands compiled its census tables faster than most other countries in the 2011 European census round. In fact, the Netherlands had one of the shortest production times for the complete set of tables required by Eurostat. Statistics Netherlands had the advantage that no incoming census forms had to be checked and corrected.
A disadvantage of the Dutch census is that for some variables only sample information is available, which meant it was impossible to meet the level of detail required in some census hypercubes. At the moment, however, the Netherlands perceives the advantages of the register-based census in terms of cost and nonresponse problems to amply outweigh the loss of some detail compared with a traditional census.
Statistics Netherlands is not the only country that uses registers to produce census information. Four Nordic countries [Denmark, Finland, Norway and Sweden], Austria and Slovenia have more variables available in registers than the Netherlands, and the problem of insufficient detail in the outcome does not play a major role there. Most of the other register-based countries are in a similar position to the Netherlands: not all variables relevant for the census can be found in registers. They are therefore very interested in the Dutch approach of combining registers and existing sample surveys and using modern statistical techniques and accompanying software to compile the hypercubes. Obviously, it is essential that statistical bureaus are permitted to make use of registers that are relevant for the census. For Statistics Netherlands this is laid down in the statistical law that came into force in 2004. Nevertheless, Statistics Netherlands will have to maintain the good contact it has established with register holders over the last 25 years. Timely deliveries with relevant variables for Statistics Netherlands are crucial for official statistics production.
The Kingdom of the Netherlands includes the Netherlands in Europe and six islands in the Caribbean. The Kingdom consists of four constituent countries: the Netherlands [consisting of twelve provinces], Aruba, Curacao and St Maarten. The latter three islands have an independent status as a country within the Kingdom of the Netherlands. The other three Caribbean islands [Bonaire, Saba and St Eustatius] are part of the Netherlands and have had the status of ?special municipality? since 10 October 2010. All four countries produce their own official statistics. Statistics Netherlands has a regional office in the Caribbean responsible for statistics on Bonaire, Saba and St Eustatius [the Caribbean Netherlands]. Although the Caribbean Netherlands is part of the Netherlands, statistics on the Netherlands do not include the Caribbean Netherlands. All statistics concerning the Caribbean Netherlands are published separately. The results of the Dutch 2011 Census therefore relate only to the European part of the Netherlands. However, chapter 5 of this book compares some figures for the Caribbean Netherlands with those for the European part of the Netherlands. As no census was held in the Caribbean Netherlands in the 2011 Census Round, other sources were used for statistics on Bonaire, Saba and St Eustatius. Some key results of the 2011 Census in the European part of the Netherlands as well as a brief historical comparison can be found in chapter 2, while chapter 3 compares key results of the Netherlands with those of other European countries. Chapter 4 presents more information about people living in the Netherlands but born in other EU and EFTA countries, and people born in the Netherlands but living in other EU and EFTA countries.
Compilation methods in the Netherlands
The current census results in the Netherlands refer to 2011. The backbone of the Dutch census is the central population register [PR], which combines all the municipal population registers. PR data for 1 January 2011 were used as the basis for the set of hypercubes. The hypercubes focus on frequency counts, not on quantitative information. Data not available or derivable from the PR were taken from other registers. All register variables are now available from Statistics Netherlands? system of social statistical datasets [SSD], and their quality has been improved by applying micro-integration techniques. Micro-integration entails checking the data and adjusting those that are incorrect. It is widely assumed that micro-integrated data provide more reliable results, as they are based on a maximum amount of information. They also provide better coverage of subpopulations: if data are missing in one source, another source can be used.
In the 2011 Census, only two variables were not taken from a register: ?occupation? and ?educational attainment?. Records from the labour force survey [LFS] in a three year period around the enumeration date [1 January 2011] were used to estimate values for these two variables, which are included in 23 of the 60 hypercubes. Table consistency was guaranteed by using repeated weighting for these 23 hypercubes. The method of repeated weighting, described extensively in Houbiers et al. , is based on the repeated application of the regression estimator, generating a new set of weights for each table estimated. The weights of the records in the microdata are adjusted in such a way that a new table estimate is consistent with all earlier table estimates.
We used the latest version of VRD software developed by Statistics Netherlands for this repeated weighting. VRD stands for Vullen [ Filling] reference database, and the aim of the application is to fill and manage the reference database. The main functions of VRD are estimating tables via repeated weighting, adding these to the reference database, and withdrawing aggregates from the database. Under the condition of small, independent samples, variances of table values can also be estimated. Such estimated variances were used to set publication rules for cells and to calculate variation coefficients for the quality hypercubes, which serve as a quality assessment of the census hypercubes.
To maximize accuracy, all estimates are based on the largest possible number of records. Tables containing only register variables are counted from the registers. Tables with at least one variable from the LFS are estimated from the largest possible combination of register and survey data. Initial weights have to be available for these estimations. Chapter 6 describes the weights used for the 2011 census and how they were calculated, as well as how the new panel character of the LFS was used: data from different waves were available and the data closest to 1 January 2011 [Census Day] were used to compile the tables. As not all detailed cells could be estimated through repeated weighting only, an additional technique was required. Chapter 7 describes this technique and how it was applied in the reconciliation of hypercubes for the 2011 census.
As part of the 2011 census was compiled on the basis of sample data, margins of inaccuracy have to be taken into account for some results. A rule of thumb was applied for cell values based on a sample from the census population: only estimated table cells based on at least five persons are published. In addition, rare categories have been made confidential to prevent disclosure of individual information.
The register-based census has proven to be a successful concept in the Netherlands. It has many advantages compared with traditional censuses: costs are considerably lower, problems with non-response only play a role when survey microdata are reused, and the production time is much shorter. These advantages more than make up for the loss of some detail in tables based on survey variables. The 2011 census provides data on the Netherlands that can be compared to results of earlier Dutch censuses and to results of other countries taking part in the 2011 Census Round.
Although most countries in the world still conduct traditional censuses, the Netherlands is not the only country with a register-based census. A number of countries in Europe have switched to combined and register-based censuses. The 2011 Census was the fourth that the Netherlands conducted without census questionnaires.
Just as in the 2001 Census, the repeated weighting technique was used successfully to produce a consistent set of tables for the 2011 census. A new additional method was introduced for the 23 hypercubes to be estimated. All tables that had to be estimated were based on the largest number of records possible and the resulting hypercubes are mutually consistent. It is important to apply micro-integration of the different sources in the SSD before compiling tables using the estimation techniques. The use of micro-integration and the applied estimation techniques guarantee the consistency between table results from different hypercubes. There is thus no confusion for users of census information, as there is one figure on each socio-economic phenomenon, instead of several figures depending on which sources are used.