Revision History

September 2015:

Added 19 new samples for Armenia, Austria, Costa Rica, Ethiopia, France, Ghana, Mozambique, Paraguay, Portugal, Puerto Rico, South Africa, and Spain. Ethiopia, Mozambique, and Paraguay were newly added countries to IPUMS. Samples for other countries extend pre-existing series for those countries.

Continued to carry out improvements to geography, providing harmonized geographic units for the second administrative level for roughly half the countries. The revisions to geography are expected to be completed by summer 2016. More information about IPUMS geography variables is available here.

Renamed approximately 100 integrated variables, expanding them to be somewhat more consistent and intuitive. Affected variables with their current and previous names are listed here. Geography variable also underwent wholesale renaming. We apologize for any inconvenience to users posed by these changes.

July 2014:

Added 20 new samples for Dominican Republic, Ghana, Ireland, Liberia, Mali, Nigeria, Ukraine, Uruguay, and Zambia. The samples for Ghana, Ireland, Mali, and Uruguay extend the pre-existing series for those countries.

July 2014:

Revised the geographic variables, introducing first-level harmonized subnational geography for all countries, along with associated GIS boundary files. Pre-existing geography variables were renamed to conform to a more systematic naming convention intended to distinguish harmonized and unharmonized variables at the first and second administrative levels.

July 2013:

Added 27 new samples for Argentina, Bangladesh, Brazil, Burkina Faso, Cameroon, Ecuador, Fiji, Haiti, Kenya, Krygyz Republic, Panama, South Sudan, and the United States. The South Sudan sample contains records that were formerly part of the Sudan 2008 sample, but which are now treated as a separate country.

Added data from four recent censuses from Brazil, Ecuador, Haiti, and Panama that record individual mortality and/or migration events. These files can be downloaded and linked to data produced by the extract system.

June 2012:

Added 26 new samples for El Salvador, Indonesia, Mexico, Morocco, Nicaragua, Turkey, and Uruguay. The sample for Mexico extends the pre-existing series for that country. Added 40 new harmonized variables and approximately 2300 unharmonized variables specific to the individual samples.

August 2011:

Significantly redesigned the extract process in the web dissemination system. The new process is far more streamlined, relegating the numerous steps in the old system to a list of options that users can choose to ignore.

June 2011:

Added 26 new samples for Cambodia, Egypt, France, Germany, Iran, Ireland, Jamaica, Malawi, Palestine, Sierra Leone, Sudan, and Vietnam. The samples for Cambodia, Egypt, France, Palestine and Vietnam extend pre-existing series for those countries. The data release incldues 40 new harmonized variables and approximately 2100 unharmonized variables specific to the individual samples.

February 2011:

Introduced a new version of the web user interface for browsing variables and creating data extracts. The new system is explicitly designed around the concept of a "data cart" one adds to while browsing and from which one "checks out" to generate a data extract. We continue to develop new features based on this design.

July 2010:

Added a new sample for 2004 India. The sample is an employment survey similar to the other India samples.

June 2010:

Added 28 new samples for Cuba, Mali, Nepal, Pakistan, Peru, Puerto Rico, Saint Lucia, Senegal, Switzerland, Tanzania, and Thailand. The data release includes 55 new harmonized variables and approximately 2500 unharmonized variables specific to the individual samples.

Added a discussion of sampling error that highlights situations where sample design can significantly affect standard errors. We continue to develop this material.

February 2010:

Added downloadable datasets containing fertility, mortality and migration events for seven censuses from developing countries. Because there can be multiple events per person or per household, these data do not fit within the data structure handled by the IPUMS extract system. Instead, these files can be downloaded and matched onto the extract data, giving researchers complete flexibility to devise their own measures.

January 2010:

Introduced a new web interface that integrates variable browsing with the data extract process. The new system also includes a variable search feature.

Corrected a problem with marital status in Brazil 2000. There are three source variables that are not entirely consistent with one another. After review, we altered our interpretation of the data, which results in more consensual unions and separated persons and fewer married, divorced and widowed. The unharmonized source variables are unchanged, so users can access them to explore this issue further.

Corrected the earned income variable for 1981 Venezuela. Too many cases were receiving a value of "1".

May 2009:

Added 19 new samples for Armenia, Bolivia, France, Guinea, India, Italy, Jordan, Kyrgyz Republic, Mongolia, Romania, Slovenia, and South Africa. The Indian samples are large employment surveys that asked many questions common in censuses. The French, Romanian, and South African samples extend pre-existing series of samples for those countries. We also added approximately 60 new harmonized variables and 1700 unharmonized variables specific to the individual samples.

Introduced GIS boundary files. These enable users to map variables with country-level geography and variables relating to the first administrative level within in each country, such as place of residence and birthplace.

June 2008:

Added 32 new samples for Austria, Canada, China, Colombia, Egypt, Ghana, Iraq, Malaysia, Mexico, Netherlands, Panama, United Kingdom, United States, and Venezuela. Added approximately 100 new harmonized variables amd 2000 unharmonized variables specific to the individual samples.

Developed location-of-mother and location-of-father data for all samples. The constructed parental locator variables (MOMLOC and POPLOC) gives the record number within the household of each person's mother or father using information on age, relationship, marital status, child-bearing, and other data. The variables makes it easy to attach the characteristics of a person's parent to their own record (such as mother's age or father's occupation), or to summarize the characteristics of dependent children (such as number of own children in school). The basis for making the links is summarized in the variable PARRULE.

Made significant changes to the interface to accommodate the growing number of samples and variables, and to give users greater control while browsing the variables or defining a data extract. Key features include:

  • Users can customize their extract size by selecting the number of households or persons they want from each dataset. The extract system draws a subset of households that match the desired case-count or sample fraction and generates syntax files that adjust the weight variables appropriately.
  • While browsing the documentation, users can save variables to include in their data extract later in their web session.
  • The extract system will "attach characteristics": it will use information from the record of the spouse, mother, father, or head to create new variables such as "mother's employment status." The feature uses the constructed family interrelationship "pointer" variables, MOMLOC, POPLOC, and SPLOC.
July 2007:

Corrected an error in the SPLOC (spouse locator) variable in Rwanda 1991, South Africa 2001, and Uganda 1991. 2nd and higher order spouses in polygamous unions were not being linked to their husbands.

Improved the relationship-to-household-head codes in 1980 and 1990 Hungary for persons who were not part of the primary family. The samples only contain relationship-to-subfamily-head information, but it is possible to infer the relationships between subfamilies in most cases.

June 2007:

Added 17 new samples for Argentina, Hungary, Israel, Palestine, Portugal, and Rwanda.

Developed location-of-spouse data for all samples. The constructed spouse locator variable (SPLOC) gives the record number within the household of each person's spouse using information on age, relationship, marital status, and other data. The variable makes it easy to attach the characteristics of a person's spouse to their own record (such as spouse's age or occupation). The basis for making the link is summarized in the variable SPRULE.

December 2006:

Added 16 new samples for Belarus, Cambodia, Greece, Philippines, Romania, Spain, and Uganda. Added unharmonized variables as a new feature of the documentation system, giving users access to the full information of the original samples -- even those variables we have not harmonized cross-nationally. Also introduced content filtering, so only information for selected countries appears on the various documentation pages.

June 2006:

Added 19 new samples for Chile, Costa Rica, Ecuador, South Africa, and Venezuela. Added approximately 50 variables. Introduced a dynamically generated variables page allowing users to customize their view of the contents of the data series. Added a feature that compiles on a single web page, for any IPUMS variable, all relevant enumeration text from every census. This page can also be customized to include only the samples of interest to researchers.

December 2005:

Moved the expanded samples and improved data extraction system developed in March to the regular IPUMS-International site. The beta test site was deactivated.

March 2005:

Added 33 household variables and 100 person variables to the data series on the beta test site. The new release adds substantially to the household record, and nearly completes all remaining person variables from from the 28 samples currently in the data series.

Introduced a new data extraction system with improved features. The most significant improvement is the ability to revise and resubmit past data extracts. The system also allows the user, when performing case selection, the choice of including only the persons meeting the selection criteria, or including all persons within households in which any person meets the selection criteria.

September 2004:

Substituted 5% samples for USA 1980-2000 1% samples.

July 2004:

Corrected an error in the OCCISCO variable for Kenya 1989. Persons in the labor force were incorrectly coded to "Not Applicable".

June 2004:

Added preliminary samples for Brazil 1960, 1970, 1980, 1991 and 2000. Some constructed variables, such as the spouse and parental locators, are not yet included in the Brazil samples.

January 2004:

Introduced new version of the Colombia samples that allow the identification of municipalities and municipality groupings with at least 20,000 population in 1993.

December 2003:

Corrected an error in the relationship-to-head codes in Mexico 1970. The census asked for relationship to head of family, not head of household. In the previous version of the data, we erroneously interpreted these families/subfamilies in multi-family households as if they were separate households in multi-household dwellings. Only households with subfamilies were affected by the error. In the current data, subfamily members in multi-family households are coded "unknown" for relationship to household head, but a separate variable (SUBFREL) retains their relationship to their family head.

October 2003:

Replaced all French samples with versions containing persons organized in households. The previous versions were individual-level samples that did not group co-resident persons.

October 2003:

Corrected an error in urban-rural status for 1970 Mexico. The values for urban and rural had been reversed.

July 2003:

Added numerous variables on birthplace, migration, and disability, among others.

Added 2000 United States sample.

April 2003:

Added preliminary versions of the constructed household and family interrelationship variables, including pointers to mother, father and spouse.

Added the variable "number of children surviving."

Substituted a cleaner version of the Kenya 1999 sample.

March 2003:

Added China 1982 sample in addition to a number of new variables.

August 2002:

A problem with person weights in the Vietnam 1989 sample was corrected. The previous weights were erroneous.