Schemas

15 April 2024 - 17:22

There are two kinds of schemas within historical demography. IPUMS-USA, IPUMS-International, MOSAIC, and NAPP were developed to standardize census data. IDS, LINKS-gen, and PiCo are efforts to standardize historical data from other types of historical sources, such as the civil registry, militia registers, parish registers, population registers, slave registers, or tax registers. All schemas cater to a very specific public, as they have been developed for projects with strong institutional boundaries.

Census schemas (IPUMS, MOSAIC, NAPP)

There are three schemas to describe census data. The driving force behind processing census data is IPUMS in Minnesota. In 1991, they started providing "common-format extracts" with common codes and constructed variables. In 1999, IPUMS joined up with scholars from Canada, Denmark, Great Britain, Iceland, Norway, Scotland in the North Atlantic Population Project (NAPP). A similar international census comparison project took place in the early 2010s at the Max Planck Institute for Demographic Research (MPIDR) in Rostock, Germany. The MPIDR schema, called MOSAIC, standardized census data from the 1700s until 1950 for 18 regions in Europe. 

Theme Variables IPUMS-International MOSAIC NAPP
Geography

Country code
Place
Region
Urban-rural status

CNTRY
-
-
URBAN
country
place
region
urban
CNTRY
-
-
URBAN
Household Group quarter status
Household size
Household weight
GQ
PERSONS
WTHH
gq
hhsize
hhwt
GQ
NUMBERHH
HHWT
Identifier Enumeration
Household
Person
SAMPLE
SERIAL
PERNUM
id_enum
id_hhold
id_pers
SAMPLE
SERIAL
PERNUM
Individual Age
Literacy
Marital status
Occupational title
OCCHISCO
Presence at enumeration
Relation to household head
Religion
Sex
Weight
AGE
LIT
MARST
-
OCCHISCO
RESIDENT
-
RELIG
SEX
WTPER
age
lit
marst
occupan
occhisco
presence
relate
relig
sex
perwt
AGE
LIT
MARST
OCCSTRN
OCCHISCO
RESIDENT
-
RELIGION
SEX
PERWT
Person
Name
First name
Last name
-
-
fname
lname
NAMEFRST
NAMELAST
Provenance - - - -
Quality
indicators
Age
Household
Relation to household head
Marital status
Sex
-
-
-
-
-
qage
qhhold
qrelate
qmarst
qsex
QAGEGB
-
QRELGB
QMARSTGB
QSEXGB
Source Enumeration type
Enumeration year
-
YEAR
enumtype
year
-
YEAR

 

Historical person data (IDS, LINKS-gen, PiCo)

IDS, LINKS-gen, and PiCo are efforts to standardize different types of historical person data. Of these schemas, LINKS-gen is by far the most limited and standardized data in two tables that are ready for statistical analysis. IDS has been around since 2009 and explicitly states for which point or period in time historical information is valid. PiCo was developed in 2023 by the Center for Family History in the Netherlands as a means to store information on persons registrations as well as concomitant records and person reconstructions.


 

Schemas - part of the CLAIR-HD project