Schemas

15 April 2024 - 17:22

There are two kinds of schemas within historical demography. IPUMS-USA, IPUMS-International, MOSAIC, and NAPP were developed to standardize census data. IDS, LINKS-gen, and PiCo are efforts to standardize historical data from other types of historical sources, such as the civil registry, militia registers, parish registers, population registers, slave registers, or tax registers. All schemas cater to a very specific public, as they have been developed for projects with strong institutional boundaries.

Census schemas (IPUMS, MOSAIC, NAPP)

There are three schemas to describe census data. The driving force behind processing census data is IPUMS in Minnesota. In 1991, they started providing "common-format extracts" with common codes and constructed variables. In 1999, IPUMS joined up with scholars from Canada, Denmark, Great Britain, Iceland, Norway, Scotland in the North Atlantic Population Project (NAPP). A similar international census comparison project took place in the early 2010s at the Max Planck Institute for Demographic Research (MPIDR) in Rostock, Germany. The MPIDR schema, called MOSAIC, standardized census data from the 1700s until 1950 for 18 regions in Europe. 

 

ThemeVariablesIPUMS-InternationalMOSAICNAPP
GeographyCountry code Place Region Urban-rural statusCNTRY URBANcountry place region urbanCNTRY - - URBAN
HouseholdGroup quarter status Household size Household weightGQ PERSONS WTHHgq hhsize hhwtGQ NUMBERHH HHWT
IdentifierEnumeration Household PersonSAMPLE SERIAL PERNUMid_enum id_hhold id_persSAMPLE SERIAL PERNUM
IndividualAge Literacy Marital status Occupational title OCCHISCO Presence at enumeration Relation to household head Religion Sex WeightAGE LIT MARST - OCCHISCO RESIDENT - RELIG SEX WTPERage lit marst occupan occhisco presence relate relig sex perwtAGE LIT MARST OCCSTRN OCCHISCO RESIDENT - RELIGION SEX PERWT
Person NameFirst name Last name- -fname lnameNAMEFRST NAMELAST
Provenance----
Quality indicatorsAge Household Relation to household head Marital status Sex- - - - -qage qhhold qrelate qmarst qsexQAGEGB - QRELGB QMARSTGB QSEXGB
SourceEnumeration type Enumeration year- YEARenumtype year- YEAR

 

Historical person data (IDS, LINKS-gen, PiCo)

IDS, LINKS-gen, and PiCo are efforts to standardize different types of historical person data. Of these schemas, LINKS-gen is by far the most limited and standardized data in two tables that are ready for statistical analysis. IDS has been around since 2009 and explicitly states for which point or period in time historical information is valid. PiCo was developed in 2023 by the Center for Family History in the Netherlands as a means to store information on persons registrations as well as concomitant records and person reconstructions.

ThemeVariablesIDSLINKS-genPiCo
DateBaptismBAPTISM_DATE--
BirthBIRTH_DATEB_dateschema:birthDate
DateTIMESTAMPDate-
DeathDEATH_DATED_dateschema:deathDate
DivorceDIVORCE_DATE--
First observationSTART_OBSERVATION--
FuneralFUNERAL_DATE--
Last observationEND_OBSERVATIONLastEntryDate-
MarriageMARRIAGE_DATEM_date_1 to 5-
Marriage announcementMARRIAGE_PROCLAMATION_DATE--
StillbirthSTILLBIRTH_DATE--
GeographyBaptismBAPTISM_LOCATION--
BirthBIRTH_LOCATIONB_locationschema:birthPlace
DeathDIVORCE_LOCATIOND_locationschema:deathPlace
DivorceDEATH_LOCATION--
FuneralFUNERAL_LOCATION--
MarriageMARRIAGE_LOCATIONM_location_1 to 5-
Marriage announcementMARRIAGE_PROCLAMATION_LOCATION--
Place-Locationschema:address
StillbirthSTILLBIRTH_LOCATION--
HouseholdRelation between individualsRELATION--
IdentifierFatherId_I_2Id_fatherschema:parent
HouseholdId_C--
MotherId_I_2Id_motherschema:parent
PartnerId_I_2Id_partner_1 to 5schema:spouse
PersonId_I_1Id_personschema:identifier
IndividualAge-Agepico:hasAge
Age in yearsAGE_YEARS--
Age in monthAGE_MONTHS--
Age in weeksAGE_WEEKS--
Age in daysAGE_DAYS--
Age at death-D_age-
Age at last observation-LastEntryAge-
Age at marriage-M_age_1 to 5-
Alive / DeadALIVE-pico:deceased
Civil statusCIVIL_STATUS--
Died before registration-D_deadonregistration-
Father's age at birth-B_age_father-
HISCOOCCUPATION_HISCOHISCO-
Legitimacy at birthLEGITIMACY--
Mother's age at birth-B_age_mother-
NationalityNATIONALITY--
Number of marriages-Marriages_N-
Occupational titleOCCUPATIONOccupationschema:hasOccupation
ReligionRELIGION-pico:hasReligion
Role--pico:hasRole
SexSEXSexschema:gender
Signature availableSIGNATURE--
TwinMULTIPLE_BIRTHTwin-
Person nameFirst nameFIRST_NAME-schema:givenName
Last nameLAST_NAME-schema:familyName
PrefixPREFIX_LAST_NAME-pnv:prefix
TitleTITLE-schema:honorificPrefix
schema:honorificSuffix
ProvenanceCreator--prov:wasGeneratedBy
Original source--prov:hadPrimarySource
Person observations--prov:wasDerivedFrom
Quality indicatorsBirth certificate available-B-
Death certificate available-M-
Marriage certificate available-D-
Parental marriage certificate available-M_parents-
Person name did not match, but parental names and time range did-Postlink_D-
SourceArchive--schema:holdingArchive
Collection--schema:isPartOf
Date--schema:dateCreated
Digital location--schema:url
Last observation-LastEntryCert-
Name--schema:name
Place--schema:contentLocation
Primary source--prov:hadPrimarySource
Record typeNAME--
A data model schema diagram featuring interconnected circles and labels. At the center is an orange circle labeled schema:ArchiveComponent. From this central node, three arrows point downwards to other elements:  An arrow labeled schema:locationCreated points to an orange circle named schema:Place.  An arrow labeled schema:additionalType points to a blue circle named pico:huwelijksakte (marriage certificate).  An arrow labeled schema:dateCreated points to a grey circle named xsd:date.