Skip to content

Structured, descriptive documentation of the content, meaning, provenance, and access for a single data set. Originally developed as an XML DTD, DDI-Codebook retains the hierarchical structure of a DTD in describing the contents of a descriptive codebook for a data set including: identification, authorship, ownership, purpose, background methodologies, source information, provenance, quality control, access, physical file structures, variables/variable groupings, and related materials. Extensive information is found within the variable description covering the data source, derivation activity, representation, data typing, variable role, and restrictions.

Supports Activities:

  • Descriptive documentation of the content, meaning, provenance, and access for a single data set
  • Archival preservation of descriptive content
  • Input basis for more complex descriptions
  • Input content for discovery and exchange of data at the study, data file, variable, and question level
  • Input content for a structured human-readable codebook for the data set as a whole
  • Populate variable and question banks to explore available data and question structures for reuse in new surveys

Markup Examples

A listing of available examples of DDI Codebook instances (filter by version is available)


DDI Codebook 2.5 XML Schema is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
DDI Codebook 2.5 XML Schema is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
Other DDI documents are distributed under Creative Commons licenses. 

Earlier versions published with copyright @ DDI Alliance

Development Work

Each version of DDI Codebook is backward compatible with earlier versions (i.e. a instance compliant with an earlier version will be compliant with the current version. The development of DDI Codebook is managed by the Technical Committee in response to issues raised by the DDI Codebook community of users. Additional work is done to improve alignment with other products in the DDI Product Suite to facilitate transfer of content between products as needed.

Version 2.5 [current version]

Publication date: 20120-1-17 Update: 2014-01-29

Additions in version 2.5: This version is an XML Schema. It incorporates new substantive elements requested by the community to better support use in statistical agencies and is designed to make it easier to migrate documents to DDI Lifecycle for those interested in doing so. 
DDI Codebook Version 2.5 was modified on January 29, 2014. This is a sub-minor version change (version 2.5.1) which DOES NOT change the namespace of DDI-C. Modifications corrected an omission of DataFingerprint from 2.5, relaxed cardinality to support multiple languages, and expanded the documentation

Version 2.1

Publication date: 2003-03-26

DDI Codebook was the first version of the DDI to be published (Version 1 was released in 2000). DDI 2.0 was released in 2003, with Version 2.1 following two years later. 

The canonical expression of DDI 1.* through 2.1 is a Document Type Definition (DTD), although XML Schema versions have been created. The DTD can be converted to a schema using XML software, but the DTD should be used as the authoritative source for instance creation and validation. 

Version 2.0

Published: 2003-03-07

DDI Version 2.0 provided a major expansion of data description to provide study level information on geographic coverage by providing a Geographic Bounding Box and Bounding Polygon. These support spatial searches by specifying coverage in terms of coordinate points. Version 2.0 also provides the means of describing tabular data as commonly published by statistical agencies to provide summary data for geographic areas, for example U.S. Census Summary Files. The nCube structure added to data description defines the dimensions of the table using variables as well as overall information on the title, concept, measure, and universe of the table

Version 1.x

Published: 2001-01-11 with minor revisions through 2002-07-15

Structured, descriptive documentation of the content, meaning, provenance, and access for a single data set. Originally developed as an XML DTD, Codebook retains the hierarchical structure of a DTD in describing the contents of a descriptive codebook for a data set including: identification, authorship, ownership, purpose, background methodologies, source information, provenance, quality control, access, physical file structures, variables/variable groupings, and related materials. Extensive information is found within the variable description covering the data source, derivation activity, representation, data typing, variable role, and restrictions.