Create a Codebook

Ask an Expert for Help

A codebook is an essential document that informs the data user about the study, data file(s), variables, categories, etc., that make up a complete dataset. The codebook may include a dataset’s record layout, list of variable names and labels, concepts, categories, cases, missing value codes, frequency counts, notes, universe statements, and so on. When captured using the DDI metadata standard, information in a  codebook is structured, machine-actionable, and usable by computer software and databases. 

Why a Codebook? 

Creating a readable codebook to accompany your dataset can go a long way in making your data well understood and reusable well into the future. A codebook provides authoritative (straight-from-the-research) and citable information, as well as instructions on how to read, analyze, interpret, and verify data for accuracy and replication purposes. 

To create a codebook, information about the study, files, and variables must be known. (See the DDI Glossary for more information about study-level and variable-level metadata.) Information can sometimes be provided by a researcher in a readme file (e.g., a text file), or be embedded in a datafile package (e.g., SPSS allows for column names / labels / missing values / notes), and so on. 

Tip: Don’t worry if you don’t currently have any DDI or structured metadata to work with, as there are a number of tools that you can use to extract file- and variable-level metadata (such as the variable names, labels, categories, values, frequency counts, etc.) found in your completed dataset. 

Use Cases and Examples

Here is an example of a readable codebook in PDF format. This is the same codebook in machine-actionable DDI format (XML format). These files were generated using the DDI-Codebook standard, and were marked up using Nesstar Publisher. 

Additional codebook examples using ICPSR data are also available for review. 

Getting Started

Once you establish a workflow for codebook creation, you can easily attach codebooks to your published datasets in any data repository. 

Scenario: “I have a complete dataset and I would like to create a codebook”

A codebook is an accompanying document to assist in interpreting the data for replication and reuse purposes by an end-user. A DDI codebook provides metadata about a dataset and enables application tools to read the data appropriately for display and to do further statistical analysis. 


There are a number of tools that can create DDI for the creation of a structured codebook. 

Tip: If you have data in SPSS, SAS, or Stata format and want to produce DDI from those packages, there are tools to help you. Note that these tools provide only the metadata carried in the statistical packages and not the actual data. 

Note: This is not meant to be a comprehensive list of tools available for DDI; instead this represents a list of enterprise-ready software that is used by a variety of institutions and organizations that manage and produce data for reuse. Of course, these tools may not be relevant for all use cases. Anyone can create and use the DDI standard and XML schemas to develop new tools. For more information about using DDI from scratch, please review the guidance DDI for Developers. 

Codebook Creation Tools

Nesstar Publisher - Download 
Nesstar Publisher can produce PDF Codebooks and DDI-Codebook Version 1.2 XML

Colectica - Download 
Colectica can produce PDF Codebooks and DDI-Lifecycle Version 3.2 XML

DDIEditor - Download 
DDIEditor can produce DDI-Lifecycle XML

Export and Transfer Tools for DDI-XML 

StatTransfer - Download 
StatTransfer can export various formats to DDI-Lifecycle Version 3.1 XML

Sledgehammer - Download 
Sledgehammer can export various formats to DDI-Codebook and DDI-Lifecycle XML

If you have existing DDI metadata in XML format and you want to display it as a readable codebook in PDF or HTML format, you can use existing DDI style sheets to do so.
Style Sheets (DDI-XSL tools)

DDI-Lifecycle Version 3.1 Converter (using XSLT Stylesheets) 
Convert DD 3.1 XML to MARC-XML, DDI-Codebook XML, DataCite 2.2 XML, PDF, XHTML (DDA), XHTML (generic)

Generic DDI-Codebook Stylesheet to HTML (ICPSR)

Other Conversion Software

XPDF - Download
    Convert PDF file (Codebook) to ASCII text

See also:

Glossary of DDI Terms