A codebook is an essential document that informs the data user about the study, data file(s), variables, categories, etc., that make up a complete dataset. The codebook may include a dataset’s record layout, list of variable names and labels, concepts, categories, cases, missing value codes, frequency counts, notes, universe statements, and so on. When captured using the DDI metadata standard, information in a codebook is structured, machine-actionable, and usable by computer software and databases.
Why a Codebook?
Creating a readable codebook to accompany your dataset can go a long way in making your data well understood and reusable well into the future. A codebook provides authoritative (straight-from-the-research) and citable information, as well as instructions on how to read, analyze, interpret, and verify data for accuracy and replication purposes.
To create a codebook, information about the study, files, and variables must be known. (See the DDI Glossary for more information about study-level and variable-level metadata.) Information can sometimes be provided by a researcher in a readme file (e.g., a text file), or be embedded in a datafile package (e.g., SPSS allows for column names / labels / missing values / notes), and so on.
Tip: Don’t worry if you don’t currently have any DDI or structured metadata to work with, as there are a number of tools that you can use to extract file- and variable-level metadata (such as the variable names, labels, categories, values, frequency counts, etc.) found in your completed dataset.
Use Cases and Examples
Here is an example of a readable codebook in PDF format. This is the same codebook in machine-actionable DDI format (XML format). These files were generated using the DDI-Codebook standard, and were marked up using Nesstar Publisher.
Additional codebook examples using ICPSR data are also available for review.
Once you establish a workflow for codebook creation, you can easily attach codebooks to your published datasets in any data repository.
Scenario: “I have a complete dataset and I would like to create a codebook”
A codebook is an accompanying document to assist in interpreting the data for replication and reuse purposes by an end-user. A DDI codebook provides metadata about a dataset and enables application tools to read the data appropriately for display and to do further statistical analysis.
There are a number of tools that can create DDI for the creation of a structured codebook.
Tip: If you have data in SPSS, SAS, or Stata format and want to produce DDI from those packages, there are tools to help you. Note that these tools provide only the metadata carried in the statistical packages and not the actual data.
Note: This is not meant to be a comprehensive list of tools available for DDI; instead this represents a list of enterprise-ready software that is used by a variety of institutions and organizations that manage and produce data for reuse. Of course, these tools may not be relevant for all use cases. Anyone can create and use the DDI standard and XML schemas to develop new tools. For more information about using DDI from scratch, please review the guidance DDI for Developers.
Codebook Creation Tools
Nesstar Publisher - Download
Nesstar Publisher can produce PDF Codebooks and DDI-Codebook Version 1.2 XML
- Nesstar Publisher v4.0 User Guide
- Quick Guide (for <odesi>, produced by A. Cooper and J. Fry)
- Video Guides (for <odesi>, produced by Carleton University)
Colectica - Download
Colectica can produce PDF Codebooks and DDI-Lifecycle Version 3.2 XML
- Guide to Creating a PDF Codebook
- Video Guides (see ‘Publish Documentation’)
DDIEditor - Download
DDIEditor can produce DDI-Lifecycle XML
- User Guide (technical documentation)
Export and Transfer Tools for DDI-XML
StatTransfer - Download
StatTransfer can export various formats to DDI-Lifecycle Version 3.1 XML
Sledgehammer - Download
Sledgehammer can export various formats to DDI-Codebook and DDI-Lifecycle XML
If you have existing DDI metadata in XML format and you want to display it as a readable codebook in PDF or HTML format, you can use existing DDI style sheets to do so.
Style Sheets (DDI-XSL tools)
DDI-Lifecycle Version 3.1 Converter (using XSLT Stylesheets)
Convert DD 3.1 XML to MARC-XML, DDI-Codebook XML, DataCite 2.2 XML, PDF, XHTML (DDA), XHTML (generic)
Generic DDI-Codebook Stylesheet to HTML (ICPSR)
Other Conversion Software
XPDF - Download
Convert PDF file (Codebook) to ASCII text