Skip to content

DDI-CDI is a new standard which is designed to be used with research data from any domain. While it minimally describes metadata for cataloguing and citation, its fundamental purpose is to describe data and process. The specification is domain-neutral and covers the majority of data structures in common use today: Wide, Long, Multi-Dimensional and Key-Value. It offers, for the first time, a mechanism to interoperate disparate data from multiple disciplines and domains at the lowest level of granularity i.e. the datum itself. While it is designed to complement its siblings in the DDI Alliance Product Suite - DDI-Codebook and DDI-Lifecycle, which operate in the Social, Behavioral and Economic domain - it is also intended to work with a wide variety of other domain-specific and generic metadata specifications. Integration is a first-order consideration in DDI-CDI and so it is designed from the ground up to work well with controlled vocabularies from any domain as well as with other standards.

DDI-CDI has three main components. The first one supplies a rich set of foundational metadata for variables, classifications, and other concepts and representations. The second one describes data in rectangular (wide), long (event), multi-dimensional (cube), and no-SQL (big data) data formats. The third one describes process as the primary aspect of data provenance. 

For the DDI suite of metadata standards, DDI-CDI provides a new and expanded focus. Interdisciplinary research brings challenges in establishing trust and transparency for the sources and combination of cross-domain data. Ultimately, these diverse types of data must be seen as an integrated whole for research outputs, complete with a description of the structure, meaning, and provenance of each part. DDI-CDI meets this need.

DDI-CDI is a new kind of specification, aimed at both supplementing existing metadata models, and serving a unique purpose in its own right. Its key features include:

  • Model-driven
  • Domain-independence
  • Datum-oriented data description

Provenance-focused: process description down to a datapoint level if required DDI-CDI goes back to first principles and abstracts the foundational characteristics of different data structures. On this basis, it uses a “model-based” approach using UML classes. For non-modellers, this simply means that DDI-CDI can be used in the format of your choice, whether you prefer XML, JSON, or other implementation syntaxes.

The UML Model

The core of DDI-CDI is a model described using the Unified Modeling Language (UML). It is expressed in Canonical XMI, an exchange format for UML models which has been tested to work with many different UML tools. The subset of UML features conforms to the UML Class Model Interoperable Subset (UCMIS) guidelines, which further constrain the features to guarantee greater interoperability.

Informational Documentations

The DDI-CDI specification has an extensive overview document, and browsable field-level documentation which also provides information about the XMI description and the syntax representations for XML and RDF encodings.

Encodings 

The current encodings and syntax representations provided for the candidate release 2 package are:

Markup Examples 

License

DDI CDI is free software: you can redistribute it and/or modify it under the terms of the Creative Commons Attribution 4.0 International license. Other DDI documents are similarly distributed under the same Creative Commons license.
 

Credits and Acknowledgements

Future Work

Development of DDI Lifecycle is managed by the Cross Domain Integration (CDI) Working Group. The work of the CDI can be found at Cross Domain Integration (CDI) Working Group on the DDI Confluence site.