Content
Description
DDI-CDI is a new standard which is designed to be used with research data from any domain. While it minimally describes metadata for cataloguing and citation, its fundamental purpose is to describe data and process. The specification is domain-neutral and covers the majority of data structures in common use today: Wide, Long, Multi-Dimensional and Key-Value. It offers, for the first time, a mechanism to interoperate disparate data from multiple disciplines and domains at the lowest level of granularity i.e. the datum itself. While it is designed to complement its siblings in the DDI Alliance suite - DDI Codebook and DDI Lifecycle, which operate in the Social, Behavioral and Economic domain - it is also intended to work with a wide variety of other domain-specific and generic metadata specifications. Integration is a first-order consideration in DDI-CDI and so it is designed from the ground up to work well with controlled vocabularies from any domain as well as with other standards.
DDI-CDI has three main components. The first one supplies a rich set of foundational metadata for variables, classifications, and other concepts and representations. The second one describes data in rectangular (wide), long (event), multi-dimensional (cube), and no-SQL (big data) data formats. The third one describes process as the primary aspect of data provenance. <
For the DDI suite of metadata standards, DDI-CDI provides a new and expanded focus. Interdisciplinary research brings challenges in establishing trust and transparency for the sources and combination of cross-domain data. Ultimately, these diverse types of data must be seen as an integrated whole for research outputs, complete with a description of the structure, meaning, and provenance of each part. DDI-CDI meets this need.
Features
DDI-CDI is a new kind of specification, aimed at both supplementing existing metadata models, and serving a unique purpose in its own right. Its key features include:
- Model-driven
- Domain-independence
- Datum-oriented data description
Provenance-focused: process description down to a datapoint level if required DDI-CDI goes back to first principles and abstracts the foundational characteristics of different data structures. On this basis, it uses a “model-based” approach using UML classes. For non-modellers, this simply means that DDI-CDI can be used in the format of your choice, whether you prefer XML, JSON, or other implementation syntaxes.
The UML Model
The core of DDI-CDI is a model described using the Unified Modeling Language (UML). It is expressed in Canonical XMI, an exchange format for UML models which has been tested to work with many different UML tools. The subset of UML features conforms to the UML Class Model Interoperable Subset (UCMIS) guidelines, which further constrain the features to guarantee greater interoperability.
Informational Documentations
The DDI-CDI specification has an extensive overview document, and browsable field-level documentation which also provides information about the XMI description and the syntax representations for XML and RDF encodings.