Overview of Current Products
The Data Documentation Initiative (DDI) is a suite of products that describes metadata about both quantitative and qualitative research data in the social, behavioral, economic, and health sciences. The DDI suite is a set of free standards that document and manage different stages of the research data lifecycle, including conceptualization, collection, process, distribution, discovery, and archiving.
The content areas of DDI cover the following areas:
- Conceptual objects: concept, unit, unit type, universe, population, geographic structures, and representation
- Methodological objects: approaches to sample selection, data capture, weighting, quality control, and process management
- Processing: data capture, data processing, analysis, and data management
- Quantitative and qualitative data objects: concept, universe, representation, usage, data type, record, record relationships, storage, access, and descriptive statistics
- Data management: ownership, access, rights management, restrictions, quality standards, organization, agent management, relationship between products, versioning, and provenance
Products within the DDI suite differ in terms of their area of coverage within DDI, supported activities, and required level of infrastructure. From simple descriptive content for human understanding to structures that support metadata-driven statistics production and analysis, DDI addresses a broad area of data management needs. As a suite of standards, DDI provides a common means of identification for information objects, support for common cross-product content, and an informed means of transforming content between products.
Current DDI Products
[DDI also has one or more products under development. Descriptions of those products are found here.]
DDI-Codebook - Structured, descriptive documentation of the content, meaning, provenance, and access for a single data set.
DDI-Lifecycle - Expands on the idea of DDI-Codebook in terms of content coverage, depth, metadata management over time, reusable metadata, and support for the planning, capture, processing, storage, discovery and dissemination of data. It allows grouping and comparing related studies or series of studies.
DDI-CDI - (Cross-Domain Integration) is intended to fill the emerging need for integration of data from different disciplinary domains. It is designed to connect disparate forms of data and metadata, whether they are described in DDI Codebook/Lifecycle or in any other fashion.
Controlled Vocabularies - A set of controlled vocabularies commonly used in social science and other disciplines to support systems designed to identify, locate, and access data for research purposes.
XKOS - Extended Knowledge Organization System (XKOS) leverages the Simple Knowledge Organization System (SKOS) for managing statistical classifications and concept management systems. XKOS adds the extensions that are needed to meet the requirements of the statistical community.
SDTL: Structured Data Transformation Language (SDTL) is an independent intermediate language for representing data transformation commands
Product | Description | Supports Activities | Points of Contact with other DDI Products | Available metadata syntax representations |
---|---|---|---|---|
DDI-Codebook | Originally developed as an XML DTD, Codebook retains the hierarchical structure of a DTD in describing the contents of a descriptive codebook for a data set including: identification, authorship, ownership, purpose, background methodologies, source information, provenance, quality control, access, physical file structures, variables/variable groupings, and related materials. Extensive information is found within the variable description covering the data source, derivation activity, representation, data typing, variable role, and restrictions. Content Coverage Codebook covers all major content areas but in general, s limited to descriptive narrative |
|
|
|
DDI-Lifecycle | DDI-Lifecycle expands on the idea of DDI-Codebook in terms of content coverage, depth, metadata management over time, reusable metadata, and support for the planning, capture, processing, storage, discovery and dissemination of research data. DDI-Lifecycle is the most comprehensive of the DDI products covering conceptual and methodological objects, processing, quantitative and qualitative data objects, and data management. Lifecycle is appropriate for longitudinal, linked, and other complex datasets. |
|
|
|
DDI-CDI | The new DDI - Cross Domain Integration (DDI - CDI) is an application of the model which emerged from many years of work on a "next generation" DDI specification. It is designed to be a model which can be used to connect disparate forms of data with each other, whether they are described in DDI- Codebook/DDI-Lifecycle or in any other fashion. As such, it can be used as a way of integrating these new forms of data with more traditional, existing data, or with each other. the specifications must be able to describe new forms of data, to be implemented in a wider range of technologies. Ultimately, the diverse types of data must be seen as an integrated whole, complete with a description of the structure, meaning, and provenance of each part. DDI-CDI is intended to meet this need. |
|
|
|
Controlled Vocabularies |
A set of controlled vocabularies commonly used in social science research. Reflects uses of controlled vocabulary to support systems designed to identify, locate, and access data for research purposes. Content coverage is driven by the needs of the DDI community, but use is not limited to this community. |
|
|
|
XKOS | XKOS extends Simple Knowledge Organization System (SKOS) for the needs of statistical classifications. It does so in two main directions. First, it defines a number of terms that enable the representation of statistical classifications with their structure and textual properties, as well as the relations between classifications. Second, it refines SKOS semantic properties to allow the use of more specific relations between concepts. Those specific relations can be used for the representation of classifications or for any other case where SKOS is employed. XKOS adds the extensions that are desirable to meet the requirements of the statistical community. |
|
|
|
SDTL | Structured Data Transformation Language (SDTL) is an independent intermediate language for representing data transformation commands. Statistical analysis packages (e.g., SPSS, Stata, SAS, and R) provide similar functionality, but each one has its own proprietary language. SDTL consists of JSON schemas for common operations, such as RECODE, MERGE FILES, and VARIABLE LABELS. SDTL provides machine-actionable descriptions of variable-level data transformation histories derived from any data transformation language. Provenance metadata represented in SDTL can be added to documentation in DDI and other metadata standards. |
|
|
|