Getting Started with DDI-Lifecycle (DDI-L)
ASK AN EXPERT
This document provides a basic introduction on how to make use of the DDI-Lifecycle specification. DDI-Lifecycle supports a number of applications, and this page focuses on common content needed to support the referencing and reuse of metadata. It is assumed here that you've had some experience with XML; if not, consult our FAQ.
DDI-Lifecycle expands on the idea of DDI-Codebook in terms of content coverage, depth, metadata management over time, reusable metadata, and support for the planning, capture, processing, storage, discovery and dissemination of research data. DDI-Lifecycle is the most comprehensive of the DDI products covering conceptual and methodological objects, processing, quantitative and qualitative data objects, and data management. DDI-Lifecycle is appropriate for longitudinal, linked, and other complex datasets. For a more complete description of DDI-Lifecycle and access to current and past versions see DDI-Lifecycle
The markup task may appear daunting at first, so we have broken the process down into discrete steps. If you have questions or would like some advice on your specific situation, please don't hesitate to get in touch with us.
If you're curious to see how other organizations make use of DDI XML, see DDI Implementations. Also see specific applications listed under Getting Started. DDI-Lifecycle supports a range of applications outside of a standard codebook and uses specialized sub-sets of metadata to support these uses. Reviewing the overall content covered by the DDI Suite provided in Products/Overview of Products [link] will provide an overall view of what content is covered. In many areas, this exceeds the coverage of many non-DDI standards.
- Step One: Review the Tags
- Step Two: Mark Up and Validate Documents
- Step Three: Select and Implement Display Software
- Step Four: Test Marked Up Documents
Step One: Review the Tags
Step One is all about organizing information, and for this kind of task it may be useful to consult a librarian or an archivist who is familiar with metadata.
You may also find it helpful to review some markup examples for specific content.
First, you need to determine what information will be recorded in your DDI instances. The DDI features over 300 different tags (most of them optional), but it's unlikely that anyone would make use of all of them. Become familiar with the current high level technical documentation and best practice documents. These provide essential insight into the structures within DDI-Lifecycle that support internal coherence and the reuse of metadata both within and outside of a single instance. Critical features include:
- Identification
- Versioning
- Reference
- Controlled Vocabulary use
The DDI has ten main schemas and five specialized schemas (each schema has a specific namespace and all can be integrated within a single instance). Refer to Glossary for terminology definitions.
- Instance -- This is an external wrapper for the published instance. Supports the creation of an instance that is intended to be treated as a document as well as publishing discrete pieces of metadata created for the purpose of transfer or independent management.
- Study Unit -- This is a primary schema intended to pull together the material contained by a traditional study level codebook.
- Group -- This schema supports the grouping of study units, resource packages, and local holding material into series, multi-study collections, or related resource collections.
- Logical Product -- This schema contains information pertaining the description of a variable. It support the "variable cascade" approach, differentiating between conceptual variables, represented variables, and instance variables, how they are enumerated, organized into multi-dimensional structures (NCubes), and how they are related (variables within a logical record, unique record identification, and how to relate one record type to another within and across studies).
- Physical Data Product -- This set of schema provide a means of linking variables to specific storage layouts. Specialized schema were intended to support the description of different storage types, in particular the way different storage types referenced storage locations (character location, array order, name, etc.). Includes physicaldataproduct, dataset, physicaldataproduct_ncube_inline, physicaldataproduct_ncube_normal, physicaldataproduct_ncube_tabular, and physicaldataproduct_proprietary.
- Physical Instance -- This schema describes and defines a specific data file/storage entity. It includes a citation for the data file, access information, and variable statistics.
- Archive -- This schema contains elements whose content may vary by managing/archiving agency. These include definitions of access rules, collection organization, agent (organization, individual) identification, lifecycle events, data quality, and archive specific information.
- Comparative -- This schema supports capturing information on the similarity and differences between two or more elements of the same type such as concepts, universes, questions, variables, categories, etc.
- Conceptual -- This schema contains the description of concepts and their subsidiary types and organization into groups and schemes.
- DDI Profile -- This schema is a means of defining which elements are used or not used by an application (note this is only for use in XML). It allows constraining cardinality (requiring an object or limiting it's use to a single instance) and specifying content (requiring the use of a specific controlled vocabulary, or fixed content).
- Reusable -- This schema contains objects that have broad usage across the various schemas such as, identification, reference, code values (controlled vocabulary usage), name, label, description, date, subject, keyword, etc.
Note that with DDI-Lifecyle v4.0, the multi-schema approach will change to a single schema approach. As of v3.3 elements outside of the physical data product set have unique names (not duplicated in another DDI-Lifecycle namespace).
If you have existing metadata records, the process of defining the fields you want to use is somewhat easier, as you can just look at what was recorded on the old records. At this stage, you should generate a simple list of metadata fields you used, with a brief explanation of each. Note that DDI terminology may differ from that of the metadata system you use. Compare the definitions of fields, using the Glossary as a guide to identifying the appropriate DDI-Lifecycle term. If you are unable to identify an appropriate field, submit a question using the Ask an Expert link before creating a new user defined field.
Again, you're just creating a template that includes the kinds of information you wish to record about the dataset. If you want this information to display in your DDI instance or codebook, it needs to be included on this list.
Next you need to find out what DDI tags those fields map to. This is best done by familiarizing yourself with the field level documentation of the version you are using. Essentially, you're attempting to build a mapping document that will lay out the structure and content of your XML. Pay careful attention to whether or not a field is optional or repeatable, as this will affect your XML.
Determine how documentation will be created
Transferring content:
If you are transferring existing metadata in another format into DDI-Lifecycle you will need to map your content to DDI elements. Note that DDI-Lifecycle breaks out specific sub-elements to an extent beyond that found in DDI-Codebook and many other metadata structure. This may require pre-editing of your current metadata for consistency and for specific areas that will require sorting current metadata into multiple elements or objects. Check with the DDI community (join DDI User's list) to see if others have already mapped from your current metadata structure into DDI-Lifecycle. In addition to reusable mapping they may be able to provide specific insights and suggestions for your particular situation. Note that your goal will probably be to automate the transfer of as much content as possible and then doing hand entry to deal with problematic sections.
Hand entry:
Although DDI-Lifecycle instances can be hand-crafted, you will want to use an entry tool if available. See the Tools listing to identify available entry software. If doing hand entry without additional entry tool, we recommend the use of an XML editor to assist in the appropriate entry of elements and attributes. Low-tech approaches for small amounts of metadata can include hand created clip libraries that provide a set structure to use in entering content.
Be sure to validate any template or test examples in an XML validation tool. Periodic validation will help identify any errors introduced when making alterations or updates in templates or content usage.
Validating your XML should be relatively easy. Conventional XML editors include validation utilities, and if your server is set up to handle XML, you probably have server-side validation utilities installed already. Note that due to the structure of DDI-Lifecycle standard schema validation will not catch errors such as duplicated identifiers or invalid internal or external references. Secondary validation tools are required. Check in Tools or submit a query to the DDI community.
Step Two: Mark Up Documents
Step Two is more of a technical process. At this stage, you may want to consult a programmer, or someone who's proficient in XML, because this stage is all about coding (tagging). There may be ways to automate your markup, depending on your source materials. This step really depends upon the materials you are working with and what your ultimate goal for markup is.
The first thing you need to do is determine what software you're going to use to generate XML. If you're creating new documentation or if you're working from an unformatted text/Word document that's not suitable for text processing, then you may want to purchase an XML editor and begin the process of tagging individual documents, using your template. If your source document has a regular format, you may use text processing to insert DDI tags around the appropriate content. If your metadata is in a database or some sort of delimited format, then you'll most likely want to have a programmer build a script that generates XML from your original metadata files. DDI XML is generally produced in UTF-8. Use of proprietary import documents may introduce non-ASCII-Latin or non-UTF-8 characters. It is advisable to clean your input prior to transformation. Locating and correcting this type of error can be difficult and time-consuming after transfer.
Step Three: Select and Implement Display Software
Step Three may involve both a systems/server administrator and an XML person.
In order to render XML into an attractive, understandable document, you'll may a stylesheet to dictate display, and you will need software to read the stylesheet (XSLT document) and render the XML accordingly. Check in Tools or submit a query to the DDI community to locate any existing stylesheet to meet your needs.
For consistent display, you'll want to take advantage of a server-side solution, which should be installed on your Web server by the individual responsible for such things. Be aware that this kind of install is seldom a simple out-of-the-box thing. As mentioned, Cocoon freeware by Apache may be used for this.
Step Four: Testing
At this point, you have the XML files validated and sitting in the appropriate folder on your server, as well as your XSLT file(s), if applicable. Your sysadmin has installed the necessary software, and you're ready to begin testing. With any luck, your files will display and the only changes you'll need to make are visual changes dictated by the XSLT.