Oct 4, 2022 1:15 PM Eastern Daylight Time
We are now witnessing the emergence of FAIR data-sharing mechanisms in many areas, with the focus having shifted from the "what" to the "how" in many organizations. In many domains, there are a number of common standards – some which can apply equally across domains, and some specific to the data, processes, and practices within that domain. The challenge of FAIR data sharing – ubiquitous, automated reuse of data and metadata – is particularly acute across domain and infrastructure boundaries, demanding a change in how data are described.
To meet this challenge, it is important to first understand how the different standards and models used to describe data can be employed, so that they speak not only to traditional users, but also to users coming from other domains. One major development in this area is the idea of a FAIR Digital Object Framework (FDOF), where information - both data and metadata - of interest for the discovery and reuse of data can be identified and obtained. The FDOF represents an initial step, but does not address many of the practical issues of interoperability. We must look at the intersection of standards of different types and how they fit into this picture: the idea that every FAIR resource is implemented according to an entirely new set of technical standards is not realistic. The FDOF serves as an agreed way to obtain needed FAIR resources and to learn enough about them to understand some related resources (e.g., metadata schemas) at the level of a protocol. It is not sufficient on its own to produce interoperability, which will require an ability to actually understand the metadata schemas being used. When it comes to standards, some parts of FAIR are better supported than others.
Discovery of FAIR resources increasingly relies on standards and approaches which are widely adopted, and often much the same across domains and institutional boundaries. DCAT, Schema.org, and Dublin-Core-based cataloguing metadata is commonly found in many areas. For other aspects of FAIR however, this degree of domain-agnostic standardization does not exist. Semantics and vocabularies are often deeply domain-dependent, and other important types of metadata needed for effective reuse - structural metadata, provenance, etc. - are also seen in many different forms, reflecting domain practice. Within any given domain, the standards requiring support may be well-understood, and limited in number. The same cannot typically be said when data from other domains is the target of reuse. If we are to make use of the FDOF as intended, we need to have a second tier of domain-agnostic standards which makes this profusion of models, schemas, etc. tractable. Such a second tier should be developed as a mechanism for domain-specific standards to be more easily exchanged and transformed. Technical standards such as RDF, JSON, XML (etc.) may provide a useful foundation, but they are not themselves sufficient.
The standard vocabularies and models which are understandable across domains provide an additional needed layer of interoperability. One good example of this is SKOS: many domains use concept systems of different types. If they are described in SKOS, they can at least be exchanged and processed in a coherent way across domain boundaries, even if the specifics of the concepts themselves need further attention. The EOSC Interoperability Framework introduced this idea of a leveled hierarchy of standards, and it is a useful way to understand what a practical approach to interoperability looks like as we progress from the universal toward the domain- and community specific. This session presents the requirements which lead us to a middle tier of domain-agnostic standards in support of the FDOF, and proposes some candidates for consideration based on implementations and explorations to date. Some examples of such standards are provided, showing how they can work together to provide the complete information set needed to reuse data in a FAIR data-sharing scenario across domain and institutional boundaries.
The focus of the session is on the "interoperability" and "reuse" elements of FAIR, but the session will touch on all aspects of FAIR data sharing, and how it might practically be realized. In particular, we aim to present these ideas to the DCMI community, to get feedback and to understand how this approach may intersect with current activities and thinking in the DCMI community and with related initiatives.
Speakers: Arofan Gregory (DDI Alliance and CODATA), Flavio Rizzolo (Statistics Canada), Franck Cotton (INSEE), Simon Hodson (CODATA).