Metadata and Vocabularies

Metadata and Vocabularies2020-12-29T14:26:07+00:00

Project Description

The potential to create a unified body of scholarly materials is reliant on interoperability – specifically, that repositories follow consistent guidelines, protocols, and standards that allow them to communicate with each other and with other systems in order to transfer information, metadata, and digital objects.

COAR is working towards the alignment of repository networks and other systems via technical and semantic interoperability, including efforts to harmonize metadata and vocabularies in repositories.


COAR is working with the repository community to improve the quality and comprehensiveness of metadata. As such COAR raises awareness of the importance of metadata for repositories and the development of services, and intends tot build a knowledge base that will link to the various metadata schemas already in use by the repository community.

Controlled Vocabularies

The use of controlled vocabularies for bibliographic metadata “ensures that everyone is using the same word to mean the same thing”. The continuous revision, update and maintenance of the COAR Controlled Vocabularies and its adoption by the most commonly used open repository software is a way to enhance the interoperability across repositories and with other related systems such as harvesters, CRIS systems, data repositories and publishers.

The COAR Controlled Vocabularies are governed and maintained by the Editorial Board. In order to define the controlled vocabularies, the Editorial Board analyzes existing vocabularies and dictionaries and will use the most appropriate existing terms whenever possible. In the case where there are gaps identified by the community, new terms are defined by the group. The Editorial Board translates labels into numerous other languages.

The COAR Controlled Vocabularies are described in SKOS. In SKOS, concepts are identified using URIs, labels to the concepts can be offered in multiple languages, notes allow for description, as well as different annotations and concepts from other vocabularies can be linked. The URL prefix is reserved for the concepts of the COAR Controlled Vocabularies.

Available Controlled Vocabularies

The Resource Type vocabulary defines concepts to identify the genre of a resource. Such resources, like publications, research data, audio and video objects, are typically deposited in institutional and thematic repositories or published in ejournals.

This vocabulary supports a hierarchical model that relates narrower and broader concepts. Multilingual labels regard regional distinctions in language and term. Concepts of this vocabulary are mapped with terms and concepts of similar vocabularies and dictionaries.

The Access Rights vocabulary defines concepts to declare the access status of a resource. Multilingual labels regard regional distinctions in language and term. The Access Rights vocabulary builds on access rights defined in info:eu-repo/semantics.

The Version Type vocabulary defines concepts to declare the version of a resource. Multilingual labels regard regional distinctions in language and term. The concepts are adopted from the “Journal Article Versions (JAV): Recommendations of the NISO/ALPSP JAV Technical Working Group“.

COAR provides repository community with an Implementation Guide for the controlled vocabularies. It is also available on Github. This guide includes implementation of such vocabularies on different repository platforms and Open Journal Systems (OJS) as well as a list of repositories which have implemented COAR Controlled Vocabularies. If you would like to contribute to the guide for a new repository platform or add your repository to the list of use cases, please create an issue on Github or email us.

The Editorial Board manages the COAR Controlled Vocabularies and comprises the following members:

  • Jochen Schirrwagen, University of Bielefeld, Germany – (Co-Chair)
  • Isabel Bernal, Consejo Superior de Investigaciones Cientifícas (CSIC), Spain – (Co-Chair)
  • Alberto Apollaro, Ministerio de Ciencia, Tecnología e Innovación Productiva (MinCyT), Argentina
  • Dom Fripp, Jisc, United Kingdom
  • Gültekin Gürdal, Izmir Institute of Technology Library, Turkey
  • Hilary Jones, Jisc, United Kingdom
  • Ilkay Holt, COAR, United Kingdom
  • Ku (Alan) Liping, The National Science Library, CAS, China
  • Laurence Le Borgne, ADBS, France
  • Liu Dan, Peking University Library, China
  • Milan Ojsteršek, University of Maribor, Slovenia
  • Nie Hua, Peking University Library, China
  • Paola Azrilevich, Ministerio de Ciencia, Tecnología e Innovación Productiva (MinCyT), Argentina
  • Pedro Príncipe, Universidade do Minho, Portugal
  • Sawsan Habre, Lebanese American University, Lebanon
  • Susanna Mornati, 4Science, Italy
  • Tomoko Kataoka, JPCOAR, Japan
  • Wilko Steinhoff, Data Archiving and Networked Services (DANS), Netherlands
  • Yutaka Hayashi, JPCOAR, Japan
Editorial Board members served in the past:
  • Imma Subirats, Food and Agriculture Organization of the United Nations, Itay
  • Sandor Kopacsi, University of Vienna, Austria
  • Shenghui Wang, OCLC (Online Computer Library Center), Netherlands
  • Ilaria Fava, State and University Library, University of Göttingen, Germany
  • Iryna Solodovnik, ood and Agriculture Organization of the United Nations, Italy
  • Sophie Aubin, INRA – the French National Institute for Agricultural Research, France
  • Nathalie Vedovotto, Inist-CNRS, France


What is a controlled vocabulary?2019-10-21T17:22:52+01:00

A controlled vocabulary is an organized arrangement of words and phrases used to index content and/or to retrieve content through browsing or searching. It typically includes preferred and variant terms and has a defined scope or describes a specific domain. Controlled vocabularies capture the richness of variant terms and promote consistency in preferred terms and the assignment of the same terms to similar content.

What is the benefit of controlled vocabularies?2019-10-21T17:23:05+01:00

Controlled vocabularies are beneficial at the indexing process so that data providers and repositories apply the same term to refer to the same concept (e.g., person, place or thing) in a consistent way. This helps with search and discovery of content. Controlled vocabularies guide end-users to formulate their searches better as they may not know the correct term for a given concept. In fact, the most useful function of controlled vocabularies is to gather together variant terms and synonyms for concepts and link concepts in a logical order or organize them into categories. Thus, consolidating many different synonyms into one controlled term increases the number of useful hits returned by the search.

What types of controlled vocabularies do exist?2019-10-21T17:23:22+01:00

There are different types of controlled vocabularies including subject heading lists, controlled lists, synonym ring lists, authority files, taxonomies, alphanumeric classification schemes, thesauri, and ontologies.

Which “controlled vocabularies” are the most relevant for repositories?2019-10-21T17:23:33+01:00

Subject heading lists, authority files, taxonomies, alphanumeric classification schemes and ontologies.

What does the resource type vocabulary describe ?2019-10-21T17:23:50+01:00

The aim of this Controlled Vocabulary is to provide concepts that describe the genre of a digital resource

How can resource type controlled vocabulary be implemented?2019-10-21T17:25:04+01:00

The Resource Type Controlled Vocabulary is using the SKOS standard. Each term has properties for the concept-URI, the definition of the concept, labels in multiple languages and may have relations to terms in other controlled vocabularies. Moreover concepts in this vocabulary are organized in a hierarchical way.

How can controlled vocabulary be used in metadata record?2019-10-21T17:24:42+01:00

In order to describe the genre of a digital resource the most appropriate concept should be chosen. It is not necessary to include broader concepts as they are already logically related in the vocabulary. When referring to a concept from the controlled vocabulary the concept-URI must be included and optionally one or more labels associated with the concept.

Can I tag a resource with two concepts?2019-10-21T17:25:21+01:00

It is up to a concrete application profile to decide if a resource can be tagged by only one or more concepts.

Why does the resource type vocabulary have a complexity in terms of hierarchy?2019-10-21T17:25:29+01:00

The proposed hierarchy is an attempt to structure all the concepts from a generic down to a granular level. It is however not without contradiction, e.g. to have ‘thesis’ under ‘text’. The vocabulary is going to be recommended in repository metadata guidelines. And it is up to those guidelines to decide to include all concepts or only a subset as long as the original concept-URLs and labels are used.

Want to contribute?

Email Us!