Metadata and Vocabularies

Metadata and Vocabularies2023-11-16T13:30:34+00:00

The potential to create a unified body of scholarly materials relies on interoperability. Specifically, repositories should follow consistent guidelines, protocols, and standards that allow them to communicate with each other and with other systems in order to transfer information, metadata, and digital objects. COAR is working with the repository community to improve the quality and comprehensiveness of metadata in repositories.

Basic Metadata Recommendations

Below are the metadata requirements for a number of user stories developed by a former COAR Metadata Working Group.

User Story: “As a researcher, I want my publications and other research outputs to be available to users via Google and Google Scholar.

Metadata Requirements

  • Title
  • Author name / ID
  • Publication date
  • Permanent URL (e.g. DOI or Handle)
  • Abstract
  • Domain subject headings or key words

User Story: “As a teacher, I want to be able to find permissions, or licensing information about an article so I can determine if I can reuse it for my course.”

Metadata Requirements

User Story: “As a researcher, I want to be able to reuse a data set found in a repository.”

Metadata Requirements

  • format
  • link to article
  • link to related protocol / software from research study
  • reuse license
  • domain metadata requirements

User Story: “As a research funder, I want to monitor compliance with our open access policy.”

Metadata Requirements

  • Author name / ID
  • Publication Date
  • Funder name / ID
  • Open access status
  • Article version
  • Journal name and publisher name
  • DOI

User Story: “As a research institution, I want to document and track the publications of our affiliated researchers.”

Metadata Requirements

  • Author name / ID
  • Institution name / ID

User Story: “As a repository manager, I want to ensure my repository content is included in national research assessment activities.”

Metadata Requirements

  • Title
  • Author name / ID
  • Date
  • Institution name / ID
  • Funder name / ID
  • DOI

User Story: “As a researcher, I want the metadata for my article to be visible and findable in other indexing services.”

Metadata Requirements

  • OAI-PMH
  • PIDs: ORCID, DOIs, etc.
  • Standard vocabularies (e.g COAR)
  • Mapping to meta-schemas (e.g. schema.org)

User Story: “As a repository manager with content in non-English language, I want my records to be available to local users in their local language as well as through international indexes and discovery services.”

Metadata Requirements

These requirements are pending as COAR examines best practices recommendations for managing and exposing content for non-English languages.

User Story: “As a scholar, I want my research outputs to be available over the long term and remain as a permanent part of the scholarly record.”

Metadata Requirements

  • Format information
  • Size of bitstream

COAR’s Controlled Vocabularies

The use of controlled vocabularies for bibliographic metadata “ensures that everyone is using the same word to mean the same thing”. The continuous revision, update and maintenance of the COAR Controlled Vocabularies and its adoption by the most commonly used open repository software is a way to enhance the interoperability across repositories and with other related systems such as harvesters, CRIS systems, data repositories and publishers.

The COAR Controlled Vocabularies are governed and maintained by an Editorial Board. In order to define the controlled vocabularies, the Editorial Board analyzes existing vocabularies and dictionaries and will adopt the most appropriate existing terms and definitions whenever possible. In cases where there are gaps identified by the community, new terms are defined by the group. The COAR Controlled Vocabulary Editorial Board also translates vocabulary terms into numerous languages.

The Resource Type vocabulary defines concepts to identify the genre of a resource. Such resources, like publications, research data, audio and video objects, are typically deposited in institutional and thematic repositories or published in ejournals.

This vocabulary supports a hierarchical model that relates narrower and broader concepts. Multilingual labels regard regional distinctions in language and term. Concepts of this vocabulary are mapped with terms and concepts of similar vocabularies and dictionaries.

The Access Rights vocabulary defines concepts to declare the access status of a resource. Multilingual labels regard regional distinctions in language and term. The Access Rights vocabulary builds on access rights defined in info:eu-repo/semantics.

The Version Type vocabulary defines concepts to declare the version of a resource. Multilingual labels regard regional distinctions in language and term. The concepts are adopted from the “Journal Article Versions (JAV): Recommendations of the NISO/ALPSP JAV Technical Working Group“.

Resources

Terms of Reference for the Editorial Board

Note – In November-December 2023 we will be renewing and updating the membership in the Editorial Board.
  • Isabel Bernal, Consejo Superior de Investigaciones Cientifícas (CSIC), Spain – (Chair)
  • Alberto Apollaro, Ministerio de Ciencia, Tecnología e Innovación Productiva (MinCyT), Argentina
  • Brigit Nonó, Universitat de Girona, Spain
  • Cristina Azorín, Universitat Autònoma de Barcelona, Spain
  • Dom Fripp, Jisc, United Kingdom
  • Gültekin Gürdal, Izmir Institute of Technology Library, Turkey
  • Irina Razumova, NEICON, The Russian Federation
  • Jochen Schirrwagen, Universitätsbibliothek, RWTH Aachen University, Germany
  • Juha Hakala, The National Library of Finland, Finland
  • Ku (Alan) Liping, The National Science Library, CAS, China
  • Laurence Le Borgne, ADBS, France
  • Liu Dan, Peking University Library, China
  • Marina Losada, Universitat Pompeu Fabra, Barcelona, Spain
  • Milan Ojsteršek, University of Maribor, Slovenia
  • Nie Hua, Peking University Library, China
  • Paola Azrilevich, Ministerio de Ciencia, Tecnología e Innovación Productiva (MinCyT), Argentina
  • Pedro Príncipe, Universidade do Minho, Portugal
  • Sawsan Habre, Lebanese American University, Lebanon
  • Susanna Mornati, 4Science, Italy
  • Tomoko Kataoka, JPCOAR, Japan
  • Yutaka Hayashi, JPCOAR, Japan
Previous Editorial Board Members:
  • Imma Subirats, Food and Agriculture Organization of the United Nations, Itay
  • Sandor Kopacsi, University of Vienna, Austria
  • Shenghui Wang, OCLC (Online Computer Library Center), Netherlands
  • Ilaria Fava, State and University Library, University of Göttingen, Germany
  • Iryna Solodovnik, ood and Agriculture Organization of the United Nations, Italy
  • Sophie Aubin, INRA – the French National Institute for Agricultural Research, France
  • Nathalie Vedovotto, Inist-CNRS, France
  • Ilkay Holt, COAR, United Kingdom
  • Wilko Steinhoff, Data Archiving and Networked Services (DANS), Netherlands
  • Hilary Jones, Jisc, United Kingdom

COAR provides repository community with an Implementation Guide for the controlled vocabularies.  This guide includes implementation of such vocabularies on different repository platforms and Open Journal Systems (OJS) as well as a list of repositories which have implemented COAR Controlled Vocabularies.

FAQs

What is a controlled vocabulary?2019-10-21T17:22:52+01:00

A controlled vocabulary is an organized arrangement of words and phrases used to index content and/or to retrieve content through browsing or searching. It typically includes preferred and variant terms and has a defined scope or describes a specific domain. Controlled vocabularies capture the richness of variant terms and promote consistency in preferred terms and the assignment of the same terms to similar content.

What is the benefit of controlled vocabularies?2019-10-21T17:23:05+01:00

Controlled vocabularies are beneficial at the indexing process so that data providers and repositories apply the same term to refer to the same concept (e.g., person, place or thing) in a consistent way. This helps with search and discovery of content. Controlled vocabularies guide end-users to formulate their searches better as they may not know the correct term for a given concept. In fact, the most useful function of controlled vocabularies is to gather together variant terms and synonyms for concepts and link concepts in a logical order or organize them into categories. Thus, consolidating many different synonyms into one controlled term increases the number of useful hits returned by the search.

What types of controlled vocabularies do exist?2019-10-21T17:23:22+01:00

There are different types of controlled vocabularies including subject heading lists, controlled lists, synonym ring lists, authority files, taxonomies, alphanumeric classification schemes, thesauri, and ontologies.

Which “controlled vocabularies” are the most relevant for repositories?2019-10-21T17:23:33+01:00

Subject heading lists, authority files, taxonomies, alphanumeric classification schemes and ontologies.

What does the resource type vocabulary describe ?2019-10-21T17:23:50+01:00

The aim of this Controlled Vocabulary is to provide concepts that describe the genre of a digital resource

How can resource type controlled vocabulary be implemented?2019-10-21T17:25:04+01:00

The Resource Type Controlled Vocabulary is using the SKOS standard. Each term has properties for the concept-URI, the definition of the concept, labels in multiple languages and may have relations to terms in other controlled vocabularies. Moreover concepts in this vocabulary are organized in a hierarchical way.

How can I reuse COAR vocabularies? What are the license conditions?2021-11-01T19:40:03+00:00

All COAR vocabularies are made available under a Creative Commons Zero 1.0 Universal.

How do COAR Vocabularies support FAIR Principles?2021-11-01T19:37:55+00:00

COAR controlled vocabularies play a key role in making semantic artifacts (and repositories) compliant with the FAIR Principles, in particular when it comes to findability and interoperability. Using controlled vocabularies enables repositories to be consistent in describing their resources, helps with search and discovery of content, and allows machine readability for accessibility and interoperability.

How can I share my feedback about existing concepts in COAR Vocabularies?2021-11-01T19:54:17+00:00

Any feedback about the vocabularies is very welcome through this online form which is available on the main COAR website as well as on the dedicated site for the vocabularies. Your feedback will be discussed by the Editorial Board members for inclusion in future versions.  

How can I join the COAR Controlled Vocabularies Editorial Board?2021-11-01T19:37:23+00:00

Anyone from a COAR member institution that is interested in participating in the Editorial Board may do so. To do that, please email office@coar-repositories.org to indicate your interest. There are exceptions to the COAR member-only status if the interested person is willing to translate one or more of the COAR vocabularies into a language which is not represented at the time.

How can I contribute? My language is not represented in the COAR controlled vocabularies.2021-11-01T19:53:11+00:00

Please email office@coar-repositories.org to indicate your interest. Please note that persons from COAR’s member institutions are prioritised to take on a role in the Editorial Board but there are exceptions to the COAR member-only status if the interested person is willing to translate any of the COAR vocabularies into a language which is not represented at the time.

How do you map concepts in the COAR vocabularies to other vocabularies or ontologies?2021-11-01T20:05:30+00:00

Once a year, COAR Controlled Vocabularies Editorial Board checks the validity of the URIs of matched concepts from third party vocabularies. If the URI of an external concept is not working any more, such matching is removed from the COAR vocabulary. In addition, the members of the COAR Editorial Board conduct desktop research to find other related vocabularies with whom to match in newer versions. Both activities can be highly time consuming and therefore COAR welcomes active notification from third party vocabularies whenever there is a change to their service that may effect current mappings.

How can controlled vocabulary be used in metadata record?2019-10-21T17:24:42+01:00

In order to describe the genre of a digital resource the most appropriate concept should be chosen. It is not necessary to include broader concepts as they are already logically related in the vocabulary. When referring to a concept from the controlled vocabulary the concept-URI must be included and optionally one or more labels associated with the concept.

How do you collaborate with third party vocabularies/ service providers?2021-11-01T20:03:54+00:00

COAR vocabularies have mappings to the related concepts in other related vocabularies. Building and maintaining a network of relations amongst URI-supported vocabularies greatly enhances discovery and enrichment of an open scholarly system. The Editorial Board checks the validity of such mappings on a regular basis. We encourage third party initiatives to get in touch so as to ensure links to their concepts remain active.

How often are COAR controlled vocabularies updated?2021-11-01T19:55:41+00:00

The COAR controlled vocabularies are reviewed once a year, and they are updated only when it is determined that there are significant changes needed. Discussions take place regularly in the Editorial Board meetings throughout the year.

Can I tag a resource with two concepts?2019-10-21T17:25:21+01:00

It is up to a concrete application profile to decide if a resource can be tagged by only one or more concepts.

Are regional variations accepted for all languages?2021-11-01T19:49:30+00:00

Regional differences are taken into account in translations of the concepts. For instance, Spanish translations are checked by editors from Spain and Latin America and any regional differences are reflected in the alternative labels.

Are definitions of concepts in COAR Vocabularies only in English? Is it possible to translate definitions?2021-11-01T19:41:59+00:00

All COAR vocabulary concepts are defined in English and translations to the other languages are optional. We highly encourage the Editorial Board members if they wish to translate the definitions into their local languages and inform COAR Office about it.

Is there specific technical documentation to integrate the vocabularies in my repository?2021-11-01T19:57:50+00:00

Yes, relevant documentation can be found on COAR controlled vocabularies’ dedicated website. Specifically, the implementation guide for adopting vocabularies can help with implementing them into different repository software platforms including DSpace, Samvera/Hyrax, Haplo, Eprints and Open Journal Systems. 

Can I combine COAR Vocabularies concepts with concepts from other controlled lists in my repository?2021-11-01T20:02:18+00:00

Yes, depending on the application profile used in your infrastructure, COAR controlled vocabularies can be used in combination with other vocabularies e.g. DataCite. Two common examples of such application profiles are OpenAIRE 4.0 Guidelines and RIOXX Application Profile 3.0

Can I suggest other possible vocabularies to be developed by the COAR community?2021-11-01T20:07:38+00:00

Yes, suggestions for new vocabularies are welcome. They are to be discussed and evaluated by the Editorial Board.

Do COAR Vocabularies accept mappings with third party vocabularies whose concepts are not URI supported?2021-11-01T20:06:40+00:00

Currently, COAR Vocabularies only match with other vocabularies’ concepts that have associated PIDs.

Can I combine concepts from different versions of COAR Resource Types Vocabulary?2021-11-01T19:59:01+00:00

It is recommended to use the latest version of the vocabularies but adopters are free to continue using earlier versions of the vocabularies. Each release, with relevant documentation including links to various formats (e.g. SKOS RDF), is available on this website. Deprecated concepts in any release remain available within their respective vocabulary versions and the PURL URIs will continue to resolve them in case there are repositories still using them.

Who maintains the PURLs of the concepts of COAR Vocabularies?2021-11-01T19:56:43+00:00

The PURL system provides “persistent URLs” and is operated and maintained by The Internet Archive. COAR maintains a PURL namespace for each COAR Vocabulary. Each PURL is configured to redirect to a web resource giving more information about the vocabulary.

Why does the resource type vocabulary have a complexity in terms of hierarchy?2019-10-21T17:25:29+01:00

The proposed hierarchy is an attempt to structure all the concepts from a generic down to a granular level. It is however not without contradiction, e.g. to have ‘thesis’ under ‘text’. The vocabulary is going to be recommended in repository metadata guidelines. And it is up to those guidelines to decide to include all concepts or only a subset as long as the original concept-URLs and labels are used.

Go to Top