Persistent Identifiers: Addressing the challenges of global adoption

The aim of this blog post is to raise awareness about certain issues related to the adoption of persistent identifiers (PIDs), which especially impact developing countries and to propose an alternative approach that will enable greater global inclusiveness and more widespread adoption of PIDs across the world.

Persistent Identifiers (PIDs) are an important part of the scholarly ecosystem because they provide long-lasting references to digital resources. To accomplish this, a PID typically has two components: 1. A unique identifier used as a reference to a resource, and 2. A service that correctly forwards (resolves) resource references over time even when its location changes. The first provides a long term stable reference for users, while the second tracks the current location so users don’t have to.

There are a number of different types of PIDs available for scholarly resources. The most well-recognized one for scholarly resources is the DOI (Digital Object Identifier), but there are also ARKs, PURLs, Handles and URNs all of which have been available for at least two decades. Handles, which have traditionally been used in the repository world, have a robust infrastructure which provides resolution for the DOI system. ARKs, also commonly found in libraries, archives and museums services, have a flexible, decentralized infrastructure. (1) Regardless of type, PIDs promote efficient citation and discovery of scholarly resources.

So, why then, if a resource already has a Handle or other type of PID, would you still need to acquire a DOI? As long as the service is properly maintaining the link from the unique identifier to the resource, does it really matter what type of PID you use?

Managed research ecosystems based on DOI-based metadata collections

While the original aim of PIDs services was to offer persistence, some DOI Registration Agencies  or “RAs” (2) have been developing value-added services with the metadata they collect, which is then repurposed as part of a value-added service offering, turning it into a kind of managed research ecosystem. At least two DOI-based aggregations (Crossref and Datacite) have been created for the purpose of discovery, tracking, and analysis of research production.

Crossref, for example, presents the vision of a “research nexus”:

The research nexus goes beyond the basic idea of just having persistent identifiers for content. Objects and entities such as journal articles, book chapters, grants, preprints, data, software, statements, dissertations, protocols, affiliations, contributors, etc. should all be identified and that is still an important part of the picture. But what is most important is how they relate to each other and the context in which they make up the whole research ecosystem. The foundation of the research nexus is metadata; the richer and more comprehensive the metadata in Crossref records, the more value there is for our members and for others, including for future generations.

DataCite articulates its value as:

Organizations within the research community join DataCite as members to be able to assign DOIs to all their research outputs. This way, their outputs become discoverable and associated metadata is made available to the community. DataCite then develops additional services to improve the DOI management experience, making it easier for our members to connect and share their DOIs with the broader research ecosystem and to assess the use of their DOIs within that ecosystem.

While value-added services, in themselves, are welcome, there are two issues that arise when DOI aggregations are marketed as a sole reference point for scholarly outputs.

Cost barriers

Firstly, there are substantial cost barriers to the adoption of DOIs for organizations in developing countries. The costs of minting DOIs (or joining DataCite or Crossref – even as a consortium) makes them unaffordable in many parts of Africa, Asia, and Latin America, where there are often little or no budgets for these types of services. In addition, fluctuating currency exchange rates in many countries mean the future costs can be highly unpredictable. Although all PIDs require some resources to maintain them –  or they are at risk of becoming inactive or inaccessible –  some PIDs, such as Handles and ARKs, are far less expensive to acquire.

There are programs to assist lower resourced countries (such as the Global Equitable Membership program at Crossref or the Global Access Fund from Datacite), however, these programs provide only temporary or partial relief and do not address the fundamental financial constraints for many organizations. The Datacite Global Access Fund, for example, will offer registration of DOIs free of charge for one year, and then organizations must provide a sustainability plan outlining how they plan to continue accessing these services following the funding period. This is not a long term solution, but rather merely entices organizations to join, leaving them to confront the cost issue once the year is over. The Crossref Global Equitable Membership program offers waivers for organizations in some low income countries, but it covers only a subset of the countries and institutions that are facing serious financial constraints.

Examples

Latin America: According to an analysis by LA Referencia of over 4.5 million metadata records harvested from journals and repositories in Latin America, only around 20% have DOIs. According to LA Referencia, the main reason for such low DOI coverage is the costs (in US dollars) that these services represent for universities and research institutions. As an alternative to DOIs, the Brazilian Institute of Information in Science and Technology (IBICT-Brazil) and LA Referencia (supported by SCOSS) are working on providing a decentralized technological solution based on ARK identifiers. This initiative seeks to support the assignment and resolution of identifiers through a network of resources provided by Brazilian institutions, and is also being considered by other Latin American countries.

Africa: The same problem exists in Africa. In the AfricaConnect project, WACREN (West and Central Research and Education Network) is collaborating with the Regional Universities Forum for Capacity Building in Agriculture (RUFORUM), a network of 163 African universities in 40 African countries to provide contemporary platforms to enable research data sharing and open access publishing. The collaboration is important for advancing research and education in agriculture in the region, improving research data management best practices especially with regards to FAIR data, and open science practices. PIDs are an important aspect of the initiative for ensuring permanence of resources. However, with the huge range and volume of research outputs related to the project, paywalled DOIs are not an option. If they were required, this would slow down the rate of uptake of PIDs, limit their adoption and hamper the collaboration both within Africa and with the global research community. The African NRENs, therefore, are focused on using ARK’s which are free to acquire and which can be provided to the universities directly.

Risk of monopolization

Possibly more problematic than the cost issue is the risk of monopolization. A global requirement for a DOI (and therefore to be represented in the metadata aggregation) by funders and governments in order for a resource to be “counted” or considered “legitimate” has the potential to create a quasi-monopolistic system, which gives a few players undue influence and introduces the risk of profiteering. Particularly worrisome is the narrative that associates having a DOI with general “trustworthiness” or “integrity” of research. Crossref, for example, has recently posted a blog about this, saying:

One particular benefit of a rich and transparent metadata network is the opportunity to infer judgments on the integrity of the scholarly record (ISR). Amanda Bartell, Head of Member Experience, highlighted that the community agrees that availability of information about relationships between research outputs, institutions and other elements of the scholarly ecosystem together provide essential context for deciding about trustworthiness of organisations and their published content. Conversely, it can make it harder for parties to pass off information as trustworthy when that context is missing.

If this perspective is widely embraced, it will have a seriously detrimental effect on developing countries because many of their outputs will not be included in those centralized metadata collections. Meanwhile, organizations outside the global north already struggle with lower visibility and perceived credibility of their research. Requirements to assign DOIs in order to be considered legitimate will only further exacerbate this situation. Inferring a relationship between the quality or integrity of a resource and having a DOI is simply wrong, and should be avoided.

The way forward

Discoverability and persistence are critical for ensuring research outputs are widely used and have the broadest impact. However, this can and should be facilitated in a way that is flexible to the needs of everyone in the scholarly community and, as much as possible, reduces structural inequalities. Instead of focusing our efforts on the centralization and use of a few selective DOI services – which will end up excluding many due to financial barriers – the best way forward is for institutions, countries, and regions to choose the PID service that is most suitable in their own local context and conditions. Only in this way can we ensure the broadest possible adoption of PIDs across the scholarly landscape. And, while there is a value proposition for focusing on just one or two global PID services, the risks of creating a “haves” and “have nots” universe must be avoided.

Moreover, the use of several types of PIDs need not result in metadata silos. Harmonization of metadata elements across different PIDs services should be enough to ensure interoperability of collections, while also creating the conditions for a more inclusive and enduring scholarly commons. As articulated in the UNESCO Open Science Toolkit, Bolstering Open Science Infrastructures for All, “open science should build on long-term practices, services, infrastructures and funding models that ensure the equal participation of scientific producers from less privileged institutions and countries”.

COAR has long advocated for a distributed, yet interoperable environment as critical for a resilient and bibliodiverse ecosystem, while also reducing the risks of service lock-in. This approach has also been underscored in the UNESCO Recommendation on Open Science, which encourages members to adopt, “federated information technology infrastructure for open science… and robust, open and community managed infrastructures, protocols and standards to support bibliodiversity and engagement with society”(3). As such, we urge the broader community to consider the perspectives presented here and ensure global solutions reflect the needs and requirements of all countries and regions.

End Notes

(1)  The UNESCO Digital Library, for example, uses ARKs: https://unesdoc.unesco.org/ark:/48223/pf0000379949.locale=en

(2) The DOI Foundation is funded by annual fees paid by the registration agencies (RAs) and other members. The Foundation manages the DOI system on behalf of agencies who run DOI registries. DOI registries allocate DOI prefixes, register DOI names, and provide a metadata schema associated with each DOI record. There are currently 12 registry agencies most of whom are focused on minting DOIs for scholarly resources.

(3) From UNESCO Recommendation on Open Science. 2022. (iii) Investing in open science infrastructures and services. Section 18e.



Discover more from COAR

Subscribe now to keep reading and get access to the full archive.

Continue reading