November 24, 2020
Prepared by Kathleen Shearer, Executive Director, COAR and Dr. Danny Kingsley, Visiting Fellow, Australian National University
There has been significant concern expressed in the repository community about the requirements contained in the Data Repository Selection: Criteria that Matter, which sets out a number of criteria for the identification and selection of data repositories that will be used by publishers to guide authors in terms of where they should deposit their data.
COAR agrees that it is important to encourage and support the adoption of best practices in repositories. And there are a number of initiatives looking at requirements for repositories, based on different objectives such as the FAIR Principles, CoreTrustSeal, the TRUST Principles, and the CARE Principles of Indigenous Data Governance. Recently COAR brought together many of these requirements – assessed and validated them with a range of repository types and across regions – resulting in the publication of the COAR Community Framework for Best Practices in Repositories.
However, there is a risk that if repository requirements are set very high or applied strictly, then only a few well-resourced repositories will be able to fully comply. The criteria set out in Data Repository Selection: Criteria that Matter are not currently supported by most domain or generalist data repositories, in particular the dataset-level requirements. If implemented by publishers, this will have a very detrimental effect on the open science ecosystem by concentrating repository services within a few organizations, further exacerbating inequalities in access to services. Additionally, it will introduce bias against some researchers, for example, researchers who prefer to share their data locally; researchers in the global south; or researchers who want to share their data in a relevant domain repository, so it can be visible to their peers and integrated with other similar datasets.
The criteria outlined in Data Repository Selection: Criteria that Matter are also too narrowly conceived and do not reflect the range of issues that should be taken into consideration when making decisions about where to deposit data. In addition to supporting FAIR data, researchers, policy makers, and journals should consider a range of issues when choosing or recommending an appropriate data repository:
All of these elements are important in their own right, and each researcher will need to make a choice of how to balance them according to their own circumstances based on funder requirements, priorities, and needs – this should not be determined by the publisher.
Our collective goal should be to develop a sustainable, distributed, and interoperable repository network that can support researchers around the world in managing and sharing their data. To achieve this, we need to work with and strengthen existing services, not disqualify them.
If you have concerns about this issue, please send your feedback to email@example.com