7 things you should know about…Institutional Repositories, CRIS Systems, and their Interoperability

Scenario

  1. What are the main differences between CRISs and IRs?
  2. Are CRISs and IRs incompatible? Why are then CRISs sometimes replacing IRs?
  3. Are all CRISs commercial products? And are all IRs open source?
  4. So how can a repository “play a CRIS role”? Are there any examples for that already?
  5. The expression CRIS/IR interoperability is very often heard these days, but what exactly does it mean?
  6. Is then the CRIS+IR use case the best one for institutional system architecture?
  7. Are there further areas where CRIS/IR interoperability might be applicable?

References / Useful Links

Scenario

System interoperability between institutional repositories (IRs) and research information management systems (CRISs) is a well-explored but scarcely documented topic that has been around for a number of years [1] in countries where a consistent reporting activity is required on research outputs. The UK is perhaps the best example for such countries, with research assessment exercises being periodically carried out for funding allocation purposes by the Government and the need to regularly report to funders in order for them to assess the impact of their research funding programmes [2]. This text will subsequently take a somewhat UK-centred approach to the topic, but there are many other countries where this functionality has been exploited for some time now such as The Netherlands, Germany, Italy, Norway, Portugal or Spain and the list keeps steadily growing [3].

CRIS/IR interoperability basically aims to exploit (generally at institutional level) the complementary functionality of both repositories and research information management systems in order to enhance the features of both of them and to best serve the needs of the organisation that hosts them. Technically this is achieved by ensuring some degree of information exchange between both platforms. Details on why and how CRIS/IR interoperability is performed are provided in the answers to the seven questions below.

 

1. What are the main differences between CRISs and IRs?

Current Research Information Systems (CRISs) and Institutional Repositories (IRs) are two kinds of platforms for carrying out research information management at institutional, regional, national or subject-specific level. CRISs and IRs share a good deal of functionality, their common goal being to collect information about research activity carried out within a given research environment by means of a series of metadata supported by a specific data model. Both systems will ultimately rely for their information collection purposes on the data provided by researchers and research administrators, either directly or via mediating services such as libraries, research offices, publishers and databases.

Besides these basic similarities, there are also relevant differences in the approach CRISs and IRs take towards collecting and disseminating research information management. Some of the main differences are listed below, keeping the analysis at institutional level for simplicity purposes. It’s also worth noting that these differences are described from a ‘historical’ or conceptual viewpoint: nowadays both CRISs and IRs keep rapidly evolving towards an increasing level of integration, with differences between them becoming more and more difficult to point out [4].

1. Wide research activity vs Research outputs

CRISs are focussed on dealing with the whole institutional research activity, including research projects, grants, people, organizations, outputs and research facilities and equipment. Repositories will rather make emphasis on the collection of institutional research outputs, whose scope has gradually become wider, originally mainly covering institutional publications but having since been extended to additional outputs like research data.

2. Reporting vs Dissemination

CRISs collect such a wide research information in order to be able to describe the institutional research activity for reporting purposes, either at institutional, funder or governmental level. Research assessment exercises – such as the REF in the UK – and the research funders’ wish to measure the impact of their investments have both traditionally driven the implementation of ever more sophisticated CRISs at institutions. On the other hand, the main goal of repositories is the collection and dissemination of institutional research outputs, with a strong emphasis on publications. Repository implementation was driven by the international Open Access movement aimed to offer free access to publicly-funded institutional research publications, fighting the barriers that restrict the access to research.

3. Internal- vs Externally-oriented

CRISs being tools aimed to collect information about all areas of institutional research activity, including the economic one, they are basically designed for internal use at institutions, and have traditionally paid little attention to dissemination of the research information they store. The purpose of repositories is exactly the opposite one: while many institutions may use them as wider research information management platforms, they are oriented towards the outside world in their purpose to showcase, disseminate and grant open access to the institutional research output.

4. Research Office vs Library

As a result of the above, CRISs and IRs have traditionally been managed by different institutional units: while it’s reasonable to expect that CRISs get managed from the same Institutional Research Offices that also deal with research project management, repositories should instead be run and maintained from the institutional area that will usually deal with publications: the Library. Institutions have come a long way in promoting dialogue between their Research Office and Library since repositories and CRISs started getting implemented, but there is still some asymmetry in the way this role distribution places the different systems with regard to researchers: while the Research Office stands naturally close to research departments, whose income it secures by managing funded projects, the Library has traditionally been forced to carry out intensive dissemination campaigns for spreading the word among researchers about Open Access and institutional repositories.

This said, there are also numerous exceptions to this separation of duties: libraries do sometimes manage the data input into CRISs as well as the institutional repository, and institutional research offices often run the data exchange processes between CRISs and IRs. However, this organisational aspect remains a relevant factor at many institutions.

5. Metadata vs Full-text

As accurate reporting tools, CRISs are all about metadata. Repositories will also need metadata, but their emphasis lies rather on full-content availability and digital object creation, curation, preservation and re-use. This does not prevent many repositories from having a larger number of metadata-only items than of full-text (or full-content) ones, but this is not the way repositories are meant to work.

6. CERIF vs Dublin Core

From a technical viewpoint, the main difference between CRISs and IRs lies precisely in the metadata standards they use, and this difference is arguably the main barrier preventing their interoperability: while CRISs run the complex, often non-harmonised metadata standards they need to describe the wide institutional research activity, repositories operate much simpler (often too simple) metadata models which are otherwise extremely consistent across institutions and countries [5]. The Common European Research Information Format (CERIF) is the most widespread CRIS metadata standard, and although its current implementation significantly varies across countries, it is rapidly becoming the default standard that will ensure interoperability at a higher level, not just with repositories, but also with funder and research assessment systems. The main repository metadata standard is qualified Dublin Core or qDC, together with a set of additional, slightly more complex models like MODS, METS or PREMIS. Repository metadata standards are often perceived to be ‘too flat’, ie offering insufficient flexibility for describing complex semantic areas such as research funding, but the extraordinary rate of repository implementation owes a good share of its success to its metadata model simplicity.

Both CRIS and IR metadata standards are permanently evolving towards more complex data models that will often enable a deeper interoperability between them.

7. Commercial vs Open Source Platforms

Although many institutions have traditionally chosen to develop their own in-house-buit CRISs to meet their specific reporting needs, CRISs tend to be commercial systems due to their intrinsic complexity, and also to the key fact that commercial CRISs allow to set up user communities. As for repositories, their system architecture being much lighter as a rule, they have traditionally been open source platforms and have built upon a wide and committed international community for which sharing was one of the main drivers.

8. Open vs Closed

As a result of the previous difference, the information sharing standards are very different between the CRIS and IR communities. The repository community, driven by the intensive

library-based communication activity that existed way before they arrived, has always been completely open and made the case for dissemination not just of research outputs, but of working procedures and open-source code too. The CRIS community on the other hand has only recently started to systematically share and openly discuss its progress beyond the gates of the rather small research manager and administrator community.

Most of these differences between CRISs and IRs are rapidly vanishing nowadays, as tools are developed for increasing system interoperability and as institutional platforms merge their functionality for delivering a better service to researchers and all other involved stakeholders. It’s useful however to know where both systems come from in order to fully understand how the picture gradually evolves.

 

2. Are CRISs and IRs incompatible? Why are then CRISs sometimes replacing IRs?

CRISs and IRs are by no means incompatible, in fact they provide solutions for different institutional needs and do very often work together. However, as CRIS systems improve their functionality, they have started covering a growing number of features traditionally delivered by repositories, such as becoming OAI-PMH compliant, holding openly accessible full-text files for the publications’ metadata, managing embargoes, etc.

Although not yet delivering a full repository functionality, CRISs are perceived by a number of institutions as advanced enough (or in the process of becoming so) to justify the dephasing of their institutional repositories and relying on their CRISs for all research information management-related tasks. The growing degree of functional overlap between both systems is of course one of the main reasons for the occasional ‘replacement’ of IRs by CRISs at a time where expenses are often very closely monitored, but this process will usually only happen at research-intensive institutions where CRISs and IRs coexist – smaller institutions solely relying on their repository for research information management purposes will usually not consider running a CRIS and will hence keep their repository.

 

3. Are all CRISs commercial products? And are all IRs open source?

No. The CRIS system environment is understandably perceived to be a commercial one mainly because commercial CRISs like Pure (Atira-Elsevier) and Converis (Avedas-ThomsonReuters) are widespread platforms in the process of aggresive commercial expansion, and there have been recent movements by which the big companies in research information management have repositioned themselves in this market niche making significant investments along the way.

Not all institutional repositories run on open source platforms either – even if most do. Platforms like DigiTool or CONTENTdm are proprietary software systems.

Traditionally, the main CRIS vendors in non-English-speaking countries are however university consortia turned into mixed public-private organisations (such as Cineca in Italy or OCU and Sigma in Spain). Moreover, many institutions choose even today to develop their own in-house-built CRIS platforms, since they perceive this to be the only way to have full control over meeting institutional requirements and keeping implementation costs at bay. Finally, many other HEIs are also exploring the possibility of extending their institutional repository features so that it can play a CRIS role in the so-called IR-as-CRIS use case. This rather old line of thought inevitably leads to the birth and consolidation of the long-awaited open source CRISs. The whole landscape is in fact so rapidly evolving at the time that, while the commercial forces involved in the current picture are massive, it is really difficult to predict what a rather fractured scenario will evolve into.

 

4. So how can a repository “play a CRIS role”? Are there any examples for that already?

A repository may (to some extent) play a CRIS role by extending its underlying data model in order to collect additional research information besides the research outputs (in short: publications). There is nothing that prevents a Dublin Core-based metadata model to be extended in order to cover additional research areas such as research projects, and in fact this has been put into practice by the ePrints platform team in Southampton since as early as 2010 [6], which has led to a widespread and increasing number of enhanced-IR or IR-as-CRIS platforms in the UK.

Repositories will never try to ‘replace’ commercial CRISs since a wider research information management is not their main objective, but they will be able to deliver enough RIM functionality for institutions not wishing to run a sophisticated CRIS system to be able to rely on their IRs instead. Current repository metadata enhancement initiatives such as RIOXX [7] in the UK or the extended OpenAIRE Guidelines at European level are in fact very much supporting the enhanced-IR concept by providing the technical means for ensuring that IRs can collect information about research projects in an interoperable way.

There are several best practice examples for repositories playing a CRIS role. The most advanced one may well be the Research Hub at the Hong Kong University (HKU), where a DSpace-based IR has gradually been evolved into a DSpace-CRIS system with the active support from a developer team at Cineca in Italy [8]. Another good example of an IR system that has enhanced its data model to include an ever growing number of additional research information management features is the Enlighten repository at the University of Glasgow [9], whose approach to CRIS/IR interoperability is described in the best practice factsheet associated to this text.

 

5. The expression CRIS/IR interoperability is very often heard these days, but what exactly does it mean?

CRIS/IR interoperability means enabling some technical mechanism for information exchange between Current Research Information Systems and Institutional Repositories. This information exchange will usually involve some degree of metadata transfer between both systems, thus relieving researchers and/or researcher administrators from the time-consuming effort of repeatedly typing in the same information describing their publications or some other aspect of their research activity. CRIS/IR interoperability fits within a wider trend to make as many system as highly interoperable as possible, so that the long-cherished motto “one input, many outputs” can actually start becoming a reality. CRISs are already providing interoperability with additional internal institutional systems such as Finance or HR modules, so CRIS/IR interoperability is just one more step in the progress towards a seamless institutional system integration (see the institutional system architecture diagram at University of St Andrews at [10]).

The most frequent way CRIS/IR interoperability is achieved at institutional level is by mapping CRIS to IR metadata standards so that information can be automatically transferred between both systems, usually in the form of publication metadata harvested by the CRIS which gets delivered into the IR where the full-text file is added and exposed from by the researcher or the Library when the appropriate copyright requirements are met. This is the so-called CRIS+IR use case for institutional system configuration, where both systems coexist and cooperate in a way that maximizes institutional benefit: the repository content will increase much quicker as researchers will find it much easier to deposit their final manuscripts at the metadata transfer stage from the CRIS to the IR, and CRISs will not need to perform a repository role they were not designed for and will just store among its diverse publication metadata the URL where the full-text file is stored in the repository.

There are however additional and very important ways for achieving CRIS/IR interoperability being currently put in place at cross-institutional level. A key development in this regard is the recent release of the OpenAIRE Guidelines for CRIS Managers [11], that allow the mandatory deposit of publications resulting from EC-funded research projects to be carried out straight from CERIF-compliant CRISs by means of a CERIF-XML gateway. The extension of the OpenAIRE data provider basis to include CRISs will result in a significantly larger amount of harvested contents, again providing evidence for the win-win nature of CRIS/IR interoperability.

Furthermore, this CERIF-XML gateway for metadata exchange between CRISs and repositories can also be adapted for and applied to non-CERIF-compliant CRIS systems by mapping their own data model to the CERIF standard. This feature will allow to achieve interoperability far beyond OpenAIRE harvesting purposes, focussing instead on the development of regional or national research portals where metadata is harvested from an array of institutional CRISs, regardless of whether or not they are CERIF-compliant, and linked to the full-text content available at institutional repositories. The Catalan Research Portal presented at the recently held OR2014 conference provides a pioneering example for such functionality [12].

 

6. Is then the CRIS+IR use case the best one for institutional system architecture?

There is no such thing as the best use case for institutional system architecture. The CRIS+IR configuration provides a highly effective mechanism for dealing with the research information management needs at a large, research-intensive university, but smaller universities may not need such a complex architecture if they manage to successfully enhance their repository data model to collect sufficient metadata on the different areas of research activity they are required to report on. The ideal institutional system configuration will depend on many factors, such as size of the institution, its reporting needs, IT resources available for system upgrade and maintenance or the balance of power between the different institutional units at a specific moment. It is important to keep in mind however that once an institution has come up with the system configuration model that they perceive as preferable, such model should ideally be stuck to in order to be able to deeply exploit its advantages – the work on the Enlighten repository platform at the University of Glasgow featured on the companion document [13] being a clear example of such beneficial approach.

 

7. Are there further areas where CRIS/IR interoperability might be applicable?

There are indeed areas where CRIS/IR interoperability will be required to provide a solid cross-institutional reporting on different threads. For instance, Research Data Management is presently being institutionally implemented in many different ways, with institutions using a variety of systems for dealing with the task. Since the whole institutional RDM activity is to be jointly reported to research funders – which have already issued mandates for filing the research data underlying research publications – there is a need to harmonise the different ways RDM is dealt with via CRISs, repositories or both [10].

Another area – to some extent UK-specific – where system interoperability will possibly be very welcome is the Gold Open Access payment management [14]. An analysis is under way for identifying the appropriate metadata for describing such activity (together with additional policy-related topics such as content-sharing licenses), but its implementation is very likely to be different for CRISs and repositories, again leading to a requirement for metadata harmonisation and mapping.

In general any new research information management area (eg coding research facilities and equipment) will require some degree of CRIS/IR interoperability to be successfully implemented while a significant dispersion in research information management system configurations persists among institutions.

Pablo de Castro, euroCRIS Board, September 2014

 

References / Useful Links

Go back to text with “back” button in your browser

[1] See this “Repositories and CRIS: working smartly together” conference held July 2011 in Nottingham by the Repositories Support Project (RSP) as an early example for coordination initiatives on the issue of CRIS/IR interoperability, http://www.rsp.ac.uk/events/repositories-and-cris-systems-workingsmartly-together/

[2] Rosemary Russell, “CERIF CRIS UK landscape study: work in progress report”. Presentation at the euroCRIS Membership Meeting Spring 2013 (DFG, Bonn, May 13-14, 2013), http://dspacecris.eurocris.org/handle/11366/75

[3] Pablo de Castro, “Institutional CRIS implementation in Europe: one goal, different strategies and speeds”. Presentation delivered at the euroCRIS Membership Meeting Autumn 2013 (Universidade do Porto, Nov 14-15, 2013), http://dspacecris.eurocris.org/handle/11366/71

[4] Pablo de Castro, Kathleen Shearer, Friedrich Summann, “The gradual merging of repository and CRIS solutions to meet institutional research information management requirements”. Proceedings of the 12th International Conference on Current Research Information Systems (2014). Procedia Computer Science 33: 39-46 (2014), http://dx.doi.org/10.1016/j.procs.2014.06.007

[5] Friedrich Summann, “Discovery Metadata: A View from Institutional Repositories”. Presentation delivered at the 11th euroCRIS Strategic Seminar: “Metadata in Research Information Systems” (Brussels, Sep 9-10, 2013), http://dspacecris.eurocris.org/handle/11366/265

[6] Les Carr, “EPrints: A Hybrid CRIS/Repository?”. Presentation delivered at the 1st Workshop on CRIS, CERIF and Institutional Repositories, Module 1: “The intersection of data models and metadata concerning IR and CRIS” (CNR, Rome, 10-11 May 2010), http://eprints.ecs.soton.ac.uk/21048/

[7] EDINA News, Jul 3rd, 2014, “RIOXX 2.0 offers standard for open access publication metadata exposure”, http://edina.ac.uk/cgi-bin/news.cgi?filename=2014-07-03-rioxx.txt [retrieved Sep 16th, 2014]

[8] David T. Palmer, Andrea Bollini, Susanna Mornati, Michele Mennielli, “DSpace-CRIS@HKU: Achieving visibility with a CERIF compliant Open Source System”. Proceedings of the 12th International Conference on Current Research Information Systems (2014). Procedia Computer Science 33: 118-123 (2014), http://dx.doi.org/10.1016/j.procs.2014.06.019

[9] William Nixon, Susan Ashworth, Valerie McCutcheon, “Enlighten: Research and APC funding workflows at the University of Glasgow”. Insights: the UKSG journal, 26 (2). pp. 159-167 (2013), http://eprints.gla.ac.uk/83882/

[10] Anna Clements, Valerie McCutcheon, “Research data meets research information management: Two case studies using (a) Pure CERIF-CRIS and (b) EPrints repository platform with CERIF extensions”. Proceedings of the 12th International Conference on Current Research Information Systems (2014). Procedia Computer Science 33: 199-206 (2014), http://dspacecris.eurocris.org/handle/11366/184

[11] OpenAIRE Guidelines for CRIS Managers based on CERIF-XML, https://guidelines.openaire.eu/wiki/OpenAIRE_Guidelines:_For_CRIS

[12] Ramon Ros, Lluís Anglada, Sandra Reoyo, Ricard de la Vega, “Let’s do data research work: the creation of a portal with research information from Catalan Universities”. Presentation delivered at the Open Repositories 2014 conference (Helsinki, Finland, June 9-13, 2014), https://www.doria.fi/handle/10024/97743

[13] Best practice case study on CRIS/IR interoperability: the University of Glasgow’s Enlighten Institutional Repository and Research System. https://www.coar-repositories.org/activities/repository-observatory/third-edition-ir-and-cris/ir-cris-repository-profiles/

[14] The 12-month Jisc Monitor project, http://jiscmonitor.jiscinvolve.org/wp/, for exploring how a shared service might support institutions in meeting HEFCE’s policy on Open Access for the Post-2014 REF assessment exercise is presently modelling the institutional Open Access payment activity.

 

Current page navigation:

css.php