Thursday, 14 May 2009

Repositories and Research information

I've just spent three days in Athens at the euroCRIS meeting, discussing the relationship between repositories and Current Research Information Systems. The idea behind a CRIS (plural CRIS, not CRISes) is that it forms a cross-institutional information layer that aggregates information from the library (publications), human resources (personnel and organisational structure), finance department (projects and grants), estates management (facilities and equipment) and external sources (funding programmes, citation data), and so integrates at some level with the set of services provided by a repository.

The CRIS initiative comes out of an administrative background (starting in 1991) and so predates repositories and exists tangentially to them. A CRIS is typically concerned with repository metadata (how many papers? which publishers? written by whom?) but not its data contents. So my concern was that the repository should not be sidelined or marginalised, but instead the repository should be seen as a mature partner in the aggregate of information services provided across the institution. The experience gained in the UK's recent research assessment exercise (documented in Institutional Repository Checklist for Serving Institutional Management) has very clearly been that the library, through the repository, provides enormous experience in dealing with bibliographic information, ensuring quality and basic auditing capability on claims of authorship and publication. Treating the repository as a superfluous adjunct to an administrative catalogue is to miss the benefit that a managed repository has to offer.

At the meeting many universities from across Europe spoke of how they were trying to make the two systems work together in one form or another. In some ways, the innovation is not technical, but simply in the concept that institutional information should not be siloed, but that it can be shared between administrative domains for the benefit of the whole institution.

On the technical side, CERIF (Common European Research Information Format) is the data sharing and interoperability standard that euroCRIS are promoting. Now on its third major iteration since 1991, it models many of the entities found in the research environment, particularly people, institutions, projects and research publications, patents and products. The standard is expressed in the language of the relational database, with individual tables defined for each kind of entity. Its particular novelty is that that roles like "author" or "project manager" are relationships between independent entities (people, publications or projects) rather than attributes of those entities, and that all relationships are constrained to an explicit time-period.

These requirements are straightforward to satisfy in EPrints - each new entity type (e.g. project) is just an extra dataset with an independent metadata schema and its own workflow and display rules. So an EPrints repository should be able to take on a useful role within a CRIS environment, deployong its comprehensive set of services for ingesting and managing project and personnel data, as well as research publication data. What is not yet clear is whether EPrints should be a helpful adjunct to, a useful component of, or a competent replacement for a CRIS.

That dilemma will be partly solved by the new JISC R4R (Ready for REF) project, whose aim is to investigate the use of CERIF as a mechanism for exchanging research information between universities (e.g. supporting the movement of staff throughout their careers). R4R, which is a joint activity between the Kings College, London and the University of Southampton, is focusing on the transfer of research information in the context of the forthcoming UK Research Excellence Framework (REF) activities.

In the meantime, there is a lot of interest in this area: the report on Serving Institutional Management that I mentioned above was the most-downloaded item of the OR08 conference.

No comments:

Post a Comment