Wednesday, 24 June 2009

Hardworking Repositories: Comparing UK & US

To go with the list of UK Repositories, here are the top 10 most hardworking US repositories, based on the number of days deposit activity that they achieved in the last year according to ROAR.

RIT Digital Media Library253
Georgia Tech's Institutional Repository: SMARTech252
ScholarSpace at University of Hawaii at Manoa248
NITLE DSpace Service: Middlebury College245
Trinity University239
AgSpace: Home234
Florida State University D-Scholarship Repository231
DigitalCommons@Florida Atlantic University230

Once again, congratulations to those on the list. The methodology for drawing up this list was deliberately devised to promote daily engagement rather than numbers of deposits, in order to try and factor out bulk imports from external data services.

(I am slightly hesitant about publishing this list, because I am less familiar with US repository scene than with that in the UK. That means that I have difficulties in sanity-checking the list - in particular, the Middlebury College/Trinity services seem to be registered with the same host, even though their front ends are delivered from different host names. Do they genuinely count as separate repositories?)

These two lists (US/UK) do show some apparent differences in practice. If the headline numbers (days on which deposits are made) are subdivided into three categories (few deposits 1-9, medium 11-99 and high 100+) then it appears that the UK repositories are dominated by medium deposit days, and the US repositories by few deposit days.

Is this difference significant? Is it an artefact of the workflows and processes of the repository software platforms (the UK table is dominated by EPrints, the US table by DSpace)? Is it due to the different sizes of the host institutions? Or does it show a genuine difference in practice in terms of individual self-archiving vs proxy deposit? There needs to be some more analysis.


  1. First, I should say that I am very new to the repository world. I've been in academic IT for the past 15+ years, but have only been working with repositories for the past 6 months or so.

    That said, frankly, I find this data a little depressing--even the hardest working repositories were only seeing deposits about 250 days per year, and it was REALLY unusual to have more than 99 assets show up in a given day. It would also be helpful to have an "average" line on these graphs (maybe excluding the highest of the high days or using a log scale if those days would skew things?) so we could get a sense of what a "typical" day was like for these repositories.

    What I read into this (and I could be very wrong) is that these repositories are either too difficult for end users to use, not appealing enough (an activity you SHOULD do, rather than one you WANT to do), or both. I can't imagine that that so few assets are being created at major universities on a daily basis.

  2. NITLE ( coordinates a hosted consortial version of DSpace for 19 colleges and universities. ( Trinity was an original participant but has since gone their own way, I believe.

    I think you are right to question these statistics. We like to think we are busy here at Middlebury, but I'm not sure we're *that* busy!