self deposit rates - external calibration

Southampton University, and our school in particular, has never had a CRIS or Research Management System in which to report all publications before the repository came along. Consequently we genuinely can't answer questions about the percentage of our research output that gets put into our repository, because we have know independent way of knowing what the size of our research output is! Consequently we have always reported a figure of "100%" in surveys, or admitted our ignorance in interviews.

My first posting listed "batch importing articles from publishers' web sites" as a summer task for me. It's not something that I got around to in a serious way - I did do a batch upload of several dozen articles and then got stuck when I realised that I would have to manually check them for duplicates.

Anyway, my colleague Stevan Harnad pushed me for a figure of the proportion of our research available in the repository as he is refining our methods for measuring the "OA Citation advantage". Since it's impossible to refuse one of Stevan's requests I manually checked a "representative sample" of ECS-affiliated publications in the ACM digital library from the year 2006 against our repository holdings. After allowing for trip reports, proceedings edited and oddities like people publishing a paper and immediately taking a post at another University, I could only detect 1 missing deposit from 40 publications - that's a success rate of 97.5%.

To be honest, I was stunned. I expected to find a lot of missing items. I still need to examine these 39 items closer and see what percentage have the full texts uploaded! I also ought to check a second sample from ISI's Web of Science, but these are both tasks for another day. Or perhaps later on today, when I take the train to the Open University in Milton Keynes where I have been invited to the official opening of their Repository and Newest Research Building. See for the former and for the latter.

