Thursday, 7 May 2009
I've been taking advantage of the new ISI license to import citation counts into our school repository.
Now we have Web of Science and Google Scholar citation counts listed for matching eprint records, you can search for eprints that fall into a citation range (e.g. 10 or more) and you can order search results by either type of citation count.
Now I'm being asked to provide reports of h-factors and citation averages and community normalised bibliometrics. What larks! I've had to draft in Perl assistance to write the necessary scripts.
But what it's taught me is that we're still missing out on an awfully big proportion of our school's research outputs - and we're an engineering school, not a humanities school. So I'm looking to add a THIRD source of citation data - the ACM Digital Library. The ACM run many of the journals and conferences that our researchers publish in - journals and conferences that ISI don't index. And then there's Scopus - that would potentially be a FOURTH citation data source. It looks like we'll need to have a separate "evidence of impact" dataset in the repository.
Integrating all this extra data has been made very easy by some developments from Chris Gutteridge and Tim Brody. Firstly, the EPrints import framework now supports an update option that allows you to merge new data with existing records. Secondly, the Microsoft Excel exporter (which is so useful for generating complex reports and charts) now has a matching importer. Combine these two features together and you can use all the user interface features of a spreadsheet to do large-scale, batch data amendments outside the repository environment and then commit the updates to the repository. This is great for spotting and fixing metadata errors.