Wednesday 28 January 2009

Using the EPrints Commandline Toolbox

The so-called "EPrints Toolbox" (bin/toolbox) allows the administrator to access/change data in EPrints from the command line. It is useful for people like myself who aren't Perl programmers (the shame!)

I haven't had chance to use it much, because it turns out I can often do what I want through the batch editor, but I found it very useful this morning to update some publication data.

The background is that I am helping run a conference (WebSci09), for which all the submissions have been handled by a web service called "EasyChair". I scraped all the submission data for the accepted papers and posters from the EasyChair web pages and turned them into an EP3 XML file which I then imported into an existing, subject-based EPrints repository. A few days after having done that, I realised that it would have been nice to import the affiliation and location data that EasyChair maintains for each of the authors. So I added "location" and "affiliation" text subfields to the creators compound field in the eprints dataset, ran "epadmin update_database_structure" to make the database tables sync with the updated config definitions, and then used the scraped data to run a sequence of commands like the following:
/opt/eprints3/bin/toolbox devel modifyEprint --eprint 106 << \EOF
<eprint>
<creators>
<item>
<name><given>Leslie</given><family>Carr</family></name>
<id>lac@gmail.com</id>
<affiliation>Department of Computer Science, Gadget University</affiliation>
<location>Japan</location>
</item>
</creators>
</eprint>
EOF

That works for me because I am a dyed-in-the-wool shell programmer, but you can invoke toolbox functionality from the web via CGI scripts - if you set up the appropriate security regime first (CGI toolbox is disabled by default because it is very dangerous to let all and sundry on the web have edit access to the database!)

I see that toolbox isn't documented in the wiki yet, and we've not had that much experience using it here at Southampton, but the range of facilities is shown below. Note that when you try to modify an eprint, the modification is happening field by field, so you don't need to add a full eprint record, but you do have to provide the entire contents of a field.
    toolbox *repository_id* [options] getEprint --eprint eprintid
toolbox *repository_id* [options] getEprintField --eprint eprintid --field fieldname
toolbox *repository_id* [options] createEprint < data
toolbox *repository_id* [options] modifyEprint --eprint eprintid < data
toolbox *repository_id* [options] removeEprint --eprint eprintid
toolbox *repository_id* [options] addDocument --eprint eprintid < data
toolbox *repository_id* [options] modifyDocument --document documentid < data
toolbox *repository_id* [options] removeDocument --document documentid
toolbox *repository_id* [options] getFile --document documentid --filename filename
toolbox *repository_id* [options] addFile --document documentid --filename filename < data
toolbox *repository_id* [options] removeFile --document documentid --filename filename
toolbox *repository_id* [options] idSearchEprints < data
toolbox *repository_id* [options] xmlSearchEprints < data

If you prefer to do this via the Web, I did successfully access the toolbox functionality in JavaScript (well, the jQuery library) like so:
jQuery.post("http://repository/cgi/toolbox",
{verb: "getEprint", username: "admin", password: "whatever", eprint: 358},
function(xml){ alert(xml); }
);

2 comments:

  1. Hi,
    how we can obtain this toolbox ?

    ReplyDelete
  2. The toolbox is a standard part of EPrints v3.1 and is available from the bin directory and the cgi directory

    ReplyDelete