Thursday, 10 January 2008

The Myth of the Complex Object

Here's an extract from the introduction to a Fedora tutorial that I have bit of an issue with:

The Problem of Digital Content

Digital content is not just documents, nor is it made up exclusively of the content from digital versions of currently owned non-digital content.

  • Conventional Objects: books and other text objects, geospatial data, images, maps
  • Complex, Compound, Dynamic Objects: video, numeric data sets and their associated code books, timed audio

As users become more sophisticated at creating and using complex digital content, digital repositories must also become more sophisticated.

In summary "There is a problem with content! Content is complex, compound dynamic objects! We need more sophisticated repositories to cope!" As a PhD examiner I am used to challenging rather alarmist opening paragraphs like this. So here I go...

This idea that users are creating complex objects which only complex repositories can cater for just doesn't sound right. When I was a child we had the idea that we'd all be walking around in silver spacesuits by the year 2000, but we aren't, and we aren't creating complex digital objects either. What we are creating is files. Lots of them. PDF files, Word files, spreadsheet files, video files, database files, Web files. The media types that I am working with may have got more interesting (richer), but I'm still using applications to create and edit and look after lots of files on my hard disk(s) on my computer(s).

And far from video content not being "just documents" but "complex compound dynamic objects" we in fact see Movie documents or plain old AVI files on our desktops. Even my e-science friends with their robotic labs full of experimental data and analytical data from formally defined workflows are producing and working with lots of files, not complex objects. They know the format and purpose and content of every file, and how it should be used, analysed and checked, and what applications can be used for each of these purposes.

What is complex and problematical about this? Exactly what do we need a new breed of repository for? A repository just needs to be able to manage items containing lots of documents/files, and to deliver them to lots of applications (or "services").

If we need better repositories, it is not to handle "more complex content" coming from "more sophisticated users", but to be better integrated with the working practices of those users. What we really need is better ingest/deposit features to help capture as much of their material as possible and better services to help them accomplish their tasks and to excel in their careers.

And that's not just vacuous waffle, because if a repository can't make knowledge workers (e.g researchers or teachers) more effective at their jobs then there really is a problem!

No comments:

Post a Comment