<xsl:value-of select="@r"/> = <xsl:value-of select="v"/>
Sunday, 27 February 2011
Rehabilitating The Third Star of Linked Data
The mantra of open data is: put your data on the web / with an open license / in a structured, reusable format / that is open / using open identifiers / that are linked with other data.
The third step/star in this process is commonly explained as using CSV rather than Excel, (because the former is an open format, but the latter is a closed proprietary standard). You'll see this position stated at Linked Data Design at the W3C and sites all around the world are copying it.
We really need to think a bit harder about this: Excel's native format is an open standard, and although an XML encoding of a the complete semantics of a spreadsheet is hardly a straightforward thing to deal with, it is simple enough to extract data from. In particular, I don't see that it is significantly more difficult than dealing with CSV!
Once you've unzipped the Office Open XML data, you can iterate around the contents of the spreadsheet, or extract individual cells with ease. And without any .NET coding or impenetrable Microsoft APIs. Here's a simple example that lists the addresses and contents of all the cells in a spreadsheet.
Of course it's simplified: i've missed off the namespaces, and strings are actually stored in a lookaside table and there are multiple sheets in a single document, but even so I'd rather wrangle XML than wrestle with CSV quotes any day.