Archive for September, 2006

XML: beyond the hype

I was at a workshop today in York, run by the Archaeological Data Service, entitled XML for archaeologists: Beyond the Hype. I went along because I felt that I didn’t really understand how XML works, and have to say I was very pleasantly suprised because I absolutely loved the workshop and came away feeling that I’d made the conceptual link that I needed in order to understand XML.

My problem was that I was confused by the idea of a “language”. I understood that in XML you defined your own markup tags, and that somewhere there was a schema that explained what those tags actually meant. My sticking point was that I couldn’t figure out how you explained what something was, without relying on a whole bunch of other things that you’d need to explain as well. How would you explain to a computer what an elephant, or should I say an <elephant>, is?

What I now see, or what makes sense to me (ie not necessarily the truth but good enough for me to work with), is that it’s actually more like a grammar than a language. It defines objects , but only in terms of rules and relationships. In a spoken or written language, that would be like knowing how to conjugate a verb or use it in a sentence without ever knowing what that verb meant. In XML, you don’t have to explain what an <elephant> is, but you can define that it has a <trunk>, four <legs>, two <ears> and so on. The pc doesn’t have to understand what any of those things actually are, but it understands the relationship between them, because you’ve defined that in your schema. Once I made that connection (sorry to those that think I’m terribly slow) I really enjoyed the workshop.

What also became clear, is that archaeologists, and in fact anyone who has to classify objects and assign behaviours to them, fundamentally understand XML even if they don’t realise that they do. The key is to try and persuade people to codify those objects and their behaviours, and to get them all to use the same language/schema to describe them.

There are some schemas available in archaeology- in the UK the best known is MIDAS XML but there’s also ArchaeoML. Some people have argued that they don’t go far enough, partly because there isn’t enough formalisation of terms of reference such as for colours. To paraphrase a colleague’s analogy, the colour “black” has a completely different meaning if you are in greyscale, monochrome or full colour. However, I can’t believe that these issues haven’t been considered before, and in fact most can be avoided by ensuring that everyone uses and understands a common vocabulary. MIDAS XML, which is the schema I am most familar with, comes with a set of thesauri and word lists that should help get around this issue, and if in doubt there are much larger models that we can call upon.

One of the most interesting presentations that I attended today was about the TEI- or Text Encoding Initiative, which aims to provide a toolkit for encoding literary texts and other information for online use. Pretty much everybody present at the talk could immediately see a use for this in dealing with archaeological grey literature. This is the huge mass of unpublished reports that commercial archaeolgical units produce each year. It is a notoriously difficult resource to quantify, let alone search, as most units have neither the time or the money to make this information available in any sensible form. Currently the job of trying to quantify this resource falls to the Archaeolgical Investigations Project, based at Bournemouth University. I can quite understand why the AIP need to exist, but currently their main method of data collection is to send people to visit units and read all of their reports for a year and type the data into a database. They need to do this because they can then have a consistent method of recording the pertinent details about a site, and what was found, separately from the remit of the unit, which is to fulfil the terms of the archaeological brief. If units could be persuaded to adopt a common schema and methodology for marking up their reports as the TEI has demonstrated, then surely this need to actually go and visit every unit in the country could be avoided? After all, broadband is a lot cheaper than train tickets! Marking up content does take time and would add to the cost of each project, but (presumably) this could be offset somehow by the cost savings on the AIP project for English Heritage.

None of this is rocket science, I’m well aware of that, it’s just getting enough people on board and coming up with a way forward that we’re all happy with…

Geomaticians of the UK unite!

At FOSS4G last week, my colleagues and I got chatting with the folks from OSGEO. It was difficult not to, given that they played such a huge part in organising the conference. Anyhow, we identified that it would be a good idea to set up a UK Local Chapter, to provide a UK-specific focus and slant on the work that OSGEO are doing. The kind of things we might look at include providing a first port-of-call to newcomers to the world of geomatics in the UK, with a particular focus on the open source tools available; providing a focus for lobbying for public access to Geodata (you know, the stuff we’ve paid for with our Taxes but have to pay again to use).

What we need at the moment is expressions of interest. Enough signatures will convince the board that such a chapter would be worth setting up. We only have a few names at the moment, mainly because we only started canvassing this week, so if you feel you could sign up then pop on over to the wiki and add your name. For more information on local chapters, here’s the place to look.

What we are also trying to do is come up with a manifesto for the group. If you have something to contribute to this, then please feel free! That’s what wikis are for, after all…

If you’re into archaeology, regardless of whether or not you’re in the UK, but you are interested in Open Source applications or Open Standards in archaeology, then we’re also investigating the level of interest in an Archaeology Special Interest Group. Again, at the moment we just need expressions of interest and ideas.

Bookshelf

On this page I’m going to list the books and papers that I find most useful.

Archaeology Related Texts:

Digital Archaeology by Thomas L Evans and Patrick Daly

Actually, this is a new book that I haven’t had chance to read yet, but Tom used to be the Head of Geomatics at Oxford Archaeology and did a lot to move the department forward so I’m sure it will be good (and I’ll be buying it myself for sure).

Spatial Technology and Archaeology by David Wheatley and Mark Gillings

This is one of those texts that anyone who is serious about doing good quality geospatial analysis with archaeological data should read.

Web-Based GIS:

Web Mapping Illustrated by Tyler Mitchell

For an outline of the whole area of web-based mapping, this is hard to beat. Amongst other things, it gives you the clearest set of instructions for installing PostGres/PostGIS that I’ve seen, and a very useful glossary of MapServer commands.

Cool Examples of Neogeography

Back from Switzerland after the FOSS4G conference, and a weekend in Geneva. Whew! Geneva would perhaps have been more enjoyable if our hotel wasn’t on a street having an all-weekend party, complete with blaring music (Pink Floyd and Reggae mix one night, Slipknot or similar the next). Anyhow, we got about- went out to CERN and visited the United Nations, and even took in a little archaeology at St Peter’s Cathedral.

The end sessions of FOSS4G have been pretty well commented on elsewhere, and I don’t have much to add, except that I want to add my congratulations to Markus Neteler from GRASS for winning the Sol Katz award and to say that the whole conference was extremely interesting, inspiring and enjoyable.

I came across some great examples of what we perhaps have to call Neogeography today- from the beautiful and a little scary Information Aesthetics. They are all examples of using maps to display statistical information, and prove what all Geomatics-types know, which is that maps are a very effective way of getting statistical information across with maximum impact, and they can be visually attractive too.

Breathing Earth

Real Time Geographical Radio

Topix Forum Activity Map I particularly like this one as it has some similarities with the MapChat application that I blogged about from FOSS4G last week.

Next Page »