Open Source Computing and GIS in the UK

Travels in a digital world

My First Hackathon

| Comments

In all the years that I’ve been involved with open source, I’ve been a committed advocate for the idea that you don’t need to be a coder to get involved. I’m definitely not a coder- I can write a script or two, and have been known to submit bugs, but that’s as far as it goes. My strengths are in identifying and fixing problems, and getting other people enthused. That’s all well and good, but hackathons- they are for real coders, right? Why expose my terrible deficiency of coding skills to the world? It turns out, I was wrong!

So, for various reasons I don’t need to go into here, I found myself at the first MapAction/AGI Technical SIG Hackathon at our company offices in Epsom, with a mix of 40 other people: coders, MapAction volunteers, Astun colleagues, and other interested parties. MapAction had done a lot of leg-work before the event, identifying particular problems that they wanted to try and solve, with enough variety that everyone could find something of interest. As one of my current work projects involves metadata gathering, when I saw a hack around implementing a metadata gathering script in Quantum GIS, working with an existing python script (thanks Tyler), I thought it was something I could have a go at. The guy I teamed up with had even less scripting experience than me, but was also interested in the end result. Other teams looked at data visualisation, and the fantastically named “Dirty Data Dashboard”, that deliberately creates invalid or “dirty” data for use in training.

With very little experience of implementing python plugins in Quantum GIS, we had a brief attempt at using the Script Runner plugin to access the script. This definitely seems like a useful first approach, but the learning curve of adapting the script to interact with Quantum GIS (for example to return results to the logging console) was more than we could manage in the time available. Furthermore, I think it would be useful to create a wrapper for Quantum GIS, rather than rewriting the original code and risking it no longer working as a stand-alone script. Instead, we worked on extending the script to return more “human-readable” results. For non-coders, this was definitely achievable in the time available, so we had the nice fuzzy feeling of actually having a result at the end of the day.

Time for another first! Well, technically a second, but I’m not sure this counts. I love GitHub, and use plenty of repositories, but as a non-coder, putting your own code on the site for all and sundry to see is quite intimidating, and there are tales of people getting criticised publicly for nothing more than producing imperfect code. Not only does that put me right off, but it also makes me quite angry because it’s deeply elitist and non-constructive. However, once we’d extended this python script, re-committing it back to GitHub was the natural thing to do. Cue much googling on forking a repository, committing back to GitHub from my local copy, then issuing a Pull Request. And yay, it was accepted!

All in all, the MapAction hackathon was definitely a success. Lots of good things were achieved, and I would imagine that it will be repeated fairly soon. There’s no moral to this story, and I still think of myself as a non-coder, but this was just a first toe in the water of engaging differently with the open source world. Finally though, I would say that if you do consider yourself a non-coder, don’t be put off from attending a hackathon, as there will definitely be something you can contribute to.

Agi-scotland

| Comments

Last week I attended the AGI (Association for Geographic Information) Scotland Showcase- the first in a series of events designed to jump-start the AGI’s regional and special interest groups. It was extremely well-attended, with approximately 140 delegates, which bodes well for future events! The venue was fantastic too- at the rather lovely Hunter Halls in the University of Glasgow.

Not un-surprisingly there was a distinctly Scottish theme to the papers, and my take-away thought is that the Scottish GI industry does seem to be doing things on its own, separate from what’s going on in the rest of the UK. It was interesting to see a demonstration of the Scottish Spatial Data Infrastructure Metadata Editor, based on Geonetwork, and also to hear about the Crofting Register, and the unique challenges of mapping the crofts. I did see a few mapping portals full of so many bells and whistles, that their authors clearly need to go and read these articles pretty damn quick!

I gave a workshop on using PostgreSQL and Quantum GIS, using Portable GIS as the platform, which went surprisingly well given the short timeslot that we had. The instructions, emergency powerpoint, and pre-prepped postgresql database backup that I used can be found here. Note that the database is designed to work with Portable GIS and consequently YMMV.

I was slightly frustrated by one paper that I sat in on, entitled “Open Source for the uninitiated”. It felt a little bit like being transported back to 5 years ago, talking about packages being almost as good as the proprietary alternatives, and bringing up concerns about the level of support that people might receive. To me, this feels like damning with faint praise, and it’s never going to win hearts and minds (mixing metaphors, sorry).The one good thing is that there’s obviously still a need to raise the profile of open source, and to demonstrate the true worth of the packages.

Finally, a big shout out to my boss, who did a stonking presentation on “doing something with all of this open stuff”, not about open source but about open data, which won the delegate’s best presentation at the event.

PgRouting and Mapserver

| Comments

The third in an occasional series of posts dabbling in PgRouting

Once you’ve got your PgRouting database configured in PostgreSQL (see here and here for more information) you might want to use the routing data in an online map. There are a number of tutorials around for doing this using geoserver, and some information on using mapserver, but also a distressing number of posts pleading for assistance!

So, in the spirit of sharing, this is my attempt to pull the various bits of advice together into something a bit more comprehensive.

The key point to remember is that the PgRouting algorithms don’t return geometry, so you can’t automatically display them on a map. You need to join the query results to your network table to get your geometries and give you mappable results.

The most useful links about displaying routing data in mapserver are here and here, and you also need to know a little bit about mapserver runtime substitution, which you will find here.

The approach that worked for me was to take a working shortest_path query based on the examples given above, and test it in psql/pgadmin3. This gives you something like this (go back to my previous post to see the routing table structure for my Ordnance Survey ITN data):

select * from shortest_path('select uid as id, source, target, length as cost from  ways', 45, 65, false, false)

We need to join this to some geometry so we can use it in mapserver, so we alter it to this:

select wkb_geometry from ways join (select * from shortest_path('select uid as id, source, target, length as cost from  ways', 45, 65, false, false)) as route on ways.uid = route.edge_id

In this case, our source node is 45 and our target node is 65, but we want to be able to call these dynamically in our URL (This was the stage that seemed a little hard to find in the online instructions). Our mapserver layer needs to look like this:

LAYER
    CONNECTIONTYPE POSTGIS
    INCLUDE routingconnectiondetails.inc
    DATA "wkb_geometry from ways join (select * from shortest_path('select uid as id, source, target, length as cost from  ways', %source%, %target%, false, false)) as route on ways.uid = route.edge_id using unique uid using srid=27700'"
    METADATA
        "source_validation_pattern" "."
        "target_validation_pattern" "."
    END
    NAME "shortest_path"
    STATUS ON
    TYPE LINE
    UNITS METERS
    CLASS
        STYLE
            COLOR 255 0 0
            WIDTH 3
        END
    END
END

Note the layer-level metadata which tells mapserver to convert the parameters you give it in the URL into the parameters required in the data string.

We can then call this from the mapserver URL like this:

http://localhost/Mapserver/mapserv60?map=C:\iShareData\workshop\_MapserverConfig\routing.map
&mode=map
&source=45
&target=65
&layers=shortest_path

Which in my case returns something like: this

There are a couple of caveats here:

  • Firstly, in this very basic example, you need to know the ID of the source and target node, which is unlikely in a real-use scenario. You’d need to join your node data to some sort of alternative positioning data (Grid Reference, Address, etc) in order to use more intuitive start and end points, or allow people to click the map to generate them.

  • Secondly, you are calculating the routing on the fly each time the map is generated, which is fine with a small sample dataset but not so good in real life. Ideally you’d save the query results into a temporary table for re-use, but this is a good starting point.

Python Easy_install in Portable GIS

| Comments

One of the things I’ve wanted to fix with Portable GIS is the method of installing new python packages. Since the version of Python included in Portable GIS is not in the windows registry, many python installers don’t work because they can’t find the installation. This includes packages like setuptools, and Pip, for easily installing packages from PyPi. While it’s possible to manually download a package and extract it into the correct location, that’s not fun, elegant, or sustainable, as it takes no account of dependencies or versions.

Enter this post from Nathan Woodrow’s blog. This points us at a bootstrapped installer for setuptools, which will install directly into a folder rather than looking in the registry. Simply download the linked ez___setup.py script, save it to your Portable Python directory (the one that actually includes python.exe) and run it. This should install setuptools into your Portable Python installation and allow you to use easy_install to grab packages (including Pip).

This will be included automatically in the next release of Portable GIS, but for now, it’s an easy do-it-yourself fix.

Portable GIS 3.1

| Comments

I’ve done a quick update to Portable GIS version 3, based on some user feedback (keep it coming). The new version (3.1) now contains QGIS Server, fully configured, and also the GDAL 1.9 libraries for Portable Python.

Head over to here to download the latest version, and check out the documentation included in the install for instructions on using the latest features!

AGI GeoCommunity 2012

| Comments

Last week was the AGI GeoCommunity 2012 event at Nottingham, and as usual, a great time was had by all. In the weeks leading up to the event I’d been a little worried that attendance would be down as many of “the usual suspects” said they weren’t attending. However, in the end attendance was up, with a lot of new faces and new sponsors. I’d love to know the demographic/industry area for these new attendees (hint, hint, AGI). The venue was the East Midlands Conference Centre, which will also be the venue for next year’s FOSS4G, so those of us in the organising committee were casting a critical eye over everything to ensure things are in place for next year. To those that attended OSGIS earlier this month- the WIFI was much better!

You can get a good feel for the event by checking out the geocom hashtag on twitter. Look out for the p0rnbots, who infiltrated the feed early on but made some scarily pertinent comments!

Day One started with an excellent keynote from Tim Stonor of Space Syntax Ltd. Go see wikipedia for a definition of the term if you’re not familiar with it (I wasn’t). Tim’s talk was mainly about spatial connectivity in city planning, and using path accessibility to predict how people will use spaces, both inside and outside. Really interesting and thought-provoking, and it turns out that there’s a QGIS plugin for it!

I then went to a stream on “sharing best practices”, which was really about how to get GIS deployed in large organisations, to people that are not familiar with it’s use. So what are these “best practices”? Bribery, ambushing them at the coffee pot, and giving them the data as Excel spreadsheets, apparently. Whilst I winced at some of the comments (one speaker concatenated Google with Open Source and made half the audience fall off their seats), the general idea is sensible- give people the data in the form that they are comfortable with to start with, and then slowly introduce newer elements. People are just trying to do their job, after all.

I spent the afternoon helping at a joint Ordnance Survey/AGI Tech SIG workshop on using open data with open source software, in which we introduced a whole new set of people to the joy of QGIS.

I don’t think I need to say anything about the evening events- I’ll just say that you potential FOSS4Gers don’t need to worry, these AGI people do know how to party!

The stand-out papers for me on Day Two were all to do with delivering GIS for the London Olympics. Nothing ground-breaking, but just GIS done really well, at a grand scale. An honourable mention also goes to Steven Feldman, incidentally FOSS4G 2013’s conference chair, with an Open (Data, Standards and Source) 101, to prepare GeoCommunity people for next year’s event.

“GIS done well” probably sums up the event for me, actually. There were very few papers saying “look at this cool new shiny thing that I have developed”, but a lot about consolidating and delivering GIS across large organisations. I’d say that the enthusiasm for “open”, and QGIS in particular, bodes well for a meeting of minds next year too.

Portable GIS V3 Released

| Comments

Finally, after quite a hiatus, I’m proud to announce the release of Portable GIS v3. In brief, this contains updated versions of the various packages BUT it’s rather stripped down in comparison to the old version. It no longer contains a full apache/php/mysql stack, and no longer contains Geoserver, GvSIG or uDIG- see here for more details and a download link. The main reason for this is to make the package easier to maintain and host. If there’s enough interest, I might consider re-integrating some packages but you’ll have to ask really nicely! I’m hoping to release a new version whenever the constituent packages are upgraded, but we’ll have to see how achievable that is.

For those who are new to the idea- this is not like a live-DVD or USB stick- it’s designed to work in your windows environment without any need for installation. Get a big enough USB stick (2Gb or more, and make it FAST) and you can store your data on there too. Note that it’s not a stealth system- in that I can’t guarantee it won’t leave some traces on your pc, and I can’t be responsible for conflicts with existing packages or with paranoid sysadmins. If you do have a problem though, get in touch and I will try and help.

Finally, as an experiment, I’m using dropbox to host the download. If I trip the daily download limit, the link won’t work. Try again tomorrow! If it really doesn’t work, I’ll come up with a different option.

Enjoy- I hope it’s useful!

OSGIS 2012

| Comments

This week was the fourth annual Open Source GIS Conference, otherwise known as OSGIS 2012, at Nottingham, and a great time was had by all. It might have been the lovely weather and the super venue on the University’s Jubilee Campus, but I very much enjoyed it. The first day was for workshops, and I attended one on OSM-GB. This is a project around measuring and improving the quality of OpenStreetMap for the UK, and providing pre-configured WMS/WFS feeds of openstreetmap data, ready projected into British National Grid format. While I personally really enjoyed the workshop, it was probably a salutary lesson in the need for some serious load-testing of the service, as the sudden spike in demand on the WMS and WFS feeds made the server quite poorly. Check it out though, as if you’re using openstreetmap as background mapping in your desktop GIS it’s probably the easiest way to get an up-to-date version on demand. I also learnt about some useful things from the OSM-GB team though, like Nominatim, for creating a searchable Gazetteer from openstreetmap data. I also went to a workshop on using Geoserver for INSPIRE compliance, run by the guys from GeoSolutions which was again really useful, and more interesting that you might think!

Day Two was talks- and the sessions that I saw were a good mix of academic and practical. The stand-out one for me was by Ken Arroyo and was about a tool for automatically repairing polygons and planar partitions. Not the worlds most punchy title, but you could see everyone in the audience who has ever struggled with getting invalid geometries into postgresql getting really excited. You can find the tool here on github. I did a reprise of a talk I did for the AGI a few months ago on “actually doing things with open and linked data”. You can find it on ELOGeo, where you can coincidentally see all the talks for the conference.

In the evening of Day Two we held the OSGeo:UK AGM, and it was good to see some new faces there. This was my final AGM as the Chapter Representative (of which more in another post), and I’d like to take the opportunity to thank Ian Edwards from the UK Met Office for bravely taking over.

Day Three wasn’t strictly conference, but was the first official Face-To-Face for the FOSS4G 2013 Organising Committee. A lot of work has already been going on behind the scenes- we have the skeleton of a website, and will soon be releasing the provisional dates for the Call for Papers, Registration, and all the other important dates that you need to know. We’re starting to put together the Sponsorship packages, the contract with the venue is nearly ready, and we have a great idea for the t-shirts but we’re not telling you about that yet…

If I had any negative thoughts about the OSGIS event this week, it is that numbers were lower than usual, but that’s probably because it clashed with other conferences (it’s normally in May, which is much better, but that wasn’t an option this year). It was good to see a large number of international delegates though. There won’t be an OSGIS event next year though, because we want everyone to come to FOSS4G instead!

PgRouting With Ordnance Survey ITN Data

| Comments

A work in progress

I threw together some notes on installing PgRouting on Ubuntu last year sometime but I haven’t really had chance to come back to it and do anything meaningful, until a chance conversation with a client got me thinking about trying again with some Ordnance Survey ITN data. If you do a google search on actually doing anything with ITN data you’ll quickly find out that most people are using ESRI Productivity Suite, or various other components, even if the end result is data in PostgreSQL. So, I thought, how hard can it be? The answer- not that difficult.

PostGIS for Beginners

| Comments

UPDATE: Shortly after submitting this post, up popped another from Paul Ramsey that does a really good job of explaining why things are done they way they are. I recommend you read it!

Via Paul Ramsey, this post popped onto my radar the other day and got me thinking. My initial impressions about the post were quite negative, and to be honest some of the points still mystify me, but after further investigation, at least some of the issues do make sense, so perhaps there is some room for improvement in our favourite spatial database. If you haven’t read it, do that and come back. I’ll wait…

Let’s take the first statement- that PostGIS can be “mind-numbingly difficult to install and use”. The author of the post is mainly talking about Ubuntu servers, so I think it would be fair to assume some level of IT literacy here. If you go off and do a web search for “Ubuntu PostGIS”, which these days is likely to be the new user’s first port of call, then for me at least the first few links are generally to old blog posts explaining, often with long lists of commands, how to install on say, Ubuntu 7.10 or 8.04, and then there’s some posts about compiling it from source with the latest version of PostGIS. There are some links to information about modern versions of Ubuntu as well, but they are not top of the list. Most links also say to add another ppa to your list of repositories, which is fairly standard for Linux but if you approach this with a new-user mindset then it’s not very reassuring. So I went looking to see if PostGIS was in the official Ubuntu repositories, and it is- but you’d have to go looking to find it. So, I wouldn’t go as far as saying mind-numbingly difficult, but as a new user you could end up making things a lot more complicated than they need to be. I’m not sure what the solution is though- manipulate google search results so that the official Ubuntu repositories appear at the top?

Buried amongst this discussion is a point about back-porting fixes or patches to prior stable releases. This might happen, and I can see a scenario where it could be a pain for a system maintainer, but since when was this a problem just with PostGIS, or FOSS in general? It’s just as prevalent, if not worse, in the proprietary software world.

Onto import and export. I tend to agree here, up to a point, that trying to use shp2pgsql at the command line as a new user is not easy. However, every time I’ve done an install recently, I’ve been asked whether I want to include the shp2pgsql loader for PgAdmin, or indeed I’ve just gone with the import tool that comes with QGIS. So using the command line for a simple import and export is not really necessary at all. Anyway, yes, please get the srid right rather than auto-populating it with -1, and please use the (sanitised) shape file name as the default table name. Yes, make it easier to get a csv file with an easting and a northing in it, rather than making users go to ogr2ogr and learn another command line syntax.

However, we then get on to the section that I most vehemently disagree with. The general premise seems to be that the end user should not worry their pretty little head about the database they are importing into, or the user that they are using to do it, or the spatial reference system that the data comes with. Automating all of these processes might make it easier for the end user, but at the expense of them actually understanding what they are doing. The minute that this clever automated process fails, or puts the data somewhere you didn’t expect, then you can be sure that a lot of end users will decide that “open source is rubbish, where’s my ArcGIS”. Been there, seen that. Teaching people to press buttons without thinking leads to rubbish output- be that in open source, proprietary, gis, or any other software.

Forward-compatibility of backups. I’m such a fan of the inherent future-proofing and openness of a plain-text SQL dump, and I’ve never hit an issue with upgrades if I follow the instructions, so this surprised me. However, trying to come at this from a new user’s perspective, sometimes it’s not straightforward. However, progress and improvement of software means it’s not always possible to totally guarantee compatibility between versions. Again, this is not the sole province of open source- how many times have you had issues with a doc to docx conversion in Microsoft Office, for example? (Answer- many)

Geometry invalidity- yes, I see this all the time. Client sends data in mapinfo or shape format. We load it into PostgreSQL, it croaks. Sometimes the “really small buffer” trick, or similar, works, sometimes we have to go back to the client to ask them to resolve the issue. Again, I would rather do this, and enforce ideas of data validity and quality, rather than dumb things down so we never have to think about what we’re doing.

Finally, it’s worth remembering that PostGIS is a spatial extension, it’s not a program in it’s own right. Comparing it to Arc Toolbox is like comparing a car engine with a complete car. Amazingly, the whole article was written without a single mention of PgAdmin, or QGIS, or any other interface that will work with PostGIS, and provide users with a lot of the bells and whistles that the author is asking for.

In conclusion- I reluctantly find that I agree with the points around installation. It could be easier- at least to find documentation. Import into postgresql could be easier, or you can use QGIS. But please, don’t make it so easy that users don’t have to think!