Archive for March, 2011

Upon learning about postgresql arrays and mapserver

I learnt something new this week (week 2 in my new job)- it’s probably not new to everyone else, but just in  case someone is interested I thought I would document it…

Scenario: You have some Ordnance Survey Mastermap data in a PostgreSQL database imported using OGR2OGR. You wish to display it using Mapserver. The key fields you are interested in for styling your data are “descriptivegroup”, “descriptiveterm” and “make”. These dictate the actual detail about the styling, such as the type of building or type of land.

Problem: These fields may well appear in PostgreSQL as arrays rather than simple text, in other words they may hold more than one value per record. Querying arrays in PostgreSQL requires a slightly different syntax to querying normal text fields (this was the first thing I didn’t know beforehand). This means that simple select queries don’t work in MapServer when you try and filter data to style it in different ways.

My Solution (caution, not necessarily the best or most elegant): Use the PostgreSQL function array_to_string in the query MapServer sends to PostgreSQL.  This has the form array_to_string(array, ‘delimiter’) to create a text string from your array using your chosen character to split the terms up. However, it would appear that you can’t simply ask for array_to_string(descriptivegroup,’|') in your select query, you also need to ask for descriptivegroup. Then I used regular expressions to search for the term I need in my Layer Class styling, like so:

DATA “select wkb-geometry from (select wkb_geometry, descriptivegroup, array_to_string(descriptivegroup, ‘|’) as dgrp, fid from topographicarea) as foo using unique fid using unique srid=27700″

LAYER

CLASSITEM “dgrp”

CLASS

EXPRESSION /^Inland Water$/

END

END

END

So basically I’m asking for any data from the topographicarea table where the column “descriptivegroup” has the term ‘Inland Water’ in it at any point. And I’m being good and using the UNIQUE terms for best performance of course!

Now, there are other ways of doing this. I could have converted the data to strings in PostgreSQL rather than leaving it as arrays. However, next time I updated the Mastermap data, I’d have to remember to do the conversion again, or the map wouldn’t display correctly. Alternatively, I could have created a view in PostgreSQL that did the conversion to string, and then simply query the view in MapServer. I’d be interested in knowing which option is better for performance actually- converting the data into a view in the database, or doing it on the fly. If there’s a simpler solution to the problem, I’d also like to know, though I might sulk a little bit if it turns out that I missed something obvious…

Python for GIS (some links for beginners)

So, I’m late to the game and only just learning about the coolness that is python. To be honest, for years the need to keep the indents in the code neat and tidy put me off, but I figured I’d better have a proper look at some point. I spent some time over Christmas going over some tutorials (more below) and more recently I’ve chosen a python-based approach to problems where previously I would have used a different method. So far so good though no doubt my efforts will make grown coders weep…

These are just a few links that I have found really useful- the first is a very good tutorial for new python users who are particularly interested in Geoprocessing, whilst the others deal with more specific problems.

  • Geoprocessing with Python using Open Source GIS. This is a full online course, with examples and assignments to try (and solutions, thank goodness). There is a couple of week’s work here to complete the full course.
  • Using Psycopg2 with PostgreSQL. This allows you to work with postgresql databases in python (disclosure- other drivers are available). I found this straightforward enough to pick up and use in a real-world situation pretty much immediately.
  • Connecting to an OpenOffice spreadsheet with Python. There is plenty of documentation about OpenOffice and Python, but again I found I could get going immediately with this particular link.

If anyone has any other resources for python beginners, particularly around Open Source GIS and PostgreSQL, feel free to add them in the comments.