Wednesday 19 January 2011

Using GB postcodes

As mentioned in the last post I have been looking at the Code-Point Open data supplied as part of the Opendata released by Ordnance Survey. The data describes the centroids for each postcode which is given as a list of post codes for England, Scotland and Wales. There are about 1.7 million post codes in the dataset, broken up into CSV files, one for each top level code such as HU. Each file has a lot of fields with no data or dummy data in them, but the postcode and the easting and northing are what I need. 

I wanted to make an overlay displaying the postcode centroids to see if they would be useful in working out the postcode for OSM addresses. Postcodes are part of the jealously-guarded Royal Mail database, sold as the PAF, which makes a couple of million pounds profit each year for Royal Mail. I would have liked to the detailed address data or at least the polygon that a postcode covers but the centroids are the best we have for now.

The process of making an overlay started by converting the OS eastings and northings to longitude and latitude. The original file is pruned to remove the excess and then I used gdaltransform to make the conversion, using the the OSGB datum EPSG:27700, before loading this into a database table.

Next I set about drawing a transparent set of tiles to use. I wanted to draw a dot for the centroid and the postcode as text beneath it. If I retrieve a list of centroids for the tile then any point close to the edge of a tile would have text that extends beyond the edge. The next-door tile, however, wouldn't know about that centroid so the text would be cut off. The simple way to deal with this is to draw each tile a bit bigger retrieving data for a larger area, draw the centroids and text then trim each tile to it's proper size. That way the text for points close to an edge will appear on tiles either side of the edge.

I rendered a set of zoom level 16 tiles for my local area and took a look to see how useful the centroid information is. It varies. For some roads the postcode is obvious from the centroid, some it is not clear where a postcode changes. I think it is of some use, so I pressed on.

I did not want to fill my on-line disk space with tiles that might not get used, so I decided to try creating the tiles on the fly to see if the response would be good enough. It is pretty good and local caching of the tiles helps too. I did have a small problem with the HTTP headers not wanting to work, but now that is sorted out. I can load the tiles as an overlay http://www.raggedred.net/codepoint/ and they work as a layer in JOSM and Potlatch 2 by using http://www.raggedred.net/tiles/codepoint as the place to get the tiles from.

I have only loaded the HU postcode area so far so, I will need to load areas for other areas if people want them. I do hope no one uses this to enter centroids into OSM, which seems a pointless exercise to me, but it might help people work out post codes for addressing.

4 comments:

naesk said...

Hi Chris

I believe this dataset be of any use to you http://www.doogal.co.uk/UKPostcodes.php?Search=HU

Chris Hill said...

Thanks Sean.
The postcode data on doogal is the same source as we already have. Their estimates of the addresses seem a bit out, probably because they are calculated automatically. Accurately pinning down the postcode for a real address is the part that needs a survey on the ground.

The actual address data was not released as part of OS OpenData, only the postcode centroids so I wonder where they got that from.

Irma said...

Hi Chris, I am trying to do exactly the same. Can you tell me how you managed to transform the data to long/lat? When I run the command: gdaltransform -s_srs EPSG:27700 -t_srs EPSG:4326 I get the error:
ERROR 6: EPSG PCS/GCS code 27700 not found in EPSG support files. Is this a val
id
EPSG coordinate system?
Many Thanks,
Irma

Chris Hill said...

Irma,
The command line I use is:

gdaltransform -t_srs WGS84 -s_srs 'EPSG:27700' < ext.txt > lonlat.txt

I create an extract of just northings and eastings for each centroid (ext.txt) wizz them through this command and then merge the lonlat.txt file back to the other data. A bit messy but it works well.