Monday 28 September 2015

Extracting building heights from LIDAR

The UK Environment Agency have released some LIDAR data in Digital Elevation Model (DEM) format. It includes Digital Terrain Model (DTM) data and Digital Surface Model data. The DSM data includes buildings and trees while the DTM is processed to remove these so the underlying terrain is visible. Tim Waters asked if you subtract the DTM from the DSM would you be left with just building and tree heights? I'd started to look at this and it turns out building heights are extractable in this way.

I have written a script to do the subtraction and create both the difference file and an SQL file to load the data into a postgresql table. I created a hill shading image from one of the difference files to see what features have been stripped from the DTM data. Here's a jpg version of it:

You can see that the buildings and some other features are all well defined. Phil Endecott suggested using the DSM data to create building outlines which could be traced in OSM. His images look good. I would suggest starting his process with this difference data as all the terrain detail has been removed so it may be even better.

Once the SQL file of height data has been loaded into a Postgresql database, which has the PostGIS extension installed, we can then do some queries on it. I selected some OSM building polygons in the range of the loaded data and found which of the height points fell within each polygon. The highest of these is the highest point of the building above the surrounding terrain. I've written a script as a place-holder to extract all the heights for a rectangular area. Someone could extend this to make a file to upload to OSM to add the heights for every building in the defined area. I feel this is clearly an import and so the usual OSM import process needs to be gone through before the import takes place.

I think there is real merit in using this data to extract building heights, which are needed for the 3D images of city buildings.

The two scripts are available here: The first ( needs matching DSM and DTM files and outputs a difference file and the SQL to load into a database table. This will work for any of the resolutions published by the EA. The second, much less polished, script ( defines a rectangle, extracts the OSM buildings for that rectangle and then finds the height data for each building. I wrote it as a script so it can be extended to create a data file for processing or uploading or so overlay tiles could be made from it. I loaded some OSM data with osm2pgsql (which would normally be used for rendering) and added the table for the heights data to the database. The SQL for the table is:

  hid serial NOT NULL,
  height double precision,
  locn geometry(Point,27700),
  CONSTRAINT eaheight_prim PRIMARY KEY (hid)

CREATE INDEX eadata_index
  ON eadata
  USING gist
The output SQL data can be loaded into this with the command:
psql -f
where and are whatever you used. The table name eadata and field names are hard-coded in 

It is important to say that I would not use a rendering database to create the upload data from as some fields will be missing. osm2pgsql is a lossy process. You can use the OSM ID to extract a current version from the API or from Overpass to add the height data to. I used the rendering data for convenience as I already had it available and to satisfy myself that the process works.  

I hope this is useful to someone. Please feel free to ask for more information if I've not made anything clear.

Sunday 20 September 2015

More LIDAR goodness

I looked into LIDAR data from the UK Environment Agency some weeks ago. I needed it help a local group who are investigating flood mitigation options. The data was listed as being about £26,500 but we got a 100% discount if we used it for restricted, research use, so we could afford it! A few weeks after I'd used the height data for the group I got an email from the Environment Agency. They said the data was being made available as Open Data under the Open Government Licence. So now I could use it for any other purposes at no cost. You can get the data from

I decided to make a detailed relief map of part of the area close to home. The data doesn't cover the whole country, only parts that are deemed at risk of flooding. All of Hull and the river Hull catchment area are included in this. I've only looked at my local area so other areas may vary.

The data is downloaded as Ordnance Survey 10km grid tiles. There are 2m, 1m, 50cm and 25cm options and digital terrain model and digital surface model options too, so let's look at these options, but first a bit about LIDAR.

LIDAR is a technology that uses laser light to measure a distance repeatedly over an area to create a 3D model of an area. If the LIDAR transceiver is mounted at a fixed point it can pan around to record a very detailed image in 3D of everything that can be seen from that point. It works very well in this way inside a building or a cave to make a very accurate model. The US Space Shuttle flew a mission to use a variant of LIDAR to record the height of the surface of the Earth from space. This is available as SRTM.

More recently LIDAR equipment has been flown in aircraft. The difficulty of making useful measurements from an aircraft should not be underestimated. The only data LIDAR returns is distance to the target, so knowing PRECISELY where the aircraft is in 3D is the real problem. GPS is hopeless at altitude measurement and scarcely good enough for lateral location, barometric height measurements vary over time and location and inertial dead-reckoning accuracy falls off with time. A combination of all of these plus post-processing can result in useful data.

The Environment Agency LIDAR distance options specify the distance between the sample points, the 2m option having less detail than the 25cm option. The area that these options cover varies with the highest detail covering the smallest area. I chose the 50cm option as it covered the area I wanted at the highest level of detail. The detail does make for larger datasets and more processing needed to do anything with it.

Clearly the LIDAR measures the distance to the first object it encounters from the aircraft, so it measures tree tops, building roofs and even vehicles. This is known as the digital surface model. This is often a composite from multiple images, as this data is, to compensate for location inaccuracies and to help remove things like vehicles. To get a useful model of the real landscape, without trees and buildings, the data is post processed to create the digital terrain model. This is the data I have used.

The OGL data was different from the data the Environment Agency originally supplied. The original data was in smaller grid squares and the height was rounded to the nearest centimetre. The OGL data is in bigger squares which makes it a bit easier to process but seems to use 18 decimal places of a metre, which is smaller than the diameter of an atom.

I wanted to create a relief map and make contours from the data and, not for the first time, GDAL had the tools. The data uses the UK Ordnance Survey projection, known as OSGB36 or ESPG:27700, so to use any OSM data with it I would need to reproject to WGS84 or EPSG:4326.

To make a relief map I used gdaldem with the hillshade option on each of the datafiles. These need to be joined together to make a larger image, so the option -compute-edges is also needed. The complete command is:
gdaldem hillshade -compute_edges infile relieftiff/outfile.tif
The output is geoTIFF files which can be merged into a single (large) geoTIFF with the command -o big.tif *.tif
This creates a geoTIFF file which has the image of the relief in a TIFF image and also has the locations of the edges in the original OS projection.

The next step is to use gdalwarp to reproject the large tiff file to one in the WGS84 projection. The command describes the source and target projection and filenames. There are significant missing pieces in the large TIFF as the available data was not rectangular. The -srcnodata 0 and -dstalpha makes missing data transparent rather than black.
gdalwarp -s_srs epsg:27700 -t_srs epsg:4326 -srcnodata 0 -dstalpha big.tif bigr.tif
The new TIFF file is what we want to see, but now it needs turning into tiles to be displayed on a slippy map. I decided that zoom level 13 to 18 would give a useful display. To make these tiles I used, specifying the reprojected TIFF image, the zoom levels and the folder to put the tiles into. -z13-19 bigrt.tif tiles
This makes a set of tiles in the TMS format in the specified folder, in this case tiles.

Another way to visualise the LIDAR data is contours. I decided to create a set of overlay tiles that are transparent except for the contours. These can have a different density of contours at each zoom level. I chose the smallest contour step to be shown at the highest zoom level to be 0.2 metres. The GDAL tool for the job is gdal_contour which makes a shapefile with a linestring for each contour. The command is
gdal_contour -i 0.2 -a height infile outshapefile.shp
The resulting shapefile needs to be reprojected to WGS84. The tool to reproject shapefiles is ogr2ogr
ogr2ogr -t_srs epsg:4326 -s_srs epsg:27700 outshape.shp new.shp
I decided to use Mapnik to make the contour overlay tiles. Mapnik can use shapefiles but specifying the long list of shapefiles created above would be a problem so I loaded the shapefiles into a postgresql table in a database with PostGIS enabled. Postgresql comes with shp2pgsql to do this:
shp2pgsql -a -g way new.shp eacontours > new.sql
This makes SQL to load the shapefile into eacontours table, putting the geometry in the field called way. To load this into a database called ukcontours which already has the postgis extension installed the command is
psql -d ukcontours -f new.sql
I then designed the overlay with Tilemill to create the transparent tiles with more contours at higher zoom levels.

You can see the results at I added a water overlay and a roads overlay (thanks to MapQuest for the roads) to help position the relief imagery.