Saturday, 16 April 2011

More villages and lint chasing

A couple of years ago we added what we thought was the last road to OSM within the boundary of Hull. That turned out to be a bit optimistic, but fairly close to the truth. We have taken much longer to move towards completion of the East Riding of Yorkshire. There is a still a long way to go, with towns like Bridlington and Hornsea still requiring a bit of work and a few largish villages still needing attention. The countryside is pleasant and now the spring is arriving with blackthorn hedges bursting into better shows of flower than I remember for many years.

Recently we have surveyed Beeford and Brandsburton. Both were easy to do, but the difference in the ambiance in each village was marked. Beeford was pleasant, Brandsburton was distinctly odd. The list of villages needing their roads and basic facilities mapping in East Yorkshire is now probably about 40, and some are tiny needing only a brief, passing visit. There are still surprises though. We found a secondary road heading out of Beeford which was not mapped. I need to check if there are any more that need adding in the county.

The process of checking roads has been assisted by the release of Ordnance Survey's Locator data. This defines a bounding rectangle and centroid for every named road in the UK. This has proved particularly useful in settlements since it shows up roads that are newly added in an area since it was surveyed. To get this I use the highly useful transparent tiles that ITO World produce. I added them to my own overlay, which now also includes postcodes at high enough zoom levels. The OS Locator data is also available in OS Locator Musical Chairs.

Like all lint tools there are some drawbacks, the main one being that people are tempted to just use the name from the OS Locator dataset rather than going out to check what is really there. On many country roads there are no name boards and OS Locator is usually right, but even so I prefer to go and look before using the OS name. At this time of year the ride out is always worth it.

Lint tool providers have a duty, in my mind, to be as impartial as possible. They can wield quite a lot of power in the way they chose to flag up errors. Many people quickly adopt the scheme as the standard way to tag and there is a flurry of activity to reduce the 'errors' for their area.  Recently there has been some discussion about the way national speed limits are tagged. I strongly believe that tagging schemes should be designed to assist the surveyors and mappers, but some seem to think that the consumers of data should be treated preferentially over mappers. Here's why I don't agree:
  • There are never enough mappers in an area. Attracting them and keeping them is hard. An overly complex tagging scheme means they will quickly get fed up. "All that just to add a speed limit? No thanks."
  • Complex schemes mean mappers make mistakes. These are then likely to be 'fixed' by some bot or mass edit. These are not done by mappers who have visited so surveyed data can be lost or changed.
  • Mappers must use a tag over and over again, each time they encounter the thing that needs tagging. Complex schemes means adding many tags for one object every time you see it which increases the mapper's workload.
  • Programmers write code to recognise and use a tagging scheme once. The map or analysis or whatever that they produce can then be run over and over again with little or no effort, consuming the tags that mappers took hours to add.
  • Retrospectively increasing the complexity of tagging for a commonly encountered object, like a speed limit, means you are asking people to resurvey huge lengths of road. If you don't need a resurvey then the increased complexity is not needed as it can be inferred some other way in a program.
One thing I have noticed and has been commented upon elsewhere is that many of the people suggesting that data consumers need this tag or that tag to make sense of the data have never actually tried to used the data in a program. They have no idea what is really involved and what techniques can be applied to get around any problem.

No comments: