Friday, 4 March 2011

Who needs approval?

I have seen discussions about what tag to use since I first joined OSM and it's not always easy to understand. Some things seem obvious like tagging a road. It's a highway right, so highway= ... ? Residential is fairly easy (but is it a living_street), but is a substantial road a trunk road or a primary, or even a secondary? Is a small road a service road, a track, an unclassified or even a path?

How do I tag a church? Apparently it's not a church, it's a place or worship, so then it needs an extra religion tag to differentiate churches from temples from synagogues etc. Why not just tag it a church or a temple?

Areas of land are unfathomable. Some get the landuse tag, i.e. how the land is used such as retail or commercial. That makes sense, but then along comes landuse=grass. Not 'used for grazing' or 'used to suppress weeds', just landuse=grass. Hmmm, does that work?

So it seems that somewhere in the past a plan was devised to sort this out: an approval process. New tags would be proposed, discussed and commented upon and then voted on, so tags would be approved for use. Approved tags? What's that all about? There are no approved tags, anyone can use any tag that suits their purpose.  That's at the core of OSM and part of what makes OSM so successful.

Imagine what would happen if I couldn't use any tag I want to. Picture the scene: I come across something that is not in the list of approved tags. I want to add it to the map. I have two choices, 1) add it as something else that is in the approved list but that is not what I have seen or 2) I can walk on by and not add it until it is in the approved list.

Both of these are stupid: option 1) will show up as the wrong thing until the tag is approved. What if the tag is not approved? Will I remember to remove it? How will I feel about removing my hard-won object?

Option 2) is even worse. The object will not get into the map database. So not only will no one looking at the map or database know it is there, no one will know there is such a thing waiting to be added, so when (if) the tag is proposed no one will know how many of them there are waiting for the tag. Will I go back and add it to the database after the tag has been approved? Somehow I doubt it, more likely I would be fed up with the bureaucracy of the process and will have left the project to give my time to someone else.

Then there's the thorny question of who should be the one to approve these new tags? No one owns OSM, so there's no one to appoint an approver. How do I know what it is you are interested in or what you want the map to be about? I don't so doesn't that rule me out from approving your tags? That effectively rules everyone out from being the approver since no one can understand everyone's motivations, so next you need a committee to try to cover more interests. How big will the committee need to be? Who sits on the committee? Am I allowed to vote someone on/off it? Why? What gives me a vote? Do I get a vote because I have been in OSM for six days? Six months? What if I have only changed a single road name in that time? So maybe I get a vote because I've added a thousand nodes. I could just add loads of nodes into existing roads just to get my vote. Can I buy a vote in return for a donation of  ten dollars? What about 10,000 dollars?

So even if some sort of committee gets formed, how often do they meet? Will they need to get advice from other experts? So how long will it take to get a tag approved? When they do decide to turn down a tag will there be some sort of appeal process? What if that is challenged by a rich company that wants a tag approved and who has a large legal team?

How will people be bound by any of this? Clearly they will have to agree to this, so as well as signing up to a licence for the data, they will have to agree to be bound by the Approvals Committee. How many people will agree to that?!?

So the answer is simple. Allow people to use any tag they like and allow a process of natural selection to merge disparate tags together as a consensus develops. Publish a list of tags that are widely used based on statistics. A description of how people have chosen to use it would help other people decide if they want to use it too. A list of the tags supported by some of the main data consumers such as renderers and routers would help too.

If this the right way to do it, then why does the approval process exist? A very good question. I don't think it should not be there at all. Removing the process would remove the right of passage of creating a tag. It would stop the stupidity of someone creating a tag, getting it through the voting process by a few people voting to approve it and then claiming it is an approved tag and demanding it gets rendered even though it has hardly ever been used. It stops people voting for or against a new tag even though they have not used it, have no knowledge about the subject and only voted because someone invited them to do so on a mailing list.

So I do sometimes struggle to find the best tag to use, but I'm relaxed about that. I am not at all happy about the phoney, abused and irrelevant tag-approval process and I would like it to end now.

9 comments:

davespod said...

Or to put it another way: vote with your feet!

This is the best reasoned argument against the wiki tag approval process I have seen.

Still not 100% sure it should be scrapped (but probably 90% sure). One of the things I like about natural selection is that it is the simple tags that seem to be popular, rather than unnecessarily over-complicated schemata involving relations (not saying these are never necessary, but they rarely are).

Russ Nelson said...

One more step is needed: to document in the wiki 1) that you have used a tag or tag combination, and 2) what you mean by it.

gom1 said...

I agree that freedom to create new tags is valuable, and there is no need for an approval process, but thankfully approval processes can be ignored. I'd have bigger issues with an enforcement process.

Clumsy efforts to achieve consistency can certainly do more harm than good, but there is still value in some consistency. Guidance along the following lines falls short of an approval process, but still seems appropriate.

"There are many different ways that you can describe something when you add it to OSM, and it is up to you which options you chose. Experienced contributors have found that some approaches work better than others in practice. Most beginners find it best to tag that feature in this way". = Good

"Generally accepted best practice among experienced contributors for that feature is to add information that relates to a, and b, using x to mean y and p to mean q." = Good

"When using this data, you can normally assume that people are using such and such a tag to mean this..." = Good

"If you tag something this way, then this is how most data users currently interpret it...." = Good

"Some people do things this way because..., others do them that way because.... At the moment there is no agreement on which is best. But they both mean much the same thing. So you need to handle both alternatives when using the data, but you can take your pick when adding a tag. Here is some information that might help you make your choice.". = Good

"Some people use this combination of tags to mean one thing, but others use them to mean something different. At the moment there is no agreement on which interpretation is better, and there is no way to tell what each contributor originally intended. If you want to make your intentions clear, it's probably wise to avoid both alternatives, and use this different set of tags instead. = Good

"We started doing it this way, but we now realise that we hadn't thought about something important. On reflection, most people with a special interest in this particular feature reckon that to capture information accurately it would probably be better if we encouraged people to do it this way from now on" = Good

"The way we go about trying to build consensus and consistency around here is to do things like this, this and this" = Good

Tordanik said...

This article's arguments seem to partially rely on the false dichotomy between "any tag you like" and "approval" processes. These would only be mutually exclusive if you could not use unapproved tags, which is clearly not the case and is not suggested even by supporters of the wiki proposal (RFC & vote) processes.

In my opinion, it's desirable that everyone can create tags for new objects without bureaucratic impediments. It is, however, also desirable that different styles of tagging the same phenomenon ultimately converge towards a single style.

The ability to use any tag you like primarily serves the first goal. The wiki proposal process is one of the factors that contribute to the second goal. It is a means to encourage people to provide feedback, and can serve you well if you aren't quite sure whether your idea is good, or try to get opinions on whether it accounts for the requirements of mappers outside the group of people you normally communicate with - beyond your local/national mapper community.

Sander said...

Why not implement it this way:

1) someone enters a custom tag in an OSM editor (a tag that's not already in the wiki)
2) when the person saves/uploads the changes, he gets a notification that he used a new tag with a question if the tag is correct
3) if the tag is correct, he can enter a little description for it which will appear on the wiki with some kind of {{stub}} template
4) some advanced OSM'er looks at the wiki and edits the page


Problems:
- The editors should support the commenting, but it is possible since there are e.g. wiki readers and editors written in Python
- Every OSM editor should have a wiki account
- extra work for the editors
- An ovefull wiki (maybe only present the comment dialog on the 10'th or 100'th time the tag is used)

Tom Chance said...

Chris,

I think your post describes quite nicely the flaw with your own argument. Leave it too open and you create massive confusion, both for relatively new contributors and for data users.

I quite agree that the tin-pot democrats on the wiki shouldn't be allowed to impose "decisions", especially where they override longstanding conventions or seek to deprecate long used tags. Perhaps this attitude comes from a very English sense of common law!

But there are times when the laissez faire approach falls down. Look, for example, at the various competing schemes for tagging footpaths, or (as you mention) the various alternative tags that people take to mean "a nondescript patch of grass" (back when I started you just used landuse=recreation_ground!)

As a new user you will find about 3 different answers on footpaths on the wiki, depending on how you look for it. You will find different footpath tagging schemas in the presets of the main editors (Potlatch 1, Potlatch 2, JOSM, Merkator). You will find different schemas supported in the "main" renders (www/mapnik, Tiles@Home, OpenCycleMap, various CloudMade, etc.)

What we need, and I suspect we will be left without for a few more years at least, is a democratic body elected by OSM Foundation members empowered to step in and resolve sticky disagreements. To work through the data contributor and user needs and identify the best tagging schema to endorse and roll out to enough major editors and users for it to gain widespread acceptance.

For every other new tagging idea, the wiki serves as a good space to discuss and improve them before making general use of them. It's fine, just so long as it is never used to override/deprecate widely used tags.

Tom

Chris Hill said...

Thanks to all for your comments. Just to be clear, I am not arguing against a consensus on the way tags are used, I am simply pointing out that voting on a proposal is phoney, open to abuse and, given there are no approved tags, wrong.

I whole-heartedly agree with better documentation. That is hard to do well, and much harder than most people who haven't written any seriously would imagine, but good documentation will help draw together a consensus. There must be a number of people in OSM skilled at objective documentation (there are plenty of people who clearly are not).

I cannot agree with a committee to settle disputes about the best way to tag things for all the reasons I give. Indeed I expect the 'core' tags to diverge. Why should US mappers use highway=motorway, why not use highway=freeway? Then they get the real meaning of a freeway not some phoney meaning of an 'American motorway'. When the law about a cycleways in one country allows mopeds and pedestrians and in another country only allow bicycles, why bend one tag to two quite different meanings?

Tom Chance said...

Chris,

That's a valid question. For my the answer is that the goal of compatability and simplicity for data users makes the added hassle for people writing editor presets worth it.

Anyway, it dodges the footpath example where even in the UK we have several competing schems that arose because the original simple solution of highway=footway/cycleway/bridleway/etc. is inadequate.

Chris Hill said...

Tom,
The problem of the different tagging schemes in the UK was caused by the path + designated proposal. Until then the highway=footway was the way footways got tagged. The editors just reflect the position of their respective authors. As I recall, people who don't live in England decided that footway, cyclway and especially bridleway didn't mean much in their country and so wrote a proposal for path etcetera, which got voted through. People dug their heels in and here we are. This was not sorted out by voting, that just made some people feel that their proposal was now the official way to do things.

The idea of footway, cycleway, track etc with the addition of access and designation tags work well. Path is useful too.

To be clear, I don't mind that there are different ways of tagging things, especially in various legal jurisdictions. What I want to stop is the voting process that lends an air of importance to something that has none.