Alex Steer

Better communication through data / about / archive

From semantic webs to thesaurus worlds

633 words | ~3 min

I've spent a lot of time recently thinking (not blogging) about context-dependent search and location-based digital services of the Layar variety, and in particular what they mean for brands and consumers. There's too much to go into, but on the one hand there is enormous potential to improve opportunities for that most enjoyable of activities, random discovery; and on the other, there is the rather tiresome effect of putting another layer of intermediation between people and the world around them, even as we keep praising the disintermediating effects of the social web.

For now, I'd like to think about one of the newest real-world search applications, the frankly astonishing (if it works) Google Goggles. There have been no shortage of planners, social media commentators and all talking about GG, so I won't rehash those analyses. I will say, with my linguist's goggles on, that this is a particularly clear example of the trend towards making the real world a more semantically well-organised place.

A few years ago it felt like you couldn't move without hearing someone talking about the semantic web: the idea that data on the web should be well structured so it can be read and processed by machines as well as humans, meaning that data can be sliced, diced and re-presented as desired. Though it's less of a buzzword now, the idea of the semantic web (now it's not just mired in tagging) has proved phenomenally useful. We're used to websites that can cut and swap data via APIs and XML specifications - think of all those Facebook apps that use your data, or the various platforms for accessing Twitter. The other big development, of course, has been in natural language processing - helping computers improve their understanding of unstructured data.

All this has made the web a much more searchable place. We are now beginning to see the creation of a semantic world, which functions by interposing a layer of data between device-carrying human beings and physical objects. For individual users our semantic goggles can be switched on and off when need be (though in theory that's true of mobile telephones too), but collectively it means that we are going to start paying a lot more attention to the meanings and categories of things in the world. We are moving from the world as dictionary - a collection of items that we can only read one at a time and try to remember - to the world as thesaurus: a network of semantically interrelated items that we can cross-refer and make associations between. (For anyone interested in the implications of the web for this kind of thinking on actual dictionaries and thesauruses, I recommend my former colleague James McCracken's excellent paper.)

Inevitably, when a new visual or audio search technology comes along, someone somewhere proclaims the death of language as a medium on the web. Given my expressed scepticism about 'death of' arguments, I obviously have little sympathy, particularly in this case. Real-world search is going to require a massive expansion of both the semantic web and of natural language processing as we try to make sense of the poor stimuli and fuzzy logic (not to say fuzzy images) of the world on which we're trying to build a thesaurus. I can't help feeling John Wilkins and the other seventeenth-century theoreticians of philosophical languages should be the patron scientists of the context-dependent web.

# Alex Steer (18/12/2009)