Alex Steer

Better communication through data / about / archive

When pigs flu: the social life of pandemics

39 words | ~0 min

My post on the social and economic drivers of pandemics (and our fear of them) appears on the Futures Company blog this morning.

# Alex Steer (28/04/2009)


Who invented 'twitter'?

973 words | ~5 min

On Wednesday, the wonderful elves at Qikipedia (the Twitter presence of QI) announced that:

The word 'twitter' was first used by Geoffrey Chaucer in 1374.

Now, this is obviously the sort of thing to make lexicographers (even washed-up former lexicographers) sit up and take notice. So let's get the big, obvious and pedantic problem out of the way first.

If you look at the OED's entry for twitter, v.1 (originally published in 1926; included in the 1989 Second Edition), the first quotation in the first sense ('intr. Of a bird: To utter a succession of light tremulous notes; to chirp continuously with a tremulous effect.') is:

1374 CHAUCER Boeth. III. met. ii. 54 (Camb. MS.) The Iangelynge bryd..enclosed in a streyht cage..twiterith desyrynge the wode with her swete voys.

There are no other quotations that seem to antedate 1374, so Chaucer appears to claim the prize for earliest use.

But that doesn't, of course, mean that he invented the word. Just that his translation of Boethius's Consolatio Philosophiae (normally known as Boece) is recorded in the OED as containing the earliest example of the word that had been found by 1926. This is one of those things that lexicographers will tell you until they die: earliest citation does not necessarily mean invention.

There's another problem, though. A lot of work has been done on medieval books and writing since 1926, and a lot more has been discovered. Language changes, but so does our knowledge of language in use. That's one of the reasons why the OED is being continuously updated, with new and revised entries now being published online every quarter. (See the website for information.) It's common, with the tools now at lexicographers' disposal, to find earlier examples (known as 'antedatings') for words.

In the case of twitter, no new example has been found. Instead, something more complicated has happened to knock Chaucer off his perch. Welcome to the surprising world of historical bibliography...

The second quotation given for the first sense of twitter is:

1387 TREVISA Higden (Rolls) I. 237 Þe nytyngale in his note Twytereþ wel fawnyng Wiþ full swete song.

For those not familiar with Middle English, the weird 'þ' character is called 'thorn', and pronounced 'th'. For those baffled by the OED's slightly cryptic citation, this is the translation by the fourteenth-century Cornish vicar John Trevisa of the Polychronicon, a history of the world by Ranulf Higden, a Benedictine monk from Chester. We know exactly when Trevisa finished his translation, because he noted it at the end:

God be {th}onked of al his nedes {th}is translacioun is I ended in a {th}orsday {th}e ey{gh}te{th}e day of Aueryl {th}e {y}ere of oure lord a {th}owsand {th}re hondred foure score and seuene {th}e ten{th}e {y}ere of kyng Richard.

This might seem irrelevant if the Chaucer quotation comes from 1374. The reason it matters is that we need to know not when we think Geoffrey Chaucer might have finished writing Boece or when John Trevisa might have finished writing Polychronicon, but when the earliest surviving manuscript containing the word twitter dates from.

You see, books and manuscripts were copied and recopied - it was a huge industry in fourteenth and fifteenth-century England, before the invention of print (and after, for some time) - and the copyists would introduce changes. Sometimes this would be to replace words from one local dialect to make the work comprehensible to readers elsewhere in the country; sometimes it just seems to have been personal preference or error. It's quite rare to have an author's original copy of a work in his or her own hand. (These are known as holographs, which sounds quite exciting but isn't. There's a possible holograph of Chaucer's Equatorie of the Planets in the manuscript collection of Peterhouse, Cambridge.) All of this means that you can't guarantee that a word in a manuscript was put there by the author. (You can get more and more sure by comparing different manuscripts, but that's about all.)

Here comes the science. The manuscript of Chaucer's Boece in which the word twitter appears is called Cambridge, University Library MS Ii.1.38, and has been dated to the first quarter of the 15th century, somewhere around 1425. (There are lots of ways of dating manuscripts, which thankfully I'm not going into here.) The manuscript of Trevisa's Polychronicon is Cambridge, St John's College, MS H.1, and dates to the late 14th century, somewhere between 1387 and 1400. It is the earliest manuscript copy of the Polychronicon that survives.

That means that our earliest example of twitter appears in the Polychronicon, not in Boece. Until someone finds an earlier one, anyway.

So next time you use Twitter, spare a thought for John Trevisa. He may or may not have invented the word, but he was one of English's great organisers and sharers of content. Not content with a massive chronicle, he also translated the Middle Ages' most popular encyclopedia (Bartholomeus Anglicus's De Proprietatibus Rerum, 'On the Properties of Things'), and may also have been involved in the effort by John Wycliffe's friends and followers to translate the Bible into English.

Though not, admittedly, in 140 characters or fewer.

# Alex Steer (14/03/2009)


I before E except in DCSF

930 words | ~5 min

Update: 16 March 2009 - Rob Wilson's blog no longer seems to be available.

Jim Knight, the Schools Minister, has spelling mistakes on his blog. Predictably, everybody is having a field day with this, dutifully listing some of the more egregious errors. Rob Wilson, the Conservative education spokesman, managed this rather neat jibe about his opponent:

He will be disappointed with his efforts in class but I'm sure he'll make every effort to improve now teacher has noticed he's falling behind.

The papers are loving it too. The Telegraph, always a stickler for standards, has wheeled its grammarians out of whichever corners grammarians inhabit, to remind us that:

Mr Knight, who is responsible for raising education standards, also clearly has problems with the "i before e, except after c" spelling rule taught to primary school pupils.

Two things are clear. The first is that we love a good laugh at someone's expense. There is perhaps no neater world-gone-mad story than one about a schools minister who's not very good at spelling. It gives expression to our dislike for authority figures, while at the same time letting us remind ourselves how much we love hard and fast rules when it comes to language - or, at least, how much we love letting other people know that we know those rules. There's room within the story for shock and disappointment (genuine or otherwise), and for a little bit of smugness to round it off. He may be the schools minister, but we can spell 'recess'.

The second thing that's clear is that none of Jim Knight's fastidious nemeses has heard of Muphry's Law. This is the jokey adage, common among proofreaders, editors, lexicographers and the like, that any written criticism of editing or proofreading will itself contain an editing or proofreading error.

In this case, Muphry's Law has chosen the Telegraph as its victim. Here's the first full paragraph of its piece:

The mispellings of Mr Knight, who was educated at Cambridge University, include "maintainence", "convicned", "curently", "similiar", "foce", "pernsioners", "reccess" and "archeaological".

Yes, that's mispellings.*

But the Law has another, perhaps more perfect victim this week. It's Rob Wilson, the abovementioned Conservative education spokesman. Mr Wilson also has a blog. Both Jim Knight and Rob Wilson have published posts of roughly equivalent length in the last week. (Jim Knight's is 337 words; Rob Wilson's is 465.)

I could be unreasonably cruel to them both and go through their posts as an editor, looking for the whole range of improvements that need making to grammar, style, punctuation, etc. But since this fight is about spelling, let's stick to words that are spelled incorrectly.

In Jim Knight's post:

In their plans schools could set up where ever, with no local co-ordination.

This should read 'wherever'. I count this as a spelling error, because 'wherever' is, in this context, probably best read as locative adverb (elliptically for 'wherever they like' or similar, in which 'wherever' is a subordinating conjunction), for which you can't really substitute the conjunction + adverb combination 'where ever' (even though this is the root of the subordinating conjunctive use of 'wherever' - stop me if this is getting too exciting). It's therefore not a valid option for spelling the word Jim Knight wanted to use.

In Rob Wilson's post:

I also know that the procedures of Parliament get things right many more times then they get them wrong

Then should be than.

and

I am a lover of Parliamentary democracy and the traditions developed here over hundred's of year.

Hundred's should be hundreds.

So, Rob Wilson's most recent post contains twice as many errors as Jim Knight's. Clearly the Schools Minister has been learning his lesson and checking his work.

But does it matter, even slightly? A sociolinguist will tell you that people apply different standards of orthography according to the media in which they're writing. That's why, despite a million scare stories, kids don't tend to write their homework in text-speak, and why your respectable auntie will send you texts without any vowels in. Anyone, sociolinguist or otherwise, will tell you that sometimes mistakes creep in because of errors in what you do, not what you know. If you type quickly and don't check your spelling, you might end up with 'recieve' instead of 'receive'. It doesn't make you an idiot, just a bit careless. And you might have good reason not to care. Blogs, even MPs' blogs, still have a reputation as being informal means of communication. That's part of their charm. That means they don't go through rigorous proofing and correction (apart from this one, obviously). That's probably especially true when their authors have other things to do, like, say, being responsible for the performance of every school in the country.

You'd think, given the widespread impression that all political communications are now ruthlessly controlled, we might be reassured by a few typos.**


* By the way, Googling 'mispelling' is a hilarious experience. You get a lot of pages of people using the word to complain about other people's misspellings.

** Yes, maybe that's the point. Conspiracy theories to the usual address.

# Alex Steer (11/02/2009)


It's the size that counts, not what you do with it

564 words | ~3 min

AlphaDictionary.com has a resident word columnist, Dr Robert Beard, and he's written a list of The 100 Most Beautiful Words in English. Fact.

There are lots of these lists, of course - I'm just singling this one out because it's in front of me - and it seems amazingly common for people to harbour a 'favourite word' (and just as common to badger lexicographers for theirs). Looking through this one, you notice some startling things which are best expressed, not in words, but in numbers.

The words are long.

The list contains 944 letters, and 312 syllables by my reading. (I've erred slightly on the side of caution with some words, too.) The mean syllable count for words in the list is (obviously) 3.12, the mean length 9.44 letters. Even within this set, there's a bias towards longish words. Here's a graph of the frequency of syllable counts:

Chart - most beautiful words distribution

You might think, is this fair? How do we know how long is long? Well, it's a bit hard to find good data on the whole of the English language, as you might imagine. But as a proxy we do have the list of the 100 commonest English words drawn from the very large and well-balanced Oxford English Corpus.

Here the average number of syllables (again, by my reading) is 1.11, and the average word length is 3.38 letters. Here's the distribution:

Chart - most common words distribution

You'll also notice there is absolutely no overlap between the words on the two lists. In fact, and it would be hard to judge this objectively except by doing a frequency analysis of a very large corpus, I'd say that most of the 'most beautiful' words are pretty far from being common.

Why do we put so much value on long, obscure words? Perhaps it's because there's a hint of priestcraft about them: we know that if we know them, and use them, we might sound clever, or be able to bamboozle (sorry) other people with them. But is that beautiful? I'd argue the opposite: that it is one of the ugliest uses of language. Words are, after all, not precious stones or museum pieces but tools for communicating. The ones we should value are the ones that make people able to understand us, and most of those, day to day, are the workmanlike common words we barely notice. Yes, there are a lot of obscure words - technical vocabulary, terms of art or pieces of conceptual language - that convey a lot of meaning to the right people at the right time, and that can save effort and even save lives. (Think of medical terms, for example.) But there again the beauty lies in the utility. Between the common and the beautiful, choose whatever gets you heard and understood.

# Alex Steer (30/01/2009)


Why optional zero tolerance doesn't work

479 words | ~2 min

The UK's Chief Medical Officer, Sir Liam Donaldson, is advising that children under 15 should not ever be given alcohol by their parents. He is quoted:

It is advice to parents. It's their choice at the end of the day within the family setting.

This has, rightly, elicited a lot of supportive comments from alcohol addiction groups and public health professionals, such as this one from Alcohol Concern:

Parents have for too long received mixed messages about whether they should give their children a little bit of alcohol or not.

The BBC article also contains the shock stat that '20% of 13-year-olds [drink] alcohol at least once a week' (though this is from one survey, and I don't know the source so can't comment on its accuracy).

So, the issue of children drinking is a confusing and misunderstood one, and it's tempting to see this new guidance from Sir Liam as a great clarification, cutting through the undergrowth with a sharp message of parental abstinence.

The problem is, it only offers clarification for people who choose to obey it, and it doesn't offer clear reasons for obeying it. I doubt that many people, if you asked them, could reel off the figures for the incidence of death or serious illness caused by alcohol in the under-15 population, or what percentage of the same in adults, or of alcoholism, can be attributed to childhood drinking. (I don't know, for the record.)

Telling people 'all drink is bad for kids', without much to support that, will provoke a 'no it isn't' response, as people recall their own days of mild underage experimentation with alcohol, and note that they are not dead as a result of it. This is pretty bad sample-of-one analysis, but it's what you get if you don't tell people to think about risk. The message needs to be: if you give your child more than x amount of alcohol per week, there is an x per cent chance that it will cause them serious harm.

So lots of parents will ignore this, perhaps because they want to introduce their children to alcohol in the relatively safety of their own home, rather than have them caning it on Buckfast down the local youth club. And when they do, they still won't know how much will push their children into the area of serious risk.

All of which, of course, is forgetting that if the '20% of 13 year-olds' stat is even close to true, they'll all be round the back of the bus shelters with the supermarket vodka anyway, trying to figure out the risk-to-reward relationship without the assistance of the Department of Health.

# Alex Steer (29/01/2009)


Stronger, better, righter, worse

244 words | ~1 min

Following up on the post about Wikipedia and Britannica, there's been a lot of talk about the proposal to introduce flagged revisions on Wikipedia. This would mean that edits, at least on some entries, would have to be checked and signed off by a reviewer.

This would reduce deliberate misinformation attacks on entries on current events, such as the recent attempt by someone to convince the world that Ted Kennedy died on Obama's inauguration day. But it would also destroy the quality that makes Wikipedia uniquely useful for following those events. Not that fact that it's written by members of the public, specifically, but the fact that it's fast.

Look, for example, at the edit history for the entry on Benazir Bhutto in late December 2007, when she was assassinated. That's fast work. Yes, quite a lot of the changes were inaccurate, deliberately or accidentally, but they got corrected with equal speed.

There's an old line that journalism is the first draft of history. When news breaks, Wikipedia is the first draft of an encyclopedia: not perfect, but there. Take that speed away, and Wikipedia will still read like the first draft of an encyclopedia, but not in a good way.

# Alex Steer (27/01/2009)


Freerunning from the law

835 words | ~4 min

This piece from the Independent, about the mooted introduction of parkour lessons in secondary schools, is a textbook example of why you should be careful when using social research data to inform policy. The report says:

According to figures from the Metropolitan Police, when sports projects were run in the borough of Westminster during the 2005 Easter holidays, youth crime dropped by 39 per cent. The following year, the most recent for which figures are available, when parkour was added to the projects, youth crime fell by 69 per cent.

So, was it parkour?

There is a fair body of evidence that providing summer activities for young people, particularly in deprived areas, can cause short-term falls in certain types of youth crime, especially vandalism and various strains of antisocial behaviour. One pilot study, done on an estate in Bristol in 1992, found a 29% drop in overall crime during the time the summer scheme ran, and a 68% drop in vehicle theft. Another, in Runcorn, was cited as the cause of a 57% drop in police callouts for youth disturbance in 1993. (Both examples are from Demos, Turning the Tide. There is little evidence for the long-term impact of schemes like this, though that's a different problem.)

Westminster Council takes youth crime reduction seriously. One of the targets for its 2003-5 Youth Crime Reduction Strategy was to 'develop the Positive Activities programme..to enable at risk young people to participate in positive activities during the school holidays'. Importantly, though, when the Strategy was written in April 2003 the activity scheme idea was listed as new. It was part of the Home Office's Positive Futures strategy, launched in 2001, to improve 'social inclusion' using sport and leisure activities. (You know, the kind of thing the Daily Mail loves.)

This already gives us a problem. Because Westminster adopts such a thorough approach to youth crime reduction, especially during the holidays, we can't safely attribute the crime reduction to parkour. (We could, however, attribute increased attendance at Positive Activities to this, though perhaps also to better publicity, word of mouth from last year's kids, etc.)

But there are other problems. I can't find a source for the 38% and 69% drops in youth crime for Westminster in 2005 and 2006. (If anyone knows where I can get them, I'd really like to see.) So I'm going to have to use some proxies. Let's take a look at the number of reported crimes for Westminster over the Easter period in 2006, 2007 and 2008. This is all types of crime, done by criminals of all ages.

March 05: 5399 April 05: 6501 May 05: 6444

March 06: 5399 April 06: 5169 May 06: 5658

March 07: 5831 April 07: 5705 May 07: 5617

March 08: 5264 April 08: 5056 May 08: 5103

This doesn't tell us too much: sometimes it seems to rise over the Easter holidays, sometimes to fall. So let's look instead at the number of reported motor vehicle thefts for 2005 and 2006.

2005: 98 2006: 63

I've chosen motor vehicle theft because it's disproportionately a young people's crime, and seems in other studies to have been heavily improved by the provision of holiday activities. On the face of it, this looks great: a 35.7% reduction in a crime largely committed by youngsters. Clearly something in the Youth Crime Reduction Strategy (though not necessarily parkour) was working.

But let's pull out and look at the figures for the whole period 2000-2008.

2000: 121 2001: 15 2002: 139 2003: 17 2004: 12 2005: 98 2006: 63 2007: 48 2008: 50

Mean: 62.5%

Suddenly the drop doesn't look so impressive. In fact, there's little clear pattern at all. It seems it's a classic case of Regression toward the mean, in a set of numbers with a pretty high distribution (lowest: 12; highest: 139). When Westminster published its Youth Crime Reduction Strategy in April 2003, little wonder it was worried if this statistic is representative: its last set of Easter holiday stats (for April 2002) showed a huge spike of car thefts on the previous year (from 15 in 2001 to 139 in 2002). Yet at the same time the figure was poised to plunge again for 2003: just 17. It's now hovering somewhere comfortably just below the mean.

So has parkour killed off youth crime in Westminster? Not for certain. My proxies tell us a lot less than proper youth crime statistics would, and if I find them I'll report on them here. What I'd like to know, and what I wish the Independent or Westminster or the Met would tell us, is what the figures look like for the last ten years or so. Do they mirror the pattern for vehicle theft? That would make it easier to tell if we're looking at a genuine reduction or just a regression toward the mean.

# Alex Steer (26/01/2009)


Next up, crowdsourcing for brain surgery

1155 words | ~6 min

I'm going to try and get in before the crowd on this one, because the backlash will be inevitable.

The Encyclopedia Britannica is going to allow some user-generated content onto its site. However (according to the BBC News story), unlike Wikipedia, it will be maintaining very tight controls on what is allowed on. From what I can tell, only contributions that get an initial nod from the editors will be allowed onto the site, and even then it will carry a 'Britannica Checked' mark, to distinguish it from the main text.

To explain this new approach, Jorge Cauz, the president of Encyclopedia Britannica Inc., said:

'We are not abdicating our responsibility as publishers or burying it under the now-fashionable 'wisdom of the crowds... We believe that the creation and documentation of knowledge is a collaborative process but not a democratic one.'

This is clearly going to draw some fire from fans of user-generated content, who (I predict) will call it aloof and offensive. I anticipate that someone, somewhere will cite the analysis by Jim Giles, published in Nature in 2005, which purported to show that Wikipedia and EB were of comparable accuracy on science topics, even though this has been comprehensively rebutted by Britannica. (Nature's response is pretty inadequate.)

So I'm going to defend the position, and not just because I'm a curmudgeonly lexicographer. You see, what's at stake, which Cauz seems to realise, is the verifiability of text as well as its accuracy. To explain: even if the Nature study were unimpeachable, and found that Wikipedia and EB entries were of roughly equal accuracy, there's still no way of assessing the accuracy of an entry in either unless you know a lot about the subject matter. This is obviously a problem, and an encyclopedia is useless if you have to check the facts of whatever you're reading independently. Does this mean encyclopedias are useless, then? No - but it means you have to trust whoever's making them.

This is tricky, of course, but inevitable, and not just in encyclopedias. You also have to trust that your brain surgeon knows how to operate on brains, and that the person who fits your gas boiler knows enough not to blow up your house. This is what's known as the professionalization of society: we can't all do everything, so we willingly hand control over our lives to people who can do what we can't.

The modern movement towards user-generated content began, really, with blogs, in response to the idea that bloggers on the ground can share information and insight more quickly, and perhaps more accurately, than the mainstream media. This is, in large part, probably true, and 'citizen journalism' has rightly been heralded. But journalism is an odd profession. A reporter's main duty is tell people what is going on in a certain part of the world ('the here') at a given time ('the now'). Aside from the ability to write clearly, and a commitment to broad impartiality (I'm talking hard news reporters here, not comment writers), there aren't that many specific technical skills involved, except maybe shorthand. This is not to denigrate journalism - it's hard work, and hard to do well - but the entry barriers are not particularly high, and so what journalists do can be copied by bloggers.

Writing encyclopedias, like brain surgery or fitting boilers, requires a lot of technical skill. The skill is not the ability to write (though this matters) but the ability to design research: to dig out facts and verify their accuracy. Journalists do this too, but normally on less obscure subjects. This skill is important. Whether you're reading an EB or a Wikipedia entry, you have to trust that whoever wrote it is right.

So whom do you trust: an anonymous EB encyclopedist or an anonymous Wikipedia contributor? Here's a good way of deciding: starvation.

We all, unless we're very rich, have to work to live. Encyclopedists get paid for what they do. Their work is monitored by other encyclopedists, who also have to write and review entries in order to make a living. If they screw up, they get sacked. Also, since encyclopedia-writing is something you get paid for, the entry barrier is quite high: you will have to prove your ability to research before they'll let you do it. You will also, probably, be a near-specialist in a given area: botany, for example, or politics, or history, or Japan. In short, as in all jobs, there is pressure to perform. Here, the pressure is to be right. With Wikipedia, there is no pressure to be right (apart from personal pride) because nobody does it for a living. If I had to pick, with no prior knowledge and no ability to check, whose entry to trust, I'd choose the writer whose kids won't eat if the text isn't good enough.

Now, on the negative side, the point can be made that the sheer number of authors on Wikipedia creates an effective check on inaccuracy. This is true, and to a layman it seems to work. Aside from occasional vandalism (which it seems unfair to dwell on), to a layman most of the entries seem pretty accurate. In that sense, most of what's in it is 'good enough'. The problem comes when you need to be absolutely sure on a certain detail. How do you know that the research has been done using up-to-date resources, and that something important but obscure isn't missing?

You, as an amateur, can only critique an entry up to the limit of your own knowledge. On Wikipedia, all you know is that the entry has been edited by lots of other people who may not have more expertise than you. Yes, there is weight of numbers, which will allow for small corrections to aggregate, but you can only hope that some of that editing has been done by someone who really knows the subject. Even then, that person would need a lot of spare time and passion to improve an entry dramatically. At least with a professionally-edited encyclopedia you know that serious money and time have been spent by the publisher to get the best possible quality.

And yes, sometimes encyclopedists get it wrong, which is why Britannica's idea of letting people submit corrections and suggestions to the editors is a good one. Collaborative but not democratic: it reminds us that the most important thing about an encyclopedia, like a life-saving operation, is that it is done well, not that everyone gets to have a go.

# Alex Steer (24/01/2009)


Pocket-vacuuming for charities

582 words | ~3 min

The Economist has an interesting piece which reports on a forthcoming paper suggesting that monetary incentives for charitable giving have a confounding effect on image-based incentives. In other words, it's fairly well established that people are more generous when they think other people are watching, but when other people are watching and someone offers you some money every time you give to a good cause, you're likely to give less (because it makes you look bad).

This is interesting from the point of view of behavioural economics, but does it have any real-world application? The Economist piece ends:

Cleverly designed rewards could actually draw out more generosity by exploiting image motivation. Suppose, for example, that rewards were used to encourage people to support a certain cause with a minimum donation. If that cause then publicised those who were generous well beyond the minimum required of them, it would show that they were not just “in it for the money”. Behavioural economics may yet provide charities with some creative new fund-raising techniques.

This may be true, but I don't think it's necessarily anything new. It most obviously applies to fundraising techniques like black tie dinners or auctions. In the first case, you give some large amount of money and get to live it up for the evening, with the (hopefully substantial) remainder of your ticket price going to the cause you're supporting. In theory that's a pure reward mechanism: you give, and you get something in return. With an auction, likewise, you bid for the chance to spend a week on a yacht or a day at a spa, and the sum from the highest bidder goes to the charity. Again, this sounds like the reward-based mechanism.

But it's not. At least, not entirely. People pay disproportionately high ticket prices for charity dinners, and bid themselves to the edge of space at charity auctions for things that are typically worth much less to the winner than what they fetch. The reason given is, usually, that it's 'all in a good cause' - and that's as true of mums and dads paying 50p to do a coconut shy as it is of billionaires at gala dinner. Using the framework above, we can see another reason. Participating in the auction, the dinner or the coconut shy makes you look good. Whatever you may win (especially if it's a coconut) is a fringe benefit at best.

So it seems that fundraisers are already clued in to this bit of behavioural knowledge. Charities and foundations are good at recognizing their major donors, for exactly this reason: the plaque on the wall, the picture in the paper, they all help to raise the profile of the philanthropist, and encourage him or her to keep on giving.

Knowing what the Economist article tells us, though, we might be able to make smarter decisions about what fundraising events to organise: specifically, should we put on those where people are most free to give above a given baseline, knowing that they're likely to do so in the quest for public praise? By this logic, an auction is a much better idea that a black tie dinner. At a dinner, once you've paid for the ticket, there are limited opportunities to flash your wallet for your good cause. At an auction, you can actually fight with other guests to give away as much as you can.

# Alex Steer (23/01/2009)


The end of civilization in 6,000 words

724 words | ~4 min

This happened over Christmas, but it's been a while since I got my claws into a good language topic, and it's worth saying some things about.

A few weeks back, there was one of those eruptions of rage which, for some reason, are periodically launched at the editors and publishers of dictionaries. This time, the target was that most innocuous of tomes, the Oxford Junior Dictionary.

Why? Well, a new edition came out, and a few keen-eyed readers noticed that, while a lot of new headwords had been added, many of them science- and technology-related, a good few had been ditched: several ecclesiastical terms (including bishop, pew and sin), some natural history terms (acorn, sycamore, starling among them), and various terms involving the monarchy and aristocracy (including monarch, no less). A full list is breathlessly cited by the Telegraph.

Naturally, various quarters have gone ballistic, claiming that this is a crackdown by politically-correct lexicographers determined to cut Britain's children off from the country's natural and national heritage. The finest example I've seen, from across the Atlantic, is titled Britain's Language Police. Attempts by OUP's children's dictionaries editor to explain the changes have not been met with broad smiles either.

But is this the cultural revolution that endless journalists are claiming? Will lexicographers be making the rest of the population melt their old dictionaries down for guns?* Of course not.

The dramatic extent of the changes reflect the fact that more and more dictionaries are beginning to be recompiled based on corpora of current English usage. These extremely useful tools (such as OUP's Oxford English Corpus) allow for good-quality frequency analysis, which helps lexicographers determine what should be included in a dictionary of a given size, and what shouldn't. Of course, frequency isn't the only criterion. Some less frequent words are still included, because they're the kind that are frequently looked up in dictionaries.

Now, like it or not, the Oxford Junior Dictionary is designed for children aged about 7-10, and only contains 6,000 headwords. For non dictionary fanatics, this is not very big at all. Unlike, say. the OED (where there might be good reason to query if words suddenly started getting removed - and they won't), there is huge competitive pressure for words in the OJD. By the age of 10, the average child (whoever that is) will understand around 40,000 words (see Anglin, 1993). But you don't even need to know that to realize that one small dictionary is not the be-all and end-all of a child's exposure to unfamiliar words. The OJD is designed for teachers to use, based on an understanding of words that children are likely to come across in day-to-day life. It is a reference tool, not a surrogate parent.

Nor is it, despite claims to the contrary, a tool for imaginative exploration. This is a tiny, tiny dictionary full of simple descriptions of fairly mundane things. It's hard to be romantic about it without imagining that it's something it's not. The editors, I would guess, have rather sensibly realized that the world is full of much more interesting tour guides for children, many of which - children's illustrated encyclopedias, for example, or the internet - did not exist when the first dictionaries for young readers were published. And so they should not be demonised for producing something small and useful which a child can use when he or she hears the term 'MP3 player' and wants to know what it means.

And yes, it may happen that the same child may want to know what a vicar is, and will look in the OJD and not find it there. And so will look it up somewhere else. And, as children do, note it and move on. Because it's not the end of the world.

* I know, I know. But metal dictionaries would be great.

# Alex Steer (22/01/2009)