Alex Steer

Better communication through data / about / archive

Sausages and statuses

495 words | ~2 min

Two of today's lighter news stories in the Telegraph turn on matters of language; both are slightly questionable.

The first is the bizarre, wonderful story of the man in Benxi, China who tried to convince a restaurant full of diners that he was a suicide bomber by going in with sausages strapped to his torso. A quote, attributed to one of the police officers attending the scene, is given:

It must have been terrifying for the customers but those things would only have gone off if you'd kept them past their sell by date.

Does this pun ('go off' = detonate/become putrid) even work in Mandarin? Has the gag (which is, admittedly, not bad) been inserted for the benefit of an English readership?

The other story is Facebook's list of the commonest terms in status updates this year. Helpfully, the list gives the topics (e.g. 'Facebook applications') and the specific terms used (e.g. 'Farmville, Social Living'). All are fine except the last:

15 - I
Specific words: I, is

This is far from clear in the Telegraph's coverage, but Facebook's blog entry on the status trends explains that the trend shows the increase in 'I' in status updates and the decline in 'is'.

Until March of 2009, people updated their status in a box that appeared next to their name on the home page and, consequently, many updates started with the word "is." Once that box no longer was shown next to people's name, the usage of "is" dropped off dramatically and usage of "I" doubled almost overnight. Prior to March of 2009, "is" represented about 9 percent of all words in status updates. With the change in interface, it remained high in absolute terms, but dropped all the way to about 1.5 percent recently while "I" increased from 1 percent to about 2.5 percent.

This is a pretty good measure. A more direct and plodding comparison would have been to look at the rise of 'am' against 'is', but this would have given a distorted picture, as it would exclude first-person statuses that don't use the verb to be, such as 'I love', 'I hate', 'I've just been', etc. Given Facebook's reputation, back in the days of the is-initial status, for producing syntactically mangled updates (e.g. 'John Smith is I hate Mondays'), it would be fun to know what proportion of posts begin 'is' and also contain 'I'.

Has anyone started building a corpus of social network status updates? Being able to run proper analyses on all that data would be fun. Maybe not that useful, but fun.

Happy Christmas.

# Alex Steer (23/12/2009)