Alex Steer

Better communication through data / about / archive

What the Publicis/Omnicom merger means for big data

1432 words | ~7 min

Update: Since I wrote this, the merger has collapsed.

Disclosure: I work for a company in which WPP has a financial stake, and am a former WPP Marketing Fellow. All opinions are my own.

As more or less everyone in advertising and marketing now knows, Omnicom and Publicis today announced that they are merging, forming what's set to be the world's largest marketing services holding company.

As the New York Times notes, the CEOs of the two companies paid the customary homage to big data when talking about the benefits they expected such a huge merger to deliver:

“The communication and marketing landscape has undergone dramatic changes in recent years including the exponential development of new media giants, the explosion of Big Data, blurring of the roles of all players and profound changes in consumer behavior,” he [Mr. Lévy] said. “This evolution has created both great challenges and tremendous opportunities for clients. John and I have conceived this merger to benefit our clients by bringing together the most comprehensive offering of analog and digital services.” At the news conference, he expanded on that notion. The “billions of people” who are now online and providing data to companies,” Mr. Lévy said, provide an opportunity to use advertising technologies to “crunch billions of data in order to come with a message which is relevant to a very narrow audience.”

I'd suggest this should be taken with a pinch of salt. To some extent this is the sort of talking-up you'd expect from any company latching on to fashionable industry concepts to enhance its future share price, and good on them for doing that. But it does raise some interesting questions about the relationship between holding companies and big data.

Whose data is it anyway?

It's worth remembering that this data is not - or should not be - a strategic asset for Publicis/Omnicom. The data is collected on behalf of, and should be owned by, its clients.

Clients may find themselves with a bigger menu of marketing services options as a result of this merger. It remains to be seen to what extent these are powered by data, or whether the newly-merged holding group will do a good job of helping clients join data together and break down operational walls so they can make more effective use of their data.

Finally, this will come down to a question of what clients want - and it is likely to be more driven by clients than any holding company wants to admit. At the moment, I see no reason to believe that clients will want to deal direct with holding companies for their data management services. They will expect their agency partners to provide data in common formats, be more open about data sharing and act faster on opportunities found in data; but they may not want their data management platforms provided by their agencies, or even their holding companies. It's good if Publicis/Omnicom adopt and enforce common data openness standards across all their agencies, because it will make it faster and easier to access, integrate and audit. But the agenda will be set by clients, who will want the data provided by those agencies to play nicely with data from all other sources, regardless of source.

The awkward matter of scale

Holding companies have a bad rep. For small agencies, positioning themselves as the 'hot shops' in contrast to the lumbering industry giants is a pretty easy win. There's a nagging doubt - which clients clearly share to some extent, and perhaps rightly - that dealing with a big agency or group is a recipe for cookie-cutter work or grindingly slow bureaucratic ways of working.

Big data has provided a way for holding companies to talk unashamedly about scale. A few years ago, media planning and buying provided a similar opportunity. This time round there was far more talk of data than of media, which implies that data is currently a more fashionable and persuasive way of implying 'value through scale'. This was backed up by plenty of talk of efficiency savings too.

What's missing, for me, in a lot of this rhetoric, is any mention of creative pride in the scale that big data provides. I've yet to hear a holding company come out and say that, yes, having tons of data and making a lot of it can lead to more focused strategies, more interesting creative opportunities, better briefs, fewer things done on a whim, fewer lazy assumptions. I know from experience that some holding companies value these things highly. Mine (WPP) was awarded as most effective holding company at Cannes for the third year in a row. But it would be great to hear holding companies come out and say that being huge, and having better data resources, gives advertisers more scope to be fast, focused, and fearless.

From big data to marketing coordination

Speaking of scale, people say some stupid things about big data. One of the stupidest is, 'Big data's not about how much data you've got, it's about what you do with it.'

No, big data is explicitly about how much data you've got. Of course, the utility value of data is also an important characteristic (duh). But let's not pretend that volume, timeliness and granularity of data aren't also decisive. My view is that if you don't value the scale of your data, you're doing it wrong.

This thoughtless utterance, though, is simply a bad articulation of a very important insight: that at the moment data is typically being used at the wrong points throughout the advertising planning and execution process. Everybody with a bit of common sense in marketing knows this, and could tell you where and when they want to use data more - when identifying specific challenges and opportunities, when taking the temperature of an issue and figuring out a response in real time, when measuring and adapting the performance of creative content mid-stream during a campaign, and when measuring the relationship between short-term activity and long-term brand value and behavioural change.

This stuff is hard, and a lot of it won't be sorted out for several years. But most of the difficulties aren't theoretical ones. There's plenty of chatter about what big data could do - the possibilities, the opportunities, the brilliance of theory. But very few of the pundits are really doing it, and doing it in ways that scale up and let you do it quickly and repeatably and every single day, without going insane.

Doing it well does mean more than having data. But hidden under the banner of 'what you do with it' is, I think, a complete reconfiguration of how marketing works, and how agencies work together. As we all know, clients are demanding more for less - more efficiency, more speed, and more transferability. International clients, in particular, want ideas that travel well, assets that can be localised but that are basically globally consistent. As creative agency people we can tend to rebel against this, because our instinct is to assume that marketing strategy and creative advertising development necessarily sit very close together. And we tend to see all the bits of the industry that are more replicable or scalable - research, media, analytics, production, adaptation - as a bit Luddite-ish.

Unless there's an astonishing creative revolution (which seems unlikely in the current climate), that way of thinking needs to change fast. If clients want fast, replicable, measurable work, then as people who care about long-term effectiveness and creative difference, we need to find ways to give it to them. That means taking a deep breath and bringing together strategy, creative and production in new combinations, working data seamlessly throughout that process, and using technology and information to make sure we can measure short-term effects and predict long-term ones. Call it marketing coordination if you like. And yes, we'll need to find the right balance between creative speed and quality of craft, and know - as ever - where to focus our efforts and battles to fight.

So we can stop talking about what big data could do, and just get on with doing amazing work every day.

# Alex Steer (28/07/2013)

Can buying feel more like making?

644 words | ~3 min

You Are Not An Artisan is a cracking long read from Venkatesh Rao's Ribbonfarm blog. In short it argues cogently that younger (Millennial) people, especially in developed markets like the US, apply consumer-style judgements to their choice of career, which leads them disproportionately to seek out careers that feel creative or artisanal. It's a much more cogent analysis, in a single post, than older work on the concept of a creative class.

This bit of the post caught my eye:

The future of work looks bleaker than it needs to for one simple reason: we bring consumption sensibilities to production behavior choices. Even our language reflects this: we “shop around” for careers. We look for prestigious brands to work for. We look for “fulfillment” at work. Sometimes we even accept pay cuts to be associated with famous names. This is work as fashion accessory and conversation fodder.

This interested me because it also applies the other way round.

A couple of years ago when I was working with The Futures Company on what eventually became their research stream on leading-edge Millennials, we observed that decisions about consumption and brand choice were strongly influenced by feelings about productive creativity. This was true of Millennials more than of older consumers, and especially of 'leading-edge' Millennials (what you might call hipsters if you're feeling unkind).

In other words, Millennials bring production sensibilities to consumption. They seek out consumption choices that reflect their desire not just to be seen as creative (that's the cruel interpretation) but to feel as if they are creating. And, in particular, to be creating artisanally. They are disproportionately willing to choose brands that reflect an artisanal work ethic.

This has been noticed to some extent by marketers but has not much been applied with specific reference to leading-edge Millennials (who are an oversaturated target market but a lucrative one). It tends to come out either in terms of ideas like 'crowdsourcing' (which very, very rarely produces much of value beyond PR headlines and the odd gimmicky competition), or in terms of embracing a general 'authenticity' aesthetic - typically, re-doing all your shops in wood and leather, putting out some ads about how you support local farmers, and leaving it at that.

To make the most of this tendency, brands would need to start thinking of themselves, for this audience, less as producers of finished products and more as enablers of artisanal craftsmanship. There are plenty of brands that are well set up to do this - think anything in homeware, consumer electronics, cookery, spirit alcohol, or anything else with a 'do-it-yourself' aspect - but few are making the most of it, largely due to the fact that the economics of those industries are heavily weighted towards the mass market.

As for the brands that make more of their revenue from leading-edge Millennials (ahem, 'cool' brands), their operational and marketing models tend to be a bad fit for this way of thinking. Ironically, most of these involve ultra-refined material production with end-to-end supply chain control (think sportswear, shoes, personal computers, smartphones, etc.) and have maintained their credentials due to some fairly heavy-duty brand management. Neither of these lend themselves easily to rendering up control of product, process or brand experience in any way, let alone giving people the raw materials and letting people hack around with them in fundamental ways.

One day, some major brand is going to nail this one, but it will take a disproportionate amount of courage to slam the history of 20th-century consumer capitalism into reverse, and move from being a seller of things to a provider of raw materials again.

# Alex Steer (13/07/2013)

Vine, Instagram and fake digital trends

876 words | ~4 min

Three recent posts from the Marketing Land blog tell a morality tale on the perils of believing your own hype when it comes to digital trends.

On June 8th, the blog published an article titled 'Vine passes Instagram in total Twitter shares'. It hinged on this chart, from Topsy Analytics, apparently showing volumes of tweets posted to Twitter containing links to Instagram and Vine:

So far, so interesting. Plucky little Vine overtaking its giant competitor.

Then less than three weeks later, another article: 'A Week After Instagram’s Video Launch, Vine Sharing Tanks On Twitter'. And with it, an update of the same chart:

Taken together, these two would suggest that Instagram and Vine were locked in a head-to-head struggle, with some dramatic reversals of fortune. Very exciting stuff for those who love a good trend. First, Vine is the underdog that races into the lead. Next, Instagram knocks it back by introducing video sharing. A rollercoaster of trend-based action.

But hang on. Because this should give us all reason to raise an eyebrow. A quick bit of Googling tells us that Instagram has 130 million active users, while Vine has only 13 million. Are we really to believe that the two services are neck-and-neck?

To Topsy's credit, they saw the story brewing and posted a response. They explained:

The free Topsy service generates trend charts using a sample of the most influential people and tweets. This allows users to see emerging trends among influencers in real time... while the free service gives you a high-level snapshot of the momentum and direction of the social conversation around a topic or domain, Topsy Pro gives you the complete and unfiltered picture that accounts for every single tweet. Because influencers tend to move faster than the general social media population to try the newest things—which is part of what makes them influential—new trends or changes in the direction of trends can appear amplified in charts generated by our free service.

And they published this chart, showing the real ratio of Instagram to Vine shares:

![Instagram and Vine trends chart](http://farm4.staticflickr.com/3735/9167979252_dbf718b04d_o.png)

In short, Vine has grown a lot recently, and has suffered a recent dip. But the whole story is based on nonsense.

In fairness, Marketing Land also published a clarifying post - though I don't think they give due prominence to the 'postscript' on the two original posts, both of which are still up, which points out that their analysis was wrong. Ideally they should take those two posts down and replace them with a retraction.

Topsy, on the other hand, are to be congratulated for handling this well - even though they should probably have a much bigger warning sign on their free product.

But mainly, this is a story about the dangers of unchecked trends. This sort of story can be hugely influential. If you're a digital publisher and you only read the first one, you might decide to shift your efforts from Instagram to Vine; or from Vine to Instagram, if you read the second. As it is, you'd be crazy to act on the basis of either piece of 'information'. But you wouldn't necessarily know that.

Marketers are bombarded with trends by an increasing number of digital marketing and analytics companies. If the marketers don't know the right questions to ask, they are vulnerability to plausibility bias: the tendency to believe that stories constitute evidence, just because they contain numbers and are told by people who sound like they know what they're talking about.

So if you're a marketer, and someone shows you a trend, here are three questions you should always ask:

How are these metrics calculated? Always ask this for metrics that appear to be something other than simple counting - especially scores like 'sentiment', 'influence' or 'customer value'.
Is the data from a single source? Always ask this if someone is asking you to make a comparison. If I show you data on smartphone penetration in the UK vs Botswana, how do you know the data's been collected using the same method and represents a fair comparison?
Is the data sampled? Always ask this. Just always. And then find out how it's sampled and how you can be sure the sample is representative. In the case of the data above, it definitely wasn't.

And if you don't get sensible answers to those three, be very careful indeed.

# Alex Steer (29/06/2013)

'Big data' in 1980

326 words | ~2 min

The paragraph below contains the earliest found usage of 'big data', which now appears in the OED's recently-added definition of the term.

Written in 1980, it's from Charles Tilly's The Old New Social History and the New Old Social History. It merits reading in full. (I've added a couple of links.)

The cliometricians "specialize in the assembling of vast quantities of data by teams of assistants, the use of the electronic computer to process it all, and the application of highly sophisticated mathematical procedures to the results obtained", (Stone 1979: 11). Against these procedures, Stone lodges the objections that historical data are too unreliable, that research assistants cannot be trusted with the application of ostensibly uniform rules, that coding loses crucial details, that mathematical results are incomprehensible to the historians they are meant to persuade, that the storage of evidence on computer tapes blocks the verification of conclusions by other historians, that the investigators tend to lose their wit, grace, and sense of proportion in the pursuit of statistical results, that none of the big questions has actually yielded to the bludgeoning of the big-data people, that "in general the sophistication of the methodology has tended to exceed the reliability of the data, while the usefulness of the results seems -- up to a point -- to be in inverse correlation to the mathematical complexity of the methodology and the grandiose scale of data-collection'' (Stone 1979: 13), For this eminent European social historian, the large enterprises which took shape in the 1960s have obviously lost their attractions.

Over-complicated, opaque, too proud of itself, and not useful enough. I leave it to you to judge how much has changed in the last 33 years.

# Alex Steer (22/06/2013)

Less than you assume, more than you imagine: Futureproofing online privacy

998 words | ~5 min

Bit of a long read, this. Blame cross-country rail travel.

Henry Porter, writing in the Guardian, is apoplectic about alleged efforts by GCHQ and the NSA to collect vast quantities of internet data direct from the fibreoptic cables that form the backbone of the net. He writes:

The story..must surely shake that complacency and demand a review of the profit-and-loss account in the safety versus liberty debate. And that must take in the effect the actions and views of a generation of middle-aged politicians, journalists and spies will have on people aged under 25, who may have to live with total surveillance under regimes that may be much less benign than the ones we know.

Despite this being a classic case of slippery slope rhetoric, I tend to agree. But since plenty of people will be writing about this story (as they have already) in terms of liberty vs security, I'm going to talk instead about expectations of privacy online.

What does it mean to have privacy online? In one sense, not much. Online activity is activity in a domain which is defined by communication: the transmission of information between parts of a network. By communicating over the network you are inviting third parties, not just to overhear your communication, but to be part of it. Asking for privacy in the classic sense of not being overheard is a little like asking for privacy in a game of Chinese Whispers.

But obviously this isn't satisfactory, so it can't really be what we mean when we talk informally about online privacy. Imagine that we are playing Chinese Whispers. You want to get a message to me, so you pass it through a chain of other people. They, rather obviously, know what your message is. You do not expect privacy of communication from them. But you do, reasonably, expect that they will keep your message confidential and not pass it on to others who are not in the chain.

So if I send a message from Machine A to Machine Z, and it passes through Machines B, C, D, and so on, can I reasonably expect that a stored copy of it will not be read by any person or machine not involved in its direct transmission to its destination? Or should I expect this to happen and adjust my behaviour accordingly?

Most of us think probabilistically about this, at least informally. Rather than talking in terms of absolute permission or inhibition, you figure out the probability that someone will access your message, and weigh that against the downside risk of them doing so. In other words: how likely is it that my message will go public, and how much damage would it do?

We are all aware that our online activity is part of a vast amount of similar, almost identical activity by others. So we modify our behaviour to some extent, but not as if we were being broadcast to the nation. Suppose I live in an authoritarian society and hold a critical opinion of the president. I may express this in an email to a like-minded friend, because I make a judgement that the effort required by some secret policeman to dig it out from the whole pile of online communications would be too high to make the risk of being caught badmouthing the great leader all that high.

The problem is, we're rubbish at judging risk.

The Guardian piece at the top demonstrates this in one direction. Henry Porter writes:

The two countries [Britain and the US] are rapidly perfecting a surveillance system that will allow them to capture and analyse a large quantity of international traffic consisting of emails, texts, phone calls, internet searches, chat, photographs, blogposts, videos and the many uses of Google.

Are they? Are they both capturing and analysing it? Because while capturing it may be easy (if expensive), analysing it is much harder. In particular, analysing it down to the level of individual users' individual behaviours is extremely hard, since you're effectively trying to run very granular searches on some of the largest datasets you could imagine. I suspect the author is overestimating the risk to individual liberty by underestimating the cost and complexity of the kind of operations he imagines. This is the conflation of what's plausible with what's possible.

And yet... when it comes to making this sort of judgement we also underestimate the risk, because we tend to think in terms of what we believe is possible now. Which is unwise when we're talking about permanent records of our online activity. Given time, it's perfectly legitimate to worry about what's merely plausible (any logically feasible kind of analysis), because it may become possible (thanks, Moore's Law). We also need to be aware of the fact that there are whole categories of data analysis that are possible now that were impossible a few years ago. I started my career as a dictionary editor, and when the dictionary I edited was first published in the late 19th century, there was no way to search its text except by the alphabetical ordering of its headwords. Now you can call up the results for any word in the dictionary; run regular expression queries to find words and phrases that contain fragments that interest you; and even mine the whole structure of the text for patterns.

In short: people with access to your data can probably do less with it now than you assume, but will probably be able to do more with it in future than you imagine. Any serious debate about online privacy should include that assumption.

# Alex Steer (21/06/2013)

The 'fallacy' fallacy

746 words | ~4 min

I've mentioned a couple of times here recently that there are plenty of people in the marketing industry who try to sound smart by trying to make other people's smart pronouncements sound dumb. (Try saying that three times quickly.)

Sadly this piece on Digiday is a textbook example of shoddy thinking in this genre. It's called 'The "Big Data" Fallacy', which obviously drew my attention. Early on in the article, we find this:

Investing in a DMP is something of a credibility test, with advertisers under pressure to make this “big data” technology the central component of all marketing strategy with other pieces, including multiple DSPs and networks, plugging in to the DMP. The problem with this strategy is that it is based on a fallacy — big data is just regular data, and its something every business should already be built on.

Whoa, nelly.

'Its [sic] something every business should already be built on.' That would make it not a fallacy, then. In fact, it would make the statement - advertisers should make data a central component of their activity - a truism.

Just like if I come out and say, 'the sky is blue' (when it is), that's not a fallacy. It's just the bleeding obvious.

It rolls on:

Successful online advertising is not about accumulating data, but actually doing something with it. The DMP and DSP are elements of a larger solution. Marketers don’t need a single data platform – they need a comprehensive marketing operating system.

It would be neat if I could demonstrate that this were a fallacy. It's not - just an unsubstantiated claim. If the problem is that data is already there and not being used, the exact constitution of whatever makes it usable doesn't matter, as long as it does the job. So you immediately start to wonder what the agenda is here.

I'm going to try to avoid quoting the whole article, so I'll stop here:

The age of big data is really no different from what businesses should have been doing all along. Data is prevalent in every organization... This is important data for marketing, but it represents just one component in a well-balanced strategy. Data only provides value when matched with media, so advertisers actually need access to media operating in tandem with their data.

Okay, once again, I've no idea where this comes from. There's no premise in the argument that justifies the leap to 'data'. So one can safely assume that the author has a vested interest in connecting data to media operations. And the author works for MediaMath, who do precisely this.

There's nothing wrong with a sales pitch. In many ways this blog is itself a kind of sales pitch (albeit an odd, roundabout, geeky one), since I work with data and information in the marketing industry. But at the moment the whole domain of 'big data' (and yes, I'll do a post on that term at some point) is full of people trying to demonstrate that their way of doing things is brilliant and everybody else's is wrong. Which would be fine, if it were true.

But it isn't - or, at least, it's not demonstrably true. Big data in marketing is more or less the wild west, with lots of models going round being more or less unproven. Be wary of anyone who tells you otherwise. There are proven models and there are proven models, but to my knowledge nobody has the kind of rigorously tested normative information yet that you'll find in market research, let alone the actual hard sciences after which a lot of big data practitioners are modelling themselves. (You remember, just like lots of financial engineers used to, before they accidentally blew up the economy.)

All of which means that when you see a blog post, article, conference presentation etc. that opens by dissecting someone else's 'fallacy', you should be aware that there are huge vested interests at play. And there's a good chance that the victim of the dissection wasn't fallacious at all.

I call it the fallacy fallacy: the erroneous tendency to assume that, because someone is in competition with you, he or she is wrong. It's lazy and we all need to stop using it.

# Alex Steer (20/06/2013)

Internet Explorer: makes ads about Do Not Track; tracks you on its website

425 words | ~2 min

I'd like to show you an ad. It's the latest for Microsoft Internet Explorer, and it tries to persuade us that Microsoft is on the side of consumers and their privacy concerns:

Do you see what they did there? Tugged at the heartstrings with an obvious human truth: we all have things we love to share, but we all have things we don't want people to know about us. And that, says the ad, is why Internet Explorer comes with 'Do Not Track' switched on by default, meaning that websites can't set cookies that remember users' online behaviours and preferences.

Firstly, Do Not Track has absolutely nothing to do with the kind of personal information that the ad talks about. That's the kind of information you pick up by snooping on people's personally identifiable accounts or by listening to what they say on Twitter.

If you come to a website that sets a tracking cookie, here's what that cookie can do:

Record which pages you visit
Record what events you take (e.g. buying, adding to basket)
Remember your machine if it visits the site again
Serve up recommendations (like Amazon does) about things you may find interesting or valuable

And here's what it can't do:

Tell who you are
Tie your behaviours back to your identity
Tell anyone anything about you as a person
Harm your reputation in any way

Unless of course you choose to share your identity with it by creating an account, giving your personal details and logging in. In that case, the cookie is irrelevant as all your behaviours are logged against your account, just as they would be if you were a registered customer of an old-fashioned mail order business.

But that's not the best thing about the ad. The best thing is that if you go to Internet Explorer's website and take a look at the source code, you'll see that it's running Google Analytics.

Let me say that again. Internet Explorer's website is running Google Analytics.

Which drops a cookie on your computer. To track your behaviours.

Well played.

# Alex Steer (16/06/2013)

Big data and small ambitions

415 words | ~2 min

Having a drink last night with an old friend who spends a lot of time working with data and statistics, we realised we'd both heard too many talks this year that say one of two things.

The first is, Big data is going to change everything.

The second is, It's not the data that counts, it's what you do with it.

Both of which prove mainly that people like sounding smart. (Yes, I'm saying this on a blog. I know, sorry...)

More and more, my response to both of these is:

If you want to know which companies will succeed using big data, look at which ones succeed using small data.

Not exactly world-shaking, is it? But since about halfway through 2012 people have been talking about data as if it never existed before. Suddenly it's as if everyone is deeply committed to the idea that you can use information about people to find out more about their lives, needs, attitudes, values and behaviours. And it's as if this is somehow new.

It's not new, and many of the people who are suddenly banging the big data drum are to be doubted in their convictions, and in the scale of their ambition. Some (mainly on the IT side) have been gathering data for years and not making any serious use of it. Others (mainly on the advertising side) have been so consistently rude about the data that has been made available to them - by media agencies, by market research companies, by clients themselves - that you'd be forgiven for thinking that they considered data to be an impediment.

By contrast to both of these, there are people and there are organisations who for years have made every effort to make the most of any piece of information they could scrape together - using it, thinking laterally with it, taking it past the obvious and using it to keep themselves honest, and to push themselves to be more ambitious, and more creative. Trust them, not the latecomers.

Data can be an impediment to creativity if it is used badly. Just like more data can be used to find insights that unlock a new perspective on people. On that basis, having more data is valuable, in much the same way that having lots of bullets is useful in a fight. You just need someone who knows how to fire the gun.

# Alex Steer (15/06/2013)

How the Guardian reads the Daily Mail

442 words | ~2 min

The data team at the Guardian have created a word-tree visualisation tool which lets us query what commenters on the Daily Mail website have to say on various topics. The logic of the tool is pretty simple, as they explain:

It uses the most recent ten comments from over 100 stories featuring the words "young offenders institution" posted by the MailOnline since 2009. To use it just put in any word and it will say what comes after in any of the comments in the database. For example, if you put in the word "scum" then you can see that many users are happy to throw that word around to describe offenders. "Scum and scummer" was one inventive way that a user got their point across.

The article is a cracking read - as you'd expect, it provides example after example of crazy Daily Mail comment-bait.

But that's where I have a problem with the application of this tool - both as a linguist and as someone who works a lot on the fair and balanced use of data. Word-tree visualisation tools are useful for evaluating usage by spotting frequency patterns above the word level.

But here, the Guardian aren't using the tool to evaluate usage, so much as to judge users.

Anyone who takes at face value the Guardian's (genuine, serious and long-standing) commitment to data-driven journalism should question the approach they've taken in this piece. A quick look suggests that they've picked lemmas (words and phrases) that are geared towards providing a quick thrill for the Guardian's readers, who (I rather suspect) enjoy taking a dim view of the parochial opinions of Mail readers. Lemmas they pick for analysis include:

scum
bring back
this country
parents

The piece continues:

Some of the old bugbears of the Daily Mail such as "human rights", "Labour", "jail", "prison", "tax" and "the judge" also make for fun reads.

I can't say this strongly enough. If you go into a piece of analysis with strongly-held prejudices, you will tend to find things that confirm those prejudices, because that's what you'll be looking for. The bigger the data set, the more you will find to confirm what you already believe. (It's one of the reasons why big data analysis in data-rich but complex domains like economics is so fraught with error.) That kind of lazy poke-the-monkey analysis is, I'm sad to say, exactly what the Guardian is guilty of here. They should know better, and should do better.

# Alex Steer (27/05/2013)

Microsoft: Work From Here. Until you drop.

324 words | ~2 min

If you travel by train in the UK you might have seen Microsoft's new ad campaign for Office 365. It's called 'Work From Here' and it's a really smart piece of media thinking. Creative Review has the details - they've taken over train stations, railway carriages, ticket halls etc., to reinforce the point that with Office 365 you can work from anywhere.

Thing is, while I like the media (and the creative), I really don't like the thought.

Read the above: Here's where Lisa finalised the figures and posted them just after eight.

Get a life, Lisa.

There are plenty more in the same vein. I saw one about a guy called Ben getting some documents to his angry boss in the nick of time. Poor Ben.

If 'work from here' is supposed to be liberating, I think it has the opposite effect. Office 365 comes across as the successor to the Crackberry, a tool that makes sure you can never blame the mere fact of being away from your desk - or it being night-time, or a weekend, or anything else - for your failure to deliver some unspecified documentation to someone, somewhere. Because whichever 'eight', morning or evening, Lisa is sending her documents at, it's probably a time she should be seeing friends or loved ones instead of Excel-jockeying her life away.

So thanks, Microsoft, for reminding us that wherever we are, whatever we're doing, in any railway station throughout the land, we shouldn't be buying a coffee or playing Angry Birds or browsing the weird souvenirs in WH Smiths. We should be working.

Image via Creative Review, used with thanks.

# Alex Steer (14/04/2013)