Data – it might not mean what you think it means

It certainly didn’t when it came to the £32 bread. That was actually 11 loaves and cost £2.82.

The lesson from this is if it sound ridiculous, it probably is and it needs to be checked thoroughly, not the easiest thing to do when you’re on deadline.

With this I’m mainly annoyed with myself that I didn’t go with my minor freakout on the Thursday, that a prescription item might not be an item (i.e. bag of pasta, loaf of bread) but might be a crate of multiple individual items (my instinct was right).

However, having checked the Assembly written answer, which asked for cost of an item and was replied to on the basis of an item, so it looked like an item. Plus the information had been put to the Welsh Government press office and they hadn’t queried it.

Lesson two, answers to FOIs and written questions may have misinterpreted the question, or can’t answer it fully with the data available, and so may not actually be a direct answer to the actual question asked.

Good replies will tell you this, bad ones will assume you share a psychic connection with the person in the department who put the data together (and who probably didn’t tell the person who sent out the reply what it means either).

Lesson three, don’t assume because you presented the figures you are planning to run to the press office, that they will notice/or question the figures, chance are it won’t get spotted until someone with actual knowledge of the situation reads it, once it has been published.

The other minor problem here was unhelpfully presented data. It has been changed now but when I was looking at it the quantity column existed with no explanation in the notes as to what it might refer to, making it less than clear how much bread they were actually prescribing. When I checked the England figures on Sunday, it became a lot clearer, and the Welsh figures have now been changed.

Interpreting data to find stories is not always straightforward. And generally newspapers want things that make good headlines, so unless something obviously contradicts an interpretation, it’s usually easier to go with what makes a good headline and not check too closely and risk the story collapsing.

The figures do raise questions:

  • could NHS Wales get a better deal on gluten free products? (even going through the figures with regard given to quantities there are still things like £4 for 500g of pasta or mince pies at nearly £5 for six, other things appear to much better value)
  • Bigger actual adding up mistake, the head of Coeliacs UK estimated 300,000 people in Wales might have coeliacs with around 75,000 diagnosed, but at a usual estimate of one in 100, that should be 30,000 (I realised this Sunday night, seriously need to check figures more), and if under-diagnoses is a problem, is 142,000 prescriptions in 2010, if a prescription could cover six to 12 loaves or a kilo of pasta, kind of a lot of food?
  • what is the point of press offices? (only slightly kidding)

The problem is none of these is immediately gripping, so starting a debate about the minor amounts of money NHS Wales could save if it procurement on some items a little more effective is probably is a non-starter.

And focusing on the £32 figure doesn’t help that, because it shuts down debate in the first place in the article (‘we’re not interested in subtleties’) and in the response (‘we can focus on a single problematic figure and therefore not deal with other issues’).

I think data journalism may be a good place to look at this issue.

At the moment looking at data, it’s with a headline/intro in mind, so that key statistic that will grab attention. But you usually have reams of data that adds context, which may make that statistic less earth-shattering but which don’t make it less interesting.

Saying, it’s complicated, it’s possibly a bit boring (but I’ll try to make it entertaining) but potentially it is important might not be a bad idea (by the way do you want to see the background).

(Obviously it helps to get the numbers right).

Anyway, stopping the meandering ramble about what journalism could look like, there have been a couple of other recent examples where interpreting data is key to the story.

Firstly, the DFT and its internet usage figuresthe problem with which Mary Hamilton blogged about on Saturday, is probably the less excusable one, because the people who answered the FOI flagged the issue of hits versus page views, so someone really should have asked the ‘is this telling me what I think it’s telling me’ question.

Labour versus Tories on English health cuts is the other one, though both sides are at least a bit right, Labour mainly but not really because of the way they’re using the figures (as Full Facts points out)

To get to the bottom of it you need to interpret some less than clear figures – as David Higgerson explains, you have to go searching for an explanation as to what the target was that some PCTs are so far off.

I have to agree with his point that clear explanations are needed to help people looking at data as datasets are usually written for people who already understand what it all means, but journalists have a responsibility to check and double check anything that’s not blindingly obvious.

Even if it turns the story based on the data from a front page splash to an interesting downpage.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.