Request Initiative, not-for-profit community interest company that helps charities, NGOs non-profits make use of FOI, is blogging top tips for using the FOI act.
This week’s topic, looking at asking for datasets, is timely, with the new FOI rules on datasets coming in from September 1.
The advice is good – datasets are always much better for analysing than the response you are likely to get if you let an FOI officer do the number-crunching for you (usually it’s the chance to maybe find several stories in the dataset, rather than just getting the answer to one question).
All good advice until I got to this bit, which just doesn’t match what I’ve found to be the case.
The best way to request a whole dataset is in CSV (comma separated values) format. Whatever proprietary software might be in use, every database can export to CSV and you can open it with Excel when you receive it.
Not the bit about asking for CSVs (ask for them, they’re great!), but about every database being able to export to CSV.
You’d think, right?
The problem, and experience has taught me this, is the that either no one knows how to get the programme to export to CSV (because it’s never come up before, because that’s not how the programme is used, because people generally only learn to do what they need to do, if they’ve never needed to do that then it probably isn’t something the programme can do) or actually it doesn’t export to CSV (or at least everyone in the organisation swears it doesn’t).
Yes, there is an entire story in how the public sector may have invested in proprietary software for organising data and processes, where if they ever decide to change software, they would have no easy way of exporting their old data out and importing it into a new system (except, probably, at huge cost). That, or the programme does have the option but it’s not enabled.
There was a great post yesterday from Andy Dickinson about trying to get data from the Home Office that suggests staff have to look through 75 screens to find information in just one database. Also yesterday, I had a conversation with South Wales Police, well known for their excellent records management, about how it would take them 2,000 plus hours to get some data out of a database (I did ask about exporting, you know, as opposed to typing it out by hand).
Which, brings me to: why I don’t think the new rules on datasets are going to make it any easier to get datasets out of public bodies.
The ICO has issued guidance on the new requirements. Reading through them, in theory, for most FOIs nothing has changed, small data tables put together to answer a question aren’t datasets covered by the new requirements so most FOI responses will go out much as they did before.
Apart from that, most responses concerning what are probably datasets under the requirement could be made compliant by not converting spreadsheets to a PDF (seriously, stop doing that), but instead by saving them to a CSV and then making them reuseable under the Open Government Licence.
However, I suspect the new dataset requirements are not going to lead to the bounty of data the Government is hoping it will.
The issue with requesting datasets now isn’t format (yes, PDFs are annoying when you get them but most are convertible), it’s getting the raw material out in the first place – the programmes that just don’t work like that, the information that isn’t held in that format.
It’s not going to solve the issue where the requester is incredulously pointing out that being able to export information in a simple, open format might be a basic and key requirement of any software purchase, while a council officer claims that, of course, the information can’t be exported out of the programme because why would anyone need to do that when everything is done in the programme and the FOI officer is stuck in the middle.
The new requirements may mean more attention starts to be paid to finding out how to export datasets, and may give FOI officers more power to get people to start asking questions of suppliers about export options but I suspect the caveat that data should be released in this form “so far as reasonably practicable” will win out and datasets will remain locked down.
On the subject of blogging about FOI and blogging weekly, I’m aiming to try (not committing myself there) to blog a bit more regularly (or at all). As I’ve finished my book, I have a little more time, so should probably put it to better use.
The plan is some more posts on FOI, data journalism (and I’m hoping the free time will allow for a bit more experimenting), and open data, at least semi-regularly. I have great hopes but then again, this blog is three-and-a-half years old and has less than 35 posts.