This post is a synthesis of various ideas which have come together now that Data.gov is live.
In turn, I would like to see data held by UK Public Sector organisations be freed up for others to create value.
This post has been lurking for a few weeks, so I thought I should post it before anything else happened[!].
Photo credit to Will Lion
I first got interested in this area when I learned about XML, and Microsoft BizTalk emerged at the beginning of the Naughties. These two tools promised to allow more effective data interchange between incompatible data sources. XML describes what the data actually *is*, but how it should be formatted (as per HTML).
Looking back through my archives, I managed to find a presentation about XML & Web Services delivered to some colleagues in ICL / Fujitsu back in 2002 , and another in 2003 about how Web Services might change the IT market. The material I used from Ovum, a tech industry analyst, was remarkably prescient (see this story on Silicon.com).
[…] an in-depth pollution report for your county, covering air, water, chemicals, and more. [in the USA]
Scorecard is a mash-up, i.e. it takes a number of different sources of U.S. environmental data and mashes them up into something else – in this case, as consolidated pollution report.Wikipedia defines a mashup as:
a web application that combines data and/or functionality from more than one source
Mashups are effectively a more groovy form of middleware [if that makes more sense for you], i.e. a piece of software that sits between incompatible applications or data sources and allows them to talk with one another [cue techie pedantry ;-)].
So, that was the start. I played and poked around with mashup tools like:
- Microsoft Popfly (visual interface, more straightforward for a techie bluffer like me)
- Yahoo! Pipes (rather natty interface, but a bit techie for doofus here)
[I think it’s worth saying that my interest is primarily how these tools can be used for business advantage, and to enable organisational agility.]
So far, so good.
You might have read that I attended SXSW. Whilst there [in my own words], “I died and went to mashup heaven” when I met Kirsten and Oren from Mashery. Mashery creates tools that help expose an organisation’s data to the outside world, using an application programming interface (API).
<phew> Which brings me to the nub of this piece [get on with it! – Ed.]
OMB Director Peter Orszag drops by to introduce us to what will be a key milestone in government transparency:
Today, I’m pleased to announce that the Federal CIO Council is launching Data.gov. Created as part of the President’s commitment to open government and democratizing information, Data.gov will open up the workings of government by making economic, healthcare, environmental, and other government information available on a single website, allowing the public to access raw data and transform it in innovative ways.
Such data are currently fragmented across multiple sites and formats—making them hard to use and even harder to access in the first place. Data.gov will change this, by creating a one-stop shop for free access to data generated across all federal agencies. The Data.gov catalog will allow the American people to find, use, and repackage data held and generated by the government, which we hope will result in citizen feedback and new ideas.
Data.gov will also help government agencies—so that taxpayer dollars get spent more wisely and efficiently. Through live data feeds, agencies will have the ability to easily access data both internally and externally from other agencies, which will allow them to maintain higher levels of performance. In the months and years ahead, our goal is to continuously improve and update Data.gov with a wide variety of available datasets and easy-to-use tools based on public feedback and as we modernize legacy systems over time.
Democratizing government data will help change how government operates—and give citizens the ability to participate in making government services more effective, accessible, and transparent.
Also worth quoting Jake Brewer writing at the Huffington Post:
Sometimes the geekiest stuff is the most important. When it comes to creating a more transparent and accountable government, Thursday, May 21, is one of those sometimes.
On this beautiful morning, our nation’s citizenry received one of the greatest gifts it could receive from its government: raw, freely and easily accessible data.
New federal CIO Vivek Kundra and the Obama Administration have officially launched Data.gov, which is the first-ever catalog of federal data being made freely (and easily) available to citizens.
Now, it’s unlikely the description of Data.gov will send chills down the spine of anyone who doesn’t speak Ruby or Python or MYSQL, and if you visit the site, it’s unlikely you’ll be struck or know to be impressed by what’s there. But if you step back and take a minute to understand what you’re looking at, you’ll realize we’ve just taken an unprecedented first step into the Era of Big Open Government.
When information and process become free and participatory, markets get created (think about weather data), more people engage more deeply with their government (see: Obama’s online townhall), and ultimately, people care more about what their government does and how it serves them. …it’s nearly impossible for people to know more about what’s going on and care less.
Transparency is at the heart of destroying apathy.
The key with this new data, though, is that we do something with it. While opening up data is a beautiful thing in its own right, what will make this release truly great is when citizens actually take the information and create new, brilliant applications.
That’s why Sunlight Labs in partnership with Google, O’Reilly Media, and Craig Newmark of Craigslist has simultaneously launched a contest with $25,000 in awards to incentivize the creation of said brilliance.
This is a wonderful, one-time opportunity to show the administration the good that follows when they make information free. So we need to seize it. And everyone’s help in getting the word out is key — whether you’re a developer, someone who knows developers to share this with, or someone who simply writes and talks to others.
At the end of the day, the more great entries the Apps for America contest receives, the more likely government is to release more data — and the more data government releases the more transparent, accountable, and efficient it can be.
Open, free, raw information — true Transparency — makes government work the way it’s supposed to (for you).
So let’s get on this. Geeks, wonks and active citizens alike.
btw, check out this fab wiki from Wired on Data.gov.
UK Public Sector data should be set free
Impetus in the UK has been inspired by this truly excellent article from the Guardian “Give us back our crown jewels”, which is summarised thus:
- Further info via the Guardian on this via Free Our Data, and also their Data blog.
- Also n.b. the sterling work by by Mash the State, who pointed me to a wiki of Public Sector APIs collated by Rewired State.
So, the central idea that these sterling folk have been advocating has been vindicated & shown to work by our good friends on the other side of the Pond.
How it could happen
I was lucky enough to sit next to Stuart Dempster at the Thinking Digital dinner on the Thursday night. I bounced my wacky ideas off him, and although not saying they would work, he felt there was an possibility for them to do so.
So I was thinking that in the UK Public Sector, and based on my (admittedly dated) knowldge of Government IT, an ‘aunt sally’ might be:
- Public Sector bodies could aggregate the data they wanted to share [no small feat]
- Take advantage of the Departmental Internet Server (aka a BizTalk-based black box) that might already be in use in their Department, Agency etc.
- E.g. re DIS: this case study, Graham Coombes, UK Government Gateway / e-Delivery Team – Customer Skills Checklist, archive Govt. Gateway preso
- Use this to talk to the Government Gateway (e.g. over GSi2)
- And thence to a mega data warehouse in Central Government, which would then offer an API to the World
Where the money might come from for this is, of course, very sketchy!
btw, checking out the various US pages on Data.gov, it got to wondering whether we we need our own CIO? Or do we already have one?
So, if you’re read this far – *many* thanks! I realise this has been a bit of a discursive ramble, so suggestions please about how best to tidy it up. I’ve now got to the stage where I need to publish (or die writing it!).
I’d like to draw your attention of a couple of other interesting links:
- Christopher Chantrill’s UK Public Spending.co.uk – seemingly a mine of data about [yes] UK Public Spending.
- Unfortunately, the data are only available via the website, or in downloadable form
- So, perhaps he needs help with making them mashable?
- Hans Rosling’s Gapminder Foundation
- Whilst attending Thinking Digital recently, I was lucky enough to be able to ask him about whether he thought each Public Sector body should have an API, or should the data be collected in a central place – as per Data.gov.
- Hans responded by saying that he thought it best to be collected in one place
And a couple of other ideas:
- Perhaps we need to have a “Freedom of Data” act, to help establish ‘data.gov.uk’?
- Also, that “The Revolution will be visualised”, e.g. this from The Guardian re MPs’ expenses
- My love of visualisation goes back to the days when I was working with search & retrieval technologies, to go *inside* organisations…
I saw this quote about Cyberspace and thought it relevant to visualisation:
The word "cyberspace" (from cybernetics and space) was coined by science fiction novelist and seminal cyberpunk author William Gibson in his 1982 story "Burning Chrome" and popularized by his 1984 novel Neuromancer. The portion of Neuromancer cited in this respect is usually the following:
Cyberspace. A consensual hallucination experienced daily by billions of legitimate operators, in every nation, by children being taught mathematical concepts… A graphic representation of data abstracted from banks of every computer in the human system. Unthinkable complexity. Lines of light ranged in the nonspace of the mind, clusters and constellations of data. Like city lights, receding.
I’ve read and enjoyed Gibson’s Sprawl Trilogy, also Snow Crash, and am presently loving Down & Out in the Magic Kingdom. I find it useful to go back to the inspiration behind many present-day innovations by reading the ‘source material’.
Over and out!