Aaaah, the one I was waiting for – the #ioe12 open data module.
I’m just in the process of writing the slides for the Institutional Web Management Workshop 2012 (IWMW 2012) so this module came in handy. My parallel workshop is entitled Big and Small Web Data and open data definitely falls into the remit.
“A piece of content or data is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and share-alike.” Open Knowledge Foundation
The opening TED talk by Tim Bernes Lee is primarily an advocacy talk for open data. Tim talks about how the community feel that was around at the start of the web is similar to the community feel now around open data. While people he asked people to put their documents on Web many years back he now wants people to put their data on the web. This is primarily because “you can do all types of stuff with data” and you can link it up – linked data. He explains that the more things you get to link together the more powerful it is. Tim ends by encouraging a communal shout of “Raw Data Now!” because “data is about our lives“.
It was an enthusiastic talk but lacked depth: no discussions of what exactly it is and the reasons why people might to or not want to be open with their data or the challenges that they face in doing it.
The Wikipedia entry offers a better overview exploring the roots of open data (e.g.Mertonian tradition of science, the open movement), the lack of an agreed definition, commercial issues and the ‘reluctance’ to put licences on data – which causes uncertainty. The arguments for and against open data are contextual and often depend on the type of data and how it can be used. I see the two key arguments for open data as being the use of public money to fund research (i.e. we paid for the data) and the advancement of science through collaboration. The arguments against open data are less clear but centralise around safety, commercial and reputation incentives for controlling data use and the cost of preparing data for publication.
One of the more interesting resources for the module is the data.gov Web site and their open data community section. This is the US government Web site which was launched in late May 2009, part of the process of “rebuilding confidence in government and business” (Aliya Sternstein). The site was a forerunner for the uk data.gov.uk one which appeared in beta version in September 2009 and went live in January 2010. The open data community section of the data.gov site is primarily a series of forums and blog posts looking at international governmental data sharing.
There was also the Open Data Commons which comprises of a set of legal tools to help users provide and use Open Data. This includes licences (additional licences to CC) and dedications. The site was set up by
Jordan Hatcher (who I actually worked with on the JISC PoWR project) and was
transferred to the Open Knowledge Foundation in January 2009.
Other resources include a list of where to find open data on the Web (e.g. CKAN (Comprehensive Knowledge Archive Network), Infochimps, OpenStreetMap and more) – very handy. The comments add a lot of good resources too. There are also details of the New York Times linked open data work and a link to the Linked data site which provides pointers to resources from across the linked data community. Good to see a list of tools there including tools for publishing and consuming linked data and for end users.
The whole ‘open data’ movement is becoming so huge it was almost impossible to give a snap shot by just a few resources. I still feel there is so much to learn and then there is also so much discipline specific data/tools too. Phew!