Annual Report 2007

= Annual Report 2006-2007 =

TableOfContents

= Introduction =

The Open Knowledge Foundation celebrates its third birthday today. We thought it timely to put together a report of its main activities over the past year.

We’ve had some significant developments in our existing open knowledge projects and tools, and have commenced work on several new ones. Much focus has gone into new releases for three of our central projects: the Open Knowledge Definition (OKD), and the Comprehensive Knowledge Archive Network (CKAN) and KForge. The first of these is to help make sure knowledge is openly licensed, the second to make sure open knowledge can be easily found, and the third to provide knowledge users and producers with tools for storage, retrieval and development. Our two 'exemplar' projects - Open Shakespeare and Open Economics have been launched and significantly developed.

We've also been hard at work campaigning to protect and promote the legislative and institutional conditions under which an infrastructure for open knowledge could flourish. As well as campaigning on the INSPIRE directive, with moderate success, we've responded to consultations from Ofcom and WIPO, and had research published by the IPPR. Our two main events of the last year - the Forum on Civic Information No. 2 and Open Knowledge 1.0 - brought together people from across the open knowledge spectrum for talks, discussions and workshops.

In September 2006 we were pleased to welcome Peter Suber and Benjamin Mako Hill onto our advisory board. Peter is a Research Professor of Philosophy at Earlham College, Senior Researcher at the Scholarly Publishing and Academic Resources Coalition (SPARC), and the Open Access Project Director at Public Knowledge. Benjamin Mako Hill is a technology and intellectual property researcher, activist, consultant, and has been an active member of the Free and Open Source Software (FOSS) community for over a decade.

The Open Knowledge Foundation 24th May 2007

= Events =

Forum on Civic Information No. 2
28th November 2006, Sir David Davies Lecture Theatre, UCL

The second Forum on Civic Information was organised in association with the UCL Faculty of Computer Science. Talks were given by John Sheridan (Head of e-Services, Office of Public Information), Richard Pope (Love Music Hate Racism and MySociety), Heather Brooke (author of Your Right To Know) and Julian Todd (Public Whip). There was a good turnout - despite the rainy November weather!

John Sheridan's talk gave insight into the Office of Public Information's plans and activities pertaining to civic information. He described the OPSI's key responsibilities and objectives, and described its location in the context of government information policies, producers and consumers. After brief comments on the desirability of grass roots demand for data from the public sector, he outlined the history and value of click-use licensing, and emphasised the importance of standardization and meta-data in official documents and datasets. Finally he gave a sketch of the AKTive PSI research project into the potential benefits of the semantic web languages and other recent web technologies for the presentation and organisation of government information.

Richard Pope described his experiences working on three websites designed for local communities. Lambeth Planning Alerts was born out of a feeling from local residents that they weren't being sufficiently informed about planned building developments in their area. It uses a screen scraper to grab information from local government websites on a daily basis, and then e-mails it to a list of subscribers with a link back to the original site, a comments page and a Google map. ElectionMemory.com brought manifestoes and information about candidates for the Lambeth 2006 local elections together in one place. Whereas the information was previously difficult to find or unavailable online, the site provided a platform for shared public scrutiny of candidates in an area where voter turnout has been low. Love Music Hate Racism uses Pledge Bank to help to organize local events for its campaign - which had little funds or personnel, and has greatly relied on the active participation of local groups.

Heather Brooke talked about her work in looking at how official information is procured by citizens. She summarised the process of applying for information under the 2005 Freedom of Information Act, and suggested that UK authorities suffer from what she called 'data hoarding' - which leads to 'stunted civic propriety'. Her book, Your Right To Know is a citizen's guide to getting hold of official information. She suggested the importance for democracy of citizens having access to the information for which they pay, highlighting the value of being able to directly query databases. As a journalist, she described her requests for data from UK Police departments - and the problems she encountered with the varieties of data formats used, and the unwillingness to make available the information she asked for. Finally she sketched the proposed legislation that would restrict FOI requests, and drew attention to a Downing street petition on this.

Julian Todd talked about Public Whip and projects like it that he would like to see. He demonstrated the functionality of the PublicWhip.org.uk and discussed the parliamentary information that is displayed on the site, before going on to discuss aspects of the US electoral process. After describing how gerrymandering and tactics of 'packing and cracking' are common practices in the US, he explained the publicly available electoral maps, and detailed breakdowns of political finance that is published by government. He suggested the desirability of such a fine grain of detail in resources displaying public information produced in the UK. Finally he made some comments on attitudes to using open source software in educational institutions.

In the subsequent discussions, questions about data formats, licensing and the publishing process for civic information were raised. General agreement was reached on the importance of access to public data, in a raw form, with accompanying identifiers.

Links
 * OKFN - Forum on Civic Information No. 2

Open Knowledge 1.0
Saturday 17th March 2007, Limehouse Town Hall, London

Open Knowledge 1.0 brought together individuals and groups from across the open knowledge spectrum. Most of the day was taken up by three main sessions on open geodata, open media, and open scientific and civic information. A programme of short talks and mini workshops followed these. The event certainly benefited from the wide range of backgrounds and specialisms of the speakers and participants - which ranged from academia, government, and the information sector, to the media, the arts and campaign groups.

Open Geodata

OKF director Rufus Pollock introduced the conference, focusing on componentization and the potential for commercial exploitation of open knowledge. Following this was the first panel on open geodata, chaired by Becky Hogge of the Open Rights Group.

First was Ed Parson from Open Geomatics. His talk was about his views on geodata and his experiences as CTO at Ordnance Survey. After proposing that the value of the data increased as the number of people who use it increased, he went on to propose that users should nevertheless pay for use of Ordnance Survey data. The reasons he cited were the massive costs involved in maintaining the complex datasets (an estimated £70 million p.a.) and the political difficulty of justifying the spending on geodata in a context where health, schools, energy and security are further up on the agenda. A comparison with the situation in the US showed that, relatively, the OS is much better funded, and the data is more up to date – as demonstrated by the ad hoc re-mapping that was necessary in the emergency response to hurricane Katrina. However he suggested that the current licensing of OS data was dissatisfactory and had a damaging effect on potential commercial exploitation of the data. The documentation is overly complicated and ambiguous, and the process of applying to use the data painstaking. He concluded by expressing hopes that Open Space – an open API that he was working on while he was there would be soon released, that licensing would improve, and that some of the Ordnance Survey’s older products with less commercial potential would be released under open licenses.

Next, Steve Coast talked about his experiences developing Open Street Map. The project started with the idea of aggregating traces from GPS systems. The more traces were aggregated, the more accurate the emerging map would be. The traces could be turned into maps and names and icons could be added. The motivation for the project was that UK geodata was not free and not current – leading to a desire for ‘grassroots re-mapping’. The ‘mapping parties’ initiated by Open Street Map proved to be popular in the UK as well abroad. Their mapping activities revealed error in Ordnance Survey maps – including deliberate ‘Easter eggs’ included to detect copyright infringement. He also helped to start projects to collect postcode data – which is usually expensive. Ultimately he hoped that as the quality of projects like OSM increased, the OS would be pressured to bring their prices down – paralleling the effect of free software on their proprietary counterparts.

Finally, Charles Arthur of the Guardian’s ‘Free Our Data’ campaign gave his views on the political and economic background to the notion of making impersonal government data available at cost. He accused the government of ‘sitting on the fence’ with respect to the issue and highlighted tensions between legislation to increase availability, and requirements to recoup costs of producing the data. He commented on the comparative availability of US public information, demonstrating a Chicago Police Department mashup of crime data onto a Google map. OnOneMap was cited as an interesting but rare example of a similar project in the UK. He alleged that much of the money gained by data producing departments came from other departments and was basically circulating tax money. A better approach would be to make the data free, increasing commercial opportunity, meaning costs incurred to government departments could be recovered through taxation.

Open Media

The Open Media session was chaired by Andrea Rota of Liquid Culture. Paula Ledieu of Magic Latern Productions spoke about the open licensing in the context of the emerging new media landscape – and the success of web based projects such as Flickr which may have seemed implausible only a few years ago. However media producers still have problems searching for open cultural content – and she warned that with the proliferation of licensing experiments such as those produced by the Creative Commons, we should be wary of inadvertently creating ‘license ghettos’. She related her experiences as director of the BBC’s Creative Archive, and described how the project seemed to lose the BBC's attention, leading in a ‘public value test’ by the corporation. Then she suggested that questions of pricing public value were difficult and there was a widespread inability to give an empirical response to economic questions concerning the open media. Finally she warned of conflating legal advice with business advice, and the danger of losing projects like the Creative Archive as an ideal, with over a million hours of video footage and many hours of audio that could be released under an open license.

Susana Noguero and Olivier Schulbaum of Barcelona based collective Platoniq talked about several of their projects, including Burn Station and the Bank of Common Knowledge. The former, a ‘mobile copying station’, aims to bring open net culture to public spaces. The latter is an experiment using ‘peer to peer’ communication models for knowledge transfer between community members. They asked about the broader effects of open media, the extent to which the notion of open media entails open access, and the possibility of licensing for collective authorship.

Zoe Young, co-ordinator of the metadata working group for Transmission, discussed their recent recommendations for standardizing metadata for video content. They decided that it was important to make these as easy to understand, as broadly applicable and as workable as possible – leading them to specify a minimal set of required fields along with a larger set of optional fields specifying more detailed information. She also described their work with the Participatory Culture Organization’s Democracy Player.

Open Civic and Scientific Information

Jo Walsh of the Open Knowledge Foundation and OSGeo chaired the final session on civic and scientific information. Tim Hubbard of the Sanger Institute started with a presentation on the Human Genome Project and open access publishing of scientific research. He gave background information on the HGP as a high profile case of public, open access, scientific research that succeeded through international collaboration and sharing data – outlining the Bermuda Statement, and reiterating the point that ‘the more people analyze a block of data, the more valuable it is’. He suggested that problems associated with data sharing fall into two broad categories: cultural attitudes and practical issues. In relation to the latter, he discussed data integration and distributed annotation systems. In relation to the former, he discussed the pros and cons of open access publishing economically, academically, and in relation to government perceptions and long term sustainability. Finally he discussed the potential application of an open access model to AIDS drugs development, and his work lobbying for WHO endorsement of more open access scientific research.

Peter Murray-Rust of the Unilever Centre for Molecular Science Informatics, Cambridge University, spoke about his work on the World Wide Molecular Matrix (WWMM) and issues relating to access in chemical research. He mentioned that some large companies are migrating to more open research models, and emphasised the virtues of co-operation and sharing in the procurement and analysis of data. To illustrate the importance of open access, he spoke of the process of trying to investigate the presence of the carcinogenic chemical 'sudan red' in chili powder. Many sources of data or papers on the chemical charge for access. He described how PubChem, an open access registry of chemical information run by the US National Institutes of Health, came under attack from the American Chemical society for allowing free public access, before going on to look at the ubiquitous financial model whereby both researchers and pharmaceutical companies pay publishers for access to data. Finally he demonstrated the WWMM, talked about Blue Obelisk - a project to produce open access resources for the chemical informatics community, and described the possibility of new publishing models whereby funders would pay publishers upfront upon the condition that the material would be released under an open license.

John Sheridan, head of e-Services at the Office of Public Information (OPSI), spoke about efforts to improve to data and documents produced by government departments. He stressed the importance of high profile public campaigns like 'Free Our Data' to raise awareness of the role and potential of PSI. After general comments on the possibilities of recent web technological developments, he discussed the government's model for public service reform, and how the two might intersect to provide a transformation in public services. He closed with comments on the Aktive PSI research into exploiting semantic web technologies to increase use and usefulness of PSI and to improve the UK's knowledge economy.

Mini Workshops, presentations and short talks


 * Presentation on the SP-ARK project
 * Becky Hogge on the Open Rights Group
 * Steve Dowding on wikimapia as a campaign tool
 * Chris Armbruster on e-science, e-publishing, and PSI
 * Richard Pope on PlanningAlerts.com
 * Tim Cowlishaw and Rufus Pollock on Public Domain Works
 * Platoniq on The Bank of Common Knowledge
 * Julian Todd on UNDemocracy.com
 * Jo Walsh on the INSPIRE directive
 * Jessica Clark on visualizing public media
 * Hugh Barnard on community currencies
 * Paula Ledieu on the Public Service Publisher
 * An Mertens on Atomic Literature

Links
 * OKFN - Open Knowledge 1.0 - development page
 * OKFN - Open Knowledge 1.0: Post-Event Information
 * OKF Blog - Open Knowledge 1.0 Has Happened

= Campaigning, Policy and Research =

Ofcom PSP Consultation Response
On 24th January 2007, Ofcom (Office of Communications, UK) invited individuals and organisations to give their views on its notion of an organisation provisionally named the 'Public Service Publisher' (PSP), which could provide 'additional innovation and plurality' to existing public service provisions. The PSP would pay particular attention to content emerging as a result of the proliferation of 'new media' technologies.

In March 2007, the Open Knowledge Foundation, the Open Rights Group and Free Culture UK jointly submitted a response urging the Ofcom to support open content, software and infrastructure, as well as to refrain from restricting its outlook to the UK alone. The original discussion document alludes to 'alternative open licensing models' while describing the potential of more 'participatory media environments'. It is hoped that some of the more detailed comments on the advantages of open licensing in our joint response will be taken on board as the idea of a PSP is further developed.

Links
 * Joint response of the Open Knowledge Foundation, the Open Rights Group and Free Culture UK to OfCom’s 'A new approach to public service content in the digital media age' (PDF)
 * OKF Blog - Response to OfCom’s Public Service Publisher Consultation
 * OKF Blog - Zoetropes and Nickelodeons: A response to OFCOM’s ‘Public Service Publisher’ proposal

INSPIRE Directive
The Open Knowledge Foundation was actively involved in campaigning on the European Commission's INSPIRE Directive. The first version of the directive was submitted to the European Council in July 2004 and aimed to build towards a framework for geographic information that would shared by member states. Subsequent to this there were debates resulting in amendments to the original proposal motivated to increase the capacity for governments to recover costs for producing the data by restricting public access. The OKF strongly supported the publicgeodata campaign to include legislation more supportive of more open access to geodata to European citizens and businesses.

A petition to either amend or reject the directive collected over 8000 signatures from across Europe. National ministries, European parties and MEPs were lobbied with letters and personal contact.

Links
 * OKF Blog - INSPIRE: Where Next?
 * publicgeodata

WIPO
In September 2006, the Open Knowledge Foundation submitted a response to a World Intellectual Property Organization consultation on the Economic, Social and Cultural Impact of Intellectual Property in the Creative Industries. The response urged the WIPO to take an economic approach in order to assimilate the various possible effects of policy to a single evaluative axis which would be centred around the economic notion of 'social welfare'. It also recommended that transparency, breadth of scope, citation of empirical evidence and openness to revision should be crucial aspects of the IP policy making process for creative industries.

Links
 * OKF Blog - Response to WIPO Consultation on Performing Impact Assessments for IP in the Creative Industries

Papers and Conferences
In July 2006 Rufus Pollock, director of the Open Knowledge Foundation, delivered a paper at an Institute for Public Policy Research (IPPR) seminar on 'IP and the Public Domain' which was subsequently published. The paper, titled 'The Value of the Public Domain', traces the historical development and importance of the public domain before going on to explore the economic value of the public domain both in theory - looking at the economic concept of 'social value' and the special case of knowledge as a good, and in practice - using a variety of recent case studies.

In May 2006, he gave a talk at Telecommunications and Media Forum (IIC, Brussels) on open models in media and broadcasting. In April 2007, he spoke about the importance of open access and open data in academic research at the AHRB/British Academy Conference on Copyright and Research in the Humanities and the Social Sciences (Edinburgh University).

Links
 * IPPR - The Value of the Public Domain
 * OKF Blog - IPPR ‘IP and the Public Sphere’ Seminar
 * OKF Blog - The Value of the Public Domain Published
 * OKF Blog - Copyright and Research in the Humanities and Social Sciences Conference
 * OKF Blog - Copyright and Research in the Humanities and Social Sciences
 * OKF Blog - Talk at IIC Telecommunications and Media Forum
 * OKF Blog - IIC Telecommunications and Media Forum: Spectrum Policy

Open Knowledge Development Process
The open knowledge development process is a frequently recurring subject of discussion at the OKF. After the essay ‘Four Principles of (Open) Knowledge Development’ from May 2006, much attention has been focused on further investigation of the features of the process noted: that it should be incremental, decentralized, collaborative, and componentized. A draft paper on the collaborative development of data is in the pipeline, and work on Open Shakespeare has spurred thoughts about annotation systems for the collaborative development of texts. An essay published in April 2007, ‘What do we mean by componentization for knowledge?’, explores the distinction between atomizing and packaging – favouring the latter for knowledge development.

'Dead knowledge: why being explicit about openness matters', from August 2006, emphasises the importance of explicitly open licensing in collaborative knowledge projects, to prevent content remaining unused due to the difficulties involved in asking permission from all those involved in its creation.

Links
 * OKF Blog - The Four Principles of (Open) Knowledge Development
 * OKF Blog - Dead knowledge: why being explicit about openness matters
 * OKF Blog - Thinking about Annotation
 * OKFN Discuss Collaborative Development of Data
 * OKF Blog - What Do We Mean by Componentization (for Knowledge)?

= Projects =

Open Knowledge Definition
The Open Knowledge Definition (OKD) aims to describe, as clearly, simply and succinctly as possible, what open know knowledge is and the licensing conditions which must meet in order that what is licensed can be called open knowledge. The first version was circulated in August 2005. Since this time last year, there have been two new versions of the OKD in response to feedback. Version 0.2.1 was published in May 2006 and included minor revisions. Version 1.0 was released in July 2006, taking account of comments received.

In February 2007, a set of 'Open Knowledge' and 'Open Data' web buttons were released, which have started to appear on documents and datasets compatible with the Open Knowledge Definition.

Links
 * Open Knowledge Definition
 * OKF Blog - Version 1.0 of the Open Knowledge Definition Released
 * Open Knowledge and Open Data Web Buttons
 * OKF Blog - Open Knowledge Web Buttons: Get Them Now
 * OKF Blog - Open APIs Don’t Equal Open Knowledge

CKAN
The Open Knowledge has been working on CKAN - the Comprehensive Knowledge Archive Network - since June 2006. It is a registry of open knowledge packages with tagging and user support.

Version 0.2, released in February 2007, saw a complete re-write of CKAN to use a pylons web framework. It included support for full CRUD (create, read, update, delete) on packages and tags.

Version 0.3 was released in April 2007 and included the introduction of domain model versioning, a 'recent changes' log, Open ID user authentication and a host of minor fixes and improvements.

Links
 * CKAN
 * OKFN - CKAN Project Page

KForge
KForge is the Open Knowledge Foundation's flagship project – an open source system for managing software and knowledge projects. It re-uses existing best-of-breed tools such as a versioned storage (subversion), a tracker (trac), and wiki (trac or moinmoin), integrating them with the system’s own facilities (projects, users, permissions etc). KForge also provides a complete web interface for project administration as well a fully-developed plugin system so that new services and features can be easily added.

Version 0.11 was released in July 2006 which included a completely re-written web interface, improved logging, system administration, and documentation as well as a new command line interface.

Version 0.12 was released in January 2007 which included the addition of Mailman and Wordpress plugins, 'remember me' option on login, 'notify' plugin for system administrators and 'forgot password' support. Database support was extended and integration with subsystems, usability of web interface and overall speed were all improved.

It is hoped that future developments will include distributed storage for large datasets, support for other versioning systems, and extended database support.

Links
 * KForge
 * OKFN - KForge Project Page

Open Shakespeare
First released just over a year ago, Open Shakespeare is intended to be a simple exemplar of an open knowledge package. The original idea was to have the complete works of Shakespeare, with ancillary information and a web interface for searching and analysing the material in a form that was open in accordance with the Open Knowledge Definition, unlike other existing cases where versions of the text are under copyright, and code is proprietary. Emphasis was placed on the potential for reusability, the presence of multiple versions of the text, and the development of an open API – to make it easy for users to build their own applications and websites from the package.

Version 0.2 was released in July 2006, which marked significant internal changes and use of a domain model and database backend, as well as substantial improvement upon the flexibility, speed and functionality of the concordance.

Version 0.3 was released in October 2006, which included more versions of the texts, and introduced the option to view multiple texts side by side. It also marked the installation of the Shakespeare python package.

Version 0.4 was released in April 2007, and included a new web annotation system and more improvements to the speed of the concordance. The interface was switched from CherryPy to WSGI, and the template language from Kid to Genshi.

Attention is now being focused on extending annotation support, transcribing and proofreading texts, and extracting supplementary material from public domain sources.

Links
 * Open Shakespeare
 * OKFN - Open Shakespeare Project Page

Open Economics
Open Economics is intended to be an exemplar of an open knowledge package with a data store and visualisation tools. It was launched in June 2006 and presently consists of a python library for building economic models, data stored in a subversion repository, and a web interface for access, visualisation and various simple 'services'. So far the data store includes figures relating to music sales, patent registrations, price index, interest rates and population.

Version 0.4 was released in April 2007, and marks a change from kid to genshi templates, and includes new datasets, an improved web interface and a better data bundle package.

It is hoped that future efforts will be directed at uploading more data and improving the web interface and the data store.

Links
 * Open Economics
 * OKFN - Open Economics Project Page

Public Domain Works
Conceived and launched in association with Free Culture UK, Public Domain Works is an open registry of artistic works that are in the public domain. First released in June 2006, the project aims to provide a database of metadata in a raw and reusable form. In March 2007 the Open Rights Group joined the project.

Links
 * Public Domain Works
 * OKFN - Public Domain Works Project Page

= Timeline =

May 2006 June 2006 July 2006 September 2006 October 2006 November 2006 January 2007 February 2007 March 2007 April 2007
 * Version 0.2.1 of Open Knowledge Definition
 * Talk at Telecommunications and Media Forum, IIC, Brussels
 * Launch of Public Domain Works
 * First release of Open Economics
 * Version 0.2 of Open Shakespeare
 * Version 0.11 of KForge
 * Version 1.0 of Open Knowledge Definition
 * Publication of 'The Value of the Public Domain'
 * Response to WIPO consultation on creative industries
 * Peter Suber and Benjamin Mako Hill join advisory board
 * Version 0.3 of Open Shakespeare
 * Forum on Civic Information No. 2
 * Version 0.12 of KForge
 * Version 0.2 of CKAN
 * Open Rights Group join Public Domain Works
 * Open Knowledge 1.0
 * Response to Ofcom PSP consultation
 * Talk at AHRB/British Academy Conference on Copyright and Research in the Humanities and the Social Sciences, Edinburgh
 * Version 0.4 of Open Economics
 * Version 0.4 of Open Shakespeare
 * Version 0.3 of CKAN

= People =

Executive Group Board of Directors Advisory Board
 * Saul Albert
 * Rufus Pollock
 * Jo Walsh
 * Rufus Pollock
 * Martin Keegan
 * James Noyes
 * Natasha Phillips
 * Dr Tim Hubbard
 * Paula Le Dieu
 * Dr Peter Murray-Rust
 * Professor John Naughton
 * Professor Peter Suber
 * Benjamin Mako Hill