ODDC Coordinator Tim has been in Montreal this week for the Developers for Development hackaton and confernce, taking part in a panel today on Impact of Open Data in Developing Countries. Below are Tim's slides and notes from his talk. A version of this presentation was also given by Web Foundation CEO Anne Jelema at the Data and Accountability for the Post-2015 Development Framework event in New York.
In this short presentation I want to focus on three things. Firstly, I want to present a global snapshot of open data readiness, implementation and impacts around the world.
Secondly, I want to offer some remarks on the importance of how research into open data is framed, and what social research can bring to our understanding of the open data landscape in developing countries.
Lastly, I want to share a number of critical reflections emerging from the work of the ODDC network.
Part 1: A global snapshot
I've often started presentations and papers about open data by commenting on how 'it's just a few short years since the idea of open data gained traction', yet, in 2014 that line is starting to get a little old. Data.gov launched in 2009, Kenya's data portal in 2011. IATI has been with us for a while. Open data is no longer a brand new idea, just waiting to be embraced - it is becoming part of the mainstream discourse of development and government policy. The issue now is less about convincing governments to engage with the open data agenda, than it is about discovering whether open data discourses are translating into effective implementation, and ultimately open data impacts.
Back in June last year, at the Web Foundation we launched a global expert survey to help address that question. All-in-all we collected data covering 77 countries, representing every region, type of government and level of development, and asking about government, civil society and business readiness to secure benefits from open data, the actual availability of key datasets, and observed impacts from open data. The results were striking: over 55% of these diverse countries surveyed had some form of open data policy in place, many with high-level ministerial support.
The policy picture looks good. Yet, when it came to key datasets actually being made available as open data, the picture was very different. Less than 7% of the dataset surveyed in the Barometer were published both in bulk machine-readable forms, and under open licenses: that is, in ways that would meet the open definition. And much of this percentage is made up of the datasets published by a few leading developed states. When it comes to essential infrastructural datasets like national maps, company registers or land registries, data availability, of even non-open data, is very poor, and particularly bad in developing countries. In many countries, the kinds of cadastral records that are cited as a key to the economic potential of open data are simple not yet collected with full country coverage. Many countries have long-standing capacity building programs to help them create land registries or detailed national maps - but with many such programmes years or even decades behind on delivering the required datasets.
The one exception where data was generally available and well curated, albeit not provided in open and accessible forms, was census data. National statistics offices have been the beneficiaries of years of capacity building support: yet the same programmes that have enabled them to manage data well have also helped them to become quasi-independent of governments, complicating whether or not they will easily be covered by government open data policies.
If the implementation story is disappointing, the impact story is even more so. In the Barometer survey we asked expert researchers to cite examples of where open data was reported in the media, or in academic sources, to have had impacts across a range of political, social and economic domains, and to score questions on a 10-point scale for the breadth and depth of impacts identified. The scores were universally low. Of course, whilst the idea of open data can no longer be claimed to be brand new, many country open data initiatives are - and so it is far to day that outcomes and impacts take time - and are unlikely to be seen over in any substantial way over the very short term. Yet, even in countries where open data has been present for a number of years, evidence of impact was light. The impacts cited were often hackathon applications, which, important as they are, generally only prototype and point to potential impacts. Without getting to scale, few demo applications along can deliver substantial change.
Of course, some of this impact evidence gap may also be down to weaknesses in existing research. Some of the outcomes from open data publication are not easily picked up in visible applications or high profile news stories. That's where the need for a qualitative research agenda really comes in.
Part 2: The Open Data Barometer
The Open Data Barometer is just one part of a wider open data programme at the World Wide Web Foundation, including the Open Data in Development Countries research project supported by Canada's International Development Research Center. The main focus of that project over the last 12 months has been on establishing a network of case study research partners based in developing countries, each responding to both local concerns, and a shared research agenda, to understand how open data can be put to use in particular decision making and governance situations.
Our case study partners are drawn from Universities, NGOs and independent consultancies, and were selected from responses to an open call for proposals issues in mid 2012. Interestingly, many of these partners were not open data experts, or already involved in open data - but were focussed on particular social and policy issues, and were interested in looking at what open data meant for these. Focus areas for the cases range from budget and aid transparency, to higher education performance, to the location of sanitation facilities in a city. Together, these foundations gives the research network a number of important characteristics:
Firstly, whilst we have a shared research framework that highlights particular elements that each case study seeks to incorporate - from looking at the political, social and economic context of open data, through to the technical features of datasets and the actions of intermediaries - cases are also able to look at the different constraints exogenous to datasets themselves which affect whether or not data has a chance of making a difference.
Secondly, the research network works to build critical research capacity around open data - bringing new voices into the open data debate. For example, in Kenya, the Jesuit Hakimani Trust have an established record working on citizens access to information, but until 2013 had not looking at the issue of open data in Kenya. By incorporating questions about open data in their large-scale surveys of citizen attitudes, they start generating evidence that treats open data alongside other forms of access to information for poor and marginalisd citizens, generating new insights.
Thirdly, the research is open to unintended consequences of open data publication: good and bad - and can look for impacts outside the classic logic model of 'data + apps = impact'. Indeed, as researchers in both Sao Paulo and Chennai have found, they have, as respected research intermediaries exploring open data use, been invited to get involved with shaping future government data collection practices. Gisele Craviero from the University of Sao Paulo uses the metaphor of an iceberg to highlight this importance of looking below the surface. The idea that opening data ultimately changes what data gets collected, and how it is handled inside the state should not be an alien idea for those involved in IATI - which has led to many aid agencies starting to geocode their data. But it is a route to effects often underplayed in explorations of the changes open data may be part of bringing about.
Part 3: Emerging findings
As mentioned, we've spent much of 2013 building up the Open Data in Developing Countries research network - and our case study parters are right now in the midst of their data collection and analysis. We're looking forward to presenting full findings from this first phase of research towards the summer, but there are some emerging themes that I've been hearing from the network in my role as coordinator that I want to draw out. I should note that these points of analysis are preliminary, and are the product of conversations within the network, rather than being final statements, or points that I claim specific authorship over.
We need to unpack the definition of open data.
Open data is generally presented as a package with a formal definition. Open data is data that is proactively published, in machine-readable formats, and under open licenses. Without all of these: there isn't open data. Yet, ODDC participants have been highlighting how the relative importance of these criteria varies from country to country. In Sierra Leone, for example, machine-readable formats might be argued to be less important right now than proactive publication, as for many datasets the authoritative copy may well be the copy on paper. In India, Nigeria or Brazil, the question of licensing may by mute: as it is either assumed that government data is free to re-use, regardless or explicit statements, or local data re-users may be unconcerned with violating licenses, based on a rational expectation that no-one will come after them.
Now - this is not to say that the Open Definition should be abandoned, but we should be critically aware of it's primary strength: it helps to create a global open data commons, and to deliver on a vision of 'Frictionless data'. Open data of this form is easier to access 'top down', and can more easily be incorporated into panopticon-like development dashboards, but the actual impact on 'bottom up' re-use may be minimal. Unless actors in a developing country are equipped with the skills and capacities to draw on this global commons, and to overcome other local 'frictions' to re-using data effectively, the direct ROI on the extra effort to meet a pure open definition might not accrue to those putting the effort in: and a dogmatic focus on strict definitions might even in some cases slow down the process of making data relatively more accessible. Understanding the trade offs here requires more research and analysis - but the point at least is made that there can be differences of emphasis in opening data, and these prioritise different potential users.
Supply is weak, but so is demand.
Talking at the Philippines Good Governance Summit a few weeks ago, Michael Canares presented findings from his research into how the local government Full Disclosure Policy (FDP) is affecting both 'duty bearers' responsible for supplying information on local budgets, projects, spend and so-on, and 'claim holders' - citizens and their associations who seek to secure good services from government. A major finding has been that, with publishers being in 'compliance mode', putting required information but in accessible formats, citizen groups articulated very little demand for online access to Full Disclosure Policy information. Awareness that the information was available was low, interest in the particular data published was low (that is, information made available did not match with any specific demand), and where citizen groups were accessing the data they often found they did not have the knowledge to make sense of or use it. The most viewed and download documents garnered no more than 43 visits in the period surveyed.
In open data, as we remove the formal or technical barriers to data re-use that come from licenses and non-standard formats, we encounter the informal hurdles, roadblocks and thickets that lay behind them. And even as those new barriers are removed through capacity building and intermediation, we may find that they were not necessarily holding back a tide of latent demand - but were rather theoretical barriers in the way of a progressive vision of an engaged citizenry and innovative public service provision. Beyond simply calling for the removal of barriers, this vision needs to be elaborated - whether through the designs of civic leaders, or through the distributed actions of a broad range of social activists and entrepreneurs. And the tricky challenge of culture change - changing expectations of who is, and can be, empowered - needs to be brought to the fore.
Innovative intermediation is about more than visualisation.
Early open data portals listed datasets. Then they started listing third party apps. Now, many profile interactive visualisations built with data, or provide visualisation tools. Apps and infographics have become the main thing people think of when it comes to 'intermediaries' making open data accessible. Yet, if you look at how information flows on the ground in developing countries, mobile messaging, community radio, notice boards, churches and chiefs centres are much more likely to come up as key sites of engagement with public information.
What might open data capacity building look like if we started with these intermediaries, and only brought technology in to improve the flow of data where that was needed? What does data need to be shaped like to enable these intermediaries to act with it? And how do the interests of these intermediaries, and the constituencies they serve, affect what will happen with open data? All these are questions we need to dig into further.
I said in the opening that this would be a presentation of critical reflections. It is important to emphasise that none of this constitutes an argument against open data. The idea that government data should be accessible to citizens retains its strong intrinsic appeal. Rather, in offering some critical remarks, I hope this can help us to consider different directions open data for development can take as it matures, and that ultimately we can move more firmly towards securing impacts from the important open data efforts so many parties are undertaking.