In recent years, governments around the world, including in the UK, have been making their data available via portals and APIs, seeding improvements in public services as well as innovative new products. Could the commercial sector learn from this approach to making non-personal data public? Some pioneers are showing how opening up enterprise data can add value for the originating company.
The beginnings of Open Data
The Open Data movement has been around for decades. It advocates that scientific and public datasets be made freely available so that the latest technologies can be applied to gain insight and enable further social and economic development.
As early as the 1950s, large numbers of scientists were calling for scientific datasets to be made accessible to all. They saw the potential for not only verifying scientific claims, but also new discoveries resulting from bringing different data sources together. The earth sciences led the way – in order to make scientific claims on a global scale, they needed access to e.g. meteorological, seismological data from all over the world – and established the World Data Center system in 1958.
By the time the Human Genome Project kicked off in 1990, open data was a core component. Over the 13 years that project ran, the internet made the process of publishing and accessing open data much faster, easier and cheaper. The (almost) complete human DNA sequence is now available to anyone on the internet, promising benefits to virologists, oncologists, pharmacologists, forensic scientists, and more.
In 2004, all the OECD countries signed up to make all their publicly funded scientific research data freely accessible.
Opening up government data
Ahead of the curve, Prime Minister Gordon Brown introduced the Open Government Licence in the UK in 2010, as well as the data.gov.uk website, a repository of public sector datasets. This made huge amounts of data, previously only accessible within government, available to anyone with a web browser. The idea was that the data would spur the generation of value by helping businesses, improving public services and empowering citizens to make data-driven decisions.
It’s such an untapped resource… when it is sitting there on a disk in somebody’s office it is wasted
“It’s such an untapped resource,” said World Wide Web inventor Sir Tim Berners-Lee, who oversaw the project. “Government data is something we have already spent the money on… and when it is sitting there on a disk in somebody’s office it is wasted.”
The UK currently ranks in second place on the World Wide Web Foundation’s Open Data Barometer, which orders governments based on the openness and accessibility of their data.
Initially, there were just a handful of services built on the data.gov.uk datasets (one for reporting road hazards using ONBS location data, for example, and one for finding planning applications on local authority websites), but now there are countless apps and services making use of over 45,000 government datasets, many of which are provided by the commercial sector.
One of the first examples to spring to mind is the Citymapper app, which uses Transport for London (TfL) data to help users navigate their way around the capital easily, showing them the fastest routes, cost of using public transport, the number of calories burned by walking or cycling etc. Having trained their algorithm on London transport data, the company was then able to apply it to datasets from other municipalities. The app is now live in 39 cities around the world and, testifying to the power of open data for kickstarting economic activity, has much bigger plans. From Forbe’s coverage of The Telegraph Smart Cities Conference last month:
“Omid Ashtari explained how Citymapper is going from an app company that looks at data to running transport in its war against the single occupancy car. Citymapper has become incredibly adept at cleaning data, amalgamating it from multiple sources and combining it with data generated from its own users movements and has monetised this with “Smartride” a cross between a bus and a taxi. You don’t book the vehicle but a seat within it, and it doesn’t come to your door but meets you at a street corner close to where you are but without diverting too much from the journey the other occupants are making.”
One of the clearest examples of how open data can help generate economic growth is the Global Positioning System, developed by the US Department of Defence in the 70s, and originally intended only for military use. As it was made increasingly available for civilian use over the following decades, almost every industry has come to rely on GPS data in one way or another, to the point where it is estimated that discontinuing the service would lead to the destruction of $96 billion of value.
In the developing world, open data can have an even more transformative, and sometimes life-saving, role. The response to the 2014 Ebola outbreak in West Africa, for example, relied heavily on open datasets for mapping the threat and coordinating resources. In Ghana, a company called Esoko is using open government data, in combination with other sources, to help small scale farmers level the playing field in negotiations with buyers and get a better price for their produce.
With the explosion of Internet of Things (IoT) sensors, the amount of data available to governments, both local and national, will increase exponentially. While this creates the opportunity for data-driven solutions to a range of urban and infrastructure planning challenges, it also raises privacy questions. The Open Data Institute is piloting two ‘data trust’ projects as a potential way to increase access to data while safeguarding the privacy of the individuals who create it.
The case for open commercial data
The advantages to the public sector of opening up data are clear: the value created by commercial or civil society organisations using and augmenting the datasets is returned to government in the form of higher economic growth, higher tax revenues and lowered administrative burdens. Could the commercial sector also benefit from this virtuous circle which leads from data to insight to value and back to data? Some innovative organisations are already applying open data licenses to selected datasets to reap the benefits of engaging the hive mind.
If you’re wondering how much value an eye for openness can really generate, consider the story of Amazon
Nike created the Materials Sustainability Index (MSI), a database which allows them to compare the sustainability of production materials from a huge range of suppliers, along with a publicly accessible API. It’s since been picked up by the Sustainable Apparel Coalition and generates value for the entire industry, helping Nike keep up with increasing demand for environmentally sound products.
Since 2009, The Guardian has published raw data, including all Guardian content, via a public API. This allows app developers to serve content in return for carrying the newspaper’s advertising, which provides an additional source of revenue.
Media and information company Thomson Reuters’ solution to an internal problem – connecting datasets from around the organisation – has now been published under an open licence. This allows customers to benefit from a permanent, machine-readable identifier that provides a unique reference for a wide variety of entity types including organizations, instruments, funds, issuers and people. It also helps embed them in the Thomson Reuters ecosystem of products.
If you’re wondering how much value an eye for openness can really generate, consider the story of Amazon. The astronomical growth of the company from minor bookseller to world-leading cloud computing company was fueled in no small part by a mandate issued to staff by Jeff Bezos around 2002. According to a former Amazon engineer, it went something like this:
- All teams will henceforth expose their data and functionality through service interfaces
- Teams must communicate with each other through these interfaces
- There will be no other form of inter-process communication allowed: no direct linking, no direct reads of another team’s data store, no shared-memory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network
- It doesn’t matter what technology they use
- All service interfaces, without exception, must be designed from the ground up to be externalizable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions.
And ended with this line:
“Anyone who doesn’t do this will be fired. Thank you; have a nice day!”
While most businesses were still transferring data between teams using spreadsheets and email, Bezos was building a decoupled and dynamic network of services, gaining the skills and knowledge required to build the Amazon Web Services platform. That business brought in $6.11 billion in revenue for Amazon in Q2 2018.
Look at the data your teams are generating and using. The chances are that there’s additional value lying hidden in that data, just waiting to be unlocked.