Who should govern our data?

Joan Sony Cherian

Data should not be the exclusive property of one person or company but a form of social commons which needs to be properly redistributed ABSTRACT Gurumurthy, A., Chami, N. (2022). Governing the Resource of Data: To what end and for Whom?,...

Data should not be the exclusive property of one person or company but a form of social commons which needs to be properly redistributed

ABSTRACT

Gurumurthy, A., Chami, N. (2022). Governing the Resource of Data: To what end and for Whom?, Data Governance Network Working Paper 23.

When the Internet was just about taking off, techno-enthusiasts ushered in hopes of a world where knowledge and information would be abundant and free from the hands of the elite. The Internet would be democratic, giving everyone open and equitable access.

Yet almost two decades later, this dream is barely alive. The Internet and its data, almost exclusively, lie in the hands of very large tech companies. The ownership of data has become the currency of the future. Many governments and inter-regional entities have been trying to undo this indiscriminate accumulation of data as well as protect data privacy by conceptualising different forms of data governance systems. The authors of the paper, “Governing the Resource of Data: To what end and for Whom?”, analyse the different approaches to data governance. They argue that data needs to be fundamentally understood as a form of social commons and that there should be thorough re-structuring of the data economy. They propose a semi-commons approach as the most pragmatic way to govern data in order to foster innovation and ensure equitable access.

Platform capitalists

A handful of ‘platform capitalists’ now control the Internet. Platform capitalists are those who take advantage of their first mover privilege by rapidly expanding across the digital landscape. They then offer themselves as a platform for third party players for a price (Meta, Amazon, Microsoft, etc.). These companies retain and expand their control through data accumulation and extraction.

The importance of data accumulation in the digital economy cannot be overstated. With the advent of the Internet of Things (IoT), ‘smart’ devices and related technologies, the possibility of data goes beyond that of the virtual to even the physical and social. Control over such data can even predict behavioural patterns. Platform capitalists have unbridled control over the data economy leading to exclusion and under-optimisation of the data for common good. It forestalls the prospects of smaller businesses and data communities. Additionally, since most of these companies are based in the West, it leaves developing countries to fend for themselves, left out of their own data’s immense possibilities and uses.

The commodification of data has led to a finders-keepers logic which undermines human rights, encourages illegal data mining and profiling.

State regulators have been trying to find a solution to better re-distribute and govern data structures. An approach which is currently in vogue, is the European Union’s individualist policy where individuals have ownership rights of their own personal data (for concerns on privacy) but their non-personal data (data that does not have any personal identifiers) is seen as the property of the data processors/collectors. There are multiple issues within this approach. First of all, assuming that there is no privacy risk with non-personal data is flawed. To quote the authors’ example, the data collected by smart energy systems, temperature and motion sensors seem harmless. But when they move up the data value chain, they hold the potential for smart home manufacturers to infer a lot of socio-behavioural insights that can profile individual households when clubbed with other data sets. Again, by letting data collectors have ownership rights over non-personal data keeps data within the bounds of the finders-keepers logic. It also does not offer an answer to how data can be equally redistributed.

Another approach is that of data stewardship. Data stewardship “refers to any institutional arrangement where a group of people come together to pool their data and put in place a collective governance process for determining who has access to this data, under what conditions, and to whose benefit.” It can also take the model of a public-private partnership where private data can be used for governance issues and policies. The EU’s proposal for “data altruism organisations” which will enable the pooling of non-personal data for non-profit, “general interest” purposes and the World Economic Forum’s ‘Data for Common Purpose’ initiative are plausible examples of such an arrangement. By creating such privacy-focused data forums, the goal is to increase data-based value creation for optimum use. While it would be a marked improvement from platform capitalism, it remains to be seen whether these collectives can really unlock data’s potential. For one, such initiatives would need proper state-of-the-art infrastructure. Most countries in the Global South would then be at a huge disadvantage as they do not have the adequate equipment or resources. Data stewardship remains, therefore, an ideal solution while not exactly pragmatic.

The semi-commons approach

Data has three layers. The semantic layer which has the encoded information. The syntactic layer which represents the information as machine-readable datasets and the physical layer which is the infrastructure through which one extracts data. An ideal data governance structure should prevent the possessors of the syntactic and physical layers from having exclusive rights over the semantic layer.

A semi-commons approach to data governance seeks to balance public and private claims to data. It fundamentally recognises data as social commons where first movers do not get exclusive rights.

Data holders and seekers

Data holders — be it private, public, or altruistic organisations can only have non-exclusive rights over the base layer of data (raw non-processed data). They can use and generate profit through it but are required to share data as other data seekers are entitled to accessibility in a semi-commons approach. Data seekers can have access to raw non-personal data and aggregate non-personal data (after due safeguards are met for irreversible anonymisation). However, this access is not an unconditional right. Different data seekers have different rights over the kind of data being sought.

For example, individual data subjects can access their personal data and non-personal data. Public agencies have an ‘authority access’ in the raw non-personal data and aggregate non-personal data held by other private players. Authority access refers to “entitlements of public agencies to access data on the grounds of fulfilling legitimate public policy functions, backed by specific legislation”. Private organisations can conditionally access raw and aggregate non-personal data. These conditionalities will have to be streamlined with the larger economic and social policies of a country.

A semi-commons approach would need a thorough re-ordering of the current way in which data is hoarded and kept under the exclusive ownership of platform capitalists. One would need to build an equitable data market which encourages production through co-operation. The authors give the example of the Barcelona municipality which is building a smart city by creating a public-funded data infrastructure.

Here, the public is equipped with smart contracts and cryptographic tools which allow them to directly contribute data to the city data commons on their own terms. Local companies and co-operatives are also given access to the city data commons. It also mandates data to be in machine readable format with open-source APIs. Furthermore, a semi-commons approach would help foster data-driven solutions and innovation in sectors which desperately need it. For example, NITI Aayog had commented that the agriculture sector, which desperately needs more data-driven innovation, would only have a mellow response from private AI players due to low profitably in comparison to other sectors.

Therefore, a semi-commons approach, in order to be actualised, calls for a thorough change of perspective wherein data should not be thought of the exclusive property of one person or company but a form of social commons which needs to be properly regulated and redistributed.

THE GIST

The Internet and its data lie in the hands of very large tech companies. The qwnership of data has become the currency of the future.

The importance of data accumulation in the digital economy cannot be overstated. With the advent of ‘smart’ devices and related technologies, the possibility of data goes beyond that of the virtual to even the physical and social.

Public agencies have an ‘authority access’ in the raw non-personal data and aggregate non-personal data held by other private players.