Sharing data: the 4D competition law model to determining data value

Following a fascinating hackathon in Chicago in 2018 organized by EY, three data ‘worlds’ were modeled.[1] Across all three models, one of the common factors was “who owns and controls data will be a critical pivot point”. Competition law is one of the key lenses that should be used when viewing this issue.[2] Our competition law analysis identifies a 4D model that gives greater insight to the potential value of data because competition law varies the degrees of commercial freedom to monetize data.

Now is the time to use the 4D model because we have entered Phase 2 in the evolution of data valuation.

“Attitudes to data privacy and ownership will change as the value of data increases. Companies that understand the future of data now, can position themselves to win.”

Robert Holston

EY Global Consumer Industries Advisory Leader

4D model – 1^st dimension

Data can be personal or non-personal. The EU’s General Data Protection Regulation (GDPR) is important to personal data, namely, data of an individual person. Not only is the GDPR relevant to the ability to obtain and use the data, but the data portability right under the GDPR is important to the ability of a company to continue to hold the data.[3] Non-personal data examples are data relating to a company or a machine. Other rules apply to non-personal data and to the ability of companies to access such data. For example, data are likely to become increasingly available from public sources.[4]

The GDPR requires that the data holder may process the data under certain conditions. Two of those are that consent must be provided by the end consumer or that the data holder has a legitimate purpose. Both conditions, but particularly the second are likely to evolve in meaning as further guidance is received and as the Courts rule on contentious issues. Competition law is likely to have a role where the data holder is in a dominant position. Two cases decided upon by the German competition authority lead to the conclusion that a dominant entity will be held to a higher standard when determining whether consent has been granted and whether, using a balance of interest test, there is a legitimate purpose.

4D model – 2^nd dimension

Data can relate only to a single user (so clearly personal data). It can be bundled data relating to a person but anonymized or pseudonymized, for example, it might relate to the data generated by a person’s car use. Finally, it can be aggregated level data, so again anonymized but this time related to a group of data generators, for example, the movement of cars travelling in a town on a particular day.

4D model – 3^rd dimension

The data can be volunteered by the owner, for example, as typically occurs when using an online platform by end consumers. The data can be observed, namely, captured by recording the actions of individuals, for example, location data when using cell phones. Finally, the data can be inferred, namely, data about individuals based on analysis of volunteered or observed data, for example, credit scores.[5]

The GDPR right of portability applies to volunteered data. It is unclear whether it applies to observed data, while commentators generally identify that it does not apply to inferred data. The right to portability has parameters that should avoid so-called lock-in of an end consumer with a particular data holder. However, there are some practical elements that will create friction to data portability. Commentators suggest that data holders that hold a dominant position will be held to a more stringent regime.

As regards the Internet of Things (IoT), it is observed data and inferred data that likely have the highest value. Novel use of such data may lead to novel products and services. Whether or not these will be ‘new’ markets and whether such market participants will be found under competition rules to hold market power and even hold a dominant position remains to be seen.

4D model – 4^th dimension

The data can be accessed, for example, by advertisers seeking to determine their best advertising strategy. The data can be traded, which by implication means unambiguously sold or transferred, such that the original holder no longer holds a copy of the data. Finally, the data can be shared, which implies a form of partial sale or license, with the original holder retaining a copy. Data sharing, which implies a reciprocity of data being shared is also known as data pooling.

Data sharing, where this includes competitively sensitive information may constitute a breach of competition law. The precise nature of what is competitively sensitive information will require expert analysis. The benefits of data sharing or pooling are clear, as the already existing examples show in the insurance and banking sector (such as creditworthiness registries). Care and attention will be needed to the design and implementation of a data pool to ensure compliance with competition law.

An entity in a dominant position might be required to grant access to the data it holds. The questions to consider here are whether access to the data is essential and, on balance, should access be granted. These questions arise under the “essential facilities doctrine” of competition law. A novelty to consider is that unlike traditional essential facility cases that concerned infrastructure and then intellectual property rights, there is currently no legal recognition under EU law of a general property right in data. This might lead to a lower standard allowing easier access to data held by a dominant entity. Another novel point relates to a sub-element of the doctrine that considers the issue of substitutability. In the present context, would other data sources be substitutable (replicable) that are currently accessible to the requesting entity? This might be the case for volunteered data, but is less likely for observed data and unlikely for inferred data.

For personal data, consent under EU law will be required for any third party to access the data. Over time the emergence of Personal Information Management Systems[6] and/or data collecting societies for personal data may facilitate access. Finally, real-time or very frequent access data, such as data relating to the movements of a car, are likely to be less substitutable than slow moving or non-dynamic data (for example, the age of a car owner).

Assessment

The value of data held by dominant entities might not be as high as normal market forces would suggest because:

competition law could require the data to be made available to third parties;
in relation to personal data, a dominant entity may face stricter rules in relation to its ability to process personal data; and
at least until an established set of rules or modus operandi exists that is approved by competition authorities, every data access request will likely be an expensive exercise for the dominant entity because refusal will likely be contested.

The value of the data held by a non-dominant company may be higher than expected as the company should be able to require a fee not only for the data per se but an additional fee for the administrative work involved in granting access. Such administrative fees might not be immaterial because ensuring data interoperability (in other words, the data shared with the recipient is in a usable form) will likely be costly to put in place, particularly for real-time data. Additionally, a non-dominant company may have the right to require a dominant company to share its data. While a fee will be payable, this right enables the non-dominant company to build a broader and deeper data bank.

The value of data may be lower than expected where the information includes competitively sensitive information, because the information likely cannot benefit from being in a data pool. In any event, there will be costs imposed as a result of the need to ensure compliance with competition rules, such as the creation and monitoring of robust filters.

Each of the four dimensions of the 4D model are relevant metrics for analyzing the data market, as are the elements within each of the four dimensions. This means there is the possibility of many different market definitions. A business may not hold a dominant position, for example, in relation to the personal, single user data it holds, but could be deemed to hold a dominant position in relation to the aggregated volunteered data. The value of the aggregated volunteered data will be reduced because the business will need to grant access to third parties on fair, reasonable and non-discriminatory terms.

Using the 4D model can help businesses understand the elements which will influence the value of data, minimize errors in value determination and so facilitate optimization of data monetization.

“The value of data will be significantly sensitive to whether a data holder is deemed to hold market power (dominant position) and the extent to which the data is shared, voluntarily or by rule of law. Market definition will be critical. Companies that use the 4D competition law model to improve their understanding of the value of data now, can position themselves to win.”

Kiran Desai

EU Competition Law Leader

[1] See article Will consumers share their data without a share in its value?, 3 July 2018 by Robert Holston, EY Global Consumer Industries Advisory Leader, which can be found at https://www.ey.com/en_gl/growth/will-consumers-share-their-data-without-a-share-in-its-value

[2] Others include the EU’s General Data Protection Directive (‘GDPR’), Article 8 of the Fundamental Charter of Human Rights, and the foreseen roles of data collecting societies and Personal Information Management Systems (infra footnote 5).

[3] The original data holder is required under the GDPR to have a process to expunge data from its systems that is historic – such as data relating to former customers.

[4] Greater access will come as a result of the application of the Public Sector Information Directive (2003/98/EC as amended).

[5] An alternative expression is that inferred data are the product of probability-based analytic processes. They are a result of the detection of correlations which are used to create predictions of behavior.

[6] https://ec.europa.eu/digital-single-market/en/news/emerging-offer-personal-information-management-services-current-state-service-offers-and