Big data and the future for privacy

Big data and the future for privacy

Phuah Eng Chye (18 January 2020)

Privacy advocates acknowledge the emergence of big data has diminished the ability to regulate privacy. Neil M. Richards and Jonathan King note “many technologists and futurists predict a digital future in which privacy has no place. Others argue that the benefits of open data and data science mean that certain kinds of privacy rules (like limitations on collection or deletion requirements) make privacy either an obstacle to progress or something highly impractical to enforce in our ubiquitous digital future. Data scientists consider privacy to be an obstacle to the kind of innovative work they want to do, while the leading manuals for data warehouse[1] engineers largely ignore considerations of privacy altogether. At the level of theory, then, privacy is an anachronism hostile to progress, while at the level of practice, it is just impractical and gets in the way of doing things”. Nonetheless, they think the “dismissal of privacy as a foolish anachronism is belied by both common sense and a small but growing scholarly and public literature about the importance of privacy for the kind of sustainable, humanist society we should want to build”.

In a big data landscape, Neil M. Richards and Jonathan King explains the error is to equate privacy with secrecy or have the misimpression that information that is shared ceases to be private. In this regard, secrets are already “protected by a wide variety of legal, technological and operational tools, including criminal and contract law, encryption and other security tools, and the whole trade of spycraft. The same is true of corporations around the world that utilize confidentiality regimes, nondisclosure agreements, and trade secret protection to protect secret information”.

They trace “much of the confusion about privacy law over the past few decades has come from the simplistic idea that privacy is a binary, on-or-off state, and that once information is shared and consent given, it can no longer be protected. Binary notions of privacy are particularly dangerous and can erode trust in our era of Big Data and metadata, in which private information is necessarily shared to some extent in order to be useful. The law has always protected private information in intermediate states, whether through confidentiality rules like the duties lawyers and doctors owe to clients and patients, evidentiary rules like the ones protecting marital communications, or statutory rules like the federal laws protecting health, financial, communications, and intellectual privacies…Understanding that shared private information can remain confidential better helps us see more clearly how to align our expectations of privacy with the rapidly growing secondary uses of big data”.

Neil M. Richards and Jonathan King conclude “we live in an information society, and privacy rules are the rules that govern the information in and out of networks and data sets in this society. Understanding privacy rules as information rules radically changes the questions we might ask about regulation of personal information in the big data context”. In this regard, while data specialists may consider information that is shared “to be public or at least non-private, and therefore beyond any regulation…it would leave us with an essentially lawless, anarchic and unsustainable information society. An information society with no rules has no protection against hackers, malicious code, data breaches, revenge porn, child pornography, cybercrime, or any of our other information age maladies”. Hence, the information rules should not be based on “privateness or publicness, but rather about what kinds of data uses and which kinds of information regulation support the kind of society we might want to live in and which ones do not”.

In their view, “privacy is instead an ethical approach to the management of information flows…ultimately a series of human questions that must be informed by human values…we must inquire exactly what those values should be”. From their perspective, “privacy is not a value in and of itself. Privacy is instead a tool that can be used to restrict access to data or to regulate decisions based upon data. (Analogously, the opposite idea of transparency is also a tool that can be used to shed light upon unknown activity.) But the decision about whether or not to impose an information rule, or what sort of rule to impose, must be made upon the basis of values other than privacy itself. Under our account, because privacy is such a vague and notoriously slippery concept, it is not a helpful concept upon which to base policy decisions. The values that privacy rules can serve, however, are useful”.  While many values could be considered, Neil M. Richards and Jonathan King highlighted “four such values – identity, equality, security and trust”.

  • . They note research “suggesting that widespread surveillance affects our ability to form our identities ourselves. When it comes to surveillance of intellectual activities, there is good evidence that surveillance dulls our reading and thinking to the boring, the bland, and the mainstream…privacy shelters our ability to play, to engage in determination, and to manage our boundaries between our social selves and the world. Privacy of varying sorts – protection from surveillance or interference – is what enables us to define our identities. In turn, free identities enable free market economies and ultimately govern free democracies”.
  • . “Data science allows firms to better understand their competitors and customers, and permits governments greater transparency over the activities of both noncitizens and citizens. Surveillance of this sort is usually not just to learn about others, but to learn in order to nudge, influence, prevent or control…it is data analytics that make the Internet’s increasingly specific profiling, targeting, and retargeting possible…Big data analytics permit efficient sorting, but the line between sorting and discrimination is a blurry and dangerous one…perfect price discrimination, in which the surplus of consumer transactions could potentially be retained entirely by the sophisticated merchants…Such uses of data science threaten an end-run around well-established legal principles that forbid government or private entities from engaging in intentional race, gender, or other forms of invidious discrimination…Such tools can of course be used to combat discrimination as well, by looking at hidden patterns of bias and denial of opportunity…Privacy rules placed upon big data – including restrictions on collection, algorithmic transparency and accountability, and restrictions on the use of analytics to sort and treat people differently – will be an essential element of the future of privacy law in a data science world…without meaningful new procedures and rules protecting equality, our commitment to equality risks being undermined by big data analytics”.
  • . “We have reached a point in our digital society where privacy cannot exist without security and security cannot exist without privacy…The increasing collection of sensitive digital information behind these walls, coupled with increasing vectors of attack have made companies more alluring and vulnerable…In a world in which digital information can be changed easily and surreptitiously, rules and technologies protecting data quality, privacy, and security simultaneously will be essential. More generally, by empowering end users with more privacy controls and employing tools which strengthen data integrity, the systems we access are strengthened as we use them. In this manner, privacy and data integrity are becoming a kind of fitness indicator for systems security”.
  • . “Privacy also promotes trust – trust in systems, trust in networks, and trust in the relationships between individual people and the entities that hold their data. Many commentators have argued that privacy rules are somehow bad for business…missed the important insight that privacy rules can promote trust between users and platforms…When users trust that their information will not be misused or abused by their confidantes, they share more information, more freely, and more accurately with those confidantes…because the doctor promises and society enforces confidentiality, the patient shares more completely, and receives better care as a result…The legally-enforceable promise of privacy (or to be technical, confidentiality) promotes not only the welfare of both parties, but also promotes further trust between the parties, making both of them more likely to deal and share with each other in the future. This is the information-sharing function of confidentiality. Privacy rules therefore promote trust, a form of value creation…In a digital environment in which identity can be fluid and everything else is seemingly negotiable and up for grabs, privacy rules create trust, which in turn allows for long-term stable relationships to flourish. Such forms of economic and social sustainability will be essential in the digital economy, in order for individuals and corporations to share more information and take advantage of their new digital opportunities over the long term, but only as long as they can trust that their data will not be abused by the other parties in information relationships”.

Neil M. Richards and Jonathan King suggest there are “several ways in which privacy as information rules can (and is likely) to be effected in a digital networked society in order to supplement the necessary but diminished role that privacy self-management will retain”.

One is to regulate through a blanket data protection law or formal data protection agency for which there are two tracks. “The first will be procedural, reinforcing transparency of processing, the basis for algorithmic decisions (so-called algorithmic accountability”), and providing meaningful notice of data practices and actual choice to opt-out of unwanted collection, use, or disclosure. But the second track must be substantive. Certain kinds of data collection, certain kinds of processing, and certain kinds of decisions based upon algorithmic outputs must be taken off the table. In particular, processing that threatens identity, equality, security, data integrity, and trust must be regulated and when necessary forbidden”.

Another approach is to establish consumer subject review boards or independent ethical boards to assess “the implications of big data for any entity that engages in sensitive consumer analytics at scale. Alternatively, the creation of a regulatory commission to oversee the fairness and honesty of big data practices (or the vesting of such authority in an entity like the Federal Trade Commission) might also be a sensible option”. “Given the problem of privacy unraveling, we should consider either prohibitions on certain kinds of disclosures that have an unraveling effect, or prohibitions on decisions based upon those criteria. As with other protections of meaningful human equality, procedural rules alone will be insufficient, and substantive prohibitions need to be part of the regulatory equation”.

Neil M. Richards and Jonathan King suggest “formal regulation along the lines of statutes or agency rulemaking will not be able to solve all of the problems we identify and represents at best a limited solution…political gridlock…over-regulate…a lag between innovation and regulation”. The most effective approach may be to establish or annoint an agency with regulatory oversight of (unfair and deceptive) data and investigating practices. In addition, “the global nature of the information economy” means actors will “increasingly fall within the regulatory authority of foreign data protection authorities”. For example, court judgements in one country can lead to subsequent litigation or affect practices in other countries. “Although judgments of these sorts rarely have extraterritorial application as a matter of law, they tend to have extraterritorial application as a matter of effect”. “The point here is not whether these controversial foreign judgments are correct, but to observe that precedent changing decisions are occurring and to consider their global effect regardless of their merits as they do”.

Neil M. Richards and Jonathan King also think “competition around privacy itself can also serve as a form of soft regulation”. Some tech companies invest “in privacy research and development”. “As privacy revelations continue to gain headlines, the initial reactions and resulting competitive dynamic offer promise to organically advance privacy” with firms voluntarily providing “users more insight and control over the data”.

“Our point here is that regulation in the digital, networked, global society can occur from a variety of perhaps unexpected sources, and that the failure of the U.S. Congress to pass an American data protection law does not stop other regulatory entrepreneurs such as state or foreign governments or unexpected federal regulators like the FTC from stepping into the regulatory vacuum to make or influence regulatory policy. Perhaps the most unexpected and encouraging of all is the emerging industry trend of innovating, advocating and competing for privacy, not just technology”.

Neil M. Richards and Jonathan King suggest “any meaningful system of privacy rules rest on technologies and social norms…it is essential that we develop some form of big data ethics…we must have a social conversation beyond merely the compulsions of legal rules in order to ensure that the tools of the new data science are employed in ways that are not merely effective and efficient, but ethical as well. Ethical rules need not lag the way that legal doctrine sometimes must. Big data ethics can be embraced into the professional ethos of cross functional leaders on the ground in a way that each can apply their own insights and expertise. By first having a shared vision about what big data ethics mean for their organization, leaders can exert a regulatory effect that emerges naturally around big data deployments, building trust and responsibly harnessing the full power of big data in the process…Big data ethics will be, in large part, the future for privacy”.

To summarise, Neil M. Richards and Jonathan King makes several important propositions on the regulation of privacy. First, they propose that privacy rules should be perceived as information rules “that govern the information in and out of networks and data sets in this society”. Second, that the regulation should be based not on privacy itself but on “the values that privacy rules can serve…identity, equality, security and trust”. Finally, information rules should not be based on “privateness or publicness, but rather about what kinds of data uses and which kinds of information regulation support the kind of society we might want to live in and which ones do not”.


Michael D. Birnhack, Eran Toch, Irit Hadar (25 July 2014) “Privacy mindset, technological mindset”. 55 Jurimetrics 55-114.

Neil M. Richards, Jonathan King (19 October 2014) “Big data and the future for privacy”. Handbook of Research on Digital Transformations (Elgar 2016). or

Phuah Eng Chye (21 December 2019) “The debate on regulating surveillance”.

Phuah Eng Chye (4 January 2020) “The economics and regulation of privacy”.

[1] Michael D. Birnhack, Eran Toch and Irit Hadar explore the gaps between privacy law and technology in the contexts of data warehousing and data science and the challenges of implementing Privacy by Design (PbD) which aims to “embed privacy within the design of a new technological system”.