The Justice Srikrishna Committee has become so mired in privacy concerns that it has severely curtailed the potential benefits of Big Data and AI
One of the most exciting promises that the Justice Srikrishna Committee held out was that the data protection framework it suggested would protect individual privacy while ensuring that the digital economy flourished. It claimed that in doing so it would chart a path distinct from the US, the European Union and China, one that was finely tuned to the new digital economy. If it was going to deliver on this, its biggest challenge was going to be designing its privacy framework to address both the promises and challenges of Artificial Intelligence and Big Data.
As I read through the report, I was glad to note that the committee had devoted considerable space to the subject. While discussing the principles of collection and purpose limitation, the committee observed that the purposes for which Big Data applications use data only become evident at a later point and that it is, therefore, impossible to stipulate a purpose in advance. As a result, the committee had noted that “limiting collection is antithetical to large-scale processing; equally, meaningful purpose specification is impossible with the purposes themselves constantly evolving”. This is the most succinct analysis of the privacy issue central to the regulation of Big Data technologies that I have read. It gave me hope that the report would articulate a solution that achieved this fine balance.
However, other than vaguely suggesting that personal data should be processed in a manner that does not result in a decision being taken about an individual and, where it does result in such a decision, that explicit consent should first be obtained, the report does not provide any new or innovative solution to the concerns that it so eloquently articulated. The accompanying draft Personal Data Protection Bill, 2018 retains the principles of collection and purpose limitation, departing not a whit from the formulation commonly found in most data protection legislations. Despite recognizing the many benefits of big data and the need to encourage its growth, the committee had offered no useful suggestions as to what should be done.
I had hoped that the committee would encourage the use of de-identified data sets by suggesting that companies that design their systems to de-identify data would be exempted from some of the provisions of the law. This would have encouraged organizations to incorporate privacy into the design of their systems from the ground up. At the same time, it would have generated valuable data sets that could be of use in Big Data applications. Instead, the committee seems to have gotten itself so mired in concerns around the possibility of reidentification that it has only exempted the applicability of the law to data that has been irreversibly de-identified.
I am sceptical as to whether there can ever be such a thing as completely irreversible anonymization. Experience has shown that machine-learning algorithms are able to derive personal insights from even the most thoroughly anonymized data sets. Instead of prescribing an impossible standard, the committee would have done well to place the onus of ensuring anonymity on the entity responsible for maintaining these anonymized data sets—only allowing them exemptions from their privacy obligations if they could demonstrate that their use of these data sets does not compromise the identity of any individual. Should technology evolve to the point where it is capable of re-identifying individuals in their databases, it will be their responsibility to upgrade their solutions to ensure that anonymity is maintained despite these new advances. As an added advantage, if the individuals in these anonymous data sets want, they can consent to being re-identified to partake of the benefits that being part of that data set offers them.