Open Government and the need to balance institutional transparency with individual privacy
Remarks at the XXVII German-Canadian Conference
October 1, 2012
Address by Jennifer Stoddart
Privacy Commissioner of Canada
(Check against delivery)
Before I begin, please let me thank the organizers of this forum for including the topic of Open Government so prominently within the proceedings. Put simply, this issue calls for what many say is the need to balance two values which are essential to democracy – privacy and access to information.
Personally, as someone who has served as both the head of Quebec's access to information and privacy commission as well as federal Privacy Commissioner, I like to think that rather than balance, the question before us is “how do we work to ensure ongoing mutual respect between these two important values?”
Today, the Internet is part of daily life in Canada and Germany along with other advanced democracies. More and more information is being furnished online and there is a natural demand among citizens for even more to be put at their fingertips. All told, the provision and management of Open Government is an important topic for the present and future of democracies like ours.
Balancing essential democratic values
In Canada, I believe we are well-positioned to face the future of open government, both institutionally and legally. Legally, the Supreme Court of Canada has described our Access to Information Act and our Privacy Act as a “seamless code with complimentary provisions that can and should be interpreted harmoniously.”
Institutionally, Canada's federal and provincial access to information and privacy commissioners signed a joint resolution on Open Government in 2010. We came together to support open government as a means to enhance both accountability and transparency, while recognizing that transparency needs to respect privacy.
This statement was and is important, because the same technological ingenuity allowing anyone to access this vast amount of data on their laptop or mobile device also highlights the need for greater vigilance in respecting and protecting privacy.
Opportunity and challenge in the Information Age
Put another way, that which eases the burden of a focused researcher seeking specific information is also raising the risk of revelation by those who might seek to besmirch someone else's name or by those who are just simply nosy. As a former historian, I remember well the long hours I spent combing through archives seeking specific documents and gazing quizzically at projected sheets of microfiche. But of course, thanks to the power, convenience and intuitiveness of today's search engine technology, that scene is increasingly one of a bygone era.
From a privacy perspective, as this scene fades increasingly to yesteryear, so does the “practical obscurity” posed by paper records. To explain what I mean, let me turn to the case of AB vs Canada. This Federal Court case involved a man who was HIV positive and granted Canadian citizenship following a judicial review. Years and years ago, the fact that his surname and health status was included in the court record wouldn't have mattered all that much. This decision, however, was posted online. And by the time he realized he could have it anonymized, the case was cited by another judge and the man was then unable to have that record scrubbed.
In other words, years ago, to retrieve this sensitive personal information, someone would have had to make the effort to go down to the court house and find the file. Today, that information's just a few key strokes away, anywhere, anytime.
Open Government in Canada today and the need for vigilance
This example springs from the courts, and we're not here to talk about the open court principle, but rather, open government. I just find that it is a powerful example underlining the privacy risks that can come with open data.
Our Office is, however, keeping a keen eye on developments, simply because the terrain on which open data rests is fast moving territory. And the speed of its movement is quickened by the ever-growing amounts of data being created and the skyrocketing power of computing.
Typically, when datasets are released publicly, certain information is removed to make the data anonymous. But when such data is analyzed in combination with other publicly available information, there is a risk of re-identification – a risk that is constantly increasing thanks to the trends I just mentioned. Privacy scholar Paul Ohm has detailed this challenge in a paper provocatively entitled “The Failure of Anonymization.” In it, he wrote that “data can either be useful or perfectly anonymous but never both.” He has also noted, that during the last decade, computers have gotten much faster…. But more importantly, the amount of information people volunteer about themselves online and especially on social networks has simply exploded.
To try and quantify what we call Big Data, consider a study by IDC, which estimated that more than 1.8 zettabytes of information would be created and stored during 2011. Just in case you're curious, 1.8 zettabytes of data is equivalent to 200 billion two hour-long HD movies, which would take one person 47 million years - uninterrupted - to view. And that was just a year's worth of data according to the study which also estimated that the number of servers worldwide will grow by a factor of 10 during the next decade.
In such an environment, we can't assure privacy by merely removing identifiers from data sets. There are many examples where this unfortunate fact was proven correct, and I'll detail one in particular. In 2006, Netflix released anonymized data of more than 100 million movie ratings from nearly 500,000 customers. Researchers from the University of Texas at Austin then combined these records with other movie ratings obtained from the Internet Movie Database.
The researchers concluded that 'if you already knew the identity of someone included in the records along with a few of the movies they liked or disliked, you could use the Netflix dataset to find their entire movie viewing history prior to 2005.'
What to do?
Now if I were to conclude my remarks now, I would be leaving you with little other than the paradox between the warning of the limits of anonymization and our Office's support for Open Government as a means to support transparency and accountability. In other words, “what are organizations to do?”
I should first note that despite the challenge, it's been noted that all is far from lost when it comes to anonymization. In fact, Canada Research Chair for Health Information, Doctor Khaled El-Amam has stressed in some of his work that “as long as proper de-identification techniques, combined with re-identification risk measurement procedures are used, de-identification remains a crucial tool in privacy protection.”
Moving down to the day-to-day work of organizations, the privacy risks associated with open government underline the need to integrate privacy considerations into their planning throughout their processes. This means that privacy needs to factor prominently into the thinking behind devising new initiatives that collect data.
When determining the real need to gather certain information, organizations should consider the new era expectations of transparency while realizing the challenge of de-identification. It also means that privacy impact assessments should be seen and used as building blocks for planning and design rather than being treated as afterthoughts, add-ons, appendices or boxes to be checked. The issues I've discussed also point to the need for ongoing privacy training within organizations.
Going further, our Office joined with our provincial counterparts in developing guidelines for the online posting of administrative tribunal decisions. While these were developed specifically for tribunals, most other organizations could borrow from the suggested best practices.
The guidelines, called on tribunals to advise parties of steps they can take to identify and protect personal information in advance of a public hearing. For example, there is usually no reason for people to include social insurance numbers in their submissions.
I would also recommend that organizations consult the Draft Anonymisation Code of Practice issued by the United Kingdom Information Commissioner's Office earlier this year. This document is significant in part because it comes from an office with the combined mandate of data protection and freedom of information. The Code seeks to help organizations identify key issues they need to consider when deciding to anonymize personal data. It also provides advice on the appropriate means to do so and how best to manage risk tied to releasing anonymized information.
And now, in closing, I want to once again commend the organizers for not just including the topic of open government on the agenda, but including it as a prominent one. I am thankful for having the chance to hear Corrine Charette speak today along with Pierre Boucher and Matthi Bolte. Further, I look forward to our discussion being moderated by Ambassador Wnendt. And beyond today, I look forward to further dialogue within the federal community as well as the global community on this important issue. Thank you.
- Date modified: