Privacy Enhancing Technologies – A Review of Tools and Techniques

Report prepared by the Technology Analysis Division of the Office of the Privacy Commissioner of Canada

November 2017

Introduction

A number of (relatively) recent developments have contributed to an increased level of awareness of the need for security and privacy (especially of online activity)Footnote 1, notably:

  • the continued evolution of technologies that permit individuals to connect and communicate (e.g., e-mail, instant messaging, chat, online social networks, and so on), resulting in an increasing amount of personal information (generated by and about individuals) being available online;
  • the increased interest that corporationsFootnote 2 have in collecting this information and making use of it in some fashion (e.g., reduced auto insurance rates, targeted advertising, personalization, and so on);
  • revelations about the extent of government surveillance of individual communications and other online activities, including those of law-abiding citizens, (e.g., the Snowden revelations, which began in June 2013, about the activities of the US National Security Agency (NSA))Footnote 3; and
  • continuing headlines about major breaches at both government organizations and corporationsFootnote 4, resulting in the compromise of millions upon millions of records containing personal informationFootnote 5.

These developments bring with them potential or real risks of identity disclosure, linking data traffic with identity, location disclosure in connection with data content transfer, user profile disclosure or information disclosure itself. Privacy Enhancing Technologies (PETs)Footnote 6 can help address these risks.

PETs are a category of technologies that have not previously been systematically studied by the Office of the Privacy Commissioner of Canada (OPC). As a result, there were some gaps in our knowledge of these tools and techniques. In order to begin to address these gaps, a more systematic study of these tools and techniques was undertaken, starting with a (non-exhaustive) review of the general types of privacy enhancing technologies available. This paper presents the results of that review.

Scope

While many traditional security technologies (e.g., encryption) can be considered privacy-protective, this review focuses on PETs primarily used by law-abiding consumers and citizens seeking to protect their personal information online. This project was further limited in scope to those technologies that protect information in transit (i.e., communicated / transmitted by information and communications technologies (ICT)). Technologies that protect information at rest (e.g., when stored on mobile devices) are not included, nor are descriptions of the ICT systems to which PETs may be applied (unless required for a proper understanding of PET functionality).

Further, this report does not include:

  • a description of the security and privacy implications , weaknesses or limitations of the identified products or services;
  • verification, by the Office of the Privacy Commissioner of Canada, of the assertions made by PET developers / vendors with respect to PET functionality, either through literature research or testing of any kindFootnote 7;
  • a comparative analysis of PETs in any given category (e.g., anonymization) to determine which PET is “better”; or
  • the development of specific guidance or recommendations for implementation or use of any particular PET.

A Taxonomy of Privacy Enhancing Technologies

Taxonomy is defined as “the practice and science of classification of things or concepts, including the principles that underlie such classification”Footnote 8. Just as there is no universally agreed definition, there is also no universally agreed taxonomy for PETsFootnote 9. For example, the European Union Agency for Network and Information Security (ENISA), in its work on a PETs controls matrix, identifies four major categories of technology: secure messaging, virtual private networks, anonymizing networks and anti-tracking tools for online browsingFootnote 10. Other researchers have characterized PETs “according to their technical contributions (e.g., anonymous communication, and privacy preserving data mining)Footnote 11.

For the purposes of this paper, the taxonomy for privacy-enhancing technologies (PETs) described belowFootnote 12 is a way of classifying these technologies based on the functionality/capabilities that they provide to an end user. This particular taxonomy has been chosen because it provides a fairly granular way of categorizing the various tools and techniques that have been identified during our review, using terms that often appear in common usage or in the media. It also helps identify areas where additional research and development is required. The principal drawback is that some tools and techniques provide more than one capability, making it somewhat difficult to neatly categorize them.

PETs are intended to allow users to protect their (informational) privacy by allowing them to decide, amongst other things, what information they are willing to share with third parties such as online service providers, under what circumstances that information will be shared, and what the third parties can use that information for. They do this by providing one or more of the following functions/capabilities.

Informed Consent

When an individual discloses his or her personal information to commercial and other entities, he or she also grants, sometimes explicitly, sometimes implicitly, consent for it to be used for one or more purposes. Consent is a key principle of most data protection/privacy legislation. Although the specific language varies, a key element of consent is that it be informed (i.e., based on a clear understanding of what the individual is consenting to). Subsequent control over the storage, use, and onward sharing of that information relies on the notion of trust that the given consent will be respected. Unfortunately, the reality is that, given the complexity of the policy language, the complexity of the business ecosystem behind the organization with whom the individual is dealing, and similar factors, this trust is sometimes misplaced.

As discussed in the OPC’s 2017 Annual ReportFootnote 13, one way for this trust to be restored is through the use of a technique known as “data tagging”. In data tagging, a user’s personal information is labeled or tagged with instructions or preferences specifying how the data should be treated by service providers. These preferences can be expressed in a machine readable format using a privacy policy language, and automatic mechanisms have been proposed to ensure that service providers follow the instructions.

Sticky policies are an example of data tagging. Sticky policies technically enforce preferences when personal data is shared across multiple parties. One way to enforce this is through the use of encryption. The EnCoRe (Ensuring Consent and Revocation) project proposed an architectureFootnote 14 where encrypted personal data, with a machine-readable policy stuck on, can only be decrypted and read by entities that abide by the policy rules. A trust authority enforces this by verifying compliance and only distributing decryption keys to those services that adhere to the policiesFootnote 15.

Sticky policies are an integral part of certain privacy policy language proposals such as PPL (PrimeLife Policy Language)Footnote 16 and E-P3P (Platform for Enterprise Privacy Practices)Footnote 17. PPL is based on XACMLFootnote 18 and is used to grant service providers access to data as long as the organization’s policy is compatible with the user’s privacy preferences. The use of XACML for tagging data was suggested during the OPC’s recent consent consultations. E-P3P is a privacy-specific access control language that allows organizations to design and deploy machine-readable privacy policies, including identifying opt-in or opt-out choices (depending on the nature of the information) and placing restrictions on access to personal information, and design access control policies to give effect to the privacy policies.

Data tagging and sticky policy research has been ongoing since 2002, but the work remains at the proof of concept stage with few commercial deployments. In general, machine-readable, automated policy languages have had very limited success, perhaps due to complexity, a lack of interoperabilityFootnote 19 and little demand for their capabilities. Most recently, Microsoft has discontinued all support for P3P in their Windows 10 browsersFootnote 20.

Data minimizationFootnote 21

Data minimization is a fundamental privacy design principle which requires that services and applications only process the minimum amount of information strictly necessary for the service or for a particular transaction. The objective is to minimize the amount of personal information collected and used by online service providers (e.g., to mitigate the risk of profiling based on user behaviour). PETs in this category include websites that deliberately choose not to collect and store personal information such as search terms, search history, IP addressesFootnote 22 and so on. Examples include DuckDuckGoFootnote 23, Ixquick (now StartPage)Footnote 24, DisconnectFootnote 25 and UnbubbleFootnote 26.

Other tools that could fall within this category include those designed to protect privacy by deleting browsing history and other computer activities. An example of such a tool is Privacy EraserFootnote 27. Privacy Eraser claims that it will erase all digital footprints - web browser cache, cookies, browsing history, address bar history, typed URLs, autocomplete form history, saved passwords, search history, recent documents, temporary files, recycle bin, and more.

As individuals browse the web, their web browser will record information about the browsing activity (e.g., the sites you’ve visited, the date and time of each visit, search terms used, and so on). There may be times, however, when an individual might not want that kind of information to be accessible to anyone else who uses that computer. All of the major browsers now support a mode sometimes referred to as private browsingFootnote 28.

It should be noted, however, that there are limitations to the protection provided by private browsing modes:

  • if the computer used is connected to a corporate network, the network administrator could potentially see what sites have been visited;
  • if the computer used has been infected by malware, your online activities could still be tracked;
  • if the computer used has Internet protection software (e.g., parental control programs such as QustodioFootnote 29), they can track private browsing sessions; and
  • the user’s Internet Service Provider can access the user’s online history (e.g., in response to a lawful access request).

Another category of tools or techniques used to implement data minimization is that of ephemeral communications. These tools have been developed in response to the permanence of Internet conversation, which arose once computers began to mediate our online communications. Computers naturally produce conversation records, and these data were often saved and archived. These tools, on the other hand, claim to automatically expire messages, videos, and other contentFootnote 30. Examples of these tools include SnapchatFootnote 31, WickrFootnote 32, ConfideFootnote 33 and FirechatFootnote 34.

Data Tracking

In order for individuals to properly manage their digital privacy, it helps if they have a way to log, archive and look up (data tracking) the information that they have already disclosed, when, to whom, and under what circumstances. This includes allowing an individual to track what information a single site or service provider (e.g., Google) possesses about them, which can be done via a dashboard (see discussion under the heading “Control”), but also allowing individuals to track data disclosure across multiple sites.

One way of doing this is through a tool called Data Track, developed as part of the European Union’s Privacy and Identity Management for Europe (PRIME) projectFootnote 35. Data Track was intended to provide a history of all online transactions, storing for the user information regarding which personal information has been disclosed to whom. Data Track was also intended to provide transparency to users of their online transactions and to enable them to later question data controllers over whether they really treated their personal information as promised. Work on Data Track was suspended in 2011.

AnonymityFootnote 36

There are a number of identity/identification “states” that are possible, ranging from fully anonymous to fully identified (sometimes referred to as “verified”)Footnote 37. There is also a range of information that can be used to identify individuals online, from name to IP address. PETs can allow users to choose the degree of anonymity they desire (e.g., by using pseudonyms, anonymizers, or anonymous data credentials).

Communication anonymizers hide the real online identity (e.g., email address, IP address, etc.) of a user and replace it with a non-traceable identity (e.g., disposable / one-time email address, random IP address of hosts participating in an anonymizing network, pseudonym, etc.). They can be applied to email, Web browsing, peer-to-peer (P2P) networking, VoIP, chat, instant messaging, and so onFootnote 38.

One of the best known communications anonymizers is TorFootnote 39. Tor is a free, world-wide network of relays on the Internet that individuals and groups can use to keep websites from tracking them, to connect to news sites, instant messaging services, or similar network services when these are blocked by their Internet service providers or may be sensitive in nature. Also, a feature known as ‘hidden services’ lets users publish web sites and other services without needing to reveal the location of the site. For example, journalists use Tor to communicate more safely with whistleblowers and dissidents.

ControlFootnote 40

PETs in this category allow users to exercise more control over what personal information is sent to, and used by, online service providers and merchants (or other online users). They do so, for example, by allowing individuals to limit the type or quantity of information that they disclose to third parties. These are sometimes referred to as “selective disclosure techniques” or “selective disclosure technologies”.

Almost every day, we are asked to identify ourselves, whether it is to obtain a service (e.g., health care) or purchase some good (e.g., alcohol or cigarettes) that is restricted in some way (e.g., only available to individuals over a certain age). To do this, we typically rely on government-issued identification (e.g., a driver’s license). This results in the revelation of more information than is strictly necessary for the transaction in question. In many cases, we merely need to be able to demonstrate that we meet certain criteria, or possess certain attributes (e.g., that we are residents of a particular place, or that we are of legal age)Footnote 41.

One way to limit the amount of information we disclose in identity-related transactions is through the use of techniques known as attribute-based credentials (sometimes abbreviated ABCs)Footnote 42. These credentials are an important building block of privacy-respecting identity management systems. Among the privacy features of ABCs are the ability of credential holders “to disclose a minimal set of credential attributes to services, or to perform anonymous proofs of possession of certain credentials or attribute values matching certain criteria, while limiting the linkability of identity-related transactions”Footnote 43. Examples of ABCs include Microsoft’s UProveFootnote 44 and IBM’s Identity MixerFootnote 45.

Other “control” technologies that have been identified during the course of our review include:

  • Self-sovereign identity: a concept that puts the user at the centre of the administration of their identity. To achieve this, the user’s identity must be interoperable across multiple locations, with the user’s consent, but also subject to true user control of that digital identity, creating user autonomy. A self-sovereign identity must also be transportable and it must also allow ordinary users to make claims about themselves, which could include personally identifying information or facts about personal capability or group membership. It must also meet a series of guiding principlesFootnote 46. An example of this technology is UPortFootnote 47;
  • Personal Information Management Systems (PIMS)Footnote 48: The basic idea behind the PIMS concept is that individuals should be able to decide with whom they share their personal information, for what purposes, and for how long, to be able to keep track of all the information shared, and to be able to retract that information if circumstances warrant and permit. This category of technology encompasses several other components, including personal data ecosystemsFootnote 49, personal data dashboardsFootnote 50, and personal data storesFootnote 51; and
  • Other (Miscellaneous): a number of other “control” technologies were identified, including SieveFootnote 52, TACYTFootnote 53, and Privacy BoxFootnote 54, that don’t easily fit into any of the previous categories.

Negotiate Terms and Conditions

In many cases, the privacy policies established and published by online service providers are of the “take-it-or-leave-it” variety – there is no customization or personalization. However, consumers view privacy differently and are increasingly concerned about the implications of sharing information in light of complex, hard to understand privacy policies. This frequently results in individuals abandoning (not completing) an online transaction. If individuals could negotiate privacy policies as personalized agreements, and if they could trust that online service providers would honour those agreements, this would be a step in the right direction.

The Platform for Privacy Preferences Project (P3P)Footnote 55, developed by the World Wide Web Consortium (W3C), was intended to enable websites to express their privacy practices in a standard machine-readable format that could be retrieved automatically and interpreted easily by user agents. Individuals could use P3P to set their own privacy preferences. P3P user agents, built into web browsers, were then to inform users of site practices (in both machine- and human-readable formats), allow users to screen and search for sites that offer certain privacy protections, and automate decision-making based on these practices when appropriate. As mentioned previously, P3P was never widely adopted and support for it has largely been discontinued.

Our review identified some initiatives that were intended to make P3P more usable, including the Policy Aware WebFootnote 56 and the Transparent Accountable Datamining InitiativeFootnote 57, but neither of these appear to have progressed much beyond theory. There have also been some efforts to develop alternatives to P3PFootnote 58, but these do not appear to have made much headway either.

Technical Enforcement

In those instances where individuals are able to negotiate the terms and conditions of a service, PETs in this category provide individuals with the possibility of having these terms and conditions technically enforced by the infrastructures of online service providers and merchants (i.e., not just having to rely on promises, but being confident that it is technically impossible for service providers to violate the agreed upon data handling conditions). Technical enforcement of negotiated terms and conditions can be accomplished in a number of different ways, many of which are currently in use, albeit for different purposes (this list is not intended to be exhaustive):

  • network monitoring: passive or active monitoring of network activity to compare the activity against the agreed terms and conditions (e.g., WiresharkFootnote 59, FiddlerFootnote 60, and so on). Some tools provide real-time prevention of privacy leaksFootnote 61;
  • endpoint event detectionFootnote 62: a category of tools and solutions that focus on detecting, investigating, and mitigating suspicious activities and issues on hosts and endpoints (e.g., McAfee Active ResponseFootnote 63, Symantec Endpoint ProtectionFootnote 64, and so on);
  • web transparency toolsFootnote 65: these tools are primarily intended to provide a user with information about the intended collection, storage and/or data processing of their personal information, or to help the user determine the potential impact of data profiling. Such tools include ad blockers (e.g., Adblock PlusFootnote 66 and GhosteryFootnote 67), and tracking blockers (e.g., Privacy BadgerFootnote 68); and
  • enterprise digital rights management: access control technologies that try to control the use, modification, and distribution of copyrighted works (such as software and multimedia content), as well as systems within devices that enforce these policies (e.g., ContentGuardFootnote 69, DigimarcFootnote 70, and so on).

Remote Audit of Enforcement

PETs in this category provide individuals with the ability to remotely audit the enforcement of the terms and conditions offered by online service providers and merchants. While the term is most frequently applied to audits of an organization’s financial information, other areas which can be audited include governance, compliance and risk (GRC) and internal controls. An audit involves the gathering and analysis of information relevant to specified objectives, scope and criteria. While this information has traditionally been gathered in the form of onsite interviews, document reviews and through observation of processes or people, some of this information gathering can now be done remotely.

One way to facilitate the auditing of an organization is for that organization to pre-emptively publish information concerning their policies, procedures and practices. For example, timely, accurate statistical information from private sector firms on government requests for and access to personal information – in the form of clear transparency reportsFootnote 71 at regular intervals – can form the basis for rational consumer choices and build consumer confidence in a growing digital economy and its interface with the state for law enforcement and security purposes.

Another way for individuals to “audit” an organization is for the organization to undergo certification against a trust markFootnote 72, defined as “electronic labels or visual representations indicating that an e-merchant has demonstrated its conformity to standards regarding, e.g., security, privacy and business practice”Footnote 73. Organizations that offer certification against a trust mark often make information about the trust mark, and the criteria an organization needs to satisfy to obtain the mark, available on their website.Footnote 74 Individuals can then research the trust mark, as well as the trust mark provider, and decide if they are prepared to share their personal information with the website in question.

As useful as trust marks might be in helping establish trust in an organization, trust marks have their limitations. For example, a privacy trust mark (e.g., such as the ones issued by TRUSTe, now TrustArcFootnote 75) does not necessarily guarantee that the organization has implemented specific technical security standards or processes (such as basic traffic encryption or infrastructure vulnerability testing)Footnote 76 as there may be more than one way to meet the requirements of the trust mark.

Use of Legal Rights

Many data protection/privacy laws provide individuals with certain rights, including the right to access the information about them that an organization holds, the right to challenge the accuracy and completeness of that information, and the right to have it amended as appropriateFootnote 77. Typically, exercising these rights requires individuals to send a written request to an organization and then wait for the organization to respond. One way to assist individuals in exercising their right is to automate the request process for them.

In 2014 the Citizen LabFootnote 78, in partnership with Open EffectFootnote 79 and Open MediaFootnote 80, launched the original version of the Access My Info (AMI)Footnote 81 tool. AMI is a step-by-step wizard that results in the generation of a personalized formal letter requesting access to the information a provider stores and utilizes about a person. The original version only allowed users to generate a letter to telecommunications companies. An improved tool, relaunched in June 2016Footnote 82, provides individuals with the ability to send formal requests to a broader range of organizations, including those that provide fitness trackers and dating applications.

The "Failure" of PETs

Our review has shown that there does not seem to be any shortage of good ideas for protecting individual privacy – the PETs listed earlier in this paper only scratch the surface of technologies that are available. A wide range of PETs have been proposed, but few seem to have made their way out of the research environment and into the marketplace or people’s lives in any meaningful way. There are a number of possible reasons for this “failure” of PETs to go mainstream.

The current economic and regulatory environments provide little incentive for deploying promising consent technologies, so further development of technology alone is not likely to lead to significant changes. Much of the online world bases its revenue streams on the collection and processing of personal information, particularly for targeted advertising. At the same time, industry most often relies on implied, opt-out consent where the lack of action is interpreted as permission for the processing of personal information. Consent technologies that make it easier for consumers to take actions, particularly for opting out, would likely reduce revenue streams.

The examples reviewed above illustrate that there is no shortage of good ideas and viable technologies for improving the consent process. There is a shortage of incentives for organizations, mostly commercial companies, to use technology to provide a better ability to consent or not consent. The economics of the current highly competitive environment, dominated by self-regulation and opt-out consent models, may dissuade companies from offering effective consent mechanisms.

The tools may sometimes fail because they are considered by average individuals as too complex. They may not have been intuitive, requiring specialized knowledge or skills to operate, which the average consumer may not have. They might fail because there is no consumer demand for privacy protections (which may stem, in part, from a lack of knowledge of what tools are available to them) or government might be unwilling to regulate privacy protections for fear of inhibiting innovation.

Potential users may not trust the tools (i.e., that they will provide the protections they claim to). There is some basis for this skepticism. Many PETs only ever seem to be lab prototypes, or used in limited trials, so there is little to no experience of their practical use and their impact on the processing of personal information. Some PETs may involve third parties who are unknown to, and therefore untrusted by, individuals.

Some tools fail because they are unable to overcome the “network effect” (a phenomenon whereby a good or service becomes more valuable when more people use it). Existing powerful or dominant undertakings (e.g., Facebook) are able to exploit “economies of aggregation” and create barriers to entry through their control of huge personal data sets alongside proprietary software which organizes the dataFootnote 83.

So Now What?

As our preliminary review has shown, there is no shortage of good ideas for protecting individual privacy. There are, however, some categories of PETs (e.g., data tracking) that do not seem to have attracted the same degree of research interest as others. It is not clear whether this is due to lack of interest on the part of researchers, or whether the issues the technologies are intended to address are difficult to resolve.

The discussion in the previous section of this report identified a number of possible barriers to the implementation or adoption of PETs including, but not limited to, lack of awareness of the existence of these tools, their lack of usability, and a lack of incentive for organizations to offer or implement these tools. This review did not examine adoption rates of the different technologies identified so it is not clear which specific barrier(s) are most responsible for the low uptake of PETs. Similarly, where a PET has been successful in terms of adoption, it is not clear what factors contributed to that success.

It is clear from this preliminary review that additional research is needed to assess the relative strengths and weaknesses of PETs, develop new PETs or improve the effectiveness of existing ones, and better understand the barriers to deployment and adoption of PETs in the online marketplace. Individuals also need to be better educated about the existence of PETs and supported to make more use of them, should they so wish, to protect their personal information online and give them greater control over its potential use (or not) by others.

Date modified: