Ethical and Privacy Implications of Online Recruitment Strategies for Research

Remarks at the Ottawa Hospital Research Institute Clinical Training Course

Ottawa, Ontario
October 20, 2014

Address by Patricia Kosseim
Senior General Counsel and Director General, Legal Services, Policy, Research and Technology Analysis Branch

(Check against delivery)


Introduction

Before I begin, I would like to make three disclaimers:

First, as you know, clinical research conducted in universities or teaching hospitals is covered by provincial legislation.   But as private sector companies become more and more involved in online recruitment of potential research participants — as my talk today will demonstrate —  the federal jurisdiction will become more relevant to your world.

Second, although I am a current member of the Board of Governors of the Ottawa Hospital, I am here today representing the Office of the Privacy Commissioner of Canada in my professional capacity as a privacy law expert, and with a background in health law and ethics.

Third, for the purposes of today’s presentation, I use multiple examples of research recruitment methods across a broad range of disciplines, not only from the clinical research context, but also from social science and behavioral research.

Having said that, here is an outline of what I intend to cover:

First, I will review some of the persistent challenges that researchers continue to face in recruiting and enrolling participants for clinical research.

I will then review several new online methods being used to find eligible participants for research.   I have organized these examples along a continuum, ranging from the most common and obvious methods, to some of the newer and less obvious methods.

I will continue on to examine the new public-private partnerships in online research and discuss some of the legal and ethical implications.

I will then turn to examine how the emergence of big data is fundamentally changing our traditional concept and understanding of the research endeavor.

And finally, I will conclude with a few thoughts on ultimate accountability and responsibility for research.

Patient recruitment in research: Persistent Challenges

According to a recent report by the Tufts University Center for the Study of Drug Development, clinical research studies usually take twice as long as expected to meet desired enrollment levels.  Eleven per cent (11%) of sites in a given clinical trial fail to enroll a single patient, while thirty-seven per cent (37%) under-enroll.

According to a recent Whitepaper by InVentiv Clinical Trial Recruitment Solutions entitled, “e-Recruiting: Using Digital Platforms, Social Media and Mobile Technologies to Improve clinical Trial Enrollment”:  Nearly 30% of time dedicated to clinical trials is spent on patient recruitment and enrollment, and each day that a drug development program is delayed, it costs the sponsor $37K in operational costs and anywhere between $600K and $8M in lost opportunity costs.

Online Recruitment Methods: A Continuum

These persistent challenges are driving clinical researchers to find new methods of recruiting potential participants online, and some are even conducting certain aspects of the research itself online, where it is possible.

Let me walk through some examples of how that can be done.

Example 1: Simple Online Ads

In this first example, we see an ad on Kijiji advertising a University of Ottawa study on the health behaviors of survivors of multiple cancers.  Researchers are looking for adults 18 or older, who speak and understand English, and have been told by a doctor or other health professional that they have cancer.  We are assured that the research was approved by the University of Ottawa Ethics Review Board.

If interested, potential participants are invited to click on a link to find out more about the study and be directed to the appropriate consent form.

In essence, there is nothing qualitatively different between this ad and a traditional ad in a newspaper or magazine. Once a potential participant responds to the ad, basic ethics principles would need to be respected, including valid and informed consent and privacy protection.

Example 2: First Party Targeted Ads

In this second example, we see a page being served up by Facebook on behalf of a research company that’s in the business of conducting consumer surveys online.  For a modest sum, Facebook directs the company’s ads to Facebook users by matching their profiles with enrollment criteria, thereby reaching more potentially eligible respondents. A notable benefit of online research is the cost containment of administering questionnaires through digital form rather than traditional mail or phone methods.

Some issues to address in this kind of research may include: the degree of invasiveness of the questions being asked; the potential sample bias; capacity issues that cannot be readily assessed in a virtual world; and whether users’ profiles actually match their identity (respondents may not be who they say they are).

From a privacy perspective, these ads are usually served in accordance with the company’s privacy policy, in this case Facebook’s.  When a user signs up, they are assumed to have read and accepted the terms of service.  

Example 3: Third Party Targeted Ads

Another means available to focus recruitment efforts for research is through third party targeted ads.  This particular ad campaign for Vicks Behind-the-Ear-Thermometers combined Google flu trends data with user information collected through mobile apps to target potential customers.

Google flu trends data were used to identify regions likely to be experiencing influenza, hence needing a thermometer. Within that geographic area, certain potential customers were further targeted according to basic demographic data gathered through their mobile apps, in this case the Pandora music service. Users of Pandora provide basic demographics, including age, gender, ZIP code, and parental status, and through their mobile device, they can reveal their approximate location at any given time. Combining all of this information, Vicks was able to target its ads for Behind the Ear Thermometers to only those Pandora listeners who were young mothers, living in a high-risk flu area, within two miles of retailers selling the product (including Walmart, Target, and Babies “R” Us).

While this is a documented example of how a targeted ad was used to sell a product, one can easily extrapolate how pharmaceutical or other companies might use similar “smart” methods to target potential research participants matching eligibility criteria for study enrolment. This example gives rise to yet another range of privacy issues, namely — whether the app users were aware that their personal information would be used for targeted ads by third parties and whether they were given a meaningful opportunity to opt out.

Example 4: Contextual Ads Related to Search Queries

Contextualized ads are ads that use information about a user’s current visit to a website in order to serve a targeted advertisement.  For example, if a user is searching for active clinical trials for cancer, they may be shown ads for clinical trials currently recruiting patients or they may be shown ads by private research companies willing to provide researchers with recruitment services for a fee.

In this example a search for active clinical trials for cancer served up contextual ads on the site related to alternative cancer clinics and a company advertising genome sequencing services.

Generally, this form of advertising is done in accordance with the accepted user terms and privacy policy of the company, in this case Google.

Example 5: Online Behavioral Advertising (OBA)

Online behavioral advertising, on the other hand, serves ads to users unrelated to the current web page, instead based on past queries that have been tracked over time.  Through the use of third party cookies placed in a user’s web browser, a user’s activities can be followed as they navigate around the net. This tracking allows advertisers to create a profile associated with his or her browser and paint a granular “picture” of the user’s interests.  Ads are then tailored based on these inferred interests and served to users who correspond to a given profile.

According to the OPC’s policy position on OBA , opt-out consent can be acceptable for online behavioral advertising provided certain conditions are met:

  • Individuals are made aware of the practice in a manner that is clear, transparent and  understandable;
  • Individuals are informed of such at or before the time of collection and provided with information about the various parties involved in online behavioural advertising;
  • Individuals are able to easily opt-out of the practice — ideally at or before the time the information is collected;
  • The opt-out takes effect immediately and is persistent;
  • Information collected and used is destroyed as soon as possible or effectively de-identified; and
  • The information collected and used is limited, to the extent practicable, to non-sensitive information (avoiding sensitive information such as medical or health information).

In January 2014, our Office announced its first finding in relation to online behavioural advertising.  In a nutshell, these were the facts: the complainant discovered that he was tracked from website to website after having visited websites about sleep apnea devices. Even on websites having nothing to do with the condition, he was served ads peddling sleep apnea devices.

That means he was tracked on the basis of health information — contrary to our Guidelines on Online Behavioural Advertising and, further, contrary to Google’s own privacy policy which reassures users that they will not be tracked on the basis of sensitive information, which includes health information.

We made specific recommendations to Google, which agreed to implement them all.

Example 6: Secondary Use of Personal Information

A sixth example of how research enrollment strategies could be enhanced is through secondary use of information posted online for other purposes. Direct-to-consumer genetic testing sites offer a rich opportunity for identifying potential research participants matching specific eligibility criteria and contacting them for enrolment in a given study. Many of these sites contain highly sensitive personal information.

Whether personally identifiable information about clients or members can be used or disclosed for research purposes depends on the Terms of Use. What were users told about the conditions for joining or signing on to the site in the first place? The inquiry then necessarily turns to whether the terms of use or privacy policies of these websites clearly laid out research as a potential use of personal information, in what form (identifiable or not), and by whom (website owners or third parties); whether the information was sufficiently clear to serve as a valid basis for inferring informed consent; and whether the proposed research actually aligns with users’ reasonable expectations.

Just recently, 23andMe announced its intention to enter the Canadian market and offer genetic tests for $199 that will scan one’s DNA for the possibility of developing 108 different health conditions. Its Privacy Policy describes in quite some detail “23andMe Research” which “refers to scientific research conducted by 23andMe or by third parties in collaboration with 23andMe” and sets out the terms and conditions of a separate consent for this purpose.

Example 7: Social Media Listening

An interesting example of how online listening can be used for research purposes is described in this study published in Nature Biotechnology in 2011. The researchers collected data from PatientsLikeMe.com, a patients’ online forum that allows individuals suffering from serious diseases to share their “real-world health experiences.” In this case, the study examined patient self-reported data describing the effects of using lithium carbonate to treat ALS, also know as Low Gehrig's Disease. Ultimately, the study found that the use of the treatment in question had no effect on disease progression over a 12-month period, an outcome which effectively mirrored the results of parallel randomized trials.

While the authors acknowledge that observational studies of this nature can never be a true substitute for the gold standard of double-blinded randomized trials, the study nonetheless shows the potential of online patient fora to provide an observational environment for monitoring disease progression and treatment efficacy. Provided, of course, people know and understand this and consent to be a part of it —  and provided, consideration is given to the particular vulnerability of online patient groups in these types of situations.

In May 2010, it was discovered that someone from Nielsen, a global information company that provides insights into what people watch, listen and buy online, had infiltrated PatientsLikeMe.  Using sophisticated software, Nielsen scraped all messages from the online discussion forum, many containing the highly sensitive personal information of members. This activity was clearly in contravention of the terms of use of the online forum and contrary to the reasonable expectation of members. To their credit, Nielsen stopped the practice of its own volition, but only after hundreds of “creeped-out” members quit the PatientsLikeMe website out of anger and loss of trust.

A more recent example of how personal information in online fora is used came to light in 2013 in the case of Positive Singles, a dating website for people who test positive for STDs.  Following a complaint, our office’s investigation revealed that the company was sharing highly sensitive information about its users with affiliated websites, without having first explained this practice and obtained proper consent, in contravention with users’ reasonable expectation of privacy.

Example 8: Crowd Sourcing

Also emerging are new crowd sourcing platforms where requesters post human computational tasks that they are willing to pay to have completed, and workers choose which of these little jobs they wish to do.   In essence, these are like online labour markets.

This specific one, Mechanical Turk by Amazon, has 426,018 HITs (Human Intelligence Tasks) available for the taking, with hundreds of thousands of workers and tens of thousands of requesters.

Behavioral researchers have come to realize key benefits of using this online marketplace — a very large and stable set of diverse people persistently available and willing to do tasks (like participate in research studies or fill out online surveys), for relatively low pay.

Each HIT has a title, the name of the requester, wage being offered, number of related HITs being offered, how much time the requester has allocated to completion of each HIT, when the HIT expires, the required country of residence, and any specific qualifications required.

These virtual marketplaces raise a new host of issues: how to obtain meaningful, informed consent before workers accept HITs; when and how to provide debriefing statements if the study involves deception; as with other online strategies, how to restrict populations to prevent children or non legally-capable adults from participating; how much to offer by way of compensation; and how to ensure confidentiality — although worker IDs are anonymized, they may be identified in cases where a requester needs to communicate with them.  There are also security issues related to where the data are stored.  The advantage of an “external” HIT is that collected data go straight to the requester, as opposed to “internal” HITs, where collected data get stored first on Amazon’s servers.

Here, for example, is a HIT posted by requester Teachers College Columbia University, to find participants for a research study on spirituality, health, and psychology.  The duration of the HIT is 1.5 to 2.5 hours and the reward being offered is $6.50 per hour.  Workers have five days to respond after which the HIT expires.  The full survey must be filled out, fluid English is required, and no repeat survey takers will be accepted.  Workers are required to enter their Mechanical Turk ID before starting. 

Example 9: Mobile Technology

Another strategy for recruiting potential research participants is through mobile technologies.

An interesting example of how effective this strategy can be was a study conducted by Nathan Eagle at the Harvard School of Public Health. Eagle devised a new blood-bank monitoring system in Kenya.  Recognizing the explosive proliferation of mobile technologies in Africa, he innovated new ways to recruit public health nurses to text in current blood supply levels in rural hospitals across Kenya in real time.   At first, it proved to be a huge success and recruitment and compliance was at an all-time high, until half the nurses stopped texting in the data when they realized how much the text messages were costing them personally.  The cost of sending in continual SMS messages represented a significant proportion of their daily wages.

Working with the mobile operators in East Africa, the researcher found a way to automatically credit the participating nurses for the cost of each SMS message being sent to the central database, plus an extra penny if it was properly formatted.  Virtually all the nurses thereafter re-engaged.

Eagle has since co-founded a company called Jana that provides organizations with the ability to connect with people in emerging markets through mobile phones.  According to his website, the company rewards people with free mobile airtime as an incentive for completing certain tasks, such as online surveys. His mobile airtime rewards platform has been integrated into the systems of over 200 mobile operator partners, providing clients with a “unique ability to instantly compensate 3.48 billion people in 70 local currencies.”

One thing that needs to be highlighted with regard to the mobile environment is the challenge of obtaining meaningful consent when reading devices that have such small screens.  Our Office has developed guidance for mobile app developers that advises using techniques such as layering, just-in-time pop-ups and sounds to alert users at instances when they are about to provide personal information in the use of a mobile app.

Example 10: Wearable Computing

Another emerging technology is wearable computing devices. A whole new range of possibilities have opened up to identify eligible research participants based on specific health measurements and to collect data not only from participants, but directly from their devices!  As a case in point, consider contact lenses developed by Google that can read glucose levels from tear fluid in order to manage diabetes.

As body worn devices are capable of reading health status and digitally communicating results online, patients could soon be sending in directly the very information needed for researchers to assess eligibility criteria for entry into specific research studies.

This raises a whole new set of ethical and legal issues that we will have to think through carefully as these new devices become common in the marketplace.

Research Makes Strange Bedfellows

For decades, lawyers and ethicists have had to work out parameters for guiding public-private partnerships in clinical research, particularly involving pharmaceutical companies —  rules around intellectual property and publication, physicians and patient finder fees, potential conflicts of interest, etc.

With the explosion of web technology it has become abundantly clear that online giants like Google, Facebook, and others hold more social science data than academia will ever be able to hold, reproduce, or even imagine, which ushers in a whole new range of public-private partnerships.

With massive amounts of socio-demographic and behavioural data in their possession, and recognizing that personal data is “the new oil,” private sector companies have begun to play a much more active, direct, and prominent role in human subject research.

Today, virtually everyone has become a researcher: governments, companies, platform operators, even private citizens.

A good example of the power of online giants to conduct research is Facebook’s emotion contagion study. The company recently conducted a psychological experiment by altering the newsfeeds of 689,000 users.  By tweaking their news feed algorithm, the company showed more positive posts to one study group and more negative posts to another.  They then tested the resulting impact on users’ mood by counting the number of generally positive or negative words in their own subsequent posts.

The company partnered with researchers from Cornell University.  Although they sought Institutional Research Board (IRB) approval they were told that since it was Facebook that had done all the collection and analyses of user data, the University itself was not technically engaged in the research and the study was therefore exempt from the Common Rule requirements for IRB approval.  And when the journal required demonstration of informed consent for publication purposes, the company cited the general terms of its user policy as sufficient informed consent.

Some supported Facebook for being transparent about a practice that presented no more risk than that which companies create everyday by experimenting with new and improved commercial services.  They fear that too critical a public outcry will make companies like Facebook less likely to want to cooperate with university researchers, meaning that they’ll become even less transparent and their data won’t be available for the public advancement of knowledge.

Others were not so easily placated and saw the loophole used to exempt the study from IRB review as disingenuous.  They were highly critical of Facebook —  creeped out by its having unabashedly crossed the line from observation to manipulation.  Some have asked if Facebook would have benefited from an external, independent ethics review?  Could an independent ethics committee have helped better anticipate the social, ethical, and legal implications of moving people, as one commentator put it, from the proverbial fishbowl to the petri dish?

In defense of Facebook’s practices, OKCupid, a date-matching site, revealed its own online experiments.  Among them was a study on the “Power of Suggestion” which the company undertook to examine the effectiveness of its suggested “match percentages” in influencing the subsequent behavior of users.  In other words, if the company says that two people are a good match, will they act as though they are?  To test this, the company changed its algorithms to reverse good and bad matches —  those individuals who scored high percentage match (90%) were given a score of only 30% and, conversely, those individuals who scored poorly on potential compatibility (30%) were given a high score of 90%.   The company then looked at the number of email communications that followed between individuals using the company website to see if users would pursue relationships with people they were told were good matches (even though they weren’t) or if they would forego relationships with people they were told were not good matches (even though they were).

The risk of  emotional harm for users  who might act on the wrong information and actually agree to go out on dates with poorly matched individuals or, conversely, lose out on finding their once in a lifetime soul-mate, seemed completely lost on the company.   The company remained unapologetic for its practice, claiming, “If you use the Internet, you’re the subject of hundreds of experiments at any given time, on every site… that’s how websites work.”

Both these examples help to illustrate the fundamentally different cultures of public and private sector research: two different worlds colliding or, as one Tumblr blogger puts it, “two frames clashing.”  “The manifest risk in all these instances”, he writes, as online companies become de facto research institutions, “is that these new, digital networked entrants undermine and circumnavigate hard-won public accords enshrined in laws, regulations, and norms of communities of practice, under the ruse that ‘new technology’ somehow means that ‘old rules don’t apply’”.  As public outcry mounts against some of the practices of online digital companies, well-meaning, reputable, altruistic academic researchers genuinely interested in partnering with private companies to accelerate research in the public interest might, unless they do their proper due diligence, get thrown under the same bus. 

One Big Global Experiment under the Big Data Tent

This fundamental difference in academic and commercial cultures is further exacerbated by the rise of big data, which has enabled research at an unprecedented speed, scale, and computing power.  In a manner of speaking, Big Data has turned the world into one big global experiment in which we have all unwittingly become human subjects.  And there are differences from the kind of research you are used to.

Big Data initiatives in the commercial world are not subject to classic peer review processes designed to challenge the validity of starting assumptions and the rigor of proposed methodologies.  Unlike the humbling world of scientific researchers who get grilled by their peers, and whose proposals are often rejected, sending them back to the drawing board, Big Data projects in the commercial context get no such grilling.  Google Flu Trends that I mentioned earlier has recently come under intense criticism in an article published in Nature, and subsequent commentary in Science, that uncovered certain flaws in its methodology ex ante, calling Google’s original claims an example of “Big Data hubris”.

Unlike publicly-funded research that increasingly requires researchers to publish their data and share the benefits of the research findings, there is no obligation for commercial organizations to publish or share their algorithms.  In fact, these algorithms, like the secret ingredients or formulae of a commercial product, are safeguarded as valuable trade secrets and exclusively owned as intellectual property.  Moreover, as the recent Nature article pointed out, the flaws in Google Flu Trends were due in part to the highly complex and dynamic algorithms, constantly being changed by Google engineers to improve commercial services, making it impossible for others to see and follow, let alone replicate.

Finally, and most importantly, unlike publicly-funded scientific research in Canada and the U.S., there is no ethical review of commercial Big Data endeavors to weigh the risks and benefits based on core principles of respect for persons, concern for welfare, fairness, and equity.

This is why Privacy Commissioners around the globe in October issued a joint resolution affirming basic fair information principles, and requiring continuous assessment of both profiles and the underlying algorithms. “This necessitates regular reviews to verify if the results from the profiling are responsible, fair and ethical, and compatible with and proportionate to the purposes for which the profiles are being used.”

It Wasn’t Me:  The Machine Made Me Do It!

Some might argue that the emerging techniques are not human interventions, but machine-made associations based on impersonal algorithms that do not see — or quite frankly care —  who we are at the individual level.   The claim is that these automated algorithms and processes are blind to who the individuals actually are and that these artificial agents are value neutral and operate without human intermediaries.

Many will recall how Target set out to analyze purchase patterns, combined with basic demographic data, to identify women likely to be in their second trimester of pregnancy — apparently a major life transition period that represents a gold mine for retailers.  Interests in maternity clothing, prenatal vitamins, unscented skin lotions, cotton balls, hand sanitizers, and washcloths helped reveal those customers to whom Target should send more targeted and personalized advertisements, thereby augmenting the ultimate effectiveness of their marketing strategies.  In the world of messy, big data, whether Target gets it right all of the time is less important than getting it right most of the time; in fact the only reason this practice came to light in a 2012 New York Times article was because Target “got it” in one case before a teen’s own father did. And when they get it wrong there is no automatic switch to turn off the algorithm, as was recently reported by women who expressed consternation at continually being bombarded with baby-related ads months after having painfully miscarried.

According to a 2013 study led by Harvard University Professor Latanya Sweeney, on “Discrimination in Online Ad Delivery”, racial bias was found in ads connected with certain search terms used in Google and Reuters.  When searching black-identifying first names (such as DeShawn, Darnell, and Jermaine), a higher percentage of ads offering services for criminal record checks appeared than was the case when searching white-identifying names like Brad, Jill, or Emma.

One can argue that machines are value-neutral, but surely it takes some human somewhere to tell the machine what to do, doesn’t it?

The claim that “the machine made me do it” conjures up an emerging literature on the concept of robotic ethics.  Can we teach robots to make ethical choices?  A recent UK research study, led by Alan Winfield at the Bristol Laboratory of Robotics, found that we are not there yet.  Having programmed a robot to follow Isaac Asimov’s first ethical rule of robots, that “a robot may not injure a human being or through inaction, allow a human being to come to harm”, the researchers concluded that when the robot was capable of saving a human surrogate, it did so 100% of the time.  When, however, it was faced with the ethical dilemma of saving one but not the other, it failed nearly half the time to rescue either.  It became paralyzed with indecision, constantly changing its mind or “dithering” from one to the other until eventually it ran out of time and failed to save either.

Conclusion

Ethical decision-making, for the time being at least, is still the domain of human agency.  We can tell robots and machines what to do or not to do with data.  We can design algorithms for good purposes in the public interest, where societal benefits outweigh risks.  We can build in requirements from the very outset that will limit data uses for fair purposes and avoid discriminatory inferences about who we are. We can avoid non-transparent ways of profiling us without our knowledge.

While academics have a lot to gain from innovative private sector approaches that can help them accelerate their research in the public interest, my key message is that they would be best to do their due diligence before choosing which companies to partner with or which advertising or targeting opportunities to exploit.

In an era of big data, enabled by all the wonderful promises it has to offer society, let’s think more critically up front about the kind of world we want to live in — both in the academic and commercial worlds — and make sure we design algorithms and programmes in that light.

Date modified: