The way we interact with our digital devices has evolved over time: from specific commands in command line interfaces, to graphical user interfaces (GUIs), to touch-based interfaces. Virtual assistants (VAs) are the next step in this evolution, and they present new privacy challenges. These assistants, such as Siri (Apple), Alexa (Amazon), Cortana (Microsoft), or simply ‘Google’, are designed to respond to your spoken or written commands and take some action. Such commands let you place phone calls, order a car service, book a calendar appointment, play music or buy goods.
The use of these assistants is on the rise: a 2015 Gartner study found that 38 per cent of Americans had used a virtual assistant in 2015 and that two-thirds of customers in developed markets would use them daily in 2016. The most commonly-used VAs are voice-based, however, much of the presented information also applies to text-based VAs.
Getting to know you, passively or actively
VAs access and analyze passive and active information to provide services to you. For companies that primarily rely on passive data inputs, the VA may operate without direct user intervention. The same cannot be said of active data inputs.
Passive data analysis entails collecting data about a user from pre-existing, or ongoing, data sources such as calendars, emails, geolocation or web browsing history. The result of the analysis is presented to you when the VA believes it will be most useful. For example, in the morning you might be prompted with the weather for the day and a note that it will take longer than normal to get to your morning appointment (found your calendar) based on weather patterns (geolocation and weather services) and traffic (maps and transit habits). Similarly, when you sit down for coffee, a series of news stories will be presented to you, selected on the basis of past web browsing habits and the interests that you indicated to the VA.
Active data analysis, in contrast, involves you making a direct request of the assistant through specific voice or text input. VAs typically have similar basic components for processing active requests: an ‘input’ mechanism (for voice or text), a server-side component that processes the natural language input to translate it into machine-usable instructions, an instruction implementation layer and an interface to present the returned data to you (either a voice or screen). Voice-based VAs in active mode often persistently listen for a cue to trigger the assistant to process an instruction. As a result, the microphone(s) of the devices they’re installed onto may be constantly on and waiting for particular words that initiate the processing of the instruction set.
Enhanced and personalized services
VAs also make inferences about what you’re interested in, based on general assumptions about you. The assistant is configured to assume certain characteristics about you which may or may not be accurate, depending on whether you match the algorithms the VA depends on to make suggestions to you. For instance, the assistant may assume, based on what it knows or infers about your financial situation and physical mobility, that you’re wealthy enough to take a taxi, and able enough to walk to your next calendar appointment in a specific amount of time.
The suggestions offered by VA’s may be more or less relevant depending on how ‘good’ their predictions are. Many VAs enhance their accuracy or usefulness using profiles that are developed about you based on your past online use, purchases, behaviour, location, etc.
Who’s listening, and reading?
VAs typically rely on cloud computing infrastructure to ascertain what a given utterance means. Your profile(s) can either be stored locally or on a company server. This means that profile information could be available to unauthorized persons, although there’s no evidence that a third-party has yet inappropriately extracted or used information contained in a local or cloud VA profile.
IBM disabled and banned the use of other companies’ VAs, on the basis that there were concerns that sensitive corporate data might be retained on another company’s servers. Such concerns may be accentuated by the fact that some VA companies provide samples of users’ queries to third-party transcription services. Though such samples are typically separated from a user ID, they can be identifying where an individual either identifies themselves or discloses personally identifying information to the VA.
The manner in which audio data is collected may result in the capture of information from persons other than the person using the device (e.g., background conversations might be captured). Further, when passive data is used by a VA, other persons’ information might also be processed by the assistant. This could include personal emails or communications that were generally meant to be kept private. There’s also the issue of companies providing copies of inputs and responses to third-parties to evaluate the effectiveness of natural language processing: this may expose your personal information to parties unknown.
Some companies will disassociate a user’s ID from passive or active engagements with a VA after a given period of time, but still retain the actual inputs for research purposes. Given that queries or presentations of information may themselves include personal information, questions of data retention, appropriate use, safeguards and anonymization policies come to the fore.
The potential uses of virtual assistants are growing as companies like Apple and Google expand the number of partners who can take advantage of the assistant. VAs can access information held in a wide range of services (e.g., calendar, email) including services to which you subscribe (e.g., Lyft, Open Table) in order to present information to you. The range of information that can be accessed will continue to grow as more applications are updated to support interactions with these VAs.