I am an Associate Professor in Computer Science at De Montfort University in Leicester, UK. My research focuses on privacy technologies, privacy metrics, transparency, and smart cities.

Methods of Corporate Surveillance: a Primer on Experimental Transparency Research


News headlines about privacy invasions, discrimination, and biases discovered in the platforms of big technology companies are commonplace today. The headlines -- ranging from comprehensive profiling of users, to microtargeting of political messages, to discrimination based on gender, race, and age -- show that big tech's operations can cause real-world harm. However, big tech companies are reluctant to disclose how they operate and typically do not give out specific information, such as what data they collect or how their algorithms make decisions. This secretive operation counteracts ideals of transparency, openness, and accountability. This tutorial will present research methods and findings from the last 5-10 years that use large-scale experiments to systematically interact with the publicly accessible elements of big tech's platforms. These experiments allow inferring details about big tech’s hidden operations -- in essence conducting meta-surveillance against big tech. For example, findings from these experiments have documented how user tracking works, the extent of tracking on the web today, and how the collected information is used for algorithmic decision-making and ad targeting.

Tutorial schedule

The tutorial will be held at The Web Conference (WWW 2020) in Taipei, on April 21st (morning).

Indicative contents:

  1. The corporate surveillance landscape
    • Corporate surveillance and the need for transparency
    • Consumer-facing services (e.g., mail, search, ride-sharing)
    • Methods of corporate surveillance: Tracking (stateful vs. stateless, cross-device, mobile, cookie synchronization), Profiling (contents, sharing, data brokers), Advertising (ad targeting, ad auctions, real-time bidding)
  2. Methods for meta-surveillance research
    • Experiment design: Challenges of studying black-box systems, Variables under experimenter’s control and methods to control them (virtual personas, crowdsourced user interaction), Response variables
    • Data collection: Automation of browsers and apps, Active traffic capture (desktop vs. mobile, wired vs. wireless, plaintext vs. encrypted), Passive traffic capture (ISPs, search/mail providers)
    • Data analysis: Statistics, statistical tests (causal inferences, statistical significance, analysis of differences), Static and dynamic analysis of mobile apps, Natural language processing, Machine learning, Performance measures (to quantify bias, transparency, privacy, etc.)
  3. Results from transparency research
    • Transparency for corporate surveillance methods (e.g., fingerprinting, cross-device tracking, cookie synchronization)
    • Transparency for corporate services (e.g., advertising, search, ride-sharing)
    • Effectiveness of defense mechanisms (e.g., ad blocking, anti-fingerprinting)
  4. Gaps and challenges in transparency research