WhatsApp Image 2020-07-19 at 5.32.01 PM.

An adversarial approach to quantitative analysis of trajectory data 


Problem Statement & Scope

Recent years have seen an increase in the usage of digital devices with location capabilities. Mobility/trajectory data are collected by various devices, such as smartphones and cameras. Pervasive usage of these devices, although providing convince, leaves a non-erasable digital trace of the user. The contextual information attached to a trace can be used to crack the habitual patterns and activities of users. Our work quantifies the privacy leakage in a given mobility dataset after it has been anonymized with a location privacy protection mechanism (LPPM). We follow an Adversarial Approach: Taking the role of an attacker to recover the original trajectories given the anonymized trajectory dataset and mobility profiles of individuals

Our problem statement can be summarized as:

"How much user information is leaked after a trajectory dataset has been protected with a Location Privacy Preservation Mechanism"


  • Identify the right metric to measure privacy of users in an anonymied mobility dataset

  • Quantify privacy for users in a published anonymized mobility dataset

  • Identify the best set of methods and parameters that can be used to promote maximum privacy for the mobility datasets



Screenshot 2020-07-08 at 4.19.02 PM.png

The dataset provides mobility traces of cabs in San Francisco, USA. It contains GPS coordinates of 536 taxis collected for May, 2008.  Trace format for this dataset is - [latitude, longitude, occupancy, time]. Time is in UNIX epoch format, latitude longitude in decimal degrees, and occupancy is either 0 or 1 denoting if the cab is occupied or not.


Trajectory dataset containing GPS traces of taxis for the city of Beijing provisioned by T-Drive and Microsoft Research License Agreement (MSR-LA). It contains week-long trajectories of 10,537 taxis with around 15 million data points recorded during Feb 2, 2008 to Feb 8, 2008.


An Adversarial Approach

Our project quantifies the user location privacy of a given mobility dataset through adversarial attacks against different standard Location Privacy Preserving Mechanisms(LPPM). 


An adversary is an entity who takes the role of an attacker to recover the original trace of the users given the anonymized trajectory dataset and mobility profiles of individuals. It is assumed that adversary has a knowledge of the anonymization function which is used to anonymize the actual trace. The adversary will also have a knowledge about some part of the actual trace.


The approach can be divided into three steps:


(i) KNOWLEDGE CONSTRUCTION: Knowledge created by the adversary using part of the actual trace. e An adversary collects various pieces of information about the mobility of the users.


(ii) DE-OBFUSCATION: Obtaining the actual location of users which was previously obfuscated.


(iii) TRACING ATTACK: Attack done by the adversary to find out the whole sequence (or a partial subsequence) of the user’s actual trace. The scope of the tracing attack in our work is limited to reconstruction of the entire trajectory.


Trace Visualization 

San Francisco Bay Area  - 99 regions


Quantifying Privacy of Users

  • Distortion/Correctness was identified to quantify the success of the adversarial attack and measure users' privacy

  • Correctness/ distortion is the measure of how much of the original trace was reconstructed by the adversary

  • Correctness measures the distance between adversary's outcome and actual outcome. The actual outcome value is what we want to hide from the adversary

  • If distance = 0; adversary's outcome and actual outcome are equal. The correctness value is very high, that is, the adversary has complete knowledge and is able to identify the actual trace of  the user

Parameters to maximize Privacy

  • Impact of various LPPM parameters was studied with respect to Privacy. We tuned these parameters to achieve increasing privacy

  • These parameters included:

  • Number of Regions 

  • Sampling Level

  • Location Hiding Level

  • Hiding Levels Probability

  • Fake Noise Injection Probability

  • One question remains "How usable the dataset will be to any researcher if the actual information is being hidden and distorted". The scope of our project was limited to reconstruction of the adversary’s trace. The usability of datasets after applying these LPPM mechanisms can be done as a part of future work.


For more Information

Code Link:

Report Link:

Our Team












Our Mentor


Our Sponsor

Screenshot 2020-07-18 at 10.43.38 PM.png