Advisor: Jeff Ondich
Your mobile phone talks to the internet when you ask it to. For example, you might use a browser to go to a website, or use an store app to download a new game.
But mobile operating systems also allow apps to run background threads, and many of those threads communicate with remote servers without direct user interaction. Sometimes, an app might communicate with its creators' website to deliver or request information needed by the app to do its job. (After all, a news app can't give you lock-screen news updates without occasionally requesting updated news from a server.) But there is lots of circumstantial evidence that apps also call back to the mother-ship to report ostensibly private data about the phone's user.
I'm curious about what exactly is going on with my phone. How much can we learn by monitoring the traffic on our own devices?
In 2017, tech reporter Kashmir Hill was curious about something similar—how much were the smart devices in her home talking to the internet? So she collaborated with artist/engineer/journalist Surya Mattu to set up and spy on a houseful of smart devices. Their report, The House That Spied on Me, did a great job of illuminating just how chatty the typical internet-connected device can be. Working from their inspiration, I want to focus more narrowly on smartphones and dig deeper into the data that's directly available to us as hackers and owners of our own phones.
Another inspiration for me is the big collection of articles from December 2019 by Stuart A. Thompson and Charlie Warzel: One Nation, Tracked. These articles, based on the New York Times's acquisition of a large dataset of cell phone location data, explored the question of what kinds of inferences you could draw from such a dataset. What could you learn about the lives of people individually and in aggregate by looking at the locations of their phones over time?
In this comps project, you will set up and/or write tools to observe the network traffic on your own phone, and you will collect and analyze data about that traffic. You will also think about how that data could be used if it's just your data, as well as how it could be used if it's millions of people's data rather than just one person's data.
Here's what you will do.