Short: Can Android Applications Be Identified Using Only TCP/IP Headers of Their Launch Time Traffic?

Hasan Faik Alan, Jasleen Kaur

The ability to identify mobile apps in network traffic has significant implications in many domains, including traffic management, malware detection, and maintaining user privacy. App identification methods in the literature typically use deep packet inspection (DPI) and analyze HTTP headers to extract app fingerprints. However, these methods cannot be used if HTTP traffic is encrypted. We investigate whether Android apps can be identified from their launchtime network traffic using only TCP/IP headers. We first capture network traffic of 86,109 app launches by repeatedly running 1,595 apps on 4 distinct Android devices. We then use supervised learning methods used previously in the web page identification literature, to identify the apps that generated the traffic. We find that: (i) popular Android apps can be identified with 88% accuracy, by using the packet sizes of the first 64 packets they generate, when the learning methods are trained and tested on the data collected from same device; (ii) when the data from an unseen device (but similar operating system/vendor) is used for testing, the apps can be identified with 67% accuracy; (iii) the app identification accuracy does not drop significantly even if the training data are stale by several days, and (iv) the accuracy does drop quite significantly if the operating system/vendor is very different. We discuss the implications of our findings as well as open issues.

Review:
The paper proposes a machine learning technique to identify mobile applications based on the network traffic they generate at launch. While prior work has performed similar experiments focusing application-layer data (using deep packet inspection), the experiments in this paper use only TCP/IP headers. Indeed, with the increasing use of end-to-end encryption, deep packet inspection may not always provide meaningful results.

The reviewers particularly liked the large scale data collection that was done for this paper. For their analysis, the authors performed over 86,000 app launches on 4 distinct devices, which provided plenty of training data for their supervised learning method. The authors demonstrated that apps can be identified with relatively high accuracy based only on their launch traffic.

A common theme throughout the reviews was that despite being a short paper, the authors managed to pack a significant amount of insight beyond simply identifying applications. For instance, the authors also looked at the impact of application updates and different devices on detection accuracy.