Tech —

Android phones keep location cache, too, but it’s harder to access

Following the revelation that iPhones log location data even if you don't use …

After this week's disturbing revelation that iPhones and 3G iPads keep a log of location data based on cell tower and WiFi base station triangulation, developer Magnus Eriksson set out to demonstrate that Android smartphones store the exact same type of data for its location services. While the data is harder to access for the average user, it's as trivial to access for a knowledgeable hacker or forensics expert.

On Wednesday, security researchers Alasdair Allan and Pete Warden revealed their findings that 3G-capable iOS devices keep a database of location data based on cell tower triangulation and WiFi basestation proximity in a file called "consolidated.db." The iPhone, as well as 3G-equipped iPads, generate this cache even if you don't explicitly use location-based services. This data is also backed up to your computer every time it is synced with iTunes. Warden wrote an application which can find, parse, and map the location data on a user's computer if the iOS device backups are not optionally encrypted.

Allan and Warden's findings sparked major concerns over privacy, leading some to speculate that Apple was tracking all iPhone users. The controversy prompted letters from Sentator Al Franken (D-MN) and US Representative Ed Markey (D-MA) demanding that Apple answer questions about how the data is collected, how or when it is sent to Apple, and how Apple could protect a user's privacy.

iOS data forensics expert Alex Levinson later on Wednesday revealed that the consolidated.db file was neither new—iOS has kept the same information in the past, just in a different database—nor was its existence necessarily a secret—Levinson had collaborated on a book with fellow security researcher Sean Morrisey that discussed consolidated.db in detail.

Eriksson suspected that his Android device collected similar information. "Following the latest internet outrage to the revelation that iPhone has a cache for its location service, I decided to have look what my Android device caches for the same function," he wrote in a note on GitHub. He put together an application similar to Warden's based on open source cache parsing code, which extracts data from "cache.cell" and "cache.wifi" and displays it on a map.

Like iOS, Android stores these databases in an area that is only accessible by root. To access the caches, an Android device needs to be "rooted," which removes most of the system's security features. Unlike iOS, though, Android phones aren't typically synced with a computer, so the files would need to be extracted from a rooted device directly. This distinction makes the data harder to access for the average user, but easy enough for an experienced hacker or forensic expert.

Another important difference, according to developer Mike Castelman, is that Android keeps less data overall than iOS devices. "The main difference that I can see is that Android seems to have a cache versus iOS's log," Castleman, who contributed some code improvements to Eriksson's tool, told Ars. That is, Android appears to limit the caches to 50 entries for cell tower triangulation and 200 entries for WiFi basestation location. iOS's consolidated.db, on the other hand, seems to keep a running tally of data since iOS is first installed and activated on a device. iOS will also keep multiple records of the same tower or basestation, while Android only keeps a single record.

Regardless of those differences, however, the data could be used in the same way. For instance, said Castleman, "if you were arrested or something shortly after a crime was committed, either device would contain evidence that could be used against you."

The data in these caches is used when GPS data isn't available, or to more quickly narrow down a location while GPS services are being polled (known as "assisted" or aGPS). Apple and Google both collect some of this data to build and maintain databases of known cell tower and WiFi basestation locations. Both companies previously used similar data from Skyhook, but both recently moved to building and using their own databases (presumably for cost and/or performance reasons).

A security researcher revealed to the Wall Street Journal that Google is also collecting a wide variety of location data from Android devices which could lead to privacy breaches. "According to new research by security analyst Samy Kamkar, an HTC Android phone collected its location every few seconds and transmitted the data to Google at least several times an hour," the WSJ reported. 

While Google is also using the data to improve its internal cell tower and WiFi location database or to improve call routing like Apple, it also uses the data to improve Google Maps and collect information about traffic patterns. The problem with Google's data collection is that unlike Apple, the information sent to Google contains a unique identification number that can be tied to a particular phone. While technically anonymous, that number could potentially be used to trace back to an individual user.

The fact that smartphones equipped with GPS could be used to track individual users isn't new, and a recent Nielsen survey revealed that many users are extremely wary about privacy when using location-based services via a mobile device. However, the details revealed in the past few days about the extent of location data collection and how easy it can be to access it have heightened privacy concerns even further.

UPDATE: Google spokesperson Randall Sarafa contacted Ars to clarify that its data collection practices are opt-in, as is Apple's. "All location sharing on Android is opt-in by the user. We provide users with notice and control over the collection, sharing and use of location in order to provide a better mobile experience on Android devices," he told Ars.

Furthermore, he explained that the unique identifier number is random, not hashed from the unique IMEI or MEID number associated with all mobile devices. "Any location data that is sent back to Google location servers is anonymized and is not tied or traceable to a specific user," Sarafa said. However, as researchers have shown numerous times in the past, "anonymized" data can often be analyzed and correlated with a single person with surprising accuracy.

Channel Ars Technica