Where to find data for your RFP – Data Acquisition 101

It is fairly common for me to receive a Request for Production (RFP) with a data acquisition request to obtain “emails from the phone” or “Facebook posts from the phone”. These types of requests require me to educate the attorney (and often, by proxy, the judge) about where different types of data can be found, and that a phone is not the most comprehensive source of emails and social media posts. I don’t mind providing this information — it’s my job! — but to save time, I thought I should write about this as a guide that could come in handy when writing up an RFP.

Mobile Devices

A smartphone might seem to have tons of information on it, but in reality, it holds very limited information:

  • Text messages
  • Photos taken with the phone, along with date and time stamp
  • Phone call logs
  • Messages sent/received with chat apps like WhatsApp and Signal
  • Email headers (not contents of emails)
  • Old/deleted messages, call logs, and photos (to some degree; depends on certain factors)

Information that is usually not on a mobile device, or available only in a very limited capacity, and better sourced elsewhere:

  • Email contents
  • Social media posts
  • Watch activity on YouTube, Netflix, etc.
  • How long a user spent on a specific app
  • Origin of photos posted to social media
  • Logins/passwords for social media apps, banking apps, etc.

To find this other information, it is necessary to go to where this information is stored: on the servers of the app provider.

Emails

To get a comprehensive list of emails, it is necessary to get it from the email server itself.

The most direct approach to get the emails is for the custodian of the email to request to download an archive of all their own emails. For example, if the email account is with Gmail, the custodian can request to download data for their own Gmail account through the Google Takeout service. With the same form, the user can also request data from Google-owned apps like YouTube and Fitbit.

The data is usually not ready right away — Google will email the custodian when the download is ready. However, the custodian must then log into Google with their own account to get the download.

To preserve forensic soundness, it might seem that the analyst should log into the custodian’s account and download the archive, but this can be problematic. Google is (rightly) very security-conscious when it comes to these archives, and will question this new login and put up a few roadblocks, leading to excessive time being spent getting into the account. It also requires the custodian to give the analyst their password, which we like to avoid for the sake of the custodian’s peace of mind.

A better approach is to have the custodian get together with the digital forensics analyst, at the office of one or the other, with the custodian’s laptop in attendance. The analyst plugs their own external drive into the laptop, and the custodian downloads the archive directly to the analyst’s own external drive. This preserves forensic soundness, as the analyst can attest that they watched the archive go right onto their drive. The analyst then unplugs the drive and the two part ways, with the analyst going off to work on parsing emails, social media activity, and any other data requested by the RFP.

For work emails, the company can provide a .PST (Personal Storage Table) file, a common format for emails. Any decent analyst can readily work with a PST file to inspect emails.

Social media activity

All the large social media platforms provide a method for downloading one’s own data such as posts, comments, likes, and private messages. The process is similar to the Gmail process, with each platform having its own portal for such requests. By and large, the archive isn’t available right away, and the custodian receives an email when the download is ready.

The same process as for Gmail above should be employed to ensure the data is forensically sound, without the custodian having to give up their password or the analyst going through a lengthy (and thus costly) process of logging in on behalf of the custodian.

Instructions for each social media platform can be found by searching on “[platform name] download my data,” but here are links for a few of the popular ones:

Watch activity

Streaming services like Netflix give custodians the ability to download their watch history. While this has rarely been needed on a case, it does come up from time to time.

Time spent on app

As for the amount of time a person spent using an app, this is best measured by their activity on the app such as posts and comments, which can be obtained from the downloaded archive.

There is currently no way to measure how long a person looked at an app, as an app can be left open on a phone while the person puts it down or hands it to another person.

In summary

I hope this guide helps you organize the gathering of data for your case, and in writing your RFPs for speedy resolution.