Social media has become a notable source of potential forensic evidence, with social media giant Facebook being a primary source of interest. With over 1.35 billion monthly active users as of September 30, 2014 [1], Facebook is considered the largest social networking platform.

Kivu is finding that forensic collection of Facebook (and other sources of social media evidence) can be a significant challenge because of these factors:

1. Facebook content is not a set of static files, but rather a collection of rendered database content and active programmatic scripts. It’s an interactive application delivered to users via a web-browser. Each page of delivered Facebook content is uniquely created for a user on a specific device and browser.  Ignoring the authentication and legal evidentiary issues, screen prints or PDF printouts of Facebook web pages often do not suffice for collecting this type of information – they simply miss parts of what would have been visible to the user – including, interestingly the unique ads that were tailored to the specific user because of their preferences and prior viewing habits.

2. Most forensic collection tools have limitations in the capture of active Internet content, and this includes Facebook. Specialized tools, such as X1 Social Discovery and PageFreezer, can record and preserve Internet content, but gaps remain in the use of such tools. The forensic collection process must adapt to address the gaps (e.g., X1 Social Discovery does not capture all forms of video).

Below are guidelines that we at Kivu have developed for collecting Facebook account content as forensic evidence:

1. Identify the account or accounts that will be collected – Determine whether or not the custodian has provided their Facebook account credentials. If no credentials have been provided, the investigation is a “public collection” – that is, the collection needs to be based on what a Facebook user who is not “friends” with the target individual (or friends with any of the target individual’s friends, depending on how the target individual has set up their privacy settings) can access. If credentials have been provided, it is considered a “private collection, ” and the investigator will need to confirm the scope of the collection with attorneys or the client, including what content to collect.

2. Verify the ownership of the account – Verifying an online presence through a collection tool as well as a web browser is a good way to validate the presence of the target account.

3. Identify whether friends’ details will be collected.

4. Determine the scope of collection – (e.g. the entire account or just photos).

5. Determine how to perform the collection – which tool or combination of tools will be most effective? Make sure that that your tool of choice can access and view the target profile. The tool X-1 Social Discovery, for example, uses the Facebool API to collect information from Facebook. The Facebook API is documented and provides a foundation for consistent collection versus a custom-built application that may not be entirely validated. Further, Facebook collections from other sources such as cached Google pages provide a method of cross-validating the data targeted for collection.

6. Identify gaps in the collection methodology.

a. If photos are of importance and there is a large volume of photos to be collected, a batch script that can export all photos of interest can speed up the collection process. One method of doing so is a mouse recording tool.

b. Videos do not render properly while being downloaded for preservation, aeven when using forensic capture tools such as X-1 Social Discovery. If videos are an integral part of an investigation, the investigator will need to capture videos in their native format in addition to testing any forensic collection tool. It should be noted that there are tools such as to download the videos, and these tools in combination with forensic collection tools such as X-1 Social Discovery provide the capability to authenticate and preserve video-based evidence.

7. Define the best method to deliver the collection – If there are several hundred photos to collect, determine whether all photos can be collected. Identify whether an automated screen capture method is needed.

8. If the collection is ongoing (e.g., once a week), define the recurring collection parameters.

Kivu is a licensed California private investigations firm, which combines technical and legal expertise to deliver investigative, discovery and forensic solutions worldwide. Author Katherine Delude is a Digital Forensic Analyst in Kivu’s San Francisco office. To learn more about forensically preserving Facebook content, please contact Kivu.

[1] Accessed 11 December 2014.

Internet technology provides a substantial challenge to the collection and preservation of data, metadata (data that describes and gives information about other data) in particular. This blog post from Kivu will explain the factors to consider in using web pages in forensics investigations.

The challenge stems from the complexity of source-to-endpoint content distribution. Originating content for a single website may be stored on one or more servers and then collectively called and transmitted to an endpoint such as a laptop or tablet. For example, a mobile phone in Germany may receive different content from the same website than a phone in the United States. As content is served (e.g., sent to a tablet), it may be routed through different channels and re-packaged before reaching the final destination (e.g., an online magazine delivered as an iPhone application.)

From a forensics perspective, this dynamic Internet technology increases the difficulty of identifying and preserving content that is presented to a user through a browser or mobile application. To comprehend the issues concerning forensics and Internet technology, we need to understand what web pages are and the differences between the two types of web pages: fixed content (static web pages) and web pages with changing content (dynamic web pages).

What is a Web Page? graphic

A web page is a file that contains content (e.g., a blog article) and links to other files (e.g., an image file). The content within the web page is structured with Hypertext Markup Language (HTML), a formatting protocol that was developed to standardize the display of content in an Internet browser. To illustrate HTML, let’s look at the following example. The web page’s title, “Web Page Example,” is identified by an HTML <title> label and the page content “Hello World” is bolded using a <b> label.

graphic2Web pages that are accessible on the Internet reside on a web server and are accessible through a website address known as a Uniform Resource Locator, or URL (e.g., The web server distributes web pages to a user as the user navigates through a website. Most visitors reach a website by entering the domain in a URL bar or by typing keywords into a search engine.

Static versus Dynamic Web Pages

Web pages may be classified as static or dynamic. The difference between static and dynamic web pages stems from the level of interactivity within a web page.

A static web page is an HTML page that is “delivered exactly as it is stored,” meaning that the content stored within the HTML page on the source server is the same content that is delivered to an end-user. A static web page may:

• Contain image(s)
• Link to other web pages
• Have some user interactivity such as a form page used to request information
• Employ formatting files, known as Cascading Style Sheets (CSS)

A dynamic web page is an HTML page that is generated on demand as a user visits a web page. A dynamic page is derived from a combination of:

• Programmatic code file(s)
• Files that define formatting
• Static files such as image files
• Data source(s) such as a database

A dynamic web page has the behavior of a software application delivered in a web-browser. Dynamic web page content can vary by numerous factors, including: user, device, geographic location or account type (e.g., paid versus free). The underlying software code may exist on the client-side (stored on a user’s device), the server-side (stored on a remote server) or both. From a user’s perspective, a single dynamic web page is a hidden combination of complex software code, content, images and other files. Finally, the website delivering dynamic web page content can manage multiple concurrent user activities at one time on the same device or manage multiple dynamically-generated web pages during one user session on a single device. This behind-the-scenes management of user activity hides the underlying complexity of the numerous activities for a single user session.

Web Pages Stored on a User Device as Forensics Evidence

To a forensic examiner, web page artifacts that are stored on a user device may have significant value as evidence in an investigation. Web page artifacts are one type of Internet browser artifact. Other Internet artifacts include: Internet browser history, downloaded files and cookie files. If the device of interest is a mobile device, evidence may also reside in database files such as SQLite files.

Forensic examiners review Internet artifacts to answer specific questions such as, “Was web-mail in use?” or “Is there evidence of file transfer?” Forensic analysis may be used to create a timeline of user activity, locate web-based email communications, identify an individual’s geographic location based on Internet use, or establish theft of corporate data using cloud-based storage such as Dropbox.

Web Content Stored on a Server as Forensics Evidence

Depending on the type of investigation (e.g., a computer hacking investigation), a forensic examiner may search for evidence on servers. Server-side content may be composed of stored files such as log files, software code, style sheets and data sources (e.g., databases).

Server-side content may directly or indirectly relate to web pages or files on a user device. If a user downloaded an Adobe PDF file, for example, the file on the server is likely to match the downloaded file on the user’s device. If the evidence on a user device is a dynamic web page, however, there may be a number of individual files that collectively relate as evidence, including: images, scripts, style sheets and log files.

The individual server-side files are component parts of a web page. A forensic examiner would analyze server-side files by investigating the relationship between the web page content on a user device and related server-side files. A forensic examiner may also review server logs for artifacts such as IP address and user account activity.

Factors to Consider in Web Page Forensics Investigations

1. Analyze the domain associated with web page content. Collect information on:

a. Owner of the domain – WHOIS database lookup.
b. Domain registry company – e.g., GoDaddy.
c. Location of domain – IP address and location of web server.

2. Conduct a search using a search engine such as Google, Yahoo or Bing. Review the first page of search results and then review an additional 2 to 10 pages.

a. Depending on the scope of the content, it may be worth filtering search results by date or other criteria.
b. It may be worth using specialty search tools that focus on blogs or social media.
c. Consider searching sites that track plagiarism.

3. Examine the impact of geo-location filtering. Many companies filter individuals by location in order to provide targeted content.

a. Searches may need to be carried out in different countries.
b. Consider using a proxy server account to facilitate international searches.

4. Use caution when examining server-side metadata. Website files are frequently updated, and the updates change file metadata. A limited number of file types such as image files, may provide some degree of historical metadata.

5. There is a small possibility that archival sites, such as The Wayback Machine, may contain web page content. However, archival sites may be limited in the number of historical records, unless a paid archiving service is used.

Kivu is a licensed California private investigations firm, which combines technical and legal expertise to deliver investigative, discovery and forensic solutions worldwide. Author, Megan Bell, directs data analysis projects and manages business development initiatives at Kivu. For more information about using web pages in forensics investigations, please contact Kivu.