Digital Forensics with Open Source Tools
By Cory Altheide Harlan Carvey
SYNGRESS Copyright © 2011 Elsevier, Inc.
All right reserved.
Digital Forensics with Open Source Tools
INFORMATION IN THIS CHAPTER
Welcome to "Digital Forensics with Open Source Tools"
What Is "Digital Forensics?"
What Is "Open Source?"
Benefits of Open Source Tools
WELCOME TO "DIGITAL FORENSICS WITH OPEN SOURCE TOOLS"
In digital forensics, we rely upon our expertise as examiners to interpret data and information retrieved by our tools. To provide findings, we must be able to trust our tools. When we use closed source tools exclusively, we will always have a veil of abstraction between our minds and the truth that is impossible to eliminate.
We wrote this book to fill several needs. First, we wanted to provide a work that demonstrated the full capabilities of open source forensics tools. Many examiners that are aware of and that use open source tools are not aware that you can actually perform a complete investigation using solely open source tools. Second, we wanted to shine a light on the persistence and availability (and subsequent examination) of a wide variety of digital artifacts. It is our sincere hope that the reader learns to understand the wealth of information that is available for use in a forensic examination.
To continue further, we must define what we mean by "Digital Forensics" and what we mean by "Open Source."
WHAT IS "DIGITAL FORENSICS?"
At the first Digital Forensics Research Workshop (DFRWS) in 2001, digital forensics was defined as:
The use of scientifically derived and proven methods toward the preservation, collection, validation, identification, analysis, interpretation, documentation and presentation of digital evidence derived from digital sources for the purpose of facilitating or furthering the reconstruction of events found to be criminal, or helping to anticipate unauthorized actions shown to be disruptive to planned operations.
While digital forensics techniques are used in more contexts than just criminal investigations, the principles and procedures are more or less the same no matter the investigation. While the investigation type may vary widely, the sources of evidence generally do not. Digital forensic examinations use computer-generated data as their source. Historically this has been limited to magnetic and optical storage media, but increasingly snapshots of memory from running systems are the subjects of examination.
Digital forensics is alternately (and simultaneously!) described as an art and a science. In Forensic Discovery, Wietse Venema and Dan Farmer make the argument that at times the examiner acts as a digital archaeologist and, at other times, a digital geologist.
Digital archaeology is about the direct effects from user activity, such as file contents, file access time stamps, information from deleted files, and network flow logs.... Digital geology is about autonomous processes that users have no direct control over, such as the allocation and recycling of disk blocks, file ID numbers, memory pages or process ID numbers.
This mental model of digital forensics may be more apropos than the "digital ballistics" metaphor that has been used historically. No one ever faults an archaeologist for working on the original copy of a 4000-year-old pyramid, for example. Like archaeology and anthropology, digital forensics combines elements from "hard" or natural science with elements from "soft" or social science.
Many have made the suggestion that the dichotomy of the art and science of forensic analysis is not a paradox at all, but simply an apparent inconsistency arising from the conflation of the two aspects of the practice: the science of forensics combined with the art of investigation. Applying scientific method and deductive reasoning to data is the science—interpreting these data to reconstruct an event is the art.
On his Web site, Brian Carrier makes the argument that referring to the practice as "digital forensics" may be partially to blame for some of this. While traditional crime scene forensic analysts are tasked with answering very discrete questions about subsets of evidence posed to them by detectives, digital forensic examiners often wear both hats. Carrier prefers the term "digital forensic investigation" to make this distinction clear.
Goals of Forensic Analysis
The goal of any given forensic examination is to find facts, and via these facts to recreate the truth of an event. The examiner reveals the truth of an event by discovering and exposing the remnants of the event that have been left on the system. In keeping with the digital archaeologist metaphor, these remnants are known as artifacts. These remnants are sometimes referred to as evidence. As the authors deal frequently with lawyers in writing, we prefer to avoid overusing the term evidence due to the loaded legal connotations. Evidence is something to be used during a legal proceeding, and using this term loosely may get an examiner into trouble. Artifacts are traces left behind due to activities and events, which can be innocuous, or not.
As stated by Locard's exchange principle, "with contact between two items, there will be an exchange." This simple statement is the fundamental principle at the core of evidence dynamics and indeed all of digital forensics. Specific to digital forensics, this means that an action taken by an actor on a computer system will leave traces of that activity on the system. Very simple actions may simply cause registers to change in the processor. More complex actions have a greater likelihood of creating longer-lasting impressions to the system, but even simple, discreet tasks can create artifacts. To use a real-world crime scene investigation analogy, kicking open a door and picking a lock will both leave artifacts of their actions (a splintered door frame and microscopic abrasions on the tumblers, respectively). Even the act of cleaning up artifacts can leave additional artifacts—the digital equivalent to the smell of bleach at a physical crime scene that has been "washed."
It is important to reiterate the job of the examiner: to determine truth. Every examination should begin with a hypothesis. Examples include "this computer was hacked into," "my spouse has been having an affair," or "this computer was used to steal the garbage file." The examiner's task is not to prove these assertions. The examiner's task is to uncover artifacts that indicate the hypothesis to be either valid or not valid. In the legal realm, these would be referred to as inculpatory and exculpatory evidence, respectively.
An additional hitch is introduced due to the ease with which items in the digital realm can be manipulated (or fabricated entirely). In many investigations, the examiner must determine whether or not the digital evidence is consistent with the processes and systems that were purported to have generated it. In some cases, determining the consistency of the digital evidence is the sole purpose of an examination.
The Digital Forensics Process
The process of digital forensics can be broken down into three categories of activity: acquisition, analysis, and presentation.
Acquisition refers to the collection of digital media to be examined. Depending on the type of examination, these can be physical hard drives, optical media, storage cards from digital cameras, mobile phones, chips from embedded devices, or even single document files. In any case, media to be examined should be treated delicately. At a minimum the acquisition process should consist of creating a duplicate of the original media (the working copy) as well as maintaining good records of all actions taken with any original media.
Analysis refers to the actual media examination—the "identification, analysis, and interpretation" items from the DFRWS 2001 definition. Identification consists of locating items or items present in the media in question and then further reducing this set to items or artifacts of interest. These items are then subjected to the appropriate analysis. This can be file system analysis, file content examination, log analysis, statistical analysis, or any number of other types of review. Finally, the examiner interprets results of this analysis based on the examiner's training, expertise, experimentation, and experience. Presentation refers to the process by which the examiner shares results of the analysis phase with the interested party or parties. This consists of generating a report of actions taken by the examiner, artifacts uncovered, and the meaning of those artifacts. The presentation phase can also include the examiner defending these findings under challenge.
Note that findings from the analysis phase can drive additional acquisitions, each of which will generate additional analyses, etc. This feedback loop can continue for numerous cycles given an extensive network compromise or a long-running criminal investigation.
This book deals almost exclusively with the analysis phase of the process, although basic acquisition of digital media is discussed.
WHAT IS "OPEN SOURCE?"
Generically, "open source" means just that: the source code is open and available for review. However, just because you can view the source code doesn't mean you have license to do anything else with it. The Open Source Initiative has created a formal definition that lays out the requirements for a software license to be truly open source. In a nutshell, to be considered open source, a piece of software must be freely redistributable, must provide access to the source code, must allow the end user to modify the source code at will, and must not restrict the end use of the software. For more detail, see the full definition at the Open Source Initiative's site.
"Free" vs. "Open"
Due to the overloading of the word "free" in the English language, confusion about what "free" software is can arise. Software available free of charge (gratis) is not necessarily free from restriction (libre). In the open source community, "free software" generally means software considered "open source" and without restriction, in addition to usually being available at no cost. This is in contrast to various "freeware" applications generally found on Windows system available solely in binary, executable format but at no cost.
This core material of this book is focused on the use of open source software to perform digital forensic examinations. "Freeware" closed source applications that perform a function not met by any available open source tools or that are otherwise highly useful are discussed in the Appendix.
Open Source Licenses
At the time of this writing, there are 58 licenses recognized as "Open Source" by the Open Source Initiative. Since this is a book about the use of open source software and not a book about the intricacies of software licensing, we briefly discuss the most commonly used open source licenses. The two most commonly used licenses are the GNU Public License (GPL) and the Berkeley Software Distribution License (BSD). To grossly simplify, the core difference between these two licenses is that the GPL requires that any modifications made to GPL code that is then incorporated into distributed compiled software be made available in source form as well. The BSD license does not have this requirement, instead only asking for acknowledgment that the distributed software contains code from a BSD-licensed project.
This means that a widget vendor using GPL-licensed code in their widget controller code must provide customers that purchase their widgets the source code upon request. If the widget was driven using BSD license software, this would not be necessary. In other words, the GPL favors the rights of the original producer of the code, while the BSD license favors the rights of the user or consumer of the code. Because of this requirement, the GPL is known as a copyleft license (a play on "copyright"). The BSD license is what is known as a permissive license. Most permissive licenses are considered GPL compatible because they give the end user authority over what he or she does with the code, including using it in derivative works that are GPL licensed. Additional popular GPL-compatible licenses include the Apache Public License (used by Apache Foundation projects) and the X11/MIT License.
BENEFITS OF OPEN SOURCE TOOLS
There are great many passionate screeds about the benefits of open source software, the ethics of software licensing, and the evils of proprietary software. We will not repeat them here, but we will outline a few of the most compelling reasons to use open source tools that are specific to digital forensics.
Excerpted from Digital Forensics with Open Source Tools by Cory Altheide Harlan Carvey Copyright © 2011 by Elsevier, Inc.. Excerpted by permission of SYNGRESS. All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.