Dual Purpose Volatile Data Collection Script

When responding to a potential security incident a capability is needed to quickly triage the system to see what's going on. Is a rogue process running on the system, whose currently logged onto the system, what other systems are trying to connect over the network, or how do I document the actions I took on the system. These are valid questions during incident response whether the response is for an actual event or a simulation. One area to examine to get answers is the systems' volatile data. Automating the collection of volatile data can save valuable time which in turn helps analysts examine the data faster in order to get answers. This post briefly describes (and releases) the Tr3Secure volatile data collection script I wrote.

Tr3Secure needed a toolset for responding to systems during attack simulations and one of the tools had to quickly collect volatile data on a system (I previously discussed what Tr3Secure is here). However, the volatile data collection tool had to provide dual functions. First and foremost it had to properly preserve and acquire data from live systems. The toolset is initially being used in a training environment but the tools and processes we are learning need to be able to translate over to actual security incidents. What good is mastering a collection tool that can’t be used during live incident response activities? The second required function was the tool had to help with training people on examining volatile data. Tr3Secure members come from different information security backgrounds so not every member will be knowledgeable about volatile data. Collecting data is one thing but people will eventually need to know how to understand what the data means. The DFIR community has a few volatile data collection scripts but none of the scripts I found provided the dual functionality for practical and training usage. So I went ahead and wrote a script to meet our needs.

Practical Usage

These were some considerations taken into account to ensure the script is scalable to meet the needs for volatile data collection during actual incident response activities.


Different responses will have different requirements on where to store the volatile data that’s collected. At times the data may be stored on the same drive where the DFIR toolset is located while at other times the data may be stored to a different drive. I took this into consideration and the volatile data collection script allows for the output data to be stored on a drive of choice. If someone prefers to run their tools from a CD-ROM while someone else works with a large USB removable drive then the script can be used by the both of them.

        Organize Output
Troy Larson posted a few lines of code from his collection script to the Win4n6 sometime ago. One thing I noticed about his script was that he organized the output data based on a case number. I incorporated his idea into my script; a case number needs to be entered when the script is run on a system. A case folder enables data collected from numerous systems to be stored in the same folder (folder is named Data-Case#). In addition to organizing data into a case folder, the actual volatile data is stored in a sub-folder named after the system the data came from (system's computer name is used to name the folder). To prevent overwriting data by running the script multiple times on the same system I incorporated a timestamp into the folder name (two digit month, day, year, hour, and minute). Appending a timestamp to the folder name means the script can execute against the same system numerous times and all of the volatile data is stored in separate folders. Lastly, the data collected from the system is stored in separate sub-folders for easier access. The screenshot below shows the data collected for Case Number 100 from the system OWNING-U on 01/01/2012 at 15:46.


Automating data collection means that documentation can be automated as well. The script documents everything in a collection log. Each case has one collection log so regardless if data is collected from one or ten systems an analyst will only have to worry about reviewing one log.

The following information is documented both to the screen for an analyst to see and a collection log file: case number, examiner name, target system, user account used to collect data, drives for tools and data storage, time skew, and program execution. The script prompts the analyst for the case number, their name, and the drive to store data on. This information is automatically stored in the collection log so the analyst doesn’t have to worry about maintaining documentation elsewhere. In addition, the script prompts the analyst for the current date and time which is used to record the time difference between the system and the actual time. Every program executed by the script is recorded in the collection log along with a timestamp of when the program executed. This will make it easier to account for artifacts left on a system if the system is examined after the script is executed. The screenshot below shows the part of the collection log for the data collected from the system OWNING-U.


RFC 3227’s Order of Volatility outlines that evidence should be collected starting with the most volatile then proceeding to the less volatile. The script takes into account the order of volatility during data collection. When all data is selected for collection, the memory is first imaged then volatile data is collected followed by collecting non-volatile data. The volatile data collected is: process information, network information, logged on users, open files, clipboard, and then system information. The non-volatile data collected is installed software, security settings, configured users/groups, system's devices, auto-runs locations, and applied group policies. Another item the script incorporated from Troy Larson’s comment in the Win4n6 group is preserving the prefetch files before volatile data is collected. I never thought about this before I read his comment but it makes sense. Volatile data gets collected by executing numerous programs on a system and these actions can overwrite the existing prefetch files with new information or files. Preserving the prefetch files upfront ensures analysts will have access to most of the prefetch files that were on the system before the collection occurred (four prefetch files may be overwritten before the script preserves them). The script uses robocopy to copy the prefetch files so the file system metadata (timestamps, NTFS permissions, and file ownership) is collected along with the files themselves. The screenshot below shows the preserved files for system OWNING-U.

        Tools Executed

The readme file accompanying the script outlines the various programs used to collect data. The programs include built-in Windows commands and third party utilities. The screenshot below shows the tools folder where the third party utilities are stored.

I’m not going to discuss every program but I at least wanted to highlight a few. Windows diskpart command allows for disks, partitions, and volumes to be managed through the command line. The script leverages diskpart to make it easy for an analyst to see what drives and volumes are attached to a system. Hopefully, the analyst won’t need to open up Windows explorer to see what the removable media drive mappings are since the script displays the information automatically as shown below. Note, to make diskpart work a text file needs to be created in the tools folder named diskpart_commands.txt and the file needs to contain these two commands on separate lines: list disk and list volume.

Mandiant’s Memoryze is used to obtain a forensic image of the system’s memory. Memoryze supports a wide range of Windows operating systems which makes the script more versatile for dumping RAM. The key reason the script uses Memoryze is because it’s the only free memory imaging program I found that allows an image to be stored in a folder of your choice. Most programs will place the memory image in the same folder where the command line is opened. This wouldn’t work because the image would be dropped in the folder where the script is located instead of the drive the analyst wants. Memoryze uses an xml configuration file to image RAM so I borrowed a few lines of code from the MemoryDD.bat batch file to create the xml file for the script. Note, the script only needs the memoryze.exe; to obtain the exe install Memoryze on a computer then just copy memoryze.exe to the Tools folder.

PXServer’s Winaudit program obtains the configuration information from a system and I first became acquainted with the program during my time performing vulnerability assessments. The script uses Winaudit to collect some non-volatile data including the installed software, configured users/groups, and computer devices. Winaudit is capable of collecting a lot more information so it wouldn’t be that hard to incorporate the additional information by modifying the script.

Training Usage

These were the two items put into the script to assist with training members on performing incident response system triage.

        Ordered Output Reports

The script collects a wealth of information about a system and this may be overwhelming to analysts new to examining volatile data. For example, the script produces six different reports about the processes running on a system. A common question when faced with so many reports is how should they be reviewed. The script’s output reports have numbers which is the suggested order for them to be reviewed. This provides a little assistance to analysts until they develop their own process for examining the data. The screenshots below shows the process reports in the output folder and those reports opened in Notepad ++.

        Understanding Tool Functionality and Volatile Data

The script needs to help people better understand what the collected data means about the system where it came from. Two great references for collecting, examining, and understanding volatile data are Windows Forensic Analysis, 2nd edition and Malware Forensics: Investigating and Analyzing Malicious Code. I used both books when researching and selecting the script’s tools to collect volatile data. What better ways to help someone better understand the tools or data then by directing them to references that explain it? I placed comments in the script containing the page number where a specific tool is discussed and the data explained in both books. The screenshot below shows the portion of the script that collects process information and the references are highlighted in red.

Releasing the Tr3Secure Volatile Data Collection Script

There are very few things I do forensically that I think are cool; this script happens to be one of them. There are not many tools or scripts that work as intended while at the same time provide training. People who have more knowledge about volatile data can hit the ground running with the script investigating systems. The script automates imaging memory image, collecting volatile/non-volatile data, and documenting every action taken on the system. People with less knowledge can leverage the tool to learn how to investigate systems. The script collects data then the ordered output and references in the comments can be used to interpret the data. Talk about killing two birds with one stone.

The following is the location to the zip file containing the script and the readme file <zip download link is here>. Please be advised, a few programs the script uses require administrative rights to run properly.

轉自 http://journeyintoir.blogspot.com/2012/01/dual-purpose-volatile-data-collection.html 

0 意見: