In that tweet, I included the following table:
The table outlines the last modification dates on the emails (batched by date) and shows the earliest and latest timestamps, minimum ID, maximum ID, count and a column titled “FAT.”
What the table illustrates is that the first batches of DNC emails published by WikiLeaks have times that are FAT-like, suggesting the files could have been transferred to a FAT file system (such as is used on USB storage devices).
Having received several queries concerning this, I wanted to give a more detailed explanation and, as further observations have been made, to report on these and make some clarifications.
FAT File System Indicators
The “FAT” column is in reference to the FAT file system, a file system that, in recent years, is usually used on USB storage devices (some outdated non-USB disk storage devices used this in the past too, but it’s very rare to find such devices still in use).
One of the shortfalls of the FAT file system is that it stores timestamp data at a lower resolution (to the nearest two seconds). However, this is advantageous for the purpose of digital forensics as it means there is a pattern that can be detected and used to determine whether files were likely to have been transferred via a FAT file system.
The batches of DNC emails that appear to have been copied to a FAT file system due to this pattern have an “x” in the “FAT” column (in the table referenced at the beginning of this article).
The First Two Batches
Drawing upon a 30-day email retention policy and the sent dates of emails, research in the public domain has suggested that the DNC emails were likely acquired on dates between May 19-25, 2016 [@steemwh1sks] for some time.
Looking at the sent dates of emails and the last modified dates of the email files in the first two batches (those with last modification dates in May, two months prior to initial publication) it is possible to determine that:
- Emails appear to have been copied on May 23, 2016 and May 25, 2016.
- Emails were stored on a device using the FAT file system (very likely to be a USB storage device) at some point in time between acquisition and being published by WikiLeaks.
We can’t, however, make any declaration on exactly when the files were moved to a USB device as different types of copy operations could produce the same result even if the files were transferred to USB weeks after acquisition (as it’s possible to retain the last-modified dates in various circumstances).
Interestingly, the FAT file system indication is in line with claims made by Craig Murray that were published in December 2018 in relation to how WikiLeaks had obtained the DNC leaks through a physical hand-over of the emails.
This particular characteristic was also reported on recently (February 13, 2019) in an article authored by William Binney and Larry Johnson titled “Why The DNC Was Not Hacked By The Russians“. In the article they state:
This data alone does not prove that the emails were copied at the DNC headquarters. But it does show that the data/emails posted by Wikileaks did go through a storage device, like a thumbdrive, before Wikileaks posted the emails on the World Wide Web.
This fact alone is enough to raise reasonable doubts about Mueller’s indictment accusing 12 Russian soldiers as the culprits for the leak of the DNC emails to Wikileaks. A savvy defense attorney will argue, and rightly so, that someone copied the DNC files to a storage device (Eg., USB thumb drive) and transferred that to Wikileaks.
(The article also covers conflicts between intelligence community assessments and Mueller’s July 2018 indictment.)
Looking at the transfer speeds on these batches also gives us reason to doubt that this was a local machine or local network transfer straight to a USB device as the transfers appear to have been at a rate of ~3 megabits/second.
This suggests the files published by WikiLeaks may initially have been transferred remotely.
Some will argue that this supports assertions regarding the DNC being hacked, however, the rates observed alone could just as easily be argued to support statements made by Seymour Hersh that were reported on in July/August 2017 which suggest that WikiLeaks obtained access to a password protected DropBox where the files [DNC and Podesta emails] had been placed.
As well as the batches of emails with last modified dates before the initial publication of DNC Leaks on July 22, 2016, there were two further batches of DNC emails that were made available on WikiLeaks site at later dates and that had last-modified timestamps in August and September 2016.
The third batch, with last modified dates of August 26 2016, also appears to have been transferred via a USB storage device between acquisition and publication.
The fourth of these with last modified dates of September 21 2016, did not have the same 2-second rounding artifact.
While the new tranches included additional DNC staffers, WikiLeaks did not update their web page to reflect that additions were made. However, publication of the batch with the last modified date of September 21, 2016 was announced via the WikiLeaks Twitter account on November 6, 2016 (or November 7 on my side of the Atlantic):
— WikiLeaks (@wikileaks) November 7, 2016
The DNC emails page on WikiLeaks was updated a little over two weeks later (some time between November 22-25, 2016) with the new total (44,053 emails).
- Some emails had internal send times that were later than the last modified timestamps by up to as much as 7 hours in some cases.
All times are normalized to GMT.
- The IDs assigned to the different batches of files aren’t in a consistent sequence and it seems possible that the files were renamed after acquisition. (The May 23 batch, however, did use a subset of IDs used by the May 25 batch.)
- Total counts of emails associated with separate mailboxes that were published by WikiLeaks are interesting too. When looking solely at the emails, there are many sent to mailing list groups and, in these instances, it’s extremely difficult (maybe impossible!?) to determine whose mailbox the email came from (see email 15384 for an example of this). There is also some disparity between the totals WikiLeaks cites and the number of emails that can be identified as belonging to a specific mailbox (with the latter being lower). These factors combined suggest that WikiLeaks were either told the totals for each mailbox or were provided the emails segregated by mailbox.
- There are approximately a thousand older emails (with dates prior to April 1, 2016) that account for a little over 2% of the emails released. While there could be various explanations for this, it appears (based on what is disclosed in one of the leaked emails) that the email retention rules didn’t apply to emails if they were moved into other folders. This at least gives a good explanation for what would otherwise seem an anomalous presence of old emails.
- While there are 44,053 email files, WikiLeaks only indexes 27,515 (as can be seen when doing a blank search in their database):
The reason for this disparity appears to be due to almost 40% of the emails being duplicates and the duplicates not being indexed.
- Based on an analysis of the Sent dates, batches 3 and 4 have a last sent date of May 23. Therefore, it is likely that batches 3 and 4 were also acquired on May 23.
Data & Verification
Raw data for last modification timestamps is available here.
Raw data for the above with send dates included are available here.
(The latter of these has approximately 100 entries less than the former, as some emails lacked headers from which a sent date could be determined).
For those that don’t want to (or don’t have means to) scrape all of the data but wish to do a few manual spot-checks on the data linked to above, you can use your web browser to validate individual dates.
To do this, visit the leak you want to check (on WikiLeaks site), click on the “View source” tab and make sure your browser’s developer console is open, then click on the “Download raw source” link. Your browser should send a GET request for the file (which will be for a URL that starts with “https://wikileaks.org/dnc-emails//get/” and is followed by the email ID).
If you expand the details and check the headers, you will find the “Last-Modified” date there and that is where the last modified timestamps are coming from.
The example below uses FireFox:
This obviously isn’t practical for fully validating all of the data due to the volume of emails and is only referenced here as a simple way to do a spot-check that is accessible to most people and helps to illustrate where these last modified dates are being sourced from.
While the evidence suggests that the first three batches of DNC emails could have been transferred via a USB storage device at some stage (due to the FAT-like rounding pattern), the transfer speeds observed for the batches with last-modified dates matching the dates of acquisition are at approximately 3 megabits/second, much slower than we would expect for a local file transfer.
In other instances, such as the NGP-VAN research, there have other factors that help to reinforce a conclusion of a USB device being used but in this instance we don’t.
The reason this is important is because there are, in fact, other circumstances where this rounding pattern can be caused and one of those is by unzipping a zipped archive using Linux’s unzip command.
While this may be less likely for the general public, for an entity such as WikiLeaks the chances of this being a factor are significantly higher. This author reached out to WikiLeaks to inquire about the format of the leak and how the leaks were handled and while they could confirm WikiLeak was a Linux house, they couldn’t confirm any details about the leaks and/or how they were handled. Despite the FAT-like rounding observed here, it cannot be definitively attributed to a USB device and there are other possibilities that can’t be ruled out.
Given that (for the May 23 and May 25 batches) the file last modified times and the internal email sent times are close in many cases, it seems likely that the original emails were copied soon after acquisition. The anomalous time shift between last modification timestamps and the send times of emails (especially for the May 25 batch) raises the possibility that an intermediary on the West Coast (US) may have copied the emails to a USB drive. The time shift can be explained by then copying the thumb drive to local storage, while at a location in London, for example. The (hypothetical) existence of an intermediary doesn’t tell us anything about the individual (or individuals) who originally acquired the emails. Thus, this scenario does not necessarily rule out the possibility of an insider acquiring the emails. If we contemplate the intermediate use of cloud storage, this could have been used as a method to decouple the acquisition of the emails from delivery to another party that subsequently delivered them to Wikileaks.
Editor’s Note: This article has been updated to clarify on other potential causes for the fact pattern observed and to include information about attempts to obtain confirmation from WikiLeaks that were made after the article was originally published.
Credit (and many thanks) to Forensicator for researching, sharing observations, providing the data set, charts and more.