Document Type

Conference Proceeding

Abstract

Fake file systems are used in the field of cyber deception to bait intruders and fool forensic investigators. File system researchers also frequently generate their own synthetic document repositories, due to data privacy and copyright concerns associated with experimenting on real-world corpora. For both these fields, realism is critical. Unfortunately, after creating a set of files and folders, there are no current testing standards that can be applied to validate their authenticity, or conversely, reliably automate their detection. This paper reviews the previous 30 years of file system surveys on real world corpora, to identify a set of discrete measures for generating synthetic file systems. Statistical distributions, such as size, age and lifetime of files, common file types, compression and duplication ratios, directory distribution and depth (and its relationship with numbers of files and sub-directories) were identified and the respective merits discussed. Additionally, this paper highlights notable absences in these surveys, which could be beneficial, such as analysing, on mass, the text content distribution, file naming habits, and comparing file access times against traditional working hours.

Comments

12^th Australian Digital Forensics Conference. Held on the 1-3 December, 2014 at Edith Cowan University, Joondalup Campus, Perth, Western Australia.

DOI

10.4225/75/57b3dc72fb878

Download

Included in

Computer Engineering Commons, Computer Sciences Commons

COinS

Australian Digital Forensics Conference

Towards a set of metrics to guide the generation of fake computer file systems

Document Type

Abstract

Comments

DOI

Included in

Search

Links

Browse

Author Information

Article Locations

Australian Digital Forensics Conference

Towards a set of metrics to guide the generation of fake computer file systems

Authors

Document Type

Abstract

Comments

DOI

Included in

Share

Search

Links

Browse

Author Information

Article Locations