HTTP-aware anonymisation of packet traces
North East Wales Institute
Computing, Health and Science
School of Computer and Information Science, Centre for Security Research
Current Internet packet traces, used to observe the characteristics of current network applications, must be anonymised when stored, due to legal reasons. This process reduces the application-level statistics that can be later performed on the traces collected. This study evaluates the amount of information that may be retrieved from packet traces that were anonymised, while retaining the HTTP header tags and proposes an anonymising method that supports current research of non-intrusive www characteristics without breaching user privacy. The second part of the study uses the technique proposed to provide detailed statistics about the characteristics of HTTP dialogues, as extracted from anonymised network traces. The results revealed possible sources of bias, such as large files for average object sizes, a relatively high of HTTP 1.0 servers, considering its limitations, and the majority of pages having an age of less than one year.