A meta-analysis approach for feature selection in network traffic research

Abstract

The selection of features for network traffic analysis and anomaly detection is a challenge for experts who aim to build systems that discover traffic patterns, characterize networks, and improve security. There are no major guidelines or best practices for feature selection in the field. The literature is full of different proposals that ultimately depend on feature availability, types of known traffic, tool limitations, specific goals, and, fundamentally, the experts’ knowledge and intuition. In this work we have revisited 71 principal publications in the field of network traffic analysis from 2005 to 2017. Relevant information has been curated according to formalized data structures and stored in JSON format, creating a database for the smart retrieval of network traffic analysis researches. Meta-analysis performed upon the explored publications disclosed a set of main features that are common in a considerable volume of works and could be used as a baseline for future research. Additionally, aiming for validation and generalization in network traffic research, the creation of such meta-analysis environments is highly valuable. It allows homogenizing and joining criteria for the design of experiments, thus avoiding getting lost or becoming irrelevant due to the high complexity and variability that network traffic analysis involves.

Publication
Proceedings of the Reproducibility Workshop, SIGCOMM 2017