Radio frequency traffic classification over WLAN networks

Aim

Network traffic classification is the process of analysing traffic flows and associating them to different categories of network applications, i.e., associating a flow of packets to the application that generated it. Traffic analysis and classification represents an essential task for both network management architectures and network security solutions.

The aim of this case study is to investigate whether there are features that can be extracted from the Radio Frequency (RF) domain over a WLAN wireless network and whether such features can be utilised effectively for the purpose of classifying the different types of traffic being carried over a wireless network. The ultimate goal being to detect unauthorised traffic on a network of wireless Internet Protocol (IP) based radios for the purpose of intrusion detection.

Background

Several classification techniques have been presented in the literature, where there are three major approaches to classify Internet traffic flows: the Port-based approach, the Deep Packet Inspection (DPI) approach, and the Statistical-based approach that is based on statistical characteristics of the flows behaviour. However, these approaches still have certain limitations.

The simplest method in the Port-based approach consists in examining, for example, the TCP port numbers. The Port-based approaches are simple because many well-known applications have specific port numbers, for example, the HTTP traffic uses port 80 and FTP traffic uses port 21. However, the research community now recognises that Port-based classification is inadequate, mainly because many applications use dynamic port-negotiation mechanisms to hide from firewalls and network security tools. When sent over a WLAN, these port numbers are typically encrypted as they do not form part of the typical WLAN protocols.

Approaches based on Deep Packet Inspection are usually considered very reliable for traffic that is not encapsulated into other application-level protocols and for un-encrypted traffic. However, the current trends show that the portion of encrypted traffic on the Internet is constantly increasing, and several applications are using protocol encapsulation or obfuscation to evade network policy enforced through filtering. In addition, access to the full payload is often not possible (e.g., due to privacy and performance issues).

The current statistical based approaches have to collect all the packets in a traffic flow to classify different network applications, which is useful for off-line classification, but it cannot be applied in an online environment.

Experimental work

Six popular Internet applications have been used to generate the Internet traffic data set. The data set included different traffic types: video streaming, online video, FTP, web browsing, email, and Skype. In addition, each of the traffic types was collected with different variations, for example, the most popular web browsers have been utilised in the data set collection, including Internet Explorer, Firefox, Safari, and Chrome.

List of Internet applications

Application NameSoftwareServer / Website
Video streaming Darwin Streaming Server and Quicktime Different video test sequences
Online video Internet Explorer, Firefox, Safari, Chrome Youtube, Live TV
FTP Internet Explorer, Firefox, Safari, Chrome, Filezilla Different FTP servers
Web browsing Internet Explorer, Firefox, Safari, Chrome Browsing through different websites
Email Internet Explorer, Firefox, Safari, Chrome, Thunderbird Hotmail
Skype Skype Different Skype conversations

Different classification approaches

Two traffic classification approaches were used for the analysis. The first is the encrypted MAC-Frame-Level traffic statistical analysis. This was based on analysing the encrypted MAC-frame-level traffic statistical properties of the captured wireless traces to analyse each individual traffic type in terms of its average Frame Length/Capture Length and average Frame Inter-Arrival Time (IAT) characteristics.

The second approach was based on Machine Learning algorithms, where the data features were extracted and used to train various machine learning algorithms. This has included feature collation, where two key attributes were utilised by the machine learning algorithms in order to perform traffic classification, being the packet length (number of bits) and the packet inter-arrival time (seconds).

The statistical analysis approach results have shown that although the average Frame Length/Capture Length and average Inter-arrival Time traffic features between the low-data-rate and the high-data-rate applications can be identified statistically to some extent, such as Skype vs. FTP, similar applications have very fine distinctions between them, especially under time-varying network conditions, different radio propagation conditions over WLAN, and the software used for the application.

In addition, it can also be observed that the traffic characteristics even from the same application type can greatly vary over different days and using different software.

Impact and applications

Traffic analysis and classification represents an essential task for both network management architectures and network security solutions. This case study has presented an investigation on whether there are features that can be extracted from the Radio Frequency (RF) domain over a WLAN wireless network and whether such features can be utilised effectively for the purpose of classifying the different types of traffic being carried over a wireless network. 

  • Traffic classification plays an important role in common network management applications, such as network monitoring and intrusion detection. For example, an enterprise or a service provider can apply various rules to protect network resources or enforce organisation policies according to the classification results. Therefore, accurate traffic classification is a key factor in network management and monitoring.
  • Other main motivations for looking for new and reliable traffic classification techniques today are to offer adequate Quality of Service (QoS) depending on the category of traffic carried by flows and to perform a billing not only based on bandwidth usage but also on the traffic category.
  • Network traffic classification represents an essential task in the whole chain of network security, where some of the most important and widely spread applications of traffic classification pertain to network security, such as the enforcement of security policies on the use of different applications, the ability to classify encrypted traffic, and the identification of malicious traffic flows.