Argus developer v7 training
And because Argus can be deployed in end-systems and embedded in networking devices, it is perfect data for doing host behavior-based traffic classification as well as NFV / SDN based classification. Argus data also contains sampled payload data, so it is great for doing most payload-based traffic classification, especially encrypted traffic classification. Traffic classification dominates the network ML literature and Argus data is designed specifically for flow feature-based, as well as early and sub-flow-based traffic analytics and classification. The source of data is either packets (packet-level features) or network flow data (flow-level features).Īrgus is used in the UNSW-NB15 Datasetwhich is the most frequent used standard data set for network based Machine Learning studies, because of its rich data features. In the literature, there are a lot of papers where ML has been applied to network traffic prediction, classification, routing, operations, performance and network security. Because Argus has been used in many ML network research projects, it is a proven network traffic data set that can deliver predictable and reliable results. Of the four broad categories of problems that can leverage ML, the Argus Project focuses on clustering, classification and regression of network traffic flow-features. Designed data, rather than trying to make do with the data that is lying around, is key to successful ML solutions.ĭata generation, collection, feature engineering, establishing ground truth, model development and validation, model optimization, deployment are all complex concepts that must be addressed when considering an operational ML approach to any problem. Non-statistical approaches yield the best prediction results. ML and networking have been an interesting pair for a few decades now, and a number of basic concepts have emerged that will help the Data Scientist to approach the complexities of this topic. Argus provides the most mature streaming network situational awareness capability available, providing guarantees on data timeliness, order and state, making it a natural choice for ML. data that is actually useful and reliable.ĭeploying ML in operational networks, requires a stable and reliable network data generation and processing platform. Argus is famous for being one of the first large data sources of network data, well structured, with the right kind of attributes that allow ML to "peek" into network state and condition, in real-time. Having a lot of attributes associated with the data, is a key component for ML deep learning. Having massive amounts of non-statistical, historical, transactional network activity data is really important for ML model development and training. The very large data capabilities, rich data models, flexible data formats, high performance data generation and processing, metadata enhancement capabilities, streaming and block processing strategies, and technical maturity all come together to provide an environment where successful ML and AI models can be developed, tested, optimized and deployed. Unsupervised learning using network flow data has been an active research topic for many years, and organizations like Oak Ridge National Lab (ORNL) have had great results using Argus data in their operational system, SITU. The Argus system is the network data source of choice for many prominent Machine Learning (ML) Network Intrusion Detection and Anomaly Detection projects.