1The raw network packets of the UNSW-NB15 dataset was created by the IXIA PerfectStorm tool in the Cyber Range Lab of the Australian Centre for Cyber Security (ACCS) for generating a hybrid of real modern normal activities and synthetic contemporary attack behaviours.
Tcpdump tool used to capture 100 GB of the raw traffic (e.g., Pcap files). This data set has nine types of attacks, namely, Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic, Reconnaissance, Shellcode and Worms. The Argus and Bro-IDS tools are used and twelve algorithms are developed to generate totally 49 features with the class label.
This task is using Apache Hive for converting big raw data into useful information for the end users. To do so, firstly understand the dataset carefully. Then, make at least 4 Hive queries (refer to the marking scheme). Apply appropriate visualization tools to present your findings numerically and graphically. Interpret shortly your findings.
Finally, take screenshot of your outcomes (e.g., tables and plots) together with the scripts/queries into the report.
Tip: The mark for this section depends on the level of your HIVE queries’ complexities, for instance using the simple select query is not supposed for full mark.
In this section, you will conduct advanced analytics using PySpark.
We need to learn and understand the data through at least 4 analytical methods (descriptive statistics, correlation, hypothesis testing, density estimation, etc.). You need to present your work numerically and graphically. Apply tooltip text, legend, title, X-Y labels etc. accordingly to help end-users for getting insights.
Discuss (1) what other alternative technologies are available for tasks 2 and 3 and how they are differ (use academic references), and (2) what was surprisingly new thinking evoked and/or neglected at your end?
Tip: add individual assessment of each member in a same report.
Document all your work. Your final report must follow 5 sections detailed in the “format of final submission” section (refer to the next page). Your work must demonstrate appropriate understanding of academic writing and integrity.