Treffer: Processing and Analysis of Large Data Sets from High Bandwidth Tactical Networking Experiments Using High Performance Computing.

Title:
Processing and Analysis of Large Data Sets from High Bandwidth Tactical Networking Experiments Using High Performance Computing.
Source:
ITEA Journal of Test & Evaluation. Sep2015, Vol. 36 Issue 3, p220-225. 6p.
Database:
Supplemental Index

Weitere Informationen

Systems-of-systems evaluation is critical for mission success as individually complex systems are combined in even more complex environments. Analysis of recorded data can yield essential information for enabling interoperability, improving performance, and engineering resilient combat support systems. In the field of digital communications, large amounts of valuable test data can be generated, often with significant latent information. The timely processing and analysis of these large datasets becomes a high-value endeavor. The growing dependence on digital communications and the availability o f highly sophisticated communications systems require more scalable methods to process large amounts of recorded data into a form that analysts can use to make critical decisions. Data collection from a "small" test o f20-30 high-bandwidth radio systems can yield on the order of 1 Terabyte of network data for a single 12-hour test period. Processing this data into a data model for use by analysts can take up to 2-3 days. With larger, more complex test scenarios necessary, the current processing approach will soon become overwhelmed. The Army Research Laboratory (ARL) and Aberdeen Test Center (ATC) have partnered to employ High Performance Computing (HPC) to address the growing requirements for data processing and analysis. Using the HPC resources at ARL's DoD Supercomputing Resource Center (DSRC), a framework has been designed for distributed processing and analysis of recorded data that can scale to suppon large data sets. A series of map-reduce operations distribute the processing load of data among hundreds to thousands of processing cores. Each map-reduce cycle can distribute the problem and data in different domains [such as time, data source, packet attributes, etc] to achieve higher levels of parallelism. The framework uses pluggable processing modules that can perform various types of pre-determined data analysis functions to generate structured data outputs. The structured data outputs are loaded into a query-able database schema that can be studied in-depth by analysts. Visualization modules are being added to a web-based interface to produce high-level views of the test architecture, communications performance, and to enable visual analytics of the data. [ABSTRACT FROM AUTHOR]