Main Page

From OVISWiki
Revision as of 12:28, 16 February 2018 by Gentile (talk | contribs) (Get rid of the ancient viz)
Jump to: navigation, search

OVIS is a modular system for HPC data collection, transport, storage, analysis, visualization, and response. The OVIS project seeks to enable more effective use of High Performance Computational Clusters via greater understanding of applications' use of resources, including the effects of competition for shared resources; discovery of abnormal system conditions; and intelligent response to conditions of interest.

Data Collection, Transport, and Storage

The Lightweight Distributed Metric Service (LDMS) is the OVIS data collection and transport system. LDMS provides capabilities for lightweight run-time collection of high-fidelity data. Data can be accessed on-node or transported off node. Additionally, LDMS can store data in a variety of storage options.

Log Message Analysis

OVIS analyses include the Baler tool for log message clustering.

Decision Support

The OVIS project includes research work in determining intelligent response to conditions of interest. This includes dynamic application (re-)mapping based upon application needs and resource state and invocation of resiliency responses upon discovery of potential pre-failure and/or abnormal conditions.

Collaborative Analysis Support

Shaun, a cluster supporting collaboration in HPC data analytics, is coming soon.