- New mailing list and user's group with bi-weekly telecons and face-to-face meeting planned for June 2019 -- Join Now!
- 2017 ISC High Performance 2017 (ISC) Gauss Award Winner: Diagnosing Performance Variations in HPC Applications Using Machine Learning - using LDMS monitoring data as the basis for Machine Learning-based Performance Diagnosis
- LDMS wins 2015 R&D 100 award! LDMS Video
- 2015: ASCR awarded Resilience project Holistic Measurement Driven Resilience: Combining Operational Fault and Failure Measurements and Fault Injection for Quantifying Fault Detection and Impact
OVIS/LDMS can be obtained from github.com/ovis-hpc
- Coming mid-late Dec 2018 - LDMS v4! Release will be announced here and at github site!
- The current distribution (v3) includes only the OVIS/LDMS monitoring, transport, and storage components.
Upcoming HPC Monitoring and Analysis Conference Events
- Workshop on Monitoring and Analysis for HPC Systems Plus Applications (HPCMASPA) held in conjunction with IEEE Cluster 2019 in Sept 2019 at Albuquerque, NM USA.
- Monitoring Large-Scale HPC Systems -- collaboration and resource site for HPC Monitoring
- Includes materials from SC18 BoF: Monitoring Large-Scale HPC Systems: Extracting and Presenting Meaningful System and Application Insights