Difference between revisions of "OVISWiki:Current events"
From OVISWiki
Line 1: | Line 1: | ||
== News == | == News == | ||
+ | * See NEWS at [https://github.com/ovis-hpc/ovis/wiki github-wiki] - includes information for the bi-weekly user telecons including notes. | ||
+ | ** Coming soon - LDMSCON2020 announcement! | ||
+ | |||
* We are working on a lightweight [[very light secure app monitoring approach]] | * We are working on a lightweight [[very light secure app monitoring approach]] | ||
− | |||
− | |||
* [https://csl.illinois.edu/news/protecting-super-computing-environments-artificial-intelligence Sandia-UIUC collaboration on AI for Supercomputer Diagnostics] | * [https://csl.illinois.edu/news/protecting-super-computing-environments-artificial-intelligence Sandia-UIUC collaboration on AI for Supercomputer Diagnostics] | ||
* [http://isc-hpc.com/isc-2017.html 2017 ISC High Performance 2017 (ISC)] Gauss Award Winner: [[Media:ISC_SNL_BU_MachineLearning.pdf| Diagnosing Performance Variations in HPC Applications Using Machine Learning]] - using LDMS monitoring data as the basis for Machine Learning-based Performance Diagnosis | * [http://isc-hpc.com/isc-2017.html 2017 ISC High Performance 2017 (ISC)] Gauss Award Winner: [[Media:ISC_SNL_BU_MachineLearning.pdf| Diagnosing Performance Variations in HPC Applications Using Machine Learning]] - using LDMS monitoring data as the basis for Machine Learning-based Performance Diagnosis |
Revision as of 13:48, 29 February 2020
News
- See NEWS at github-wiki - includes information for the bi-weekly user telecons including notes.
- Coming soon - LDMSCON2020 announcement!
- We are working on a lightweight very light secure app monitoring approach
- Sandia-UIUC collaboration on AI for Supercomputer Diagnostics
- 2017 ISC High Performance 2017 (ISC) Gauss Award Winner: Diagnosing Performance Variations in HPC Applications Using Machine Learning - using LDMS monitoring data as the basis for Machine Learning-based Performance Diagnosis
- LDMS wins 2015 R&D 100 award! LDMS Video
- 2015: ASCR awarded Resilience project Holistic Measurement Driven Resilience: Combining Operational Fault and Failure Measurements and Fault Injection for Quantifying Fault Detection and Impact
Releases
OVIS/LDMS can be obtained from github.com/ovis-hpc
- LDMS v4! Available at github site!
- The current distribution includes only the OVIS/LDMS monitoring, transport, and storage components.
Upcoming HPC Monitoring and Analysis Conference Events
- Workshop on Monitoring and Analysis for HPC Systems Plus Applications (HPCMASPA) held in conjunction with IEEE Cluster 2019 in Sept 2019 at Albuquerque, NM USA.
- Monitoring Large-Scale HPC Systems -- collaboration and resource site for HPC Monitoring
- Includes materials from SC18 BoF: Monitoring Large-Scale HPC Systems: Extracting and Presenting Meaningful System and Application Insights