Difference between revisions of "LDMS 3.x Plugins"

From OVISWiki
Jump to: navigation, search
(Production Store Plugins)
 
(6 intermediate revisions by 2 users not shown)
Line 12: Line 12:
 
If anything in the table below provokes a question, please send it to the ovis-help mailing list described elsewhere.
 
If anything in the table below provokes a question, please send it to the ovis-help mailing list described elsewhere.
  
 +
Migration note: LDMS v4 will introduce ldms-plugins.sh with better functionality.
 
== Production Sampler Plugins ==
 
== Production Sampler Plugins ==
 
Below is a summary of the plugins that can be considered production quality in release 3.4.4 for commodity Linux environments.
 
Below is a summary of the plugins that can be considered production quality in release 3.4.4 for commodity Linux environments.
 
{|class="wikitable"
 
{|class="wikitable"
 
!name || metric set content
 
!name || metric set content
 +
|-
 +
| dstat || memory and other statistics from LDMSD itself
 
|-
 
|-
 
| edac ||memory error checking from  /sys/devices/system/edac metrics
 
| edac ||memory error checking from  /sys/devices/system/edac metrics
Line 47: Line 50:
 
| store_flatfile || 1-metric files || narrow CSV file per metric with timestamp and source columns. Tolerates conflicting schema definitions.
 
| store_flatfile || 1-metric files || narrow CSV file per metric with timestamp and source columns. Tolerates conflicting schema definitions.
 
|-
 
|-
| strore_rabbitv3 || AMQP messages || feeds all metrics to single AMQP broker via librabbitmq 0.8.
+
| strore_rabbitv3 || AMQP messages || feeds all metrics to single AMQP broker via librabbitmq 0.8. (multiple routing keys)
 +
|-
 +
| strore_rabbitkw || AMQP messages || feeds all metrics to single AMQP broker via librabbitmq 0.8. (single routing key)
 
|}
 
|}
  
Line 55: Line 60:
 
| aries_mmr/nic_mmr/rtr_mmr || Cray XC aries network counters
 
| aries_mmr/nic_mmr/rtr_mmr || Cray XC aries network counters
 
|-
 
|-
| cray_system_sampler || wide variety of Cray XC metric
+
| cray_system_sampler || wide variety of Cray XE/XC metrics
 +
|-
 +
| kgnilnd          || kernel-space sampler for network data
 
|}
 
|}
 
These also work on appropriate platforms
 
These also work on appropriate platforms
 
{|class="wikitable"
 
{|class="wikitable"
 
| cray_power_sampler || ?
 
| cray_power_sampler || ?
|-
 
| kgnilnd          || kernel-space sampler for network data
 
 
|-
 
|-
 
| msr_interlagos    || AMD CPU event counters
 
| msr_interlagos    || AMD CPU event counters
 
|}
 
|}
 +
 
== Testing/Tutorial ==
 
== Testing/Tutorial ==
 
{|class="wikitable"
 
{|class="wikitable"
Line 79: Line 85:
 
| synthetic || generates synthetic waveform data sets (testing)
 
| synthetic || generates synthetic waveform data sets (testing)
 
|}
 
|}
 +
<!--
 
== Need TLC, but can work under some circumstances ==
 
== Need TLC, but can work under some circumstances ==
 
{|class="wikitable"
 
{|class="wikitable"
Line 92: Line 99:
 
| lustre2_oss ||
 
| lustre2_oss ||
 
|}
 
|}
 +
-->
 +
 +
<!--
 
== Retiring if champion not self-identified ==
 
== Retiring if champion not self-identified ==
 
{|class="wikitable"
 
{|class="wikitable"
Line 102: Line 112:
 
| knc_sampler_derived || ditto
 
| knc_sampler_derived || ditto
 
|}
 
|}
 +
-->
 +
 
== Experimental ==
 
== Experimental ==
 
{|class="wikitable"
 
{|class="wikitable"
Line 121: Line 133:
 
|-
 
|-
 
| tsampler || ?
 
| tsampler || ?
 +
|-
 +
| store_sos || Scalable Object Store storage prototype
 
|}
 
|}
 +
 +
== Other ==
 +
All other plugins should be considered unsupported.

Latest revision as of 09:18, 14 December 2018

You can list the stores and samplers and usage hints for your LDMS 3.4 installation with (typically)

 /usr/bin/ldms_list_plugins.sh

This will list the compiled and installed plugins and their options for use in the LDMS configuration language. This may include experimental plugins. Generally, production plugins will have a man page, for example to understand the meminfo plugin, try

 man Plugin_meminfo

Unfortunately, the man pages may not be installed on your system or they may not yet be complete. If anything in the table below provokes a question, please send it to the ovis-help mailing list described elsewhere.

Migration note: LDMS v4 will introduce ldms-plugins.sh with better functionality.

Production Sampler Plugins

Below is a summary of the plugins that can be considered production quality in release 3.4.4 for commodity Linux environments.

name metric set content
dstat memory and other statistics from LDMSD itself
edac memory error checking from /sys/devices/system/edac metrics
jobid Currently running job id and user (requires loose cooperation from queuing system, e.g. slurm)
lnet_stats /proc/sys/lnet/stats metrics, particularly memory
lustre2_client Lustre client metrics
meminfo /proc/meminfo values
procinterrupts interrupt counters (very large datasets on many-core machines)
procnetdev /proc/net tcp interface device counters (excludes rdma traffic/errors)
procnfs nfs v3 client statistics (calls, bytes)
procstat /proc/stat counters (includes cpu tick)
sysclassib infiniband counters and rates (includes rdma traffic/errors)
vmstat /proc/vmstat counters

Production Store Plugins

name output format notes
store_csv CSV files requires the size of metric sets of a given name (schema) be identical across all nodes.
store_flatfile 1-metric files narrow CSV file per metric with timestamp and source columns. Tolerates conflicting schema definitions.
strore_rabbitv3 AMQP messages feeds all metrics to single AMQP broker via librabbitmq 0.8. (multiple routing keys)
strore_rabbitkw AMQP messages feeds all metrics to single AMQP broker via librabbitmq 0.8. (single routing key)

Vendor specific

Below is a summary of the plugins that can be considered production quality in release 3.4.4 for CLE6

aries_mmr/nic_mmr/rtr_mmr Cray XC aries network counters
cray_system_sampler wide variety of Cray XE/XC metrics
kgnilnd kernel-space sampler for network data

These also work on appropriate platforms

cray_power_sampler  ?
msr_interlagos AMD CPU event counters

Testing/Tutorial

all_example (tutorial/testing)
array_example arrays (tutorial/testing)
clock ticker (tutorial/testing)
fptrans floating point transmission test sets (testing)
generic_sampler (tutorial/testing)
synthetic generates synthetic waveform data sets (testing)


Experimental

hfclock high frequency clock
perfevent Process perf event counters
power_sampler Power API 1
rapl PAPI counters
spapi PAPI counters
switchx infiniband switch port data
test_sampler  ?
timer_base  ?
tsampler  ?
store_sos Scalable Object Store storage prototype

Other

All other plugins should be considered unsupported.