LDMS 3.x Plugins

From OVISWiki
Revision as of 10:23, 16 March 2018 by Baallan (talk | contribs) (Production Store Plugins)
Jump to: navigation, search

You can list the stores and samplers and usage hints for your LDMS 3.4 installation with (typically)

 /usr/bin/ldms_list_plugins.sh

This will list the compiled and installed plugins and their options for use in the LDMS configuration language. This may include experimental plugins. Generally, production plugins will have a man page, for example to understand the meminfo plugin, try

 man Plugin_meminfo

Unfortunately, the man pages may not be installed on your system or they may not yet be complete. If anything in the table below provokes a question, please send it to the ovis-help mailing list described elsewhere.

Production Sampler Plugins

Below is a summary of the plugins that can be considered production quality in release 3.4.4 for commodity Linux environments.

name metric set content
edac memory error checking from /sys/devices/system/edac metrics
jobid Currently running job id and user (requires loose cooperation from queuing system, e.g. slurm)
lnet_stats /proc/sys/lnet/stats metrics, particularly memory
lustre2_client Lustre client metrics
meminfo /proc/meminfo values
procinterrupts interrupt counters (very large datasets on many-core machines)
procnetdev /proc/net tcp interface device counters (excludes rdma traffic/errors)
procnfs nfs v3 client statistics (calls, bytes)
procstat /proc/stat counters (includes cpu tick)
sysclassib infiniband counters and rates (includes rdma traffic/errors)
vmstat /proc/vmstat counters

Production Store Plugins

name output format notes
store_csv CSV files requires the size of metric sets of a given name (schema) be identical across all nodes.
store_flatfile 1-metric files narrow CSV file per metric with timestamp and source columns. Tolerates conflicting schema definitions.

Vendor specific

Below is a summary of the plugins that can be considered production quality in release 3.4.4 for CLE6

aries_mmr/nic_mmr/rtr_mmr Cray XC aries network counters
cray_system_sampler wide variety of Cray XC metric

These also work on appropriate platforms

cray_power_sampler  ?
kgnilnd kernel-space sampler for network data
msr_interlagos AMD CPU event counters

Testing/Tutorial

all_example (tutorial/testing)
array_example arrays (tutorial/testing)
clock ticker (tutorial/testing)
fptrans floating point transmission test sets (testing)
generic_sampler (tutorial/testing)
synthetic generates synthetic waveform data sets (testing)

Need TLC, but can work under some circumstances

procdiskstats  ?
procsensors temperatures sensor
sampler_atasmart SMART counters from devices
lustre2_mds
lustre2_oss

Retiring if champion not self-identified

hadoop ?
knc_sampler knights corner
knc_sampler_copy ditto
knc_sampler_derived ditto

Experimental

hfclock high frequency clock
perfevent Process perf event counters
power_sampler Power API 1
rapl PAPI counters
spapi PAPI counters
switchx infiniband switch port data
test_sampler  ?
timer_base  ?
tsampler  ?