Sampler APIv2

This website contains archival information. For updates, see https://github.com/ovis-hpc/ovis-wiki/wiki

API

The base API is extended by get_set and sample.

get_set is called by ldmsd after the sampler has been configured.

sample is called repeatedly after ldmsd has used get_set.

Lifecycle

Load
Configuration
Get set
Sample when told
Clear plug-in resources

Stage 2 may be repeated to change the set of metrics collected. These changes will propagate immediately (with the next pull) through the LDMSD aggregation hierarchy.

Important things to design around

staticness – Specific store plug-ins configured at the aggregator may not be able to handle change once the initial metric set definition is received. E.g. a CSV store with fixed column definitions in a single file cannot handle column redefinition. In v2, its aggregator must be stopped and restarted with the new set definition.
homogeneity – Specific store plug-ins (again, e.g. CSV) may not be able to handle variation within a set of the same name (e.g. a different number of cpu cores on different nodes). In v2, the customary workaround is to use a configuration parameter to define the maximum possible set of metrics and then record 0 values for any metric not present on a specific node.
OS variability – Specific data sources (e.g. /proc/meminfo) may change the metrics they contain in time. Simply grabbing everything in a file may lead to locally incorrect data and to set definition conflicts at the store.

Supporting Utilities

AVL Configuration

The attribute-value list passed to the config function contains the strings passed as key=value from the command line interface.

char *value;
value = av_value(avl,"key1");

gets the value matching key1 or NULL if user did not include key1.

If a key’s presence is mandatory, the config function should log an error message and return a non-zero status.

In ldms v2, the config function provides global properties for the plugin. The config function parses the avl and stores properties for later use in sampler and store activities.

LDMS_JOBID

The LDMS_JOBID macros are optional: if your sampler does not want to be resource manager aware, that’s fine. Samplers without jobids may increase the data analysis effort later for queries involving jobid. The assumptions of the jobid macros are:

The slurmjobid sampler has been loaded and configured first.
The jobid metric can be represented as uint64_t.
The jobid metric is going to be added to metric set using a particular construction pattern that goes beyond the API requirements.

You can develop your own method of registering the jobid metric in your metric set if your set construction technique varies from the usual.

Background: The slurmjobid sampler provides a singleton collector of resource management information specific to the node it runs on. As of version 2.4, the design assumption of this sampler is that nodes are allocated to a single job only and that some resource manager (which may be slurm or any other) writes the job information to a place the jobid sampler is configured to read. Any time the file of expected job information is not found or empty, jobid 0 is assumed.