Exporting Custom Output

Introduction

As with --import, the --export option allows one to build custom modules, but in this case for generating output. Unlike --import, which allows multiple modules to be specified, you only --export using a single module, which in turn overrides all of collectl's output formatting routines.

These modules all end in the extension .ph and are searched for first in the directory collectl is being executed from and then /usr/share/collectl. If the module name is prepended with a directory name, collectl will search only there.

The reason you might care about this is that now if you want to produce your own exportable form of output and be able to print it locally, make it available to another program over a socket or even write to a local file while still being able to log to raw and/or plot formats, you get that all that functionality for free.

How It Works

The interface to all this is really quite simple. At the command line the user types collectl --export name[,options] where: There are currently 5 different custom exports that are part of a standard collectl distribution: The first four are the most interesting because they have been built to take advantage of collectl's capability of sending their output over a socket and are therefore the ideal vehicle for interfacing with other tools and environments. All four share a common set of options as described in the following table:

alignUsed in conjunction with i= and when specified data samples will be aligned to whole minute boundaries. In other words if used with i=15, data will be reported at the top of the minute, 15 seconds past, etc. The first sample may therefore be partial.
avg|max|min|totused in conjunction with i=, send the average, maximum, minimum or total of the data over the associated set of intervals specified by i=. If none of these are specified, the values from the most recent monitoring interval will be reported.
cothis does not take a value and indicates changes only such that only data elements that have changed since they were last sampled are reported in an attempt to minimize processing and network bandwidth. If not specified, samples for all reporting intervals will be sent.
d=maskdebugging mask, see beginning of the actual export file for details
f=filenames the output snapshot file, which applies to only lexpr. If this option is not used, -f must be and the snapshot file name which is set to the single character L filename and written into the directory associated with -f
caution
If you are writing your own export module that doesn't use a snapshot file, you must explicity include error checking to assure it is not run without at least -P or --rawtoo in combination with -f
hshows help/usage
i=secsspecifies the reporting interval in seconds. In other words, if you specify i=60, a sample will be reported every 60 seconds independent of collectl's monitoring interval. The default is to report every sample. This interval must be a multiple of the base collectl interval. note that while collectl always rounds rates to the next whole value, when multiple intervals are added together only the totals are rounded.
sspecifies a subset of those subsystems specified with -s in the collectl command line and only data collected for that subset will be reported. The default to report everything. In the case where you only want report data collected via --import, use s= with no args.
ttl is the time to live in intervals for each piece of performance data. If more than this number of intervals passes data will be sent regardless of whether it changed or not and the ttl countdown timer reset. The default is 5. This actually has a second use for gexpr and that is to set the gmond ttl to double this number multiplied by the interval.

Logging
Collectl can actually create up to 3 different type of log files and it's worth spending a little more time enumerating how collectl decides where and when to create them.

Example

Perhaps the best way to see how all this works is with a simple example and it turns out that vmstat.ph is small enough to meet that need. You may also wish to refer to the others as well to see how some of the more exotic capabilities are implemented.

This first section gets called almost immediately by collectl after reading in the various user switches. This is the place to catch switch errors and since this routine always requires -scm we'll just hardcode it to that and reject any user entered ones. This initialization subroutine must be named for our module followed by Init.

sub vmstatInit
{
  error("-s not allowed with 'vmstat'")          if $userSubsys ne '';
  error("-f requires either --rawtoo or -P")     if $filename ne '' && !$rawtooFlag && !$plotFlag;
  error("-P or --rawtoo require -f")             if $filename eq '' && ($rawtooFlag || $plotFlag);
  $subsys=$userSubsys='cm';
}
Next we define the output routine, with the same base name as that of our included file.

The if statement uses collectl's standard idiom for printing headers based on the number of lines printed and whether or not the user wants only a single header, no header or even to clear the screen between headers.

sub vmstat
{
  my $line;
  if (printHeader())
  {
    $line= "${cls}#${miniBlanks}procs ---------------memory (KB)--------------- --swaps-- -----io---- --system-- ----cpu-----\n";
    $line.="#$miniDateTime r  b   swpd   free   buff  cache  inact active   si   so    bi    bo   in    cs us sy  id wa\n";
  }
Next comes the handling of optional date/time prefixes that I stole from printTerm() in formatit.ph and which can be controlled by various switch options. Again, if you have no intent of supporting these you can even put in error handling in your initialization routine or simply ignore the switches.
  my $datetime='';
  if ($options=~/[dDTm]/)
  {
    ($ss, $mm, $hh, $mday, $mon, $year)=localtime($lastSecs);
    $datetime=sprintf("%02d:%02d:%02d", $hh, $mm, $ss);
    $datetime=sprintf("%02d/%02d %s", $mon+1, $mday, $datetime)                  if $options=~/d/;
    $datetime=sprintf("%04d%02d%02d %s", $year+1900, $mon+1, $mday, $datetime)   if $options=~/D/;
    $datetime.=".$usecs"                                                         if ($options=~/m/);
    $datetime.=" ";
  }
Here we build the actual output, noting that we're not really printing anything yet, but rather building up a string (which may contain the header) that we will print in one shot.
  my $i=$NumCpus;
  my $usr=$userP[$i]+$niceP[$i];
  my $sys=$sysP[$i]+$irqP[$i]+$softP[$i]+$stealP[$i];
  $line.=sprintf("%s %2d %2d %6s %6s %6s %6s %6s %6s %4d %4d %5d %5d %4d %5d %2d %2d %3d %2d\n",
                $datetime, $procsRun, $procsBlock,
                cvt($swapUsed,6,1,1),  cvt($memFree,6,1,1),  cvt($memBuf,6,1,1),
                cvt($memCached,6,1,1), cvt($inactive,6,1,1), cvt($active,6,1,1),
                $swapin/$intSecs, $swapout/$intSecs, $pagein/$intSecs, $pageout/$intSecs,
                $intrpt/$intSecs, $ctxt/$intSecs,
                $usr, $sys, $idleP[$i], $waitP[$i]);
Finally comes the output. There is actually a lot of latitude here and in this case we're calling printText() which will send the output to the terminal or over a socket. It will not write to a local file as does lexpr, but if you want to see how to do that, refer to its source. As with all perl require files, they must return true and therefore the final line is the digit 1.
  printText($line);
}
1;
Try running it and you'll see all the pagination and time formats work just as they do with standard output formats.
updated November 9, 2012