Collectl Features
Home |
Architecture |
Features |
Documentation |
Releases |
FAQ |
Support
The following are descriptions of just some of collectl's many features but are
not intended as a tutorial of how and when to apply them:
Fine Grained, Non-Drifting Monitoring
If the Time:HiRes perl modules is installed, which you can verify with the
command collectl -v, you will be able to run with non-integral
sampling intervals. Whether integral or fractional, sample times will align
extremely close to the whole second and will not drift as it the case
with just about every other tool. You will also be able to use -om to
have all times reported in msec.
Low Overhead
Collectl uses very little CPU. In fact it has been measured to use <0.1% when run
as a daemon using the default sampling interval of 60 seconds for process and slab
data and 10 seconds for everything else. The overhead can increase on systems which
have dozens of disks or running hundreds processes so you may want to measure its
effect on you own system if you are concerned.
Summary vs Detail Data
You can report aggregated performance numbers on many devices such as CPUs, Disks,
interconnects such as Infiniband or Quadrics, Networks and even the
Lustre file system.
However you can also report on individual devices if you want to see how the aggregate
load is being generated. Be sure to see the documentation
for more details, particularly the examples and tutorials in the Getting Started
section.
Brief vs Verbose Format
If is often more useful to see less data but for more devices and collectl recognizes
this by providing brief format as a default interactive display format. This
allows you to see what a number of subsystems are doing on a single line, making it
much easier to spot inconsistencies in the data by scanning a column of numbers. If
you want more detail and are willing to look at multiple lines per sample verbose
format is what you want. In fact, if you collectl data in record you can play
it back in both formats, first brief to look for problems and than again in
verbose to see more of what is happening. This technique also applies to
Summary and Detail data as well.
Plot Format
Althougy you can also display interactive output in plot format, this is really intended
for its namesake, plotting. By generating output (or simply playing back recorded data)
in this format, you can then feed the resultant files into plotting tools that recognize
delimiter separtated fields such as gnuplot,
Excel or even OpenOffice.
If you need data with non-space separated data which is the default, you can even change
it via the --sep switch.
Aligned Monitoring Intervals
If you've installed Time::HiRes and are using an integral sampling interval,
by default collectl will align its sampling on integral second boundaries.
In interactive mode samples will be taken as close as possible to
the nearest second and when run as a daemon, samples will align to the
top of the minute. In the latter case this means if you're running on a cluster
with synchronized clocks, all instances of collectl will collectl their
samples within a few msec of each other, making it much easier to correlate
events across the cluster.
Process and Slab Monitoring
Both ot these are higher overhead activities, but collectl provides a secondary
interval so they can be gathered at a lower frequency. In addition one can
specify a number of filters to select processes by pid, parent, owner, or even name.
One can
also request process threads be included. As for slabs, here too one can filter
by name and during interactive display can request that only slabs that have
changed in value be displayed, significantly reducing the output and enhancing the
readability.
Collectl also has the ability to display process data in a way similar to the top
command and one option allows you to sort by top I/O users, something not currently
available in any other utilities.
Process I/O
New to the 2.4.0 release is the monitoring of process i/o statistics. See
this page for more details
Interrupt Reporting by CPU 
New to the 2.5.0 release you can now report interrupts at the CPU level and even
examine them changing in more detail at the individual interrupt levels.
Socket Communications
Rather than dispay its output on the terminal or write it to a file, collectl can
send its data over a socket as well, making it possible to integrate it with other
programs.
Exportable data formats
If you don't like the format of the data collectl presents, feel free to write your
own using --export. There are already 3 that come with collectl for writing as
S-Exporessionn, List Expressions or even in vmstat format. This data
can also be sent over the socket interface as well.