# collectl -sZ # PROCESS SUMMARY (faults are /sec) # PID User PR PPID S VSZ RSS SysT UsrT Pct AccuTime MajF MinF Command 21502 root 15 1749 S 6M 2M 0.00 0.00 0 0:06.40 0 0 /usr/sbin/sshd 21504 root 15 21502 S 4M 1M 0.00 0.00 0 0:00.79 0 0 -bash 22984 root 15 1 S 7M 1M 0.00 0.00 0 0:00.78 0 0 cupsd 23073 apache 15 1914 S 18M 8M 0.00 0.00 0 0:00.01 0 0 /usr/sbin/httpd
The way you tell collectl to monitor processes is to specify the Z subsystem and any optional parameters with --procopts. Since monitoring processes is a heavier-weight function, it is recommended to use a different interval, which can be specified after the main monitoring interval separated by a colon. The default is 60 seconds. Therefore, to monitor all the processes once every 20 seconds and the rest of the parameters every 5 simply say:
collectl -sZ -i5:20The biggest mistake people make when running this command interactively is to leave off the interval or specificy something like -i1 and not see any process data. That is because the default interval is 60 seconds and they just haven't waited long enough for the output! This should obvious since collectl will announce it is waiting for a 60 second sample.
There are also a few restrictions to the way these intervals are specified. The process interval must be a multiple of the main interval AND cannot be less than it. If you specify a process interval without a main interval, the main interval defaults to the process interval.
Finally, as with other data collected by collectl, you can play back process data by specifying -p. While not exactly plottable data, you can specify -P and the output will be written to a separate file as time stamped space delimited data, one process per line.
If a plus sign immediately follows a process selector any processes selected by it will have their threads monitored as well. See collectl -x or man collectl for more details.
If you do not want this effect and only want to look at those processes that match the selection list at the time collectl is started, specify --procopts p to suppress dynamic process discovery. This holds for process threads as well, suppressing looking for new ones.
Perhaps the best way to see this in effect is to run collectl with the following command:
collectl -i:.1 -sZ --procfilt fabcThe .1 for an interval is not a mistake. It is there to show that you can indeed use collectl to spot the appearance of short lived processes - just don't do it unless you really need to. The --procfilt switch is saying to look for any processes invoked with a command that contains the string 'abc' in it. When this command is invoked there shouldn't be any output unless someone IS running a command with 'abc' in it. Now go to a different window or terminal and edit the file abc with your favorite editor. You will immediately see collectl display process information for your editor and when you exit the editor the output will stop.
When run in non-threaded mode, the times reported include all time consumed by all threads. When run in threaded mode, times are reported for indivual threads as well as the main process. In other words, if a process's only job is to start threads, it will typically show times of 0. If you rerun collectl in non-threaded mode you will see it report aggregated times.
# collectl -sZ -i:1 --procopts m # PID User S VmSize VmLck VmRSS VmData VmStk VmExe VmLib MajF MinF Command 9410 root R 81760K 0 15828K 14132K 84K 16K 3620K 0 18 /usr/bin/perl 1 root S 4832K 0 556K 212K 84K 36K 1388K 0 0 init 2 root S 0 0 0 0 0 0 0 0 0 kthreadd 3 root S 0 0 0 0 0 0 0 0 0 migration/0
# collectl -sZ -i:1 # PROCESS SUMMARY (faults are /sec) # PID User PR PPID S VSZ RSS SysT UsrT Pct AccuTime RKB WKB MajF MinF Command 1 root 20 0 S 4M 552K 0.00 0.00 0 0:00.68 0 0 0 0 init 2 root 15 0 S 0 0 0.00 0.00 0 0:00.00 0 0 0 0 kthreadd 3 root RT 2 S 0 0 0.00 0.00 0 0:00.02 0 0 0 0 migration/0
# collectl -sZ -i:1 --procfilt cdt -oT # PROCESS SUMMARY (faults are /sec) # PID User PR PPID S VSZ RSS SysT UsrT Pct AccuTime RKB WKB MajF MinF Command 09:01:03 13577 root 20 12775 R 1M 1M 0.04 0.00 4 0:01.92 0 16K 0 0 ./dt 09:01:04 13577 root 20 12775 D 1M 1M 0.40 0.00 40 0:02.32 0 118K 0 0 ./dt 09:01:05 13577 root 20 12775 D 1M 1M 0.24 0.00 24 0:02.56 0 65K 0 0 ./dt
# collectl -sZ --procopts i -i:.5 --procfilt cdt -oTm # PID User S SysT UsrT RKB WKB RKBC WKBC RSYS WSYS CNCL Command 09:03:24.003 13614 root D 0.12 0.00 0 32K 0 32K 0 64 0 ./dt 09:03:24.503 13614 root D 0.14 0.00 0 32K 0 32K 0 64 0 ./dt 09:03:25.003 13614 root R 0.10 0.00 0 24K 0 24K 0 48 0 ./dt
In its simplest form, this switch tells collectl to simply display the top consumers of cpu. However, as of collectl V2.6.4 you can now now tell it to optionally display the list sorted by I/O or page faults. Here I'm simply looking for the top processes sorted by page faults with the command collectl --top flt and the display fills my window, which in this case is only 10 lines high. To look at the top consumers of I/O, simply use --top io instead:
# PID User PR PPID S VSZ RSS CP SysT UsrT Pct AccuTime RKB WKB MajF MinF Command 3009 root 20 1 S 2M 280K 3 0.00 0.00 0 0:43.01 0 0 0 8 irqbalance 7144 root 20 6485 R 81M 15M 2 0.00 0.06 6 0:01.70 0 0 0 5 /usr/bin/perl 1 root 20 0 S 4M 556K 2 0.00 0.00 0 0:03.60 0 0 0 0 init 2 root 15 0 S 0 0 2 0.00 0.00 0 0:00.00 0 0 0 0 kthreadd 3 root RT 2 S 0 0 0 0.00 0.00 0 0:00.10 0 0 0 0 migration/0 4 root 15 2 S 0 0 0 0.00 0.00 0 0:00.06 0 0 0 0 ksoftirqd/0 5 root RT 2 S 0 0 0 0.00 0.00 0 0:00.30 0 0 0 0 watchdog/0 6 root RT 2 S 0 0 1 0.00 0.00 0 0:00.08 0 0 0 0 migration/1 7 root 15 2 S 0 0 1 0.00 0.00 0 0:00.02 0 0 0 0 ksoftirqd/1
The following 3 successive displays are the result of monitoring a processes named thread.pl which creates a couple of threads 10 seconds apart which then do some I/O. In the first display we see the main script, which is actually run under the perl interpretter and so the string thread does exist as part of the argument string, but I chose to leave it off the output to save screen real estate:
collectl --top io --procfilt fthread --procopt t # PROCESS SUMMARY (faults are /sec) 06:57:42 # PID User PR PPID S VSZ RSS CP SysT UsrT Pct AccuTime RKB WKB MajF MinF CommandnF Command 7024 root 20 6725 S 61M 2M 2 0.00 0.00 0 0:00.00 0 0 0 0 /usr/bin/perl
# PROCESS SUMMARY (faults are /sec) 06:57:52 # PID User PR PPID S VSZ RSS CP SysT UsrT Pct AccuTime RKB WKB MajF MinF Command 7065+ root 20 6725 R 73M 5M 0 0.88 0.12 100 0:01.98 0 291K 0 0 thread.pl 7064 root 20 6725 S 73M 5M 2 0.88 0.11 99 0:01.98 0 0 0 0 /usr/bin/perl
# PROCESS SUMMARY (faults are /sec) 06:58:02 # PID User PR PPID S VSZ RSS CP SysT UsrT Pct AccuTime RKB WKB MajF MinF Command 7098+ root 20 6725 R 83M 8M 1 0.12 0.02 14 0:00.86 0 29K 0 0 thread.pl 7096+ root 20 6725 R 83M 8M 0 0.16 0.00 16 0:04.24 0 27K 0 0 thread.pl 7095 root 20 6725 S 83M 8M 2 0.28 0.02 30 0:05.13 0 0 0 0 /usr/bin/perl
Including non-process data
The native top can natively show other types of data besides the top processes
and so can collectl. Just specify those subsystems you are interested in with -s and
they will be displayed in a scrolling window above the process data - by
scrolling multiple lines of data, you are able to see history, something the
linux command cannot do. You may also want to include timestamps with the
brief data by using -oT to make it easier to read.
But don't stop with brief data, you can even show verbose data as well. However in the case of multiple subsystems it just isn't practical to show scrolling history and so you will only see the latest sample. If you choose to show a single verbose subsystem you will see scrolled data.
Finally, if you want to customize the way the screen real-estate is allocated between the process and other data, you can change the size of the process section by including the number of lines to display as the second argument to --top. You can also control the size of the subsystem data with --hr lines, a synonym for --headerrpeat lines.
There are actually 2 main functional components to this format, the main one being to determine the parent child relationship between all processes (there IS some additional overhead involved here). A second function is the aggregation of various counters and meters.
Proctree can also be combined with --top to limit the number of processes display OR in playback mode with or without --top. Consider the following output when playing back a file with --top --export proctree:
# PID PPID User PR S VSZ RSS CP SysT UsrT Pct AccuTime MajF MinF Command 00001 0 root 15 S 2G 108M 0 0.03 0.03 0 0:18.21 0 0 init 05535 1 root 15 R 106M 15M 2 0.01 0.03 0 0:07.68 0 0 /usr/bin/perl 05452 1 haldaemo 15 S 85M 7M 1 0.02 0.00 0 0:06.99 0 0 hald 05453 5452 root 15 S 55M 3M 0 0.02 0.00 0 0:05.42 0 0 hald-runner 05474 5453 root 16 S 9M 652K 0 0.02 0.00 0 0:05.41 0 0 hald-addon-storage:
One can also use most of the process options as well (see --showsubopts for the complete set).
Additional interactive --top options
Proctree was really developed for real-time display with --top and so there are more available
options, the main one to consider is the suppression of fields with zero in them. In the previous
example, fields with 0 CPU were suppressed because by default --top sorts by CPU (even though
we're not sorting). If one were to choose a different sort field with --top, proctree will
use that field to suppress entries with zero in them. In fact, there are a number
of different switches one can select interactively, one of which is to change the suppression
value from 0 to something else.
So let's take a closer look at running in interactive mode by typing the command
collectl --top --export proctreeand at some time after the first data screen is displayed, type RETURN. You will now see a menu like this:
Enter a command and RETURN while in display mode: pid only display this pid and its children a toggle aggregation between 'on' and 'off' dxx change display hierarchy depth to xx i change display format to 'I/O' k toggle multiplication of I/O numbers by 1024 between 'on' and 'off' m change display format to 'memory' p change display format to 'process' h show this menu stype where 'type' is a valid sorting type (see --showtopopts) entries with 0s in those field(s) will be skipped wxx max width for display of command arguments z toggle 'skip' logic between 'on' and 'off' Zxx when skipping, only keep entries with I/O fields > xxKB Press RETURN to go back to display mode...
These commands fall into several categories, one being those that toggle behavior such as aggregation, multiplication and the skip logic. By default, all values are aggregated up through their parent hierarchy and typing the a command followed by a RETURN will turn this behavior off. Similarly, when the values of the I/O counters are too large to easily read you can force their division by 1K with the k command. And finally, you can disable the logic that skips zero-based entries with the z command. If you'd rather skip on some value other 0 you can set the skip value with Zxxx.
Look at the display line above the following process data:
Process Tree 09:06:03 [skip when 'time'<=0 is 'on' aggr: 'on' x1024: 'off' depth 5] # PID PPID User PR S VSZ RSS CP SysT UsrT Pct AccuTime MajF MinF Command 00001 0 root 15 S 674M 272M 1 0.00 0.06 6 0:09.96 0 0 init 01766 1 root 15 S 50M 24M 0 0.00 0.06 6 0:01.30 0 0 /usr/sbin/sshd 02142 1766 root 15 S 25M 14M 1 0.00 0.06 6 0:00.88 0 0 /usr/sbin/sshd 02144 2142 root 15 S 18M 12M 1 0.00 0.06 6 0:00.87 0 0 -bash 02229 2144 root 19 R 14M 10M 0 0.00 0.06 6 0:00.84 0 0 /usr/bin/perl
As with other process displays, you can also choose whether you want to see the default display, one that shows all I/O fields or one focused on memory using the p, i or m commands. You can easily switch between these formats at any time.
If you enter a number as a command, this is interpretted as a process PID and the display will be adjusted such that this becomes the first entry in the display. If you would like to skip on something other than the current field, you can easily change that with the s command immediately following by one of the sort field names listed with --showtop. Finally, if using the wide command option with --procopts w, long command string will cause wrapping and make the display unreadable. The w command can be used to set the maximum width of the command field.
As with other collectl options, there are simply far too many combinations to describe which are appropriate for a particular situation (such as using --procopt) so it is recommended you experiment to better understand the many capabilities of proctree.
The fields themselves summarize all the key data elements associated with each process making it possible to see the process start/end times, cpu consumption, I/O (if the kernel supports I/O stats), page faults and even the ranges of the different types of memory consumed. And since the data elements are separated by a single character delimeter you can easily load the file into your favorite spreadsheet and perform deeper analysis (the data is actually not very user friendly as written).
It is also important to remember a couple of things:
Collectl maintains 2 data structures that control monitoring: pids-to-monitor and pids-to-ignore. These lists are built at the time collectl starts, so if --procopts p is not specified, the effect is to execute a ps command and save all the pids in the pids-to-monitor list. If filters are specified with --procfilt, only those pids that match are placed in pids-to-monitor list and the rest placed in the pids-to-ignore list and so you can see that when filters are used there can be a significant reduction in overhead since collectl need not examine every processes data.
If collectl is only monitoring a specific set of processes, either because --procopts p was specified or procfilt was used and only specified specific pids (not ppids), on each monitoring pass collectl only looks at the pids in the to-be-monitored list. In other words, this is as efficient as it gets because it needn't look for processes if neither list, aka newly created processes.
If doing dynamic process monitoring, every monitoring pass collectl has to read all the pids in /proc to get a list of ALL current processes. While it ignores any in the do-not-monitor list, it must look at the rest. If any of these are in the to-be-monitored list and have had thread monitoring requested, additional work is required to see if any new threads have shown up. Any processes not in the to-be-monitored list are obviously NEW processes and must then be examined to see if they match any selection criteria and this involves reading the /proc/pid/stat file. That pid is then placed in one of the two lists. It should be understood that during any particular interval a lot of processes come and go, such as cat, ls, etc. However, these are short lived enough as to not even be seen by collectl, unless of course collectl is running at a very fine grained monitoring level.
Occasionally a process being monitored disappears because it had terminated. When this happens its pid is removed from the to-be-monitored list.
Finally, these data structures (and a couple of others that have not been described) need maintenance to keep them from growing. If the number of processes to monitor has been fixed, this maintenance is significantly reduced.
So the bottom line is if you have to use dynamic monitoring, try to bound the number of processes and/or threads. If you really need to see it all, don't be afraid to but just be mindful of the overhead. Collecting all process data with the default interval has been observed to take about 1 minute of CPU time, which is less than 0.1%, on a lightly loaded Proliant DL380, but that load will be higher with more active process.
updated Dec 16, 2008 |