Playback

Playing back one or more files

There are actually 2 reasons for playing back a file, one being to generate plottable files and the other is to simply examine the data in the same format as you would see if running collectl interactively. The following discussion applies to both cases, the only real difference is that to generate plot files you include the switches -P and -f.

You tell collectl to play back one or more files using -p followed by any combination of one or more filenames separated with commas or whitespace, noting you need to quote the string if it contains spaces or wildcard characters. The files will be played back as if a single file with monotonically increasing sample numbers for each unique host. It should be noted that if these files contain samples for different subsystems the resultant stream will contain data elements for all, zero filling as appropriate. When this occurs, a message will be displayed if -m has been specified.

Collectl will generate plot format if requested with -P, writing the output to multiple output files if both summary and detail data is specified or when a file with data for a differnet host or a different collection date is encounted. NOTE: files that contain data that crosses midnight will not force creation of a second file when the date changes.

Filtering with --from and --thru
You restrict the timeframe between which data is reported by using --from and --thru. However, since collectl doesn't require you to specify both switches nor does it require you to specify both a date and time for each switch, it tries to make an intelligent guess as to what timeframe you really meant. In most cases it guesses right but sometimes it guesses wrong. The simple fix if it gets confused is to remove the ambiguity and just specify full dates/times with both switches.

When you specify playback files using wildcarding in the name string, collectl initially selects all the files that match. However, in some cases you may not really intend for them to all be played back, especially if you selected a timeframe using --from and --thru switches. Have no fear. Collectl will use these switches to select the appropriate subset of files that match your selection.

Caution
When filtering wildcarded files using --from and/or --thru switches, collectl compares those values with the timestamps of the filenames to determine whether or not that file is likey to contain the data requested. However, collectl doesn't know if a file contains data that spans midnight or not so it simply assumes it doesn't since this is a rarely used feature. Therefore, collectl only relaxes these tests when wildcards are not specified in the filename playback string.

Processing files that span midnight
The main reason for the complexity in the interpretation of --from and --thru, is to allow collectl to deal with files that contain data that crosses midnight. As an example, consider a single file with data collected from midnite on one day to 2AM, 26 hours later. If you want to tell collectl to process the entire file, don't specify any time filters and it will report everything. But if you want to process the date from midnight to 1AM, you need to tell collectl which dates are involved! As stated in the rules above:

A word about the first record reported
Collectl always needs data from a base interval from which to begin calculating changes in counters and that interval is never displayed. In other words, if you collected data every 10 seconds starting at 10:00:00 and then played it back, the first time reported will be for 10:00:10.

In order to try and mitigate this when playing back data and specifying a --from time, collectl attempts to read a sample from the previous interval so that you actually see the time you requested. Further, when mulitple files are processed collectl is smart enough to know if they are contiguous to use the last set of data from one file as the base interval for the next one and as a result there will be no holes in the data as reported. However if they are not configuous a new base level must be taken for the new file and its first record skipped. This can be confusing and probably not even that important but consider 2 files generated contiguously:

If you have 2 non-contiguous files you will see the same results whether you process them one at a time or together using a wild card, that is no first record for either.

How --from and --thru are really interpretted
This section probably contains more detail than you should really care about and is here more for completeness than anything else. Remember, these switches can contain a date, a time or both! You can also use the shorthand form of combining both switches under --from by separating the two with a hypen. In other words you do something like: --from 12:00-13:00, noting you can use an combination of dates/times on either side of the hyphen.

Important Tip
Perhaps the most important thing to keep in mind is that when you play back a file, collectl will use the same switches as were specified during collection. In other words if you collect cpu, disk and network data using -scdn, when you play it back you will get cpu, disk and network summary data either displayed on the terminal or written to a file. However, you could just have easily chosen a different subsystem specification such as -scND in which case you'd still get CPU summary data but now you'd get network and disk detail data. This feature can be extremely useful especially when combined with different output formatting switches such as -o and/or --verbose.
updated May 04, 2011