Tcp/Ip Extended Stats

EXPERIMENTAL

Introduction

Having recently discovered /proc/net/snmp (not really sure how I missed it), I decided to try to incorporate that statistics into collectl for 2 very good readons: Unfortunately, as it often the case, this got messy very quickly. On the two systems I looked at, one running debian squeeze and the other RHEL 5.2 (I did not do an extensive investigtion) the data was different. More confusing, some fields were dropped while others were added! Furthermore, this is the data that drives netstat -s and that too shows inconsistent fields.

Being torn between the challenge of getting this right the first time and getting something useful out there for people to use, I decided to:

How to use this module

It's real simple as shown below:
collectl --import snmp
#<-------Tcp/Ip Errors-------->
# Loss FTrn   Ip Icmp  Tcp  Udp
     0    0    0    0    0    0
and like all other import modules, you can combine it with any combination of subsystems, like this:
collectl --import snmp -scn
#<----CPU[HYPER]-----><----------Network----------><-------Tcp/Ip Errors-------->
#cpu sys inter  ctxsw   KBIn  PktIn  KBOut  PktOut  Loss FTrn   Ip Icmp  Tcp  Udp
   0   0    80     65      0      2      0       1     0    0    0    0    0    0
   0   0   123     76      0      1      0       1     0    0    0    0    0    0
You can also combine it with other switches for time formatting, write data to a file and even play it back. Just remember in playback mode you still need to include --import snmp so collectl will load the necessary print routines.

Options

The examples about show this module running in brief mode in which it is simply summaring some error counts. In verbose mode it shows a LOT more and can be controlled by the o= modifier as shown here:

collectl --import snmp,o=i
# SNMP SUMMARY
#<----------------------------------IpPkts----------------------------------->
# Receiv Delivr Forwrd DiscdI InvAdd   Sent DiscrO ReasRq ReasOK FragOK FragCr
       1      1      0      0      0      1      0      0      0      0      0
       1      1      0      0      0      1      0      0      0      0      0
There are currently 7 options, one for each type of data. It should also be noted that this modules also reads the same data read by the -st option and so you can use that as well or not, for reasons that may become clearer later on. You can combine the options in any way you wish, but note that their output will be appended to the same line. You may need a wider terminal and/or smaller font if you choose all at once. Their values and corresponding fields in /proc/net/snmp and /proc/net/netstat are: The completeness of the information reported for each option varies and the header names can be confusing (feel free to make suggestions on the mailing list or support forum). It should also be noted not all fields are reported and an attempt was made to at least include what netstat reports. But since not all netstats report the same thing that isn't always the case either. Again, all comments welcome.

Finally, at this point in time this module does NOT write anything in plot format - a good reason to use -st if you really want that data. It also doesn't write anything in export format.

Normalization

One big change is from collectl's method of normalizing results as /sec values. Most of the data reported by this module are error counts or otherwise small numbers and even values as low as 1 can indicate problems. When reported as /sec values over intervals other than 1, these values will be reported as 0 and missed unless one remembers to include -on. Therefore almost all values will be reported as actual values for the entire reporting interval's duration. The only exceptions are:

Once again, feedback is welcome here too.

Future Plans

The plan it to ultimately replace what -st currently reports with what this module reports. In other words, some day you're be able to run collectl -st --tcpopts xxx replacing xxx with the same options currently used by the snmp module. That will also mean collectl will no longer report PureAcks and HPAcks as it currently does in brief mode as it is also felt the other information reported is more useful. Again, any and all optinions are welcome.

updated Jan 31, 2012