Interrupts

Introduction

Prior to V2.5.0, collectl reported the total number of interrupts across all CPUs as part of the CPU summary data and nothing about interrupts in the CPU detail. Since interrupt counts are actually reported in the kernel for each CPU by individual interrupt number, that information will now be made available both in summary and detail formats. However, a slightly different methodology for categorization will be used because interrupt summary data will be reported by CPU and detailed interrupt data will break out the data by individual interrupts for each CPU.

Requesting Interrupt Statistics

Since -si has already been taken for reporting inode statistics, interrupt summary data should be requested using -sj and detail data using -sJ. Interrupts are also treated differently than other statistics in a couple of ways:

Rather than reporting a fixed number of fields like other summary data, the number of fields are variable and equal to the number of CPUs
If one specifies -sCj one will NOT get a separate verbose format for the Interrupt Summary, but rather will have those details included as part of the CPU detail report as shown in the examples below.

Plot Data

Although looking at interrupt data in brief or verbose fits quite well with collectl's reporting methodology, it doesn't fit the plot format. This is because it expects a fixed number of fields for summary data and a variable number of fields for detail data, indexed by a device number. However, interrupt summary data is actually variable size based on the number of CPUs and the detail data doesn't really fit with anything and so trying to do so will generate an error. Therefore to report interrupt data in plot format you will be required to request CPU detail data as well since interrupts are reported as part of that data.

Examples

The following examples only show interrupt data except where CPU data is required. Any of these reports can be extended to include other data types.

This first example shows the basic interrupt output displayed in brief mode on an 8 processor system, with timestamps. If you include --verbose you essentially get the same display except with more significant digits for each interrupt.

[mjs]# ./collectl.pl -sj -oT
#         <-----------------Int----------------->
#Time     Cpu0 Cpu1 Cpu2 Cpu3 Cpu4 Cpu5 Cpu6 Cpu7
12:49:55  4828  16K 1000  16K   18  16K  16K    0
12:49:56  4804  16K 1000  16K    0  16K  16K    0
12:49:57  4811  16K 1000  16K   18  16K  16K    0

As mentioned above, specifying -sCj is special as it takes the data for 2 separate subsystems (CPU detail and interrupt summary) and combines them in a single display as shown for two sampling periods on a 4 processor system:

[mjs]# ./collectl.pl -sCj -oT
# SINGLE CPU STATISTICS
#            CPU  USER NICE  SYS WAIT IRQ  SOFT STEAL IDLE INTRPT
14:36:12       0     0    0    0    0    0    0     0  100     11
14:36:12       1     0    0    0    0    0    0     0   99    999
14:36:12       2     0    0    0    0    0    0     0  100      0
14:36:12       3     0    0    0    0    0    0     0  100      0
14:36:13       0     0    0    0    0    0    0     0  100     13
14:36:13       1     0    0    0    0    0    0     0  100   1000
14:36:13       2     0    0    0    0    0    0     0  100      0
14:36:13       3     0    0    0    0    0    0     0  100      0

A third form that can be particularly useful is the Interrupt Details which shows the distribution of the individual interrupts across the CPUs. What makes this form especially handy is it only shows those interrupts that changed during the monitoring cycle which is considerable easier to read than /proc/interrupts itself.

[mjs]# ./collectl.pl -sJ -oT
# INTERRUPT DETAILS
#          Int    Cpu0   Cpu1   Cpu2   Cpu3   Cpu4   Cpu5   Cpu6   Cpu7   Type            Device(s)
12:48:50   082       0      0      0   7731      0      0      0      0   PCI-MSI-X       eth2 (queue 0)
12:48:50   098       0      0      0      0   2037      0      0      0   PCI-MSI-X       eth2 (queue 2)
12:48:50   122       0      0   2240      0      0      0      0      0   PCI-MSI-X       eth2 (queue 5)
12:48:50   138       0   7084      0      0      0      0      0      0   PCI-MSI-X       eth2 (queue 7)
12:48:50   154       0      0      0      0      0   7723      0      0   PCI-MSI-X       eth3 (queue 0)
12:48:50   162    9082      0      0      0      0      0      0      0   PCI-MSI-X       eth3 (queue 1)
12:48:50   178       0      0      0      0      0      0   8253      0   PCI-MSI-X       eth3 (queue 3)
12:48:50   210       0      0      0      0      0      0      0   6417   PCI-MSI-X       eth3 (queue 7)
12:48:50   218       1      0      0      0      0      0      0      0   PCI-MSI         eth0

Restrictions

If the number of CPUs change during processing, this can only be detected when monitoring CPU data at the same time. If you are only monitoring interrupt data and there is a state change things will get very messy. As users typically monitor Interrupts and CPU data at the same time it is not felt to be worth the extra effort or processing overhead to try and accommodate this rare case.

updated June 25, 2010