So why should you care? This probably won't have any impact on any of your running applications - but do you ever run top? iostat? sar? or any other monitoring tools? If you're reading this you probably run collectl. Most monitoring tools are farily light-weight and for a good reason - if you're trying to measure something you don't want the tool's overhead to get in the way. Unfortunately with this regression it will now!
A bug has been logged with RedHat - see bugzilla 761293 and we are also working with a kernel developer to do some patch testing. Results look promising and so should eventually get into a newer kernel release, but there will also be a time period where there will be kernels with higher /proc read overhead. But also remember this only happens with high core counts, not most systems.
In the following example, you can see monitoring CPU data takes about 3 seconds to read almost 9K samples and write them to a file on a 2-socket/dual-core system. Very efficient!
time collectl -sc -i0 -c 8640 -f/tmp real 0m2.879s user 0m1.908s sys 0m0.913s
time collectl -sc -i0 -c 864 -f/tmp real 0m16.783s user 0m3.003s sys 0m13.523s
Since a simply uname command will tell you your kernel version, you might think that's all it takes, but nothing is always that simple because most vendors patch their kernels and you can't always be sure what code it's actually running.
One simple way to tell for sure is to run the very simple test below which times a read of /proc/stat (which seems to be the most heavily effected) by using strace see how much time is spent in the actually read.
The following is on my 2-socket/dual-core system:
strace -c cat /proc/stat>/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000251 251 1 execve 0.00 0.000000 0 3 read 0.00 0.000000 0 1 write 0.00 0.000000 0 4 open 0.00 0.000000 0 5 close 0.00 0.000000 0 5 fstat 0.00 0.000000 0 8 mmap 0.00 0.000000 0 3 mprotect 0.00 0.000000 0 1 munmap 0.00 0.000000 0 3 brk 0.00 0.000000 0 1 1 access 0.00 0.000000 0 1 uname 0.00 0.000000 0 1 arch_prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000251 37 1 total
strace -c cat /proc/stat >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.014997 4999 3 read 0.00 0.000000 0 1 write 0.00 0.000000 0 20 16 open 0.00 0.000000 0 6 close 0.00 0.000000 0 12 10 stat 0.00 0.000000 0 5 fstat 0.00 0.000000 0 8 mmap 0.00 0.000000 0 3 mprotect 0.00 0.000000 0 1 munmap 0.00 0.000000 0 4 brk 0.00 0.000000 0 1 1 access 0.00 0.000000 0 1 execve 0.00 0.000000 0 1 arch_prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.014997 66 27 total
updated Dec 13, 2011 |