From: Gautam H Thaker <gthaker@atl.lmco.com>
To: linux-kernel@vger.kernel.org
Cc: "Gautam H. Thaker - LM ATL" <gthaker@atl.lmco.com>,
Ingo Molnar <mingo@redhat.com>
Subject: ~5x greater CPU load for a networked application when using 2.6.15-rt15-smp vs. 2.6.12-1.1390_FC4
Date: Thu, 23 Feb 2006 14:55:56 -0500 [thread overview]
Message-ID: <43FE134C.6070600@atl.lmco.com> (raw)
The real-time patches at the URL below do a great job of endowing Linux with
real-time capabilities.
http://people.redhat.com/mingo/realtime-preempt/
It has been documented before (and accepted) that this patch turns Linux into
a RT kernel but considerably slows down the code paths, esp. thru the I/O
subsystem. I want to provide some additional measurements and seek opinions
of if it might ever be possible to improve on this situation.
In my tests I used 20 3GHZ Intel Xeon PCs on an isoloated gigabit network.
One of the nodes has a "monitor" process that listens to incoming UDP packets
from the other 19 nodes. Each node is sending approximately 2000 UDP
packets/sec to the monitor process for a total of about 38,000 incoming UDP
packest/sec. These UDP packets are small with application payload being ~10
bytes, for total BW usage of less than 4 Mbits/sec at application level and
less than 15 Mbits/sec counting all headers. (Total BW usage is not high but
there is a large number of packets that are coming in.) Monitor process does
some fairly simple processing per packet.
I measured the CPU usage of the "monitor" process when the testbed was used
with two different operating system. The monitor process is the "nalive.p"
process in the "top" output below. The CPU laod is fairly stable and "top"
gives the following information:
::::::::::::::
top: 2.6.12-1.1390_FC4 # STANDARD KERNEL
::::::::::::::
top - 14:34:39 up 2:32, 2 users, load average: 0.10, 0.05, 0.01
Tasks: 56 total, 2 running, 54 sleeping, 0 stopped, 0 zombie
top - 14:35:32 up 2:33, 2 users, load average: 0.11, 0.06, 0.01
Tasks: 56 total, 2 running, 54 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.4% us, 7.0% sy, 0.0% ni, 80.8% id, 0.2% wa, 7.0% hi, 3.6% si
Mem: 2076008k total, 100292k used, 1975716k free, 16192k buffers
Swap: 128512k total, 0k used, 128512k free, 50376k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4823 root -66 0 22712 2236 1484 S 8.4 0.1 0:37.74 nalive.p
4860 gthaker 16 0 7396 2380 1904 R 0.2 0.1 0:00.04 sshd
1 root 16 0 1748 572 492 S 0.0 0.0 0:01.06 init
2 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
3 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/0
::::::::::::::
top: 2.6.15-rt15-smp.out # REAL_TIME KERNEL
::::::::::::::
node0> top
top - 09:52:48 up 1:47, 3 users, load average: 0.91, 1.05, 1.02
Tasks: 98 total, 1 running, 97 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.5% us, 41.8% sy, 0.0% ni, 55.6% id, 0.1% wa, 0.0% hi, 0.0% si
Mem: 2058608k total, 88104k used, 1970504k free, 9072k buffers
Swap: 128512k total, 0k used, 128512k free, 39208k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2906 root -66 0 18624 2244 1480 S 41.4 0.1 27:11.21 nalive.p
6 root -91 0 0 0 0 S 32.3 0.0 21:04.53 softirq-net-rx/
1379 root -40 -5 0 0 0 S 14.5 0.0 9:54.76 IRQ 23
400 root 15 0 0 0 0 S 0.2 0.0 0:00.13 kjournald
1 root 16 0 1740 564 488 S 0.0 0.0 0:04.03 init
The %CPU is at 8% for the non-real-time, uniprocessor kernel, while it is at
least 41% (and may be 41.4%+32.3%+14.5% = 88%) for the real-time SMP kernel.)
My question is this: How much improvement in raw efficiency is possible for
real-time patches? We take very long view, so if there is a belief that in 5
years the penalty will be reduced from 5-10x in this application to less than
2x that would be great. If we think this is about as well as can be done it
helps knowing that too.
There is nothing else going on on the machines, all code paths should be
going down "happy path" with no contention or blocking - my naive view is
that a 2x overhead is possible, but 5-10x seems harder to understand. And
this is not the case of finding some large non-preemptible region - since
real-time performance is excellent, but about why the code paths seem so "heavy".
Gautam Thaker
next reply other threads:[~2006-02-23 19:56 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-23 19:55 Gautam H Thaker [this message]
2006-02-23 20:15 ` ~5x greater CPU load for a networked application when using 2.6.15-rt15-smp vs. 2.6.12-1.1390_FC4 Benjamin LaHaise
2006-02-23 20:58 ` Ingo Molnar
2006-02-23 21:06 ` Nish Aravamudan
2006-02-23 21:08 ` Ingo Molnar
2006-02-23 21:14 ` Nish Aravamudan
2006-02-23 22:07 ` Esben Nielsen
2006-02-24 8:03 ` Jan Engelhardt
2006-02-24 12:11 ` Andrew Morton
2006-02-24 20:06 ` Gautam H Thaker
2006-02-24 20:31 ` Andrew Morton
2006-02-24 20:44 ` Gautam H Thaker
2006-02-24 16:52 ` Theodore Ts'o
2006-02-24 19:25 ` Gautam H Thaker
2006-02-28 19:27 ` Matt Mackall
2006-02-28 22:19 ` Gautam H Thaker
-- strict thread matches above, loose matches on Subject: below --
2006-07-11 18:08 Jonathan Walsh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43FE134C.6070600@atl.lmco.com \
--to=gthaker@atl.lmco.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox