From: Andrew Morton <akpm@digeo.com>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Andi Kleen <ak@suse.de>, davem@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [BENCHMARK] Lmbench 2.5.54-mm2 (impressive improvements)
Date: Sun, 05 Jan 2003 01:18:48 -0800 [thread overview]
Message-ID: <3E17F878.21A363BF@digeo.com> (raw)
In-Reply-To: Pine.LNX.4.44.0301041930300.1388-100000@home.transmeta.com
Linus Torvalds wrote:
>
> ...
> It doesn't show up on lmbench (insufficient precision), but your AIM9
> numbers are quite interesting. Are they stable?
OK, a closer look. This is on a dual 1.7G P4, with HT disabled (involuntarily,
grr.) Looks like an 8-10% hit on context-switch intensive stuff.
2.5.54+BK
=========
Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
--------- ------------- ----- ------ ------ ------ ------ ------- -------
i686-linu Linux 2.5.54 3 4 11 6 48 12 53
*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP
ctxsw UNIX UDP TCP conn
tbench 32: (85k switches/sec)
Throughput 114.633 MB/sec (NB=143.291 MB/sec 1146.33 MBit/sec)
Throughput 114.157 MB/sec (NB=142.696 MB/sec 1141.57 MBit/sec)
Throughput 115.095 MB/sec (NB=143.869 MB/sec 1150.95 MBit/sec)
pollbench 1 100 5000 (118k switches/sec)
result with handles 1 processes 100 loops 5000:time 8.371942 sec.
result with handles 1 processes 100 loops 5000:time 8.381814 sec.
result with handles 1 processes 100 loops 5000:time 8.367576 sec.
pollbench 2 100 2000 (105k switches/sec)
result with handles 2 processes 100 loops 2000:time 3.694412 sec.
result with handles 2 processes 100 loops 2000:time 3.672226 sec.
result with handles 2 processes 100 loops 2000:time 3.657455 sec.
pollbench 5 100 2000 (79k switches/sec)
result with handles 5 processes 100 loops 2000:time 4.564727 sec.
result with handles 5 processes 100 loops 2000:time 4.783192 sec.
result with handles 5 processes 100 loops 2000:time 4.561067 sec.
2.5.54+BK+broken-wrmsr-backout-patch:
=====================================
Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
--------- ------------- ----- ------ ------ ------ ------ ------- -------
i686-linu Linux 2.5.54 3 4 11 6 48 12 53
i686-linu Linux 2.5.54 1 3 8 4 40 10 51
*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP
ctxsw UNIX UDP TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
i686-linu Linux 2.5.54 3 14 22 26 30 57
i686-linu Linux 2.5.54 1 12 28 22 32 58
tbench 32:
Throughput 121.701 MB/sec (NB=152.126 MB/sec 1217.01 MBit/sec)
Throughput 124.958 MB/sec (NB=156.197 MB/sec 1249.58 MBit/sec)
Throughput 124.086 MB/sec (NB=155.107 MB/sec 1240.86 MBit/sec)
pollbench 1 100 5000
result with handles 1 processes 100 loops 5000:time 7.306432 sec.
result with handles 1 processes 100 loops 5000:time 7.352913 sec.
result with handles 1 processes 100 loops 5000:time 7.337019 sec.
pollbench 2 100 2000
result with handles 2 processes 100 loops 2000:time 3.184550 sec.
result with handles 2 processes 100 loops 2000:time 3.251854 sec.
result with handles 2 processes 100 loops 2000:time 3.209147 sec.
pollbench 5 100 2000
result with handles 5 processes 100 loops 2000:time 4.135773 sec.
result with handles 5 processes 100 loops 2000:time 4.117304 sec.
result with handles 5 processes 100 loops 2000:time 4.119047 sec.
The tbench changes should probably be ignored. After profiling tbench
I can say that this thoughput difference is _not_ due to the task switcher
change (__switch_to is only 1%). I left the numbers here to show what
the effect of simply relinking and rebooting the kernel can be.
BTW, the pollbench numbers are not stunningly better than the 500MHz PIII:
pollbench 1 100 5000
result with handles 1 processes 100 loops 5000:time 9.609487 sec.
pollbench 2 100 2000
result with handles 2 processes 100 loops 2000:time 4.016496 sec.
pollbench 5 100 2000
result with handles 5 processes 100 loops 2000:time 4.917921 sec.
I didn't profile the P4. John has promised P4 oprofile support for
next week, which will be nice.
I did profile Manfred's pollbench on the PIII, uniprocessor build. Note
that there is only a 5% throughput difference on this machine. It's all
in __switch_to(). Here the PIII is doing 70k switches/sec.
2.5.54+BK:
c012abbc 534 2.69888 buffered_rmqueue
c0116714 617 3.11837 __wake_up_common
c010a606 635 3.20934 restore_all
c014b038 745 3.76529 do_poll
c013d4dc 757 3.82594 fget
c014551c 766 3.87142 pipe_write
c010a5c4 1249 6.31254 system_call
c014b0f0 1273 6.43384 sys_poll
c01090a4 1775 8.97099 __switch_to
c0116484 1922 9.71394 schedule
2.5.54+BK+backout-patch:
c012abbc 768 3.1024 buffered_rmqueue
c0116714 790 3.19127 __wake_up_common
c010a5e6 809 3.26803 restore_all
c013d4dc 918 3.70834 fget
c014551c 936 3.78105 pipe_write
c014b038 977 3.94668 do_poll
c01090a4 1070 4.32236 __switch_to
c014b0f0 1606 6.48758 sys_poll
c010a5a4 1678 6.77843 system_call
c0116484 2542 10.2686 schedule
next prev parent reply other threads:[~2003-01-05 9:10 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <94F20261551DC141B6B559DC4910867204491F@blr-m3-msg.wipro.com.suse.lists.linux.kernel>
[not found] ` <3E155903.F8C22286@digeo.com.suse.lists.linux.kernel>
2003-01-03 18:40 ` [BENCHMARK] Lmbench 2.5.54-mm2 (impressive improvements) Andi Kleen
2003-01-03 21:32 ` Andrew Morton
2003-01-05 1:01 ` Andrew Morton
2003-01-05 3:35 ` Linus Torvalds
2003-01-05 3:51 ` Linus Torvalds
2003-01-05 3:54 ` Andrew Morton
2003-01-05 3:52 ` Linus Torvalds
2003-01-05 10:06 ` Andi Kleen
2003-01-05 18:51 ` Linus Torvalds
2003-01-05 23:46 ` Andi Kleen
2003-01-06 1:33 ` Linus Torvalds
2003-01-06 2:05 ` Andi Kleen
2003-01-06 0:58 ` H. Peter Anvin
2003-01-05 9:18 ` Andrew Morton [this message]
2003-01-03 8:59 Aniruddha M Marathe
2003-01-03 9:33 ` Andrew Morton
2003-01-03 10:24 ` David S. Miller
2003-01-03 10:22 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3E17F878.21A363BF@digeo.com \
--to=akpm@digeo.com \
--cc=ak@suse.de \
--cc=davem@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.