From: Momchil Velikov <velco@fadata.bg>
To: vda@port.imtp.ilyichevsk.odessa.ua
Cc: Russell King <rmk@arm.linux.org.uk>,
Roy Sigurd Karlsbakk <roy@karlsbakk.net>,
netdev@oss.sgi.com,
Kernel mailing list <linux-kernel@vger.kernel.org>,
libc-alpha@sources.redhat.com
Subject: Re: Csum and csum copyroutines benchmark
Date: 25 Oct 2002 10:48:10 +0300 [thread overview]
Message-ID: <87n0p3x8lh.fsf@fadata.bg> (raw)
In-Reply-To: <200210250643.g9P6hop13980@Port.imtp.ilyichevsk.odessa.ua>
>>>>> "Denis" == Denis Vlasenko <vda@port.imtp.ilyichevsk.odessa.ua> writes:
Denis> /me said:
>> I'm experimenting with different csum_ routines in userspace now.
Denis> Short conclusion:
Denis> 1. It is possible to speed up csum routines for AMD processors by 30%.
Denis> 2. It is possible to speed up csum_copy routines for both AMD and Intel
Denis> three times or more. Roy, do you like that? ;)
Additional data point:
Short summary:
1. Checksum - kernelpii_csum is ~19% faster
2. Copy - lernelpii_csum is ~6% faster
Dual Pentium III, 1266Mhz, 512K cache, 2G SDRAM (133Mhz, ECC)
The only changes I made were to decrease the buffer size to 1K (as I
think this is more representative to a network packet size, correct me
if I'm wrong) and increase the runs to 1024. Max values are worthless
indeed.
Csum benchmark program
buffer size: 1 K
Each test tried 1024 times, max and min CPU cycles are reported.
Please disregard max values. They are due to system interference only.
csum tests:
kernel_csum - took 941 max, 740 min cycles per kb. sum=0x44000077
kernel_csum - took 748 max, 742 min cycles per kb. sum=0x44000077
kernel_csum - took 60559 max, 742 min cycles per kb. sum=0x44000077
kernelpii_csum - took 52804 max, 601 min cycles per kb. sum=0x44000077
kernelpiipf_csum - took 12930 max, 601 min cycles per kb. sum=0x44000077
pfm_csum - took 10161 max, 1402 min cycles per kb. sum=0x44000077
pfm2_csum - took 864 max, 838 min cycles per kb. sum=0x44000077
copy tests:
kernel_copy - took 339 max, 239 min cycles per kb. sum=0x44000077
kernel_copy - took 239 max, 239 min cycles per kb. sum=0x44000077
kernel_copy - took 239 max, 239 min cycles per kb. sum=0x44000077
kernelpii_copy - took 244 max, 225 min cycles per kb. sum=0x44000077
ntqpf_copy - took 10867 max, 512 min cycles per kb. sum=0x44000077
ntqpfm_copy - took 710 max, 403 min cycles per kb. sum=0x44000077
ntq_copy - took 4535 max, 443 min cycles per kb. sum=0x44000077
ntqpf2_copy - took 563 max, 555 min cycles per kb. sum=0x44000077
Done
HOWEVER ...
sometimes (say 1/30) I get the following output:
Csum benchmark program
buffer size: 1 K
Each test tried 1024 times, max and min CPU cycles are reported.
Please disregard max values. They are due to system interference only.
csum tests:
kernel_csum - took 958 max, 740 min cycles per kb. sum=0x44000077
kernel_csum - took 748 max, 740 min cycles per kb. sum=0x44000077
kernel_csum - took 752 max, 740 min cycles per kb. sum=0x44000077
kernelpii_csum - took 624 max, 600 min cycles per kb. sum=0x44000077
kernelpiipf_csum - took 877211 max, 601 min cycles per kb. sum=0x44000077
Bad sum
Aborted
which is to say that pfm_csum and pfm2_csum results are not to be
trusted (at least on PIII (or my kernel CONFIG_MPENTIUMIII=y
config?)).
~velco
next prev parent reply other threads:[~2002-10-25 7:42 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-10-23 10:18 tuning linux for high network performance? Roy Sigurd Karlsbakk
2002-10-23 11:06 ` [RESEND] " Roy Sigurd Karlsbakk
2002-10-23 13:01 ` bert hubert
2002-10-23 13:21 ` David S. Miller
2002-10-23 13:42 ` Roy Sigurd Karlsbakk
2002-10-23 17:01 ` bert hubert
2002-10-23 17:10 ` Ben Greear
2002-10-23 17:11 ` Richard B. Johnson
2002-10-23 17:12 ` Nivedita Singhvi
2002-10-23 17:56 ` Richard B. Johnson
2002-10-23 18:07 ` Nivedita Singhvi
2002-10-23 18:30 ` Richard B. Johnson
2002-10-24 4:11 ` David S. Miller
2002-10-24 9:37 ` Karen Shaeffer
2002-10-24 10:30 ` sendfile64() anyone? (was [RESEND] tuning linux for high network performance?) Roy Sigurd Karlsbakk
2002-10-24 10:47 ` David S. Miller
2002-10-24 11:07 ` Roy Sigurd Karlsbakk
2002-10-23 13:41 ` [RESEND] tuning linux for high network performance? Roy Sigurd Karlsbakk
2002-10-23 14:59 ` Nivedita Singhvi
2002-10-23 15:26 ` O_DIRECT sockets? (was [RESEND] tuning linux for high network performance?) Roy Sigurd Karlsbakk
2002-10-23 16:34 ` Nivedita Singhvi
2002-10-23 16:34 ` Nivedita Singhvi
2002-10-24 10:14 ` Roy Sigurd Karlsbakk
2002-10-24 10:46 ` David S. Miller
2002-10-24 10:46 ` David S. Miller
2002-10-23 18:01 ` [RESEND] tuning linux for high network performance? Denis Vlasenko
2002-10-23 13:36 ` Roy Sigurd Karlsbakk
2002-10-24 16:22 ` Denis Vlasenko
2002-10-24 11:50 ` Russell King
2002-10-24 12:42 ` bert hubert
2002-10-24 17:41 ` Denis Vlasenko
2002-10-25 11:36 ` Csum and csum copyroutines benchmark Denis Vlasenko
2002-10-25 7:48 ` Momchil Velikov [this message]
2002-10-25 13:59 ` Denis Vlasenko
2002-10-25 9:47 ` Momchil Velikov
2002-10-25 10:19 ` Alan Cox
2002-10-25 16:00 ` Denis Vlasenko
2002-10-25 14:26 ` Daniel Egger
2002-10-23 14:52 ` [RESEND] tuning linux for high network performance? Nivedita Singhvi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87n0p3x8lh.fsf@fadata.bg \
--to=velco@fadata.bg \
--cc=libc-alpha@sources.redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@oss.sgi.com \
--cc=rmk@arm.linux.org.uk \
--cc=roy@karlsbakk.net \
--cc=vda@port.imtp.ilyichevsk.odessa.ua \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.