From: Andrew Morton <andrewm@uow.edu.au>
To: Manfred Spraul <manfred@colorfullife.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [beta patch] SSE copy_page() / clear_page()
Date: Sat, 17 Feb 2001 02:27:02 +1100 [thread overview]
Message-ID: <3A8D46C6.3873DF22@uow.edu.au> (raw)
In-Reply-To: <3A846C84.109F1D7D@colorfullife.com> <200102092240.OAA15902@penguin.transmeta.com> <3A8B08C7.BD79E3B4@colorfullife.com>
Manfred Spraul wrote:
>
> Intel Pentium III and P 4 have hardcoded "fast stringcopy" operations
> that invalidate whole cachelines during write (documented in the most
> obvious place: multiprocessor management, memory ordering)
Which are dramatically slower than a simple `mov' loop for just
about all alignments, except for source and dest both eight-byte
aligned.
For example, copying an unchached source to an uncached dest,
with the source misaligned, my PIII Coppermine does 108 MBytes/sec
with `rep;movsl' and 149 MBytes/sec with an open-coded variant
of our copy_csum routines. That's a lot. Similar results
on a PII and a PIII Katmai.
On the K6-2, however, the string operation is almost always
a win.
It seems that a good approximation for our bulk-copy strategy is:
if (AMD) {
string_copy();
} else if (intel) {
if ((source|dest) & 7)
duff_copy();
else
string_copy();
} else {
quack();
}
This will make our Intel copies 20-40% faster than
at present, depending upon the distribution of
alignments. (And for networking, the distribution
is pretty much uniform).
Somewhere on my to-do list is getting lots of people to
test lots of architectures with lots of combinations of
[source/dest][cached/uncached] at lots of alignments
to confirm if this will work.
If you have time, could you please grab
http://www.uow.edu.au/~andrewm/linux/cptimer.tar.gz
and teach it how to do SSE copies, in preparation for this
great event?
Thanks.
-
next prev parent reply other threads:[~2001-02-16 15:17 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-02-09 22:17 [beta patch] SSE copy_page() / clear_page() Manfred Spraul
2001-02-09 22:40 ` Linus Torvalds
2001-02-09 23:03 ` Doug Ledford
2001-02-10 9:09 ` Manfred Spraul
2001-02-10 17:18 ` Doug Ledford
2001-02-10 18:00 ` Manfred Spraul
2001-02-10 18:18 ` Manfred Spraul
[not found] ` <200102092240.OAA15902@penguin.transmeta.com>
2001-02-14 22:37 ` Manfred Spraul
2001-02-16 15:27 ` Andrew Morton [this message]
2001-02-20 17:35 ` Pavel Machek
2001-02-20 20:49 ` Alan Cox
2001-02-20 20:52 ` Pavel Machek
2001-02-20 21:08 ` Alan Cox
2001-02-20 21:16 ` Manfred Spraul
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3A8D46C6.3873DF22@uow.edu.au \
--to=andrewm@uow.edu.au \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox