From: Doug Ledford <dledford@redhat.com>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [beta patch] SSE copy_page() / clear_page()
Date: Fri, 09 Feb 2001 18:03:05 -0500 [thread overview]
Message-ID: <3A847729.2C868879@redhat.com> (raw)
In-Reply-To: <3A846C84.109F1D7D@colorfullife.com> <961rkk$fgm$1@penguin.transmeta.com>
Linus Torvalds wrote:
>
> In article <3A846C84.109F1D7D@colorfullife.com>,
> Manfred Spraul <manfred@colorfullife.com> wrote:
> >
> >* use sse for normal memcopy. Then main advantage of sse over mmx is
> >that only the clobbered registers must be saved, not the full fpu state.
> >
> >* verify that the code doesn't break SSE enabled apps.
> >I checked a sse enabled mp3 encoder and Mesa.
>
> Ehh.. Did you try this with pending FPU exceptions that have not yet
> triggered?
There are other gotcha's here as well (speaking from experience :-(
> I have this strong suspicion that your kernel will lock up in a bad way
> of you have somebody do something like divide by zero without actually
> touching a single FP instruction after the divide (so that the error has
> happened, but has not yet been raised as an exception).
Or much worse, let the kernel mix-and-match SSE and MMX optimized routines
without doing full saves of the FPU on SSE routines, which leads to FPU saves
in MMX routines with kernel data in the SSE registers, which then shows up
when the app touches those SSE registers and you get use space corruption. My
code to handle this type of situation was *very* complex, and I don't think I
ever got it quite perfectly right without simply imposing a rule that the
kernel could never use both SSE and MMX instructions on the same CPU.
> And when it hits your SSE copy routines with the pending error, it will
> likely loop forever on taking the fault in kernel space.
>
> Basically, kernel use of MMX and SSE are a lot harder to get right than
> many people seem to realize. Why do you think I threw out all the
> patches that tried to do this?
>
> And no, the bug won't show up in any normal testing. You'll never know
> about it until somebody malicious turns your machine into a doorstop.
>
> Finally, did you actually see any performance gain in any benchmarks?
Now this I can comment on. Real life gain. We had to re-run a SpecWEB99 test
after taking the SSE instructions out of our (buggy) 2.2.14-5.0 kernel that
shipped in Red Hat Linux 6.2 (hanging bag over my head). We recompiled the
exact same kernel without the SSE stuff, then re-ran the SpecWEB99 test, and
there was an performance drop of roughly 2% overall from the change. So, yes,
they make an aggregate performance difference to the typical running of the
kernel (and this was without copy_page or zero_page being optimized, instead I
had done memset, memcpy, copy_*_user as optimized routines) and they make a
huge difference to benchmarks that target the areas they help the most (like
disk I/O benchmarks which would sometimes see a 40% or more boost in
performance).
--
Doug Ledford <dledford@redhat.com> http://people.redhat.com/dledford
Please check my web site for aic7xxx updates/answers before
e-mailing me about problems
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2001-02-09 23:05 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-02-09 22:17 [beta patch] SSE copy_page() / clear_page() Manfred Spraul
2001-02-09 22:40 ` Linus Torvalds
2001-02-09 23:03 ` Doug Ledford [this message]
2001-02-10 9:09 ` Manfred Spraul
2001-02-10 17:18 ` Doug Ledford
2001-02-10 18:00 ` Manfred Spraul
2001-02-10 18:18 ` Manfred Spraul
[not found] ` <200102092240.OAA15902@penguin.transmeta.com>
2001-02-14 22:37 ` Manfred Spraul
2001-02-16 15:27 ` Andrew Morton
2001-02-20 17:35 ` Pavel Machek
2001-02-20 20:49 ` Alan Cox
2001-02-20 20:52 ` Pavel Machek
2001-02-20 21:08 ` Alan Cox
2001-02-20 21:16 ` Manfred Spraul
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3A847729.2C868879@redhat.com \
--to=dledford@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox