All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@muc.de>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Andi Kleen <ak@muc.de>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Runtime memory barrier patching
Date: Tue, 22 Apr 2003 00:59:38 +0200	[thread overview]
Message-ID: <20030421225938.GA14947@averell> (raw)
In-Reply-To: <Pine.LNX.4.44.0304211514200.17938-100000@home.transmeta.com>

On Tue, Apr 22, 2003 at 12:23:10AM +0200, Linus Torvalds wrote:
> 
> On Tue, 22 Apr 2003, Andi Kleen wrote:
> > 
> > At least on Athlon/Opteron these sequences are the fastest because they are
> > special cased in the decoder and do not consume any execution resources.  
> 
> Is that true even on the 32-bit Athlons, especially the older ones?

It is not the recommended form for Athlons (see my other mail) 
But I doubt it's a big issue. The Athlon has a pretty good decoder.

> 
> I can understand the special-casing on Opteron, since in 64-bit mode
> you'll see more of the prefixes, but for older K7s?

64bit mode needs it special cased anyways because the common
xchg ax,ax nop would not be a nop (it would zero extend the register to
64bit) 

> I think the P3 (which is still Intel's "current" offering as it comes to 
> the mobile Pentium-M side) has problems. And there are still people who 
> use even older chips.

P3 should be fine now.

> > I'm using the GAS sequences for the Intel case Ulrich pointed out now,
> > but only upto 4 bytes (memory barrier only needs 3 bytes currently). 
> > This will hopefully satisfy all nop optimizers ;)
> 
> Looks good to me.
> 
> I do have _one_ more small niggling issue - I think this patch also makes
> the CONFIG_X86_SSE2 define be a thing of the past. Or is it used for
> something else still? It would be good to remove it, and try to make most
> of the architecture choices be pure optimization hints (apart from some of
> the more painful architecture updates like the broken write protect on the
> original 386). That will make it easier for distribution makers.

CONFIG_X86_SSE2 is a nop now yes. But it does not matter because 
the user cannot set it directly. I can remove it in a followup patch.

I'm thinking of using it for the prefetches in the future
I wrote prefetch using versions of these for 64bit and it
helps, so it may make sense to port it over.

I also experiemented with replacing the local_irq_restore with
an P4 optimized version (bt $9,oldflags ; jnc 1f ; sti ; 1: instead
of pushl oldflags ; popfl) which is 60cycles -> 47cycles, but I ran
into weird binutils problems and it bloated the code quite a lot
(2 bytes -> 7 bytes) for only a few cycles so I dropped it again.

But please put the first version of the patch in first so that
we get the infrastructure for future work.

-Andi



  reply	other threads:[~2003-04-21 22:47 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-21 19:27 [PATCH] Runtime memory barrier patching Andi Kleen
2003-04-21 19:59 ` Linus Torvalds
2003-04-21 20:53   ` Andi Kleen
2003-04-21 21:04     ` Linus Torvalds
2003-04-21 21:43       ` Ulrich Drepper
2003-04-21 22:05         ` Linus Torvalds
2003-04-21 22:45           ` Andi Kleen
2003-04-21 22:11       ` Andi Kleen
2003-04-21 22:23         ` Linus Torvalds
2003-04-21 22:59           ` Andi Kleen [this message]
2003-04-21 23:35     ` Jamie Lokier
2003-04-21 23:46       ` Andi Kleen
2003-04-21 23:56         ` [PATCH] Runtime memory barrier patching II Andi Kleen
2003-04-21 23:57         ` [PATCH] Runtime memory barrier patching Jamie Lokier
2003-04-22  0:06       ` Linus Torvalds
2003-04-22  0:13         ` Jamie Lokier
  -- strict thread matches above, loose matches on Subject: below --
2003-04-21 23:41 Chuck Ebbert
2003-04-22  0:04 ` Jamie Lokier
     [not found] <200304220111.h3M1BEp5004047@hera.kernel.org>
2003-04-22  8:43 ` Arjan van de Ven
2003-04-22 11:18   ` Andi Kleen
2003-04-22 16:11     ` Dave Jones
2003-04-22 10:12 Chuck Ebbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030421225938.GA14947@averell \
    --to=ak@muc.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.