Re: [rfc][patch 3/3] x86: optimise barriers

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Jarek Poplawski <jarkao2@o2.pl>
To: Nick Piggin <npiggin@suse.de>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andi Kleen <ak@suse.de>
Subject: Re: [rfc][patch 3/3] x86: optimise barriers
Date: Fri, 12 Oct 2007 11:55:05 +0200	[thread overview]
Message-ID: <20071012095505.GD1962@ff.dom.local> (raw)
In-Reply-To: <20071012085733.GA19237@wotan.suse.de>

On Fri, Oct 12, 2007 at 10:57:33AM +0200, Nick Piggin wrote:
> On Fri, Oct 12, 2007 at 10:25:34AM +0200, Jarek Poplawski wrote:
> > On 04-10-2007 07:23, Nick Piggin wrote:
> > > According to latest memory ordering specification documents from Intel and
> > > AMD, both manufacturers are committed to in-order loads from cacheable memory
> > > for the x86 architecture. Hence, smp_rmb() may be a simple barrier.
> > ...
> > 
> > Great news!
> > 
> > First it looks like a really great thing that it's revealed at last.
> > But then... there is probably some confusion: did we have to use
> > ineffective code for so long?
> 
> I'm not sure exactly what the situation is with the manufacturers,
> but maybe they (at least Intel) wanted to keep their options open
> WRT their barrier semantics, even if current implementations were
> not taking full liberty of them.
> 
>  
> > First again, we could try to blame Intel etc. But then, wait a minute:
> > is it such a mystery knowledge? If this reordering is done there are
> > some easy rules broken (just like in examples from these manuals). And
> > if somebody cared to do this for optimization, then this is probably
> > noticeable optimization, let's say 5 or 10%. Then any test shouldn't
> > need to take very long to tell the truth in less than 100 loops!
> 
> I don't know quite what you're saying... the CPUs could probably get
> performance by having weakly ordered loads, OTOH I think the Intel
> ones might already do this speculatively so they appear in order but
> essentially have the performance of weak order.

I meant: if there is any reordering possible this should be quite
distinctly visible, because why would any vendor enable such nasty
things if not for performance. But now I start to doubt: of course
there is such a possibility someone makes this reordering for some
other reasons which could be so rare it's hard to check. And this
someone knows it's processors are seen less efficient because of eg.
mostly unneeded read barriers used by operating systems...

> 
> If you're just talking about this patch, then it probably isn't much
> performance gain. I'm guessing you'd be lucky to measure it from
> userspace.

No, it's only about the comment to this patch: "Hence, smp_rmb() may be
a simple barrier".

> 
> 
> > So, maybe linux needs something like this, instead of waiting few
> > years with each new model for vendors goodwill? IMHO, even for less
> > popular processors, this could be checked under some debugging option
> > at the system start (after disabling suspicios barrier for a while
> > plus some WARN_ONs).
> 
> I don't know if that would be worthwhile. It actually isn't always
> trivial to trigger reordering. For example, on my dual-core core2,
> in order to see reads pass writes, I have to do work on a set that
> exceeds the cache size and does a huge amount of work to ensure it
> is going to trigger that. If you can actually come up with a test
> case that triggers load/load or store/store reordering, I'm sure
> Intel / AMD would like to see it ;)

Anyway, it seems any heavy testing such as yours, should give us the
same informations years earlier than any vendors manual and then any
gain is multiplied by millions of users. Then only still doubtful
cases could be treated with additional caution and some debugging
code.

> 
> All existing processors as far as we know are in-order WRT loads vs
> loads and stores vs stores. It was just a matter of getting the docs
> clarified, which gives us more confidence that we're correct and a
> reasonable guarnatee of forward compatibility.

After reading this Intel's legal information I don't think you should
feel so much more forward confident...

> 
> So, I think the plan is just to merge these 3 patches during the
> current window.
> 

And they really should be!

Jarek P.

next prev parent reply	other threads:[~2007-10-12  9:52 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-04  5:21 [rfc][patch 1/3] x86_64: fence nontemproal stores Nick Piggin
2007-10-04  5:22 ` [rfc][patch 2/3] x86: fix IO write barriers Nick Piggin
2007-10-04 17:32   ` Dave Jones
2007-10-04 17:53     ` Andi Kleen
2007-10-04 18:10       ` Dave Jones
2007-10-04 18:21         ` Andi Kleen
2007-10-04 18:41           ` Dave Jones
2007-10-04 18:58             ` Andi Kleen
2007-10-04 19:08               ` Dave Jones
2007-10-04 20:52                 ` Alan Cox
2007-10-04  5:23 ` [rfc][patch 3/3] x86: optimise barriers Nick Piggin
2007-10-12  8:25   ` Jarek Poplawski
2007-10-12  8:42     ` Helge Hafting
2007-10-12  9:12       ` Jarek Poplawski
2007-10-12  9:44         ` Nick Piggin
2007-10-12 10:04           ` Jarek Poplawski
2007-10-12 12:44         ` Helge Hafting
2007-10-12 13:29           ` Jarek Poplawski
2007-10-15 10:17             ` Helge Hafting
2007-10-15 11:53               ` Jarek Poplawski
2007-10-12  8:57     ` Nick Piggin
2007-10-12  9:55       ` Jarek Poplawski [this message]
2007-10-12 10:42         ` Nick Piggin
2007-10-12 11:55           ` Jarek Poplawski
2007-10-12 12:10             ` Jarek Poplawski
2007-10-12 15:13     ` Linus Torvalds
2007-10-15  7:44       ` Jarek Poplawski
2007-10-15  8:09         ` Nick Piggin
2007-10-15  9:10           ` Jarek Poplawski
2007-10-15  9:24             ` Jarek Poplawski
2007-10-16  0:50             ` Nick Piggin
2007-10-16  9:00               ` Jarek Poplawski
2007-10-16  9:14                 ` david
2007-10-16 12:49                   ` Jarek Poplawski
2007-10-15 14:38         ` David Schwartz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071012095505.GD1962@ff.dom.local \
    --to=jarkao2@o2.pl \
    --cc=ak@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox