qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <dwg@au1.ibm.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: aliguori@us.ibm.com, aik@ozlabs.ru, rusty@rustcorp.com.au,
	qemu-devel@nongnu.org, agraf@suse.de,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] virtio: Make memory barriers be memory barriers
Date: Tue, 6 Sep 2011 13:12:24 +1000	[thread overview]
Message-ID: <20110906031224.GH30278@yookeroo.fritz.box> (raw)
In-Reply-To: <20110905091945.GC16038@redhat.com>

On Mon, Sep 05, 2011 at 12:19:46PM +0300, Michael S. Tsirkin wrote:
> On Mon, Sep 05, 2011 at 02:43:16PM +1000, David Gibson wrote:
> > On Sun, Sep 04, 2011 at 12:16:43PM +0300, Michael S. Tsirkin wrote:
> > > On Sun, Sep 04, 2011 at 12:46:35AM +1000, David Gibson wrote:
> > > > On Fri, Sep 02, 2011 at 06:45:50PM +0300, Michael S. Tsirkin wrote:
> > > > > On Thu, Sep 01, 2011 at 04:31:09PM -0400, Paolo Bonzini wrote:
> > > > > > > > > Why not limit the change to ppc then?
> > > > > > > >
> > > > > > > > Because the bug is masked by the x86 memory model, but it is still
> > > > > > > > there even there conceptually. It is not really true that x86 does
> > > > > > > > not need memory barriers, though it doesn't in this case:
> > > > > > > >
> > > > > > > > http://bartoszmilewski.wordpress.com/2008/11/05/who-ordered-memory-fences-on-an-x86/
> > > > > > > >
> > > > > > > > Paolo
> > > > > > > 
> > > > > > > Right.
> > > > > > > To summarize, on x86 we probably want wmb and rmb to be compiler
> > > > > > > barrier only. Only mb might in theory need to be an mfence.
> > > > > > 
> > > > > > No, wmb needs to be sfence and rmb needs to be lfence.  GCC does
> > > > > > not provide those, so they should become __sync_synchronize() too,
> > > > > > or you should use inline assembly.
> > > > > > 
> > > > > > > But there might be reasons why that is not an issue either
> > > > > > > if we look closely enough.
> > > > > > 
> > > > > > Since the ring buffers are not using locked instructions (no xchg
> > > > > > or cmpxchg) the barriers simply must be there, even on x86.  Whether
> > > > > > it works in practice is not interesting, only the formal model is
> > > > > > interesting.
> > > > > > 
> > > > > > Paolo
> > > > > 
> > > > > Well, can you describe an issue in virtio that lfence/sfence help solve
> > > > > in terms of a memory model please?
> > > > > Pls note that guest uses smp_ variants for barriers.
> > > > 
> > > > Ok, so, I'm having a bit of trouble with the fact that I'm having to
> > > > argue the case that things the protocol requiress to be memory
> > > > barriers actually *be* memory barriers on all platforms.
> > > > 
> > > > I mean argue for a richer set of barriers, with per-arch minimal
> > > > implementations instead of the large but portable hammer of
> > > > sync_synchronize, if you will.
> > > 
> > > That's what I'm saying really. On x86 the richer set of barriers
> > > need not insert code at all for both wmb and rmb macros. All we
> > > might need is an 'optimization barrier'- e.g. linux does
> > >  __asm__ __volatile__("": : :"memory")
> > > ppc needs something like sync_synchronize there.
> > 
> > But you're approaching this the wrong way around - correctness should
> > come first.  That is, we should first ensure that there is a
> > sufficient memory barrier to satisfy the protocol.  Then, *if* there
> > is a measurable performance improvement and *if* we can show that a
> > weaker barrier is sufficient on a given platform, then we can whittle
> > it down to a lighter barrier.
> 
> You are only looking at ppc. But on x86 this code ships in
> production. So changes should be made in a way to reduce
> a potential for regressions, balancing risk versus potential benefit.
> I'm trying to point out a way to do this.

Oh, please.  Adding a stronger barrier has a miniscule chance of
breaking things.  And this in a project that has build-breaking
regressions with tedious frequency.

> > > > But just leaving them out on x86!?
> > > > Seriously, wtf?  Do you enjoy having software that works chiefly by
> > > > accident?
> > > 
> > > I'm surprised at the controversy too. People seem to argue that
> > > x86 cpu does not reorder stores and that we need an sfence between
> > > stores to prevent the guest from seeing them out of order, at
> > > the same time.
> > 
> > I don't know the x86 storage model well enough to definitively say
> > that the barrier is not necessary there - nor to say that it is
> > necessary.  All I know is that the x86 model is quite strongly
> > ordered, and I assume that is why the lack of barrier has not caused
> > an observed problem on x86.
> 
> Please review Documentation/memory-barriers.txt as one reference.
> then look at how SMP barriers are implemented at various systems.
> In particular, note how it says 'Mandatory barriers should not be used
> to control SMP effects'.

No, again, correctness first; the onus of showing it's safe is on
those who want weaker barriers.

> > Again, correctness first.  sync_synchronize should be a sufficient
> > barrier for wmb() on all platforms.  If you really don't want it, the
> > onus is on you
> 
> Just for fun, I did a quick hack replacing all barriers with mb()
> in the userspace virtio test. This is on x386.
> 
> Before:
> [mst@tuck virtio]$ sudo time ./virtio_test 
> spurious wakeus: 0x1da
> 24.53user 14.63system 0:41.91elapsed 93%CPU (0avgtext+0avgdata
> 464maxresident)k
> 0inputs+0outputs (0major+154minor)pagefaults 0swaps
> 
> After:
> [mst@tuck virtio]$ sudo time ./virtio_test 
> spurious wakeus: 0x218
> 33.97user 6.22system 0:42.10elapsed 95%CPU (0avgtext+0avgdata
> 464maxresident)k
> 0inputs+0outputs (0major+154minor)pagefaults 0swaps
> 
> So user time went up significantly, as expected. Surprisingly the kernel
> side started working more efficiently - surprising since
> kernel was not changed - with net effect close to evening out.

Right.  So small overall performance impact, and that's on a dedicated
testcase which does nothing but the virtio protocol.  I *strongly*
suspect the extra cost of the memory barriers will be well and truly
lost in the rest of the overhead of the qemu networking code.

> So a risk of performance regressions from unnecessary fencing
> seems to be non-zero, assuming that time doesn't lie.
> This might be worth investigating, but I'm out of time right now.
> 
> 
> > to show that (a) it's safe to do so and
> > (b) it's actually worth it.
> 
> Worth what? I'm asking you to minimuse disruption to other platforms
> while you fix ppc.

I'm not "fixing ppc".  I'm fixing a fundamental flaw in the protocol
implementation.  _So far_ I've only observed the effects on ppc, but
that doesn't mean they don't exist.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

  reply	other threads:[~2011-09-06  3:14 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-01  6:09 [Qemu-devel] [PATCH] virtio: Make memory barriers be memory barriers David Gibson
2011-09-01  6:47 ` Paolo Bonzini
2011-09-02  0:08   ` David Gibson
2011-09-02  6:49     ` Paolo Bonzini
2011-09-03 11:53       ` Blue Swirl
2011-09-01  7:37 ` Sasha Levin
2011-09-01  7:38   ` Sasha Levin
2011-09-01  7:56   ` Paolo Bonzini
2011-09-02  0:09     ` David Gibson
2011-09-01 15:30 ` Michael S. Tsirkin
2011-09-01 16:14   ` Paolo Bonzini
2011-09-01 16:34     ` Michael S. Tsirkin
2011-09-01 20:31       ` Paolo Bonzini
2011-09-02 15:45         ` Michael S. Tsirkin
2011-09-03 14:46           ` David Gibson
2011-09-04  9:16             ` Michael S. Tsirkin
2011-09-05  4:43               ` David Gibson
2011-09-05  9:19                 ` Michael S. Tsirkin
2011-09-06  3:12                   ` David Gibson [this message]
2011-09-06  6:55                     ` Paolo Bonzini
2011-09-06  9:02                       ` David Gibson
2011-09-06  9:28                       ` Avi Kivity
2011-09-06  9:35                         ` Michael S. Tsirkin
2011-09-06  9:38                         ` Paolo Bonzini
2011-09-05  7:41               ` Paolo Bonzini
2011-09-05  8:06                 ` Michael S. Tsirkin
2011-09-05  9:42                   ` Paolo Bonzini
2011-09-03 16:19           ` Paolo Bonzini
2011-09-04  8:47             ` Michael S. Tsirkin
2011-09-02  0:11     ` David Gibson
2011-09-02  6:11       ` Paolo Bonzini
2011-09-02 15:57         ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110906031224.GH30278@yookeroo.fritz.box \
    --to=dwg@au1.ibm.com \
    --cc=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=aliguori@us.ibm.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).