qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH 13/13] iommu: Add a memory barrier to DMA RW function
Date: Thu, 17 May 2012 07:10:45 +1000	[thread overview]
Message-ID: <1337202645.30558.13.camel@pasglop> (raw)
In-Reply-To: <4FB4028F.7070003@codemonkey.ws>

On Wed, 2012-05-16 at 14:39 -0500, Anthony Liguori wrote:

> I must confess, I have no idea what PCI et al guarantee with respect to 
> ordering.  What's nasty about this patch is that you're not just ordering wrt 
> device writes/reads, but also with the other VCPUs.  I don't suspect this would 
> be prohibitively expensive but it still worries me.

So the precise ordering rules of various busses can vary slightly.

We could try to get as precise & fine grained as those busses are in HW
or ... it's my belief that it makes sense to simply guarantee that the
DMA accesses done by emulated devices always appear to other VCPUs in
the order they were done by the device emulation code.

IE. If we can prove that the cost of doing so is negligible, then it's
also the simplest approach since just sticking that one barrier here
will provide that ordering guarantee (at least for anything using the
dma_* accessors).

Ordering problems can be really sneaky & nasty to debug and so I'm
really tempted to use that big hammer approach here, provided there is
no problematic performance loss.

> > So by putting the barrier right in the dma_* accessor we kill all the
> > birds with one stone without having to audit all drivers for use of the
> > right accessors and all bus types.
> >
> > Also while the goal of using more targeted barriers might be worthwhile
> > in the long run, it's not totally trivial because we do want to order
> > store vs. subsequent loads in all cases and load vs. loads, and we don't
> > want to have to keep track of what the previous access was, so at this
> > stage it's simply easier to just use a full barrier.
> >
> > So my suggestion is to see if that patch introduces a measurable
> > performance regression anywhere we care about (ie on x86) and if not,
> > just go for it, it will solve a very real problem and we can ponder ways
> > to do it better as a second step if it's worthwhile.
> >
> > Anthony, how do you usually benchmark these things ? Any chance you can
> > run a few tests to see if there's any visible loss ?
> 
> My concern would really be limited to virtio ring processing. It all depends on 
> where you place the barriers in the end.

So virtio doesn't use the dma_* interface since it bypasses the iommu
(on purpose).

> I really don't want to just conservatively stick barriers everywhere either. 
> I'd like to have a specific ordering guarantee and then implement that and deal 
> with the performance consequences.

Well, my idea is to provide a well defined ordering semantic of all DMA
accesses issued by a device :-) IE. All DMAs done by the device
emulation appear to other VCPUs in the order they were issued by the
emulation code. IE. Making the storage accesses visible in the right
order to "other VCPUs" is the whole point of the exercise.

This is well defined, though a bit broad and possibly broader than
strictly necessary but it's a cost/benefit game here. If the cost is low
enough, the benefit is that it's going to be safe, we won't have subtle
cases of things passing each other etc... and it's also simpler to
implement and maintain since it's basically one barrier in the right
place.

I have a long experience with dealing with ordering issues on large SMP
systems and believe me, anything "fine grained" is really really hard to
generally get right, and the resulting bugs are really nasty to track
down and even identify. So I have a strong bias toward the big hammer
approach that is guaranteed to avoid the problem for anything using the
right DMA accessors.

> I also wonder if the "fix" that you see from this is papering around a bigger 
> problem.  Can you explain the ohci problem that led you to do this in the first 
> place?

Well, we did an audit of OHCI and we discovered several bugs there which
have been fixed since then, mostly cases where the emulated device would
incorrectly read/modify/write entire data structures in guest memory
rather than just updating the fields it's supposed to update, causing
simultaneous updates of other fields by the guest driver to be lost.

The result was that we still had an occasional mild instability where
every now and then the host would seem to get errors or miss
completions, which the barrier appeared to fix.

On the other hand, we -know- that not having the barrier is incorrect so
that was enough for me to be happy about the diagnosis.

IE. The OHCI -will- update fields that must be visible in the right
order by the host driver (such as a link pointer in a TD followed by the
done list pointer pointing to that TD) and we know that POWER cpus are
very good at shooting stores out of order, so missing TDs on completion
being one of our symptoms, I think we pretty much nailed it.

I'm going to try on some fast x86 using things like AHCI to see if I can
show a performance issue.

Cheers,
Ben.

> Regards,
> 
> Anthony Liguori
> 
> >
> > Cheers,
> > Ben.
> >
> >
> >

  reply	other threads:[~2012-05-16 21:11 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-10  4:48 [Qemu-devel] [PATCH 00/13] IOMMU infrastructure Benjamin Herrenschmidt
2012-05-10  4:48 ` [Qemu-devel] [PATCH 01/13] Better support for dma_addr_t variables Benjamin Herrenschmidt
2012-05-10  4:48 ` [Qemu-devel] [PATCH 02/13] Implement cpu_physical_memory_zero() Benjamin Herrenschmidt
2012-05-15  0:42   ` Anthony Liguori
2012-05-15  1:23     ` David Gibson
2012-05-15  2:03       ` Anthony Liguori
2012-05-10  4:48 ` [Qemu-devel] [PATCH 03/13] iommu: Add universal DMA helper functions Benjamin Herrenschmidt
2012-05-10  4:48 ` [Qemu-devel] [PATCH 04/13] usb-ohci: Use " Benjamin Herrenschmidt
2012-05-10  4:48 ` [Qemu-devel] [PATCH 05/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 06/13] ide/ahci: Use universal DMA helper functions Benjamin Herrenschmidt
2012-05-21  1:51   ` [Qemu-devel] [PATCH 06/13 - UPDATED] " Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 07/13] usb: Convert usb_packet_{map, unmap} to universal DMA helpers Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 08/13] iommu: Introduce IOMMU emulation infrastructure Benjamin Herrenschmidt
2012-05-15  0:49   ` Anthony Liguori
2012-05-15  1:42     ` David Gibson
2012-05-15  2:03       ` Anthony Liguori
2012-05-15  2:32         ` Benjamin Herrenschmidt
2012-05-15  2:50           ` Anthony Liguori
2012-05-15  3:02             ` Benjamin Herrenschmidt
2012-05-15 14:02               ` Anthony Liguori
2012-05-15 21:55                 ` Benjamin Herrenschmidt
2012-05-15 22:02                   ` Anthony Liguori
2012-05-15 23:08                     ` Benjamin Herrenschmidt
2012-05-15 23:58                       ` Anthony Liguori
2012-05-16  0:41                         ` Benjamin Herrenschmidt
2012-05-16  0:54                           ` Anthony Liguori
2012-05-16  1:20                             ` Benjamin Herrenschmidt
2012-05-16 19:36                               ` Anthony Liguori
2012-05-10  4:49 ` [Qemu-devel] [PATCH 09/13] iommu: Add facility to cancel in-use dma memory maps Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 10/13] pseries: Convert sPAPR TCEs to use generic IOMMU infrastructure Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 11/13] iommu: Allow PCI to use " Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 12/13] pseries: Implement IOMMU and DMA for PAPR PCI devices Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 13/13] iommu: Add a memory barrier to DMA RW function Benjamin Herrenschmidt
2012-05-15  0:52   ` Anthony Liguori
2012-05-15  1:11     ` Benjamin Herrenschmidt
2012-05-15  1:44     ` David Gibson
2012-05-16  4:35       ` Benjamin Herrenschmidt
2012-05-16  5:51         ` David Gibson
2012-05-16 19:39         ` Anthony Liguori
2012-05-16 21:10           ` Benjamin Herrenschmidt [this message]
2012-05-16 21:12             ` Benjamin Herrenschmidt
2012-05-17  0:07           ` Benjamin Herrenschmidt
2012-05-17  0:24             ` Benjamin Herrenschmidt
2012-05-17  0:52               ` [Qemu-devel] [RFC/PATCH] Add a memory barrier to guest memory access functions Benjamin Herrenschmidt
2012-05-17  2:28                 ` Anthony Liguori
2012-05-17  2:44                   ` Benjamin Herrenschmidt
2012-05-17 22:09                     ` Anthony Liguori
2012-05-18  1:04                       ` David Gibson
2012-05-18  1:16                       ` Benjamin Herrenschmidt
2012-05-17  3:35                   ` David Gibson
2012-05-18  6:53               ` [Qemu-devel] [PATCH 13/13] iommu: Add a memory barrier to DMA RW function Paolo Bonzini
2012-05-18  8:18                 ` Benjamin Herrenschmidt
2012-05-18  8:57                   ` Paolo Bonzini
2012-05-18 22:26                     ` Benjamin Herrenschmidt
2012-05-19  7:24                       ` Paolo Bonzini
2012-05-20 21:36                         ` Benjamin Herrenschmidt
2012-05-21  1:56                           ` [Qemu-devel] [PATCH] Add a memory barrier to guest memory access functions Benjamin Herrenschmidt
2012-05-21  8:11                             ` Paolo Bonzini
2012-05-21  8:31                               ` Michael S. Tsirkin
2012-05-21  8:58                                 ` Benjamin Herrenschmidt
2012-05-21  9:07                                   ` Benjamin Herrenschmidt
2012-05-21  9:16                                     ` Benjamin Herrenschmidt
2012-05-21  9:34                                       ` Michael S. Tsirkin
2012-05-21  9:53                                         ` Benjamin Herrenschmidt
2012-05-21 10:31                                           ` Michael S. Tsirkin
2012-05-21 11:45                                             ` Benjamin Herrenschmidt
2012-05-21 12:18                                               ` Michael S. Tsirkin
2012-05-21 15:16                                                 ` Paolo Bonzini
2012-05-21 21:58                                                 ` [Qemu-devel] [PATCH] Add a memory barrier to guest memory access function Benjamin Herrenschmidt
2012-05-21 22:22                                                   ` Michael S. Tsirkin
2012-05-21 22:56                                                     ` Benjamin Herrenschmidt
2012-05-22  5:11                                                       ` Michael S. Tsirkin
2012-05-22  0:00                                                     ` Benjamin Herrenschmidt
2012-05-22  4:19                                                   ` Rusty Russell
2012-05-21 22:18                                 ` [Qemu-devel] [PATCH] Add a memory barrier to guest memory access functions Anthony Liguori
2012-05-21 22:26                                   ` Benjamin Herrenschmidt
2012-05-21 22:31                                     ` Anthony Liguori
2012-05-21 22:44                                       ` Michael S. Tsirkin
2012-05-21 23:02                                         ` Benjamin Herrenschmidt
2012-05-22  4:34                                         ` [Qemu-devel] [PATCH] Add a memory barrier to DMA functions Benjamin Herrenschmidt
2012-05-22  4:51                                           ` Benjamin Herrenschmidt
2012-05-22  7:17                                           ` Benjamin Herrenschmidt
2012-05-22 11:14                                             ` Michael S. Tsirkin
2012-05-22 11:41                                               ` Benjamin Herrenschmidt
2012-05-22 12:03                                                 ` Michael S. Tsirkin
2012-05-22 21:24                                                   ` Benjamin Herrenschmidt
2012-05-22 21:40                                                   ` Anthony Liguori
2012-05-21 22:37                                   ` [Qemu-devel] [PATCH] Add a memory barrier to guest memory access functions Michael S. Tsirkin
2012-05-15  0:52 ` [Qemu-devel] [PATCH 00/13] IOMMU infrastructure Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1337202645.30558.13.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=anthony@codemonkey.ws \
    --cc=david@gibson.dropbear.id.au \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).