qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: qemu-devel@nongnu.org, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH 09/13] iommu: Add facility to cancel in-use dma memory maps
Date: Thu, 21 Jun 2012 07:52:53 +1000	[thread overview]
Message-ID: <1340229173.28143.194.camel@pasglop> (raw)
In-Reply-To: <4FE23FAC.80007@codemonkey.ws>

On Wed, 2012-06-20 at 16:25 -0500, Anthony Liguori wrote:

> So this cancellation stuff is hopelessly broken
> 
> It's simply not possible to fully cancel pending DMA in a synchronous callback.

Well, at least for PAPR H_PUT_TCE, cancellation must be synchronous, ie
the hypercall must not return until the cancellation is complete.

> Indeed, bdrv_aio_cancel ends up having a nasty little loop in it:
> 
>      if (active) {
>          /* fail safe: if the aio could not be canceled, we wait for
>             it */
>          while (qemu_paio_error(acb) == EINPROGRESS)
>              ;
>      }
> 
> That spins w/100% CPU.
> 
> Can you explain when DMA cancellation really happens and what the effect would 
> be if we simply ignored it?

It will almost never happen in practice. It will actually never happen
with the current patch series. Where it -will- eventually happen in the
long run is if the guest removes a translation that is "in use" by a
dma_map() mapping established by a device. It's always a guest
programming error though and it's not an attack vector since the guest
can only shoot itself in the foot, but it might make things like kdump
less reliable inside the guest.

We need a way to signal the device that the translation is going away
and we need -a- way to synchronize though it could be a two step process
(see below).

> Can we do something more clever like use an asynchronous callback to handle 
> flushing active DMA mappings?
> 
> There's just no way a callback like this is going to work.

Ok so first let's see what happens in real HW: One of the DMA accesses
gets a target abort return from the host bridge. The device interrupts
it's operations and signals an error.

Now, I agree that requiring a cancel callback to act synchronously might
be a bit fishy, so what about we define the following semantics:

 - First this assumes our iommu backend decides to implement that level
of correctness, as I said above, none do in that patch series (yet)

 - The basic idea is that for most iommu, there's an MMIO to start a TLB
flush and an MMIO the guest uses to spin on to get the status as to
whether the TLB flush has completed, so we can do things asynchronously
that way. However we -still- need to do things synchronously for the
hypercall used by PAPR, but as we discussed earlier that can be done
without spinning, by delaying the completion of the hypercall.

 - So step 1, no callback at all. When an iommu TLB flush operation is
started, we tag all pending maps (see below). We signal completion when
all those maps have been unmapped.

 - The above tagging can be done using some kind of generation count
along with an ordered list of maps, we keep track of the "oldest" map
still active, that sort of thing. Not too hard.

 - step 2, because some maps can be long lived and we don't want to
delay invalidations for ever, we add a cancel callback which device can
optionally installed along with a map. This callback is only meant to
-initiate- a cancellation in order to speed up when the unmap will
occur.

What do you think ? Would that work ? As I explained in my email
exchange with Gerd, there's quite a few issues in actually implementing
cancellation properly anyway, for example, today on PAPR, H_PUT_TCE is
implemented by the kernel KVM in real mode for performance reasons. So
we would need to change the KVM API to be able to keep the kernel
informed that there are maps covering portions of the iommu space (a
bitmap ?) to force exists to qemu when an invalidation collides with a
map for example.

Additionally, to be totally -correct-, we would need synchronization
with qemu to a much larger extent. IE. Any invalidation must also make
sure that anything that used a previous translation has completed, ie,
even the simple dma_ld/st ops must be synchronized in theory.

My conclusion is that the complexity of solving the problem is huge,
while the actual problem scope is close to non-existent. So I think we
can safely merge the series and ignore the issue for the time being.

Cheers,
Ben.

  reply	other threads:[~2012-06-20 21:53 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-19  6:39 [Qemu-devel] [PATCH 00/13] iommu series Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 01/13] Better support for dma_addr_t variables Benjamin Herrenschmidt
2012-06-20 21:14   ` Anthony Liguori
2012-06-20 21:29     ` Benjamin Herrenschmidt
2012-06-21  1:44       ` David Gibson
2012-06-20 22:26     ` Peter Maydell
2012-06-20 22:59       ` Anthony Liguori
2012-06-21  7:54         ` Peter Maydell
2012-06-22  1:58     ` Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 02/13] Implement cpu_physical_memory_set() Benjamin Herrenschmidt
2012-06-20 21:15   ` Anthony Liguori
2012-06-20 21:30     ` Benjamin Herrenschmidt
2012-06-20 21:37       ` Anthony Liguori
2012-06-21  1:45     ` David Gibson
2012-06-21  1:46       ` David Gibson
2012-06-21  2:50       ` Benjamin Herrenschmidt
2012-06-22  1:58     ` Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 03/13] iommu: Add universal DMA helper functions Benjamin Herrenschmidt
2012-06-20 21:16   ` Anthony Liguori
2012-06-20 21:32     ` Michael S. Tsirkin
2012-06-20 21:38       ` Anthony Liguori
2012-06-20 21:42         ` Michael S. Tsirkin
2012-06-20 21:46           ` Anthony Liguori
2012-06-20 22:00             ` Michael S. Tsirkin
2012-06-20 21:33     ` Benjamin Herrenschmidt
2012-06-20 21:40     ` Michael S. Tsirkin
2012-06-20 22:01       ` Anthony Liguori
2012-06-21  1:48     ` David Gibson
2012-06-22  2:02     ` Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 04/13] usb-ohci: Use " Benjamin Herrenschmidt
2012-06-20 21:18   ` Anthony Liguori
2012-06-20 21:36     ` Benjamin Herrenschmidt
2012-06-20 21:40       ` Anthony Liguori
2012-06-20 22:02         ` Benjamin Herrenschmidt
2012-06-21  7:33           ` Michael S. Tsirkin
2012-06-21 12:55             ` Anthony Liguori
2012-06-21 14:10               ` Michael S. Tsirkin
2012-06-22  2:28               ` Benjamin Herrenschmidt
2012-06-21  6:43         ` Gerd Hoffmann
2012-06-19  6:39 ` [Qemu-devel] [PATCH 05/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers Benjamin Herrenschmidt
2012-06-20 21:21   ` Anthony Liguori
2012-06-20 21:37     ` Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 06/13] ide/ahci: Use universal DMA helper functions Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 07/13] usb: Convert usb_packet_{map, unmap} to universal DMA helpers Benjamin Herrenschmidt
2012-06-19 13:42   ` Gerd Hoffmann
2012-06-19 20:23     ` Benjamin Herrenschmidt
2012-06-20  3:14       ` David Gibson
2012-06-20  3:52         ` Benjamin Herrenschmidt
2012-06-21  1:42           ` David Gibson
2012-06-20  6:25         ` Gerd Hoffmann
2012-06-20  9:25           ` Benjamin Herrenschmidt
2012-06-20  9:54             ` Gerd Hoffmann
2012-06-19  6:39 ` [Qemu-devel] [PATCH 08/13] iommu: Introduce IOMMU emulation infrastructure Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 09/13] iommu: Add facility to cancel in-use dma memory maps Benjamin Herrenschmidt
2012-06-20 21:25   ` Anthony Liguori
2012-06-20 21:52     ` Benjamin Herrenschmidt [this message]
2012-06-22  3:18     ` Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 10/13] pseries: Convert sPAPR TCEs to use generic IOMMU infrastructure Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 11/13] iommu: Allow PCI to use " Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 12/13] pseries: Implement IOMMU and DMA for PAPR PCI devices Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 13/13] Add a memory barrier to DMA functions Benjamin Herrenschmidt
2012-06-20 21:12 ` [Qemu-devel] [PATCH 00/13] iommu series Anthony Liguori
  -- strict thread matches above, loose matches on Subject: below --
2012-05-10  4:48 [Qemu-devel] [PATCH 00/13] IOMMU infrastructure Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 09/13] iommu: Add facility to cancel in-use dma memory maps Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1340229173.28143.194.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=anthony@codemonkey.ws \
    --cc=david@gibson.dropbear.id.au \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).