qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	Richard Henderson <rth@twiddle.net>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org,
	Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Subject: Re: [Qemu-devel] [PATCH 08/13] iommu: Introduce IOMMU emulation infrastructure
Date: Wed, 16 May 2012 07:55:43 +1000	[thread overview]
Message-ID: <1337118943.6727.93.camel@pasglop> (raw)
In-Reply-To: <4FB26212.5050409@codemonkey.ws>

On Tue, 2012-05-15 at 09:02 -0500, Anthony Liguori wrote:

> I think the crux of your argument is that upon a change to the translation 
> table, the operation acts as a barrier such that the exact moment it returns, 
> you're guaranteed that no DMAs are in flight with the old translation mapping.

Not when the translation is changed in memory but whenever the
translation cache are invalidated or whatever other mechanism the HW
provides to do that synchronization. On PAPR, this guarantee is provided
by the H_PUT_TCE hypervisor call which we use to manipulate
translations.

[ Note that for performance reasons, it might end up being very
impractical to provide that guarantee since it prevents us from handling
H_PUT_TCE entirely in kernel real mode like we to today... we'll have to
figure our what we want to do here for the TCE backend implementation,
maybe have qemu mark "in use" translations and cause exist when those
are modified ... ]

> That's not my understanding of at least VT-d and I have a hard time believing 
> it's true for other IOMMUs as that kind of synchronization seems like it would 
> be very expensive to implement in hardware.

How so ? It's perfectly standard stuff ... it's usually part of the TLB
flushing op.

> Rather, when the IOTLB is flushed, I believe the only guarantee that you have is 
> that future IOTLB lookups will return the new mapping.  But that doesn't mean 
> that there isn't a request in flight that uses the old mapping.

I would be very surprised if that was the case :-)

I don't think any sane HW implementation would fail to provide full
synchronization with invalidations. That's how MMUs operate and I don't
see any reason why an iommu shouldn't be held to the same standards.

If it didn't, you'd have a nice host attack... have a guest doing
pass-through start a very long transaction and immediately commit
suicide. KVM starts reclaiming the pages, they go back to the host,
might be re-used immediately ... while still being DMAed to.

> I will grant you that PCI transactions are typically much smaller than QEMU 
> transactions such that we may continue to use the old mappings for much longer 
> than real hardware would.  But I think that still puts us well within the realm 
> of correctness.

No, a "random amount of time after invalidation" is not and will never
be correct. On large SMP machines, the time between a page being freed
and that page being re-used can be very small. The memory being re-used
by something like kexec can happen almost immediately while qemu is
blocked on an AIO that takes milliseconds ... etc....

At least because this is emulated iommu, qemu only writes to virtual
addresses mapping the guest space, so this isn't a host attack (unlike
with a real HW iommu however where the lack of such synchronization
would definitely be, as I described earlier).

> > Cases where that matter are unloading of a (broken) driver, kexec/kdump
> > from one guest to another etc... all involve potentially clearing all
> > iommu tables while a driver might have left a device DMA'ing. The
> > expectation is that the device will get target aborts from the iommu
> > until the situation gets "cleaned up" in SW.
> 
> Yes, this would be worse in QEMU than on bare metal because we essentially have 
> a much larger translation TLB.  But as I said above, I think we're well within 
> the specified behavior here.

No :-)

> >> Why does this need to be guaranteed?  How can software depend on this in a
> >> meaningful way?
> >
> > The same as TLB invalidations :-)
> >
> > In real HW, this is a property of the HW itself, ie, whatever MMIO is
> > used to invalidate the HW TLB provides a way to ensure (usually by
> > reading back) that any request pending in the iommu pipeline has either
> > been completed or canned.
> 
> Can you point to a spec that says this?  This doesn't match my understanding.

Appart from common sense ? I'd have to dig to get you actual specs but
it should be plain obvious that you need that sort of sync or you simply
cannot trust your iommu to do virtualization.

> > When we start having page fault capable iommu's this will be even more
> > important as faults will be be part of the non-error case.
> 
> We can revisit this discussion after every PCI device is changed to cope with a 
> page fault capable IOMMU ;-)

Heh, well, the point is that is still part of the base iommu model, page
faulting is just going to make the problem worse.

> >>> David's approach may not be the best long term, but provided it's not
> >>> totally broken (I don't know qemu locking well enough to judge how
> >>> dangerous it is) then it might be a "good enough" first step until we
> >>> come up with something better ?
> >>
> >> No, it's definitely not good enough.  Dropping the global mutex in random places
> >> is asking for worlds of hurt.
> >>
> >> If this is really important, then we need some sort of cancellation API to go
> >> along with map/unmap although I doubt that's really possible.
> >>
> >> MMIO/PIO operations cannot block.
> >
> > Well, there's a truckload of cases in real HW where an MMIO/PIO read is
> > used to synchronize some sort of HW operation.... I suppose nothing that
> > involves blocking at this stage in qemu but I would be careful with your
> > expectations here... writes are usually pipelined but blocking on a read
> > response does make a lot of sense.
> 
> Blocking on an MMIO/PIO request effectively freezes a CPU.  All sorts of badness 
> results from that.  Best case scenario, you trigger soft lockup warnings.

Well, that's exactly what happens in HW on PIO accesses and MMIO reads
waiting for a reply...

> > In any case, for the problem at hand, I can just drop the wait for now
> > and maybe just print a warning if I see an existing map.
> >
> > We still need some kind of either locking or barrier to simply ensure
> > that the updates to the TCE table are visible to other processors but
> > that can be done in the backend.
> >
> > But I wouldn't just forget about the issue, it's going to come back and
> > bite...
> 
> I think working out the exact semantics of what we need to do is absolutely 
> important.  But I think you're taking an overly conservative approach to what we 
> need to provide here.

I'm happy to have the patches merged without that for now, it will get
us going with USB emulation etc... which we need for graphics, but we do
need to sort this out eventually.

I'll re-submit without it.

Cheers,
Ben.

> Regards,
> 
> Anthony Liguori
> 
> >
> > Cheers,
> > Ben.
> >
> >> Regards,
> >>
> >> Anthony Liguori
> >>
> >>>
> >>> The normal case will be that no map exist, ie, it will almost always be
> >>> a guest programming error to remove an iommu mapping while a device is
> >>> actively using it, so having this case be slow is probably a non-issue.
> >>>
> >>> Cheers,
> >>> Ben.
> >>>
> >>>
> >
> >

  reply	other threads:[~2012-05-15 21:56 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-10  4:48 [Qemu-devel] [PATCH 00/13] IOMMU infrastructure Benjamin Herrenschmidt
2012-05-10  4:48 ` [Qemu-devel] [PATCH 01/13] Better support for dma_addr_t variables Benjamin Herrenschmidt
2012-05-10  4:48 ` [Qemu-devel] [PATCH 02/13] Implement cpu_physical_memory_zero() Benjamin Herrenschmidt
2012-05-15  0:42   ` Anthony Liguori
2012-05-15  1:23     ` David Gibson
2012-05-15  2:03       ` Anthony Liguori
2012-05-10  4:48 ` [Qemu-devel] [PATCH 03/13] iommu: Add universal DMA helper functions Benjamin Herrenschmidt
2012-05-10  4:48 ` [Qemu-devel] [PATCH 04/13] usb-ohci: Use " Benjamin Herrenschmidt
2012-05-10  4:48 ` [Qemu-devel] [PATCH 05/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 06/13] ide/ahci: Use universal DMA helper functions Benjamin Herrenschmidt
2012-05-21  1:51   ` [Qemu-devel] [PATCH 06/13 - UPDATED] " Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 07/13] usb: Convert usb_packet_{map, unmap} to universal DMA helpers Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 08/13] iommu: Introduce IOMMU emulation infrastructure Benjamin Herrenschmidt
2012-05-15  0:49   ` Anthony Liguori
2012-05-15  1:42     ` David Gibson
2012-05-15  2:03       ` Anthony Liguori
2012-05-15  2:32         ` Benjamin Herrenschmidt
2012-05-15  2:50           ` Anthony Liguori
2012-05-15  3:02             ` Benjamin Herrenschmidt
2012-05-15 14:02               ` Anthony Liguori
2012-05-15 21:55                 ` Benjamin Herrenschmidt [this message]
2012-05-15 22:02                   ` Anthony Liguori
2012-05-15 23:08                     ` Benjamin Herrenschmidt
2012-05-15 23:58                       ` Anthony Liguori
2012-05-16  0:41                         ` Benjamin Herrenschmidt
2012-05-16  0:54                           ` Anthony Liguori
2012-05-16  1:20                             ` Benjamin Herrenschmidt
2012-05-16 19:36                               ` Anthony Liguori
2012-05-10  4:49 ` [Qemu-devel] [PATCH 09/13] iommu: Add facility to cancel in-use dma memory maps Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 10/13] pseries: Convert sPAPR TCEs to use generic IOMMU infrastructure Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 11/13] iommu: Allow PCI to use " Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 12/13] pseries: Implement IOMMU and DMA for PAPR PCI devices Benjamin Herrenschmidt
2012-05-10  4:49 ` [Qemu-devel] [PATCH 13/13] iommu: Add a memory barrier to DMA RW function Benjamin Herrenschmidt
2012-05-15  0:52   ` Anthony Liguori
2012-05-15  1:11     ` Benjamin Herrenschmidt
2012-05-15  1:44     ` David Gibson
2012-05-16  4:35       ` Benjamin Herrenschmidt
2012-05-16  5:51         ` David Gibson
2012-05-16 19:39         ` Anthony Liguori
2012-05-16 21:10           ` Benjamin Herrenschmidt
2012-05-16 21:12             ` Benjamin Herrenschmidt
2012-05-17  0:07           ` Benjamin Herrenschmidt
2012-05-17  0:24             ` Benjamin Herrenschmidt
2012-05-17  0:52               ` [Qemu-devel] [RFC/PATCH] Add a memory barrier to guest memory access functions Benjamin Herrenschmidt
2012-05-17  2:28                 ` Anthony Liguori
2012-05-17  2:44                   ` Benjamin Herrenschmidt
2012-05-17 22:09                     ` Anthony Liguori
2012-05-18  1:04                       ` David Gibson
2012-05-18  1:16                       ` Benjamin Herrenschmidt
2012-05-17  3:35                   ` David Gibson
2012-05-18  6:53               ` [Qemu-devel] [PATCH 13/13] iommu: Add a memory barrier to DMA RW function Paolo Bonzini
2012-05-18  8:18                 ` Benjamin Herrenschmidt
2012-05-18  8:57                   ` Paolo Bonzini
2012-05-18 22:26                     ` Benjamin Herrenschmidt
2012-05-19  7:24                       ` Paolo Bonzini
2012-05-20 21:36                         ` Benjamin Herrenschmidt
2012-05-21  1:56                           ` [Qemu-devel] [PATCH] Add a memory barrier to guest memory access functions Benjamin Herrenschmidt
2012-05-21  8:11                             ` Paolo Bonzini
2012-05-21  8:31                               ` Michael S. Tsirkin
2012-05-21  8:58                                 ` Benjamin Herrenschmidt
2012-05-21  9:07                                   ` Benjamin Herrenschmidt
2012-05-21  9:16                                     ` Benjamin Herrenschmidt
2012-05-21  9:34                                       ` Michael S. Tsirkin
2012-05-21  9:53                                         ` Benjamin Herrenschmidt
2012-05-21 10:31                                           ` Michael S. Tsirkin
2012-05-21 11:45                                             ` Benjamin Herrenschmidt
2012-05-21 12:18                                               ` Michael S. Tsirkin
2012-05-21 15:16                                                 ` Paolo Bonzini
2012-05-21 21:58                                                 ` [Qemu-devel] [PATCH] Add a memory barrier to guest memory access function Benjamin Herrenschmidt
2012-05-21 22:22                                                   ` Michael S. Tsirkin
2012-05-21 22:56                                                     ` Benjamin Herrenschmidt
2012-05-22  5:11                                                       ` Michael S. Tsirkin
2012-05-22  0:00                                                     ` Benjamin Herrenschmidt
2012-05-22  4:19                                                   ` Rusty Russell
2012-05-21 22:18                                 ` [Qemu-devel] [PATCH] Add a memory barrier to guest memory access functions Anthony Liguori
2012-05-21 22:26                                   ` Benjamin Herrenschmidt
2012-05-21 22:31                                     ` Anthony Liguori
2012-05-21 22:44                                       ` Michael S. Tsirkin
2012-05-21 23:02                                         ` Benjamin Herrenschmidt
2012-05-22  4:34                                         ` [Qemu-devel] [PATCH] Add a memory barrier to DMA functions Benjamin Herrenschmidt
2012-05-22  4:51                                           ` Benjamin Herrenschmidt
2012-05-22  7:17                                           ` Benjamin Herrenschmidt
2012-05-22 11:14                                             ` Michael S. Tsirkin
2012-05-22 11:41                                               ` Benjamin Herrenschmidt
2012-05-22 12:03                                                 ` Michael S. Tsirkin
2012-05-22 21:24                                                   ` Benjamin Herrenschmidt
2012-05-22 21:40                                                   ` Anthony Liguori
2012-05-21 22:37                                   ` [Qemu-devel] [PATCH] Add a memory barrier to guest memory access functions Michael S. Tsirkin
2012-05-15  0:52 ` [Qemu-devel] [PATCH 00/13] IOMMU infrastructure Anthony Liguori
  -- strict thread matches above, loose matches on Subject: below --
2012-06-19  6:39 [Qemu-devel] [PATCH 00/13] iommu series Benjamin Herrenschmidt
2012-06-19  6:39 ` [Qemu-devel] [PATCH 08/13] iommu: Introduce IOMMU emulation infrastructure Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1337118943.6727.93.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=alex.williamson@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=eduard.munteanu@linux360.ro \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).