* [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
@ 2011-11-11 22:49 Alex Williamson
2011-11-12 0:17 ` Chris Wright
2011-11-12 0:37 ` David Woodhouse
0 siblings, 2 replies; 17+ messages in thread
From: Alex Williamson @ 2011-11-11 22:49 UTC (permalink / raw)
To: dwmw2; +Cc: iommu, linux-pci, linux-kernel, chrisw, ddutile, alex.williamson
domain_update_iommu_coherency() currently default to setting domains
as coherent when the domain is not attached to any iommus. This
allows for a window in domain_context_mapping_one() where such a
domain can update context entries non-coherenty, and only after
update the domain capability to clear iommu_coherency.
This can be seen using KVM device assignment on VT-d systems that
do not support coherency in the ecap register. When a device is
added to a guest, a domain is created (iommu_coherency = 0), the
device is attached, and ranges are mapped. If we then hot unplug
the device, the coherency is updated and set to the default (1)
since no iommus are attached to the domain. A subsequenct attach
of a device makes use of the same dmar domain (now marked coherent)
updates context entries with cohrency enabled, and only disables
coherency as the last step in the process.
To fix this, switch domain_update_iommu_coherency() to use the
safer, non-coherent default for domains not attached to iommus.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Donald Dutile <ddutile@redhat.com>
Acked-by: Donald Dutile <ddutile@redhat.com>
---
drivers/iommu/intel-iommu.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index c0c7820..6b9d8c1 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -560,7 +560,9 @@ static void domain_update_iommu_coherency(struct dmar_domain *domain)
{
int i;
- domain->iommu_coherency = 1;
+ i = find_first_bit(&domain->iommu_bmp, g_num_of_iommus);
+
+ domain->iommu_coherency = i < g_num_of_iommus ? 1 : 0;
for_each_set_bit(i, &domain->iommu_bmp, g_num_of_iommus) {
if (!ecap_coherent(g_iommus[i]->ecap)) {
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-11 22:49 [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus Alex Williamson
@ 2011-11-12 0:17 ` Chris Wright
2011-11-12 0:37 ` David Woodhouse
1 sibling, 0 replies; 17+ messages in thread
From: Chris Wright @ 2011-11-12 0:17 UTC (permalink / raw)
To: Alex Williamson; +Cc: dwmw2, iommu, linux-pci, linux-kernel, chrisw, ddutile
* Alex Williamson (alex.williamson@redhat.com) wrote:
> domain_update_iommu_coherency() currently default to setting domains
> as coherent when the domain is not attached to any iommus. This
> allows for a window in domain_context_mapping_one() where such a
> domain can update context entries non-coherenty, and only after
> update the domain capability to clear iommu_coherency.
>
> This can be seen using KVM device assignment on VT-d systems that
> do not support coherency in the ecap register. When a device is
> added to a guest, a domain is created (iommu_coherency = 0), the
> device is attached, and ranges are mapped. If we then hot unplug
> the device, the coherency is updated and set to the default (1)
> since no iommus are attached to the domain. A subsequenct attach
> of a device makes use of the same dmar domain (now marked coherent)
> updates context entries with cohrency enabled, and only disables
> coherency as the last step in the process.
Specifically, in domain_context_mapping_one():
context_set_translation_type(context, translation);
context_set_fault_enable(context);
context_set_present(context);
domain_flush_cache(domain, context, sizeof(*context)); <-- noop
This does not update context entry, leaving IOMMU w/ a non-present
context entry, and therefore subsequent DMA generates DMAR faults:
DMAR:[DMA Read] Request device [08:10.0] fault addr 37298000
DMAR:[fault reason 02] Present bit in context entry is clear
> To fix this, switch domain_update_iommu_coherency() to use the
> safer, non-coherent default for domains not attached to iommus.
>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Tested-by: Donald Dutile <ddutile@redhat.com>
> Acked-by: Donald Dutile <ddutile@redhat.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
thanks,
-chris
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-11 22:49 [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus Alex Williamson
2011-11-12 0:17 ` Chris Wright
@ 2011-11-12 0:37 ` David Woodhouse
2011-11-12 0:45 ` Chris Wright
2011-11-12 0:46 ` Roland Dreier
1 sibling, 2 replies; 17+ messages in thread
From: David Woodhouse @ 2011-11-12 0:37 UTC (permalink / raw)
To: Alex Williamson; +Cc: iommu, linux-pci, linux-kernel, chrisw, ddutile
[-- Attachment #1: Type: text/plain, Size: 1366 bytes --]
On Fri, 2011-11-11 at 15:49 -0700, Alex Williamson wrote:
> To fix this, switch domain_update_iommu_coherency() to use the
> safer, non-coherent default for domains not attached to iommus.
That isn't a fix for the problem you described.
The problem is that changing a domain from coherent to non-coherent is
*broken*. It probably needs to flush the cache for the *entire* set of
page tables — not just the new context entry it adds.
You might have removed the *common* case where we trigger that bug, but
it certainly isn't a fix.
However, I'd be receptive to an argument that the situation you describe
is in fact the *only* time we'd have to switch from coherent to
non-coherent at run time, because the coherency is an all-or-nothing
characteristic of the chipset. Either all the IOMMUs are coherent, or
none of them, right? This brain-damage only affects the first chipsets
from before we worked out that cache incoherency was a *really* f*cking
stupid idea, doesn't it?
So if you were to ditch the whole idea of a per-domain runtime update,
and instead calculate a global value for 'iommu_coherency' at boot time,
by iterating over for_each_active_iommu()¹, I think that would be a
better way to deal with the issue. And you *could* really call that a
'fix'.
Make sense?
--
dwmw2
¹ just in *case* they ever differ
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5818 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 0:37 ` David Woodhouse
@ 2011-11-12 0:45 ` Chris Wright
2011-11-12 0:47 ` Chris Wright
2011-11-12 0:51 ` David Woodhouse
2011-11-12 0:46 ` Roland Dreier
1 sibling, 2 replies; 17+ messages in thread
From: Chris Wright @ 2011-11-12 0:45 UTC (permalink / raw)
To: David Woodhouse; +Cc: Alex Williamson, chrisw, linux-pci, iommu, linux-kernel
* David Woodhouse (dwmw2@infradead.org) wrote:
> On Fri, 2011-11-11 at 15:49 -0700, Alex Williamson wrote:
> > To fix this, switch domain_update_iommu_coherency() to use the
> > safer, non-coherent default for domains not attached to iommus.
>
> That isn't a fix for the problem you described.
>
> The problem is that changing a domain from coherent to non-coherent is
> *broken*. It probably needs to flush the cache for the *entire* set of
> page tables — not just the new context entry it adds.
For a guest domain, the page tables aren't actually changing.
And for the snoop mode change, we remap the pages.
> You might have removed the *common* case where we trigger that bug, but
> it certainly isn't a fix.
>
> However, I'd be receptive to an argument that the situation you describe
> is in fact the *only* time we'd have to switch from coherent to
> non-coherent at run time, because the coherency is an all-or-nothing
> characteristic of the chipset. Either all the IOMMUs are coherent, or
> none of them, right? This brain-damage only affects the first chipsets
> from before we worked out that cache incoherency was a *really* f*cking
> stupid idea, doesn't it?
Dunno if it exists going forward (I've stopped being surprised by the
brain damage in this area ;), but those machines are still out there.
> So if you were to ditch the whole idea of a per-domain runtime update,
> and instead calculate a global value for 'iommu_coherency' at boot time,
> by iterating over for_each_active_iommu()¹, I think that would be a
> better way to deal with the issue. And you *could* really call that a
> 'fix'.
>
> Make sense?
Ideally, yes. Not sure we can practically do it though. Would have to
be sure we force incoherent access mode for the busted hw.
thanks,
-chris
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 0:45 ` Chris Wright
@ 2011-11-12 0:47 ` Chris Wright
2011-11-12 0:51 ` David Woodhouse
1 sibling, 0 replies; 17+ messages in thread
From: Chris Wright @ 2011-11-12 0:47 UTC (permalink / raw)
To: Chris Wright; +Cc: David Woodhouse, linux-pci, iommu, linux-kernel
* Chris Wright (chrisw@sous-sol.org) wrote:
> * David Woodhouse (dwmw2@infradead.org) wrote:
> > So if you were to ditch the whole idea of a per-domain runtime update,
> > and instead calculate a global value for 'iommu_coherency' at boot time,
> > by iterating over for_each_active_iommu()¹, I think that would be a
> > better way to deal with the issue. And you *could* really call that a
> > 'fix'.
> >
> > Make sense?
>
> Ideally, yes. Not sure we can practically do it though. Would have to
> be sure we force incoherent access mode for the busted hw.
Forgot to mention, the simple patch may be the better choice for -stable
(assuming the page table mapping issue...isn't).
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 0:45 ` Chris Wright
2011-11-12 0:47 ` Chris Wright
@ 2011-11-12 0:51 ` David Woodhouse
2011-11-12 0:58 ` Chris Wright
1 sibling, 1 reply; 17+ messages in thread
From: David Woodhouse @ 2011-11-12 0:51 UTC (permalink / raw)
To: Chris Wright; +Cc: Alex Williamson, linux-pci, iommu, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 858 bytes --]
On Fri, 2011-11-11 at 16:45 -0800, Chris Wright wrote:
> > So if you were to ditch the whole idea of a per-domain runtime update,
> > and instead calculate a global value for 'iommu_coherency' at boot time,
> > by iterating over for_each_active_iommu()¹, I think that would be a
> > better way to deal with the issue. And you *could* really call that a
> > 'fix'.
> >
> > Make sense?
>
> Ideally, yes. Not sure we can practically do it though. Would have to
> be sure we force incoherent access mode for the busted hw.
Why's it not practical? You do the *same* loop we currently have in
domain_update_iommu_coherency(), except that you do it just *once*, over
all active IOMMUs, at boot time. And then you just use that result
forever more.
So if *any* IOMMU in the system is non-coherent, you run them all that
way.
--
dwmw2
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5818 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 0:51 ` David Woodhouse
@ 2011-11-12 0:58 ` Chris Wright
2011-11-12 1:03 ` David Woodhouse
0 siblings, 1 reply; 17+ messages in thread
From: Chris Wright @ 2011-11-12 0:58 UTC (permalink / raw)
To: David Woodhouse
Cc: Chris Wright, Alex Williamson, linux-pci, iommu, linux-kernel
* David Woodhouse (dwmw2@infradead.org) wrote:
> On Fri, 2011-11-11 at 16:45 -0800, Chris Wright wrote:
> > > So if you were to ditch the whole idea of a per-domain runtime update,
> > > and instead calculate a global value for 'iommu_coherency' at boot time,
> > > by iterating over for_each_active_iommu()¹, I think that would be a
> > > better way to deal with the issue. And you *could* really call that a
> > > 'fix'.
> > >
> > > Make sense?
> >
> > Ideally, yes. Not sure we can practically do it though. Would have to
> > be sure we force incoherent access mode for the busted hw.
>
> Why's it not practical? You do the *same* loop we currently have in
> domain_update_iommu_coherency(), except that you do it just *once*, over
> all active IOMMUs, at boot time. And then you just use that result
> forever more.
Yeah, you're right, that should be simple enough. What about snoop
control? Same thing...should we expect it to be system wide? Because
that one's exported out to KVM and used.
> So if *any* IOMMU in the system is non-coherent, you run them all that
> way.
Minus the measureable slowdown for some devices that were behind coherent
IOMMU (IIRC, the chipsets that had this issue were mobile anyway, so
not as performance sensitive), *nod* .
thanks,
-chris
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 0:58 ` Chris Wright
@ 2011-11-12 1:03 ` David Woodhouse
0 siblings, 0 replies; 17+ messages in thread
From: David Woodhouse @ 2011-11-12 1:03 UTC (permalink / raw)
To: Chris Wright; +Cc: Alex Williamson, linux-pci, iommu, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1766 bytes --]
On Fri, 2011-11-11 at 16:58 -0800, Chris Wright wrote:
> * David Woodhouse (dwmw2@infradead.org) wrote:
> > On Fri, 2011-11-11 at 16:45 -0800, Chris Wright wrote:
> > > > So if you were to ditch the whole idea of a per-domain runtime update,
> > > > and instead calculate a global value for 'iommu_coherency' at boot time,
> > > > by iterating over for_each_active_iommu()¹, I think that would be a
> > > > better way to deal with the issue. And you *could* really call that a
> > > > 'fix'.
> > > >
> > > > Make sense?
> > >
> > > Ideally, yes. Not sure we can practically do it though. Would have to
> > > be sure we force incoherent access mode for the busted hw.
> >
> > Why's it not practical? You do the *same* loop we currently have in
> > domain_update_iommu_coherency(), except that you do it just *once*, over
> > all active IOMMUs, at boot time. And then you just use that result
> > forever more.
>
> Yeah, you're right, that should be simple enough. What about snoop
> control? Same thing...should we expect it to be system wide? Because
> that one's exported out to KVM and used.
Not sure. That might differ for the graphics IOMMU (large page support
certainly does). But I don't *think* the coherency does, does it?
> > So if *any* IOMMU in the system is non-coherent, you run them all that
> > way.
>
> Minus the measureable slowdown for some devices that were behind coherent
> IOMMU (IIRC, the chipsets that had this issue were mobile anyway, so
> not as performance sensitive), *nod* .
Are there really machines where only *some* of the IOMMUs are coherent?
I didn't think there were, which would mean there's no real performance
penalty on the hardware that actually exists today.
--
dwmw2
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5818 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 0:37 ` David Woodhouse
2011-11-12 0:45 ` Chris Wright
@ 2011-11-12 0:46 ` Roland Dreier
2011-11-12 0:51 ` Chris Wright
1 sibling, 1 reply; 17+ messages in thread
From: Roland Dreier @ 2011-11-12 0:46 UTC (permalink / raw)
To: David Woodhouse
Cc: Alex Williamson, iommu, linux-pci, linux-kernel, chrisw, ddutile
On Fri, Nov 11, 2011 at 4:37 PM, David Woodhouse <dwmw2@infradead.org> wrote:
> This brain-damage only affects the first chipsets
> from before we worked out that cache incoherency was a *really* f*cking
> stupid idea, doesn't it?
As we talked about at KS, I have some Westmere EP (ie latest
and greatest server platform) systems where the BIOS exposes
an option that allows choosing VT-d coherency on or off, and
defaults it to "off".
What is the "official" Intel line on coherency with Westmere and
Tylersburg -- because as I also mentioned, I was seeing some
problems with VT-d and the default "coherency off" setting that
looked like the IOMMU HW is getting stale PTEs (ie a missing
or not working cache flush).
- R.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 0:46 ` Roland Dreier
@ 2011-11-12 0:51 ` Chris Wright
2011-11-12 0:55 ` David Woodhouse
0 siblings, 1 reply; 17+ messages in thread
From: Chris Wright @ 2011-11-12 0:51 UTC (permalink / raw)
To: Roland Dreier
Cc: David Woodhouse, Alex Williamson, iommu, linux-pci, linux-kernel,
chrisw, ddutile
* Roland Dreier (roland@purestorage.com) wrote:
> On Fri, Nov 11, 2011 at 4:37 PM, David Woodhouse <dwmw2@infradead.org> wrote:
> > This brain-damage only affects the first chipsets
> > from before we worked out that cache incoherency was a *really* f*cking
> > stupid idea, doesn't it?
>
> As we talked about at KS, I have some Westmere EP (ie latest
> and greatest server platform) systems where the BIOS exposes
> an option that allows choosing VT-d coherency on or off, and
> defaults it to "off".
That's just more brain damage AFAICT. Esp. if you do performance
testing (and choose not to use passthrough mode)... have and it's
quite measurable. I switched default to on long time ago, w/out
issue.
> What is the "official" Intel line on coherency with Westmere and
> Tylersburg -- because as I also mentioned, I was seeing some
> problems with VT-d and the default "coherency off" setting that
> looked like the IOMMU HW is getting stale PTEs (ie a missing
> or not working cache flush).
That sounds like sw bugs more than official recommendation issue.
thanks,
-chris
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 0:51 ` Chris Wright
@ 2011-11-12 0:55 ` David Woodhouse
2011-11-12 1:08 ` Chris Wright
0 siblings, 1 reply; 17+ messages in thread
From: David Woodhouse @ 2011-11-12 0:55 UTC (permalink / raw)
To: Chris Wright
Cc: Roland Dreier, Alex Williamson, iommu, linux-pci, linux-kernel,
ddutile
[-- Attachment #1: Type: text/plain, Size: 1657 bytes --]
On Fri, 2011-11-11 at 16:51 -0800, Chris Wright wrote:
> * Roland Dreier (roland@purestorage.com) wrote:
> > On Fri, Nov 11, 2011 at 4:37 PM, David Woodhouse <dwmw2@infradead.org> wrote:
> > > This brain-damage only affects the first chipsets
> > > from before we worked out that cache incoherency was a *really* f*cking
> > > stupid idea, doesn't it?
> >
> > As we talked about at KS, I have some Westmere EP (ie latest
> > and greatest server platform) systems where the BIOS exposes
> > an option that allows choosing VT-d coherency on or off, and
> > defaults it to "off".
>
> That's just more brain damage AFAICT. Esp. if you do performance
> testing (and choose not to use passthrough mode)... have and it's
> quite measurable. I switched default to on long time ago, w/out
> issue.
>
> > What is the "official" Intel line on coherency with Westmere and
> > Tylersburg -- because as I also mentioned, I was seeing some
> > problems with VT-d and the default "coherency off" setting that
> > looked like the IOMMU HW is getting stale PTEs (ie a missing
> > or not working cache flush).
>
> That sounds like sw bugs more than official recommendation issue.
Although the cache-flushing has been tested on the original chipsets
fairly well, and it's one of the parts I've mostly rewritten when doing
performance work since I inherited the code, so that might not be my
first suspicion. I would be more inclined to suspect that there's some
chipset buffering that we aren't correctly flushing (which might in
itself be a hardware issue, since the way to flush the cache is supposed
to be well-defined).
--
dwmw2
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5818 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 0:55 ` David Woodhouse
@ 2011-11-12 1:08 ` Chris Wright
2011-11-12 1:20 ` David Woodhouse
2011-11-12 1:30 ` Roland Dreier
0 siblings, 2 replies; 17+ messages in thread
From: Chris Wright @ 2011-11-12 1:08 UTC (permalink / raw)
To: David Woodhouse
Cc: Chris Wright, Roland Dreier, Alex Williamson, iommu, linux-pci,
linux-kernel, ddutile
* David Woodhouse (dwmw2@infradead.org) wrote:
> On Fri, 2011-11-11 at 16:51 -0800, Chris Wright wrote:
> > * Roland Dreier (roland@purestorage.com) wrote:
> > > On Fri, Nov 11, 2011 at 4:37 PM, David Woodhouse <dwmw2@infradead.org> wrote:
> > > > This brain-damage only affects the first chipsets
> > > > from before we worked out that cache incoherency was a *really* f*cking
> > > > stupid idea, doesn't it?
> > >
> > > As we talked about at KS, I have some Westmere EP (ie latest
> > > and greatest server platform) systems where the BIOS exposes
> > > an option that allows choosing VT-d coherency on or off, and
> > > defaults it to "off".
> >
> > That's just more brain damage AFAICT. Esp. if you do performance
> > testing (and choose not to use passthrough mode)... have and it's
> > quite measurable. I switched default to on long time ago, w/out
> > issue.
> >
> > > What is the "official" Intel line on coherency with Westmere and
> > > Tylersburg -- because as I also mentioned, I was seeing some
> > > problems with VT-d and the default "coherency off" setting that
> > > looked like the IOMMU HW is getting stale PTEs (ie a missing
> > > or not working cache flush).
> >
> > That sounds like sw bugs more than official recommendation issue.
>
> Although the cache-flushing has been tested on the original chipsets
> fairly well, and it's one of the parts I've mostly rewritten when doing
> performance work since I inherited the code, so that might not be my
> first suspicion.
All the stale PTE issues I've encountered in the past have turned into
fixed sw bugs (perhaps it's since been fixed?). Also, I thought with
Coherency On/Off it's only effecting the use of clflush, not IOTLB or
Context Entry cache flushing (invalidations).
On a slightly separate, but performance related note...have you ever
tried using the hw queue? Currently we only have a sw queue, but the
submission path for invalidations doesn't really queue (unless I missed
it). It seems to pull from the software queue and submit/wait,
submit/wait...Seems simple enough to submit the whole queue and then
issue the wait.
This would be a huge win if we ever have an emulated IOMMU. We could
make the sw queue bigger, and allocate more than a single page for the
hw queue. We'd only exit on running the queue rather than every
invalidation.
> I would be more inclined to suspect that there's some
> chipset buffering that we aren't correctly flushing (which might in
> itself be a hardware issue, since the way to flush the cache is supposed
> to be well-defined).
Roland, have you tried switching BIOS to Coherency On and can do you
ever see stale PTEs?
thanks,
-chris
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 1:08 ` Chris Wright
@ 2011-11-12 1:20 ` David Woodhouse
2011-11-15 7:12 ` cody
2011-11-12 1:30 ` Roland Dreier
1 sibling, 1 reply; 17+ messages in thread
From: David Woodhouse @ 2011-11-12 1:20 UTC (permalink / raw)
To: Chris Wright
Cc: Roland Dreier, Alex Williamson, iommu, linux-pci, linux-kernel,
ddutile
[-- Attachment #1: Type: text/plain, Size: 1080 bytes --]
On Fri, 2011-11-11 at 17:08 -0800, Chris Wright wrote:
> All the stale PTE issues I've encountered in the past have turned into
> fixed sw bugs (perhaps it's since been fixed?). Also, I thought with
> Coherency On/Off it's only effecting the use of clflush, not IOTLB or
> Context Entry cache flushing (invalidations).
Yeah, it's supposed to be *just* clflush. Nevertheless, I can imagine it
being screwed up and there actually being a buffer in the chipset too.
We certainly made that mistake with the graphics engine in some cases...
> On a slightly separate, but performance related note...have you ever
> tried using the hw queue? Currently we only have a sw queue, but the
> submission path for invalidations doesn't really queue (unless I missed
> it). It seems to pull from the software queue and submit/wait,
> submit/wait...Seems simple enough to submit the whole queue and then
> issue the wait.
I have a feeling we trigger errata if we do that — although if we're
only doing it for an emulated IOMMU that shouldn't be an issue.
--
dwmw2
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5818 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 1:20 ` David Woodhouse
@ 2011-11-15 7:12 ` cody
2011-11-15 4:54 ` Chris Wright
0 siblings, 1 reply; 17+ messages in thread
From: cody @ 2011-11-15 7:12 UTC (permalink / raw)
To: David Woodhouse
Cc: Chris Wright, Roland Dreier, Alex Williamson, iommu, linux-pci,
linux-kernel, ddutile
On 11/11/2011 05:20 PM, David Woodhouse wrote:
> On Fri, 2011-11-11 at 17:08 -0800, Chris Wright wrote:
>
>> All the stale PTE issues I've encountered in the past have turned into
>> fixed sw bugs (perhaps it's since been fixed?). Also, I thought with
>> Coherency On/Off it's only effecting the use of clflush, not IOTLB or
>> Context Entry cache flushing (invalidations).
>>
> Yeah, it's supposed to be *just* clflush. Nevertheless, I can imagine it
> being screwed up and there actually being a buffer in the chipset too.
> We certainly made that mistake with the graphics engine in some cases...
>
>
>> On a slightly separate, but performance related note...have you ever
>> tried using the hw queue? Currently we only have a sw queue, but the
>> submission path for invalidations doesn't really queue (unless I missed
>> it). It seems to pull from the software queue and submit/wait,
>> submit/wait...Seems simple enough to submit the whole queue and then
>> issue the wait.
>>
> I have a feeling we trigger errata if we do that — although if we're
> only doing it for an emulated IOMMU that shouldn't be an issue.
>
>
What does the emulated IOMMU here? Does it mean the emulated IOMMU
exposed to guest VM?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-15 7:12 ` cody
@ 2011-11-15 4:54 ` Chris Wright
2011-11-15 5:55 ` Kai Huang
0 siblings, 1 reply; 17+ messages in thread
From: Chris Wright @ 2011-11-15 4:54 UTC (permalink / raw)
To: cody
Cc: David Woodhouse, Roland Dreier, linux-pci, iommu, linux-kernel,
Chris Wright
* cody (mail.kai.huang@gmail.com) wrote:
> On 11/11/2011 05:20 PM, David Woodhouse wrote:
> >On Fri, 2011-11-11 at 17:08 -0800, Chris Wright wrote:
> >>On a slightly separate, but performance related note...have you ever
> >>tried using the hw queue? Currently we only have a sw queue, but the
> >>submission path for invalidations doesn't really queue (unless I missed
> >>it). It seems to pull from the software queue and submit/wait,
> >>submit/wait...Seems simple enough to submit the whole queue and then
> >>issue the wait.
> >I have a feeling we trigger errata if we do that — although if we're
> >only doing it for an emulated IOMMU that shouldn't be an issue.
> >
> What does the emulated IOMMU here? Does it mean the emulated IOMMU
> exposed to guest VM?
Yes. So for KVM, the IOMMU emulation would be in QEMU.
thanks,
-chris
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-15 4:54 ` Chris Wright
@ 2011-11-15 5:55 ` Kai Huang
0 siblings, 0 replies; 17+ messages in thread
From: Kai Huang @ 2011-11-15 5:55 UTC (permalink / raw)
To: Chris Wright
Cc: David Woodhouse, Roland Dreier, linux-pci, iommu, linux-kernel
On Tue, Nov 15, 2011 at 12:54 PM, Chris Wright <chrisw@sous-sol.org> wrote:
> * cody (mail.kai.huang@gmail.com) wrote:
>> On 11/11/2011 05:20 PM, David Woodhouse wrote:
>> >On Fri, 2011-11-11 at 17:08 -0800, Chris Wright wrote:
>> >>On a slightly separate, but performance related note...have you ever
>> >>tried using the hw queue? Currently we only have a sw queue, but the
>> >>submission path for invalidations doesn't really queue (unless I missed
>> >>it). It seems to pull from the software queue and submit/wait,
>> >>submit/wait...Seems simple enough to submit the whole queue and then
>> >>issue the wait.
>> >I have a feeling we trigger errata if we do that — although if we're
>> >only doing it for an emulated IOMMU that shouldn't be an issue.
>> >
>> What does the emulated IOMMU here? Does it mean the emulated IOMMU
>> exposed to guest VM?
>
> Yes. So for KVM, the IOMMU emulation would be in QEMU.
>
What will emulated IOMMU be used for to the guest VM? To provide
another layer of address translation (in guest VM) that allows guest
VM to allocate large contiguous DVMA space for device? Does this mean
we need to keep something like *shadow page table* for the real IOMMU?
And this will be only used for direct IO, correct?
> thanks,
> -chris
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus
2011-11-12 1:08 ` Chris Wright
2011-11-12 1:20 ` David Woodhouse
@ 2011-11-12 1:30 ` Roland Dreier
1 sibling, 0 replies; 17+ messages in thread
From: Roland Dreier @ 2011-11-12 1:30 UTC (permalink / raw)
To: Chris Wright
Cc: David Woodhouse, Alex Williamson, iommu, linux-pci, linux-kernel,
ddutile
On Fri, Nov 11, 2011 at 5:08 PM, Chris Wright <chrisw@sous-sol.org> wrote:
>> I would be more inclined to suspect that there's some
>> chipset buffering that we aren't correctly flushing (which might in
>> itself be a hardware issue, since the way to flush the cache is supposed
>> to be well-defined).
>
> Roland, have you tried switching BIOS to Coherency On and can do you
> ever see stale PTEs?
Yeah, the problems go away if I turn on coherency.
More details here:
https://lkml.org/lkml/2011/10/11/412
(and after that email, I let the box run long enough that I'm pretty
sure turning VT-d coherency on does fix things)
I'd love to fix this but not sure how much I can contribute beyond
tests (I don't see anything wrong in the VT-d cache flushing code
in the kernel, and I did look for a while).
- R.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2011-11-15 5:55 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-11 22:49 [PATCH] intel-iommu: Default to non-coherent for domains unattached to iommus Alex Williamson
2011-11-12 0:17 ` Chris Wright
2011-11-12 0:37 ` David Woodhouse
2011-11-12 0:45 ` Chris Wright
2011-11-12 0:47 ` Chris Wright
2011-11-12 0:51 ` David Woodhouse
2011-11-12 0:58 ` Chris Wright
2011-11-12 1:03 ` David Woodhouse
2011-11-12 0:46 ` Roland Dreier
2011-11-12 0:51 ` Chris Wright
2011-11-12 0:55 ` David Woodhouse
2011-11-12 1:08 ` Chris Wright
2011-11-12 1:20 ` David Woodhouse
2011-11-15 7:12 ` cody
2011-11-15 4:54 ` Chris Wright
2011-11-15 5:55 ` Kai Huang
2011-11-12 1:30 ` Roland Dreier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox