* Re: VT-d interrup remapping errata workaround
2013-07-08 13:42 ` Jan Beulich
@ 2013-07-08 16:22 ` Neil Horman
2013-07-08 17:11 ` Neil Horman
2013-07-08 19:13 ` [RFC PATCH] iommu/vt-d: Expand interrupt remapping quirk to cover x58 chipset Neil Horman
2 siblings, 0 replies; 13+ messages in thread
From: Neil Horman @ 2013-07-08 16:22 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Joerg Roedel, Malcolm Crossley, xen-devel
On Mon, Jul 08, 2013 at 02:42:55PM +0100, Jan Beulich wrote:
> >>> On 08.07.13 at 15:33, Neil Horman <nhorman@tuxdriver.com> wrote:
> > On Mon, Jul 08, 2013 at 01:29:43PM +0100, Jan Beulich wrote:
> >> All,
> >>
> >> just having spotted the backport of Linux commit 03bbcb2e I notice
> >> a certain discrepancy with the Xen commit having the same purpose
> >> as well as with the actual specification updates:
> >>
> >> The Linux solution keys off of device IDs 3403 and 3406, as listed in
> >> the specification update, but this way fails to cover the X58 chipset,
> >> which has - under different numbers (62 and 69) - the same errata
> >> (the Linux commit message also only mentions erratum 53 for the
> >> 55x0 chipsets, albeit I believe 47 is as much of an issue there as it is
> >> for Xen).
> >>
> >> The Xen solution keys off of ID 342e, with no explanation in the
> >> commit description on where this association comes from.
> >>
> >> I therefore wonder whether
> >> - the Xen solution may have false positives
> >> - the Linux solution would still have false negatives even after
> >> adding ID 3405
> >>
> > Don Dutile noted that there was a problem he thought he noted with an X58
> > chipset that was simmilar to the one I fixed in the above commit. I'm
> > unabel to
> > track down an errata sheet for the X58 chipset, but if you have one and can
> > forward it to me I think it would be best (assuming that it has the same
> > errata
> > that the 5500/5520 chipsets do), that we just add their pci device id's in
> > to the existing linux errata.
>
> Attached. I don't have a weblink at hand, I'm sending a copy cached
> on my local FS.
>
> > To answer your above questions, I'm not sure if the xen solution may offer
> > false
> > positives, as I've not looked at the code. The Linux solution may offer
> > false
> > negatives, but only to the extent that this problem exists in chips not
> > covered
> > by any errata sheet that I've found to date.
>
> Yeah, the hidden question behind that really was - do we have a
> way to ensure no false negatives and no false positives? I.e. some
> consolidated set of errata listing not just an individual chipset's
> revisions that are affected, but all chipsets with all their revisions.
> I already proded my Intel contacts for this, but am of little hope
> that I'll get back a positive answer.
>
Its a limitation of the organizational system. Errata sheets are filed
per-device, so you have to look at each devices sheet to know if its affected,
and its on us to add devices to the appropriate quirks manually.
I wouldn't hold out for Intel on this either. They can help if you point to a
specific device and errata item, but they don't have a database thats searachble
by errata text super easily.
Neil
> Jan
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VT-d interrup remapping errata workaround
2013-07-08 13:42 ` Jan Beulich
2013-07-08 16:22 ` Neil Horman
@ 2013-07-08 17:11 ` Neil Horman
2013-07-09 6:40 ` Jan Beulich
2013-07-08 19:13 ` [RFC PATCH] iommu/vt-d: Expand interrupt remapping quirk to cover x58 chipset Neil Horman
2 siblings, 1 reply; 13+ messages in thread
From: Neil Horman @ 2013-07-08 17:11 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Joerg Roedel, Malcolm Crossley, xen-devel
On Mon, Jul 08, 2013 at 02:42:55PM +0100, Jan Beulich wrote:
> >>> On 08.07.13 at 15:33, Neil Horman <nhorman@tuxdriver.com> wrote:
> > On Mon, Jul 08, 2013 at 01:29:43PM +0100, Jan Beulich wrote:
> >> All,
> >>
> >> just having spotted the backport of Linux commit 03bbcb2e I notice
> >> a certain discrepancy with the Xen commit having the same purpose
> >> as well as with the actual specification updates:
> >>
> >> The Linux solution keys off of device IDs 3403 and 3406, as listed in
> >> the specification update, but this way fails to cover the X58 chipset,
> >> which has - under different numbers (62 and 69) - the same errata
> >> (the Linux commit message also only mentions erratum 53 for the
> >> 55x0 chipsets, albeit I believe 47 is as much of an issue there as it is
> >> for Xen).
> >>
> >> The Xen solution keys off of ID 342e, with no explanation in the
> >> commit description on where this association comes from.
> >>
> >> I therefore wonder whether
> >> - the Xen solution may have false positives
> >> - the Linux solution would still have false negatives even after
> >> adding ID 3405
> >>
> > Don Dutile noted that there was a problem he thought he noted with an X58
> > chipset that was simmilar to the one I fixed in the above commit. I'm
> > unabel to
> > track down an errata sheet for the X58 chipset, but if you have one and can
> > forward it to me I think it would be best (assuming that it has the same
> > errata
> > that the 5500/5520 chipsets do), that we just add their pci device id's in
> > to the existing linux errata.
>
> Attached. I don't have a weblink at hand, I'm sending a copy cached
> on my local FS.
>
Thank you I'll see if I can find an online version to point others to, but
regardless, errata 61 seems to be the kicker here. Looks like we need to add
and new pci id to the quirk, and check for some additional revisions if that one
is seen. I'll get to it shortly. Thanks for sending this out. Lets hope this
problem doesn't exist in even more chipsets.
Neil
> > To answer your above questions, I'm not sure if the xen solution may offer
> > false
> > positives, as I've not looked at the code. The Linux solution may offer
> > false
> > negatives, but only to the extent that this problem exists in chips not
> > covered
> > by any errata sheet that I've found to date.
>
> Yeah, the hidden question behind that really was - do we have a
> way to ensure no false negatives and no false positives? I.e. some
> consolidated set of errata listing not just an individual chipset's
> revisions that are affected, but all chipsets with all their revisions.
> I already proded my Intel contacts for this, but am of little hope
> that I'll get back a positive answer.
>
> Jan
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VT-d interrup remapping errata workaround
2013-07-08 17:11 ` Neil Horman
@ 2013-07-09 6:40 ` Jan Beulich
2013-07-09 9:34 ` Malcolm Crossley
0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2013-07-09 6:40 UTC (permalink / raw)
To: Malcolm Crossley, Neil Horman; +Cc: Andrew Cooper, Joerg Roedel, xen-devel
>>> On 08.07.13 at 19:11, Neil Horman <nhorman@tuxdriver.com> wrote:
> Thank you I'll see if I can find an online version to point others to, but
> regardless, errata 61 seems to be the kicker here. Looks like we need to add
> and new pci id to the quirk, and check for some additional revisions if that one
> is seen. I'll get to it shortly. Thanks for sending this out. Lets hope this
> problem doesn't exist in even more chipsets.
Erratum 61? That doesn't require remapping to be disabled as
workaround. 62 and 69 do.
But you make a very good point regarding revisions: 62 also
applies to rev 0x12, and 69 applies to both 0x12 and 0x22.
Xen doesn't check for either of them right now. Malcolm - this
in turn means that keying off of device ID 0x342e indeed isn't
correct/complete. Are you going to be able to submit a Xen
side fix relatively soon, or shall I take care of this?
Jan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VT-d interrup remapping errata workaround
2013-07-09 6:40 ` Jan Beulich
@ 2013-07-09 9:34 ` Malcolm Crossley
2013-07-09 9:48 ` Jan Beulich
0 siblings, 1 reply; 13+ messages in thread
From: Malcolm Crossley @ 2013-07-09 9:34 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Joerg Roedel, Neil Horman, xen-devel
On 09/07/13 07:40, Jan Beulich wrote:
>>>> On 08.07.13 at 19:11, Neil Horman <nhorman@tuxdriver.com> wrote:
>> Thank you I'll see if I can find an online version to point others to, but
>> regardless, errata 61 seems to be the kicker here. Looks like we need to add
>> and new pci id to the quirk, and check for some additional revisions if that one
>> is seen. I'll get to it shortly. Thanks for sending this out. Lets hope this
>> problem doesn't exist in even more chipsets.
> Erratum 61? That doesn't require remapping to be disabled as
> workaround. 62 and 69 do.
>
> But you make a very good point regarding revisions: 62 also
> applies to rev 0x12, and 69 applies to both 0x12 and 0x22.
> Xen doesn't check for either of them right now. Malcolm - this
> in turn means that keying off of device ID 0x342e indeed isn't
> correct/complete. Are you going to be able to submit a Xen
> side fix relatively soon, or shall I take care of this?
I didn't notice that errata 69 is not listed as fixed on rev 0x22 for
the X58
chipset, sorry about that.
Have we had confirmation from Intel about this particular errata affecting
the X58? It seems strange that they both have the same revision ID but
different fixes for errata. If we have got confirmation then I'll create
a patch
straight away, I'll try our Intel contacts as well, it might speed
things up a bit.
Malcolm
> Jan
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: VT-d interrup remapping errata workaround
2013-07-09 9:34 ` Malcolm Crossley
@ 2013-07-09 9:48 ` Jan Beulich
0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2013-07-09 9:48 UTC (permalink / raw)
To: Malcolm Crossley; +Cc: Andrew Cooper, Joerg Roedel, Neil Horman, xen-devel
>>> On 09.07.13 at 11:34, Malcolm Crossley <malcolm.crossley@citrix.com> wrote:
> Have we had confirmation from Intel about this particular errata affecting
> the X58? It seems strange that they both have the same revision ID but
> different fixes for errata. If we have got confirmation then I'll create
> a patch straight away, I'll try our Intel contacts as well, it might speed
> things up a bit.
No, there was no confirmation so far.
Jan
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC PATCH] iommu/vt-d: Expand interrupt remapping quirk to cover x58 chipset
2013-07-08 13:42 ` Jan Beulich
2013-07-08 16:22 ` Neil Horman
2013-07-08 17:11 ` Neil Horman
@ 2013-07-08 19:13 ` Neil Horman
2013-07-09 6:44 ` Jan Beulich
2 siblings, 1 reply; 13+ messages in thread
From: Neil Horman @ 2013-07-08 19:13 UTC (permalink / raw)
To: Jan Beulich
Cc: Andrew Cooper, Malcolm Crossley, Joerg Roedel, Neil Horman,
xen-devel
Recently we added an early quirk to detect 5500/5520 chipsets with early
revisions that had problems with irq draining with interrupt remapping enabled.
It turns out this same problem is present in the intel X58 chipset as well. See
errata 69 here:
http://www.intel.com/content/www/us/en/chipsets/x58-express-specification-update.html
This patch extends the pci early quirk so that the chip devices/revisions
specified in the above update are also covered in the same way:
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
arch/x86/kernel/early-quirks.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 94ab6b9..f1997b4 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -196,14 +196,21 @@ static void __init ati_bugs_contd(int num, int slot, int func)
static void __init intel_remapping_check(int num, int slot, int func)
{
u8 revision;
+ u16 device;
+ device = read_pci_config_16(num, slot, func, PCI_DEVICE_ID);
revision = read_pci_config_byte(num, slot, func, PCI_REVISION_ID);
/*
* Revision 0x13 of this chipset supports irq remapping
* but has an erratum that breaks its behavior, flag it as such
*/
- if (revision == 0x13)
+ if ((device == 3405) &&
+ ((revision == 0x12) ||
+ (revision == 0x13) ||
+ (revision == 0x22))) {
+ set_irq_remapping_broken();
+ } else if (revision == 0x13)
set_irq_remapping_broken();
}
@@ -239,8 +246,11 @@ static struct chipset early_qrk[] __initdata = {
PCI_CLASS_SERIAL_SMBUS, PCI_ANY_ID, 0, ati_bugs_contd },
{ PCI_VENDOR_ID_INTEL, 0x3403, PCI_CLASS_BRIDGE_HOST,
PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
+ { PCI_VENDOR_ID_INTEL, 0x3405, PCI_CLASS_BRIDGE_HOST,
+ PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
{ PCI_VENDOR_ID_INTEL, 0x3406, PCI_CLASS_BRIDGE_HOST,
PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
+
{}
};
--
1.8.1.4
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [RFC PATCH] iommu/vt-d: Expand interrupt remapping quirk to cover x58 chipset
2013-07-08 19:13 ` [RFC PATCH] iommu/vt-d: Expand interrupt remapping quirk to cover x58 chipset Neil Horman
@ 2013-07-09 6:44 ` Jan Beulich
0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2013-07-09 6:44 UTC (permalink / raw)
To: Neil Horman; +Cc: Andrew Cooper, Joerg Roedel, Malcolm Crossley, xen-devel
>>> On 08.07.13 at 21:13, Neil Horman <nhorman@tuxdriver.com> wrote:
> --- a/arch/x86/kernel/early-quirks.c
> +++ b/arch/x86/kernel/early-quirks.c
> @@ -196,14 +196,21 @@ static void __init ati_bugs_contd(int num, int slot, int func)
> static void __init intel_remapping_check(int num, int slot, int func)
> {
> u8 revision;
> + u16 device;
>
> + device = read_pci_config_16(num, slot, func, PCI_DEVICE_ID);
> revision = read_pci_config_byte(num, slot, func, PCI_REVISION_ID);
>
> /*
> * Revision 0x13 of this chipset supports irq remapping
> * but has an erratum that breaks its behavior, flag it as such
> */
> - if (revision == 0x13)
> + if ((device == 3405) &&
0x3405.
Also, I'd do the revision == 0x13 check first (uniformly for all
matched IDs), but I admit that this is a matter of taste. So
consider this
Reviewed-by: Jan Beulich <jbeulich@suse.com>
with the above mistake fixed.
Jan
> + ((revision == 0x12) ||
> + (revision == 0x13) ||
> + (revision == 0x22))) {
> + set_irq_remapping_broken();
> + } else if (revision == 0x13)
> set_irq_remapping_broken();
>
> }
^ permalink raw reply [flat|nested] 13+ messages in thread