From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "Keir (Xen.org)" <keir@xen.org>, Jacob Shin <jacob.shin@amd.com>,
SuraveeSuthikulpanit <suravee.suthikulpanit@amd.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH v2] AMD/IO-APIC: Use old IO-APIC ack method if AMD 813{1, 2} PCI-X tunnel is present
Date: Tue, 4 Jun 2013 17:48:56 +0100 [thread overview]
Message-ID: <51AE1A78.5030805@citrix.com> (raw)
In-Reply-To: <51AE31FE02000078000DB300@nat28.tlf.novell.com>
On 04/06/13 17:29, Jan Beulich wrote:
>>>> On 04.06.13 at 12:13, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> References at time of patch:
>>
>> AMD 8132 PCI-X HyperTransport Tunnel
>> http://support.amd.com/us/ChipsetMotherboard_TechDocs/30801.pdf
>>
>> AMD 8131 PCI-X HyperTransport Tunnel
>> http://support.amd.com/us/ChipsetMotherboard_TechDocs/26310_AMD-8131_HyperTra
>> nsport_PCI-X_Tunnel_Revision_Guide_rev_3_18.pdf
>>
>> The IO-APICs on the above two PCI-X HyperTransport Tunnels will issue illegal
>> HT packets if an interrupt gets masked while active.
>>
>> This is sadly exactly what the "new" IO-APIC ack mode does. The "old" ack
>> mode does however avoid this bad behaviour, and stabilises an affected
>> system belonging to a customer.
> Reading the errata descriptions, I'm getting the impression that
> your patch only tries to deal with some portion of the issue, and
> even that only on the basis that devices (and perhaps even their
> drivers) are well behaved. That's because there's no silencing of
> interrupt sources being done here, and I also can't see how we
> would from the hypervisor.
>
> If that's correct, is adding the workaround code really worth it,
> rather than documenting that on those systems "ioapic_ack=old"
> will make them a little more reliable?
>
> If it's not correct, would you mind explaining in some more detail
> how you see your change dealing with all aspects of the erratum,
> namely for edge triggered IRQs as well as for the masking that's
> done with the old ack mode for level triggered ones (or why you
> don't have to deal with those cases)?
The observation we have from the customer is that using "new" mode will
result in host hangs/crashes about once every two hours. With "old",
the system has been rock solid for a week.
Unfortunately I do not have the hardware at my disposal.
>
>> +void __init amd813x_errata_quirks(void)
>> +{
>> + unsigned i, found = 0;
>> + uint32_t bus, dev, vendor_id;
>> +
>> + if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
>> + return;
>> +
>> + /* Check for the affected IO-APIC version (0x11) to save scanning the PCI
>> + * bus on most systems. */
>> + for ( i = 0; i < nr_ioapics; ++i )
>> + {
>> + if ( mp_ioapics[i].mpc_apicver == 0x11 )
>> + {
>> + found = 1;
>> + break;
>> + }
>> + }
>> +
>> + if ( !found )
>> + return;
>> +
>> + for ( bus = 0; bus < 256; ++bus )
>> + {
>> + for ( dev = 0; dev < 32; ++dev )
>> + {
>> + /* IO-APIC is always Function 1. Check for Multifunction. */
>> + if ( !( pci_conf_read8(0, bus, dev, 0, PCI_HEADER_TYPE ) & 0x80 ) )
>> + continue;
>> +
>> + vendor_id = pci_conf_read32(0, bus, dev, 1, PCI_VENDOR_ID);
>> +
>> + /* 0x7451 is AMD-8131 PCI-X IO-APIC */
>> + if ( vendor_id == 0x74511022 )
>> + printk(XENLOG_WARNING "Found AMD 8131 PCI-X tunnel, erraturm #63: ");
> "erratum"
Doh.
>
>> + /* 0x7459 is AMD-8132 PCI-X IO-APIC */
>> + else if ( vendor_id == 0x74591022 )
>> + printk(XENLOG_WARNING "Found AMD 8132 PCI-X tunnel, erraturm #81: ");
> Again.
>
>> + else
>> + continue;
> And overall this is the kind of thing that I think a switch statement
> is dealing with in a more legible way, at once avoiding the need for
> the "vendor_id" local variable.
Ok
>
>> +
>> + if ( ioapic_ack_forced )
>> + {
> Inverting the condition would allow you to make this a "if/else-if/else"
> sequence, with less deep indentation...
>
>> + if ( ioapic_ack_new != 0 )
> != 0 on a boolean variable? Two lines earlier you didn't do so,
> please be consistent.
Sorry - I thought I checked this.
>
>> + printk("Not overriding command line. System will be unstable. "
> Is "will" really right here? These chipsets are rather old, and not
> having seen reports of instabilities makes me imply that many
> such systems just happen to work fine. Therefore, "may" might
> be the better term...
Yes - see description above.
>
>> + "Use ioapic_ack_mode=old\n");
> "Don't use \"ioapic_ack=new\"\n" would seem better.
>
> Jan
No - this is more informative to someone who doesn't know the Xen
command line arguments inside/out.
~Andrew
>
>> + else
>> + printk("Command line is ok for system stability\n");
>> + }
>> + else
>> + {
>> + printk("Forcing 'old' IO-APIC ack method\n");
>> + ioapic_ack_new = 0;
>> + }
>> + return;
>> + }
>> + }
>> +}
>> +
>> void __init setup_IO_APIC(void)
>> {
>> + amd813x_errata_quirks();
>> +
>> enable_IO_APIC();
>>
>> if (acpi_ioapic)
>
next prev parent reply other threads:[~2013-06-04 16:48 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-04 10:13 [PATCH v2] AMD/IO-APIC: Use old IO-APIC ack method if AMD 813{1, 2} PCI-X tunnel is present Andrew Cooper
2013-06-04 10:36 ` George Dunlap
2013-06-04 16:29 ` Jan Beulich
2013-06-04 16:48 ` Andrew Cooper [this message]
2013-06-05 8:20 ` Jan Beulich
2013-06-05 9:43 ` Andrew Cooper
2013-06-05 9:54 ` Jan Beulich
2013-06-05 10:39 ` Andrew Cooper
2013-06-05 11:33 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51AE1A78.5030805@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=JBeulich@suse.com \
--cc=jacob.shin@amd.com \
--cc=keir@xen.org \
--cc=suravee.suthikulpanit@amd.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.