All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Keir Fraser <keir.xen@gmail.com>, Jan Beulich <JBeulich@suse.com>
Subject: IO-APIC line level race condition
Date: Fri, 17 Feb 2012 13:27:24 +0000	[thread overview]
Message-ID: <4F3E55BC.4050004@citrix.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 1562 bytes --]

Sadly, we have discovered another line level interrupt race condition in
Xen-4.1.  The result was that an outstanding un-eoi'd interrupt at the
IO-APIC resulted in the mptsas controller offlining the root filesystem.

This is now two separate IO-APIC bugs found recently.

1) Cisco C210 M2 server - EOI Broadcast Suppression, io_apci_ack=old
2) Dell R710 - No EOI Broadcast Suppression, io_apic_ack=new

Both servers use IO-APIC version 0x20 and have an mptsas controller for
their disks, using Legacy PCI line level interrupts.  Workload on both
servers appear to have more active vcpus

Case 1 is now considered stable by the customer after I provided a
private fix which caused Xen to never consider turning on EOI Broadcast
Suppression.  I have re-attached a patch which allows this problem to be
"fixed" by specifying "ioapic_ack=new" on the command line, rather than
requiring a patch and recompile of Xen.

Case 2 has only been seen once (this morning) so we currently have no
idea as to its reproducibility.  However, given that this hardware is
fairly common in our test infrastructure, i would say that it is fairly
rare.

With Case1 and the new patch, Case1 becomes the same as Case2 with
respect to IO-APIC setup, presumably meaning that the Case2 bug still
exists with Case1.


I will start working on cleaning up the IO-APIC code as soon as I can,
as reducing the unnecessary complexity should make race conditions like
this easier to find.

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


[-- Attachment #2: ioapic-ack-fix.patch --]
[-- Type: text/x-patch, Size: 2476 bytes --]

# HG changeset patch
# Parent 9ad1e42c341bc78463b6f6610a6300f75b535fbb
IO-APIC: Prevent using EOI broadcast suppression if user specified
ioapic_ack=new on the command line.

Currently, if EOI broadcast suppression is advertised on the BSP LAPIC, Xen will
discard any user specified option regarding IO-APIC ack mode.

This patch introduces a check which prevents EOI Broadcast suppression from
forcing the IO-APIC ack mode to old if the user has explicitly asked for the new
ack mode on the command line.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

diff -r 9ad1e42c341b xen/arch/x86/apic.c
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -424,9 +424,16 @@ int __init verify_local_APIC(void)
      */
     if ( reg0 & APIC_LVR_DIRECTED_EOI )
     {
-        ioapic_ack_new = 0;
-        directed_eoi_enabled = 1;
-        printk("Enabled directed EOI with ioapic_ack_old on!\n");
+        if ( ioapic_ack_new == 1 && ioapic_ack_forced == 1 )
+            printk("Not enabling EOI broadcast suppression because "
+                   "ioapic_ack_new has been forced on the command line\n");
+        else
+        {
+            ioapic_ack_new = 0;
+            directed_eoi_enabled = 1;
+            printk("Enabled EOI broadcast suppression with ioapic_ack_old "
+                   "on!\n");
+        }
     }
 
     /*
diff -r 9ad1e42c341b xen/arch/x86/io_apic.c
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -44,6 +44,7 @@ static DEFINE_SPINLOCK(ioapic_lock);
 
 bool_t __read_mostly skip_ioapic_setup;
 bool_t __read_mostly ioapic_ack_new = 1;
+bool_t __read_mostly ioapic_ack_forced = 0;
 
 #ifndef sis_apic_bug
 /*
@@ -1543,9 +1544,15 @@ static unsigned int startup_level_ioapic
 static void __init setup_ioapic_ack(char *s)
 {
     if ( !strcmp(s, "old") )
+    {
         ioapic_ack_new = 0;
+        ioapic_ack_forced = 1;
+    }
     else if ( !strcmp(s, "new") )
+    {
         ioapic_ack_new = 1;
+        ioapic_ack_forced = 1;
+    }
     else
         printk("Unknown ioapic_ack value specified: '%s'\n", s);
 }
diff -r 9ad1e42c341b xen/include/asm-x86/io_apic.h
--- a/xen/include/asm-x86/io_apic.h
+++ b/xen/include/asm-x86/io_apic.h
@@ -180,6 +180,7 @@ static inline void io_apic_modify(unsign
 /* 1 if "noapic" boot option passed */
 extern bool_t skip_ioapic_setup;
 extern bool_t ioapic_ack_new;
+extern bool_t ioapic_ack_forced;
 
 #ifdef CONFIG_ACPI_BOOT
 extern int io_apic_get_unique_id (int ioapic, int apic_id);

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

                 reply	other threads:[~2012-02-17 13:27 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F3E55BC.4050004@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir.xen@gmail.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.