From: Ashok Raj <ashok.raj@intel.com>
To: linux-kernel@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>
Cc: Ashok Raj <ashok.raj@intel.com>,
Sukumar Ghorai <sukumar.ghorai@intel.com>,
Srikanth Nandamuri <srikanth.nandamuri@intel.com>,
Evan Green <evgreen@chromium.org>,
Mathias Nyman <mathias.nyman@linux.intel.com>,
Bjorn Helgaas <bhelgaas@google.com>,
stable@vger.kernel.org
Subject: [PATCH v2] x86/hotplug: Silence APIC only after all irq's are migrated
Date: Thu, 20 Aug 2020 17:42:03 -0700 [thread overview]
Message-ID: <1597970523-24797-1-git-send-email-ashok.raj@intel.com> (raw)
When offlining CPUs, fixup_irqs() migrates all interrupts away from the
outgoing CPU to an online CPU. It's always possible the device sent an
interrupt to the previous CPU destination. Pending interrupt bit in IRR in
LAPIC identifies such interrupts. apic_soft_disable() will not capture any
new interrupts in IRR. This causes interrupts from device to be lost during
CPU offline. The issue was found when explicitly setting MSI affinity to a
CPU and immediately offlining it. It was simple to recreate with a USB
ethernet device and doing I/O to it while the CPU is offlined. Lost
interrupts happen even when Interrupt Remapping is enabled.
Current code does apic_soft_disable() before migrating interrupts.
native_cpu_disable()
{
...
apic_soft_disable();
cpu_disable_common();
--> fixup_irqs(); // Too late to capture anything in IRR.
}
Just flipping the above call sequence seems to hit the IRR checks
and the lost interrupt is fixed for both legacy MSI and when
interrupt remapping is enabled.
Fixes: 60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead")
Link: https://lore.kernel.org/lkml/875zdarr4h.fsf@nanos.tec.linutronix.de/
Reported-by: Evan Green <evgreen@chromium.org>
Tested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Tested-by: Evan Green <evgreen@chromium.org>
Reviewed-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
---
v2:
- Typos and fixes suggested by Randy Dunlap
To: linux-kernel@vger.kernel.org
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Sukumar Ghorai <sukumar.ghorai@intel.com>
Cc: Srikanth Nandamuri <srikanth.nandamuri@intel.com>
Cc: Evan Green <evgreen@chromium.org>
Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
---
arch/x86/kernel/smpboot.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 27aa04a95702..3016c3b627ce 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1594,13 +1594,20 @@ int native_cpu_disable(void)
if (ret)
return ret;
+ cpu_disable_common();
/*
* Disable the local APIC. Otherwise IPI broadcasts will reach
* it. It still responds normally to INIT, NMI, SMI, and SIPI
- * messages.
+ * messages. It's important to do apic_soft_disable() after
+ * fixup_irqs(), because fixup_irqs() called from cpu_disable_common()
+ * depends on IRR being set. After apic_soft_disable() CPU preserves
+ * currently set IRR/ISR but new interrupts will not set IRR.
+ * This causes interrupts sent to outgoing CPU before completion
+ * of IRQ migration to be lost. Check SDM Vol 3 "10.4.7.2 Local
+ * APIC State after It Has been Software Disabled" section for more
+ * details.
*/
apic_soft_disable();
- cpu_disable_common();
return 0;
}
--
2.7.4
next reply other threads:[~2020-08-21 0:43 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-21 0:42 Ashok Raj [this message]
2020-08-21 3:16 ` [PATCH v2] x86/hotplug: Silence APIC only after all irq's are migrated Randy Dunlap
2020-08-23 16:48 ` Raj, Ashok
2020-08-26 0:40 ` Thomas Gleixner
2020-08-26 2:50 ` Raj, Ashok
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1597970523-24797-1-git-send-email-ashok.raj@intel.com \
--to=ashok.raj@intel.com \
--cc=bhelgaas@google.com \
--cc=evgreen@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathias.nyman@linux.intel.com \
--cc=srikanth.nandamuri@intel.com \
--cc=stable@vger.kernel.org \
--cc=sukumar.ghorai@intel.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox