From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: x86@kernel.org
Cc: linux-pci <linux-pci@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
Bjorn Helgaas <bhelgaas@google.com>,
Evan Green <evgreen@chromium.org>,
"Ghorai, Sukumar" <sukumar.ghorai@intel.com>,
"Amara, Madhusudanarao" <madhusudanarao.amara@intel.com>,
"Nandamuri, Srikanth" <srikanth.nandamuri@intel.com>
Subject: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug
Date: Wed, 18 Mar 2020 21:25:39 +0200 [thread overview]
Message-ID: <806c51fa-992b-33ac-61a9-00a606f82edb@linux.intel.com> (raw)
Hi
I can reproduce the lost MSI interrupt issue on 5.6-rc6 which includes
the "Plug non-maskable MSI affinity race" patch.
I can see this on a couple platforms, I'm running a script that first generates
a lot of usb traffic, and then in a busyloop sets irq affinity and turns off
and on cpus:
for i in 1 3 5 7; do
echo "1" > /sys/devices/system/cpu/cpu$i/online
done
echo "A" > "/proc/irq/*/smp_affinity"
echo "A" > "/proc/irq/*/smp_affinity"
echo "F" > "/proc/irq/*/smp_affinity"
for i in 1 3 5 7; do
echo "0" > /sys/devices/system/cpu/cpu$i/online
done
I added some very simple debugging but I don't really know what to look for.
xhci interrupts (122) just stop after a setting msi affinity, it survived many
similar msi_set_affinity() calls before this.
I'm not that familiar with the inner workings of this, but I'll be happy to
help out with adding debugging and testing patches.
Details:
cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
0: 26 0 0 0 0 0 0 0 IO-APIC 2-edge timer
1: 0 0 0 0 0 7 0 0 IO-APIC 1-edge i8042
4: 0 4 59941 0 0 0 0 0 IO-APIC 4-edge ttyS0
8: 0 0 0 0 0 0 1 0 IO-APIC 8-edge rtc0
9: 0 40 8 0 0 0 0 0 IO-APIC 9-fasteoi acpi
16: 0 0 0 0 0 0 0 0 IO-APIC 16-fasteoi i801_smbus
120: 0 0 293 0 0 0 0 0 PCI-MSI 32768-edge i915
121: 728 0 0 58 0 0 0 0 PCI-MSI 520192-edge enp0s31f6
122: 63575 2271 0 1957 7262 0 0 0 PCI-MSI 327680-edge xhci_hcd
123: 0 0 0 0 0 0 0 0 PCI-MSI 514048-edge snd_hda_intel:card0
NMI: 0 0 0 0 0 0 0 0 Non-maskable interrupts
trace snippet:
<idle>-0 [001] d.h. 129.676900: xhci_irq: xhci irq
<idle>-0 [001] d.h. 129.677507: xhci_irq: xhci irq
<idle>-0 [001] d.h. 129.677556: xhci_irq: xhci irq
<idle>-0 [001] d.h. 129.677647: xhci_irq: xhci irq
<...>-14 [001] d..1 129.679802: msi_set_affinity: direct update msi 122, vector 33 -> 33, apicid: 2 -> 6
<idle>-0 [003] d.h. 129.682639: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.682769: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.682908: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.683552: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.683677: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.683819: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.689017: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.689140: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.689307: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.689984: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.690107: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.690278: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.695541: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.695674: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.695839: xhci_irq: xhci irq
<idle>-0 [003] d.H. 129.696667: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.696797: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.696973: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.702288: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.702380: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.702493: xhci_irq: xhci irq
migration/3-24 [003] d..1 129.703150: msi_set_affinity: direct update msi 122, vector 33 -> 33, apicid: 6 -> 0
kworker/0:0-5 [000] d.h. 131.328790: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 133.312704: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 135.360786: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
<idle>-0 [000] d.h. 137.344694: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 139.128679: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 141.312686: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 143.360703: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 145.344791: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -92,6 +92,10 @@ msi_set_affinity(struct irq_data *irqd, const struct cpumask *mask, bool force)
cfg->vector == old_cfg.vector ||
old_cfg.vector == MANAGED_IRQ_SHUTDOWN_VECTOR ||
cfg->dest_apicid == old_cfg.dest_apicid) {
+ trace_printk("direct update msi %u, vector %u -> %u, apicid: %u -> %u\n",
+ irqd->irq,
+ old_cfg.vector, cfg->vector,
+ old_cfg.dest_apicid, cfg->dest_apicid);
irq_msi_update_msg(irqd, cfg);
return ret;
}
@@ -134,7 +138,10 @@ msi_set_affinity(struct irq_data *irqd, const struct cpumask *mask, bool force)
*/
if (IS_ERR_OR_NULL(this_cpu_read(vector_irq[cfg->vector])))
this_cpu_write(vector_irq[cfg->vector], VECTOR_RETRIGGERED);
-
+ trace_printk("twostep update msi, irq %u, vector %u -> %u, apicid: %u -> %u\n",
+ irqd->irq,
+ old_cfg.vector, cfg->vector,
+ old_cfg.dest_apicid, cfg->dest_apicid);
/* Redirect it to the new vector on the local CPU temporarily */
old_cfg.vector = cfg->vector;
irq_msi_update_msg(irqd, &old_cfg);
Thanks
-Mathias
next reply other threads:[~2020-03-18 19:23 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-18 19:25 Mathias Nyman [this message]
2020-03-19 20:24 ` MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug Evan Green
2020-03-20 8:07 ` Mathias Nyman
2020-03-20 9:52 ` Thomas Gleixner
2020-03-23 9:42 ` Mathias Nyman
2020-03-23 14:10 ` Thomas Gleixner
2020-03-23 20:32 ` Mathias Nyman
2020-03-24 0:24 ` Thomas Gleixner
2020-03-24 16:17 ` Evan Green
2020-03-24 19:03 ` Thomas Gleixner
2020-05-01 18:43 ` Raj, Ashok
2020-05-05 19:36 ` Thomas Gleixner
2020-05-05 20:16 ` Raj, Ashok
2020-05-05 21:47 ` Thomas Gleixner
2020-05-07 12:18 ` Raj, Ashok
2020-05-07 12:53 ` Thomas Gleixner
2020-05-07 17:57 ` Raj, Ashok
2020-05-07 19:41 ` Thomas Gleixner
2020-03-25 17:12 ` Mathias Nyman
[not found] <20200508005528.GB61703@otc-nc-03>
2020-05-08 11:04 ` Thomas Gleixner
2020-05-08 16:09 ` Raj, Ashok
2020-05-08 16:49 ` Thomas Gleixner
2020-05-11 19:03 ` Raj, Ashok
2020-05-11 20:14 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=806c51fa-992b-33ac-61a9-00a606f82edb@linux.intel.com \
--to=mathias.nyman@linux.intel.com \
--cc=bhelgaas@google.com \
--cc=evgreen@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=madhusudanarao.amara@intel.com \
--cc=srikanth.nandamuri@intel.com \
--cc=sukumar.ghorai@intel.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.