From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>,
Thomas Gleixner <tglx@linutronix.de>,
axboe@kernel.dk, linux-scsi@vger.kernel.org,
Sumit Saxena <sumit.saxena@broadcom.com>,
Marc Zyngier <marc.zyngier@arm.com>,
mpe@ellerman.id.au,
Shivasharan Srikanteshwara
<shivasharan.srikanteshwara@broadcom.com>,
Kashyap Desai <kashyap.desai@broadcom.com>,
keith.busch@intel.com, peterz@infradead.org,
Hannes Reinecke <hare@suse.de>, Christoph Hellwig <hch@lst.de>
Subject: [PATCH 4.13 41/53] genirq/cpuhotplug: Enforce affinity setting on startup of managed irqs
Date: Mon, 16 Oct 2017 18:16:38 +0200 [thread overview]
Message-ID: <20171016161443.962836398@linuxfoundation.org> (raw)
In-Reply-To: <20171016161442.263947886@linuxfoundation.org>
4.13-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner <tglx@linutronix.de>
commit e43b3b58548051f8809391eb7bec7a27ed3003ea upstream.
Managed interrupts can end up in a stale state on CPU hotplug. If the
interrupt is not targeting a single CPU, i.e. the affinity mask spawns
multiple CPUs then the following can happen:
After boot:
dstate: 0x01601200
IRQD_ACTIVATED
IRQD_IRQ_STARTED
IRQD_SINGLE_TARGET
IRQD_AFFINITY_SET
IRQD_AFFINITY_MANAGED
node: 0
affinity: 24-31
effectiv: 24
pending: 0
After offlining CPU 31 - 24
dstate: 0x01a31000
IRQD_IRQ_DISABLED
IRQD_IRQ_MASKED
IRQD_SINGLE_TARGET
IRQD_AFFINITY_SET
IRQD_AFFINITY_MANAGED
IRQD_MANAGED_SHUTDOWN
node: 0
affinity: 24-31
effectiv: 24
pending: 0
Now CPU 25 gets onlined again, so it should get the effective interrupt
affinity for this interruopt, but due to the x86 interrupt affinity setter
restrictions this ends up after restarting the interrupt with:
dstate: 0x01601300
IRQD_ACTIVATED
IRQD_IRQ_STARTED
IRQD_SINGLE_TARGET
IRQD_AFFINITY_SET
IRQD_SETAFFINITY_PENDING
IRQD_AFFINITY_MANAGED
node: 0
affinity: 24-31
effectiv: 24
pending: 24-31
So the interrupt is still affine to CPU 24, which was the last CPU to go
offline of that affinity set and the move to an online CPU within 24-31,
in this case 25, is pending. This mechanism is x86/ia64 specific as those
architectures cannot move interrupts from thread context and do this when
an interrupt is actually handled. So the move is set to pending.
Whats worse is that offlining CPU 25 again results in:
dstate: 0x01601300
IRQD_ACTIVATED
IRQD_IRQ_STARTED
IRQD_SINGLE_TARGET
IRQD_AFFINITY_SET
IRQD_SETAFFINITY_PENDING
IRQD_AFFINITY_MANAGED
node: 0
affinity: 24-31
effectiv: 24
pending: 24-31
This means the interrupt has not been shut down, because the outgoing CPU
is not in the effective affinity mask, but of course nothing notices that
the effective affinity mask is pointing at an offline CPU.
In the case of restarting a managed interrupt the move restriction does not
apply, so the affinity setting can be made unconditional. This needs to be
done _before_ the interrupt is started up as otherwise the condition for
moving it from thread context would not longer be fulfilled.
With that change applied onlining CPU 25 after offlining 31-24 results in:
dstate: 0x01600200
IRQD_ACTIVATED
IRQD_IRQ_STARTED
IRQD_SINGLE_TARGET
IRQD_AFFINITY_MANAGED
node: 0
affinity: 24-31
effectiv: 25
pending:
And after offlining CPU 25:
dstate: 0x01a30000
IRQD_IRQ_DISABLED
IRQD_IRQ_MASKED
IRQD_SINGLE_TARGET
IRQD_AFFINITY_MANAGED
IRQD_MANAGED_SHUTDOWN
node: 0
affinity: 24-31
effectiv: 25
pending:
which is the correct and expected result.
Fixes: 761ea388e8c4 ("genirq: Handle managed irqs gracefully in irq_startup()")
Reported-by: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: axboe@kernel.dk
Cc: linux-scsi@vger.kernel.org
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: mpe@ellerman.id.au
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: keith.busch@intel.com
Cc: peterz@infradead.org
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710042208400.2406@nanos
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/irq/chip.c | 2 +-
kernel/irq/manage.c | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -265,8 +265,8 @@ int irq_startup(struct irq_desc *desc, b
irq_setup_affinity(desc);
break;
case IRQ_STARTUP_MANAGED:
+ irq_do_set_affinity(d, aff, false);
ret = __irq_startup(desc);
- irq_set_affinity_locked(d, aff, false);
break;
case IRQ_STARTUP_ABORT:
return 0;
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -175,6 +175,9 @@ int irq_do_set_affinity(struct irq_data
struct irq_chip *chip = irq_data_get_irq_chip(data);
int ret;
+ if (!chip || !chip->irq_set_affinity)
+ return -EINVAL;
+
ret = chip->irq_set_affinity(data, mask, force);
switch (ret) {
case IRQ_SET_MASK_OK:
next prev parent reply other threads:[~2017-10-16 16:21 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-16 16:15 [PATCH 4.13 00/53] 4.13.8-stable review Greg Kroah-Hartman
2017-10-16 16:15 ` [PATCH 4.13 01/53] USB: dummy-hcd: Fix deadlock caused by disconnect detection Greg Kroah-Hartman
2017-10-16 16:15 ` [PATCH 4.13 02/53] MIPS: math-emu: Remove pr_err() calls from fpu_emu() Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 03/53] MIPS: bpf: Fix uninitialised target compiler error Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 04/53] mei: always use domain runtime pm callbacks Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 05/53] dmaengine: edma: Align the memcpy acnt array size with the transfer Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 06/53] dmaengine: ti-dma-crossbar: Fix possible race condition with dma_inuse Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 07/53] NFS: Fix uninitialized rpc_wait_queue Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 08/53] nfs/filelayout: fix oops when freeing filelayout segment Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 09/53] HID: usbhid: fix out-of-bounds bug Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 12/53] KVM: MMU: always terminate page walks at level 1 Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 13/53] KVM: nVMX: fix guest CR4 loading when emulating L2 to L1 exit Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 14/53] usb: renesas_usbhs: Fix DMAC sequence for receiving zero-length packet Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 16/53] iommu/amd: Finish TLB flush in amd_iommu_unmap() Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 17/53] device property: Track owner device of device property Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 18/53] Revert "vmalloc: back off when the current task is killed" Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 19/53] fs/mpage.c: fix mpage_writepage() for pages with buffers Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 20/53] ALSA: usb-audio: Kill stray URB at exiting Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 21/53] ALSA: seq: Fix use-after-free at creating a port Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 22/53] ALSA: seq: Fix copy_from_user() call inside lock Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 23/53] ALSA: caiaq: Fix stray URB at probe error path Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 24/53] ALSA: line6: Fix NULL dereference at podhd_disconnect() Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 25/53] ALSA: line6: Fix missing initialization before error path Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 26/53] ALSA: line6: Fix leftover URB at error-path during probe Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 27/53] drm/atomic: Unref duplicated drm_atomic_state in drm_atomic_helper_resume() Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 28/53] drm/i915/edp: Get the Panel Power Off timestamp after panel is off Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 31/53] drm/i915: Use crtc_state_is_legacy_gamma in intel_color_check Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 32/53] usb: gadget: configfs: Fix memory leak of interface directory data Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 33/53] usb: gadget: composite: Fix use-after-free in usb_composite_overwrite_options Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 34/53] PCI: aardvark: Move to struct pci_host_bridge IRQ mapping functions Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 35/53] Revert "PCI: tegra: Do not allocate MSI target memory" Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 36/53] direct-io: Prevent NULL pointer access in submit_page_section Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 37/53] fix unbalanced page refcounting in bio_map_user_iov Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 38/53] more bio_map_user_iov() leak fixes Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 39/53] bio_copy_user_iov(): dont ignore ->iov_offset Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 40/53] perf script: Add missing separator for "-F ip,brstack" (and brstackoff) Greg Kroah-Hartman
2017-10-16 16:16 ` Greg Kroah-Hartman [this message]
2017-10-16 16:16 ` [PATCH 4.13 42/53] genirq/cpuhotplug: Add sanity check for effective affinity mask Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 43/53] USB: serial: ftdi_sio: add id for Cypress WICED dev board Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 44/53] USB: serial: cp210x: fix partnum regression Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 45/53] USB: serial: cp210x: add support for ELV TFD500 Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 46/53] USB: serial: option: add support for TP-Link LTE module Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 47/53] USB: serial: qcserial: add Dell DW5818, DW5819 Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 48/53] USB: serial: console: fix use-after-free on disconnect Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 49/53] USB: serial: console: fix use-after-free after failed setup Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 50/53] RAS/CEC: Use the right length for "cec_disable" Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 51/53] x86/microcode: Do the family check first Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 52/53] x86/alternatives: Fix alt_max_short macro to really be a max() Greg Kroah-Hartman
2017-10-16 16:16 ` [PATCH 4.13 53/53] KVM: nVMX: update last_nonleaf_level when initializing nested EPT Greg Kroah-Hartman
2017-10-16 23:41 ` [PATCH 4.13 00/53] 4.13.8-stable review Shuah Khan
2017-10-17 6:59 ` Greg Kroah-Hartman
2017-10-17 0:25 ` Guenter Roeck
2017-10-17 13:21 ` Greg Kroah-Hartman
[not found] ` <20171016161443.534299546@linuxfoundation.org>
[not found] ` <866e97b1-08dd-fd43-7713-699759f63fcf@3CityElectronics.com>
2017-10-17 7:02 ` [PATCH 4.13 30/53] drm/i915/bios: parse DDI ports also for CHV for HDMI DDC pin and DP AUX channel Jani Nikula
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171016161443.962836398@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=axboe@kernel.dk \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=kashyap.desai@broadcom.com \
--cc=keith.busch@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=marc.zyngier@arm.com \
--cc=mpe@ellerman.id.au \
--cc=peterz@infradead.org \
--cc=shivasharan.srikanteshwara@broadcom.com \
--cc=stable@vger.kernel.org \
--cc=sumit.saxena@broadcom.com \
--cc=tglx@linutronix.de \
--cc=yasu.isimatu@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).