From: Sasha Levin <sasha.levin@oracle.com>
To: stable@vger.kernel.org, stable-commits@vger.kernel.org
Cc: "Wanpeng Li" <wanpeng.li@hotmail.com>,
"Tejun Heo" <tj@kernel.org>,
"Lai Jiangshan" <jiangshanlai@gmail.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Peter Zijlstra" <peterz@infradead.org>,
"Frédéric Weisbecker" <fweisbec@gmail.com>,
"Sasha Levin" <sasha.levin@oracle.com>
Subject: [added to the 4.1 stable tree] workqueue: fix rebind bound workers warning
Date: Thu, 19 May 2016 00:20:01 -0400 [thread overview]
Message-ID: <1463631606-32540-62-git-send-email-sasha.levin@oracle.com> (raw)
In-Reply-To: <1463631606-32540-1-git-send-email-sasha.levin@oracle.com>
From: Wanpeng Li <wanpeng.li@hotmail.com>
This patch has been added to the 4.1 stable tree. If you have any
objections, please let us know.
===============
[ Upstream commit f7c17d26f43d5cc1b7a6b896cd2fa24a079739b9 ]
------------[ cut here ]------------
WARNING: CPU: 0 PID: 16 at kernel/workqueue.c:4559 rebind_workers+0x1c0/0x1d0
Modules linked in:
CPU: 0 PID: 16 Comm: cpuhp/0 Not tainted 4.6.0-rc4+ #31
Hardware name: IBM IBM System x3550 M4 Server -[7914IUW]-/00Y8603, BIOS -[D7E128FUS-1.40]- 07/23/2013
0000000000000000 ffff881037babb58 ffffffff8139d885 0000000000000010
0000000000000000 0000000000000000 0000000000000000 ffff881037babba8
ffffffff8108505d ffff881037ba0000 000011cf3e7d6e60 0000000000000046
Call Trace:
dump_stack+0x89/0xd4
__warn+0xfd/0x120
warn_slowpath_null+0x1d/0x20
rebind_workers+0x1c0/0x1d0
workqueue_cpu_up_callback+0xf5/0x1d0
notifier_call_chain+0x64/0x90
? trace_hardirqs_on_caller+0xf2/0x220
? notify_prepare+0x80/0x80
__raw_notifier_call_chain+0xe/0x10
__cpu_notify+0x35/0x50
notify_down_prepare+0x5e/0x80
? notify_prepare+0x80/0x80
cpuhp_invoke_callback+0x73/0x330
? __schedule+0x33e/0x8a0
cpuhp_down_callbacks+0x51/0xc0
cpuhp_thread_fun+0xc1/0xf0
smpboot_thread_fn+0x159/0x2a0
? smpboot_create_threads+0x80/0x80
kthread+0xef/0x110
? wait_for_completion+0xf0/0x120
? schedule_tail+0x35/0xf0
ret_from_fork+0x22/0x50
? __init_kthread_worker+0x70/0x70
---[ end trace eb12ae47d2382d8f ]---
notify_down_prepare: attempt to take down CPU 0 failed
This bug can be reproduced by below config w/ nohz_full= all cpus:
CONFIG_BOOTPARAM_HOTPLUG_CPU0=y
CONFIG_DEBUG_HOTPLUG_CPU0=y
CONFIG_NO_HZ_FULL=y
As Thomas pointed out:
| If a down prepare callback fails, then DOWN_FAILED is invoked for all
| callbacks which have successfully executed DOWN_PREPARE.
|
| But, workqueue has actually two notifiers. One which handles
| UP/DOWN_FAILED/ONLINE and one which handles DOWN_PREPARE.
|
| Now look at the priorities of those callbacks:
|
| CPU_PRI_WORKQUEUE_UP = 5
| CPU_PRI_WORKQUEUE_DOWN = -5
|
| So the call order on DOWN_PREPARE is:
|
| CB 1
| CB ...
| CB workqueue_up() -> Ignores DOWN_PREPARE
| CB ...
| CB X ---> Fails
|
| So we call up to CB X with DOWN_FAILED
|
| CB 1
| CB ...
| CB workqueue_up() -> Handles DOWN_FAILED
| CB ...
| CB X-1
|
| So the problem is that the workqueue stuff handles DOWN_FAILED in the up
| callback, while it should do it in the down callback. Which is not a good idea
| either because it wants to be called early on rollback...
|
| Brilliant stuff, isn't it? The hotplug rework will solve this problem because
| the callbacks become symetric, but for the existing mess, we need some
| workaround in the workqueue code.
The boot CPU handles housekeeping duty(unbound timers, workqueues,
timekeeping, ...) on behalf of full dynticks CPUs. It must remain
online when nohz full is enabled. There is a priority set to every
notifier_blocks:
workqueue_cpu_up > tick_nohz_cpu_down > workqueue_cpu_down
So tick_nohz_cpu_down callback failed when down prepare cpu 0, and
notifier_blocks behind tick_nohz_cpu_down will not be called any
more, which leads to workers are actually not unbound. Then hotplug
state machine will fallback to undo and online cpu 0 again. Workers
will be rebound unconditionally even if they are not unbound and
trigger the warning in this progress.
This patch fix it by catching !DISASSOCIATED to avoid rebind bound
workers.
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: stable@vger.kernel.org
Suggested-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
---
kernel/workqueue.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 32152a4..d0efe92 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4477,6 +4477,17 @@ static void rebind_workers(struct worker_pool *pool)
pool->attrs->cpumask) < 0);
spin_lock_irq(&pool->lock);
+
+ /*
+ * XXX: CPU hotplug notifiers are weird and can call DOWN_FAILED
+ * w/o preceding DOWN_PREPARE. Work around it. CPU hotplug is
+ * being reworked and this can go away in time.
+ */
+ if (!(pool->flags & POOL_DISASSOCIATED)) {
+ spin_unlock_irq(&pool->lock);
+ return;
+ }
+
pool->flags &= ~POOL_DISASSOCIATED;
for_each_pool_worker(worker, pool) {
--
2.5.0
next prev parent reply other threads:[~2016-05-19 4:22 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-19 4:19 [added to the 4.1 stable tree] Revert "usb: hub: do not clear BOS field during reset device" Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] stable: remove artifact created on backport Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] iwlwifi: pcie: lower the debug level for RSA semaphore access Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ASoC: rt5640: Correct the digital interface data select Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] regulator: s2mps11: Fix invalid selector mask and voltages for buck9 Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] libahci: save port map for forced port map Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ata: ahci-platform: Add ports-implemented DT bindings Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] iio: ak8975: Fix NULL pointer exception on early interrupt Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] efi: Fix out-of-bounds read in variable_matches() Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] USB: serial: cp210x: add ID for Link ECU Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] USB: serial: cp210x: add Straizona Focusers device ids Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] [media] v4l2-dv-timings.h: fix polarity for 4k formats Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] MD: make bio mergeable Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Add dock support for ThinkPad X260 Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] workqueue: fix ghost PENDING flag while doing MQ IO Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1() Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/dp/mst: Restore primary hub guid on resume Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] cxl: Keep IRQ mappings on context teardown Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/radeon: fix vertical bars appear on monitor (v2) Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ARM: SoCFPGA: Fix secondary CPU startup in thumb2 kernel Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] IB/security: Restrict use of the write() interface Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] mm: vmscan: reclaim highmem zone if buffer_heads is over limit Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] mm: soft-offline: don't free target page in successful page migration Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_* Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2) Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] atomic_open(): fix the handling of create_error Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] Drivers: hv: ring_buffer.c: fix comment style Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] Drivers: hv_vmbus: Fix signal to host condition Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read() Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] powerpc: Fix bad inline asm constraint in create_zero_mask() Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] Minimal fix-up of bad hashing behavior of hash_64() Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] tracing: Don't display trigger file for events that can't be enabled Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/radeon: make sure vertical front porch is at least 1 Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] MAINTAINERS: Remove asterisk from EFI directory names Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ACPICA: Dispatcher: Update thread ID for recursive method calls Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] crypto: hash - Fix page length clamping in hash walk Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] x86/sysfb_efi: Fix valid BAR address range check Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] fs/pnode.c: treat zero mnt_group_id-s as unequal Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] propogate_mnt: Handle the first propogated copy being a slave Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/radeon: fix DP link training issue with second 4K monitor Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] mm, cma: prevent nr_isolated_* counters from going negative Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] get_rock_ridge_filename(): handle malformed NM entries Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Apply fix for white noise on Asus N550JV, too Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Fix white noise on Asus UX501VW headset Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] Input: max8997-haptic - fix NULL pointer dereference Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/i915: Bail out of pipe config compute loop on LPT Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Fix broken reconfig Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Asus N750JV external subwoofer fixup Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Fix white noise on Asus N750JV headphone Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Fix subwoofer pin on ASUS N751 and N551 Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: usb-audio: Yet another Phoneix Audio device quirk Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] perf/core: Disable the event on a truncated AUX record Sasha Levin
2016-05-23 6:59 ` Alexander Shishkin
2016-05-30 21:50 ` Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] tools lib traceevent: Do not reassign parg after collapse_tree() Sasha Levin
2016-05-19 4:20 ` Sasha Levin [this message]
2016-05-19 4:20 ` [added to the 4.1 stable tree] drm/radeon: fix DP mode validation Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] ocfs2: fix SGID not inherited issue Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] ocfs2: fix posix_acl_create deadlock Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] nf_conntrack: avoid kernel pointer value leak in slab name Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1463631606-32540-62-git-send-email-sasha.levin@oracle.com \
--to=sasha.levin@oracle.com \
--cc=fweisbec@gmail.com \
--cc=jiangshanlai@gmail.com \
--cc=peterz@infradead.org \
--cc=stable-commits@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=wanpeng.li@hotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox