From: Gautham R Shenoy <ego@linux.vnet.ibm.com>
To: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Tejun Heo <htejun@gmail.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Abdul Haleem <abdhalee@linux.vnet.ibm.com>,
Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>,
linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/2] Fix CPU Online handling for unbounded worker threads
Date: Mon, 13 Jun 2016 11:14:04 +0530 [thread overview]
Message-ID: <20160613054404.GA18423@in.ibm.com> (raw)
In-Reply-To: <cover.1465311052.git.ego@linux.vnet.ibm.com>
Hi Peter, Thomas,
On Tue, Jun 07, 2016 at 08:44:01PM +0530, Gautham R. Shenoy wrote:
> Hi,
>
> This patchset fixes a couple of issues in the CPU_ONLINE notification
> handling for the workqueues with respect to unbounded worker
threads.
Any thoughts on these patches ? They fix a race which
was causing WARN_ON() to be consistently reproduced on POWER machines
since 4.6.
Could you please review these patches ?
--
Thanks and Regard
gautham.
>
> Patch 1 ensures that the affinity of a unbound worker thread
> associated with a node whose very first CPU has come online is set
> correctly. In the existing code path we will never call
> set_cpus_allowed_ptr() for unbound worker threads that have been
> created on a CPU Online operation after boot.
>
> Patch 2 fixes the following WARN_ON() reported by Abdul when
> set_cpus_allowed_ptr() for an unbound worker thread is invoked when
> only one of the CPUs in its cpumask is online but not yet active.
>
> ------------[ cut here ]------------
> WARNING: CPU: 40 PID: 248 at kernel/sched/core.c:1166 __set_cpus_allowed_ptr+0x21c/0x290
> Modules linked in:
> CPU: 40 PID: 248 Comm: cpuhp/40 Not tainted 4.6.0-autotest #1
> task: c000000f27284200 ti: c000000f273fc000 task.ti: c000000f273fc000
> NIP: c00000000010488c LR: c000000000104874 CTR: 0000000000000000
> REGS: c000000f273ff7d0 TRAP: 0700 Not tainted (4.6.0-autotest)
> MSR: 9000000100029033 <SF,HV,EE,ME,IR,DR,RI,LE,TM[E]> CR: 28002804 XER: 20000000
> CFAR: c0000000005b0888 SOFTE: 0
> GPR00: c00000000010478c c000000f273ffa50 c0000000013ce400 0000000000000000
> GPR04: c00000000140ed98 0000000000000800 c0000007f64d9408 0000000000000000
> GPR08: 0000000000000000 0000000000000028 c00000000140ee90 0000000000000020
> GPR12: 0000000000002200 c00000000fb96800 c0000000000f44a8 c0000007fa158480
> GPR16: c0000007fc621a70 c000000f2721f800 0000000000000000 0000000000000001
> GPR20: c000000001571ef0 0000000000000000 c00000000134879f c0000000012bc510
> GPR24: 0000010000000000 0000000000000000 c00000000140ea98 c0000007f64d9408
> GPR28: c0000007fbc21c00 ffffffffffffffea 0000000000000000 c000000f27280000
> NIP [c00000000010488c] __set_cpus_allowed_ptr+0x21c/0x290
> LR [c000000000104874] __set_cpus_allowed_ptr+0x204/0x290
> Call Trace:
> [c000000f273ffa50] [c00000000010478c] __set_cpus_allowed_ptr+0x11c/0x290 (unreliable)
> [c000000f273ffac0] [c0000000000ed4b0] workqueue_cpu_up_callback+0x2c0/0x470
> [c000000f273ffb70] [c0000000000f5c58] notifier_call_chain+0x98/0x100
> [c000000f273ffbc0] [c0000000000c5ed0] __cpu_notify+0x70/0xe0
> [c000000f273ffc00] [c0000000000c6028] notify_online+0x38/0x50
> [c000000f273ffc30] [c0000000000c5214] cpuhp_invoke_callback+0x84/0x250
> [c000000f273ffc90] [c0000000000c562c] cpuhp_up_callbacks+0x5c/0x120
> [c000000f273ffce0] [c0000000000c64d4] cpuhp_thread_fun+0x184/0x1c0
> [c000000f273ffd20] [c0000000000fa050] smpboot_thread_fn+0x290/0x2a0
> [c000000f273ffd80] [c0000000000f45b0] kthread+0x110/0x130
> [c000000f273ffe30] [c000000000009570] ret_from_kernel_thread+0x5c/0x6c
> Instruction dump:
> 419eff3c 3d420004 38a00800 388a0998 7f63db78 484abfa1 60000000 2fa30000
> 409eff1c 813f0378 2f890001 419eff10 <0fe00000> 4bffff08 60000000 60000000
> ---[ end trace cbc1c5cfbc9591d0 ]---
>
> The patches are based on 4.7-rc2. I have tested the patches on a
> multi-node x86_64 and a ppc64
>
> Gautham R. Shenoy (2):
> workqueue: Move wq_update_unbound_numa() to the beginning of
> CPU_ONLINE
> workqueue:Fix affinity of an unbound worker of a node with 1 online
> CPU
>
> kernel/workqueue.c | 27 +++++++++++++++++++--------
> 1 file changed, 19 insertions(+), 8 deletions(-)
>
> --
> 1.9.3
>
prev parent reply other threads:[~2016-06-13 5:44 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-19 10:57 WARNING at kernel/sched/core.c:1166 while booting 4.6.0 mainline on ppc64le bare metal abdhalee
2016-05-19 12:34 ` Gavin Shan
2016-05-26 15:11 ` Gautham R Shenoy
2016-06-07 12:29 ` Abdul Haleem
2016-06-07 15:14 ` [PATCH 0/2] Fix CPU Online handling for unbounded worker threads Gautham R. Shenoy
2016-06-07 15:14 ` [PATCH 1/2] workqueue: Move wq_update_unbound_numa() to the beginning of CPU_ONLINE Gautham R. Shenoy
2016-06-15 15:53 ` Tejun Heo
2016-06-15 19:28 ` Gautham R Shenoy
2016-06-16 19:35 ` Tejun Heo
2016-06-21 14:12 ` Gautham R Shenoy
2016-06-21 15:36 ` Tejun Heo
2016-06-21 19:37 ` Peter Zijlstra
2016-06-21 19:43 ` Tejun Heo
2016-06-21 19:47 ` Peter Zijlstra
2016-06-22 5:15 ` Gautham R Shenoy
2016-06-24 9:00 ` [tip:sched/urgent] sched/core: Allow kthreads to fall back to online && !active cpus tip-bot for Tejun Heo
2016-06-07 15:14 ` [PATCH 2/2] workqueue:Fix affinity of an unbound worker of a node with 1 online CPU Gautham R. Shenoy
2016-06-08 6:03 ` Abdul Haleem
2016-06-14 11:22 ` Peter Zijlstra
2016-06-15 10:19 ` Gautham R Shenoy
2016-06-15 11:32 ` Peter Zijlstra
2016-06-15 12:50 ` Gautham R Shenoy
2016-06-15 13:14 ` Peter Zijlstra
2016-06-15 16:01 ` Tejun Heo
2016-06-16 12:11 ` Michael Ellerman
2016-06-16 12:45 ` Peter Zijlstra
2016-06-16 19:39 ` Tejun Heo
2016-06-17 1:49 ` Michael Ellerman
2016-07-15 5:27 ` Gautham R Shenoy
2016-07-15 5:30 ` Michael Ellerman
[not found] ` <57887507.911f240a.687de.08c5SMTPIN_ADDED_BROKEN@mx.google.com>
2016-07-15 12:10 ` Tejun Heo
2016-06-13 5:44 ` Gautham R Shenoy [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160613054404.GA18423@in.ibm.com \
--to=ego@linux.vnet.ibm.com \
--cc=abdhalee@linux.vnet.ibm.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=htejun@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.