All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jason J. Herne" <jjherne@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>,
	Sasha Levin <sasha.levin@oracle.com>, Tejun Heo <tj@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Dave Jones <davej@redhat.com>, Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: workqueue: WARN at at kernel/workqueue.c:2176
Date: Thu, 29 May 2014 12:23:37 -0400	[thread overview]
Message-ID: <53875F09.3090607@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140527142637.GB19143@laptop.programming.kicks-ass.net>

On 05/27/2014 10:26 AM, Peter Zijlstra wrote:
> On Tue, May 27, 2014 at 10:18:31AM -0400, Jason J. Herne wrote:
>> On 05/16/2014 12:29 PM, Peter Zijlstra wrote:
>>> On Sat, May 17, 2014 at 12:18:06AM +0800, Lai Jiangshan wrote:
>>>> so the scheduler/set_cpus_allowed_ptr()/cpu_active_mask should be the first
>>>> place to fix.
>>>
>>> I'm not arguing about that, not to mention that this is userspace
>>> exposed and nobody protects that.
>>>
>>> But I was expecting kernel stuff that calls it on hotplug to be
>>> serialized thusly, but apparently not so.
>>>
>>
>> Was a final patch posted for this issue? The discussion made it sound like
>> there were still a few things to figure out before we could resolve this
>> bug. I can recreate this as needed and I'm happy to test any patches.
>
>
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=sched/urgent&id=6acbfb96976fc3350e30d964acb1dbbdf876d55e
>
> which should make its way to Linus soonish I suppose.
>

I applied the patch on top of c7208164e66f63e3ec1759b98087849286410741 
and I am still hitting the problem.
Should I have applied to a different branch/commit to pick up any other 
needed changes?

Patch applied:

diff --git a/kernel/cpu.c b/kernel/cpu.c
index a9e710e..247979a 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -726,10 +726,12 @@ void set_cpu_present(unsigned int cpu, bool present)
void set_cpu_online(unsigned int cpu, bool online)
{
- if (online)
+ if (online) {
cpumask_set_cpu(cpu, to_cpumask(cpu_online_bits));
- else
+ cpumask_set_cpu(cpu, to_cpumask(cpu_active_bits));
+ } else {
cpumask_clear_cpu(cpu, to_cpumask(cpu_online_bits));
+ }
}
void set_cpu_active(unsigned int cpu, bool active)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 44e00ab..86f3890 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5076,7 +5076,6 @@ static int sched_cpu_active(struct notifier_block 
*nfb,
unsigned long action, void *hcpu)
{
switch (action & ~CPU_TASKS_FROZEN) {
- case CPU_STARTING:
case CPU_DOWN_FAILED:
set_cpu_active((long)hcpu, true);
return NOTIFY_OK;

Here is the output from the recreation using this patch:

[ 3634.146233] ------------[ cut here ]------------
[ 3634.146238] WARNING: at kernel/workqueue.c:2176
[ 3634.146239] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4 
nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack 
xt_CHECKSUM iptable_mangle bridge stp llc ip6table_filter ip6_tables 
ebtable_nat ebtables iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi qeth_l2 tape_3590 tape tape_class vhost_net tun 
vhost macvtap macvlan eadm_sch qeth ccwgroup zfcp scsi_transport_fc 
scsi_tgt qdio dasd_eckd_mod dasd_mod dm_multipath [last unloaded: kvm]
[ 3634.146260] CPU: 6 PID: 28009 Comm: kworker/7:0 Not tainted 3.15.0-rc7 #1
[ 3634.146263] Workqueue: \xffffff80           (null)
[ 3634.146264] task: 000000025def32e0 ti: 000000026dca0000 task.ti: 
000000026dca0000
[ 3634.146266] Krnl PSW : 0404c00180000000 000000000015ad1a 
(process_one_work+0x2e6/0x4c0)
[ 3634.146272]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 
PM:0 EA:3
Krnl GPRS: 0000000000000000 0000000000bc649a 00000002764b0980 
0000000000b94f40
[ 3634.146275]            0000000000b94f40 0000000000000000 
0000000000000000 0000000000bc6496
[ 3634.146277]            0000000000000000 000000008b65b600 
000000008b657000 000000008b657018
[ 3634.146278]            00000002764b0980 0000000000b94f40 
000000026dca3dd0 000000026dca3d70
[ 3634.146287] Krnl Code: 000000000015ad0e: 95001000		cli	0(%r1),0
            000000000015ad12: a774fece		brc	7,15aaae
           #000000000015ad16: a7f40001		brc	15,15ad18
           >000000000015ad1a: 92011000		mvi	0(%r1),1
            000000000015ad1e: a7f4fec8		brc	15,15aaae
            000000000015ad22: e31003180004	lg	%r1,792
            000000000015ad28: 58301024		l	%r3,36(%r1)
            000000000015ad2c: a73a0001		ahi	%r3,1
[ 3634.146299] Call Trace:
[ 3634.146301] ([<000000000015ace8>] process_one_work+0x2b4/0x4c0)
[ 3634.146303]  [<000000000015c100>] worker_thread+0x178/0x39c
[ 3634.146305]  [<0000000000164ba6>] kthread+0x10e/0x128
[ 3634.146310]  [<000000000072d026>] kernel_thread_starter+0x6/0xc
[ 3634.146312]  [<000000000072d020>] kernel_thread_starter+0x0/0xc
[ 3634.146313] Last Breaking-Event-Address:
[ 3634.146315]  [<000000000015ad16>] process_one_work+0x2e2/0x4c0
[ 3634.146316] ---[ end trace 03f51c9126c24171 ]---

I don't think this output provides anything new. Please let me know if I 
can gather any more data.

-- 
-- Jason J. Herne (jjherne@linux.vnet.ibm.com)


  reply	other threads:[~2014-05-29 16:23 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-12 18:58 workqueue: WARN at at kernel/workqueue.c:2176 Sasha Levin
2014-05-12 20:01 ` Tejun Heo
2014-05-13  2:19   ` Lai Jiangshan
2014-05-13  2:17     ` Sasha Levin
2014-05-14 16:52       ` Jason J. Herne
2014-05-16  3:50         ` Lai Jiangshan
2014-05-16  9:35           ` Peter Zijlstra
2014-05-16  9:56             ` Lai Jiangshan
2014-05-16 10:29               ` Peter Zijlstra
2014-05-16 10:15             ` Peter Zijlstra
2014-05-16 10:16               ` Peter Zijlstra
2014-05-16 10:39                 ` Peter Zijlstra
2014-05-16 11:57           ` Peter Zijlstra
2014-05-16 12:08             ` Tejun Heo
2014-05-16 12:14               ` Thomas Gleixner
2014-05-16 12:16                 ` Tejun Heo
2014-05-16 16:18             ` Lai Jiangshan
2014-05-16 16:29               ` Peter Zijlstra
2014-05-27 14:18                 ` Jason J. Herne
2014-05-27 14:26                   ` Peter Zijlstra
2014-05-29 16:23                     ` Jason J. Herne [this message]
2014-06-03 11:24                       ` Lai Jiangshan
2014-06-03 12:45                         ` Lai Jiangshan
2014-06-03 14:28                           ` Peter Zijlstra
2014-06-04  1:47                             ` Lai Jiangshan
2014-06-03 14:16                         ` Peter Zijlstra
2014-06-04  2:27                           ` Lai Jiangshan
2014-06-04  6:49                             ` Peter Zijlstra
2014-06-04  8:25                               ` Lai Jiangshan
2014-06-04  9:39                                 ` Peter Zijlstra
2014-06-05 10:54                                   ` Lai Jiangshan
2014-06-05 15:22                                     ` Jason J. Herne
2014-06-06 12:39                                     ` Jason J. Herne
2014-06-06 13:36                                     ` Peter Zijlstra
2014-06-08  2:50                                       ` Lai Jiangshan
2014-09-01  3:04                                       ` Lai Jiangshan
2014-09-03 15:15                                         ` Peter Zijlstra
2014-09-04  2:22                                           ` Lai Jiangshan
2014-09-04  6:39                                             ` Peter Zijlstra
2014-06-09 14:01                                     ` Jason J. Herne
2014-06-10  1:21                                       ` Lai Jiangshan
2014-06-16  1:30                                         ` Lai Jiangshan
2014-09-09 14:52                                 ` [tip:sched/core] sched: Migrate waking tasks tip-bot for Lai Jiangshan
2014-09-10  7:38                                   ` Kirill Tkhai
2014-09-10  7:53                                     ` Peter Zijlstra
2014-06-04  2:28                         ` workqueue: WARN at at kernel/workqueue.c:2176 Lai Jiangshan
2014-06-04  6:48                           ` Peter Zijlstra
2014-05-19 13:07           ` [tip:sched/core] sched: Fix hotplug vs set_cpus_allowed_ptr() tip-bot for Lai Jiangshan
2014-05-22 12:26           ` [tip:sched/core] sched: Fix hotplug vs. set_cpus_allowed_ptr() tip-bot for Lai Jiangshan
2014-05-22 22:02             ` Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53875F09.3090607@linux.vnet.ibm.com \
    --to=jjherne@linux.vnet.ibm.com \
    --cc=davej@redhat.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sasha.levin@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.