All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jason J. Herne" <jjherne@linux.vnet.ibm.com>
To: tj@kernel.org, linux-kernel@vger.kernel.org
Subject: Subject: Warning in workqueue.c
Date: Fri, 07 Feb 2014 09:39:24 -0500	[thread overview]
Message-ID: <52F4F01C.1070800@linux.vnet.ibm.com> (raw)

I've been able to reproduce the following warning using several kernel 
versions on the S390 platform, including the latest master: 3.14-rc1 
(38dbfb59d1175ef458d006556061adeaa8751b72).

[28718.212810] ------------[ cut here ]------------
[28718.212819] WARNING: at kernel/workqueue.c:2156
[28718.212822] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4 
nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack 
xt_CHECKSUM iptable_mangle bridge stp llc ip6table_filter ip6_tables 
ebtable_nat ebtables iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi tape_3590 qeth_l2 tape tape_class vhost_net tun 
vhost macvtap macvlan lcs dasd_eckd_mod dasd_mod qeth ccwgroup zfcp 
scsi_transport_fc scsi_tgt qdio dm_multipath [last unloaded: kvm]
[28718.212857] CPU: 2 PID: 20 Comm: kworker/3:0 Not tainted 3.14.0-rc1 #1
[28718.212862] task: 00000000f7b23260 ti: 00000000f7b2c000 task.ti: 
00000000f7b2c000
[28718.212874] Krnl PSW : 0404c00180000000 000000000015b0be 
(process_one_work+0x2e6/0x4c0)
[28718.212881]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 
PM:0 EA:3
Krnl GPRS: 0000000001727790 0000000000bc2a52 00000000f7f21900 
0000000000b92500
[28718.212883]            0000000000b92500 0000000000105b24 
0000000000000000 0000000000bc2a4e
[28718.212887]            0000000000000000 0000000084a2b500 
0000000084a27000 0000000084a27018
[28718.212888]            00000000f7f21900 0000000000b92500 
00000000f7b2fdd0 00000000f7b2fd70
[28718.212907] Krnl Code: 000000000015b0b2: 95001000		cli	0(%r1),0
            000000000015b0b6: a774fece		brc	7,15ae52
           #000000000015b0ba: a7f40001		brc	15,15b0bc
           >000000000015b0be: 92011000		mvi	0(%r1),1
            000000000015b0c2: a7f4fec8		brc	15,15ae52
            000000000015b0c6: e31003180004	lg	%r1,792
            000000000015b0cc: 58301024		l	%r3,36(%r1)
            000000000015b0d0: a73a0001		ahi	%r3,1
[28718.212937] Call Trace:
[28718.212940] ([<000000000015b08c>] process_one_work+0x2b4/0x4c0)
[28718.212944]  [<000000000015b858>] worker_thread+0x178/0x39c
[28718.212949]  [<0000000000164652>] kthread+0x10e/0x128
[28718.212956]  [<0000000000728c66>] kernel_thread_starter+0x6/0xc
[28718.212960]  [<0000000000728c60>] kernel_thread_starter+0x0/0xc
[28718.212962] Last Breaking-Event-Address:
[28718.212965]  [<000000000015b0ba>] process_one_work+0x2e2/0x4c0
[28718.212968] ---[ end trace 6d115577307998c2 ]---

The workload is:
2 processes onlining random cpus in a tight loop by using 'echo 1 > 
/sys/bus/cpu.../online'
2 processes offlining random cpus in a tight loop by using 'echo 0 > 
/sys/bus/cpu.../online'
Otherwise, fairly idle system. load average: 5.82, 6.27, 6.27

The machine has 10 processors.
The warning message some times hits within a few minutes on starting the 
workload. Other times it takes several hours.

The particular spot in the code is:
	/*
	 * Ensure we're on the correct CPU.  DISASSOCIATED test is
	 * necessary to avoid spurious warnings from rescuers servicing the
	 * unbound or a disassociated pool.
	 */
	WARN_ON_ONCE(!(worker->flags & WORKER_UNBOUND) &&
		     !(pool->flags & POOL_DISASSOCIATED) &&
		     raw_smp_processor_id() != pool->cpu);

I'm not familiar with scheduling or work queuing internals so I'm not 
sure how to further debug.
I would be happy to run tests and/or collect debugging data.

-- 
-- Jason J. Herne (jjherne@linux.vnet.ibm.com)


             reply	other threads:[~2014-02-07 14:39 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-07 14:39 Jason J. Herne [this message]
2014-02-07 16:51 ` Subject: Warning in workqueue.c Tejun Heo
2014-02-07 17:55   ` Jason J. Herne
2014-02-07 19:36     ` Tejun Heo
2014-02-10 15:32       ` Jason J. Herne
2014-02-10 23:17         ` Tejun Heo
2014-02-12 15:18           ` Jason J. Herne
2014-02-13  3:02             ` Lai Jiangshan
2014-02-13  3:31             ` Lai Jiangshan
2014-02-13 17:58               ` Jason J. Herne
2014-02-13 20:41                 ` Tejun Heo
2014-02-14 14:56                   ` Jason J. Herne
2014-02-14 14:58                     ` Tejun Heo
2014-02-14 16:09                   ` Peter Zijlstra
2014-02-14 16:25                     ` Tejun Heo
2014-02-14 16:28                       ` Peter Zijlstra
2014-02-14 16:38                         ` Tejun Heo
2014-02-24 15:01                       ` Jason J. Herne
2014-02-24 18:35                         ` Tejun Heo
2014-02-25 10:37                           ` Peter Zijlstra
2014-03-10 14:37                             ` Jason J. Herne
2014-03-17 14:51                               ` Jason J. Herne
2014-03-17 15:16                               ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52F4F01C.1070800@linux.vnet.ibm.com \
    --to=jjherne@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.