From: baiyaowei <baiyaowei@cmss.chinamobile.com>
To: tj@kernel.org, jiangshanlai@gmail.com
Cc: akpm@linux-foundation.org, baiyaowei@cmss.chinamobile.com,
linux-kernel@vger.kernel.org
Subject: NULL pointer dereference in process_one_work
Date: Fri, 24 Nov 2017 02:23:44 -0500 [thread overview]
Message-ID: <20171124072344.GA313@byw> (raw)
Hi,tj and jiangshan,
I build a ceph storage pool to run some benchmarks with 3.10 kernel.
Occasionally, when the cpus' load is very high, some nodes crash with
message below.
[292273.612014] BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
[292273.612057] IP: [<ffffffff8109d4b1>] process_one_work+0x31/0x470
[292273.612087] PGD 0
[292273.612099] Oops: 0000 [#1] SMP
[292273.612117] Modules linked in: rbd(OE) bcache(OE) ip_vs xfs
xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
xt_conntrack nf_conntrack ipt_REJECT tun bridge stp llc ebtable_filter
ebtables ip6table_filter ip6_tables iptable_filter bonding
intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul
ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper
cryptd mxm_wmi iTCO_wdt iTCO_vendor_support dcdbas ipmi_devintf pcspkr
ipmi_ssif mei_me sg lpc_ich mei sb_edac ipmi_si mfd_core edac_core
ipmi_msghandler shpchp wmi acpi_power_meter nfsd auth_rpcgss nfs_acl
lockd grace sunrpc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif
crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit
drm_kms_helper
[292273.612495] crct10dif_pclmul crct10dif_common ttm crc32c_intel drm
ahci nvme bnx2x libahci i2c_core libata mdio libcrc32c megaraid_sas ptp
pps_core dm_mirror dm_region_hash dm_log dm_mod
[292273.612580] CPU: 16 PID: 353223 Comm: kworker/16:2 Tainted: G
OE ------------ 3.10.0-327.el7.x86_64 #1
[292273.612620] Hardware name: Dell Inc. PowerEdge R730xd/0WCJNT, BIOS
2.4.3 01/17/2017
[292273.612655] task: ffff8801f55e6780 ti: ffff882a199b0000 task.ti:
ffff882a199b0000
[292273.612685] RIP: 0010:[<ffffffff8109d4b1>] [<ffffffff8109d4b1>]
process_one_work+0x31/0x470
[292273.612721] RSP: 0018:ffff882a199b3e28 EFLAGS: 00010046
[292273.612743] RAX: 0000000000000000 RBX: ffff88088b273028 RCX:
ffff882a199b3fd8
[292273.612771] RDX: 0000000000000000 RSI: ffff88088b273028 RDI:
ffff88088b273000
[292273.612799] RBP: ffff882a199b3e60 R08: 0000000000000000 R09:
0000000000000770
[292273.612827] R10: ffff8822a3bb1f80 R11: ffff8822a3bb1f80 R12:
ffff88088b273000
[292273.612855] R13: ffff881fff313fc0 R14: 0000000000000000 R15:
ffff881fff313fc0
[292273.612883] FS: 0000000000000000(0000) GS:ffff881fff300000(0000)
knlGS:0000000000000000
[292273.612914] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[292273.612937] CR2: 00000000000000b8 CR3: 000000000194a000 CR4:
00000000003407e0
[292273.612965] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[292273.612994] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[292273.613021] Stack:
[292273.613031] 00000000ff313fd8 0000000000000000 ffff881fff313fd8
000188088b273030
[292273.613069] ffff8801f55e6780 ffff88088b273000 ffff881fff313fc0
ffff882a199b3ec0
[292273.613108] ffffffff8109e4cc ffff882a199b3fd8 ffff882a199b3fd8
ffff8801f55e6780
[292273.613146] Call Trace:
[292273.613160] [<ffffffff8109e4cc>] worker_thread+0x21c/0x400
[292273.613185] [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400
[292273.613212] [<ffffffff810a5aef>] kthread+0xcf/0xe0
[292273.613234] [<ffffffff810a5a20>] ?
kthread_create_on_node+0x140/0x140
[292273.613263] [<ffffffff81645858>] ret_from_fork+0x58/0x90
[292273.613287] [<ffffffff810a5a20>] ?
kthread_create_on_node+0x140/0x140
[292273.614303] Code: 48 89 e5 41 57 41 56 45 31 f6 41 55 41 54 49 89 fc
53 48 89 f3 48 83 ec 10 48 8b 06 4c 8b 6f 48 48 89 c2 30 d2 a8 04 4c 0f
45 f2 <49> 8b 46 08 44 8b b8 00 01 00 00 41 c1 ef 05 44 89 f8 83 e0 01
[292273.617971] RIP [<ffffffff8109d4b1>] process_one_work+0x31/0x470
[292273.620011] RSP <ffff882a199b3e28>
[292273.621940] CR2: 0000000000000008
Some crash messsages:
crash> sys
KERNEL: /usr/lib/debug/lib/modules/3.10.0-327.el7.x86_64/vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 32
DATE: Wed Oct 18 05:21:14 2017
UPTIME: 3 days, 09:07:25
LOAD AVERAGE: 221.70, 222.22, 224.96
TASKS: 3115
NODENAME: node121
RELEASE: 3.10.0-327.el7.x86_64
VERSION: #1 SMP Thu Nov 19 22:10:57 UTC 2015
MACHINE: x86_64 (2099 Mhz)
MEMORY: 255.9 GB
PANIC: "BUG: unable to handle kernel NULL pointer dereference at
0000000000000008"
crash> bt
PID: 353223 TASK: ffff8801f55e6780 CPU: 16 COMMAND: "kworker/16:2"
#0 [ffff882a199b3af0] machine_kexec at ffffffff81051beb
#1 [ffff882a199b3b50] crash_kexec at ffffffff810f2542
#2 [ffff882a199b3c20] oops_end at ffffffff8163e1a8
#3 [ffff882a199b3c48] no_context at ffffffff8162e2b8
#4 [ffff882a199b3c98] __bad_area_nosemaphore at ffffffff8162e34e
#5 [ffff882a199b3ce0] bad_area_nosemaphore at ffffffff8162e4b8
#6 [ffff882a199b3cf0] __do_page_fault at ffffffff81640fce
#7 [ffff882a199b3d48] do_page_fault at ffffffff81641113
#8 [ffff882a199b3d70] page_fault at ffffffff8163d408
[exception RIP: process_one_work+49]
RIP: ffffffff8109d4b1 RSP: ffff882a199b3e28 RFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff88088b273028 RCX: ffff882a199b3fd8
RDX: 0000000000000000 RSI: ffff88088b273028 RDI: ffff88088b273000
RBP: ffff882a199b3e60 R8: 0000000000000000 R9: 0000000000000770
R10: ffff8822a3bb1f80 R11: ffff8822a3bb1f80 R12: ffff88088b273000
R13: ffff881fff313fc0 R14: 0000000000000000 R15: ffff881fff313fc0
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff882a199b3e68] worker_thread at ffffffff8109e4cc
#10 [ffff882a199b3ec8] kthread at ffffffff810a5aef
#11 [ffff882a199b3f50] ret_from_fork at ffffffff81645858
crash> dis process_one_work
0xffffffff8109d480 <process_one_work>: nopl 0x0(%rax,%rax,1) [FTRACE
NOP]
0xffffffff8109d485 <process_one_work+5>: push %rbp
0xffffffff8109d486 <process_one_work+6>: mov %rsp,%rbp
0xffffffff8109d489 <process_one_work+9>: push %r15
0xffffffff8109d48b <process_one_work+11>: push %r14
0xffffffff8109d48d <process_one_work+13>: xor %r14d,%r14d
0xffffffff8109d490 <process_one_work+16>: push %r13
0xffffffff8109d492 <process_one_work+18>: push %r12
0xffffffff8109d494 <process_one_work+20>: mov %rdi,%r12
0xffffffff8109d497 <process_one_work+23>: push %rbx
0xffffffff8109d498 <process_one_work+24>: mov %rsi,%rbx
0xffffffff8109d49b <process_one_work+27>: sub $0x10,%rsp
0xffffffff8109d49f <process_one_work+31>: mov (%rsi),%rax
0xffffffff8109d4a2 <process_one_work+34>: mov 0x48(%rdi),%r13
0xffffffff8109d4a6 <process_one_work+38>: mov %rax,%rdx
0xffffffff8109d4a9 <process_one_work+41>: xor %dl,%dl
0xffffffff8109d4ab <process_one_work+43>: test $0x4,%al
0xffffffff8109d4ad <process_one_work+45>: cmovne %rdx,%r14
0xffffffff8109d4b1 <process_one_work+49>: mov 0x8(%r14),%rax
0xffffffff8109d4b5 <process_one_work+53>: mov 0x100(%rax),%r15d
0xffffffff8109d4bc <process_one_work+60>: shr $0x5,%r15d
0xffffffff8109d4c0 <process_one_work+64>: mov %r15d,%eax
0xffffffff8109d4c3 <process_one_work+67>: and $0x1,%eax
0xffffffff8109d4c6 <process_one_work+70>: testb $0x80,0x58(%rdi)
0xffffffff8109d4ca <process_one_work+74>: mov %al,-0x30(%rbp)
0xffffffff8109d4cd <process_one_work+77>: jne
0xffffffff8109d4da <process_one_work+90>
crash> *work_struct ffff88088b273028
struct work_struct {
data = {
counter = 0
},
entry = {
next = 0xffff88088b273030,
prev = 0xffff88088b273030
},
func = 0xffff8801f55e6780
}
crash> *worker ffff88088b273000
struct worker {
{
entry = {
next = 0x0,
prev = 0x0
},
hentry = {
next = 0x0,
pprev = 0x0
}
},
current_work = 0x0,
current_func = 0x0,
current_pwq = 0x0,
desc_valid = false,
scheduled = {
next = 0xffff88088b273030,
prev = 0xffff88088b273030
},
task = 0xffff8801f55e6780,
pool = 0xffff881fff313fc0,
last_active = 4586712874,
flags = 1,
id = 2,
desc =
"\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
rescue_wq = 0x0
}
crash>
It looks like the work parameter is mistakenly pointing into the worker
parameter. I noticed there was one similar case like this before, but not
exactly same.
https://lists.gt.net/linux/kernel/2022989
I wonder how this situation could happen. When process_one_work is called,
the work being processed shall be on worker_pool->worklist, but not
here.
crash> *worker_pool 0xffff881fff313fc0
struct worker_pool {
lock = {
{
rlock = {
raw_lock = {
{
head_tail = 4142331618,
tickets = {
head = 63202,
tail = 63206
}
}
}
}
}
},
cpu = 16,
node = 0,
id = 32,
flags = 0,
worklist = {
next = 0xffff881fff313fd8,
prev = 0xffff881fff313fd8
},
nr_workers = 3,
nr_idle = 2,
idle_list = {
next = 0xffff880245a18280,
prev = 0xffff88078d3c5780
},
...
Maybe we'd add if(get_work_pwq(work)) in process_one_work or add
BUG_ON like this:
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index dde6298..82a92e0 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2012,11 +2012,15 @@ static void process_one_work(struct worker
*worker, struct work_struct *work)
__releases(&pool->lock)
__acquires(&pool->lock)
{
- struct pool_workqueue *pwq = get_work_pwq(work);
+ struct pool_workqueue *pwq;
struct worker_pool *pool = worker->pool;
- bool cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE;
+ bool cpu_intensive;
int work_color;
struct worker *collision;
+
+ BUG_ON(!(pwq = get_work_pwq(work)));
+ bool cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE;
+
#ifdef CONFIG_LOCKDEP
I really appreciate your any ideas and please let me know if you want
any more information from the crashed system.
Thanks,
Yaowei,Bai
reply other threads:[~2017-11-24 7:33 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171124072344.GA313@byw \
--to=baiyaowei@cmss.chinamobile.com \
--cc=akpm@linux-foundation.org \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.