From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932256AbcDSOhz (ORCPT ); Tue, 19 Apr 2016 10:37:55 -0400 Received: from mail-wm0-f51.google.com ([74.125.82.51]:38306 "EHLO mail-wm0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932143AbcDSOhw (ORCPT ); Tue, 19 Apr 2016 10:37:52 -0400 To: "Linux-Kernel@Vger. Kernel. Org" Cc: Peter Zijlstra , SiteGround Operations From: Nikolay Borisov Subject: Crash in __wake_up_common Message-ID: <571642BC.9040606@kyup.com> Date: Tue, 19 Apr 2016 17:37:48 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On a 4.4.1 kernel I observed the following crash: [1157738.189104] BUG: unable to handle kernel NULL pointer dereference at (null) [1157738.189374] IP: [] __wake_up_common+0x2e/0x90 [1157738.189596] PGD 4382a6067 PUD 43827e067 PMD 0 [1157738.189901] Oops: 0000 [#1] SMP [1157738.190158] Modules linked in: tcp_scalable dm_snapshot dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio xt_multiport xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables zfs(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) zunicode(PO) ext2 ib_umad ib_ipoib ib_cm ib_sa ses enclosure igb i2c_algo_bit x86_pkg_temp_thermal crc32_pclmul sb_edac edac_core i2c_i801 lpc_ich mfd_core ioatdma shpchp ipmi_devintf ipmi_si ipmi_msghandler ib_qib dca ib_mad ib_core ib_addr ipv6 [1157738.193517] CPU: 2 PID: 11460 Comm: z_wr_iss Tainted: P W O 4.4.1-clouder2 #69 [1157738.193688] Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013 [1157738.193859] task: ffff8802d102a700 ti: ffff88005b068000 task.ti: ffff88005b068000 [1157738.194029] RIP: 0010:[] [] __wake_up_common+0x2e/0x90 [1157738.194247] RSP: 0018:ffff88005b06bd48 EFLAGS: 00010096 [1157738.194415] RAX: ffffffffffffffe8 RBX: ffff880438ef52c8 RCX: 0000000000000000 [1157738.194585] RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff880438ef52c8 [1157738.194756] RBP: ffff88005b06bd88 R08: 0000000000000000 R09: 0000000000000002 [1157738.194926] R10: 0000000000000001 R11: 0000000000000078 R12: 0000000000000086 [1157738.195098] R13: ffff880438ef52d0 R14: 0000000000000000 R15: 0000000000000000 [1157738.195267] FS: 0000000000000000(0000) GS:ffff88047fc40000(0000) knlGS:0000000000000000 [1157738.195440] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1157738.195607] CR2: 0000000000000000 CR3: 00000004382a9000 CR4: 00000000001406e0 [1157738.195778] Stack: [1157738.195939] ffff880438ef52c8 0000000300000001 0000000000000001 ffff880438ef52c8 [1157738.196294] 0000000000000086 0000000000000003 0000000000000001 0000000000000000 [1157738.196648] ffff88005b06bdc8 ffffffff810e0ee8 ffffffffa0296000 ffff880438ef5200 [1157738.197002] Call Trace: [1157738.197168] [] __wake_up+0x48/0x70 [1157738.197343] [] ? taskq_thread_spawn+0x60/0x60 [spl] [1157738.197515] [] ? taskq_thread_spawn+0x60/0x60 [spl] [1157738.197687] [] taskq_thread+0x106/0x580 [spl] [1157738.197857] [] ? try_to_wake_up+0x3b0/0x3b0 [1157738.198028] [] ? taskq_thread_spawn+0x60/0x60 [spl] [1157738.198199] [] ? taskq_thread_spawn+0x60/0x60 [spl] [1157738.198370] [] ? taskq_thread_spawn+0x60/0x60 [spl] [1157738.198541] [] kthread+0xd7/0xf0 [1157738.198711] [] ? schedule_tail+0x1e/0xd0 [1157738.198880] [] ? kthread_freezable_should_stop+0x80/0x80 [1157738.199053] [] ret_from_fork+0x3f/0x70 [1157738.199222] [] ? kthread_freezable_should_stop+0x80/0x80 [1157738.199393] Code: e5 41 57 41 56 41 55 41 54 53 48 83 ec 18 0f 1f 44 00 00 89 75 cc 89 55 c8 4c 8d 6f 08 48 8b 57 08 41 89 cf 48 8d 42 e8 4d 89 c6 <48> 8b 58 18 49 39 d5 74 3b 48 83 eb 18 eb 07 48 89 d8 48 8d 5a [1157738.202805] RIP [] __wake_up_common+0x2e/0x90 [1157738.203020] RSP [1157738.203184] CR2: 0000000000000000 ffffffff810e08be points to this line in __wake_up_common: list_for_each_entry_safe(curr, next, &q->task_list, task_list) { This is the wait_queue_head_t: crash> struct wait_queue_head_t ffff880438ef52c8 struct wait_queue_head_t { lock = { { rlock = { raw_lock = { val = { counter = 1 } } } } }, task_list = { next = 0x0, prev = 0xffff880438ef52d8 } } nr_exclusive seems to be 1, and mode is 3 (TASK_NORMAL). The spl module is coming from zfs(ZoL) but I dunno whether this might be a bug in the scheduler or in the zfs. The line which led to the __wake_up is this: wake_up(&tq->tq_wait_waitq);