* NVMe induced NULL deref in bt_iter() @ 2017-06-30 17:26 Jens Axboe 2017-07-02 10:45 ` Max Gurtovoy 0 siblings, 1 reply; 29+ messages in thread From: Jens Axboe @ 2017-06-30 17:26 UTC (permalink / raw) To: Max Gurtovoy; +Cc: linux-block@vger.kernel.org Hi Max, I remembered you reporting this. I think this is a regression introduced with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] is, but that's not indexable by the tag we find. So I think we need to guard those with a NULL check. The actual requests themselves are static, so we know the memory itself isn't going away. But if we race with completion, we could find a NULL there, validly. Since you could reproduce it, can you try the below? diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index d0be72ccb091..b856b2827157 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) bitnr += tags->nr_reserved_tags; rq = tags->rqs[bitnr]; - if (rq->q == hctx->queue) + if (rq && rq->q == hctx->queue) iter_data->fn(hctx, rq, iter_data->data, reserved); return true; } @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) if (!reserved) bitnr += tags->nr_reserved_tags; rq = tags->rqs[bitnr]; - - iter_data->fn(rq, iter_data->data, reserved); + if (rq) + iter_data->fn(rq, iter_data->data, reserved); return true; } -- Jens Axboe ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-06-30 17:26 NVMe induced NULL deref in bt_iter() Jens Axboe @ 2017-07-02 10:45 ` Max Gurtovoy 0 siblings, 0 replies; 29+ messages in thread From: Max Gurtovoy @ 2017-07-02 10:45 UTC (permalink / raw) To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, sagig [-- Attachment #1: Type: text/plain, Size: 2592 bytes --] On 6/30/2017 8:26 PM, Jens Axboe wrote: > Hi Max, Hi Jens, > > I remembered you reporting this. I think this is a regression introduced > with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] > is, but that's not indexable by the tag we find. So I think we need to > guard those with a NULL check. The actual requests themselves are > static, so we know the memory itself isn't going away. But if we race > with completion, we could find a NULL there, validly. > > Since you could reproduce it, can you try the below? I still can repro the null deref with this patch applied. > > diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c > index d0be72ccb091..b856b2827157 100644 > --- a/block/blk-mq-tag.c > +++ b/block/blk-mq-tag.c > @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) > bitnr += tags->nr_reserved_tags; > rq = tags->rqs[bitnr]; > > - if (rq->q == hctx->queue) > + if (rq && rq->q == hctx->queue) > iter_data->fn(hctx, rq, iter_data->data, reserved); > return true; > } > @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) > if (!reserved) > bitnr += tags->nr_reserved_tags; > rq = tags->rqs[bitnr]; > - > - iter_data->fn(rq, iter_data->data, reserved); > + if (rq) > + iter_data->fn(rq, iter_data->data, reserved); > return true; > } see the attached file for dmesg output. output of gdb: (gdb) list *(blk_mq_flush_busy_ctxs+0x48) 0xffffffff8127b108 is in blk_mq_flush_busy_ctxs (./include/linux/sbitmap.h:234). 229 230 for (i = 0; i < sb->map_nr; i++) { 231 struct sbitmap_word *word = &sb->map[i]; 232 unsigned int off, nr; 233 234 if (!word->word) 235 continue; 236 237 nr = 0; 238 off = i << sb->shift; when I change the "if (!word->word)" to "if (word && !word->word)" I can get null deref at "nr = find_next_bit(&word->word, word->depth, nr);". Seems like somehow word becomes NULL. Adding the linux-nvme guys too. Sagi has mentioned that this can be null only if we remove the tagset while I/O is trying to get a tag and when killing the target we get into error recovery and periodic reconnects, which does _NOT_ include freeing the tagset, so this is probably the admin tagset. Sagi, you've mention a patch for centrelizing the treatment of the admin tagset to the nvme core. I think I missed this patch, so can you please send a pointer to it and I'll check if it helps ? [-- Attachment #2: null_deref_4_12_rc_5.log --] [-- Type: text/plain, Size: 96308 bytes --] Linux version 4.12.0-rc5+ (root@rsws34) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #62 SMP Mon Jun 19 15:33:59 IDT 2017 nvme nvme0: creating 24 I/O queues. nvme nvme0: new ctrl: NQN "subsystem_rsws33.mtr.labs.mlnx_1", addr 11.212.40.110:1023 perf: interrupt took too long (2502 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 perf: interrupt took too long (3128 > 3127), lowering kernel.perf_event_max_sample_rate to 63000 perf: interrupt took too long (3918 > 3910), lowering kernel.perf_event_max_sample_rate to 51000 perf: interrupt took too long (4903 > 4897), lowering kernel.perf_event_max_sample_rate to 40000 blk_update_request: I/O error, dev nvme0n1, sector 486252953 blk_update_request: I/O error, dev nvme0n1, sector 254324451 blk_update_request: I/O error, dev nvme0n1, sector 486828506 blk_update_request: I/O error, dev nvme0n1, sector 175268160 blk_update_request: I/O error, dev nvme0n1, sector 204249372 blk_update_request: I/O error, dev nvme0n1, sector 45725385 blk_update_request: I/O error, dev nvme0n1, sector 503167578 blk_update_request: I/O error, dev nvme0n1, sector 220671103 blk_update_request: I/O error, dev nvme0n1, sector 351009498 blk_update_request: I/O error, dev nvme0n1, sector 509040223 nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning Buffer I/O error on dev nvme0n100, logical block 65535984, async page read nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning Buffer I/O error on dev nvme0n100, logical block 65535998, async page read Buffer I/O error on dev nvme0n100, logical block 0, async page read Buffer I/O error on dev nvme0n100, logical block 1, async page read Buffer I/O error on dev nvme0n1, logical block 65535984, async page read Buffer I/O error on dev nvme0n10, logical block 65535984, async page read nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning Buffer I/O error on dev nvme0n1, logical block 65535998, async page read Buffer I/O error on dev nvme0n10, logical block 65535998, async page read Buffer I/O error on dev nvme0n1, logical block 0, async page read Buffer I/O error on dev nvme0n1, logical block 1, async page read nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning BUG: unable to handle kernel NULL pointer dereference at (null) nvme nvme0: Reconnecting in 10 seconds... IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#1] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 1 PID: 973 Comm: kworker/1:1H Tainted: G E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046daa2a40 task.stack: ffffc9000497c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc9000497fba8 EFLAGS: 00010246 RAX: ffffc9000497fc18 RBX: 0000000000000000 RCX: ffff8804572f0040 RDX: ffff8804458b7ca0 RSI: ffffc9000497fc18 RDI: ffff8804572f0000 RBP: ffffc9000497fbf8 R08: 0000000000000001 R09: fffffffffff68c3c R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804572f00d8 R13: ffffc9000497fbb8 R14: ffff8804572f0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fa40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc9000497fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce3 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#2] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 2 PID: 5734 Comm: kworker/2:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046a9dc1c0 task.stack: ffffc90004f1c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004f1fba8 EFLAGS: 00010246 RAX: ffffc90004f1fc18 RBX: 0000000000000000 RCX: ffff880457300040 RDX: ffff8804458b7cc0 RSI: ffffc90004f1fc18 RDI: ffff880457300000 RBP: ffffc90004f1fbf8 R08: 0000000000000001 R09: fffffffffff69477 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573000d8 R13: ffffc90004f1fbb8 R14: ffff880457300000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004f1fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce4 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#3] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 3 PID: 4319 Comm: kworker/3:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff880467c78640 task.stack: ffffc90006efc000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90006effba8 EFLAGS: 00010246 RAX: ffffc90006effc18 RBX: 0000000000000000 RCX: ffff880457320040 RDX: ffff8804458b7ce0 RSI: ffffc90006effc18 RDI: ffff880457320000 RBP: ffffc90006effbf8 R08: 0000000000000001 R09: fffffffffff69c47 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573200d8 R13: ffffc90006effbb8 R14: ffff880457320000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 ? pick_next_entity+0x7b/0x120 worker_thread+0x77/0x420 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90006effba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce5 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#4] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 4 PID: 1029 Comm: kworker/4:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046c46a7c0 task.stack: ffffc90004974000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004977ba8 EFLAGS: 00010246 RAX: ffffc90004977c18 RBX: 0000000000000000 RCX: ffff880457330040 RDX: ffff8804458b7d00 RSI: ffffc90004977c18 RDI: ffff880457330000 RBP: ffffc90004977bf8 R08: 0000000000000001 R09: fffffffffff6a46a R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573300d8 R13: ffffc90004977bb8 R14: ffff880457330000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004977ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce6 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#5] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 5 PID: 964 Comm: kworker/5:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046cd1e5c0 task.stack: ffffc90004044000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004047ba8 EFLAGS: 00010246 RAX: ffffc90004047c18 RBX: 0000000000000000 RCX: ffff880457340040 RDX: ffff8804458b7d20 RSI: ffffc90004047c18 RDI: ffff880457340000 RBP: ffffc90004047bf8 R08: 0000000000000001 R09: fffffffffff6accd R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573400d8 R13: ffffc90004047bb8 R14: ffff880457340000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fb40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004047ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce7 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#6] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 7 PID: 976 Comm: kworker/7:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086dce2500 task.stack: ffffc90004064000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004067ba8 EFLAGS: 00010246 RAX: ffffc90004067c18 RBX: 0000000000000000 RCX: ffff880850ab0040 RDX: ffff8808509a9ba0 RSI: ffffc90004067c18 RDI: ffff880850ab0000 RBP: ffffc90004067bf8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850ab00d8 R13: ffffc90004067bb8 R14: ffff880850ab0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fa40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004067ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce8 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#7] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 6 PID: 936 Comm: kworker/6:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086cede840 task.stack: ffffc90004034000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004037ba8 EFLAGS: 00010246 RAX: ffffc90004037c18 RBX: 0000000000000000 RCX: ffff880850aa0040 RDX: ffff8808509a9bc0 RSI: ffffc90004037c18 RDI: ffff880850aa0000 RBP: ffffc90004037bf8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850aa00d8 R13: ffffc90004037bb8 R14: ffff880850aa0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004037ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce9 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#8] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 8 PID: 1211 Comm: kworker/8:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086c20e540 task.stack: ffffc90004084000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004087ba8 EFLAGS: 00010246 RAX: ffffc90004087c18 RBX: 0000000000000000 RCX: ffff880850ac0040 RDX: ffff8808509a9b80 RSI: ffffc90004087c18 RDI: ffff880850ac0000 RBP: ffffc90004087bf8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850ac00d8 R13: ffffc90004087bb8 R14: ffff880850ac0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004087ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cea ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#9] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 9 PID: 949 Comm: kworker/9:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086e3e01c0 task.stack: ffffc90003ef0000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003ef3ba8 EFLAGS: 00010246 RAX: ffffc90003ef3c18 RBX: 0000000000000000 RCX: ffff880850ad0040 RDX: ffff8808509a9b60 RSI: ffffc90003ef3c18 RDI: ffff880850ad0000 RBP: ffffc90003ef3bf8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850ad00d8 R13: ffffc90003ef3bb8 R14: ffff880850ad0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003ef3ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ceb ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#10] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 10 PID: 950 Comm: kworker/10:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086e3dc180 task.stack: ffffc90003f50000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003f53ba8 EFLAGS: 00010246 RAX: ffffc90003f53c18 RBX: 0000000000000000 RCX: ffff880850ae0040 RDX: ffff8808509a9b40 RSI: ffffc90003f53c18 RDI: ffff880850ae0000 RBP: ffffc90003f53bf8 R08: 0000000000000001 R09: fffffffffff6d8a7 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850ae00d8 R13: ffffc90003f53bb8 R14: ffff880850ae0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003f53ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cec ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#11] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 11 PID: 960 Comm: kworker/11:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086c0a40c0 task.stack: ffffc9000400c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc9000400fba8 EFLAGS: 00010246 RAX: ffffc9000400fc18 RBX: 0000000000000000 RCX: ffff880850af0040 RDX: ffff8808509a9b20 RSI: ffffc9000400fc18 RDI: ffff880850af0000 RBP: ffffc9000400fbf8 R08: 0000000000000001 R09: fffffffffff6e0e5 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850af00d8 R13: ffffc9000400fbb8 R14: ffff880850af0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fb40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc9000400fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ced ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#12] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 12 PID: 2505 Comm: kworker/12:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046e15a440 task.stack: ffffc90005e4c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90005e4fba8 EFLAGS: 00010246 RAX: ffffc90005e4fc18 RBX: 0000000000000000 RCX: ffff880457350040 RDX: ffff8804458b7d40 RSI: ffffc90005e4fc18 RDI: ffff880457350000 RBP: ffffc90005e4fbf8 R08: 0000000000000001 R09: fffffffffff6e459 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573500d8 R13: ffffc90005e4fbb8 R14: ffff880457350000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 ? pick_next_entity+0x7b/0x120 worker_thread+0x77/0x420 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90005e4fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cee ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#13] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 13 PID: 1001 Comm: kworker/13:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046cfc2800 task.stack: ffffc90004674000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004677ba8 EFLAGS: 00010246 RAX: ffffc90004677c18 RBX: 0000000000000000 RCX: ffff880457360040 RDX: ffff8804458b7d60 RSI: ffffc90004677c18 RDI: ffff880457360000 RBP: ffffc90004677bf8 R08: 0000000000000001 R09: fffffffffff6f05a R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573600d8 R13: ffffc90004677bb8 R14: ffff880457360000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fbc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004677ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cef ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#14] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 14 PID: 947 Comm: kworker/14:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046ccd8500 task.stack: ffffc90003ed8000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003edbba8 EFLAGS: 00010246 RAX: ffffc90003edbc18 RBX: 0000000000000000 RCX: ffff880457370040 RDX: ffff8804458b7d80 RSI: ffffc90003edbc18 RDI: ffff880457370000 RBP: ffffc90003edbbf8 R08: 0000000000000001 R09: fffffffffff6f8d1 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573700d8 R13: ffffc90003edbbb8 R14: ffff880457370000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003edbba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf0 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#15] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 15 PID: 987 Comm: kworker/15:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046cd1c580 task.stack: ffffc9000408c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc9000408fba8 EFLAGS: 00010246 RAX: ffffc9000408fc18 RBX: 0000000000000000 RCX: ffff880457380040 RDX: ffff8804458b7da0 RSI: ffffc9000408fc18 RDI: ffff880457380000 RBP: ffffc9000408fbf8 R08: 0000000000000001 R09: fffffffffff70360 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573800d8 R13: ffffc9000408fbb8 R14: ffff880457380000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc9000408fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf1 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#16] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 16 PID: 963 Comm: kworker/16:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046d1988c0 task.stack: ffffc90004024000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004027ba8 EFLAGS: 00010246 RAX: ffffc90004027c18 RBX: 0000000000000000 RCX: ffff880457390040 RDX: ffff8804458b7dc0 RSI: ffffc90004027c18 RDI: ffff880457390000 RBP: ffffc90004027bf8 R08: 0000000000000001 R09: fffffffffff70c9b R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573900d8 R13: ffffc90004027bb8 R14: ffff880457390000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004027ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf2 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#17] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 17 PID: 5820 Comm: kworker/17:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046d2c69c0 task.stack: ffffc90004094000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004097ba8 EFLAGS: 00010246 RAX: ffffc90004097c18 RBX: 0000000000000000 RCX: ffff8804573b0040 RDX: ffff8804458b7de0 RSI: ffffc90004097c18 RDI: ffff8804573b0000 RBP: ffffc90004097bf8 R08: 0000000000000001 R09: fffffffffff7159b R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573b00d8 R13: ffffc90004097bb8 R14: ffff8804573b0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004097ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf3 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#18] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 18 PID: 920 Comm: kworker/18:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086d87a7c0 task.stack: ffffc90003ed0000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003ed3ba8 EFLAGS: 00010246 RAX: ffffc90003ed3c18 RBX: 0000000000000000 RCX: ffff880850b00040 RDX: ffff8808509a9b00 RSI: ffffc90003ed3c18 RDI: ffff880850b00000 RBP: ffffc90003ed3bf8 R08: 0000000000000001 R09: fffffffffff72142 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b000d8 R13: ffffc90003ed3bb8 R14: ffff880850b00000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003ed3ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf4 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#19] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 19 PID: 1259 Comm: kworker/19:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086c1864c0 task.stack: ffffc9000543c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc9000543fba8 EFLAGS: 00010246 RAX: ffffc9000543fc18 RBX: 0000000000000000 RCX: ffff880850b10040 RDX: ffff8808509a9ae0 RSI: ffffc9000543fc18 RDI: ffff880850b10000 RBP: ffffc9000543fbf8 R08: 0000000000000001 R09: fffffffffff72b3b R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b100d8 R13: ffffc9000543fbb8 R14: ffff880850b10000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fbc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc9000543fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf5 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#20] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 20 PID: 989 Comm: kworker/20:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086df08740 task.stack: ffffc9000427c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc9000427fba8 EFLAGS: 00010246 RAX: ffffc9000427fc18 RBX: 0000000000000000 RCX: ffff880850b20040 RDX: ffff8808509a9ac0 RSI: ffffc9000427fc18 RDI: ffff880850b20000 RBP: ffffc9000427fbf8 R08: 0000000000000001 R09: fffffffffff7351d R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b200d8 R13: ffffc9000427fbb8 R14: ffff880850b20000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc9000427fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf6 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#21] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 21 PID: 919 Comm: kworker/21:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086da2e9c0 task.stack: ffffc900047a4000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc900047a7ba8 EFLAGS: 00010246 RAX: ffffc900047a7c18 RBX: 0000000000000000 RCX: ffff880850b30040 RDX: ffff8808509a9aa0 RSI: ffffc900047a7c18 RDI: ffff880850b30000 RBP: ffffc900047a7bf8 R08: 0000000000000001 R09: fffffffffff73ee8 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b300d8 R13: ffffc900047a7bb8 R14: ffff880850b30000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc900047a7ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf7 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#22] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 22 PID: 932 Comm: kworker/22:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086c5e63c0 task.stack: ffffc90003e78000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003e7bba8 EFLAGS: 00010246 RAX: ffffc90003e7bc18 RBX: 0000000000000000 RCX: ffff880850b40040 RDX: ffff8808509a9a80 RSI: ffffc90003e7bc18 RDI: ffff880850b40000 RBP: ffffc90003e7bbf8 R08: 0000000000000001 R09: fffffffffff74859 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b400d8 R13: ffffc90003e7bbb8 R14: ffff880850b40000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003e7bba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf8 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#23] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 23 PID: 959 Comm: kworker/23:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086c0e6140 task.stack: ffffc90004004000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004007ba8 EFLAGS: 00010246 RAX: ffffc90004007c18 RBX: 0000000000000000 RCX: ffff880850b50040 RDX: ffff8808509a9a60 RSI: ffffc90004007c18 RDI: ffff880850b50000 RBP: ffffc90004007bf8 R08: 0000000000000001 R09: fffffffffff751ab R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b500d8 R13: ffffc90004007bb8 R14: ffff880850b50000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004007ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf9 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#24] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 0 PID: 928 Comm: kworker/0:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046c442780 task.stack: ffffc90003ef8000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003efbba8 EFLAGS: 00010246 RAX: ffffc90003efbc18 RBX: 0000000000000000 RCX: ffff8804572e0040 RDX: ffff8804458b7c80 RSI: ffffc90003efbc18 RDI: ffff8804572e0000 RBP: ffffc90003efbbf8 R08: 0000000000000002 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804572e00d8 R13: ffffc90003efbbb8 R14: ffff8804572e0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406f0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 ? blk_mq_requeue_work+0x18f/0x1b0 ? pwq_activate_delayed_work+0x47/0x70 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 ? schedule+0x1/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003efbba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cfa ]--- nvme-fabrics ctl: nvme_revalidate_ns: Identify failure BUG: unable to handle kernel NULL pointer dereference at (null) IP: sbitmap_any_bit_set+0x11/0x40 PGD 0 P4D 0 Oops: 0000 [#25] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 0 PID: 14184 Comm: kworker/0:2H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_requeue_work task: ffff88046d2c8a00 task.stack: ffffc900040ec000 RIP: 0010:sbitmap_any_bit_set+0x11/0x40 RSP: 0018:ffffc900040efbd8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8804572e0000 RCX: ffff880850a3dbb0 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8804572e00d8 RBP: ffffc900040efbd8 R08: 0000000000000001 R09: fffffffffffffff4 R10: 0000000000000005 R11: 000000000001c2c8 R12: ffff8804572e0000 R13: ffff880850a3d560 R14: 0000000000000000 R15: ffffc900040efc38 FS: 0000000000000000(0000) GS:ffff88047fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406f0 Call Trace: blk_mq_hctx_has_pending+0x18/0x70 blk_mq_run_hw_queues+0x42/0x70 blk_mq_requeue_work+0x18f/0x1b0 ? finish_task_switch+0x1d5/0x230 ? pick_next_task_idle+0x40/0x50 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 4f 10 2b 74 01 08 39 57 08 77 d8 c9 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 8b 77 08 55 48 89 e5 85 f6 74 22 48 8b 57 10 31 c0 <48> 83 3a 00 74 0f eb 18 48 8b 4a 40 48 83 c2 40 48 85 c9 75 0b RIP: sbitmap_any_bit_set+0x11/0x40 RSP: ffffc900040efbd8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cfb ]--- nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: Connect rejected: status 8 (invalid service ID). nvme nvme0: rdma_resolve_addr wait failed (-104). nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: Failed reconnect attempt 1 nvme nvme0: Reconnecting in 10 seconds... nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: Connect rejected: status 8 (invalid service ID). nvme nvme0: rdma_resolve_addr wait failed (-104). nvme nvme0: Failed reconnect attempt 2 nvme nvme0: Reconnecting in 10 seconds... nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: Connect rejected: status 8 (invalid service ID). nvme nvme0: rdma_resolve_addr wait failed (-104). nvme nvme0: Failed reconnect attempt 3 nvme nvme0: Reconnecting in 10 seconds... nvme nvme0: Connect rejected: status 8 (invalid service ID). nvme nvme0: rdma_resolve_addr wait failed (-104). nvme nvme0: Failed reconnect attempt 4 nvme nvme0: Reconnecting in 10 seconds... nvme nvme0: Connect rejected: status 8 (invalid service ID). nvme nvme0: rdma_resolve_addr wait failed (-104). nvme nvme0: Failed reconnect attempt 5 nvme nvme0: Reconnecting in 10 seconds... ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-02 10:45 ` Max Gurtovoy 0 siblings, 0 replies; 29+ messages in thread From: Max Gurtovoy @ 2017-07-02 10:45 UTC (permalink / raw) On 6/30/2017 8:26 PM, Jens Axboe wrote: > Hi Max, Hi Jens, > > I remembered you reporting this. I think this is a regression introduced > with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] > is, but that's not indexable by the tag we find. So I think we need to > guard those with a NULL check. The actual requests themselves are > static, so we know the memory itself isn't going away. But if we race > with completion, we could find a NULL there, validly. > > Since you could reproduce it, can you try the below? I still can repro the null deref with this patch applied. > > diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c > index d0be72ccb091..b856b2827157 100644 > --- a/block/blk-mq-tag.c > +++ b/block/blk-mq-tag.c > @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) > bitnr += tags->nr_reserved_tags; > rq = tags->rqs[bitnr]; > > - if (rq->q == hctx->queue) > + if (rq && rq->q == hctx->queue) > iter_data->fn(hctx, rq, iter_data->data, reserved); > return true; > } > @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) > if (!reserved) > bitnr += tags->nr_reserved_tags; > rq = tags->rqs[bitnr]; > - > - iter_data->fn(rq, iter_data->data, reserved); > + if (rq) > + iter_data->fn(rq, iter_data->data, reserved); > return true; > } see the attached file for dmesg output. output of gdb: (gdb) list *(blk_mq_flush_busy_ctxs+0x48) 0xffffffff8127b108 is in blk_mq_flush_busy_ctxs (./include/linux/sbitmap.h:234). 229 230 for (i = 0; i < sb->map_nr; i++) { 231 struct sbitmap_word *word = &sb->map[i]; 232 unsigned int off, nr; 233 234 if (!word->word) 235 continue; 236 237 nr = 0; 238 off = i << sb->shift; when I change the "if (!word->word)" to "if (word && !word->word)" I can get null deref at "nr = find_next_bit(&word->word, word->depth, nr);". Seems like somehow word becomes NULL. Adding the linux-nvme guys too. Sagi has mentioned that this can be null only if we remove the tagset while I/O is trying to get a tag and when killing the target we get into error recovery and periodic reconnects, which does _NOT_ include freeing the tagset, so this is probably the admin tagset. Sagi, you've mention a patch for centrelizing the treatment of the admin tagset to the nvme core. I think I missed this patch, so can you please send a pointer to it and I'll check if it helps ? -------------- next part -------------- Linux version 4.12.0-rc5+ (root at rsws34) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #62 SMP Mon Jun 19 15:33:59 IDT 2017 nvme nvme0: creating 24 I/O queues. nvme nvme0: new ctrl: NQN "subsystem_rsws33.mtr.labs.mlnx_1", addr 11.212.40.110:1023 perf: interrupt took too long (2502 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 perf: interrupt took too long (3128 > 3127), lowering kernel.perf_event_max_sample_rate to 63000 perf: interrupt took too long (3918 > 3910), lowering kernel.perf_event_max_sample_rate to 51000 perf: interrupt took too long (4903 > 4897), lowering kernel.perf_event_max_sample_rate to 40000 blk_update_request: I/O error, dev nvme0n1, sector 486252953 blk_update_request: I/O error, dev nvme0n1, sector 254324451 blk_update_request: I/O error, dev nvme0n1, sector 486828506 blk_update_request: I/O error, dev nvme0n1, sector 175268160 blk_update_request: I/O error, dev nvme0n1, sector 204249372 blk_update_request: I/O error, dev nvme0n1, sector 45725385 blk_update_request: I/O error, dev nvme0n1, sector 503167578 blk_update_request: I/O error, dev nvme0n1, sector 220671103 blk_update_request: I/O error, dev nvme0n1, sector 351009498 blk_update_request: I/O error, dev nvme0n1, sector 509040223 nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning Buffer I/O error on dev nvme0n100, logical block 65535984, async page read nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning Buffer I/O error on dev nvme0n100, logical block 65535998, async page read Buffer I/O error on dev nvme0n100, logical block 0, async page read Buffer I/O error on dev nvme0n100, logical block 1, async page read Buffer I/O error on dev nvme0n1, logical block 65535984, async page read Buffer I/O error on dev nvme0n10, logical block 65535984, async page read nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning Buffer I/O error on dev nvme0n1, logical block 65535998, async page read Buffer I/O error on dev nvme0n10, logical block 65535998, async page read Buffer I/O error on dev nvme0n1, logical block 0, async page read Buffer I/O error on dev nvme0n1, logical block 1, async page read nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme nvme0: rescanning nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: rescanning BUG: unable to handle kernel NULL pointer dereference at (null) nvme nvme0: Reconnecting in 10 seconds... IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#1] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 1 PID: 973 Comm: kworker/1:1H Tainted: G E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046daa2a40 task.stack: ffffc9000497c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc9000497fba8 EFLAGS: 00010246 RAX: ffffc9000497fc18 RBX: 0000000000000000 RCX: ffff8804572f0040 RDX: ffff8804458b7ca0 RSI: ffffc9000497fc18 RDI: ffff8804572f0000 RBP: ffffc9000497fbf8 R08: 0000000000000001 R09: fffffffffff68c3c R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804572f00d8 R13: ffffc9000497fbb8 R14: ffff8804572f0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fa40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc9000497fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce3 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#2] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 2 PID: 5734 Comm: kworker/2:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046a9dc1c0 task.stack: ffffc90004f1c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004f1fba8 EFLAGS: 00010246 RAX: ffffc90004f1fc18 RBX: 0000000000000000 RCX: ffff880457300040 RDX: ffff8804458b7cc0 RSI: ffffc90004f1fc18 RDI: ffff880457300000 RBP: ffffc90004f1fbf8 R08: 0000000000000001 R09: fffffffffff69477 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573000d8 R13: ffffc90004f1fbb8 R14: ffff880457300000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004f1fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce4 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#3] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 3 PID: 4319 Comm: kworker/3:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff880467c78640 task.stack: ffffc90006efc000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90006effba8 EFLAGS: 00010246 RAX: ffffc90006effc18 RBX: 0000000000000000 RCX: ffff880457320040 RDX: ffff8804458b7ce0 RSI: ffffc90006effc18 RDI: ffff880457320000 RBP: ffffc90006effbf8 R08: 0000000000000001 R09: fffffffffff69c47 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573200d8 R13: ffffc90006effbb8 R14: ffff880457320000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 ? pick_next_entity+0x7b/0x120 worker_thread+0x77/0x420 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90006effba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce5 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#4] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 4 PID: 1029 Comm: kworker/4:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046c46a7c0 task.stack: ffffc90004974000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004977ba8 EFLAGS: 00010246 RAX: ffffc90004977c18 RBX: 0000000000000000 RCX: ffff880457330040 RDX: ffff8804458b7d00 RSI: ffffc90004977c18 RDI: ffff880457330000 RBP: ffffc90004977bf8 R08: 0000000000000001 R09: fffffffffff6a46a R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573300d8 R13: ffffc90004977bb8 R14: ffff880457330000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004977ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce6 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#5] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 5 PID: 964 Comm: kworker/5:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046cd1e5c0 task.stack: ffffc90004044000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004047ba8 EFLAGS: 00010246 RAX: ffffc90004047c18 RBX: 0000000000000000 RCX: ffff880457340040 RDX: ffff8804458b7d20 RSI: ffffc90004047c18 RDI: ffff880457340000 RBP: ffffc90004047bf8 R08: 0000000000000001 R09: fffffffffff6accd R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573400d8 R13: ffffc90004047bb8 R14: ffff880457340000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fb40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004047ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce7 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#6] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 7 PID: 976 Comm: kworker/7:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086dce2500 task.stack: ffffc90004064000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004067ba8 EFLAGS: 00010246 RAX: ffffc90004067c18 RBX: 0000000000000000 RCX: ffff880850ab0040 RDX: ffff8808509a9ba0 RSI: ffffc90004067c18 RDI: ffff880850ab0000 RBP: ffffc90004067bf8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850ab00d8 R13: ffffc90004067bb8 R14: ffff880850ab0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fa40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004067ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce8 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#7] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 6 PID: 936 Comm: kworker/6:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086cede840 task.stack: ffffc90004034000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004037ba8 EFLAGS: 00010246 RAX: ffffc90004037c18 RBX: 0000000000000000 RCX: ffff880850aa0040 RDX: ffff8808509a9bc0 RSI: ffffc90004037c18 RDI: ffff880850aa0000 RBP: ffffc90004037bf8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850aa00d8 R13: ffffc90004037bb8 R14: ffff880850aa0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004037ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ce9 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#8] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 8 PID: 1211 Comm: kworker/8:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086c20e540 task.stack: ffffc90004084000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004087ba8 EFLAGS: 00010246 RAX: ffffc90004087c18 RBX: 0000000000000000 RCX: ffff880850ac0040 RDX: ffff8808509a9b80 RSI: ffffc90004087c18 RDI: ffff880850ac0000 RBP: ffffc90004087bf8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850ac00d8 R13: ffffc90004087bb8 R14: ffff880850ac0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004087ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cea ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#9] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 9 PID: 949 Comm: kworker/9:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086e3e01c0 task.stack: ffffc90003ef0000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003ef3ba8 EFLAGS: 00010246 RAX: ffffc90003ef3c18 RBX: 0000000000000000 RCX: ffff880850ad0040 RDX: ffff8808509a9b60 RSI: ffffc90003ef3c18 RDI: ffff880850ad0000 RBP: ffffc90003ef3bf8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850ad00d8 R13: ffffc90003ef3bb8 R14: ffff880850ad0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003ef3ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ceb ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#10] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 10 PID: 950 Comm: kworker/10:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086e3dc180 task.stack: ffffc90003f50000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003f53ba8 EFLAGS: 00010246 RAX: ffffc90003f53c18 RBX: 0000000000000000 RCX: ffff880850ae0040 RDX: ffff8808509a9b40 RSI: ffffc90003f53c18 RDI: ffff880850ae0000 RBP: ffffc90003f53bf8 R08: 0000000000000001 R09: fffffffffff6d8a7 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850ae00d8 R13: ffffc90003f53bb8 R14: ffff880850ae0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003f53ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cec ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#11] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 11 PID: 960 Comm: kworker/11:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086c0a40c0 task.stack: ffffc9000400c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc9000400fba8 EFLAGS: 00010246 RAX: ffffc9000400fc18 RBX: 0000000000000000 RCX: ffff880850af0040 RDX: ffff8808509a9b20 RSI: ffffc9000400fc18 RDI: ffff880850af0000 RBP: ffffc9000400fbf8 R08: 0000000000000001 R09: fffffffffff6e0e5 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850af00d8 R13: ffffc9000400fbb8 R14: ffff880850af0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fb40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc9000400fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017ced ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#12] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 12 PID: 2505 Comm: kworker/12:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046e15a440 task.stack: ffffc90005e4c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90005e4fba8 EFLAGS: 00010246 RAX: ffffc90005e4fc18 RBX: 0000000000000000 RCX: ffff880457350040 RDX: ffff8804458b7d40 RSI: ffffc90005e4fc18 RDI: ffff880457350000 RBP: ffffc90005e4fbf8 R08: 0000000000000001 R09: fffffffffff6e459 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573500d8 R13: ffffc90005e4fbb8 R14: ffff880457350000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 ? pick_next_entity+0x7b/0x120 worker_thread+0x77/0x420 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90005e4fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cee ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#13] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 13 PID: 1001 Comm: kworker/13:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046cfc2800 task.stack: ffffc90004674000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004677ba8 EFLAGS: 00010246 RAX: ffffc90004677c18 RBX: 0000000000000000 RCX: ffff880457360040 RDX: ffff8804458b7d60 RSI: ffffc90004677c18 RDI: ffff880457360000 RBP: ffffc90004677bf8 R08: 0000000000000001 R09: fffffffffff6f05a R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573600d8 R13: ffffc90004677bb8 R14: ffff880457360000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fbc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004677ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cef ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#14] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 14 PID: 947 Comm: kworker/14:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046ccd8500 task.stack: ffffc90003ed8000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003edbba8 EFLAGS: 00010246 RAX: ffffc90003edbc18 RBX: 0000000000000000 RCX: ffff880457370040 RDX: ffff8804458b7d80 RSI: ffffc90003edbc18 RDI: ffff880457370000 RBP: ffffc90003edbbf8 R08: 0000000000000001 R09: fffffffffff6f8d1 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573700d8 R13: ffffc90003edbbb8 R14: ffff880457370000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003edbba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf0 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#15] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 15 PID: 987 Comm: kworker/15:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046cd1c580 task.stack: ffffc9000408c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc9000408fba8 EFLAGS: 00010246 RAX: ffffc9000408fc18 RBX: 0000000000000000 RCX: ffff880457380040 RDX: ffff8804458b7da0 RSI: ffffc9000408fc18 RDI: ffff880457380000 RBP: ffffc9000408fbf8 R08: 0000000000000001 R09: fffffffffff70360 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573800d8 R13: ffffc9000408fbb8 R14: ffff880457380000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc9000408fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf1 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#16] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 16 PID: 963 Comm: kworker/16:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046d1988c0 task.stack: ffffc90004024000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004027ba8 EFLAGS: 00010246 RAX: ffffc90004027c18 RBX: 0000000000000000 RCX: ffff880457390040 RDX: ffff8804458b7dc0 RSI: ffffc90004027c18 RDI: ffff880457390000 RBP: ffffc90004027bf8 R08: 0000000000000001 R09: fffffffffff70c9b R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573900d8 R13: ffffc90004027bb8 R14: ffff880457390000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? ttwu_do_wakeup+0x22/0x100 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004027ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf2 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#17] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 17 PID: 5820 Comm: kworker/17:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046d2c69c0 task.stack: ffffc90004094000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004097ba8 EFLAGS: 00010246 RAX: ffffc90004097c18 RBX: 0000000000000000 RCX: ffff8804573b0040 RDX: ffff8804458b7de0 RSI: ffffc90004097c18 RDI: ffff8804573b0000 RBP: ffffc90004097bf8 R08: 0000000000000001 R09: fffffffffff7159b R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804573b00d8 R13: ffffc90004097bb8 R14: ffff8804573b0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004097ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf3 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#18] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 18 PID: 920 Comm: kworker/18:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086d87a7c0 task.stack: ffffc90003ed0000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003ed3ba8 EFLAGS: 00010246 RAX: ffffc90003ed3c18 RBX: 0000000000000000 RCX: ffff880850b00040 RDX: ffff8808509a9b00 RSI: ffffc90003ed3c18 RDI: ffff880850b00000 RBP: ffffc90003ed3bf8 R08: 0000000000000001 R09: fffffffffff72142 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b000d8 R13: ffffc90003ed3bb8 R14: ffff880850b00000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003ed3ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf4 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#19] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 19 PID: 1259 Comm: kworker/19:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086c1864c0 task.stack: ffffc9000543c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc9000543fba8 EFLAGS: 00010246 RAX: ffffc9000543fc18 RBX: 0000000000000000 RCX: ffff880850b10040 RDX: ffff8808509a9ae0 RSI: ffffc9000543fc18 RDI: ffff880850b10000 RBP: ffffc9000543fbf8 R08: 0000000000000001 R09: fffffffffff72b3b R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b100d8 R13: ffffc9000543fbb8 R14: ffff880850b10000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fbc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc9000543fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf5 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#20] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 20 PID: 989 Comm: kworker/20:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086df08740 task.stack: ffffc9000427c000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc9000427fba8 EFLAGS: 00010246 RAX: ffffc9000427fc18 RBX: 0000000000000000 RCX: ffff880850b20040 RDX: ffff8808509a9ac0 RSI: ffffc9000427fc18 RDI: ffff880850b20000 RBP: ffffc9000427fbf8 R08: 0000000000000001 R09: fffffffffff7351d R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b200d8 R13: ffffc9000427fbb8 R14: ffff880850b20000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc9000427fba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf6 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#21] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 21 PID: 919 Comm: kworker/21:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086da2e9c0 task.stack: ffffc900047a4000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc900047a7ba8 EFLAGS: 00010246 RAX: ffffc900047a7c18 RBX: 0000000000000000 RCX: ffff880850b30040 RDX: ffff8808509a9aa0 RSI: ffffc900047a7c18 RDI: ffff880850b30000 RBP: ffffc900047a7bf8 R08: 0000000000000001 R09: fffffffffff73ee8 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b300d8 R13: ffffc900047a7bb8 R14: ffff880850b30000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc900047a7ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf7 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#22] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 22 PID: 932 Comm: kworker/22:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086c5e63c0 task.stack: ffffc90003e78000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003e7bba8 EFLAGS: 00010246 RAX: ffffc90003e7bc18 RBX: 0000000000000000 RCX: ffff880850b40040 RDX: ffff8808509a9a80 RSI: ffffc90003e7bc18 RDI: ffff880850b40000 RBP: ffffc90003e7bbf8 R08: 0000000000000001 R09: fffffffffff74859 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b400d8 R13: ffffc90003e7bbb8 R14: ffff880850b40000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003e7bba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf8 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#23] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 23 PID: 959 Comm: kworker/23:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88086c0e6140 task.stack: ffffc90004004000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90004007ba8 EFLAGS: 00010246 RAX: ffffc90004007c18 RBX: 0000000000000000 RCX: ffff880850b50040 RDX: ffff8808509a9a60 RSI: ffffc90004007c18 RDI: ffff880850b50000 RBP: ffffc90004007bf8 R08: 0000000000000001 R09: fffffffffff751ab R10: 0000000000000001 R11: 0000000000000001 R12: ffff880850b500d8 R13: ffffc90004007bb8 R14: ffff880850b50000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90004007ba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cf9 ]--- BUG: unable to handle kernel NULL pointer dereference at (null) IP: blk_mq_flush_busy_ctxs+0x48/0xc0 PGD 0 P4D 0 Oops: 0000 [#24] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 0 PID: 928 Comm: kworker/0:1H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_run_work_fn task: ffff88046c442780 task.stack: ffffc90003ef8000 RIP: 0010:blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: 0018:ffffc90003efbba8 EFLAGS: 00010246 RAX: ffffc90003efbc18 RBX: 0000000000000000 RCX: ffff8804572e0040 RDX: ffff8804458b7c80 RSI: ffffc90003efbc18 RDI: ffff8804572e0000 RBP: ffffc90003efbbf8 R08: 0000000000000002 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8804572e00d8 R13: ffffc90003efbbb8 R14: ffff8804572e0000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406f0 Call Trace: blk_mq_sched_dispatch_requests+0x16d/0x190 ? blk_mq_requeue_work+0x18f/0x1b0 ? pwq_activate_delayed_work+0x47/0x70 __blk_mq_run_hw_queue+0xa0/0xb0 blk_mq_run_work_fn+0x2c/0x30 process_one_work+0x170/0x310 ? schedule+0x35/0xa0 ? schedule+0x1/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 7d c0 48 89 75 c8 44 8b 9f e0 00 00 00 45 85 db 74 77 4c 8d 6d c0 c7 45 b8 00 00 00 00 8b 5d b8 48 c1 e3 06 49 03 9e e8 00 00 00 <48> 83 3b 00 74 48 41 8b 8e dc 00 00 00 8b 45 b8 45 31 ff d3 e0 RIP: blk_mq_flush_busy_ctxs+0x48/0xc0 RSP: ffffc90003efbba8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cfa ]--- nvme-fabrics ctl: nvme_revalidate_ns: Identify failure BUG: unable to handle kernel NULL pointer dereference at (null) IP: sbitmap_any_bit_set+0x11/0x40 PGD 0 P4D 0 Oops: 0000 [#25] SMP Modules linked in: nvme_rdma rdma_cm iw_cm nvme_fabrics nvme_core ib_ipoib ib_cm netconsole configfs rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd grace autofs4 sunrpc dm_mirror dm_region_hash dm_log dm_multipath uinput iTCO_wdt iTCO_vendor_support sg pcspkr serio_raw i2c_i801 lpc_ich mfd_core shpchp ipmi_si ipmi_msghandler mlx5_ib ib_core ioatdma mlx5_core ipv6 crc_ccitt dm_mod igb dax dca i2c_algo_bit i2c_core ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) isci(E) libsas(E) scsi_transport_sas(E) [last unloaded: mlx4_core] CPU: 0 PID: 14184 Comm: kworker/0:2H Tainted: G D E 4.12.0-rc5+ #62 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 Workqueue: kblockd blk_mq_requeue_work task: ffff88046d2c8a00 task.stack: ffffc900040ec000 RIP: 0010:sbitmap_any_bit_set+0x11/0x40 RSP: 0018:ffffc900040efbd8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8804572e0000 RCX: ffff880850a3dbb0 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8804572e00d8 RBP: ffffc900040efbd8 R08: 0000000000000001 R09: fffffffffffffff4 R10: 0000000000000005 R11: 000000000001c2c8 R12: ffff8804572e0000 R13: ffff880850a3d560 R14: 0000000000000000 R15: ffffc900040efc38 FS: 0000000000000000(0000) GS:ffff88047fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406f0 Call Trace: blk_mq_hctx_has_pending+0x18/0x70 blk_mq_run_hw_queues+0x42/0x70 blk_mq_requeue_work+0x18f/0x1b0 ? finish_task_switch+0x1d5/0x230 ? pick_next_task_idle+0x40/0x50 process_one_work+0x170/0x310 ? sched_clock_cpu+0x22/0xc0 ? schedule+0x35/0xa0 worker_thread+0x77/0x420 ? pick_next_task_idle+0x40/0x50 ? default_wake_function+0xd/0x10 ? maybe_create_worker+0x110/0x110 ? schedule+0x35/0xa0 ? maybe_create_worker+0x110/0x110 kthread+0x107/0x140 ? kthread_create_worker+0x50/0x50 ret_from_fork+0x22/0x30 Code: 4f 10 2b 74 01 08 39 57 08 77 d8 c9 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 8b 77 08 55 48 89 e5 85 f6 74 22 48 8b 57 10 31 c0 <48> 83 3a 00 74 0f eb 18 48 8b 4a 40 48 83 c2 40 48 85 c9 75 0b RIP: sbitmap_any_bit_set+0x11/0x40 RSP: ffffc900040efbd8 CR2: 0000000000000000 ---[ end trace 762d84a0fc017cfb ]--- nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: Connect rejected: status 8 (invalid service ID). nvme nvme0: rdma_resolve_addr wait failed (-104). nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: Failed reconnect attempt 1 nvme nvme0: Reconnecting in 10 seconds... nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: Connect rejected: status 8 (invalid service ID). nvme nvme0: rdma_resolve_addr wait failed (-104). nvme nvme0: Failed reconnect attempt 2 nvme nvme0: Reconnecting in 10 seconds... nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme-fabrics ctl: nvme_revalidate_ns: Identify failure nvme nvme0: Connect rejected: status 8 (invalid service ID). nvme nvme0: rdma_resolve_addr wait failed (-104). nvme nvme0: Failed reconnect attempt 3 nvme nvme0: Reconnecting in 10 seconds... nvme nvme0: Connect rejected: status 8 (invalid service ID). nvme nvme0: rdma_resolve_addr wait failed (-104). nvme nvme0: Failed reconnect attempt 4 nvme nvme0: Reconnecting in 10 seconds... nvme nvme0: Connect rejected: status 8 (invalid service ID). nvme nvme0: rdma_resolve_addr wait failed (-104). nvme nvme0: Failed reconnect attempt 5 nvme nvme0: Reconnecting in 10 seconds... ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-02 10:45 ` Max Gurtovoy @ 2017-07-02 11:56 ` Sagi Grimberg -1 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-02 11:56 UTC (permalink / raw) To: Max Gurtovoy, Jens Axboe Cc: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On 02/07/17 13:45, Max Gurtovoy wrote: > > > On 6/30/2017 8:26 PM, Jens Axboe wrote: >> Hi Max, > > Hi Jens, > >> >> I remembered you reporting this. I think this is a regression introduced >> with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] >> is, but that's not indexable by the tag we find. So I think we need to >> guard those with a NULL check. The actual requests themselves are >> static, so we know the memory itself isn't going away. But if we race >> with completion, we could find a NULL there, validly. >> >> Since you could reproduce it, can you try the below? > > I still can repro the null deref with this patch applied. > >> >> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c >> index d0be72ccb091..b856b2827157 100644 >> --- a/block/blk-mq-tag.c >> +++ b/block/blk-mq-tag.c >> @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, >> unsigned int bitnr, void *data) >> bitnr += tags->nr_reserved_tags; >> rq = tags->rqs[bitnr]; >> >> - if (rq->q == hctx->queue) >> + if (rq && rq->q == hctx->queue) >> iter_data->fn(hctx, rq, iter_data->data, reserved); >> return true; >> } >> @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, >> unsigned int bitnr, void *data) >> if (!reserved) >> bitnr += tags->nr_reserved_tags; >> rq = tags->rqs[bitnr]; >> - >> - iter_data->fn(rq, iter_data->data, reserved); >> + if (rq) >> + iter_data->fn(rq, iter_data->data, reserved); >> return true; >> } > > see the attached file for dmesg output. > > output of gdb: > > (gdb) list *(blk_mq_flush_busy_ctxs+0x48) > 0xffffffff8127b108 is in blk_mq_flush_busy_ctxs > (./include/linux/sbitmap.h:234). > 229 > 230 for (i = 0; i < sb->map_nr; i++) { > 231 struct sbitmap_word *word = &sb->map[i]; > 232 unsigned int off, nr; > 233 > 234 if (!word->word) > 235 continue; > 236 > 237 nr = 0; > 238 off = i << sb->shift; > > > when I change the "if (!word->word)" to "if (word && !word->word)" > I can get null deref at "nr = find_next_bit(&word->word, word->depth, > nr);". Seems like somehow word becomes NULL. > > Adding the linux-nvme guys too. > Sagi has mentioned that this can be null only if we remove the tagset > while I/O is trying to get a tag and when killing the target we get into > error recovery and periodic reconnects, which does _NOT_ include freeing > the tagset, so this is probably the admin tagset. > > Sagi, > you've mention a patch for centrelizing the treatment of the admin > tagset to the nvme core. I think I missed this patch, so can you please > send a pointer to it and I'll check if it helps ? Hmm, In the above flow we should not be freeing the tag_set, not on admin as well. The target keep removing namespaces and finally removes the subsystem which generates a error recovery flow. What we at least try to do is: 1. mark rdma queues as not live 2. stop all the sw queues (admin and io) 3. fail inflight I/Os 4. restart all sw queues (to fast fail until we recover) We shouldn't be freeing the tagsets (although we might update them when we recover and cpu map changed - which I don't think is happening). However, I do see a difference between bt_tags_for_each and blk_mq_flush_busy_ctxs (checks tags->rqs not being NULL). Unrelated to this I think we should quiesce/unquiesce the admin_q instead of stop/start because it respects the submission path rcu [1]. It might hide the issue, but given that we never free the tagset its seems like it's not in nvme-rdma (max, can you see if this makes the issue go away?) [1]: -- diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index e3996db22738..094873a4ee38 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -785,7 +785,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work) if (ctrl->ctrl.queue_count > 1) nvme_stop_queues(&ctrl->ctrl); - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); /* We must take care of fastfail/requeue all our inflight requests */ if (ctrl->ctrl.queue_count > 1) @@ -798,7 +798,8 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work) * queues are not a live anymore, so restart the queues to fail fast * new IO */ - blk_mq_start_stopped_hw_queues(ctrl->ctrl.admin_q, true); + blk_mq_unquiesce_queue(ctrl->ctrl.admin_q); + blk_mq_kick_requeue_list(ctrl->ctrl.admin_q); nvme_start_queues(&ctrl->ctrl); nvme_rdma_reconnect_or_remove(ctrl); @@ -1651,7 +1652,7 @@ static void nvme_rdma_shutdown_ctrl(struct nvme_rdma_ctrl *ctrl) if (test_bit(NVME_RDMA_Q_LIVE, &ctrl->queues[0].flags)) nvme_shutdown_ctrl(&ctrl->ctrl); - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); blk_mq_tagset_busy_iter(&ctrl->admin_tag_set, nvme_cancel_request, &ctrl->ctrl); nvme_rdma_destroy_admin_queue(ctrl); -- ^ permalink raw reply related [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-02 11:56 ` Sagi Grimberg 0 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-02 11:56 UTC (permalink / raw) On 02/07/17 13:45, Max Gurtovoy wrote: > > > On 6/30/2017 8:26 PM, Jens Axboe wrote: >> Hi Max, > > Hi Jens, > >> >> I remembered you reporting this. I think this is a regression introduced >> with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] >> is, but that's not indexable by the tag we find. So I think we need to >> guard those with a NULL check. The actual requests themselves are >> static, so we know the memory itself isn't going away. But if we race >> with completion, we could find a NULL there, validly. >> >> Since you could reproduce it, can you try the below? > > I still can repro the null deref with this patch applied. > >> >> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c >> index d0be72ccb091..b856b2827157 100644 >> --- a/block/blk-mq-tag.c >> +++ b/block/blk-mq-tag.c >> @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, >> unsigned int bitnr, void *data) >> bitnr += tags->nr_reserved_tags; >> rq = tags->rqs[bitnr]; >> >> - if (rq->q == hctx->queue) >> + if (rq && rq->q == hctx->queue) >> iter_data->fn(hctx, rq, iter_data->data, reserved); >> return true; >> } >> @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, >> unsigned int bitnr, void *data) >> if (!reserved) >> bitnr += tags->nr_reserved_tags; >> rq = tags->rqs[bitnr]; >> - >> - iter_data->fn(rq, iter_data->data, reserved); >> + if (rq) >> + iter_data->fn(rq, iter_data->data, reserved); >> return true; >> } > > see the attached file for dmesg output. > > output of gdb: > > (gdb) list *(blk_mq_flush_busy_ctxs+0x48) > 0xffffffff8127b108 is in blk_mq_flush_busy_ctxs > (./include/linux/sbitmap.h:234). > 229 > 230 for (i = 0; i < sb->map_nr; i++) { > 231 struct sbitmap_word *word = &sb->map[i]; > 232 unsigned int off, nr; > 233 > 234 if (!word->word) > 235 continue; > 236 > 237 nr = 0; > 238 off = i << sb->shift; > > > when I change the "if (!word->word)" to "if (word && !word->word)" > I can get null deref at "nr = find_next_bit(&word->word, word->depth, > nr);". Seems like somehow word becomes NULL. > > Adding the linux-nvme guys too. > Sagi has mentioned that this can be null only if we remove the tagset > while I/O is trying to get a tag and when killing the target we get into > error recovery and periodic reconnects, which does _NOT_ include freeing > the tagset, so this is probably the admin tagset. > > Sagi, > you've mention a patch for centrelizing the treatment of the admin > tagset to the nvme core. I think I missed this patch, so can you please > send a pointer to it and I'll check if it helps ? Hmm, In the above flow we should not be freeing the tag_set, not on admin as well. The target keep removing namespaces and finally removes the subsystem which generates a error recovery flow. What we at least try to do is: 1. mark rdma queues as not live 2. stop all the sw queues (admin and io) 3. fail inflight I/Os 4. restart all sw queues (to fast fail until we recover) We shouldn't be freeing the tagsets (although we might update them when we recover and cpu map changed - which I don't think is happening). However, I do see a difference between bt_tags_for_each and blk_mq_flush_busy_ctxs (checks tags->rqs not being NULL). Unrelated to this I think we should quiesce/unquiesce the admin_q instead of stop/start because it respects the submission path rcu [1]. It might hide the issue, but given that we never free the tagset its seems like it's not in nvme-rdma (max, can you see if this makes the issue go away?) [1]: -- diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index e3996db22738..094873a4ee38 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -785,7 +785,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work) if (ctrl->ctrl.queue_count > 1) nvme_stop_queues(&ctrl->ctrl); - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); /* We must take care of fastfail/requeue all our inflight requests */ if (ctrl->ctrl.queue_count > 1) @@ -798,7 +798,8 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work) * queues are not a live anymore, so restart the queues to fail fast * new IO */ - blk_mq_start_stopped_hw_queues(ctrl->ctrl.admin_q, true); + blk_mq_unquiesce_queue(ctrl->ctrl.admin_q); + blk_mq_kick_requeue_list(ctrl->ctrl.admin_q); nvme_start_queues(&ctrl->ctrl); nvme_rdma_reconnect_or_remove(ctrl); @@ -1651,7 +1652,7 @@ static void nvme_rdma_shutdown_ctrl(struct nvme_rdma_ctrl *ctrl) if (test_bit(NVME_RDMA_Q_LIVE, &ctrl->queues[0].flags)) nvme_shutdown_ctrl(&ctrl->ctrl); - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); blk_mq_tagset_busy_iter(&ctrl->admin_tag_set, nvme_cancel_request, &ctrl->ctrl); nvme_rdma_destroy_admin_queue(ctrl); -- ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-02 11:56 ` Sagi Grimberg @ 2017-07-02 14:37 ` Max Gurtovoy -1 siblings, 0 replies; 29+ messages in thread From: Max Gurtovoy @ 2017-07-02 14:37 UTC (permalink / raw) To: Sagi Grimberg, Jens Axboe Cc: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On 7/2/2017 2:56 PM, Sagi Grimberg wrote: > > > On 02/07/17 13:45, Max Gurtovoy wrote: >> >> >> On 6/30/2017 8:26 PM, Jens Axboe wrote: >>> Hi Max, >> >> Hi Jens, >> >>> >>> I remembered you reporting this. I think this is a regression introduced >>> with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] >>> is, but that's not indexable by the tag we find. So I think we need to >>> guard those with a NULL check. The actual requests themselves are >>> static, so we know the memory itself isn't going away. But if we race >>> with completion, we could find a NULL there, validly. >>> >>> Since you could reproduce it, can you try the below? >> >> I still can repro the null deref with this patch applied. >> >>> >>> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c >>> index d0be72ccb091..b856b2827157 100644 >>> --- a/block/blk-mq-tag.c >>> +++ b/block/blk-mq-tag.c >>> @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, >>> unsigned int bitnr, void *data) >>> bitnr += tags->nr_reserved_tags; >>> rq = tags->rqs[bitnr]; >>> >>> - if (rq->q == hctx->queue) >>> + if (rq && rq->q == hctx->queue) >>> iter_data->fn(hctx, rq, iter_data->data, reserved); >>> return true; >>> } >>> @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, >>> unsigned int bitnr, void *data) >>> if (!reserved) >>> bitnr += tags->nr_reserved_tags; >>> rq = tags->rqs[bitnr]; >>> - >>> - iter_data->fn(rq, iter_data->data, reserved); >>> + if (rq) >>> + iter_data->fn(rq, iter_data->data, reserved); >>> return true; >>> } >> >> see the attached file for dmesg output. >> >> output of gdb: >> >> (gdb) list *(blk_mq_flush_busy_ctxs+0x48) >> 0xffffffff8127b108 is in blk_mq_flush_busy_ctxs >> (./include/linux/sbitmap.h:234). >> 229 >> 230 for (i = 0; i < sb->map_nr; i++) { >> 231 struct sbitmap_word *word = &sb->map[i]; >> 232 unsigned int off, nr; >> 233 >> 234 if (!word->word) >> 235 continue; >> 236 >> 237 nr = 0; >> 238 off = i << sb->shift; >> >> >> when I change the "if (!word->word)" to "if (word && !word->word)" >> I can get null deref at "nr = find_next_bit(&word->word, word->depth, >> nr);". Seems like somehow word becomes NULL. >> >> Adding the linux-nvme guys too. >> Sagi has mentioned that this can be null only if we remove the tagset >> while I/O is trying to get a tag and when killing the target we get into >> error recovery and periodic reconnects, which does _NOT_ include freeing >> the tagset, so this is probably the admin tagset. >> >> Sagi, >> you've mention a patch for centrelizing the treatment of the admin >> tagset to the nvme core. I think I missed this patch, so can you >> please send a pointer to it and I'll check if it helps ? > > Hmm, > > In the above flow we should not be freeing the tag_set, not on admin as > well. The target keep removing namespaces and finally removes the > subsystem which generates a error recovery flow. What we at least try > to do is: > > 1. mark rdma queues as not live > 2. stop all the sw queues (admin and io) > 3. fail inflight I/Os > 4. restart all sw queues (to fast fail until we recover) > > We shouldn't be freeing the tagsets (although we might update them > when we recover and cpu map changed - which I don't think is happening). > > However, I do see a difference between bt_tags_for_each > and blk_mq_flush_busy_ctxs (checks tags->rqs not being NULL). > > Unrelated to this I think we should quiesce/unquiesce the admin_q > instead of stop/start because it respects the submission path rcu [1]. > > It might hide the issue, but given that we never free the tagset its > seems like it's not in nvme-rdma (max, can you see if this makes the > issue go away?) Yes, this fixes the null deref issue. I run some additional login/logout tests that passed too. This fix is important also for stable kernel (with needed backports to blk_mq_quiesce_queue/blk_mq_unquiesce_queue functions). You can add my: Tested-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Let me know if you want me to push this fix to the mailing list to save time (can we make it to 4.12 ?) > > [1]: > -- > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c > index e3996db22738..094873a4ee38 100644 > --- a/drivers/nvme/host/rdma.c > +++ b/drivers/nvme/host/rdma.c > @@ -785,7 +785,7 @@ static void nvme_rdma_error_recovery_work(struct > work_struct *work) > > if (ctrl->ctrl.queue_count > 1) > nvme_stop_queues(&ctrl->ctrl); > - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); > + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); > > /* We must take care of fastfail/requeue all our inflight > requests */ > if (ctrl->ctrl.queue_count > 1) > @@ -798,7 +798,8 @@ static void nvme_rdma_error_recovery_work(struct > work_struct *work) > * queues are not a live anymore, so restart the queues to fail > fast > * new IO > */ > - blk_mq_start_stopped_hw_queues(ctrl->ctrl.admin_q, true); > + blk_mq_unquiesce_queue(ctrl->ctrl.admin_q); > + blk_mq_kick_requeue_list(ctrl->ctrl.admin_q); > nvme_start_queues(&ctrl->ctrl); > > nvme_rdma_reconnect_or_remove(ctrl); > @@ -1651,7 +1652,7 @@ static void nvme_rdma_shutdown_ctrl(struct > nvme_rdma_ctrl *ctrl) > if (test_bit(NVME_RDMA_Q_LIVE, &ctrl->queues[0].flags)) > nvme_shutdown_ctrl(&ctrl->ctrl); > > - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); > + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); > blk_mq_tagset_busy_iter(&ctrl->admin_tag_set, > nvme_cancel_request, &ctrl->ctrl); > nvme_rdma_destroy_admin_queue(ctrl); > -- ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-02 14:37 ` Max Gurtovoy 0 siblings, 0 replies; 29+ messages in thread From: Max Gurtovoy @ 2017-07-02 14:37 UTC (permalink / raw) On 7/2/2017 2:56 PM, Sagi Grimberg wrote: > > > On 02/07/17 13:45, Max Gurtovoy wrote: >> >> >> On 6/30/2017 8:26 PM, Jens Axboe wrote: >>> Hi Max, >> >> Hi Jens, >> >>> >>> I remembered you reporting this. I think this is a regression introduced >>> with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] >>> is, but that's not indexable by the tag we find. So I think we need to >>> guard those with a NULL check. The actual requests themselves are >>> static, so we know the memory itself isn't going away. But if we race >>> with completion, we could find a NULL there, validly. >>> >>> Since you could reproduce it, can you try the below? >> >> I still can repro the null deref with this patch applied. >> >>> >>> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c >>> index d0be72ccb091..b856b2827157 100644 >>> --- a/block/blk-mq-tag.c >>> +++ b/block/blk-mq-tag.c >>> @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, >>> unsigned int bitnr, void *data) >>> bitnr += tags->nr_reserved_tags; >>> rq = tags->rqs[bitnr]; >>> >>> - if (rq->q == hctx->queue) >>> + if (rq && rq->q == hctx->queue) >>> iter_data->fn(hctx, rq, iter_data->data, reserved); >>> return true; >>> } >>> @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, >>> unsigned int bitnr, void *data) >>> if (!reserved) >>> bitnr += tags->nr_reserved_tags; >>> rq = tags->rqs[bitnr]; >>> - >>> - iter_data->fn(rq, iter_data->data, reserved); >>> + if (rq) >>> + iter_data->fn(rq, iter_data->data, reserved); >>> return true; >>> } >> >> see the attached file for dmesg output. >> >> output of gdb: >> >> (gdb) list *(blk_mq_flush_busy_ctxs+0x48) >> 0xffffffff8127b108 is in blk_mq_flush_busy_ctxs >> (./include/linux/sbitmap.h:234). >> 229 >> 230 for (i = 0; i < sb->map_nr; i++) { >> 231 struct sbitmap_word *word = &sb->map[i]; >> 232 unsigned int off, nr; >> 233 >> 234 if (!word->word) >> 235 continue; >> 236 >> 237 nr = 0; >> 238 off = i << sb->shift; >> >> >> when I change the "if (!word->word)" to "if (word && !word->word)" >> I can get null deref at "nr = find_next_bit(&word->word, word->depth, >> nr);". Seems like somehow word becomes NULL. >> >> Adding the linux-nvme guys too. >> Sagi has mentioned that this can be null only if we remove the tagset >> while I/O is trying to get a tag and when killing the target we get into >> error recovery and periodic reconnects, which does _NOT_ include freeing >> the tagset, so this is probably the admin tagset. >> >> Sagi, >> you've mention a patch for centrelizing the treatment of the admin >> tagset to the nvme core. I think I missed this patch, so can you >> please send a pointer to it and I'll check if it helps ? > > Hmm, > > In the above flow we should not be freeing the tag_set, not on admin as > well. The target keep removing namespaces and finally removes the > subsystem which generates a error recovery flow. What we at least try > to do is: > > 1. mark rdma queues as not live > 2. stop all the sw queues (admin and io) > 3. fail inflight I/Os > 4. restart all sw queues (to fast fail until we recover) > > We shouldn't be freeing the tagsets (although we might update them > when we recover and cpu map changed - which I don't think is happening). > > However, I do see a difference between bt_tags_for_each > and blk_mq_flush_busy_ctxs (checks tags->rqs not being NULL). > > Unrelated to this I think we should quiesce/unquiesce the admin_q > instead of stop/start because it respects the submission path rcu [1]. > > It might hide the issue, but given that we never free the tagset its > seems like it's not in nvme-rdma (max, can you see if this makes the > issue go away?) Yes, this fixes the null deref issue. I run some additional login/logout tests that passed too. This fix is important also for stable kernel (with needed backports to blk_mq_quiesce_queue/blk_mq_unquiesce_queue functions). You can add my: Tested-by: Max Gurtovoy <maxg at mellanox.com> Reviewed-by: Max Gurtovoy <maxg at mellanox.com> Let me know if you want me to push this fix to the mailing list to save time (can we make it to 4.12 ?) > > [1]: > -- > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c > index e3996db22738..094873a4ee38 100644 > --- a/drivers/nvme/host/rdma.c > +++ b/drivers/nvme/host/rdma.c > @@ -785,7 +785,7 @@ static void nvme_rdma_error_recovery_work(struct > work_struct *work) > > if (ctrl->ctrl.queue_count > 1) > nvme_stop_queues(&ctrl->ctrl); > - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); > + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); > > /* We must take care of fastfail/requeue all our inflight > requests */ > if (ctrl->ctrl.queue_count > 1) > @@ -798,7 +798,8 @@ static void nvme_rdma_error_recovery_work(struct > work_struct *work) > * queues are not a live anymore, so restart the queues to fail > fast > * new IO > */ > - blk_mq_start_stopped_hw_queues(ctrl->ctrl.admin_q, true); > + blk_mq_unquiesce_queue(ctrl->ctrl.admin_q); > + blk_mq_kick_requeue_list(ctrl->ctrl.admin_q); > nvme_start_queues(&ctrl->ctrl); > > nvme_rdma_reconnect_or_remove(ctrl); > @@ -1651,7 +1652,7 @@ static void nvme_rdma_shutdown_ctrl(struct > nvme_rdma_ctrl *ctrl) > if (test_bit(NVME_RDMA_Q_LIVE, &ctrl->queues[0].flags)) > nvme_shutdown_ctrl(&ctrl->ctrl); > > - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); > + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); > blk_mq_tagset_busy_iter(&ctrl->admin_tag_set, > nvme_cancel_request, &ctrl->ctrl); > nvme_rdma_destroy_admin_queue(ctrl); > -- ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-02 14:37 ` Max Gurtovoy @ 2017-07-02 15:08 ` Sagi Grimberg -1 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-02 15:08 UTC (permalink / raw) To: Max Gurtovoy, Jens Axboe Cc: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org >> Hmm, >> >> In the above flow we should not be freeing the tag_set, not on admin as >> well. The target keep removing namespaces and finally removes the >> subsystem which generates a error recovery flow. What we at least try >> to do is: >> >> 1. mark rdma queues as not live >> 2. stop all the sw queues (admin and io) >> 3. fail inflight I/Os >> 4. restart all sw queues (to fast fail until we recover) >> >> We shouldn't be freeing the tagsets (although we might update them >> when we recover and cpu map changed - which I don't think is happening). >> >> However, I do see a difference between bt_tags_for_each >> and blk_mq_flush_busy_ctxs (checks tags->rqs not being NULL). >> >> Unrelated to this I think we should quiesce/unquiesce the admin_q >> instead of stop/start because it respects the submission path rcu [1]. >> >> It might hide the issue, but given that we never free the tagset its >> seems like it's not in nvme-rdma (max, can you see if this makes the >> issue go away?) > > Yes, this fixes the null deref issue. > I run some additional login/logout tests that passed too. > This fix is important also for stable kernel (with needed backports to > blk_mq_quiesce_queue/blk_mq_unquiesce_queue functions). > You can add my: > Tested-by: Max Gurtovoy <maxg@mellanox.com> > Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Thanks for clarifying Max. However I still think its not the root cause (unless I don't understand it). As I said, we do not free the tagset so I'm not sure why we get to a NULL deref in the sbitmap code. Jens, can you explain why changing blk_mq_stop_hw_queues to blk_mq_quiesce_queue makes the issue go away? I know that quiesce respects the rcu grace, but I still do not understand why without it we get a NULL sb->map. > Let me know if you want me to push this fix to the mailing list to save > time (can we make it to 4.12 ?) I can send patches, we need it in pci, fc and loop too.. I don't think its a 4.12 material as we are way too late to this sort of fix. ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-02 15:08 ` Sagi Grimberg 0 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-02 15:08 UTC (permalink / raw) >> Hmm, >> >> In the above flow we should not be freeing the tag_set, not on admin as >> well. The target keep removing namespaces and finally removes the >> subsystem which generates a error recovery flow. What we at least try >> to do is: >> >> 1. mark rdma queues as not live >> 2. stop all the sw queues (admin and io) >> 3. fail inflight I/Os >> 4. restart all sw queues (to fast fail until we recover) >> >> We shouldn't be freeing the tagsets (although we might update them >> when we recover and cpu map changed - which I don't think is happening). >> >> However, I do see a difference between bt_tags_for_each >> and blk_mq_flush_busy_ctxs (checks tags->rqs not being NULL). >> >> Unrelated to this I think we should quiesce/unquiesce the admin_q >> instead of stop/start because it respects the submission path rcu [1]. >> >> It might hide the issue, but given that we never free the tagset its >> seems like it's not in nvme-rdma (max, can you see if this makes the >> issue go away?) > > Yes, this fixes the null deref issue. > I run some additional login/logout tests that passed too. > This fix is important also for stable kernel (with needed backports to > blk_mq_quiesce_queue/blk_mq_unquiesce_queue functions). > You can add my: > Tested-by: Max Gurtovoy <maxg at mellanox.com> > Reviewed-by: Max Gurtovoy <maxg at mellanox.com> Thanks for clarifying Max. However I still think its not the root cause (unless I don't understand it). As I said, we do not free the tagset so I'm not sure why we get to a NULL deref in the sbitmap code. Jens, can you explain why changing blk_mq_stop_hw_queues to blk_mq_quiesce_queue makes the issue go away? I know that quiesce respects the rcu grace, but I still do not understand why without it we get a NULL sb->map. > Let me know if you want me to push this fix to the mailing list to save > time (can we make it to 4.12 ?) I can send patches, we need it in pci, fc and loop too.. I don't think its a 4.12 material as we are way too late to this sort of fix. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-02 11:56 ` Sagi Grimberg @ 2017-07-03 9:40 ` Ming Lei -1 siblings, 0 replies; 29+ messages in thread From: Ming Lei @ 2017-07-03 9:40 UTC (permalink / raw) To: Sagi Grimberg Cc: Max Gurtovoy, Jens Axboe, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On Sun, Jul 02, 2017 at 02:56:56PM +0300, Sagi Grimberg wrote: > > > On 02/07/17 13:45, Max Gurtovoy wrote: > > > > > > On 6/30/2017 8:26 PM, Jens Axboe wrote: > > > Hi Max, > > > > Hi Jens, > > > > > > > > I remembered you reporting this. I think this is a regression introduced > > > with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] > > > is, but that's not indexable by the tag we find. So I think we need to > > > guard those with a NULL check. The actual requests themselves are > > > static, so we know the memory itself isn't going away. But if we race > > > with completion, we could find a NULL there, validly. > > > > > > Since you could reproduce it, can you try the below? > > > > I still can repro the null deref with this patch applied. > > > > > > > > diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c > > > index d0be72ccb091..b856b2827157 100644 > > > --- a/block/blk-mq-tag.c > > > +++ b/block/blk-mq-tag.c > > > @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, > > > unsigned int bitnr, void *data) > > > bitnr += tags->nr_reserved_tags; > > > rq = tags->rqs[bitnr]; > > > > > > - if (rq->q == hctx->queue) > > > + if (rq && rq->q == hctx->queue) > > > iter_data->fn(hctx, rq, iter_data->data, reserved); > > > return true; > > > } > > > @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, > > > unsigned int bitnr, void *data) > > > if (!reserved) > > > bitnr += tags->nr_reserved_tags; > > > rq = tags->rqs[bitnr]; > > > - > > > - iter_data->fn(rq, iter_data->data, reserved); > > > + if (rq) > > > + iter_data->fn(rq, iter_data->data, reserved); > > > return true; > > > } > > > > see the attached file for dmesg output. > > > > output of gdb: > > > > (gdb) list *(blk_mq_flush_busy_ctxs+0x48) > > 0xffffffff8127b108 is in blk_mq_flush_busy_ctxs > > (./include/linux/sbitmap.h:234). > > 229 > > 230 for (i = 0; i < sb->map_nr; i++) { > > 231 struct sbitmap_word *word = &sb->map[i]; > > 232 unsigned int off, nr; > > 233 > > 234 if (!word->word) > > 235 continue; > > 236 > > 237 nr = 0; > > 238 off = i << sb->shift; > > > > > > when I change the "if (!word->word)" to "if (word && !word->word)" > > I can get null deref at "nr = find_next_bit(&word->word, word->depth, > > nr);". Seems like somehow word becomes NULL. > > > > Adding the linux-nvme guys too. > > Sagi has mentioned that this can be null only if we remove the tagset > > while I/O is trying to get a tag and when killing the target we get into > > error recovery and periodic reconnects, which does _NOT_ include freeing > > the tagset, so this is probably the admin tagset. > > > > Sagi, > > you've mention a patch for centrelizing the treatment of the admin > > tagset to the nvme core. I think I missed this patch, so can you please > > send a pointer to it and I'll check if it helps ? > > Hmm, > > In the above flow we should not be freeing the tag_set, not on admin as > well. The target keep removing namespaces and finally removes the > subsystem which generates a error recovery flow. What we at least try > to do is: > > 1. mark rdma queues as not live > 2. stop all the sw queues (admin and io) > 3. fail inflight I/Os > 4. restart all sw queues (to fast fail until we recover) > > We shouldn't be freeing the tagsets (although we might update them > when we recover and cpu map changed - which I don't think is happening). > > However, I do see a difference between bt_tags_for_each > and blk_mq_flush_busy_ctxs (checks tags->rqs not being NULL). > > Unrelated to this I think we should quiesce/unquiesce the admin_q > instead of stop/start because it respects the submission path rcu [1]. > > It might hide the issue, but given that we never free the tagset its > seems like it's not in nvme-rdma (max, can you see if this makes the > issue go away?) > > [1]: > -- > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c > index e3996db22738..094873a4ee38 100644 > --- a/drivers/nvme/host/rdma.c > +++ b/drivers/nvme/host/rdma.c > @@ -785,7 +785,7 @@ static void nvme_rdma_error_recovery_work(struct > work_struct *work) > > if (ctrl->ctrl.queue_count > 1) > nvme_stop_queues(&ctrl->ctrl); > - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); > + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); > > /* We must take care of fastfail/requeue all our inflight requests > */ > if (ctrl->ctrl.queue_count > 1) > @@ -798,7 +798,8 @@ static void nvme_rdma_error_recovery_work(struct > work_struct *work) > * queues are not a live anymore, so restart the queues to fail fast > * new IO > */ > - blk_mq_start_stopped_hw_queues(ctrl->ctrl.admin_q, true); > + blk_mq_unquiesce_queue(ctrl->ctrl.admin_q); > + blk_mq_kick_requeue_list(ctrl->ctrl.admin_q); > nvme_start_queues(&ctrl->ctrl); > > nvme_rdma_reconnect_or_remove(ctrl); > @@ -1651,7 +1652,7 @@ static void nvme_rdma_shutdown_ctrl(struct > nvme_rdma_ctrl *ctrl) > if (test_bit(NVME_RDMA_Q_LIVE, &ctrl->queues[0].flags)) > nvme_shutdown_ctrl(&ctrl->ctrl); > > - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); > + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); > blk_mq_tagset_busy_iter(&ctrl->admin_tag_set, > nvme_cancel_request, &ctrl->ctrl); > nvme_rdma_destroy_admin_queue(ctrl); Yeah, the above change is correct, for any canceling requests in this way we should use blk_mq_quiesce_queue(). Thanks, Ming ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-03 9:40 ` Ming Lei 0 siblings, 0 replies; 29+ messages in thread From: Ming Lei @ 2017-07-03 9:40 UTC (permalink / raw) On Sun, Jul 02, 2017@02:56:56PM +0300, Sagi Grimberg wrote: > > > On 02/07/17 13:45, Max Gurtovoy wrote: > > > > > > On 6/30/2017 8:26 PM, Jens Axboe wrote: > > > Hi Max, > > > > Hi Jens, > > > > > > > > I remembered you reporting this. I think this is a regression introduced > > > with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] > > > is, but that's not indexable by the tag we find. So I think we need to > > > guard those with a NULL check. The actual requests themselves are > > > static, so we know the memory itself isn't going away. But if we race > > > with completion, we could find a NULL there, validly. > > > > > > Since you could reproduce it, can you try the below? > > > > I still can repro the null deref with this patch applied. > > > > > > > > diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c > > > index d0be72ccb091..b856b2827157 100644 > > > --- a/block/blk-mq-tag.c > > > +++ b/block/blk-mq-tag.c > > > @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, > > > unsigned int bitnr, void *data) > > > bitnr += tags->nr_reserved_tags; > > > rq = tags->rqs[bitnr]; > > > > > > - if (rq->q == hctx->queue) > > > + if (rq && rq->q == hctx->queue) > > > iter_data->fn(hctx, rq, iter_data->data, reserved); > > > return true; > > > } > > > @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, > > > unsigned int bitnr, void *data) > > > if (!reserved) > > > bitnr += tags->nr_reserved_tags; > > > rq = tags->rqs[bitnr]; > > > - > > > - iter_data->fn(rq, iter_data->data, reserved); > > > + if (rq) > > > + iter_data->fn(rq, iter_data->data, reserved); > > > return true; > > > } > > > > see the attached file for dmesg output. > > > > output of gdb: > > > > (gdb) list *(blk_mq_flush_busy_ctxs+0x48) > > 0xffffffff8127b108 is in blk_mq_flush_busy_ctxs > > (./include/linux/sbitmap.h:234). > > 229 > > 230 for (i = 0; i < sb->map_nr; i++) { > > 231 struct sbitmap_word *word = &sb->map[i]; > > 232 unsigned int off, nr; > > 233 > > 234 if (!word->word) > > 235 continue; > > 236 > > 237 nr = 0; > > 238 off = i << sb->shift; > > > > > > when I change the "if (!word->word)" to "if (word && !word->word)" > > I can get null deref at "nr = find_next_bit(&word->word, word->depth, > > nr);". Seems like somehow word becomes NULL. > > > > Adding the linux-nvme guys too. > > Sagi has mentioned that this can be null only if we remove the tagset > > while I/O is trying to get a tag and when killing the target we get into > > error recovery and periodic reconnects, which does _NOT_ include freeing > > the tagset, so this is probably the admin tagset. > > > > Sagi, > > you've mention a patch for centrelizing the treatment of the admin > > tagset to the nvme core. I think I missed this patch, so can you please > > send a pointer to it and I'll check if it helps ? > > Hmm, > > In the above flow we should not be freeing the tag_set, not on admin as > well. The target keep removing namespaces and finally removes the > subsystem which generates a error recovery flow. What we at least try > to do is: > > 1. mark rdma queues as not live > 2. stop all the sw queues (admin and io) > 3. fail inflight I/Os > 4. restart all sw queues (to fast fail until we recover) > > We shouldn't be freeing the tagsets (although we might update them > when we recover and cpu map changed - which I don't think is happening). > > However, I do see a difference between bt_tags_for_each > and blk_mq_flush_busy_ctxs (checks tags->rqs not being NULL). > > Unrelated to this I think we should quiesce/unquiesce the admin_q > instead of stop/start because it respects the submission path rcu [1]. > > It might hide the issue, but given that we never free the tagset its > seems like it's not in nvme-rdma (max, can you see if this makes the > issue go away?) > > [1]: > -- > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c > index e3996db22738..094873a4ee38 100644 > --- a/drivers/nvme/host/rdma.c > +++ b/drivers/nvme/host/rdma.c > @@ -785,7 +785,7 @@ static void nvme_rdma_error_recovery_work(struct > work_struct *work) > > if (ctrl->ctrl.queue_count > 1) > nvme_stop_queues(&ctrl->ctrl); > - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); > + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); > > /* We must take care of fastfail/requeue all our inflight requests > */ > if (ctrl->ctrl.queue_count > 1) > @@ -798,7 +798,8 @@ static void nvme_rdma_error_recovery_work(struct > work_struct *work) > * queues are not a live anymore, so restart the queues to fail fast > * new IO > */ > - blk_mq_start_stopped_hw_queues(ctrl->ctrl.admin_q, true); > + blk_mq_unquiesce_queue(ctrl->ctrl.admin_q); > + blk_mq_kick_requeue_list(ctrl->ctrl.admin_q); > nvme_start_queues(&ctrl->ctrl); > > nvme_rdma_reconnect_or_remove(ctrl); > @@ -1651,7 +1652,7 @@ static void nvme_rdma_shutdown_ctrl(struct > nvme_rdma_ctrl *ctrl) > if (test_bit(NVME_RDMA_Q_LIVE, &ctrl->queues[0].flags)) > nvme_shutdown_ctrl(&ctrl->ctrl); > > - blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); > + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); > blk_mq_tagset_busy_iter(&ctrl->admin_tag_set, > nvme_cancel_request, &ctrl->ctrl); > nvme_rdma_destroy_admin_queue(ctrl); Yeah, the above change is correct, for any canceling requests in this way we should use blk_mq_quiesce_queue(). Thanks, Ming ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-03 9:40 ` Ming Lei @ 2017-07-03 10:07 ` Sagi Grimberg -1 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-03 10:07 UTC (permalink / raw) To: Ming Lei Cc: Max Gurtovoy, Jens Axboe, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Hi Ming, > Yeah, the above change is correct, for any canceling requests in this > way we should use blk_mq_quiesce_queue(). I still don't understand why should blk_mq_flush_busy_ctxs hit a NULL deref if we don't touch the tagset... Also, I'm wandering in what case we shouldn't use blk_mq_quiesce_queue()? Maybe we should unexport blk_mq_stop_hw_queues() and blk_mq_start_stopped_hw_queues() and use the quiesce/unquiesce equivalent always? The only fishy usage is in nvme_fc_start_fcp_op() where if submission failed the code stop the hw queues and delays it, but I think it should be handled differently.. ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-03 10:07 ` Sagi Grimberg 0 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-03 10:07 UTC (permalink / raw) Hi Ming, > Yeah, the above change is correct, for any canceling requests in this > way we should use blk_mq_quiesce_queue(). I still don't understand why should blk_mq_flush_busy_ctxs hit a NULL deref if we don't touch the tagset... Also, I'm wandering in what case we shouldn't use blk_mq_quiesce_queue()? Maybe we should unexport blk_mq_stop_hw_queues() and blk_mq_start_stopped_hw_queues() and use the quiesce/unquiesce equivalent always? The only fishy usage is in nvme_fc_start_fcp_op() where if submission failed the code stop the hw queues and delays it, but I think it should be handled differently.. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-03 10:07 ` Sagi Grimberg @ 2017-07-03 12:03 ` Ming Lei -1 siblings, 0 replies; 29+ messages in thread From: Ming Lei @ 2017-07-03 12:03 UTC (permalink / raw) To: Sagi Grimberg Cc: Max Gurtovoy, Jens Axboe, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On Mon, Jul 03, 2017 at 01:07:44PM +0300, Sagi Grimberg wrote: > Hi Ming, > > > Yeah, the above change is correct, for any canceling requests in this > > way we should use blk_mq_quiesce_queue(). > > I still don't understand why should blk_mq_flush_busy_ctxs hit a NULL > deref if we don't touch the tagset... Looks no one mentioned the steps for reproduction, then it isn't easy to understand the related use case, could anyone share the steps for reproduction? > > Also, I'm wandering in what case we shouldn't use > blk_mq_quiesce_queue()? Maybe we should unexport blk_mq_stop_hw_queues() > and blk_mq_start_stopped_hw_queues() and use the quiesce/unquiesce > equivalent always? There are at least one case in which we have to use stop queues: - when QUEUE_BUSY(now it becomes BLK_STS_RESOURCE) happens, some drivers need to stop queues for avoiding to hurt CPU, such as virtio-blk, ... > > The only fishy usage is in nvme_fc_start_fcp_op() where if submission > failed the code stop the hw queues and delays it, but I think it should > be handled differently.. It looks like the old way of scsi-mq, but scsi has removed this way and avoids to stop queue. Thanks, Ming ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-03 12:03 ` Ming Lei 0 siblings, 0 replies; 29+ messages in thread From: Ming Lei @ 2017-07-03 12:03 UTC (permalink / raw) On Mon, Jul 03, 2017@01:07:44PM +0300, Sagi Grimberg wrote: > Hi Ming, > > > Yeah, the above change is correct, for any canceling requests in this > > way we should use blk_mq_quiesce_queue(). > > I still don't understand why should blk_mq_flush_busy_ctxs hit a NULL > deref if we don't touch the tagset... Looks no one mentioned the steps for reproduction, then it isn't easy to understand the related use case, could anyone share the steps for reproduction? > > Also, I'm wandering in what case we shouldn't use > blk_mq_quiesce_queue()? Maybe we should unexport blk_mq_stop_hw_queues() > and blk_mq_start_stopped_hw_queues() and use the quiesce/unquiesce > equivalent always? There are at least one case in which we have to use stop queues: - when QUEUE_BUSY(now it becomes BLK_STS_RESOURCE) happens, some drivers need to stop queues for avoiding to hurt CPU, such as virtio-blk, ... > > The only fishy usage is in nvme_fc_start_fcp_op() where if submission > failed the code stop the hw queues and delays it, but I think it should > be handled differently.. It looks like the old way of scsi-mq, but scsi has removed this way and avoids to stop queue. Thanks, Ming ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-03 12:03 ` Ming Lei @ 2017-07-03 12:46 ` Max Gurtovoy -1 siblings, 0 replies; 29+ messages in thread From: Max Gurtovoy @ 2017-07-03 12:46 UTC (permalink / raw) To: Ming Lei, Sagi Grimberg Cc: Jens Axboe, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On 7/3/2017 3:03 PM, Ming Lei wrote: > On Mon, Jul 03, 2017 at 01:07:44PM +0300, Sagi Grimberg wrote: >> Hi Ming, >> >>> Yeah, the above change is correct, for any canceling requests in this >>> way we should use blk_mq_quiesce_queue(). >> >> I still don't understand why should blk_mq_flush_busy_ctxs hit a NULL >> deref if we don't touch the tagset... > > Looks no one mentioned the steps for reproduction, then it isn't easy > to understand the related use case, could anyone share the steps for > reproduction? Hi Ming, I create 500 ns per 1 subsystem (using with CX4 target and C-IB initiator but also saw it in CX5 vs. CX5 setup). The null deref happens when I remove all configuration in the target (1 port 1 subsystem and 500 namespaces and nvmet modules unload) during traffic to 1 nvme device/ns from the intiator. I get Null deref in blk_mq_flush_busy_ctxs function that calls sbitmap_for_each_set in the initiator. seems like the "struct sbitmap_word *word = &sb->map[i];" is null. It's actually might be not null in the beginning of the func and become null during running the while loop there. > >> >> Also, I'm wandering in what case we shouldn't use >> blk_mq_quiesce_queue()? Maybe we should unexport blk_mq_stop_hw_queues() >> and blk_mq_start_stopped_hw_queues() and use the quiesce/unquiesce >> equivalent always? > > There are at least one case in which we have to use stop queues: > > - when QUEUE_BUSY(now it becomes BLK_STS_RESOURCE) happens, some drivers > need to stop queues for avoiding to hurt CPU, such as virtio-blk, ... > >> >> The only fishy usage is in nvme_fc_start_fcp_op() where if submission >> failed the code stop the hw queues and delays it, but I think it should >> be handled differently.. > > It looks like the old way of scsi-mq, but scsi has removed this way and > avoids to stop queue. > > > Thanks, > Ming > ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-03 12:46 ` Max Gurtovoy 0 siblings, 0 replies; 29+ messages in thread From: Max Gurtovoy @ 2017-07-03 12:46 UTC (permalink / raw) On 7/3/2017 3:03 PM, Ming Lei wrote: > On Mon, Jul 03, 2017@01:07:44PM +0300, Sagi Grimberg wrote: >> Hi Ming, >> >>> Yeah, the above change is correct, for any canceling requests in this >>> way we should use blk_mq_quiesce_queue(). >> >> I still don't understand why should blk_mq_flush_busy_ctxs hit a NULL >> deref if we don't touch the tagset... > > Looks no one mentioned the steps for reproduction, then it isn't easy > to understand the related use case, could anyone share the steps for > reproduction? Hi Ming, I create 500 ns per 1 subsystem (using with CX4 target and C-IB initiator but also saw it in CX5 vs. CX5 setup). The null deref happens when I remove all configuration in the target (1 port 1 subsystem and 500 namespaces and nvmet modules unload) during traffic to 1 nvme device/ns from the intiator. I get Null deref in blk_mq_flush_busy_ctxs function that calls sbitmap_for_each_set in the initiator. seems like the "struct sbitmap_word *word = &sb->map[i];" is null. It's actually might be not null in the beginning of the func and become null during running the while loop there. > >> >> Also, I'm wandering in what case we shouldn't use >> blk_mq_quiesce_queue()? Maybe we should unexport blk_mq_stop_hw_queues() >> and blk_mq_start_stopped_hw_queues() and use the quiesce/unquiesce >> equivalent always? > > There are at least one case in which we have to use stop queues: > > - when QUEUE_BUSY(now it becomes BLK_STS_RESOURCE) happens, some drivers > need to stop queues for avoiding to hurt CPU, such as virtio-blk, ... > >> >> The only fishy usage is in nvme_fc_start_fcp_op() where if submission >> failed the code stop the hw queues and delays it, but I think it should >> be handled differently.. > > It looks like the old way of scsi-mq, but scsi has removed this way and > avoids to stop queue. > > > Thanks, > Ming > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-03 12:46 ` Max Gurtovoy @ 2017-07-03 15:54 ` Ming Lei -1 siblings, 0 replies; 29+ messages in thread From: Ming Lei @ 2017-07-03 15:54 UTC (permalink / raw) To: Max Gurtovoy Cc: Sagi Grimberg, Jens Axboe, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On Mon, Jul 03, 2017 at 03:46:34PM +0300, Max Gurtovoy wrote: > > > On 7/3/2017 3:03 PM, Ming Lei wrote: > > On Mon, Jul 03, 2017 at 01:07:44PM +0300, Sagi Grimberg wrote: > > > Hi Ming, > > > > > > > Yeah, the above change is correct, for any canceling requests in this > > > > way we should use blk_mq_quiesce_queue(). > > > > > > I still don't understand why should blk_mq_flush_busy_ctxs hit a NULL > > > deref if we don't touch the tagset... > > > > Looks no one mentioned the steps for reproduction, then it isn't easy > > to understand the related use case, could anyone share the steps for > > reproduction? > > Hi Ming, > I create 500 ns per 1 subsystem (using with CX4 target and C-IB initiator > but also saw it in CX5 vs. CX5 setup). > The null deref happens when I remove all configuration in the target (1 port > 1 subsystem and 500 namespaces and nvmet modules unload) during traffic to 1 > nvme device/ns from the intiator. > I get Null deref in blk_mq_flush_busy_ctxs function that calls > sbitmap_for_each_set in the initiator. seems like the "struct sbitmap_word > *word = &sb->map[i];" is null. It's actually might be not null in the > beginning of the func and become null during running the while loop there. So looks it is still a normal release in initiator. Per my experience, without quiescing queue before blk_mq_tagset_busy_iter() for canceling requests, request double free can be caused: one submitted req in .queue_rq can completed in blk_mq_end_request(), meantime it can be completed in nvme_cancel_request(). That is why we have to quiescing queue first before canceling request in this way. Except for NVMe, looks NBD and mtip32xx need fix too. This way might cause blk_cleanup_queue() to complete early, then NULL deref can be triggered in blk_mq_flush_busy_ctxs(). But in my previous debug in PCI NVMe, this wasn't seen yet. It should have been verified if the above is true by adding some debug message inside blk_cleanup_queue(). Thanks, Ming ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-03 15:54 ` Ming Lei 0 siblings, 0 replies; 29+ messages in thread From: Ming Lei @ 2017-07-03 15:54 UTC (permalink / raw) On Mon, Jul 03, 2017@03:46:34PM +0300, Max Gurtovoy wrote: > > > On 7/3/2017 3:03 PM, Ming Lei wrote: > > On Mon, Jul 03, 2017@01:07:44PM +0300, Sagi Grimberg wrote: > > > Hi Ming, > > > > > > > Yeah, the above change is correct, for any canceling requests in this > > > > way we should use blk_mq_quiesce_queue(). > > > > > > I still don't understand why should blk_mq_flush_busy_ctxs hit a NULL > > > deref if we don't touch the tagset... > > > > Looks no one mentioned the steps for reproduction, then it isn't easy > > to understand the related use case, could anyone share the steps for > > reproduction? > > Hi Ming, > I create 500 ns per 1 subsystem (using with CX4 target and C-IB initiator > but also saw it in CX5 vs. CX5 setup). > The null deref happens when I remove all configuration in the target (1 port > 1 subsystem and 500 namespaces and nvmet modules unload) during traffic to 1 > nvme device/ns from the intiator. > I get Null deref in blk_mq_flush_busy_ctxs function that calls > sbitmap_for_each_set in the initiator. seems like the "struct sbitmap_word > *word = &sb->map[i];" is null. It's actually might be not null in the > beginning of the func and become null during running the while loop there. So looks it is still a normal release in initiator. Per my experience, without quiescing queue before blk_mq_tagset_busy_iter() for canceling requests, request double free can be caused: one submitted req in .queue_rq can completed in blk_mq_end_request(), meantime it can be completed in nvme_cancel_request(). That is why we have to quiescing queue first before canceling request in this way. Except for NVMe, looks NBD and mtip32xx need fix too. This way might cause blk_cleanup_queue() to complete early, then NULL deref can be triggered in blk_mq_flush_busy_ctxs(). But in my previous debug in PCI NVMe, this wasn't seen yet. It should have been verified if the above is true by adding some debug message inside blk_cleanup_queue(). Thanks, Ming ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-03 15:54 ` Ming Lei @ 2017-07-04 6:58 ` Sagi Grimberg -1 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-04 6:58 UTC (permalink / raw) To: Ming Lei, Max Gurtovoy Cc: Jens Axboe, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org > So looks it is still a normal release in initiator. > > Per my experience, without quiescing queue before > blk_mq_tagset_busy_iter() for canceling requests, request double free > can be caused: one submitted req in .queue_rq can completed in > blk_mq_end_request(), meantime it can be completed in > nvme_cancel_request(). That is why we have to quiescing queue > first before canceling request in this way. Except for NVMe, looks > NBD and mtip32xx need fix too. Let me cook some patches for those as well... ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-04 6:58 ` Sagi Grimberg 0 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-04 6:58 UTC (permalink / raw) > So looks it is still a normal release in initiator. > > Per my experience, without quiescing queue before > blk_mq_tagset_busy_iter() for canceling requests, request double free > can be caused: one submitted req in .queue_rq can completed in > blk_mq_end_request(), meantime it can be completed in > nvme_cancel_request(). That is why we have to quiescing queue > first before canceling request in this way. Except for NVMe, looks > NBD and mtip32xx need fix too. Let me cook some patches for those as well... ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-03 12:03 ` Ming Lei @ 2017-07-04 7:56 ` Sagi Grimberg -1 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-04 7:56 UTC (permalink / raw) To: Ming Lei Cc: Max Gurtovoy, Jens Axboe, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org > There are at least one case in which we have to use stop queues: > > - when QUEUE_BUSY(now it becomes BLK_STS_RESOURCE) happens, some drivers > need to stop queues for avoiding to hurt CPU, such as virtio-blk, ... Why isn't virtio_blk using blk_mq_delay_run_hw_queue like scsi does? ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-04 7:56 ` Sagi Grimberg 0 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-04 7:56 UTC (permalink / raw) > There are at least one case in which we have to use stop queues: > > - when QUEUE_BUSY(now it becomes BLK_STS_RESOURCE) happens, some drivers > need to stop queues for avoiding to hurt CPU, such as virtio-blk, ... Why isn't virtio_blk using blk_mq_delay_run_hw_queue like scsi does? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-04 7:56 ` Sagi Grimberg @ 2017-07-04 8:08 ` Ming Lei -1 siblings, 0 replies; 29+ messages in thread From: Ming Lei @ 2017-07-04 8:08 UTC (permalink / raw) To: Sagi Grimberg Cc: Max Gurtovoy, Jens Axboe, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On Tue, Jul 04, 2017 at 10:56:23AM +0300, Sagi Grimberg wrote: > > > There are at least one case in which we have to use stop queues: > > > > - when QUEUE_BUSY(now it becomes BLK_STS_RESOURCE) happens, some drivers > > need to stop queues for avoiding to hurt CPU, such as virtio-blk, ... > > Why isn't virtio_blk using blk_mq_delay_run_hw_queue like scsi does? IMO it shouldn't be easy to figure out one perfect delay time, and it should have been self-adaptive. Also I think it might be possible to move this kind of stop action into blk-mq core code, and not let drivers touch stop state. Finally we may kill all stopping in drivers. Thanks, Ming ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-04 8:08 ` Ming Lei 0 siblings, 0 replies; 29+ messages in thread From: Ming Lei @ 2017-07-04 8:08 UTC (permalink / raw) On Tue, Jul 04, 2017@10:56:23AM +0300, Sagi Grimberg wrote: > > > There are at least one case in which we have to use stop queues: > > > > - when QUEUE_BUSY(now it becomes BLK_STS_RESOURCE) happens, some drivers > > need to stop queues for avoiding to hurt CPU, such as virtio-blk, ... > > Why isn't virtio_blk using blk_mq_delay_run_hw_queue like scsi does? IMO it shouldn't be easy to figure out one perfect delay time, and it should have been self-adaptive. Also I think it might be possible to move this kind of stop action into blk-mq core code, and not let drivers touch stop state. Finally we may kill all stopping in drivers. Thanks, Ming ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-04 8:08 ` Ming Lei @ 2017-07-04 9:14 ` Sagi Grimberg -1 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-04 9:14 UTC (permalink / raw) To: Ming Lei Cc: Max Gurtovoy, Jens Axboe, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org On 04/07/17 11:08, Ming Lei wrote: > On Tue, Jul 04, 2017 at 10:56:23AM +0300, Sagi Grimberg wrote: >> >>> There are at least one case in which we have to use stop queues: >>> >>> - when QUEUE_BUSY(now it becomes BLK_STS_RESOURCE) happens, some drivers >>> need to stop queues for avoiding to hurt CPU, such as virtio-blk, ... >> >> Why isn't virtio_blk using blk_mq_delay_run_hw_queue like scsi does? > > IMO it shouldn't be easy to figure out one perfect delay time, It doesn't needs to be perfect, just something that is sufficient to not hog the cpu and won't have noticeable effects... > and it should have been self-adaptive. But IMO always start the queues on *every* completion is a waste... why iterating on all the hw queues on each completion? > Also I think it might be possible to move this kind of stop action into > blk-mq core code, and not let drivers touch stop state. Finally we > may kill all stopping in drivers. That's a good idea! ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-04 9:14 ` Sagi Grimberg 0 siblings, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2017-07-04 9:14 UTC (permalink / raw) On 04/07/17 11:08, Ming Lei wrote: > On Tue, Jul 04, 2017@10:56:23AM +0300, Sagi Grimberg wrote: >> >>> There are at least one case in which we have to use stop queues: >>> >>> - when QUEUE_BUSY(now it becomes BLK_STS_RESOURCE) happens, some drivers >>> need to stop queues for avoiding to hurt CPU, such as virtio-blk, ... >> >> Why isn't virtio_blk using blk_mq_delay_run_hw_queue like scsi does? > > IMO it shouldn't be easy to figure out one perfect delay time, It doesn't needs to be perfect, just something that is sufficient to not hog the cpu and won't have noticeable effects... > and it should have been self-adaptive. But IMO always start the queues on *every* completion is a waste... why iterating on all the hw queues on each completion? > Also I think it might be possible to move this kind of stop action into > blk-mq core code, and not let drivers touch stop state. Finally we > may kill all stopping in drivers. That's a good idea! ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: NVMe induced NULL deref in bt_iter() 2017-07-02 10:45 ` Max Gurtovoy @ 2017-07-03 16:01 ` Jens Axboe -1 siblings, 0 replies; 29+ messages in thread From: Jens Axboe @ 2017-07-03 16:01 UTC (permalink / raw) To: Max Gurtovoy Cc: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, sagig On 07/02/2017 04:45 AM, Max Gurtovoy wrote: > > > On 6/30/2017 8:26 PM, Jens Axboe wrote: >> Hi Max, > > Hi Jens, > >> >> I remembered you reporting this. I think this is a regression introduced >> with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] >> is, but that's not indexable by the tag we find. So I think we need to >> guard those with a NULL check. The actual requests themselves are >> static, so we know the memory itself isn't going away. But if we race >> with completion, we could find a NULL there, validly. >> >> Since you could reproduce it, can you try the below? > > I still can repro the null deref with this patch applied. > >> >> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c >> index d0be72ccb091..b856b2827157 100644 >> --- a/block/blk-mq-tag.c >> +++ b/block/blk-mq-tag.c >> @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) >> bitnr += tags->nr_reserved_tags; >> rq = tags->rqs[bitnr]; >> >> - if (rq->q == hctx->queue) >> + if (rq && rq->q == hctx->queue) >> iter_data->fn(hctx, rq, iter_data->data, reserved); >> return true; >> } >> @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) >> if (!reserved) >> bitnr += tags->nr_reserved_tags; >> rq = tags->rqs[bitnr]; >> - >> - iter_data->fn(rq, iter_data->data, reserved); >> + if (rq) >> + iter_data->fn(rq, iter_data->data, reserved); >> return true; >> } > > see the attached file for dmesg output. > > output of gdb: > > (gdb) list *(blk_mq_flush_busy_ctxs+0x48) > 0xffffffff8127b108 is in blk_mq_flush_busy_ctxs > (./include/linux/sbitmap.h:234). > 229 > 230 for (i = 0; i < sb->map_nr; i++) { > 231 struct sbitmap_word *word = &sb->map[i]; > 232 unsigned int off, nr; > 233 > 234 if (!word->word) > 235 continue; > 236 > 237 nr = 0; > 238 off = i << sb->shift; > > > when I change the "if (!word->word)" to "if (word && !word->word)" > I can get null deref at "nr = find_next_bit(&word->word, word->depth, > nr);". Seems like somehow word becomes NULL. > > Adding the linux-nvme guys too. > Sagi has mentioned that this can be null only if we remove the tagset > while I/O is trying to get a tag and when killing the target we get into > error recovery and periodic reconnects, which does _NOT_ include freeing > the tagset, so this is probably the admin tagset. > > Sagi, > you've mention a patch for centrelizing the treatment of the admin > tagset to the nvme core. I think I missed this patch, so can you please > send a pointer to it and I'll check if it helps ? Right, this is clearly a different issue and my first thought as well was that it's a missing quiesce of the queue. We're iterating the tags when they are being torn down. Looks like Sagi's patch fixes the issue, so I'm considering this one resolved. -- Jens Axboe ^ permalink raw reply [flat|nested] 29+ messages in thread
* NVMe induced NULL deref in bt_iter() @ 2017-07-03 16:01 ` Jens Axboe 0 siblings, 0 replies; 29+ messages in thread From: Jens Axboe @ 2017-07-03 16:01 UTC (permalink / raw) On 07/02/2017 04:45 AM, Max Gurtovoy wrote: > > > On 6/30/2017 8:26 PM, Jens Axboe wrote: >> Hi Max, > > Hi Jens, > >> >> I remembered you reporting this. I think this is a regression introduced >> with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] >> is, but that's not indexable by the tag we find. So I think we need to >> guard those with a NULL check. The actual requests themselves are >> static, so we know the memory itself isn't going away. But if we race >> with completion, we could find a NULL there, validly. >> >> Since you could reproduce it, can you try the below? > > I still can repro the null deref with this patch applied. > >> >> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c >> index d0be72ccb091..b856b2827157 100644 >> --- a/block/blk-mq-tag.c >> +++ b/block/blk-mq-tag.c >> @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) >> bitnr += tags->nr_reserved_tags; >> rq = tags->rqs[bitnr]; >> >> - if (rq->q == hctx->queue) >> + if (rq && rq->q == hctx->queue) >> iter_data->fn(hctx, rq, iter_data->data, reserved); >> return true; >> } >> @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) >> if (!reserved) >> bitnr += tags->nr_reserved_tags; >> rq = tags->rqs[bitnr]; >> - >> - iter_data->fn(rq, iter_data->data, reserved); >> + if (rq) >> + iter_data->fn(rq, iter_data->data, reserved); >> return true; >> } > > see the attached file for dmesg output. > > output of gdb: > > (gdb) list *(blk_mq_flush_busy_ctxs+0x48) > 0xffffffff8127b108 is in blk_mq_flush_busy_ctxs > (./include/linux/sbitmap.h:234). > 229 > 230 for (i = 0; i < sb->map_nr; i++) { > 231 struct sbitmap_word *word = &sb->map[i]; > 232 unsigned int off, nr; > 233 > 234 if (!word->word) > 235 continue; > 236 > 237 nr = 0; > 238 off = i << sb->shift; > > > when I change the "if (!word->word)" to "if (word && !word->word)" > I can get null deref at "nr = find_next_bit(&word->word, word->depth, > nr);". Seems like somehow word becomes NULL. > > Adding the linux-nvme guys too. > Sagi has mentioned that this can be null only if we remove the tagset > while I/O is trying to get a tag and when killing the target we get into > error recovery and periodic reconnects, which does _NOT_ include freeing > the tagset, so this is probably the admin tagset. > > Sagi, > you've mention a patch for centrelizing the treatment of the admin > tagset to the nvme core. I think I missed this patch, so can you please > send a pointer to it and I'll check if it helps ? Right, this is clearly a different issue and my first thought as well was that it's a missing quiesce of the queue. We're iterating the tags when they are being torn down. Looks like Sagi's patch fixes the issue, so I'm considering this one resolved. -- Jens Axboe ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2017-07-04 9:14 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-06-30 17:26 NVMe induced NULL deref in bt_iter() Jens Axboe 2017-07-02 10:45 ` Max Gurtovoy 2017-07-02 10:45 ` Max Gurtovoy 2017-07-02 11:56 ` Sagi Grimberg 2017-07-02 11:56 ` Sagi Grimberg 2017-07-02 14:37 ` Max Gurtovoy 2017-07-02 14:37 ` Max Gurtovoy 2017-07-02 15:08 ` Sagi Grimberg 2017-07-02 15:08 ` Sagi Grimberg 2017-07-03 9:40 ` Ming Lei 2017-07-03 9:40 ` Ming Lei 2017-07-03 10:07 ` Sagi Grimberg 2017-07-03 10:07 ` Sagi Grimberg 2017-07-03 12:03 ` Ming Lei 2017-07-03 12:03 ` Ming Lei 2017-07-03 12:46 ` Max Gurtovoy 2017-07-03 12:46 ` Max Gurtovoy 2017-07-03 15:54 ` Ming Lei 2017-07-03 15:54 ` Ming Lei 2017-07-04 6:58 ` Sagi Grimberg 2017-07-04 6:58 ` Sagi Grimberg 2017-07-04 7:56 ` Sagi Grimberg 2017-07-04 7:56 ` Sagi Grimberg 2017-07-04 8:08 ` Ming Lei 2017-07-04 8:08 ` Ming Lei 2017-07-04 9:14 ` Sagi Grimberg 2017-07-04 9:14 ` Sagi Grimberg 2017-07-03 16:01 ` Jens Axboe 2017-07-03 16:01 ` Jens Axboe
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.