From mboxrd@z Thu Jan 1 00:00:00 1970 From: swise@opengridcomputing.com (Steve Wise) Date: Tue, 1 Nov 2016 11:20:15 -0500 Subject: nvmet_rdma crash - DISCONNECT event with NULL queue In-Reply-To: <6f42d056-284d-00fc-2b98-189f54957980@grimberg.me> References: <01b401d23458$af277210$0d765630$@opengridcomputing.com> <6f42d056-284d-00fc-2b98-189f54957980@grimberg.me> Message-ID: <01cc01d2345b$d445acd0$7cd10670$@opengridcomputing.com> > > I just hit an nvmf target NULL pointer deref BUG after a few hours of keep-alive > > timeout testing. It appears that nvmet_rdma_cm_handler() was called with > > cm_id->qp == NULL, so the local nvmet_rdma_queue * variable queue is left as > > NULL. But then nvmet_rdma_queue_disconnect() is called with queue == NULL > which > > causes the crash. > > AFAICT, the only way cm_id->qp is NULL is for a scenario we didn't even > get to allocate a queue-pair (e.g. calling rdma_create_qp). The teardown > paths does not nullify cm_id->qp... rdma_destroy_qp() nulls out cm_id->qp. > > Are you sure that the flow is indeed DISCONNECTED event? > That is the only event I would expect during this test. From nvmet_rdma_cm_handler(): case RDMA_CM_EVENT_ADDR_CHANGE: case RDMA_CM_EVENT_DISCONNECTED: case RDMA_CM_EVENT_TIMEWAIT_EXIT: nvmet_rdma_queue_disconnect(queue); break; > > In the log, I see that the target side keep-alive fired: > > > > [20676.867545] eth2: link up, 40Gbps, full-duplex, Tx/Rx PAUSE > > [20677.079669] nvmet: ctrl 1 keep-alive timer (15 seconds) expired! > > [20677.079684] nvmet: ctrl 1 keep-alive timer (15 seconds) expired! > > Wow, two keep-alive timeouts on the same controller? that is > seriously wrong... > > > > [20677.088402] nvmet_rdma: freeing queue 276 > > [20677.090981] BUG: unable to handle kernel NULL pointer dereference at > > 0000000000000120 > > [20677.090988] IP: [] > nvmet_rdma_queue_disconnect+0x24/0x90 > > [nvmet_rdma] > > No stack trace? > Sure: [20677.090992] Oops: 0000 [#1] SMP [20677.091030] Modules linked in: iw_cxgb4(OE) cxgb4(OE) null_blk brd nvmet_rdma nvmet ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge 8021q mrp garp stp llc ipmi_devintf cachefiles fscache rdma_ucm rdma_cm iw_cm ib_ipoib ib_cm ib_uverbs ib_umad ocrdma be2net iw_nes libcrc32c iw_cxgb3 cxgb3 mdio ib_qib rdmavt mlx5_ib mlx5_core mlx4_ib mlx4_en mlx4_core ib_mthca ib_core binfmt_misc dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan vhost tun kvm irqbypass uinput iTCO_wdt iTCO_vendor_support mxm_wmi pcspkr dm_mod i2c_i801 i2c_smbus sg ipmi_si ipmi_msghandler nvme nvme_core lpc_ich mfd_core [20677.091040] mei_me mei igb dca ptp pps_core wmi ext4(E) mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) libata(E) mgag200(E) ttm(E) drm_kms_helper(E) drm(E) fb_sys_fops(E) sysimgblt(E) sysfillrect(E) syscopyarea(E) i2c_algo_bit(E) i2c_core(E) [last unloaded: cxgb4] [20677.091043] CPU: 10 PID: 24331 Comm: kworker/u64:1 Tainted: G OE 4.8.0 #130 [20677.091044] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015 [20677.091049] Workqueue: iw_cm_wq cm_work_handler [iw_cm] [20677.091050] task: ffff88102fa4ab80 task.stack: ffff880f111d8000 [20677.091053] RIP: 0010:[] [] nvmet_rdma_queue_disconnect+0x24/0x90 [nvmet_rdma] [20677.091054] RSP: 0018:ffff880f111db968 EFLAGS: 00010296 [20677.091055] RAX: ffff88102fa4ab80 RBX: 0000000000000000 RCX: 0000000000000000 [20677.091056] RDX: 0000000000000005 RSI: 000000000000000a RDI: ffffffffa084f040 [20677.091056] RBP: ffff880f111db998 R08: ffff88102a30e358 R09: ffff880f111dba78 [20677.091058] R10: 0000000000000c2c R11: 0000000000000001 R12: 0000000000000000 [20677.091059] R13: ffff880805ca2c00 R14: ffff880f111db9e8 R15: ffff880f111dbc48 [20677.091060] FS: 0000000000000000(0000) GS:ffff88105f280000(0000) knlGS:0000000000000000 [20677.091061] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [20677.091062] CR2: 0000000000000120 CR3: 000000105a203000 CR4: 00000000000406e0 [20677.091063] Stack: [20677.091065] ffff881053f6c100 0000000000000046 000012ce42a8b7b3 ffff88105bdf7000 [20677.091067] ffff880f111db9e8 0000000000000000 ffff880f111db9c8 ffffffffa084d484 [20677.091069] ffff880f111dbb28 ffff880805ca2c00 ffff880805ca2e70 ffff880f111db9e8 [20677.091070] Call Trace: [20677.091074] [] nvmet_rdma_cm_handler+0x154/0x1a0 [nvmet_rdma] [20677.091078] [] cma_iw_handler+0x110/0x1c0 [rdma_cm] [20677.091084] [] ? ring_buffer_lock_reserve+0x7e/0xc0 [20677.091086] [] cm_close_handler+0x93/0xc0 [iw_cm] [20677.091089] [] process_event+0xd7/0xf0 [iw_cm] [20677.091092] [] cm_work_handler+0x130/0x190 [iw_cm] [20677.091097] [] ? trace_event_raw_event_workqueue_execute_start+0x66/0xa0 [20677.091099] [] process_one_work+0x183/0x4d0 [20677.091103] [] ? __schedule+0x1f0/0x5b0 [20677.091106] [] ? schedule+0x40/0xb0 [20677.091108] [] worker_thread+0x16d/0x530 [20677.091110] [] ? __schedule+0x1f0/0x5b0 [20677.091115] [] ? __wake_up_common+0x56/0x90 [20677.091117] [] ? maybe_create_worker+0x120/0x120 [20677.091119] [] ? schedule+0x40/0xb0 [20677.091121] [] ? maybe_create_worker+0x120/0x120 [20677.091123] [] kthread+0xcc/0xf0 [20677.091126] [] ? schedule_tail+0x1e/0xc0 [20677.091129] [] ret_from_fork+0x1f/0x40 [20677.091131] [] ? kthread_freezable_should_stop+0x70/0x70 [20677.091149] Code: 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 30 48 89 5d f0 4c 89 65 f8 66 66 66 66 90 48 89 fb 48 c7 c7 40 f0 84 a0 e8 1c 55 e9 e0 <48> 8b 83 20 01 00 00 4c 8d a3 20 01 00 00 49 39 c4 48 89 45 e0 [20677.091152] RIP [] nvmet_rdma_queue_disconnect+0x24/0x90 [nvmet_rdma] [20677.091152] RSP [20677.091153] CR2: 0000000000000120 > > > > > > So maybe there is just a race in that keep-alive can free the queue and yet a > > DISCONNECTED event still received on the cm_id after the queue is freed? > > rdma_destroy_id should barrier this scenario. I'm guessing the cm_id hasn't been destroyed. But the qp has...