From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Steve Wise" Subject: RE: stuck iscsi/iser target with linux-4.15.0-rc1 Date: Wed, 13 Dec 2017 15:40:12 -0600 Message-ID: <01bd01d3745a$f4bc82f0$de3588d0$@opengridcomputing.com> References: <000801d36ac6$9e9f5f70$dbde1e50$@opengridcomputing.com> <0ba7e891-f020-26fb-9945-9e824332593c@grimberg.me> <018901d36d17$6a703410$3f509c30$@opengridcomputing.com> <1dee9f68-a81b-b7b8-9e70-e0ef5c63c520@grimberg.me> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Return-path: In-Reply-To: <1dee9f68-a81b-b7b8-9e70-e0ef5c63c520-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> Content-Language: en-us Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: 'Sagi Grimberg' , 'target-devel' Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-rdma@vger.kernel.org > > [239800.115739] target_wait_for_sess_cmds: Waiting for se_cmd: > ffff88034082c998 t_state: 6, fabric state: 12 > > Hmm, this means that the command was delegated to isert to send > data+response... Which means we lose a reference put somewhere here. > > I'm assuming that this happens before your changes to ib_drain_qp > correct? If this does not happen without your changes it might indicate > that drain_qp is missing an error (or successful?) completion which > would prevent a final reference drop (isert_completion_put). Hey Sagi, I'm trying to reproduce this on CX4 cards with mlx5. I have the two nodes setup via RoCEv2 and rping works over mlx5 fine, but when I try to discover the iSER targets, the initiator fails with: [root@potato1 ~]# iscsiadm -m discovery -t sendtargets -p 172.16.99.239:3260 -I iser iscsiadm: recv's end state machine bug? iscsiadm: Could not perform SendTargets discovery: iSCSI PDU timed out [root@potato1 ~]# uname -r 4.15.0-rc3+ And the target logs this: [ 873.240460] mlx5_0:dump_cqe:277:(pid 494): dump error cqe [ 873.246665] 00000000 00000000 00000000 00000000 [ 873.251942] 00000000 00000000 00000000 00000000 [ 873.257214] 00000000 00000000 00000000 00000000 [ 873.262472] 00000000 00008a12 0a0000f6 00014bd2 [ 873.267711] isert: isert_print_wc: send failure: invalid request error (9) vend_err 8a Any ideas? I'm using straight 4.15.0-rc3 + a workaround to avoid crashing my x86 systems at bootup from here: https://www.mail-archive.com/netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg203210.html' Steve. Thanks, Steve. --- This email has been checked for viruses by AVG. http://www.avg.com -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Steve Wise" Date: Wed, 13 Dec 2017 21:40:12 +0000 Subject: RE: stuck iscsi/iser target with linux-4.15.0-rc1 Message-Id: <01bd01d3745a$f4bc82f0$de3588d0$@opengridcomputing.com> List-Id: References: <000801d36ac6$9e9f5f70$dbde1e50$@opengridcomputing.com> <0ba7e891-f020-26fb-9945-9e824332593c@grimberg.me> <018901d36d17$6a703410$3f509c30$@opengridcomputing.com> <1dee9f68-a81b-b7b8-9e70-e0ef5c63c520@grimberg.me> In-Reply-To: <1dee9f68-a81b-b7b8-9e70-e0ef5c63c520-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: 'Sagi Grimberg' , 'target-devel' Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > [239800.115739] target_wait_for_sess_cmds: Waiting for se_cmd: > ffff88034082c998 t_state: 6, fabric state: 12 > > Hmm, this means that the command was delegated to isert to send > data+response... Which means we lose a reference put somewhere here. > > I'm assuming that this happens before your changes to ib_drain_qp > correct? If this does not happen without your changes it might indicate > that drain_qp is missing an error (or successful?) completion which > would prevent a final reference drop (isert_completion_put). Hey Sagi, I'm trying to reproduce this on CX4 cards with mlx5. I have the two nodes setup via RoCEv2 and rping works over mlx5 fine, but when I try to discover the iSER targets, the initiator fails with: [root@potato1 ~]# iscsiadm -m discovery -t sendtargets -p 172.16.99.239:3260 -I iser iscsiadm: recv's end state machine bug? iscsiadm: Could not perform SendTargets discovery: iSCSI PDU timed out [root@potato1 ~]# uname -r 4.15.0-rc3+ And the target logs this: [ 873.240460] mlx5_0:dump_cqe:277:(pid 494): dump error cqe [ 873.246665] 00000000 00000000 00000000 00000000 [ 873.251942] 00000000 00000000 00000000 00000000 [ 873.257214] 00000000 00000000 00000000 00000000 [ 873.262472] 00000000 00008a12 0a0000f6 00014bd2 [ 873.267711] isert: isert_print_wc: send failure: invalid request error (9) vend_err 8a Any ideas? I'm using straight 4.15.0-rc3 + a workaround to avoid crashing my x86 systems at bootup from here: https://www.mail-archive.com/netdev@vger.kernel.org/msg203210.html' Steve. Thanks, Steve. --- This email has been checked for viruses by AVG. http://www.avg.com