linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Potnuri Bharat Teja <bharat@chelsio.com>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: target-devel@vger.kernel.org, nab@linux-iscsi.org,
	linux-scsi@vger.kernel.org, swise@opengridcomputing.com,
	varun@chelsio.com, rajur@chelsio.com
Subject: Re: IB/isert: Return value of iser target transport handlers ignored by iscsi target
Date: Thu, 18 Aug 2016 17:33:02 +0530	[thread overview]
Message-ID: <20160818120249.GA29925@chelsio.com> (raw)
In-Reply-To: <8f3ad0ed-a17d-571a-7e1a-eb61caa263c9@grimberg.me>

On Sunday, August 08/07/16, 2016 at 20:09:08 +0300, Sagi Grimberg wrote:
> >Hi,
> 
> Hi Baharat,
> 
> >In iSER target during iwarp connection tear-down due to ping timeouts, the rdma
> >queues are set to error state and subsequent queued iscsi session commands posted
> >shall fail with corresponding errno returned by ib_post_send/recv.
> >At this stage iser target handlers (Ex: isert_put_datain())
> >return the error to iscsci target, but these errors are
> >not being handled by the iscsi target handlers(Ex: lio_queue_status()).
> 
> Indeed looks like lio_queue_data_in() assumes that
> ->iscsit_queue_data_in() cannot fail. This either mean
> that isert_put_data_in() should take care of
> the extra put in this case, or the iscsi should correctly
> handle a queue_full condition (because we should not be hitting
> this in the normal IO flow).
> 
> Does this (untested) patch help?
> --
> diff --git a/drivers/infiniband/ulp/isert/ib_isert.c
> b/drivers/infiniband/ulp/isert/ib_isert.c
> index a990c04208c9..ff5dfc0f7c50 100644
> --- a/drivers/infiniband/ulp/isert/ib_isert.c
> +++ b/drivers/infiniband/ulp/isert/ib_isert.c
> @@ -1810,6 +1810,7 @@ isert_put_response(struct iscsi_conn *conn, struct
> iscsi_cmd *cmd)
>         struct ib_send_wr *send_wr = &isert_cmd->tx_desc.send_wr;
>         struct iscsi_scsi_rsp *hdr = (struct iscsi_scsi_rsp *)
>                                 &isert_cmd->tx_desc.iscsi_header;
> +       int ret;
> 
>         isert_create_send_desc(isert_conn, isert_cmd, &isert_cmd->tx_desc);
>         iscsit_build_rsp_pdu(cmd, conn, true, hdr);
> @@ -1848,7 +1849,10 @@ isert_put_response(struct iscsi_conn *conn, struct
> iscsi_cmd *cmd)
> 
>         isert_dbg("Posting SCSI Response\n");
> 
> -       return isert_post_response(isert_conn, isert_cmd);
> +       ret = isert_post_response(isert_conn, isert_cmd);
> +       if (ret)
> +               target_put_sess_cmd(&cmd->se_cmd);
> +       return 0;
>  }
> 
>  static void
> @@ -2128,7 +2132,7 @@ isert_put_datain(struct iscsi_conn *conn, struct
> iscsi_cmd *cmd)
>         struct isert_conn *isert_conn = conn->context;
>         struct ib_cqe *cqe = NULL;
>         struct ib_send_wr *chain_wr = NULL;
> -       int rc;
> +       int ret;
> 
>         isert_dbg("Cmd: %p RDMA_WRITE data_length: %u\n",
>                  isert_cmd, se_cmd->data_length);
> @@ -2148,34 +2152,41 @@ isert_put_datain(struct iscsi_conn *conn, struct
> iscsi_cmd *cmd)
>                 isert_init_send_wr(isert_conn, isert_cmd,
>                                    &isert_cmd->tx_desc.send_wr);
> 
> -               rc = isert_post_recv(isert_conn, isert_cmd->rx_desc);
> -               if (rc) {
> -                       isert_err("ib_post_recv failed with %d\n", rc);
> -                       return rc;
> +               ret = isert_post_recv(isert_conn, isert_cmd->rx_desc);
> +               if (ret) {
> +                       isert_err("ib_post_recv failed with %d\n", ret);
> +                       goto out;
>                 }
> 
>                 chain_wr = &isert_cmd->tx_desc.send_wr;
>         }
> 
> -       isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr);
> +       ret = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr);
>         isert_dbg("Cmd: %p posted RDMA_WRITE for iSER Data READ\n",
> isert_cmd);
> -       return 1;
> +out:
> +       if (ret)
> +               target_put_sess_cmd(&cmd->se_cmd);
> +       return 0;
>  }
> 
>  static int
>  isert_get_dataout(struct iscsi_conn *conn, struct iscsi_cmd *cmd, bool
> recovery)
>  {
>         struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
> +       int ret;
> 
>         isert_dbg("Cmd: %p RDMA_READ data_length: %u write_data_done: %u\n",
>                  isert_cmd, cmd->se_cmd.data_length, cmd->write_data_done);
> 
>         isert_cmd->tx_desc.tx_cqe.done = isert_rdma_read_done;
> -       isert_rdma_rw_ctx_post(isert_cmd, conn->context,
> +       ret = isert_rdma_rw_ctx_post(isert_cmd, conn->context,
>                         &isert_cmd->tx_desc.tx_cqe, NULL);
> 
>         isert_dbg("Cmd: %p posted RDMA_READ memory for ISER Data WRITE\n",
>                  isert_cmd);+out:
> +       if (ret)
> +               target_put_sess_cmd(&cmd->se_cmd);
>         return 0;
>  }
> 
> --
>
Hi Sagi,
We found another issue in cq poll mechanism within iwarp module,
masking the above failure, so had to fix it first. Now I am able to
reproduce the actual stall due to pending underefed commands in command
list.
The above patch doesn't hold good for all cases and I see the same stall
waiting for completion event. Debugging further there are few more post
commands like isert_put_nopin(), which are also needed to be modified to
handle the post send failures.
I tried with few more changes in addition to the above change to handle
all possible post send/recv failures during ping timeout. Here is the  
snippet of change I tried:

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index ba6be06..a96f16c 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
static void
@@ -1892,6 +1897,7 @@ isert_put_nopin(struct iscsi_cmd *cmd, struct
iscsi_conn *conn,
        struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
        struct isert_conn *isert_conn = conn->context;
        struct ib_send_wr *send_wr =
			 &isert_cmd->tx_desc.send_wr;
+       int ret;

	isert_create_send_desc(isert_conn,
				 isert_cmd, &isert_cmd->tx_desc);
	iscsit_build_nopin_rsp(cmd, conn, (struct iscsi_nopin *)
@@ -1902,7 +1908,11 @@
	isert_put_nopin(struct iscsi_cmd *cmd, struct
					 iscsi_conn *conn,

	isert_dbg("conn %p Posting NOPIN Response\n", isert_conn);

-       return isert_post_response(isert_conn, isert_cmd);
+       ret = isert_post_response(isert_conn, isert_cmd);
+       if (ret)
+		 target_put_sess_cmd(&cmd->se_cmd);
+
+       return ret;
}

Similar change in all function calls of switch case of
isert_response_queue().

Yet that doesnt solve it all. With all above changes I see random
crashes due to null pointer dereference. I believe these crashes are due
to some race between isert threads.

I am currently dubugging those failures to see if our changes are proper
fix for the issue.

Will update with more details.

> >
> >-> While closing the session in case of ping timeouts, isert_close_connection()->
> >isert_wait_conn()->isert_wait4cmds() checks for the queued session commands
> >and waits infinitely for command completion 'cmd_wait_comp' in target_wait_for_sess_cmds().
> >'cmd_wait_comp' will be never complete as the kref on the session command is
> >not derefed, due to which target_release_cmd_kref() is not called by kref_put().
> >Due to this, the older session is not cleared causing the next login negotiation to fail
> >as the older session is still active(Older SID exists).
> 
> Makes sense...
> 
> >[Query 1] If the return value of ib_post_send/recv() are handled to deref the
> >corresponding queued session commands, the wait on cmd_wait_comp will be
> >complete and clears off the session successfully. Is this the rightway to
> >do it here?
> 
> I think that given that ->iscsit_queue_data_in() and
> ->iscsit_queue_status() are never expected to fail we probably
> should take care of it in isert...
> 
> Nic, any input on the iscsit side?
It is the same on the iscsit side too, Most of the code doesnt expect
for return value or failures.
> 
> >
> >[Query 2] An extra deref is done in case of transport_status CMD_T_TAS
> >in target_wait_for_sess_cmds(), can similar
> >implementation be done for transport state CMD_T_FABRIC_STOP?
> 
> I think that can work also given that target_sess_cmd_list_set_waiting()
> is indeed set when we are starting teardown. Not sure how this will
> affect other transports though...
Thanks,
Bharat.

      reply	other threads:[~2016-08-18 12:03 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-04 10:35 IB/isert: Return value of iser target transport handlers ignored by iscsi target Potnuri Bharat Teja
2016-08-07 17:09 ` Sagi Grimberg
2016-08-18 12:03   ` Potnuri Bharat Teja [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160818120249.GA29925@chelsio.com \
    --to=bharat@chelsio.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=nab@linux-iscsi.org \
    --cc=rajur@chelsio.com \
    --cc=sagi@grimberg.me \
    --cc=swise@opengridcomputing.com \
    --cc=target-devel@vger.kernel.org \
    --cc=varun@chelsio.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).