All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
To: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: James Bottomley <jbottomley-wo1vFcy6AUs@public.gmane.org>,
	Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Sebastian Parschauer
	<sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>,
	linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH 04/12] IB/srp: Fix connection state tracking
Date: Wed, 6 May 2015 11:29:16 +0200	[thread overview]
Message-ID: <5549DEEC.9050501@sandisk.com> (raw)
In-Reply-To: <1430842201.2407.226.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Hello Doug,

On 05/05/15 18:10, Doug Ledford wrote:
> Be that as it may, that doesn't change what I said about posting a
> command to a known disconnected QP.  You could just fail immediately.
> Something like:
>
> if (!ch->connected) {
> 	scmnd->result = DID_NO_CONNECT;
> 	goto err;
> }
>
> right after getting the channel in queuecommand would work.  That would
> save a couple spinlocks, several DMA mappings, a call into the low level
> driver, and a few other things.  (And I only left requeue on the table
> because I wasn't sure how the blk_mq dealt with just a single channel
> being down versus all of them being down)

What you wrote above looks correct to me. However, it is intentional 
that such a check is not present in srp_queuecommand(). The intention 
was to optimize the hot path of that driver as much as possible. Hence 
the choice to post a work request on the QP even after it has been 
disconnected and to let the HCA generate an error completion.

> But my point in all of this is that if you have a single qp between
> yourself and the target, then any error including a qp resource error ==
> path error since you only have one path.  When you have a multi queue
> device, that's no longer true.  A transient resource problem on one qp
> does not mean a path event (at least not necessarily, although your
> statement below converts a QP event into a path event by virtue
> disconnecting and reconnecting all of the QPs).  My curiosity is now
> moot given what you wrote about tearing everything down and reconnecting
> (unless the error handling is modified to be more subtle in its
> workings), but the original question in my mind was what happens at the
> blk_mq level if you did have a single queue drop but not all of them and
> you weren't using multipath.

If we want to support this without adding similar code to handle this in 
every SCSI LLD I think we need to change first how blk-mq and 
dm-multipath interact. Today dm-multipath is a layer on top of blk-mq. 
Supporting the above scenario properly is possible e.g. by integrating 
multipath support in the blk-mq layer. I think Hannes and Christoph have 
already started to work on this.

>> If only one channel fails all other channels are disconnected and the
>> transport layer error handling mechanism is started.
>
> I missed that.  I assume it's done in srp_start_tl_fail_timers()?

Yes, that's correct. Both QP errors and reception of a DREQ trigger a 
call of srp_tl_err_work(). That last function calls 
srp_start_tl_fail_timers() which starts the reconnection mechanism, at 
least if the reconnect_delay parameter has a positive value (> 0).

Bart.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-05-06  9:29 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-30  8:56 [PATCH 0/12] IB/srp patches for kernel v4.2 Bart Van Assche
2015-04-30  8:56 ` [PATCH 01/12] scsi_transport_srp: Introduce srp_wait_for_queuecommand() Bart Van Assche
     [not found]   ` <5541EE4A.30803-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-04-30  9:32     ` Sagi Grimberg
2015-04-30  9:37     ` Christoph Hellwig
2015-04-30 10:26       ` Bart Van Assche
2015-04-30 10:32         ` Sagi Grimberg
2015-04-30 10:58           ` Bart Van Assche
     [not found]             ` <55420AEA.10108-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-04-30 14:13               ` Sagi Grimberg
2015-04-30 17:25               ` Christoph Hellwig
2015-05-06  9:59                 ` Bart Van Assche
2015-05-11  7:50                   ` hosts resets in SRP and the rest of the world, was: " Christoph Hellwig
2015-05-11  8:54                     ` Bart Van Assche
2015-05-11  9:31                       ` Christoph Hellwig
2015-05-11  9:58                         ` Bart Van Assche
2015-05-11 11:47                           ` Christoph Hellwig
2015-05-11 10:58                         ` Bart Van Assche
2015-05-11 11:50                           ` Christoph Hellwig
2015-05-12  8:49                             ` Bart Van Assche
2015-05-12 18:02                               ` Christoph Hellwig
2015-04-30  8:57 ` [PATCH 02/12] scsi_transport_srp: Fix a race condition Bart Van Assche
     [not found]   ` <5541EE66.7090608-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-04-30  9:44     ` Sagi Grimberg
     [not found]       ` <5541F96F.8090503-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-04-30 10:20         ` Bart Van Assche
2015-04-30  8:58 ` [PATCH 04/12] IB/srp: Fix connection state tracking Bart Van Assche
2015-04-30  9:51   ` Sagi Grimberg
2015-04-30 11:25     ` Bart Van Assche
     [not found]       ` <5542111E.1080305-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-04-30 15:00         ` Sagi Grimberg
     [not found]           ` <5542439D.1000107-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-05-05  9:31             ` Bart Van Assche
     [not found]               ` <55488E06.8040308-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-05-05  9:45                 ` Sagi Grimberg
     [not found]                   ` <5548911F.8060505-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-05-05  9:59                     ` Bart Van Assche
2015-04-30 16:08   ` Doug Ledford
     [not found]     ` <1430410094.102408.71.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-05  9:21       ` Bart Van Assche
     [not found]         ` <55488BAE.7070006-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-05-05 14:10           ` Doug Ledford
2015-05-05 14:26             ` Bart Van Assche
2015-05-05 15:10               ` Doug Ledford
2015-05-05 15:27                 ` Bart Van Assche
     [not found]                   ` <5548E155.70007-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-05-05 16:10                     ` Doug Ledford
     [not found]                       ` <1430842201.2407.226.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-06  9:29                         ` Bart Van Assche [this message]
     [not found]                           ` <5549DEEC.9050501-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-05-07 13:44                             ` Doug Ledford
2015-04-30  8:58 ` [PATCH 05/12] IB/srp: Fix reconnection failure handling Bart Van Assche
2015-04-30  8:59 ` [PATCH 06/12] scsi_transport_srp: Reduce failover time Bart Van Assche
2015-04-30 10:13   ` Sagi Grimberg
2015-04-30 11:02     ` Bart Van Assche
     [not found]       ` <55420BAA.7060507-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-04-30 15:14         ` Sagi Grimberg
     [not found]           ` <554246E6.9020503-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-05-05  9:38             ` Bart Van Assche
2015-04-30  9:00 ` [PATCH 07/12] IB/srp: Remove superfluous casts Bart Van Assche
2015-04-30 10:13   ` Sagi Grimberg
2015-04-30  9:00 ` [PATCH 08/12] IB/srp: Rearrange module description Bart Van Assche
     [not found]   ` <5541EF39.6040301-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-04-30 10:15     ` Sagi Grimberg
2015-04-30  9:01 ` [PATCH 09/12] IB/srp: Remove a superfluous check from srp_free_req_data() Bart Van Assche
     [not found]   ` <5541EF4F.6050200-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-04-30 10:18     ` Sagi Grimberg
2015-04-30 10:37       ` Bart Van Assche
2015-04-30  9:01 ` [PATCH 10/12] IB/srp: Remove !ch->target tests from the reconnect code Bart Van Assche
2015-04-30 10:19   ` Sagi Grimberg
     [not found] ` <5541EE21.3050809-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-04-30  8:57   ` [PATCH 03/12] IB/srp: Remove an extraneous scsi_host_put() from an error path Bart Van Assche
2015-04-30  9:44     ` Sagi Grimberg
2015-04-30  9:02   ` [PATCH 11/12] IB/srp: Add 64-bit LUN support Bart Van Assche
2015-04-30  9:02   ` [PATCH 12/12] IB/srp: Make CM timeout dependent on subnet timeout Bart Van Assche
     [not found]     ` <5541EFB3.6030704-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2015-04-30 10:27       ` Sagi Grimberg
2015-04-30 10:45         ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5549DEEC.9050501@sandisk.com \
    --to=bart.vanassche-xdaiopvojttbdgjk7y7tuq@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=jbottomley-wo1vFcy6AUs@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.