From: Ming Lei <ming.lei@redhat.com>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Bart Van Assche <bvanassche@acm.org>,
Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
linux-nvme@lists.infradead.org
Subject: Re: [PATCH 1/2] blk-mq: introduce blk_mq_complete_request_sync()
Date: Thu, 21 Mar 2019 09:39:09 +0800 [thread overview]
Message-ID: <20190321013908.GA15115@ming.t460p> (raw)
In-Reply-To: <448615db-64e2-cbe7-c09e-19b2d86a720a@grimberg.me>
On Wed, Mar 20, 2019 at 05:47:01PM -0700, Sagi Grimberg wrote:
>
> > Hi Ming,
> >
> > Just like the NVMeOF initiator driver, the SRP initiator driver uses an
> > RDMA RC connection for all of its communication over the network. If
> > communication between initiator and target fails the target driver will
> > close the connection or one of the work requests that was posted by the
> > initiator driver will complete with an error status (wc->status !=
> > IB_WC_SUCCESS). In the latter case the function srp_handle_qp_err() will
> > try to reestablish the connection between initiator and target after a
> > certain delay:
> >
> > if (delay > 0)
> > queue_delayed_work(system_long_wq, &rport->reconnect_work,
> > 1UL * delay * HZ);
> >
> > SCSI timeouts may kick the SCSI error handler. That results in calls of
> > the srp_reset_device() and/or srp_reset_host() functions. srp_reset_host()
> > terminates all outstanding requests after having disconnected the RDMA RC
> > connection. Disconnecting the RC connection first guarantees that there
> > are no concurrent request completion calls from the regular completion
> > path and from the error handler.
>
> Hi Bart,
>
> If I understand the race correctly, its not between the requests
> completion and the queue pairs removal nor the timeout handler
> necessarily, but rather it is between the async requests completion and
> the tagset deallocation.
>
> Think of surprise removal (or disconnect) during I/O, drivers
> usually stop/quiesce/freeze the queues, terminate/abort inflight
> I/Os and then teardown the hw queues and the tagset.
>
> IIRC, the same race holds for srp if this happens during I/O:
> 1. srp_rport_delete() -> srp_remove_target() -> srp_stop_rport_timers() ->
> __rport_fail_io_fast()
>
> 2. complete all I/Os (async remotely via smp)
>
> Then continue..
>
> 3. scsi_host_put() -> scsi_host_dev_release() -> scsi_mq_destroy_tags()
>
> What is preventing (3) from happening before (2) if its async? I would
> think that scsi drivers need the exact same thing...
blk_cleanup_queue() will do that, but it can't be used in device recovery
obviously.
BTW, blk_mq_complete_request_sync() is a bit misleading, maybe
blk_mq_complete_request_locally() is better.
Thanks,
Ming
next prev parent reply other threads:[~2019-03-21 1:39 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-18 3:29 [PATCH 0/2] blk-mq/nvme: cancel request synchronously Ming Lei
2019-03-18 3:29 ` [PATCH 1/2] blk-mq: introduce blk_mq_complete_request_sync() Ming Lei
2019-03-18 4:09 ` Bart Van Assche
2019-03-18 7:38 ` Ming Lei
2019-03-18 15:04 ` Bart Van Assche
2019-03-18 15:16 ` Ming Lei
2019-03-18 15:49 ` Bart Van Assche
2019-03-18 16:06 ` Ming Lei
2019-03-21 0:47 ` Sagi Grimberg
2019-03-21 1:39 ` Ming Lei [this message]
2019-03-21 2:04 ` Sagi Grimberg
2019-03-21 2:32 ` Ming Lei
2019-03-21 21:40 ` Sagi Grimberg
2019-03-27 8:27 ` Christoph Hellwig
2019-03-21 2:15 ` Bart Van Assche
2019-03-21 2:13 ` Sagi Grimberg
2019-03-18 14:40 ` Keith Busch
2019-03-18 17:30 ` James Smart
2019-03-18 17:37 ` James Smart
2019-03-19 1:06 ` Ming Lei
2019-03-19 3:37 ` James Smart
2019-03-19 3:50 ` Ming Lei
2019-03-19 1:31 ` Ming Lei
2019-03-19 4:04 ` James Smart
2019-03-19 4:28 ` Ming Lei
2019-03-27 8:30 ` Christoph Hellwig
2019-03-18 3:29 ` [PATCH 2/2] nvme: cancel request synchronously Ming Lei
2019-03-27 8:30 ` Christoph Hellwig
2019-03-27 2:06 ` [PATCH 0/2] blk-mq/nvme: " Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190321013908.GA15115@ming.t460p \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).