From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bart Van Assche Subject: Re: hosts resets in SRP and the rest of the world, was: Re: [PATCH 01/12] scsi_transport_srp: Introduce srp_wait_for_queuecommand() Date: Mon, 11 May 2015 12:58:03 +0200 Message-ID: <55508B3B.9080806@sandisk.com> References: <5541EE21.3050809@sandisk.com> <5541EE4A.30803@sandisk.com> <20150430093719.GA23486@infradead.org> <5542034D.5010300@sandisk.com> <554204D7.9050204@dev.mellanox.co.il> <55420AEA.10108@sandisk.com> <20150430172516.GA19200@infradead.org> <5549E600.9050208@sandisk.com> <20150511075058.GA18483@infradead.org> <55506E46.2060103@sandisk.com> <20150511093130.GA30217@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail-bl2on0067.outbound.protection.outlook.com ([65.55.169.67]:60657 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753496AbbEKK6M (ORCPT ); Mon, 11 May 2015 06:58:12 -0400 In-Reply-To: <20150511093130.GA30217@infradead.org> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Christoph Hellwig Cc: Sagi Grimberg , Doug Ledford , James Bottomley , Sagi Grimberg , Sebastian Parschauer , Jens Axboe , "linux-scsi@vger.kernel.org" , Hannes Reinecke On 05/11/15 11:31, Christoph Hellwig wrote: > On Mon, May 11, 2015 at 10:54:30AM +0200, Bart Van Assche wrote: >> Hello Christoph, >> >> There are multiple events that can cause the SRP initiator driver to >> initiate a reconnect: >> 1. The SCSI core invoking eh_host_reset_handler(). >> 2. An error reported by the IB HCA or by the IB core, e.g. an RDMA >> transmit timeout or a transport layer disconnect reported by the >> IB/CM. > > Right, I missed the srp_reconnect_work case. But even with that I > think what I wrote above still stands. srp_reconnect_work in that > case would just directly trigger the abort all commands and > reconnect operation. > > The main point I was trying to make is that instead of having a sequence > of: > > 1) block new queuecommand instances > 2) flush out pending queuecommand instances > 3) do part of the disconnect > 4) fail all in-flight commands > 5) reconnect > > we should aim for: > > 1) block new queuecommand instances > 2) fail all in-flight commands > 3) disconnect and reconnect > > to avoid the need to keep track of pending queuecommand instances, > and instead re-use the existing infrastructure to fail all in-flight > commands, which we have the infrastructure for, and which we need > to do anyway. Hello Christoph, What I'm wondering about is whether it will be possible with the above approach to trigger path failover before (2 * SCSI timeout) has expired ? Starting SCSI error handling immediately after the block layer has reported the first SCSI timeout is only safe if all ongoing SCSI commands are canceled in some way. Is this what the function blk_abort_request() is intended for ? As far as I can see invoking that function or any function with a similar purpose is only safe after the queuecommand() callback function has finished. However, blk_mq_run_hw_queue() invokes mq_ops->queue_rq() without holding any lock. So it's not clear to me how to safely cancel ongoing blk-mq requests without waiting until these have timed out. I hope that this means that overlooked something ? Bart.