From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bart Van Assche Subject: Re: [PATCH 02/12] scsi_transport_srp: Fix a race condition Date: Thu, 30 Apr 2015 12:20:59 +0200 Message-ID: <5542020B.40704@sandisk.com> References: <5541EE21.3050809@sandisk.com> <5541EE66.7090608@sandisk.com> <5541F96F.8090503@dev.mellanox.co.il> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5541F96F.8090503-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Sagi Grimberg , Doug Ledford Cc: James Bottomley , Sagi Grimberg , Sebastian Parschauer , linux-rdma , "linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" List-Id: linux-rdma@vger.kernel.org On 04/30/15 11:44, Sagi Grimberg wrote: > On 4/30/2015 11:57 AM, Bart Van Assche wrote: >> Avoid that srp_terminate_io() can get invoked while srp_queuecommand() >> is in progress. This patch avoids that an I/O timeout can trigger the >> following kernel warning: >> >> WARNING: at drivers/infiniband/ulp/srp/ib_srp.c:1447 >> srp_terminate_io+0xef/0x100 [ib_srp]() >> Call Trace: >> [] dump_stack+0x4e/0x68 >> [] warn_slowpath_common+0x81/0xa0 >> [] warn_slowpath_null+0x1a/0x20 >> [] srp_terminate_io+0xef/0x100 [ib_srp] >> [] __rport_fail_io_fast+0xba/0xc0 >> [scsi_transport_srp] >> [] rport_fast_io_fail_timedout+0xe0/0xf0 >> [scsi_transport_srp] >> [] process_one_work+0x1db/0x780 >> [] worker_thread+0x11b/0x450 >> [] kthread+0xe4/0x100 >> [] ret_from_fork+0x7c/0xb0 >> >> See also patch "scsi_transport_srp: Add transport layer error >> handling" (commit ID 29c17324803c). >> >> Signed-off-by: Bart Van Assche >> Cc: James Bottomley >> Cc: Sagi Grimberg >> Cc: Sebastian Parschauer >> Cc: #v3.13 >> --- >> drivers/scsi/scsi_transport_srp.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/scsi/scsi_transport_srp.c >> b/drivers/scsi/scsi_transport_srp.c >> index 6ce1c48..4a44337 100644 >> --- a/drivers/scsi/scsi_transport_srp.c >> +++ b/drivers/scsi/scsi_transport_srp.c >> @@ -437,8 +437,10 @@ static void __rport_fail_io_fast(struct srp_rport >> *rport) >> >> /* Involve the LLD if possible to terminate all I/O on the >> rport. */ >> i = to_srp_internal(shost->transportt); >> - if (i->f->terminate_rport_io) >> + if (i->f->terminate_rport_io) { >> + srp_wait_for_queuecommand(shost); >> i->f->terminate_rport_io(rport); >> + } > > Why not just terminate the inflight IO before unblocking the target? Sorry but I don't think that would prevent the described race condition. The call trace in the description of this patch illustrates that srp_queuecommand() can still be active even after the transport state has been changed into "offline". Hence if terminate_rport_io() would be invoked earlier the same race would still exist. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html