From mboxrd@z Thu Jan 1 00:00:00 1970 From: James Bottomley Subject: Re: Deadlock in usb-storage error handling Date: Fri, 28 Mar 2014 09:52:22 -0700 Message-ID: <1396025542.15365.17.camel@dabdike> References: Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-15" Content-Transfer-Encoding: 7bit Return-path: Received: from bedivere.hansenpartnership.com ([66.63.167.143]:44089 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752450AbaC1QwZ (ORCPT ); Fri, 28 Mar 2014 12:52:25 -0400 In-Reply-To: Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Alan Stern Cc: Andreas Reis , USB list , SCSI development list On Thu, 2014-03-20 at 15:49 -0400, Alan Stern wrote: > On Thu, 20 Mar 2014, James Bottomley wrote: > > > On Thu, 2014-03-20 at 12:34 -0400, Alan Stern wrote: > > > On Thu, 20 Mar 2014, James Bottomley wrote: > > > > > > > OK, so I think we have three things to do > > > > > > > > 1. Investigate SCSI and fix it's abort state problem that's causing > > > > it not to send the abort second time around > > > > 2. Fix usb-storage to fail a reset it can't do (i.e. device reset > > > > with outstanding commands) > > > > 3. Find out why we're sending a spurious request sense. > > > > > > > > I can look at 1 and 3 if you want to take 2. > > > > > > It's a deal! Thanks for your help. > > > > OK, I think this is the fix for 1, if you could try it out. > > > > Thanks, > > > > James > > > > --- > > > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > > index 771c16b..c52bfb2 100644 > > --- a/drivers/scsi/scsi_error.c > > +++ b/drivers/scsi/scsi_error.c > > @@ -145,14 +145,14 @@ scmd_eh_abort_handler(struct work_struct *work) > > "scmd %p retry " > > "aborted command\n", scmd)); > > scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY); > > - return; > > + goto out; > > } else { > > SCSI_LOG_ERROR_RECOVERY(3, > > scmd_printk(KERN_WARNING, scmd, > > "scmd %p finish " > > "aborted command\n", scmd)); > > scsi_finish_command(scmd); > > - return; > > + goto out; > > } > > } else { > > SCSI_LOG_ERROR_RECOVERY(3, > > @@ -162,6 +162,8 @@ scmd_eh_abort_handler(struct work_struct *work) > > } > > } > > > > + scmd->eh_eflags &= ~SCSI_EH_ABORT_SCHEDULED; > > + > > if (!scsi_eh_scmd_add(scmd, 0)) { > > SCSI_LOG_ERROR_RECOVERY(3, > > scmd_printk(KERN_WARNING, scmd, > > @@ -170,6 +172,10 @@ scmd_eh_abort_handler(struct work_struct *work) > > scmd->result |= DID_TIME_OUT << 16; > > scsi_finish_command(scmd); > > } > > + return; > > + out: > > + scmd->eh_eflags &= ~SCSI_EH_ABORT_SCHEDULED; > > + return; > > } > > > > /** > > This worked the first time. :-) > > But I wonder, is it safe to access scmd after calling > scsi_finish_command()? Agree, I've redone the patch integrated into the try_to_abort call instead. Will post shortly. James