From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hannes Reinecke Subject: Re: [PATCH 1/3] Remove two cancel_delayed_work() calls from the error handler Date: Tue, 27 May 2014 08:22:51 +0200 Message-ID: <53842F3B.5010106@suse.de> References: <538359DB.9080601@acm.org> <53835A52.9010306@acm.org> <53842545.50502@suse.de> <53842BE7.5060304@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from cantor2.suse.de ([195.135.220.15]:53764 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751605AbaE0GXI (ORCPT ); Tue, 27 May 2014 02:23:08 -0400 In-Reply-To: <53842BE7.5060304@acm.org> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Bart Van Assche Cc: "linux-scsi@vger.kernel.org" (Resending; mailer rejected it ...) On 05/27/2014 08:08 AM, Bart Van Assche wrote: > On 05/27/14 07:40, Hannes Reinecke wrote: >> On 05/26/2014 05:14 PM, Bart Van Assche wrote: >>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c >>> index f17aa7a..5232583 100644 >>> --- a/drivers/scsi/scsi_error.c >>> +++ b/drivers/scsi/scsi_error.c >>> @@ -193,7 +193,7 @@ scsi_abort_command(struct scsi_cmnd *scmd) >>> SCSI_LOG_ERROR_RECOVERY(3, >>> scmd_printk(KERN_INFO, scmd, >>> "scmd %p previous abort failed\n", scmd)); >>> - cancel_delayed_work(&scmd->abort_work); >>> + WARN_ON_ONCE(delayed_work_pending(&scmd->abort_work)); >>> return FAILED; >>> } >>> >>> >> The first bit is okay, the second isn't. >> >> The second bit is for these cases where the abort got scheduled (in >> scsi_abort_command()), but the workqueue didn't get executed by the = time >> the next timeout occured. >> I know, highly unlikely, but there is no safeguarding that it _canno= t_ >> happen. >> So the second cancel_delayed_work() has to stay. > > But how could that next timeout occur while abort_work is still pendi= ng > ? The block layer removes a request from the timeout list before > invoking the timeout handler (see also blk_rq_check_expired()). This > means that no block layer timers are active after abort_work has been > scheduled and before scmd_eh_abort_handler() is called. This also mea= ns > that a second timeout can only occur after a SCSI command has been > reinserted to a SCSI device queue. And such a reinsertion can only oc= cur > after scmd_eh_abort_handler() has started. The pending bit is cleared > from a work struct before the associated handler is invoked. This is = why > I think the above cancel_delayed_work() statement is not necessary. O= r > did I perhaps overlook something ? > Hmm. Okay, convinced. Acked-by: Hannes Reinecke Cheers, Hannes --=20 Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N=FCrnberg GF: J. Hawn, J. Guild, F. Imend=F6rffer, HRB 16746 (AG N=FCrnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html