From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Christie Subject: Re: [PATCH 4/4] scsi: Stop accepting SCSI requests before removing a device Date: Wed, 06 Jun 2012 10:21:02 -0500 Message-ID: <4FCF755E.6090301@cs.wisc.edu> References: <4FCE3D20.4000205@acm.org> <4FCE3E63.7000002@acm.org> <4FCE7BC9.1010504@cs.wisc.edu> <4FCF4A52.20406@acm.org> <4FCF5B38.7020707@cs.wisc.edu> <4FCF6EE8.7010907@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: Received: from sabe.cs.wisc.edu ([128.105.6.20]:50457 "EHLO sabe.cs.wisc.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754061Ab2FFPVj (ORCPT ); Wed, 6 Jun 2012 11:21:39 -0400 In-Reply-To: <4FCF6EE8.7010907@acm.org> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Bart Van Assche Cc: linux-scsi , James Bottomley , Jun'ichi Nomura , Stefan Richter , Jens Axboe , Joe Lawrence On 06/06/2012 09:53 AM, Bart Van Assche wrote: > On 06/06/12 13:29, Mike Christie wrote: > >> On 06/06/2012 07:17 AM, Bart Van Assche wrote: >>> On 06/05/12 21:36, Mike Christie wrote: >>> >>>> On 06/05/2012 12:14 PM, Bart Van Assche wrote: >>>>> Avoid that the code for requeueing SCSI requests triggers a >>>>> crash by making sure that that code isn't scheduled anymore >>>>> after a device has been removed. >>>>> >>>>> Also, source code inspection of __scsi_remove_device() revealed >>>>> a race condition in this function: no new SCSI requests must be >>>>> accepted for a SCSI device after device removal started. >>>>> >>>>> Signed-off-by: Bart Van Assche >>>>> Cc: Mike Christie >>>>> Cc: James Bottomley >>>>> Cc: Jens Axboe >>>>> Cc: Joe Lawrence >>>>> Cc: Jun'ichi Nomura >>>>> Cc: >>>>> --- >>>>> drivers/scsi/scsi_lib.c | 7 ++++--- >>>>> drivers/scsi/scsi_sysfs.c | 11 +++++++++-- >>>>> 2 files changed, 13 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >>>>> index 082c1e5..b722a8b 100644 >>>>> --- a/drivers/scsi/scsi_lib.c >>>>> +++ b/drivers/scsi/scsi_lib.c >>>>> @@ -158,10 +158,11 @@ static void __scsi_queue_insert(struct scsi_cmnd *cmd, int reason, int unbusy) >>>>> * that are already in the queue. >>>>> */ >>>>> spin_lock_irqsave(q->queue_lock, flags); >>>>> - blk_requeue_request(q, cmd->request); >>>>> + if (!blk_queue_dead(q)) { >>>>> + blk_requeue_request(q, cmd->request); >>>>> + kblockd_schedule_work(q, &device->requeue_work); >>>>> + } >>>> >>>> If we do not requeue what eventually frees the request? >>> >>> As far as I can see any request passed to __scsi_queue_insert() has >>> already been started. So if it isn't requeued it's timer remains active >>> and hence will fire eventually. >> >> This is true in the scsi_dispatch_cmd path, but not others. >> >> If we were requeueing from the error handler scsi_eh_flush_done_q then >> the timer would have been stopped because that is how we got into the eh. > > > Given the above I propose to add a __blk_end_request_all(req, -ENXIO) > call if the queue is dead. > Seems ok to me. Maybe a little messy. Don't forget to free the scsi stuff like the scatterlist, scsi_cmnd, etc. If it is REQ_TYPE_BLOCK_PC set req->errors.