From: Mike Christie <michaelc@cs.wisc.edu>
To: Bart Van Assche <bvanassche@acm.org>
Cc: linux-scsi <linux-scsi@vger.kernel.org>,
James Bottomley <jbottomley@parallels.com>,
Jun'ichi Nomura <j-nomura@ce.jp.nec.com>,
Stefan Richter <stefanr@s5r6.in-berlin.de>,
Tomas Henzl <thenzl@redhat.com>,
Mike Snitzer <snitzer@redhat.com>
Subject: Re: [PATCH 2/3] Stop accepting SCSI requests before removing a device
Date: Wed, 30 May 2012 12:27:36 -0500 [thread overview]
Message-ID: <4FC65888.3000907@cs.wisc.edu> (raw)
In-Reply-To: <4FC5C488.4010307@acm.org>
On 05/30/2012 01:56 AM, Bart Van Assche wrote:
> On 05/29/12 17:35, Mike Christie wrote:
>
>> On 05/29/2012 10:00 AM, Bart Van Assche wrote:
>>> The patch below makes sure that blk_drain_queue() and blk_cleanup_queue()
>>> wait until all queuecommand invocations have finished and hence fixes a
>>> race between the SCSI error handler and __scsi_remove_device(). Any feedback
>>> is welcome.
>>>
>>> ---
>>> drivers/scsi/scsi_error.c | 14 +++++++++++++-
>>> 1 files changed, 13 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
>>> index 386f0c5..947f627 100644
>>> --- a/drivers/scsi/scsi_error.c
>>> +++ b/drivers/scsi/scsi_error.c
>>> @@ -781,10 +781,17 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd,
>>> struct scsi_device *sdev = scmd->device;
>>> struct scsi_driver *sdrv = scsi_cmd_to_driver(scmd);
>>> struct Scsi_Host *shost = sdev->host;
>>> + struct request_queue *q = sdev->request_queue;
>>> DECLARE_COMPLETION_ONSTACK(done);
>>> unsigned long timeleft;
>>> struct scsi_eh_save ses;
>>> - int rtn;
>>> + int rtn = FAILED;
>>> +
>>> + spin_lock_irq(q->queue_lock);
>>> + if (blk_queue_dead(q))
>>> + goto out_unlock;
>>> + q->rq.count[BLK_RW_SYNC]++;
>>> + spin_unlock_irq(q->queue_lock);
>>
>> Are you hitting a case where a scsi_cmnd does not have a request struct
>> that was allocated through the block layer functions like
>> blk_get_request, but is getting sent through this path? What code is
>> doing this?
>>
>> Or, are you hitting a bug where somehow the request is freed (so the
>> rq.count is decremented) but the scsi eh is still working on a scsi_cmnd
>> that had a request struct allocated for it?
>
>
> I haven't hit any such bugs. This patch is what I came up with after
> analyzing what would be necessary to make sure that queuecommand isn't
> called anymore after blk_cleanup_queue() finished and also to make sure
> that blk_drain_queue() waits until all active queuecommand calls have
It should be waiting now if the scsi_cmnd has a request backing
shouldn't it? We will allocate a request struct with blk_get_request or
one of the other blk helpers for each scsi_cmnd, and that will increment
the q->rq.count. If we then go down the error path because a cmd timed
out or because scsi_decide_disposition returned FAILED, then we will
still have that request backing the scsi cmnd and the count should still
be incremented for it. When we call scsi_send_eh_cmnd for eh operations
the request is then still there and not freed yet. The request will get
freed later when scsi_eh_flush_done_q is called. In there we will either
retry or call scsi_finish_command which will go through the normal
completion process and eventually call __blk_put_request and freed_request.
next prev parent reply other threads:[~2012-05-30 17:28 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-04 15:00 [PATCH 0/3 v6] Fixes for SCSI device removal Bart Van Assche
2012-05-04 15:03 ` [PATCH 1/3] sd: Fix device removal NULL pointer dereference Bart Van Assche
2012-05-04 15:06 ` [PATCH 2/3] Stop accepting SCSI requests before removing a device Bart Van Assche
2012-05-04 20:16 ` Mike Christie
2012-05-04 20:30 ` Mike Christie
2012-05-05 13:04 ` Bart Van Assche
2012-05-29 15:00 ` Bart Van Assche
2012-05-29 17:35 ` Mike Christie
2012-05-30 6:56 ` Bart Van Assche
2012-05-30 17:27 ` Mike Christie [this message]
2012-05-30 20:00 ` Bart Van Assche
2012-06-01 3:13 ` Mike Christie
2012-05-04 15:07 ` [PATCH 3/3] Make scsi_free_queue() abort pending requests Bart Van Assche
2012-05-04 20:25 ` Mike Christie
2012-05-04 20:32 ` Mike Christie
2012-05-05 6:07 ` Bart Van Assche
2012-05-07 0:44 ` Mike Christie
2012-05-07 1:15 ` Mike Christie
2012-05-14 18:43 ` Bart Van Assche
2012-05-29 14:56 ` Bart Van Assche
2012-05-05 13:41 ` Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FC65888.3000907@cs.wisc.edu \
--to=michaelc@cs.wisc.edu \
--cc=bvanassche@acm.org \
--cc=j-nomura@ce.jp.nec.com \
--cc=jbottomley@parallels.com \
--cc=linux-scsi@vger.kernel.org \
--cc=snitzer@redhat.com \
--cc=stefanr@s5r6.in-berlin.de \
--cc=thenzl@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.