All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
	Tomas Henzl <thenzl@redhat.com>,
	"'linux-scsi@vger.kernel.org'" <linux-scsi@vger.kernel.org>,
	Stanislaw Gruszka <sgruszka@redhat.com>
Subject: Re: [RFC] How to fix an async scan - rmmod race?
Date: Thu, 12 Apr 2012 10:48:20 +0000	[thread overview]
Message-ID: <4F86B2F4.3020101@acm.org> (raw)
In-Reply-To: <4F860572.6090404@cs.wisc.edu>

On 04/11/12 22:28, Mike Christie wrote:

> On 04/11/2012 02:47 PM, Bart Van Assche wrote:
>> disadvantage is that this approach will only work fine if the LLD stops
>> I/O completion notifications before invoking scsi_remove_host(). Several
> 
> I don't think you would want to do that, because you have IO from the
> sd_shutdown path that you do want to execute. After the remove/shutdown
> callouts have been run then you do not want new IO to be sent to the LLD.
> 
> So scsi_remove_host sets the host state to cancel initially. It then
> calls scsi_forget_host which will loop over devices and remove them.
> That could cause IO to be sent by functions like sd_shutdown.


So that means that with an operational transport layer it's wrong for a
SCSI LLD to stop processing SCSI commands before scsi_remove_host()
finished ? It looks like several SCSI LLD authors are not aware of this.
I have found several examples of high-profile SCSI LLD drivers in the
kernel tree that cause newly submitted SCSI commands to fail during
kernel module removal even before scsi_remove_host() gets invoked.

> After the ULD code is run __scsi_remove_device will set the state to
> SDEV_DEL and scsi_remove_host will then set the state to SHOST_DEL. So
> that would prevent new IO from getting queued.
> 
> But then is there a race that you were hitting?


scsi_remove_host() can get invoked after the SCSI core has submitted a
request to the LLD via queuecommand() but before the LLD has received
the I/O completion notification that will be generated once that request
finishes. I see three alternatives to handle this:
- The LLD stops I/O completion notifications before invoking
  scsi_remove_host() (which is not correct because it prevents
  sd_shutdown() to send SCSI commands to the device).
- The SCSI core keeps the LLD around long enough until it is sure that
  no new I/O notifications will arrive.
- The SCSI LLD stops I/O completion notifications after having invoked
  scsi_remove_host() and kills all pending SCSI commands before
  continuing with LLD-specific host removal tasks. As far as I can see
  the SCSI core doesn't provide a function yet that would allow an
  LLD to kill all pending requests. Maybe blk_abort_queue() could be
  helpful here.

Bart.

  reply	other threads:[~2012-04-12 10:48 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-05 13:58 [RFC] How to fix an async scan - rmmod race? Tomas Henzl
2012-04-05 15:57 ` Mike Christie
2012-04-05 16:05   ` Mike Christie
2012-04-05 18:00 ` Bart Van Assche
2012-04-05 21:29   ` Mike Christie
2012-04-06  9:24     ` Bart Van Assche
2012-04-06 17:22       ` Mike Christie
2012-04-06 18:37         ` Bart Van Assche
2012-04-11 21:46         ` Mike Christie
2012-04-06  9:54     ` Tomas Henzl
2012-04-06 15:20       ` James Bottomley
2012-04-06 16:15         ` Bart Van Assche
2012-04-06 16:35           ` James Bottomley
2012-04-06 17:01             ` Bart Van Assche
2012-04-06 17:15               ` James Bottomley
2012-04-06 17:59                 ` Bart Van Assche
2012-04-08 17:38                 ` Bart Van Assche
2012-04-11 18:17                   ` Mike Christie
2012-04-11 18:30                     ` Mike Christie
2012-04-11 19:47                     ` Bart Van Assche
2012-04-11 22:28                       ` Mike Christie
2012-04-12 10:48                         ` Bart Van Assche [this message]
2012-04-06  9:39 ` Bart Van Assche
2012-04-06 10:14   ` Tomas Henzl
2012-04-06 13:13     ` Tomas Henzl
2012-04-06 14:38       ` Bart Van Assche
2012-04-06 15:32         ` Tomas Henzl
2012-04-12 12:48 ` [RFC] How to fix an async scan - rmmod race? try_module_get Tomas Henzl
2012-04-18 16:48   ` [RFC] How to fix an async scan - 'rmmod --wait' race? Tomas Henzl
2012-04-18 18:18     ` Bart Van Assche
2012-05-17  8:42     ` James Bottomley
2012-05-17  8:55       ` Bart Van Assche
2012-05-17  9:01         ` James Bottomley
2012-05-17 14:51           ` Tomas Henzl
2012-05-22 10:05             ` James Bottomley
2012-05-25 15:13               ` Tomas Henzl
2012-05-25 18:46                 ` Dan Williams
2012-05-28 11:58                   ` Tomas Henzl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F86B2F4.3020101@acm.org \
    --to=bvanassche@acm.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    --cc=sgruszka@redhat.com \
    --cc=thenzl@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.