From: Mike Christie <michaelc@cs.wisc.edu>
To: Bart Van Assche <bvanassche@acm.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
Tomas Henzl <thenzl@redhat.com>,
"'linux-scsi@vger.kernel.org'" <linux-scsi@vger.kernel.org>,
Stanislaw Gruszka <sgruszka@redhat.com>
Subject: Re: [RFC] How to fix an async scan - rmmod race?
Date: Wed, 11 Apr 2012 17:28:02 -0500 [thread overview]
Message-ID: <4F860572.6090404@cs.wisc.edu> (raw)
In-Reply-To: <4F85DFEB.5080000@acm.org>
On 04/11/2012 02:47 PM, Bart Van Assche wrote:
> On 04/11/12 18:17, Mike Christie wrote:
>
>> I would like scsi core to handle this, but I think right now drivers are
>> doing work arounds like your patch.
>
>
> I've been wondering whether the following approach would make sense
> (haven't tested this yet):
> - Let scsi_remove_host() stop the SCSI error handler thread instead of
> keeping the error handler thread around until scsi_host_dev_release()
> gets invoked.
> - After blk_cleanup_queue() finished, kill all outstanding SCSI
> requests from inside scsi_remove_device() (those requests that have
> already been passed to the LLD via queuecommand) instead of waiting
> until the SCSI error handler detects a timeout.
>
> An advantage of that approach would be that independent of the context
> from which an I/O request is submitted (scanning / user space / ...)
> that no new requests would be passed to the SCSI LLD after
> scsi_remove_host() has finished. So this approach could be an
> alternative for Tomas' patch at the start of this thread. However, a
I do not think it solve's Tomas's issue, because if a async scan is
running it could have already completed the scan related IO and is just
about to call the slave_configure callout when the user does a rmmod.
The scsi_remove_host from the rmmod will not see any IOs in flight and
go right along removing the host and rmmod will complete and remove the
module. The scsi_scan.c code could then try to access the
slave_configure callout which is now invalid.
> disadvantage is that this approach will only work fine if the LLD stops
> I/O completion notifications before invoking scsi_remove_host(). Several
I don't think you would want to do that, because you have IO from the
sd_shutdown path that you do want to execute. After the remove/shutdown
callouts have been run then you do not want new IO to be sent to the LLD.
So scsi_remove_host sets the host state to cancel initially. It then
calls scsi_forget_host which will loop over devices and remove them.
That could cause IO to be sent by functions like sd_shutdown.
After the ULD code is run __scsi_remove_device will set the state to
SDEV_DEL and scsi_remvoe_host will then set the state to SHOST_DEL. So
that would prevent new IO from getting queued.
But then is there a race that you were hitting? Were you hitting
something like IO was in the reuqest_queue, the sd_shutdown IO got
queued before those IOs, but then the sd shutdown IO completed and the
other queued IO got sent to the LLD before the sdev state could get set
to SDEV_DEL. So then IO could get sent to the LLD queuecommand when we
did not want it to.
next prev parent reply other threads:[~2012-04-11 22:28 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-05 13:58 [RFC] How to fix an async scan - rmmod race? Tomas Henzl
2012-04-05 15:57 ` Mike Christie
2012-04-05 16:05 ` Mike Christie
2012-04-05 18:00 ` Bart Van Assche
2012-04-05 21:29 ` Mike Christie
2012-04-06 9:24 ` Bart Van Assche
2012-04-06 17:22 ` Mike Christie
2012-04-06 18:37 ` Bart Van Assche
2012-04-11 21:46 ` Mike Christie
2012-04-06 9:54 ` Tomas Henzl
2012-04-06 15:20 ` James Bottomley
2012-04-06 16:15 ` Bart Van Assche
2012-04-06 16:35 ` James Bottomley
2012-04-06 17:01 ` Bart Van Assche
2012-04-06 17:15 ` James Bottomley
2012-04-06 17:59 ` Bart Van Assche
2012-04-08 17:38 ` Bart Van Assche
2012-04-11 18:17 ` Mike Christie
2012-04-11 18:30 ` Mike Christie
2012-04-11 19:47 ` Bart Van Assche
2012-04-11 22:28 ` Mike Christie [this message]
2012-04-12 10:48 ` Bart Van Assche
2012-04-06 9:39 ` Bart Van Assche
2012-04-06 10:14 ` Tomas Henzl
2012-04-06 13:13 ` Tomas Henzl
2012-04-06 14:38 ` Bart Van Assche
2012-04-06 15:32 ` Tomas Henzl
2012-04-12 12:48 ` [RFC] How to fix an async scan - rmmod race? try_module_get Tomas Henzl
2012-04-18 16:48 ` [RFC] How to fix an async scan - 'rmmod --wait' race? Tomas Henzl
2012-04-18 18:18 ` Bart Van Assche
2012-05-17 8:42 ` James Bottomley
2012-05-17 8:55 ` Bart Van Assche
2012-05-17 9:01 ` James Bottomley
2012-05-17 14:51 ` Tomas Henzl
2012-05-22 10:05 ` James Bottomley
2012-05-25 15:13 ` Tomas Henzl
2012-05-25 18:46 ` Dan Williams
2012-05-28 11:58 ` Tomas Henzl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F860572.6090404@cs.wisc.edu \
--to=michaelc@cs.wisc.edu \
--cc=James.Bottomley@HansenPartnership.com \
--cc=bvanassche@acm.org \
--cc=linux-scsi@vger.kernel.org \
--cc=sgruszka@redhat.com \
--cc=thenzl@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.