From: Tomas Henzl <thenzl@redhat.com>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: Bart Van Assche <bvanassche@acm.org>,
"'linux-scsi@vger.kernel.org'" <linux-scsi@vger.kernel.org>,
Stanislaw Gruszka <sgruszka@redhat.com>
Subject: Re: [RFC] How to fix an async scan - rmmod race?
Date: Fri, 06 Apr 2012 11:54:02 +0200 [thread overview]
Message-ID: <4F7EBD3A.8070509@redhat.com> (raw)
In-Reply-To: <4F7E0EBF.80407@cs.wisc.edu>
On 04/05/2012 11:29 PM, Mike Christie wrote:
> On 04/05/2012 01:00 PM, Bart Van Assche wrote:
>> On 04/05/12 13:58, Tomas Henzl wrote:
>>
>>> When a rmmod is tried then in some cases the kernel is not able to handle a paging request:
>>> [ 727.154296] BUG: unable to handle kernel paging request at ffffffffa01874b8
>>> From what I observerved it happens when when we call the rmmod only a while after a modprobe
>>> (in this case it is the mpt2sas driver). More accurately said, it happens when rmmod is called
>>> while scsi async is still at work. The driver is removed but the scsi_host_template is still filled
>>> with now invalid pointers, in this case it is most likely the hostt->scan_finished which causes the BUG.
>>
>> Are you sure the above analysis is correct ? I've triggered several
>> million device removal events with ib_srp but I haven't ever seen the
>> above crash.
> ib_srp uses scsi_scan_target when the target is added so you are going
> down a different code path. If you are rmmoding ib_srp while the driver
> is calling scsi_scan_target() that would be a similar problem.
If the driver doesn't define the 'scsi_host_template.scan_finished' then the problem
is less visible. It's because in do_scsi_scan_host another path is taken and
the scsi_host_scan_allowed test is used to skip further scanning, which I think reduces
the risk significantly. ib_srp doesn't define scan_finished so it is not safe but it is less
likely it will hit this.
>
> Tomas's bug occurs when drivers use scsi_scan_host, use the async scsi
> device scanning, and then rmmod the LLD while the scan is still in progress.
>
> I think a general problem that we might hit similar to Tomas's issue is
> when scanning from userspace then rmmoding the driver. Maybe that means
> we need a more generic fix? Or, maybe that could be handled by having
> scsi_scan() do a try_module_get before scanning.
I like this idea(try_module_get) it is easy/elegant and it is used in scsi_rescan_device,
but a scan can take a lot of time and during that time the driver couldn't be removed.
When a flag in scsi_remove_host is set then the scan can be cancelled, if the user
rmmods the driver.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-04-06 9:54 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-05 13:58 [RFC] How to fix an async scan - rmmod race? Tomas Henzl
2012-04-05 15:57 ` Mike Christie
2012-04-05 16:05 ` Mike Christie
2012-04-05 18:00 ` Bart Van Assche
2012-04-05 21:29 ` Mike Christie
2012-04-06 9:24 ` Bart Van Assche
2012-04-06 17:22 ` Mike Christie
2012-04-06 18:37 ` Bart Van Assche
2012-04-11 21:46 ` Mike Christie
2012-04-06 9:54 ` Tomas Henzl [this message]
2012-04-06 15:20 ` James Bottomley
2012-04-06 16:15 ` Bart Van Assche
2012-04-06 16:35 ` James Bottomley
2012-04-06 17:01 ` Bart Van Assche
2012-04-06 17:15 ` James Bottomley
2012-04-06 17:59 ` Bart Van Assche
2012-04-08 17:38 ` Bart Van Assche
2012-04-11 18:17 ` Mike Christie
2012-04-11 18:30 ` Mike Christie
2012-04-11 19:47 ` Bart Van Assche
2012-04-11 22:28 ` Mike Christie
2012-04-12 10:48 ` Bart Van Assche
2012-04-06 9:39 ` Bart Van Assche
2012-04-06 10:14 ` Tomas Henzl
2012-04-06 13:13 ` Tomas Henzl
2012-04-06 14:38 ` Bart Van Assche
2012-04-06 15:32 ` Tomas Henzl
2012-04-12 12:48 ` [RFC] How to fix an async scan - rmmod race? try_module_get Tomas Henzl
2012-04-18 16:48 ` [RFC] How to fix an async scan - 'rmmod --wait' race? Tomas Henzl
2012-04-18 18:18 ` Bart Van Assche
2012-05-17 8:42 ` James Bottomley
2012-05-17 8:55 ` Bart Van Assche
2012-05-17 9:01 ` James Bottomley
2012-05-17 14:51 ` Tomas Henzl
2012-05-22 10:05 ` James Bottomley
2012-05-25 15:13 ` Tomas Henzl
2012-05-25 18:46 ` Dan Williams
2012-05-28 11:58 ` Tomas Henzl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F7EBD3A.8070509@redhat.com \
--to=thenzl@redhat.com \
--cc=bvanassche@acm.org \
--cc=linux-scsi@vger.kernel.org \
--cc=michaelc@cs.wisc.edu \
--cc=sgruszka@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.