linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tomas Henzl <thenzl@redhat.com>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: Bart Van Assche <bvanassche@acm.org>,
	"'linux-scsi@vger.kernel.org'" <linux-scsi@vger.kernel.org>,
	Stanislaw Gruszka <sgruszka@redhat.com>
Subject: Re: [RFC] How to fix an async scan - rmmod race?
Date: Fri, 06 Apr 2012 11:54:02 +0200	[thread overview]
Message-ID: <4F7EBD3A.8070509@redhat.com> (raw)
In-Reply-To: <4F7E0EBF.80407@cs.wisc.edu>

On 04/05/2012 11:29 PM, Mike Christie wrote:
> On 04/05/2012 01:00 PM, Bart Van Assche wrote:
>> On 04/05/12 13:58, Tomas Henzl wrote:
>>
>>> When a rmmod is tried then in some cases the kernel is not able to handle a paging request:
>>> [  727.154296] BUG: unable to handle kernel paging request at ffffffffa01874b8
>>> From what I observerved it happens when when we call the rmmod only a while after a modprobe
>>> (in this case it is the mpt2sas driver). More accurately said, it happens when rmmod is called
>>> while scsi async is still at work. The driver is removed but the scsi_host_template is still filled
>>> with now invalid pointers, in this case it is most likely the hostt->scan_finished which causes the BUG.
>>
>> Are you sure the above analysis is correct ? I've triggered several
>> million device removal events with ib_srp but I haven't ever seen the
>> above crash.
> ib_srp uses scsi_scan_target when the target is added so you are going
> down a different code path. If you are rmmoding ib_srp while the driver
> is calling scsi_scan_target() that would be a similar problem.
If the driver doesn't define the 'scsi_host_template.scan_finished' then the problem
is less visible. It's because in do_scsi_scan_host another path is taken and  
the scsi_host_scan_allowed test is used to skip further scanning, which I think reduces
the risk significantly. ib_srp doesn't define scan_finished so it is not safe but it is less
likely it will hit this.

>
> Tomas's bug occurs when drivers use scsi_scan_host, use the async scsi
> device scanning, and then rmmod the LLD while the scan is still in progress.
>
> I think a general problem that we might hit similar to Tomas's issue is
> when scanning from userspace then rmmoding the driver. Maybe that means
> we need a more generic fix? Or, maybe that could be handled by having
> scsi_scan() do a try_module_get before scanning.
I like this idea(try_module_get) it is easy/elegant and it is used in scsi_rescan_device,
but a scan can take a lot of time and during that time the driver couldn't be removed.
When a flag in scsi_remove_host is set then the scan can be cancelled, if the user 
rmmods the driver.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  parent reply	other threads:[~2012-04-06  9:54 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-05 13:58 [RFC] How to fix an async scan - rmmod race? Tomas Henzl
2012-04-05 15:57 ` Mike Christie
2012-04-05 16:05   ` Mike Christie
2012-04-05 18:00 ` Bart Van Assche
2012-04-05 21:29   ` Mike Christie
2012-04-06  9:24     ` Bart Van Assche
2012-04-06 17:22       ` Mike Christie
2012-04-06 18:37         ` Bart Van Assche
2012-04-11 21:46         ` Mike Christie
2012-04-06  9:54     ` Tomas Henzl [this message]
2012-04-06 15:20       ` James Bottomley
2012-04-06 16:15         ` Bart Van Assche
2012-04-06 16:35           ` James Bottomley
2012-04-06 17:01             ` Bart Van Assche
2012-04-06 17:15               ` James Bottomley
2012-04-06 17:59                 ` Bart Van Assche
2012-04-08 17:38                 ` Bart Van Assche
2012-04-11 18:17                   ` Mike Christie
2012-04-11 18:30                     ` Mike Christie
2012-04-11 19:47                     ` Bart Van Assche
2012-04-11 22:28                       ` Mike Christie
2012-04-12 10:48                         ` Bart Van Assche
2012-04-06  9:39 ` Bart Van Assche
2012-04-06 10:14   ` Tomas Henzl
2012-04-06 13:13     ` Tomas Henzl
2012-04-06 14:38       ` Bart Van Assche
2012-04-06 15:32         ` Tomas Henzl
2012-04-12 12:48 ` [RFC] How to fix an async scan - rmmod race? try_module_get Tomas Henzl
2012-04-18 16:48   ` [RFC] How to fix an async scan - 'rmmod --wait' race? Tomas Henzl
2012-04-18 18:18     ` Bart Van Assche
2012-05-17  8:42     ` James Bottomley
2012-05-17  8:55       ` Bart Van Assche
2012-05-17  9:01         ` James Bottomley
2012-05-17 14:51           ` Tomas Henzl
2012-05-22 10:05             ` James Bottomley
2012-05-25 15:13               ` Tomas Henzl
2012-05-25 18:46                 ` Dan Williams
2012-05-28 11:58                   ` Tomas Henzl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F7EBD3A.8070509@redhat.com \
    --to=thenzl@redhat.com \
    --cc=bvanassche@acm.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    --cc=sgruszka@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).