linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [Bugme-new] [Bug 9674] New: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc, mmc_core
       [not found] ` <20080101181026.38298e13.akpm@linux-foundation.org>
@ 2008-01-02  3:24   ` James Bottomley
  2008-01-02 12:21     ` Rafael J. Wysocki
  2008-01-02 15:49     ` Pete Wyckoff
  0 siblings, 2 replies; 6+ messages in thread
From: James Bottomley @ 2008-01-02  3:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rafael J. Wysocki, Pierre Ossman, Jens Axboe, bugme-daemon,
	gentuu, linux-scsi


On Tue, 2008-01-01 at 18:10 -0800, Andrew Morton wrote:
> On Tue,  1 Jan 2008 14:55:45 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=9674
> > 
> >            Summary: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc,
> >                     mmc_core
> 
> Guys, this is a very recent regression.  Could you please take a look, see
> if it's due to mmc, block or scsi changes?

There's not a lot of information to go on.  The stack trace looks bogus,
so I guess the kernel is compiled without a frame pointer.  However, it
does look like the initial insertion of sr_mod is going through and it
generates a command which gets into scsi_request_fn and then indirects
through a bogus queueucommand pointer.

What's the actual underlying device the cdrom is attached to?

There's no real changes to SCSI in this area from 2.6.24-rc4 ...
however, the reinsertion is suggestive, it's like the removal is
retriggering a module request for some reason.

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bugme-new] [Bug 9674] New: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc, mmc_core
  2008-01-02  3:24   ` [Bugme-new] [Bug 9674] New: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc, mmc_core James Bottomley
@ 2008-01-02 12:21     ` Rafael J. Wysocki
  2008-01-02 15:41       ` James Bottomley
  2008-01-02 15:49     ` Pete Wyckoff
  1 sibling, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2008-01-02 12:21 UTC (permalink / raw)
  To: James Bottomley
  Cc: Andrew Morton, Pierre Ossman, Jens Axboe, gentuu, linux-scsi,
	bugme-daemon

On Wednesday, 2 of January 2008, James Bottomley wrote:
> 
> On Tue, 2008-01-01 at 18:10 -0800, Andrew Morton wrote:
> > On Tue,  1 Jan 2008 14:55:45 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:
> > 
> > > http://bugzilla.kernel.org/show_bug.cgi?id=9674
> > > 
> > >            Summary: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc,
> > >                     mmc_core
> > 
> > Guys, this is a very recent regression.  Could you please take a look, see
> > if it's due to mmc, block or scsi changes?
> 
> There's not a lot of information to go on.  The stack trace looks bogus,
> so I guess the kernel is compiled without a frame pointer.

The bug report has been updated with a stack trace from a kernel compiled
with a frame pointer.  Please have a look.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bugme-new] [Bug 9674] New: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc, mmc_core
  2008-01-02 12:21     ` Rafael J. Wysocki
@ 2008-01-02 15:41       ` James Bottomley
  0 siblings, 0 replies; 6+ messages in thread
From: James Bottomley @ 2008-01-02 15:41 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, Pierre Ossman, Jens Axboe, gentuu, linux-scsi,
	bugme-daemon


On Wed, 2008-01-02 at 13:21 +0100, Rafael J. Wysocki wrote:
> On Wednesday, 2 of January 2008, James Bottomley wrote:
> > 
> > On Tue, 2008-01-01 at 18:10 -0800, Andrew Morton wrote:
> > > On Tue,  1 Jan 2008 14:55:45 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:
> > > 
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=9674
> > > > 
> > > >            Summary: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc,
> > > >                     mmc_core
> > > 
> > > Guys, this is a very recent regression.  Could you please take a look, see
> > > if it's due to mmc, block or scsi changes?
> > 
> > There's not a lot of information to go on.  The stack trace looks bogus,
> > so I guess the kernel is compiled without a frame pointer.
> 
> The bug report has been updated with a stack trace from a kernel compiled
> with a frame pointer.  Please have a look.

Please, please don't do this.  Filing something in bugzilla is
tantamount to putting it in the file and forget folder.  The reason I
cc'd the SCSI mailing list and asked for more details is so that we get
the email flow that might trigger direct interaction between the
reporter and someone on the list who recognised the symptoms.  

Let me say again, catagorically, that if you want to give a bug the best
chance of being fixed, the correct flow of information is:

file a bugzilla and note the bugid.
Then email a complete report to the relevant list, but add [BUG <bugid>]
to the subject line and cc bugme-daemon@bugzilla.kernel.org  If you do
this, bugzilla will keep track of the entire discussion as it progresses
and allow those who track bugs through bugzilla to get a pretty accurate
idea of the status.  You should never need to touch bugzilla again once
the initial bug report is filed: all future information flow is via the
mailing lists.

Also, using urls unless for historical purposes is also a killer.  Many
people travel, and their MO is to download the email and read it on the
plane/train/whatever.  If you embed a url containing critical
information, the email gets marked as read, but since I can't get to the
information, nothing happens.  Then it gets forgotten.

This is the relevant piece of information that should have been on the
mailing list:


[  101.359083] Unable to handle kernel paging request at ffffffff88021cc0 RIP:
[  101.359092]  [<ffffffff88021cc0>]
[  101.359099] PGD 203067 PUD 207063 PMD 3d34a067 PTE 0
[  101.359108] Oops: 0010 [1] PREEMPT SMP
[  101.359115] CPU 0
[  101.359118] Modules linked in: sr_mod tcp_westwood ipt_REJECT xt_state
iptable_filter ipt_owner ipt_MASQUERADE xt_tcpudp xt_multiport iptable_nat
nf_nat nf_conntrack_ipv4 nf_conntrack ip_tables x_tables iwl3945 ricoh_mmc
cdrom
[  101.359150] Pid: 4496, comm: modprobe Not tainted 2.6.24-rc6-git7 #5
[  101.359154] RIP: 0010:[<ffffffff88021cc0>]  [<ffffffff88021cc0>]
[  101.359159] RSP: 0018:ffff81002b457970  EFLAGS: 00010086
[  101.359163] RAX: ffffffff88021cc0 RBX: ffff81003f1627e0 RCX:
ffff81003f023b38
[  101.359167] RDX: 0000000000000000 RSI: ffff810030efd000 RDI:
ffff81003f1627e0
[  101.359171] RBP: ffff81002b4579b8 R08: 0000000000000001 R09:
0000000000000001
[  101.359175] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff810030efd000
[  101.359179] R13: ffff81002b457988 R14: ffffffff00000010 R15:
0000000100000010
[  101.359185] FS:  00002adcf7ea0b00(0000) GS:ffffffff80733000(0000)
knlGS:0000000000000000
[  101.359189] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  101.359193] CR2: ffffffff88021cc0 CR3: 000000002b497000 CR4:
00000000000006e0
[  101.359197] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  101.359201] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  101.359206] Process modprobe (pid: 4496, threadinfo ffff81002b456000, task
ffff81003de1ef50)
[  101.359210] Stack:  ffffffff80333a98 0000000000000086 ffff81003f023af8
ffff810030efd000
[  101.359221]  ffff81003f1627e0 ffff81003f023800 ffff81003e31c000
ffff81003f1627e0
[  101.359230]  0000000000000000 ffff81002b457a08 ffffffff803fe7e1
ffff81003f023b38
[  101.359237] Call Trace:
[  101.359248]  [<ffffffff80333a98>] elv_next_request+0xe8/0x180
[  101.359256]  [<ffffffff803fe7e1>] scsi_request_fn+0x71/0x380
[  101.359264]  [<ffffffff803375b8>] __generic_unplug_device+0x28/0x30
[  101.359270]  [<ffffffff80337623>] blk_execute_rq_nowait+0x63/0xb0
[  101.359276]  [<ffffffff80339113>] blk_execute_rq+0x73/0xe0
[  101.359283]  [<ffffffff80337775>] get_request_wait+0x25/0x120
[  101.359288]  [<ffffffff80337896>] blk_get_request+0x26/0x80
[  101.359296]  [<ffffffff803fe5b2>] scsi_execute+0xe2/0x110
[  101.359301]  [<ffffffff803fe661>] scsi_execute_req+0x81/0xf0
[  101.359312]  [<ffffffff8800d713>] :sr_mod:sr_probe+0x1e3/0x630
[  101.359323]  [<ffffffff803c8d01>] driver_probe_device+0xa1/0x1c0
[  101.359329]  [<ffffffff803c8ff5>] __driver_attach+0xe5/0xf0
[  101.359334]  [<ffffffff803c8f10>] __driver_attach+0x0/0xf0
[  101.359342]  [<ffffffff803c7ee3>] bus_for_each_dev+0x53/0x80
[  101.359348]  [<ffffffff803c8b3c>] driver_attach+0x1c/0x20
[  101.359353]  [<ffffffff803c8305>] bus_add_driver+0xa5/0x210
[  101.359360]  [<ffffffff803c922a>] driver_register+0x4a/0x80
[  101.359367]  [<ffffffff80402241>] scsi_register_driver+0x11/0x20
[  101.359374]  [<ffffffff8801303c>] :sr_mod:init_sr+0x3c/0x5c
[  101.359382]  [<ffffffff80268a23>] sys_init_module+0x153/0x1a80
[  101.359395]  [<ffffffff8020bdee>] system_call+0x7e/0x83
[  101.359399]
[  101.359401]
[  101.359402] Code:  Bad RIP value.
[  101.359409] RIP  [<ffffffff88021cc0>]
[  101.359414]  RSP <ffff81002b457970>
[  101.359416] CR2: ffffffff88021cc0
[  101.359425] ---[ end trace c303fca3a91ba9a8 ]---
[  101.359429] note: modprobe[4496] exited with preempt_count 1
[  101.359438] BUG: sleeping function called from invalid context at
kernel/rwsem.c:21
[  101.359451] in_atomic():1, irqs_disabled():0
[  101.359460] INFO: lockdep is turned off.
[  101.359469] Pid: 4496, comm: modprobe Tainted: G      D 2.6.24-rc6-git7 #5
[  101.359480]
[  101.359482] Call Trace:
[  101.359494]  [<ffffffff80232ee2>] __might_sleep+0xc2/0xf0
[  101.359509]  [<ffffffff8056f56d>] down_read+0x1d/0x50
[  101.359521]  [<ffffffff8023cb80>] exit_mm+0x30/0x110
[  101.359532]  [<ffffffff8023e6d7>] do_exit+0x1c7/0x950
[  101.359546]  [<ffffffff802268b8>] do_page_fault+0x5a8/0x860
[  101.359564]  [<ffffffff805712ad>] error_exit+0x0/0xa9
[  101.359578]  [<ffffffff80333a98>] elv_next_request+0xe8/0x180
[  101.359591]  [<ffffffff803fe7e1>] scsi_request_fn+0x71/0x380
[  101.359606]  [<ffffffff803375b8>] __generic_unplug_device+0x28/0x30
[  101.359619]  [<ffffffff80337623>] blk_execute_rq_nowait+0x63/0xb0
[  101.359633]  [<ffffffff80339113>] blk_execute_rq+0x73/0xe0
[  101.359647]  [<ffffffff80337775>] get_request_wait+0x25/0x120
[  101.359660]  [<ffffffff80337896>] blk_get_request+0x26/0x80
[  101.359674]  [<ffffffff803fe5b2>] scsi_execute+0xe2/0x110
[  101.359688]  [<ffffffff803fe661>] scsi_execute_req+0x81/0xf0
[  101.359703]  [<ffffffff8800d713>] :sr_mod:sr_probe+0x1e3/0x630
[  101.359719]  [<ffffffff803c8d01>] driver_probe_device+0xa1/0x1c0
[  101.359731]  [<ffffffff803c8ff5>] __driver_attach+0xe5/0xf0
[  101.359742]  [<ffffffff803c8f10>] __driver_attach+0x0/0xf0
[  101.359756]  [<ffffffff803c7ee3>] bus_for_each_dev+0x53/0x80
[  101.359768]  [<ffffffff803c8b3c>] driver_attach+0x1c/0x20
[  101.359780]  [<ffffffff803c8305>] bus_add_driver+0xa5/0x210
[  101.359793]  [<ffffffff803c922a>] driver_register+0x4a/0x80
[  101.359807]  [<ffffffff80402241>] scsi_register_driver+0x11/0x20
[  101.359821]  [<ffffffff8801303c>] :sr_mod:init_sr+0x3c/0x5c
[  101.359834]  [<ffffffff80268a23>] sys_init_module+0x153/0x1a80
[  101.359853]  [<ffffffff8020bdee>] system_call+0x7e/0x83
[  101.359865]

This is clearly showing it was in the request function and probably
indirected via a bad pointer.

I think something in the remove chain triggered a reinsertion of the
module (this would be something in the user land hotplug scripts or
something in the kernel triggering it).  Now, for the second time of
asking, what is the underlying driver or HBA the cdrom is attached to?

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bugme-new] [Bug 9674] New: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc, mmc_core
  2008-01-02  3:24   ` [Bugme-new] [Bug 9674] New: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc, mmc_core James Bottomley
  2008-01-02 12:21     ` Rafael J. Wysocki
@ 2008-01-02 15:49     ` Pete Wyckoff
  2008-01-02 16:33       ` James Bottomley
  1 sibling, 1 reply; 6+ messages in thread
From: Pete Wyckoff @ 2008-01-02 15:49 UTC (permalink / raw)
  To: James Bottomley
  Cc: Andrew Morton, Rafael J. Wysocki, Pierre Ossman, Jens Axboe,
	bugme-daemon, gentuu, linux-scsi

James.Bottomley@HansenPartnership.com wrote on Tue, 01 Jan 2008 21:24 -0600:
> 
> On Tue, 2008-01-01 at 18:10 -0800, Andrew Morton wrote:
> > On Tue,  1 Jan 2008 14:55:45 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:
> > 
> > > http://bugzilla.kernel.org/show_bug.cgi?id=9674
> > > 
> > >            Summary: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc,
> > >                     mmc_core
> > 
> > Guys, this is a very recent regression.  Could you please take a look, see
> > if it's due to mmc, block or scsi changes?
> 
> There's not a lot of information to go on.  The stack trace looks bogus,
> so I guess the kernel is compiled without a frame pointer.  However, it
> does look like the initial insertion of sr_mod is going through and it
> generates a command which gets into scsi_request_fn and then indirects
> through a bogus queueucommand pointer.

Bogus prep_rq_fn actually.

> What's the actual underlying device the cdrom is attached to?
> 
> There's no real changes to SCSI in this area from 2.6.24-rc4 ...
> however, the reinsertion is suggestive, it's like the removal is
> retriggering a module request for some reason.

Here's a guess.  When sr_mod is removed, it looks like the request
queue prep_rq_fn is still pointing to the now nonexistent
sr_prep_fn.  This may have been due to a commit that went in early
2.6.24:

    commit 7f9a6bc4e9d59e7fcf03ed23f60cd81ca5d80b65
    Author: James Bottomley <James.Bottomley@steeleye.com>
    Date:   Sat Aug 4 10:06:25 2007 -0500

    [SCSI] move ULD attachment into the prep function
    
    One of the intents of the block prep function was to allow ULDs to use
    it for preprocessing.  The original SCSI model was to have a single prep
    function and add a pointer indirect filter to build the necessary
    commands.  This patch reverses that, does away with the init_command
    field of the scsi_driver structure and makes ULDs attach directly to the
    prep function instead.  The value is really that it allows us to begin
    to separate the ULDs from the SCSI mid layer (as long as they don't use
    any core functions---which is hard at the moment---a ULD doesn't even
    need SCSI to bind).
    
    Acked-by: Jens Axboe <jens.axboe@oracle.com>
    Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

When the module is re-inserted, it does a few SCSI commands before
setting up the new prep_rq_fn, presumably hitting this bogus
pointer.

One fix would be to have sr remember the original prep function and
restore it in sr_kref_release.  Sd and a few other drivers have this
issue.  Ide-cd bothers to set prep_rq_fn to NULL as it releases
the device.

		-- Pete

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bugme-new] [Bug 9674] New: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc, mmc_core
  2008-01-02 15:49     ` Pete Wyckoff
@ 2008-01-02 16:33       ` James Bottomley
  2008-01-02 17:14         ` [PATCH] scsi_sysfs: restore prep_fn when ULD is removed James Bottomley
  0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2008-01-02 16:33 UTC (permalink / raw)
  To: Pete Wyckoff
  Cc: Andrew Morton, Rafael J. Wysocki, Pierre Ossman, Jens Axboe,
	bugme-daemon, gentuu, linux-scsi


On Wed, 2008-01-02 at 10:49 -0500, Pete Wyckoff wrote:
> James.Bottomley@HansenPartnership.com wrote on Tue, 01 Jan 2008 21:24 -0600:
> > 
> > On Tue, 2008-01-01 at 18:10 -0800, Andrew Morton wrote:
> > > On Tue,  1 Jan 2008 14:55:45 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:
> > > 
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=9674
> > > > 
> > > >            Summary: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc,
> > > >                     mmc_core
> > > 
> > > Guys, this is a very recent regression.  Could you please take a look, see
> > > if it's due to mmc, block or scsi changes?
> > 
> > There's not a lot of information to go on.  The stack trace looks bogus,
> > so I guess the kernel is compiled without a frame pointer.  However, it
> > does look like the initial insertion of sr_mod is going through and it
> > generates a command which gets into scsi_request_fn and then indirects
> > through a bogus queueucommand pointer.
> 
> Bogus prep_rq_fn actually.
> 
> > What's the actual underlying device the cdrom is attached to?
> > 
> > There's no real changes to SCSI in this area from 2.6.24-rc4 ...
> > however, the reinsertion is suggestive, it's like the removal is
> > retriggering a module request for some reason.
> 
> Here's a guess.  When sr_mod is removed, it looks like the request
> queue prep_rq_fn is still pointing to the now nonexistent
> sr_prep_fn.  This may have been due to a commit that went in early
> 2.6.24:
> 
>     commit 7f9a6bc4e9d59e7fcf03ed23f60cd81ca5d80b65
>     Author: James Bottomley <James.Bottomley@steeleye.com>
>     Date:   Sat Aug 4 10:06:25 2007 -0500
> 
>     [SCSI] move ULD attachment into the prep function
>     
>     One of the intents of the block prep function was to allow ULDs to use
>     it for preprocessing.  The original SCSI model was to have a single prep
>     function and add a pointer indirect filter to build the necessary
>     commands.  This patch reverses that, does away with the init_command
>     field of the scsi_driver structure and makes ULDs attach directly to the
>     prep function instead.  The value is really that it allows us to begin
>     to separate the ULDs from the SCSI mid layer (as long as they don't use
>     any core functions---which is hard at the moment---a ULD doesn't even
>     need SCSI to bind).
>     
>     Acked-by: Jens Axboe <jens.axboe@oracle.com>
>     Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
> 
> When the module is re-inserted, it does a few SCSI commands before
> setting up the new prep_rq_fn, presumably hitting this bogus
> pointer.
> 
> One fix would be to have sr remember the original prep function and
> restore it in sr_kref_release.  Sd and a few other drivers have this
> issue.  Ide-cd bothers to set prep_rq_fn to NULL as it releases
> the device.

Bingo .. that's why we ask the list, thanks Pete!

I don't think the fix is the correct one, but I've attached what I think
is the correct fix (basically, there's a bus callback we can use to
ensure the right thing always gets done rather than relying on drivers
doing it in their own release methods, that way they can't forget).

The reason it was showing up in -rc4 I suspect is because something was
structurally altering the module stack, and the address that sr_mod was
loaded was changed, so the prep_fn as Pete said was pointing into bogus
address space.

The way to trigger this bug 100% of the time is to rmmod sr_mod and then
send an inquiry (or another command) to the device using the sg node.

James

---

Index: BUILD-2.6/drivers/scsi/scsi_lib.c
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_lib.c	2008-01-01 10:13:33.000000000 -0600
+++ BUILD-2.6/drivers/scsi/scsi_lib.c	2008-01-02 10:17:51.000000000 -0600
@@ -1324,7 +1324,7 @@ int scsi_prep_return(struct request_queu
 }
 EXPORT_SYMBOL(scsi_prep_return);
 
-static int scsi_prep_fn(struct request_queue *q, struct request *req)
+int scsi_prep_fn(struct request_queue *q, struct request *req)
 {
 	struct scsi_device *sdev = q->queuedata;
 	int ret = BLKPREP_KILL;
Index: BUILD-2.6/drivers/scsi/scsi_priv.h
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_priv.h	2007-11-03 09:08:46.000000000 -0500
+++ BUILD-2.6/drivers/scsi/scsi_priv.h	2008-01-02 10:20:09.000000000 -0600
@@ -74,6 +74,9 @@ extern struct request_queue *scsi_alloc_
 extern void scsi_free_queue(struct request_queue *q);
 extern int scsi_init_queue(void);
 extern void scsi_exit_queue(void);
+struct request_queue;
+struct request;
+extern int scsi_prep_fn(struct request_queue *, struct request *);
 
 /* scsi_proc.c */
 #ifdef CONFIG_SCSI_PROC_FS
Index: BUILD-2.6/drivers/scsi/scsi_sysfs.c
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_sysfs.c	2007-11-03 10:08:02.000000000 -0500
+++ BUILD-2.6/drivers/scsi/scsi_sysfs.c	2008-01-02 10:31:33.000000000 -0600
@@ -373,12 +373,24 @@ static int scsi_bus_resume(struct device
 	return err;
 }
 
+static int scsi_bus_remove(struct device *dev)
+{
+	struct scsi_device *sdev = to_scsi_device(dev);
+
+	/* reset the prep_fn back to the default since the
+	 * driver may have altered it and it's being removed */
+	blk_queue_prep_rq(sdev->request_queue, scsi_prep_fn);
+
+	return 0;
+}
+
 struct bus_type scsi_bus_type = {
         .name		= "scsi",
         .match		= scsi_bus_match,
 	.uevent		= scsi_bus_uevent,
 	.suspend	= scsi_bus_suspend,
 	.resume		= scsi_bus_resume,
+	.remove		= scsi_bus_remove,
 };
 
 int scsi_sysfs_register(void)



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] scsi_sysfs: restore prep_fn when ULD is removed
  2008-01-02 16:33       ` James Bottomley
@ 2008-01-02 17:14         ` James Bottomley
  0 siblings, 0 replies; 6+ messages in thread
From: James Bottomley @ 2008-01-02 17:14 UTC (permalink / raw)
  To: Pete Wyckoff
  Cc: Andrew Morton, Rafael J. Wysocki, Pierre Ossman, Jens Axboe,
	bugme-daemon, gentuu, linux-scsi

A recent bug report:

http://bugzilla.kernel.org/show_bug.cgi?id=9674

Was caused because the ULDs now set their own prep functions, but don't
necessarily reset the prep function back to the SCSI default when they
are removed.  This leads to panics if commands are sent to the device
after the module is removed because the prep_fn is still pointing to the
old module code.  The fix for this is to implement a bus remove method
that resets the prep_fn pointer correctly before calling the driver
specific prep_fn.

James

----

P.S. The patch in the bug report is slightly wrong.  It doesn't include
the piece to call the ULD specific remove function.

Index: BUILD-2.6/drivers/scsi/scsi_lib.c
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_lib.c	2008-01-01 10:13:33.000000000 -0600
+++ BUILD-2.6/drivers/scsi/scsi_lib.c	2008-01-02 10:17:51.000000000 -0600
@@ -1324,7 +1324,7 @@ int scsi_prep_return(struct request_queu
 }
 EXPORT_SYMBOL(scsi_prep_return);
 
-static int scsi_prep_fn(struct request_queue *q, struct request *req)
+int scsi_prep_fn(struct request_queue *q, struct request *req)
 {
 	struct scsi_device *sdev = q->queuedata;
 	int ret = BLKPREP_KILL;
Index: BUILD-2.6/drivers/scsi/scsi_priv.h
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_priv.h	2007-11-03 09:08:46.000000000 -0500
+++ BUILD-2.6/drivers/scsi/scsi_priv.h	2008-01-02 10:20:09.000000000 -0600
@@ -74,6 +74,9 @@ extern struct request_queue *scsi_alloc_
 extern void scsi_free_queue(struct request_queue *q);
 extern int scsi_init_queue(void);
 extern void scsi_exit_queue(void);
+struct request_queue;
+struct request;
+extern int scsi_prep_fn(struct request_queue *, struct request *);
 
 /* scsi_proc.c */
 #ifdef CONFIG_SCSI_PROC_FS
Index: BUILD-2.6/drivers/scsi/scsi_sysfs.c
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_sysfs.c	2007-11-03 10:08:02.000000000 -0500
+++ BUILD-2.6/drivers/scsi/scsi_sysfs.c	2008-01-02 10:58:17.000000000 -0600
@@ -373,12 +373,29 @@ static int scsi_bus_resume(struct device
 	return err;
 }
 
+static int scsi_bus_remove(struct device *dev)
+{
+	struct device_driver *drv = dev->driver;
+	struct scsi_device *sdev = to_scsi_device(dev);
+	int err = 0;
+
+	/* reset the prep_fn back to the default since the
+	 * driver may have altered it and it's being removed */
+	blk_queue_prep_rq(sdev->request_queue, scsi_prep_fn);
+
+	if (drv && drv->remove)
+		err = drv->remove(dev);
+
+	return 0;
+}
+
 struct bus_type scsi_bus_type = {
         .name		= "scsi",
         .match		= scsi_bus_match,
 	.uevent		= scsi_bus_uevent,
 	.suspend	= scsi_bus_suspend,
 	.resume		= scsi_bus_resume,
+	.remove		= scsi_bus_remove,
 };
 
 int scsi_sysfs_register(void)



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-01-02 17:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-9674-10286@http.bugzilla.kernel.org/>
     [not found] ` <20080101181026.38298e13.akpm@linux-foundation.org>
2008-01-02  3:24   ` [Bugme-new] [Bug 9674] New: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc, mmc_core James Bottomley
2008-01-02 12:21     ` Rafael J. Wysocki
2008-01-02 15:41       ` James Bottomley
2008-01-02 15:49     ` Pete Wyckoff
2008-01-02 16:33       ` James Bottomley
2008-01-02 17:14         ` [PATCH] scsi_sysfs: restore prep_fn when ULD is removed James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).