* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
[not found] <20080419085758.GA18612@elte.hu>
@ 2008-04-19 13:22 ` James Bottomley
2008-04-21 13:49 ` Ingo Molnar
0 siblings, 1 reply; 5+ messages in thread
From: James Bottomley @ 2008-04-19 13:22 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, Pekka Enberg, Christoph Lameter, linux-scsi
On Sat, 2008-04-19 at 10:57 +0200, Ingo Molnar wrote:
> x86.git allyesconfig bootup test produced the following warning in
> slub.c (and a stream of similar warnings later on):
>
> [ 47.295141] ------------[ cut here ]------------
> [ 47.298969] WARNING: at mm/slub.c:2443 kmem_cache_destroy+0xf8/0x102()
> [ 47.302967] Modules linked in:
> [ 47.307205] Pid: 1, comm: swapper Not tainted 2.6.25-sched-devel.git-x86-latest.git #24
> [ 47.310970] [<c014be1a>] warn_on_slowpath+0x46/0x56
> [ 47.317464] [<c019685f>] ? get_pageblock_flags_group+0x56/0x74
> [ 47.322969] [<c04f2e03>] ? list_add+0xa/0xf
> [ 47.328814] [<c123083a>] ? _spin_unlock+0x22/0x25
> [ 47.333327] [<c0191ce8>] ? time_hardirqs_off+0xe/0x1f
> [ 47.338972] [<c0167ff7>] ? trace_hardirqs_off_caller+0x15/0xab
> [ 47.342970] [<c0168098>] ? trace_hardirqs_off+0xb/0xd
> [ 47.350971] [<c01976f7>] ? free_hot_cold_page+0x11d/0x139
> [ 47.354970] [<c0191f58>] ? time_hardirqs_on+0xe/0x1f
> [ 47.361595] [<c0169779>] ? trace_hardirqs_on_caller+0x16/0x13c
> [ 47.366973] [<c01698aa>] ? trace_hardirqs_on+0xb/0xd
> [ 47.370972] [<c01b1a62>] kmem_cache_destroy+0xf8/0x102
> [ 47.377727] [<c09dab7f>] scsi_put_host_cmd_pool+0x42/0x58
> [ 47.382973] [<c09db1b4>] scsi_destroy_command_freelist+0x54/0x5c
> [ 47.386972] [<c09db2ef>] scsi_host_dev_release+0x79/0xa9
> [ 47.394973] [<c067ee95>] device_release+0x3e/0x54
> [ 47.398974] [<c04e0e62>] kobject_release+0x45/0x55
> [ 47.405383] [<c04e0e1d>] ? kobject_release+0x0/0x55
> [ 47.409513] [<c04e1968>] kref_put+0x3e/0x49
> [ 47.414975] [<c04e0d8d>] kobject_put+0x19/0x1b
> [ 47.418975] [<c067f44e>] put_device+0x16/0x18
> [ 47.422975] [<c09db274>] scsi_host_put+0x12/0x14
> [ 47.426975] [<c09db3dc>] scsi_unregister+0x1d/0x20
> [ 47.433383] [<c1ceddcc>] aha1542_detect+0x7db/0x7f5
> [ 47.438977] [<c01698aa>] ? trace_hardirqs_on+0xb/0xd
> [ 47.442976] [<c1ceddf1>] ? init_this_scsi_driver+0xb/0xd0
> [ 47.450666] [<c1cede44>] init_this_scsi_driver+0x5e/0xd0
> [ 47.454978] [<c1c944db>] kernel_init+0x152/0x2b0
> [ 47.458978] [<c1c94389>] ? kernel_init+0x0/0x2b0
> [ 47.465249] [<c1c94389>] ? kernel_init+0x0/0x2b0
> [ 47.469258] [<c01199c3>] kernel_thread_helper+0x7/0x10
> [ 47.474978] =======================
> [ 47.478980] ---[ end trace 778e504de7e3b1e3 ]---
> [ 47.483297] ------------[ cut here ]------------
>
> config and bootlog at:
>
> http://redhat.com/~mingo/misc/config-Sat_Apr_19_10_28_28_CEST_2008.bad
> http://redhat.com/~mingo/misc/log-Sat_Apr_19_10_28_28_CEST_2008.bad
>
> [a few .config options were turned off: just accept all the defaults
> after 'make oldconfig']
The WARN_ON is caused by kmem_cache_destroy() with apparently
outstanding objects, isn't it?
The most significant piece of the log seems to be before with all those
isa SCSI drivers ... I assume you don't actually have any of the
hardware, you're just randomly inserting the modules?
James
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
2008-04-19 13:22 ` [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool() James Bottomley
@ 2008-04-21 13:49 ` Ingo Molnar
2008-04-21 15:57 ` James Bottomley
0 siblings, 1 reply; 5+ messages in thread
From: Ingo Molnar @ 2008-04-21 13:49 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-kernel, Pekka Enberg, Christoph Lameter, linux-scsi
* James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > x86.git allyesconfig bootup test produced the following warning in
> > slub.c (and a stream of similar warnings later on):
[...]
> > config and bootlog at:
> >
> > http://redhat.com/~mingo/misc/config-Sat_Apr_19_10_28_28_CEST_2008.bad
> > http://redhat.com/~mingo/misc/log-Sat_Apr_19_10_28_28_CEST_2008.bad
> >
> > [a few .config options were turned off: just accept all the defaults
> > after 'make oldconfig']
>
> The WARN_ON is caused by kmem_cache_destroy() with apparently
> outstanding objects, isn't it?
>
> The most significant piece of the log seems to be before with all
> those isa SCSI drivers ... I assume you don't actually have any of the
> hardware, you're just randomly inserting the modules?
correct. As i mentioned it in the first sentence this is an allyesconfig
bzImage bootup. I.e. this is the bootup log of a "make allyesconfig"
kernel - roughly analogous to (trying to) insert every module in
existence. In the boot log you'll find 4871 initcalls, done by over 3000
drivers that each is attempted to be loaded by the kernel (!).
I do those bootups to "run as much as possible" kernel code and to make
sure that the maximum combination of debug and other features still
produces a working kernel.
I had to work half a year to gradually get the kernel to that stage
(started with it more than a year ago, as part of the -rt kernel) but
these days i'm booting a 32-bit and a 64-bit allyesconfig bzImage kernel
almost daily :) These bootups already caught a healthy amount of bugs in
the kernel, both important and unimportant ones. Recently the size of
the allyesconfig bzImage kernel surpassed 42MB, so it's massive.
Btw., i also boot "allnoconfig" kernels. [with just the minimal set of
features turned on to make the kernel minimally boot up and report back
via networking.]
Ingo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
2008-04-21 13:49 ` Ingo Molnar
@ 2008-04-21 15:57 ` James Bottomley
2008-04-22 13:10 ` Ingo Molnar
0 siblings, 1 reply; 5+ messages in thread
From: James Bottomley @ 2008-04-21 15:57 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, Pekka Enberg, Christoph Lameter, linux-scsi
On Mon, 2008-04-21 at 15:49 +0200, Ingo Molnar wrote:
> * James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>
> > > x86.git allyesconfig bootup test produced the following warning in
> > > slub.c (and a stream of similar warnings later on):
> [...]
>
> > > config and bootlog at:
> > >
> > > http://redhat.com/~mingo/misc/config-Sat_Apr_19_10_28_28_CEST_2008.bad
> > > http://redhat.com/~mingo/misc/log-Sat_Apr_19_10_28_28_CEST_2008.bad
> > >
> > > [a few .config options were turned off: just accept all the defaults
> > > after 'make oldconfig']
> >
> > The WARN_ON is caused by kmem_cache_destroy() with apparently
> > outstanding objects, isn't it?
> >
> > The most significant piece of the log seems to be before with all
> > those isa SCSI drivers ... I assume you don't actually have any of the
> > hardware, you're just randomly inserting the modules?
>
> correct. As i mentioned it in the first sentence this is an allyesconfig
> bzImage bootup. I.e. this is the bootup log of a "make allyesconfig"
> kernel - roughly analogous to (trying to) insert every module in
> existence. In the boot log you'll find 4871 initcalls, done by over 3000
> drivers that each is attempted to be loaded by the kernel (!).
>
> I do those bootups to "run as much as possible" kernel code and to make
> sure that the maximum combination of debug and other features still
> produces a working kernel.
>
> I had to work half a year to gradually get the kernel to that stage
> (started with it more than a year ago, as part of the -rt kernel) but
> these days i'm booting a 32-bit and a 64-bit allyesconfig bzImage kernel
> almost daily :) These bootups already caught a healthy amount of bugs in
> the kernel, both important and unimportant ones. Recently the size of
> the allyesconfig bzImage kernel surpassed 42MB, so it's massive.
>
> Btw., i also boot "allnoconfig" kernels. [with just the minimal set of
> features turned on to make the kernel minimally boot up and report back
> via networking.]
Thanks ... it looks like we may have trouble from devices that alter the
unchecked isa dma flag after scsi_host_alloc. The guilty parties appear
to be gdth, eata, u14-34f, ultrastor, BusLogic and advansys.
The trouble is that if you alloc the host with it one way and free it
with it the other, the wrong freelist is used and the ref counts are
invalid.
Try this pseudo fix: it avoids allocating the freelist until add time
(by which time they should all have fixed the flag). It still doesn't
change the fact that the host is allocated in the wrong region, but that
shouldn't matter too much.
James
---
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 1592640..75af254 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -199,9 +199,13 @@ int scsi_add_host(struct Scsi_Host *shost, struct device *dev)
if (!shost->can_queue) {
printk(KERN_ERR "%s: can_queue = 0 no longer supported\n",
sht->name);
- goto out;
+ goto fail;
}
+ error = scsi_setup_command_freelist(shost);
+ if (error)
+ goto fail;
+
if (!shost->shost_gendev.parent)
shost->shost_gendev.parent = dev ? dev : &platform_bus;
@@ -255,6 +259,8 @@ int scsi_add_host(struct Scsi_Host *shost, struct device *dev)
out_del_gendev:
device_del(&shost->shost_gendev);
out:
+ scsi_destroy_command_freelist(shost);
+ fail:
return error;
}
EXPORT_SYMBOL(scsi_add_host);
@@ -376,10 +382,6 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
else
shost->dma_boundary = 0xffffffff;
- rval = scsi_setup_command_freelist(shost);
- if (rval)
- goto fail_kfree;
-
device_initialize(&shost->shost_gendev);
snprintf(shost->shost_gendev.bus_id, BUS_ID_SIZE, "host%d",
shost->host_no);
@@ -395,14 +397,12 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
"scsi_eh_%d", shost->host_no);
if (IS_ERR(shost->ehandler)) {
rval = PTR_ERR(shost->ehandler);
- goto fail_destroy_freelist;
+ goto fail_kfree;
}
scsi_proc_hostdir_add(shost->hostt);
return shost;
- fail_destroy_freelist:
- scsi_destroy_command_freelist(shost);
fail_kfree:
kfree(shost);
return NULL;
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
2008-04-21 15:57 ` James Bottomley
@ 2008-04-22 13:10 ` Ingo Molnar
2008-04-22 13:39 ` James Bottomley
0 siblings, 1 reply; 5+ messages in thread
From: Ingo Molnar @ 2008-04-22 13:10 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-kernel, Pekka Enberg, Christoph Lameter, linux-scsi
* James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> Thanks ... it looks like we may have trouble from devices that alter
> the unchecked isa dma flag after scsi_host_alloc. The guilty parties
> appear to be gdth, eata, u14-34f, ultrastor, BusLogic and advansys.
>
> The trouble is that if you alloc the host with it one way and free it
> with it the other, the wrong freelist is used and the ref counts are
> invalid.
>
> Try this pseudo fix: it avoids allocating the freelist until add time
> (by which time they should all have fixed the flag). It still doesn't
> change the fact that the host is allocated in the wrong region, but
> that shouldn't matter too much.
ok - do you intend to push this pseudo-fix upstream? If yes then please
consider it fixed as far as i'm concerned - i'll re-reply if the warning
resurfaces (it wasnt lethal to the bootup otherwise). Or if you've got
some other approach/fix then i can test that too.
Ingo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
2008-04-22 13:10 ` Ingo Molnar
@ 2008-04-22 13:39 ` James Bottomley
0 siblings, 0 replies; 5+ messages in thread
From: James Bottomley @ 2008-04-22 13:39 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, Pekka Enberg, Christoph Lameter, linux-scsi
On Tue, 2008-04-22 at 15:10 +0200, Ingo Molnar wrote:
> * James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>
> > Thanks ... it looks like we may have trouble from devices that alter
> > the unchecked isa dma flag after scsi_host_alloc. The guilty parties
> > appear to be gdth, eata, u14-34f, ultrastor, BusLogic and advansys.
> >
> > The trouble is that if you alloc the host with it one way and free it
> > with it the other, the wrong freelist is used and the ref counts are
> > invalid.
> >
> > Try this pseudo fix: it avoids allocating the freelist until add time
> > (by which time they should all have fixed the flag). It still doesn't
> > change the fact that the host is allocated in the wrong region, but
> > that shouldn't matter too much.
>
> ok - do you intend to push this pseudo-fix upstream? If yes then please
> consider it fixed as far as i'm concerned - i'll re-reply if the warning
> resurfaces (it wasnt lethal to the bootup otherwise). Or if you've got
> some other approach/fix then i can test that too.
It's certainly the line of least resistance. The more correct fix would
be to haul unchecked_isa_dma out of the host and make everything use the
template, so it becomes immutable (as it should be). However, that's a
lot of work and given that Andi's gunning for unchecked_isa_dma anyway,
probably not worth it.
So, unless I think of something better in the next few days, this is the
way I'll fix it.
James
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-04-22 13:40 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20080419085758.GA18612@elte.hu>
2008-04-19 13:22 ` [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool() James Bottomley
2008-04-21 13:49 ` Ingo Molnar
2008-04-21 15:57 ` James Bottomley
2008-04-22 13:10 ` Ingo Molnar
2008-04-22 13:39 ` James Bottomley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox