public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
@ 2008-04-19  8:57 Ingo Molnar
  2008-04-19  9:11 ` Pekka J Enberg
  2008-04-19 13:22 ` James Bottomley
  0 siblings, 2 replies; 10+ messages in thread
From: Ingo Molnar @ 2008-04-19  8:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: James Bottomley, Pekka Enberg, Christoph Lameter


x86.git allyesconfig bootup test produced the following warning in 
slub.c (and a stream of similar warnings later on):

[   47.295141] ------------[ cut here ]------------
[   47.298969] WARNING: at mm/slub.c:2443 kmem_cache_destroy+0xf8/0x102()
[   47.302967] Modules linked in:
[   47.307205] Pid: 1, comm: swapper Not tainted 2.6.25-sched-devel.git-x86-latest.git #24
[   47.310970]  [<c014be1a>] warn_on_slowpath+0x46/0x56
[   47.317464]  [<c019685f>] ? get_pageblock_flags_group+0x56/0x74
[   47.322969]  [<c04f2e03>] ? list_add+0xa/0xf
[   47.328814]  [<c123083a>] ? _spin_unlock+0x22/0x25
[   47.333327]  [<c0191ce8>] ? time_hardirqs_off+0xe/0x1f
[   47.338972]  [<c0167ff7>] ? trace_hardirqs_off_caller+0x15/0xab
[   47.342970]  [<c0168098>] ? trace_hardirqs_off+0xb/0xd
[   47.350971]  [<c01976f7>] ? free_hot_cold_page+0x11d/0x139
[   47.354970]  [<c0191f58>] ? time_hardirqs_on+0xe/0x1f
[   47.361595]  [<c0169779>] ? trace_hardirqs_on_caller+0x16/0x13c
[   47.366973]  [<c01698aa>] ? trace_hardirqs_on+0xb/0xd
[   47.370972]  [<c01b1a62>] kmem_cache_destroy+0xf8/0x102
[   47.377727]  [<c09dab7f>] scsi_put_host_cmd_pool+0x42/0x58
[   47.382973]  [<c09db1b4>] scsi_destroy_command_freelist+0x54/0x5c
[   47.386972]  [<c09db2ef>] scsi_host_dev_release+0x79/0xa9
[   47.394973]  [<c067ee95>] device_release+0x3e/0x54
[   47.398974]  [<c04e0e62>] kobject_release+0x45/0x55
[   47.405383]  [<c04e0e1d>] ? kobject_release+0x0/0x55
[   47.409513]  [<c04e1968>] kref_put+0x3e/0x49
[   47.414975]  [<c04e0d8d>] kobject_put+0x19/0x1b
[   47.418975]  [<c067f44e>] put_device+0x16/0x18
[   47.422975]  [<c09db274>] scsi_host_put+0x12/0x14
[   47.426975]  [<c09db3dc>] scsi_unregister+0x1d/0x20
[   47.433383]  [<c1ceddcc>] aha1542_detect+0x7db/0x7f5
[   47.438977]  [<c01698aa>] ? trace_hardirqs_on+0xb/0xd
[   47.442976]  [<c1ceddf1>] ? init_this_scsi_driver+0xb/0xd0
[   47.450666]  [<c1cede44>] init_this_scsi_driver+0x5e/0xd0
[   47.454978]  [<c1c944db>] kernel_init+0x152/0x2b0
[   47.458978]  [<c1c94389>] ? kernel_init+0x0/0x2b0
[   47.465249]  [<c1c94389>] ? kernel_init+0x0/0x2b0
[   47.469258]  [<c01199c3>] kernel_thread_helper+0x7/0x10
[   47.474978]  =======================
[   47.478980] ---[ end trace 778e504de7e3b1e3 ]---
[   47.483297] ------------[ cut here ]------------

config and bootlog at:

 http://redhat.com/~mingo/misc/config-Sat_Apr_19_10_28_28_CEST_2008.bad
 http://redhat.com/~mingo/misc/log-Sat_Apr_19_10_28_28_CEST_2008.bad

[a few .config options were turned off: just accept all the defaults 
 after 'make oldconfig']

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
  2008-04-19  8:57 [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool() Ingo Molnar
@ 2008-04-19  9:11 ` Pekka J Enberg
  2008-04-19 10:43   ` Ingo Molnar
  2008-04-21  5:58   ` Christoph Lameter
  2008-04-19 13:22 ` James Bottomley
  1 sibling, 2 replies; 10+ messages in thread
From: Pekka J Enberg @ 2008-04-19  9:11 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, James Bottomley, Christoph Lameter

On Sat, 19 Apr 2008, Ingo Molnar wrote:
> x86.git allyesconfig bootup test produced the following warning in 
> slub.c (and a stream of similar warnings later on):
> 
> [   47.295141] ------------[ cut here ]------------
> [   47.298969] WARNING: at mm/slub.c:2443 kmem_cache_destroy+0xf8/0x102()
> [   47.302967] Modules linked in:
> [   47.307205] Pid: 1, comm: swapper Not tainted 2.6.25-sched-devel.git-x86-latest.git #24
> [   47.310970]  [<c014be1a>] warn_on_slowpath+0x46/0x56
> [   47.317464]  [<c019685f>] ? get_pageblock_flags_group+0x56/0x74
> [   47.322969]  [<c04f2e03>] ? list_add+0xa/0xf
> [   47.328814]  [<c123083a>] ? _spin_unlock+0x22/0x25
> [   47.333327]  [<c0191ce8>] ? time_hardirqs_off+0xe/0x1f
> [   47.338972]  [<c0167ff7>] ? trace_hardirqs_off_caller+0x15/0xab
> [   47.342970]  [<c0168098>] ? trace_hardirqs_off+0xb/0xd
> [   47.350971]  [<c01976f7>] ? free_hot_cold_page+0x11d/0x139
> [   47.354970]  [<c0191f58>] ? time_hardirqs_on+0xe/0x1f
> [   47.361595]  [<c0169779>] ? trace_hardirqs_on_caller+0x16/0x13c
> [   47.366973]  [<c01698aa>] ? trace_hardirqs_on+0xb/0xd
> [   47.370972]  [<c01b1a62>] kmem_cache_destroy+0xf8/0x102
> [   47.377727]  [<c09dab7f>] scsi_put_host_cmd_pool+0x42/0x58
> [   47.382973]  [<c09db1b4>] scsi_destroy_command_freelist+0x54/0x5c
> [   47.386972]  [<c09db2ef>] scsi_host_dev_release+0x79/0xa9
> [   47.394973]  [<c067ee95>] device_release+0x3e/0x54
> [   47.398974]  [<c04e0e62>] kobject_release+0x45/0x55
> [   47.405383]  [<c04e0e1d>] ? kobject_release+0x0/0x55
> [   47.409513]  [<c04e1968>] kref_put+0x3e/0x49
> [   47.414975]  [<c04e0d8d>] kobject_put+0x19/0x1b
> [   47.418975]  [<c067f44e>] put_device+0x16/0x18
> [   47.422975]  [<c09db274>] scsi_host_put+0x12/0x14
> [   47.426975]  [<c09db3dc>] scsi_unregister+0x1d/0x20
> [   47.433383]  [<c1ceddcc>] aha1542_detect+0x7db/0x7f5
> [   47.438977]  [<c01698aa>] ? trace_hardirqs_on+0xb/0xd
> [   47.442976]  [<c1ceddf1>] ? init_this_scsi_driver+0xb/0xd0
> [   47.450666]  [<c1cede44>] init_this_scsi_driver+0x5e/0xd0
> [   47.454978]  [<c1c944db>] kernel_init+0x152/0x2b0
> [   47.458978]  [<c1c94389>] ? kernel_init+0x0/0x2b0
> [   47.465249]  [<c1c94389>] ? kernel_init+0x0/0x2b0
> [   47.469258]  [<c01199c3>] kernel_thread_helper+0x7/0x10
> [   47.474978]  =======================
> [   47.478980] ---[ end trace 778e504de7e3b1e3 ]---
> [   47.483297] ------------[ cut here ]------------
> 
> config and bootlog at:
> 
>  http://redhat.com/~mingo/misc/config-Sat_Apr_19_10_28_28_CEST_2008.bad
>  http://redhat.com/~mingo/misc/log-Sat_Apr_19_10_28_28_CEST_2008.bad
> 
> [a few .config options were turned off: just accept all the defaults 
>  after 'make oldconfig']

I couldn't spot anything in particular in SLUB which makes me think SCSI 
code simply didn't free all objects before scsi_put_host_cmd_pool() called 
kmem_cache_destroy() to kill the cache.

James, does this make sense or should I just look at SLUB harder?

			Pekka

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
  2008-04-19  9:11 ` Pekka J Enberg
@ 2008-04-19 10:43   ` Ingo Molnar
  2008-04-21  5:58   ` Christoph Lameter
  1 sibling, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2008-04-19 10:43 UTC (permalink / raw)
  To: Pekka J Enberg; +Cc: linux-kernel, James Bottomley, Christoph Lameter


* Pekka J Enberg <penberg@cs.helsinki.fi> wrote:

> > [   47.370972]  [<c01b1a62>] kmem_cache_destroy+0xf8/0x102
> > [   47.377727]  [<c09dab7f>] scsi_put_host_cmd_pool+0x42/0x58
> > [   47.382973]  [<c09db1b4>] scsi_destroy_command_freelist+0x54/0x5c
> > [   47.386972]  [<c09db2ef>] scsi_host_dev_release+0x79/0xa9
> > [   47.394973]  [<c067ee95>] device_release+0x3e/0x54
> > [   47.398974]  [<c04e0e62>] kobject_release+0x45/0x55
> > [   47.405383]  [<c04e0e1d>] ? kobject_release+0x0/0x55
> > [   47.409513]  [<c04e1968>] kref_put+0x3e/0x49
> > [   47.414975]  [<c04e0d8d>] kobject_put+0x19/0x1b
> > [   47.418975]  [<c067f44e>] put_device+0x16/0x18
> > [   47.422975]  [<c09db274>] scsi_host_put+0x12/0x14
> > [   47.426975]  [<c09db3dc>] scsi_unregister+0x1d/0x20
> > [   47.433383]  [<c1ceddcc>] aha1542_detect+0x7db/0x7f5
> > [   47.438977]  [<c01698aa>] ? trace_hardirqs_on+0xb/0xd
> > [   47.442976]  [<c1ceddf1>] ? init_this_scsi_driver+0xb/0xd0
> > [   47.450666]  [<c1cede44>] init_this_scsi_driver+0x5e/0xd0
> > [   47.454978]  [<c1c944db>] kernel_init+0x152/0x2b0
> > [   47.458978]  [<c1c94389>] ? kernel_init+0x0/0x2b0
> > [   47.465249]  [<c1c94389>] ? kernel_init+0x0/0x2b0
> > [   47.469258]  [<c01199c3>] kernel_thread_helper+0x7/0x10
> > [   47.474978]  =======================
> > [   47.478980] ---[ end trace 778e504de7e3b1e3 ]---
> > [   47.483297] ------------[ cut here ]------------
> > 
> > config and bootlog at:
> > 
> >  http://redhat.com/~mingo/misc/config-Sat_Apr_19_10_28_28_CEST_2008.bad
> >  http://redhat.com/~mingo/misc/log-Sat_Apr_19_10_28_28_CEST_2008.bad
> > 
> > [a few .config options were turned off: just accept all the defaults 
> >  after 'make oldconfig']
> 
> I couldn't spot anything in particular in SLUB which makes me think 
> SCSI code simply didn't free all objects before 
> scsi_put_host_cmd_pool() called kmem_cache_destroy() to kill the 
> cache.
> 
> James, does this make sense or should I just look at SLUB harder?

yesterday's x86.git auto-qa passed fine with hundreds of successful 
builds and bootups so SLUB cannot be the culprit - i think it's more 
likely the SCSI layer changes that were pulled last night.

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
  2008-04-19  8:57 [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool() Ingo Molnar
  2008-04-19  9:11 ` Pekka J Enberg
@ 2008-04-19 13:22 ` James Bottomley
  2008-04-21 13:49   ` Ingo Molnar
  1 sibling, 1 reply; 10+ messages in thread
From: James Bottomley @ 2008-04-19 13:22 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Pekka Enberg, Christoph Lameter, linux-scsi

On Sat, 2008-04-19 at 10:57 +0200, Ingo Molnar wrote:
> x86.git allyesconfig bootup test produced the following warning in 
> slub.c (and a stream of similar warnings later on):
> 
> [   47.295141] ------------[ cut here ]------------
> [   47.298969] WARNING: at mm/slub.c:2443 kmem_cache_destroy+0xf8/0x102()
> [   47.302967] Modules linked in:
> [   47.307205] Pid: 1, comm: swapper Not tainted 2.6.25-sched-devel.git-x86-latest.git #24
> [   47.310970]  [<c014be1a>] warn_on_slowpath+0x46/0x56
> [   47.317464]  [<c019685f>] ? get_pageblock_flags_group+0x56/0x74
> [   47.322969]  [<c04f2e03>] ? list_add+0xa/0xf
> [   47.328814]  [<c123083a>] ? _spin_unlock+0x22/0x25
> [   47.333327]  [<c0191ce8>] ? time_hardirqs_off+0xe/0x1f
> [   47.338972]  [<c0167ff7>] ? trace_hardirqs_off_caller+0x15/0xab
> [   47.342970]  [<c0168098>] ? trace_hardirqs_off+0xb/0xd
> [   47.350971]  [<c01976f7>] ? free_hot_cold_page+0x11d/0x139
> [   47.354970]  [<c0191f58>] ? time_hardirqs_on+0xe/0x1f
> [   47.361595]  [<c0169779>] ? trace_hardirqs_on_caller+0x16/0x13c
> [   47.366973]  [<c01698aa>] ? trace_hardirqs_on+0xb/0xd
> [   47.370972]  [<c01b1a62>] kmem_cache_destroy+0xf8/0x102
> [   47.377727]  [<c09dab7f>] scsi_put_host_cmd_pool+0x42/0x58
> [   47.382973]  [<c09db1b4>] scsi_destroy_command_freelist+0x54/0x5c
> [   47.386972]  [<c09db2ef>] scsi_host_dev_release+0x79/0xa9
> [   47.394973]  [<c067ee95>] device_release+0x3e/0x54
> [   47.398974]  [<c04e0e62>] kobject_release+0x45/0x55
> [   47.405383]  [<c04e0e1d>] ? kobject_release+0x0/0x55
> [   47.409513]  [<c04e1968>] kref_put+0x3e/0x49
> [   47.414975]  [<c04e0d8d>] kobject_put+0x19/0x1b
> [   47.418975]  [<c067f44e>] put_device+0x16/0x18
> [   47.422975]  [<c09db274>] scsi_host_put+0x12/0x14
> [   47.426975]  [<c09db3dc>] scsi_unregister+0x1d/0x20
> [   47.433383]  [<c1ceddcc>] aha1542_detect+0x7db/0x7f5
> [   47.438977]  [<c01698aa>] ? trace_hardirqs_on+0xb/0xd
> [   47.442976]  [<c1ceddf1>] ? init_this_scsi_driver+0xb/0xd0
> [   47.450666]  [<c1cede44>] init_this_scsi_driver+0x5e/0xd0
> [   47.454978]  [<c1c944db>] kernel_init+0x152/0x2b0
> [   47.458978]  [<c1c94389>] ? kernel_init+0x0/0x2b0
> [   47.465249]  [<c1c94389>] ? kernel_init+0x0/0x2b0
> [   47.469258]  [<c01199c3>] kernel_thread_helper+0x7/0x10
> [   47.474978]  =======================
> [   47.478980] ---[ end trace 778e504de7e3b1e3 ]---
> [   47.483297] ------------[ cut here ]------------
> 
> config and bootlog at:
> 
>  http://redhat.com/~mingo/misc/config-Sat_Apr_19_10_28_28_CEST_2008.bad
>  http://redhat.com/~mingo/misc/log-Sat_Apr_19_10_28_28_CEST_2008.bad
> 
> [a few .config options were turned off: just accept all the defaults 
>  after 'make oldconfig']

The WARN_ON is caused by kmem_cache_destroy() with apparently
outstanding objects, isn't it?

The most significant piece of the log seems to be before with all those
isa SCSI drivers ... I assume you don't actually have any of the
hardware, you're just randomly inserting the modules?

James





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
  2008-04-19  9:11 ` Pekka J Enberg
  2008-04-19 10:43   ` Ingo Molnar
@ 2008-04-21  5:58   ` Christoph Lameter
  2008-04-21 13:50     ` Ingo Molnar
  1 sibling, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2008-04-21  5:58 UTC (permalink / raw)
  To: Pekka J Enberg; +Cc: Ingo Molnar, linux-kernel, James Bottomley

On Sat, 19 Apr 2008, Pekka J Enberg wrote:

> > [a few .config options were turned off: just accept all the defaults 
> >  after 'make oldconfig']
> 
> I couldn't spot anything in particular in SLUB which makes me think SCSI 
> code simply didn't free all objects before scsi_put_host_cmd_pool() called 
> kmem_cache_destroy() to kill the cache.
> 
> James, does this make sense or should I just look at SLUB harder?

The WARN is intended to warn that a kmem_cache_destroy was run with 
objects not freed.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
  2008-04-19 13:22 ` James Bottomley
@ 2008-04-21 13:49   ` Ingo Molnar
  2008-04-21 15:57     ` James Bottomley
  0 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2008-04-21 13:49 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel, Pekka Enberg, Christoph Lameter, linux-scsi


* James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> > x86.git allyesconfig bootup test produced the following warning in 
> > slub.c (and a stream of similar warnings later on):
[...]

> > config and bootlog at:
> > 
> >  http://redhat.com/~mingo/misc/config-Sat_Apr_19_10_28_28_CEST_2008.bad
> >  http://redhat.com/~mingo/misc/log-Sat_Apr_19_10_28_28_CEST_2008.bad
> > 
> > [a few .config options were turned off: just accept all the defaults 
> >  after 'make oldconfig']
> 
> The WARN_ON is caused by kmem_cache_destroy() with apparently 
> outstanding objects, isn't it?
> 
> The most significant piece of the log seems to be before with all 
> those isa SCSI drivers ... I assume you don't actually have any of the 
> hardware, you're just randomly inserting the modules?

correct. As i mentioned it in the first sentence this is an allyesconfig 
bzImage bootup. I.e. this is the bootup log of a "make allyesconfig" 
kernel - roughly analogous to (trying to) insert every module in 
existence. In the boot log you'll find 4871 initcalls, done by over 3000 
drivers that each is attempted to be loaded by the kernel (!).

I do those bootups to "run as much as possible" kernel code and to make 
sure that the maximum combination of debug and other features still 
produces a working kernel.

I had to work half a year to gradually get the kernel to that stage 
(started with it more than a year ago, as part of the -rt kernel) but 
these days i'm booting a 32-bit and a 64-bit allyesconfig bzImage kernel 
almost daily :) These bootups already caught a healthy amount of bugs in 
the kernel, both important and unimportant ones. Recently the size of 
the allyesconfig bzImage kernel surpassed 42MB, so it's massive.

Btw., i also boot "allnoconfig" kernels. [with just the minimal set of 
features turned on to make the kernel minimally boot up and report back 
via networking.]

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
  2008-04-21  5:58   ` Christoph Lameter
@ 2008-04-21 13:50     ` Ingo Molnar
  0 siblings, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2008-04-21 13:50 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka J Enberg, linux-kernel, James Bottomley


* Christoph Lameter <clameter@sgi.com> wrote:

> On Sat, 19 Apr 2008, Pekka J Enberg wrote:
> 
> > > [a few .config options were turned off: just accept all the defaults 
> > >  after 'make oldconfig']
> > 
> > I couldn't spot anything in particular in SLUB which makes me think 
> > SCSI code simply didn't free all objects before 
> > scsi_put_host_cmd_pool() called kmem_cache_destroy() to kill the 
> > cache.
> > 
> > James, does this make sense or should I just look at SLUB harder?
> 
> The WARN is intended to warn that a kmem_cache_destroy was run with 
> objects not freed.

i suspect if that warn-on triggers more frequently then it might make 
sense to turn it into a pretty SLUB warning about that cache, with a 
stackdump at the end. (that way people are not tricked into mistakenly 
believing that it's a SLUB bug)

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
  2008-04-21 13:49   ` Ingo Molnar
@ 2008-04-21 15:57     ` James Bottomley
  2008-04-22 13:10       ` Ingo Molnar
  0 siblings, 1 reply; 10+ messages in thread
From: James Bottomley @ 2008-04-21 15:57 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Pekka Enberg, Christoph Lameter, linux-scsi

On Mon, 2008-04-21 at 15:49 +0200, Ingo Molnar wrote:
> * James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> > > x86.git allyesconfig bootup test produced the following warning in 
> > > slub.c (and a stream of similar warnings later on):
> [...]
> 
> > > config and bootlog at:
> > > 
> > >  http://redhat.com/~mingo/misc/config-Sat_Apr_19_10_28_28_CEST_2008.bad
> > >  http://redhat.com/~mingo/misc/log-Sat_Apr_19_10_28_28_CEST_2008.bad
> > > 
> > > [a few .config options were turned off: just accept all the defaults 
> > >  after 'make oldconfig']
> > 
> > The WARN_ON is caused by kmem_cache_destroy() with apparently 
> > outstanding objects, isn't it?
> > 
> > The most significant piece of the log seems to be before with all 
> > those isa SCSI drivers ... I assume you don't actually have any of the 
> > hardware, you're just randomly inserting the modules?
> 
> correct. As i mentioned it in the first sentence this is an allyesconfig 
> bzImage bootup. I.e. this is the bootup log of a "make allyesconfig" 
> kernel - roughly analogous to (trying to) insert every module in 
> existence. In the boot log you'll find 4871 initcalls, done by over 3000 
> drivers that each is attempted to be loaded by the kernel (!).
> 
> I do those bootups to "run as much as possible" kernel code and to make 
> sure that the maximum combination of debug and other features still 
> produces a working kernel.
> 
> I had to work half a year to gradually get the kernel to that stage 
> (started with it more than a year ago, as part of the -rt kernel) but 
> these days i'm booting a 32-bit and a 64-bit allyesconfig bzImage kernel 
> almost daily :) These bootups already caught a healthy amount of bugs in 
> the kernel, both important and unimportant ones. Recently the size of 
> the allyesconfig bzImage kernel surpassed 42MB, so it's massive.
> 
> Btw., i also boot "allnoconfig" kernels. [with just the minimal set of 
> features turned on to make the kernel minimally boot up and report back 
> via networking.]

Thanks ... it looks like we may have trouble from devices that alter the
unchecked isa dma flag after scsi_host_alloc.  The guilty parties appear
to be gdth, eata, u14-34f, ultrastor, BusLogic and advansys.

The trouble is that if you alloc the host with it one way and free it
with it the other, the wrong freelist is used and the ref counts are
invalid.

Try this pseudo fix: it avoids allocating the freelist until add time
(by which time they should all have fixed the flag).  It still doesn't
change the fact that the host is allocated in the wrong region, but that
shouldn't matter too much.

James

---

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 1592640..75af254 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -199,9 +199,13 @@ int scsi_add_host(struct Scsi_Host *shost, struct device *dev)
 	if (!shost->can_queue) {
 		printk(KERN_ERR "%s: can_queue = 0 no longer supported\n",
 				sht->name);
-		goto out;
+		goto fail;
 	}
 
+	error = scsi_setup_command_freelist(shost);
+	if (error)
+		goto fail;
+
 	if (!shost->shost_gendev.parent)
 		shost->shost_gendev.parent = dev ? dev : &platform_bus;
 
@@ -255,6 +259,8 @@ int scsi_add_host(struct Scsi_Host *shost, struct device *dev)
  out_del_gendev:
 	device_del(&shost->shost_gendev);
  out:
+	scsi_destroy_command_freelist(shost);
+ fail:
 	return error;
 }
 EXPORT_SYMBOL(scsi_add_host);
@@ -376,10 +382,6 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	else
 		shost->dma_boundary = 0xffffffff;
 
-	rval = scsi_setup_command_freelist(shost);
-	if (rval)
-		goto fail_kfree;
-
 	device_initialize(&shost->shost_gendev);
 	snprintf(shost->shost_gendev.bus_id, BUS_ID_SIZE, "host%d",
 		shost->host_no);
@@ -395,14 +397,12 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 			"scsi_eh_%d", shost->host_no);
 	if (IS_ERR(shost->ehandler)) {
 		rval = PTR_ERR(shost->ehandler);
-		goto fail_destroy_freelist;
+		goto fail_kfree;
 	}
 
 	scsi_proc_hostdir_add(shost->hostt);
 	return shost;
 
- fail_destroy_freelist:
-	scsi_destroy_command_freelist(shost);
  fail_kfree:
 	kfree(shost);
 	return NULL;



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
  2008-04-21 15:57     ` James Bottomley
@ 2008-04-22 13:10       ` Ingo Molnar
  2008-04-22 13:39         ` James Bottomley
  0 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2008-04-22 13:10 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel, Pekka Enberg, Christoph Lameter, linux-scsi


* James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> Thanks ... it looks like we may have trouble from devices that alter 
> the unchecked isa dma flag after scsi_host_alloc.  The guilty parties 
> appear to be gdth, eata, u14-34f, ultrastor, BusLogic and advansys.
> 
> The trouble is that if you alloc the host with it one way and free it 
> with it the other, the wrong freelist is used and the ref counts are 
> invalid.
> 
> Try this pseudo fix: it avoids allocating the freelist until add time 
> (by which time they should all have fixed the flag).  It still doesn't 
> change the fact that the host is allocated in the wrong region, but 
> that shouldn't matter too much.

ok - do you intend to push this pseudo-fix upstream? If yes then please 
consider it fixed as far as i'm concerned - i'll re-reply if the warning 
resurfaces (it wasnt lethal to the bootup otherwise). Or if you've got 
some other approach/fix then i can test that too.

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool()
  2008-04-22 13:10       ` Ingo Molnar
@ 2008-04-22 13:39         ` James Bottomley
  0 siblings, 0 replies; 10+ messages in thread
From: James Bottomley @ 2008-04-22 13:39 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Pekka Enberg, Christoph Lameter, linux-scsi

On Tue, 2008-04-22 at 15:10 +0200, Ingo Molnar wrote:
> * James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> > Thanks ... it looks like we may have trouble from devices that alter 
> > the unchecked isa dma flag after scsi_host_alloc.  The guilty parties 
> > appear to be gdth, eata, u14-34f, ultrastor, BusLogic and advansys.
> > 
> > The trouble is that if you alloc the host with it one way and free it 
> > with it the other, the wrong freelist is used and the ref counts are 
> > invalid.
> > 
> > Try this pseudo fix: it avoids allocating the freelist until add time 
> > (by which time they should all have fixed the flag).  It still doesn't 
> > change the fact that the host is allocated in the wrong region, but 
> > that shouldn't matter too much.
> 
> ok - do you intend to push this pseudo-fix upstream? If yes then please 
> consider it fixed as far as i'm concerned - i'll re-reply if the warning 
> resurfaces (it wasnt lethal to the bootup otherwise). Or if you've got 
> some other approach/fix then i can test that too.

It's certainly the line of least resistance.  The more correct fix would
be to haul unchecked_isa_dma out of the host and make everything use the
template, so it becomes immutable (as it should be).  However, that's a
lot of work and given that Andi's gunning for unchecked_isa_dma anyway,
probably not worth it.

So, unless I think of something better in the next few days, this is the
way I'll fix it.

James



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-04-22 13:40 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-19  8:57 [bug] SCSI/SLUB - latest -git: WARNING: at mm/slub.c:2443 kmem_cache_destroy, scsi_put_host_cmd_pool() Ingo Molnar
2008-04-19  9:11 ` Pekka J Enberg
2008-04-19 10:43   ` Ingo Molnar
2008-04-21  5:58   ` Christoph Lameter
2008-04-21 13:50     ` Ingo Molnar
2008-04-19 13:22 ` James Bottomley
2008-04-21 13:49   ` Ingo Molnar
2008-04-21 15:57     ` James Bottomley
2008-04-22 13:10       ` Ingo Molnar
2008-04-22 13:39         ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox