* sgpool-8 double free
@ 2006-02-19 20:29 Dave Jones
2006-02-19 21:56 ` James Bottomley
0 siblings, 1 reply; 4+ messages in thread
From: Dave Jones @ 2006-02-19 20:29 UTC (permalink / raw)
To: linux-scsi; +Cc: bcollins
We had a user report the following trace to us running
a 2.6.16rc4 kernel. (It's actually been there since at least 2.6.15)
He can trigger it easily with just a 'modprobe sbp2'
Whilst it sounds firewire specific, the trace doesn't finger
sbp2 at all, but points to scsi_mod.
More info at https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=182005
Dave
Feb 18 22:30:17 fgrbhw01 kernel: sbp2: $Rev: 1306 $ Ben Collins <bcollins@debian.org>
Feb 18 22:30:17 fgrbhw01 kernel: ieee1394: sbp2: Driver forced to serialize I/O (serialize_io=1)
Feb 18 22:30:17 fgrbhw01 kernel: ieee1394: sbp2: Try serialize_io=0 for better performance
Feb 18 22:30:17 fgrbhw01 kernel: scsi2 : SCSI emulation for IEEE-1394 SBP-2 Devices
Feb 18 22:30:17 fgrbhw01 kernel: ieee1394: sbp2: Node 0-00:1023: Using 36byte inquiry workaround
Feb 18 22:30:18 fgrbhw01 kernel: ieee1394: sbp2: Logged into SBP-2 device
Feb 18 22:30:18 fgrbhw01 kernel: Vendor: Initio Model: 0KLAT80 Rev: 2.05
Feb 18 22:30:18 fgrbhw01 kernel: Type: Direct-Access ANSI SCSI revision: 00
Feb 18 22:30:18 fgrbhw01 kernel: SCSI device sdb: 781422768 512-byte hdwr sectors (400088 MB)
Feb 18 22:30:18 fgrbhw01 kernel: slab error in cache_free_debugcheck(): cache `sgpool-8': double free, or memory outside object was overwritten
Feb 18 22:30:18 fgrbhw01 kernel: [<c014d8bf>] cache_free_debugcheck+0xce/0x1b9
[<c01486cb>] mempool_free+0x5f/0x63
Feb 18 22:30:18 fgrbhw01 kernel: [<c014e230>] kmem_cache_free+0x2a/0x5c
[<c01486cb>] mempool_free+0x5f/0x63
Feb 18 22:30:18 fgrbhw01 kernel: [<f8864f65>] scsi_io_completion+0x65/0x3ce
[scsi_mod] [<f8860bb3>] scsi_finish_command+0xb8/0xbd [scsi_mod]
Feb 18 22:30:18 fgrbhw01 kernel: [<f8860ab6>] scsi_softirq+0x109/0x128
[scsi_mod] [<c0127098>] __do_softirq+0x58/0xc2
Feb 18 22:30:18 fgrbhw01 kernel: [<c0105f75>] do_softirq+0x46/0x4e
Feb 18 22:30:18 fgrbhw01 kernel: =======================
Feb 18 22:30:18 fgrbhw01 kernel: [<c0105e9a>] do_IRQ+0x72/0x7b [<c01048fe>]
common_interrupt+0x1a/0x20
Feb 18 22:30:18 fgrbhw01 kernel: [<f88c940b>] ext3_get_block_handle+0x0/0x2a5
[ext3] [<f88c9714>] ext3_get_block+0x64/0x6c [ext3]
Feb 18 22:30:18 fgrbhw01 kernel: [<f88c9f0f>] ext3_bmap+0x0/0x6d [ext3]
[<c0165dec>] generic_block_bmap+0x28/0x35
Feb 18 22:30:18 fgrbhw01 kernel: [<c02f599a>] io_schedule+0x26/0x30
[<c02f5cd3>] out_of_line_wait_on_bit_lock+0x75/0x7d
Feb 18 22:30:18 fgrbhw01 kernel: [<c01631d3>] sync_buffer+0x0/0x33
[<f88c9f75>] ext3_bmap+0x66/0x6d [ext3]
Feb 18 22:30:18 fgrbhw01 kernel: [<f88c96b0>] ext3_get_block+0x0/0x6c [ext3]
[<f88c9f0f>] ext3_bmap+0x0/0x6d [ext3]
Feb 18 22:30:18 fgrbhw01 kernel: [<c0178e14>] bmap+0x23/0x27 [<f88961e9>]
journal_bmap+0x1d/0x64 [jbd]
Feb 18 22:30:18 fgrbhw01 kernel: [<c01347cd>] wake_bit_function+0x0/0x3c
[<c014d9a2>] cache_free_debugcheck+0x1b1/0x1b9
Feb 18 22:30:18 fgrbhw01 kernel: [<f88961bd>] journal_next_log_block+0x74/0x83
[jbd] [<f889623f>] journal_get_descriptor_buffer+0xf/0x8d [jbd]
Feb 18 22:30:19 fgrbhw01 kernel: [<f8893709>]
journal_commit_transaction+0x61c/0xdbf [jbd] [<c02f6269>]
_spin_lock_irqsave+0x9/0xd
Feb 18 22:30:19 fgrbhw01 kernel: [<c012a32b>] try_to_del_timer_sync+0x44/0x4a
[<f88959aa>] kjournald+0xbd/0x20e [jbd]
Feb 18 22:30:19 fgrbhw01 kernel: [<c011d4c9>] schedule_tail+0x36/0x8b
[<f88958e8>] commit_timeout+0x0/0x5 [jbd]
Feb 18 22:30:19 fgrbhw01 kernel: [<c01347a0>] autoremove_wake_function+0x0/0x2d
[<f88958ed>] kjournald+0x0/0x20e [jbd]
Feb 18 22:30:19 fgrbhw01 kernel: [<c01023a9>] kernel_thread_helper+0x5/0xb
Feb 18 22:30:19 fgrbhw01 kernel: f3fa3888: redzone 1: 0x170fc2a5, redzone 2:
0xc01485d0.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: sgpool-8 double free
2006-02-19 20:29 sgpool-8 double free Dave Jones
@ 2006-02-19 21:56 ` James Bottomley
2006-02-19 22:58 ` Stefan Richter
0 siblings, 1 reply; 4+ messages in thread
From: James Bottomley @ 2006-02-19 21:56 UTC (permalink / raw)
To: Dave Jones; +Cc: linux-scsi, bcollins
On Sun, 2006-02-19 at 15:29 -0500, Dave Jones wrote:
> Feb 18 22:30:18 fgrbhw01 kernel: ieee1394: sbp2: Logged into SBP-2 device
> Feb 18 22:30:18 fgrbhw01 kernel: Vendor: Initio Model: 0KLAT80 Rev: 2.05
> Feb 18 22:30:18 fgrbhw01 kernel: Type: Direct-Access ANSI SCSI revision: 00
> Feb 18 22:30:18 fgrbhw01 kernel: SCSI device sdb: 781422768 512-byte hdwr sectors (400088 MB)
> Feb 18 22:30:18 fgrbhw01 kernel: slab error in cache_free_debugcheck(): cache `sgpool-8': double free, or memory outside object was overwritten
> Feb 18 22:30:18 fgrbhw01 kernel: [<c014d8bf>] cache_free_debugcheck+0xce/0x1b9
> [<c01486cb>] mempool_free+0x5f/0x63
> Feb 18 22:30:18 fgrbhw01 kernel: [<c014e230>] kmem_cache_free+0x2a/0x5c
> [<c01486cb>] mempool_free+0x5f/0x63
> Feb 18 22:30:18 fgrbhw01 kernel: [<f8864f65>] scsi_io_completion+0x65/0x3ce
> [scsi_mod] [<f8860bb3>] scsi_finish_command+0xb8/0xbd [scsi_mod]
> Feb 18 22:30:18 fgrbhw01 kernel: [<f8860ab6>] scsi_softirq+0x109/0x128
This is a characteristic trace for double done() on the same SCSI
command.
James
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: sgpool-8 double free
2006-02-19 21:56 ` James Bottomley
@ 2006-02-19 22:58 ` Stefan Richter
2006-02-19 23:10 ` Stefan Richter
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Richter @ 2006-02-19 22:58 UTC (permalink / raw)
To: James Bottomley; +Cc: Dave Jones, linux-scsi, bcollins
James Bottomley wrote:
> On Sun, 2006-02-19 at 15:29 -0500, Dave Jones wrote:
>
>>Feb 18 22:30:18 fgrbhw01 kernel: ieee1394: sbp2: Logged into SBP-2 device
>>Feb 18 22:30:18 fgrbhw01 kernel: Vendor: Initio Model: 0KLAT80 Rev: 2.05
>>Feb 18 22:30:18 fgrbhw01 kernel: Type: Direct-Access ANSI SCSI revision: 00
>>Feb 18 22:30:18 fgrbhw01 kernel: SCSI device sdb: 781422768 512-byte hdwr sectors (400088 MB)
>>Feb 18 22:30:18 fgrbhw01 kernel: slab error in cache_free_debugcheck(): cache `sgpool-8': double free, or memory outside object was overwritten
>>Feb 18 22:30:18 fgrbhw01 kernel: [<c014d8bf>] cache_free_debugcheck+0xce/0x1b9
>> [<c01486cb>] mempool_free+0x5f/0x63
>>Feb 18 22:30:18 fgrbhw01 kernel: [<c014e230>] kmem_cache_free+0x2a/0x5c
>>[<c01486cb>] mempool_free+0x5f/0x63
>>Feb 18 22:30:18 fgrbhw01 kernel: [<f8864f65>] scsi_io_completion+0x65/0x3ce
>>[scsi_mod] [<f8860bb3>] scsi_finish_command+0xb8/0xbd [scsi_mod]
>>Feb 18 22:30:18 fgrbhw01 kernel: [<f8860ab6>] scsi_softirq+0x109/0x128
>
>
> This is a characteristic trace for double done() on the same SCSI
> command.
Perhaps. OTOH, maybe there was indeed memory overwritten. SBP-2 targets
write data into the PC's memory anywhere they see fit, without driver
intervention. So IMO it is possible that a buggy target writes outside
of the response data buffer.
I have an Initio based bridge which also claims to implement type
"Direct-Access". After this device receives a mode_sense command, my
system reboots straight away (kernel panic while in interrupt; I
reported it in the thread "TYPE_RBC cache fixes (sbp2.c affected)").
I did not have the time to debug it yet. But at least the very latest
1394 drivers flag the affected device for BLIST_MS_SKIP_PAGE_08 which
avoids mode_sense to be sent by sd_mod, thus avoids the panic. Patch:
"sbp2: update 36byte inquiry workaround (fix compatibility regression)"
http://www.kernel.org/git/?p=linux/kernel/git/scjody/ieee1394.git;a=commit;h=99496037c6744fd938ffb8ccfc8fc91762322ff8
(The skip-page-08 part was actually added for a different bridge with a
different problem.)
As I can see from the log in the bug report, sbp2's blacklist will also
spring into action for the reporter's device.
The mentioned patch applies to 2.6.16-rc1 or later. Users of 2.6.14 and
2.6.15 may apply it by means of rediffs which I provide:
http://me.in-berlin.de/~s5r6/linux1394/updates/
However instead of patching sbp2, the BLIST_MS_SKIP_PAGE_08 workaround
can certainly also be enabled by feeding it into scsi_mod's dev_flags
parameter or default_dev_flags parameter. I guess the syntax would be
something like
# modprobe scsi_mod dev_flags=Initio:0KLAT80:8192
or
# modprobe scsi_mod default_dev_flags=8192
before sbp2 and scsi_mod are loaded or
# echo 8192 > /sys/module/scsi_mod/parameters/default_dev_flags
when scsi_mod is loaded but before the disk is connected. (Note, I never
tried setting these parameters myself. Correct me if I got it wrong.)
I will add a pointer to this posting to bugzilla.redhat.com.
--
Stefan Richter
-=====-=-==- --=- =--==
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: sgpool-8 double free
2006-02-19 22:58 ` Stefan Richter
@ 2006-02-19 23:10 ` Stefan Richter
0 siblings, 0 replies; 4+ messages in thread
From: Stefan Richter @ 2006-02-19 23:10 UTC (permalink / raw)
To: James Bottomley; +Cc: Dave Jones, linux-scsi, bcollins
I wrote:
> James Bottomley wrote:
>> This is a characteristic trace for double done() on the same SCSI
>> command.
>
> Perhaps. OTOH, maybe there was indeed memory overwritten.
PS: I suspect sbp2 may indeed doubly call done() in corner cases:
http://bugzilla.kernel.org/show_bug.cgi?id=5998
However a double done() is extremely unlikely in the case reported by
Dave. AFAICS from the messages at
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=182005 , sbp2 does
not run any of the code code paths which lead to alternative routes to
done(), besides the normal command completion. (These routes are
FireWire bus reset handling and SCSI error handling. In theory these
routes cannot doubly call done() either...)
--
Stefan Richter
-=====-=-==- --=- =--==
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-02-19 23:10 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-19 20:29 sgpool-8 double free Dave Jones
2006-02-19 21:56 ` James Bottomley
2006-02-19 22:58 ` Stefan Richter
2006-02-19 23:10 ` Stefan Richter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox