All of lore.kernel.org
 help / color / mirror / Atom feed
* new kernel oops in recent kernels
@ 2008-03-16 10:49 Giuseppe Sacco
  2008-03-16 20:27 ` Compiler error? [was: Re: new kernel oops in recent kernels] Giuseppe Sacco
  0 siblings, 1 reply; 18+ messages in thread
From: Giuseppe Sacco @ 2008-03-16 10:49 UTC (permalink / raw)
  To: linux-mips

Hi all,
testing the latest kernels on SGI O2, I found this new kernel oops. It
is there (even if code changed a bit) since 2.6.22, but the oops I
attach here has been produced with kernel from linux-mips.org git of
yesterday night.

I don't know if this is a problem that I should report to this list, or
if I should address a different list. If you have any suggestion, please
let me know.

Thanks a lot,
Giuseppe

CPU 0 Unable to handle kernel paging request at virtual address 0000000000000000, epc == 0000000000000000, ra == 0000000000000000
Oops[#1]:
Cpu 0
$ 0   : 0000000000000000 ffffffff9001fce0 ffffffffffffff86 0000000000000028
$ 4   : 980000000fc01140 0000000000000080 0000000000024000 0000000000000000
$ 8   : 980000000fc54700 0000000000000001 0000000000008000 404000130a0808ff
$12   : 0000000000000008 ffffffff801b8db8 0000000000000000 ffffffff803f0000
$16   : 980000000ff2fa70 980000000c417bb8 980000000c417c20 980000000fdeb610
$20   : 000000007fffffff 980000000f9211a0 980000000fc26000 000000007fa51ecd
$24   : 0000000000000000 ffffffff80074290                                  
$28   : 980000000c414000 980000000c417bb0 0000000000400000 0000000000000000
Hi    : 0000000000000000
Lo    : 003d08dbda057200
epc   : 0000000000000000 0x0     Not tainted
ra    : 0000000000000000 0x0
Status: 9001fce3    KX SX UX KERNEL EXL IE 
Cause : 00000008
BadVA : 0000000000000000
PrId  : 00002321 (R5000)
Modules linked in: parport_pc lp parport ipv6 deflate zlib_deflate ctr twofish twofish_common camellia serpent blowfish des_generic cbc aes_generic xcbc sha25
6_generic sha1_generic crypto_null crypto_blkcipher dm_snapshot dm_mirror dm_mod ehci_hcd ohci_hcd r8169 usbcore sg evdev
Process hald-probe-stor (pid: 1937, threadinfo=980000000c414000, task=980000000ebf47d8)
Stack : 980000000c417be0 980000000c417de0 0800000000000000 980000000c417bb0
        00000008ffffff86 0000000000000000 0200000000000001 000006d600000000
        0000000000000000 980000000c417de0 980000000fdeb610 0000000000000001
        0000000000005326 ffffffff802460b0 0000000070023a00 000000000f9211a0
        ffffffff80490000 ffffffff8024bb84 980000000fc10e80 980000000f80bb28
        0000000000000000 980000000f9210e0 0000010100000001 00000000800d1618
        0000000000000004 980000000fc8f850 000000007fffffff 980000000fde4000
        0000000000005326 000000007fffffff 980000000c407540 980000000f9211a0
        980000000fc26000 000000007fa51ecd ffffffff80245c6c 980000000c407540
        0000000000000000 fffffffffffffdfd 0000000000005326 ffffffff801ad8bc
        ...
Call Trace:
[<ffffffff802460b0>] sr_drive_status+0x50/0xe8
[<ffffffff8024bb84>] cdrom_ioctl+0x5f4/0x1208
[<ffffffff80245c6c>] sr_block_ioctl+0x64/0xe8
[<ffffffff801ad8bc>] compat_blkdev_ioctl+0x7cc/0x18e0
[<ffffffff800d1870>] do_open+0x98/0x310
[<ffffffff800d1d60>] blkdev_open+0x0/0xc0
[<ffffffff800d1da8>] blkdev_open+0x48/0xc0
[<ffffffff8009c444>] __dentry_open+0x114/0x2e0
[<ffffffff8009c740>] do_filp_open+0x48/0x58
[<ffffffff8009c740>] do_filp_open+0x48/0x58
[<ffffffff800def8c>] compat_sys_ioctl+0xf4/0x440
[<ffffffff80019154>] handle_sys+0x114/0x130
[<ffffffff8001fcf3>] fpu_emulator_cop1Handler+0x362/0x2270


Code: (Bad address in epc)
 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* new kernel oops in recent kernels
@ 2008-03-16 15:19 Giuseppe Sacco
  2008-03-16 16:39 ` James Bottomley
  2008-03-16 16:42 ` Matthew Wilcox
  0 siblings, 2 replies; 18+ messages in thread
From: Giuseppe Sacco @ 2008-03-16 15:19 UTC (permalink / raw)
  To: linux-scsi

Hi all,
testing latest kernels on SGI O2, I found this new kernel oops. It has
been produced with kernel from linux-mips.org git of yesterday night.
A very similar oops has been reported by others[0] using 2.6.22.

As you may see, the oops happens while booting the machine, when init
run all scripts via rc. One of those scripts run hald-probe-storage, the
process that actually create the oops.

I am not able to identify the cause nor to propose a solution, but I am
willing to test any patch for this problem.

Thanks a lot,
Giuseppe

CPU 0 Unable to handle kernel paging request at virtual address 0000000000000000, epc == 0000000000000000, ra == 0000000000000000
Oops[#1]:
Cpu 0
$ 0   : 0000000000000000 ffffffff9001fce0 ffffffffffffff86 0000000000000028
$ 4   : 980000000fc01140 0000000000000080 0000000000024000 0000000000000000
$ 8   : 980000000fc54700 0000000000000001 0000000000008000 404000130a0808ff
$12   : 0000000000000008 ffffffff801b8db8 0000000000000000 ffffffff803f0000
$16   : 980000000ff2fa70 980000000c417bb8 980000000c417c20 980000000fdeb610
$20   : 000000007fffffff 980000000f9211a0 980000000fc26000 000000007fa51ecd
$24   : 0000000000000000 ffffffff80074290                                  
$28   : 980000000c414000 980000000c417bb0 0000000000400000 0000000000000000
Hi    : 0000000000000000
Lo    : 003d08dbda057200
epc   : 0000000000000000 0x0     Not tainted
ra    : 0000000000000000 0x0
Status: 9001fce3    KX SX UX KERNEL EXL IE 
Cause : 00000008
BadVA : 0000000000000000
PrId  : 00002321 (R5000)
Modules linked in: parport_pc lp parport ipv6 deflate zlib_deflate ctr twofish twofish_common camellia serpent blowfish des_generic cbc aes_generic xcbc sha25
6_generic sha1_generic crypto_null crypto_blkcipher dm_snapshot dm_mirror dm_mod ehci_hcd ohci_hcd r8169 usbcore sg evdev
Process hald-probe-stor (pid: 1937, threadinfo=980000000c414000, task=980000000ebf47d8)
Stack : 980000000c417be0 980000000c417de0 0800000000000000 980000000c417bb0
        00000008ffffff86 0000000000000000 0200000000000001 000006d600000000
        0000000000000000 980000000c417de0 980000000fdeb610 0000000000000001
        0000000000005326 ffffffff802460b0 0000000070023a00 000000000f9211a0
        ffffffff80490000 ffffffff8024bb84 980000000fc10e80 980000000f80bb28
        0000000000000000 980000000f9210e0 0000010100000001 00000000800d1618
        0000000000000004 980000000fc8f850 000000007fffffff 980000000fde4000
        0000000000005326 000000007fffffff 980000000c407540 980000000f9211a0
        980000000fc26000 000000007fa51ecd ffffffff80245c6c 980000000c407540
        0000000000000000 fffffffffffffdfd 0000000000005326 ffffffff801ad8bc
        ...
Call Trace:
[<ffffffff802460b0>] sr_drive_status+0x50/0xe8
[<ffffffff8024bb84>] cdrom_ioctl+0x5f4/0x1208
[<ffffffff80245c6c>] sr_block_ioctl+0x64/0xe8
[<ffffffff801ad8bc>] compat_blkdev_ioctl+0x7cc/0x18e0
[<ffffffff800d1870>] do_open+0x98/0x310
[<ffffffff800d1d60>] blkdev_open+0x0/0xc0
[<ffffffff800d1da8>] blkdev_open+0x48/0xc0
[<ffffffff8009c444>] __dentry_open+0x114/0x2e0
[<ffffffff8009c740>] do_filp_open+0x48/0x58
[<ffffffff8009c740>] do_filp_open+0x48/0x58
[<ffffffff800def8c>] compat_sys_ioctl+0xf4/0x440
[<ffffffff80019154>] handle_sys+0x114/0x130
[<ffffffff8001fcf3>] fpu_emulator_cop1Handler+0x362/0x2270


Code: (Bad address in epc)

[0]http://lists.debian.org/debian-mips/2008/03/msg00082.html


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: new kernel oops in recent kernels
  2008-03-16 15:19 new kernel oops in recent kernels Giuseppe Sacco
@ 2008-03-16 16:39 ` James Bottomley
  2008-03-16 18:32   ` Giuseppe Sacco
  2008-03-16 16:42 ` Matthew Wilcox
  1 sibling, 1 reply; 18+ messages in thread
From: James Bottomley @ 2008-03-16 16:39 UTC (permalink / raw)
  To: Giuseppe Sacco; +Cc: linux-scsi

On Sun, 2008-03-16 at 16:19 +0100, Giuseppe Sacco wrote:
> Hi all,
> testing latest kernels on SGI O2, I found this new kernel oops. It has
> been produced with kernel from linux-mips.org git of yesterday night.
> A very similar oops has been reported by others[0] using 2.6.22.
> 
> As you may see, the oops happens while booting the machine, when init
> run all scripts via rc. One of those scripts run hald-probe-storage, the
> process that actually create the oops.
> 
> I am not able to identify the cause nor to propose a solution, but I am
> willing to test any patch for this problem.
> 
> Thanks a lot,
> Giuseppe
> 
> CPU 0 Unable to handle kernel paging request at virtual address 0000000000000000, epc == 0000000000000000, ra == 0000000000000000
> Oops[#1]:
> Cpu 0
> $ 0   : 0000000000000000 ffffffff9001fce0 ffffffffffffff86 0000000000000028
> $ 4   : 980000000fc01140 0000000000000080 0000000000024000 0000000000000000
> $ 8   : 980000000fc54700 0000000000000001 0000000000008000 404000130a0808ff
> $12   : 0000000000000008 ffffffff801b8db8 0000000000000000 ffffffff803f0000
> $16   : 980000000ff2fa70 980000000c417bb8 980000000c417c20 980000000fdeb610
> $20   : 000000007fffffff 980000000f9211a0 980000000fc26000 000000007fa51ecd
> $24   : 0000000000000000 ffffffff80074290                                  
> $28   : 980000000c414000 980000000c417bb0 0000000000400000 0000000000000000
> Hi    : 0000000000000000
> Lo    : 003d08dbda057200
> epc   : 0000000000000000 0x0     Not tainted
> ra    : 0000000000000000 0x0
> Status: 9001fce3    KX SX UX KERNEL EXL IE 
> Cause : 00000008
> BadVA : 0000000000000000
> PrId  : 00002321 (R5000)
> Modules linked in: parport_pc lp parport ipv6 deflate zlib_deflate ctr twofish twofish_common camellia serpent blowfish des_generic cbc aes_generic xcbc sha25
> 6_generic sha1_generic crypto_null crypto_blkcipher dm_snapshot dm_mirror dm_mod ehci_hcd ohci_hcd r8169 usbcore sg evdev
> Process hald-probe-stor (pid: 1937, threadinfo=980000000c414000, task=980000000ebf47d8)
> Stack : 980000000c417be0 980000000c417de0 0800000000000000 980000000c417bb0
>         00000008ffffff86 0000000000000000 0200000000000001 000006d600000000
>         0000000000000000 980000000c417de0 980000000fdeb610 0000000000000001
>         0000000000005326 ffffffff802460b0 0000000070023a00 000000000f9211a0
>         ffffffff80490000 ffffffff8024bb84 980000000fc10e80 980000000f80bb28
>         0000000000000000 980000000f9210e0 0000010100000001 00000000800d1618
>         0000000000000004 980000000fc8f850 000000007fffffff 980000000fde4000
>         0000000000005326 000000007fffffff 980000000c407540 980000000f9211a0
>         980000000fc26000 000000007fa51ecd ffffffff80245c6c 980000000c407540
>         0000000000000000 fffffffffffffdfd 0000000000005326 ffffffff801ad8bc
>         ...
> Call Trace:
> [<ffffffff802460b0>] sr_drive_status+0x50/0xe8
> [<ffffffff8024bb84>] cdrom_ioctl+0x5f4/0x1208
> [<ffffffff80245c6c>] sr_block_ioctl+0x64/0xe8
> [<ffffffff801ad8bc>] compat_blkdev_ioctl+0x7cc/0x18e0
> [<ffffffff800d1870>] do_open+0x98/0x310
> [<ffffffff800d1d60>] blkdev_open+0x0/0xc0
> [<ffffffff800d1da8>] blkdev_open+0x48/0xc0
> [<ffffffff8009c444>] __dentry_open+0x114/0x2e0
> [<ffffffff8009c740>] do_filp_open+0x48/0x58
> [<ffffffff8009c740>] do_filp_open+0x48/0x58
> [<ffffffff800def8c>] compat_sys_ioctl+0xf4/0x440
> [<ffffffff80019154>] handle_sys+0x114/0x130
> [<ffffffff8001fcf3>] fpu_emulator_cop1Handler+0x362/0x2270

This is a bit strange.  It's obviously O2 specific, which makes it a lot
harder.  Can you compile the kernel with CONFIG_DEBUG_INFO and reproduce
(just in case this changes the symbol layout).  Then ask gdb where
sr_drive_status+0x50 (or what it moves to) is:

gdb vmlinux
b *(sr_drive_status+0x50)

should identify the file and line.

The signature implies that cdi->handle may be NULL, so you could put in
a check for that as well.

Thanks,

James



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: new kernel oops in recent kernels
  2008-03-16 15:19 new kernel oops in recent kernels Giuseppe Sacco
  2008-03-16 16:39 ` James Bottomley
@ 2008-03-16 16:42 ` Matthew Wilcox
  2008-03-16 18:29   ` Giuseppe Sacco
  1 sibling, 1 reply; 18+ messages in thread
From: Matthew Wilcox @ 2008-03-16 16:42 UTC (permalink / raw)
  To: Giuseppe Sacco; +Cc: linux-scsi

On Sun, Mar 16, 2008 at 04:19:08PM +0100, Giuseppe Sacco wrote:
> testing latest kernels on SGI O2, I found this new kernel oops. It has
> been produced with kernel from linux-mips.org git of yesterday night.
> A very similar oops has been reported by others[0] using 2.6.22.

> CPU 0 Unable to handle kernel paging request at virtual address 0000000000000000, epc == 0000000000000000, ra == 0000000000000000

I'm not familiar with MIPS; is epc the program counter?  If so, this
would be a branch to 0.  That's somewhat confusing as I don't see any
function pointers used within sr_drive_status().  How accurate are MIPS
backtraces?

> Call Trace:
> [<ffffffff802460b0>] sr_drive_status+0x50/0xe8
> [<ffffffff8024bb84>] cdrom_ioctl+0x5f4/0x1208
> [<ffffffff80245c6c>] sr_block_ioctl+0x64/0xe8

It would be interesting to see a disassembly (objdump -dr
drivers/scsi/sr_ioctl.o) of sr_drive_status from say 0x40 to 0x60.

And if that calls a function, it would be interesting to put in printks
to figure out where we're dereferencing a null pointer.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: new kernel oops in recent kernels
  2008-03-16 16:42 ` Matthew Wilcox
@ 2008-03-16 18:29   ` Giuseppe Sacco
  2008-03-17  3:58     ` Matthew Wilcox
  2008-03-17  4:41     ` Matthew Wilcox
  0 siblings, 2 replies; 18+ messages in thread
From: Giuseppe Sacco @ 2008-03-16 18:29 UTC (permalink / raw)
  To: linux-scsi

Hi all,

Il giorno dom, 16/03/2008 alle 10.42 -0600, Matthew Wilcox ha scritto:
> On Sun, Mar 16, 2008 at 04:19:08PM +0100, Giuseppe Sacco wrote:
[...]
> > Call Trace:
> > [<ffffffff802460b0>] sr_drive_status+0x50/0xe8
> > [<ffffffff8024bb84>] cdrom_ioctl+0x5f4/0x1208
> > [<ffffffff80245c6c>] sr_block_ioctl+0x64/0xe8
> 
> It would be interesting to see a disassembly (objdump -dr
> drivers/scsi/sr_ioctl.o) of sr_drive_status from say 0x40 to 0x60.

here it is:

(gdb) disassemble sr_drive_status+0x50
Dump of assembler code for function sr_drive_status:
0xffffffff80246060 <sr_drive_status+0>:	daddiu	sp,sp,-32
0xffffffff80246064 <sr_drive_status+4>:	lui	v0,0x7fff
0xffffffff80246068 <sr_drive_status+8>:	sd	s0,16(sp)
0xffffffff8024606c <sr_drive_status+12>:	sd	ra,24(sp)
0xffffffff80246070 <sr_drive_status+16>:	ori	v0,v0,0xffff
0xffffffff80246074 <sr_drive_status+20>:	move	s0,a0
0xffffffff80246078 <sr_drive_status+24>:	bne	a1,v0,0xffffffff802460e8 <sr_drive_status+136>
0xffffffff8024607c <sr_drive_status+28>:	ld	v1,24(a0)
0xffffffff80246080 <sr_drive_status+32>:	ld	a0,16(v1)
0xffffffff80246084 <sr_drive_status+36>:	jal	0xffffffff80244c70 <sr_test_unit_ready>
0xffffffff80246088 <sr_drive_status+40>:	daddiu	a1,sp,4
0xffffffff8024608c <sr_drive_status+44>:	bnez	v0,0xffffffff802460a8 <sr_drive_status+72>
0xffffffff80246090 <sr_drive_status+48>:	move	a0,s0
0xffffffff80246094 <sr_drive_status+52>:	li	v0,4
0xffffffff80246098 <sr_drive_status+56>:	ld	ra,24(sp)
0xffffffff8024609c <sr_drive_status+60>:	ld	s0,16(sp)
0xffffffff802460a0 <sr_drive_status+64>:	jr	ra
0xffffffff802460a4 <sr_drive_status+68>:	daddiu	sp,sp,32
0xffffffff802460a8 <sr_drive_status+72>:	jal	0xffffffff8024c838 <cdrom_get_media_event>
0xffffffff802460ac <sr_drive_status+76>:	move	a1,sp
0xffffffff802460b0 <sr_drive_status+80>:	bnez	v0,0xffffffff802460fc <sr_drive_status+156>
0xffffffff802460b4 <sr_drive_status+84>:	lhu	v0,0(sp)
0xffffffff802460b8 <sr_drive_status+88>:	sll	v0,v0,0x0
0xffffffff802460bc <sr_drive_status+92>:	andi	v0,v0,0xff
0xffffffff802460c0 <sr_drive_status+96>:	andi	v1,v0,0x2
0xffffffff802460c4 <sr_drive_status+100>:	bnez	v1,0xffffffff80246094 <sr_drive_status+52>
0xffffffff802460c8 <sr_drive_status+104>:	andi	v0,v0,0x1
0xffffffff802460cc <sr_drive_status+108>:	beqz	v0,0xffffffff80246098 <sr_drive_status+56>
0xffffffff802460d0 <sr_drive_status+112>:	li	v0,1
0xffffffff802460d4 <sr_drive_status+116>:	ld	ra,24(sp)

> And if that calls a function, it would be interesting to put in printks
> to figure out where we're dereferencing a null pointer.
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: new kernel oops in recent kernels
  2008-03-16 16:39 ` James Bottomley
@ 2008-03-16 18:32   ` Giuseppe Sacco
  2008-03-16 18:47     ` James Bottomley
  0 siblings, 1 reply; 18+ messages in thread
From: Giuseppe Sacco @ 2008-03-16 18:32 UTC (permalink / raw)
  To: linux-scsi

Hi James,

Il giorno dom, 16/03/2008 alle 11.39 -0500, James Bottomley ha scritto:
> On Sun, 2008-03-16 at 16:19 +0100, Giuseppe Sacco wrote:
[...]
> This is a bit strange.  It's obviously O2 specific, which makes it a lot
> harder.  Can you compile the kernel with CONFIG_DEBUG_INFO and reproduce
> (just in case this changes the symbol layout).  Then ask gdb where
[...]

I cannot find any CONFIG_DEBUG_INFO. Do you mean CONFIG_DEBUG_KERNEL?

Thanks,
Giuseppe


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: new kernel oops in recent kernels
  2008-03-16 18:32   ` Giuseppe Sacco
@ 2008-03-16 18:47     ` James Bottomley
  0 siblings, 0 replies; 18+ messages in thread
From: James Bottomley @ 2008-03-16 18:47 UTC (permalink / raw)
  To: Giuseppe Sacco; +Cc: linux-scsi

On Sun, 2008-03-16 at 19:32 +0100, Giuseppe Sacco wrote:
> Hi James,
> 
> Il giorno dom, 16/03/2008 alle 11.39 -0500, James Bottomley ha scritto:
> > On Sun, 2008-03-16 at 16:19 +0100, Giuseppe Sacco wrote:
> [...]
> > This is a bit strange.  It's obviously O2 specific, which makes it a lot
> > harder.  Can you compile the kernel with CONFIG_DEBUG_INFO and reproduce
> > (just in case this changes the symbol layout).  Then ask gdb where
> [...]
> 
> I cannot find any CONFIG_DEBUG_INFO. Do you mean CONFIG_DEBUG_KERNEL?

This from lib/Kconfig.debug:

config DEBUG_INFO
        bool "Compile the kernel with debug info"
        depends on DEBUG_KERNEL
        help
          If you say Y here the resulting kernel image will include
          debugging info resulting in a larger kernel image.
          This adds debug symbols to the kernel and modules (gcc -g), and
          is needed if you intend to use kernel crashdump or binary object
          tools like crash, kgdb, LKCD, gdb, etc on the kernel.
          Say Y here only if you plan to debug the kernel.

          If unsure, say N.

It does depend on CONFIG_DEBUG_KERNEL according to the depends clause.

James



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Compiler error? [was: Re: new kernel oops in recent kernels]
  2008-03-16 10:49 new kernel oops in recent kernels Giuseppe Sacco
@ 2008-03-16 20:27 ` Giuseppe Sacco
  2008-03-16 23:36   ` Thomas Bogendoerfer
  0 siblings, 1 reply; 18+ messages in thread
From: Giuseppe Sacco @ 2008-03-16 20:27 UTC (permalink / raw)
  To: linux-mips

Hi all,
the Oops I reported earlier today, may be related to a problem of the
GNU C compiler, but I do not know MIPS assembly, so I ask for help.

Call Trace of the original oops:

> [<ffffffff802460b0>] sr_drive_status+0x50/0xe8
> [<ffffffff8024bb84>] cdrom_ioctl+0x5f4/0x1208
> [<ffffffff80245c6c>] sr_block_ioctl+0x64/0xe8
> [<ffffffff801ad8bc>] compat_blkdev_ioctl+0x7cc/0x18e0
> [<ffffffff800d1870>] do_open+0x98/0x310
> [<ffffffff800d1d60>] blkdev_open+0x0/0xc0
> [<ffffffff800d1da8>] blkdev_open+0x48/0xc0
> [<ffffffff8009c444>] __dentry_open+0x114/0x2e0
> [<ffffffff8009c740>] do_filp_open+0x48/0x58
> [<ffffffff8009c740>] do_filp_open+0x48/0x58
> [<ffffffff800def8c>] compat_sys_ioctl+0xf4/0x440
> [<ffffffff80019154>] handle_sys+0x114/0x130
> [<ffffffff8001fcf3>] fpu_emulator_cop1Handler+0x362/0x2270

sr_drive_status+0x50 is, in decimal, sr_drive_status+80
The gdb disassable the code as this:

(gdb) disassemble sr_drive_status+0x50
Dump of assembler code for function sr_drive_status:
0xffffffff80246060 <sr_drive_status+0>: daddiu  sp,sp,-32
0xffffffff80246064 <sr_drive_status+4>: lui     v0,0x7fff
0xffffffff80246068 <sr_drive_status+8>: sd      s0,16(sp)
0xffffffff8024606c <sr_drive_status+12>:        sd      ra,24(sp)
0xffffffff80246070 <sr_drive_status+16>:        ori     v0,v0,0xffff
0xffffffff80246074 <sr_drive_status+20>:        move    s0,a0
0xffffffff80246078 <sr_drive_status+24>:        bne     a1,v0,0xffffffff802460e8 <sr_drive_status+136>
0xffffffff8024607c <sr_drive_status+28>:        ld      v1,24(a0)
0xffffffff80246080 <sr_drive_status+32>:        ld      a0,16(v1)
0xffffffff80246084 <sr_drive_status+36>:        jal     0xffffffff80244c70 <sr_test_unit_ready>
0xffffffff80246088 <sr_drive_status+40>:        daddiu  a1,sp,4
0xffffffff8024608c <sr_drive_status+44>:        bnez    v0,0xffffffff802460a8 <sr_drive_status+72>
0xffffffff80246090 <sr_drive_status+48>:        move    a0,s0
0xffffffff80246094 <sr_drive_status+52>:        li      v0,4
0xffffffff80246098 <sr_drive_status+56>:        ld      ra,24(sp)
0xffffffff8024609c <sr_drive_status+60>:        ld      s0,16(sp)
0xffffffff802460a0 <sr_drive_status+64>:        jr      ra
0xffffffff802460a4 <sr_drive_status+68>:        daddiu  sp,sp,32
0xffffffff802460a8 <sr_drive_status+72>:        jal     0xffffffff8024c838 <cdrom_get_media_event>
0xffffffff802460ac <sr_drive_status+76>:        move    a1,sp
0xffffffff802460b0 <sr_drive_status+80>:        bnez    v0,0xffffffff802460fc <sr_drive_status+156>
0xffffffff802460b4 <sr_drive_status+84>:        lhu     v0,0(sp)
0xffffffff802460b8 <sr_drive_status+88>:        sll     v0,v0,0x0
0xffffffff802460bc <sr_drive_status+92>:        andi    v0,v0,0xff
0xffffffff802460c0 <sr_drive_status+96>:        andi    v1,v0,0x2
0xffffffff802460c4 <sr_drive_status+100>:       bnez    v1,0xffffffff80246094 <sr_drive_status+52>
0xffffffff802460c8 <sr_drive_status+104>:       andi    v0,v0,0x1
0xffffffff802460cc <sr_drive_status+108>:       beqz    v0,0xffffffff80246098 <sr_drive_status+56>
0xffffffff802460d0 <sr_drive_status+112>:       li      v0,1
0xffffffff802460d4 <sr_drive_status+116>:       ld      ra,24(sp)

then, I changed the code in sr_drive_status, adding the printk line, as
shown below:

int sr_drive_status(struct cdrom_device_info *cdi, int slot)
{
        struct scsi_cd *cd = cdi->handle;
        struct scsi_sense_hdr sshdr;
        struct media_event_desc med;

        if (CDSL_CURRENT != slot) {
                /* we have no changer support */
                return -EINVAL;
        }
        if (0 == sr_test_unit_ready(cd->device, &sshdr))
                return CDS_DISC_OK;

printk(KERN_INFO "sr_drive_status() cdi=0x%p, cd=0x%p\n", cdi, cd);

        if (!cdrom_get_media_event(cdi, &med)) {
                if (med.media_present)
                        return CDS_DISC_OK;
[...]

and now, I cannot reproduce any oops.

The new assembly code is:

0xffffffff80246060 <sr_drive_status+0>: daddiu  sp,sp,-48
0xffffffff80246064 <sr_drive_status+4>: lui     v0,0x7fff
0xffffffff80246068 <sr_drive_status+8>: sd      s0,16(sp)
0xffffffff8024606c <sr_drive_status+12>:        sd      ra,32(sp)
0xffffffff80246070 <sr_drive_status+16>:        sd      s1,24(sp)
0xffffffff80246074 <sr_drive_status+20>:        ori     v0,v0,0xffff
0xffffffff80246078 <sr_drive_status+24>:        move    s0,a0
0xffffffff8024607c <sr_drive_status+28>:        bne     a1,v0,0xffffffff80246108 <sr_drive_status+168>
0xffffffff80246080 <sr_drive_status+32>:        ld      s1,24(a0)
0xffffffff80246084 <sr_drive_status+36>:        ld      a0,16(s1)
0xffffffff80246088 <sr_drive_status+40>:        jal     0xffffffff80244c70 <sr_test_unit_ready>
0xffffffff8024608c <sr_drive_status+44>:        daddiu  a1,sp,4
0xffffffff80246090 <sr_drive_status+48>:        bnez    v0,0xffffffff802460b0 <sr_drive_status+80>
0xffffffff80246094 <sr_drive_status+52>:        lui     a0,0x803c
0xffffffff80246098 <sr_drive_status+56>:        li      v0,4
0xffffffff8024609c <sr_drive_status+60>:        ld      ra,32(sp)
0xffffffff802460a0 <sr_drive_status+64>:        ld      s1,24(sp)
0xffffffff802460a4 <sr_drive_status+68>:        ld      s0,16(sp)
0xffffffff802460a8 <sr_drive_status+72>:        jr      ra
0xffffffff802460ac <sr_drive_status+76>:        daddiu  sp,sp,48
0xffffffff802460b0 <sr_drive_status+80>:        daddiu  a0,a0,-4560
0xffffffff802460b4 <sr_drive_status+84>:        move    a1,s0
0xffffffff802460b8 <sr_drive_status+88>:        jal     0xffffffff80032ba8 <printk>
0xffffffff802460bc <sr_drive_status+92>:        move    a2,s1
0xffffffff802460c0 <sr_drive_status+96>:        move    a0,s0
0xffffffff802460c4 <sr_drive_status+100>:       jal     0xffffffff8024c858 <cdrom_get_media_event>
0xffffffff802460c8 <sr_drive_status+104>:       move    a1,sp
0xffffffff802460cc <sr_drive_status+108>:       bnez    v0,0xffffffff80246120 <sr_drive_status+192>
0xffffffff802460d0 <sr_drive_status+112>:       lhu     v0,0(sp)

the gcc I am using in versione 4.1.2. Any help is really appreciated.

Thanks,
Giuseppe

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Compiler error? [was: Re: new kernel oops in recent kernels]
  2008-03-16 20:27 ` Compiler error? [was: Re: new kernel oops in recent kernels] Giuseppe Sacco
@ 2008-03-16 23:36   ` Thomas Bogendoerfer
  2008-03-17  8:05     ` Giuseppe Sacco
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Bogendoerfer @ 2008-03-16 23:36 UTC (permalink / raw)
  To: Giuseppe Sacco; +Cc: linux-mips

On Sun, Mar 16, 2008 at 09:27:37PM +0100, Giuseppe Sacco wrote:
> the gcc I am using in versione 4.1.2. Any help is really appreciated.

4.2.1 generates nearly the same (reasonable) code. The major difference
between the version with printk and the version without is the size
of the local stack. I guess this prevents killing of *cd.
Could you try the hack below and tell me, if it helps ? This hack
ensures, that the buffer given to the scsi driver is one cache
line big (at least on R5k O2s). If this helps, there are more
places to fix for non-coherent machines...

Thomas.

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index 12f5bae..acb98a8 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -482,7 +482,7 @@ int cdrom_get_media_event(struct cdrom_device_info *cdi,
 			  struct media_event_desc *med)
 {
 	struct packet_command cgc;
-	unsigned char buffer[8];
+	unsigned char buffer[32];
 	struct event_header *eh = (struct event_header *) buffer;
 
 	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);




-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessary a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: new kernel oops in recent kernels
  2008-03-16 18:29   ` Giuseppe Sacco
@ 2008-03-17  3:58     ` Matthew Wilcox
  2008-03-17  4:41     ` Matthew Wilcox
  1 sibling, 0 replies; 18+ messages in thread
From: Matthew Wilcox @ 2008-03-17  3:58 UTC (permalink / raw)
  To: Giuseppe Sacco; +Cc: linux-scsi

On Sun, Mar 16, 2008 at 07:29:07PM +0100, Giuseppe Sacco wrote:
> > It would be interesting to see a disassembly (objdump -dr
> > drivers/scsi/sr_ioctl.o) of sr_drive_status from say 0x40 to 0x60.
> 
> here it is:
> 
> (gdb) disassemble sr_drive_status+0x50
> 0xffffffff802460b0 <sr_drive_status+80>:	bnez	v0,0xffffffff802460fc <sr_drive_status+156>

The thing about objdump -dr is that it gives you the name of the
function that's being called.  gdb apparently doesn't, or would need a
different command from "disassemble".

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: new kernel oops in recent kernels
  2008-03-16 18:29   ` Giuseppe Sacco
  2008-03-17  3:58     ` Matthew Wilcox
@ 2008-03-17  4:41     ` Matthew Wilcox
  2008-03-17  8:17       ` Giuseppe Sacco
  1 sibling, 1 reply; 18+ messages in thread
From: Matthew Wilcox @ 2008-03-17  4:41 UTC (permalink / raw)
  To: Giuseppe Sacco; +Cc: linux-scsi

On Sun, Mar 16, 2008 at 07:29:07PM +0100, Giuseppe Sacco wrote:
> > > [<ffffffff802460b0>] sr_drive_status+0x50/0xe8
> > > [<ffffffff8024bb84>] cdrom_ioctl+0x5f4/0x1208
> > > [<ffffffff80245c6c>] sr_block_ioctl+0x64/0xe8
> > 

> 0xffffffff802460a4 <sr_drive_status+68>:	daddiu	sp,sp,32
> 0xffffffff802460a8 <sr_drive_status+72>:	jal	0xffffffff8024c838 <cdrom_get_media_event>
> 0xffffffff802460ac <sr_drive_status+76>:	move	a1,sp
> 0xffffffff802460b0 <sr_drive_status+80>:	bnez	v0,0xffffffff802460fc <sr_drive_status+156>

I think I was confused earlier.  156 is 0x9c, thus within the function.
The backtrace must be incorrect; this is really 0x48 and thus a call to
cdrom_get_media_event, which points the finger at
cdi->ops->generic_packet being NULL.

Put a BUG_ON(!cdi->ops->generic_packet) in drivers/cdrom/cdrom.c right
before the line that calls it (ie line 11 of cdrom_get_media_event).
That should trigger and give a better backtrace.  Then it's a simple (*)
matter of figuring out why it's NULL.

* This is sarcasm.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Compiler error? [was: Re: new kernel oops in recent kernels]
  2008-03-16 23:36   ` Thomas Bogendoerfer
@ 2008-03-17  8:05     ` Giuseppe Sacco
  2008-03-17 14:18       ` Ralf Baechle
  0 siblings, 1 reply; 18+ messages in thread
From: Giuseppe Sacco @ 2008-03-17  8:05 UTC (permalink / raw)
  To: Thomas Bogendoerfer; +Cc: linux-mips

Hi Thomas,

Il giorno lun, 17/03/2008 alle 00.36 +0100, Thomas Bogendoerfer ha
scritto:
> On Sun, Mar 16, 2008 at 09:27:37PM +0100, Giuseppe Sacco wrote:
> > the gcc I am using in versione 4.1.2. Any help is really appreciated.
> 
> 4.2.1 generates nearly the same (reasonable) code. The major difference
> between the version with printk and the version without is the size
> of the local stack. I guess this prevents killing of *cd.
> Could you try the hack below and tell me, if it helps ? This hack
> ensures, that the buffer given to the scsi driver is one cache
> line big (at least on R5k O2s). If this helps, there are more
> places to fix for non-coherent machines...
[...]

The patch you proposed, that use a larger buffer, does not seems to
trigger the bug.

Thanks,
Giuseppe

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: new kernel oops in recent kernels
  2008-03-17  4:41     ` Matthew Wilcox
@ 2008-03-17  8:17       ` Giuseppe Sacco
  0 siblings, 0 replies; 18+ messages in thread
From: Giuseppe Sacco @ 2008-03-17  8:17 UTC (permalink / raw)
  To: linux-scsi

Hi all,
I wrote a message to linux-mips mailing list for investigating on the
assembly code generated in sr_cdrom_status() since adding the suggested
printk() stopped the oops. I supposed there is a problem with the
compiler, but people there are investigating about problem with cache
coherency.
You may follow the thread on public web archive, at
http://www.linux-mips.org/archives/linux-mips/2008-03/msg00079.html

I'll be back on this list after checking any problem with cache
coherence and c compiler.

Thanks,
Giuseppe


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Compiler error? [was: Re: new kernel oops in recent kernels]
  2008-03-17  8:05     ` Giuseppe Sacco
@ 2008-03-17 14:18       ` Ralf Baechle
  2008-03-17 14:32         ` Thomas Bogendoerfer
  0 siblings, 1 reply; 18+ messages in thread
From: Ralf Baechle @ 2008-03-17 14:18 UTC (permalink / raw)
  To: Giuseppe Sacco; +Cc: Thomas Bogendoerfer, linux-mips

On Mon, Mar 17, 2008 at 09:05:42AM +0100, Giuseppe Sacco wrote:

> The patch you proposed, that use a larger buffer, does not seems to
> trigger the bug.

It may help but doesn't have a chance to be accepted upstream.  So
this is no more than an useful litmus test.

  Ralf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Compiler error? [was: Re: new kernel oops in recent kernels]
  2008-03-17 14:18       ` Ralf Baechle
@ 2008-03-17 14:32         ` Thomas Bogendoerfer
  2008-03-21 23:00           ` Thomas Bogendoerfer
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Bogendoerfer @ 2008-03-17 14:32 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Giuseppe Sacco, linux-mips

On Mon, Mar 17, 2008 at 02:18:28PM +0000, Ralf Baechle wrote:
> On Mon, Mar 17, 2008 at 09:05:42AM +0100, Giuseppe Sacco wrote:
> 
> > The patch you proposed, that use a larger buffer, does not seems to
> > trigger the bug.
> 
> It may help but doesn't have a chance to be accepted upstream.  So
> this is no more than an useful litmus test.

sure, that's why I called it a hack. It was just to make sure, that I'm
on the right track.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessary a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Compiler error? [was: Re: new kernel oops in recent kernels]
  2008-03-17 14:32         ` Thomas Bogendoerfer
@ 2008-03-21 23:00           ` Thomas Bogendoerfer
  2008-03-22 23:39             ` Giuseppe Sacco
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Bogendoerfer @ 2008-03-21 23:00 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Giuseppe Sacco, linux-mips

On Mon, Mar 17, 2008 at 03:32:15PM +0100, Thomas Bogendoerfer wrote:
> On Mon, Mar 17, 2008 at 02:18:28PM +0000, Ralf Baechle wrote:
> > On Mon, Mar 17, 2008 at 09:05:42AM +0100, Giuseppe Sacco wrote:
> > 
> > > The patch you proposed, that use a larger buffer, does not seems to
> > > trigger the bug.
> > 
> > It may help but doesn't have a chance to be accepted upstream.  So
> > this is no more than an useful litmus test.
> 
> sure, that's why I called it a hack. It was just to make sure, that I'm
> on the right track.

below is a patch, which replaces all buffers on the stack, which are
passed to the scsi layer with kmalloced ones.

Giuseppe, could you please check if this fixes your problem, and
doesn't cause new regressions ? 

Thomas.


diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index 12f5bae..fd45563 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -482,27 +482,37 @@ int cdrom_get_media_event(struct cdrom_device_info *cdi,
 			  struct media_event_desc *med)
 {
 	struct packet_command cgc;
-	unsigned char buffer[8];
-	struct event_header *eh = (struct event_header *) buffer;
+	unsigned char *buffer;
+	struct event_header *eh;
+	int ret = 1;
+
+	buffer = kmalloc(8, GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
 
-	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
+	eh = (struct event_header *)buffer;
+
+	init_cdrom_command(&cgc, buffer, 8, CGC_DATA_READ);
 	cgc.cmd[0] = GPCMD_GET_EVENT_STATUS_NOTIFICATION;
 	cgc.cmd[1] = 1;		/* IMMED */
 	cgc.cmd[4] = 1 << 4;	/* media event */
-	cgc.cmd[8] = sizeof(buffer);
+	cgc.cmd[8] = 8;
 	cgc.quiet = 1;
 
 	if (cdi->ops->generic_packet(cdi, &cgc))
-		return 1;
+		goto err;
 
 	if (be16_to_cpu(eh->data_len) < sizeof(*med))
-		return 1;
+		goto err;
 
 	if (eh->nea || eh->notification_class != 0x4)
-		return 1;
+		goto err;
 
-	memcpy(med, &buffer[sizeof(*eh)], sizeof(*med));
-	return 0;
+	memcpy(med, buffer + sizeof(*eh), sizeof(*med));
+	ret = 0;
+err:
+	kfree(buffer);
+	return ret;
 }
 
 /*
@@ -512,68 +522,82 @@ int cdrom_get_media_event(struct cdrom_device_info *cdi,
 static int cdrom_mrw_probe_pc(struct cdrom_device_info *cdi)
 {
 	struct packet_command cgc;
-	char buffer[16];
+	char *buffer;
+	int ret = 1;
+
+	buffer = kmalloc(16, GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
 
-	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
+	init_cdrom_command(&cgc, buffer, 16, CGC_DATA_READ);
 
 	cgc.timeout = HZ;
 	cgc.quiet = 1;
 
 	if (!cdrom_mode_sense(cdi, &cgc, MRW_MODE_PC, 0)) {
 		cdi->mrw_mode_page = MRW_MODE_PC;
-		return 0;
+		ret = 0;
 	} else if (!cdrom_mode_sense(cdi, &cgc, MRW_MODE_PC_PRE1, 0)) {
 		cdi->mrw_mode_page = MRW_MODE_PC_PRE1;
-		return 0;
+		ret = 0;
 	}
-
-	return 1;
+	kfree(buffer);
+	return ret;
 }
 
 static int cdrom_is_mrw(struct cdrom_device_info *cdi, int *write)
 {
 	struct packet_command cgc;
 	struct mrw_feature_desc *mfd;
-	unsigned char buffer[16];
+	unsigned char *buffer;
 	int ret;
 
 	*write = 0;
+	buffer = kmalloc(16, GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
 
-	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
+	init_cdrom_command(&cgc, buffer, 16, CGC_DATA_READ);
 
 	cgc.cmd[0] = GPCMD_GET_CONFIGURATION;
 	cgc.cmd[3] = CDF_MRW;
-	cgc.cmd[8] = sizeof(buffer);
+	cgc.cmd[8] = 16;
 	cgc.quiet = 1;
 
 	if ((ret = cdi->ops->generic_packet(cdi, &cgc)))
-		return ret;
+		goto err;
 
 	mfd = (struct mrw_feature_desc *)&buffer[sizeof(struct feature_header)];
-	if (be16_to_cpu(mfd->feature_code) != CDF_MRW)
-		return 1;
+	if (be16_to_cpu(mfd->feature_code) != CDF_MRW) {
+		ret = 1;
+		goto err;
+	}
 	*write = mfd->write;
 
 	if ((ret = cdrom_mrw_probe_pc(cdi))) {
 		*write = 0;
-		return ret;
 	}
-
-	return 0;
+err:
+	kfree(buffer);
+	return ret;
 }
 
 static int cdrom_mrw_bgformat(struct cdrom_device_info *cdi, int cont)
 {
 	struct packet_command cgc;
-	unsigned char buffer[12];
+	unsigned char *buffer;
 	int ret;
 
 	printk(KERN_INFO "cdrom: %sstarting format\n", cont ? "Re" : "");
 
+	buffer = kmalloc(12, GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
+
 	/*
 	 * FmtData bit set (bit 4), format type is 1
 	 */
-	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_WRITE);
+	init_cdrom_command(&cgc, buffer, 12, CGC_DATA_WRITE);
 	cgc.cmd[0] = GPCMD_FORMAT_UNIT;
 	cgc.cmd[1] = (1 << 4) | 1;
 
@@ -600,6 +624,7 @@ static int cdrom_mrw_bgformat(struct cdrom_device_info *cdi, int cont)
 	if (ret)
 		printk(KERN_INFO "cdrom: bgformat failed\n");
 
+	kfree(buffer);
 	return ret;
 }
 
@@ -659,16 +684,17 @@ static int cdrom_mrw_set_lba_space(struct cdrom_device_info *cdi, int space)
 {
 	struct packet_command cgc;
 	struct mode_page_header *mph;
-	char buffer[16];
+	char *buffer;
 	int ret, offset, size;
 
-	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
+	buffer = kmalloc(16, GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
 
-	cgc.buffer = buffer;
-	cgc.buflen = sizeof(buffer);
+	init_cdrom_command(&cgc, buffer, 16, CGC_DATA_READ);
 
 	if ((ret = cdrom_mode_sense(cdi, &cgc, cdi->mrw_mode_page, 0)))
-		return ret;
+		goto err;
 
 	mph = (struct mode_page_header *) buffer;
 	offset = be16_to_cpu(mph->desc_length);
@@ -678,55 +704,70 @@ static int cdrom_mrw_set_lba_space(struct cdrom_device_info *cdi, int space)
 	cgc.buflen = size;
 
 	if ((ret = cdrom_mode_select(cdi, &cgc)))
-		return ret;
+		goto err;
 
 	printk(KERN_INFO "cdrom: %s: mrw address space %s selected\n", cdi->name, mrw_address_space[space]);
-	return 0;
+	ret = 0;
+err:
+	kfree(buffer);
+	return ret;
 }
 
 static int cdrom_get_random_writable(struct cdrom_device_info *cdi,
 			      struct rwrt_feature_desc *rfd)
 {
 	struct packet_command cgc;
-	char buffer[24];
+	char *buffer;
 	int ret;
 
-	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
+	buffer = kmalloc(24, GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
+
+	init_cdrom_command(&cgc, buffer, 24, CGC_DATA_READ);
 
 	cgc.cmd[0] = GPCMD_GET_CONFIGURATION;	/* often 0x46 */
 	cgc.cmd[3] = CDF_RWRT;			/* often 0x0020 */
-	cgc.cmd[8] = sizeof(buffer);		/* often 0x18 */
+	cgc.cmd[8] = 24;		        /* often 0x18 */
 	cgc.quiet = 1;
 
 	if ((ret = cdi->ops->generic_packet(cdi, &cgc)))
-		return ret;
+		goto err;
 
 	memcpy(rfd, &buffer[sizeof(struct feature_header)], sizeof (*rfd));
-	return 0;
+	ret = 0;
+err:
+	kfree(buffer);
+	return ret;
 }
 
 static int cdrom_has_defect_mgt(struct cdrom_device_info *cdi)
 {
 	struct packet_command cgc;
-	char buffer[16];
+	char *buffer;
 	__be16 *feature_code;
 	int ret;
 
-	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
+	buffer = kmalloc(16, GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
+
+	init_cdrom_command(&cgc, buffer, 16, CGC_DATA_READ);
 
 	cgc.cmd[0] = GPCMD_GET_CONFIGURATION;
 	cgc.cmd[3] = CDF_HWDM;
-	cgc.cmd[8] = sizeof(buffer);
+	cgc.cmd[8] = 16;
 	cgc.quiet = 1;
 
 	if ((ret = cdi->ops->generic_packet(cdi, &cgc)))
-		return ret;
+		goto err;
 
 	feature_code = (__be16 *) &buffer[sizeof(struct feature_header)];
 	if (be16_to_cpu(*feature_code) == CDF_HWDM)
-		return 0;
-
-	return 1;
+		ret = 0;
+err:
+	kfree(buffer);
+	return ret;
 }
 
 
@@ -817,10 +858,14 @@ static int cdrom_mrw_open_write(struct cdrom_device_info *cdi)
 static int mo_open_write(struct cdrom_device_info *cdi)
 {
 	struct packet_command cgc;
-	char buffer[255];
+	char *buffer;
 	int ret;
 
-	init_cdrom_command(&cgc, &buffer, 4, CGC_DATA_READ);
+	buffer = kmalloc(255, GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
+
+	init_cdrom_command(&cgc, buffer, 4, CGC_DATA_READ);
 	cgc.quiet = 1;
 
 	/*
@@ -837,10 +882,15 @@ static int mo_open_write(struct cdrom_device_info *cdi)
 	}
 
 	/* drive gave us no info, let the user go ahead */
-	if (ret)
-		return 0;
+	if (ret) {
+		ret = 0;
+		goto err;
+	}
 
-	return buffer[3] & 0x80;
+	ret = buffer[3] & 0x80;
+err:
+	kfree(buffer);
+	return ret;
 }
 
 static int cdrom_ram_open_write(struct cdrom_device_info *cdi)
@@ -863,15 +913,19 @@ static int cdrom_ram_open_write(struct cdrom_device_info *cdi)
 static void cdrom_mmc3_profile(struct cdrom_device_info *cdi)
 {
 	struct packet_command cgc;
-	char buffer[32];
+	char *buffer;
 	int ret, mmc3_profile;
 
-	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
+	buffer = kmalloc(32, GFP_KERNEL);
+	if (!buffer)
+		return;
+
+	init_cdrom_command(&cgc, buffer, 32, CGC_DATA_READ);
 
 	cgc.cmd[0] = GPCMD_GET_CONFIGURATION;
 	cgc.cmd[1] = 0;
 	cgc.cmd[2] = cgc.cmd[3] = 0;		/* Starting Feature Number */
-	cgc.cmd[8] = sizeof(buffer);		/* Allocation Length */
+	cgc.cmd[8] = 32;		        /* Allocation Length */
 	cgc.quiet = 1;
 
 	if ((ret = cdi->ops->generic_packet(cdi, &cgc)))
@@ -880,6 +934,7 @@ static void cdrom_mmc3_profile(struct cdrom_device_info *cdi)
 		mmc3_profile = (buffer[6] << 8) | buffer[7];
 
 	cdi->mmc3_profile = mmc3_profile;
+	kfree(buffer);
 }
 
 static int cdrom_is_dvd_rw(struct cdrom_device_info *cdi)
@@ -1594,12 +1649,15 @@ static void setup_send_key(struct packet_command *cgc, unsigned agid, unsigned t
 static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 {
 	int ret;
-	u_char buf[20];
+	u_char *buf;
 	struct packet_command cgc;
 	struct cdrom_device_ops *cdo = cdi->ops;
-	rpc_state_t rpc_state;
+	rpc_state_t *rpc_state;
+
+	buf = kzalloc(20, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
 
-	memset(buf, 0, sizeof(buf));
 	init_cdrom_command(&cgc, buf, 0, CGC_DATA_READ);
 
 	switch (ai->type) {
@@ -1610,7 +1668,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		setup_report_key(&cgc, ai->lsa.agid, 0);
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			return ret;
+			goto err;
 
 		ai->lsa.agid = buf[7] >> 6;
 		/* Returning data, let host change state */
@@ -1621,7 +1679,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		setup_report_key(&cgc, ai->lsk.agid, 2);
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			return ret;
+			goto err;
 
 		copy_key(ai->lsk.key, &buf[4]);
 		/* Returning data, let host change state */
@@ -1632,7 +1690,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		setup_report_key(&cgc, ai->lsc.agid, 1);
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			return ret;
+			goto err;
 
 		copy_chal(ai->lsc.chal, &buf[4]);
 		/* Returning data, let host change state */
@@ -1649,7 +1707,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		cgc.cmd[2] = ai->lstk.lba >> 24;
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			return ret;
+			goto err;
 
 		ai->lstk.cpm = (buf[4] >> 7) & 1;
 		ai->lstk.cp_sec = (buf[4] >> 6) & 1;
@@ -1663,7 +1721,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		setup_report_key(&cgc, ai->lsasf.agid, 5);
 		
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			return ret;
+			goto err;
 
 		ai->lsasf.asf = buf[7] & 1;
 		break;
@@ -1676,7 +1734,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		copy_chal(&buf[4], ai->hsc.chal);
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			return ret;
+			goto err;
 
 		ai->type = DVD_LU_SEND_KEY1;
 		break;
@@ -1689,7 +1747,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 
 		if ((ret = cdo->generic_packet(cdi, &cgc))) {
 			ai->type = DVD_AUTH_FAILURE;
-			return ret;
+			goto err;
 		}
 		ai->type = DVD_AUTH_ESTABLISHED;
 		break;
@@ -1700,24 +1758,23 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		cdinfo(CD_DVD, "entering DVD_INVALIDATE_AGID\n"); 
 		setup_report_key(&cgc, ai->lsa.agid, 0x3f);
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			return ret;
+			goto err;
 		break;
 
 	/* Get region settings */
 	case DVD_LU_SEND_RPC_STATE:
 		cdinfo(CD_DVD, "entering DVD_LU_SEND_RPC_STATE\n");
 		setup_report_key(&cgc, 0, 8);
-		memset(&rpc_state, 0, sizeof(rpc_state_t));
-		cgc.buffer = (char *) &rpc_state;
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			return ret;
+			goto err;
 
-		ai->lrpcs.type = rpc_state.type_code;
-		ai->lrpcs.vra = rpc_state.vra;
-		ai->lrpcs.ucca = rpc_state.ucca;
-		ai->lrpcs.region_mask = rpc_state.region_mask;
-		ai->lrpcs.rpc_scheme = rpc_state.rpc_scheme;
+		rpc_state = (rpc_state_t *)buf;
+		ai->lrpcs.type = rpc_state->type_code;
+		ai->lrpcs.vra = rpc_state->vra;
+		ai->lrpcs.ucca = rpc_state->ucca;
+		ai->lrpcs.region_mask = rpc_state->region_mask;
+		ai->lrpcs.rpc_scheme = rpc_state->rpc_scheme;
 		break;
 
 	/* Set region settings */
@@ -1728,20 +1785,23 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		buf[4] = ai->hrpcs.pdrc;
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			return ret;
+			goto err;
 		break;
 
 	default:
 		cdinfo(CD_WARNING, "Invalid DVD key ioctl (%d)\n", ai->type);
-		return -ENOTTY;
+		ret = -ENOTTY;
+		goto err;
 	}
-
-	return 0;
+	ret = 0;
+err:
+	kfree(buf);
+	return ret;
 }
 
 static int dvd_read_physical(struct cdrom_device_info *cdi, dvd_struct *s)
 {
-	unsigned char buf[21], *base;
+	unsigned char *buf, *base;
 	struct dvd_layer *layer;
 	struct packet_command cgc;
 	struct cdrom_device_ops *cdo = cdi->ops;
@@ -1750,7 +1810,11 @@ static int dvd_read_physical(struct cdrom_device_info *cdi, dvd_struct *s)
 	if (layer_num >= DVD_LAYERS)
 		return -EINVAL;
 
-	init_cdrom_command(&cgc, buf, sizeof(buf), CGC_DATA_READ);
+	buf = kmalloc(21, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	init_cdrom_command(&cgc, buf, 21, CGC_DATA_READ);
 	cgc.cmd[0] = GPCMD_READ_DVD_STRUCTURE;
 	cgc.cmd[6] = layer_num;
 	cgc.cmd[7] = s->type;
@@ -1762,7 +1826,7 @@ static int dvd_read_physical(struct cdrom_device_info *cdi, dvd_struct *s)
 	cgc.quiet = 1;
 
 	if ((ret = cdo->generic_packet(cdi, &cgc)))
-		return ret;
+		goto err;
 
 	base = &buf[4];
 	layer = &s->physical.layer[layer_num];
@@ -1786,17 +1850,24 @@ static int dvd_read_physical(struct cdrom_device_info *cdi, dvd_struct *s)
 	layer->end_sector_l0 = base[13] << 16 | base[14] << 8 | base[15];
 	layer->bca = base[16] >> 7;
 
-	return 0;
+	ret = 0;
+err:
+	kfree(buf);
+	return ret;
 }
 
 static int dvd_read_copyright(struct cdrom_device_info *cdi, dvd_struct *s)
 {
 	int ret;
-	u_char buf[8];
+	u_char *buf;
 	struct packet_command cgc;
 	struct cdrom_device_ops *cdo = cdi->ops;
 
-	init_cdrom_command(&cgc, buf, sizeof(buf), CGC_DATA_READ);
+	buf = kmalloc(8, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	init_cdrom_command(&cgc, buf, 8, CGC_DATA_READ);
 	cgc.cmd[0] = GPCMD_READ_DVD_STRUCTURE;
 	cgc.cmd[6] = s->copyright.layer_num;
 	cgc.cmd[7] = s->type;
@@ -1804,12 +1875,15 @@ static int dvd_read_copyright(struct cdrom_device_info *cdi, dvd_struct *s)
 	cgc.cmd[9] = cgc.buflen & 0xff;
 
 	if ((ret = cdo->generic_packet(cdi, &cgc)))
-		return ret;
+		goto err;
 
 	s->copyright.cpst = buf[4];
 	s->copyright.rmi = buf[5];
 
-	return 0;
+	ret = 0;
+err:
+	kfree(buf);
+	return ret;
 }
 
 static int dvd_read_disckey(struct cdrom_device_info *cdi, dvd_struct *s)
@@ -1841,26 +1915,33 @@ static int dvd_read_disckey(struct cdrom_device_info *cdi, dvd_struct *s)
 static int dvd_read_bca(struct cdrom_device_info *cdi, dvd_struct *s)
 {
 	int ret;
-	u_char buf[4 + 188];
+	u_char *buf;
 	struct packet_command cgc;
 	struct cdrom_device_ops *cdo = cdi->ops;
 
-	init_cdrom_command(&cgc, buf, sizeof(buf), CGC_DATA_READ);
+	buf = kmalloc(4 + 188, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	init_cdrom_command(&cgc, buf, 4 + 188, CGC_DATA_READ);
 	cgc.cmd[0] = GPCMD_READ_DVD_STRUCTURE;
 	cgc.cmd[7] = s->type;
 	cgc.cmd[9] = cgc.buflen & 0xff;
 
 	if ((ret = cdo->generic_packet(cdi, &cgc)))
-		return ret;
+		goto err;
 
 	s->bca.len = buf[0] << 8 | buf[1];
 	if (s->bca.len < 12 || s->bca.len > 188) {
 		cdinfo(CD_WARNING, "Received invalid BCA length (%d)\n", s->bca.len);
-		return -EIO;
+		ret = -EIO;
+		goto err;
 	}
 	memcpy(s->bca.value, &buf[4], s->bca.len);
-
-	return 0;
+	ret = 0;
+err:
+	kfree(buf);
+	return ret;
 }
 
 static int dvd_read_manufact(struct cdrom_device_info *cdi, dvd_struct *s)
@@ -1960,9 +2041,13 @@ static int cdrom_read_subchannel(struct cdrom_device_info *cdi,
 {
 	struct cdrom_device_ops *cdo = cdi->ops;
 	struct packet_command cgc;
-	char buffer[32];
+	char *buffer;
 	int ret;
 
+	buffer = kmalloc(32, GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
+
 	init_cdrom_command(&cgc, buffer, 16, CGC_DATA_READ);
 	cgc.cmd[0] = GPCMD_READ_SUBCHANNEL;
 	cgc.cmd[1] = 2;     /* MSF addressing */
@@ -1971,7 +2056,7 @@ static int cdrom_read_subchannel(struct cdrom_device_info *cdi,
 	cgc.cmd[8] = 16;
 
 	if ((ret = cdo->generic_packet(cdi, &cgc)))
-		return ret;
+		goto err;
 
 	subchnl->cdsc_audiostatus = cgc.buffer[1];
 	subchnl->cdsc_format = CDROM_MSF;
@@ -1986,7 +2071,10 @@ static int cdrom_read_subchannel(struct cdrom_device_info *cdi,
 	subchnl->cdsc_absaddr.msf.second = cgc.buffer[10];
 	subchnl->cdsc_absaddr.msf.frame = cgc.buffer[11];
 
-	return 0;
+	ret = 0;
+err:
+	kfree(buffer);
+	return ret;
 }
 
 /*

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessary a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Compiler error? [was: Re: new kernel oops in recent kernels]
  2008-03-21 23:00           ` Thomas Bogendoerfer
@ 2008-03-22 23:39             ` Giuseppe Sacco
  2008-03-23 11:16               ` Thomas Bogendoerfer
  0 siblings, 1 reply; 18+ messages in thread
From: Giuseppe Sacco @ 2008-03-22 23:39 UTC (permalink / raw)
  To: linux-mips

Hi Thomas,

Il giorno sab, 22/03/2008 alle 00.00 +0100, Thomas Bogendoerfer ha
scritto:
[...]
> below is a patch, which replaces all buffers on the stack, which are
> passed to the scsi layer with kmalloced ones.
> 
> Giuseppe, could you please check if this fixes your problem, and
> doesn't cause new regressions ? 

I rebuilt a kernel (pulling latest code from git) with your patch. Now I
do not get anymore the Oops at boot time, moreover I may mount a CDROM
and copying data from that CDROM to local SCSI disk.

Bye,
Giuseppe

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Compiler error? [was: Re: new kernel oops in recent kernels]
  2008-03-22 23:39             ` Giuseppe Sacco
@ 2008-03-23 11:16               ` Thomas Bogendoerfer
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Bogendoerfer @ 2008-03-23 11:16 UTC (permalink / raw)
  To: Giuseppe Sacco; +Cc: linux-mips

On Sun, Mar 23, 2008 at 12:39:58AM +0100, Giuseppe Sacco wrote:
> > Giuseppe, could you please check if this fixes your problem, and
> > doesn't cause new regressions ? 
> 
> I rebuilt a kernel (pulling latest code from git) with your patch. Now I
> do not get anymore the Oops at boot time, moreover I may mount a CDROM
> and copying data from that CDROM to local SCSI disk.

great, thank you for testing. I'll submit the patch to the maintainer.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessary a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2008-03-23 11:16 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-16 10:49 new kernel oops in recent kernels Giuseppe Sacco
2008-03-16 20:27 ` Compiler error? [was: Re: new kernel oops in recent kernels] Giuseppe Sacco
2008-03-16 23:36   ` Thomas Bogendoerfer
2008-03-17  8:05     ` Giuseppe Sacco
2008-03-17 14:18       ` Ralf Baechle
2008-03-17 14:32         ` Thomas Bogendoerfer
2008-03-21 23:00           ` Thomas Bogendoerfer
2008-03-22 23:39             ` Giuseppe Sacco
2008-03-23 11:16               ` Thomas Bogendoerfer
  -- strict thread matches above, loose matches on Subject: below --
2008-03-16 15:19 new kernel oops in recent kernels Giuseppe Sacco
2008-03-16 16:39 ` James Bottomley
2008-03-16 18:32   ` Giuseppe Sacco
2008-03-16 18:47     ` James Bottomley
2008-03-16 16:42 ` Matthew Wilcox
2008-03-16 18:29   ` Giuseppe Sacco
2008-03-17  3:58     ` Matthew Wilcox
2008-03-17  4:41     ` Matthew Wilcox
2008-03-17  8:17       ` Giuseppe Sacco

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.