linux-fbdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: tty-related oops in latest kernel(s)?
       [not found]                 ` <Pine.LNX.4.64.0705301857370.29485@jalava.cc.jyu.fi>
@ 2007-05-30 16:09                   ` Andrew Morton
  2007-05-30 18:04                     ` Alexey Dobriyan
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2007-05-30 16:09 UTC (permalink / raw)
  To: Tero Roponen
  Cc: Pekka Enberg, linux-kernel, Alan Cox, Andy Whitcroft,
	linux-fbdev-devel, Antonino A. Daplas

On Wed, 30 May 2007 19:01:09 +0300 (EEST) Tero Roponen <teanropo@jyu.fi> wrote:

> On Wed, 30 May 2007, Andrew Morton wrote:
> 
> > On Wed, 30 May 2007 15:02:49 +0300 (EEST) Tero Roponen <teanropo@jyu.fi> wrote:
> > 
> > > On Wed, 30 May 2007, Pekka Enberg wrote:
> > > 
> > > > On 5/30/07, Tero Roponen <teanropo@jyu.fi> wrote:
> > > > > Hmmm, I just found something interesting. In 2.6.21.3 the /sbin/init
> > > > > gets corrupted when I watch the video!
> > > > >
> > > > > $ cp /sbin/init init.before
> > > > > $ mplayer kiwi.flv
> > > > > $ cp /sbin/init init.after
> > > > >
> > > > > The sha1sums are here:
> > > > >
> > > > > 52c8d643057619cbe137b8e69d4709ce3bdd832d  init.after
> > > > > 8efc7864a5b535a9e336fa82e9d7f112f3d956c1  init.before
> > > > >
> > > > > It seems that something corrupts memory somewhere...
> > > > 
> > > > To debug this a bit further:
> > > > 
> > > > $ od -a -t x1 -v init.after > init.after.dump
> > > > $ od -a -t x1 -v init.before > init.before.dump
> > > > $ diff -u init.before.dump init.after.dump | less
> > > > 
> > > > -0011340  nul nul nul  e9  f0  fe  ff  ff  ff   %   < soh enq  bs   h  80
> > > > -           00  00  00  e9  f0  fe  ff  ff  ff  25  3c  01  05  08  68  80
> > > > +0010000    y ack nul nul   y ack nul nul   y ack nul nul   y ack nul nul
> > > > +           79  06  00  00  79  06  00  00  79  06  00  00  79  06  00  00
> > > > +0010020    y ack nul nul   y ack nul nul   y ack nul nul   y ack nul nul
> > > > +           79  06  00  00  79  06  00  00  79  06  00  00  79  06  00  00
> > > > +0011340    y ack nul nul   y ack nul nul  ff   %   < soh enq  bs   h  80
> > > > +           79  06  00  00  79  06  00  00  ff  25  3c  01  05  08  68  80
> > > > 
> > > > The file at offset 0010000 - 0011348 is overwritten with the byte
> > > > pattern 79 06 00 00.
> > > > 
> > > > Do you see anything in the logs or is this a silent corruption? Did
> > > > you see this corruption with 2.6.19 or 2.6.22-rc3?
> > > > 
> > > 
> > > I recompiled 2.6.22-rc3 and booted it with slub_debug. Now I can't oops
> > > the kernel, but ./slab_info -v gives me a warning:
> > > 
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > neofb: no support for 32bpp
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1024x768) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1152x864) larger than the LCD panel (800x600)
> > > Mode (1024x1024) larger than the LCD panel (800x600)
> > > Mode (1024x1024) larger than the LCD panel (800x600)
> > > Mode (1024x1024) larger than the LCD panel (800x600)
> > > Mode (1024x1024) larger than the LCD panel (800x600)
> > > Mode (1280x1024) larger than the LCD panel (800x600)
> > > Mode (1280x1024) larger than the LCD panel (800x600)
> > > Mode (1280x1024) larger than the LCD panel (800x600)
> > > Mode (1280x1024) larger than the LCD panel (800x600)
> > > *** SLUB kmalloc-1024: Redzone Active@0xc10be860 slab 0xc10217c0
> > >     offset=2144 flags=0x80004082 inuse=7 freelist=0x00000000
> > >   Bytes b4 0xc10be850:  00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
> > >     Object 0xc10be860:  00 00 00 00 00 20 00 00 20 03 00 00 58 02 00 00 ............X...
> > >     Object 0xc10be870:  20 03 00 00 58 02 00 00 00 00 00 00 00 00 00 00 ....X...........
> > >     Object 0xc10be880:  10 00 00 00 00 00 00 00 0b 00 00 00 05 00 00 00 ................
> > >     Object 0xc10be890:  00 00 00 00 05 00 00 00 06 00 00 00 00 00 00 00 ................
> > >     Object 0xc10be8a0:  00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00 ................
> > >     Object 0xc10be8b0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > >     Object 0xc10be8c0:  ff ff ff ff ff ff ff ff 00 00 00 00 a8 61 00 00 ÿÿÿÿÿÿÿÿ....¨a..
> > >     Object 0xc10be8d0:  58 00 00 00 28 00 00 00 17 00 00 00 01 00 00 00 X...(...........
> > >    Redzone 0xc10bec60:  4d 6b 00 00                                     Mk..            
> > > FreePointer 0xc10bec64 -> 0x00006b4d
> > > Last alloc: 0x6b4d jiffies_ago=4294923792 cpu=27469 pid=27469
> > > Last free : 0x6b4d jiffies_ago=4294923792 cpu=27469 pid=27469
> > >     Filler 0xc10bec88:  4d 6b 00 00 4d 6b 00 00                         Mk..Mk..        
> > >  [<c013f717>] check_object+0x64/0x23d
> > >  [<c0141371>] validate_slab+0xff/0x12a
> > >  [<c01413aa>] validate_slab_slab+0xe/0x51
> > >  [<c0141488>] validate_store+0x9b/0xe8
> > >  [<c01343d1>] __handle_mm_fault+0x370/0x68b
> > >  [<c01413ed>] validate_store+0x0/0xe8
> > >  [<c013eaa6>] slab_attr_store+0x1e/0x22
> > >  [<c016e470>] sysfs_write_file+0xad/0xd6
> > >  [<c016e3c3>] sysfs_write_file+0x0/0xd6
> > >  [<c0143341>] vfs_write+0x8a/0x10c
> > >  [<c01437d7>] sys_write+0x41/0x67
> > >  [<c01022c2>] sysenter_past_esp+0x5f/0x85
> > >  =======================
> > > @@@ SLUB kmalloc-1024: Restoring redzone (0xcc) from 0xc10bec60-0xc10bec63
> > > 
> > 
> > So something did an overwrite of a 1024-byte kmalloc.  Unfortunately that
> > overwrite seems to have trashed our last-alloc info, so we don't know who
> > allocated that memory.  Darn.
> > 
> > Does the problem go away if you disable CONFIG_SLUB and enable CONFIG_SLAB?
> > 
> > 
> 
> Hi,
> 
> after some trial and error I found a simple way to trigger the
> corruption:
> 
> [root@terrop ~]# ./slabinfo -v
> [root@terrop ~]# ./oops
> [root@terrop ~]# ./slabinfo -v

Whoa.  Impressed.

> *** SLUB kmalloc-1024: Redzone Active@0xc10be860 slab 0xc10217c0
>     offset=2144 flags=0x80004082 inuse=7 freelist=0x00000000
>   Bytes b4 0xc10be850:  00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
>     Object 0xc10be860:  00 00 00 00 00 20 00 00 20 03 00 00 58 02 00 00 ............X...
>     Object 0xc10be870:  20 03 00 00 58 02 00 00 00 00 00 00 00 00 00 00 ....X...........
>     Object 0xc10be880:  18 00 00 00 00 00 00 00 10 00 00 00 08 00 00 00 ................
>     Object 0xc10be890:  00 00 00 00 08 00 00 00 08 00 00 00 00 00 00 00 ................
>     Object 0xc10be8a0:  00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 ................
>     Object 0xc10be8b0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>     Object 0xc10be8c0:  ff ff ff ff ff ff ff ff 00 00 00 00 a8 61 00 00 ÿÿÿÿÿÿÿÿ....¨a..
>     Object 0xc10be8d0:  58 00 00 00 28 00 00 00 17 00 00 00 01 00 00 00 X...(...........
>    Redzone 0xc10bec60:  6b 6b 6b 00                                     kkk.            
> FreePointer 0xc10bec64 -> 0x006b6b6b
> Last alloc: 0x6b6b6b jiffies_ago=4287907122 cpu=7039851 pid=7039851
> Last free : 0x6b6b6b jiffies_ago=4287907122 cpu=7039851 pid=7039851
>     Filler 0xc10bec88:  6b 6b 6b 00 6b 6b 6b 00                         kkk.kkk.        
>  [<c013f717>] check_object+0x64/0x23d
>  [<c0141371>] validate_slab+0xff/0x12a
>  [<c01413aa>] validate_slab_slab+0xe/0x51
>  [<c0141488>] validate_store+0x9b/0xe8
>  [<c01343d1>] __handle_mm_fault+0x370/0x68b
>  [<c01413ed>] validate_store+0x0/0xe8
>  [<c013eaa6>] slab_attr_store+0x1e/0x22
>  [<c016e470>] sysfs_write_file+0xad/0xd6
>  [<c016e3c3>] sysfs_write_file+0x0/0xd6
>  [<c0143341>] vfs_write+0x8a/0x10c
>  [<c01437d7>] sys_write+0x41/0x67
>  [<c01022c2>] sysenter_past_esp+0x5f/0x85
>  =======================
> @@@ SLUB kmalloc-1024: Restoring redzone (0xcc) from 0xc10bec60-0xc10bec63
> 
> [root@terrop ~]# cat oops.c
> #include <sys/ioctl.h>
> #include <stdio.h>
> #include <linux/fb.h>
> #include <fcntl.h>
> 
> int main(void)
> {
>         struct fb_var_screeninfo fbinfo;
>         int fd = open("/dev/fb0", O_RDWR);
>         if (fd < 0)
>                 return 1;
> 
>         /* Get screeninfo */
>         ioctl(fd, FBIOGET_VSCREENINFO, &fbinfo);
> 
>         /* Change depth from current 16 to 24. */
>         fbinfo.bits_per_pixel = 24;
>         ioctl(fd, FBIOPUT_VSCREENINFO, &fbinfo);
> 
>         return 0;
> }
> 
> So this seems to be a framebuffer error.
> 

cc's added ;)

Thanks.

Tony, this is with SLUB enabled, which might be detecting a
hitherto-undetected bug.

Config is at http://userweb.kernel.org/~akpm/config-tero.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty-related oops in latest kernel(s)?
  2007-05-30 16:09                   ` tty-related oops in latest kernel(s)? Andrew Morton
@ 2007-05-30 18:04                     ` Alexey Dobriyan
  2007-05-30 23:14                       ` Antonino A. Daplas
  0 siblings, 1 reply; 7+ messages in thread
From: Alexey Dobriyan @ 2007-05-30 18:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tero Roponen, Pekka Enberg, linux-kernel, Alan Cox,
	Andy Whitcroft, linux-fbdev-devel, Antonino A. Daplas

On Wed, May 30, 2007 at 09:09:45AM -0700, Andrew Morton wrote:
> On Wed, 30 May 2007 19:01:09 +0300 (EEST) Tero Roponen <teanropo@jyu.fi> wrote:
> 
> > On Wed, 30 May 2007, Andrew Morton wrote:
> > 
> > > On Wed, 30 May 2007 15:02:49 +0300 (EEST) Tero Roponen <teanropo@jyu.fi> wrote:
> > > 
> > > > On Wed, 30 May 2007, Pekka Enberg wrote:
> > > > 
> > > > > On 5/30/07, Tero Roponen <teanropo@jyu.fi> wrote:
> > > > > > Hmmm, I just found something interesting. In 2.6.21.3 the /sbin/init
> > > > > > gets corrupted when I watch the video!
> > > > > >
> > > > > > $ cp /sbin/init init.before
> > > > > > $ mplayer kiwi.flv
> > > > > > $ cp /sbin/init init.after
> > > > > >
> > > > > > The sha1sums are here:
> > > > > >
> > > > > > 52c8d643057619cbe137b8e69d4709ce3bdd832d  init.after
> > > > > > 8efc7864a5b535a9e336fa82e9d7f112f3d956c1  init.before
> > > > > >
> > > > > > It seems that something corrupts memory somewhere...
> > > > > 
> > > > > To debug this a bit further:
> > > > > 
> > > > > $ od -a -t x1 -v init.after > init.after.dump
> > > > > $ od -a -t x1 -v init.before > init.before.dump
> > > > > $ diff -u init.before.dump init.after.dump | less
> > > > > 
> > > > > -0011340  nul nul nul  e9  f0  fe  ff  ff  ff   %   < soh enq  bs   h  80
> > > > > -           00  00  00  e9  f0  fe  ff  ff  ff  25  3c  01  05  08  68  80
> > > > > +0010000    y ack nul nul   y ack nul nul   y ack nul nul   y ack nul nul
> > > > > +           79  06  00  00  79  06  00  00  79  06  00  00  79  06  00  00
> > > > > +0010020    y ack nul nul   y ack nul nul   y ack nul nul   y ack nul nul
> > > > > +           79  06  00  00  79  06  00  00  79  06  00  00  79  06  00  00
> > > > > +0011340    y ack nul nul   y ack nul nul  ff   %   < soh enq  bs   h  80
> > > > > +           79  06  00  00  79  06  00  00  ff  25  3c  01  05  08  68  80
> > > > > 
> > > > > The file at offset 0010000 - 0011348 is overwritten with the byte
> > > > > pattern 79 06 00 00.
> > > > > 
> > > > > Do you see anything in the logs or is this a silent corruption? Did
> > > > > you see this corruption with 2.6.19 or 2.6.22-rc3?
> > > > > 
> > > > 
> > > > I recompiled 2.6.22-rc3 and booted it with slub_debug. Now I can't oops
> > > > the kernel, but ./slab_info -v gives me a warning:
> > > > 
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > neofb: no support for 32bpp
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1024x768) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1152x864) larger than the LCD panel (800x600)
> > > > Mode (1024x1024) larger than the LCD panel (800x600)
> > > > Mode (1024x1024) larger than the LCD panel (800x600)
> > > > Mode (1024x1024) larger than the LCD panel (800x600)
> > > > Mode (1024x1024) larger than the LCD panel (800x600)
> > > > Mode (1280x1024) larger than the LCD panel (800x600)
> > > > Mode (1280x1024) larger than the LCD panel (800x600)
> > > > Mode (1280x1024) larger than the LCD panel (800x600)
> > > > Mode (1280x1024) larger than the LCD panel (800x600)
> > > > *** SLUB kmalloc-1024: Redzone Active@0xc10be860 slab 0xc10217c0
> > > >     offset=2144 flags=0x80004082 inuse=7 freelist=0x00000000
> > > >   Bytes b4 0xc10be850:  00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
> > > >     Object 0xc10be860:  00 00 00 00 00 20 00 00 20 03 00 00 58 02 00 00 ............X...
> > > >     Object 0xc10be870:  20 03 00 00 58 02 00 00 00 00 00 00 00 00 00 00 ....X...........
> > > >     Object 0xc10be880:  10 00 00 00 00 00 00 00 0b 00 00 00 05 00 00 00 ................
> > > >     Object 0xc10be890:  00 00 00 00 05 00 00 00 06 00 00 00 00 00 00 00 ................
> > > >     Object 0xc10be8a0:  00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00 ................
> > > >     Object 0xc10be8b0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > > >     Object 0xc10be8c0:  ff ff ff ff ff ff ff ff 00 00 00 00 a8 61 00 00 ÿÿÿÿÿÿÿÿ....¨a..
> > > >     Object 0xc10be8d0:  58 00 00 00 28 00 00 00 17 00 00 00 01 00 00 00 X...(...........
> > > >    Redzone 0xc10bec60:  4d 6b 00 00                                     Mk..            
> > > > FreePointer 0xc10bec64 -> 0x00006b4d
> > > > Last alloc: 0x6b4d jiffies_ago=4294923792 cpu=27469 pid=27469
> > > > Last free : 0x6b4d jiffies_ago=4294923792 cpu=27469 pid=27469
> > > >     Filler 0xc10bec88:  4d 6b 00 00 4d 6b 00 00                         Mk..Mk..        
> > > >  [<c013f717>] check_object+0x64/0x23d
> > > >  [<c0141371>] validate_slab+0xff/0x12a
> > > >  [<c01413aa>] validate_slab_slab+0xe/0x51
> > > >  [<c0141488>] validate_store+0x9b/0xe8
> > > >  [<c01343d1>] __handle_mm_fault+0x370/0x68b
> > > >  [<c01413ed>] validate_store+0x0/0xe8
> > > >  [<c013eaa6>] slab_attr_store+0x1e/0x22
> > > >  [<c016e470>] sysfs_write_file+0xad/0xd6
> > > >  [<c016e3c3>] sysfs_write_file+0x0/0xd6
> > > >  [<c0143341>] vfs_write+0x8a/0x10c
> > > >  [<c01437d7>] sys_write+0x41/0x67
> > > >  [<c01022c2>] sysenter_past_esp+0x5f/0x85
> > > >  =======================
> > > > @@@ SLUB kmalloc-1024: Restoring redzone (0xcc) from 0xc10bec60-0xc10bec63
> > > > 
> > > 
> > > So something did an overwrite of a 1024-byte kmalloc.  Unfortunately that
> > > overwrite seems to have trashed our last-alloc info, so we don't know who
> > > allocated that memory.  Darn.
> > > 
> > > Does the problem go away if you disable CONFIG_SLUB and enable CONFIG_SLAB?
> > > 
> > > 
> > 
> > Hi,
> > 
> > after some trial and error I found a simple way to trigger the
> > corruption:
> > 
> > [root@terrop ~]# ./slabinfo -v
> > [root@terrop ~]# ./oops
> > [root@terrop ~]# ./slabinfo -v
> 
> Whoa.  Impressed.
> 
> > *** SLUB kmalloc-1024: Redzone Active@0xc10be860 slab 0xc10217c0
> >     offset=2144 flags=0x80004082 inuse=7 freelist=0x00000000
> >   Bytes b4 0xc10be850:  00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
> >     Object 0xc10be860:  00 00 00 00 00 20 00 00 20 03 00 00 58 02 00 00 ............X...
> >     Object 0xc10be870:  20 03 00 00 58 02 00 00 00 00 00 00 00 00 00 00 ....X...........
> >     Object 0xc10be880:  18 00 00 00 00 00 00 00 10 00 00 00 08 00 00 00 ................
> >     Object 0xc10be890:  00 00 00 00 08 00 00 00 08 00 00 00 00 00 00 00 ................
> >     Object 0xc10be8a0:  00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 ................
> >     Object 0xc10be8b0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> >     Object 0xc10be8c0:  ff ff ff ff ff ff ff ff 00 00 00 00 a8 61 00 00 ÿÿÿÿÿÿÿÿ....¨a..
> >     Object 0xc10be8d0:  58 00 00 00 28 00 00 00 17 00 00 00 01 00 00 00 X...(...........
> >    Redzone 0xc10bec60:  6b 6b 6b 00                                     kkk.            
> > FreePointer 0xc10bec64 -> 0x006b6b6b
> > Last alloc: 0x6b6b6b jiffies_ago=4287907122 cpu=7039851 pid=7039851
> > Last free : 0x6b6b6b jiffies_ago=4287907122 cpu=7039851 pid=7039851
> >     Filler 0xc10bec88:  6b 6b 6b 00 6b 6b 6b 00                         kkk.kkk.        
> >  [<c013f717>] check_object+0x64/0x23d
> >  [<c0141371>] validate_slab+0xff/0x12a
> >  [<c01413aa>] validate_slab_slab+0xe/0x51
> >  [<c0141488>] validate_store+0x9b/0xe8
> >  [<c01343d1>] __handle_mm_fault+0x370/0x68b
> >  [<c01413ed>] validate_store+0x0/0xe8
> >  [<c013eaa6>] slab_attr_store+0x1e/0x22
> >  [<c016e470>] sysfs_write_file+0xad/0xd6
> >  [<c016e3c3>] sysfs_write_file+0x0/0xd6
> >  [<c0143341>] vfs_write+0x8a/0x10c
> >  [<c01437d7>] sys_write+0x41/0x67
> >  [<c01022c2>] sysenter_past_esp+0x5f/0x85
> >  =======================
> > @@@ SLUB kmalloc-1024: Restoring redzone (0xcc) from 0xc10bec60-0xc10bec63
> > 
> > [root@terrop ~]# cat oops.c
> > #include <sys/ioctl.h>
> > #include <stdio.h>
> > #include <linux/fb.h>
> > #include <fcntl.h>
> > 
> > int main(void)
> > {
> >         struct fb_var_screeninfo fbinfo;
> >         int fd = open("/dev/fb0", O_RDWR);
> >         if (fd < 0)
> >                 return 1;
> > 
> >         /* Get screeninfo */
> >         ioctl(fd, FBIOGET_VSCREENINFO, &fbinfo);
> > 
> >         /* Change depth from current 16 to 24. */
> >         fbinfo.bits_per_pixel = 24;
> >         ioctl(fd, FBIOPUT_VSCREENINFO, &fbinfo);
> > 
> >         return 0;
> > }
> > 
> > So this seems to be a framebuffer error.
> > 
> 
> cc's added ;)
> 
> Thanks.
> 
> Tony, this is with SLUB enabled, which might be detecting a
> hitherto-undetected bug.
> 
> Config is at http://userweb.kernel.org/~akpm/config-tero.txt

Two suspicious things for me:

1)

--- a/drivers/video/neofb.c
+++ b/drivers/video/neofb.c
@@ -1295,7 +1295,7 @@ static int neofb_setcolreg(u_int regno, 
 		outb(blue >> 10, 0x3c9);
 		break;
 	case 16:
-		((u32 *) fb->pseudo_palette)[regno] =
+		((u16 *) fb->pseudo_palette)[regno] =
 				((red & 0xf800)) | ((green & 0xfc00) >> 5) |
 				((blue & 0xf800) >> 11);
 		break;



2) palette in neofb_par is "u32 palette[16];" which is 4x16 = 64 bytes.
   struct fb_info::pseudo_palette is assigned to it in neo_alloc_fb_info().
   Yet, we check at the beginning of neofb_setcolreg() for color map
   length which neofb advertises as 256 which seems too many.

   printk()s showing "regno" at the beginning of neofb_setcolreg()
   welcome.

   Alexey, who only knows how to spell framebuffer and a bit.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty-related oops in latest kernel(s)?
  2007-05-30 18:04                     ` Alexey Dobriyan
@ 2007-05-30 23:14                       ` Antonino A. Daplas
  2007-05-30 23:18                         ` David Miller
  2007-05-31  7:17                         ` Geert Uytterhoeven
  0 siblings, 2 replies; 7+ messages in thread
From: Antonino A. Daplas @ 2007-05-30 23:14 UTC (permalink / raw)
  To: Alexey Dobriyan
  Cc: linux-fbdev-devel, linux-kernel, Pekka Enberg, Tero Roponen,
	Andy Whitcroft, Andrew Morton, Alan Cox

On Wed, 2007-05-30 at 22:04 +0400, Alexey Dobriyan wrote:
> On Wed, May 30, 2007 at 09:09:45AM -0700, Andrew Morton wrote:
> > On Wed, 30 May 2007 19:01:09 +0300 (EEST) Tero Roponen <teanropo@jyu.fi> wrote:
> > 
> > > On Wed, 30 May 2007, Andrew Morton wrote:
> > > 
> > > > On Wed, 30 May 2007 15:02:49 +0300 (EEST) Tero Roponen <teanropo@jyu.fi> wrote:
> > > > 
> > > > > On Wed, 30 May 2007, Pekka Enberg wrote:
> > > > > 
> > > > > > On 5/30/07, Tero Roponen <teanropo@jyu.fi> wrote:
[snip]
> Two suspicious things for me:
> 
> 1)
> 
> --- a/drivers/video/neofb.c
> +++ b/drivers/video/neofb.c
> @@ -1295,7 +1295,7 @@ static int neofb_setcolreg(u_int regno, 
>  		outb(blue >> 10, 0x3c9);
>  		break;
>  	case 16:
> -		((u32 *) fb->pseudo_palette)[regno] =
> +		((u16 *) fb->pseudo_palette)[regno] =

u32 is correct.

>  				((red & 0xf800)) | ((green & 0xfc00) >> 5) |
>  				((blue & 0xf800) >> 11);
>  		break;
> 
> 
> 
> 2) palette in neofb_par is "u32 palette[16];" which is 4x16 = 64 bytes.
>    struct fb_info::pseudo_palette is assigned to it in neo_alloc_fb_info().
>    Yet, we check at the beginning of neofb_setcolreg() for color map
>    length which neofb advertises as 256 which seems too many.
> 

Yes, 256 is too many. the pseudo_palette is used for the 16-color
console only.

I'm impressed that this bug has escaped notice for this long. That bug
is present since the 2.5.x era.

Probably, the best thing to do is hide the pseudo_palette from the
drivers and move it to the console layer where it belongs to spare
future driver writers from palette usage confusion. That will be a
thankless job.

Tony



-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty-related oops in latest kernel(s)?
  2007-05-30 23:14                       ` Antonino A. Daplas
@ 2007-05-30 23:18                         ` David Miller
  2007-05-30 23:28                           ` Antonino A. Daplas
  2007-05-31  7:17                         ` Geert Uytterhoeven
  1 sibling, 1 reply; 7+ messages in thread
From: David Miller @ 2007-05-30 23:18 UTC (permalink / raw)
  To: adaplas
  Cc: linux-fbdev-devel, linux-kernel, penberg, teanropo, apw, akpm,
	adobriyan, alan

From: "Antonino A. Daplas" <adaplas@gmail.com>
Date: Thu, 31 May 2007 07:14:46 +0800

> Yes, 256 is too many. the pseudo_palette is used for the 16-color
> console only.

Many many drivers allocate 256 entries, just FYI :-)  They
all should be fixed up I guess.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty-related oops in latest kernel(s)?
  2007-05-30 23:18                         ` David Miller
@ 2007-05-30 23:28                           ` Antonino A. Daplas
  0 siblings, 0 replies; 7+ messages in thread
From: Antonino A. Daplas @ 2007-05-30 23:28 UTC (permalink / raw)
  To: David Miller
  Cc: linux-fbdev-devel, linux-kernel, penberg, teanropo, apw, akpm,
	adobriyan, alan

On Wed, 2007-05-30 at 16:18 -0700, David Miller wrote:
> From: "Antonino A. Daplas" <adaplas@gmail.com>
> Date: Thu, 31 May 2007 07:14:46 +0800
> 
> > Yes, 256 is too many. the pseudo_palette is used for the 16-color
> > console only.
> 
> Many many drivers allocate 256 entries, just FYI :-)  They
> all should be fixed up I guess.

I did a pseudo_palette allocation audit before, it might be high time to
run one again :-(

Tony


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty-related oops in latest kernel(s)?
  2007-05-30 23:14                       ` Antonino A. Daplas
  2007-05-30 23:18                         ` David Miller
@ 2007-05-31  7:17                         ` Geert Uytterhoeven
  2007-05-31  9:04                           ` Antonino A. Daplas
  1 sibling, 1 reply; 7+ messages in thread
From: Geert Uytterhoeven @ 2007-05-31  7:17 UTC (permalink / raw)
  To: linux-fbdev-devel
  Cc: linux-kernel, Pekka Enberg, Tero Roponen, Andy Whitcroft,
	Andrew Morton, Alexey Dobriyan, Alan Cox

On Thu, 31 May 2007, Antonino A. Daplas wrote:
> On Wed, 2007-05-30 at 22:04 +0400, Alexey Dobriyan wrote:
> > 2) palette in neofb_par is "u32 palette[16];" which is 4x16 = 64 bytes.
> >    struct fb_info::pseudo_palette is assigned to it in neo_alloc_fb_info().
> >    Yet, we check at the beginning of neofb_setcolreg() for color map
> >    length which neofb advertises as 256 which seems too many.
> > 
> 
> Yes, 256 is too many. the pseudo_palette is used for the 16-color
> console only.
> 
> I'm impressed that this bug has escaped notice for this long. That bug
> is present since the 2.5.x era.
> 
> Probably, the best thing to do is hide the pseudo_palette from the
> drivers and move it to the console layer where it belongs to spare
> future driver writers from palette usage confusion. That will be a
> thankless job.

The console layer doesn't know how to fill in the pseudo palette in all
cases, that's why the driver have to do it.

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty-related oops in latest kernel(s)?
  2007-05-31  7:17                         ` Geert Uytterhoeven
@ 2007-05-31  9:04                           ` Antonino A. Daplas
  0 siblings, 0 replies; 7+ messages in thread
From: Antonino A. Daplas @ 2007-05-31  9:04 UTC (permalink / raw)
  To: linux-fbdev-devel
  Cc: linux-kernel, Pekka Enberg, Tero Roponen, Andy Whitcroft,
	Geert Uytterhoeven, Andrew Morton, Alexey Dobriyan, Alan Cox

On Thu, 2007-05-31 at 09:17 +0200, Geert Uytterhoeven wrote:
> On Thu, 31 May 2007, Antonino A. Daplas wrote:
> > On Wed, 2007-05-30 at 22:04 +0400, Alexey Dobriyan wrote:
> > > 2) palette in neofb_par is "u32 palette[16];" which is 4x16 = 64 bytes.
> > >    struct fb_info::pseudo_palette is assigned to it in neo_alloc_fb_info().
> > >    Yet, we check at the beginning of neofb_setcolreg() for color map
> > >    length which neofb advertises as 256 which seems too many.
> > > 
> > 
> > Yes, 256 is too many. the pseudo_palette is used for the 16-color
> > console only.
> > 
> > I'm impressed that this bug has escaped notice for this long. That bug
> > is present since the 2.5.x era.
> > 
> > Probably, the best thing to do is hide the pseudo_palette from the
> > drivers and move it to the console layer where it belongs to spare
> > future driver writers from palette usage confusion. That will be a
> > thankless job.
> 
> The console layer doesn't know how to fill in the pseudo palette in all
> cases, that's why the driver have to do it.
> 

I have actually started working on that.  It involves breaking down
fb_setcolreg() so it deals only with writing to the actual hardware
registers.  The part of fb_setcolreg() that adds entries to the
pseudo_palette can be separated as a new method, fb_get_pixel(), which
given red, blue, green, transp, the driver returns a u32 pixel value
that can be written to the pseudo_palette.

So fbcon can hold a copy of the pseudo_palette and fills it up by
calling info->fbops->fb_get_pixel() successively.

This will touch the logo code, the drawing libraries, each driver, etc
so it's a lot of work.  During the conversion period,  we support having
info->pseudo_palette and fbcon->pseudo_palette at the same time. Once
all drivers are converted, we can remove info->pseudo_palette.

One use for having an fb_get_pixel() method is we can use this as an
rgb888-image-to-raw-framebuffer-format converter.

Currently, I have only converted vesafb. Once the core code is done,
I'll start converting the rest of the drivers one by one.

Tony




-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-05-31  9:04 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.64.0705271200360.4955@jalava.cc.jyu.fi>
     [not found] ` <84144f020705280022lf3902caj1def02ed56e0bff@mail.gmail.com>
     [not found]   ` <84144f020705280234g39aa04b3hfe369f4477e6043d@mail.gmail.com>
     [not found]     ` <Pine.LNX.4.64.0705291900001.32656@jalava.cc.jyu.fi>
     [not found]       ` <84144f020705291157k465ec6c4sb81081844bb57514@mail.gmail.com>
     [not found]         ` <Pine.LNX.4.64.0705300654493.5241@jalava.cc.jyu.fi>
     [not found]           ` <84144f020705292254o319f6619m787bf29491c92509@mail.gmail.com>
     [not found]             ` <Pine.LNX.4.64.0705301458400.4634@jalava.cc.jyu.fi>
     [not found]               ` <20070530083953.9909bcef.akpm@linux-foundation.org>
     [not found]                 ` <Pine.LNX.4.64.0705301857370.29485@jalava.cc.jyu.fi>
2007-05-30 16:09                   ` tty-related oops in latest kernel(s)? Andrew Morton
2007-05-30 18:04                     ` Alexey Dobriyan
2007-05-30 23:14                       ` Antonino A. Daplas
2007-05-30 23:18                         ` David Miller
2007-05-30 23:28                           ` Antonino A. Daplas
2007-05-31  7:17                         ` Geert Uytterhoeven
2007-05-31  9:04                           ` Antonino A. Daplas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).