* Re: Slub debugging NAND error in 2.6.25.10.atmel.2 [not found] ` <6B5648EA2E2C2D42AF2FF7691522AD92417CA6@wpr01.wprmedical.local> @ 2008-08-29 9:48 ` Haavard Skinnemoen 2008-08-29 10:46 ` Eirik Aanonsen 2008-08-29 14:28 ` MTD/block regression (was Re: Slub debugging NAND error in 2.6.25.10.atmel.2) Haavard Skinnemoen 0 siblings, 2 replies; 8+ messages in thread From: Haavard Skinnemoen @ 2008-08-29 9:48 UTC (permalink / raw) To: Eirik Aanonsen; +Cc: linux-mtd, kernel, David Woodhouse [adding linux-mtd and David back to Cc] "Eirik Aanonsen" <eaa@wprmedical.com> wrote: > > Using physmap partition information > > Creating 5 MTD partitions on "physmap-flash.0": > > 0x00000000-0x00020000 : "u-boot" > > 0x00020000-0x00640000 : "root" > > 0x00640000-0x00720000 : "kernel1" > > 0x00720000-0x007e0000 : "modules" > > 0x007e0000-0x00800000 : "env" > > NAND device: Manufacturer ID: 0xec, Chip ID: 0xd5 (Samsung NAND 2GiB 3,3V 8-bit) > > Scanning device for bad blocks > > Bad eraseblock 31 at 0x00f80000 > > Bad eraseblock 1579 at 0x31580000 > > Bad eraseblock 2921 at 0x5b480000 > > Bad eraseblock 2931 at 0x5b980000 > > Bad eraseblock 3359 at 0x68f80000 > > Creating 1 MTD partitions on "atmel_nand": > > 0x00000000-0x80000000 : "main" > > ============================================================================= > > BUG kmalloc-4096: Poison overwritten > > ----------------------------------------------------------------------------- Hmm...I just saw this when booting 2.6.27-rc5 on the NGW100: physmap platform flash device: 00800000 at 00000000 physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank Amd/Fujitsu Extended Query Table at 0x0041 number of CFI chips: 1 cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness. RedBoot partition parsing not available Using physmap partition information Creating 3 MTD partitions on "physmap-flash.0": 0x00000000-0x00020000 : "u-boot" 0x00020000-0x007f0000 : "root" kobject (91ce8410): tried to init an initialized object, something is seriously wrong. Call trace: [<90017184>] dump_stack+0x18/0x20 [<900c1894>] kobject_init+0x28/0x5c [<900c1bf6>] kobject_init_and_add+0xe/0x24 [<900beff0>] blk_register_filter+0x28/0x40 [<900be224>] add_disk+0x38/0x68 [<900e70f0>] add_mtd_blktrans_dev+0x174/0x184 [<900e748e>] mtdblock_add_mtd+0x36/0x3c [<900e6e38>] blktrans_notify_add+0x1a/0x3a [<900e533c>] add_mtd_device+0x60/0xa0 [<900e5f7e>] add_mtd_partitions+0x37a/0x3a0 [<900ec4d0>] physmap_flash_probe+0x1ec/0x21c [<900e0f24>] platform_drv_probe+0x10/0x12 [<900e06d0>] driver_probe_device+0x84/0xf0 [<900e076a>] __driver_attach+0x2e/0x44 [<900e0096>] bus_for_each_dev+0x2e/0x4c [<900e05b6>] driver_attach+0x12/0x14 [<900e036c>] bus_add_driver+0x6c/0x178 [<900e08a4>] driver_register+0x58/0xb0 [<900e1126>] platform_driver_register+0x56/0x5c [<9000aaf6>] physmap_init+0xa/0x10 [<9001422a>] do_one_initcall+0x2a/0x10c [<900005b8>] kernel_init+0x48/0x90 [<9001fcc0>] do_exit+0x0/0x4cc 0x007f0000-0x00800000 : "env" kobject (91ce8410): tried to init an initialized object, something is seriously wrong. Call trace: [<90017184>] dump_stack+0x18/0x20 [<900c1894>] kobject_init+0x28/0x5c [<900c1bf6>] kobject_init_and_add+0xe/0x24 [<900beff0>] blk_register_filter+0x28/0x40 [<900be224>] add_disk+0x38/0x68 [<900e70f0>] add_mtd_blktrans_dev+0x174/0x184 [<900e748e>] mtdblock_add_mtd+0x36/0x3c [<900e6e38>] blktrans_notify_add+0x1a/0x3a [<900e533c>] add_mtd_device+0x60/0xa0 [<900e5f7e>] add_mtd_partitions+0x37a/0x3a0 [<900ec4d0>] physmap_flash_probe+0x1ec/0x21c [<900e0f24>] platform_drv_probe+0x10/0x12 [<900e06d0>] driver_probe_device+0x84/0xf0 [<900e076a>] __driver_attach+0x2e/0x44 [<900e0096>] bus_for_each_dev+0x2e/0x4c [<900e05b6>] driver_attach+0x12/0x14 [<900e036c>] bus_add_driver+0x6c/0x178 [<900e08a4>] driver_register+0x58/0xb0 [<900e1126>] platform_driver_register+0x56/0x5c [<9000aaf6>] physmap_init+0xa/0x10 [<9001422a>] do_one_initcall+0x2a/0x10c [<900005b8>] kernel_init+0x48/0x90 [<9001fcc0>] do_exit+0x0/0x4cc I wonder if it's related? Haavard [fullquoting since others might find the below interesting] > > INFO: 0x91c0a0c0-0x91c0a13f. First byte 0xff instead of 0x6b > > INFO: Slab 0x90157100 used=2 fp=0x91c0a080 flags=0x40c2 > > INFO: Object 0x91c0a080 @offset=8320 fp=0x91c0b0c0 > > > > Bytes b4 0x91c0a070: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ > > Object 0x91c0a080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > Object 0x91c0a090: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > Object 0x91c0a0a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > Object 0x91c0a0b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > > Object 0x91c0a0c0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ > > Object 0x91c0a0d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ > > Object 0x91c0a0e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ > > Object 0x91c0a0f0: ff ff ff ff ff ff ff ff ff ff ff ff 3d ff 3d ff ÿÿÿÿÿÿÿÿÿÿÿÿ=ÿ=ÿ > > Redzone 0x91c0b080: bb bb bb bb »»»» > > Padding 0x91c0b0a8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ > > Padding 0x91c0b0b8: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ > > Call trace: > > [<90010680>] dump_stack+0x18/0x20 > > [<90048468>] print_trailer+0xdc/0x108 > > [<90048500>] check_bytes_and_report+0x6c/0x8c > > [<900486b0>] check_object+0x84/0x170 > > [<900490ae>] __slab_alloc+0x2c2/0x35c > > [<900499fc>] kmem_cache_alloc+0x20/0x50 > > [<900967d8>] kobject_uevent_env+0x6c/0x1c8 > > [<9009693c>] kobject_uevent+0x8/0xc > > [<90071c5e>] register_disk+0xbe/0xf0 > > [<90092d88>] add_disk+0x2c/0x38 > > [<900b25cc>] add_mtd_blktrans_dev+0x16c/0x17c > > [<900b296a>] mtdblock_add_mtd+0x36/0x3c > > [<900b233c>] blktrans_notify_add+0x1a/0x36 > > [<900b0832>] add_mtd_device+0x62/0x9c > > [<900b147a>] add_mtd_partitions+0x386/0x3a8 > > [<900090dc>] atmel_nand_probe+0x2b8/0x300 > > [<900aeaf0>] platform_drv_probe+0x10/0x12 > > [<900ad9c8>] driver_probe_device+0x7c/0xe8 > > [<900adb22>] __driver_attach+0x4e/0x88 > > [<900ad12e>] bus_for_each_dev+0x2e/0x4c > > [<900ad8b6>] driver_attach+0x12/0x14 > > [<900ad700>] bus_add_driver+0x6c/0x178 > > [<900adcda>] driver_register+0x3e/0x94 > > [<900aec5a>] platform_driver_register+0x4a/0x50 > > [<900aec6a>] platform_driver_probe+0xa/0x38 > > [<90008e1e>] atmel_nand_init+0xe/0x14 > > [<900003e2>] kernel_init+0x8e/0x1c8 > > [<9001966c>] do_exit+0x0/0x428 > > > > FIX kmalloc-4096: Restoring 0x91c0a0c0-0x91c0a13f=0x6b > > > > FIX kmalloc-4096: Marking all objects used > > atmel_spi atmel_spi.0: Atmel SPI Controller at 0xffe00000 (irq 3) > > atmel_usba_udc atmel_usba_udc.0: MMIO registers at 0xfff03000 mapped at fff03000 > > atmel_usba_udc atmel_usba_udc.0: FIFO at 0xff300000 mapped at ff300000 > > at32ap700x_rtc at32ap700x_rtc.0: rtc core: registered at32ap700x_rtc as rtc0 > > at32ap700x_rtc at32ap700x_rtc.0: Atmel RTC for AT32AP700x at fff00080 irq 21 > > at32_wdt at32_wdt.0: AT32AP700X WDT at 0xfff000b0, timeout 2 sec (nowayout=0) > > cpufreq: AT32AP CPU frequency driver > > at32ap700x_rtc at32ap700x_rtc.0: setting system clock to 1970-01-01 00:00:00 UTC (0) > > VFS: Mounted root (jffs2 filesystem). > > Freeing init memory: 48K (90000000 - 9000c000) > > It seems like the data that starts this error initiates from. > > Drivers/mtd/nand/atmel_nand.c line 634 > return platform_driver_probe(&atmel_nand_driver, atmel_nand_probe); > > ... > > fs/partitions/check.c line 446 > kobject_uevent(&disk->dev.kobj, KOBJ_ADD); > > lib/kobject_uevent.c line 156 > env = kzalloc(sizeof(struct kobj_uevent_env), GFP_KERNEL); > > include/linux/slab.h line 271 > return kmalloc(size, flags | __GFP_ZERO); > > The needed space for the struct is 2184 and the kzalloc is called with size 2184 as well. Resulting in a 4kb chache that seems not to be free and something is overwritten. > > The function in check.c is called many times when registering different partitions and only fails when related to the atmel_nand driver. > > Does anyone know how I can go from here to try and locate what is wrong? > > The error occurs in both the 2.6.25.10.atmel.2 as well as 2.6.25.6.atmel.1 > > ____________________________________________________ > > Eirik Aanonsen > SW Developer > E-mail: eaa@wprmedical.com > Phone: +47 90 68 11 92 > Fax: +47 37 03 56 77 > ____________________________________________________ > > _______________________________________________ > Kernel mailing list > Kernel@avr32linux.org > http://duppen.flaskehals.net/cgi-bin/mailman/listinfo/kernel ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Slub debugging NAND error in 2.6.25.10.atmel.2 2008-08-29 9:48 ` Slub debugging NAND error in 2.6.25.10.atmel.2 Haavard Skinnemoen @ 2008-08-29 10:46 ` Eirik Aanonsen 2008-08-29 11:29 ` Haavard Skinnemoen 2008-08-29 14:28 ` MTD/block regression (was Re: Slub debugging NAND error in 2.6.25.10.atmel.2) Haavard Skinnemoen 1 sibling, 1 reply; 8+ messages in thread From: Eirik Aanonsen @ 2008-08-29 10:46 UTC (permalink / raw) To: Haavard Skinnemoen; +Cc: linux-mtd, kernel, David Woodhouse [-- Attachment #1: Type: text/plain, Size: 4345 bytes --] >"Eirik Aanonsen" <eaa@wprmedical.com> wrote: >> > Using physmap partition information >> > Creating 5 MTD partitions on "physmap-flash.0": >> > 0x00000000-0x00020000 : "u-boot" >> > 0x00020000-0x00640000 : "root" >> > 0x00640000-0x00720000 : "kernel1" >> > 0x00720000-0x007e0000 : "modules" >> > 0x007e0000-0x00800000 : "env" >> > NAND device: Manufacturer ID: 0xec, Chip ID: 0xd5 (Samsung NAND 2GiB >3,3V 8-bit) >> > Scanning device for bad blocks >> > Bad eraseblock 31 at 0x00f80000 >> > Bad eraseblock 1579 at 0x31580000 >> > Bad eraseblock 2921 at 0x5b480000 >> > Bad eraseblock 2931 at 0x5b980000 >> > Bad eraseblock 3359 at 0x68f80000 >> > Creating 1 MTD partitions on "atmel_nand": >> > 0x00000000-0x80000000 : "main" >> > >======================================================================== >===== >> > BUG kmalloc-4096: Poison overwritten >> > -------------------------------------------------------------------- >--------- > >Hmm...I just saw this when booting 2.6.27-rc5 on the NGW100: > >physmap platform flash device: 00800000 at 00000000 >physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank > Amd/Fujitsu Extended Query Table at 0x0041 >number of CFI chips: 1 >cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness. >RedBoot partition parsing not available >Using physmap partition information >Creating 3 MTD partitions on "physmap-flash.0": >0x00000000-0x00020000 : "u-boot" >0x00020000-0x007f0000 : "root" >kobject (91ce8410): tried to init an initialized object, something is >seriously wrong. >Call trace: > [<90017184>] dump_stack+0x18/0x20 > [<900c1894>] kobject_init+0x28/0x5c > [<900c1bf6>] kobject_init_and_add+0xe/0x24 > [<900beff0>] blk_register_filter+0x28/0x40 > [<900be224>] add_disk+0x38/0x68 > [<900e70f0>] add_mtd_blktrans_dev+0x174/0x184 > [<900e748e>] mtdblock_add_mtd+0x36/0x3c > [<900e6e38>] blktrans_notify_add+0x1a/0x3a > [<900e533c>] add_mtd_device+0x60/0xa0 > [<900e5f7e>] add_mtd_partitions+0x37a/0x3a0 > [<900ec4d0>] physmap_flash_probe+0x1ec/0x21c > [<900e0f24>] platform_drv_probe+0x10/0x12 > [<900e06d0>] driver_probe_device+0x84/0xf0 > [<900e076a>] __driver_attach+0x2e/0x44 > [<900e0096>] bus_for_each_dev+0x2e/0x4c > [<900e05b6>] driver_attach+0x12/0x14 > [<900e036c>] bus_add_driver+0x6c/0x178 > [<900e08a4>] driver_register+0x58/0xb0 > [<900e1126>] platform_driver_register+0x56/0x5c > [<9000aaf6>] physmap_init+0xa/0x10 > [<9001422a>] do_one_initcall+0x2a/0x10c > [<900005b8>] kernel_init+0x48/0x90 > [<9001fcc0>] do_exit+0x0/0x4cc > >0x007f0000-0x00800000 : "env" >kobject (91ce8410): tried to init an initialized object, something is >seriously wrong. >Call trace: > [<90017184>] dump_stack+0x18/0x20 > [<900c1894>] kobject_init+0x28/0x5c > [<900c1bf6>] kobject_init_and_add+0xe/0x24 > [<900beff0>] blk_register_filter+0x28/0x40 > [<900be224>] add_disk+0x38/0x68 > [<900e70f0>] add_mtd_blktrans_dev+0x174/0x184 > [<900e748e>] mtdblock_add_mtd+0x36/0x3c > [<900e6e38>] blktrans_notify_add+0x1a/0x3a > [<900e533c>] add_mtd_device+0x60/0xa0 > [<900e5f7e>] add_mtd_partitions+0x37a/0x3a0 > [<900ec4d0>] physmap_flash_probe+0x1ec/0x21c > [<900e0f24>] platform_drv_probe+0x10/0x12 > [<900e06d0>] driver_probe_device+0x84/0xf0 > [<900e076a>] __driver_attach+0x2e/0x44 > [<900e0096>] bus_for_each_dev+0x2e/0x4c > [<900e05b6>] driver_attach+0x12/0x14 > [<900e036c>] bus_add_driver+0x6c/0x178 > [<900e08a4>] driver_register+0x58/0xb0 > [<900e1126>] platform_driver_register+0x56/0x5c > [<9000aaf6>] physmap_init+0xa/0x10 > [<9001422a>] do_one_initcall+0x2a/0x10c > [<900005b8>] kernel_init+0x48/0x90 > [<9001fcc0>] do_exit+0x0/0x4cc > >I wonder if it's related? > >Haavard > The fault from my posts here seems to be related to a wrong definition related to my board. I had to change the include/linux/mtd/nand.h Where I had to replace: #define NAND_MAX_OOBSIZE 64 #define NAND_MAX_PAGESIZE 2048 With: #define NAND_MAX_OOBSIZE 128 #define NAND_MAX_PAGESIZE 4096 Nice the nand driver did not allocate enogh memory to handle ny nand flash. This was what caused the overwriting of the memory in my case. It could prove useifull to have these values in KConfig instead in an header, since the size of these values differ a lot from chip to chip? Eirik Aanonsen [-- Attachment #2: neko_test.elf --] [-- Type: application/octet-stream, Size: 665804 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Slub debugging NAND error in 2.6.25.10.atmel.2 2008-08-29 10:46 ` Eirik Aanonsen @ 2008-08-29 11:29 ` Haavard Skinnemoen 0 siblings, 0 replies; 8+ messages in thread From: Haavard Skinnemoen @ 2008-08-29 11:29 UTC (permalink / raw) To: Eirik Aanonsen; +Cc: linux-mtd, kernel, David Woodhouse "Eirik Aanonsen" <eaa@wprmedical.com> wrote: > The fault from my posts here seems to be related to a wrong definition related to my board. I had to change the include/linux/mtd/nand.h > Where I had to replace: > #define NAND_MAX_OOBSIZE 64 > #define NAND_MAX_PAGESIZE 2048 > With: > #define NAND_MAX_OOBSIZE 128 > #define NAND_MAX_PAGESIZE 4096 > Nice the nand driver did not allocate enogh memory to handle ny nand flash. This was what caused the overwriting of the memory in my case. Ah. That makes sense. It probably explains that weird oops loop you got too -- all kinds of weird things can happen when memory is overwritten like this. > It could prove useifull to have these values in KConfig instead in an header, since the size of these values differ a lot from chip to chip? I think they should just be adjusted. At least that's what the comment directly above them says: /* This constant declares the max. oobsize / page, which * is supported now. If you add a chip with bigger oobsize/page * adjust this accordingly. */ Haavard ^ permalink raw reply [flat|nested] 8+ messages in thread
* MTD/block regression (was Re: Slub debugging NAND error in 2.6.25.10.atmel.2) 2008-08-29 9:48 ` Slub debugging NAND error in 2.6.25.10.atmel.2 Haavard Skinnemoen 2008-08-29 10:46 ` Eirik Aanonsen @ 2008-08-29 14:28 ` Haavard Skinnemoen 2008-08-30 1:34 ` FUJITA Tomonori 1 sibling, 1 reply; 8+ messages in thread From: Haavard Skinnemoen @ 2008-08-29 14:28 UTC (permalink / raw) To: Haavard Skinnemoen Cc: Eirik Aanonsen, kernel, linux-kernel, Tomonori, linux-mtd, Jens Axboe, David Woodhouse, FUJITA Haavard Skinnemoen <haavard.skinnemoen@atmel.com> wrote: > Hmm...I just saw this when booting 2.6.27-rc5 on the NGW100: > > physmap platform flash device: 00800000 at 00000000 > physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank > Amd/Fujitsu Extended Query Table at 0x0041 > number of CFI chips: 1 > cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness. > RedBoot partition parsing not available > Using physmap partition information > Creating 3 MTD partitions on "physmap-flash.0": > 0x00000000-0x00020000 : "u-boot" > 0x00020000-0x007f0000 : "root" > kobject (91ce8410): tried to init an initialized object, something is seriously wrong. > Call trace: > [<90017184>] dump_stack+0x18/0x20 > [<900c1894>] kobject_init+0x28/0x5c > [<900c1bf6>] kobject_init_and_add+0xe/0x24 > [<900beff0>] blk_register_filter+0x28/0x40 > [<900be224>] add_disk+0x38/0x68 > [<900e70f0>] add_mtd_blktrans_dev+0x174/0x184 > [<900e748e>] mtdblock_add_mtd+0x36/0x3c > [<900e6e38>] blktrans_notify_add+0x1a/0x3a > [<900e533c>] add_mtd_device+0x60/0xa0 > [<900e5f7e>] add_mtd_partitions+0x37a/0x3a0 > [<900ec4d0>] physmap_flash_probe+0x1ec/0x21c > [<900e0f24>] platform_drv_probe+0x10/0x12 > [<900e06d0>] driver_probe_device+0x84/0xf0 > [<900e076a>] __driver_attach+0x2e/0x44 > [<900e0096>] bus_for_each_dev+0x2e/0x4c > [<900e05b6>] driver_attach+0x12/0x14 > [<900e036c>] bus_add_driver+0x6c/0x178 > [<900e08a4>] driver_register+0x58/0xb0 > [<900e1126>] platform_driver_register+0x56/0x5c > [<9000aaf6>] physmap_init+0xa/0x10 > [<9001422a>] do_one_initcall+0x2a/0x10c > [<900005b8>] kernel_init+0x48/0x90 > [<9001fcc0>] do_exit+0x0/0x4cc > > 0x007f0000-0x00800000 : "env" > kobject (91ce8410): tried to init an initialized object, something is seriously wrong. > Call trace: > [<90017184>] dump_stack+0x18/0x20 > [<900c1894>] kobject_init+0x28/0x5c > [<900c1bf6>] kobject_init_and_add+0xe/0x24 > [<900beff0>] blk_register_filter+0x28/0x40 > [<900be224>] add_disk+0x38/0x68 > [<900e70f0>] add_mtd_blktrans_dev+0x174/0x184 > [<900e748e>] mtdblock_add_mtd+0x36/0x3c > [<900e6e38>] blktrans_notify_add+0x1a/0x3a > [<900e533c>] add_mtd_device+0x60/0xa0 > [<900e5f7e>] add_mtd_partitions+0x37a/0x3a0 > [<900ec4d0>] physmap_flash_probe+0x1ec/0x21c > [<900e0f24>] platform_drv_probe+0x10/0x12 > [<900e06d0>] driver_probe_device+0x84/0xf0 > [<900e076a>] __driver_attach+0x2e/0x44 > [<900e0096>] bus_for_each_dev+0x2e/0x4c > [<900e05b6>] driver_attach+0x12/0x14 > [<900e036c>] bus_add_driver+0x6c/0x178 > [<900e08a4>] driver_register+0x58/0xb0 > [<900e1126>] platform_driver_register+0x56/0x5c > [<9000aaf6>] physmap_init+0xa/0x10 > [<9001422a>] do_one_initcall+0x2a/0x10c > [<900005b8>] kernel_init+0x48/0x90 > [<9001fcc0>] do_exit+0x0/0x4cc > > I wonder if it's related? Ok, it turns out it's not related. It's a newly introduced regression which I've bisected down to: commit abf5439370491dd6fbb4fe1a7939680d2a9bc9d4 Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Date: Sat Aug 16 14:10:05 2008 +0900 block: move cmdfilter from gendisk to request_queue cmd_filter works only for the block layer SG_IO with SCSI block devices. It breaks scsi/sg.c, bsg, and the block layer SG_IO with SCSI character devices (such as st). We hit a kernel crash with them. The problem is that cmd_filter code accesses to gendisk (having struct blk_scsi_cmd_filter) via inode->i_bdev->bd_disk. It works for only SCSI block device files. With character device files, inode->i_bdev leads you to struct cdev. inode->i_bdev->bd_disk->blk_scsi_cmd_filter isn't safe. SCSI ULDs don't expose gendisk; they keep it private. bsg needs to be independent on any protocols. We shouldn't change ULDs to expose their gendisk. This patch moves struct blk_scsi_cmd_filter from gendisk to request_queue, a common object, which eveyone can access to. The user interface doesn't change; users can change the filters via /sys/block/. gendisk has a pointer to request_queue so the cmd_filter code accesses to struct blk_scsi_cmd_filter. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Unfortunately, I can't revert it cleanly, so it could be a false positive. But it does sort of make sense, since it makes the filter per-queue instead of per-gendisk, so if MTD uses the same queue for several block devices, the filter kobject might end up being initialized multiple times. Or something. Haavard ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: MTD/block regression (was Re: Slub debugging NAND error in 2.6.25.10.atmel.2) 2008-08-29 14:28 ` MTD/block regression (was Re: Slub debugging NAND error in 2.6.25.10.atmel.2) Haavard Skinnemoen @ 2008-08-30 1:34 ` FUJITA Tomonori 2008-08-30 10:49 ` Haavard Skinnemoen 2008-08-31 12:00 ` Geert Uytterhoeven 0 siblings, 2 replies; 8+ messages in thread From: FUJITA Tomonori @ 2008-08-30 1:34 UTC (permalink / raw) To: haavard.skinnemoen Cc: eaa, kernel, linux-kernel, fujita.tomonori, linux-mtd, jens.axboe, dwmw2 On Fri, 29 Aug 2008 16:28:24 +0200 Haavard Skinnemoen <haavard.skinnemoen@atmel.com> wrote: > Haavard Skinnemoen <haavard.skinnemoen@atmel.com> wrote: > > Hmm...I just saw this when booting 2.6.27-rc5 on the NGW100: > > > > physmap platform flash device: 00800000 at 00000000 > > physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank > > Amd/Fujitsu Extended Query Table at 0x0041 > > number of CFI chips: 1 > > cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness. > > RedBoot partition parsing not available > > Using physmap partition information > > Creating 3 MTD partitions on "physmap-flash.0": > > 0x00000000-0x00020000 : "u-boot" > > 0x00020000-0x007f0000 : "root" > > kobject (91ce8410): tried to init an initialized object, something is seriously wrong. > > Call trace: > > [<90017184>] dump_stack+0x18/0x20 > > [<900c1894>] kobject_init+0x28/0x5c > > [<900c1bf6>] kobject_init_and_add+0xe/0x24 > > [<900beff0>] blk_register_filter+0x28/0x40 > > [<900be224>] add_disk+0x38/0x68 > > [<900e70f0>] add_mtd_blktrans_dev+0x174/0x184 > > [<900e748e>] mtdblock_add_mtd+0x36/0x3c > > [<900e6e38>] blktrans_notify_add+0x1a/0x3a > > [<900e533c>] add_mtd_device+0x60/0xa0 > > [<900e5f7e>] add_mtd_partitions+0x37a/0x3a0 > > [<900ec4d0>] physmap_flash_probe+0x1ec/0x21c > > [<900e0f24>] platform_drv_probe+0x10/0x12 > > [<900e06d0>] driver_probe_device+0x84/0xf0 > > [<900e076a>] __driver_attach+0x2e/0x44 > > [<900e0096>] bus_for_each_dev+0x2e/0x4c > > [<900e05b6>] driver_attach+0x12/0x14 > > [<900e036c>] bus_add_driver+0x6c/0x178 > > [<900e08a4>] driver_register+0x58/0xb0 > > [<900e1126>] platform_driver_register+0x56/0x5c > > [<9000aaf6>] physmap_init+0xa/0x10 > > [<9001422a>] do_one_initcall+0x2a/0x10c > > [<900005b8>] kernel_init+0x48/0x90 > > [<9001fcc0>] do_exit+0x0/0x4cc > > > > 0x007f0000-0x00800000 : "env" > > kobject (91ce8410): tried to init an initialized object, something is seriously wrong. > > Call trace: > > [<90017184>] dump_stack+0x18/0x20 > > [<900c1894>] kobject_init+0x28/0x5c > > [<900c1bf6>] kobject_init_and_add+0xe/0x24 > > [<900beff0>] blk_register_filter+0x28/0x40 > > [<900be224>] add_disk+0x38/0x68 > > [<900e70f0>] add_mtd_blktrans_dev+0x174/0x184 > > [<900e748e>] mtdblock_add_mtd+0x36/0x3c > > [<900e6e38>] blktrans_notify_add+0x1a/0x3a > > [<900e533c>] add_mtd_device+0x60/0xa0 > > [<900e5f7e>] add_mtd_partitions+0x37a/0x3a0 > > [<900ec4d0>] physmap_flash_probe+0x1ec/0x21c > > [<900e0f24>] platform_drv_probe+0x10/0x12 > > [<900e06d0>] driver_probe_device+0x84/0xf0 > > [<900e076a>] __driver_attach+0x2e/0x44 > > [<900e0096>] bus_for_each_dev+0x2e/0x4c > > [<900e05b6>] driver_attach+0x12/0x14 > > [<900e036c>] bus_add_driver+0x6c/0x178 > > [<900e08a4>] driver_register+0x58/0xb0 > > [<900e1126>] platform_driver_register+0x56/0x5c > > [<9000aaf6>] physmap_init+0xa/0x10 > > [<9001422a>] do_one_initcall+0x2a/0x10c > > [<900005b8>] kernel_init+0x48/0x90 > > [<9001fcc0>] do_exit+0x0/0x4cc > > > > I wonder if it's related? > > Ok, it turns out it's not related. It's a newly introduced regression > which I've bisected down to: > > commit abf5439370491dd6fbb4fe1a7939680d2a9bc9d4 > Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> > Date: Sat Aug 16 14:10:05 2008 +0900 > > block: move cmdfilter from gendisk to request_queue Really sorry about that. A fix was queued in Jens' tree: http://marc.info/?l=linux-kernel&m=122000748432301&w=2 > Unfortunately, I can't revert it cleanly, so it could be a false > positive. But it does sort of make sense, since it makes the filter > per-queue instead of per-gendisk, so if MTD uses the same queue for > several block devices, the filter kobject might end up being > initialized multiple times. Or something. Right, the problem is that MTD uses the same queue for multiple gendisks. It would be great if a MTD developer could fix it. Thanks, ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: MTD/block regression (was Re: Slub debugging NAND error in 2.6.25.10.atmel.2) 2008-08-30 1:34 ` FUJITA Tomonori @ 2008-08-30 10:49 ` Haavard Skinnemoen 2008-08-31 12:00 ` Geert Uytterhoeven 1 sibling, 0 replies; 8+ messages in thread From: Haavard Skinnemoen @ 2008-08-30 10:49 UTC (permalink / raw) To: FUJITA Tomonori Cc: eaa, kernel, linux-kernel, fujita.tomonori, linux-mtd, jens.axboe, dwmw2 On Sat, 30 Aug 2008 10:34:12 +0900 FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote: > Really sorry about that. A fix was queued in Jens' tree: > > http://marc.info/?l=linux-kernel&m=122000748432301&w=2 Ah, great. Sorry for not searching the list before posting. > > Unfortunately, I can't revert it cleanly, so it could be a false > > positive. But it does sort of make sense, since it makes the filter > > per-queue instead of per-gendisk, so if MTD uses the same queue for > > several block devices, the filter kobject might end up being > > initialized multiple times. Or something. > > Right, the problem is that MTD uses the same queue for multiple > gendisks. It would be great if a MTD developer could fix it. Yeah, I sort of suspected that MTD was doing something unusual. Unfortunately, I'm not familiar enough with the MTD and block code to help out here... Haavard ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: MTD/block regression (was Re: Slub debugging NAND error in 2.6.25.10.atmel.2) 2008-08-30 1:34 ` FUJITA Tomonori 2008-08-30 10:49 ` Haavard Skinnemoen @ 2008-08-31 12:00 ` Geert Uytterhoeven 2008-08-31 16:55 ` Jens Axboe 1 sibling, 1 reply; 8+ messages in thread From: Geert Uytterhoeven @ 2008-08-31 12:00 UTC (permalink / raw) To: FUJITA Tomonori Cc: eaa, Linux/m68k, kernel, Linux Kernel Development, linux-mtd, jens.axboe, haavard.skinnemoen, Andrew Morton, dwmw2 On Sat, 30 Aug 2008, FUJITA Tomonori wrote: > On Fri, 29 Aug 2008 16:28:24 +0200 > Haavard Skinnemoen <haavard.skinnemoen@atmel.com> wrote: > > Haavard Skinnemoen <haavard.skinnemoen@atmel.com> wrote: > > > Hmm...I just saw this when booting 2.6.27-rc5 on the NGW100: > > > > > > kobject (91ce8410): tried to init an initialized object, something is seriously wrong. > > > Call trace: > > > [<90017184>] dump_stack+0x18/0x20 > > > [<900c1894>] kobject_init+0x28/0x5c > > > [<900c1bf6>] kobject_init_and_add+0xe/0x24 > > > [<900beff0>] blk_register_filter+0x28/0x40 > > > [<900be224>] add_disk+0x38/0x68 > > > [<900e70f0>] add_mtd_blktrans_dev+0x174/0x184 > > > [<900e748e>] mtdblock_add_mtd+0x36/0x3c > > > [<900e6e38>] blktrans_notify_add+0x1a/0x3a > > > [<900e533c>] add_mtd_device+0x60/0xa0 > > > [<900e5f7e>] add_mtd_partitions+0x37a/0x3a0 > > > [<900ec4d0>] physmap_flash_probe+0x1ec/0x21c > > > [<900e0f24>] platform_drv_probe+0x10/0x12 > > > [<900e06d0>] driver_probe_device+0x84/0xf0 > > > [<900e076a>] __driver_attach+0x2e/0x44 > > > [<900e0096>] bus_for_each_dev+0x2e/0x4c > > > [<900e05b6>] driver_attach+0x12/0x14 > > > [<900e036c>] bus_add_driver+0x6c/0x178 > > > [<900e08a4>] driver_register+0x58/0xb0 > > > [<900e1126>] platform_driver_register+0x56/0x5c > > > [<9000aaf6>] physmap_init+0xa/0x10 > > > [<9001422a>] do_one_initcall+0x2a/0x10c > > > [<900005b8>] kernel_init+0x48/0x90 > > > [<9001fcc0>] do_exit+0x0/0x4cc > > > > Ok, it turns out it's not related. It's a newly introduced regression > > which I've bisected down to: > > > > commit abf5439370491dd6fbb4fe1a7939680d2a9bc9d4 > > Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> > > Date: Sat Aug 16 14:10:05 2008 +0900 > > > > block: move cmdfilter from gendisk to request_queue > > Really sorry about that. A fix was queued in Jens' tree: > > http://marc.info/?l=linux-kernel&m=122000748432301&w=2 > > > > Unfortunately, I can't revert it cleanly, so it could be a false > > positive. But it does sort of make sense, since it makes the filter > > per-queue instead of per-gendisk, so if MTD uses the same queue for > > several block devices, the filter kobject might end up being > > initialized multiple times. Or something. > > Right, the problem is that MTD uses the same queue for multiple > gendisks. It would be great if a MTD developer could fix it. I'm also seeing it with drivers/block/ataflop.c (also a single queue) on ARAnyM. And from looking at drivers/block/floppy.c and drivers/block/amiflop.c, I guess it happens there, too. Any other single-queue drivers that got broken??? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: MTD/block regression (was Re: Slub debugging NAND error in 2.6.25.10.atmel.2) 2008-08-31 12:00 ` Geert Uytterhoeven @ 2008-08-31 16:55 ` Jens Axboe 0 siblings, 0 replies; 8+ messages in thread From: Jens Axboe @ 2008-08-31 16:55 UTC (permalink / raw) To: Geert Uytterhoeven Cc: eaa, Linux/m68k, kernel, Linux Kernel Development, FUJITA Tomonori, haavard.skinnemoen, linux-mtd, Andrew Morton, dwmw2 On Sun, Aug 31 2008, Geert Uytterhoeven wrote: > On Sat, 30 Aug 2008, FUJITA Tomonori wrote: > > On Fri, 29 Aug 2008 16:28:24 +0200 > > Haavard Skinnemoen <haavard.skinnemoen@atmel.com> wrote: > > > Haavard Skinnemoen <haavard.skinnemoen@atmel.com> wrote: > > > > Hmm...I just saw this when booting 2.6.27-rc5 on the NGW100: > > > > > > > > kobject (91ce8410): tried to init an initialized object, something is seriously wrong. > > > > Call trace: > > > > [<90017184>] dump_stack+0x18/0x20 > > > > [<900c1894>] kobject_init+0x28/0x5c > > > > [<900c1bf6>] kobject_init_and_add+0xe/0x24 > > > > [<900beff0>] blk_register_filter+0x28/0x40 > > > > [<900be224>] add_disk+0x38/0x68 > > > > [<900e70f0>] add_mtd_blktrans_dev+0x174/0x184 > > > > [<900e748e>] mtdblock_add_mtd+0x36/0x3c > > > > [<900e6e38>] blktrans_notify_add+0x1a/0x3a > > > > [<900e533c>] add_mtd_device+0x60/0xa0 > > > > [<900e5f7e>] add_mtd_partitions+0x37a/0x3a0 > > > > [<900ec4d0>] physmap_flash_probe+0x1ec/0x21c > > > > [<900e0f24>] platform_drv_probe+0x10/0x12 > > > > [<900e06d0>] driver_probe_device+0x84/0xf0 > > > > [<900e076a>] __driver_attach+0x2e/0x44 > > > > [<900e0096>] bus_for_each_dev+0x2e/0x4c > > > > [<900e05b6>] driver_attach+0x12/0x14 > > > > [<900e036c>] bus_add_driver+0x6c/0x178 > > > > [<900e08a4>] driver_register+0x58/0xb0 > > > > [<900e1126>] platform_driver_register+0x56/0x5c > > > > [<9000aaf6>] physmap_init+0xa/0x10 > > > > [<9001422a>] do_one_initcall+0x2a/0x10c > > > > [<900005b8>] kernel_init+0x48/0x90 > > > > [<9001fcc0>] do_exit+0x0/0x4cc > > > > > > Ok, it turns out it's not related. It's a newly introduced regression > > > which I've bisected down to: > > > > > > commit abf5439370491dd6fbb4fe1a7939680d2a9bc9d4 > > > Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> > > > Date: Sat Aug 16 14:10:05 2008 +0900 > > > > > > block: move cmdfilter from gendisk to request_queue > > > > Really sorry about that. A fix was queued in Jens' tree: > > > > http://marc.info/?l=linux-kernel&m=122000748432301&w=2 > > > > > > > Unfortunately, I can't revert it cleanly, so it could be a false > > > positive. But it does sort of make sense, since it makes the filter > > > per-queue instead of per-gendisk, so if MTD uses the same queue for > > > several block devices, the filter kobject might end up being > > > initialized multiple times. Or something. > > > > Right, the problem is that MTD uses the same queue for multiple > > gendisks. It would be great if a MTD developer could fix it. > > I'm also seeing it with drivers/block/ataflop.c (also a single queue) on > ARAnyM. > > And from looking at drivers/block/floppy.c and drivers/block/amiflop.c, > I guess it happens there, too. > > Any other single-queue drivers that got broken??? The pending change will eliminate this problem so that single queue devices will work fine again. They should still work fine in -rc5, just with the annoying WARN_ON() for each device per queue. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-08-31 16:55 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <6B5648EA2E2C2D42AF2FF7691522AD92417C9D@wpr01.wprmedical.local>
[not found] ` <6B5648EA2E2C2D42AF2FF7691522AD92417CA6@wpr01.wprmedical.local>
2008-08-29 9:48 ` Slub debugging NAND error in 2.6.25.10.atmel.2 Haavard Skinnemoen
2008-08-29 10:46 ` Eirik Aanonsen
2008-08-29 11:29 ` Haavard Skinnemoen
2008-08-29 14:28 ` MTD/block regression (was Re: Slub debugging NAND error in 2.6.25.10.atmel.2) Haavard Skinnemoen
2008-08-30 1:34 ` FUJITA Tomonori
2008-08-30 10:49 ` Haavard Skinnemoen
2008-08-31 12:00 ` Geert Uytterhoeven
2008-08-31 16:55 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox