* linux-next: Tree for June 29 @ 2009-06-29 6:48 Stephen Rothwell 2009-06-29 9:44 ` Next June 29: Boot failure with SLQB on s390 Sachin Sant 0 siblings, 1 reply; 19+ messages in thread From: Stephen Rothwell @ 2009-06-29 6:48 UTC (permalink / raw) To: linux-next; +Cc: LKML [-- Attachment #1: Type: text/plain, Size: 6805 bytes --] Hi all, Changes since 20090626: New tree: percpu My fixes tree contains this commit: fbdev: work around old compiler bug This tree fails to build for powerpc allyesconfig. The hid tree lost its build failure. The rr tree gained a conflict against Linus' tree. The block tree lost its build failure. The percpu tree gained a conflict against Linus' tree. ---------------------------------------------------------------------------- I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/sfr/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/people/sfr/linux-next/). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" as mentioned in the FAQ on the wiki (see below). You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the final fixups (if any), it is also built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES) and i386, sparc and sparc64 defconfig. These builds also have CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and CONFIG_DEBUG_INFO disabled when necessary. Below is a summary of the state of the merge. We are up to 131 trees (counting Linus' and 19 trees of patches pending for Linus' tree), more are welcome (even if they are currently empty). Thanks to those who have contributed, and to those who haven't, please do. Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Jan Dittmer for adding the linux-next tree to his build tests at http://l4x.org/k/ , the guys at http://test.kernel.org/ and Randy Dunlap for doing many randconfig builds. There is a wiki covering stuff to do with linux-next at http://linux.f-seidel.de/linux-next/pmwiki/ . Thanks to Frank Seidel. -- Cheers, Stephen Rothwell sfr@canb.auug.org.au $ git checkout master $ git reset --hard stable Merging origin/master Merging fixes/fixes Merging arm-current/master Merging m68k-current/for-linus Merging powerpc-merge/merge Merging sparc-current/master Merging scsi-rc-fixes/master Merging net-current/master Merging sound-current/for-linus Merging pci-current/for-linus Merging wireless-current/master Merging kbuild-current/master Merging quilt/driver-core.current Merging quilt/usb.current Merging cpufreq-current/fixes Merging input-current/for-linus Merging md-current/for-linus Merging audit-current/for-linus Merging crypto-current/master Merging dwmw2/master Merging arm/devel Merging davinci/for-next Merging pxa/for-next Merging avr32/avr32-arch Merging blackfin/for-linus Merging cris/for-next Merging ia64/test Merging m68k/for-next Merging m68knommu/for-next Merging microblaze/next Merging mips/mips-for-linux-next CONFLICT (add/add): Merge conflict in arch/mips/cavium-octeon/executive/cvmx-helper-errata.c CONFLICT (content): Merge conflict in arch/mips/mm/tlbex.c CONFLICT (content): Merge conflict in arch/mips/sibyte/swarm/setup.c CONFLICT (content): Merge conflict in drivers/char/hw_random/Kconfig CONFLICT (content): Merge conflict in drivers/char/hw_random/Makefile CONFLICT (add/add): Merge conflict in drivers/dma/txx9dmac.c Merging parisc/master Merging powerpc/next Merging 4xx/next Merging galak/next Merging s390/features Merging sh/master Merging sparc/master Merging xtensa/master Merging cifs/master Merging configfs/linux-next CONFLICT (content): Merge conflict in fs/configfs/dir.c Merging ext4/next Merging fatfs/master Merging fuse/for-next Merging gfs2/master Merging jfs/next Merging nfs/linux-next Merging nfsd/nfsd-next Merging nilfs2/for-next Merging ocfs2/linux-next Merging squashfs/master Merging v9fs/for-next CONFLICT (content): Merge conflict in net/9p/protocol.c Merging ubifs/linux-next Merging xfs/master Merging reiserfs-bkl/reiserfs/kill-bkl-rc6 CONFLICT (content): Merge conflict in fs/reiserfs/super.c Merging vfs/for-next Merging pci/linux-next Merging hid/for-next Merging quilt/i2c Merging quilt/jdelvare-hwmon Merging quilt/kernel-doc Merging v4l-dvb/master Merging quota/for_next Merging kbuild/master Merging ide/master Merging libata/NEXT Merging infiniband/for-next Merging acpi/test Merging ieee1394/for-next Merging ubi/linux-next Merging kvm/master CONFLICT (content): Merge conflict in arch/x86/kvm/x86.c CONFLICT (content): Merge conflict in virt/kvm/kvm_main.c Merging dlm/next Merging scsi/master Merging async_tx/next Merging udf/for_next Merging net/master Merging wireless/master Merging mtd/master Merging crypto/master Merging sound/for-next Merging cpufreq/next Merging quilt/rr CONFLICT (content): Merge conflict in arch/powerpc/platforms/powermac/setup.c CONFLICT (content): Merge conflict in kernel/cpu.c Applying: rr/pmac: fix for cpumask accessor changes Applying: UML: Fix some apparent bitrot in mmu_context.h Merging mmc/next Merging input/next Merging bkl-removal/bkl-removal Merging lsm/for-next Merging block/for-next Merging quilt/device-mapper Merging embedded/master Merging firmware/master Merging pcmcia/master Merging battery/master Merging leds/for-mm Merging backlight/for-mm Merging kgdb/kgdb-next Merging slab/for-next Merging uclinux/for-next Merging md/for-next Merging mfd/for-next Merging hdlc/hdlc-next Merging drm/drm-next Merging voltage/for-next Merging security-testing/next Merging lblnet/master Merging quilt/ttydev Merging agp/agp-next Merging uwb/for-upstream Merging watchdog/master Merging bdev/master Merging dwmw2-iommu/master Merging cputime/cputime Merging osd/linux-next Merging jc_docs/docs-next Merging nommu/master Merging trivial/for-next Merging audit/for-next Merging omap/for-next Merging quilt/aoe Merging suspend/linux-next Merging bluetooth/master Merging edac-amd/for-next Merging fsnotify/for-next Merging irda/for-next Merging hwlat/for-linus Merging tip/auto-latest CONFLICT (content): Merge conflict in arch/x86/include/asm/termios.h Merging percpu/for-next CONFLICT (content): Merge conflict in arch/mn10300/kernel/vmlinux.lds.S Merging asm-generic/next Merging quilt/driver-core Merging quilt/usb Merging quilt/staging Merging scsi-post-merge/master Applying: kvm/powerpc: fix build error [-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Next June 29: Boot failure with SLQB on s390 2009-06-29 6:48 linux-next: Tree for June 29 Stephen Rothwell @ 2009-06-29 9:44 ` Sachin Sant 2009-06-29 10:31 ` Heiko Carstens 0 siblings, 1 reply; 19+ messages in thread From: Sachin Sant @ 2009-06-29 9:44 UTC (permalink / raw) To: Pekka Enberg; +Cc: Stephen Rothwell, linux-next, linux-s390 I still have problems booting next with SLQB on a s390 box. Write protected kernel read-only data: 0x12000 - 0x446fff Experimental hierarchical RCU implementation. Experimental hierarchical RCU init done. console ÝttyS0¨ enabled Unable to handle kernel pointer dereference at virtual kernel address (null) Oops: 0004 #1¨ SMP Modules linked in: CPU: 0 Not tainted 2.6.31-rc1-autotest-next-20090629 #1 Process swapper (pid: 0, task: 0000000000447700, ksp: 0000000000498000) Krnl PSW : 0700200180000000 00000000002be00c (init_section_page_cgroup+0x8c/0x10 c) R:0 T:1 IO:1 EX:1 Key:0 M:0 W:0 P:0 AS:0 CC:2 PM:0 EA:3 Krnl GPRS: 0000000000000000 0000000000000028 0000000000000010 0000000000000010 000003e040000000 0000000000010000 0000000000498a28 0000000000000000 00000000004cc000 00000000004bb408 00000000004cc010 0000000080808000 0000000000280000 00000000002d1cf0 00000000004aa69e 0000000000497ed0 Krnl Code: 00000000002be000: a7c90000 lghi %r12,0 00000000002be004: a7f40031 brc 15,2be066 00000000002be008: 41102018 la %r1,24(%r2) >00000000002be00c: e34020100024 stg %r4,16(%r2) 00000000002be012: a7a90000 lghi %r10,0 00000000002be01c: e3a020000024 stg %r10,0(%r2) 00000000002be022: a74b0038 aghi %r4,56 Call Trace: (<00000000004b0bbe>¨ console_init+0x36/0x50) <000000000049897a>¨ start_kernel+0x312/0x3c0 <0000000000012020>¨ _ehead+0x20/0x80 Last Breaking-Event-Address: <00000000002be07c>¨ init_section_page_cgroup+0xfc/0x10c --- end trace 31fd0ba7d8756001 ¨--- Kernel panic - not syncing: Attempted to kill the idle task! CPU: 0 Tainted: G D 2.6.31-rc1-autotest-next-20090629 #1 Process swapper (pid: 0, task: 0000000000447700, ksp: 0000000000498000) 0000000000000005 0000000000497b20 0000000000000002 0000000000000000 0000000000497bc0 0000000000497b38 0000000000497b38 0000000000049fc2 00000000002c7cf0 0000000000000000 0000000000000000 000000000000000b 0000000000000008 0000000000000000 0000000000497b20 0000000000497b98 00000000002c7c78 00000000000163aa 0000000000497b20 0000000000497b70 Call Trace: (<000000000001630c>¨ show_trace+0xdc/0xec) <0000000000048eb0>¨ panic+0x84/0x1c0 <000000000004ce70>¨ do_exit+0x74/0x6f0 <0000000000016862>¨ die+0x152/0x154 <000000000001298c>¨ do_no_context+0xa0/0xac <00000000000131aa>¨ do_protection_exception+0x252/0x260 <0000000000025f7c>¨ sysc_return+0x0/0x8 <00000000002be00c>¨ init_section_page_cgroup+0x8c/0x10c (<00000000004b0bbe>¨ console_init+0x36/0x50) <000000000049897a>¨ start_kernel+0x312/0x3c0 <0000000000012020>¨ _ehead+0x20/0x80 01: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 00020E3E With SLUB machine boots fine. Should i be testing this(SLQB/s390) combination ? Thanks -Sachin -- --------------------------------- Sachin Sant IBM Linux Technology Center India Systems and Technology Labs Bangalore, India --------------------------------- ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-29 9:44 ` Next June 29: Boot failure with SLQB on s390 Sachin Sant @ 2009-06-29 10:31 ` Heiko Carstens 2009-06-29 10:39 ` Nick Piggin 0 siblings, 1 reply; 19+ messages in thread From: Heiko Carstens @ 2009-06-29 10:31 UTC (permalink / raw) To: Sachin Sant, Nick Piggin Cc: Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 On Mon, Jun 29, 2009 at 03:14:40PM +0530, Sachin Sant wrote: > I still have problems booting next with SLQB on a s390 box. > > Write protected kernel read-only data: 0x12000 - 0x446fff > Experimental hierarchical RCU implementation. > Experimental hierarchical RCU init done. > console ÝttyS0¨ enabled > Unable to handle kernel pointer dereference at virtual kernel address (null) > Oops: 0004 #1¨ SMP > Modules linked in: > CPU: 0 Not tainted 2.6.31-rc1-autotest-next-20090629 #1 > Process swapper (pid: 0, task: 0000000000447700, ksp: 0000000000498000) > Krnl PSW : 0700200180000000 00000000002be00c (init_section_page_cgroup+0x8c/0x10 > c) > R:0 T:1 IO:1 EX:1 Key:0 M:0 W:0 P:0 AS:0 CC:2 PM:0 EA:3 > Krnl GPRS: 0000000000000000 0000000000000028 0000000000000010 0000000000000010 > 000003e040000000 0000000000010000 0000000000498a28 0000000000000000 > 00000000004cc000 00000000004bb408 00000000004cc010 0000000080808000 > 0000000000280000 00000000002d1cf0 00000000004aa69e 0000000000497ed0 > Krnl Code: 00000000002be000: a7c90000 lghi %r12,0 > 00000000002be004: a7f40031 brc 15,2be066 > 00000000002be008: 41102018 la %r1,24(%r2) > >00000000002be00c: e34020100024 stg %r4,16(%r2) > 00000000002be012: a7a90000 lghi %r10,0 > 00000000002be01c: e3a020000024 stg %r10,0(%r2) > 00000000002be022: a74b0038 aghi %r4,56 > Call Trace: > (<00000000004b0bbe>¨ console_init+0x36/0x50) > <000000000049897a>¨ start_kernel+0x312/0x3c0 > <0000000000012020>¨ _ehead+0x20/0x80 > Last Breaking-Event-Address: > <00000000002be07c>¨ init_section_page_cgroup+0xfc/0x10c > > --- end trace 31fd0ba7d8756001 ¨--- > Kernel panic - not syncing: Attempted to kill the idle task! > CPU: 0 Tainted: G D 2.6.31-rc1-autotest-next-20090629 #1 > Process swapper (pid: 0, task: 0000000000447700, ksp: 0000000000498000) > 0000000000000005 0000000000497b20 0000000000000002 0000000000000000 > 0000000000497bc0 0000000000497b38 0000000000497b38 0000000000049fc2 > 00000000002c7cf0 0000000000000000 0000000000000000 000000000000000b > 0000000000000008 0000000000000000 0000000000497b20 0000000000497b98 > 00000000002c7c78 00000000000163aa 0000000000497b20 0000000000497b70 > Call Trace: > (<000000000001630c>¨ show_trace+0xdc/0xec) > <0000000000048eb0>¨ panic+0x84/0x1c0 > <000000000004ce70>¨ do_exit+0x74/0x6f0 > <0000000000016862>¨ die+0x152/0x154 > <000000000001298c>¨ do_no_context+0xa0/0xac > <00000000000131aa>¨ do_protection_exception+0x252/0x260 > <0000000000025f7c>¨ sysc_return+0x0/0x8 > <00000000002be00c>¨ init_section_page_cgroup+0x8c/0x10c > (<00000000004b0bbe>¨ console_init+0x36/0x50) > <000000000049897a>¨ start_kernel+0x312/0x3c0 > <0000000000012020>¨ _ehead+0x20/0x80 > 01: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 00020E3E > > With SLUB machine boots fine. Should i be testing this(SLQB/s390) > combination ? No, just stay with SLUB/SLAB for the time being. Maybe the backtrace looks familiar to Nick? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-29 10:31 ` Heiko Carstens @ 2009-06-29 10:39 ` Nick Piggin 2009-06-29 11:50 ` Heiko Carstens 2009-06-30 5:33 ` Sachin Sant 0 siblings, 2 replies; 19+ messages in thread From: Nick Piggin @ 2009-06-29 10:39 UTC (permalink / raw) To: Heiko Carstens Cc: Sachin Sant, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 On Mon, Jun 29, 2009 at 12:31:23PM +0200, Heiko Carstens wrote: > On Mon, Jun 29, 2009 at 03:14:40PM +0530, Sachin Sant wrote: > > I still have problems booting next with SLQB on a s390 box. > > > > Write protected kernel read-only data: 0x12000 - 0x446fff > > Experimental hierarchical RCU implementation. > > Experimental hierarchical RCU init done. > > console ÝttyS0¨ enabled > > Unable to handle kernel pointer dereference at virtual kernel address (null) This could I suppose be due to failed allocation where the caller isn't expecting failure (or using SLAB_PANIC). Did you manage to test with the prink debugging patch for SLQB that I sent for the power6 boot failure? I don't think I saw a reply from you but maybe I missed it? Thanks, Nick ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-29 10:39 ` Nick Piggin @ 2009-06-29 11:50 ` Heiko Carstens 2009-06-29 11:58 ` Nick Piggin 2009-06-30 5:33 ` Sachin Sant 1 sibling, 1 reply; 19+ messages in thread From: Heiko Carstens @ 2009-06-29 11:50 UTC (permalink / raw) To: Nick Piggin Cc: Sachin Sant, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 On Mon, Jun 29, 2009 at 12:39:43PM +0200, Nick Piggin wrote: > On Mon, Jun 29, 2009 at 12:31:23PM +0200, Heiko Carstens wrote: > > On Mon, Jun 29, 2009 at 03:14:40PM +0530, Sachin Sant wrote: > > > I still have problems booting next with SLQB on a s390 box. > > > > > > Write protected kernel read-only data: 0x12000 - 0x446fff > > > Experimental hierarchical RCU implementation. > > > Experimental hierarchical RCU init done. > > > console ÝttyS0¨ enabled > > > Unable to handle kernel pointer dereference at virtual kernel address (null) > > This could I suppose be due to failed allocation where the caller > isn't expecting failure (or using SLAB_PANIC). > > Did you manage to test with the prink debugging patch for SLQB that > I sent for the power6 boot failure? I don't think I saw a reply from > you but maybe I missed it? Could you send me the debug patch as well? I can give it a quick run as well. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-29 11:50 ` Heiko Carstens @ 2009-06-29 11:58 ` Nick Piggin 2009-06-29 13:09 ` Heiko Carstens 2009-06-29 14:12 ` Heiko Carstens 0 siblings, 2 replies; 19+ messages in thread From: Nick Piggin @ 2009-06-29 11:58 UTC (permalink / raw) To: Heiko Carstens Cc: Sachin Sant, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 n Mon, Jun 29, 2009 at 01:50:38PM +0200, Heiko Carstens wrote: > On Mon, Jun 29, 2009 at 12:39:43PM +0200, Nick Piggin wrote: > > On Mon, Jun 29, 2009 at 12:31:23PM +0200, Heiko Carstens wrote: > > > On Mon, Jun 29, 2009 at 03:14:40PM +0530, Sachin Sant wrote: > > > > I still have problems booting next with SLQB on a s390 box. > > > > > > > > Write protected kernel read-only data: 0x12000 - 0x446fff > > > > Experimental hierarchical RCU implementation. > > > > Experimental hierarchical RCU init done. > > > > console ÝttyS0¨ enabled > > > > Unable to handle kernel pointer dereference at virtual kernel address (null) > > > > This could I suppose be due to failed allocation where the caller > > isn't expecting failure (or using SLAB_PANIC). > > > > Did you manage to test with the prink debugging patch for SLQB that > > I sent for the power6 boot failure? I don't think I saw a reply from > > you but maybe I missed it? > > Could you send me the debug patch as well? I can give it a quick run as well. This is what I had. It is only helpful for the power6 failure where there was a problem in an allocation from kmem_cache_create. --- mm/slqb.c | 40 ++++++++++++++++++++++++++++++++++------ 1 file changed, 34 insertions(+), 6 deletions(-) Index: linux-2.6/mm/slqb.c =================================================================== --- linux-2.6.orig/mm/slqb.c +++ linux-2.6/mm/slqb.c @@ -1456,7 +1456,7 @@ static void *__remote_slab_alloc_node(st } static noinline void *__remote_slab_alloc(struct kmem_cache *s, - gfp_t gfpflags, int node) + gfp_t gfpflags, int node, int trace) { void *object; struct zonelist *zonelist; @@ -1465,19 +1465,32 @@ static noinline void *__remote_slab_allo enum zone_type high_zoneidx = gfp_zone(gfpflags); object = __remote_slab_alloc_node(s, gfpflags, node); + if (trace && !object) + printk("__remote_slab_alloc_node(node:%d) failed\n", node); if (likely(object || (gfpflags & __GFP_THISNODE))) return object; - zonelist = node_zonelist(slab_node(current->mempolicy), gfpflags); + node = slab_node(current->mempolicy); + if (trace) + printk("slab_node(current->mempolicy) = %d\n", node); + + zonelist = node_zonelist(node, gfpflags); for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) { - if (!cpuset_zone_allowed_hardwall(zone, gfpflags)) + if (!cpuset_zone_allowed_hardwall(zone, gfpflags)) { + if (trace) + printk("cpuset not allowed node:%d\n", zone_to_nid(zone)); continue; + } node = zone_to_nid(zone); object = __remote_slab_alloc_node(s, gfpflags, node); if (likely(object)) return object; + if (trace) + printk("__remote_slab_alloc_node(node:%d) failed\n", node); } + if (trace) + printk("__remote_slab_alloc failed\n"); return NULL; } #endif @@ -1488,7 +1501,7 @@ static noinline void *__remote_slab_allo * Must be called with interrupts disabled. */ static __always_inline void *__slab_alloc(struct kmem_cache *s, - gfp_t gfpflags, int node) + gfp_t gfpflags, int node, int trace) { void *object; struct kmem_cache_cpu *c; @@ -1497,7 +1510,7 @@ static __always_inline void *__slab_allo #ifdef CONFIG_NUMA if (unlikely(node != -1) && unlikely(node != numa_node_id())) { try_remote: - return __remote_slab_alloc(s, gfpflags, node); + return __remote_slab_alloc(s, gfpflags, node, trace); } #endif @@ -1509,6 +1522,8 @@ try_remote: object = cache_list_get_page(s, l); if (unlikely(!object)) { object = __slab_alloc_page(s, gfpflags, node); + if (trace && !object) + printk("__slab_alloc_page(node:%d) failed\n", node); #ifdef CONFIG_NUMA if (unlikely(!object)) { node = numa_node_id(); @@ -1532,10 +1547,11 @@ static __always_inline void *slab_alloc( { void *object; unsigned long flags; + int trace = 0; again: local_irq_save(flags); - object = __slab_alloc(s, gfpflags, node); + object = __slab_alloc(s, gfpflags, node, trace); local_irq_restore(flags); if (unlikely(slab_debug(s)) && likely(object)) { @@ -1546,6 +1562,18 @@ again: if (unlikely(gfpflags & __GFP_ZERO) && likely(object)) memset(object, 0, s->objsize); + if (!object && !trace) { + trace = 1; + dump_stack(); + printk("slab_alloc allocation failed\n"); + printk("slab:%s flags:%x node:%d\n", s->name, gfpflags, node); + goto again; + } + if (trace) { + if (object) + printk("slab_alloc allocation worked when being traced, bugger\n"); + } + return object; } ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-29 11:58 ` Nick Piggin @ 2009-06-29 13:09 ` Heiko Carstens 2009-06-29 14:12 ` Heiko Carstens 1 sibling, 0 replies; 19+ messages in thread From: Heiko Carstens @ 2009-06-29 13:09 UTC (permalink / raw) To: Nick Piggin Cc: Sachin Sant, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 > > > > > Unable to handle kernel pointer dereference at virtual kernel address (null) > > > > > > This could I suppose be due to failed allocation where the caller > > > isn't expecting failure (or using SLAB_PANIC). > > > > > > Did you manage to test with the prink debugging patch for SLQB that > > > I sent for the power6 boot failure? I don't think I saw a reply from > > > you but maybe I missed it? > > > > Could you send me the debug patch as well? I can give it a quick run as well. > > This is what I had. It is only helpful for the power6 > failure where there was a problem in an allocation from > kmem_cache_create. It doesn't print anything out to the console. kmalloc_node returns 0x10 for the large allocation in init_section_page_cgroup(). The page allocator won't be able to satisfy this request since the requested order is larger than MAX_ORDER on our platform. Now we only need to figure out why the SLQB allocator returns 0x10 instead of NULL ;) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-29 11:58 ` Nick Piggin 2009-06-29 13:09 ` Heiko Carstens @ 2009-06-29 14:12 ` Heiko Carstens 2009-06-30 7:34 ` Nick Piggin 2009-06-30 9:06 ` Nick Piggin 1 sibling, 2 replies; 19+ messages in thread From: Heiko Carstens @ 2009-06-29 14:12 UTC (permalink / raw) To: Nick Piggin Cc: Sachin Sant, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 On Mon, Jun 29, 2009 at 01:58:35PM +0200, Nick Piggin wrote: > n Mon, Jun 29, 2009 at 01:50:38PM +0200, Heiko Carstens wrote: > > On Mon, Jun 29, 2009 at 12:39:43PM +0200, Nick Piggin wrote: > > > On Mon, Jun 29, 2009 at 12:31:23PM +0200, Heiko Carstens wrote: > > > > On Mon, Jun 29, 2009 at 03:14:40PM +0530, Sachin Sant wrote: > > > > > I still have problems booting next with SLQB on a s390 box. > > > > > > > > > > Write protected kernel read-only data: 0x12000 - 0x446fff > > > > > Experimental hierarchical RCU implementation. > > > > > Experimental hierarchical RCU init done. > > > > > console ÝttyS0¨ enabled > > > > > Unable to handle kernel pointer dereference at virtual kernel address (null) > > > > > > This could I suppose be due to failed allocation where the caller > > > isn't expecting failure (or using SLAB_PANIC). > > > > > > Did you manage to test with the prink debugging patch for SLQB that > > > I sent for the power6 boot failure? I don't think I saw a reply from > > > you but maybe I missed it? > > > > Could you send me the debug patch as well? I can give it a quick run as well. > > This is what I had. It is only helpful for the power6 > failure where there was a problem in an allocation from > kmem_cache_create. slqb returns ZERO_SIZE_PTR instead of NULL for large size requests it cannot handle. The patch below would fix it. But I think its too ugly. So I leave it up to Nick to come up with a real and nice patch ;) diff --git a/include/linux/slqb_def.h b/include/linux/slqb_def.h index 7b4a601..9d03485 100644 --- a/include/linux/slqb_def.h +++ b/include/linux/slqb_def.h @@ -187,7 +187,7 @@ static __always_inline int kmalloc_index(size_t size) if (unlikely(!size)) return 0; if (unlikely(size > 1UL << KMALLOC_SHIFT_SLQB_HIGH)) - return 0; + return -1; if (unlikely(size <= KMALLOC_MIN_SIZE)) return KMALLOC_SHIFT_LOW; @@ -219,7 +219,7 @@ static __always_inline int kmalloc_index(size_t size) if (size <= 512 * 1024) return 19; if (size <= 1024 * 1024) return 20; if (size <= 2 * 1024 * 1024) return 21; - return -1; + return -2; } #ifdef CONFIG_ZONE_DMA @@ -239,8 +239,12 @@ static __always_inline struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags) int index; index = kmalloc_index(size); - if (unlikely(index == 0)) - return ZERO_SIZE_PTR; + if (unlikely(index <= 0)) { + if (index == 0) + return ZERO_SIZE_PTR; + if (index == -1) + return NULL; + } if (likely(!(flags & SLQB_DMA))) return &kmalloc_caches[index]; ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-29 14:12 ` Heiko Carstens @ 2009-06-30 7:34 ` Nick Piggin 2009-06-30 9:06 ` Nick Piggin 1 sibling, 0 replies; 19+ messages in thread From: Nick Piggin @ 2009-06-30 7:34 UTC (permalink / raw) To: Heiko Carstens Cc: Sachin Sant, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 On Mon, Jun 29, 2009 at 04:12:34PM +0200, Heiko Carstens wrote: > slqb returns ZERO_SIZE_PTR instead of NULL for large size requests it cannot > handle. Ah, thank you for debugging this. Great. > The patch below would fix it. But I think its too ugly. So I leave it up to > Nick to come up with a real and nice patch ;) The patch isn't so bad :) It is for the constant case anyway so it should optimize away. But I think I have a similar problem for the non constant case too, so I will look at doing another patch. Thanks, Nick > > diff --git a/include/linux/slqb_def.h b/include/linux/slqb_def.h > index 7b4a601..9d03485 100644 > --- a/include/linux/slqb_def.h > +++ b/include/linux/slqb_def.h > @@ -187,7 +187,7 @@ static __always_inline int kmalloc_index(size_t size) > if (unlikely(!size)) > return 0; > if (unlikely(size > 1UL << KMALLOC_SHIFT_SLQB_HIGH)) > - return 0; > + return -1; > > if (unlikely(size <= KMALLOC_MIN_SIZE)) > return KMALLOC_SHIFT_LOW; > @@ -219,7 +219,7 @@ static __always_inline int kmalloc_index(size_t size) > if (size <= 512 * 1024) return 19; > if (size <= 1024 * 1024) return 20; > if (size <= 2 * 1024 * 1024) return 21; > - return -1; > + return -2; > } > > #ifdef CONFIG_ZONE_DMA > @@ -239,8 +239,12 @@ static __always_inline struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags) > int index; > > index = kmalloc_index(size); > - if (unlikely(index == 0)) > - return ZERO_SIZE_PTR; > + if (unlikely(index <= 0)) { > + if (index == 0) > + return ZERO_SIZE_PTR; > + if (index == -1) > + return NULL; > + } > > if (likely(!(flags & SLQB_DMA))) > return &kmalloc_caches[index]; ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-29 14:12 ` Heiko Carstens 2009-06-30 7:34 ` Nick Piggin @ 2009-06-30 9:06 ` Nick Piggin 2009-06-30 9:20 ` Pekka Enberg 2009-06-30 10:09 ` Heiko Carstens 1 sibling, 2 replies; 19+ messages in thread From: Nick Piggin @ 2009-06-30 9:06 UTC (permalink / raw) To: Heiko Carstens Cc: Sachin Sant, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 On Mon, Jun 29, 2009 at 04:12:34PM +0200, Heiko Carstens wrote: > slqb returns ZERO_SIZE_PTR instead of NULL for large size requests it cannot > handle. > The patch below would fix it. But I think its too ugly. So I leave it up to > Nick to come up with a real and nice patch ;) Could you try this patch and see if it helps? (it fixes a number of simple corner cases here, *blush*) -- SLQB: fix allocation size checking SLQB would return ZERO_SIZE_PTR rather than NULL if the requested size is too large. Debugged by Heiko Carstens. Fix this by checking size edge cases up front rather than in the slab index calculation. Additionally, if the size parameter was non-constant and too large, then the checks may not have been performed at all which could cause corruption. Next, ARCH_KMALLOC_MINALIGN may not be obeyed if size is non-constant. So test for KMALLOC_MIN_SIZE in that case. Finally, if KMALLOC_SHIFT_SLQB_HIGH is larger than 2MB, then kmalloc_index could silently run off the end of its precomputed table and return a -1 index into the kmalloc slab array, which could result in corruption. Extend this to allow up to 32MB (to match SLAB), and add a compile-time error in the case that the table is exceeded (also like SLAB). --- include/linux/slqb_def.h | 17 ++++++++++------- mm/slqb.c | 18 ++++++++++++++---- 2 files changed, 24 insertions(+), 11 deletions(-) Index: linux-2.6/include/linux/slqb_def.h =================================================================== --- linux-2.6.orig/include/linux/slqb_def.h +++ linux-2.6/include/linux/slqb_def.h @@ -184,10 +184,7 @@ extern struct kmem_cache kmalloc_caches_ */ static __always_inline int kmalloc_index(size_t size) { - if (unlikely(!size)) - return 0; - if (unlikely(size > 1UL << KMALLOC_SHIFT_SLQB_HIGH)) - return 0; + extern int ____kmalloc_too_large(void); if (unlikely(size <= KMALLOC_MIN_SIZE)) return KMALLOC_SHIFT_LOW; @@ -219,7 +216,11 @@ static __always_inline int kmalloc_index if (size <= 512 * 1024) return 19; if (size <= 1024 * 1024) return 20; if (size <= 2 * 1024 * 1024) return 21; - return -1; + if (size <= 4 * 1024 * 1024) return 22; + if (size <= 8 * 1024 * 1024) return 23; + if (size <= 16 * 1024 * 1024) return 24; + if (size <= 32 * 1024 * 1024) return 25; + return ____kmalloc_too_large(); } #ifdef CONFIG_ZONE_DMA @@ -238,10 +239,12 @@ static __always_inline struct kmem_cache { int index; - index = kmalloc_index(size); - if (unlikely(index == 0)) + if (unlikely(size > 1UL << KMALLOC_SHIFT_SLQB_HIGH)) + return NULL; + if (unlikely(!size)) return ZERO_SIZE_PTR; + index = kmalloc_index(size); if (likely(!(flags & SLQB_DMA))) return &kmalloc_caches[index]; else Index: linux-2.6/mm/slqb.c =================================================================== --- linux-2.6.orig/mm/slqb.c +++ linux-2.6/mm/slqb.c @@ -2514,18 +2514,28 @@ static struct kmem_cache *get_slab(size_ { int index; + if (unlikely(size <= KMALLOC_MIN_SIZE)) { + if (unlikely(!size)) + return ZERO_SIZE_PTR; + + index = KMALLOC_SHIFT_LOW; + goto got_index; + } + #if L1_CACHE_BYTES >= 128 if (size <= 128) { #else if (size <= 192) { #endif - if (unlikely(!size)) - return ZERO_SIZE_PTR; - index = size_index[(size - 1) / 8]; - } else + } else { + if (unlikely(size > 1UL << KMALLOC_SHIFT_SLQB_HIGH)) + return NULL; + index = fls(size - 1); + } +got_index: if (unlikely((flags & SLQB_DMA))) return &kmalloc_caches_dma[index]; else ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-30 9:06 ` Nick Piggin @ 2009-06-30 9:20 ` Pekka Enberg 2009-06-30 9:27 ` Nick Piggin 2009-06-30 10:09 ` Heiko Carstens 1 sibling, 1 reply; 19+ messages in thread From: Pekka Enberg @ 2009-06-30 9:20 UTC (permalink / raw) To: Nick Piggin Cc: Heiko Carstens, Sachin Sant, Stephen Rothwell, linux-next, linux-s390 On Tue, 2009-06-30 at 11:06 +0200, Nick Piggin wrote: > Finally, if KMALLOC_SHIFT_SLQB_HIGH is larger than 2MB, then kmalloc_index > could silently run off the end of its precomputed table and return a -1 > index into the kmalloc slab array, which could result in corruption. Extend > this to allow up to 32MB (to match SLAB), and add a compile-time error in > the case that the table is exceeded (also like SLAB). I wonder if SLQB should just do page allocator pass-through for really big allocations? That way callers don't need to worry about whether they're running under SLAB/SLUB/SLOB/SQLB. Pekka ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-30 9:20 ` Pekka Enberg @ 2009-06-30 9:27 ` Nick Piggin 2009-06-30 9:30 ` Pekka Enberg 0 siblings, 1 reply; 19+ messages in thread From: Nick Piggin @ 2009-06-30 9:27 UTC (permalink / raw) To: Pekka Enberg Cc: Heiko Carstens, Sachin Sant, Stephen Rothwell, linux-next, linux-s390 On Tue, Jun 30, 2009 at 12:20:10PM +0300, Pekka Enberg wrote: > On Tue, 2009-06-30 at 11:06 +0200, Nick Piggin wrote: > > Finally, if KMALLOC_SHIFT_SLQB_HIGH is larger than 2MB, then kmalloc_index > > could silently run off the end of its precomputed table and return a -1 > > index into the kmalloc slab array, which could result in corruption. Extend > > this to allow up to 32MB (to match SLAB), and add a compile-time error in > > the case that the table is exceeded (also like SLAB). > > I wonder if SLQB should just do page allocator pass-through for really > big allocations? That way callers don't need to worry about whether > they're running under SLAB/SLUB/SLOB/SQLB. Well it could, OTOH it should be pretty well in line with SLAB after this patch so I don't see much need. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-30 9:27 ` Nick Piggin @ 2009-06-30 9:30 ` Pekka Enberg 0 siblings, 0 replies; 19+ messages in thread From: Pekka Enberg @ 2009-06-30 9:30 UTC (permalink / raw) To: Nick Piggin Cc: Heiko Carstens, Sachin Sant, Stephen Rothwell, linux-next, linux-s390 On Tue, 2009-06-30 at 11:06 +0200, Nick Piggin wrote: > > > Finally, if KMALLOC_SHIFT_SLQB_HIGH is larger than 2MB, then kmalloc_index > > > could silently run off the end of its precomputed table and return a -1 > > > index into the kmalloc slab array, which could result in corruption. Extend > > > this to allow up to 32MB (to match SLAB), and add a compile-time error in > > > the case that the table is exceeded (also like SLAB). On Tue, Jun 30, 2009 at 12:20:10PM +0300, Pekka Enberg wrote: > > I wonder if SLQB should just do page allocator pass-through for really > > big allocations? That way callers don't need to worry about whether > > they're running under SLAB/SLUB/SLOB/SQLB. On Tue, 2009-06-30 at 11:27 +0200, Nick Piggin wrote: > Well it could, OTOH it should be pretty well in line with SLAB > after this patch so I don't see much need. True. But with page allocator fall-through, we don't need to bump up slab limit to 32 MB (which is pretty damn big IMHO). Anyway, up to you, really. Pekka ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-30 9:06 ` Nick Piggin 2009-06-30 9:20 ` Pekka Enberg @ 2009-06-30 10:09 ` Heiko Carstens 2009-06-30 10:29 ` Nick Piggin 1 sibling, 1 reply; 19+ messages in thread From: Heiko Carstens @ 2009-06-30 10:09 UTC (permalink / raw) To: Nick Piggin Cc: Sachin Sant, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 On Tue, Jun 30, 2009 at 11:06:31AM +0200, Nick Piggin wrote: > On Mon, Jun 29, 2009 at 04:12:34PM +0200, Heiko Carstens wrote: > > slqb returns ZERO_SIZE_PTR instead of NULL for large size requests it cannot > > handle. > > The patch below would fix it. But I think its too ugly. So I leave it up to > > Nick to come up with a real and nice patch ;) > > Could you try this patch and see if it helps? (it fixes a number > of simple corner cases here, *blush*) Yes, it does work now. Thanks! > SLQB: fix allocation size checking > > SLQB would return ZERO_SIZE_PTR rather than NULL if the requested size is too > large. Debugged by Heiko Carstens. Fix this by checking size edge cases up > front rather than in the slab index calculation. > > Additionally, if the size parameter was non-constant and too large, then > the checks may not have been performed at all which could cause corruption. > > Next, ARCH_KMALLOC_MINALIGN may not be obeyed if size is non-constant. So > test for KMALLOC_MIN_SIZE in that case. > > Finally, if KMALLOC_SHIFT_SLQB_HIGH is larger than 2MB, then kmalloc_index > could silently run off the end of its precomputed table and return a -1 > index into the kmalloc slab array, which could result in corruption. Extend > this to allow up to 32MB (to match SLAB), and add a compile-time error in > the case that the table is exceeded (also like SLAB). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-30 10:09 ` Heiko Carstens @ 2009-06-30 10:29 ` Nick Piggin 2009-06-30 10:57 ` Pekka Enberg 0 siblings, 1 reply; 19+ messages in thread From: Nick Piggin @ 2009-06-30 10:29 UTC (permalink / raw) To: Heiko Carstens Cc: Sachin Sant, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 On Tue, Jun 30, 2009 at 12:09:29PM +0200, Heiko Carstens wrote: > On Tue, Jun 30, 2009 at 11:06:31AM +0200, Nick Piggin wrote: > > On Mon, Jun 29, 2009 at 04:12:34PM +0200, Heiko Carstens wrote: > > > slqb returns ZERO_SIZE_PTR instead of NULL for large size requests it cannot > > > handle. > > > The patch below would fix it. But I think its too ugly. So I leave it up to > > > Nick to come up with a real and nice patch ;) > > > > Could you try this patch and see if it helps? (it fixes a number > > of simple corner cases here, *blush*) > > Yes, it does work now. Thanks! Thanks. Pekka, can you please merge this with Signed-off-by: Nick Piggin <npiggin@suse.de> > > > SLQB: fix allocation size checking > > > > SLQB would return ZERO_SIZE_PTR rather than NULL if the requested size is too > > large. Debugged by Heiko Carstens. Fix this by checking size edge cases up > > front rather than in the slab index calculation. > > > > Additionally, if the size parameter was non-constant and too large, then > > the checks may not have been performed at all which could cause corruption. > > > > Next, ARCH_KMALLOC_MINALIGN may not be obeyed if size is non-constant. So > > test for KMALLOC_MIN_SIZE in that case. > > > > Finally, if KMALLOC_SHIFT_SLQB_HIGH is larger than 2MB, then kmalloc_index > > could silently run off the end of its precomputed table and return a -1 > > index into the kmalloc slab array, which could result in corruption. Extend > > this to allow up to 32MB (to match SLAB), and add a compile-time error in > > the case that the table is exceeded (also like SLAB). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-30 10:29 ` Nick Piggin @ 2009-06-30 10:57 ` Pekka Enberg 0 siblings, 0 replies; 19+ messages in thread From: Pekka Enberg @ 2009-06-30 10:57 UTC (permalink / raw) To: Nick Piggin Cc: Heiko Carstens, Sachin Sant, Stephen Rothwell, linux-next, linux-s390 On Tue, 2009-06-30 at 12:29 +0200, Nick Piggin wrote: > On Tue, Jun 30, 2009 at 12:09:29PM +0200, Heiko Carstens wrote: > > On Tue, Jun 30, 2009 at 11:06:31AM +0200, Nick Piggin wrote: > > > On Mon, Jun 29, 2009 at 04:12:34PM +0200, Heiko Carstens wrote: > > > > slqb returns ZERO_SIZE_PTR instead of NULL for large size requests it cannot > > > > handle. > > > > The patch below would fix it. But I think its too ugly. So I leave it up to > > > > Nick to come up with a real and nice patch ;) > > > > > > Could you try this patch and see if it helps? (it fixes a number > > > of simple corner cases here, *blush*) > > > > Yes, it does work now. Thanks! > > Thanks. Pekka, can you please merge this with > > Signed-off-by: Nick Piggin <npiggin@suse.de> Done. Thanks everyone! ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-29 10:39 ` Nick Piggin 2009-06-29 11:50 ` Heiko Carstens @ 2009-06-30 5:33 ` Sachin Sant 2009-06-30 8:34 ` Nick Piggin 1 sibling, 1 reply; 19+ messages in thread From: Sachin Sant @ 2009-06-30 5:33 UTC (permalink / raw) To: Nick Piggin Cc: Heiko Carstens, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 [-- Attachment #1: Type: text/plain, Size: 659 bytes --] Nick Piggin wrote: > This could I suppose be due to failed allocation where the caller > isn't expecting failure (or using SLAB_PANIC). > > Did you manage to test with the prink debugging patch for SLQB that > I sent for the power6 boot failure? I don't think I saw a reply from > you but maybe I missed it? > Hi Nick, Sorry for the delay in getting the debug o/p. Attaching the boot log from the Power6 box with the debug patch, although i don't see any extra messages . Thanks -Sachin -- --------------------------------- Sachin Sant IBM Linux Technology Center India Systems and Technology Labs Bangalore, India --------------------------------- [-- Attachment #2: boot-log --] [-- Type: text/plain, Size: 13185 bytes --] Using 007c5c91 bytes for initrd buffer Please wait, loading kernel... Allocated 01200000 bytes for kernel @ 02300000 Elf64 kernel loaded... Loading ramdisk... ramdisk loaded 007c5c91 @ 03500000 OF stdout device is: /vdevice/vty@30000000 Preparing to boot Linux version 2.6.31-rc1-next-20090629 (root@mpower6lp5) (gcc version 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Tue Jun 30 10:31:38 IST 2009 Calling ibm,client-architecture... done command line: root=/dev/sda3 sysrq=8 insmod=sym53c8xx insmod=ipr crashkernel=512M-:256M loglevel=8 mminit_loglevel=4 memory layout at init: alloc_bottom : 0000000003cd0000 alloc_top : 0000000008000000 alloc_top_hi : 0000000008000000 rmo_top : 0000000008000000 ram_top : 0000000008000000 instantiating rtas at 0x00000000074e0000... done boot cpu hw idx 0000000000000000 copying OF device tree... Building dt strings... Building dt structure... Device tree strings 0x0000000003ce0000 -> 0x0000000003ce15c2 Device tree struct 0x0000000003cf0000 -> 0x0000000003d10000 Calling quiesce... returning from prom_init Crash kernel location must be 0x2000000 Reserving 256MB of memory at 32MB for crashkernel (System RAM: 4096MB) Phyp-dump disabled at boot time Using pSeries machine description Page orders: linear mapping = 16, virtual = 16, io = 12 Using 1TB segments Found initrd at 0xc000000003500000:0xc000000003cc5c91 console [udbg0] enabled Partition configured for 2 cpus. CPU maps initialized for 2 threads per core (thread shift is 1) Starting Linux PPC64 #1 SMP Tue Jun 30 10:31:38 IST 2009 ----------------------------------------------------- ppc64_pft_size = 0x1a physicalMemorySize = 0x100000000 htab_hash_mask = 0x7ffff ----------------------------------------------------- Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.31-rc1-next-20090629 (root@mpower6lp5) (gcc version 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Tue Jun 30 10:31:38 IST 2009 [boot]0012 Setup Arch mminit::memory_register Entering add_active_range(2, 0x0, 0x800) 0 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x800, 0x1000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x1000, 0x1800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x1800, 0x2000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x2000, 0x2800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x2800, 0x3000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x3000, 0x3800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x3800, 0x4000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x4000, 0x4800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x4800, 0x5000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x5000, 0x5800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x5800, 0x6000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x6000, 0x6800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x6800, 0x7000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x7000, 0x7800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x7800, 0x8000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x8000, 0x8800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x8800, 0x9000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x9000, 0x9800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0x9800, 0xa000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0xa000, 0xa800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0xa800, 0xb000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0xb000, 0xb800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0xb800, 0xc000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0xc000, 0xc800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0xc800, 0xd000) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0xd000, 0xd800) 1 entries of 256 used mminit::memory_register Entering add_active_range(2, 0xd800, 0xe000) 1 entries of 256 used mminit::memory_register Entering add_active_range(3, 0xe000, 0xe800) 1 entries of 256 used mminit::memory_register Entering add_active_range(3, 0xe800, 0xf000) 2 entries of 256 used mminit::memory_register Entering add_active_range(3, 0xf000, 0xf800) 2 entries of 256 used mminit::memory_register Entering add_active_range(3, 0xf800, 0x10000) 2 entries of 256 used Node 0 Memory: Node 2 Memory: 0x0-0xe0000000 Node 3 Memory: 0xe0000000-0x100000000 EEH: No capable adapters found PPC64 nvram contains 15360 bytes Using shared processor idle loop Zone PFN ranges: DMA 0x00000000 -> 0x00010000 Normal 0x00010000 -> 0x00010000 Movable zone start PFN for each node early_node_map[2] active PFN ranges 2: 0x00000000 -> 0x0000e000 3: 0x0000e000 -> 0x00010000 mminit::pageflags_layout_widths Section 20 Node 4 Zone 2 Flags 23 mminit::pageflags_layout_shifts Section 20 Node 4 Zone 2 mminit::pageflags_layout_offsets Section 44 Node 40 Zone 38 mminit::pageflags_layout_zoneid Zone ID: 38 -> 44 mminit::pageflags_layout_usage location: 64 -> 38 unused 38 -> 23 flags 23 -> 0 Could not find start_pfn for node 0 On node 0 totalpages: 0 On node 2 totalpages: 57344 DMA zone: 56 pages used for memmap DMA zone: 0 pages reserved DMA zone: 57288 pages, LIFO batch:1 mminit::memmap_init Initialising map node 2 zone 0 pfns 0 -> 57344 On node 3 totalpages: 8192 DMA zone: 8 pages used for memmap DMA zone: 0 pages reserved DMA zone: 8184 pages, LIFO batch:0 mminit::memmap_init Initialising map node 3 zone 0 pfns 57344 -> 65536 [boot]0015 Setup Done mminit::zonelist general 2:DMA = 2:DMA 3:DMA mminit::zonelist thisnode 2:DMA = 2:DMA mminit::zonelist general 3:DMA = 3:DMA 2:DMA mminit::zonelist thisnode 3:DMA = 3:DMA Built 3 zonelists in Node order, mobility grouping on. Total pages: 65472 Policy zone: DMA Kernel command line: root=/dev/sda3 sysrq=8 insmod=sym53c8xx insmod=ipr crashkernel=512M-:256M loglevel=8 mminit_loglevel=4 PID hash table entries: 4096 (order: 12, 32768 bytes) freeing bootmem node 2 freeing bootmem node 3 Memory: 3897728k/4194304k available (9216k kernel code, 296576k reserved, 2112k data, 4289k bss, 512k init) Experimental hierarchical RCU implementation. RCU-based detection of stalled CPUs is enabled. Experimental hierarchical RCU init done. NR_IRQS:512 [boot]0020 XICS Init [boot]0021 XICS Done pic: no ISA interrupt controller time_init: decrementer frequency = 512.000000 MHz time_init: processor frequency = 4704.000000 MHz clocksource: timebase mult[7d0000] shift[22] registered clockevent: decrementer mult[83126e97] shift[32] cpu[0] Console: colour dummy device 80x25 console handover: boot [udbg0] -> real [hvc0] allocated 2621440 bytes of page_cgroup please try 'cgroup_disable=memory' option if you don't want memory cgroups Security Framework initialized SELinux: Disabled at boot. Dentry cache hash table entries: 524288 (order: 6, 4194304 bytes) Inode-cache hash table entries: 262144 (order: 5, 2097152 bytes) Mount-cache hash table entries: 4096 Initializing cgroup subsys ns Initializing cgroup subsys cpuacct Initializing cgroup subsys memory Initializing cgroup subsys devices Initializing cgroup subsys freezer irq: irq 2 on host null mapped to virtual irq 16 clockevent: decrementer mult[83126e97] shift[32] cpu[1] Processor 1 found. Brought up 2 CPUs Node 0 CPUs: 0-1 Node 2 CPUs: Node 3 CPUs: CPU0 attaching sched-domain: domain 0: span 0-1 level SIBLING groups: 0 1 domain 1: span 0-1 level NODE groups: 0-1 CPU1 attaching sched-domain: domain 0: span 0-1 level SIBLING groups: 1 0 domain 1: span 0-1 level NODE groups: 0-1 NET: Registered protocol family 16 IBM eBus Device Driver POWER6 performance monitor hardware support registered PCI: Probing PCI hardware PCI: Probing PCI hardware done bio: create slab <bio-0> at 0 usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb NET: Registered protocol family 2 IP route cache hash table entries: 32768 (order: 2, 262144 bytes) TCP established hash table entries: 131072 (order: 5, 2097152 bytes) TCP bind hash table entries: 65536 (order: 5, 2097152 bytes) TCP: Hash tables configured (established 131072 bind 65536) TCP reno registered NET: Registered protocol family 1 Unpacking initramfs... Switched to high resolution mode on CPU 0 Switched to high resolution mode on CPU 1 irq: irq 655360 on host null mapped to virtual irq 17 irq: irq 655367 on host null mapped to virtual irq 18 IOMMU table initialized, virtual merging enabled irq: irq 589825 on host null mapped to virtual irq 19 RTAS daemon started audit: initializing netlink socket (disabled) type=2000 audit(1246339143.235:1): initialized Kprobe smoke test started Kprobe smoke test passed successfully HugeTLB registered 16 MB page size, pre-allocated 0 pages HugeTLB registered 16 GB page size, pre-allocated 0 pages VFS: Disk quotas dquot_6.5.2 Dquot-cache hash table entries: 8192 (order 0, 65536 bytes) Btrfs loaded msgmni has been set to 7612 alg: No test for stdrng (krng) Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254) io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) pci_hotplug: PCI Hot Plug PCI Core version: 0.5 rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1 vio_register_driver: driver hvc_console registering HVSI: registered 0 devices Generic RTC Driver v1.07 Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled Platform driver 'serial8250' needs updating - please use dev_pm_ops pmac_zilog: 0.6 (Benjamin Herrenschmidt <benh@kernel.crashing.org>) input: Macintosh mouse button emulation as /devices/virtual/input/input0 Uniform Multi-Platform E-IDE driver ide-gd driver 1.18 IBM eHEA ethernet device driver (Release EHEA_0101) irq: irq 590088 on host null mapped to virtual irq 264 ehea: eth0: Jumbo frames are disabled ehea: eth0 -> logical port id #2 ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver mice: PS/2 mouse device common for all mice EDAC MC: Ver: 2.1.0 Jun 30 2009 usbcore: registered new interface driver hiddev usbcore: registered new interface driver usbhid usbhid: v2.6:USB HID core driver TCP cubic registered NET: Registered protocol family 15 registered taskstats version 1 Freeing unused kernel memory: 512k freed doing fast boot SysRq : Changing Loglevel Loglevel set to 8 Unable to handle kernel paging request for data at address 0xc0000000008f4504 Faulting instruction address: 0xc000000000391094 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=1024 DEBUG_PAGEALLOC NUMA pSeries Modules linked in: scsi_mod(+) NIP: c000000000391094 LR: c00000000060cd88 CTR: 0000000000000008 REGS: c0000000c63f3590 TRAP: 0300 Not tainted (2.6.31-rc1-next-20090629) MSR: 8000000000009032 <EE,ME,IR,DR> CR: 24222428 XER: 20000001 DAR: c0000000008f4504, DSISR: 0000000040000000 TASK = c0000000c63e0a80[62] 'modprobe' THREAD: c0000000c63f0000 CPU: 1 GPR00: c00000000060cd88 c0000000c63f3810 c000000000b0c488 c0000000008f4500 GPR04: c0000000ddcd0000 0000000000000001 0000000000000000 c0000000dfff8480 GPR08: 0000000000000001 c0000000de000010 0000000000000002 c000000000c681f8 GPR12: 0000000024222428 c000000000be2600 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000018 ffffffffffffffff 0000000000210d00 0000000000000010 GPR24: 0000000000210d00 c0000000dfc60ea0 c000000000fc1580 c0000000008f4500 GPR28: c0000000008f4448 c0000000008f4500 c000000000a87bd0 c0000000dfc60e80 NIP [c000000000391094] ._raw_spin_lock+0x30/0x184 LR [c00000000060cd88] ._spin_lock+0x10/0x24 Call Trace: [c0000000c63f38b0] [c00000000060cd88] ._spin_lock+0x10/0x24 [c0000000c63f3920] [c000000000150030] .__slab_alloc_page+0x390/0x430 [c0000000c63f39e0] [c0000000001518f0] .kmem_cache_alloc+0x160/0x2bc [c0000000c63f3aa0] [c000000000152308] .kmem_cache_create+0x294/0x2a8 [c0000000c63f3b90] [d000000000eb177c] .scsi_init_queue+0x38/0x170 [scsi_mod] [c0000000c63f3c20] [d000000000eb1678] .init_scsi+0x1c/0xe8 [scsi_mod] [c0000000c63f3ca0] [c0000000000097a0] .do_one_initcall+0x80/0x19c [c0000000c63f3d90] [c0000000000c7a08] .SyS_init_module+0x118/0x28c [c0000000c63f3e30] [c000000000008534] syscall_exit+0x0/0x40 Instruction dump: 7c0802a6 fba1ffe8 7d800026 7c7d1b78 fbc1fff0 ebc2c268 f8010010 fb61ffd8 fb81ffe0 fbe1fff8 91810008 f821ff61 <80030004> 6c09dead 2f894ead 419e000c ---[ end trace e7d1b9681037bc75 ]--- ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-30 5:33 ` Sachin Sant @ 2009-06-30 8:34 ` Nick Piggin 2009-06-30 10:56 ` Sachin Sant 0 siblings, 1 reply; 19+ messages in thread From: Nick Piggin @ 2009-06-30 8:34 UTC (permalink / raw) To: Sachin Sant Cc: Heiko Carstens, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 If you are not seeing any extra messages from the patch, then it could be that you are seeing ZERO_SIZE_PTR or similar errors as well. I will ask if you can try the patch I sent in the other thread. Thanks, Nick On Tue, Jun 30, 2009 at 11:03:40AM +0530, Sachin Sant wrote: > Loglevel set to 8 > Unable to handle kernel paging request for data at address 0xc0000000008f4504 > Faulting instruction address: 0xc000000000391094 > Oops: Kernel access of bad area, sig: 11 [#1] > SMP NR_CPUS=1024 DEBUG_PAGEALLOC NUMA pSeries > Modules linked in: scsi_mod(+) > NIP: c000000000391094 LR: c00000000060cd88 CTR: 0000000000000008 > REGS: c0000000c63f3590 TRAP: 0300 Not tainted (2.6.31-rc1-next-20090629) > MSR: 8000000000009032 <EE,ME,IR,DR> CR: 24222428 XER: 20000001 > DAR: c0000000008f4504, DSISR: 0000000040000000 > TASK = c0000000c63e0a80[62] 'modprobe' THREAD: c0000000c63f0000 CPU: 1 > GPR00: c00000000060cd88 c0000000c63f3810 c000000000b0c488 c0000000008f4500 > GPR04: c0000000ddcd0000 0000000000000001 0000000000000000 c0000000dfff8480 > GPR08: 0000000000000001 c0000000de000010 0000000000000002 c000000000c681f8 > GPR12: 0000000024222428 c000000000be2600 0000000000000000 0000000000000000 > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR20: 0000000000000018 ffffffffffffffff 0000000000210d00 0000000000000010 > GPR24: 0000000000210d00 c0000000dfc60ea0 c000000000fc1580 c0000000008f4500 > GPR28: c0000000008f4448 c0000000008f4500 c000000000a87bd0 c0000000dfc60e80 > NIP [c000000000391094] ._raw_spin_lock+0x30/0x184 > LR [c00000000060cd88] ._spin_lock+0x10/0x24 > Call Trace: > [c0000000c63f38b0] [c00000000060cd88] ._spin_lock+0x10/0x24 > [c0000000c63f3920] [c000000000150030] .__slab_alloc_page+0x390/0x430 > [c0000000c63f39e0] [c0000000001518f0] .kmem_cache_alloc+0x160/0x2bc > [c0000000c63f3aa0] [c000000000152308] .kmem_cache_create+0x294/0x2a8 > [c0000000c63f3b90] [d000000000eb177c] .scsi_init_queue+0x38/0x170 [scsi_mod] > [c0000000c63f3c20] [d000000000eb1678] .init_scsi+0x1c/0xe8 [scsi_mod] > [c0000000c63f3ca0] [c0000000000097a0] .do_one_initcall+0x80/0x19c > [c0000000c63f3d90] [c0000000000c7a08] .SyS_init_module+0x118/0x28c > [c0000000c63f3e30] [c000000000008534] syscall_exit+0x0/0x40 > Instruction dump: > 7c0802a6 fba1ffe8 7d800026 7c7d1b78 fbc1fff0 ebc2c268 f8010010 fb61ffd8 > fb81ffe0 fbe1fff8 91810008 f821ff61 <80030004> 6c09dead 2f894ead 419e000c > ---[ end trace e7d1b9681037bc75 ]--- ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Next June 29: Boot failure with SLQB on s390 2009-06-30 8:34 ` Nick Piggin @ 2009-06-30 10:56 ` Sachin Sant 0 siblings, 0 replies; 19+ messages in thread From: Sachin Sant @ 2009-06-30 10:56 UTC (permalink / raw) To: Nick Piggin Cc: Heiko Carstens, Pekka Enberg, Stephen Rothwell, linux-next, linux-s390 Nick Piggin wrote: > If you are not seeing any extra messages from the patch, then it > could be that you are seeing ZERO_SIZE_PTR or similar errors as > well. I will ask if you can try the patch I sent in the other > thread. Still facing same problem after applying the patch you mentioned. Probably the s390 issue is different that the Power6 issue. Thanks -Sachin -- --------------------------------- Sachin Sant IBM Linux Technology Center India Systems and Technology Labs Bangalore, India --------------------------------- ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2009-06-30 10:57 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-06-29 6:48 linux-next: Tree for June 29 Stephen Rothwell 2009-06-29 9:44 ` Next June 29: Boot failure with SLQB on s390 Sachin Sant 2009-06-29 10:31 ` Heiko Carstens 2009-06-29 10:39 ` Nick Piggin 2009-06-29 11:50 ` Heiko Carstens 2009-06-29 11:58 ` Nick Piggin 2009-06-29 13:09 ` Heiko Carstens 2009-06-29 14:12 ` Heiko Carstens 2009-06-30 7:34 ` Nick Piggin 2009-06-30 9:06 ` Nick Piggin 2009-06-30 9:20 ` Pekka Enberg 2009-06-30 9:27 ` Nick Piggin 2009-06-30 9:30 ` Pekka Enberg 2009-06-30 10:09 ` Heiko Carstens 2009-06-30 10:29 ` Nick Piggin 2009-06-30 10:57 ` Pekka Enberg 2009-06-30 5:33 ` Sachin Sant 2009-06-30 8:34 ` Nick Piggin 2009-06-30 10:56 ` Sachin Sant
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).