public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: boot panic after oops from sysfs_new_dirent
       [not found] <Pine.SOC.4.64.0908311646150.20934@math.ut.ee>
@ 2009-08-31 21:22 ` David Miller
  2009-09-01  7:05   ` Meelis Roos
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2009-08-31 21:22 UTC (permalink / raw)
  To: mroos; +Cc: sparclinux, linux-kernel, tj

From: Meelis Roos <mroos@linux.ee>
Date: Mon, 31 Aug 2009 16:51:45 +0300 (EEST)

[ CC:'ing Tejun Heo and lkml as Tejun recently made some changes which
  were meant to "fix" things in this area :-) ]

> I powered on my 6-CPU E3000 after a summer pause and tried todays 
> 2.6.31-rc8+git on it. It fails - first it spills lots of warnings about 
> percpu things, then oopses in sysfs_new_dirent and then panics.
> 
> The TOD failure is a known problem waiting for replacement clock board, 
> it does not seem to disturb other Linux versions (2.6.30 runs fine).
> 
> If it gives no ideas then I might find the time to bisect it but not 
> today.
> 
> 3,0>ERROR: TEST=NVRAM Devices,SUBTEST=M48T59 (TOD) Init ID=8.1
> 3,0>Component under test: Board 16 Firehose Bus
> 3,0>TODC battery is low bit set
> Detected failed TOD on clock board. Using backup TOD on board in slot 1 
> fhc ac simm-status environment sram flashprom SUNW,UltraSPARC-II SUNW,UltraSPARC-II 
> fhc ac simm-status environment sram flashprom SUNW,UltraSPARC-II SUNW,UltraSPARC-II 
> fhc ac simm-status environment sram flashprom SUNW,UltraSPARC-II SUNW,UltraSPARC-II 
> Probing UPA Slot at 2,0   sbus fhc ac environment flashprom eeprom sbus-speed counter-timer 
> Probing UPA Slot at 3,0   sbus counter-timer 
> Probing /sbus@2,0 at d,0  SUNW,soc 
> Probing /sbus@2,0 at 1,0  SUNW,socal sf ssd sf ssd 
> Probing /sbus@2,0 at 2,0  QLGC,isp sd st 
> Probing /sbus@3,0 at 3,0  SUNW,hme SUNW,fas sd st 
> Probing /sbus@3,0 at 0,0  SUNW,qfe SUNW,qfe SUNW,qfe SUNW,qfe 
> 4-slot Sun Enterprise 3000, No Keyboard
> OpenBoot 3.2.29, 3584 MB memory installed, Serial #8631214.
> Copyright 2001 Sun Microsystems, Inc.  All rights reserved
> Ethernet address 8:0:20:83:b3:ae, Host ID: 8083b3ae.
> 
> 
> 
> Boot device: disk  File and args: 
> SILO Version 1.4.13
> boot: 
> Linux                    LinuxOLD                 test                     
> hea                      
> boot: test console=ttyS0
> Allocated 8 Megs of memory at 0x40000000 for kernel
> Uncompressing image...
> Loaded kernel version 2.6.31
> 
> [    0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.29 2001/06/18 17:28'
> [    0.000000] PROMLIB: Root node compatible: 
> [    0.000000] Linux version 2.6.31-rc8 (mroos@mandel) (gcc version 4.3.4 (Debian 4.3.4-2) ) #11 SMP Mon Aug 31 14:24:32 EEST 2009
> [    0.000000] debug: ignoring loglevel setting.
> [    0.000000] console [earlyprom0] enabled
> [    0.000000] ARCH: SUN4U
> [    0.000000] Ethernet address: 08:00:20:83:b3:ae
> [    0.000000] Kernel: Using 4 locked TLB entries for main kernel image.
> [    0.000000] Remapping the kernel... done.
> [    0.000000] OF stdout device is: /central@1f,0/fhc@0,f8800000/zs@0,902000:a
> [    0.000000] PROM: Built device tree with 70462 bytes of memory.
> [    0.000000] Top of RAM: 0xdfd14000, Total RAM: 0xdf960000
> [    0.000000] Memory hole size: 3MB
> [    0.000000] [0000000200000000-fffff80001400000] page_structs=131072 node=0 entry=0/0
> [    0.000000] [0000000200000000-fffff80001800000] page_structs=131072 node=0 entry=1/0
> [    0.000000] [0000000200000000-fffff80001c00000] page_structs=131072 node=0 entry=2/0
> [    0.000000] [0000000200c00000-fffff80002000000] page_structs=131072 node=0 entry=3/0
> [    0.000000] [0000000200c00000-fffff80002400000] page_structs=131072 node=0 entry=4/0
> [    0.000000] [0000000200c00000-fffff80002800000] page_structs=131072 node=0 entry=5/0
> [    0.000000] [0000000201800000-fffff80002c00000] page_structs=131072 node=0 entry=6/0
> [    0.000000] [0000000201800000-fffff80003000000] page_structs=131072 node=0 entry=7/0
> [    0.000000] [0000000201800000-fffff80003400000] page_structs=131072 node=0 entry=8/0
> [    0.000000] [0000000202400000-fffff80003800000] page_structs=131072 node=0 entry=9/0
> [    0.000000] [0000000202400000-fffff80003c00000] page_structs=131072 node=0 entry=10/0
> [    0.000000] [0000000202400000-fffff80004000000] page_structs=131072 node=0 entry=11/0
> [    0.000000] Zone PFN ranges:
> [    0.000000]   Normal   0x00000000 -> 0x0006fe8a
> [    0.000000] Movable zone start PFN for each node
> [    0.000000] early_node_map[3] active PFN ranges
> [    0.000000]     0: 0x00000000 -> 0x0006fc27
> [    0.000000]     0: 0x0006fe00 -> 0x0006fe7f
> [    0.000000]     0: 0x0006fe80 -> 0x0006fe8a
> [    0.000000] On node 0 totalpages: 457904
> [    0.000000]   Normal zone: 5372 pages used for memmap
> [    0.000000]   Normal zone: 0 pages reserved
> [    0.000000]   Normal zone: 452532 pages, LIFO batch:15
> [    0.000000] Booting Linux...
> [    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 452532
> [    0.000000] Kernel command line: root=/dev/sda2 ro debug ignore_loglevel console=ttyS0
> [    0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
> [    0.000000] Dentry cache hash table entries: 524288 (order: 9, 4194304 bytes)
> [    0.000000] Inode-cache hash table entries: 262144 (order: 8, 2097152 bytes)
> [    0.000000] Memory: 3593384k available (2896k kernel code, 1392k data, 152k init) [fffff80000000000,00000000dfd14000]
> [    0.000000] SLUB: Genslabs=14, HWalign=32, Order=0-3, MinObjects=0, CPUs=16, Nodes=1
> [    0.000000] RCU-based detection of stalled CPUs is enabled.
> [    0.000000] NR_IRQS:255
> [    0.000000] clocksource: mult[40842] shift[16]
> [    0.000000] clockevent: mult[3f7ced91] shift[32]
> [  126.197098] Console: colour dummy device 80x25
> [  126.250110] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
> [  126.342739] ... MAX_LOCKDEP_SUBCLASSES:  8
> [  126.391701] ... MAX_LOCK_DEPTH:          48
> [  126.441711] ... MAX_LOCKDEP_KEYS:        8191
> [  126.493801] ... CLASSHASH_SIZE:          4096
> [  126.545893] ... MAX_LOCKDEP_ENTRIES:     16384
> [  126.599026] ... MAX_LOCKDEP_CHAINS:      32768
> [  126.652159] ... CHAINHASH_SIZE:          16384
> [  126.705292]  memory used by lock dependency info: 5695 kB
> [  126.769885]  per task-struct memory footprint: 1920 bytes
> [  126.984803] Calibrating delay using timer specific routine.. 498.95 BogoMIPS (lpj=2494757)
> [  127.083344] ------------[ cut here ]------------
> [  127.138101] WARNING: at mm/percpu.c:651 pcpu_map+0xdc/0x100()
> [  127.206826] Modules linked in:
> [  127.243239] Call Trace:
> [  127.272498]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  127.342282]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  127.409997]  [00000000004d493c] pcpu_map+0xdc/0x100
> [  127.468338]  [00000000004d59a4] pcpu_alloc+0x3e4/0x4e0
> [  127.529806]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  127.593363]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  127.664223]  [000000000084d6cc] mmap_init+0x14/0x24
> [  127.722547]  [000000000084910c] proc_caches_init+0xdc/0xec
> [  127.788179]  [0000000000842910] start_kernel+0x2b4/0x338
> [  127.851748]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  127.915271]  [0000000000000000] (null)
> [  127.960244] ---[ end trace 139ce121c98e96c9 ]---
> [  128.015292] ------------[ cut here ]------------
> [  128.070525] WARNING: at mm/vmalloc.c:106 vmap_page_range_noflush+0x228/0x280()
> [  128.156969] Modules linked in:
> [  128.193383] Call Trace:
> [  128.222627]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  128.292429]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  128.360144]  [00000000004c3ce8] vmap_page_range_noflush+0x228/0x280
> [  128.435155]  [00000000004c42fc] map_kernel_range_noflush+0x1c/0x40
> [  128.509122]  [00000000004d4908] pcpu_map+0xa8/0x100
> [  128.567466]  [00000000004d59a4] pcpu_alloc+0x3e4/0x4e0
> [  128.628932]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  128.692485]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  128.763336]  [000000000084d6cc] mmap_init+0x14/0x24
> [  128.821670]  [000000000084910c] proc_caches_init+0xdc/0xec
> [  128.887305]  [0000000000842910] start_kernel+0x2b4/0x338
> [  128.950861]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  129.014396]  [0000000000000000] (null)
> [  129.059195] ---[ end trace 139ce121c98e96ca ]---
> [  129.114459] ------------[ cut here ]------------
> [  129.169644] WARNING: at mm/percpu.c:565 pcpu_depopulate_chunk+0x1e4/0x200()
> [  129.252971] Modules linked in:
> [  129.289383] Call Trace:
> [  129.318627]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  129.388429]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  129.456143]  [00000000004d4b44] pcpu_depopulate_chunk+0x1e4/0x200
> [  129.529071]  [00000000004d5970] pcpu_alloc+0x3b0/0x4e0
> [  129.590535]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  129.654088]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  129.724940]  [000000000084d6cc] mmap_init+0x14/0x24
> [  129.783276]  [000000000084910c] proc_caches_init+0xdc/0xec
> [  129.848909]  [0000000000842910] start_kernel+0x2b4/0x338
> [  129.912466]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  129.976002]  [0000000000000000] (null)
> [  130.020798] ---[ end trace 139ce121c98e96cb ]---
> [  130.076017] ------------[ cut here ]------------
> [  130.131247] WARNING: at mm/vmalloc.c:43 vunmap_page_range+0x104/0x160()
> [  130.210407] Modules linked in:
> [  130.246822] Call Trace:
> [  130.276062]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  130.345867]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  130.413579]  [00000000004c2ae4] vunmap_page_range+0x104/0x160
> [  130.482342]  [00000000004c2b94] unmap_kernel_range_noflush+0x14/0x40
> [  130.558394]  [00000000004d4aa4] pcpu_depopulate_chunk+0x144/0x200
> [  130.631322]  [00000000004d5970] pcpu_alloc+0x3b0/0x4e0
> [  130.692788]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  130.756343]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  130.827190]  [000000000084d6cc] mmap_init+0x14/0x24
> [  130.885529]  [000000000084910c] proc_caches_init+0xdc/0xec
> [  130.951161]  [0000000000842910] start_kernel+0x2b4/0x338
> [  131.014717]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  131.078252]  [0000000000000000] (null)
> [  131.123049] ---[ end trace 139ce121c98e96cc ]---
> [  131.178268] ------------[ cut here ]------------
> [  131.233497] WARNING: at mm/vmalloc.c:43 vunmap_page_range+0x104/0x160()
> [  131.312658] Modules linked in:
> [  131.349071] Call Trace:
> [  131.378316]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  131.448117]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  131.515830]  [00000000004c2ae4] vunmap_page_range+0x104/0x160
> [  131.584593]  [00000000004c2b94] unmap_kernel_range_noflush+0x14/0x40
> [  131.660647]  [00000000004d4aa4] pcpu_depopulate_chunk+0x144/0x200
> [  131.733573]  [00000000004d5970] pcpu_alloc+0x3b0/0x4e0
> [  131.795040]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  131.858591]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  131.929440]  [000000000084d6cc] mmap_init+0x14/0x24
> [  131.987779]  [000000000084910c] proc_caches_init+0xdc/0xec
> [  132.053412]  [0000000000842910] start_kernel+0x2b4/0x338
> [  132.116967]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  132.180504]  [0000000000000000] (null)
> [  132.225299] ---[ end trace 139ce121c98e96cd ]---
> [  132.280518] ------------[ cut here ]------------
> [  132.335748] WARNING: at mm/vmalloc.c:43 vunmap_page_range+0x104/0x160()
> [  132.414909] Modules linked in:
> [  132.451323] Call Trace:
> [  132.480566]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  132.550368]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  132.618083]  [00000000004c2ae4] vunmap_page_range+0x104/0x160
> [  132.686844]  [00000000004c2b94] unmap_kernel_range_noflush+0x14/0x40
> [  132.762896]  [00000000004d4aa4] pcpu_depopulate_chunk+0x144/0x200
> [  132.835822]  [00000000004d5970] pcpu_alloc+0x3b0/0x4e0
> [  132.897291]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  132.960844]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  133.031691]  [000000000084d6cc] mmap_init+0x14/0x24
> [  133.090030]  [000000000084910c] proc_caches_init+0xdc/0xec
> [  133.155662]  [0000000000842910] start_kernel+0x2b4/0x338
> [  133.219219]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  133.282756]  [0000000000000000] (null)
> [  133.327552] ---[ end trace 139ce121c98e96ce ]---
> [  133.382770] ------------[ cut here ]------------
> [  133.438001] WARNING: at mm/vmalloc.c:43 vunmap_page_range+0x104/0x160()
> [  133.517162] Modules linked in:
> [  133.553574] Call Trace:
> [  133.582818]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  133.652618]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  133.720334]  [00000000004c2ae4] vunmap_page_range+0x104/0x160
> [  133.789096]  [00000000004c2b94] unmap_kernel_range_noflush+0x14/0x40
> [  133.865149]  [00000000004d4aa4] pcpu_depopulate_chunk+0x144/0x200
> [  133.938075]  [00000000004d5970] pcpu_alloc+0x3b0/0x4e0
> [  133.999544]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  134.063095]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  134.133943]  [000000000084d6cc] mmap_init+0x14/0x24
> [  134.192282]  [000000000084910c] proc_caches_init+0xdc/0xec
> [  134.257916]  [0000000000842910] start_kernel+0x2b4/0x338
> [  134.321471]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  134.385006]  [0000000000000000] (null)
> [  134.429803] ---[ end trace 139ce121c98e96cf ]---
> [  134.485020] ------------[ cut here ]------------
> [  134.540252] WARNING: at mm/vmalloc.c:43 vunmap_page_range+0x104/0x160()
> [  134.619414] Modules linked in:
> [  134.655826] Call Trace:
> [  134.685068]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  134.754872]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  134.822585]  [00000000004c2ae4] vunmap_page_range+0x104/0x160
> [  134.891346]  [00000000004c2b94] unmap_kernel_range_noflush+0x14/0x40
> [  134.967399]  [00000000004d4aa4] pcpu_depopulate_chunk+0x144/0x200
> [  135.040325]  [00000000004d5970] pcpu_alloc+0x3b0/0x4e0
> [  135.101794]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  135.165347]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  135.236197]  [000000000084d6cc] mmap_init+0x14/0x24
> [  135.294533]  [000000000084910c] proc_caches_init+0xdc/0xec
> [  135.360165]  [0000000000842910] start_kernel+0x2b4/0x338
> [  135.423723]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  135.487257]  [0000000000000000] (null)
> [  135.532055] ---[ end trace 139ce121c98e96d0 ]---
> [  135.587271] ------------[ cut here ]------------
> [  135.642504] WARNING: at mm/vmalloc.c:43 vunmap_page_range+0x104/0x160()
> [  135.721665] Modules linked in:
> [  135.758077] Call Trace:
> [  135.787320]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  135.857123]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  135.924836]  [00000000004c2ae4] vunmap_page_range+0x104/0x160
> [  135.993597]  [00000000004c2b94] unmap_kernel_range_noflush+0x14/0x40
> [  136.069651]  [00000000004d4aa4] pcpu_depopulate_chunk+0x144/0x200
> [  136.142576]  [00000000004d5970] pcpu_alloc+0x3b0/0x4e0
> [  136.204044]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  136.267598]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  136.338446]  [000000000084d6cc] mmap_init+0x14/0x24
> [  136.396784]  [000000000084910c] proc_caches_init+0xdc/0xec
> [  136.462418]  [0000000000842910] start_kernel+0x2b4/0x338
> [  136.525971]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  136.589510]  [0000000000000000] (null)
> [  136.634308] ---[ end trace 139ce121c98e96d1 ]---
> [  136.691136] ------------[ cut here ]------------
> [  136.745488] WARNING: at mm/percpu.c:651 pcpu_map+0xdc/0x100()
> [  136.814227] Modules linked in:
> [  136.850641] Call Trace:
> [  136.879883]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  136.949683]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  137.017398]  [00000000004d493c] pcpu_map+0xdc/0x100
> [  137.075741]  [00000000004d59a4] pcpu_alloc+0x3e4/0x4e0
> [  137.137209]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  137.200760]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  137.271601]  [000000000084f2b4] files_init+0x68/0x78
> [  137.330987]  [000000000084f450] vfs_caches_init+0xa0/0x150
> [  137.396620]  [0000000000842928] start_kernel+0x2cc/0x338
> [  137.460178]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  137.523715]  [0000000000000000] (null)
> [  137.568512] ---[ end trace 139ce121c98e96d2 ]---
> [  137.623820] Mount-cache hash table entries: 512
> [  137.678029] ------------[ cut here ]------------
> [  137.733137] WARNING: at mm/percpu.c:651 pcpu_map+0xdc/0x100()
> [  137.801877] Modules linked in:
> [  137.838289] Call Trace:
> [  137.867534]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  137.937335]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  138.005048]  [00000000004d493c] pcpu_map+0xdc/0x100
> [  138.063389]  [00000000004d59a4] pcpu_alloc+0x3e4/0x4e0
> [  138.124859]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  138.188411]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  138.259264]  [00000000004b5338] bdi_init+0x38/0xc0
> [  138.316559]  [00000000008508d0] sysfs_inode_init+0x8/0x1c
> [  138.381145]  [0000000000850914] sysfs_init+0x30/0xac
> [  138.440530]  [000000000084f9b0] mnt_init+0x90/0x1a4
> [  138.498873]  [000000000084f458] vfs_caches_init+0xa8/0x150
> [  138.564509]  [0000000000842928] start_kernel+0x2cc/0x338
> [  138.628065]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  138.691601]  [0000000000000000] (null)
> [  138.736396] ---[ end trace 139ce121c98e96d3 ]---
> [  138.791615] ------------[ cut here ]------------
> [  138.846847] WARNING: at mm/vmalloc.c:106 vmap_page_range_noflush+0x228/0x280()
> [  138.933300] Modules linked in:
> [  138.969712] Call Trace:
> [  138.998957]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  139.068758]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  139.136472]  [00000000004c3ce8] vmap_page_range_noflush+0x228/0x280
> [  139.211484]  [00000000004c42fc] map_kernel_range_noflush+0x1c/0x40
> [  139.285451]  [00000000004d4908] pcpu_map+0xa8/0x100
> [  139.343795]  [00000000004d59a4] pcpu_alloc+0x3e4/0x4e0
> [  139.405263]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  139.468814]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  139.539659]  [00000000004b5338] bdi_init+0x38/0xc0
> [  139.596959]  [00000000008508d0] sysfs_inode_init+0x8/0x1c
> [  139.661551]  [0000000000850914] sysfs_init+0x30/0xac
> [  139.720934]  [000000000084f9b0] mnt_init+0x90/0x1a4
> [  139.779276]  [000000000084f458] vfs_caches_init+0xa8/0x150
> [  139.844911]  [0000000000842928] start_kernel+0x2cc/0x338
> [  139.908469]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  139.972005]  [0000000000000000] (null)
> [  140.016800] ---[ end trace 139ce121c98e96d4 ]---
> [  140.072067] ------------[ cut here ]------------
> [  140.127250] WARNING: at mm/percpu.c:565 pcpu_depopulate_chunk+0x1e4/0x200()
> [  140.210578] Modules linked in:
> [  140.246991] Call Trace:
> [  140.276235]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  140.346036]  [000000000045ebdc] warn_slowpath_null+0x1c/0x40
> [  140.413750]  [00000000004d4b44] pcpu_depopulate_chunk+0x1e4/0x200
> [  140.486675]  [00000000004d5970] pcpu_alloc+0x3b0/0x4e0
> [  140.548143]  [00000000004d5af8] __alloc_percpu+0x18/0x40
> [  140.611697]  [00000000005b112c] __percpu_counter_init+0x4c/0xc0
> [  140.682543]  [00000000004b5338] bdi_init+0x38/0xc0
> [  140.739842]  [00000000008508d0] sysfs_inode_init+0x8/0x1c
> [  140.804433]  [0000000000850914] sysfs_init+0x30/0xac
> [  140.863815]  [000000000084f9b0] mnt_init+0x90/0x1a4
> [  140.922159]  [000000000084f458] vfs_caches_init+0xa8/0x150
> [  140.987795]  [0000000000842928] start_kernel+0x2cc/0x338
> [  141.051351]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  141.114887]  [0000000000000000] (null)
> [  141.159683] ---[ end trace 139ce121c98e96d5 ]---
> [  141.216152] ------------[ cut here ]------------
> [  141.270459] WARNING: at lib/kobject.c:595 kobject_put+0x40/0x60()
> [  141.343360] kobject: '<NULL>' (fffff800df022ad0): is not initialized, yet kobject_put() is being called.
> [  141.456910] Modules linked in:
> [  141.493327] Call Trace:
> [  141.522571]  [000000000045eb70] warn_slowpath_common+0x50/0xa0
> [  141.592368]  [000000000045ec34] warn_slowpath_fmt+0x34/0x60
> [  141.659045]  [0000000000598d60] kobject_put+0x40/0x60
> [  141.719469]  [00000000004d3ba0] kmem_cache_destroy+0x180/0x1e0
> [  141.789270]  [0000000000850940] sysfs_init+0x5c/0xac
> [  141.848650]  [000000000084f9b0] mnt_init+0x90/0x1a4
> [  141.906994]  [000000000084f458] vfs_caches_init+0xa8/0x150
> [  141.972628]  [0000000000842928] start_kernel+0x2cc/0x338
> [  142.036187]  [00000000006c5004] tlb_fixup_done+0xa0/0xbc
> [  142.099721]  [0000000000000000] (null)
> [  142.144518] ---[ end trace 139ce121c98e96d6 ]---
> [  142.199742] mnt_init: sysfs_init error: -12
> [  142.249979] Unable to handle kernel NULL pointer dereference
> [  142.317414] tsk->{mm,active_mm}->context = 0000000000000000
> [  142.384089] tsk->{mm,active_mm}->pgd = fffff8000086bb74
> [  142.446597]               \|/ ____ \|/
> [  142.446613]               "@'/ .. \`@"
> [  142.446627]               /_| \__/ |_\
> [  142.446640]                  \__U_/
> [  142.622671] swapper(0): Oops [#1]
> [  142.662298] TSTATE: 0000000080e01603 TPC: 00000000004d2198 TNPC: 00000000004d219c Y: 00000000    Tainted: G        W 
> [  142.789371] TPC: <kmem_cache_alloc+0x58/0x120>
> [  142.842497] g0: 0000000000000000 g1: 0000000000000140 g2: 0000000000000000 g3: 00000000007d6010
> [  142.946679] g4: 00000000007d6010 g5: 0000000100f9c000 g6: 0000000000834000 g7: 0000000000667300
> [  143.050861] o0: 000000000078d9f0 o1: 00000000000006a1 o2: 0000000000000000 o3: 0000000000000008
> [  143.155046] o4: fffff800df056008 o5: 0000000066730000 sp: 0000000000836e81 ret_pc: 00000000004d2184
> [  143.263397] RPC: <kmem_cache_alloc+0x44/0x120>
> [  143.316526] l0: 0000000000000000 l1: 00000000000080d0 l2: 00000000000012d0 l3: 000000000053c1b0
> [  143.420710] l4: 0000000000000000 l5: 0000000000000000 l6: 00000000000212d0 l7: 0000000000000010
> [  143.524891] i0: 0000000000000000 i1: 00000000000080d0 i2: 0000000000000001 i3: 0000000000000001
> [  143.629075] i4: 0000000000000000 i5: 0000000000000002 i6: 0000000000836f41 i7: 000000000053c1b0
> [  143.733265] I7: <sysfs_new_dirent+0x30/0x120>
> [  143.785344] Disabling lock debugging due to kernel taint
> [  143.848907] Caller[000000000053c1b0]: sysfs_new_dirent+0x30/0x120
> [  143.921834] Caller[000000000053c7a4]: create_dir+0x24/0xc0
> [  143.987469] Caller[000000000053c870]: sysfs_create_dir+0x30/0x80
> [  144.059361] Caller[00000000005990e8]: kobject_add_internal+0xc8/0x200
> [  144.136454] Caller[0000000000599354]: kobject_add_varg+0x34/0x60
> [  144.208341] Caller[0000000000599424]: kobject_add+0x44/0x80
> [  144.275018] Caller[0000000000599490]: kobject_create_and_add+0x30/0x80
> [  144.353148] Caller[000000000084f9cc]: mnt_init+0xac/0x1a4
> [  144.417742] Caller[000000000084f458]: vfs_caches_init+0xa8/0x150
> [  144.489628] Caller[0000000000842928]: start_kernel+0x2cc/0x338
> [  144.559437] Caller[00000000006c5004]: tlb_fixup_done+0xa0/0xbc
> [  144.629223] Caller[0000000000000000]: (null)
> [  144.680270] Instruction DUMP: c211a03e  82006022  83287003 <d85d0001> f05b0000  02c60026  e4032018  c2032014  83287003 
> [  144.809463] Kernel panic - not syncing: Attempted to kill the idle task!
> [  144.889675] Call Trace:
> [  144.918870]  [00000000006d3958] panic+0x5c/0x1a0
> [  144.974082]  [0000000000462d4c] do_exit+0x62c/0x6a0
> [  145.032424]  [0000000000427e54] die_if_kernel+0xf4/0x300
> [  145.095991]  [0000000000447bec] unhandled_fault+0x6c/0xc0
> [  145.160570]  [0000000000447d20] do_sparc64_fault+0xe0/0x680
> [  145.227247]  [0000000000407b04] sparc64_realfault_common+0x10/0x20
> [  145.301210]  [00000000004d2198] kmem_cache_alloc+0x58/0x120
> [  145.367886]  [000000000053c1b0] sysfs_new_dirent+0x30/0x120
> [  145.434564]  [000000000053c7a4] create_dir+0x24/0xc0
> [  145.493947]  [000000000053c870] sysfs_create_dir+0x30/0x80
> [  145.559589]  [00000000005990e8] kobject_add_internal+0xc8/0x200
> [  145.630432]  [0000000000599354] kobject_add_varg+0x34/0x60
> [  145.696067]  [0000000000599424] kobject_add+0x44/0x80
> [  145.756495]  [0000000000599490] kobject_create_and_add+0x30/0x80
> [  145.828373]  [000000000084f9cc] mnt_init+0xac/0x1a4
> [  145.886717]  [000000000084f458] vfs_caches_init+0xa8/0x150
> [  145.952344] Press Stop-A (L1-A) to return to the boot prom
> 
> -- 
> Meelis Roos (mroos@linux.ee)
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: boot panic after oops from sysfs_new_dirent
  2009-08-31 21:22 ` boot panic after oops from sysfs_new_dirent David Miller
@ 2009-09-01  7:05   ` Meelis Roos
  2009-09-01  7:14     ` Tejun Heo
  0 siblings, 1 reply; 9+ messages in thread
From: Meelis Roos @ 2009-09-01  7:05 UTC (permalink / raw)
  To: David Miller; +Cc: sparclinux, Linux Kernel list, tj

> > I powered on my 6-CPU E3000 after a summer pause and tried todays 
> > 2.6.31-rc8+git on it. It fails - first it spills lots of warnings about 
> > percpu things, then oopses in sysfs_new_dirent and then panics.

Bisecting shows that there may be different crashes, like in
56513ed50cc8a5c184a3f347e81d74c850cc14fa below:

3,0>ERROR: TEST=NVRAM Devices,SUBTEST=M48T59 (TOD) Init ID=8.1
3,0>Component under test: Board 16 Firehose Bus
3,0>TODC battery is low bit set
Detected failed TOD on clock board. Using backup TOD on board in slot 1 
fhc ac simm-status environment sram flashprom SUNW,UltraSPARC-II SUNW,UltraSPARC-II 
fhc ac simm-status environment sram flashprom SUNW,UltraSPARC-II SUNW,UltraSPARC-II 
fhc ac simm-status environment sram flashprom SUNW,UltraSPARC-II SUNW,UltraSPARC-II 
Probing UPA Slot at 2,0   sbus fhc ac environment flashprom eeprom sbus-speed counter-timer 
Probing UPA Slot at 3,0   sbus counter-timer 
Probing /sbus@2,0 at d,0  SUNW,soc 
Probing /sbus@2,0 at 1,0  SUNW,socal sf ssd sf ssd 
Probing /sbus@2,0 at 2,0  QLGC,isp sd st 
Probing /sbus@3,0 at 3,0  SUNW,hme SUNW,fas sd st 
Probing /sbus@3,0 at 0,0  SUNW,qfe SUNW,qfe SUNW,qfe SUNW,qfe 
4-slot Sun Enterprise 3000, No Keyboard
OpenBoot 3.2.29, 3584 MB memory installed, Serial #8631214.
Copyright 2001 Sun Microsystems, Inc.  All rights reserved
Ethernet address 8:0:20:83:b3:ae, Host ID: 8083b3ae.



Boot device: disk  File and args: 
SILO Version 1.4.13
boot: 
Linux                    LinuxOLD                 test                     
hea                      
boot: test console=ttyS0
Allocated 8 Megs of memory at 0x40000000 for kernel
Uncompressing image...
Loaded kernel version 2.6.30

[    0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.29 2001/06/18 17:28'
[    0.000000] PROMLIB: Root node compatible: 
[    0.000000] Linux version 2.6.30 (mroos@mandel) (gcc version 4.3.4 (Debian 4.3.4-2) ) #12 SMP Mon Aug 31 18:04:20 EEST 2009
[    0.000000] debug: ignoring loglevel setting.
[    0.000000] console [earlyprom0] enabled
[    0.000000] ARCH: SUN4U
[    0.000000] Ethernet address: 08:00:20:83:b3:ae
[    0.000000] Kernel: Using 4 locked TLB entries for main kernel image.
[    0.000000] Remapping the kernel... done.
[    0.000000] OF stdout device is: /central@1f,0/fhc@0,f8800000/zs@0,902000:a
[    0.000000] PROM: Built device tree with 70462 bytes of memory.
[    0.000000] Top of RAM: 0xdfd14000, Total RAM: 0xdf960000
[    0.000000] Memory hole size: 3MB
[    0.000000] [0000000200000000-fffff80001400000] page_structs=131072 node=0 entry=0/0
[    0.000000] [0000000200000000-fffff80001800000] page_structs=131072 node=0 entry=1/0
[    0.000000] [0000000200000000-fffff80001c00000] page_structs=131072 node=0 entry=2/0
[    0.000000] [0000000200c00000-fffff80002000000] page_structs=131072 node=0 entry=3/0
[    0.000000] [0000000200c00000-fffff80002400000] page_structs=131072 node=0 entry=4/0
[    0.000000] [0000000200c00000-fffff80002800000] page_structs=131072 node=0 entry=5/0
[    0.000000] [0000000201800000-fffff80002c00000] page_structs=131072 node=0 entry=6/0
[    0.000000] [0000000201800000-fffff80003000000] page_structs=131072 node=0 entry=7/0
[    0.000000] [0000000201800000-fffff80003400000] page_structs=131072 node=0 entry=8/0
[    0.000000] [0000000202400000-fffff80003800000] page_structs=131072 node=0 entry=9/0
[    0.000000] [0000000202400000-fffff80003c00000] page_structs=131072 node=0 entry=10/0
[    0.000000] [0000000202400000-fffff80004000000] page_structs=131072 node=0 entry=11/0
[    0.000000] Zone PFN ranges:
[    0.000000]   Normal   0x00000000 -> 0x0006fe8a
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[3] active PFN ranges
[    0.000000]     0: 0x00000000 -> 0x0006fc27
[    0.000000]     0: 0x0006fe00 -> 0x0006fe7f
[    0.000000]     0: 0x0006fe80 -> 0x0006fe8a
[    0.000000] On node 0 totalpages: 457904
[    0.000000]   Normal zone: 5372 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000]   Normal zone: 452532 pages, LIFO batch:15
[    0.000000] Booting Linux...
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 452532
[    0.000000] Kernel command line: root=/dev/sda2 ro debug ignore_loglevel console=ttyS0
[    0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[    0.000000] Dentry cache hash table entries: 524288 (order: 9, 4194304 bytes)
[    0.000000] Inode-cache hash table entries: 262144 (order: 8, 2097152 bytes)
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: at kernel/lockdep.c:2282 lockdep_trace_alloc+0xb8/0x100()
[    0.000000] Modules linked in:
[    0.000000] Call Trace:
[    0.000000]  [000000000045ec50] warn_slowpath_common+0x50/0xa0
[    0.000000]  [000000000045ecbc] warn_slowpath_null+0x1c/0x40
[    0.000000]  [0000000000488cb8] lockdep_trace_alloc+0xb8/0x100
[    0.000000]  [00000000004a7334] __alloc_pages_internal+0x34/0x4a0
[    0.000000]  [00000000008438bc] mem_init+0x244/0x32c
[    0.000000]  [000000000083e7b4] start_kernel+0x18c/0x334
[    0.000000]  [00000000006c2204] tlb_fixup_done+0xa0/0xbc
[    0.000000]  [0000000000000000] (null)
[    0.000000] ---[ end trace 139ce121c98e96c9 ]---
[    0.000000] Memory: 3593504k available (2888k kernel code, 1392k data, 152k init) [fffff80000000000,00000000dfd14000]
[    0.000000] SLUB: Genslabs=14, HWalign=32, Order=0-3, MinObjects=0, CPUs=16, Nodes=1
[    0.000000] RCU-based detection of stalled CPUs is enabled.
[    0.000000] NR_IRQS:255
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: at mm/bootmem.c:535 alloc_arch_preferred_bootmem+0x48/0x5c()
[    0.000000] Modules linked in:
[    0.000000] Call Trace:
[    0.000000]  [000000000045ec50] warn_slowpath_common+0x50/0xa0
[    0.000000]  [000000000045ecbc] warn_slowpath_null+0x1c/0x40
[    0.000000]  [0000000000846ff0] alloc_arch_preferred_bootmem+0x48/0x5c
[    0.000000]  [000000000084781c] ___alloc_bootmem_nopanic+0x20/0xcc
[    0.000000]  [00000000008479ac] ___alloc_bootmem+0x10/0x44
[    0.000000]  [0000000000847b70] __alloc_bootmem+0x10/0x20
[    0.000000]  [000000000083f5bc] init_IRQ+0xc0/0x2b0
[    0.000000]  [000000000083e7f0] start_kernel+0x1c8/0x334
[    0.000000]  [00000000006c2204] tlb_fixup_done+0xa0/0xbc
[    0.000000]  [0000000000000000] (null)
[    0.000000] ---[ end trace 139ce121c98e96ca ]---
[    0.000000] clocksource: mult[40842] shift[16]
[    0.000000] clockevent: mult[3f7ced91] shift[32]
[  125.625114] Console: colour dummy device 80x25
[  125.678129] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[  125.770812] ... MAX_LOCKDEP_SUBCLASSES:  8
[  125.819774] ... MAX_LOCK_DEPTH:          48
[  125.869784] ... MAX_LOCKDEP_KEYS:        8191
[  125.921874] ... CLASSHASH_SIZE:          4096
[  125.973966] ... MAX_LOCKDEP_ENTRIES:     16384
[  126.027100] ... MAX_LOCKDEP_CHAINS:      32768
[  126.080233] ... CHAINHASH_SIZE:          16384
[  126.133367]  memory used by lock dependency info: 5695 kB
[  126.197958]  per task-struct memory footprint: 1920 bytes
[  126.412806] Calibrating delay using timer specific routine.. 498.27 BogoMIPS (lpj=2491393)
[  126.511769] ------------[ cut here ]------------
[  126.566062] WARNING: at mm/vmalloc.c:106 vmap_page_range_noflush+0x228/0x280()
[  126.652505] Modules linked in:
[  126.688929] Call Trace:
[  126.718168]  [000000000045ec50] warn_slowpath_common+0x50/0xa0
[  126.787961]  [000000000045ecbc] warn_slowpath_null+0x1c/0x40
[  126.855677]  [00000000004c3a08] vmap_page_range_noflush+0x228/0x280
[  126.930690]  [00000000004c401c] map_kernel_range_noflush+0x1c/0x40
[  127.004660]  [00000000004d3fbc] pcpu_map+0xbc/0x100
[  127.062999]  [00000000004d5064] pcpu_alloc+0x3e4/0x4e0
[  127.124467]  [00000000004d51b8] __alloc_percpu+0x18/0x40
[  127.188020]  [00000000005af40c] __percpu_counter_init+0x4c/0xc0
[  127.258870]  [000000000084b2d8] files_init+0x68/0x78
[  127.318247]  [000000000084b474] vfs_caches_init+0xa0/0x150
[  127.383888]  [000000000083e8f0] start_kernel+0x2c8/0x334
[  127.447436]  [00000000006c2204] tlb_fixup_done+0xa0/0xbc
[  127.510971]  [0000000000000000] (null)
[  127.555772] ---[ end trace 139ce121c98e96cb ]---
[  127.611597] Mount-cache hash table entries: 512
[  127.665366] ------------[ cut here ]------------
[  127.720400] WARNING: at mm/vmalloc.c:106 vmap_page_range_noflush+0x228/0x280()
[  127.806845] Modules linked in:
[  127.843271] Call Trace:
[  127.872505]  [000000000045ec50] warn_slowpath_common+0x50/0xa0
[  127.942305]  [000000000045ecbc] warn_slowpath_null+0x1c/0x40
[  128.010019]  [00000000004c3a08] vmap_page_range_noflush+0x228/0x280
[  128.085033]  [00000000004c401c] map_kernel_range_noflush+0x1c/0x40
[  128.158998]  [00000000004d3fbc] pcpu_map+0xbc/0x100
[  128.217341]  [00000000004d5064] pcpu_alloc+0x3e4/0x4e0
[  128.278809]  [00000000004d51b8] __alloc_percpu+0x18/0x40
[  128.342360]  [00000000005af40c] __percpu_counter_init+0x4c/0xc0
[  128.413215]  [00000000004b5170] bdi_init+0x50/0xc0
[  128.470512]  [000000000084c958] sysfs_inode_init+0x8/0x1c
[  128.535098]  [000000000084c99c] sysfs_init+0x30/0xac
[  128.594483]  [000000000084b9d4] mnt_init+0x90/0x1ec
[  128.652827]  [000000000084b47c] vfs_caches_init+0xa8/0x150
[  128.718464]  [000000000083e8f0] start_kernel+0x2c8/0x334
[  128.782012]  [00000000006c2204] tlb_fixup_done+0xa0/0xbc
[  128.845550]  [0000000000000000] (null)
[  128.890346] ---[ end trace 139ce121c98e96cc ]---
[  128.946007] Unable to handle kernel NULL pointer dereference
[  129.013245] tsk->{mm,active_mm}->context = 0000000000000000
[  129.079919] tsk->{mm,active_mm}->pgd = fffff80000867934
[  129.142432]               \|/ ____ \|/
[  129.142447]               "@'/ .. \`@"
[  129.142461]               /_| \__/ |_\
[  129.142474]                  \__U_/
[  129.318502] swapper(0): Oops [#1]
[  129.358101] TSTATE: 0000000080e01600 TPC: 00000000004d46b4 TNPC: 00000000004d46b8 Y: 00000000    Tainted: G        W 
[  129.485202] TPC: <free_percpu+0x74/0x180>
[  129.533120] g0: 0000000000000008 g1: e000000004402036 g2: 0000000000000000 g3: fffff80000000000
[  129.637301] g4: 00000000007d2458 g5: 0000000100fa0000 g6: 0000000000830000 g7: 0000000000200200
[  129.741483] o0: 0000000000000000 o1: 00000000008339a0 o2: 0000000000830000 o3: 0000000000000000
[  129.845666] o4: 0000000000000002 o5: 0000000000000000 sp: 0000000000833111 ret_pc: 00000000004d46ac
[  129.954021] RPC: <free_percpu+0x6c/0x180>
[  130.001942] l0: fffff800058181c0 l1: 0000000000000000 l2: 00000000007e2c00 l3: 0000000000784000
[  130.106121] l4: 00000000008339a0 l5: 000000000087eaa0 l6: 0000000000000000 l7: 0000000000000000
[  130.210306] i0: 0000000101802004 i1: 0000000000000000 i2: 000000000106b330 i3: 0000000000000000
[  130.314489] i4: 0000000000000066 i5: 00000000007fdad0 i6: 00000000008331d1 i7: 00000000005af394
[  130.418676] I7: <percpu_counter_destroy+0x54/0x80>
[  130.475979] Caller[00000000005af394]: percpu_counter_destroy+0x54/0x80
[  130.554119] Caller[00000000004b51bc]: bdi_init+0x9c/0xc0
[  130.617671] Caller[000000000084c958]: sysfs_inode_init+0x8/0x1c
[  130.688512] Caller[000000000084c99c]: sysfs_init+0x30/0xac
[  130.754147] Caller[000000000084b9d4]: mnt_init+0x90/0x1ec
[  130.818740] Caller[000000000084b47c]: vfs_caches_init+0xa8/0x150
[  130.890629] Caller[000000000083e8f0]: start_kernel+0x2c8/0x334
[  130.960430] Caller[00000000006c2204]: tlb_fixup_done+0xa0/0xbc
[  131.030219] Caller[0000000000000000]: (null)
[  131.081264] Instruction DUMP: 050041b0  7fffb605  90100018 <e05a2040> c25c2018  d0586008  92260008  90100010  7fffff4e 
[  131.210457] Kernel panic - not syncing: Attempted to kill the idle task!
[  131.290670] Call Trace:
[  131.319860]  [00000000006d0818] panic+0x5c/0x1a0
[  131.375075]  [0000000000462eac] do_exit+0x62c/0x6a0
[  131.433419]  [0000000000427e54] die_if_kernel+0xf4/0x300
[  131.496986]  [0000000000447c0c] unhandled_fault+0x6c/0xc0
[  131.561563]  [0000000000447d40] do_sparc64_fault+0xe0/0x6a0
[  131.628240]  [0000000000407ac4] sparc64_realfault_common+0x10/0x20
[  131.702207]  [00000000004d46b4] free_percpu+0x74/0x180
[  131.763673]  [00000000005af394] percpu_counter_destroy+0x54/0x80
[  131.835563]  [00000000004b51bc] bdi_init+0x9c/0xc0
[  131.892866]  [000000000084c958] sysfs_inode_init+0x8/0x1c
[  131.957456]  [000000000084c99c] sysfs_init+0x30/0xac
[  132.016837]  [000000000084b9d4] mnt_init+0x90/0x1ec
[  132.075182]  [000000000084b47c] vfs_caches_init+0xa8/0x150
[  132.140816]  [000000000083e8f0] start_kernel+0x2c8/0x334
[  132.204367]  [00000000006c2204] tlb_fixup_done+0xa0/0xbc
[  132.267905]  [0000000000000000] (null)
[  132.312705] Press Stop-A (L1-A) to return to the boot prom

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: boot panic after oops from sysfs_new_dirent
  2009-09-01  7:05   ` Meelis Roos
@ 2009-09-01  7:14     ` Tejun Heo
  2009-09-01  7:30       ` Tejun Heo
  0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2009-09-01  7:14 UTC (permalink / raw)
  To: Meelis Roos; +Cc: David Miller, sparclinux, Linux Kernel list

Meelis Roos wrote:
>>> I powered on my 6-CPU E3000 after a summer pause and tried todays 
>>> 2.6.31-rc8+git on it. It fails - first it spills lots of warnings about 
>>> percpu things, then oopses in sysfs_new_dirent and then panics.
> 
> Bisecting shows that there may be different crashes, like in
> 56513ed50cc8a5c184a3f347e81d74c850cc14fa below:

Yeah, I was trying to fix that one with
74d46d6b2d23d44d72c37df4c6a5d2e782f7b088 and it seemingly fixed it on
my machine.  Apparently broken on yours.  Are the errors different
before and after 74d46d6b2d?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: boot panic after oops from sysfs_new_dirent
  2009-09-01  7:14     ` Tejun Heo
@ 2009-09-01  7:30       ` Tejun Heo
  2009-09-01  7:33         ` Meelis Roos
  0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2009-09-01  7:30 UTC (permalink / raw)
  To: Meelis Roos; +Cc: David Miller, sparclinux, Linux Kernel list

Tejun Heo wrote:
> Meelis Roos wrote:
>>>> I powered on my 6-CPU E3000 after a summer pause and tried todays 
>>>> 2.6.31-rc8+git on it. It fails - first it spills lots of warnings about 
>>>> percpu things, then oopses in sysfs_new_dirent and then panics.
>> Bisecting shows that there may be different crashes, like in
>> 56513ed50cc8a5c184a3f347e81d74c850cc14fa below:
> 
> Yeah, I was trying to fix that one with
> 74d46d6b2d23d44d72c37df4c6a5d2e782f7b088 and it seemingly fixed it on
> my machine.  Apparently broken on yours.  Are the errors different
> before and after 74d46d6b2d?

Ummm.... one more question.  Is cpu0 missing?  Can you please post the
output of "cat /proc/cpuinfo"?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: boot panic after oops from sysfs_new_dirent
  2009-09-01  7:30       ` Tejun Heo
@ 2009-09-01  7:33         ` Meelis Roos
  2009-09-01  7:58           ` Tejun Heo
  0 siblings, 1 reply; 9+ messages in thread
From: Meelis Roos @ 2009-09-01  7:33 UTC (permalink / raw)
  To: Tejun Heo; +Cc: David Miller, sparclinux, Linux Kernel list

> Ummm.... one more question.  Is cpu0 missing?  Can you please post the
> output of "cat /proc/cpuinfo"?

Yes, there's no CPU numbered 0, the numbers come from hardware slots on 
backplane.

cpu             : TI UltraSparc II  (BlackBird)
fpu             : UltraSparc II integrated FPU
pmu             : ultra12
prom            : OBP 3.2.29 2001/06/18 17:28
type            : sun4u
ncpus probed    : 6
ncpus active    : 6
D$ parity tl1   : 0
I$ parity tl1   : 0
Cpu6ClkTck      : 000000000ec82e00
Cpu7ClkTck      : 000000000ec82e00
Cpu10ClkTck     : 000000000ec82e00
Cpu11ClkTck     : 000000000ec82e00
Cpu14ClkTck     : 000000000ec82e00
Cpu15ClkTck     : 000000000ec82e00
MMU Type        : Spitfire
State:
CPU6:           online
CPU7:           online
CPU10:          online
CPU11:          online
CPU14:          online
CPU15:          online


-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: boot panic after oops from sysfs_new_dirent
  2009-09-01  7:33         ` Meelis Roos
@ 2009-09-01  7:58           ` Tejun Heo
  2009-09-01 11:48             ` Meelis Roos
  0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2009-09-01  7:58 UTC (permalink / raw)
  To: Meelis Roos; +Cc: David Miller, sparclinux, Linux Kernel list

Meelis Roos wrote:
>> Ummm.... one more question.  Is cpu0 missing?  Can you please post the
>> output of "cat /proc/cpuinfo"?
> 
> Yes, there's no CPU numbered 0, the numbers come from hardware slots on 
> backplane.

Aha...  Does the following patch fix the problem?

diff --git a/mm/percpu.c b/mm/percpu.c
index 5fe3784..3311c89 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -197,7 +197,12 @@ static unsigned long pcpu_chunk_addr(struct pcpu_chunk *chunk,
 static bool pcpu_chunk_page_occupied(struct pcpu_chunk *chunk,
 				     int page_idx)
 {
-	return *pcpu_chunk_pagep(chunk, 0, page_idx) != NULL;
+	/*
+	 * Any possible cpu id can be used here, so there's no need to
+	 * worry about preemption or cpu hotplug.
+	 */
+	return *pcpu_chunk_pagep(chunk, raw_smp_processor_id(),
+				 page_idx) != NULL;
 }

 /* set the pointer to a chunk in a page struct */
@@ -297,6 +302,14 @@ static struct pcpu_chunk *pcpu_chunk_addr_search(void *addr)
 		return pcpu_first_chunk;
 	}

+	/*
+	 * The address is relative to unit0 which might be unused and
+	 * thus unmapped.  Offset the address to the unit space of the
+	 * current processor before looking it up in the vmalloc
+	 * space.  Note that any possible cpu id can be used here, so
+	 * there's no need to worry about preemption or cpu hotplug.
+	 */
+	addr += raw_smp_processor_id() * pcpu_unit_size;
 	return pcpu_get_page_chunk(vmalloc_to_page(addr));
 }

-- 
tejun

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: boot panic after oops from sysfs_new_dirent
  2009-09-01  7:58           ` Tejun Heo
@ 2009-09-01 11:48             ` Meelis Roos
  2009-09-01 12:30               ` [PATCH] percpu: don't assume existence of cpu0 Tejun Heo
  0 siblings, 1 reply; 9+ messages in thread
From: Meelis Roos @ 2009-09-01 11:48 UTC (permalink / raw)
  To: Tejun Heo; +Cc: David Miller, sparclinux, Linux Kernel list

> Aha...  Does the following patch fix the problem?

Yes, yesterdays 2.6.31-rc8-git plus this patch seems to work fine, no 
warnings/oopses/panics. It is happily churning on debian unstable 
upgrade now.

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] percpu: don't assume existence of cpu0
  2009-09-01 11:48             ` Meelis Roos
@ 2009-09-01 12:30               ` Tejun Heo
  2009-09-01 22:52                 ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2009-09-01 12:30 UTC (permalink / raw)
  To: Meelis Roos; +Cc: David Miller, sparclinux, Linux Kernel list

percpu incorrectly assumed that cpu0 was always there which led to the
following warning and eventual oops on sparc machines w/o cpu0.

  WARNING: at mm/percpu.c:651 pcpu_map+0xdc/0x100()
  Modules linked in:
  Call Trace:
    [000000000045eb70] warn_slowpath_common+0x50/0xa0
    [000000000045ebdc] warn_slowpath_null+0x1c/0x40
    [00000000004d493c] pcpu_map+0xdc/0x100
    [00000000004d59a4] pcpu_alloc+0x3e4/0x4e0
    [00000000004d5af8] __alloc_percpu+0x18/0x40
    [00000000005b112c] __percpu_counter_init+0x4c/0xc0
  ...
  Unable to handle kernel NULL pointer dereference
  ...
   I7: <sysfs_new_dirent+0x30/0x120>
   Disabling lock debugging due to kernel taint
   Caller[000000000053c1b0]: sysfs_new_dirent+0x30/0x120
   Caller[000000000053c7a4]: create_dir+0x24/0xc0
   Caller[000000000053c870]: sysfs_create_dir+0x30/0x80
   Caller[00000000005990e8]: kobject_add_internal+0xc8/0x200
  ...
   Kernel panic - not syncing: Attempted to kill the idle task!

This patch fixes the problem by backporting parts from devel branch to
make percpu core not depend on the existence of cpu0.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Meelis Roos <mroos@linux.ee>
Cc: David Miller <davem@davemloft.net>
---
Meelis Roos wrote:
>> Aha...  Does the following patch fix the problem?
> 
> Yes, yesterdays 2.6.31-rc8-git plus this patch seems to work fine, no 
> warnings/oopses/panics. It is happily churning on debian unstable 
> upgrade now.

Good to hear.  Will forward to Linus right away.  Thanks.

 mm/percpu.c |   15 ++++++++++++++-
 1 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 5fe3784..3311c89 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -197,7 +197,12 @@ static unsigned long pcpu_chunk_addr(struct pcpu_chunk *chunk,
 static bool pcpu_chunk_page_occupied(struct pcpu_chunk *chunk,
 				     int page_idx)
 {
-	return *pcpu_chunk_pagep(chunk, 0, page_idx) != NULL;
+	/*
+	 * Any possible cpu id can be used here, so there's no need to
+	 * worry about preemption or cpu hotplug.
+	 */
+	return *pcpu_chunk_pagep(chunk, raw_smp_processor_id(),
+				 page_idx) != NULL;
 }

 /* set the pointer to a chunk in a page struct */
@@ -297,6 +302,14 @@ static struct pcpu_chunk *pcpu_chunk_addr_search(void *addr)
 		return pcpu_first_chunk;
 	}

+	/*
+	 * The address is relative to unit0 which might be unused and
+	 * thus unmapped.  Offset the address to the unit space of the
+	 * current processor before looking it up in the vmalloc
+	 * space.  Note that any possible cpu id can be used here, so
+	 * there's no need to worry about preemption or cpu hotplug.
+	 */
+	addr += raw_smp_processor_id() * pcpu_unit_size;
 	return pcpu_get_page_chunk(vmalloc_to_page(addr));
 }

-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] percpu: don't assume existence of cpu0
  2009-09-01 12:30               ` [PATCH] percpu: don't assume existence of cpu0 Tejun Heo
@ 2009-09-01 22:52                 ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2009-09-01 22:52 UTC (permalink / raw)
  To: tj; +Cc: mroos, sparclinux, linux-kernel

From: Tejun Heo <tj@kernel.org>
Date: Tue, 01 Sep 2009 21:30:40 +0900

> Meelis Roos wrote:
>>> Aha...  Does the following patch fix the problem?
>> 
>> Yes, yesterdays 2.6.31-rc8-git plus this patch seems to work fine, no 
>> warnings/oopses/panics. It is happily churning on debian unstable 
>> upgrade now.
> 
> Good to hear.  Will forward to Linus right away.  Thanks.

Thanks a lot for fixing this so quickly Tejun.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-09-01 22:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <Pine.SOC.4.64.0908311646150.20934@math.ut.ee>
2009-08-31 21:22 ` boot panic after oops from sysfs_new_dirent David Miller
2009-09-01  7:05   ` Meelis Roos
2009-09-01  7:14     ` Tejun Heo
2009-09-01  7:30       ` Tejun Heo
2009-09-01  7:33         ` Meelis Roos
2009-09-01  7:58           ` Tejun Heo
2009-09-01 11:48             ` Meelis Roos
2009-09-01 12:30               ` [PATCH] percpu: don't assume existence of cpu0 Tejun Heo
2009-09-01 22:52                 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox