* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
[not found] <200603010120.k211KqVP009559@shell0.pdx.osdl.net>
@ 2006-03-01 2:18 ` Paul Jackson
2006-03-01 2:36 ` Andrew Morton
0 siblings, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 2:18 UTC (permalink / raw)
To: akpm, ebiederm; +Cc: linux-kernel
Andrew - the following should address your concerns in patch:
proc-dont-lock-task_structs-indefinitely-cpuset-fix.patch
where you had to include "../fs/proc/internal.h" in kernel/cpuset.c
Eric wrote (off list in a patch to Andrew, apparently):
> I just refactored fs/proc/base.c to use task_refs to ensure there are not
> long user triggerable hold times of task_struct. It looks like I missed
> cpuset.c. Oops.
>
> This patch updates proc_cpuset_show to handle the task dying between when
> the file is opened and when data is read out.
Thanks for catching this, Eric.
I was just about to send a patch that moved the cpuset_open(),
cpuset_release() and proc_cpuset_operations{} code from kernel
cpuset.c to fs/proc/base.c, leaving behind a now publically
exported proc_cpuset_show() routine that handles the cpuset
specific details.
For lurkers, this is the code that prints a tasks cpuset path
in the file /proc/<pid>/cpuset. That code had some proc file
specific details buried in its kernel/cpuset.c implementation,
and Eric is changing those proc details. Proc stuff should go
in proc/fs and cpuset stuff in kernel/cpuset.c
I will remerge with your fixes to handle possibly null task_refs
correctly and try again to send my above patch.
However, I have some debugging to do on this kernel first.
It blows up on boot (ia64 sn2_defconfig).
I haven't started to analyze it any yet. I don't know if it's a bug
or pilot error yet.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 2:18 ` + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree Paul Jackson
@ 2006-03-01 2:36 ` Andrew Morton
2006-03-01 3:45 ` Paul Jackson
0 siblings, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2006-03-01 2:36 UTC (permalink / raw)
To: Paul Jackson; +Cc: ebiederm, linux-kernel
Paul Jackson <pj@sgi.com> wrote:
>
> However, I have some debugging to do on this kernel first.
>
> It blows up on boot (ia64 sn2_defconfig).
>
> I haven't started to analyze it any yet. I don't know if it's a bug
> or pilot error yet.
-rc5-mm1 appears to be a trainwreck. It's a bit of a mystery - I've tried
several further configs and it all works swimmingly.
Hopefully some of the people who are hitting this will be able to tell us
whether http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm1.1.gz
or http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm1.2.gz work
OK, so I know how much to drop..
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 2:36 ` Andrew Morton
@ 2006-03-01 3:45 ` Paul Jackson
2006-03-01 4:10 ` Paul Jackson
2006-03-01 4:31 ` Eric W. Biederman
0 siblings, 2 replies; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 3:45 UTC (permalink / raw)
To: Andrew Morton; +Cc: ebiederm, linux-kernel
> -rc5-mm1 appears to be a trainwreck. It's a bit of a mystery - I've tried
> several further configs and it all works swimmingly.
Getting closer.
Without the patches:
proc-dont-lock-task_structs-indefinitely.patch
proc-dont-lock-task_structs-indefinitely-git-nfs-fix.patch
proc-dont-lock-task_structs-indefinitely-cpuset-fix.patch
it boots and works (except for the /proc/*/fd/* permission
complaints I made earlier to Eric). And I was able to run my SGI
specific application that generates the 50 such permission complaints.
With these patches, it still boots, and looks fine ... until
I fire up my SGI specific application, and then it dies.
Once it died with some complaint (lost now) from a swap
daemon. This latest time, it died with just:
Kernel panic - not syncing: Attempted to kill init!
So I think the above 3 patches make it easy for user space
to kill the kernel.
The SGI app is some rather largish tool used for system
monitoring and maintenance - I will have to stare at it
to reduce out any useful explanation of what it is doing
that is so painful here.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 3:45 ` Paul Jackson
@ 2006-03-01 4:10 ` Paul Jackson
2006-03-01 5:05 ` Eric W. Biederman
2006-03-01 4:31 ` Eric W. Biederman
1 sibling, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 4:10 UTC (permalink / raw)
To: Paul Jackson; +Cc: akpm, ebiederm, linux-kernel
With these three patches:
proc-dont-lock-task_structs-indefinitely.patch
proc-dont-lock-task_structs-indefinitely-git-nfs-fix.patch
proc-dont-lock-task_structs-indefinitely-cpuset-fix.patch
the command:
/bin/fuser -n tcp 5553
kills my kernel very quickly. This latest time it died with
the swap command failure I mentioned before. And this command
shows the permission problem (2) that I reported to Eric.
The full output, from command entry to death, is as
follows. The second '#', near the end, is my shell
prompt returning to me, microseconds before death.
# /bin/fuser -n tcp 5553
Cannot stat file /proc/1675/fd/3: Permission denied
Cannot stat file /proc/1675/fd/4: Permission denied
Cannot stat file /proc/1675/fd/5: Permission denied
Cannot stat file /proc/1675/fd/6: Permission denied
Cannot stat file /proc/1675/fd/7: Permission denied
Cannot stat file /proc/2852/fd/6: Permission denied
Cannot stat file /proc/2852/fd/7: Permission denied
Cannot stat file /proc/2853/fd/6: Permission denied
Cannot stat file /proc/2853/fd/7: Permission denied
Cannot stat file /proc/2854/fd/6: Permission denied
Cannot stat file /proc/2854/fd/7: Permission denied
Cannot stat file /proc/2855/fd/6: Permission denied
Cannot stat file /proc/2855/fd/7: Permission denied
Cannot stat file /proc/2866/fd/0: Permission denied
Cannot stat file /proc/2866/fd/1: Permission denied
Cannot stat file /proc/2867/fd/0: Permission denied
Cannot stat file /proc/2867/fd/1: Permission denied
Cannot stat file /proc/2868/fd/0: Permission denied
Cannot stat file /proc/2868/fd/1: Permission denied
Cannot stat file /proc/2869/fd/0: Permission denied
Cannot stat file /proc/2869/fd/1: Permission denied
Cannot stat file /proc/2897/fd/3: Permission denied
Cannot stat file /proc/2902/fd/4: Permission denied
Cannot stat file /proc/2911/fd/3: Permission denied
Cannot stat file /proc/2914/fd/1: Permission denied
Cannot stat file /proc/2921/fd/3: Permission denied
Cannot stat file /proc/2921/fd/5: Permission denied
Cannot stat file /proc/2921/fd/6: Permission denied
Cannot stat file /proc/3512/fd/3: Permission denied
Cannot stat file /proc/3512/fd/4: Permission denied
Cannot stat file /proc/3512/fd/6: Permission denied
Cannot stat file /proc/3512/fd/7: Permission denied
Cannot stat file /proc/3537/fd/3: Permission denied
Cannot stat file /proc/3537/fd/4: Permission denied
Cannot stat file /proc/3645/fd/3: Permission denied
Cannot stat file /proc/3676/fd/4: Permission denied
Cannot stat file /proc/3676/fd/5: Permission denied
Cannot stat file /proc/3676/fd/7: Permission denied
Cannot stat file /proc/3680/fd/0: Permission denied
Cannot stat file /proc/3680/fd/2: Permission denied
Cannot stat file /proc/3680/fd/3: Permission denied
Cannot stat file /proc/3680/fd/4: Permission denied
Cannot stat file /proc/3700/fd/3: Permission denied
Cannot stat file /proc/3735/fd/1: Permission denied
Cannot stat file /proc/3735/fd/2: Permission denied
Cannot stat file /proc/3735/fd/4: Permission denied
Cannot stat file /proc/3735/fd/5: Permission denied
Cannot stat file /proc/3735/fd/7: Permission denied
Cannot stat file /proc/3946/fd/3: Permission denied
Cannot stat file /proc/3946/fd/4: Permission denied
Cannot stat file /proc/3946/fd/5: Permission denied
Cannot stat file /proc/3946/fd/6: Permission denied
Cannot stat file /proc/3948/fd/3: Permission denied
Cannot stat file /proc/3948/fd/4: Permission denied
Cannot stat file /proc/3948/fd/5: Permission denied
Cannot stat file /proc/3948/fd/7: Permission denied
Cannot stat file /proc/3948/fd/8: Permission denied
Cannot stat file /proc/3948/fd/9: Permission denied
Cannot stat file /proc/3948/fd/10: Permission denied
Cannot stat file /proc/3948/fd/11: Permission denied
Cannot stat file /proc/3948/fd/12: Permission denied
Cannot stat file /proc/3948/fd/13: Permission denied
Cannot stat file /proc/3975/fd/0: Permission denied
Cannot stat file /proc/3984/fd/9: Permission denied
Cannot stat file /proc/3984/fd/10: Permission denied
Cannot stat file /proc/4020/fd/4: Permission denied
# swapper[0]: bugcheck! 0 [1]
Modules linked in:
Pid: 0, CPU 3, comm: swapper
psr : 0000101008026038 ifs : 8000000000000288 ip : [<a000000100106a10>] Not tainted
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 3:45 ` Paul Jackson
2006-03-01 4:10 ` Paul Jackson
@ 2006-03-01 4:31 ` Eric W. Biederman
2006-03-01 4:58 ` Paul Jackson
1 sibling, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2006-03-01 4:31 UTC (permalink / raw)
To: Paul Jackson; +Cc: Andrew Morton, linux-kernel
Paul Jackson <pj@sgi.com> writes:
>> -rc5-mm1 appears to be a trainwreck. It's a bit of a mystery - I've tried
>> several further configs and it all works swimmingly.
>
> Getting closer.
>
> Without the patches:
>
> proc-dont-lock-task_structs-indefinitely.patch
> proc-dont-lock-task_structs-indefinitely-git-nfs-fix.patch
> proc-dont-lock-task_structs-indefinitely-cpuset-fix.patch
That definitely makes sense if there is a reference counting bug
somewhere.
What is also possible but scary is that I don't have a reference
counting bug and something else is wrong with process management,
and by holding a much lighter grasp on the tasks in /proc
I have managed to make the bug much easier to trigger.
Hmm. I think I can see at least one reference counting bug..
Unfortunately it is in the wrong direction.
> With these patches, it still boots, and looks fine ... until
> I fire up my SGI specific application, and then it dies.
> Once it died with some complaint (lost now) from a swap
> daemon. This latest time, it died with just:
>
> Kernel panic - not syncing: Attempted to kill init!
Ouch.
> So I think the above 3 patches make it easy for user space
> to kill the kernel.
The intent was the opposite... But until the bugs get out...
Eric
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 4:31 ` Eric W. Biederman
@ 2006-03-01 4:58 ` Paul Jackson
0 siblings, 0 replies; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 4:58 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: akpm, linux-kernel
I turned on the following debug:
> CONFIG_DEBUG_SLAB=y
> CONFIG_DEBUG_SPINLOCK=y
> CONFIG_DEBUG_SPINLOCK_SLEEP=y
and now *with* or *without* the following three patches,
it dies during system boot.
proc-dont-lock-task_structs-indefinitely.patch
proc-dont-lock-task_structs-indefinitely-git-nfs-fix.patch
proc-dont-lock-task_structs-indefinitely-cpuset-fix.patch
I will start poping patches until one boots with these
DEBUG options.
My current config has the following options set:
CONFIG_PREEMPT=y
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
CONFIG_DEBUG_MUTEXES=y
The entire failing boot output is:
Uncompressing Linux... donei(3|0)/Scsi(Pun2,Lun0)/HD(Part8,Sig461D0E2E-AD03-4C5ELoading file initrd...done
Linux version 2.6.16-rc5-mm1 (pj@jackhammer) (gcc version 3.3.3 (SuSE Linux)) #8 SMP PREEMPT Tue Feb 28 20:19:57 PST 2006
EFI v1.10 by INTEL: SALsystab=0x230027c9070 ACPI 2.0=0x230027c9840
Number of logical nodes in system = 2
Number of memory chunks in system = 2
Initial ramdisk at: 0xe00002bc39fa9000 (4386320 bytes)
SAL 2.9: SGI SN2 version 4.32
SAL Platform features: ITC_Drift
SAL: AP wakeup using external interrupt vector 0x12
No logical to physical processor mapping available
ACPI: Local APIC address c0000000fee00000
ACPI: Error parsing MADT - no IOSAPIC entries
register_intr: No IOSAPIC for GSI 52
4 CPUs available, 4 CPUs total
Increasing MCA rendezvous timeout from 20000 to 49000 milliseconds
MCA related initialization done
SGI SAL version 4.32
Virtual mem_map starts at 0xa0007ffd43c40000
Built 2 zonelists
Kernel command line: BOOT_IMAGE=scsi2:\efi\SuSE\vmlinuz.pj5 root=/dev/sdb6 selinux=0 console=ttySG0 splash=silent thash_entries=2097152 ro
PID hash table entries: 4096 (order: 12, 131072 bytes)
Console: colour dummy device 80x25
Memory: 7567248k/7730080k available (6903k code, 180272k reserved, 4023k data, 384k init)
McKinley Errata 9 workaround not needed; disabling it
Dentry cache hash table entries: 1048576 (order: 9, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 8, 4194304 bytes)
Mount-cache hash table entries: 1024
Boot processor id 0x0/0x8
Brought up 4 CPUs
Total of 4 processors activated (7782.40 BogoMIPS).
migration_cost=7018,36910
checking if image is initramfs... it is
Freeing initrd memory: 4256kB freed
DMI not present or invalid.
NET: Registered protocol family 16
ACPI: bus type pci registered
Altix IO Topology Information
*****************************
Serial Number:N0000015
PCI SEGMENT PCIBUS NUMBER BRICK RACK:SLOT BUS CONNECTION TOPOLOGY
----------- ------------- --------------------- -------------------
0x0001 0x01 IXbrick 001:27 01 001c24:slab0:widget12:bus0
0x0002 0x01 IXbrick 001:27 02 001c24:slab0:widget12:bus1
0x0003 0x01 IXbrick 001:27 03 001c24:slab0:widget15:bus0
0x0004 0x01 IXbrick 001:27 04 001c24:slab0:widget15:bus1
0x0005 0x01 IXbrick 001:27 05 001c24:slab0:widget13:bus0
0x0006 0x01 IXbrick 001:27 06 001c24:slab0:widget13:bus1
PROM version < 4.50 -- implementing old PROM flush WAR
ACPI: Subsystem revision 20060210
ACPI: SCI (ACPI GSI 52) not registered
ACPI: Interpreter enabled
ACPI: Using IOSAPIC for interrupt routing
SCSI subsystem initialized
perfmon: version 2.0 IRQ 238
perfmon: Itanium 2 PMU detected, 16 PMCs, 18 PMDs, 4 counters (47 bits)
PAL Information Facility v0.5
perfmon: added sampling format default_format
perfmon_default_smpl: default_format v2.0 registered
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 2048 (order 0, 16384 bytes)
SGI XFS with ACLs, realtime, large block/inode numbers, no debug enabled
SGI XFS Quota Management subsystem
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
SGI Altix RTC Timer: v2.1, 20 MHz
EFI Time Services Driver v0.4
Linux agpgart interface v0.101 (c) Dave Jones
sn_console: Console driver init
ttySG0 at I/O 0x0 (irq = 0) is a SGI SN L1
Unable to handle kernel NULL pointer dereference (address 0000000000000058)
swapper[1]: Oops 8813272891392 [1]
Modules linked in:
Pid: 1, CPU 0, comm: swapper
psr : 0000101008026018 ifs : 800000000000040b ip : [<a0000001001f1950>] Not tainted
ip is at sysfs_create_group+0x30/0x2a0
unat: 0000000000000000 pfs : 0000000000000308 rsc : 0000000000000003
rnat: 0000000002000027 bsps: 0000000000000002 pr : 0000000000005649
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a00000010081ad30 b6 : e000023002310080 b7 : a00000010081ad80
f6 : 1003e0000000000000000 f7 : 1003e20c49ba5e353f7cf
f8 : 1003e0000000000002ff0 f9 : 1003e0000000000000068
f10 : 1003e0000000000000000 f11 : 1003e0000000000000000
r1 : a000000100c93a70 r2 : 0000000000000058 r3 : a000000100aa3520
r8 : 0000000000000000 r9 : a000000100cba720 r10 : ffffffffffffffff
r11 : 0000000000000400 r12 : e00002343bdb7d50 r13 : e00002343bdb0000
r14 : a000000100aa4378 r15 : a000000100cba720 r16 : a000000100aa3528
r17 : 00000000000003c0 r18 : 0000000000000001 r19 : 0000000000000002
r20 : ffffffffffffffff r21 : 000000000000000e r22 : 0000000000000000
r23 : a000000100a94e80 r24 : a000000100824c80 r25 : a000000100aa3ad8
r26 : 0000000000004000 r27 : a000000100913e80 r28 : e00002300c6ff918
r29 : 0000000000000001 r30 : a0000001007d4ba8 r31 : a00000010081ad80
Call Trace:
[<a000000100013280>] show_stack+0x40/0xa0
sp=e00002343bdb78e0 bsp=e00002343bdb1298
[<a000000100013ab0>] show_regs+0x7d0/0x800
sp=e00002343bdb7ab0 bsp=e00002343bdb1248
[<a000000100036970>] die+0x210/0x320
sp=e00002343bdb7ab0 bsp=e00002343bdb1200
[<a00000010005a800>] ia64_do_page_fault+0x900/0xa80
sp=e00002343bdb7ad0 bsp=e00002343bdb1198
[<a00000010000bbc0>] ia64_leave_kernel+0x0/0x290
sp=e00002343bdb7b80 bsp=e00002343bdb1198
[<a0000001001f1950>] sysfs_create_group+0x30/0x2a0
sp=e00002343bdb7d50 bsp=e00002343bdb1140
[<a00000010081ad30>] topology_cpu_callback+0x70/0xc0
sp=e00002343bdb7d60 bsp=e00002343bdb1110
[<a00000010081ae00>] topology_sysfs_init+0x80/0x120
sp=e00002343bdb7d60 bsp=e00002343bdb10f0
[<a000000100009860>] init+0x580/0x8e0
sp=e00002343bdb7d60 bsp=e00002343bdb10c8
[<a000000100011740>] kernel_thread_helper+0xe0/0x100
sp=e00002343bdb7e30 bsp=e00002343bdb10a0
[<a000000100009140>] start_kernel_thread+0x20/0x40
sp=e00002343bdb7e30 bsp=e00002343bdb10a0
<0>Kernel panic - not syncing: Attempted to kill init!
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 4:10 ` Paul Jackson
@ 2006-03-01 5:05 ` Eric W. Biederman
2006-03-01 5:25 ` Paul Jackson
0 siblings, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2006-03-01 5:05 UTC (permalink / raw)
To: Paul Jackson; +Cc: akpm, ebiederm, linux-kernel
Paul Jackson <pj@sgi.com> writes:
> With these three patches:
> proc-dont-lock-task_structs-indefinitely.patch
> proc-dont-lock-task_structs-indefinitely-git-nfs-fix.patch
> proc-dont-lock-task_structs-indefinitely-cpuset-fix.patch
>
> the command:
>
> /bin/fuser -n tcp 5553
I can kill a kernel this way as well. Thanks this looks like
a good reproducer I will see if I can figure out why.
Eric
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 5:05 ` Eric W. Biederman
@ 2006-03-01 5:25 ` Paul Jackson
2006-03-01 6:11 ` Eric W. Biederman
` (3 more replies)
0 siblings, 4 replies; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 5:25 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: akpm, ebiederm, linux-kernel
Eric wrote:
> I can kill a kernel this way as well. Thanks this looks like
> a good reproducer I will see if I can figure out why.
I suspect two problems, one with your patches that the fuser provokes,
and a separate bug earlier in *-mm that the DEBUG options noted below
provoke.
Details:
In addition to the problem that shows up with the three patches
> proc-dont-lock-task_structs-indefinitely.patch
> proc-dont-lock-task_structs-indefinitely-git-nfs-fix.patch
> proc-dont-lock-task_structs-indefinitely-cpuset-fix.patch
when running the fuser command:
> /bin/fuser -n tcp 5553
I am seeing as a separate bug the crash during boot that I reported
last, when I turned on some DEBUG options. That crash occurs even with
none of your proc patches.
That is, to be specific, with this patch at the top of my applied stack:
rtc-subsystem-rs5c372-driver.patch
and these patches at the front of my unapplied queue:
trivial-cleanup-to-proc_check_chroot.patch
proc-fix-the-inode-number-on-proc-pid-fd.patch
and the debug options:
> CONFIG_DEBUG_SLAB=y
> CONFIG_DEBUG_SPINLOCK=y
> CONFIG_DEBUG_SPINLOCK_SLEEP=y
I die during system boot with:
==============================
...
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
SGI Altix RTC Timer: v2.1, 20 MHz
EFI Time Services Driver v0.4
Linux agpgart interface v0.101 (c) Dave Jones
sn_console: Console driver init
ttySG0 at I/O 0x0 (irq = 0) is a SGI SN L1
Unable to handle kernel NULL pointer dereference (address 0000000000000058)
swapper[1]: Oops 8813272891392 [1]
Modules linked in:
Pid: 1, CPU 0, comm: swapper
psr : 0000101008026018 ifs : 800000000000040b ip : [<a0000001001f0a50>] Not tainted
ip is at sysfs_create_group+0x30/0x2a0
unat: 0000000000000000 pfs : 0000000000000308 rsc : 0000000000000003
rnat: 0000000002000027 bsps: 0000000000000002 pr : 0000000000005649
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a00000010081ad30 b6 : e000023002310080 b7 : a00000010081ad80
f6 : 1003e0000000000000000 f7 : 1003e20c49ba5e353f7cf
f8 : 1003e0000000000002ff0 f9 : 1003e0000000000000068
f10 : 1003e0000000000000000 f11 : 1003e0000000000000000
r1 : a000000100c93ab0 r2 : 0000000000000058 r3 : a000000100aa3560
r8 : 0000000000000000 r9 : a000000100cba7a0 r10 : ffffffffffffffff
r11 : 0000000000000400 r12 : e00002343bdb7d50 r13 : e00002343bdb0000
r14 : a000000100aa8300 r15 : a000000100cba7a0 r16 : a000000100aa3568
r17 : 00000000000003c0 r18 : 0000000000000001 r19 : 0000000000000002
r20 : ffffffffffffffff r21 : 000000000000000e r22 : 0000000000000000
r23 : a000000100a94eb8 r24 : a000000100824c80 r25 : a000000100aa3b18
r26 : 0000000000004000 r27 : a000000100913e80 r28 : e00002343b2d3918
r29 : 0000000000000001 r30 : a0000001007d9228 r31 : a00000010081ad80
Call Trace:
[<a000000100013280>] show_stack+0x40/0xa0
sp=e00002343bdb78e0 bsp=e00002343bdb1298
[<a000000100013ab0>] show_regs+0x7d0/0x800
sp=e00002343bdb7ab0 bsp=e00002343bdb1248
[<a000000100036970>] die+0x210/0x320
sp=e00002343bdb7ab0 bsp=e00002343bdb1200
[<a00000010005a800>] ia64_do_page_fault+0x900/0xa80
sp=e00002343bdb7ad0 bsp=e00002343bdb1198
[<a00000010000bbc0>] ia64_leave_kernel+0x0/0x290
sp=e00002343bdb7b80 bsp=e00002343bdb1198
[<a0000001001f0a50>] sysfs_create_group+0x30/0x2a0
sp=e00002343bdb7d50 bsp=e00002343bdb1140
[<a00000010081ad30>] topology_cpu_callback+0x70/0xc0
sp=e00002343bdb7d60 bsp=e00002343bdb1110
[<a00000010081ae00>] topology_sysfs_init+0x80/0x120
sp=e00002343bdb7d60 bsp=e00002343bdb10f0
[<a000000100009860>] init+0x580/0x8e0
sp=e00002343bdb7d60 bsp=e00002343bdb10c8
[<a000000100011740>] kernel_thread_helper+0xe0/0x100
sp=e00002343bdb7e30 bsp=e00002343bdb10a0
[<a000000100009140>] start_kernel_thread+0x20/0x40
sp=e00002343bdb7e30 bsp=e00002343bdb10a0
<0>Kernel panic - not syncing: Attempted to kill init!
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 5:25 ` Paul Jackson
@ 2006-03-01 6:11 ` Eric W. Biederman
2006-03-01 6:15 ` Eric W. Biederman
2006-03-01 7:20 ` [PATCH] proc: Reference couting fix Eric W. Biederman
` (2 subsequent siblings)
3 siblings, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2006-03-01 6:11 UTC (permalink / raw)
To: Paul Jackson; +Cc: akpm, linux-kernel
Paul Jackson <pj@sgi.com> writes:
> Eric wrote:
>> I can kill a kernel this way as well. Thanks this looks like
>> a good reproducer I will see if I can figure out why.
>
> I suspect two problems, one with your patches that the fuser provokes,
> and a separate bug earlier in *-mm that the DEBUG options noted below
> provoke.
>
> Details:
>
> In addition to the problem that shows up with the three patches
>> proc-dont-lock-task_structs-indefinitely.patch
>> proc-dont-lock-task_structs-indefinitely-git-nfs-fix.patch
>> proc-dont-lock-task_structs-indefinitely-cpuset-fix.patch
>
> when running the fuser command:
>> /bin/fuser -n tcp 5553
>
> I am seeing as a separate bug the crash during boot that I reported
> last, when I turned on some DEBUG options. That crash occurs even with
> none of your proc patches.
Ok. I think I have found the big bug in my task_ref patches.
I had missed that __unhash_process got moved outside of the
tasklist_lock. Which messed up my serialization with detach_pid.
My gut feel says modifying doubly linked lists without a lock isn't
even safe. Which is probably why I never considered the possibility.
Now to think through what this means in terms of locking.
Eric
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 6:11 ` Eric W. Biederman
@ 2006-03-01 6:15 ` Eric W. Biederman
0 siblings, 0 replies; 43+ messages in thread
From: Eric W. Biederman @ 2006-03-01 6:15 UTC (permalink / raw)
To: Paul Jackson; +Cc: akpm, linux-kernel
ebiederm@xmission.com (Eric W. Biederman) writes:
> Ok. I think I have found the big bug in my task_ref patches.
>
Nope. It was __unhash_process just moved, it is still under
the tasklist_lock.
Eric
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH] proc: Reference couting fix.
2006-03-01 5:25 ` Paul Jackson
2006-03-01 6:11 ` Eric W. Biederman
@ 2006-03-01 7:20 ` Eric W. Biederman
2006-03-01 7:26 ` [PATCH] proc: task_mmu bug fix Eric W. Biederman
2006-03-01 7:48 ` + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree Paul Jackson
3 siblings, 0 replies; 43+ messages in thread
From: Eric W. Biederman @ 2006-03-01 7:20 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel, Paul Jackson
Fix reference counts in seccomp_write, and mem_read.
While looking for the bug I found two other places I goofed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
fs/proc/base.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
0774b9b05aa41a25d72f31498fc2967bfe8e60b7
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 8d73c6a..6a26847 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -671,6 +671,9 @@ static ssize_t mem_read(struct file * fi
int ret = -ESRCH;
struct mm_struct *mm;
+ if (!task)
+ goto out_no_task;
+
if (!MAY_PTRACE(task) || !ptrace_may_attach(task))
goto out;
@@ -720,6 +723,8 @@ out_put:
out_free:
free_page((unsigned long) page);
out:
+ put_task_struct(task);
+out_no_task:
return ret;
}
@@ -965,10 +970,12 @@ static ssize_t seccomp_write(struct file
if (unlikely(tsk->seccomp.mode))
goto out;
+ result = -EFAULT;
memset(__buf, 0, sizeof(__buf));
count = min(count, sizeof(__buf) - 1);
if (copy_from_user(__buf, buf, count))
- return -EFAULT;
+ goto out;
+
seccomp_mode = simple_strtoul(__buf, &end, 0);
if (*end == '\n')
end++;
--
1.2.2.g709a-dirty
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH] proc: task_mmu bug fix.
2006-03-01 5:25 ` Paul Jackson
2006-03-01 6:11 ` Eric W. Biederman
2006-03-01 7:20 ` [PATCH] proc: Reference couting fix Eric W. Biederman
@ 2006-03-01 7:26 ` Eric W. Biederman
2006-03-01 7:46 ` Andrew Morton
2006-03-01 7:48 ` + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree Paul Jackson
3 siblings, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2006-03-01 7:26 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel, Paul Jackson
This should fix the big bug that has been crashing kernels when
fuser is called. At least it is the bug I observed here. It seems
you need the right access pattern on /proc/<pid>/maps to trigger this.
seq_operations ->stop is only called once per start making it safe to
call put_task_struct there. However m_next was calling m_stop which
totally messed me up.
Technically the task_struct needs to be held for the duration, so
split m_stop into two functions such that only vma_stop is called
multiple times per start.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
fs/proc/task_mmu.c | 18 ++++++++++++------
1 files changed, 12 insertions(+), 6 deletions(-)
4217fed6dbbf2b5615d8a498b39aad5ee28d3e5f
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4772543..f299538 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -363,17 +363,13 @@ out:
return priv->tail_vma;
}
-static void m_stop(struct seq_file *m, void *v)
+static void vma_stop(struct proc_maps_private *priv, struct vm_area_struct *vma)
{
- struct proc_maps_private *priv = m->private;
- struct vm_area_struct *vma = v;
if (vma && vma != priv->tail_vma) {
struct mm_struct *mm = vma->vm_mm;
up_read(&mm->mmap_sem);
mmput(mm);
}
- if (priv->task)
- put_task_struct(priv->task);
}
static void *m_next(struct seq_file *m, void *v, loff_t *pos)
@@ -385,10 +381,20 @@ static void *m_next(struct seq_file *m,
(*pos)++;
if (vma && (vma != tail_vma) && vma->vm_next)
return vma->vm_next;
- m_stop(m, v);
+ vma_stop(priv, vma);
return (vma != tail_vma)? tail_vma: NULL;
}
+static void m_stop(struct seq_file *m, void *v)
+{
+ struct proc_maps_private *priv = m->private;
+ struct vm_area_struct *vma = v;
+
+ vma_stop(priv, vma);
+ if (priv->task)
+ put_task_struct(priv->task);
+}
+
static struct seq_operations proc_pid_maps_op = {
.start = m_start,
.next = m_next,
--
1.2.2.g709a-dirty
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH] proc: task_mmu bug fix.
2006-03-01 7:26 ` [PATCH] proc: task_mmu bug fix Eric W. Biederman
@ 2006-03-01 7:46 ` Andrew Morton
2006-03-01 12:49 ` Eric W. Biederman
2006-03-01 18:33 ` Paul Jackson
0 siblings, 2 replies; 43+ messages in thread
From: Andrew Morton @ 2006-03-01 7:46 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: linux-kernel, pj
ebiederm@xmission.com (Eric W. Biederman) wrote:
>
> This should fix the big bug that has been crashing kernels when
> fuser is called. At least it is the bug I observed here. It seems
> you need the right access pattern on /proc/<pid>/maps to trigger this.
Thanks. Do you think this is likely to fix the crashes reported by
Laurent, Jesper, Paul, Rafael and Martin?
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 5:25 ` Paul Jackson
` (2 preceding siblings ...)
2006-03-01 7:26 ` [PATCH] proc: task_mmu bug fix Eric W. Biederman
@ 2006-03-01 7:48 ` Paul Jackson
2006-03-01 8:26 ` Andrew Morton
3 siblings, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 7:48 UTC (permalink / raw)
To: Paul Jackson; +Cc: ebiederm, akpm, linux-kernel
> I am seeing as a separate bug the crash during boot that I reported
> last, when I turned on some DEBUG options.
I have narrowed it down to between the following two patches
in *-mm (patch numbers 20 and 90 in 2.6.16-rc5-mm1, roughly):
multiple-exports-of-strpbrk.patch == ok
git-drm.patch == bad
I have to set this aside for now.
As stated before, the bad patch won't boot on my ia64 SN2 Altix
sn2_defconfig plus debug options:
> CONFIG_DEBUG_SLAB=y
> CONFIG_DEBUG_SPINLOCK=y
> CONFIG_DEBUG_SPINLOCK_SLEEP=y
It fails with:
==============
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
SGI Altix RTC Timer: v2.1, 20 MHz
EFI Time Services Driver v0.4
Linux agpgart interface v0.101 (c) Dave Jones
sn_console: Console driver init
ttySG0 at I/O 0x0 (irq = 0) is a SGI SN L1
Unable to handle kernel NULL pointer dereference (address 0000000000000058)
swapper[1]: Oops 8813272891392 [1]
Modules linked in:
Pid: 1, CPU 0, comm: swapper
psr : 0000101008026018 ifs : 800000000000040b ip : [<a0000001001ea8b0>] Not tainted
ip is at sysfs_create_group+0x30/0x2a0
unat: 0000000000000000 pfs : 0000000000000308 rsc : 0000000000000003
rnat: 0000000002000027 bsps: 0000000000000002 pr : 0000000000005649
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a000000100809190 b6 : e000023002310080 b7 : a0000001008091e0
f6 : 1003e0000000000000000 f7 : 1003e20c49ba5e353f7cf
f8 : 1003e0000000000003398 f9 : 1003e000000000000007f
f10 : 1003e0000000000000000 f11 : 1003e0000000000000000
r1 : a000000100c70d60 r2 : 0000000000000058 r3 : a000000100a866a8
r8 : 0000000000000000 r9 : a000000100c96920 r10 : ffffffffffffffff
r11 : 0000000000000400 r12 : e00002343bd97d50 r13 : e00002343bd90000
r14 : a000000100a87378 r15 : a000000100c96920 r16 : a000000100a866b0
r17 : 00000000000003c0 r18 : 0000000000000001 r19 : 0000000000000002
r20 : ffffffffffffffff r21 : 0000000000000000 r22 : 000000000000000e
r23 : a000000100a72150 r24 : a000000100812c40 r25 : a000000100a72670
r26 : a000000100a88c20 r27 : a0000001008f3b88 r28 : e00002bc3a0432f0
r29 : 0000000000000001 r30 : a0000001007cac58 r31 : a0000001008091e0
Call Trace:
[<a0000001000132c0>] show_stack+0x40/0xa0
sp=e00002343bd978e0 bsp=e00002343bd91278
[<a000000100013af0>] show_regs+0x7d0/0x800
sp=e00002343bd97ab0 bsp=e00002343bd91228
[<a000000100036df0>] die+0x210/0x320
sp=e00002343bd97ab0 bsp=e00002343bd911d8
[<a00000010005a840>] ia64_do_page_fault+0x900/0xa80
sp=e00002343bd97ad0 bsp=e00002343bd91178
[<a00000010000bd00>] ia64_leave_kernel+0x0/0x290
sp=e00002343bd97b80 bsp=e00002343bd91178
[<a0000001001ea8b0>] sysfs_create_group+0x30/0x2a0
sp=e00002343bd97d50 bsp=e00002343bd91120
[<a000000100809190>] topology_cpu_callback+0x70/0xc0
sp=e00002343bd97d60 bsp=e00002343bd910f0
[<a000000100809260>] topology_sysfs_init+0x80/0x120
sp=e00002343bd97d60 bsp=e00002343bd910d0
[<a000000100009860>] init+0x580/0x8e0
sp=e00002343bd97d60 bsp=e00002343bd910a8
[<a000000100011780>] kernel_thread_helper+0xe0/0x100
sp=e00002343bd97e30 bsp=e00002343bd91080
[<a000000100009140>] start_kernel_thread+0x20/0x40
sp=e00002343bd97e30 bsp=e00002343bd91080
<0>Kernel panic - not syncing: Attempted to kill init!
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 7:48 ` + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree Paul Jackson
@ 2006-03-01 8:26 ` Andrew Morton
2006-03-01 8:39 ` Paul Jackson
2006-03-01 9:53 ` Paul Jackson
0 siblings, 2 replies; 43+ messages in thread
From: Andrew Morton @ 2006-03-01 8:26 UTC (permalink / raw)
To: Paul Jackson; +Cc: pj, ebiederm, linux-kernel
Paul Jackson <pj@sgi.com> wrote:
>
> > I am seeing as a separate bug the crash during boot that I reported
> > last, when I turned on some DEBUG options.
>
> I have narrowed it down to between the following two patches
> in *-mm (patch numbers 20 and 90 in 2.6.16-rc5-mm1, roughly):
I hope that machine doesn't take too long to boot.
> multiple-exports-of-strpbrk.patch == ok
> git-drm.patch == bad
That would point at either the sysfs changes in gregkh-driver-* or acpi.
There have been no changes in the acpi patch in a couple of weeks. Did
that machine run rc4-mm2?
> ip is at sysfs_create_group+0x30/0x2a0
It does have a sysfsy feel. But I don't immediately see how any of the
patches in the driver tree can affect this.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 8:26 ` Andrew Morton
@ 2006-03-01 8:39 ` Paul Jackson
2006-03-01 9:53 ` Paul Jackson
1 sibling, 0 replies; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 8:39 UTC (permalink / raw)
To: Andrew Morton; +Cc: ebiederm, linux-kernel
> Did that machine run rc4-mm2?
It's twin sister ran rc4-mm2, as that is where I
tested the last "cpuset memory spread slab file i/o"
patch I sent you.
> That would point at either the sysfs changes in gregkh-driver-* or acpi.
> There have been no changes in the acpi patch in a couple of weeks.
I'll give these gregkh changes a higher weighting in my
"biased binary search". Thanks.
I've got time now to do a few more slices.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 8:26 ` Andrew Morton
2006-03-01 8:39 ` Paul Jackson
@ 2006-03-01 9:53 ` Paul Jackson
2006-03-01 10:02 ` Andrew Morton
` (2 more replies)
1 sibling, 3 replies; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 9:53 UTC (permalink / raw)
To: Andrew Morton; +Cc: ebiederm, linux-kernel
Ok - down to the patch:
1) gregkh-driver-empty_release_functions_are_broken.patch - good
2) gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch - special case
3) gregkh-driver-fix-up-the-sysfs-pollable-patch.patch - bad
Up through and including (1), it all seems fine.
With (3) or more loaded, it fails to boot, with the crash
given before (and appended below for completeness).
With patchs up through (2) loaded, it boots, but complains 27
times during the boot
One of the 27 complaints for special case (2):
================================= begin =================================
Debug: sleeping function called from invalid context at drivers/base/core.c:343^M
in_atomic():1, irqs_disabled():0^M
^M
Call Trace:^M
[<a0000001000132c0>] show_stack+0x40/0xa0^M
sp=e00002bc3a49f9b0 bsp=e00002bc3a499558^M
[<a000000100013b50>] dump_stack+0x30/0x60^M
sp=e00002bc3a49fb80 bsp=e00002bc3a499540^M
[<a00000010008ff80>] __might_sleep+0x200/0x220^M
sp=e00002bc3a49fb80 bsp=e00002bc3a499510^M
[<a0000001004b58b0>] put_device+0x30/0x60^M
sp=e00002bc3a49fb90 bsp=e00002bc3a4994f0^M
[<a00000010051e470>] scsi_put_command+0x170/0x1a0^M
sp=e00002bc3a49fb90 bsp=e00002bc3a499498^M
[<a000000100528c80>] scsi_next_command+0x40/0x80^M
sp=e00002bc3a49fb90 bsp=e00002bc3a499468^M
[<a0000001005296a0>] scsi_end_request+0x1a0/0x1e0^M
sp=e00002bc3a49fb90 bsp=e00002bc3a499420^M
[<a000000100529a50>] scsi_io_completion+0x370/0x820^M
sp=e00002bc3a49fb90 bsp=e00002bc3a499388^M
[<a000000100529f90>] scsi_blk_pc_done+0x90/0xc0^M
sp=e00002bc3a49fba0 bsp=e00002bc3a499368^M
[<a00000010051d2b0>] scsi_finish_command+0x150/0x180^M
sp=e00002bc3a49fba0 bsp=e00002bc3a499338^M
[<a00000010052a590>] scsi_softirq_done+0x270/0x2a0^M
sp=e00002bc3a49fba0 bsp=e00002bc3a499310^M
[<a0000001003be420>] blk_done_softirq+0x1a0/0x200^M
sp=e00002bc3a49fbb0 bsp=e00002bc3a4992f8^M
[<a0000001000b0070>] __do_softirq+0xd0/0x240^M
sp=e00002bc3a49fbc0 bsp=e00002bc3a499280^M
[<a0000001000b0260>] do_softirq+0x80/0xe0^M
sp=e00002bc3a49fbc0 bsp=e00002bc3a499220^M
[<a0000001000b0340>] irq_exit+0x80/0xc0^M
sp=e00002bc3a49fbc0 bsp=e00002bc3a499208^M
[<a000000100010240>] ia64_handle_irq+0x120/0x140^M
sp=e00002bc3a49fbc0 bsp=e00002bc3a4991d0^M
[<a0000001003ebe40>] __copy_user+0x100/0x960^M
sp=e00002bc3a49fd90 bsp=e00002bc3a499110^M
[<a000000100011ac0>] default_idle+0xc0/0x160^M
sp=e00002bc3a49fd90 bsp=e00002bc3a4990f0^M
[<a000000100012a90>] cpu_idle+0x230/0x300^M
sp=e00002bc3a49fe30 bsp=e00002bc3a4990c0^M
[<a000000100055340>] start_secondary+0x340/0x360^M
sp=e00002bc3a49fe30 bsp=e00002bc3a499080^M
[<a000000100008650>] __end_ivt_text+0x330/0x360^M
sp=e00002bc3a49fe30 bsp=e00002bc3a499080^M
================================== end ==================================
The boottime crash seen in case (3) and beyond:
================================= begin =================================
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
SGI Altix RTC Timer: v2.1, 20 MHz
EFI Time Services Driver v0.4
Linux agpgart interface v0.101 (c) Dave Jones
sn_console: Console driver init
ttySG0 at I/O 0x0 (irq = 0) is a SGI SN L1
Unable to handle kernel NULL pointer dereference (address 0000000000000058)
swapper[1]: Oops 8813272891392 [1]
Modules linked in:
Pid: 1, CPU 0, comm: swapper
psr : 0000101008026018 ifs : 800000000000040b ip : [<a0000001001eac90>] Not tainted
ip is at sysfs_create_group+0x30/0x2a0
unat: 0000000000000000 pfs : 0000000000000308 rsc : 0000000000000003
rnat: 0000000002000027 bsps: 0000000000000002 pr : 0000000000005649
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a000000100809190 b6 : e000023002310080 b7 : a0000001008091e0
f6 : 1003e0000000000000000 f7 : 1003e20c49ba5e353f7cf
f8 : 1003e0000000000003398 f9 : 1003e000000000000007f
f10 : 1003e0000000000000000 f11 : 1003e0000000000000000
r1 : a000000100c70cc0 r2 : 0000000000000058 r3 : a000000100a80ef8
r8 : 0000000000000000 r9 : a000000100c95820 r10 : ffffffffffffffff
r11 : 0000000000000400 r12 : e00002343bd97d50 r13 : e00002343bd90000
r14 : a000000100a83360 r15 : a000000100c95820 r16 : a000000100a80f00
r17 : 00000000000003c0 r18 : 0000000000000001 r19 : 0000000000000002
r20 : ffffffffffffffff r21 : 0000000000000000 r22 : 000000000000000e
r23 : a000000100a720a8 r24 : a000000100812c40 r25 : a000000100a77698
r26 : a000000100a88b60 r27 : a0000001008f3b88 r28 : e00002bc3a0432f0
r29 : 0000000000000001 r30 : a0000001007d0db8 r31 : a0000001008091e0
Call Trace:
[<a0000001000132c0>] show_stack+0x40/0xa0
sp=e00002343bd978e0 bsp=e00002343bd91278
[<a000000100013af0>] show_regs+0x7d0/0x800
sp=e00002343bd97ab0 bsp=e00002343bd91228
[<a000000100036df0>] die+0x210/0x320
sp=e00002343bd97ab0 bsp=e00002343bd911d8
[<a00000010005a840>] ia64_do_page_fault+0x900/0xa80
sp=e00002343bd97ad0 bsp=e00002343bd91178
[<a00000010000bd00>] ia64_leave_kernel+0x0/0x290
sp=e00002343bd97b80 bsp=e00002343bd91178
[<a0000001001eac90>] sysfs_create_group+0x30/0x2a0
sp=e00002343bd97d50 bsp=e00002343bd91120
[<a000000100809190>] topology_cpu_callback+0x70/0xc0
sp=e00002343bd97d60 bsp=e00002343bd910f0
[<a000000100809260>] topology_sysfs_init+0x80/0x120
sp=e00002343bd97d60 bsp=e00002343bd910d0
[<a000000100009860>] init+0x580/0x8e0
sp=e00002343bd97d60 bsp=e00002343bd910a8
[<a000000100011780>] kernel_thread_helper+0xe0/0x100
sp=e00002343bd97e30 bsp=e00002343bd91080
[<a000000100009140>] start_kernel_thread+0x20/0x40
sp=e00002343bd97e30 bsp=e00002343bd91080
<0>Kernel panic - not syncing: Attempted to kill init!
================================== end ==================================
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 9:53 ` Paul Jackson
@ 2006-03-01 10:02 ` Andrew Morton
2006-03-01 10:14 ` Paul Jackson
2006-03-01 10:11 ` Paul Jackson
2006-03-01 19:21 ` Greg KH
2 siblings, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2006-03-01 10:02 UTC (permalink / raw)
To: Paul Jackson; +Cc: ebiederm, linux-kernel, Greg KH, Neil Brown
Paul Jackson <pj@sgi.com> wrote:
>
> Ok - down to the patch:
>
> 1) gregkh-driver-empty_release_functions_are_broken.patch - good
> 2) gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch - special case
> 3) gregkh-driver-fix-up-the-sysfs-pollable-patch.patch - bad
>
> Up through and including (1), it all seems fine.
OK, thanks. So
gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch is the
problem. Odd.
<looks at sysfs_poll()>
If that gets called on a top-level file in /sys, won't
filp->f_dentry->d_parent be pointing at a non-sysfs dentry?
> With (3) or more loaded, it fails to boot, with the crash
> given before (and appended below for completeness).
>
> With patchs up through (2) loaded, it boots, but complains 27
> times during the boot
>
> One of the 27 complaints for special case (2):
> ================================= begin =================================
> Debug: sleeping function called from invalid context at drivers/base/core.c:343^M
> in_atomic():1, irqs_disabled():0^M
> ^M
> Call Trace:^M
> [<a0000001000132c0>] show_stack+0x40/0xa0^M
> sp=e00002bc3a49f9b0 bsp=e00002bc3a499558^M
> [<a000000100013b50>] dump_stack+0x30/0x60^M
> sp=e00002bc3a49fb80 bsp=e00002bc3a499540^M
> [<a00000010008ff80>] __might_sleep+0x200/0x220^M
> sp=e00002bc3a49fb80 bsp=e00002bc3a499510^M
> [<a0000001004b58b0>] put_device+0x30/0x60^M
> sp=e00002bc3a49fb90 bsp=e00002bc3a4994f0^M
> [<a00000010051e470>] scsi_put_command+0x170/0x1a0^M
> sp=e00002bc3a49fb90 bsp=e00002bc3a499498^M
> [<a000000100528c80>] scsi_next_command+0x40/0x80^M
> sp=e00002bc3a49fb90 bsp=e00002bc3a499468^M
> [<a0000001005296a0>] scsi_end_request+0x1a0/0x1e0^M
> sp=e00002bc3a49fb90 bsp=e00002bc3a499420^M
> [<a000000100529a50>] scsi_io_completion+0x370/0x820^M
> sp=e00002bc3a49fb90 bsp=e00002bc3a499388^M
Yeah, known problem - big messiness in scsi. That's why -mm includes
revert-gregkh-driver-put_device-might_sleep.patch. Looks like I need to
add revert-gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch too ;)
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 9:53 ` Paul Jackson
2006-03-01 10:02 ` Andrew Morton
@ 2006-03-01 10:11 ` Paul Jackson
2006-03-01 10:31 ` Paul Jackson
2006-03-01 19:21 ` Greg KH
2 siblings, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 10:11 UTC (permalink / raw)
To: Paul Jackson; +Cc: akpm, ebiederm, linux-kernel
Drat - off by one in my reporting again. The first "good" (1) case
was one patch earlier than reported above.
cpufreq-_ppc-frequency-change-issues-freq-already-lowered-by-bios.patch - good
gregkh-driver-put_device-might_sleep.patch - untested
gregkh-driver-empty_release_functions_are_broken.patch - special case
gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch - bad
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 10:02 ` Andrew Morton
@ 2006-03-01 10:14 ` Paul Jackson
0 siblings, 0 replies; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 10:14 UTC (permalink / raw)
To: Andrew Morton; +Cc: ebiederm, linux-kernel, greg, neilb
> OK, thanks. So
> gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch is the
> problem. Odd.
See my "drat" correction that crossed paths with your message.
I'm testing gregkh-driver-put_device-might_sleep.patch now.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 10:11 ` Paul Jackson
@ 2006-03-01 10:31 ` Paul Jackson
0 siblings, 0 replies; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 10:31 UTC (permalink / raw)
To: Paul Jackson; +Cc: akpm, ebiederm, linux-kernel
Updated and corrected results, now including the previously untested patch:
cpufreq-_ppc-frequency-change-issues-freq-already-lowered-by-bios.patch - good
gregkh-driver-put_device-might_sleep.patch - special case
gregkh-driver-empty_release_functions_are_broken.patch - special case
gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch - bad
where the two special cases boot with the warnings reported,
and the bad case crashes on boot as reported.
That a patch named "*might_sleep*" causes the warning:
Debug: sleeping function called from invalid context at drivers/base/core.c:343^M
in_atomic():1, irqs_disabled():0^M
seems likely enough to me.
Perhaps the problem isn't so much a mainline bug in:
gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch
but rather perhaps this patch has trouble handling this DEBUG warning ??
Recall, as noted before, this crash requires some DEBUG options.
If I disable CONFIG_DEBUG_SPINLOCK and CONFIG_DEBUG_SPINLOCK_SLEEP,
then a build up through and including (and beyond) the following patches:
cpufreq-_ppc-frequency-change-issues-freq-already-lowered-by-bios.patch
gregkh-driver-put_device-might_sleep.patch
gregkh-driver-empty_release_functions_are_broken.patch
gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch
boots fine. With these two DEBUG options, it crashes during boot
(the "bad" above).
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] proc: task_mmu bug fix.
2006-03-01 7:46 ` Andrew Morton
@ 2006-03-01 12:49 ` Eric W. Biederman
2006-03-01 13:14 ` Hugh Dickins
2006-03-01 13:15 ` Rafael J. Wysocki
2006-03-01 18:33 ` Paul Jackson
1 sibling, 2 replies; 43+ messages in thread
From: Eric W. Biederman @ 2006-03-01 12:49 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, pj
Andrew Morton <akpm@osdl.org> writes:
> ebiederm@xmission.com (Eric W. Biederman) wrote:
>>
>> This should fix the big bug that has been crashing kernels when
>> fuser is called. At least it is the bug I observed here. It seems
>> you need the right access pattern on /proc/<pid>/maps to trigger this.
>
> Thanks. Do you think this is likely to fix the crashes reported by
> Laurent, Jesper, Paul, Rafael and Martin?
So I haven't tracked down all of the bug reports yet. But the
few bits I have seen make it likely. First the task_mmu change
was one of the largest change in logic I had to make. Second
the ugly bug reports seem to be about an extra decrement. Third
it seems to be my task_ref work that is the most implicated.
I will certainly follow and see what I can do to confirm that I have
gotten everything.
Eric
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] proc: task_mmu bug fix.
2006-03-01 12:49 ` Eric W. Biederman
@ 2006-03-01 13:14 ` Hugh Dickins
2006-03-01 13:15 ` Rafael J. Wysocki
1 sibling, 0 replies; 43+ messages in thread
From: Hugh Dickins @ 2006-03-01 13:14 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Andrew Morton, Rafael J. Wysocki, linux-kernel, pj
On Wed, 1 Mar 2006, Eric W. Biederman wrote:
> Andrew Morton <akpm@osdl.org> writes:
> >
> > Thanks. Do you think this is likely to fix the crashes reported by
> > Laurent, Jesper, Paul, Rafael and Martin?
>
> So I haven't tracked down all of the bug reports yet. But the
> few bits I have seen make it likely. First the task_mmu change
> was one of the largest change in logic I had to make. Second
> the ugly bug reports seem to be about an extra decrement. Third
> it seems to be my task_ref work that is the most implicated.
I was getting the same bootup __put_task_struct symptoms as Rafael,
and this patch fixes those nicely: looks stable now, thanks Eric.
Hugh
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] proc: task_mmu bug fix.
2006-03-01 12:49 ` Eric W. Biederman
2006-03-01 13:14 ` Hugh Dickins
@ 2006-03-01 13:15 ` Rafael J. Wysocki
1 sibling, 0 replies; 43+ messages in thread
From: Rafael J. Wysocki @ 2006-03-01 13:15 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Andrew Morton, linux-kernel, pj
On Wednesday 01 March 2006 13:49, Eric W. Biederman wrote:
> Andrew Morton <akpm@osdl.org> writes:
>
> > ebiederm@xmission.com (Eric W. Biederman) wrote:
> >>
> >> This should fix the big bug that has been crashing kernels when
> >> fuser is called. At least it is the bug I observed here. It seems
> >> you need the right access pattern on /proc/<pid>/maps to trigger this.
> >
> > Thanks. Do you think this is likely to fix the crashes reported by
> > Laurent, Jesper, Paul, Rafael and Martin?
>
> So I haven't tracked down all of the bug reports yet. But the
> few bits I have seen make it likely. First the task_mmu change
> was one of the largest change in logic I had to make. Second
> the ugly bug reports seem to be about an extra decrement. Third
> it seems to be my task_ref work that is the most implicated.
>
> I will certainly follow and see what I can do to confirm that I have
> gotten everything.
I can confirm it fixes the problem that I have reported.
Thanks a lot,
Rafael
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] proc: task_mmu bug fix.
2006-03-01 7:46 ` Andrew Morton
2006-03-01 12:49 ` Eric W. Biederman
@ 2006-03-01 18:33 ` Paul Jackson
1 sibling, 0 replies; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 18:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: ebiederm, linux-kernel
Andrew wrote:
> Thanks. Do you think this is likely to fix the crashes reported by
> Laurent, Jesper, Paul, Rafael and Martin?
I presume it was getting the 'fuser ...' crash,
since Eric was using the same command I was using.
I need to run Eric's fix with my SGI inhouse
application that I first saw this on, to be sure
it's happy too.
I'm optimistic that will work too. Hopefully
I can get to this sometime this evening.
The gregkh/sysfs/... boot failure is a separate
bug, as I trust you are aware from your responses
on that failure.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 9:53 ` Paul Jackson
2006-03-01 10:02 ` Andrew Morton
2006-03-01 10:11 ` Paul Jackson
@ 2006-03-01 19:21 ` Greg KH
2006-03-01 20:58 ` Paul Jackson
2 siblings, 1 reply; 43+ messages in thread
From: Greg KH @ 2006-03-01 19:21 UTC (permalink / raw)
To: Paul Jackson; +Cc: Andrew Morton, ebiederm, linux-kernel
On Wed, Mar 01, 2006 at 01:53:38AM -0800, Paul Jackson wrote:
> Ok - down to the patch:
>
> 1) gregkh-driver-empty_release_functions_are_broken.patch - good
> 2) gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch - special case
> 3) gregkh-driver-fix-up-the-sysfs-pollable-patch.patch - bad
>
> Up through and including (1), it all seems fine.
>
> With (3) or more loaded, it fails to boot, with the crash
> given before (and appended below for completeness).
>
> With patchs up through (2) loaded, it boots, but complains 27
> times during the boot
>
> One of the 27 complaints for special case (2):
> ================================= begin =================================
> Debug: sleeping function called from invalid context at drivers/base/core.c:343^M
> in_atomic():1, irqs_disabled():0^M
> ^M
> Call Trace:^M
> [<a0000001000132c0>] show_stack+0x40/0xa0^M
> sp=e00002bc3a49f9b0 bsp=e00002bc3a499558^M
<snip>
As reported this is expected, and can be ignored safely. It's just scsi
being bad :)
> The boottime crash seen in case (3) and beyond:
> ================================= begin =================================
> pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> SGI Altix RTC Timer: v2.1, 20 MHz
> EFI Time Services Driver v0.4
> Linux agpgart interface v0.101 (c) Dave Jones
> sn_console: Console driver init
> ttySG0 at I/O 0x0 (irq = 0) is a SGI SN L1
> Unable to handle kernel NULL pointer dereference (address 0000000000000058)
> swapper[1]: Oops 8813272891392 [1]
> Modules linked in:
>
> Pid: 1, CPU 0, comm: swapper
> psr : 0000101008026018 ifs : 800000000000040b ip : [<a0000001001eac90>] Not tainted
> ip is at sysfs_create_group+0x30/0x2a0
> unat: 0000000000000000 pfs : 0000000000000308 rsc : 0000000000000003
> rnat: 0000000002000027 bsps: 0000000000000002 pr : 0000000000005649
> ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
> csd : 0000000000000000 ssd : 0000000000000000
> b0 : a000000100809190 b6 : e000023002310080 b7 : a0000001008091e0
> f6 : 1003e0000000000000000 f7 : 1003e20c49ba5e353f7cf
> f8 : 1003e0000000000003398 f9 : 1003e000000000000007f
> f10 : 1003e0000000000000000 f11 : 1003e0000000000000000
> r1 : a000000100c70cc0 r2 : 0000000000000058 r3 : a000000100a80ef8
> r8 : 0000000000000000 r9 : a000000100c95820 r10 : ffffffffffffffff
> r11 : 0000000000000400 r12 : e00002343bd97d50 r13 : e00002343bd90000
> r14 : a000000100a83360 r15 : a000000100c95820 r16 : a000000100a80f00
> r17 : 00000000000003c0 r18 : 0000000000000001 r19 : 0000000000000002
> r20 : ffffffffffffffff r21 : 0000000000000000 r22 : 000000000000000e
> r23 : a000000100a720a8 r24 : a000000100812c40 r25 : a000000100a77698
> r26 : a000000100a88b60 r27 : a0000001008f3b88 r28 : e00002bc3a0432f0
> r29 : 0000000000000001 r30 : a0000001007d0db8 r31 : a0000001008091e0
>
> Call Trace:
> [<a0000001000132c0>] show_stack+0x40/0xa0
> sp=e00002343bd978e0 bsp=e00002343bd91278
> [<a000000100013af0>] show_regs+0x7d0/0x800
> sp=e00002343bd97ab0 bsp=e00002343bd91228
> [<a000000100036df0>] die+0x210/0x320
> sp=e00002343bd97ab0 bsp=e00002343bd911d8
> [<a00000010005a840>] ia64_do_page_fault+0x900/0xa80
> sp=e00002343bd97ad0 bsp=e00002343bd91178
> [<a00000010000bd00>] ia64_leave_kernel+0x0/0x290
> sp=e00002343bd97b80 bsp=e00002343bd91178
> [<a0000001001eac90>] sysfs_create_group+0x30/0x2a0
> sp=e00002343bd97d50 bsp=e00002343bd91120
> [<a000000100809190>] topology_cpu_callback+0x70/0xc0
> sp=e00002343bd97d60 bsp=e00002343bd910f0
> [<a000000100809260>] topology_sysfs_init+0x80/0x120
> sp=e00002343bd97d60 bsp=e00002343bd910d0
This points at the sysfs cpu patches that are in -mm, which are not in
my tree...
thanks,
greg k-h
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 19:21 ` Greg KH
@ 2006-03-01 20:58 ` Paul Jackson
2006-03-01 21:30 ` Greg KH
0 siblings, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 20:58 UTC (permalink / raw)
To: Greg KH; +Cc: akpm, ebiederm, linux-kernel
Greg wrote:
> As reported this is expected, and can be ignored safely. It's just scsi
> being bad :)
Yeah - so I eventually realized.
> > [<a0000001001eac90>] sysfs_create_group+0x30/0x2a0
> > sp=e00002343bd97d50 bsp=e00002343bd91120
> > [<a000000100809190>] topology_cpu_callback+0x70/0xc0
> > sp=e00002343bd97d60 bsp=e00002343bd910f0
> > [<a000000100809260>] topology_sysfs_init+0x80/0x120
> > sp=e00002343bd97d60 bsp=e00002343bd910d0
>
> This points at the sysfs cpu patches that are in -mm, which are not in
> my tree...
So ... what does that mean for who should be looking at this?
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 20:58 ` Paul Jackson
@ 2006-03-01 21:30 ` Greg KH
2006-03-01 22:26 ` Andrew Morton
0 siblings, 1 reply; 43+ messages in thread
From: Greg KH @ 2006-03-01 21:30 UTC (permalink / raw)
To: Paul Jackson; +Cc: akpm, ebiederm, linux-kernel
On Wed, Mar 01, 2006 at 12:58:02PM -0800, Paul Jackson wrote:
> Greg wrote:
> > As reported this is expected, and can be ignored safely. It's just scsi
> > being bad :)
>
> Yeah - so I eventually realized.
>
> > > [<a0000001001eac90>] sysfs_create_group+0x30/0x2a0
> > > sp=e00002343bd97d50 bsp=e00002343bd91120
> > > [<a000000100809190>] topology_cpu_callback+0x70/0xc0
> > > sp=e00002343bd97d60 bsp=e00002343bd910f0
> > > [<a000000100809260>] topology_sysfs_init+0x80/0x120
> > > sp=e00002343bd97d60 bsp=e00002343bd910d0
> >
> > This points at the sysfs cpu patches that are in -mm, which are not in
> > my tree...
>
> So ... what does that mean for who should be looking at this?
Hm, looks like that stuff went into mainline already, sorry I thought it
was still in -mm.
Look at changeset 69dcc99199fe29b0a29471a3488d39d9d33b25fc for details.
I've cced Yanmin, who did that work.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 21:30 ` Greg KH
@ 2006-03-01 22:26 ` Andrew Morton
2006-03-01 22:50 ` Greg KH
2006-03-01 23:10 ` Paul Jackson
0 siblings, 2 replies; 43+ messages in thread
From: Andrew Morton @ 2006-03-01 22:26 UTC (permalink / raw)
To: Greg KH; +Cc: pj, ebiederm, linux-kernel, Zhang, Yanmin
Greg KH <greg@kroah.com> wrote:
>
> On Wed, Mar 01, 2006 at 12:58:02PM -0800, Paul Jackson wrote:
> > Greg wrote:
> > > As reported this is expected, and can be ignored safely. It's just scsi
> > > being bad :)
> >
> > Yeah - so I eventually realized.
> >
> > > > [<a0000001001eac90>] sysfs_create_group+0x30/0x2a0
> > > > sp=e00002343bd97d50 bsp=e00002343bd91120
> > > > [<a000000100809190>] topology_cpu_callback+0x70/0xc0
> > > > sp=e00002343bd97d60 bsp=e00002343bd910f0
> > > > [<a000000100809260>] topology_sysfs_init+0x80/0x120
> > > > sp=e00002343bd97d60 bsp=e00002343bd910d0
> > >
> > > This points at the sysfs cpu patches that are in -mm, which are not in
> > > my tree...
> >
> > So ... what does that mean for who should be looking at this?
>
> Hm, looks like that stuff went into mainline already, sorry I thought it
> was still in -mm.
>
> Look at changeset 69dcc99199fe29b0a29471a3488d39d9d33b25fc for details.
But Paul bisected it down to a particular not-merged patch,
gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch, which I'll
admit doesn't look like it'll cause this.
Paul, did you test
http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm2-pre1.gz? That has the
sysfs-pollable patches reverted.
> I've cced Yanmin, who did that work.
You missed. I've added Yanmin now.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 22:26 ` Andrew Morton
@ 2006-03-01 22:50 ` Greg KH
2006-03-01 23:20 ` Paul Jackson
2006-03-01 23:10 ` Paul Jackson
1 sibling, 1 reply; 43+ messages in thread
From: Greg KH @ 2006-03-01 22:50 UTC (permalink / raw)
To: Andrew Morton; +Cc: pj, ebiederm, linux-kernel, Zhang, Yanmin
On Wed, Mar 01, 2006 at 02:26:31PM -0800, Andrew Morton wrote:
> Greg KH <greg@kroah.com> wrote:
> >
> > On Wed, Mar 01, 2006 at 12:58:02PM -0800, Paul Jackson wrote:
> > > Greg wrote:
> > > > As reported this is expected, and can be ignored safely. It's just scsi
> > > > being bad :)
> > >
> > > Yeah - so I eventually realized.
> > >
> > > > > [<a0000001001eac90>] sysfs_create_group+0x30/0x2a0
> > > > > sp=e00002343bd97d50 bsp=e00002343bd91120
> > > > > [<a000000100809190>] topology_cpu_callback+0x70/0xc0
> > > > > sp=e00002343bd97d60 bsp=e00002343bd910f0
> > > > > [<a000000100809260>] topology_sysfs_init+0x80/0x120
> > > > > sp=e00002343bd97d60 bsp=e00002343bd910d0
> > > >
> > > > This points at the sysfs cpu patches that are in -mm, which are not in
> > > > my tree...
> > >
> > > So ... what does that mean for who should be looking at this?
> >
> > Hm, looks like that stuff went into mainline already, sorry I thought it
> > was still in -mm.
> >
> > Look at changeset 69dcc99199fe29b0a29471a3488d39d9d33b25fc for details.
>
> But Paul bisected it down to a particular not-merged patch,
> gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch, which I'll
> admit doesn't look like it'll cause this.
Yeah, I realize that, it just really seems odd that this code dies, and
I thought it was still in your tree at the time, sorry.
Oh, and Paul, this all works just fine with no -mm, right?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 22:26 ` Andrew Morton
2006-03-01 22:50 ` Greg KH
@ 2006-03-01 23:10 ` Paul Jackson
2006-03-01 23:40 ` Paul Jackson
1 sibling, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 23:10 UTC (permalink / raw)
To: Andrew Morton; +Cc: greg, ebiederm, linux-kernel, yanmin.zhang
Andrew wrote:
> But Paul bisected it down to a particular not-merged patch,
> gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch, which I'll
> admit doesn't look like it'll cause this.
Yes - though I did have a couple of "drat" errors. I am double
checking now that that is correct. I think it is. Be back
in a little with greater certainty.
> Paul, did you test
> http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm2-pre1.gz? That has the
> sysfs-pollable patches reverted.
I did not test this -- I will try it shortly.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 22:50 ` Greg KH
@ 2006-03-01 23:20 ` Paul Jackson
2006-03-01 23:40 ` Andrew Morton
0 siblings, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 23:20 UTC (permalink / raw)
To: Greg KH; +Cc: akpm, ebiederm, linux-kernel, yanmin.zhang
> Oh, and Paul, this all works just fine with no -mm, right?
Do you mean - does 2.6.16-rc5 work for me?
I haven't tried that yet. The lowest I went in Andrew's 2.6.16-rc5-rc1
patch stack was one patch, which would be 2.6.16-rc5 plus
"linus.patch." That worked for me, as did a few points between that and
gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch
I will try a plain 2.6.16-rc5 as well, shortly.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 23:10 ` Paul Jackson
@ 2006-03-01 23:40 ` Paul Jackson
2006-03-02 4:20 ` Andrew Morton
0 siblings, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-01 23:40 UTC (permalink / raw)
To: Paul Jackson; +Cc: akpm, greg, ebiederm, linux-kernel, yanmin.zhang
Andrew wrote:
> But Paul bisected it down to a particular not-merged patch,
> gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch, which I'll
> admit doesn't look like it'll cause this.
Verified.
Also ...
1) As stated before, this is -only- with CONFIG_DEBUG_SPINLOCK and
CONFIG_DEBUG_SPINLOCK_SLEEP enabled. If I turn these two off,
it boots fine.
2) Because the patch:
gregkh-driver-put_device-might_sleep.patch
that comes just before here was causing me to worry with its
added error messages, I removed it from my stack. That made
no difference to the failing boot (it simply removed the
error messages, as Greg would predict.)
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 23:20 ` Paul Jackson
@ 2006-03-01 23:40 ` Andrew Morton
2006-03-02 0:10 ` Paul Jackson
0 siblings, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2006-03-01 23:40 UTC (permalink / raw)
To: Paul Jackson; +Cc: greg, ebiederm, linux-kernel, yanmin.zhang
Paul Jackson <pj@sgi.com> wrote:
>
> > Oh, and Paul, this all works just fine with no -mm, right?
>
> Do you mean - does 2.6.16-rc5 work for me?
>
> I haven't tried that yet. The lowest I went in Andrew's 2.6.16-rc5-rc1
> patch stack was one patch, which would be 2.6.16-rc5 plus
> "linus.patch."
That's tip-of-linus-tree a day or so after -rc5.
> I will try a plain 2.6.16-rc5 as well, shortly.
I don't think there's much point in that - the sysfs-topology code went in
on Feb 15.
If 2.6.16-rc5+linus.patch works and
2.6.16-rc5+linus.patch+gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch
crashes then we have a pretty good idea of where to look.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 23:40 ` Andrew Morton
@ 2006-03-02 0:10 ` Paul Jackson
2006-03-02 0:35 ` Paul Jackson
0 siblings, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-02 0:10 UTC (permalink / raw)
To: Andrew Morton; +Cc: greg, ebiederm, linux-kernel, yanmin.zhang
Andrew wrote:
> If 2.6.16-rc5+linus.patch works and
> 2.6.16-rc5+linus.patch+gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch
> crashes then we have a pretty good idea of where to look.
Now I'm trying:
http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm2-pre1.gz
since that's the one left on my list of things to try that you
haven't suspected would be little help to try.
I'll skip the plain 2.6.16-rc5 unless someone asks for it again.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-02 0:10 ` Paul Jackson
@ 2006-03-02 0:35 ` Paul Jackson
0 siblings, 0 replies; 43+ messages in thread
From: Paul Jackson @ 2006-03-02 0:35 UTC (permalink / raw)
To: Paul Jackson; +Cc: akpm, greg, ebiederm, linux-kernel, yanmin.zhang
pj wrote:
> Now I'm trying:
> http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm2-pre1.gz
That boots fine, both with and without CONFIG_DEBUG_SPINLOCK and
CONFIG_DEBUG_SPINLOCK_SLEEP.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-01 23:40 ` Paul Jackson
@ 2006-03-02 4:20 ` Andrew Morton
2006-03-02 6:14 ` Paul Jackson
0 siblings, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2006-03-02 4:20 UTC (permalink / raw)
To: Paul Jackson; +Cc: pj, greg, ebiederm, linux-kernel, yanmin.zhang, Neil Brown
Paul Jackson <pj@sgi.com> wrote:
>
> Andrew wrote:
> > But Paul bisected it down to a particular not-merged patch,
> > gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch, which I'll
> > admit doesn't look like it'll cause this.
>
> Verified.
All very strange. afaict that patch is a no-op. The changelog claims that
"This patch also uses sysfs_notify to allow /sys/block/md*/md/sync_action
to be pollable", except that part is AWOL.
It'd be interesting to see if just the data structure expansion:
--- gregkh-2.6.orig/fs/sysfs/file.c
+++ gregkh-2.6/fs/sysfs/file.c
@@ -6,6 +6,7 @@
#include <linux/fsnotify.h>
#include <linux/kobject.h>
#include <linux/namei.h>
+#include <linux/poll.h>
#include <asm/uaccess.h>
#include <asm/semaphore.h>
@@ -57,6 +58,7 @@ struct sysfs_buffer {
struct sysfs_ops * ops;
struct semaphore sem;
int needs_read_fill;
+ int event;
};
--- gregkh-2.6.orig/include/linux/kobject.h
+++ gregkh-2.6/include/linux/kobject.h
@@ -24,6 +24,7 @@
#include <linux/rwsem.h>
#include <linux/kref.h>
#include <linux/kernel.h>
+#include <linux/wait.h>
#include <asm/atomic.h>
#define KOBJ_NAME_LEN 20
@@ -56,6 +57,7 @@ struct kobject {
struct kset * kset;
struct kobj_type * ktype;
struct dentry * dentry;
+ wait_queue_head_t poll;
};
extern int kobject_set_name(struct kobject *, const char *, ...)
--- gregkh-2.6.orig/include/linux/sysfs.h
+++ gregkh-2.6/include/linux/sysfs.h
@@ -74,6 +74,7 @@ struct sysfs_dirent {
umode_t s_mode;
struct dentry * s_dentry;
struct iattr * s_iattr;
+ atomic_t s_event;
};
#define SYSFS_ROOT 0x0001
is sufficient to break it.
Somewhat OT, but why is that patch dinking around with the attribute's
\x10parent directory? Why not just poll the attribute's sysfs file directly?
<xenuflects>. Possibly because we want the poller to be woken when an
attribute actually gets instantiated within the directory?? Dunno.
And it's a bit sad that poll() on an unpollable attribute will just hang.
One would expect poll() to come back with -EINVAL.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-02 4:20 ` Andrew Morton
@ 2006-03-02 6:14 ` Paul Jackson
2006-03-02 7:42 ` Andrew Morton
0 siblings, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-02 6:14 UTC (permalink / raw)
To: Andrew Morton; +Cc: greg, ebiederm, linux-kernel, yanmin.zhang, neilb
> It'd be interesting to see if just the data structure expansion:
Nice guess.
It still crashes on boot.
Details:
1) Working in your 2.6.16-rc5-mm1 stack.
2) Commented out gregkh-driver-put_device-might_sleep.patch
3) With gregkh-driver-empty_release_functions_are_broken.patch on top
4) Then push your "just the data structure expansion" patch on top of that
5) If CONFIG_DEBUG_SPINLOCK and CONFIG_DEBUG_SPINLOCK_SLEEP enabled:
dies trying to boot
6) But if these SPINLOCK debug are not enabled:
boots fine
The boot panic this time looks like the others at first glance:
========================== begin ==========================
Uncompressing Linux... donei(3|0)/Scsi(Pun2,Lun0)/HD(Part8,Sig461D0E2E-AD03-4C5ELoading file initrd...done
Linux version 2.6.16-rc5 (pj@jackhammer) (gcc version 3.3.3 (SuSE Linux)) #35 SMP PREEMPT Wed Mar 1 22:01:20 PST 2006
EFI v1.10 by INTEL: SALsystab=0x230027c9070 ACPI 2.0=0x230027c9840
Number of logical nodes in system = 2
Number of memory chunks in system = 2
Initial ramdisk at: 0xe00002bc39fa3000 (4386320 bytes)
SAL 2.9: SGI SN2 version 4.32
SAL Platform features: ITC_Drift
SAL: AP wakeup using external interrupt vector 0x12
No logical to physical processor mapping available
ACPI: Local APIC address c0000000fee00000
ACPI: Error parsing MADT - no IOSAPIC entries
register_intr: No IOSAPIC for GSI 52
4 CPUs available, 4 CPUs total
Increasing MCA rendezvous timeout from 20000 to 49000 milliseconds
MCA related initialization done
SGI SAL version 4.32
Virtual mem_map starts at 0xa0007ffd43c40000
Built 2 zonelists
Kernel command line: BOOT_IMAGE=scsi2:\efi\SuSE\vmlinuz.pj5 root=/dev/sdb6 selinux=0 console=ttySG0 splash=silent thash_entries=2097152 ro
PID hash table entries: 4096 (order: 12, 131072 bytes)
Console: colour dummy device 80x25
Memory: 7567392k/7730224k available (6861k code, 180128k reserved, 3925k data, 368k init)
McKinley Errata 9 workaround not needed; disabling it
Dentry cache hash table entries: 1048576 (order: 9, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 8, 4194304 bytes)
Mount-cache hash table entries: 1024
Boot processor id 0x0/0x8
Brought up 4 CPUs
Total of 4 processors activated (7782.40 BogoMIPS).
migration_cost=7609,38217
checking if image is initramfs... it is
Freeing initrd memory: 4272kB freed
NET: Registered protocol family 16
ACPI: bus type pci registered
Altix IO Topology Information
*****************************
Serial Number:N0000015
PCI SEGMENT PCIBUS NUMBER BRICK RACK:SLOT BUS CONNECTION TOPOLOGY
----------- ------------- --------------------- -------------------
0x0001 0x01 IXbrick 001:27 01 001c24:slab0:widget12:bus0
0x0002 0x01 IXbrick 001:27 02 001c24:slab0:widget12:bus1
0x0003 0x01 IXbrick 001:27 03 001c24:slab0:widget15:bus0
0x0004 0x01 IXbrick 001:27 04 001c24:slab0:widget15:bus1
0x0005 0x01 IXbrick 001:27 05 001c24:slab0:widget13:bus0
0x0006 0x01 IXbrick 001:27 06 001c24:slab0:widget13:bus1
PROM version < 4.50 -- implementing old PROM flush WAR
ACPI: Subsystem revision 20060210
ACPI: SCI (ACPI GSI 52) not registered
ACPI: Interpreter enabled
ACPI: Using IOSAPIC for interrupt routing
SCSI subsystem initialized
perfmon: version 2.0 IRQ 238
perfmon: Itanium 2 PMU detected, 16 PMCs, 18 PMDs, 4 counters (47 bits)
PAL Information Facility v0.5
perfmon: added sampling format default_format
perfmon_default_smpl: default_format v2.0 registered
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 2048 (order 0, 16384 bytes)
SGI XFS with ACLs, realtime, large block/inode numbers, no debug enabled
SGI XFS Quota Management subsystem
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
SGI Altix RTC Timer: v2.1, 20 MHz
EFI Time Services Driver v0.4
Linux agpgart interface v0.101 (c) Dave Jones
sn_console: Console driver init
ttySG0 at I/O 0x0 (irq = 0) is a SGI SN L1
Unable to handle kernel NULL pointer dereference (address 0000000000000058)
swapper[1]: Oops 8813272891392 [1]
Modules linked in:
Pid: 1, CPU 0, comm: swapper
psr : 0000101008026018 ifs : 800000000000040b ip : [<a0000001001ea870>] Not tainted
ip is at sysfs_create_group+0x30/0x2a0
unat: 0000000000000000 pfs : 0000000000000308 rsc : 0000000000000003
rnat: 0000000002000027 bsps: 0000000000000002 pr : 0000000000005649
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a000000100809190 b6 : e000023002310080 b7 : a0000001008091e0
f6 : 1003e0000000000000000 f7 : 1003e20c49ba5e353f7cf
f8 : 1003e0000000000003398 f9 : 1003e000000000000007f
f10 : 1003e0000000000000000 f11 : 1003e0000000000000000
r1 : a000000100c70c70 r2 : 0000000000000058 r3 : a000000100a80fa0
r8 : 0000000000000000 r9 : a000000100c95820 r10 : ffffffffffffffff
r11 : 0000000000000400 r12 : e00002343bd97d50 r13 : e00002343bd90000
r14 : a000000100a831e8 r15 : a000000100c95820 r16 : a000000100a80fa8
r17 : 00000000000003c0 r18 : 0000000000000001 r19 : 0000000000000002
r20 : ffffffffffffffff r21 : 0000000000000000 r22 : 000000000000000e
r23 : a000000100a72058 r24 : a000000100812c40 r25 : a000000100a77648
r26 : a000000100a88b10 r27 : a0000001008f3b88 r28 : e00002bc3a0432f0
r29 : 0000000000000001 r30 : a0000001007cfdc8 r31 : a0000001008091e0
Call Trace:
[<a0000001000132c0>] show_stack+0x40/0xa0
sp=e00002343bd978e0 bsp=e00002343bd91278
[<a000000100013af0>] show_regs+0x7d0/0x800
sp=e00002343bd97ab0 bsp=e00002343bd91228
[<a000000100036df0>] die+0x210/0x320
sp=e00002343bd97ab0 bsp=e00002343bd911d8
[<a00000010005a840>] ia64_do_page_fault+0x900/0xa80
sp=e00002343bd97ad0 bsp=e00002343bd91178
[<a00000010000bd00>] ia64_leave_kernel+0x0/0x290
sp=e00002343bd97b80 bsp=e00002343bd91178
[<a0000001001ea870>] sysfs_create_group+0x30/0x2a0
sp=e00002343bd97d50 bsp=e00002343bd91120
[<a000000100809190>] topology_cpu_callback+0x70/0xc0
sp=e00002343bd97d60 bsp=e00002343bd910f0
[<a000000100809260>] topology_sysfs_init+0x80/0x120
sp=e00002343bd97d60 bsp=e00002343bd910d0
[<a000000100009860>] init+0x580/0x8e0
sp=e00002343bd97d60 bsp=e00002343bd910a8
[<a000000100011780>] kernel_thread_helper+0xe0/0x100
sp=e00002343bd97e30 bsp=e00002343bd91080
[<a000000100009140>] start_kernel_thread+0x20/0x40
sp=e00002343bd97e30 bsp=e00002343bd91080
<0>Kernel panic - not syncing: Attempted to kill init!
=========================== end ===========================
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-02 6:14 ` Paul Jackson
@ 2006-03-02 7:42 ` Andrew Morton
2006-03-02 19:12 ` Paul Jackson
0 siblings, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2006-03-02 7:42 UTC (permalink / raw)
To: Paul Jackson; +Cc: greg, ebiederm, linux-kernel, yanmin.zhang, neilb
Paul Jackson <pj@sgi.com> wrote:
>
> > It'd be interesting to see if just the data structure expansion:
>
> Nice guess.
>
> It still crashes on boot.
>
OK. This is awful. I cannot see it.
If someone passes get_cpu_sysdev() a -ve cpu number then ugly things will
happen, but it's unlikely to be that. Pretty sad coding though.
> Unable to handle kernel NULL pointer dereference (address 0000000000000058)
> ...
> ip is at sysfs_create_group+0x30/0x2a0
Are you able to determine which pointer deref this is faulting at? I
couldn't find any fields which look like they're 0x58 bytes into anything.
Thanks for persisting with this.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-02 7:42 ` Andrew Morton
@ 2006-03-02 19:12 ` Paul Jackson
2006-03-02 21:52 ` Andrew Morton
0 siblings, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-02 19:12 UTC (permalink / raw)
To: Andrew Morton
Cc: greg, ebiederm, linux-kernel, yanmin.zhang, neilb, steiner,
hawkes
John, Jack,
Adding you to this one. We've seen this before, and
I wasted Andrew's and others time chasing it again.
I speculate at the bottom of this message that I
should add a panic on the kzalloc that fails if one
has 1024 CPUS, SPINLOCK debug, and just slightly larger
data structures with a new patch than we had before.
Feel free to comment on whether such a panic, or other
remedy would be desirable or not.
===
Andrew wrote:
> OK. This is awful. I cannot see it.
Crap - I should have recognized this problem a day ago.
I wasted your time. Sorry.
The extra data pushed the size of our (big SN2) sysfs_cpus[] array
past the point that it could be kzalloc'd.
The initial failure is in the file:
arch/ia64/kernel/topology.c
function:
topology_init
line:
sysfs_cpus = kzalloc(sizeof(struct ia64_cpu) * NR_CPUS, GFP_KERNEL);
With our large NR_CPUS of 1024, and the additional cost of
the CONFIG_DEBUG_SPINLOCK* debug stuff, and the little bit of
additional data added by this patch, that kzalloc() fails.
The final collapse occurs in the file:
fs/sysfs/group.c
function:
sysfs_create_group
line:
BUG_ON(!kobj || !kobj->dentry);
where kobj->dentry points to 0x58.
The offset of dentry in struct kobject is 0x50, and the offset of that
kobj in the containing struct sys_dev is another 0x8 bytes, resulting
in the failed reference:
Unable to handle kernel NULL pointer dereference (address 0000000000000058)
The drivers/base/cpu.c array:
static struct sys_device *cpu_sys_devices[NR_CPUS];
is never filled in, as a result of the above kzalloc() failure, causing
the routine get_cpu_sysdev() in drivers/base/cpu.c to return a NULL
pointer (unknowingly).
I should stare at the code between this point of initial failure and
the point that the house of cards finally collapsed and see if
something should have squeaked sooner.
Though I'm not a guru in this code, so I'm saying a little prayer that
someone else will have a more useful suggestion.
I've added a couple of SGI wizards in this area to the cc list.
I suspect that the short term solution is to proceed without
prejudice to the patch that triggered this:
gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch
while I look at some way, if just a stop gap measure, to complain
earlier in the boot, closer to the scene of the original crime,
so that others hitting this won't waste more time.
Perhaps failing that first kzalloc should cause a complaint,
if not a panic. It would seem that the system is beyond repair
if that kzalloc fails. And since the system hasn't even finished
booting yet, and is for sure trying to boot some larger than tried
before configuration, might just as well announce ones death boldly.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-02 19:12 ` Paul Jackson
@ 2006-03-02 21:52 ` Andrew Morton
2006-03-03 6:33 ` Paul Jackson
0 siblings, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2006-03-02 21:52 UTC (permalink / raw)
To: Paul Jackson
Cc: greg, ebiederm, linux-kernel, yanmin.zhang, neilb, steiner,
hawkes
Paul Jackson <pj@sgi.com> wrote:
>
> ...
>
> The initial failure is in the file:
>
> arch/ia64/kernel/topology.c
>
> function:
>
> topology_init
>
> line:
>
> sysfs_cpus = kzalloc(sizeof(struct ia64_cpu) * NR_CPUS, GFP_KERNEL);
>
> With our large NR_CPUS of 1024, and the additional cost of
> the CONFIG_DEBUG_SPINLOCK* debug stuff, and the little bit of
> additional data added by this patch, that kzalloc() fails.
>
Oh. Maybe we should put a big fat printk in slab for that.
> I should stare at the code between this point of initial failure and
> the point that the house of cards finally collapsed and see if
> something should have squeaked sooner.
Probably a panic() in your topology_init().
Also the below patch should have been done ages ago.
> I suspect that the short term solution is to proceed without
> prejudice to the patch that triggered this:
>
> gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch
Well yeah, except I find that patch to be independently malodorous ;)
> while I look at some way, if just a stop gap measure, to complain
> earlier in the boot, closer to the scene of the original crime,
> so that others hitting this won't waste more time.
See below.
> Perhaps failing that first kzalloc should cause a complaint,
> if not a panic. It would seem that the system is beyond repair
> if that kzalloc fails. And since the system hasn't even finished
> booting yet, and is for sure trying to boot some larger than tried
> before configuration, might just as well announce ones death boldly.
Yeah, it's dead.
From: Andrew Morton <akpm@osdl.org>
We presently ignore the return values from initcalls. But that can carry
useful debugging information. So print it out if it's non-zero.
Also make that warning message more friendly by printing the name of the
initcall function.
Signed-off-by: Andrew Morton <akpm@osdl.org>
---
init/main.c | 20 ++++++++++++++------
1 files changed, 14 insertions(+), 6 deletions(-)
diff -puN init/main.c~initcall-failure-reporting init/main.c
--- 25/init/main.c~initcall-failure-reporting Thu Mar 2 13:41:02 2006
+++ 25-akpm/init/main.c Thu Mar 2 13:50:53 2006
@@ -565,17 +565,23 @@ static void __init do_initcalls(void)
int count = preempt_count();
for (call = __initcall_start; call < __initcall_end; call++) {
- char *msg;
+ char *msg = NULL;
+ char msgbuf[40];
+ int result;
if (initcall_debug) {
printk(KERN_DEBUG "Calling initcall 0x%p", *call);
- print_fn_descriptor_symbol(": %s()", (unsigned long) *call);
+ print_fn_descriptor_symbol(": %s()",
+ (unsigned long) *call);
printk("\n");
}
- (*call)();
+ result = (*call)();
- msg = NULL;
+ if (result) {
+ sprintf(msgbuf, "error code %d", result);
+ msg = msgbuf;
+ }
if (preempt_count() != count) {
msg = "preemption imbalance";
preempt_count() = count;
@@ -585,8 +591,10 @@ static void __init do_initcalls(void)
local_irq_enable();
}
if (msg) {
- printk(KERN_WARNING "error in initcall at 0x%p: "
- "returned with %s\n", *call, msg);
+ printk(KERN_WARNING "initcall at 0x%p", *call);
+ print_fn_descriptor_symbol(": %s()",
+ (unsigned long) *call);
+ printk(": returned with %s\n", msg);
}
}
_
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-02 21:52 ` Andrew Morton
@ 2006-03-03 6:33 ` Paul Jackson
2006-03-03 6:44 ` Andrew Morton
0 siblings, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2006-03-03 6:33 UTC (permalink / raw)
To: Andrew Morton
Cc: greg, ebiederm, linux-kernel, yanmin.zhang, neilb, steiner,
hawkes
Andrew wrote:
> From: Andrew Morton <akpm@osdl.org>
>
> We presently ignore the return values from initcalls. But that can carry
> useful debugging information. So print it out if it's non-zero.
>
> Also make that warning message more friendly by printing the name of the
> initcall function.
I tried this patch on my sicko kernel, and the following
additional line came out, as expected:
initcall at 0xa0000001007cc4c0: topology_init+0x0/0x280(): returned with error code -12
Looks good.
Acked-by: Paul Jackson <pj@sgi.com>
> > I should stare at the code between this point of initial failure and
> > the point that the house of cards finally collapsed and see if
> > something should have squeaked sooner.
>
> Probably a panic() in your topology_init().
Yup - a panic it should be.
I guess that patch should be sent via my friendly ia64 arch maintainer.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree
2006-03-03 6:33 ` Paul Jackson
@ 2006-03-03 6:44 ` Andrew Morton
0 siblings, 0 replies; 43+ messages in thread
From: Andrew Morton @ 2006-03-03 6:44 UTC (permalink / raw)
To: Paul Jackson
Cc: greg, ebiederm, linux-kernel, yanmin.zhang, neilb, steiner,
hawkes
Paul Jackson <pj@sgi.com> wrote:
>
> Andrew wrote:
> > From: Andrew Morton <akpm@osdl.org>
> >
> > We presently ignore the return values from initcalls. But that can carry
> > useful debugging information. So print it out if it's non-zero.
> >
> > Also make that warning message more friendly by printing the name of the
> > initcall function.
>
> I tried this patch on my sicko kernel, and the following
> additional line came out, as expected:
>
> initcall at 0xa0000001007cc4c0: topology_init+0x0/0x280(): returned with error code -12
Yes, I've just been looking at the output. There are quite a few ENODEV's
of course. But it's pretty obvious what's going on from the name of the
function. It does remind you that you have drivers in vmlinux which aren't
doing anything useful.
We'll see how it goes.
^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~2006-03-03 6:45 UTC | newest]
Thread overview: 43+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200603010120.k211KqVP009559@shell0.pdx.osdl.net>
2006-03-01 2:18 ` + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree Paul Jackson
2006-03-01 2:36 ` Andrew Morton
2006-03-01 3:45 ` Paul Jackson
2006-03-01 4:10 ` Paul Jackson
2006-03-01 5:05 ` Eric W. Biederman
2006-03-01 5:25 ` Paul Jackson
2006-03-01 6:11 ` Eric W. Biederman
2006-03-01 6:15 ` Eric W. Biederman
2006-03-01 7:20 ` [PATCH] proc: Reference couting fix Eric W. Biederman
2006-03-01 7:26 ` [PATCH] proc: task_mmu bug fix Eric W. Biederman
2006-03-01 7:46 ` Andrew Morton
2006-03-01 12:49 ` Eric W. Biederman
2006-03-01 13:14 ` Hugh Dickins
2006-03-01 13:15 ` Rafael J. Wysocki
2006-03-01 18:33 ` Paul Jackson
2006-03-01 7:48 ` + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree Paul Jackson
2006-03-01 8:26 ` Andrew Morton
2006-03-01 8:39 ` Paul Jackson
2006-03-01 9:53 ` Paul Jackson
2006-03-01 10:02 ` Andrew Morton
2006-03-01 10:14 ` Paul Jackson
2006-03-01 10:11 ` Paul Jackson
2006-03-01 10:31 ` Paul Jackson
2006-03-01 19:21 ` Greg KH
2006-03-01 20:58 ` Paul Jackson
2006-03-01 21:30 ` Greg KH
2006-03-01 22:26 ` Andrew Morton
2006-03-01 22:50 ` Greg KH
2006-03-01 23:20 ` Paul Jackson
2006-03-01 23:40 ` Andrew Morton
2006-03-02 0:10 ` Paul Jackson
2006-03-02 0:35 ` Paul Jackson
2006-03-01 23:10 ` Paul Jackson
2006-03-01 23:40 ` Paul Jackson
2006-03-02 4:20 ` Andrew Morton
2006-03-02 6:14 ` Paul Jackson
2006-03-02 7:42 ` Andrew Morton
2006-03-02 19:12 ` Paul Jackson
2006-03-02 21:52 ` Andrew Morton
2006-03-03 6:33 ` Paul Jackson
2006-03-03 6:44 ` Andrew Morton
2006-03-01 4:31 ` Eric W. Biederman
2006-03-01 4:58 ` Paul Jackson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox