* Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually @ 2006-12-20 0:46 Linas Vepstas 2006-12-20 0:53 ` Benjamin Herrenschmidt 2006-12-20 1:46 ` Peter Bergner 0 siblings, 2 replies; 14+ messages in thread From: Linas Vepstas @ 2006-12-20 0:46 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: linuxppc-dev Hi Ben, Per xchat, here's the update. I'm guessing I'm using a broken compiler, as per chain of evidence below ... I noticed that linux-2.6.20-rc1-git6 crashes on power4 in SMP mode: [ 0.000000] [boot]0020 XICS Init [ 0.000000] i8259 legacy interrupt controller initialized [ 0.000000] [boot]0021 XICS Done [ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes) cpu 0x0: Vector: 700 (Program Check) at [c0000000007a3980] pc: c00000000007d574: .debug_mutex_unlock+0x5c/0x118 lr: c000000000468068: .__mutex_unlock_slowpath+0x104/0x198 sp: c0000000007a3c00 msr: 9000000000029032 current = 0xc000000000663690 paca = 0xc000000000663f80 pid = 0, comm = swapper enter ? for help [c0000000007a3c80] c000000000468068 .__mutex_unlock_slowpath+0x104/0x198 [c0000000007a3d20] c000000000231da8 .double_unlock_mutex+0x3c/0x58 [c0000000007a3db0] c00000000023b47c .dotest+0x5c/0x370 [c0000000007a3e50] c00000000023bc0c .locking_selftest+0x47c/0x17fc [c0000000007a3ef0] c0000000005f06ec .start_kernel+0x1e4/0x344 [c0000000007a3f90] c0000000000084c8 .start_here_common+0x54/0x8c 0:mon> However, I also note that the following scrolled by: init/main.c:81:2: warning: #warning gcc-4.1.0 is known to miscompile the kernel. A different compiler version is recommended. and I have not yet tried a different gcc Strangely, linux-2.6.19-git7 crashed with [ 0.000000] [boot]0020 XICS Init [ 0.000000] i8259 legacy interrupt controller initialized [ 0.000000] [boot]0021 XICS Done [ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes) System assert at: file: rtas_io_config.c -- line: 195 rio_hub_num: 10 drawer_num: 6 phb_num: 3 buid: 7 which is suspiciously in a similar place. So I am guessing that it is indeed a compiler problem, the compiler passing subroutine arguments in some broken way, or something. Hmm. seems that linux-2.6.20-rc1-git6 on power5 gives me [23178.532001] A-B-B-C-C-A deadlock:failed|failed| ok |failed|fa|[23178.532028] A-B-C-A-B-C deadlock:failed|failed| ok |failed|fa|[23178.532054] A-B-B-C-C-D-D-A deadlock:failed|failed| ok |failed|fa|[23178.532083] A-B-C-D-B-D-D-A deadlock:failed|failed| ok |failed|fa|[23178.532111] A-B-C-D-B-C-D-A deadlock:failed|failed| ok |failed|fa|[23178.532139] double unlock: ok | ok |failed|<0>-------[23178.532171] Kernel BUG at c00000000007d574 [verbose debug info unavailable] cpu 0x0: Vector: 700 (Program Check) at [c0000000007a3980] pc: c00000000007d574: .debug_mutex_unlock+0x5c/0x118 lr: c000000000468068: .__mutex_unlock_slowpath+0x104/0x198 sp: c0000000007a3c00 msr: 8000000000029032 current = 0xc000000000663690 paca = 0xc000000000663f80 pid = 0, comm = swapper enter ? for help [c0000000007a3c80] c000000000468068 .__mutex_unlock_slowpath+0x104/0x198 [c0000000007a3d20] c000000000231da8 .double_unlock_mutex+0x3c/0x58 [c0000000007a3db0] c00000000023b47c .dotest+0x5c/0x370 [c0000000007a3e50] c00000000023bc0c .locking_selftest+0x47c/0x17fc [c0000000007a3ef0] c0000000005f06ec .start_kernel+0x1e4/0x344 [c0000000007a3f90] c0000000000084c8 .start_here_common+0x54/0x8c although linux-2.6.19-git7 worked fine for weeks. At any rate, the warning: #warning gcc-4.1.0 should be converted to a flat-out error. --linas ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-20 0:46 Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually Linas Vepstas @ 2006-12-20 0:53 ` Benjamin Herrenschmidt 2006-12-20 1:02 ` Linas Vepstas 2006-12-20 1:46 ` Peter Bergner 1 sibling, 1 reply; 14+ messages in thread From: Benjamin Herrenschmidt @ 2006-12-20 0:53 UTC (permalink / raw) To: Linas Vepstas; +Cc: linuxppc-dev On Tue, 2006-12-19 at 18:46 -0600, Linas Vepstas wrote: > Hi Ben, > > Per xchat, here's the update. I'm guessing I'm using a broken > compiler, as per chain of evidence below ... > > I noticed that linux-2.6.20-rc1-git6 crashes on power4 > in SMP mode: Have you tried a different gcc to confirm ? Ben. > [ 0.000000] [boot]0020 XICS Init > [ 0.000000] i8259 legacy interrupt controller initialized > [ 0.000000] [boot]0021 XICS Done > [ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes) > cpu 0x0: Vector: 700 (Program Check) at [c0000000007a3980] > pc: c00000000007d574: .debug_mutex_unlock+0x5c/0x118 > lr: c000000000468068: .__mutex_unlock_slowpath+0x104/0x198 > sp: c0000000007a3c00 > msr: 9000000000029032 > current = 0xc000000000663690 > paca = 0xc000000000663f80 > pid = 0, comm = swapper > enter ? for help > [c0000000007a3c80] c000000000468068 .__mutex_unlock_slowpath+0x104/0x198 > [c0000000007a3d20] c000000000231da8 .double_unlock_mutex+0x3c/0x58 > [c0000000007a3db0] c00000000023b47c .dotest+0x5c/0x370 > [c0000000007a3e50] c00000000023bc0c .locking_selftest+0x47c/0x17fc > [c0000000007a3ef0] c0000000005f06ec .start_kernel+0x1e4/0x344 > [c0000000007a3f90] c0000000000084c8 .start_here_common+0x54/0x8c > 0:mon> > > > However, I also note that the following scrolled by: > init/main.c:81:2: warning: #warning gcc-4.1.0 is known to miscompile the > kernel. A different compiler version is recommended. > > and I have not yet tried a different gcc > > Strangely, linux-2.6.19-git7 crashed with > > [ 0.000000] [boot]0020 XICS Init > [ 0.000000] i8259 legacy interrupt controller initialized > [ 0.000000] [boot]0021 XICS Done > [ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes) > System assert at: file: rtas_io_config.c -- line: 195 > rio_hub_num: 10 > drawer_num: 6 > phb_num: 3 > buid: 7 > > which is suspiciously in a similar place. So I am guessing > that it is indeed a compiler problem, the compiler passing > subroutine arguments in some broken way, or something. > > Hmm. seems that linux-2.6.20-rc1-git6 on power5 gives me > > [23178.532001] A-B-B-C-C-A deadlock:failed|failed| ok > |failed|fa|[23178.532028] A-B-C-A-B-C > deadlock:failed|failed| ok |failed|fa|[23178.532054] > A-B-B-C-C-D-D-A deadlock:failed|failed| ok |failed|fa|[23178.532083] > A-B-C-D-B-D-D-A deadlock:failed|failed| ok |failed|fa|[23178.532111] > A-B-C-D-B-C-D-A deadlock:failed|failed| ok |failed|fa|[23178.532139] > double unlock: ok | ok |failed|<0>-------[23178.532171] Kernel BUG > at c00000000007d574 [verbose debug info unavailable] > cpu 0x0: Vector: 700 (Program Check) at [c0000000007a3980] > pc: c00000000007d574: .debug_mutex_unlock+0x5c/0x118 > lr: c000000000468068: .__mutex_unlock_slowpath+0x104/0x198 > sp: c0000000007a3c00 > msr: 8000000000029032 > current = 0xc000000000663690 > paca = 0xc000000000663f80 > pid = 0, comm = swapper > enter ? for help > [c0000000007a3c80] c000000000468068 .__mutex_unlock_slowpath+0x104/0x198 > [c0000000007a3d20] c000000000231da8 .double_unlock_mutex+0x3c/0x58 > [c0000000007a3db0] c00000000023b47c .dotest+0x5c/0x370 > [c0000000007a3e50] c00000000023bc0c .locking_selftest+0x47c/0x17fc > [c0000000007a3ef0] c0000000005f06ec .start_kernel+0x1e4/0x344 > [c0000000007a3f90] c0000000000084c8 .start_here_common+0x54/0x8c > > although linux-2.6.19-git7 worked fine for weeks. > > At any rate, the warning: #warning gcc-4.1.0 should be converted > to a flat-out error. > > --linas > > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-20 0:53 ` Benjamin Herrenschmidt @ 2006-12-20 1:02 ` Linas Vepstas 0 siblings, 0 replies; 14+ messages in thread From: Linas Vepstas @ 2006-12-20 1:02 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: linuxppc-dev On Wed, Dec 20, 2006 at 11:53:04AM +1100, Benjamin Herrenschmidt wrote: > On Tue, 2006-12-19 at 18:46 -0600, Linas Vepstas wrote: > > Hi Ben, > > > > Per xchat, here's the update. I'm guessing I'm using a broken > > compiler, as per chain of evidence below ... > > > > I noticed that linux-2.6.20-rc1-git6 crashes on power4 > > in SMP mode: > > Have you tried a different gcc to confirm ? tommorrow .. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-20 0:46 Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually Linas Vepstas 2006-12-20 0:53 ` Benjamin Herrenschmidt @ 2006-12-20 1:46 ` Peter Bergner 2006-12-20 21:19 ` Linas Vepstas 1 sibling, 1 reply; 14+ messages in thread From: Peter Bergner @ 2006-12-20 1:46 UTC (permalink / raw) To: Linas Vepstas; +Cc: linuxppc-dev On Tue, 2006-12-19 at 18:46 -0600, Linas Vepstas wrote: > Per xchat, here's the update. I'm guessing I'm using a broken > compiler, as per chain of evidence below ... [snip] > However, I also note that the following scrolled by: > init/main.c:81:2: warning: #warning gcc-4.1.0 is known to miscompile the > kernel. A different compiler version is recommended. It may be due to this GCC bug which Olaf ran into a while back: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24644 You can verify whether you have a broken compiler by compiling the minimal test case I posted in comment #15. If you see r13 being copied into another register and then used, then you have a broken compiler. Peter ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-20 1:46 ` Peter Bergner @ 2006-12-20 21:19 ` Linas Vepstas 2006-12-20 21:28 ` Benjamin Herrenschmidt 2006-12-23 6:28 ` Alan Modra 0 siblings, 2 replies; 14+ messages in thread From: Linas Vepstas @ 2006-12-20 21:19 UTC (permalink / raw) To: Peter Bergner; +Cc: linuxppc-dev On Tue, Dec 19, 2006 at 07:46:50PM -0600, Peter Bergner wrote: > On Tue, 2006-12-19 at 18:46 -0600, Linas Vepstas wrote: > > Per xchat, here's the update. I'm guessing I'm using a broken > > compiler, as per chain of evidence below ... > [snip] > > However, I also note that the following scrolled by: > > init/main.c:81:2: warning: #warning gcc-4.1.0 is known to miscompile the > > kernel. A different compiler version is recommended. > > It may be due to this GCC bug which Olaf ran into a while back: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24644 > > You can verify whether you have a broken compiler by compiling > the minimal test case I posted in comment #15. If you see r13 > being copied into another register and then used, then you have > a broken compiler. No, that's not it. I'd be surprised, as I was using the SuSE SLES10 gcc-4.1.0-28.4.ppc.rpm compiler, which would have that fix. I'm trying to figure out how to try a different compiler, I'm hoping that 3.3 can still compile new kernels. I'll try to stare at the dump a bit too, now. --linas ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-20 21:19 ` Linas Vepstas @ 2006-12-20 21:28 ` Benjamin Herrenschmidt 2006-12-20 23:03 ` Mutex debug lock failure [was " Linas Vepstas 2006-12-23 6:28 ` Alan Modra 1 sibling, 1 reply; 14+ messages in thread From: Benjamin Herrenschmidt @ 2006-12-20 21:28 UTC (permalink / raw) To: Linas Vepstas; +Cc: linuxppc-dev On Wed, 2006-12-20 at 15:19 -0600, Linas Vepstas wrote: > On Tue, Dec 19, 2006 at 07:46:50PM -0600, Peter Bergner wrote: > > On Tue, 2006-12-19 at 18:46 -0600, Linas Vepstas wrote: > > > Per xchat, here's the update. I'm guessing I'm using a broken > > > compiler, as per chain of evidence below ... > > [snip] > > > However, I also note that the following scrolled by: > > > init/main.c:81:2: warning: #warning gcc-4.1.0 is known to miscompile the > > > kernel. A different compiler version is recommended. > > > > It may be due to this GCC bug which Olaf ran into a while back: > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24644 > > > > You can verify whether you have a broken compiler by compiling > > the minimal test case I posted in comment #15. If you see r13 > > being copied into another register and then used, then you have > > a broken compiler. > > No, that's not it. I'd be surprised, as I was using the SuSE > SLES10 gcc-4.1.0-28.4.ppc.rpm compiler, which would have that fix. > > I'm trying to figure out how to try a different compiler, > I'm hoping that 3.3 can still compile new kernels. > > I'll try to stare at the dump a bit too, now. I've been using 4.1.2 from debian/ubuntu happily lately. Ben. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Mutex debug lock failure [was Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-20 21:28 ` Benjamin Herrenschmidt @ 2006-12-20 23:03 ` Linas Vepstas 2006-12-20 23:09 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 14+ messages in thread From: Linas Vepstas @ 2006-12-20 23:03 UTC (permalink / raw) To: Benjamin Herrenschmidt, Ingo Molnar; +Cc: linux-kernel, linuxppc-dev On Thu, Dec 21, 2006 at 08:28:54AM +1100, Benjamin Herrenschmidt wrote: > On Wed, 2006-12-20 at 15:19 -0600, Linas Vepstas wrote: > > On Tue, Dec 19, 2006 at 07:46:50PM -0600, Peter Bergner wrote: > > > > I'm trying to figure out how to try a different compiler, > > I'm hoping that 3.3 can still compile new kernels. > > > > I'll try to stare at the dump a bit too, now. > > I've been using 4.1.2 from debian/ubuntu happily lately. Do you have mutex debugging turned on? It doesn't seem to be a compiler bug. Fails in the same place, when compiled with gcc-3.3 Reading the code, it looks like the test case double_unlock_mutex() fails, because the mutex lock->owner == NULL. whereas, in kernel/mutex-debug.c, it wanted void debug_mutex_unlock(struct mutex *lock) { DEBUG_LOCKS_WARN_ON(lock->owner != current_thread_info()); This is called from lib/locking-selftest.c circa line 490 #include "locking-selftest-mutex.h" GENERATE_TESTCASE(double_unlock_mutex) This may be because include/linux/mutex-debug.h which has __DEBUG_MUTEX_INITIALIZER in it, failes to set .owner to any interesting value. WHile I desperately want to conclude that the mutex lock debug code is broken, it seems to be unchanged fron 2.6.19 (and seems to work fine in 2.6.19-git7), so I am confused. This also doesn't explain the bizarre form of the failure on power4 ... does power4 mishandle twi somehow? Will investigate next. --linas p.s. To recap: above report is for linux-2.6.20-rc1-git6 I get the following on power5: 97742.318323] A-B-B-C-C-D-D-A deadlock:failed|failed| ok |failed|failed|failed| [97742.318352] A-B-C-D-B-D-D-A deadlock:failed|failed| ok |failed|failed|failed| [97742.318380] A-B-C-D-B-C-D-A deadlock:failed|failed| ok |failed|failed|failed| [97742.318408] double unlock: ok | ok |failed|<0>------------[ cut here ]------------ [97742.318440] Kernel BUG at c00000000007d574 [verbose debug info unavailable] cpu 0x0: Vector: 700 (Program Check) at [c0000000007a3980] pc: c00000000007d574: .debug_mutex_unlock+0x5c/0x118 lr: c000000000468068: .__mutex_unlock_slowpath+0x104/0x198 sp: c0000000007a3c00 msr: 8000000000029032 current = 0xc000000000663690 paca = 0xc000000000663f80 pid = 0, comm = swapper enter ? for help [c0000000007a3c80] c000000000468068 .__mutex_unlock_slowpath+0x104/0x198 [c0000000007a3d20] c000000000231da8 .double_unlock_mutex+0x3c/0x58 [c0000000007a3db0] c00000000023b47c .dotest+0x5c/0x370 [c0000000007a3e50] c00000000023bc0c .locking_selftest+0x47c/0x17fc [c0000000007a3ef0] c0000000005f06ec .start_kernel+0x1e4/0x344 [c0000000007a3f90] c0000000000084c8 .start_here_common+0x54/0x8c 0:mon> 0:mon> r 0:mon> di c00000000007d518 c00000000007d518 7c0802a6 mflr r0 c00000000007d51c fbc1fff0 std r30,-16(r1) c00000000007d520 fbe1fff8 std r31,-8(r1) c00000000007d524 ebc2d708 ld r30,-10488(r2) c00000000007d528 7c7f1b78 mr r31,r3 c00000000007d52c f8010010 std r0,16(r1) c00000000007d530 f821ff81 stdu r1,-128(r1) c00000000007d534 e93e8008 ld r9,-32760(r30) r9 = toc ptr debug_locks c00000000007d538 80090000 lwz r0,0(r9) c00000000007d53c 2f800000 cmpwi cr7,r0,0 c00000000007d540 419e00d8 beq cr7,c00000000007d618 # .debug_mutex_unlock+0x100/0x118 if (unlikely(!debug_locks)) return; c00000000007d544 e8030030 ld r0,48(r3) r0 = lock-> owner == NULL c00000000007d548 78290464 rldicr r9,r1,0,49 r9 = bottom 14 bits lopped off of stack pointer, giving current ptr c00000000007d54c 7fa04800 cmpd cr7,r0,r9 c00000000007d550 41be0028 beq cr7,c00000000007d578 # .debug_mutex_unlock+0x60/0x118 c00000000007d554 e93e8000 ld r9,-32768(r30) c00000000007d558 80090000 lwz r0,0(r9) c00000000007d55c 2f800000 cmpwi cr7,r0,0 c00000000007d560 409e0014 bne cr7,c00000000007d574 # .debug_mutex_unlock+0x5c/0x118 c00000000007d564 481b1f4d bl c00000000022f4b0 # .debug_locks_off+0x0/0x64 c00000000007d568 60000000 nop c00000000007d56c 2fa30000 cmpdi cr7,r3,0 c00000000007d570 419e0008 beq cr7,c00000000007d578 # .debug_mutex_unlock+0x60/0x118 c00000000007d574 0fe00000 twi 31,r0,0 crash here void debug_mutex_unlock(struct mutex *lock) { if (unlikely(!debug_locks)) return; DEBUG_LOCKS_WARN_ON(lock->owner != current_thread_info()); DEBUG_LOCKS_WARN_ON(lock->magic != lock); DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next); DEBUG_LOCKS_WARN_ON(lock->owner != current_thread_info()); } r3 is struct mutex, which is 0:mon> d c0000000006c12c0 c0000000006c12c0 0000000100000000 80000000dead4ead |..............N.| c0000000006c12d0 ffffffff00000000 ffffffffffffffff |................| c0000000006c12e0 c0000000006c12e0 c0000000006c12e0 |.....l.......l..| c0000000006c12f0 0000000000000000 0000000000000000 |................| c0000000006c1300 c0000000006c12c0 0000000000000000 |.....l..........| c0000000006c1310 00000000dead4ead ffffffff00000000 |......N.........| c0000000006c1320 ffffffffffffffff c0000000006c1328 |.............l.(| c0000000006c1330 c0000000006c1328 0000000100000000 |.....l.(........| struct mutex { /* 1: unlocked, 0: locked, negative: locked, possible waiters */ atomic_t count; // 1 spinlock_t wait_lock; // 80000000dead4ead ffffffff00000000 ffffffffffffffff struct list_head wait_list; // c0000000006c12e0 c0000000006c12e0 #ifdef CONFIG_DEBUG_MUTEXES struct thread_info *owner; // 0000000000000000 const char *name; // 0000000000000000 void *magic; // c0000000006c12c0 #endif #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; #endif }; typedef struct { raw_spinlock_t raw_lock; // 80000000 #if defined(CONFIG_PREEMPT) && defined(CONFIG_SMP) // # CONFIG_PREEMPT is not set unsigned int break_lock; #endif #ifdef CONFIG_DEBUG_SPINLOCK unsigned int magic, owner_cpu; // dead4ead ffffffff void *owner; // ffffffffffffffff #endif #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; #endif } spinlock_t; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Mutex debug lock failure [was Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-20 23:03 ` Mutex debug lock failure [was " Linas Vepstas @ 2006-12-20 23:09 ` Benjamin Herrenschmidt 2006-12-20 23:46 ` Linas Vepstas 0 siblings, 1 reply; 14+ messages in thread From: Benjamin Herrenschmidt @ 2006-12-20 23:09 UTC (permalink / raw) To: Linas Vepstas; +Cc: Ingo Molnar, linux-kernel, linuxppc-dev > This also doesn't explain the bizarre form of the failure on > power4 ... does power4 mishandle twi somehow? Will investigate > next. Or you hit it before you have a console ? Ben. > --linas > > p.s. To recap: above report is for linux-2.6.20-rc1-git6 > > I get the following on power5: > > 97742.318323] A-B-B-C-C-D-D-A deadlock:failed|failed| ok |failed|failed|failed| > [97742.318352] A-B-C-D-B-D-D-A deadlock:failed|failed| ok |failed|failed|failed| > [97742.318380] A-B-C-D-B-C-D-A deadlock:failed|failed| ok |failed|failed|failed| > [97742.318408] double unlock: ok | ok |failed|<0>------------[ cut here ]------------ > [97742.318440] Kernel BUG at c00000000007d574 [verbose debug info > unavailable] > cpu 0x0: Vector: 700 (Program Check) at [c0000000007a3980] > pc: c00000000007d574: .debug_mutex_unlock+0x5c/0x118 > lr: c000000000468068: .__mutex_unlock_slowpath+0x104/0x198 > sp: c0000000007a3c00 > msr: 8000000000029032 > current = 0xc000000000663690 > paca = 0xc000000000663f80 > pid = 0, comm = swapper > enter ? for help > [c0000000007a3c80] c000000000468068 .__mutex_unlock_slowpath+0x104/0x198 > [c0000000007a3d20] c000000000231da8 .double_unlock_mutex+0x3c/0x58 > [c0000000007a3db0] c00000000023b47c .dotest+0x5c/0x370 > [c0000000007a3e50] c00000000023bc0c .locking_selftest+0x47c/0x17fc > [c0000000007a3ef0] c0000000005f06ec .start_kernel+0x1e4/0x344 > [c0000000007a3f90] c0000000000084c8 .start_here_common+0x54/0x8c > 0:mon> > 0:mon> r > > 0:mon> di c00000000007d518 > c00000000007d518 7c0802a6 mflr r0 > c00000000007d51c fbc1fff0 std r30,-16(r1) > c00000000007d520 fbe1fff8 std r31,-8(r1) > c00000000007d524 ebc2d708 ld r30,-10488(r2) > c00000000007d528 7c7f1b78 mr r31,r3 > c00000000007d52c f8010010 std r0,16(r1) > c00000000007d530 f821ff81 stdu r1,-128(r1) > c00000000007d534 e93e8008 ld r9,-32760(r30) > r9 = toc ptr debug_locks > c00000000007d538 80090000 lwz r0,0(r9) > c00000000007d53c 2f800000 cmpwi cr7,r0,0 > c00000000007d540 419e00d8 beq cr7,c00000000007d618 # .debug_mutex_unlock+0x100/0x118 > if (unlikely(!debug_locks)) return; > > c00000000007d544 e8030030 ld r0,48(r3) > r0 = lock-> owner == NULL > c00000000007d548 78290464 rldicr r9,r1,0,49 > r9 = bottom 14 bits lopped off of stack pointer, giving current ptr > > c00000000007d54c 7fa04800 cmpd cr7,r0,r9 > c00000000007d550 41be0028 beq cr7,c00000000007d578 # .debug_mutex_unlock+0x60/0x118 > c00000000007d554 e93e8000 ld r9,-32768(r30) > c00000000007d558 80090000 lwz r0,0(r9) > c00000000007d55c 2f800000 cmpwi cr7,r0,0 > c00000000007d560 409e0014 bne cr7,c00000000007d574 # .debug_mutex_unlock+0x5c/0x118 > c00000000007d564 481b1f4d bl c00000000022f4b0 # .debug_locks_off+0x0/0x64 > c00000000007d568 60000000 nop > c00000000007d56c 2fa30000 cmpdi cr7,r3,0 > c00000000007d570 419e0008 beq cr7,c00000000007d578 # .debug_mutex_unlock+0x60/0x118 > c00000000007d574 0fe00000 twi 31,r0,0 > > crash here > > void debug_mutex_unlock(struct mutex *lock) > { > if (unlikely(!debug_locks)) > return; > > DEBUG_LOCKS_WARN_ON(lock->owner != current_thread_info()); > DEBUG_LOCKS_WARN_ON(lock->magic != lock); > DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next); > DEBUG_LOCKS_WARN_ON(lock->owner != current_thread_info()); > } > > r3 is struct mutex, which is > > 0:mon> d c0000000006c12c0 > c0000000006c12c0 0000000100000000 80000000dead4ead |..............N.| > c0000000006c12d0 ffffffff00000000 ffffffffffffffff |................| > c0000000006c12e0 c0000000006c12e0 c0000000006c12e0 |.....l.......l..| > c0000000006c12f0 0000000000000000 0000000000000000 |................| > c0000000006c1300 c0000000006c12c0 0000000000000000 |.....l..........| > c0000000006c1310 00000000dead4ead ffffffff00000000 |......N.........| > c0000000006c1320 ffffffffffffffff c0000000006c1328 |.............l.(| > c0000000006c1330 c0000000006c1328 0000000100000000 |.....l.(........| > > struct mutex { > /* 1: unlocked, 0: locked, negative: locked, possible waiters */ > atomic_t count; // 1 > spinlock_t wait_lock; // 80000000dead4ead ffffffff00000000 ffffffffffffffff > struct list_head wait_list; // c0000000006c12e0 c0000000006c12e0 > #ifdef CONFIG_DEBUG_MUTEXES > struct thread_info *owner; // 0000000000000000 > const char *name; // 0000000000000000 > void *magic; // c0000000006c12c0 > #endif > #ifdef CONFIG_DEBUG_LOCK_ALLOC > struct lockdep_map dep_map; > #endif > }; > > typedef struct { > raw_spinlock_t raw_lock; // 80000000 > #if defined(CONFIG_PREEMPT) && defined(CONFIG_SMP) // # CONFIG_PREEMPT > is not set > unsigned int break_lock; > #endif > #ifdef CONFIG_DEBUG_SPINLOCK > unsigned int magic, owner_cpu; // dead4ead ffffffff > void *owner; // ffffffffffffffff > #endif > #ifdef CONFIG_DEBUG_LOCK_ALLOC > struct lockdep_map dep_map; > #endif > } spinlock_t; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Mutex debug lock failure [was Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-20 23:09 ` Benjamin Herrenschmidt @ 2006-12-20 23:46 ` Linas Vepstas 2006-12-21 0:36 ` Anton Blanchard 0 siblings, 1 reply; 14+ messages in thread From: Linas Vepstas @ 2006-12-20 23:46 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Ingo Molnar, linux-kernel, linuxppc-dev On Thu, Dec 21, 2006 at 10:09:55AM +1100, Benjamin Herrenschmidt wrote: > > > This also doesn't explain the bizarre form of the failure on > > power4 ... does power4 mishandle twi somehow? Will investigate > > next. > > Or you hit it before you have a console ? Ahh... I was about to claim I have a console, but on closer examination: someone has been messing with consoles? I don't get it ... The power4 has ony a serial connection. the power5 has a graphics card in it, unused. The power5 has "Console: colour dummy device 80x25" kick in, at which point the time goes crazy as well. There may be multiple bugs here. The weird crash for 2.6.19-git7 on power4 looks like this: Looking for displays opening PHB /pci@400000000110... done opening PHB /pci@400000000111... done instantiating rtas at 0x000000002fd10000 ... done 0000000000000000 : boot cpu 0000000000000000 copying OF device tree ... Building dt strings... Building dt structure... Device tree strings 0x00000000043a0000 -> 0x00000000043a135c Device tree struct 0x00000000043b0000 -> 0x00000000043c0000 Calling quiesce ... returning from prom_init [ 0.000000] Starting Linux PPC64 #0 SMP Tue Dec 5 17:20:17 CST 2006 [ 0.000000] ----------------------------------------------------- [ 0.000000] ppc64_pft_size = 0x0 [ 0.000000] physicalMemorySize = 0x400000000 [ 0.000000] ppc64_caches.dcache_line_size = 0x80 [ 0.000000] ppc64_caches.icache_line_size = 0x80 [ 0.000000] htab_address = 0xc0000003e0000000 [ 0.000000] htab_hash_mask = 0x1fffff [ 0.000000] ----------------------------------------------------- [ 0.000000] Linux version 2.6.19-git7linas (root@venn) (gcc version 4.1.0 (SUSE Linux)) #0 SMP Tue Dec 5 17:20:17 CST 2006 [ 0.000000] [boot]0012 Setup Arch [ 0.000000] No ramdisk, default root is /dev/sda2 [ 0.000000] EEH: PCI Enhanced I/O Error Handling Enabled [ 0.000000] PPC64 nvram contains 81920 bytes [ 0.000000] Zone PFN ranges: [ 0.000000] DMA 0 -> 262144 [ 0.000000] Normal 262144 -> 262144 [ 0.000000] early_node_map[1] active PFN ranges [ 0.000000] 0: 0 -> 262144 [ 0.000000] [boot]0015 Setup Done [ 0.000000] Built 1 zonelists. Total pages: 261888 [ 0.000000] Kernel command line: root=/dev/sda3 console=ttyS0 selinux=0 elevator=cfq kdb=on [ 0.000000] [boot]0020 XICS Init [ 0.000000] i8259 legacy interrupt controller initialized [ 0.000000] [boot]0021 XICS Done [ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes) System assert at: file: rtas_io_config.c -- line: 195 rio_hub_num: 10 drawer_num: 6 phb_num: 3 buid: 7 which should be compared to 2.6.20-rc1-git6, build with the same .config, on power4: Looking for displays opening PHB /pci@400000000110... done opening PHB /pci@400000000111... done instantiating rtas at 0x000000002fd10000 ... done 0000000000000000 : boot cpu 0000000000000000 copying OF device tree ... Building dt strings... Building dt structure... Device tree strings 0x0000000004380000 -> 0x000000000438135c Device tree struct 0x0000000004390000 -> 0x00000000043a0000 Calling quiesce ... returning from prom_init [ 0.000000] Starting Linux PPC64 #0 SMP Tue Dec 19 17:42:44 CST 2006 [ 0.000000] ----------------------------------------------------- [ 0.000000] ppc64_pft_size = 0x0 [ 0.000000] physicalMemorySize = 0x400000000 [ 0.000000] ppc64_caches.dcache_line_size = 0x80 [ 0.000000] ppc64_caches.icache_line_size = 0x80 [ 0.000000] htab_address = 0xc0000003e0000000 [ 0.000000] htab_hash_mask = 0x1fffff [ 0.000000] ----------------------------------------------------- [ 0.000000] Linux version 2.6.20-rc1-git6linas (root@venn) (gcc version 4.1.0 (SUSE Linux)) #0 SMP Tue Dec 19 17:42:44 CST 2006 [ 0.000000] [boot]0012 Setup Arch [ 0.000000] No ramdisk, default root is /dev/sda2 [ 0.000000] EEH: PCI Enhanced I/O Error Handling Enabled [ 0.000000] PPC64 nvram contains 81920 bytes [ 0.000000] Zone PFN ranges: [ 0.000000] DMA 0 -> 262144 [ 0.000000] Normal 262144 -> 262144 [ 0.000000] early_node_map[1] active PFN ranges [ 0.000000] 0: 0 -> 262144 [ 0.000000] [boot]0015 Setup Done [ 0.000000] Built 1 zonelists. Total pages: 261888 [ 0.000000] Kernel command line: root=/dev/sda3 console=ttyS0 selinux=0 elevator=cfq kdb=on [ 0.000000] [boot]0020 XICS Init [ 0.000000] i8259 legacy interrupt controller initialized [ 0.000000] [boot]0021 XICS Done [ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes) cpu 0x0: Vector: 700 (Program Check) at [c0000000007a3980] pc: c00000000007d574: .debug_mutex_unlock+0x5c/0x118 lr: c000000000468068: .__mutex_unlock_slowpath+0x104/0x198 sp: c0000000007a3c00 msr: 9000000000029032 current = 0xc000000000663690 paca = 0xc000000000663f80 pid = 0, comm = swapper enter ? for help [c0000000007a3c80] c000000000468068 .__mutex_unlock_slowpath+0x104/0x198 [c0000000007a3d20] c000000000231da8 .double_unlock_mutex+0x3c/0x58 [c0000000007a3db0] c00000000023b47c .dotest+0x5c/0x370 [c0000000007a3e50] c00000000023bc0c .locking_selftest+0x47c/0x17fc [c0000000007a3ef0] c0000000005f06ec .start_kernel+0x1e4/0x344 [c0000000007a3f90] c0000000000084c8 .start_here_common+0x54/0x8c 0:mon> And, exactly the same kernel, on power5: Looking for displays found display : /pci@800000020000002/pci@2,2/pci@1/display@0, opening ... done instantiating rtas at 0x0000000007710000 ... done 0000000000000000 : boot cpu 0000000000000000 0000000000000002 : starting cpu hw idx 0000000000000002... done copying OF device tree ... Building dt strings... Building dt structure... Device tree strings 0x0000000002580000 -> 0x000000000258141e Device tree struct 0x0000000002590000 -> 0x00000000025a0000 [ 0.000000] Using pSeries machine description [ 0.000000] Partition configured for 4 cpus. [ 0.000000] Starting Linux PPC64 #0 SMP Tue Dec 19 17:42:44 CST 2006 [ 0.000000] ----------------------------------------------------- [ 0.000000] ppc64_pft_size = 0x19 [ 0.000000] physicalMemorySize = 0x72000000 [ 0.000000] ppc64_caches.dcache_line_size = 0x80 [ 0.000000] ppc64_caches.icache_line_size = 0x80 [ 0.000000] htab_address = 0x0000000000000000 [ 0.000000] htab_hash_mask = 0x3ffff [ 0.000000] ----------------------------------------------------- [ 0.000000] Linux version 2.6.20-rc1-git6linas (root@venn) (gcc version 4.1.0 (SUSE Linux)) #0 SMP Tue Dec 19 17:42:446[ 0.000000] [boot]0012 Setup Arch [ 0.000000] No ramdisk, default root is /dev/sda2 [ 0.000000] EEH: PCI Enhanced I/O Error Handling Enabled [ 0.000000] PPC64 nvram contains 7168 bytes [ 0.000000] Zone PFN ranges: [ 0.000000] DMA 0 -> 29184 [ 0.000000] Normal 29184 -> 29184 [ 0.000000] early_node_map[1] active PFN ranges [ 0.000000] 0: 0 -> 29184 [ 0.000000] [boot]0015 Setup Done [ 0.000000] Built 1 zonelists. Total pages: 29156 [ 0.000000] Kernel command line: root=/dev/sda9 console=hvsi0 selinux=0 elevator=cfq rtas_msgs=1 [ 0.000000] [boot]0020 XICS Init [ 0.000000] [boot]0021 XICS Done [ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes) [97741.568069] Console: colour dummy device 80x25 [97742.318175] ------------------------ [97742.318182] | Locking API testsuite: [97742.318189] ---------------------------------------------------------------------------- [97742.318200] | spin |wlock |rlock |mutex | wsem | rsem | [97742.318211] -------------------------------------------------------------------------- [97742.318222] A-A deadlock:failed|failed| ok |failed|failed|failed| [97742.318246] A-B-B-A deadlock:failed|failed| ok |failed|failed|failed| [97742.318271] A-B-B-C-C-A deadlock:failed|failed| ok |failed|failed|failed| [97742.318297] A-B-C-A-B-C deadlock:failed|failed| ok |failed|failed|failed| [97742.318323] A-B-B-C-C-D-D-A deadlock:failed|failed| ok |failed|failed|failed| [97742.318352] A-B-C-D-B-D-D-A deadlock:failed|failed| ok |failed|failed|failed| [97742.318380] A-B-C-D-B-C-D-A deadlock:failed|failed| ok |failed|failed|failed| [97742.318408] double unlock: ok | ok |failed|<0>------------[ cut here ]------------ [97742.318440] Kernel BUG at c00000000007d574 [verbose debug info unavailable] cpu 0x0: Vector: 700 (Program Check) at [c0000000007a3980] pc: c00000000007d574: .debug_mutex_unlock+0x5c/0x118 lr: c000000000468068: .__mutex_unlock_slowpath+0x104/0x198 sp: c0000000007a3c00 msr: 8000000000029032 current = 0xc000000000663690 paca = 0xc000000000663f80 pid = 0, comm = swapper enter ? for help [c0000000007a3c80] c000000000468068 .__mutex_unlock_slowpath+0x104/0x198 [c0000000007a3d20] c000000000231da8 .double_unlock_mutex+0x3c/0x58 [c0000000007a3db0] c00000000023b47c .dotest+0x5c/0x370 [c0000000007a3e50] c00000000023bc0c .locking_selftest+0x47c/0x17fc [c0000000007a3ef0] c0000000005f06ec .start_kernel+0x1e4/0x344 [c0000000007a3f90] c0000000000084c8 .start_here_common+0x54/0x8c 0:mon> r ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Mutex debug lock failure [was Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-20 23:46 ` Linas Vepstas @ 2006-12-21 0:36 ` Anton Blanchard 2006-12-21 1:03 ` Linas Vepstas 0 siblings, 1 reply; 14+ messages in thread From: Anton Blanchard @ 2006-12-21 0:36 UTC (permalink / raw) To: Linas Vepstas; +Cc: Ingo Molnar, linux-kernel, linuxppc-dev On Wed, Dec 20, 2006 at 05:46:47PM -0600, Linas Vepstas wrote: > System assert at: file: rtas_io_config.c -- line: 195 > rio_hub_num: 10 > drawer_num: 6 > phb_num: 3 > buid: 7 Looks like a firmware assert. Did you pass in something dodgy to a config read/write op? Maybe a bad buid? Anton ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Mutex debug lock failure [was Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-21 0:36 ` Anton Blanchard @ 2006-12-21 1:03 ` Linas Vepstas 2006-12-21 14:41 ` Ingo Molnar 0 siblings, 1 reply; 14+ messages in thread From: Linas Vepstas @ 2006-12-21 1:03 UTC (permalink / raw) To: Anton Blanchard; +Cc: Ingo Molnar, linux-kernel, linuxppc-dev On Thu, Dec 21, 2006 at 11:36:59AM +1100, Anton Blanchard wrote: > > On Wed, Dec 20, 2006 at 05:46:47PM -0600, Linas Vepstas wrote: > > > System assert at: file: rtas_io_config.c -- line: 195 > > rio_hub_num: 10 > > drawer_num: 6 > > phb_num: 3 > > buid: 7 > > Looks like a firmware assert. Did you pass in something dodgy to a > config read/write op? Maybe a bad buid? Same kernel runs fine on power5. Although it does have patches applied, those very same patches boot fine when applied to a slightly older kernel (2.6.19-rc4). I haven't been messing with buids or pci config space (at least not intentionaly). I'll try again with an unpatched, unmodified kernel. --linas ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Mutex debug lock failure [was Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-21 1:03 ` Linas Vepstas @ 2006-12-21 14:41 ` Ingo Molnar 2006-12-21 21:12 ` Linas Vepstas 0 siblings, 1 reply; 14+ messages in thread From: Ingo Molnar @ 2006-12-21 14:41 UTC (permalink / raw) To: Linas Vepstas; +Cc: linuxppc-dev, mingo, Anton Blanchard, linux-kernel On Wed, 2006-12-20 at 19:03 -0600, Linas Vepstas wrote: > Same kernel runs fine on power5. Although it does have patches > applied, those very same patches boot fine when applied to a slightly > older kernel (2.6.19-rc4). I haven't been messing with buids or > pci config space (at least not intentionaly). > > I'll try again with an unpatched, unmodified kernel. there have been a number of fixes to lockdep recently - could you try the kernel/lockdep.c file from latest -mm, does that fail too? one possibility would be a chain-hash collision. Ingo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Mutex debug lock failure [was Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-21 14:41 ` Ingo Molnar @ 2006-12-21 21:12 ` Linas Vepstas 0 siblings, 0 replies; 14+ messages in thread From: Linas Vepstas @ 2006-12-21 21:12 UTC (permalink / raw) To: Ingo Molnar; +Cc: linuxppc-dev, mingo, Anton Blanchard, linux-kernel On Thu, Dec 21, 2006 at 03:41:39PM +0100, Ingo Molnar wrote: > On Wed, 2006-12-20 at 19:03 -0600, Linas Vepstas wrote: > > Same kernel runs fine on power5. Although it does have patches > > applied, those very same patches boot fine when applied to a slightly > > older kernel (2.6.19-rc4). I haven't been messing with buids or > > pci config space (at least not intentionaly). > > > > I'll try again with an unpatched, unmodified kernel. > > there have been a number of fixes to lockdep recently - could you try > the kernel/lockdep.c file from latest -mm, does that fail too? > > one possibility would be a chain-hash collision. I see the same problem on linux-2.6.20-rc1-mm1 The patch below fixes this, although I don't understand why this has become an issue just now: Index: linux-2.6.20-rc1-mm1/kernel/mutex.c =================================================================== --- linux-2.6.20-rc1-mm1.orig/kernel/mutex.c 2006-12-19 16:19:34.000000000 -0600 +++ linux-2.6.20-rc1-mm1/kernel/mutex.c 2006-12-21 14:31:33.000000000 -0600 @@ -249,7 +249,7 @@ __mutex_unlock_common_slowpath(atomic_t wake_up_process(waiter->task); } - debug_mutex_clear_owner(lock); + // debug_mutex_clear_owner(lock); spin_unlock_mutex(&lock->wait_lock, flags); } It obvious that this is the proximal cause of the failure of the double_unlock_mutex() mutex self-test. However, both the double-unlock test, and this clear_owner() call, are in linux-2.6.19-git7, which doesn't fail this test. So I conclude that __mutex_unlock_common_slowpath() is never taken in 2.6.19 but is always taken on 2.6.20-rc1 (in particular, is taken during the double-unlock test). I don't know why that would be. It might be wise to add a test to make sure the slowpath is taken only when it should be taken? Its sort of scary to think that it might be always taken, and that no one notices the problem... I'm gonna be out until after Christmas. -- and so, Merry Christmas! --linas ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually 2006-12-20 21:19 ` Linas Vepstas 2006-12-20 21:28 ` Benjamin Herrenschmidt @ 2006-12-23 6:28 ` Alan Modra 1 sibling, 0 replies; 14+ messages in thread From: Alan Modra @ 2006-12-23 6:28 UTC (permalink / raw) To: Linas Vepstas; +Cc: linuxppc-dev On Wed, Dec 20, 2006 at 03:19:31PM -0600, Linas Vepstas wrote: > On Tue, Dec 19, 2006 at 07:46:50PM -0600, Peter Bergner wrote: > > On Tue, 2006-12-19 at 18:46 -0600, Linas Vepstas wrote: > > > Per xchat, here's the update. I'm guessing I'm using a broken > > > compiler, as per chain of evidence below ... > > [snip] > > > However, I also note that the following scrolled by: > > > init/main.c:81:2: warning: #warning gcc-4.1.0 is known to miscompile the > > > kernel. A different compiler version is recommended. > > > > It may be due to this GCC bug which Olaf ran into a while back: > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24644 > > > > You can verify whether you have a broken compiler by compiling > > the minimal test case I posted in comment #15. If you see r13 > > being copied into another register and then used, then you have > > a broken compiler. > > No, that's not it. I'd be surprised, as I was using the SuSE > SLES10 gcc-4.1.0-28.4.ppc.rpm compiler, which would have that fix. Hmm, this looks like at problem Paul forwarded on to me, originally reported by Hugh Dickins <hugh@veritas.com>. In his email, Hugh said: > I spent too long looking in the wrong direction (head_64.S and entry_64.S), > then noticed this in generic_file_aio_read from "objdump -rd mm/filemap.o": > 3b54: 7d a5 6b 78 mr r5,r13 > 3b58: 38 c0 00 00 li r6,0 > 3b5c: 7c 09 03 a6 mtctr r0 > 3b60: 38 e0 00 00 li r7,0 > 3b64: 39 00 00 00 li r8,0 > 3b68: eb a3 00 20 ld r29,32(r3) > 3b6c: 48 00 00 48 b 3bb4 <.generic_file_aio_read+0xa4> > 3b70: e9 49 00 08 ld r10,8(r9) > 3b74: 7c e7 52 14 add r7,r7,r10 > 3b78: 7c e9 53 79 or. r9,r7,r10 > 3b7c: 41 c0 01 88 blt- 3d04 <.generic_file_aio_read+0x1f4> > 3b80: e9 25 01 a0 ld r9,416(r5) > > So, if the task is preempted and rescheduled on a different cpu in between > the first and the last line, r5 will be looking at a different paca_struct > from the one we're now on, and pick up the wrong __current. (Well, there's > a branch in the middle there, which then branches back: so the flow isn't > quite as I've shown, but the effect is the same.) > > That's compiled on SuSE 10.1, gcc 4.1.0-25 (with CONFIG_CC_OPTIMIZE_FOR_SIZE, > but I've since checked that the same kind of thing happens without). In most > places it does use the expected 416(r13) for current, but occasionally via an > intermediate register as here: why it should choose to do it that way I don't > know, but assume it's some subtle and legitimate optimization. It looks as > if YDL 4.1's older gcc 3.4.4-2 does not do it that way. I don't know if SuSE's 4.1.0-25 has the PR24644 fix, or whether that fix cures the mm/filemap.c problem. I do know that a 4.1.2 20061121 compiler I happened to have lying around made copies of r13 on 2.6.17 mm/filemap.c, even with local_paca made volatile. The following workaround allowed me to compile a kernel without any silly r13 copies. #define get_paca() ({__asm__ __volatile__ ("#paca %0" : "=r" (local_paca)); local_paca;}) The asm tells gcc that local_paca is changed in some unspecified way just before each access. Explicitly making r13 volatile like this should avoid the fuzzy gcc semantics of volatile global register variables. -- Alan Modra IBM OzLabs - Linux Technology Centre ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2006-12-23 6:28 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-12-20 0:46 Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually Linas Vepstas 2006-12-20 0:53 ` Benjamin Herrenschmidt 2006-12-20 1:02 ` Linas Vepstas 2006-12-20 1:46 ` Peter Bergner 2006-12-20 21:19 ` Linas Vepstas 2006-12-20 21:28 ` Benjamin Herrenschmidt 2006-12-20 23:03 ` Mutex debug lock failure [was " Linas Vepstas 2006-12-20 23:09 ` Benjamin Herrenschmidt 2006-12-20 23:46 ` Linas Vepstas 2006-12-21 0:36 ` Anton Blanchard 2006-12-21 1:03 ` Linas Vepstas 2006-12-21 14:41 ` Ingo Molnar 2006-12-21 21:12 ` Linas Vepstas 2006-12-23 6:28 ` Alan Modra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).