* pseries softreset on cpus in 32bit mode @ 2006-05-22 16:41 Olaf Hering 2006-05-22 18:46 ` Olaf Hering ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Olaf Hering @ 2006-05-22 16:41 UTC (permalink / raw) To: linuxppc-dev Consider a simple app like this, which is placed as '/init' in an initrd cpio archive: hello32.c #include <stdio.h> int main(void) { printf("foobar\n"); asm("li 31,0; b .\n"); return 0; } It will keep one cpu busy, and in 32bit mode. If a soft-reset is triggered, this cpu remains in 32bit mode (I think) when system_reset_fwnmi() is invoked. Then bad_stack is called via STD_EXCEPTION_COMMON() and EXCEPTION_PROLOG_COMMON() because the 32bit stackpointer is > 0 and the cpu was in usermode. Finally panic is called, which doesnt make much sense in this context. machine_check_fwnmi has likely the same issue. One bug is that something trashes regs->nip, it gets 0x3200 or similar. I'm not really sure what is supposed to happen. Clearly a softreset should not panic with bad stack pointer. This is on a JS20, but a large p550 dies the same way. Linux version 2.6.17-rc4-g353b28ba (olaf@pomegranate) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Mon May 22 18:37:06 CEST 2006 [boot]0012 Setup Arch Top of RAM: 0x1e000000, Total RAM: 0x1e000000 Memory hole size: 0MB PPC64 nvram contains 16384 bytes Using default idle loop [boot]0015 Setup Done Built 1 zonelists Kernel command line: root=/dev/hda2 xmon=on quiet panic=1 foobar Bad kernel stack pointer ffa57ac0 at 3200 Oops: Bad kernel stack pointer, sig: 6 [#1] SMP NR_CPUS=128 NUMA Modules linked in: NIP: 0000000000003200 LR: 0000000010000338 CTR: 0000000000032DDC REGS: c000000007a5ed40 TRAP: c000000007a5ef10 Not tainted (2.6.17-rc4-g353b28ba) MSR: 0000000040001032 <ME,IR,DR> CR: 20000042 XER: 200FFFFF TASK = c00000001dfdb7e0[1] 'init' THREAD: c00000000ffcc000 CPU: 1 GPR00: 0000000010000338 00000000FFA57AC0 000000001009B470 0000000007ACEFF8 GPR04: 000000001002487C 0000000040000042 0000000000004000 000000001000B0E0 GPR08: 000000000000F932 0000000000000000 0000000000000000 0000000000000000 GPR12: 00000000200FFFFF C00000000052D100 C000000000442820 4000000002010000 GPR16: C000000000440ED8 0000000000000000 00000000000413DB 00000000004FA998 GPR20: 000000000250AC08 00000000004FAC08 000000000183FE00 00000000004420C0 GPR24: 000000000052CF00 0000000010000C70 0000000010000BF0 0000000000000000 GPR28: 0000000000000000 0000000010090000 00000000005123D8 0000000000000000 NIP [0000000000003200] 0x3200 LR [0000000010000338] 0x10000338 Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseries softreset on cpus in 32bit mode 2006-05-22 16:41 pseries softreset on cpus in 32bit mode Olaf Hering @ 2006-05-22 18:46 ` Olaf Hering 2006-05-23 13:07 ` [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware Olaf Hering 2006-07-19 8:34 ` [PATCH] force 64bit mode in fwnmi handlers to workaround firmware bugs Olaf Hering 2 siblings, 0 replies; 11+ messages in thread From: Olaf Hering @ 2006-05-22 18:46 UTC (permalink / raw) To: linuxppc-dev On Mon, May 22, Olaf Hering wrote: > > Consider a simple app like this, which is placed as '/init' in an initrd > cpio archive: > > hello32.c > > #include <stdio.h> > int main(void) { > printf("foobar\n"); > asm("li 31,0; b .\n"); > return 0; > } Modified the userland app to init some registers with a fixed value, and ran a kernel with the debug patch below. It gets into bad_stack from decrementer_common = c000000000003400. 3200 is coming from c000000000003200 <system_reset_common>: So what does that mean? Should a softreset disable interrupts? Linux version 2.6.17-rc4-g353b28ba-dirty (olaf@pomegranate) (gcc version 4.1.0 (SUSE Linux)) #13 SMP Mon May 22 20:38:56 CEST 2006 [boot]0012 Setup Arch Top of RAM: 0x1e000000, Total RAM: 0x1e000000 Memory hole size: 0MB PPC64 nvram contains 16384 bytes Using default idle loop [boot]0015 Setup Done Built 1 zonelists Kernel command line: root=/dev/hda2 xmon=on quiet foobar Bad kernel stack pointer ffad3ad0 at 3200 cpu 0x1: Vector: c000000007a5ef10 at [c000000007a5ed40] pc: 0000000000003200 lr: 0000000010000338 sp: ffad3ad0 msr: 40001032 current = 0xc00000000ffc67e0 paca = 0xc00000000053a100 pid = 1, comm = init enter ? for help 1:mon> r R00 = 0000000010000338 R16 = c0000000004470e8 R01 = 00000000ffad3ad0 R17 = 0000000000000000 R02 = 000000001009c470 R18 = 00000000000413cd R03 = 0000000007aceff8 R19 = 0000000000507ab8 R04 = 000000001002489c R20 = 0000000000000042 R05 = 0000000040000042 R21 = 0000000000000042 R06 = 0000000000004000 R22 = 0000000000000042 R07 = 000000001000b100 R23 = 0000000000000042 R08 = 000000000000f932 R24 = 0000000000000042 R09 = 0000000000000000 R25 = 0000000000000042 R10 = 0000000000000000 R26 = 0000000000000042 R11 = 0000000000000000 R27 = 0000000000000042 R12 = 00000000200fffff R28 = 0000000000000042 R13 = c00000000053a100 R29 = 0000000000003420 R14 = c000000000448a38 R30 = 0000000000003200 R15 = 4000000002010000 R31 = 0000000010000368 pc = 0000000000003200 lr = 0000000010000338 msr = 0000000040001032 cr = 20000042 ctr = 0000000000032ddc xer = 00000000200fffff trap = c000000007a5ef10 1:mon> Index: linux-2.6/arch/powerpc/kernel/head_64.S =================================================================== --- linux-2.6.orig/arch/powerpc/kernel/head_64.S +++ linux-2.6/arch/powerpc/kernel/head_64.S @@ -269,7 +269,12 @@ exception_marker: subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \ beq- 1f; \ ld r1,PACAKSAVE(r13); /* kernel stack to use */ \ -1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \ +1: \ + cmpdi cr1,r29,0x42; \ + bne cr1,2f; \ + li r29,2f@l; \ +2: \ + cmpdi cr1,r1,0; /* check if r1 is in userspace */ \ bge- cr1,bad_stack; /* abort if it is */ \ std r9,_CCR(r1); /* save CR in stackframe */ \ std r11,_NIP(r1); /* save SRR0 in stackframe */ \ @@ -600,6 +605,7 @@ slb_miss_user_pseries: system_reset_fwnmi: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ + mfspr r31,SPRN_SRR0 EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common) .globl machine_check_fwnmi @@ -842,6 +848,7 @@ bad_stack: std r9,_CCR(r1) std r10,GPR1(r1) std r11,_NIP(r1) + mr r30,r11 std r12,_MSR(r1) mfspr r11,SPRN_DAR mfspr r12,SPRN_DSISR ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware 2006-05-22 16:41 pseries softreset on cpus in 32bit mode Olaf Hering 2006-05-22 18:46 ` Olaf Hering @ 2006-05-23 13:07 ` Olaf Hering 2006-05-26 11:30 ` Paul Mackerras 2006-06-09 8:11 ` Paul Mackerras 2006-07-19 8:34 ` [PATCH] force 64bit mode in fwnmi handlers to workaround firmware bugs Olaf Hering 2 siblings, 2 replies; 11+ messages in thread From: Olaf Hering @ 2006-05-23 13:07 UTC (permalink / raw) To: linuxppc-dev On Mon, May 22, Olaf Hering wrote: > NIP [0000000000003200] 0x3200 > LR [0000000010000338] 0x10000338 The reason is that system_reset_fwnmi is called in 32bit mode. Forcing 64bit mode fixes the corrupt NIP for me. JS20 and p690 are affected, seems to work on p550 and JS21. According to this change for EXCEPTION_PROLOG_COMMON, I get still into decremeter_common, but its not fatal anymore because the cpu is now in 64bit mode and the stack is forced to PACAKSAVE(r13). subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \ beq- 1f; \ ld r1,PACAKSAVE(r13); /* kernel stack to use */ \ -1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \ +1: \ + cmpdi cr1,r29,0x42; \ + bne cr1,2f; \ + li r29,2f@l; \ +2: \ + cmpdi cr1,r1,0; /* check if r1 is in userspace */ \ bge- cr1,bad_stack; /* abort if it is */ \ std r9,_CCR(r1); /* save CR in stackframe */ \ std r11,_NIP(r1); /* save SRR0 in stackframe */ \ Index: linux-2.6/arch/powerpc/kernel/head_64.S =================================================================== --- linux-2.6.orig/arch/powerpc/kernel/head_64.S +++ linux-2.6/arch/powerpc/kernel/head_64.S @@ -211,6 +211,29 @@ exception_marker: ori reg,reg,(label)@l; /* virt addr of handler ... */ #endif +#define EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(area, label) \ + mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \ + std r9,area+EX_R9(r13); /* save r9 - r12 */ \ + std r10,area+EX_R10(r13); \ + std r11,area+EX_R11(r13); \ + std r12,area+EX_R12(r13); \ + mfspr r9,SPRN_SPRG1; \ + std r9,area+EX_R13(r13); \ + mfcr r9; \ + clrrdi r12,r13,32; /* get high part of &label */ \ + mfmsr r10; \ + li r11,5; /* MSR_SF_LG|MSR_ISF_LG */ \ + rldicr r11,r11,61,2; /* (5 << 61) */ \ + or r10,r10,r11; \ + mfspr r11,SPRN_SRR0; /* save SRR0 */ \ + LOAD_HANDLER(r12,label) \ + ori r10,r10,MSR_IR|MSR_DR|MSR_RI; \ + mtspr SPRN_SRR0,r12; \ + mfspr r12,SPRN_SRR1; /* and SRR1 */ \ + mtspr SPRN_SRR1,r10; \ + rfid; \ + b . /* prevent speculative execution */ + #define EXCEPTION_PROLOG_PSERIES(area, label) \ mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \ std r9,area+EX_R9(r13); /* save r9 - r12 */ \ @@ -600,14 +623,14 @@ slb_miss_user_pseries: system_reset_fwnmi: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ - EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common) + EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(PACA_EXGEN, system_reset_common) .globl machine_check_fwnmi .align 7 machine_check_fwnmi: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ - EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common) + EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(PACA_EXMC, machine_check_common) #ifdef CONFIG_PPC_ISERIES /*** ISeries-LPAR interrupt handlers ***/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware 2006-05-23 13:07 ` [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware Olaf Hering @ 2006-05-26 11:30 ` Paul Mackerras 2006-05-26 12:33 ` Olaf Hering 2006-05-27 11:34 ` Olaf Hering 2006-06-09 8:11 ` Paul Mackerras 1 sibling, 2 replies; 11+ messages in thread From: Paul Mackerras @ 2006-05-26 11:30 UTC (permalink / raw) To: Olaf Hering; +Cc: linuxppc-dev Olaf Hering writes: > According to this change for EXCEPTION_PROLOG_COMMON, I get still into > decremeter_common, but its not fatal anymore because the cpu is now in > 64bit mode and the stack is forced to PACAKSAVE(r13). > > subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \ > beq- 1f; \ > ld r1,PACAKSAVE(r13); /* kernel stack to use */ \ > -1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \ > +1: \ > + cmpdi cr1,r29,0x42; \ Ummm, what's r29 supposed to have in it here? > + bne cr1,2f; \ > + li r29,2f@l; \ And why are we setting it? Does it look like the SRR0 and SRR1 values are correct when we get this problem occurring? Is it just the MSR that is bogus? Paul. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware 2006-05-26 11:30 ` Paul Mackerras @ 2006-05-26 12:33 ` Olaf Hering 2006-05-26 12:40 ` Paul Mackerras 2006-05-27 11:34 ` Olaf Hering 1 sibling, 1 reply; 11+ messages in thread From: Olaf Hering @ 2006-05-26 12:33 UTC (permalink / raw) To: Paul Mackerras; +Cc: linuxppc-dev On Fri, May 26, Paul Mackeras wrote: > Olaf Hering writes: > > > According to this change for EXCEPTION_PROLOG_COMMON, I get still into > > decremeter_common, but its not fatal anymore because the cpu is now in > > 64bit mode and the stack is forced to PACAKSAVE(r13). > > > > subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \ > > beq- 1f; \ > > ld r1,PACAKSAVE(r13); /* kernel stack to use */ \ > > -1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \ > > +1: \ > > + cmpdi cr1,r29,0x42; \ > > Ummm, what's r29 supposed to have in it here? 0x42, from the spinning hello32 app. I filled all regs from r14 to r31 with a fixed value, and used these regs for debugging in the reset handler. > > + bne cr1,2f; \ > > + li r29,2f@l; \ > > And why are we setting it? Just a debug thing to find out how I got into bad_stack. I expected that system_reset_common calls bad_stack, but it was decrementer_common. No idea how that happend, MSR_EE is off, the timer interrupt has the lowest priority, so it should not trigger. > Does it look like the SRR0 and SRR1 values are correct when we get > this problem occurring? Is it just the MSR that is bogus? The MSR indicates 32bit mode, it is 0x1002 on entry and 0x1032 before rfdi. I havent dumped the SRR1 content, SRR0 points to the hello32 'b .' instruction. It seems the JS20 and POWER4 firmware calls fwnmi on all cpus, while JS21 (and probably POWER5) does it only on one cpu. The other cpus will be stopped with an IPI, trap 501. According to some firmware guys, the OS is supposed to force 64bit mode on reset. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware 2006-05-26 12:33 ` Olaf Hering @ 2006-05-26 12:40 ` Paul Mackerras 2006-05-26 12:48 ` Olaf Hering 0 siblings, 1 reply; 11+ messages in thread From: Paul Mackerras @ 2006-05-26 12:40 UTC (permalink / raw) To: Olaf Hering; +Cc: linuxppc-dev Olaf Hering writes: > According to some firmware guys, the OS is supposed to force 64bit mode > on reset. Who? I'll get a platform architect to drop on them from a great height. :) Paul. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware 2006-05-26 12:40 ` Paul Mackerras @ 2006-05-26 12:48 ` Olaf Hering 0 siblings, 0 replies; 11+ messages in thread From: Olaf Hering @ 2006-05-26 12:48 UTC (permalink / raw) To: Paul Mackerras; +Cc: linuxppc-dev On Fri, May 26, Paul Mackeras wrote: > Olaf Hering writes: > > > According to some firmware guys, the OS is supposed to force 64bit mode > > on reset. > > Who? I'll get a platform architect to drop on them from a great > height. :) LTC22581 for the full story. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware 2006-05-26 11:30 ` Paul Mackerras 2006-05-26 12:33 ` Olaf Hering @ 2006-05-27 11:34 ` Olaf Hering 1 sibling, 0 replies; 11+ messages in thread From: Olaf Hering @ 2006-05-27 11:34 UTC (permalink / raw) To: Paul Mackerras; +Cc: linuxppc-dev On Fri, May 26, Paul Mackeras wrote: > Olaf Hering writes: > > > According to this change for EXCEPTION_PROLOG_COMMON, I get still into > > decremeter_common, but its not fatal anymore because the cpu is now in > > 64bit mode and the stack is forced to PACAKSAVE(r13). > > > > subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \ > > beq- 1f; \ > > ld r1,PACAKSAVE(r13); /* kernel stack to use */ \ > > -1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \ > > +1: \ > > + cmpdi cr1,r29,0x42; \ > > Ummm, what's r29 supposed to have in it here? > > > + bne cr1,2f; \ > > + li r29,2f@l; \ > > And why are we setting it? > > Does it look like the SRR0 and SRR1 values are correct when we get > this problem occurring? Is it just the MSR that is bogus? I looked into this again, my debug patch was wrong. I cant rely on the hello32, r29 has to be set in the system_reset path. This is the register dump. It shows that the cpus are in 32bit mode, and that system_reset_fwnmi calls system_reset_common correctly. 0:mon> r R00 = 000000001000036c R16 = 0000000000000042 R01 = 00000000ffe96ad0 R17 = 0000000000000042 R02 = 000000001009b470 R18 = 0000000000000042 R03 = 0000000000000009 R19 = 0000000000000042 R04 = 000000001002228c R20 = 0000000000000042 R05 = 0000000040042082 R21 = 0000000000000042 R06 = 0000000000004000 R22 = 0000000000000042 R07 = 0000000010008af0 R23 = 0000000000000042 R08 = 0000000000000000 R24 = 0000000000000042 R09 = 0000000000000000 R25 = 0000000000000042 R10 = 8000000000001032 R26 = 0000000000000042 R11 = 00000000ffe96a50 R27 = 0000000000000042 R12 = 0000000020000082 R28 = 0000000000000042 R13 = 000000001009a410 R29 = 0000000000003220 R14 = 0000000000000042 R30 = 0000000000001002 R15 = 0000000000000042 R31 = a000000000001032 pc = 00000000100003b4 lr = 000000001000036c msr = 000000000000d032 cr = 20000082 ctr = 0000000000032ddc xer = 00000000000fffff trap = 100 0:mon> c1 1:mon> r R00 = 000000001000036c R16 = 0000000000000042 R01 = 00000000ffe96ad0 R17 = 0000000000000042 R02 = 000000001009b470 R18 = 0000000000000042 R03 = 0000000000000007 R19 = 0000000000000042 R04 = 000000001002228c R20 = 0000000000000042 R05 = 0000000040022082 R21 = 0000000000000042 R06 = 0000000000004000 R22 = 0000000000000042 R07 = 0000000010008af0 R23 = 0000000000000042 R08 = 0000000000000000 R24 = 0000000000000042 R09 = 0000000000000000 R25 = 0000000000000042 R10 = 8000000000001032 R26 = 0000000000000042 R11 = 00000000ffe96a50 R27 = 0000000000000042 R12 = 0000000020000082 R28 = 0000000000000042 R13 = 000000001009a410 R29 = 0000000000003220 R14 = 0000000000000042 R30 = 0000000000001002 R15 = 0000000000000042 R31 = a000000000001032 pc = 00000000100003b4 lr = 000000001000036c msr = 000000000000d032 cr = 20000082 ctr = 0000000000032ddc xer = 00000000000fffff trap = 100 Index: linux-2.6/arch/powerpc/kernel/head_64.S =================================================================== --- linux-2.6.orig/arch/powerpc/kernel/head_64.S +++ linux-2.6/arch/powerpc/kernel/head_64.S @@ -211,6 +211,31 @@ exception_marker: ori reg,reg,(label)@l; /* virt addr of handler ... */ #endif +#define EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(area, label) \ + mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \ + std r9,area+EX_R9(r13); /* save r9 - r12 */ \ + std r10,area+EX_R10(r13); \ + std r11,area+EX_R11(r13); \ + std r12,area+EX_R12(r13); \ + mfspr r9,SPRN_SPRG1; \ + std r9,area+EX_R13(r13); \ + mfcr r9; \ + clrrdi r12,r13,32; /* get high part of &label */ \ + mfmsr r10; \ + mr r30,r10; \ + li r11,5; /* MSR_SF_LG|MSR_ISF_LG */ \ + rldicr r11,r11,61,2; /* (5 << 61) */ \ + or r10,r10,r11; \ + mfspr r11,SPRN_SRR0; /* save SRR0 */ \ + LOAD_HANDLER(r12,label) \ + ori r10,r10,MSR_IR|MSR_DR|MSR_RI; \ + mtspr SPRN_SRR0,r12; \ + mfspr r12,SPRN_SRR1; /* and SRR1 */ \ + mr r31,r10; \ + mtspr SPRN_SRR1,r10; \ + rfid; \ + b . /* prevent speculative execution */ + #define EXCEPTION_PROLOG_PSERIES(area, label) \ mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \ std r9,area+EX_R9(r13); /* save r9 - r12 */ \ @@ -269,7 +294,12 @@ exception_marker: subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \ beq- 1f; \ ld r1,PACAKSAVE(r13); /* kernel stack to use */ \ -1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \ +1: \ + cmpdi cr1,r29,0x42; \ + bne cr1,2f; \ + li r29,2f@l; \ +2: ; \ + cmpdi cr1,r1,0; /* check if r1 is in userspace */ \ bge- cr1,bad_stack; /* abort if it is */ \ std r9,_CCR(r1); /* save CR in stackframe */ \ std r11,_NIP(r1); /* save SRR0 in stackframe */ \ @@ -600,14 +630,15 @@ slb_miss_user_pseries: system_reset_fwnmi: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ - EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common) + li r29,0x42 + EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(PACA_EXGEN, system_reset_common) .globl machine_check_fwnmi .align 7 machine_check_fwnmi: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ - EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common) + EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(PACA_EXMC, machine_check_common) #ifdef CONFIG_PPC_ISERIES /*** ISeries-LPAR interrupt handlers ***/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware 2006-05-23 13:07 ` [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware Olaf Hering 2006-05-26 11:30 ` Paul Mackerras @ 2006-06-09 8:11 ` Paul Mackerras 2006-06-09 9:04 ` Olaf Hering 1 sibling, 1 reply; 11+ messages in thread From: Paul Mackerras @ 2006-06-09 8:11 UTC (permalink / raw) To: Olaf Hering; +Cc: linuxppc-dev Olaf Hering writes: > The reason is that system_reset_fwnmi is called in 32bit mode. Forcing > 64bit mode fixes the corrupt NIP for me. JS20 and p690 are affected, > seems to work on p550 and JS21. What was the LTC bugzilla number for this again? Paul. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware 2006-06-09 8:11 ` Paul Mackerras @ 2006-06-09 9:04 ` Olaf Hering 0 siblings, 0 replies; 11+ messages in thread From: Olaf Hering @ 2006-06-09 9:04 UTC (permalink / raw) To: Paul Mackerras; +Cc: linuxppc-dev On Fri, Jun 09, Paul Mackeras wrote: > Olaf Hering writes: > > > The reason is that system_reset_fwnmi is called in 32bit mode. Forcing > > 64bit mode fixes the corrupt NIP for me. JS20 and p690 are affected, > > seems to work on p550 and JS21. > > What was the LTC bugzilla number for this again? LTC22581, the last comments confirm that firmware leaves the cpu in 32bit mode. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] force 64bit mode in fwnmi handlers to workaround firmware bugs 2006-05-22 16:41 pseries softreset on cpus in 32bit mode Olaf Hering 2006-05-22 18:46 ` Olaf Hering 2006-05-23 13:07 ` [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware Olaf Hering @ 2006-07-19 8:34 ` Olaf Hering 2 siblings, 0 replies; 11+ messages in thread From: Olaf Hering @ 2006-07-19 8:34 UTC (permalink / raw) To: linuxppc-dev, Paul Mackeras The firmware of POWER4 and JS20 systems does not switch the cpu to 64bit mode when the registered system_reset and machine_check handlers get called. If a 32bit process runs on that cpu at the time of the event, the cpu remains in 32bit mode. xmon and kdump can not deal with it, the result is an error like 'Bad kernel stack pointer fff2aad0 at 3200'. xmon just loses some register info, but booting the kdump kernel usually fails. Both handlers are not hot paths. Duplicate the EXCEPTION_PROLOG_PSERIES macro and add two instructions to switch to 64bit: li r11,5; rldimi r10,r11,61,0; Signed-off-by: Olaf Hering <olh@suse.de> --- arch/powerpc/kernel/head_64.S | 35 +++++++++++++++++++++++++++++++++-- 1 file changed, 33 insertions(+), 2 deletions(-) Index: linux-2.6.18-rc2/arch/powerpc/kernel/head_64.S =================================================================== --- linux-2.6.18-rc2.orig/arch/powerpc/kernel/head_64.S +++ linux-2.6.18-rc2/arch/powerpc/kernel/head_64.S @@ -191,6 +191,37 @@ exception_marker: ori reg,reg,(label)@l; /* virt addr of handler ... */ #endif +/* + * Equal to EXCEPTION_PROLOG_PSERIES, except that it forces 64bit mode. + * The firmware calls the registered system_reset_fwnmi and + * machine_check_fwnmi handlers in 32bit mode if the cpu happens to run + * a 32bit application at the time of the event. + * This firmware bug is present on POWER4 and JS20. + */ +#define EXCEPTION_PROLOG_PSERIES_FORCE_64BIT(area, label) \ + mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \ + std r9,area+EX_R9(r13); /* save r9 - r12 */ \ + std r10,area+EX_R10(r13); \ + std r11,area+EX_R11(r13); \ + std r12,area+EX_R12(r13); \ + mfspr r9,SPRN_SPRG1; \ + std r9,area+EX_R13(r13); \ + mfcr r9; \ + clrrdi r12,r13,32; /* get high part of &label */ \ + mfmsr r10; \ + /* force 64bit mode */ \ + li r11,5; /* MSR_SF_LG|MSR_ISF_LG */ \ + rldimi r10,r11,61,0; /* insert into top 3 bits */ \ + /* done 64bit mode */ \ + mfspr r11,SPRN_SRR0; /* save SRR0 */ \ + LOAD_HANDLER(r12,label) \ + ori r10,r10,MSR_IR|MSR_DR|MSR_RI; \ + mtspr SPRN_SRR0,r12; \ + mfspr r12,SPRN_SRR1; /* and SRR1 */ \ + mtspr SPRN_SRR1,r10; \ + rfid; \ + b . /* prevent speculative execution */ + #define EXCEPTION_PROLOG_PSERIES(area, label) \ mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \ std r9,area+EX_R9(r13); /* save r9 - r12 */ \ @@ -604,14 +635,14 @@ slb_miss_user_pseries: system_reset_fwnmi: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ - EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common) + EXCEPTION_PROLOG_PSERIES_FORCE_64BIT(PACA_EXGEN, system_reset_common) .globl machine_check_fwnmi .align 7 machine_check_fwnmi: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ - EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common) + EXCEPTION_PROLOG_PSERIES_FORCE_64BIT(PACA_EXMC, machine_check_common) #ifdef CONFIG_PPC_ISERIES /*** ISeries-LPAR interrupt handlers ***/ ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2006-07-19 8:34 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-05-22 16:41 pseries softreset on cpus in 32bit mode Olaf Hering 2006-05-22 18:46 ` Olaf Hering 2006-05-23 13:07 ` [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware Olaf Hering 2006-05-26 11:30 ` Paul Mackerras 2006-05-26 12:33 ` Olaf Hering 2006-05-26 12:40 ` Paul Mackerras 2006-05-26 12:48 ` Olaf Hering 2006-05-27 11:34 ` Olaf Hering 2006-06-09 8:11 ` Paul Mackerras 2006-06-09 9:04 ` Olaf Hering 2006-07-19 8:34 ` [PATCH] force 64bit mode in fwnmi handlers to workaround firmware bugs Olaf Hering
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).