* pseries softreset on cpus in 32bit mode
@ 2006-05-22 16:41 Olaf Hering
2006-05-22 18:46 ` Olaf Hering
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Olaf Hering @ 2006-05-22 16:41 UTC (permalink / raw)
To: linuxppc-dev
Consider a simple app like this, which is placed as '/init' in an initrd
cpio archive:
hello32.c
#include <stdio.h>
int main(void) {
printf("foobar\n");
asm("li 31,0; b .\n");
return 0;
}
It will keep one cpu busy, and in 32bit mode. If a soft-reset is
triggered, this cpu remains in 32bit mode (I think) when
system_reset_fwnmi() is invoked. Then bad_stack is called via
STD_EXCEPTION_COMMON() and EXCEPTION_PROLOG_COMMON() because the 32bit
stackpointer is > 0 and the cpu was in usermode. Finally panic is
called, which doesnt make much sense in this context.
machine_check_fwnmi has likely the same issue.
One bug is that something trashes regs->nip, it gets 0x3200 or similar.
I'm not really sure what is supposed to happen. Clearly a softreset
should not panic with bad stack pointer.
This is on a JS20, but a large p550 dies the same way.
Linux version 2.6.17-rc4-g353b28ba (olaf@pomegranate) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Mon May 22 18:37:06 CEST 2006
[boot]0012 Setup Arch
Top of RAM: 0x1e000000, Total RAM: 0x1e000000
Memory hole size: 0MB
PPC64 nvram contains 16384 bytes
Using default idle loop
[boot]0015 Setup Done
Built 1 zonelists
Kernel command line: root=/dev/hda2 xmon=on quiet panic=1
foobar
Bad kernel stack pointer ffa57ac0 at 3200
Oops: Bad kernel stack pointer, sig: 6 [#1]
SMP NR_CPUS=128 NUMA
Modules linked in:
NIP: 0000000000003200 LR: 0000000010000338 CTR: 0000000000032DDC
REGS: c000000007a5ed40 TRAP: c000000007a5ef10 Not tainted (2.6.17-rc4-g353b28ba)
MSR: 0000000040001032 <ME,IR,DR> CR: 20000042 XER: 200FFFFF
TASK = c00000001dfdb7e0[1] 'init' THREAD: c00000000ffcc000 CPU: 1
GPR00: 0000000010000338 00000000FFA57AC0 000000001009B470 0000000007ACEFF8
GPR04: 000000001002487C 0000000040000042 0000000000004000 000000001000B0E0
GPR08: 000000000000F932 0000000000000000 0000000000000000 0000000000000000
GPR12: 00000000200FFFFF C00000000052D100 C000000000442820 4000000002010000
GPR16: C000000000440ED8 0000000000000000 00000000000413DB 00000000004FA998
GPR20: 000000000250AC08 00000000004FAC08 000000000183FE00 00000000004420C0
GPR24: 000000000052CF00 0000000010000C70 0000000010000BF0 0000000000000000
GPR28: 0000000000000000 0000000010090000 00000000005123D8 0000000000000000
NIP [0000000000003200] 0x3200
LR [0000000010000338] 0x10000338
Call Trace:
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: pseries softreset on cpus in 32bit mode
2006-05-22 16:41 pseries softreset on cpus in 32bit mode Olaf Hering
@ 2006-05-22 18:46 ` Olaf Hering
2006-05-23 13:07 ` [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware Olaf Hering
2006-07-19 8:34 ` [PATCH] force 64bit mode in fwnmi handlers to workaround firmware bugs Olaf Hering
2 siblings, 0 replies; 11+ messages in thread
From: Olaf Hering @ 2006-05-22 18:46 UTC (permalink / raw)
To: linuxppc-dev
On Mon, May 22, Olaf Hering wrote:
>
> Consider a simple app like this, which is placed as '/init' in an initrd
> cpio archive:
>
> hello32.c
>
> #include <stdio.h>
> int main(void) {
> printf("foobar\n");
> asm("li 31,0; b .\n");
> return 0;
> }
Modified the userland app to init some registers with a fixed value, and
ran a kernel with the debug patch below. It gets into bad_stack from
decrementer_common = c000000000003400. 3200 is coming from
c000000000003200 <system_reset_common>:
So what does that mean? Should a softreset disable interrupts?
Linux version 2.6.17-rc4-g353b28ba-dirty (olaf@pomegranate) (gcc version 4.1.0 (SUSE Linux)) #13 SMP Mon May 22 20:38:56 CEST 2006
[boot]0012 Setup Arch
Top of RAM: 0x1e000000, Total RAM: 0x1e000000
Memory hole size: 0MB
PPC64 nvram contains 16384 bytes
Using default idle loop
[boot]0015 Setup Done
Built 1 zonelists
Kernel command line: root=/dev/hda2 xmon=on quiet
foobar
Bad kernel stack pointer ffad3ad0 at 3200
cpu 0x1: Vector: c000000007a5ef10 at [c000000007a5ed40]
pc: 0000000000003200
lr: 0000000010000338
sp: ffad3ad0
msr: 40001032
current = 0xc00000000ffc67e0
paca = 0xc00000000053a100
pid = 1, comm = init
enter ? for help
1:mon> r
R00 = 0000000010000338 R16 = c0000000004470e8
R01 = 00000000ffad3ad0 R17 = 0000000000000000
R02 = 000000001009c470 R18 = 00000000000413cd
R03 = 0000000007aceff8 R19 = 0000000000507ab8
R04 = 000000001002489c R20 = 0000000000000042
R05 = 0000000040000042 R21 = 0000000000000042
R06 = 0000000000004000 R22 = 0000000000000042
R07 = 000000001000b100 R23 = 0000000000000042
R08 = 000000000000f932 R24 = 0000000000000042
R09 = 0000000000000000 R25 = 0000000000000042
R10 = 0000000000000000 R26 = 0000000000000042
R11 = 0000000000000000 R27 = 0000000000000042
R12 = 00000000200fffff R28 = 0000000000000042
R13 = c00000000053a100 R29 = 0000000000003420
R14 = c000000000448a38 R30 = 0000000000003200
R15 = 4000000002010000 R31 = 0000000010000368
pc = 0000000000003200
lr = 0000000010000338
msr = 0000000040001032 cr = 20000042
ctr = 0000000000032ddc xer = 00000000200fffff trap = c000000007a5ef10
1:mon>
Index: linux-2.6/arch/powerpc/kernel/head_64.S
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/head_64.S
+++ linux-2.6/arch/powerpc/kernel/head_64.S
@@ -269,7 +269,12 @@ exception_marker:
subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \
beq- 1f; \
ld r1,PACAKSAVE(r13); /* kernel stack to use */ \
-1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \
+1: \
+ cmpdi cr1,r29,0x42; \
+ bne cr1,2f; \
+ li r29,2f@l; \
+2: \
+ cmpdi cr1,r1,0; /* check if r1 is in userspace */ \
bge- cr1,bad_stack; /* abort if it is */ \
std r9,_CCR(r1); /* save CR in stackframe */ \
std r11,_NIP(r1); /* save SRR0 in stackframe */ \
@@ -600,6 +605,7 @@ slb_miss_user_pseries:
system_reset_fwnmi:
HMT_MEDIUM
mtspr SPRN_SPRG1,r13 /* save r13 */
+ mfspr r31,SPRN_SRR0
EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common)
.globl machine_check_fwnmi
@@ -842,6 +848,7 @@ bad_stack:
std r9,_CCR(r1)
std r10,GPR1(r1)
std r11,_NIP(r1)
+ mr r30,r11
std r12,_MSR(r1)
mfspr r11,SPRN_DAR
mfspr r12,SPRN_DSISR
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware
2006-05-22 16:41 pseries softreset on cpus in 32bit mode Olaf Hering
2006-05-22 18:46 ` Olaf Hering
@ 2006-05-23 13:07 ` Olaf Hering
2006-05-26 11:30 ` Paul Mackerras
2006-06-09 8:11 ` Paul Mackerras
2006-07-19 8:34 ` [PATCH] force 64bit mode in fwnmi handlers to workaround firmware bugs Olaf Hering
2 siblings, 2 replies; 11+ messages in thread
From: Olaf Hering @ 2006-05-23 13:07 UTC (permalink / raw)
To: linuxppc-dev
On Mon, May 22, Olaf Hering wrote:
> NIP [0000000000003200] 0x3200
> LR [0000000010000338] 0x10000338
The reason is that system_reset_fwnmi is called in 32bit mode. Forcing
64bit mode fixes the corrupt NIP for me. JS20 and p690 are affected,
seems to work on p550 and JS21.
According to this change for EXCEPTION_PROLOG_COMMON, I get still into
decremeter_common, but its not fatal anymore because the cpu is now in
64bit mode and the stack is forced to PACAKSAVE(r13).
subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \
beq- 1f; \
ld r1,PACAKSAVE(r13); /* kernel stack to use */ \
-1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \
+1: \
+ cmpdi cr1,r29,0x42; \
+ bne cr1,2f; \
+ li r29,2f@l; \
+2: \
+ cmpdi cr1,r1,0; /* check if r1 is in userspace */ \
bge- cr1,bad_stack; /* abort if it is */ \
std r9,_CCR(r1); /* save CR in stackframe */ \
std r11,_NIP(r1); /* save SRR0 in stackframe */ \
Index: linux-2.6/arch/powerpc/kernel/head_64.S
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/head_64.S
+++ linux-2.6/arch/powerpc/kernel/head_64.S
@@ -211,6 +211,29 @@ exception_marker:
ori reg,reg,(label)@l; /* virt addr of handler ... */
#endif
+#define EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(area, label) \
+ mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \
+ std r9,area+EX_R9(r13); /* save r9 - r12 */ \
+ std r10,area+EX_R10(r13); \
+ std r11,area+EX_R11(r13); \
+ std r12,area+EX_R12(r13); \
+ mfspr r9,SPRN_SPRG1; \
+ std r9,area+EX_R13(r13); \
+ mfcr r9; \
+ clrrdi r12,r13,32; /* get high part of &label */ \
+ mfmsr r10; \
+ li r11,5; /* MSR_SF_LG|MSR_ISF_LG */ \
+ rldicr r11,r11,61,2; /* (5 << 61) */ \
+ or r10,r10,r11; \
+ mfspr r11,SPRN_SRR0; /* save SRR0 */ \
+ LOAD_HANDLER(r12,label) \
+ ori r10,r10,MSR_IR|MSR_DR|MSR_RI; \
+ mtspr SPRN_SRR0,r12; \
+ mfspr r12,SPRN_SRR1; /* and SRR1 */ \
+ mtspr SPRN_SRR1,r10; \
+ rfid; \
+ b . /* prevent speculative execution */
+
#define EXCEPTION_PROLOG_PSERIES(area, label) \
mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \
std r9,area+EX_R9(r13); /* save r9 - r12 */ \
@@ -600,14 +623,14 @@ slb_miss_user_pseries:
system_reset_fwnmi:
HMT_MEDIUM
mtspr SPRN_SPRG1,r13 /* save r13 */
- EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common)
+ EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(PACA_EXGEN, system_reset_common)
.globl machine_check_fwnmi
.align 7
machine_check_fwnmi:
HMT_MEDIUM
mtspr SPRN_SPRG1,r13 /* save r13 */
- EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common)
+ EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(PACA_EXMC, machine_check_common)
#ifdef CONFIG_PPC_ISERIES
/*** ISeries-LPAR interrupt handlers ***/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware
2006-05-23 13:07 ` [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware Olaf Hering
@ 2006-05-26 11:30 ` Paul Mackerras
2006-05-26 12:33 ` Olaf Hering
2006-05-27 11:34 ` Olaf Hering
2006-06-09 8:11 ` Paul Mackerras
1 sibling, 2 replies; 11+ messages in thread
From: Paul Mackerras @ 2006-05-26 11:30 UTC (permalink / raw)
To: Olaf Hering; +Cc: linuxppc-dev
Olaf Hering writes:
> According to this change for EXCEPTION_PROLOG_COMMON, I get still into
> decremeter_common, but its not fatal anymore because the cpu is now in
> 64bit mode and the stack is forced to PACAKSAVE(r13).
>
> subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \
> beq- 1f; \
> ld r1,PACAKSAVE(r13); /* kernel stack to use */ \
> -1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \
> +1: \
> + cmpdi cr1,r29,0x42; \
Ummm, what's r29 supposed to have in it here?
> + bne cr1,2f; \
> + li r29,2f@l; \
And why are we setting it?
Does it look like the SRR0 and SRR1 values are correct when we get
this problem occurring? Is it just the MSR that is bogus?
Paul.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware
2006-05-26 11:30 ` Paul Mackerras
@ 2006-05-26 12:33 ` Olaf Hering
2006-05-26 12:40 ` Paul Mackerras
2006-05-27 11:34 ` Olaf Hering
1 sibling, 1 reply; 11+ messages in thread
From: Olaf Hering @ 2006-05-26 12:33 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev
On Fri, May 26, Paul Mackeras wrote:
> Olaf Hering writes:
>
> > According to this change for EXCEPTION_PROLOG_COMMON, I get still into
> > decremeter_common, but its not fatal anymore because the cpu is now in
> > 64bit mode and the stack is forced to PACAKSAVE(r13).
> >
> > subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \
> > beq- 1f; \
> > ld r1,PACAKSAVE(r13); /* kernel stack to use */ \
> > -1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \
> > +1: \
> > + cmpdi cr1,r29,0x42; \
>
> Ummm, what's r29 supposed to have in it here?
0x42, from the spinning hello32 app. I filled all regs from r14 to r31
with a fixed value, and used these regs for debugging in the reset
handler.
> > + bne cr1,2f; \
> > + li r29,2f@l; \
>
> And why are we setting it?
Just a debug thing to find out how I got into bad_stack. I expected that
system_reset_common calls bad_stack, but it was decrementer_common. No
idea how that happend, MSR_EE is off, the timer interrupt has the lowest
priority, so it should not trigger.
> Does it look like the SRR0 and SRR1 values are correct when we get
> this problem occurring? Is it just the MSR that is bogus?
The MSR indicates 32bit mode, it is 0x1002 on entry and 0x1032 before
rfdi. I havent dumped the SRR1 content, SRR0 points to the hello32 'b .'
instruction.
It seems the JS20 and POWER4 firmware calls fwnmi on all cpus, while
JS21 (and probably POWER5) does it only on one cpu. The other cpus will
be stopped with an IPI, trap 501.
According to some firmware guys, the OS is supposed to force 64bit mode
on reset.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware
2006-05-26 12:33 ` Olaf Hering
@ 2006-05-26 12:40 ` Paul Mackerras
2006-05-26 12:48 ` Olaf Hering
0 siblings, 1 reply; 11+ messages in thread
From: Paul Mackerras @ 2006-05-26 12:40 UTC (permalink / raw)
To: Olaf Hering; +Cc: linuxppc-dev
Olaf Hering writes:
> According to some firmware guys, the OS is supposed to force 64bit mode
> on reset.
Who? I'll get a platform architect to drop on them from a great
height. :)
Paul.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware
2006-05-26 12:40 ` Paul Mackerras
@ 2006-05-26 12:48 ` Olaf Hering
0 siblings, 0 replies; 11+ messages in thread
From: Olaf Hering @ 2006-05-26 12:48 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev
On Fri, May 26, Paul Mackeras wrote:
> Olaf Hering writes:
>
> > According to some firmware guys, the OS is supposed to force 64bit mode
> > on reset.
>
> Who? I'll get a platform architect to drop on them from a great
> height. :)
LTC22581 for the full story.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware
2006-05-26 11:30 ` Paul Mackerras
2006-05-26 12:33 ` Olaf Hering
@ 2006-05-27 11:34 ` Olaf Hering
1 sibling, 0 replies; 11+ messages in thread
From: Olaf Hering @ 2006-05-27 11:34 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev
On Fri, May 26, Paul Mackeras wrote:
> Olaf Hering writes:
>
> > According to this change for EXCEPTION_PROLOG_COMMON, I get still into
> > decremeter_common, but its not fatal anymore because the cpu is now in
> > 64bit mode and the stack is forced to PACAKSAVE(r13).
> >
> > subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \
> > beq- 1f; \
> > ld r1,PACAKSAVE(r13); /* kernel stack to use */ \
> > -1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \
> > +1: \
> > + cmpdi cr1,r29,0x42; \
>
> Ummm, what's r29 supposed to have in it here?
>
> > + bne cr1,2f; \
> > + li r29,2f@l; \
>
> And why are we setting it?
>
> Does it look like the SRR0 and SRR1 values are correct when we get
> this problem occurring? Is it just the MSR that is bogus?
I looked into this again, my debug patch was wrong. I cant rely on the
hello32, r29 has to be set in the system_reset path. This is the
register dump. It shows that the cpus are in 32bit mode, and that
system_reset_fwnmi calls system_reset_common correctly.
0:mon> r
R00 = 000000001000036c R16 = 0000000000000042
R01 = 00000000ffe96ad0 R17 = 0000000000000042
R02 = 000000001009b470 R18 = 0000000000000042
R03 = 0000000000000009 R19 = 0000000000000042
R04 = 000000001002228c R20 = 0000000000000042
R05 = 0000000040042082 R21 = 0000000000000042
R06 = 0000000000004000 R22 = 0000000000000042
R07 = 0000000010008af0 R23 = 0000000000000042
R08 = 0000000000000000 R24 = 0000000000000042
R09 = 0000000000000000 R25 = 0000000000000042
R10 = 8000000000001032 R26 = 0000000000000042
R11 = 00000000ffe96a50 R27 = 0000000000000042
R12 = 0000000020000082 R28 = 0000000000000042
R13 = 000000001009a410 R29 = 0000000000003220
R14 = 0000000000000042 R30 = 0000000000001002
R15 = 0000000000000042 R31 = a000000000001032
pc = 00000000100003b4
lr = 000000001000036c
msr = 000000000000d032 cr = 20000082
ctr = 0000000000032ddc xer = 00000000000fffff trap = 100
0:mon> c1
1:mon> r
R00 = 000000001000036c R16 = 0000000000000042
R01 = 00000000ffe96ad0 R17 = 0000000000000042
R02 = 000000001009b470 R18 = 0000000000000042
R03 = 0000000000000007 R19 = 0000000000000042
R04 = 000000001002228c R20 = 0000000000000042
R05 = 0000000040022082 R21 = 0000000000000042
R06 = 0000000000004000 R22 = 0000000000000042
R07 = 0000000010008af0 R23 = 0000000000000042
R08 = 0000000000000000 R24 = 0000000000000042
R09 = 0000000000000000 R25 = 0000000000000042
R10 = 8000000000001032 R26 = 0000000000000042
R11 = 00000000ffe96a50 R27 = 0000000000000042
R12 = 0000000020000082 R28 = 0000000000000042
R13 = 000000001009a410 R29 = 0000000000003220
R14 = 0000000000000042 R30 = 0000000000001002
R15 = 0000000000000042 R31 = a000000000001032
pc = 00000000100003b4
lr = 000000001000036c
msr = 000000000000d032 cr = 20000082
ctr = 0000000000032ddc xer = 00000000000fffff trap = 100
Index: linux-2.6/arch/powerpc/kernel/head_64.S
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/head_64.S
+++ linux-2.6/arch/powerpc/kernel/head_64.S
@@ -211,6 +211,31 @@ exception_marker:
ori reg,reg,(label)@l; /* virt addr of handler ... */
#endif
+#define EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(area, label) \
+ mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \
+ std r9,area+EX_R9(r13); /* save r9 - r12 */ \
+ std r10,area+EX_R10(r13); \
+ std r11,area+EX_R11(r13); \
+ std r12,area+EX_R12(r13); \
+ mfspr r9,SPRN_SPRG1; \
+ std r9,area+EX_R13(r13); \
+ mfcr r9; \
+ clrrdi r12,r13,32; /* get high part of &label */ \
+ mfmsr r10; \
+ mr r30,r10; \
+ li r11,5; /* MSR_SF_LG|MSR_ISF_LG */ \
+ rldicr r11,r11,61,2; /* (5 << 61) */ \
+ or r10,r10,r11; \
+ mfspr r11,SPRN_SRR0; /* save SRR0 */ \
+ LOAD_HANDLER(r12,label) \
+ ori r10,r10,MSR_IR|MSR_DR|MSR_RI; \
+ mtspr SPRN_SRR0,r12; \
+ mfspr r12,SPRN_SRR1; /* and SRR1 */ \
+ mr r31,r10; \
+ mtspr SPRN_SRR1,r10; \
+ rfid; \
+ b . /* prevent speculative execution */
+
#define EXCEPTION_PROLOG_PSERIES(area, label) \
mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \
std r9,area+EX_R9(r13); /* save r9 - r12 */ \
@@ -269,7 +294,12 @@ exception_marker:
subi r1,r1,INT_FRAME_SIZE; /* alloc frame on kernel stack */ \
beq- 1f; \
ld r1,PACAKSAVE(r13); /* kernel stack to use */ \
-1: cmpdi cr1,r1,0; /* check if r1 is in userspace */ \
+1: \
+ cmpdi cr1,r29,0x42; \
+ bne cr1,2f; \
+ li r29,2f@l; \
+2: ; \
+ cmpdi cr1,r1,0; /* check if r1 is in userspace */ \
bge- cr1,bad_stack; /* abort if it is */ \
std r9,_CCR(r1); /* save CR in stackframe */ \
std r11,_NIP(r1); /* save SRR0 in stackframe */ \
@@ -600,14 +630,15 @@ slb_miss_user_pseries:
system_reset_fwnmi:
HMT_MEDIUM
mtspr SPRN_SPRG1,r13 /* save r13 */
- EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common)
+ li r29,0x42
+ EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(PACA_EXGEN, system_reset_common)
.globl machine_check_fwnmi
.align 7
machine_check_fwnmi:
HMT_MEDIUM
mtspr SPRN_SPRG1,r13 /* save r13 */
- EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common)
+ EXCEPTION_PROLOG_PSERIES_BROKEN_POWER4_FIRMWARE(PACA_EXMC, machine_check_common)
#ifdef CONFIG_PPC_ISERIES
/*** ISeries-LPAR interrupt handlers ***/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware
2006-05-23 13:07 ` [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware Olaf Hering
2006-05-26 11:30 ` Paul Mackerras
@ 2006-06-09 8:11 ` Paul Mackerras
2006-06-09 9:04 ` Olaf Hering
1 sibling, 1 reply; 11+ messages in thread
From: Paul Mackerras @ 2006-06-09 8:11 UTC (permalink / raw)
To: Olaf Hering; +Cc: linuxppc-dev
Olaf Hering writes:
> The reason is that system_reset_fwnmi is called in 32bit mode. Forcing
> 64bit mode fixes the corrupt NIP for me. JS20 and p690 are affected,
> seems to work on p550 and JS21.
What was the LTC bugzilla number for this again?
Paul.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware
2006-06-09 8:11 ` Paul Mackerras
@ 2006-06-09 9:04 ` Olaf Hering
0 siblings, 0 replies; 11+ messages in thread
From: Olaf Hering @ 2006-06-09 9:04 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev
On Fri, Jun 09, Paul Mackeras wrote:
> Olaf Hering writes:
>
> > The reason is that system_reset_fwnmi is called in 32bit mode. Forcing
> > 64bit mode fixes the corrupt NIP for me. JS20 and p690 are affected,
> > seems to work on p550 and JS21.
>
> What was the LTC bugzilla number for this again?
LTC22581, the last comments confirm that firmware leaves the cpu in
32bit mode.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] force 64bit mode in fwnmi handlers to workaround firmware bugs
2006-05-22 16:41 pseries softreset on cpus in 32bit mode Olaf Hering
2006-05-22 18:46 ` Olaf Hering
2006-05-23 13:07 ` [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware Olaf Hering
@ 2006-07-19 8:34 ` Olaf Hering
2 siblings, 0 replies; 11+ messages in thread
From: Olaf Hering @ 2006-07-19 8:34 UTC (permalink / raw)
To: linuxppc-dev, Paul Mackeras
The firmware of POWER4 and JS20 systems does not switch the cpu to 64bit
mode when the registered system_reset and machine_check handlers get called.
If a 32bit process runs on that cpu at the time of the event, the cpu
remains in 32bit mode. xmon and kdump can not deal with it, the result is
an error like 'Bad kernel stack pointer fff2aad0 at 3200'.
xmon just loses some register info, but booting the kdump kernel usually fails.
Both handlers are not hot paths. Duplicate the EXCEPTION_PROLOG_PSERIES macro
and add two instructions to switch to 64bit:
li r11,5;
rldimi r10,r11,61,0;
Signed-off-by: Olaf Hering <olh@suse.de>
---
arch/powerpc/kernel/head_64.S | 35 +++++++++++++++++++++++++++++++++--
1 file changed, 33 insertions(+), 2 deletions(-)
Index: linux-2.6.18-rc2/arch/powerpc/kernel/head_64.S
===================================================================
--- linux-2.6.18-rc2.orig/arch/powerpc/kernel/head_64.S
+++ linux-2.6.18-rc2/arch/powerpc/kernel/head_64.S
@@ -191,6 +191,37 @@ exception_marker:
ori reg,reg,(label)@l; /* virt addr of handler ... */
#endif
+/*
+ * Equal to EXCEPTION_PROLOG_PSERIES, except that it forces 64bit mode.
+ * The firmware calls the registered system_reset_fwnmi and
+ * machine_check_fwnmi handlers in 32bit mode if the cpu happens to run
+ * a 32bit application at the time of the event.
+ * This firmware bug is present on POWER4 and JS20.
+ */
+#define EXCEPTION_PROLOG_PSERIES_FORCE_64BIT(area, label) \
+ mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \
+ std r9,area+EX_R9(r13); /* save r9 - r12 */ \
+ std r10,area+EX_R10(r13); \
+ std r11,area+EX_R11(r13); \
+ std r12,area+EX_R12(r13); \
+ mfspr r9,SPRN_SPRG1; \
+ std r9,area+EX_R13(r13); \
+ mfcr r9; \
+ clrrdi r12,r13,32; /* get high part of &label */ \
+ mfmsr r10; \
+ /* force 64bit mode */ \
+ li r11,5; /* MSR_SF_LG|MSR_ISF_LG */ \
+ rldimi r10,r11,61,0; /* insert into top 3 bits */ \
+ /* done 64bit mode */ \
+ mfspr r11,SPRN_SRR0; /* save SRR0 */ \
+ LOAD_HANDLER(r12,label) \
+ ori r10,r10,MSR_IR|MSR_DR|MSR_RI; \
+ mtspr SPRN_SRR0,r12; \
+ mfspr r12,SPRN_SRR1; /* and SRR1 */ \
+ mtspr SPRN_SRR1,r10; \
+ rfid; \
+ b . /* prevent speculative execution */
+
#define EXCEPTION_PROLOG_PSERIES(area, label) \
mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \
std r9,area+EX_R9(r13); /* save r9 - r12 */ \
@@ -604,14 +635,14 @@ slb_miss_user_pseries:
system_reset_fwnmi:
HMT_MEDIUM
mtspr SPRN_SPRG1,r13 /* save r13 */
- EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common)
+ EXCEPTION_PROLOG_PSERIES_FORCE_64BIT(PACA_EXGEN, system_reset_common)
.globl machine_check_fwnmi
.align 7
machine_check_fwnmi:
HMT_MEDIUM
mtspr SPRN_SPRG1,r13 /* save r13 */
- EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common)
+ EXCEPTION_PROLOG_PSERIES_FORCE_64BIT(PACA_EXMC, machine_check_common)
#ifdef CONFIG_PPC_ISERIES
/*** ISeries-LPAR interrupt handlers ***/
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2006-07-19 8:34 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-22 16:41 pseries softreset on cpus in 32bit mode Olaf Hering
2006-05-22 18:46 ` Olaf Hering
2006-05-23 13:07 ` [PATCH] force 64bit mode in system_reset_fwnmi for broken POWER4 firmware Olaf Hering
2006-05-26 11:30 ` Paul Mackerras
2006-05-26 12:33 ` Olaf Hering
2006-05-26 12:40 ` Paul Mackerras
2006-05-26 12:48 ` Olaf Hering
2006-05-27 11:34 ` Olaf Hering
2006-06-09 8:11 ` Paul Mackerras
2006-06-09 9:04 ` Olaf Hering
2006-07-19 8:34 ` [PATCH] force 64bit mode in fwnmi handlers to workaround firmware bugs Olaf Hering
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).