* [PATCH v2 01/14] powerpc/32s: Only build hash code when CONFIG_PPC_BOOK3S_604 is selected
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1611309841.git.christophe.leroy@csgroup.eu>
It is now possible to only build book3s/32 kernel for
CPUs without hash table.
Opt out hash related code when CONFIG_PPC_BOOK3S_604 is not selected.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_book3s_32.S | 12 ++++++++++++
arch/powerpc/mm/book3s32/Makefile | 4 +++-
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 858fbc8b19f3..54140f4927e5 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -295,6 +295,7 @@ MachineCheck:
DO_KVM 0x300
DataAccess:
#ifdef CONFIG_VMAP_STACK
+#ifdef CONFIG_PPC_BOOK3S_604
BEGIN_MMU_FTR_SECTION
mtspr SPRN_SPRG_SCRATCH2,r10
mfspr r10, SPRN_SPRG_THREAD
@@ -311,12 +312,14 @@ BEGIN_MMU_FTR_SECTION
MMU_FTR_SECTION_ELSE
b 1f
ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
+#endif
1: EXCEPTION_PROLOG_0 handle_dar_dsisr=1
EXCEPTION_PROLOG_1
b handle_page_fault_tramp_1
#else /* CONFIG_VMAP_STACK */
EXCEPTION_PROLOG handle_dar_dsisr=1
get_and_save_dar_dsisr_on_stack r4, r5, r11
+#ifdef CONFIG_PPC_BOOK3S_604
BEGIN_MMU_FTR_SECTION
andis. r0, r5, (DSISR_BAD_FAULT_32S | DSISR_DABRMATCH)@h
bne handle_page_fault_tramp_2 /* if not, try to put a PTE */
@@ -324,8 +327,11 @@ BEGIN_MMU_FTR_SECTION
bl hash_page
b handle_page_fault_tramp_1
MMU_FTR_SECTION_ELSE
+#endif
b handle_page_fault_tramp_2
+#ifdef CONFIG_PPC_BOOK3S_604
ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
+#endif
#endif /* CONFIG_VMAP_STACK */
/* Instruction access exception. */
@@ -341,12 +347,14 @@ InstructionAccess:
mfspr r11, SPRN_SRR1 /* check whether user or kernel */
stw r11, SRR1(r10)
mfcr r10
+#ifdef CONFIG_PPC_BOOK3S_604
BEGIN_MMU_FTR_SECTION
andis. r11, r11, SRR1_ISI_NOPT@h /* no pte found? */
bne hash_page_isi
.Lhash_page_isi_cont:
mfspr r11, SPRN_SRR1 /* check whether user or kernel */
END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
+#endif
andi. r11, r11, MSR_PR
EXCEPTION_PROLOG_1
@@ -357,9 +365,11 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
beq 1f /* if so, try to put a PTE */
li r3,0 /* into the hash table */
mr r4,r12 /* SRR0 is fault address */
+#ifdef CONFIG_PPC_BOOK3S_604
BEGIN_MMU_FTR_SECTION
bl hash_page
END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
+#endif
#endif /* CONFIG_VMAP_STACK */
1: mr r4,r12
andis. r5,r9,DSISR_SRR1_MATCH_32S@h /* Filter relevant SRR1 bits */
@@ -692,6 +702,7 @@ handle_page_fault_tramp_2:
EXC_XFER_LITE(0x300, handle_page_fault)
#ifdef CONFIG_VMAP_STACK
+#ifdef CONFIG_PPC_BOOK3S_604
.macro save_regs_thread thread
stw r0, THR0(\thread)
stw r3, THR3(\thread)
@@ -763,6 +774,7 @@ fast_hash_page_return:
mfspr r11, SPRN_SPRG_SCRATCH1
mfspr r10, SPRN_SPRG_SCRATCH0
rfi
+#endif /* CONFIG_PPC_BOOK3S_604 */
stack_overflow:
vmap_stack_overflow_exception
diff --git a/arch/powerpc/mm/book3s32/Makefile b/arch/powerpc/mm/book3s32/Makefile
index 3f972db17761..446d9de88ce4 100644
--- a/arch/powerpc/mm/book3s32/Makefile
+++ b/arch/powerpc/mm/book3s32/Makefile
@@ -6,4 +6,6 @@ ifdef CONFIG_KASAN
CFLAGS_mmu.o += -DDISABLE_BRANCH_PROFILING
endif
-obj-y += mmu.o hash_low.o mmu_context.o tlb.o nohash_low.o
+obj-y += mmu.o mmu_context.o
+obj-$(CONFIG_PPC_BOOK3S_603) += nohash_low.o
+obj-$(CONFIG_PPC_BOOK3S_604) += hash_low.o tlb.o
--
2.25.0
^ permalink raw reply related
* [PATCH v2 00/14] powerpc/32: Reduce head complexity and re-activate MMU earlier
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
This series is a first step on the way to C syscall/exception entry/exit.
This series aims at reducing exception/syscall prologs complexity.
It also brings earlier MMU re-activation.
This series is based on Nick's v6 series "powerpc: interrupt wrappers".
It takes benefit of the removal of traps arguments (patches 2-7 of that series).
I have squashed those patches as second patch of my series in order to
please test robots. My series cleanly applies on top of entire Nick's series.
v2 has been reworked in order to apply to all PPC32, including BOOKE and 40x.
At the time being, we have two pathes in the prologs: one for
when we have VMAP stack and one when we don't.
When VMAP stack is supported, there is special prolog code to
allow accessing stack with MMU on.
That code that accesses VM stack with MMU on is also able to access
linear memory, so it can also access non VM stack with MMU on.
CONFIG_VMAP_STACK as been on by default on 6xx and 8xx for several
kernel releases now, so it is known to work.
On the 8xx, null_syscall runs in 292 cycles with VMAP_STACK and in
296 cycles without VMAP stack.
On the 832x, null_syscall runs in 224 cycles with VMAP_STACK and in
213 cycles without VMAP stack.
By removing the old non VMAP stack code, and using the same prolog
regardless of the activation of VMAP stacks, we make the code a lot
simplier and reduce the number of test cases.
BOOKE has MMU always on, so there is no change needed for that.
To allow removal of the old non VMAP stack code, only 40x need
to get adapted to support earlier MMU activation. That's what
patches 3-8 are for.
Once this is done, we easily go one step further and re-activate
Instruction translation at the same time as data translation.
At the end, null_syscall runs in 286 cycles on the 8xx and in 216
cycles on the 832x
Changes in v2:
- Implemented early MMU activation also on 40x
- Added BOOKE in the loop
- Removed the patches that replace r11 by r1 all over the place (too
much churn for very small benefit for now)
Christophe Leroy (14):
powerpc/32s: Only build hash code when CONFIG_PPC_BOOK3S_604 is
selected
NOT TO BE MERGED - Squash of patches 2-7 of v6 series "powerpc:
interrupt wrappers"
powerpc/40x: Don't use SPRN_SPRG_SCRATCH0/1 in TLB miss handlers
powerpc/40x: Change CRITICAL_EXCEPTION_PROLOG macro to a gas macro
powerpc/40x: Save SRR0/SRR1 and r10/r11 earlier in critical exception
powerpc/40x: Reorder a few instructions in critical exception prolog
powerpc/40x: Prepare for enabling MMU in critical exception prolog
powerpc/40x: Prepare normal exception handler for enabling MMU early
powerpc/32: Preserve cr1 in exception prolog stack check
powerpc/32: Use LOAD_REG_IMMEDIATE() to load MSR values
powerpc/32: Always enable data translation in exception prolog
powerpc/32: Enable instruction translation at the same time as data
translation
powerpc/32: Remove msr argument in EXC_XFER_TEMPLATE()
powerpc/32: Use fast instructions to change MSR EE/RI when available
arch/powerpc/include/asm/asm-prototypes.h | 4 +-
arch/powerpc/include/asm/book3s/64/mmu-hash.h | 1 +
arch/powerpc/include/asm/bug.h | 7 +-
arch/powerpc/include/asm/debug.h | 3 +-
arch/powerpc/include/asm/hw_irq.h | 46 +++++
arch/powerpc/include/asm/processor.h | 4 +-
arch/powerpc/kernel/asm-offsets.c | 2 -
arch/powerpc/kernel/entry_32.S | 128 ++++----------
arch/powerpc/kernel/exceptions-64e.S | 5 +-
arch/powerpc/kernel/exceptions-64s.S | 164 +++++------------
arch/powerpc/kernel/fpu.S | 2 -
arch/powerpc/kernel/head_32.h | 167 ++++--------------
arch/powerpc/kernel/head_40x.S | 161 +++++++++--------
arch/powerpc/kernel/head_8xx.S | 26 +--
arch/powerpc/kernel/head_book3s_32.S | 52 ++----
arch/powerpc/kernel/head_booke.h | 20 +--
arch/powerpc/kernel/idle_6xx.S | 12 +-
arch/powerpc/kernel/idle_e500.S | 4 +-
arch/powerpc/kernel/process.c | 7 +-
arch/powerpc/kernel/traps.c | 2 +-
arch/powerpc/kernel/vector.S | 2 -
arch/powerpc/mm/book3s32/Makefile | 4 +-
arch/powerpc/mm/book3s32/hash_low.S | 14 --
arch/powerpc/mm/book3s64/hash_utils.c | 79 ++++++---
arch/powerpc/mm/book3s64/slb.c | 11 +-
arch/powerpc/mm/fault.c | 18 +-
arch/powerpc/platforms/8xx/machine_check.c | 2 +-
27 files changed, 354 insertions(+), 593 deletions(-)
--
2.25.0
^ permalink raw reply
* [PATCH v2 09/14] powerpc/32: Preserve cr1 in exception prolog stack check
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1611309841.git.christophe.leroy@csgroup.eu>
THREAD_ALIGN_SHIFT = THREAD_SHIFT + 1 = PAGE_SHIFT + 1
Maximum PAGE_SHIFT is 18 for 256k pages so
THREAD_ALIGN_SHIFT is 19 at the maximum.
No need to clobber cr1, it can be preserved when moving r1
into CR when we check stack overflow.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_32.h | 2 +-
arch/powerpc/kernel/head_book3s_32.S | 6 ------
2 files changed, 1 insertion(+), 7 deletions(-)
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 1f5e64c70ddb..a7087930a844 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -55,7 +55,7 @@
lwz r1,TASK_STACK-THREAD(r1)
addi r1, r1, THREAD_SIZE - INT_FRAME_SIZE
1:
- mtcrf 0x7f, r1
+ mtcrf 0x3f, r1
bt 32 - THREAD_ALIGN_SHIFT, stack_overflow
#else
subi r11, r1, INT_FRAME_SIZE /* use r1 if kernel */
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 281de00c2ea4..6543cb410a8d 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -278,12 +278,6 @@ MachineCheck:
7: EXCEPTION_PROLOG_2
addi r3,r1,STACK_FRAME_OVERHEAD
#ifdef CONFIG_PPC_CHRP
-#ifdef CONFIG_VMAP_STACK
- mfspr r4, SPRN_SPRG_THREAD
- tovirt(r4, r4)
- lwz r4, RTAS_SP(r4)
- cmpwi cr1, r4, 0
-#endif
beq cr1, machine_check_tramp
twi 31, 0, 0
#else
--
2.25.0
^ permalink raw reply related
* [PATCH v2 10/14] powerpc/32: Use LOAD_REG_IMMEDIATE() to load MSR values
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1611309841.git.christophe.leroy@csgroup.eu>
40x MSR value doesn't fit on 15 bits.
LOAD_REG_IMMEDIATE() in places that will be used also
with 40x in the next patch.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_32.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index a7087930a844..48c0096fecc1 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -69,7 +69,7 @@
.macro EXCEPTION_PROLOG_2 handle_dar_dsisr=0
#ifdef CONFIG_VMAP_STACK
- li r11, MSR_KERNEL & ~(MSR_IR | MSR_RI) /* can take DTLB miss */
+ LOAD_REG_IMMEDIATE(r11, MSR_KERNEL & ~(MSR_IR | MSR_RI)) /* can take DTLB miss */
mtmsr r11
isync
mfspr r11, SPRN_SPRG_SCRATCH2
@@ -134,7 +134,7 @@
lwz r1,TASK_STACK-THREAD(r12)
beq- 99f
addi r1, r1, THREAD_SIZE - INT_FRAME_SIZE
- li r10, MSR_KERNEL & ~(MSR_IR | MSR_RI) /* can take DTLB miss */
+ LOAD_REG_IMMEDIATE(r10, MSR_KERNEL & ~(MSR_IR | MSR_RI)) /* can take DTLB miss */
mtmsr r10
isync
tovirt(r12, r12)
--
2.25.0
^ permalink raw reply related
* [PATCH v2 13/14] powerpc/32: Remove msr argument in EXC_XFER_TEMPLATE()
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1611309841.git.christophe.leroy@csgroup.eu>
msr argument is not used anymore, remove it.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_32.h | 8 +++-----
arch/powerpc/kernel/head_40x.S | 5 +----
2 files changed, 4 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 7209e63db366..98ed5b928642 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -213,7 +213,7 @@
addi r3,r1,STACK_FRAME_OVERHEAD; \
xfer(n, hdlr)
-#define EXC_XFER_TEMPLATE(hdlr, trap, msr, tfer, ret) \
+#define EXC_XFER_TEMPLATE(hdlr, trap, tfer, ret) \
li r10,trap; \
stw r10,_TRAP(r11); \
bl tfer; \
@@ -221,12 +221,10 @@
.long ret
#define EXC_XFER_STD(n, hdlr) \
- EXC_XFER_TEMPLATE(hdlr, n, MSR_KERNEL, transfer_to_handler_full, \
- ret_from_except_full)
+ EXC_XFER_TEMPLATE(hdlr, n, transfer_to_handler_full, ret_from_except_full)
#define EXC_XFER_LITE(n, hdlr) \
- EXC_XFER_TEMPLATE(hdlr, n+1, MSR_KERNEL, transfer_to_handler, \
- ret_from_except)
+ EXC_XFER_TEMPLATE(hdlr, n + 1, transfer_to_handler, ret_from_except)
.macro vmap_stack_overflow_exception
#ifdef CONFIG_SMP
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index 96ec800ebf7f..8cec630cccab 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -184,8 +184,7 @@ _ENTRY(saved_ksp_limit)
START_EXCEPTION(n, label); \
CRITICAL_EXCEPTION_PROLOG; \
addi r3,r1,STACK_FRAME_OVERHEAD; \
- EXC_XFER_TEMPLATE(hdlr, n+2, (MSR_KERNEL & ~(MSR_ME|MSR_DE|MSR_CE)), \
- crit_transfer_to_handler, ret_from_crit_exc)
+ EXC_XFER_TEMPLATE(hdlr, n+2, crit_transfer_to_handler, ret_from_crit_exc)
/*
* 0x0100 - Critical Interrupt Exception
@@ -495,7 +494,6 @@ _ENTRY(saved_ksp_limit)
2: mfspr r4,SPRN_DBSR
addi r3,r1,STACK_FRAME_OVERHEAD
EXC_XFER_TEMPLATE(DebugException, 0x2002, \
- (MSR_KERNEL & ~(MSR_ME|MSR_DE|MSR_CE)), \
crit_transfer_to_handler, ret_from_crit_exc)
/* Programmable Interval Timer (PIT) Exception. (from 0x1000) */
@@ -517,7 +515,6 @@ WDTException:
CRITICAL_EXCEPTION_PROLOG;
addi r3,r1,STACK_FRAME_OVERHEAD;
EXC_XFER_TEMPLATE(WatchdogException, 0x1020+2,
- (MSR_KERNEL & ~(MSR_ME|MSR_DE|MSR_CE)),
crit_transfer_to_handler, ret_from_crit_exc)
/* Other PowerPC processors, namely those derived from the 6xx-series
--
2.25.0
^ permalink raw reply related
* [PATCH v2 05/14] powerpc/40x: Save SRR0/SRR1 and r10/r11 earlier in critical exception
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1611309841.git.christophe.leroy@csgroup.eu>
In order to be able to switch MMU on in exception prolog, save
SRR0 and SRR1 earlier.
Also save r10 and r11 into stack earlier to better match with the
normal exception prolog.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/entry_32.S | 9 ---------
arch/powerpc/kernel/head_40x.S | 8 ++++++++
2 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index b102b40c4988..715a8b1aafc6 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -107,15 +107,6 @@ _ASM_NOKPROBE_SYMBOL(crit_transfer_to_handler)
#ifdef CONFIG_40x
.globl crit_transfer_to_handler
crit_transfer_to_handler:
- lwz r0,crit_r10@l(0)
- stw r0,GPR10(r11)
- lwz r0,crit_r11@l(0)
- stw r0,GPR11(r11)
- mfspr r0,SPRN_SRR0
- stw r0,crit_srr0@l(0)
- mfspr r0,SPRN_SRR1
- stw r0,crit_srr1@l(0)
-
/* set the stack limit to the current stack */
mfspr r8,SPRN_SPRG_THREAD
lwz r0,KSP_LIMIT(r8)
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index 595a4cf83391..6394040bbaba 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -103,6 +103,10 @@ _ENTRY(saved_ksp_limit)
.macro CRITICAL_EXCEPTION_PROLOG
stw r10,crit_r10@l(0) /* save two registers to work with */
stw r11,crit_r11@l(0)
+ mfspr r10,SPRN_SRR0
+ mfspr r11,SPRN_SRR1
+ stw r10,crit_srr0@l(0)
+ stw r11,crit_srr1@l(0)
mfcr r10 /* save CR in r10 for now */
mfspr r11,SPRN_SRR3 /* check whether user or kernel */
andi. r11,r11,MSR_PR
@@ -120,6 +124,10 @@ _ENTRY(saved_ksp_limit)
stw r9,GPR9(r11)
mflr r10
stw r10,_LINK(r11)
+ lwz r10,crit_r10@l(0)
+ lwz r12,crit_r11@l(0)
+ stw r10,GPR10(r11)
+ stw r12,GPR11(r11)
mfspr r12,SPRN_DEAR /* save DEAR and ESR in the frame */
stw r12,_DEAR(r11) /* since they may have had stuff */
mfspr r9,SPRN_ESR /* in them at the point where the */
--
2.25.0
^ permalink raw reply related
* [PATCH v2 07/14] powerpc/40x: Prepare for enabling MMU in critical exception prolog
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1611309841.git.christophe.leroy@csgroup.eu>
In order the enable MMU early in exception prolog, implement
CONFIG_VMAP_STACK principles in critical exception prolog.
There is no intention to use CONFIG_VMAP_STACK on 40x,
but related code will be used to enable MMU early in exception
in a later patch.
Also address (critirq_ctx-PAGE_OFFSET) directly instead of
using tophys() in order to win one instruction.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_40x.S | 42 +++++++++++++++++++++++++++++++---
1 file changed, 39 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index b7caaa09c860..c0582ad84117 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -89,6 +89,14 @@ _ENTRY(crit_srr0)
.space 4
_ENTRY(crit_srr1)
.space 4
+#ifdef CONFIG_VMAP_STACK
+_ENTRY(crit_r1)
+ .space 4
+_ENTRY(crit_dear)
+ .space 4
+_ENTRY(crit_esr)
+ .space 4
+#endif
_ENTRY(saved_ksp_limit)
.space 4
@@ -107,32 +115,60 @@ _ENTRY(saved_ksp_limit)
mfspr r11,SPRN_SRR1
stw r10,crit_srr0@l(0)
stw r11,crit_srr1@l(0)
+#ifdef CONFIG_VMAP_STACK
+ mfspr r10,SPRN_DEAR
+ mfspr r11,SPRN_ESR
+ stw r10,crit_dear@l(0)
+ stw r11,crit_esr@l(0)
+#endif
mfcr r10 /* save CR in r10 for now */
mfspr r11,SPRN_SRR3 /* check whether user or kernel */
andi. r11,r11,MSR_PR
- lis r11,critirq_ctx@ha
- tophys(r11,r11)
- lwz r11,critirq_ctx@l(r11)
+ lis r11,(critirq_ctx-PAGE_OFFSET)@ha
+ lwz r11,(critirq_ctx-PAGE_OFFSET)@l(r11)
beq 1f
/* COMING FROM USER MODE */
mfspr r11,SPRN_SPRG_THREAD /* if from user, start at top of */
lwz r11,TASK_STACK-THREAD(r11) /* this thread's kernel stack */
+#ifdef CONFIG_VMAP_STACK
+1: stw r1,crit_r1@l(0)
+ addi r1,r11,THREAD_SIZE-INT_FRAME_SIZE /* Alloc an excpt frm */
+ LOAD_REG_IMMEDIATE(r11,MSR_KERNEL & ~(MSR_IR | MSR_RI))
+ mtmsr r11
+ isync
+ lwz r11,crit_r1@l(0)
+ stw r11,GPR1(r1)
+ stw r11,0(r1)
+ mr r11,r1
+#else
1: addi r11,r11,THREAD_SIZE-INT_FRAME_SIZE /* Alloc an excpt frm */
tophys(r11,r11)
stw r1,GPR1(r11)
stw r1,0(r11)
tovirt(r1,r11)
+#endif
stw r10,_CCR(r11) /* save various registers */
stw r12,GPR12(r11)
stw r9,GPR9(r11)
mflr r10
stw r10,_LINK(r11)
+#ifdef CONFIG_VMAP_STACK
+ lis r9,PAGE_OFFSET@ha
+ lwz r10,crit_r10@l(r9)
+ lwz r12,crit_r11@l(r9)
+#else
lwz r10,crit_r10@l(0)
lwz r12,crit_r11@l(0)
+#endif
stw r10,GPR10(r11)
stw r12,GPR11(r11)
+#ifdef CONFIG_VMAP_STACK
+ lwz r12,crit_dear@l(r9)
+ lwz r9,crit_esr@l(r9)
+#else
mfspr r12,SPRN_DEAR /* save DEAR and ESR in the frame */
mfspr r9,SPRN_ESR /* in them at the point where the */
+#endif
stw r12,_DEAR(r11) /* since they may have had stuff */
stw r9,_ESR(r11) /* exception was taken */
mfspr r12,SPRN_SRR2
--
2.25.0
^ permalink raw reply related
* [PATCH v2 11/14] powerpc/32: Always enable data translation in exception prolog
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1611309841.git.christophe.leroy@csgroup.eu>
If the code can use a stack in vm area, it can also use a
stack in linear space.
Simplify code by removing old non VMAP stack code on PPC32.
That means the data translation is now re-enabled early in
exception prolog in all cases, not only when using VMAP stacks.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/processor.h | 4 +-
arch/powerpc/kernel/asm-offsets.c | 2 -
arch/powerpc/kernel/entry_32.S | 26 ++-----
arch/powerpc/kernel/fpu.S | 2 -
arch/powerpc/kernel/head_32.h | 106 +--------------------------
arch/powerpc/kernel/head_40x.S | 25 -------
arch/powerpc/kernel/head_8xx.S | 19 +----
arch/powerpc/kernel/head_book3s_32.S | 38 +---------
arch/powerpc/kernel/head_booke.h | 2 -
arch/powerpc/kernel/idle_6xx.S | 12 +--
arch/powerpc/kernel/idle_e500.S | 4 +-
arch/powerpc/kernel/vector.S | 2 -
arch/powerpc/mm/book3s32/hash_low.S | 14 ----
13 files changed, 17 insertions(+), 239 deletions(-)
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 8acc3590c971..33f1a2587ead 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -148,11 +148,9 @@ struct thread_struct {
#ifdef CONFIG_PPC_RTAS
unsigned long rtas_sp; /* stack pointer for when in RTAS */
#endif
-#endif
#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_PPC_KUAP)
unsigned long kuap; /* opened segments for user access */
#endif
-#ifdef CONFIG_VMAP_STACK
unsigned long srr0;
unsigned long srr1;
unsigned long dar;
@@ -161,7 +159,7 @@ struct thread_struct {
unsigned long r0, r3, r4, r5, r6, r8, r9, r11;
unsigned long lr, ctr;
#endif
-#endif
+#endif /* CONFIG_PPC32 */
/* Debug Registers */
struct debug_reg debug;
#ifdef CONFIG_PPC_FPU_REGS
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index b12d7c049bfe..a01bc9e8508b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -132,7 +132,6 @@ int main(void)
OFFSET(KSP_VSID, thread_struct, ksp_vsid);
#else /* CONFIG_PPC64 */
OFFSET(PGDIR, thread_struct, pgdir);
-#ifdef CONFIG_VMAP_STACK
OFFSET(SRR0, thread_struct, srr0);
OFFSET(SRR1, thread_struct, srr1);
OFFSET(DAR, thread_struct, dar);
@@ -149,7 +148,6 @@ int main(void)
OFFSET(THLR, thread_struct, lr);
OFFSET(THCTR, thread_struct, ctr);
#endif
-#endif
#ifdef CONFIG_SPE
OFFSET(THREAD_EVR0, thread_struct, evr[0]);
OFFSET(THREAD_ACC, thread_struct, acc);
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 859a13d34bdc..8a0e5885b731 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -141,7 +141,7 @@ transfer_to_handler:
stw r12,_CTR(r11)
stw r2,_XER(r11)
mfspr r12,SPRN_SPRG_THREAD
- tovirt_vmstack r12, r12
+ tovirt(r12, r12)
beq 2f /* if from user, fix up THREAD.regs */
addi r2, r12, -THREAD
addi r11,r1,STACK_FRAME_OVERHEAD
@@ -162,7 +162,6 @@ transfer_to_handler:
li r12,-1 /* clear all pending debug events */
mtspr SPRN_DBSR,r12
lis r11,global_dbcr0@ha
- tophys_novmstack r11,r11
addi r11,r11,global_dbcr0@l
#ifdef CONFIG_SMP
lwz r9,TASK_CPU(r2)
@@ -199,8 +198,7 @@ transfer_to_handler:
transfer_to_handler_cont:
3:
mflr r9
- tovirt_novmstack r2, r2 /* set r2 to current */
- tovirt_vmstack r9, r9
+ tovirt(r9, r9)
lwz r11,0(r9) /* virtual address of handler */
lwz r9,4(r9) /* where to go when done */
#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PERF_EVENTS)
@@ -215,8 +213,7 @@ transfer_to_handler_cont:
* To speed up the syscall path where interrupts stay on, let's check
* first if we are changing the MSR value at all.
*/
- tophys_novmstack r12, r1
- lwz r12,_MSR(r12)
+ lwz r12,_MSR(r1)
andi. r12,r12,MSR_EE
bne 1f
@@ -308,9 +305,6 @@ stack_ovf:
lis r9,StackOverflow@ha
addi r9,r9,StackOverflow@l
LOAD_REG_IMMEDIATE(r10,MSR_KERNEL)
-#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PERF_EVENTS)
- mtspr SPRN_NRI, r0
-#endif
mtspr SPRN_SRR0,r9
mtspr SPRN_SRR1,r10
rfi
@@ -1327,7 +1321,6 @@ _GLOBAL(enter_rtas)
lis r6,1f@ha /* physical return address for rtas */
addi r6,r6,1f@l
tophys(r6,r6)
- tophys_novmstack r7, r1
lwz r8,RTASENTRY(r4)
lwz r4,RTASBASE(r4)
mfmsr r9
@@ -1336,22 +1329,19 @@ _GLOBAL(enter_rtas)
mtmsr r0 /* disable interrupts so SRR0/1 don't get trashed */
li r9,MSR_KERNEL & ~(MSR_IR|MSR_DR)
mtlr r6
- stw r7, THREAD + RTAS_SP(r2)
+ stw r1, THREAD + RTAS_SP(r2)
mtspr SPRN_SRR0,r8
mtspr SPRN_SRR1,r9
rfi
-1: tophys_novmstack r9, r1
-#ifdef CONFIG_VMAP_STACK
+1:
li r0, MSR_KERNEL & ~MSR_IR /* can take DTLB miss */
mtmsr r0
isync
-#endif
- lwz r8,INT_FRAME_SIZE+4(r9) /* get return address */
- lwz r9,8(r9) /* original msr value */
+ lwz r8,INT_FRAME_SIZE+4(r1) /* get return address */
+ lwz r9,8(r1) /* original msr value */
addi r1,r1,INT_FRAME_SIZE
li r0,0
- tophys_novmstack r7, r2
- stw r0, THREAD + RTAS_SP(r7)
+ stw r0, THREAD + RTAS_SP(r2)
mtspr SPRN_SRR0,r8
mtspr SPRN_SRR1,r9
rfi /* return to caller */
diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S
index 3ff9a8fafa46..2c57ece6671c 100644
--- a/arch/powerpc/kernel/fpu.S
+++ b/arch/powerpc/kernel/fpu.S
@@ -92,9 +92,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
/* enable use of FP after return */
#ifdef CONFIG_PPC32
mfspr r5,SPRN_SPRG_THREAD /* current task's THREAD (phys) */
-#ifdef CONFIG_VMAP_STACK
tovirt(r5, r5)
-#endif
lwz r4,THREAD_FPEXC_MODE(r5)
ori r9,r9,MSR_FP /* enable FP for current */
or r9,r9,r4
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 48c0096fecc1..5ba4c6391107 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -19,7 +19,6 @@
.macro EXCEPTION_PROLOG_0 handle_dar_dsisr=0
mtspr SPRN_SPRG_SCRATCH0,r10
mtspr SPRN_SPRG_SCRATCH1,r11
-#ifdef CONFIG_VMAP_STACK
mfspr r10, SPRN_SPRG_THREAD
.if \handle_dar_dsisr
#ifdef CONFIG_40x
@@ -37,17 +36,13 @@
.endif
mfspr r11, SPRN_SRR0
stw r11, SRR0(r10)
-#endif
mfspr r11, SPRN_SRR1 /* check whether user or kernel */
-#ifdef CONFIG_VMAP_STACK
stw r11, SRR1(r10)
-#endif
mfcr r10
andi. r11, r11, MSR_PR
.endm
-.macro EXCEPTION_PROLOG_1 for_rtas=0
-#ifdef CONFIG_VMAP_STACK
+.macro EXCEPTION_PROLOG_1
mtspr SPRN_SPRG_SCRATCH2,r1
subi r1, r1, INT_FRAME_SIZE /* use r1 if kernel */
beq 1f
@@ -55,20 +50,13 @@
lwz r1,TASK_STACK-THREAD(r1)
addi r1, r1, THREAD_SIZE - INT_FRAME_SIZE
1:
+#ifdef CONFIG_VMAP_STACK
mtcrf 0x3f, r1
bt 32 - THREAD_ALIGN_SHIFT, stack_overflow
-#else
- subi r11, r1, INT_FRAME_SIZE /* use r1 if kernel */
- beq 1f
- mfspr r11,SPRN_SPRG_THREAD
- lwz r11,TASK_STACK-THREAD(r11)
- addi r11, r11, THREAD_SIZE - INT_FRAME_SIZE
-1: tophys(r11, r11)
#endif
.endm
.macro EXCEPTION_PROLOG_2 handle_dar_dsisr=0
-#ifdef CONFIG_VMAP_STACK
LOAD_REG_IMMEDIATE(r11, MSR_KERNEL & ~(MSR_IR | MSR_RI)) /* can take DTLB miss */
mtmsr r11
isync
@@ -76,11 +64,6 @@
stw r11,GPR1(r1)
stw r11,0(r1)
mr r11, r1
-#else
- stw r1,GPR1(r11)
- stw r1,0(r11)
- tovirt(r1, r11) /* set new kernel sp */
-#endif
stw r10,_CCR(r11) /* save registers */
stw r12,GPR12(r11)
stw r9,GPR9(r11)
@@ -90,7 +73,6 @@
stw r12,GPR11(r11)
mflr r10
stw r10,_LINK(r11)
-#ifdef CONFIG_VMAP_STACK
mfspr r12, SPRN_SPRG_THREAD
tovirt(r12, r12)
.if \handle_dar_dsisr
@@ -101,18 +83,10 @@
.endif
lwz r9, SRR1(r12)
lwz r12, SRR0(r12)
-#else
- mfspr r12,SPRN_SRR0
- mfspr r9,SPRN_SRR1
-#endif
#ifdef CONFIG_40x
rlwinm r9,r9,0,14,12 /* clear MSR_WE (necessary?) */
#else
-#ifdef CONFIG_VMAP_STACK
li r10, MSR_KERNEL & ~MSR_IR /* can take exceptions */
-#else
- li r10,MSR_KERNEL & ~(MSR_IR|MSR_DR) /* can take exceptions */
-#endif
mtmsr r10 /* (except for mach check in rtas) */
#endif
stw r0,GPR0(r11)
@@ -126,7 +100,6 @@
.macro SYSCALL_ENTRY trapno
mfspr r12,SPRN_SPRG_THREAD
mfspr r9, SPRN_SRR1
-#ifdef CONFIG_VMAP_STACK
mfspr r11, SPRN_SRR0
mtctr r11
andi. r11, r9, MSR_PR
@@ -141,23 +114,9 @@
stw r11,GPR1(r1)
stw r11,0(r1)
mr r11, r1
-#else
- andi. r11, r9, MSR_PR
- lwz r11,TASK_STACK-THREAD(r12)
- beq- 99f
- addi r11, r11, THREAD_SIZE - INT_FRAME_SIZE
- tophys(r11, r11)
- stw r1,GPR1(r11)
- stw r1,0(r11)
- tovirt(r1, r11) /* set new kernel sp */
-#endif
mflr r10
stw r10, _LINK(r11)
-#ifdef CONFIG_VMAP_STACK
mfctr r10
-#else
- mfspr r10,SPRN_SRR0
-#endif
stw r10,_NIP(r11)
mfcr r10
rlwinm r10,r10,0,4,2 /* Clear SO bit in CR */
@@ -165,11 +124,7 @@
#ifdef CONFIG_40x
rlwinm r9,r9,0,14,12 /* clear MSR_WE (necessary?) */
#else
-#ifdef CONFIG_VMAP_STACK
LOAD_REG_IMMEDIATE(r10, MSR_KERNEL & ~MSR_IR) /* can take exceptions */
-#else
- LOAD_REG_IMMEDIATE(r10, MSR_KERNEL & ~(MSR_IR|MSR_DR)) /* can take exceptions */
-#endif
mtmsr r10 /* (except for mach check in rtas) */
#endif
lis r10,STACK_FRAME_REGS_MARKER@ha /* exception frame marker */
@@ -198,7 +153,6 @@
li r12,-1 /* clear all pending debug events */
mtspr SPRN_DBSR,r12
lis r11,global_dbcr0@ha
- tophys_novmstack(r11,r11)
addi r11,r11,global_dbcr0@l
lwz r12,0(r11)
mtspr SPRN_DBCR0,r12
@@ -208,7 +162,6 @@
#endif
3:
- tovirt_novmstack r2, r2 /* set r2 to current */
lis r11, transfer_to_syscall@h
ori r11, r11, transfer_to_syscall@l
#ifdef CONFIG_TRACE_IRQFLAGS
@@ -234,59 +187,6 @@
99: b ret_from_kernel_syscall
.endm
-.macro save_dar_dsisr_on_stack reg1, reg2, sp
-#ifndef CONFIG_VMAP_STACK
-#ifdef CONFIG_40x
- mfspr \reg1, SPRN_DEAR
- mfspr \reg2, SPRN_ESR
-#else
- mfspr \reg1, SPRN_DAR
- mfspr \reg2, SPRN_DSISR
-#endif
- stw \reg1, _DAR(\sp)
- stw \reg2, _DSISR(\sp)
-#endif
-.endm
-
-.macro get_and_save_dar_dsisr_on_stack reg1, reg2, sp
-#ifdef CONFIG_VMAP_STACK
- lwz \reg1, _DAR(\sp)
- lwz \reg2, _DSISR(\sp)
-#else
- save_dar_dsisr_on_stack \reg1, \reg2, \sp
-#endif
-.endm
-
-.macro tovirt_vmstack dst, src
-#ifdef CONFIG_VMAP_STACK
- tovirt(\dst, \src)
-#else
- .ifnc \dst, \src
- mr \dst, \src
- .endif
-#endif
-.endm
-
-.macro tovirt_novmstack dst, src
-#ifndef CONFIG_VMAP_STACK
- tovirt(\dst, \src)
-#else
- .ifnc \dst, \src
- mr \dst, \src
- .endif
-#endif
-.endm
-
-.macro tophys_novmstack dst, src
-#ifndef CONFIG_VMAP_STACK
- tophys(\dst, \src)
-#else
- .ifnc \dst, \src
- mr \dst, \src
- .endif
-#endif
-.endm
-
/*
* Note: code which follows this uses cr0.eq (set if from kernel),
* r11, r12 (SRR0), and r9 (SRR1).
@@ -334,7 +234,6 @@
ret_from_except)
.macro vmap_stack_overflow_exception
-#ifdef CONFIG_VMAP_STACK
#ifdef CONFIG_SMP
mfspr r1, SPRN_SPRG_THREAD
lwz r1, TASK_CPU - THREAD(r1)
@@ -353,7 +252,6 @@
SAVE_NVGPRS(r11)
addi r3, r1, STACK_FRAME_OVERHEAD
EXC_XFER_STD(0, stack_overflow_exception)
-#endif
.endm
#endif /* __HEAD_32_H__ */
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index ef0bdc20849f..6b23cbd7f9c5 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -89,14 +89,12 @@ _ENTRY(crit_srr0)
.space 4
_ENTRY(crit_srr1)
.space 4
-#ifdef CONFIG_VMAP_STACK
_ENTRY(crit_r1)
.space 4
_ENTRY(crit_dear)
.space 4
_ENTRY(crit_esr)
.space 4
-#endif
_ENTRY(saved_ksp_limit)
.space 4
@@ -115,12 +113,10 @@ _ENTRY(saved_ksp_limit)
mfspr r11,SPRN_SRR1
stw r10,crit_srr0@l(0)
stw r11,crit_srr1@l(0)
-#ifdef CONFIG_VMAP_STACK
mfspr r10,SPRN_DEAR
mfspr r11,SPRN_ESR
stw r10,crit_dear@l(0)
stw r11,crit_esr@l(0)
-#endif
mfcr r10 /* save CR in r10 for now */
mfspr r11,SPRN_SRR3 /* check whether user or kernel */
andi. r11,r11,MSR_PR
@@ -130,7 +126,6 @@ _ENTRY(saved_ksp_limit)
/* COMING FROM USER MODE */
mfspr r11,SPRN_SPRG_THREAD /* if from user, start at top of */
lwz r11,TASK_STACK-THREAD(r11) /* this thread's kernel stack */
-#ifdef CONFIG_VMAP_STACK
1: stw r1,crit_r1@l(0)
addi r1,r11,THREAD_SIZE-INT_FRAME_SIZE /* Alloc an excpt frm */
LOAD_REG_IMMEDIATE(r11,MSR_KERNEL & ~(MSR_IR | MSR_RI))
@@ -140,35 +135,18 @@ _ENTRY(saved_ksp_limit)
stw r11,GPR1(r1)
stw r11,0(r1)
mr r11,r1
-#else
-1: addi r11,r11,THREAD_SIZE-INT_FRAME_SIZE /* Alloc an excpt frm */
- tophys(r11,r11)
- stw r1,GPR1(r11)
- stw r1,0(r11)
- tovirt(r1,r11)
-#endif
stw r10,_CCR(r11) /* save various registers */
stw r12,GPR12(r11)
stw r9,GPR9(r11)
mflr r10
stw r10,_LINK(r11)
-#ifdef CONFIG_VMAP_STACK
lis r9,PAGE_OFFSET@ha
lwz r10,crit_r10@l(r9)
lwz r12,crit_r11@l(r9)
-#else
- lwz r10,crit_r10@l(0)
- lwz r12,crit_r11@l(0)
-#endif
stw r10,GPR10(r11)
stw r12,GPR11(r11)
-#ifdef CONFIG_VMAP_STACK
lwz r12,crit_dear@l(r9)
lwz r9,crit_esr@l(r9)
-#else
- mfspr r12,SPRN_DEAR /* save DEAR and ESR in the frame */
- mfspr r9,SPRN_ESR /* in them at the point where the */
-#endif
stw r12,_DEAR(r11) /* since they may have had stuff */
stw r9,_ESR(r11) /* exception was taken */
mfspr r12,SPRN_SRR2
@@ -224,7 +202,6 @@ _ENTRY(saved_ksp_limit)
*/
START_EXCEPTION(0x0300, DataStorage)
EXCEPTION_PROLOG handle_dar_dsisr=1
- save_dar_dsisr_on_stack r4, r5, r11
EXC_XFER_LITE(0x300, handle_page_fault)
/*
@@ -244,14 +221,12 @@ _ENTRY(saved_ksp_limit)
/* 0x0600 - Alignment Exception */
START_EXCEPTION(0x0600, Alignment)
EXCEPTION_PROLOG handle_dar_dsisr=1
- save_dar_dsisr_on_stack r4, r5, r11
addi r3,r1,STACK_FRAME_OVERHEAD
EXC_XFER_STD(0x600, alignment_exception)
/* 0x0700 - Program Exception */
START_EXCEPTION(0x0700, ProgramCheck)
EXCEPTION_PROLOG handle_dar_dsisr=1
- save_dar_dsisr_on_stack r4, r5, r11
addi r3,r1,STACK_FRAME_OVERHEAD
EXC_XFER_STD(0x700, program_check_exception)
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 7869db974185..78c76e5cb46c 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -124,7 +124,6 @@ instruction_counter:
. = 0x200
MachineCheck:
EXCEPTION_PROLOG handle_dar_dsisr=1
- save_dar_dsisr_on_stack r4, r5, r11
li r6, RPN_PATTERN
mtspr SPRN_DAR, r6 /* Tag DAR, to be used in DTLB Error */
addi r3,r1,STACK_FRAME_OVERHEAD
@@ -137,7 +136,6 @@ MachineCheck:
. = 0x600
Alignment:
EXCEPTION_PROLOG handle_dar_dsisr=1
- save_dar_dsisr_on_stack r4, r5, r11
li r6, RPN_PATTERN
mtspr SPRN_DAR, r6 /* Tag DAR, to be used in DTLB Error */
addi r3,r1,STACK_FRAME_OVERHEAD
@@ -333,21 +331,16 @@ DataTLBError:
cmpwi cr1, r11, RPN_PATTERN
beq- cr1, FixupDAR /* must be a buggy dcbX, icbi insn. */
DARFixed:/* Return from dcbx instruction bug workaround */
-#ifdef CONFIG_VMAP_STACK
li r11, RPN_PATTERN
mtspr SPRN_DAR, r11 /* Tag DAR, to be used in DTLB Error */
-#endif
EXCEPTION_PROLOG_1
EXCEPTION_PROLOG_2 handle_dar_dsisr=1
- get_and_save_dar_dsisr_on_stack r4, r5, r11
+ lwz r4, _DAR(r11)
+ lwz r5, _DSISR(r11)
andis. r10,r5,DSISR_NOHPTE@h
beq+ .Ldtlbie
tlbie r4
.Ldtlbie:
-#ifndef CONFIG_VMAP_STACK
- li r10,RPN_PATTERN
- mtspr SPRN_DAR,r10 /* Tag DAR, to be used in DTLB Error */
-#endif
/* 0x300 is DataAccess exception, needed by bad_page_fault() */
EXC_XFER_LITE(0x300, handle_page_fault)
@@ -364,10 +357,6 @@ do_databreakpoint:
addi r3,r1,STACK_FRAME_OVERHEAD
mfspr r4,SPRN_BAR
stw r4,_DAR(r11)
-#ifndef CONFIG_VMAP_STACK
- mfspr r5,SPRN_DSISR
- stw r5,_DSISR(r11)
-#endif
EXC_XFER_STD(0x1c00, do_break)
. = 0x1c00
@@ -510,14 +499,10 @@ FixupDAR:/* Entry point for dcbx workaround. */
152:
mfdar r11
mtctr r11 /* restore ctr reg from DAR */
-#ifdef CONFIG_VMAP_STACK
mfspr r11, SPRN_SPRG_THREAD
stw r10, DAR(r11)
mfspr r10, SPRN_DSISR
stw r10, DSISR(r11)
-#else
- mtdar r10 /* save fault EA to DAR */
-#endif
mfspr r10,SPRN_M_TW
b DARFixed /* Go back to normal TLB handling */
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 6543cb410a8d..bec57305225d 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -288,7 +288,6 @@ MachineCheck:
. = 0x300
DO_KVM 0x300
DataAccess:
-#ifdef CONFIG_VMAP_STACK
#ifdef CONFIG_PPC_BOOK3S_604
BEGIN_MMU_FTR_SECTION
mtspr SPRN_SPRG_SCRATCH2,r10
@@ -310,29 +309,11 @@ ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
1: EXCEPTION_PROLOG_0 handle_dar_dsisr=1
EXCEPTION_PROLOG_1
b handle_page_fault_tramp_1
-#else /* CONFIG_VMAP_STACK */
- EXCEPTION_PROLOG handle_dar_dsisr=1
- get_and_save_dar_dsisr_on_stack r4, r5, r11
-#ifdef CONFIG_PPC_BOOK3S_604
-BEGIN_MMU_FTR_SECTION
- andis. r0, r5, (DSISR_BAD_FAULT_32S | DSISR_DABRMATCH)@h
- bne handle_page_fault_tramp_2 /* if not, try to put a PTE */
- rlwinm r3, r5, 32 - 15, 21, 21 /* DSISR_STORE -> _PAGE_RW */
- bl hash_page
- b handle_page_fault_tramp_1
-MMU_FTR_SECTION_ELSE
-#endif
- b handle_page_fault_tramp_2
-#ifdef CONFIG_PPC_BOOK3S_604
-ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
-#endif
-#endif /* CONFIG_VMAP_STACK */
/* Instruction access exception. */
. = 0x400
DO_KVM 0x400
InstructionAccess:
-#ifdef CONFIG_VMAP_STACK
mtspr SPRN_SPRG_SCRATCH0,r10
mtspr SPRN_SPRG_SCRATCH1,r11
mfspr r10, SPRN_SPRG_THREAD
@@ -353,18 +334,6 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
EXCEPTION_PROLOG_1
EXCEPTION_PROLOG_2
-#else /* CONFIG_VMAP_STACK */
- EXCEPTION_PROLOG
- andis. r0,r9,SRR1_ISI_NOPT@h /* no pte found? */
- beq 1f /* if so, try to put a PTE */
- li r3,0 /* into the hash table */
- mr r4,r12 /* SRR0 is fault address */
-#ifdef CONFIG_PPC_BOOK3S_604
-BEGIN_MMU_FTR_SECTION
- bl hash_page
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
-#endif
-#endif /* CONFIG_VMAP_STACK */
andis. r5,r9,DSISR_SRR1_MATCH_32S@h /* Filter relevant SRR1 bits */
stw r5, _DSISR(r11)
stw r12, _DAR(r11)
@@ -378,7 +347,6 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
DO_KVM 0x600
Alignment:
EXCEPTION_PROLOG handle_dar_dsisr=1
- save_dar_dsisr_on_stack r4, r5, r11
addi r3,r1,STACK_FRAME_OVERHEAD
b alignment_exception_tramp
@@ -686,18 +654,13 @@ alignment_exception_tramp:
EXC_XFER_STD(0x600, alignment_exception)
handle_page_fault_tramp_1:
-#ifdef CONFIG_VMAP_STACK
EXCEPTION_PROLOG_2 handle_dar_dsisr=1
-#endif
lwz r5, _DSISR(r11)
- /* fall through */
-handle_page_fault_tramp_2:
andis. r0, r5, DSISR_DABRMATCH@h
bne- 1f
EXC_XFER_LITE(0x300, handle_page_fault)
1: EXC_XFER_STD(0x300, do_break)
-#ifdef CONFIG_VMAP_STACK
#ifdef CONFIG_PPC_BOOK3S_604
.macro save_regs_thread thread
stw r0, THR0(\thread)
@@ -772,6 +735,7 @@ fast_hash_page_return:
rfi
#endif /* CONFIG_PPC_BOOK3S_604 */
+#ifdef CONFIG_VMAP_STACK
stack_overflow:
vmap_stack_overflow_exception
#endif
diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index 0fbdacc7fab7..5d0402c48ebc 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -144,7 +144,6 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
li r12,-1 /* clear all pending debug events */
mtspr SPRN_DBSR,r12
lis r11,global_dbcr0@ha
- tophys(r11,r11)
addi r11,r11,global_dbcr0@l
#ifdef CONFIG_SMP
lwz r10, TASK_CPU(r2)
@@ -158,7 +157,6 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
stw r12,4(r11)
3:
- tovirt(r2, r2) /* set r2 to current */
lis r11, transfer_to_syscall@h
ori r11, r11, transfer_to_syscall@l
#ifdef CONFIG_TRACE_IRQFLAGS
diff --git a/arch/powerpc/kernel/idle_6xx.S b/arch/powerpc/kernel/idle_6xx.S
index 69df840f7253..153366e178c4 100644
--- a/arch/powerpc/kernel/idle_6xx.S
+++ b/arch/powerpc/kernel/idle_6xx.S
@@ -145,9 +145,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
/*
* Return from NAP/DOZE mode, restore some CPU specific registers,
- * we are called with DR/IR still off and r2 containing physical
- * address of current. R11 points to the exception frame (physical
- * address). We have to preserve r10.
+ * R11 points to the exception frame. We have to preserve r10.
*/
_GLOBAL(power_save_ppc32_restore)
lwz r9,_LINK(r11) /* interrupted in ppc6xx_idle: */
@@ -166,11 +164,7 @@ BEGIN_FTR_SECTION
mfspr r9,SPRN_HID0
andis. r9,r9,HID0_NAP@h
beq 1f
-#ifdef CONFIG_VMAP_STACK
addis r9, r11, nap_save_msscr0@ha
-#else
- addis r9,r11,(nap_save_msscr0-KERNELBASE)@ha
-#endif
lwz r9,nap_save_msscr0@l(r9)
mtspr SPRN_MSSCR0, r9
sync
@@ -178,11 +172,7 @@ BEGIN_FTR_SECTION
1:
END_FTR_SECTION_IFSET(CPU_FTR_NAP_DISABLE_L2_PR)
BEGIN_FTR_SECTION
-#ifdef CONFIG_VMAP_STACK
addis r9, r11, nap_save_hid1@ha
-#else
- addis r9,r11,(nap_save_hid1-KERNELBASE)@ha
-#endif
lwz r9,nap_save_hid1@l(r9)
mtspr SPRN_HID1, r9
END_FTR_SECTION_IFSET(CPU_FTR_DUAL_PLL_750FX)
diff --git a/arch/powerpc/kernel/idle_e500.S b/arch/powerpc/kernel/idle_e500.S
index 72c85b6f3898..7795727e7f08 100644
--- a/arch/powerpc/kernel/idle_e500.S
+++ b/arch/powerpc/kernel/idle_e500.S
@@ -74,8 +74,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_CAN_NAP)
/*
* Return from NAP/DOZE mode, restore some CPU specific registers,
- * r2 containing physical address of current.
- * r11 points to the exception frame (physical address).
+ * r2 containing address of current.
+ * r11 points to the exception frame.
* We have to preserve r10.
*/
_GLOBAL(power_save_ppc32_restore)
diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S
index 801dc28fdcca..f5a52f444e36 100644
--- a/arch/powerpc/kernel/vector.S
+++ b/arch/powerpc/kernel/vector.S
@@ -67,9 +67,7 @@ _GLOBAL(load_up_altivec)
#ifdef CONFIG_PPC32
mfspr r5,SPRN_SPRG_THREAD /* current task's THREAD (phys) */
oris r9,r9,MSR_VEC@h
-#ifdef CONFIG_VMAP_STACK
tovirt(r5, r5)
-#endif
#else
ld r4,PACACURRENT(r13)
addi r5,r4,THREAD /* Get THREAD */
diff --git a/arch/powerpc/mm/book3s32/hash_low.S b/arch/powerpc/mm/book3s32/hash_low.S
index 0e6dc830c38b..fb4233a5bdf7 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -140,10 +140,6 @@ _GLOBAL(hash_page)
bne- .Lretry /* retry if someone got there first */
mfsrin r3,r4 /* get segment reg for segment */
-#ifndef CONFIG_VMAP_STACK
- mfctr r0
- stw r0,_CTR(r11)
-#endif
bl create_hpte /* add the hash table entry */
#ifdef CONFIG_SMP
@@ -152,17 +148,7 @@ _GLOBAL(hash_page)
li r0,0
stw r0, (mmu_hash_lock - PAGE_OFFSET)@l(r8)
#endif
-
-#ifdef CONFIG_VMAP_STACK
b fast_hash_page_return
-#else
- /* Return from the exception */
- lwz r5,_CTR(r11)
- mtctr r5
- lwz r0,GPR0(r11)
- lwz r8,GPR8(r11)
- b fast_exception_return
-#endif
#ifdef CONFIG_SMP
.Lhash_page_out:
--
2.25.0
^ permalink raw reply related
* [PATCH v2 12/14] powerpc/32: Enable instruction translation at the same time as data translation
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1611309841.git.christophe.leroy@csgroup.eu>
On 40x and 8xx, kernel text is pinned.
On book3s/32, kernel text is mapped by BATs.
Enable instruction translation at the same time as data translation, it
makes things simpler.
In syscall handler, MSR_RI can also be set at the same time because
srr0/srr1 are already saved and r1 is set properly.
On booke, translation is always on, so at the end all PPC32
have translation on early.
Also update comment in power_save_ppc32_restore().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/entry_32.S | 44 ++++++++++----------------------
arch/powerpc/kernel/head_32.h | 39 ++++++++++++----------------
arch/powerpc/kernel/head_40x.S | 10 +++++---
arch/powerpc/kernel/head_booke.h | 7 ++---
4 files changed, 39 insertions(+), 61 deletions(-)
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 8a0e5885b731..6e70c6fdfe8a 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -198,12 +198,8 @@ transfer_to_handler:
transfer_to_handler_cont:
3:
mflr r9
- tovirt(r9, r9)
lwz r11,0(r9) /* virtual address of handler */
lwz r9,4(r9) /* where to go when done */
-#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PERF_EVENTS)
- mtspr SPRN_NRI, r0
-#endif
#ifdef CONFIG_TRACE_IRQFLAGS
/*
* When tracing IRQ state (lockdep) we enable the MMU before we call
@@ -219,13 +215,9 @@ transfer_to_handler_cont:
/* MSR isn't changing, just transition directly */
#endif
- mtspr SPRN_SRR0,r11
- mtspr SPRN_SRR1,r10
+ mtctr r11
mtlr r9
- rfi /* jump to handler, enable MMU */
-#ifdef CONFIG_40x
- b . /* Prevent prefetch past rfi */
-#endif
+ bctr /* jump to handler */
#if defined (CONFIG_PPC_BOOK3S_32) || defined(CONFIG_E500)
4: rlwinm r12,r12,0,~_TLF_NAPPING
@@ -245,21 +237,7 @@ _ASM_NOKPROBE_SYMBOL(transfer_to_handler)
_ASM_NOKPROBE_SYMBOL(transfer_to_handler_cont)
#ifdef CONFIG_TRACE_IRQFLAGS
-1: /* MSR is changing, re-enable MMU so we can notify lockdep. We need to
- * keep interrupts disabled at this point otherwise we might risk
- * taking an interrupt before we tell lockdep they are enabled.
- */
- lis r12,reenable_mmu@h
- ori r12,r12,reenable_mmu@l
- LOAD_REG_IMMEDIATE(r0, MSR_KERNEL)
- mtspr SPRN_SRR0,r12
- mtspr SPRN_SRR1,r0
- rfi
-#ifdef CONFIG_40x
- b . /* Prevent prefetch past rfi */
-#endif
-
-reenable_mmu:
+1:
/*
* We save a bunch of GPRs,
* r3 can be different from GPR3(r1) at this point, r9 and r11
@@ -1334,16 +1312,20 @@ _GLOBAL(enter_rtas)
mtspr SPRN_SRR1,r9
rfi
1:
- li r0, MSR_KERNEL & ~MSR_IR /* can take DTLB miss */
- mtmsr r0
- isync
+ lis r8, 1f@h
+ ori r8, r8, 1f@l
+ LOAD_REG_IMMEDIATE(r9,MSR_KERNEL)
+ mtspr SPRN_SRR0,r8
+ mtspr SPRN_SRR1,r9
+ rfi /* Reactivate MMU translation */
+1:
lwz r8,INT_FRAME_SIZE+4(r1) /* get return address */
lwz r9,8(r1) /* original msr value */
addi r1,r1,INT_FRAME_SIZE
li r0,0
stw r0, THREAD + RTAS_SP(r2)
- mtspr SPRN_SRR0,r8
- mtspr SPRN_SRR1,r9
- rfi /* return to caller */
+ mtlr r8
+ mtmsr r9
+ blr /* return to caller */
_ASM_NOKPROBE_SYMBOL(enter_rtas)
#endif /* CONFIG_PPC_RTAS */
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 5ba4c6391107..7209e63db366 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -57,10 +57,14 @@
.endm
.macro EXCEPTION_PROLOG_2 handle_dar_dsisr=0
- LOAD_REG_IMMEDIATE(r11, MSR_KERNEL & ~(MSR_IR | MSR_RI)) /* can take DTLB miss */
- mtmsr r11
- isync
+ LOAD_REG_IMMEDIATE(r11, MSR_KERNEL & ~MSR_RI) /* re-enable MMU */
+ mtspr SPRN_SRR1, r11
+ lis r11, 1f@h
+ ori r11, r11, 1f@l
+ mtspr SPRN_SRR0, r11
mfspr r11, SPRN_SPRG_SCRATCH2
+ rfi
+1:
stw r11,GPR1(r1)
stw r11,0(r1)
mr r11, r1
@@ -86,7 +90,7 @@
#ifdef CONFIG_40x
rlwinm r9,r9,0,14,12 /* clear MSR_WE (necessary?) */
#else
- li r10, MSR_KERNEL & ~MSR_IR /* can take exceptions */
+ li r10, MSR_KERNEL /* can take exceptions */
mtmsr r10 /* (except for mach check in rtas) */
#endif
stw r0,GPR0(r11)
@@ -107,9 +111,13 @@
lwz r1,TASK_STACK-THREAD(r12)
beq- 99f
addi r1, r1, THREAD_SIZE - INT_FRAME_SIZE
- LOAD_REG_IMMEDIATE(r10, MSR_KERNEL & ~(MSR_IR | MSR_RI)) /* can take DTLB miss */
- mtmsr r10
- isync
+ LOAD_REG_IMMEDIATE(r10, MSR_KERNEL) /* can take exceptions */
+ mtspr SPRN_SRR1, r10
+ lis r10, 1f@h
+ ori r10, r10, 1f@l
+ mtspr SPRN_SRR0, r10
+ rfi
+1:
tovirt(r12, r12)
stw r11,GPR1(r1)
stw r11,0(r1)
@@ -123,9 +131,6 @@
stw r10,_CCR(r11) /* save registers */
#ifdef CONFIG_40x
rlwinm r9,r9,0,14,12 /* clear MSR_WE (necessary?) */
-#else
- LOAD_REG_IMMEDIATE(r10, MSR_KERNEL & ~MSR_IR) /* can take exceptions */
- mtmsr r10 /* (except for mach check in rtas) */
#endif
lis r10,STACK_FRAME_REGS_MARKER@ha /* exception frame marker */
stw r2,GPR2(r11)
@@ -162,8 +167,6 @@
#endif
3:
- lis r11, transfer_to_syscall@h
- ori r11, r11, transfer_to_syscall@l
#ifdef CONFIG_TRACE_IRQFLAGS
/*
* If MSR is changing we need to keep interrupts disabled at this point
@@ -175,15 +178,8 @@
#else
LOAD_REG_IMMEDIATE(r10, MSR_KERNEL | MSR_EE)
#endif
-#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PERF_EVENTS)
- mtspr SPRN_NRI, r0
-#endif
- mtspr SPRN_SRR1,r10
- mtspr SPRN_SRR0,r11
- rfi /* jump to handler, enable MMU */
-#ifdef CONFIG_40x
- b . /* Prevent prefetch past rfi */
-#endif
+ mtmsr r10
+ b transfer_to_syscall /* jump to handler */
99: b ret_from_kernel_syscall
.endm
@@ -220,7 +216,6 @@
#define EXC_XFER_TEMPLATE(hdlr, trap, msr, tfer, ret) \
li r10,trap; \
stw r10,_TRAP(r11); \
- LOAD_REG_IMMEDIATE(r10, msr); \
bl tfer; \
.long hdlr; \
.long ret
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index 6b23cbd7f9c5..96ec800ebf7f 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -128,9 +128,13 @@ _ENTRY(saved_ksp_limit)
lwz r11,TASK_STACK-THREAD(r11) /* this thread's kernel stack */
1: stw r1,crit_r1@l(0)
addi r1,r11,THREAD_SIZE-INT_FRAME_SIZE /* Alloc an excpt frm */
- LOAD_REG_IMMEDIATE(r11,MSR_KERNEL & ~(MSR_IR | MSR_RI))
- mtmsr r11
- isync
+ LOAD_REG_IMMEDIATE(r11, MSR_KERNEL & ~(MSR_ME|MSR_DE|MSR_CE)) /* re-enable MMU */
+ mtspr SPRN_SRR1, r11
+ lis r11, 1f@h
+ ori r11, r11, 1f@l
+ mtspr SPRN_SRR0, r11
+ rfi
+1:
lwz r11,crit_r1@l(0)
stw r11,GPR1(r1)
stw r11,0(r1)
diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index 5d0402c48ebc..2e3cb1cc42fb 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -157,8 +157,6 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
stw r12,4(r11)
3:
- lis r11, transfer_to_syscall@h
- ori r11, r11, transfer_to_syscall@l
#ifdef CONFIG_TRACE_IRQFLAGS
/*
* If MSR is changing we need to keep interrupts disabled at this point
@@ -172,9 +170,8 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
lis r10, (MSR_KERNEL | MSR_EE)@h
ori r10, r10, (MSR_KERNEL | MSR_EE)@l
#endif
- mtspr SPRN_SRR1,r10
- mtspr SPRN_SRR0,r11
- rfi /* jump to handler, enable MMU */
+ mtmsr r10
+ b transfer_to_syscall /* jump to handler */
99: b ret_from_kernel_syscall
.endm
--
2.25.0
^ permalink raw reply related
* [PATCH v2 08/14] powerpc/40x: Prepare normal exception handler for enabling MMU early
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1611309841.git.christophe.leroy@csgroup.eu>
Ensure normal exception handler are able to manage stuff with
MMU enabled. For that we use CONFIG_VMAP_STACK related code
allthough there is no intention to really activate CONFIG_VMAP_STACK
on powerpc 40x for the moment.
40x uses SPRN_DEAR instead of SPRN_DAR and SPRN_ESR instead of
SPRN_DSISR. Take it into account in common macros.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/entry_32.S | 2 +-
arch/powerpc/kernel/head_32.h | 15 ++++++++++++++-
arch/powerpc/kernel/head_40x.S | 17 ++++++-----------
3 files changed, 21 insertions(+), 13 deletions(-)
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 715a8b1aafc6..859a13d34bdc 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -162,7 +162,7 @@ transfer_to_handler:
li r12,-1 /* clear all pending debug events */
mtspr SPRN_DBSR,r12
lis r11,global_dbcr0@ha
- tophys(r11,r11)
+ tophys_novmstack r11,r11
addi r11,r11,global_dbcr0@l
#ifdef CONFIG_SMP
lwz r9,TASK_CPU(r2)
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index a2f72c966baf..1f5e64c70ddb 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -22,9 +22,17 @@
#ifdef CONFIG_VMAP_STACK
mfspr r10, SPRN_SPRG_THREAD
.if \handle_dar_dsisr
+#ifdef CONFIG_40x
+ mfspr r11, SPRN_DEAR
+#else
mfspr r11, SPRN_DAR
+#endif
stw r11, DAR(r10)
+#ifdef CONFIG_40x
+ mfspr r11, SPRN_ESR
+#else
mfspr r11, SPRN_DSISR
+#endif
stw r11, DSISR(r10)
.endif
mfspr r11, SPRN_SRR0
@@ -190,7 +198,7 @@
li r12,-1 /* clear all pending debug events */
mtspr SPRN_DBSR,r12
lis r11,global_dbcr0@ha
- tophys(r11,r11)
+ tophys_novmstack(r11,r11)
addi r11,r11,global_dbcr0@l
lwz r12,0(r11)
mtspr SPRN_DBCR0,r12
@@ -228,8 +236,13 @@
.macro save_dar_dsisr_on_stack reg1, reg2, sp
#ifndef CONFIG_VMAP_STACK
+#ifdef CONFIG_40x
+ mfspr \reg1, SPRN_DEAR
+ mfspr \reg2, SPRN_ESR
+#else
mfspr \reg1, SPRN_DAR
mfspr \reg2, SPRN_DSISR
+#endif
stw \reg1, _DAR(\sp)
stw \reg2, _DSISR(\sp)
#endif
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index c0582ad84117..ef0bdc20849f 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -223,11 +223,8 @@ _ENTRY(saved_ksp_limit)
* if they can't resolve the lightweight TLB fault.
*/
START_EXCEPTION(0x0300, DataStorage)
- EXCEPTION_PROLOG
- mfspr r5, SPRN_ESR /* Grab the ESR, save it */
- stw r5, _ESR(r11)
- mfspr r4, SPRN_DEAR /* Grab the DEAR, save it */
- stw r4, _DEAR(r11)
+ EXCEPTION_PROLOG handle_dar_dsisr=1
+ save_dar_dsisr_on_stack r4, r5, r11
EXC_XFER_LITE(0x300, handle_page_fault)
/*
@@ -246,17 +243,15 @@ _ENTRY(saved_ksp_limit)
/* 0x0600 - Alignment Exception */
START_EXCEPTION(0x0600, Alignment)
- EXCEPTION_PROLOG
- mfspr r4,SPRN_DEAR /* Grab the DEAR and save it */
- stw r4,_DEAR(r11)
+ EXCEPTION_PROLOG handle_dar_dsisr=1
+ save_dar_dsisr_on_stack r4, r5, r11
addi r3,r1,STACK_FRAME_OVERHEAD
EXC_XFER_STD(0x600, alignment_exception)
/* 0x0700 - Program Exception */
START_EXCEPTION(0x0700, ProgramCheck)
- EXCEPTION_PROLOG
- mfspr r4,SPRN_ESR /* Grab the ESR and save it */
- stw r4,_ESR(r11)
+ EXCEPTION_PROLOG handle_dar_dsisr=1
+ save_dar_dsisr_on_stack r4, r5, r11
addi r3,r1,STACK_FRAME_OVERHEAD
EXC_XFER_STD(0x700, program_check_exception)
--
2.25.0
^ permalink raw reply related
* [PATCH v2 06/14] powerpc/40x: Reorder a few instructions in critical exception prolog
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1611309841.git.christophe.leroy@csgroup.eu>
In order to ease preparation for CONFIG_VMAP_STACK, reorder
a few instruction, especially save r1 into stack frame earlier.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_40x.S | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index 6394040bbaba..b7caaa09c860 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -119,6 +119,9 @@ _ENTRY(saved_ksp_limit)
lwz r11,TASK_STACK-THREAD(r11) /* this thread's kernel stack */
1: addi r11,r11,THREAD_SIZE-INT_FRAME_SIZE /* Alloc an excpt frm */
tophys(r11,r11)
+ stw r1,GPR1(r11)
+ stw r1,0(r11)
+ tovirt(r1,r11)
stw r10,_CCR(r11) /* save various registers */
stw r12,GPR12(r11)
stw r9,GPR9(r11)
@@ -129,14 +132,11 @@ _ENTRY(saved_ksp_limit)
stw r10,GPR10(r11)
stw r12,GPR11(r11)
mfspr r12,SPRN_DEAR /* save DEAR and ESR in the frame */
- stw r12,_DEAR(r11) /* since they may have had stuff */
mfspr r9,SPRN_ESR /* in them at the point where the */
+ stw r12,_DEAR(r11) /* since they may have had stuff */
stw r9,_ESR(r11) /* exception was taken */
mfspr r12,SPRN_SRR2
- stw r1,GPR1(r11)
mfspr r9,SPRN_SRR3
- stw r1,0(r11)
- tovirt(r1,r11)
rlwinm r9,r9,0,14,12 /* clear MSR_WE (necessary?) */
stw r0,GPR0(r11)
lis r10, STACK_FRAME_REGS_MARKER@ha /* exception frame marker */
--
2.25.0
^ permalink raw reply related
* [PATCH v2 14/14] powerpc/32: Use fast instructions to change MSR EE/RI when available
From: Christophe Leroy @ 2021-01-22 10:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <cover.1611309841.git.christophe.leroy@csgroup.eu>
Booke and 40x have wrtee and wrteei to quickly change MSR EE.
8xx has registers SPRN_NRI, SPRN_EID and SPRN_EIE for changing
MSR EE and RI.
Use them in syscall and exception handler when possible.
On an 8xx, it reduces the null_syscall test by 6 cycles (Two
instances are changed in this patch, meaning we win 3 cycles
per place).
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/hw_irq.h | 46 +++++++++++++++++++++++++++++++
arch/powerpc/kernel/entry_32.S | 26 ++++++-----------
arch/powerpc/kernel/head_32.h | 13 +++++----
arch/powerpc/kernel/head_booke.h | 9 ++----
4 files changed, 66 insertions(+), 28 deletions(-)
diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index 0363734ff56e..899aa457e143 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -368,6 +368,52 @@ static inline void may_hard_irq_enable(void) { }
#define ARCH_IRQ_INIT_FLAGS IRQ_NOREQUEST
+#else /* __ASSEMBLY__ */
+
+.macro __hard_irq_enable tmp
+#if defined(CONFIG_BOOKE) || defined(CONFIG_40x)
+ wrteei 1
+#elif defined(CONFIG_PPC_8xx)
+ mtspr SPRN_EIE, r2 /* RI=1, EE=1 */
+#else
+ LOAD_REG_IMMEDIATE(\tmp, MSR_KERNEL | MSR_EE)
+ mtmsr \tmp
+#endif
+.endm
+
+.macro __hard_irq_disable tmp
+#if defined(CONFIG_BOOKE) || defined(CONFIG_40x)
+ wrteei 0
+#elif defined(CONFIG_PPC_8xx)
+ mtspr SPRN_EID, r2 /* RI=1, EE=0 */
+#else
+ LOAD_REG_IMMEDIATE(\tmp, MSR_KERNEL)
+ mtmsr \tmp
+#endif
+.endm
+
+.macro __hard_EE_RI_disable tmp
+#if defined(CONFIG_BOOKE) || defined(CONFIG_40x)
+ wrteei 0
+#elif defined(CONFIG_PPC_8xx)
+ mtspr SPRN_NRI, r2 /* RI=0, EE=0 */
+#else
+ LOAD_REG_IMMEDIATE(\tmp, MSR_KERNEL & ~MSR_RI)
+ mtmsr \tmp
+#endif
+.endm
+
+.macro __hard_RI_enable tmp
+#if defined(CONFIG_BOOKE) || defined(CONFIG_40x)
+ /* nop */
+#elif defined(CONFIG_PPC_8xx)
+ mtspr SPRN_EID, r2 /* RI=1, EE=0 */
+#else
+ LOAD_REG_IMMEDIATE(\tmp, MSR_KERNEL)
+ mtmsr \tmp
+#endif
+.endm
+
#endif /* __ASSEMBLY__ */
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_HW_IRQ_H */
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 6e70c6fdfe8a..a9c3974cb95d 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -303,8 +303,7 @@ trace_syscall_entry_irq_off:
bl trace_hardirqs_on
/* Now enable for real */
- LOAD_REG_IMMEDIATE(r10, MSR_KERNEL | MSR_EE)
- mtmsr r10
+ __hard_irq_enable r10
REST_GPR(0, r1)
REST_4GPRS(3, r1)
@@ -373,9 +372,8 @@ ret_from_syscall:
#endif
mr r6,r3
/* disable interrupts so current_thread_info()->flags can't change */
- LOAD_REG_IMMEDIATE(r10,MSR_KERNEL) /* doesn't include MSR_EE */
+ __hard_irq_disable r10
/* Note: We don't bother telling lockdep about it */
- mtmsr r10
lwz r9,TI_FLAGS(r2)
li r8,-MAX_ERRNO
andi. r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
@@ -529,8 +527,7 @@ syscall_exit_work:
/* Re-enable interrupts. There is no need to trace that with
* lockdep as we are supposed to have IRQs on at this point
*/
- ori r10,r10,MSR_EE
- mtmsr r10
+ __hard_irq_enable r10
/* Save NVGPRS if they're not saved already */
lwz r4,_TRAP(r1)
@@ -812,8 +809,7 @@ ret_from_except:
* can't change between when we test it and when we return
* from the interrupt. */
/* Note: We don't bother telling lockdep about it */
- LOAD_REG_IMMEDIATE(r10,MSR_KERNEL)
- mtmsr r10 /* disable interrupts */
+ __hard_irq_disable r10
lwz r3,_MSR(r1) /* Returning to user mode? */
andi. r0,r3,MSR_PR
@@ -971,8 +967,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_NEED_PAIRED_STWCX)
* can restart the exception exit path at the label
* exc_exit_restart below. -- paulus
*/
- LOAD_REG_IMMEDIATE(r10,MSR_KERNEL & ~MSR_RI)
- mtmsr r10 /* clear the RI bit */
+ __hard_EE_RI_disable r10
+
.globl exc_exit_restart
exc_exit_restart:
lwz r12,_NIP(r1)
@@ -1206,26 +1202,22 @@ do_work: /* r10 contains MSR_KERNEL here */
do_resched: /* r10 contains MSR_KERNEL here */
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_on
- mfmsr r10
#endif
- ori r10,r10,MSR_EE
- mtmsr r10 /* hard-enable interrupts */
+ __hard_irq_enable r10
bl schedule
recheck:
/* Note: And we don't tell it we are disabling them again
* neither. Those disable/enable cycles used to peek at
* TI_FLAGS aren't advertised.
*/
- LOAD_REG_IMMEDIATE(r10,MSR_KERNEL)
- mtmsr r10 /* disable interrupts */
+ __hard_irq_disable r10
lwz r9,TI_FLAGS(r2)
andi. r0,r9,_TIF_NEED_RESCHED
bne- do_resched
andi. r0,r9,_TIF_USER_WORK_MASK
beq restore_user
do_user_signal: /* r10 contains MSR_KERNEL here */
- ori r10,r10,MSR_EE
- mtmsr r10 /* hard-enable interrupts */
+ __hard_irq_enable r10
/* save r13-r31 in the exception frame, if not already done */
lwz r3,_TRAP(r1)
andi. r0,r3,1
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 98ed5b928642..69fded26c024 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -3,6 +3,7 @@
#define __HEAD_32_H__
#include <asm/ptrace.h> /* for STACK_FRAME_REGS_MARKER */
+#include <asm/hw_irq.h>
/*
* Exception entry code. This code runs with address translation
@@ -89,10 +90,8 @@
lwz r12, SRR0(r12)
#ifdef CONFIG_40x
rlwinm r9,r9,0,14,12 /* clear MSR_WE (necessary?) */
-#else
- li r10, MSR_KERNEL /* can take exceptions */
- mtmsr r10 /* (except for mach check in rtas) */
#endif
+ __hard_RI_enable r10
stw r0,GPR0(r11)
lis r10,STACK_FRAME_REGS_MARKER@ha /* exception frame marker */
addi r10,r10,STACK_FRAME_REGS_MARKER@l
@@ -173,12 +172,16 @@
* otherwise we might risk taking an interrupt before we tell lockdep
* they are enabled.
*/
+#ifdef CONFIG_40x
+ wrtee r9
+#else
LOAD_REG_IMMEDIATE(r10, MSR_KERNEL)
rlwimi r10, r9, 0, MSR_EE
+ mtmsr r10
+#endif
#else
- LOAD_REG_IMMEDIATE(r10, MSR_KERNEL | MSR_EE)
+ __hard_irq_enable r10
#endif
- mtmsr r10
b transfer_to_syscall /* jump to handler */
99: b ret_from_kernel_syscall
.endm
diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index 2e3cb1cc42fb..f9da3ea9e7aa 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -5,6 +5,7 @@
#include <asm/ptrace.h> /* for STACK_FRAME_REGS_MARKER */
#include <asm/kvm_asm.h>
#include <asm/kvm_booke_hv_asm.h>
+#include <asm/hw_irq.h>
#ifdef __ASSEMBLY__
@@ -163,14 +164,10 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
* otherwise we might risk taking an interrupt before we tell lockdep
* they are enabled.
*/
- lis r10, MSR_KERNEL@h
- ori r10, r10, MSR_KERNEL@l
- rlwimi r10, r9, 0, MSR_EE
+ wrtee r9
#else
- lis r10, (MSR_KERNEL | MSR_EE)@h
- ori r10, r10, (MSR_KERNEL | MSR_EE)@l
+ __hard_irq_enable r10
#endif
- mtmsr r10
b transfer_to_syscall /* jump to handler */
99: b ret_from_kernel_syscall
.endm
--
2.25.0
^ permalink raw reply related
* Re: [PATCH v2] powerpc/64/signal: balance return predictor stack in signal trampoline
From: Florian Weimer @ 2021-01-22 11:27 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: musl, libc-alpha, linuxppc-dev, Alan Modra
In-Reply-To: <20200511101952.1463138-1-npiggin@gmail.com>
* Nicholas Piggin:
> diff --git a/arch/powerpc/kernel/vdso64/sigtramp.S b/arch/powerpc/kernel/vdso64/sigtramp.S
> index a8cc0409d7d2..bbf68cd01088 100644
> --- a/arch/powerpc/kernel/vdso64/sigtramp.S
> +++ b/arch/powerpc/kernel/vdso64/sigtramp.S
> @@ -6,6 +6,7 @@
> * Copyright (C) 2004 Benjamin Herrenschmuidt (benh@kernel.crashing.org), IBM Corp.
> * Copyright (C) 2004 Alan Modra (amodra@au.ibm.com)), IBM Corp.
> */
> +#include <asm/cache.h> /* IFETCH_ALIGN_BYTES */
> #include <asm/processor.h>
> #include <asm/ppc_asm.h>
> #include <asm/unistd.h>
> @@ -14,21 +15,17 @@
>
> .text
>
> -/* The nop here is a hack. The dwarf2 unwind routines subtract 1 from
> - the return address to get an address in the middle of the presumed
> - call instruction. Since we don't have a call here, we artificially
> - extend the range covered by the unwind info by padding before the
> - real start. */
> - nop
> .balign 8
> + .balign IFETCH_ALIGN_BYTES
> V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
> -.Lsigrt_start = . - 4
> +.Lsigrt_start:
> + bctrl /* call the handler */
> addi r1, r1, __SIGNAL_FRAMESIZE
> li r0,__NR_rt_sigreturn
> sc
> .Lsigrt_end:
> V_FUNCTION_END(__kernel_sigtramp_rt64)
> -/* The ".balign 8" above and the following zeros mimic the old stack
> +/* The .balign 8 above and the following zeros mimic the old stack
> trampoline layout. The last magic value is the ucontext pointer,
> chosen in such a way that older libgcc unwind code returns a zero
> for a sigcontext pointer. */
As far as I understand it, this breaks cancellation handling on musl and
future glibc because it is necessary to look at the signal delivery
location to see if a system call sequence has result in an action, and
that location is no longer in user code after this change.
We have a glibc test in preparation of our change, and it started
failing:
Linux 5.10 breaks sigcontext_get_pc on powerpc64
<https://sourceware.org/bugzilla/show_bug.cgi?id=27223>
Isn't it possible to avoid the return predictor desynchronization by
adding the appropriate hint?
Thanks,
Florian
--
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill
^ permalink raw reply
* [PATCH v4 1/2] powerpc/mce: Reduce the size of event arrays
From: Ganesh Goudar @ 2021-01-22 12:32 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Ganesh Goudar, mahesh, npiggin
Maximum recursive depth of MCE is 4, Considering the maximum depth
allowed reduce the size of event to 10 from 100. This saves us ~19kB
of memory and has no fatal consequences.
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
---
v4: This patch is a fragment of the orignal patch which is
split into two.
---
arch/powerpc/include/asm/mce.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index e6c27ae843dc..7d8b6679ec68 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -204,7 +204,7 @@ struct mce_error_info {
bool ignore_event;
};
-#define MAX_MC_EVT 100
+#define MAX_MC_EVT 10
/* Release flags for get_mce_event() */
#define MCE_EVENT_RELEASE true
--
2.26.2
^ permalink raw reply related
* [PATCH v4 2/2] powerpc/mce: Remove per cpu variables from MCE handlers
From: Ganesh Goudar @ 2021-01-22 12:32 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Ganesh Goudar, mahesh, npiggin
In-Reply-To: <20210122123244.34033-1-ganeshgr@linux.ibm.com>
Access to per-cpu variables requires translation to be enabled on
pseries machine running in hash mmu mode, Since part of MCE handler
runs in realmode and part of MCE handling code is shared between ppc
architectures pseries and powernv, it becomes difficult to manage
these variables differently on different architectures, So have
these variables in paca instead of having them as per-cpu variables
to avoid complications.
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
---
v2: Dynamically allocate memory for machine check event info
v3: Remove check for hash mmu lpar, use memblock_alloc_try_nid
to allocate memory.
v4: Spliting the patch into two.
---
arch/powerpc/include/asm/mce.h | 18 +++++++
arch/powerpc/include/asm/paca.h | 4 ++
arch/powerpc/kernel/mce.c | 79 ++++++++++++++++++------------
arch/powerpc/kernel/setup-common.c | 2 +-
4 files changed, 70 insertions(+), 33 deletions(-)
diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 7d8b6679ec68..331d944280b8 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -206,6 +206,17 @@ struct mce_error_info {
#define MAX_MC_EVT 10
+struct mce_info {
+ int mce_nest_count;
+ struct machine_check_event mce_event[MAX_MC_EVT];
+ /* Queue for delayed MCE events. */
+ int mce_queue_count;
+ struct machine_check_event mce_event_queue[MAX_MC_EVT];
+ /* Queue for delayed MCE UE events. */
+ int mce_ue_count;
+ struct machine_check_event mce_ue_event_queue[MAX_MC_EVT];
+};
+
/* Release flags for get_mce_event() */
#define MCE_EVENT_RELEASE true
#define MCE_EVENT_DONTRELEASE false
@@ -234,4 +245,11 @@ long __machine_check_early_realmode_p8(struct pt_regs *regs);
long __machine_check_early_realmode_p9(struct pt_regs *regs);
long __machine_check_early_realmode_p10(struct pt_regs *regs);
#endif /* CONFIG_PPC_BOOK3S_64 */
+
+#ifdef CONFIG_PPC_BOOK3S_64
+void mce_init(void);
+#else
+static inline void mce_init(void) { };
+#endif /* CONFIG_PPC_BOOK3S_64 */
+
#endif /* __ASM_PPC64_MCE_H__ */
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 9454d29ff4b4..38e0c55e845d 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -29,6 +29,7 @@
#include <asm/hmi.h>
#include <asm/cpuidle.h>
#include <asm/atomic.h>
+#include <asm/mce.h>
#include <asm-generic/mmiowb_types.h>
@@ -273,6 +274,9 @@ struct paca_struct {
#ifdef CONFIG_MMIOWB
struct mmiowb_state mmiowb_state;
#endif
+#ifdef CONFIG_PPC_BOOK3S_64
+ struct mce_info *mce_info;
+#endif /* CONFIG_PPC_BOOK3S_64 */
} ____cacheline_aligned;
extern void copy_mm_to_paca(struct mm_struct *mm);
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 9f3e133b57b7..6ec5c68997ed 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -17,22 +17,13 @@
#include <linux/irq_work.h>
#include <linux/extable.h>
#include <linux/ftrace.h>
+#include <linux/memblock.h>
#include <asm/machdep.h>
#include <asm/mce.h>
#include <asm/nmi.h>
-static DEFINE_PER_CPU(int, mce_nest_count);
-static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event);
-
-/* Queue for delayed MCE events. */
-static DEFINE_PER_CPU(int, mce_queue_count);
-static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event_queue);
-
-/* Queue for delayed MCE UE events. */
-static DEFINE_PER_CPU(int, mce_ue_count);
-static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT],
- mce_ue_event_queue);
+#include "setup.h"
static void machine_check_process_queued_event(struct irq_work *work);
static void machine_check_ue_irq_work(struct irq_work *work);
@@ -103,9 +94,10 @@ void save_mce_event(struct pt_regs *regs, long handled,
struct mce_error_info *mce_err,
uint64_t nip, uint64_t addr, uint64_t phys_addr)
{
- int index = __this_cpu_inc_return(mce_nest_count) - 1;
- struct machine_check_event *mce = this_cpu_ptr(&mce_event[index]);
+ int index = local_paca->mce_info->mce_nest_count++;
+ struct machine_check_event *mce;
+ mce = &local_paca->mce_info->mce_event[index];
/*
* Return if we don't have enough space to log mce event.
* mce_nest_count may go beyond MAX_MC_EVT but that's ok,
@@ -191,7 +183,7 @@ void save_mce_event(struct pt_regs *regs, long handled,
*/
int get_mce_event(struct machine_check_event *mce, bool release)
{
- int index = __this_cpu_read(mce_nest_count) - 1;
+ int index = local_paca->mce_info->mce_nest_count - 1;
struct machine_check_event *mc_evt;
int ret = 0;
@@ -201,7 +193,7 @@ int get_mce_event(struct machine_check_event *mce, bool release)
/* Check if we have MCE info to process. */
if (index < MAX_MC_EVT) {
- mc_evt = this_cpu_ptr(&mce_event[index]);
+ mc_evt = &local_paca->mce_info->mce_event[index];
/* Copy the event structure and release the original */
if (mce)
*mce = *mc_evt;
@@ -211,7 +203,7 @@ int get_mce_event(struct machine_check_event *mce, bool release)
}
/* Decrement the count to free the slot. */
if (release)
- __this_cpu_dec(mce_nest_count);
+ local_paca->mce_info->mce_nest_count--;
return ret;
}
@@ -233,13 +225,14 @@ static void machine_check_ue_event(struct machine_check_event *evt)
{
int index;
- index = __this_cpu_inc_return(mce_ue_count) - 1;
+ index = local_paca->mce_info->mce_ue_count++;
/* If queue is full, just return for now. */
if (index >= MAX_MC_EVT) {
- __this_cpu_dec(mce_ue_count);
+ local_paca->mce_info->mce_ue_count--;
return;
}
- memcpy(this_cpu_ptr(&mce_ue_event_queue[index]), evt, sizeof(*evt));
+ memcpy(&local_paca->mce_info->mce_ue_event_queue[index],
+ evt, sizeof(*evt));
/* Queue work to process this event later. */
irq_work_queue(&mce_ue_event_irq_work);
@@ -256,13 +249,14 @@ void machine_check_queue_event(void)
if (!get_mce_event(&evt, MCE_EVENT_RELEASE))
return;
- index = __this_cpu_inc_return(mce_queue_count) - 1;
+ index = local_paca->mce_info->mce_queue_count++;
/* If queue is full, just return for now. */
if (index >= MAX_MC_EVT) {
- __this_cpu_dec(mce_queue_count);
+ local_paca->mce_info->mce_queue_count--;
return;
}
- memcpy(this_cpu_ptr(&mce_event_queue[index]), &evt, sizeof(evt));
+ memcpy(&local_paca->mce_info->mce_event_queue[index],
+ &evt, sizeof(evt));
/* Queue irq work to process this event later. */
irq_work_queue(&mce_event_process_work);
@@ -289,9 +283,9 @@ static void machine_process_ue_event(struct work_struct *work)
int index;
struct machine_check_event *evt;
- while (__this_cpu_read(mce_ue_count) > 0) {
- index = __this_cpu_read(mce_ue_count) - 1;
- evt = this_cpu_ptr(&mce_ue_event_queue[index]);
+ while (local_paca->mce_info->mce_ue_count > 0) {
+ index = local_paca->mce_info->mce_ue_count - 1;
+ evt = &local_paca->mce_info->mce_ue_event_queue[index];
blocking_notifier_call_chain(&mce_notifier_list, 0, evt);
#ifdef CONFIG_MEMORY_FAILURE
/*
@@ -304,7 +298,7 @@ static void machine_process_ue_event(struct work_struct *work)
*/
if (evt->error_type == MCE_ERROR_TYPE_UE) {
if (evt->u.ue_error.ignore_event) {
- __this_cpu_dec(mce_ue_count);
+ local_paca->mce_info->mce_ue_count--;
continue;
}
@@ -320,7 +314,7 @@ static void machine_process_ue_event(struct work_struct *work)
"was generated\n");
}
#endif
- __this_cpu_dec(mce_ue_count);
+ local_paca->mce_info->mce_ue_count--;
}
}
/*
@@ -338,17 +332,17 @@ static void machine_check_process_queued_event(struct irq_work *work)
* For now just print it to console.
* TODO: log this error event to FSP or nvram.
*/
- while (__this_cpu_read(mce_queue_count) > 0) {
- index = __this_cpu_read(mce_queue_count) - 1;
- evt = this_cpu_ptr(&mce_event_queue[index]);
+ while (local_paca->mce_info->mce_queue_count > 0) {
+ index = local_paca->mce_info->mce_queue_count - 1;
+ evt = &local_paca->mce_info->mce_event_queue[index];
if (evt->error_type == MCE_ERROR_TYPE_UE &&
evt->u.ue_error.ignore_event) {
- __this_cpu_dec(mce_queue_count);
+ local_paca->mce_info->mce_queue_count--;
continue;
}
machine_check_print_event_info(evt, false, false);
- __this_cpu_dec(mce_queue_count);
+ local_paca->mce_info->mce_queue_count--;
}
}
@@ -741,3 +735,24 @@ long hmi_exception_realmode(struct pt_regs *regs)
return 1;
}
+
+void __init mce_init(void)
+{
+ struct mce_info *mce_info;
+ u64 limit;
+ int i;
+
+ limit = min(ppc64_bolted_size(), ppc64_rma_size);
+ for_each_possible_cpu(i) {
+ mce_info = memblock_alloc_try_nid(sizeof(*mce_info),
+ __alignof__(*mce_info),
+ MEMBLOCK_LOW_LIMIT,
+ limit, cpu_to_node(i));
+ if (!mce_info)
+ goto err;
+ paca_ptrs[i]->mce_info = mce_info;
+ }
+ return;
+err:
+ panic("Failed to allocate memory for MCE event data\n");
+}
diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index 71f38e9248be..17dc451f0e45 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -916,7 +916,6 @@ void __init setup_arch(char **cmdline_p)
/* On BookE, setup per-core TLB data structures. */
setup_tlb_core_data();
#endif
-
/* Print various info about the machine that has been gathered so far. */
print_system_info();
@@ -938,6 +937,7 @@ void __init setup_arch(char **cmdline_p)
exc_lvl_early_init();
emergency_stack_init();
+ mce_init();
smp_release_cpus();
initmem_init();
--
2.26.2
^ permalink raw reply related
* Re: [PATCH] powerpc/prom: Fix "ibm,arch-vec-5-platform-support" scan
From: Fabiano Rosas @ 2021-01-22 13:25 UTC (permalink / raw)
To: Cédric Le Goater, linuxppc-dev; +Cc: Cédric Le Goater, stable
In-Reply-To: <20210122075029.797013-1-clg@kaod.org>
Cédric Le Goater <clg@kaod.org> writes:
> The "ibm,arch-vec-5-platform-support" property is a list of pairs of
> bytes representing the options and values supported by the platform
> firmware. At boot time, Linux scans this list and activates the
> available features it recognizes : Radix and XIVE.
>
> A recent change modified the number of entries to loop on and 8 bytes,
> 4 pairs of { options, values } entries are always scanned. This is
> fine on KVM but not on PowerVM which can advertises less. As a
> consequence on this platform, Linux reads extra entries pointing to
> random data, interprets these as available features and tries to
> activate them, leading to a firmware crash in
> ibm,client-architecture-support.
>
> Fix that by using the property length of "ibm,arch-vec-5-platform-support".
>
> Cc: stable@vger.kernel.org # v4.20+
> Fixes: ab91239942a9 ("powerpc/prom: Remove VLA in prom_check_platform_support()")
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
> ---
> arch/powerpc/kernel/prom_init.c | 12 ++++--------
> 1 file changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
> index e9d4eb6144e1..ccf77b985c8f 100644
> --- a/arch/powerpc/kernel/prom_init.c
> +++ b/arch/powerpc/kernel/prom_init.c
> @@ -1331,14 +1331,10 @@ static void __init prom_check_platform_support(void)
> if (prop_len > sizeof(vec))
> prom_printf("WARNING: ibm,arch-vec-5-platform-support longer than expected (len: %d)\n",
> prop_len);
> - prom_getprop(prom.chosen, "ibm,arch-vec-5-platform-support",
> - &vec, sizeof(vec));
> - for (i = 0; i < sizeof(vec); i += 2) {
> - prom_debug("%d: index = 0x%x val = 0x%x\n", i / 2
> - , vec[i]
> - , vec[i + 1]);
> - prom_parse_platform_support(vec[i], vec[i + 1],
> - &supported);
> + prom_getprop(prom.chosen, "ibm,arch-vec-5-platform-support", &vec, sizeof(vec));
> + for (i = 0; i < prop_len; i += 2) {
> + prom_debug("%d: index = 0x%x val = 0x%x\n", i / 2, vec[i], vec[i + 1]);
> + prom_parse_platform_support(vec[i], vec[i + 1], &supported);
> }
> }
^ permalink raw reply
* Re: [musl] Re: [PATCH v2] powerpc/64/signal: balance return predictor stack in signal trampoline
From: Rich Felker @ 2021-01-22 14:44 UTC (permalink / raw)
To: Florian Weimer
Cc: musl, libc-alpha, linuxppc-dev, Nicholas Piggin, Alan Modra
In-Reply-To: <87im7pp5yl.fsf@oldenburg.str.redhat.com>
On Fri, Jan 22, 2021 at 12:27:14PM +0100, Florian Weimer wrote:
> * Nicholas Piggin:
>
> > diff --git a/arch/powerpc/kernel/vdso64/sigtramp.S b/arch/powerpc/kernel/vdso64/sigtramp.S
> > index a8cc0409d7d2..bbf68cd01088 100644
> > --- a/arch/powerpc/kernel/vdso64/sigtramp.S
> > +++ b/arch/powerpc/kernel/vdso64/sigtramp.S
> > @@ -6,6 +6,7 @@
> > * Copyright (C) 2004 Benjamin Herrenschmuidt (benh@kernel.crashing.org), IBM Corp.
> > * Copyright (C) 2004 Alan Modra (amodra@au.ibm.com)), IBM Corp.
> > */
> > +#include <asm/cache.h> /* IFETCH_ALIGN_BYTES */
> > #include <asm/processor.h>
> > #include <asm/ppc_asm.h>
> > #include <asm/unistd.h>
> > @@ -14,21 +15,17 @@
> >
> > .text
> >
> > -/* The nop here is a hack. The dwarf2 unwind routines subtract 1 from
> > - the return address to get an address in the middle of the presumed
> > - call instruction. Since we don't have a call here, we artificially
> > - extend the range covered by the unwind info by padding before the
> > - real start. */
> > - nop
> > .balign 8
> > + .balign IFETCH_ALIGN_BYTES
> > V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
> > -.Lsigrt_start = . - 4
> > +.Lsigrt_start:
> > + bctrl /* call the handler */
> > addi r1, r1, __SIGNAL_FRAMESIZE
> > li r0,__NR_rt_sigreturn
> > sc
> > .Lsigrt_end:
> > V_FUNCTION_END(__kernel_sigtramp_rt64)
> > -/* The ".balign 8" above and the following zeros mimic the old stack
> > +/* The .balign 8 above and the following zeros mimic the old stack
> > trampoline layout. The last magic value is the ucontext pointer,
> > chosen in such a way that older libgcc unwind code returns a zero
> > for a sigcontext pointer. */
>
> As far as I understand it, this breaks cancellation handling on musl and
> future glibc because it is necessary to look at the signal delivery
> location to see if a system call sequence has result in an action, and
> that location is no longer in user code after this change.
>
> We have a glibc test in preparation of our change, and it started
> failing:
>
> Linux 5.10 breaks sigcontext_get_pc on powerpc64
> <https://sourceware.org/bugzilla/show_bug.cgi?id=27223>
>
> Isn't it possible to avoid the return predictor desynchronization by
> adding the appropriate hint?
Maybe I'm missing something but I don't see how this would break musl;
we just inspect the PC in the mcontext, which I don't see any changes
to and which should still point to the next instruction of the
interrupted context. I don't have a test environment though so I'll
have to wait for feedback from ppc users to be sure. Are there any
further details on how it's breaking glibc?
Rich
^ permalink raw reply
* Re: [PATCH v2] powerpc/64/signal: balance return predictor stack in signal trampoline
From: Raoni Fassina Firmino @ 2021-01-22 18:13 UTC (permalink / raw)
To: Florian Weimer
Cc: libc-alpha, musl, linuxppc-dev, Nicholas Piggin, Alan Modra
In-Reply-To: <87im7pp5yl.fsf@oldenburg.str.redhat.com>
On Fri, Jan 22, 2021 at 12:27:14PM +0100, AL glibc-alpha wrote:
> * Nicholas Piggin:
>
> > diff --git a/arch/powerpc/kernel/vdso64/sigtramp.S b/arch/powerpc/kernel/vdso64/sigtramp.S
> > index a8cc0409d7d2..bbf68cd01088 100644
> > --- a/arch/powerpc/kernel/vdso64/sigtramp.S
> > +++ b/arch/powerpc/kernel/vdso64/sigtramp.S
> > @@ -6,6 +6,7 @@
> > * Copyright (C) 2004 Benjamin Herrenschmuidt (benh@kernel.crashing.org), IBM Corp.
> > * Copyright (C) 2004 Alan Modra (amodra@au.ibm.com)), IBM Corp.
> > */
> > +#include <asm/cache.h> /* IFETCH_ALIGN_BYTES */
> > #include <asm/processor.h>
> > #include <asm/ppc_asm.h>
> > #include <asm/unistd.h>
> > @@ -14,21 +15,17 @@
> >
> > .text
> >
> > -/* The nop here is a hack. The dwarf2 unwind routines subtract 1 from
> > - the return address to get an address in the middle of the presumed
> > - call instruction. Since we don't have a call here, we artificially
> > - extend the range covered by the unwind info by padding before the
> > - real start. */
> > - nop
> > .balign 8
> > + .balign IFETCH_ALIGN_BYTES
> > V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
> > -.Lsigrt_start = . - 4
> > +.Lsigrt_start:
> > + bctrl /* call the handler */
> > addi r1, r1, __SIGNAL_FRAMESIZE
> > li r0,__NR_rt_sigreturn
> > sc
> > .Lsigrt_end:
> > V_FUNCTION_END(__kernel_sigtramp_rt64)
> > -/* The ".balign 8" above and the following zeros mimic the old stack
> > +/* The .balign 8 above and the following zeros mimic the old stack
> > trampoline layout. The last magic value is the ucontext pointer,
> > chosen in such a way that older libgcc unwind code returns a zero
> > for a sigcontext pointer. */
>
> As far as I understand it, this breaks cancellation handling on musl and
> future glibc because it is necessary to look at the signal delivery
> location to see if a system call sequence has result in an action, and
> that location is no longer in user code after this change.
>
> We have a glibc test in preparation of our change, and it started
> failing:
>
> Linux 5.10 breaks sigcontext_get_pc on powerpc64
> <https://sourceware.org/bugzilla/show_bug.cgi?id=27223>
>
> Isn't it possible to avoid the return predictor desynchronization by
> adding the appropriate hint?
I also caught this regression, I believe it was introduced in the kernel
5.9.
I don't know enough to comment on Florian suggestion, but I am working
on some possible fixes:
On the kernel side we can keep `__kernel_sigtramp_rt64` in the original
place after `bctrl` and add a new symbol so the kernel can jump to the
right place before `bctrl`. This would ensure backward compatibility.
On the other side, this change exposed how fragile `backtrace()` is to
any changes in the trampoline code, which the libc has no control over
in this case. So maybe there is something that can be improved in how
backtrace decides that the return-address is to the trampoline. My fist
option is to test a range, after `__kernel_sigtramp_rt64` so see if the
address is inside the routine. This would be better if we can know the
size of the function, I know that the vdso.so has this info in the elf
but I don't know if it is exposed to the glibc.
As Nicholas mentioned in his patch, GDB seems to keep working just fine,
it is seems that GDB uses some heuristics to match the surround code of
the return address to identify that it is the trampoline code. So maybe
other option is to do something similar.
o/
Raoni Fassina
^ permalink raw reply
* Re: [musl] Re: [PATCH v2] powerpc/64/signal: balance return predictor stack in signal trampoline
From: Raoni Fassina Firmino @ 2021-01-22 18:19 UTC (permalink / raw)
To: Rich Felker
Cc: Florian Weimer, libc-alpha, Alan Modra, musl, Nicholas Piggin,
linuxppc-dev
In-Reply-To: <20210122144402.GP23432@brightrain.aerifal.cx>
On Fri, Jan 22, 2021 at 09:44:05AM -0500, Rich Felker wrote:
> Maybe I'm missing something but I don't see how this would break musl;
> we just inspect the PC in the mcontext, which I don't see any changes
> to and which should still point to the next instruction of the
> interrupted context. I don't have a test environment though so I'll
> have to wait for feedback from ppc users to be sure. Are there any
> further details on how it's breaking glibc?
For glibc, backtrace() compares the return-address from each stack frame
to the value of `__kernel_sigtramp_rt64` to identify the frame with the
mcontext information, but now the return-address is not the start of the
routine, but the middle of it, so it fails to catch this special frame.
o/
Raoni Fassina
^ permalink raw reply
* Re: [musl] Re: [PATCH v2] powerpc/64/signal: balance return predictor stack in signal trampoline
From: Rich Felker @ 2021-01-22 18:31 UTC (permalink / raw)
To: Florian Weimer, musl, libc-alpha, linuxppc-dev, Nicholas Piggin,
Alan Modra
In-Reply-To: <20210122181922.pcxyomeg5xcf2umu@work-tp>
On Fri, Jan 22, 2021 at 03:19:22PM -0300, Raoni Fassina Firmino wrote:
> On Fri, Jan 22, 2021 at 09:44:05AM -0500, Rich Felker wrote:
> > Maybe I'm missing something but I don't see how this would break musl;
> > we just inspect the PC in the mcontext, which I don't see any changes
> > to and which should still point to the next instruction of the
> > interrupted context. I don't have a test environment though so I'll
> > have to wait for feedback from ppc users to be sure. Are there any
> > further details on how it's breaking glibc?
>
> For glibc, backtrace() compares the return-address from each stack frame
> to the value of `__kernel_sigtramp_rt64` to identify the frame with the
> mcontext information, but now the return-address is not the start of the
> routine, but the middle of it, so it fails to catch this special frame.
Is there a reason it's backtracing rather than just looking at the
interrupted context (pointed to by the third argument to the signal
handler)?
Rich
^ permalink raw reply
* Re: [musl] Re: [PATCH v2] powerpc/64/signal: balance return predictor stack in signal trampoline
From: Raoni Fassina Firmino @ 2021-01-22 18:50 UTC (permalink / raw)
To: Rich Felker
Cc: Florian Weimer, libc-alpha, Alan Modra, musl, Nicholas Piggin,
linuxppc-dev
In-Reply-To: <20210122183127.GQ23432@brightrain.aerifal.cx>
On Fri, Jan 22, 2021 at 01:31:27PM -0500, Rich Felker wrote:
> On Fri, Jan 22, 2021 at 03:19:22PM -0300, Raoni Fassina Firmino wrote:
> > On Fri, Jan 22, 2021 at 09:44:05AM -0500, Rich Felker wrote:
> > > Maybe I'm missing something but I don't see how this would break musl;
> > > we just inspect the PC in the mcontext, which I don't see any changes
> > > to and which should still point to the next instruction of the
> > > interrupted context. I don't have a test environment though so I'll
> > > have to wait for feedback from ppc users to be sure. Are there any
> > > further details on how it's breaking glibc?
> >
> > For glibc, backtrace() compares the return-address from each stack frame
> > to the value of `__kernel_sigtramp_rt64` to identify the frame with the
> > mcontext information, but now the return-address is not the start of the
> > routine, but the middle of it, so it fails to catch this special frame.
>
> Is there a reason it's backtracing rather than just looking at the
> interrupted context (pointed to by the third argument to the signal
> handler)?
The regression is exposed in the backtrace() routine. More precisely,
when the backtrace() is called from inside a signal handling. What I
described is the way backtrace() uses to identify this special
situation. What is failling in glibc is the test for this.
o/
Raoni Fassina
^ permalink raw reply
* Re: [PATCH 1/2] ima: Free IMA measurement buffer on error
From: Thiago Jung Bauermann @ 2021-01-22 22:30 UTC (permalink / raw)
To: Lakshmi Ramasubramanian
Cc: sashal, dmitry.kasatkin, linux-kernel, zohar, tyhicks, ebiederm,
gregkh, linux-integrity, linuxppc-dev
In-Reply-To: <20210121173003.18324-1-nramas@linux.microsoft.com>
Hi Lakshmi,
Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
> IMA allocates kernel virtual memory to carry forward the measurement
> list, from the current kernel to the next kernel on kexec system call,
> in ima_add_kexec_buffer() function. In error code paths this memory
> is not freed resulting in memory leak.
>
> Free the memory allocated for the IMA measurement list in
> the error code paths in ima_add_kexec_buffer() function.
>
> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
> Suggested-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> Fixes: 7b8589cc29e7 ("ima: on soft reboot, save the measurement list")
> ---
> security/integrity/ima/ima_kexec.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c
> index 121de3e04af2..212145008a01 100644
> --- a/security/integrity/ima/ima_kexec.c
> +++ b/security/integrity/ima/ima_kexec.c
> @@ -119,12 +119,14 @@ void ima_add_kexec_buffer(struct kimage *image)
> ret = kexec_add_buffer(&kbuf);
> if (ret) {
> pr_err("Error passing over kexec measurement buffer.\n");
> + vfree(kexec_buffer);
> return;
> }
This is a good catch.
>
> ret = arch_ima_add_kexec_buffer(image, kbuf.mem, kexec_segment_size);
> if (ret) {
> pr_err("Error passing over kexec measurement buffer.\n");
> + vfree(kexec_buffer);
> return;
> }
But this would cause problems, because the buffer is still there in the
kimage and would cause kimage_load_segment() to access invalid memory.
There's no function to undo a kexec_add_buffer() to avoid this problem,
so I'd suggest just accepting the leak in this case. Fortunately, the
current implementations of arch_ima_add_kexec_buffer() are very simple
and cannot fail, so this is a theoretical problem.
--
Thiago Jung Bauermann
IBM Linux Technology Center
^ permalink raw reply
* Re: [PATCH 2/2] ima: Free IMA measurement buffer after kexec syscall
From: Thiago Jung Bauermann @ 2021-01-22 22:31 UTC (permalink / raw)
To: Lakshmi Ramasubramanian
Cc: sashal, dmitry.kasatkin, linux-kernel, zohar, tyhicks, ebiederm,
gregkh, linux-integrity, linuxppc-dev
In-Reply-To: <20210121173003.18324-2-nramas@linux.microsoft.com>
Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
> IMA allocates kernel virtual memory to carry forward the measurement
> list, from the current kernel to the next kernel on kexec system call,
> in ima_add_kexec_buffer() function. This buffer is not freed before
> completing the kexec system call resulting in memory leak.
>
> Add ima_buffer field in "struct kimage" to store the virtual address
> of the buffer allocated for the IMA measurement list.
> Free the memory allocated for the IMA measurement list in
> kimage_file_post_load_cleanup() function.
>
> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
> Suggested-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> Fixes: 7b8589cc29e7 ("ima: on soft reboot, save the measurement list")
Good catch.
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
--
Thiago Jung Bauermann
IBM Linux Technology Center
^ permalink raw reply
* Re: [PATCH] lib/sstep: Fix incorrect return from analyze_instr()
From: Michael Ellerman @ 2021-01-23 0:33 UTC (permalink / raw)
To: Ananth N Mavinakayanahalli, linuxppc-dev
Cc: naveen.n.rao, ravi.bangoria, paulus, sandipan
In-Reply-To: <161124771457.333703.14641179082577500423.stgit@thinktux.local>
Ananth N Mavinakayanahalli <ananth@linux.ibm.com> writes:
> We currently just percolate the return value from analyze_instr()
> to the caller of emulate_step(), especially if it is a -1.
>
> For one particular case (opcode = 4) for instructions that
> aren't currently emulated, we are returning 'should not be
> single-stepped' while we should have returned 0 which says
> 'did not emulate, may have to single-step'.
>
> Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.ibm.com>
> Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> ---
> arch/powerpc/lib/sstep.c | 49 +++++++++++++++++++++++++---------------------
> 1 file changed, 27 insertions(+), 22 deletions(-)
>
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index 5a425a4a1d88..a3a0373843cd 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1445,34 +1445,39 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>
> #ifdef __powerpc64__
> case 4:
> - if (!cpu_has_feature(CPU_FTR_ARCH_300))
> - return -1;
> -
> - switch (word & 0x3f) {
> - case 48: /* maddhd */
> - asm volatile(PPC_MADDHD(%0, %1, %2, %3) :
> - "=r" (op->val) : "r" (regs->gpr[ra]),
> - "r" (regs->gpr[rb]), "r" (regs->gpr[rc]));
> - goto compute_done;
> + /*
> + * There are very many instructions with this primary opcode
> + * introduced in the ISA as early as v2.03. However, the ones
> + * we currently emulate were all introduced with ISA 3.0
> + */
> + if (cpu_has_feature(CPU_FTR_ARCH_300)) {
> + switch (word & 0x3f) {
> + case 48: /* maddhd */
> + asm volatile(PPC_MADDHD(%0, %1, %2, %3) :
> + "=r" (op->val) : "r" (regs->gpr[ra]),
> + "r" (regs->gpr[rb]), "r" (regs->gpr[rc]));
> + goto compute_done;
Indenting everything makes this patch harder to read, and I think makes
the resulting code harder to read too. We already have two levels of
switch here, and we're inside a ~1700 line function, so keeping things
simple is important I think.
Doesn't this achieve the same result?
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index bf7a7d62ae8b..d631baaf1da2 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1443,8 +1443,10 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
#ifdef __powerpc64__
case 4:
- if (!cpu_has_feature(CPU_FTR_ARCH_300))
- return -1;
+ if (!cpu_has_feature(CPU_FTR_ARCH_300)) {
+ op->type = UNKNOWN;
+ return 0;
+ }
switch (word & 0x3f) {
case 48: /* maddhd */
@@ -1470,7 +1472,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
* There are other instructions from ISA 3.0 with the same
* primary opcode which do not have emulation support yet.
*/
- return -1;
+ op->type = UNKNOWN;
+ return 0;
#endif
case 7: /* mulli */
cheers
^ permalink raw reply related
* Re: [PATCH 6/6] powerpc/rtas: constrain user region allocation to RMA
From: Alexey Kardashevskiy @ 2021-01-23 1:54 UTC (permalink / raw)
To: Nathan Lynch, Michael Ellerman, linuxppc-dev
Cc: tyreld, brking, ajd, aneesh.kumar
In-Reply-To: <87pn1ywbs9.fsf@linux.ibm.com>
On 22/01/2021 02:27, Nathan Lynch wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> Nathan Lynch <nathanl@linux.ibm.com> writes:
>>> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>>>> On 16/01/2021 02:38, Nathan Lynch wrote:
>>>>> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>>>>>> On 15/01/2021 09:00, Nathan Lynch wrote:
>>>>>>> Memory locations passed as arguments from the OS to RTAS usually need
>>>>>>> to be addressable in 32-bit mode and must reside in the Real Mode
>>>>>>> Area. On PAPR guests, the RMA starts at logical address 0 and is the
>>>>>>> first logical memory block reported in the LPAR’s device tree.
>>>>>>>
>>>>>>> On powerpc targets with RTAS, Linux makes available to user space a
>>>>>>> region of memory suitable for arguments to be passed to RTAS via
>>>>>>> sys_rtas(). This region (rtas_rmo_buf) is allocated via the memblock
>>>>>>> API during boot in order to ensure that it satisfies the requirements
>>>>>>> described above.
>>>>>>>
>>>>>>> With radix MMU, the upper limit supplied to the memblock allocation
>>>>>>> can exceed the bounds of the first logical memory block, since
>>>>>>> ppc64_rma_size is ULONG_MAX and RTAS_INSTANTIATE_MAX is 1GB. (512MB is
>>>>>>> a common size of the first memory block according to a small sample of
>>>>>>> LPARs I have checked.) This leads to failures when user space invokes
>>>>>>> an RTAS function that uses a work area, such as
>>>>>>> ibm,configure-connector.
>>>>>>>
>>>>>>> Alter the determination of the upper limit for rtas_rmo_buf's
>>>>>>> allocation to consult the device tree directly, ensuring placement
>>>>>>> within the RMA regardless of the MMU in use.
>>>>>>
>>>>>> Can we tie this with RTAS (which also needs to be in RMA) and simply add
>>>>>> extra 64K in prom_instantiate_rtas() and advertise this address
>>>>>> (ALIGH_UP(rtas-base + rtas-size, PAGE_SIZE)) to the user space? We do
>>>>>> not need this RMO area before that point.
>>>>>
>>>>> Can you explain more about what advantage that would bring? I'm not
>>>>> seeing it. It's a more significant change than what I've written
>>>>> here.
>>>>
>>>>
>>>> We already allocate space for RTAS and (like RMO) it needs to be in RMA,
>>>> and RMO is useless without RTAS. We can reuse RTAS allocation code for
>>>> RMO like this:
>>>
>>> When you say RMO I assume you are referring to rtas_rmo_buf? (I don't
>>> think it is well-named.)
>> ...
>>
>> RMO (Real mode offset) is the old term we used to use to refer to what
>> is now called the RMA (Real mode area). There are still many references
>> to RMO in Linux, but they almost certainly all refer to what we now call
>> the RMA.
>
> Yes... but I think in this discussion Alexey was using RMO to stand in
> for rtas_rmo_buf, which was what I was trying to clarify.
Correct. Thanks for the clarification, appreciated.
>
>>>> May be store in the FDT as "linux,rmo-base" next to "linux,rtas-base",
>>>> for clarity, as sharing symbols between prom and main kernel is a bit
>>>> tricky.
>>>>
>>>> The benefit is that we do not do the same thing (== find 64K in RMA)
>>>> in 2 different ways and if the RMO allocated my way is broken - we'll
>>>> know it much sooner as RTAS itself will break too.
>>>
>>> Implementation details aside... I'll grant that combining the
>>> allocations into one in prom_init reduces some duplication in the sense
>>> that both are subject to the same constraints (mostly - the RTAS data
>>> area must not cross a 256MB boundary, while the user region may). But
>>> they really are distinct concerns. The RTAS private data area is
>>> specified in the platform architecture, the OS is obligated to allocate
>>> it and pass it to instantiate-rtas, etc etc. However the user region
>>> (rtas_rmo_buf) is purely a Linux construct which is there to support
>>> sys_rtas.
>>>
>>> Now, there are multiple sites in the kernel proper that must allocate
>>> memory suitable for passing to RTAS. Obviously there is value in
>>> consolidating the logic for that purpose in one place, so I'll work on
>>> adding that in v2. OK?
>>
>> I don't think we want to move any allocations into prom_init.c unless we
>> have to.
>>
>> It's best thought of as a trampoline, that runs before the kernel
>> proper, to transition from live OF to a flat DT environment. One thing
>> that must be done as part of that is instantiating RTAS, because it's
>> basically a runtime copy of the live OF. But any other allocs are for
>> Linux to handle later, IMHO.
>
> Agreed.
Then the only comment I have left is may be use of_address_to_resource()
+ resource_size() instead of of_n_addr_cells()/of_n_size_cells() (like
pseries_memory_block_size()). And now I shut up :) Thanks,
--
Alexey
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox