* [v3 0/7] powerpc: implement machine check safe memcpy
@ 2019-07-05 21:26 Santosh Sivaraj
2019-07-05 21:26 ` [v3 1/7] powerpc/mce: Make machine_check_ue_event() static Santosh Sivaraj
` (6 more replies)
0 siblings, 7 replies; 18+ messages in thread
From: Santosh Sivaraj @ 2019-07-05 21:26 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Nicholas Piggin,
Chandan Rajendra, Reza Arbab
During a memcpy from a pmem device, if a machine check exception is
generated we end up in a panic. In case of fsdax read, this should
only result in a -EIO. Avoid MCE by implementing memcpy_mcsafe.
Before this patch series:
```
bash-4.4# mount -o dax /dev/pmem0 /mnt/pmem/
[ 7621.714094] Disabling lock debugging due to kernel taint
[ 7621.714099] MCE: CPU0: machine check (Severe) Host UE Load/Store [Not recovered]
[ 7621.714104] MCE: CPU0: NIP: [c000000000088978] memcpy_power7+0x418/0x7e0
[ 7621.714107] MCE: CPU0: Hardware error
[ 7621.714112] opal: Hardware platform error: Unrecoverable Machine Check exception
[ 7621.714118] CPU: 0 PID: 1368 Comm: mount Tainted: G M 5.2.0-rc5-00239-g241e39004581 #50
[ 7621.714123] NIP: c000000000088978 LR: c0000000008e16f8 CTR: 00000000000001de
[ 7621.714129] REGS: c0000000fffbfd70 TRAP: 0200 Tainted: G M (5.2.0-rc5-00239-g241e39004581)
[ 7621.714131] MSR: 9000000002209033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 24428840 XER: 00040000
[ 7621.714160] CFAR: c0000000000889a8 DAR: deadbeefdeadbeef DSISR: 00008000 IRQMASK: 0
[ 7621.714171] GPR00: 000000000e000000 c0000000f0b8b1e0 c0000000012cf100 c0000000ed8e1100
[ 7621.714186] GPR04: c000020000001100 0000000000010000 0000000000000200 03fffffff1272000
[ 7621.714201] GPR08: 0000000080000000 0000000000000010 0000000000000020 0000000000000030
[ 7621.714216] GPR12: 0000000000000040 00007fffb8c6d390 0000000000000050 0000000000000060
[ 7621.714232] GPR16: 0000000000000070 0000000000000000 0000000000000001 c0000000f0b8b960
[ 7621.714247] GPR20: 0000000000000001 c0000000f0b8b940 0000000000000001 0000000000010000
[ 7621.714262] GPR24: c000000001382560 c00c0000003b6380 c00c0000003b6380 0000000000010000
[ 7621.714277] GPR28: 0000000000000000 0000000000010000 c000020000000000 0000000000010000
[ 7621.714294] NIP [c000000000088978] memcpy_power7+0x418/0x7e0
[ 7621.714298] LR [c0000000008e16f8] pmem_do_bvec+0xf8/0x430
... <snip> ...
```
After this patch series:
```
bash-4.4# mount -o dax /dev/pmem0 /mnt/pmem/
[25302.883978] Buffer I/O error on dev pmem0, logical block 0, async page read
[25303.020816] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
[25303.021236] EXT4-fs (pmem0): Can't read superblock on 2nd try
[25303.152515] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
[25303.284031] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
[25304.084100] UDF-fs: bad mount option "dax" or missing value
mount: /mnt/pmem: wrong fs type, bad option, bad superblock on /dev/pmem0, missing codepage or helper program, or other error.
```
MCE is injected on a pmem address using mambo. The last patch which adds a nop
is only for testing on mambo, where r13 is not restored upon hitting vector 200.
The memcpy code can be optimised by adding VMX optimizations and GAS macros can
be used to enable code reusablity, which I will send as another series.
---
Change-log:
v3:
* Drop patch which enables DR/IR for external modules
* Drop notifier call chain, we don't want to do that in real mode
* Return remaining bytes from memcpy_mcsafe correctly
* We no longer restore r13 for simulator tests, rather use a nop at
vector 0x200 [workaround for simulator; not to be merged]
v2:
* Don't set RI bit explicitly [mahesh]
* Re-ordered series to get r13 workaround as the last patch
---
Balbir Singh (2):
powerpc/mce: Bug fixes for MCE handling in kernel space
powerpc/memcpy: Add memcpy_mcsafe for pmem
Reza Arbab (2):
powerpc/mce: Make machine_check_ue_event() static
powerpc/64s: save r13 in MCE handler (simulator workaroud)
Santosh Sivaraj (3):
powerpc/mce: Handle UE event for memcpy_mcsafe
powerpc/memcpy_mcsafe: return remaining bytes
powerpc: add machine check safe copy_to_user
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/mce.h | 7 +-
arch/powerpc/include/asm/string.h | 2 +
arch/powerpc/include/asm/uaccess.h | 12 ++
arch/powerpc/kernel/exceptions-64s.S | 1 +
arch/powerpc/kernel/mce.c | 11 +-
arch/powerpc/kernel/mce_power.c | 41 +++--
arch/powerpc/lib/Makefile | 2 +-
arch/powerpc/lib/memcpy_mcsafe_64.S | 239 +++++++++++++++++++++++++++
arch/powerpc/platforms/pseries/ras.c | 6 +-
10 files changed, 302 insertions(+), 20 deletions(-)
create mode 100644 arch/powerpc/lib/memcpy_mcsafe_64.S
--
2.20.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [v3 1/7] powerpc/mce: Make machine_check_ue_event() static
2019-07-05 21:26 [v3 0/7] powerpc: implement machine check safe memcpy Santosh Sivaraj
@ 2019-07-05 21:26 ` Santosh Sivaraj
2019-07-06 5:18 ` Nicholas Piggin
2019-07-05 21:26 ` [v3 2/7] powerpc/mce: Bug fixes for MCE handling in kernel space Santosh Sivaraj
` (5 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Santosh Sivaraj @ 2019-07-05 21:26 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Nicholas Piggin,
Chandan Rajendra, Reza Arbab
From: Reza Arbab <arbab@linux.ibm.com>
The function doesn't get used outside this file, so make it static.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
---
arch/powerpc/kernel/mce.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index b18df633eae9..e78c4f18ea0a 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -33,7 +33,7 @@ static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT],
mce_ue_event_queue);
static void machine_check_process_queued_event(struct irq_work *work);
-void machine_check_ue_event(struct machine_check_event *evt);
+static void machine_check_ue_event(struct machine_check_event *evt);
static void machine_process_ue_event(struct work_struct *work);
static struct irq_work mce_event_process_work = {
@@ -203,7 +203,7 @@ void release_mce_event(void)
/*
* Queue up the MCE event which then can be handled later.
*/
-void machine_check_ue_event(struct machine_check_event *evt)
+static void machine_check_ue_event(struct machine_check_event *evt)
{
int index;
--
2.20.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [v3 2/7] powerpc/mce: Bug fixes for MCE handling in kernel space
2019-07-05 21:26 [v3 0/7] powerpc: implement machine check safe memcpy Santosh Sivaraj
2019-07-05 21:26 ` [v3 1/7] powerpc/mce: Make machine_check_ue_event() static Santosh Sivaraj
@ 2019-07-05 21:26 ` Santosh Sivaraj
2019-07-06 9:44 ` Nicholas Piggin
2019-07-05 21:26 ` [v3 3/7] powerpc/memcpy: Add memcpy_mcsafe for pmem Santosh Sivaraj
` (4 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Santosh Sivaraj @ 2019-07-05 21:26 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Nicholas Piggin,
Chandan Rajendra, Reza Arbab
From: Balbir Singh <bsingharora@gmail.com>
The code currently assumes PAGE_SHIFT as the shift value of
the pfn, this works correctly (mostly) for user space pages,
but the correct thing to do is
1. Extract the shift value returned via the pte-walk API's
2. Use the shift value to access the instruction address.
Note, the final physical address still use PAGE_SHIFT for
computation. handle_ierror() is not modified and handle_derror()
is modified just for extracting the correct instruction
address.
This is largely due to __find_linux_pte() returning pfn's
shifted by pdshift. The code is much more generic and can
handle shift values returned.
Fixes: ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors")
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[arbab@linux.ibm.com: Fixup pseries_do_memory_failure()]
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
---
arch/powerpc/include/asm/mce.h | 3 ++-
arch/powerpc/kernel/mce_power.c | 26 ++++++++++++++++----------
arch/powerpc/platforms/pseries/ras.c | 6 ++++--
3 files changed, 22 insertions(+), 13 deletions(-)
diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index a4c6a74ad2fb..94888a7025b3 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -209,7 +209,8 @@ extern void release_mce_event(void);
extern void machine_check_queue_event(void);
extern void machine_check_print_event_info(struct machine_check_event *evt,
bool user_mode, bool in_guest);
-unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
+unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr,
+ unsigned int *shift);
#ifdef CONFIG_PPC_BOOK3S_64
void flush_and_reload_slb(void);
#endif /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index e39536aad30d..04666c0b40a8 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -23,7 +23,8 @@
* Convert an address related to an mm to a PFN. NOTE: we are in real
* mode, we could potentially race with page table updates.
*/
-unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
+unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr,
+ unsigned int *shift)
{
pte_t *ptep;
unsigned long flags;
@@ -36,13 +37,15 @@ unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
local_irq_save(flags);
if (mm == current->mm)
- ptep = find_current_mm_pte(mm->pgd, addr, NULL, NULL);
+ ptep = find_current_mm_pte(mm->pgd, addr, NULL, shift);
else
- ptep = find_init_mm_pte(addr, NULL);
+ ptep = find_init_mm_pte(addr, shift);
local_irq_restore(flags);
if (!ptep || pte_special(*ptep))
return ULONG_MAX;
- return pte_pfn(*ptep);
+ if (!*shift)
+ *shift = PAGE_SHIFT;
+ return (pte_val(*ptep) & PTE_RPN_MASK) >> *shift;
}
/* flush SLBs and reload */
@@ -358,15 +361,16 @@ static int mce_find_instr_ea_and_pfn(struct pt_regs *regs, uint64_t *addr,
unsigned long pfn, instr_addr;
struct instruction_op op;
struct pt_regs tmp = *regs;
+ unsigned int shift;
- pfn = addr_to_pfn(regs, regs->nip);
+ pfn = addr_to_pfn(regs, regs->nip, &shift);
if (pfn != ULONG_MAX) {
- instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
+ instr_addr = (pfn << shift) + (regs->nip & ((1 << shift) - 1));
instr = *(unsigned int *)(instr_addr);
if (!analyse_instr(&op, &tmp, instr)) {
- pfn = addr_to_pfn(regs, op.ea);
+ pfn = addr_to_pfn(regs, op.ea, &shift);
*addr = op.ea;
- *phys_addr = (pfn << PAGE_SHIFT);
+ *phys_addr = (pfn << shift);
return 0;
}
/*
@@ -442,12 +446,14 @@ static int mce_handle_ierror(struct pt_regs *regs,
if (mce_err->sync_error &&
table[i].error_type == MCE_ERROR_TYPE_UE) {
unsigned long pfn;
+ unsigned int shift;
if (get_paca()->in_mce < MAX_MCE_DEPTH) {
- pfn = addr_to_pfn(regs, regs->nip);
+ pfn = addr_to_pfn(regs, regs->nip,
+ &shift);
if (pfn != ULONG_MAX) {
*phys_addr =
- (pfn << PAGE_SHIFT);
+ (pfn << shift);
}
}
}
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index f16fdd0f71f7..5e43283d3300 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -740,12 +740,14 @@ static void pseries_do_memory_failure(struct pt_regs *regs,
paddr = be64_to_cpu(mce_log->logical_address);
} else if (mce_log->sub_err_type & UE_EFFECTIVE_ADDR_PROVIDED) {
unsigned long pfn;
+ unsigned int shift;
pfn = addr_to_pfn(regs,
- be64_to_cpu(mce_log->effective_address));
+ be64_to_cpu(mce_log->effective_address),
+ &shift);
if (pfn == ULONG_MAX)
return;
- paddr = pfn << PAGE_SHIFT;
+ paddr = pfn << shift;
} else {
return;
}
--
2.20.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [v3 3/7] powerpc/memcpy: Add memcpy_mcsafe for pmem
2019-07-05 21:26 [v3 0/7] powerpc: implement machine check safe memcpy Santosh Sivaraj
2019-07-05 21:26 ` [v3 1/7] powerpc/mce: Make machine_check_ue_event() static Santosh Sivaraj
2019-07-05 21:26 ` [v3 2/7] powerpc/mce: Bug fixes for MCE handling in kernel space Santosh Sivaraj
@ 2019-07-05 21:26 ` Santosh Sivaraj
2019-07-06 6:12 ` Christophe Leroy
2019-07-05 21:26 ` [v3 4/7] powerpc/mce: Handle UE event for memcpy_mcsafe Santosh Sivaraj
` (3 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Santosh Sivaraj @ 2019-07-05 21:26 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Nicholas Piggin,
Chandan Rajendra, Reza Arbab
From: Balbir Singh <bsingharora@gmail.com>
The pmem infrastructure uses memcpy_mcsafe in the pmem
layer so as to convert machine check exceptions into
a return value on failure in case a machine check
exception is encountered during the memcpy.
This patch largely borrows from the copyuser_power7
logic and does not add the VMX optimizations, largely
to keep the patch simple. If needed those optimizations
can be folded in.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
[arbab@linux.ibm.com: Added symbol export]
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
---
arch/powerpc/include/asm/string.h | 2 +
arch/powerpc/lib/Makefile | 2 +-
arch/powerpc/lib/memcpy_mcsafe_64.S | 215 ++++++++++++++++++++++++++++
3 files changed, 218 insertions(+), 1 deletion(-)
create mode 100644 arch/powerpc/lib/memcpy_mcsafe_64.S
diff --git a/arch/powerpc/include/asm/string.h b/arch/powerpc/include/asm/string.h
index 9bf6dffb4090..b72692702f35 100644
--- a/arch/powerpc/include/asm/string.h
+++ b/arch/powerpc/include/asm/string.h
@@ -53,7 +53,9 @@ void *__memmove(void *to, const void *from, __kernel_size_t n);
#ifndef CONFIG_KASAN
#define __HAVE_ARCH_MEMSET32
#define __HAVE_ARCH_MEMSET64
+#define __HAVE_ARCH_MEMCPY_MCSAFE
+extern int memcpy_mcsafe(void *dst, const void *src, __kernel_size_t sz);
extern void *__memset16(uint16_t *, uint16_t v, __kernel_size_t);
extern void *__memset32(uint32_t *, uint32_t v, __kernel_size_t);
extern void *__memset64(uint64_t *, uint64_t v, __kernel_size_t);
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index c55f9c27bf79..529d6536eb4a 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -39,7 +39,7 @@ obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \
memcpy_power7.o
obj64-y += copypage_64.o copyuser_64.o mem_64.o hweight_64.o \
- memcpy_64.o pmem.o
+ memcpy_64.o pmem.o memcpy_mcsafe_64.o
obj64-$(CONFIG_SMP) += locks.o
obj64-$(CONFIG_ALTIVEC) += vmx-helper.o
diff --git a/arch/powerpc/lib/memcpy_mcsafe_64.S b/arch/powerpc/lib/memcpy_mcsafe_64.S
new file mode 100644
index 000000000000..50f865db0338
--- /dev/null
+++ b/arch/powerpc/lib/memcpy_mcsafe_64.S
@@ -0,0 +1,215 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) IBM Corporation, 2011
+ * Derived from copyuser_power7.s by Anton Blanchard <anton@au.ibm.com>
+ * Author - Balbir Singh <bsingharora@gmail.com>
+ */
+#include <asm/ppc_asm.h>
+#include <asm/errno.h>
+#include <asm/export.h>
+
+ .macro err1
+100:
+ EX_TABLE(100b,.Ldo_err1)
+ .endm
+
+ .macro err2
+200:
+ EX_TABLE(200b,.Ldo_err2)
+ .endm
+
+.Ldo_err2:
+ ld r22,STK_REG(R22)(r1)
+ ld r21,STK_REG(R21)(r1)
+ ld r20,STK_REG(R20)(r1)
+ ld r19,STK_REG(R19)(r1)
+ ld r18,STK_REG(R18)(r1)
+ ld r17,STK_REG(R17)(r1)
+ ld r16,STK_REG(R16)(r1)
+ ld r15,STK_REG(R15)(r1)
+ ld r14,STK_REG(R14)(r1)
+ addi r1,r1,STACKFRAMESIZE
+.Ldo_err1:
+ li r3,-EFAULT
+ blr
+
+
+_GLOBAL(memcpy_mcsafe)
+ cmpldi r5,16
+ blt .Lshort_copy
+
+.Lcopy:
+ /* Get the source 8B aligned */
+ neg r6,r4
+ mtocrf 0x01,r6
+ clrldi r6,r6,(64-3)
+
+ bf cr7*4+3,1f
+err1; lbz r0,0(r4)
+ addi r4,r4,1
+err1; stb r0,0(r3)
+ addi r3,r3,1
+
+1: bf cr7*4+2,2f
+err1; lhz r0,0(r4)
+ addi r4,r4,2
+err1; sth r0,0(r3)
+ addi r3,r3,2
+
+2: bf cr7*4+1,3f
+err1; lwz r0,0(r4)
+ addi r4,r4,4
+err1; stw r0,0(r3)
+ addi r3,r3,4
+
+3: sub r5,r5,r6
+ cmpldi r5,128
+ blt 5f
+
+ mflr r0
+ stdu r1,-STACKFRAMESIZE(r1)
+ std r14,STK_REG(R14)(r1)
+ std r15,STK_REG(R15)(r1)
+ std r16,STK_REG(R16)(r1)
+ std r17,STK_REG(R17)(r1)
+ std r18,STK_REG(R18)(r1)
+ std r19,STK_REG(R19)(r1)
+ std r20,STK_REG(R20)(r1)
+ std r21,STK_REG(R21)(r1)
+ std r22,STK_REG(R22)(r1)
+ std r0,STACKFRAMESIZE+16(r1)
+
+ srdi r6,r5,7
+ mtctr r6
+
+ /* Now do cacheline (128B) sized loads and stores. */
+ .align 5
+4:
+err2; ld r0,0(r4)
+err2; ld r6,8(r4)
+err2; ld r7,16(r4)
+err2; ld r8,24(r4)
+err2; ld r9,32(r4)
+err2; ld r10,40(r4)
+err2; ld r11,48(r4)
+err2; ld r12,56(r4)
+err2; ld r14,64(r4)
+err2; ld r15,72(r4)
+err2; ld r16,80(r4)
+err2; ld r17,88(r4)
+err2; ld r18,96(r4)
+err2; ld r19,104(r4)
+err2; ld r20,112(r4)
+err2; ld r21,120(r4)
+ addi r4,r4,128
+err2; std r0,0(r3)
+err2; std r6,8(r3)
+err2; std r7,16(r3)
+err2; std r8,24(r3)
+err2; std r9,32(r3)
+err2; std r10,40(r3)
+err2; std r11,48(r3)
+err2; std r12,56(r3)
+err2; std r14,64(r3)
+err2; std r15,72(r3)
+err2; std r16,80(r3)
+err2; std r17,88(r3)
+err2; std r18,96(r3)
+err2; std r19,104(r3)
+err2; std r20,112(r3)
+err2; std r21,120(r3)
+ addi r3,r3,128
+ bdnz 4b
+
+ clrldi r5,r5,(64-7)
+
+ ld r14,STK_REG(R14)(r1)
+ ld r15,STK_REG(R15)(r1)
+ ld r16,STK_REG(R16)(r1)
+ ld r17,STK_REG(R17)(r1)
+ ld r18,STK_REG(R18)(r1)
+ ld r19,STK_REG(R19)(r1)
+ ld r20,STK_REG(R20)(r1)
+ ld r21,STK_REG(R21)(r1)
+ ld r22,STK_REG(R22)(r1)
+ addi r1,r1,STACKFRAMESIZE
+
+ /* Up to 127B to go */
+5: srdi r6,r5,4
+ mtocrf 0x01,r6
+
+6: bf cr7*4+1,7f
+err1; ld r0,0(r4)
+err1; ld r6,8(r4)
+err1; ld r7,16(r4)
+err1; ld r8,24(r4)
+err1; ld r9,32(r4)
+err1; ld r10,40(r4)
+err1; ld r11,48(r4)
+err1; ld r12,56(r4)
+ addi r4,r4,64
+err1; std r0,0(r3)
+err1; std r6,8(r3)
+err1; std r7,16(r3)
+err1; std r8,24(r3)
+err1; std r9,32(r3)
+err1; std r10,40(r3)
+err1; std r11,48(r3)
+err1; std r12,56(r3)
+ addi r3,r3,64
+
+ /* Up to 63B to go */
+7: bf cr7*4+2,8f
+err1; ld r0,0(r4)
+err1; ld r6,8(r4)
+err1; ld r7,16(r4)
+err1; ld r8,24(r4)
+ addi r4,r4,32
+err1; std r0,0(r3)
+err1; std r6,8(r3)
+err1; std r7,16(r3)
+err1; std r8,24(r3)
+ addi r3,r3,32
+
+ /* Up to 31B to go */
+8: bf cr7*4+3,9f
+err1; ld r0,0(r4)
+err1; ld r6,8(r4)
+ addi r4,r4,16
+err1; std r0,0(r3)
+err1; std r6,8(r3)
+ addi r3,r3,16
+
+9: clrldi r5,r5,(64-4)
+
+ /* Up to 15B to go */
+.Lshort_copy:
+ mtocrf 0x01,r5
+ bf cr7*4+0,12f
+err1; lwz r0,0(r4) /* Less chance of a reject with word ops */
+err1; lwz r6,4(r4)
+ addi r4,r4,8
+err1; stw r0,0(r3)
+err1; stw r6,4(r3)
+ addi r3,r3,8
+
+12: bf cr7*4+1,13f
+err1; lwz r0,0(r4)
+ addi r4,r4,4
+err1; stw r0,0(r3)
+ addi r3,r3,4
+
+13: bf cr7*4+2,14f
+err1; lhz r0,0(r4)
+ addi r4,r4,2
+err1; sth r0,0(r3)
+ addi r3,r3,2
+
+14: bf cr7*4+3,15f
+err1; lbz r0,0(r4)
+err1; stb r0,0(r3)
+
+15: li r3,0
+ blr
+
+EXPORT_SYMBOL_GPL(memcpy_mcsafe);
--
2.20.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [v3 4/7] powerpc/mce: Handle UE event for memcpy_mcsafe
2019-07-05 21:26 [v3 0/7] powerpc: implement machine check safe memcpy Santosh Sivaraj
` (2 preceding siblings ...)
2019-07-05 21:26 ` [v3 3/7] powerpc/memcpy: Add memcpy_mcsafe for pmem Santosh Sivaraj
@ 2019-07-05 21:26 ` Santosh Sivaraj
2019-07-06 9:53 ` Nicholas Piggin
2019-07-05 21:26 ` [v3 5/7] powerpc/memcpy_mcsafe: return remaining bytes Santosh Sivaraj
` (2 subsequent siblings)
6 siblings, 1 reply; 18+ messages in thread
From: Santosh Sivaraj @ 2019-07-05 21:26 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Nicholas Piggin,
Chandan Rajendra, Reza Arbab
If we take a UE on one of the instructions with a fixup entry, set nip
to continue exucution at the fixup entry. Stop processing the event
further or print it.
Based-on-patch-by: Reza Arbab <arbab@linux.ibm.com>
Cc: Reza Arbab <arbab@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
---
arch/powerpc/include/asm/mce.h | 4 +++-
arch/powerpc/kernel/mce.c | 7 ++++++-
arch/powerpc/kernel/mce_power.c | 15 +++++++++++++--
3 files changed, 22 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 94888a7025b3..f74257eb013b 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -122,7 +122,8 @@ struct machine_check_event {
enum MCE_UeErrorType ue_error_type:8;
u8 effective_address_provided;
u8 physical_address_provided;
- u8 reserved_1[5];
+ u8 ignore_event;
+ u8 reserved_1[4];
u64 effective_address;
u64 physical_address;
u8 reserved_2[8];
@@ -193,6 +194,7 @@ struct mce_error_info {
enum MCE_Initiator initiator:8;
enum MCE_ErrorClass error_class:8;
bool sync_error;
+ bool ignore_event;
};
#define MAX_MC_EVT 100
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index e78c4f18ea0a..94f2bb307537 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -144,7 +144,9 @@ void save_mce_event(struct pt_regs *regs, long handled,
if (phys_addr != ULONG_MAX) {
mce->u.ue_error.physical_address_provided = true;
mce->u.ue_error.physical_address = phys_addr;
- machine_check_ue_event(mce);
+ mce->u.ue_error.ignore_event = mce_err->ignore_event;
+ if (!mce->u.ue_error.ignore_event)
+ machine_check_ue_event(mce);
}
}
return;
@@ -230,6 +232,9 @@ void machine_check_queue_event(void)
if (!get_mce_event(&evt, MCE_EVENT_RELEASE))
return;
+ if (evt.error_type == MCE_ERROR_TYPE_UE && evt.u.ue_error.ignore_event)
+ return;
+
index = __this_cpu_inc_return(mce_queue_count) - 1;
/* If queue is full, just return for now. */
if (index >= MAX_MC_EVT) {
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index 04666c0b40a8..db4aa98a23ac 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -11,6 +11,7 @@
#include <linux/types.h>
#include <linux/ptrace.h>
+#include <linux/extable.h>
#include <asm/mmu.h>
#include <asm/mce.h>
#include <asm/machdep.h>
@@ -18,6 +19,7 @@
#include <asm/pte-walk.h>
#include <asm/sstep.h>
#include <asm/exception-64s.h>
+#include <asm/extable.h>
/*
* Convert an address related to an mm to a PFN. NOTE: we are in real
@@ -565,9 +567,18 @@ static int mce_handle_derror(struct pt_regs *regs,
return 0;
}
-static long mce_handle_ue_error(struct pt_regs *regs)
+static long mce_handle_ue_error(struct pt_regs *regs,
+ struct mce_error_info *mce_err)
{
long handled = 0;
+ const struct exception_table_entry *entry;
+
+ entry = search_exception_tables(regs->nip);
+ if (entry) {
+ mce_err->ignore_event = true;
+ regs->nip = extable_fixup(entry);
+ return 1;
+ }
/*
* On specific SCOM read via MMIO we may get a machine check
@@ -600,7 +611,7 @@ static long mce_handle_error(struct pt_regs *regs,
&phys_addr);
if (!handled && mce_err.error_type == MCE_ERROR_TYPE_UE)
- handled = mce_handle_ue_error(regs);
+ handled = mce_handle_ue_error(regs, &mce_err);
save_mce_event(regs, handled, &mce_err, regs->nip, addr, phys_addr);
--
2.20.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [v3 5/7] powerpc/memcpy_mcsafe: return remaining bytes
2019-07-05 21:26 [v3 0/7] powerpc: implement machine check safe memcpy Santosh Sivaraj
` (3 preceding siblings ...)
2019-07-05 21:26 ` [v3 4/7] powerpc/mce: Handle UE event for memcpy_mcsafe Santosh Sivaraj
@ 2019-07-05 21:26 ` Santosh Sivaraj
2019-07-06 6:16 ` Christophe Leroy
2019-07-05 21:26 ` [v3 6/7] powerpc: add machine check safe copy_to_user Santosh Sivaraj
2019-07-05 21:26 ` [v3 7/7] powerpc/64s: save r13 in MCE handler (simulator workaroud) Santosh Sivaraj
6 siblings, 1 reply; 18+ messages in thread
From: Santosh Sivaraj @ 2019-07-05 21:26 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Nicholas Piggin,
Chandan Rajendra, Reza Arbab
memcpy_mcsafe currently return -EFAULT on a machine check exception, change
it to return the remaining bytes that needs to be copied, so that machine
check safe copy_to_user can maintain the same behavior as copy_to_user.
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
---
arch/powerpc/lib/memcpy_mcsafe_64.S | 142 ++++++++++++++++------------
1 file changed, 83 insertions(+), 59 deletions(-)
diff --git a/arch/powerpc/lib/memcpy_mcsafe_64.S b/arch/powerpc/lib/memcpy_mcsafe_64.S
index 50f865db0338..4d8a3d315992 100644
--- a/arch/powerpc/lib/memcpy_mcsafe_64.S
+++ b/arch/powerpc/lib/memcpy_mcsafe_64.S
@@ -30,11 +30,25 @@
ld r14,STK_REG(R14)(r1)
addi r1,r1,STACKFRAMESIZE
.Ldo_err1:
- li r3,-EFAULT
+ /* Do a byte by byte copy to get the exact remaining size */
+ mtctr r7
+100: EX_TABLE(100b, .Ldone)
+46:
+err1; lbz r0,0(r4)
+ addi r4,r4,1
+err1; stb r0,0(r3)
+ addi r3,r3,1
+ bdnz 46b
+ li r3,0
+ blr
+
+.Ldone:
+ mfctr r3
blr
_GLOBAL(memcpy_mcsafe)
+ mr r7,r5
cmpldi r5,16
blt .Lshort_copy
@@ -49,18 +63,21 @@ err1; lbz r0,0(r4)
addi r4,r4,1
err1; stb r0,0(r3)
addi r3,r3,1
+ subi r7,r7,1
1: bf cr7*4+2,2f
err1; lhz r0,0(r4)
addi r4,r4,2
err1; sth r0,0(r3)
addi r3,r3,2
+ subi r7,r7,2
2: bf cr7*4+1,3f
err1; lwz r0,0(r4)
addi r4,r4,4
err1; stw r0,0(r3)
addi r3,r3,4
+ subi r7,r7,4
3: sub r5,r5,r6
cmpldi r5,128
@@ -87,43 +104,69 @@ err1; stw r0,0(r3)
4:
err2; ld r0,0(r4)
err2; ld r6,8(r4)
-err2; ld r7,16(r4)
-err2; ld r8,24(r4)
-err2; ld r9,32(r4)
-err2; ld r10,40(r4)
-err2; ld r11,48(r4)
-err2; ld r12,56(r4)
-err2; ld r14,64(r4)
-err2; ld r15,72(r4)
-err2; ld r16,80(r4)
-err2; ld r17,88(r4)
-err2; ld r18,96(r4)
-err2; ld r19,104(r4)
-err2; ld r20,112(r4)
-err2; ld r21,120(r4)
+err2; ld r8,16(r4)
+err2; ld r9,24(r4)
+err2; ld r10,32(r4)
+err2; ld r11,40(r4)
+err2; ld r12,48(r4)
+err2; ld r14,56(r4)
+err2; ld r15,64(r4)
+err2; ld r16,72(r4)
+err2; ld r17,80(r4)
+err2; ld r18,88(r4)
+err2; ld r19,96(r4)
+err2; ld r20,104(r4)
+err2; ld r21,112(r4)
+err2; ld r22,120(r4)
addi r4,r4,128
err2; std r0,0(r3)
err2; std r6,8(r3)
-err2; std r7,16(r3)
-err2; std r8,24(r3)
-err2; std r9,32(r3)
-err2; std r10,40(r3)
-err2; std r11,48(r3)
-err2; std r12,56(r3)
-err2; std r14,64(r3)
-err2; std r15,72(r3)
-err2; std r16,80(r3)
-err2; std r17,88(r3)
-err2; std r18,96(r3)
-err2; std r19,104(r3)
-err2; std r20,112(r3)
-err2; std r21,120(r3)
+err2; std r8,16(r3)
+err2; std r9,24(r3)
+err2; std r10,32(r3)
+err2; std r11,40(r3)
+err2; std r12,48(r3)
+err2; std r14,56(r3)
+err2; std r15,64(r3)
+err2; std r16,72(r3)
+err2; std r17,80(r3)
+err2; std r18,88(r3)
+err2; std r19,96(r3)
+err2; std r20,104(r3)
+err2; std r21,112(r3)
+err2; std r22,120(r3)
addi r3,r3,128
+ subi r7,r7,128
bdnz 4b
clrldi r5,r5,(64-7)
- ld r14,STK_REG(R14)(r1)
+ /* Up to 127B to go */
+5: srdi r6,r5,4
+ mtocrf 0x01,r6
+
+6: bf cr7*4+1,7f
+err2; ld r0,0(r4)
+err2; ld r6,8(r4)
+err2; ld r8,16(r4)
+err2; ld r9,24(r4)
+err2; ld r10,32(r4)
+err2; ld r11,40(r4)
+err2; ld r12,48(r4)
+err2; ld r14,56(r4)
+ addi r4,r4,64
+err2; std r0,0(r3)
+err2; std r6,8(r3)
+err2; std r8,16(r3)
+err2; std r9,24(r3)
+err2; std r10,32(r3)
+err2; std r11,40(r3)
+err2; std r12,48(r3)
+err2; std r14,56(r3)
+ addi r3,r3,64
+ subi r7,r7,64
+
+7: ld r14,STK_REG(R14)(r1)
ld r15,STK_REG(R15)(r1)
ld r16,STK_REG(R16)(r1)
ld r17,STK_REG(R17)(r1)
@@ -134,42 +177,19 @@ err2; std r21,120(r3)
ld r22,STK_REG(R22)(r1)
addi r1,r1,STACKFRAMESIZE
- /* Up to 127B to go */
-5: srdi r6,r5,4
- mtocrf 0x01,r6
-
-6: bf cr7*4+1,7f
-err1; ld r0,0(r4)
-err1; ld r6,8(r4)
-err1; ld r7,16(r4)
-err1; ld r8,24(r4)
-err1; ld r9,32(r4)
-err1; ld r10,40(r4)
-err1; ld r11,48(r4)
-err1; ld r12,56(r4)
- addi r4,r4,64
-err1; std r0,0(r3)
-err1; std r6,8(r3)
-err1; std r7,16(r3)
-err1; std r8,24(r3)
-err1; std r9,32(r3)
-err1; std r10,40(r3)
-err1; std r11,48(r3)
-err1; std r12,56(r3)
- addi r3,r3,64
-
/* Up to 63B to go */
-7: bf cr7*4+2,8f
+ bf cr7*4+2,8f
err1; ld r0,0(r4)
err1; ld r6,8(r4)
-err1; ld r7,16(r4)
-err1; ld r8,24(r4)
+err1; ld r8,16(r4)
+err1; ld r9,24(r4)
addi r4,r4,32
err1; std r0,0(r3)
err1; std r6,8(r3)
-err1; std r7,16(r3)
-err1; std r8,24(r3)
+err1; std r8,16(r3)
+err1; std r9,24(r3)
addi r3,r3,32
+ subi r7,r7,32
/* Up to 31B to go */
8: bf cr7*4+3,9f
@@ -179,6 +199,7 @@ err1; ld r6,8(r4)
err1; std r0,0(r3)
err1; std r6,8(r3)
addi r3,r3,16
+ subi r7,r7,16
9: clrldi r5,r5,(64-4)
@@ -192,18 +213,21 @@ err1; lwz r6,4(r4)
err1; stw r0,0(r3)
err1; stw r6,4(r3)
addi r3,r3,8
+ subi r7,r7,8
12: bf cr7*4+1,13f
err1; lwz r0,0(r4)
addi r4,r4,4
err1; stw r0,0(r3)
addi r3,r3,4
+ subi r7,r7,4
13: bf cr7*4+2,14f
err1; lhz r0,0(r4)
addi r4,r4,2
err1; sth r0,0(r3)
addi r3,r3,2
+ subi r7,r7,2
14: bf cr7*4+3,15f
err1; lbz r0,0(r4)
--
2.20.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [v3 6/7] powerpc: add machine check safe copy_to_user
2019-07-05 21:26 [v3 0/7] powerpc: implement machine check safe memcpy Santosh Sivaraj
` (4 preceding siblings ...)
2019-07-05 21:26 ` [v3 5/7] powerpc/memcpy_mcsafe: return remaining bytes Santosh Sivaraj
@ 2019-07-05 21:26 ` Santosh Sivaraj
2019-07-06 6:19 ` Christophe Leroy
2019-07-05 21:26 ` [v3 7/7] powerpc/64s: save r13 in MCE handler (simulator workaroud) Santosh Sivaraj
6 siblings, 1 reply; 18+ messages in thread
From: Santosh Sivaraj @ 2019-07-05 21:26 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Nicholas Piggin,
Chandan Rajendra, Reza Arbab
Use memcpy_mcsafe() implementation to define copy_to_user_mcsafe()
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/uaccess.h | 12 ++++++++++++
2 files changed, 13 insertions(+)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8c1c636308c8..a173b392c272 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -134,6 +134,7 @@ config PPC
select ARCH_HAS_STRICT_KERNEL_RWX if ((PPC_BOOK3S_64 || PPC32) && !RELOCATABLE && !HIBERNATION)
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UACCESS_FLUSHCACHE if PPC64
+ select ARCH_HAS_UACCESS_MCSAFE if PPC64
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_HAS_ZONE_DEVICE if PPC_BOOK3S_64
select ARCH_HAVE_NMI_SAFE_CMPXCHG
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 76f34346b642..f8fcaab4c5bc 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -386,6 +386,18 @@ static inline unsigned long raw_copy_to_user(void __user *to,
return ret;
}
+static __always_inline unsigned long __must_check
+copy_to_user_mcsafe(void __user *to, const void *from, unsigned long n)
+{
+ if (likely(check_copy_size(from, n, true))) {
+ allow_write_to_user(to, n);
+ n = memcpy_mcsafe(to, from, n);
+ prevent_write_to_user(to, n);
+ }
+
+ return n;
+}
+
extern unsigned long __clear_user(void __user *addr, unsigned long size);
static inline unsigned long clear_user(void __user *addr, unsigned long size)
--
2.20.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [v3 7/7] powerpc/64s: save r13 in MCE handler (simulator workaroud)
2019-07-05 21:26 [v3 0/7] powerpc: implement machine check safe memcpy Santosh Sivaraj
` (5 preceding siblings ...)
2019-07-05 21:26 ` [v3 6/7] powerpc: add machine check safe copy_to_user Santosh Sivaraj
@ 2019-07-05 21:26 ` Santosh Sivaraj
2019-07-06 9:56 ` Nicholas Piggin
6 siblings, 1 reply; 18+ messages in thread
From: Santosh Sivaraj @ 2019-07-05 21:26 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Nicholas Piggin,
Chandan Rajendra, Reza Arbab
From: Reza Arbab <arbab@linux.ibm.com>
Testing my memcpy_mcsafe() work in progress with an injected UE, I get
an error like this immediately after the function returns:
BUG: Unable to handle kernel data access at 0x7fff84dec8f8
Faulting instruction address: 0xc0080000009c00b0
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: mce(O+) vmx_crypto crc32c_vpmsum
CPU: 0 PID: 1375 Comm: modprobe Tainted: G O 5.1.0-rc6 #267
NIP: c0080000009c00b0 LR: c0080000009c00a8 CTR: c000000000095f90
REGS: c0000000ee197790 TRAP: 0300 Tainted: G O (5.1.0-rc6)
MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88002826 XER: 00040000
CFAR: c000000000095f8c DAR: 00007fff84dec8f8 DSISR: 40000000 IRQMASK: 0
GPR00: 000000006c6c6568 c0000000ee197a20 c0080000009c8400 fffffffffffffff2
GPR04: c0080000009c02e0 0000000000000006 0000000000000000 c000000003c834c8
GPR08: 0080000000000000 776a6681b7fb5100 0000000000000000 c0080000009c01c8
GPR12: c000000000095f90 00007fff84debc00 000000004d071440 0000000000000000
GPR16: 0000000100000601 c0080000009e0000 c000000000c98dd8 c000000000c98d98
GPR20: c000000003bba970 c0080000009c04d0 c0080000009c0618 c0000000001e5820
GPR24: 0000000000000000 0000000000000100 0000000000000001 c000000003bba958
GPR28: c0080000009c02e8 c0080000009c0318 c0080000009c02e0 0000000000000000
NIP [c0080000009c00b0] cause_ue+0xa8/0xe8 [mce]
LR [c0080000009c00a8] cause_ue+0xa0/0xe8 [mce]
After debugging we see that the first instruction at vector 200 is skipped by
the simulator, due to which r13 is not saved. Adding a nop at 0x200 fixes the
issue.
(This commit is needed for testing this series. This should not be taken
into the tree)
---
arch/powerpc/kernel/exceptions-64s.S | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 73ba246ca11d..8e43abb2a744 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -255,6 +255,7 @@ EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
* some code path might still want to branch into the original
* vector
*/
+ nop
SET_SCRATCH0(r13) /* save r13 */
EXCEPTION_PROLOG_0(PACA_EXMC)
BEGIN_FTR_SECTION
--
2.20.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [v3 1/7] powerpc/mce: Make machine_check_ue_event() static
2019-07-05 21:26 ` [v3 1/7] powerpc/mce: Make machine_check_ue_event() static Santosh Sivaraj
@ 2019-07-06 5:18 ` Nicholas Piggin
0 siblings, 0 replies; 18+ messages in thread
From: Nicholas Piggin @ 2019-07-06 5:18 UTC (permalink / raw)
To: linuxppc-dev, Santosh Sivaraj
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Chandan Rajendra, Reza Arbab
Santosh Sivaraj's on July 6, 2019 7:26 am:
> From: Reza Arbab <arbab@linux.ibm.com>
>
> The function doesn't get used outside this file, so make it static.
>
> Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
> Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v3 3/7] powerpc/memcpy: Add memcpy_mcsafe for pmem
2019-07-05 21:26 ` [v3 3/7] powerpc/memcpy: Add memcpy_mcsafe for pmem Santosh Sivaraj
@ 2019-07-06 6:12 ` Christophe Leroy
0 siblings, 0 replies; 18+ messages in thread
From: Christophe Leroy @ 2019-07-06 6:12 UTC (permalink / raw)
To: Santosh Sivaraj, linuxppc-dev
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Nicholas Piggin,
Chandan Rajendra, Reza Arbab
Le 05/07/2019 à 23:26, Santosh Sivaraj a écrit :
> From: Balbir Singh <bsingharora@gmail.com>
>
> The pmem infrastructure uses memcpy_mcsafe in the pmem
> layer so as to convert machine check exceptions into
> a return value on failure in case a machine check
> exception is encountered during the memcpy.
>
> This patch largely borrows from the copyuser_power7
> logic and does not add the VMX optimizations, largely
> to keep the patch simple. If needed those optimizations
> can be folded in.
As this patch largely borrows from copyuser_power7(), could we use gas
macro in order to share the source code between the two functions and
limit code source duplication ?
Christophe
>
> Signed-off-by: Balbir Singh <bsingharora@gmail.com>
> Acked-by: Nicholas Piggin <npiggin@gmail.com>
> [arbab@linux.ibm.com: Added symbol export]
> Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
> ---
> arch/powerpc/include/asm/string.h | 2 +
> arch/powerpc/lib/Makefile | 2 +-
> arch/powerpc/lib/memcpy_mcsafe_64.S | 215 ++++++++++++++++++++++++++++
> 3 files changed, 218 insertions(+), 1 deletion(-)
> create mode 100644 arch/powerpc/lib/memcpy_mcsafe_64.S
>
> diff --git a/arch/powerpc/include/asm/string.h b/arch/powerpc/include/asm/string.h
> index 9bf6dffb4090..b72692702f35 100644
> --- a/arch/powerpc/include/asm/string.h
> +++ b/arch/powerpc/include/asm/string.h
> @@ -53,7 +53,9 @@ void *__memmove(void *to, const void *from, __kernel_size_t n);
> #ifndef CONFIG_KASAN
> #define __HAVE_ARCH_MEMSET32
> #define __HAVE_ARCH_MEMSET64
> +#define __HAVE_ARCH_MEMCPY_MCSAFE
>
> +extern int memcpy_mcsafe(void *dst, const void *src, __kernel_size_t sz);
> extern void *__memset16(uint16_t *, uint16_t v, __kernel_size_t);
> extern void *__memset32(uint32_t *, uint32_t v, __kernel_size_t);
> extern void *__memset64(uint64_t *, uint64_t v, __kernel_size_t);
> diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
> index c55f9c27bf79..529d6536eb4a 100644
> --- a/arch/powerpc/lib/Makefile
> +++ b/arch/powerpc/lib/Makefile
> @@ -39,7 +39,7 @@ obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \
> memcpy_power7.o
>
> obj64-y += copypage_64.o copyuser_64.o mem_64.o hweight_64.o \
> - memcpy_64.o pmem.o
> + memcpy_64.o pmem.o memcpy_mcsafe_64.o
>
> obj64-$(CONFIG_SMP) += locks.o
> obj64-$(CONFIG_ALTIVEC) += vmx-helper.o
> diff --git a/arch/powerpc/lib/memcpy_mcsafe_64.S b/arch/powerpc/lib/memcpy_mcsafe_64.S
> new file mode 100644
> index 000000000000..50f865db0338
> --- /dev/null
> +++ b/arch/powerpc/lib/memcpy_mcsafe_64.S
> @@ -0,0 +1,215 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) IBM Corporation, 2011
> + * Derived from copyuser_power7.s by Anton Blanchard <anton@au.ibm.com>
> + * Author - Balbir Singh <bsingharora@gmail.com>
> + */
> +#include <asm/ppc_asm.h>
> +#include <asm/errno.h>
> +#include <asm/export.h>
> +
> + .macro err1
> +100:
> + EX_TABLE(100b,.Ldo_err1)
> + .endm
> +
> + .macro err2
> +200:
> + EX_TABLE(200b,.Ldo_err2)
> + .endm
> +
> +.Ldo_err2:
> + ld r22,STK_REG(R22)(r1)
> + ld r21,STK_REG(R21)(r1)
> + ld r20,STK_REG(R20)(r1)
> + ld r19,STK_REG(R19)(r1)
> + ld r18,STK_REG(R18)(r1)
> + ld r17,STK_REG(R17)(r1)
> + ld r16,STK_REG(R16)(r1)
> + ld r15,STK_REG(R15)(r1)
> + ld r14,STK_REG(R14)(r1)
> + addi r1,r1,STACKFRAMESIZE
> +.Ldo_err1:
> + li r3,-EFAULT
> + blr
> +
> +
> +_GLOBAL(memcpy_mcsafe)
> + cmpldi r5,16
> + blt .Lshort_copy
> +
> +.Lcopy:
> + /* Get the source 8B aligned */
> + neg r6,r4
> + mtocrf 0x01,r6
> + clrldi r6,r6,(64-3)
> +
> + bf cr7*4+3,1f
> +err1; lbz r0,0(r4)
> + addi r4,r4,1
> +err1; stb r0,0(r3)
> + addi r3,r3,1
> +
> +1: bf cr7*4+2,2f
> +err1; lhz r0,0(r4)
> + addi r4,r4,2
> +err1; sth r0,0(r3)
> + addi r3,r3,2
> +
> +2: bf cr7*4+1,3f
> +err1; lwz r0,0(r4)
> + addi r4,r4,4
> +err1; stw r0,0(r3)
> + addi r3,r3,4
> +
> +3: sub r5,r5,r6
> + cmpldi r5,128
> + blt 5f
> +
> + mflr r0
> + stdu r1,-STACKFRAMESIZE(r1)
> + std r14,STK_REG(R14)(r1)
> + std r15,STK_REG(R15)(r1)
> + std r16,STK_REG(R16)(r1)
> + std r17,STK_REG(R17)(r1)
> + std r18,STK_REG(R18)(r1)
> + std r19,STK_REG(R19)(r1)
> + std r20,STK_REG(R20)(r1)
> + std r21,STK_REG(R21)(r1)
> + std r22,STK_REG(R22)(r1)
> + std r0,STACKFRAMESIZE+16(r1)
> +
> + srdi r6,r5,7
> + mtctr r6
> +
> + /* Now do cacheline (128B) sized loads and stores. */
> + .align 5
> +4:
> +err2; ld r0,0(r4)
> +err2; ld r6,8(r4)
> +err2; ld r7,16(r4)
> +err2; ld r8,24(r4)
> +err2; ld r9,32(r4)
> +err2; ld r10,40(r4)
> +err2; ld r11,48(r4)
> +err2; ld r12,56(r4)
> +err2; ld r14,64(r4)
> +err2; ld r15,72(r4)
> +err2; ld r16,80(r4)
> +err2; ld r17,88(r4)
> +err2; ld r18,96(r4)
> +err2; ld r19,104(r4)
> +err2; ld r20,112(r4)
> +err2; ld r21,120(r4)
> + addi r4,r4,128
> +err2; std r0,0(r3)
> +err2; std r6,8(r3)
> +err2; std r7,16(r3)
> +err2; std r8,24(r3)
> +err2; std r9,32(r3)
> +err2; std r10,40(r3)
> +err2; std r11,48(r3)
> +err2; std r12,56(r3)
> +err2; std r14,64(r3)
> +err2; std r15,72(r3)
> +err2; std r16,80(r3)
> +err2; std r17,88(r3)
> +err2; std r18,96(r3)
> +err2; std r19,104(r3)
> +err2; std r20,112(r3)
> +err2; std r21,120(r3)
> + addi r3,r3,128
> + bdnz 4b
> +
> + clrldi r5,r5,(64-7)
> +
> + ld r14,STK_REG(R14)(r1)
> + ld r15,STK_REG(R15)(r1)
> + ld r16,STK_REG(R16)(r1)
> + ld r17,STK_REG(R17)(r1)
> + ld r18,STK_REG(R18)(r1)
> + ld r19,STK_REG(R19)(r1)
> + ld r20,STK_REG(R20)(r1)
> + ld r21,STK_REG(R21)(r1)
> + ld r22,STK_REG(R22)(r1)
> + addi r1,r1,STACKFRAMESIZE
> +
> + /* Up to 127B to go */
> +5: srdi r6,r5,4
> + mtocrf 0x01,r6
> +
> +6: bf cr7*4+1,7f
> +err1; ld r0,0(r4)
> +err1; ld r6,8(r4)
> +err1; ld r7,16(r4)
> +err1; ld r8,24(r4)
> +err1; ld r9,32(r4)
> +err1; ld r10,40(r4)
> +err1; ld r11,48(r4)
> +err1; ld r12,56(r4)
> + addi r4,r4,64
> +err1; std r0,0(r3)
> +err1; std r6,8(r3)
> +err1; std r7,16(r3)
> +err1; std r8,24(r3)
> +err1; std r9,32(r3)
> +err1; std r10,40(r3)
> +err1; std r11,48(r3)
> +err1; std r12,56(r3)
> + addi r3,r3,64
> +
> + /* Up to 63B to go */
> +7: bf cr7*4+2,8f
> +err1; ld r0,0(r4)
> +err1; ld r6,8(r4)
> +err1; ld r7,16(r4)
> +err1; ld r8,24(r4)
> + addi r4,r4,32
> +err1; std r0,0(r3)
> +err1; std r6,8(r3)
> +err1; std r7,16(r3)
> +err1; std r8,24(r3)
> + addi r3,r3,32
> +
> + /* Up to 31B to go */
> +8: bf cr7*4+3,9f
> +err1; ld r0,0(r4)
> +err1; ld r6,8(r4)
> + addi r4,r4,16
> +err1; std r0,0(r3)
> +err1; std r6,8(r3)
> + addi r3,r3,16
> +
> +9: clrldi r5,r5,(64-4)
> +
> + /* Up to 15B to go */
> +.Lshort_copy:
> + mtocrf 0x01,r5
> + bf cr7*4+0,12f
> +err1; lwz r0,0(r4) /* Less chance of a reject with word ops */
> +err1; lwz r6,4(r4)
> + addi r4,r4,8
> +err1; stw r0,0(r3)
> +err1; stw r6,4(r3)
> + addi r3,r3,8
> +
> +12: bf cr7*4+1,13f
> +err1; lwz r0,0(r4)
> + addi r4,r4,4
> +err1; stw r0,0(r3)
> + addi r3,r3,4
> +
> +13: bf cr7*4+2,14f
> +err1; lhz r0,0(r4)
> + addi r4,r4,2
> +err1; sth r0,0(r3)
> + addi r3,r3,2
> +
> +14: bf cr7*4+3,15f
> +err1; lbz r0,0(r4)
> +err1; stb r0,0(r3)
> +
> +15: li r3,0
> + blr
> +
> +EXPORT_SYMBOL_GPL(memcpy_mcsafe);
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v3 5/7] powerpc/memcpy_mcsafe: return remaining bytes
2019-07-05 21:26 ` [v3 5/7] powerpc/memcpy_mcsafe: return remaining bytes Santosh Sivaraj
@ 2019-07-06 6:16 ` Christophe Leroy
2019-07-06 10:05 ` Nicholas Piggin
0 siblings, 1 reply; 18+ messages in thread
From: Christophe Leroy @ 2019-07-06 6:16 UTC (permalink / raw)
To: Santosh Sivaraj, linuxppc-dev
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Nicholas Piggin,
Chandan Rajendra, Reza Arbab
Le 05/07/2019 à 23:26, Santosh Sivaraj a écrit :
> memcpy_mcsafe currently return -EFAULT on a machine check exception, change
> it to return the remaining bytes that needs to be copied, so that machine
> check safe copy_to_user can maintain the same behavior as copy_to_user.
AFAIU, this behaviour is the expected behaviour for memcpy_mcsafe(). Why
implement a different behaviour in patch 3 and then change it here.
Can't memcpy_mcsafe() return remaining bytes as expected from patch 3 ?
Christophe
>
> Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
> ---
> arch/powerpc/lib/memcpy_mcsafe_64.S | 142 ++++++++++++++++------------
> 1 file changed, 83 insertions(+), 59 deletions(-)
>
> diff --git a/arch/powerpc/lib/memcpy_mcsafe_64.S b/arch/powerpc/lib/memcpy_mcsafe_64.S
> index 50f865db0338..4d8a3d315992 100644
> --- a/arch/powerpc/lib/memcpy_mcsafe_64.S
> +++ b/arch/powerpc/lib/memcpy_mcsafe_64.S
> @@ -30,11 +30,25 @@
> ld r14,STK_REG(R14)(r1)
> addi r1,r1,STACKFRAMESIZE
> .Ldo_err1:
> - li r3,-EFAULT
> + /* Do a byte by byte copy to get the exact remaining size */
> + mtctr r7
> +100: EX_TABLE(100b, .Ldone)
> +46:
> +err1; lbz r0,0(r4)
> + addi r4,r4,1
> +err1; stb r0,0(r3)
> + addi r3,r3,1
> + bdnz 46b
> + li r3,0
> + blr
> +
> +.Ldone:
> + mfctr r3
> blr
>
>
> _GLOBAL(memcpy_mcsafe)
> + mr r7,r5
> cmpldi r5,16
> blt .Lshort_copy
>
> @@ -49,18 +63,21 @@ err1; lbz r0,0(r4)
> addi r4,r4,1
> err1; stb r0,0(r3)
> addi r3,r3,1
> + subi r7,r7,1
>
> 1: bf cr7*4+2,2f
> err1; lhz r0,0(r4)
> addi r4,r4,2
> err1; sth r0,0(r3)
> addi r3,r3,2
> + subi r7,r7,2
>
> 2: bf cr7*4+1,3f
> err1; lwz r0,0(r4)
> addi r4,r4,4
> err1; stw r0,0(r3)
> addi r3,r3,4
> + subi r7,r7,4
>
> 3: sub r5,r5,r6
> cmpldi r5,128
> @@ -87,43 +104,69 @@ err1; stw r0,0(r3)
> 4:
> err2; ld r0,0(r4)
> err2; ld r6,8(r4)
> -err2; ld r7,16(r4)
> -err2; ld r8,24(r4)
> -err2; ld r9,32(r4)
> -err2; ld r10,40(r4)
> -err2; ld r11,48(r4)
> -err2; ld r12,56(r4)
> -err2; ld r14,64(r4)
> -err2; ld r15,72(r4)
> -err2; ld r16,80(r4)
> -err2; ld r17,88(r4)
> -err2; ld r18,96(r4)
> -err2; ld r19,104(r4)
> -err2; ld r20,112(r4)
> -err2; ld r21,120(r4)
> +err2; ld r8,16(r4)
> +err2; ld r9,24(r4)
> +err2; ld r10,32(r4)
> +err2; ld r11,40(r4)
> +err2; ld r12,48(r4)
> +err2; ld r14,56(r4)
> +err2; ld r15,64(r4)
> +err2; ld r16,72(r4)
> +err2; ld r17,80(r4)
> +err2; ld r18,88(r4)
> +err2; ld r19,96(r4)
> +err2; ld r20,104(r4)
> +err2; ld r21,112(r4)
> +err2; ld r22,120(r4)
> addi r4,r4,128
> err2; std r0,0(r3)
> err2; std r6,8(r3)
> -err2; std r7,16(r3)
> -err2; std r8,24(r3)
> -err2; std r9,32(r3)
> -err2; std r10,40(r3)
> -err2; std r11,48(r3)
> -err2; std r12,56(r3)
> -err2; std r14,64(r3)
> -err2; std r15,72(r3)
> -err2; std r16,80(r3)
> -err2; std r17,88(r3)
> -err2; std r18,96(r3)
> -err2; std r19,104(r3)
> -err2; std r20,112(r3)
> -err2; std r21,120(r3)
> +err2; std r8,16(r3)
> +err2; std r9,24(r3)
> +err2; std r10,32(r3)
> +err2; std r11,40(r3)
> +err2; std r12,48(r3)
> +err2; std r14,56(r3)
> +err2; std r15,64(r3)
> +err2; std r16,72(r3)
> +err2; std r17,80(r3)
> +err2; std r18,88(r3)
> +err2; std r19,96(r3)
> +err2; std r20,104(r3)
> +err2; std r21,112(r3)
> +err2; std r22,120(r3)
> addi r3,r3,128
> + subi r7,r7,128
> bdnz 4b
>
> clrldi r5,r5,(64-7)
>
> - ld r14,STK_REG(R14)(r1)
> + /* Up to 127B to go */
> +5: srdi r6,r5,4
> + mtocrf 0x01,r6
> +
> +6: bf cr7*4+1,7f
> +err2; ld r0,0(r4)
> +err2; ld r6,8(r4)
> +err2; ld r8,16(r4)
> +err2; ld r9,24(r4)
> +err2; ld r10,32(r4)
> +err2; ld r11,40(r4)
> +err2; ld r12,48(r4)
> +err2; ld r14,56(r4)
> + addi r4,r4,64
> +err2; std r0,0(r3)
> +err2; std r6,8(r3)
> +err2; std r8,16(r3)
> +err2; std r9,24(r3)
> +err2; std r10,32(r3)
> +err2; std r11,40(r3)
> +err2; std r12,48(r3)
> +err2; std r14,56(r3)
> + addi r3,r3,64
> + subi r7,r7,64
> +
> +7: ld r14,STK_REG(R14)(r1)
> ld r15,STK_REG(R15)(r1)
> ld r16,STK_REG(R16)(r1)
> ld r17,STK_REG(R17)(r1)
> @@ -134,42 +177,19 @@ err2; std r21,120(r3)
> ld r22,STK_REG(R22)(r1)
> addi r1,r1,STACKFRAMESIZE
>
> - /* Up to 127B to go */
> -5: srdi r6,r5,4
> - mtocrf 0x01,r6
> -
> -6: bf cr7*4+1,7f
> -err1; ld r0,0(r4)
> -err1; ld r6,8(r4)
> -err1; ld r7,16(r4)
> -err1; ld r8,24(r4)
> -err1; ld r9,32(r4)
> -err1; ld r10,40(r4)
> -err1; ld r11,48(r4)
> -err1; ld r12,56(r4)
> - addi r4,r4,64
> -err1; std r0,0(r3)
> -err1; std r6,8(r3)
> -err1; std r7,16(r3)
> -err1; std r8,24(r3)
> -err1; std r9,32(r3)
> -err1; std r10,40(r3)
> -err1; std r11,48(r3)
> -err1; std r12,56(r3)
> - addi r3,r3,64
> -
> /* Up to 63B to go */
> -7: bf cr7*4+2,8f
> + bf cr7*4+2,8f
> err1; ld r0,0(r4)
> err1; ld r6,8(r4)
> -err1; ld r7,16(r4)
> -err1; ld r8,24(r4)
> +err1; ld r8,16(r4)
> +err1; ld r9,24(r4)
> addi r4,r4,32
> err1; std r0,0(r3)
> err1; std r6,8(r3)
> -err1; std r7,16(r3)
> -err1; std r8,24(r3)
> +err1; std r8,16(r3)
> +err1; std r9,24(r3)
> addi r3,r3,32
> + subi r7,r7,32
>
> /* Up to 31B to go */
> 8: bf cr7*4+3,9f
> @@ -179,6 +199,7 @@ err1; ld r6,8(r4)
> err1; std r0,0(r3)
> err1; std r6,8(r3)
> addi r3,r3,16
> + subi r7,r7,16
>
> 9: clrldi r5,r5,(64-4)
>
> @@ -192,18 +213,21 @@ err1; lwz r6,4(r4)
> err1; stw r0,0(r3)
> err1; stw r6,4(r3)
> addi r3,r3,8
> + subi r7,r7,8
>
> 12: bf cr7*4+1,13f
> err1; lwz r0,0(r4)
> addi r4,r4,4
> err1; stw r0,0(r3)
> addi r3,r3,4
> + subi r7,r7,4
>
> 13: bf cr7*4+2,14f
> err1; lhz r0,0(r4)
> addi r4,r4,2
> err1; sth r0,0(r3)
> addi r3,r3,2
> + subi r7,r7,2
>
> 14: bf cr7*4+3,15f
> err1; lbz r0,0(r4)
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v3 6/7] powerpc: add machine check safe copy_to_user
2019-07-05 21:26 ` [v3 6/7] powerpc: add machine check safe copy_to_user Santosh Sivaraj
@ 2019-07-06 6:19 ` Christophe Leroy
0 siblings, 0 replies; 18+ messages in thread
From: Christophe Leroy @ 2019-07-06 6:19 UTC (permalink / raw)
To: Santosh Sivaraj, linuxppc-dev
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Nicholas Piggin,
Chandan Rajendra, Reza Arbab
Le 05/07/2019 à 23:26, Santosh Sivaraj a écrit :
> Use memcpy_mcsafe() implementation to define copy_to_user_mcsafe()
>
> Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
> ---
> arch/powerpc/Kconfig | 1 +
> arch/powerpc/include/asm/uaccess.h | 12 ++++++++++++
> 2 files changed, 13 insertions(+)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 8c1c636308c8..a173b392c272 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -134,6 +134,7 @@ config PPC
> select ARCH_HAS_STRICT_KERNEL_RWX if ((PPC_BOOK3S_64 || PPC32) && !RELOCATABLE && !HIBERNATION)
> select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
> select ARCH_HAS_UACCESS_FLUSHCACHE if PPC64
> + select ARCH_HAS_UACCESS_MCSAFE if PPC64
> select ARCH_HAS_UBSAN_SANITIZE_ALL
> select ARCH_HAS_ZONE_DEVICE if PPC_BOOK3S_64
> select ARCH_HAVE_NMI_SAFE_CMPXCHG
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index 76f34346b642..f8fcaab4c5bc 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -386,6 +386,18 @@ static inline unsigned long raw_copy_to_user(void __user *to,
> return ret;
> }
>
> +static __always_inline unsigned long __must_check
> +copy_to_user_mcsafe(void __user *to, const void *from, unsigned long n)
> +{
> + if (likely(check_copy_size(from, n, true))) {
> + allow_write_to_user(to, n);
> + n = memcpy_mcsafe(to, from, n);
memcpy_mcsafe() doesn't expect a __user pointer. I think at least a cast
is needed.
And should we perform an access_ok() verification too ?
Christophe
> + prevent_write_to_user(to, n);
> + }
> +
> + return n;
> +}
> +
> extern unsigned long __clear_user(void __user *addr, unsigned long size);
>
> static inline unsigned long clear_user(void __user *addr, unsigned long size)
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v3 2/7] powerpc/mce: Bug fixes for MCE handling in kernel space
2019-07-05 21:26 ` [v3 2/7] powerpc/mce: Bug fixes for MCE handling in kernel space Santosh Sivaraj
@ 2019-07-06 9:44 ` Nicholas Piggin
0 siblings, 0 replies; 18+ messages in thread
From: Nicholas Piggin @ 2019-07-06 9:44 UTC (permalink / raw)
To: linuxppc-dev, Santosh Sivaraj
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Chandan Rajendra, Reza Arbab
Santosh Sivaraj's on July 6, 2019 7:26 am:
> From: Balbir Singh <bsingharora@gmail.com>
>
> The code currently assumes PAGE_SHIFT as the shift value of
> the pfn,
This comment doesn't really make sense on its own. Linux pfns
are always units of page shift, so if it's not that then it's
not a pfn.
I think you want the information from the last paragraph up the
here because that explains exactly what the problem is.
> this works correctly (mostly) for user space pages,
> but the correct thing to do is
>
> 1. Extract the shift value returned via the pte-walk API's
> 2. Use the shift value to access the instruction address.
>
> Note, the final physical address still use PAGE_SHIFT for
> computation. handle_ierror() is not modified and handle_derror()
> is modified just for extracting the correct instruction
> address.
>
> This is largely due to __find_linux_pte() returning pfn's
> shifted by pdshift. The code is much more generic and can
> handle shift values returned.
>
> Fixes: ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors")
>
> Signed-off-by: Balbir Singh <bsingharora@gmail.com>
> [arbab@linux.ibm.com: Fixup pseries_do_memory_failure()]
> Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
> Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
> ---
> arch/powerpc/include/asm/mce.h | 3 ++-
> arch/powerpc/kernel/mce_power.c | 26 ++++++++++++++++----------
> arch/powerpc/platforms/pseries/ras.c | 6 ++++--
> 3 files changed, 22 insertions(+), 13 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
> index a4c6a74ad2fb..94888a7025b3 100644
> --- a/arch/powerpc/include/asm/mce.h
> +++ b/arch/powerpc/include/asm/mce.h
> @@ -209,7 +209,8 @@ extern void release_mce_event(void);
> extern void machine_check_queue_event(void);
> extern void machine_check_print_event_info(struct machine_check_event *evt,
> bool user_mode, bool in_guest);
> -unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
> +unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr,
> + unsigned int *shift);
> #ifdef CONFIG_PPC_BOOK3S_64
> void flush_and_reload_slb(void);
> #endif /* CONFIG_PPC_BOOK3S_64 */
> diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> index e39536aad30d..04666c0b40a8 100644
> --- a/arch/powerpc/kernel/mce_power.c
> +++ b/arch/powerpc/kernel/mce_power.c
> @@ -23,7 +23,8 @@
> * Convert an address related to an mm to a PFN. NOTE: we are in real
> * mode, we could potentially race with page table updates.
> */
> -unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
> +unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr,
> + unsigned int *shift)
> {
> pte_t *ptep;
> unsigned long flags;
addr_to_pfn is too generic a name for what it does I think. I would
call it ppc_addr_to_pfn maybe. But it has other weirdness too, it's
assuming the kernel can't fault on user addresses AFAIKS.
> @@ -36,13 +37,15 @@ unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
>
> local_irq_save(flags);
> if (mm == current->mm)
> - ptep = find_current_mm_pte(mm->pgd, addr, NULL, NULL);
> + ptep = find_current_mm_pte(mm->pgd, addr, NULL, shift);
> else
> - ptep = find_init_mm_pte(addr, NULL);
> + ptep = find_init_mm_pte(addr, shift);
> local_irq_restore(flags);
> if (!ptep || pte_special(*ptep))
> return ULONG_MAX;
> - return pte_pfn(*ptep);
> + if (!*shift)
> + *shift = PAGE_SHIFT;
> + return (pte_val(*ptep) & PTE_RPN_MASK) >> *shift;
> }
>
> /* flush SLBs and reload */
AFAIKS moving this weirdness into the callers is pointles, fix it up
into a proper Linux pfn in addr_to_pfn.
The duplication and divergence of machine check code between powernv
and pseries is bad, I'm looking through it and from start to end it's
trying to do basically the same thing in slightly different ways, and
we're still got a lot of bugs. I wonder if we shouldn't try to fix it
all and merge them to a single implementation...
Actually your code does not touch the core of it too significantly so
it looks like we don't have to delay your stuff in order to do that.
This is a good fix though, it should go as a fix at the top of the
series.
Thanks,
Nick
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v3 4/7] powerpc/mce: Handle UE event for memcpy_mcsafe
2019-07-05 21:26 ` [v3 4/7] powerpc/mce: Handle UE event for memcpy_mcsafe Santosh Sivaraj
@ 2019-07-06 9:53 ` Nicholas Piggin
2019-07-08 8:23 ` Mahesh Jagannath Salgaonkar
0 siblings, 1 reply; 18+ messages in thread
From: Nicholas Piggin @ 2019-07-06 9:53 UTC (permalink / raw)
To: linuxppc-dev, Santosh Sivaraj
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Chandan Rajendra, Reza Arbab
Santosh Sivaraj's on July 6, 2019 7:26 am:
> If we take a UE on one of the instructions with a fixup entry, set nip
> to continue exucution at the fixup entry. Stop processing the event
> further or print it.
Minor nit, but can you instead a field in the mce data structure that
describes the property of the event, and then the code that intends to
ignore such events can test for it (with an appropriate comment).
So it would be has_fixup_handler or similar. Then queue_event can have
the logic
/*
* Don't report this machine check because the caller has a fixup
* handler which will do the appropriate error handling and reporting.
*/
> @@ -565,9 +567,18 @@ static int mce_handle_derror(struct pt_regs *regs,
> return 0;
> }
>
> -static long mce_handle_ue_error(struct pt_regs *regs)
> +static long mce_handle_ue_error(struct pt_regs *regs,
> + struct mce_error_info *mce_err)
> {
> long handled = 0;
> + const struct exception_table_entry *entry;
> +
> + entry = search_exception_tables(regs->nip);
Uh oh, this searches module exception tables too... we can't do that
in real mode, can we?
Thanks,
Nick
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v3 7/7] powerpc/64s: save r13 in MCE handler (simulator workaroud)
2019-07-05 21:26 ` [v3 7/7] powerpc/64s: save r13 in MCE handler (simulator workaroud) Santosh Sivaraj
@ 2019-07-06 9:56 ` Nicholas Piggin
2019-07-08 15:11 ` Reza Arbab
0 siblings, 1 reply; 18+ messages in thread
From: Nicholas Piggin @ 2019-07-06 9:56 UTC (permalink / raw)
To: linuxppc-dev, Santosh Sivaraj
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Chandan Rajendra, Reza Arbab
Santosh Sivaraj's on July 6, 2019 7:26 am:
> From: Reza Arbab <arbab@linux.ibm.com>
>
> Testing my memcpy_mcsafe() work in progress with an injected UE, I get
> an error like this immediately after the function returns:
>
> BUG: Unable to handle kernel data access at 0x7fff84dec8f8
> Faulting instruction address: 0xc0080000009c00b0
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
> Modules linked in: mce(O+) vmx_crypto crc32c_vpmsum
> CPU: 0 PID: 1375 Comm: modprobe Tainted: G O 5.1.0-rc6 #267
> NIP: c0080000009c00b0 LR: c0080000009c00a8 CTR: c000000000095f90
> REGS: c0000000ee197790 TRAP: 0300 Tainted: G O (5.1.0-rc6)
> MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88002826 XER: 00040000
> CFAR: c000000000095f8c DAR: 00007fff84dec8f8 DSISR: 40000000 IRQMASK: 0
> GPR00: 000000006c6c6568 c0000000ee197a20 c0080000009c8400 fffffffffffffff2
> GPR04: c0080000009c02e0 0000000000000006 0000000000000000 c000000003c834c8
> GPR08: 0080000000000000 776a6681b7fb5100 0000000000000000 c0080000009c01c8
> GPR12: c000000000095f90 00007fff84debc00 000000004d071440 0000000000000000
> GPR16: 0000000100000601 c0080000009e0000 c000000000c98dd8 c000000000c98d98
> GPR20: c000000003bba970 c0080000009c04d0 c0080000009c0618 c0000000001e5820
> GPR24: 0000000000000000 0000000000000100 0000000000000001 c000000003bba958
> GPR28: c0080000009c02e8 c0080000009c0318 c0080000009c02e0 0000000000000000
> NIP [c0080000009c00b0] cause_ue+0xa8/0xe8 [mce]
> LR [c0080000009c00a8] cause_ue+0xa0/0xe8 [mce]
>
> After debugging we see that the first instruction at vector 200 is skipped by
> the simulator, due to which r13 is not saved. Adding a nop at 0x200 fixes the
> issue.
>
> (This commit is needed for testing this series. This should not be taken
> into the tree)
Would be good if this was testable in simulator upstream, did you
report it? What does cause_ue do? exc_mce in mambo seems to do the
right thing AFAIKS.
Thanks,
Nick
> ---
> arch/powerpc/kernel/exceptions-64s.S | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 73ba246ca11d..8e43abb2a744 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -255,6 +255,7 @@ EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
> * some code path might still want to branch into the original
> * vector
> */
> + nop
> SET_SCRATCH0(r13) /* save r13 */
> EXCEPTION_PROLOG_0(PACA_EXMC)
> BEGIN_FTR_SECTION
> --
> 2.20.1
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v3 5/7] powerpc/memcpy_mcsafe: return remaining bytes
2019-07-06 6:16 ` Christophe Leroy
@ 2019-07-06 10:05 ` Nicholas Piggin
0 siblings, 0 replies; 18+ messages in thread
From: Nicholas Piggin @ 2019-07-06 10:05 UTC (permalink / raw)
To: Christophe Leroy, linuxppc-dev, Santosh Sivaraj
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Chandan Rajendra, Reza Arbab
Christophe Leroy's on July 6, 2019 4:16 pm:
>
>
> Le 05/07/2019 à 23:26, Santosh Sivaraj a écrit :
>> memcpy_mcsafe currently return -EFAULT on a machine check exception, change
>> it to return the remaining bytes that needs to be copied, so that machine
>> check safe copy_to_user can maintain the same behavior as copy_to_user.
>
> AFAIU, this behaviour is the expected behaviour for memcpy_mcsafe(). Why
> implement a different behaviour in patch 3 and then change it here.
> Can't memcpy_mcsafe() return remaining bytes as expected from patch 3 ?
+1
This approach might be useful if you copy an existing implementaation
exactly and then make some changes incrementally, but patch 3 is
already modified.
Thanks,
Nick
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v3 4/7] powerpc/mce: Handle UE event for memcpy_mcsafe
2019-07-06 9:53 ` Nicholas Piggin
@ 2019-07-08 8:23 ` Mahesh Jagannath Salgaonkar
0 siblings, 0 replies; 18+ messages in thread
From: Mahesh Jagannath Salgaonkar @ 2019-07-08 8:23 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev, Santosh Sivaraj
Cc: Aneesh Kumar K.V, Mahesh Salgaonkar, Chandan Rajendra, Reza Arbab
On 7/6/19 3:23 PM, Nicholas Piggin wrote:
> Santosh Sivaraj's on July 6, 2019 7:26 am:
>> If we take a UE on one of the instructions with a fixup entry, set nip
>> to continue exucution at the fixup entry. Stop processing the event
>> further or print it.
>
> Minor nit, but can you instead a field in the mce data structure that
> describes the property of the event, and then the code that intends to
> ignore such events can test for it (with an appropriate comment).
>
> So it would be has_fixup_handler or similar. Then queue_event can have
> the logic
>
> /*
> * Don't report this machine check because the caller has a fixup
> * handler which will do the appropriate error handling and reporting.
> */
>
>
>> @@ -565,9 +567,18 @@ static int mce_handle_derror(struct pt_regs *regs,
>> return 0;
>> }
>>
>> -static long mce_handle_ue_error(struct pt_regs *regs)
>> +static long mce_handle_ue_error(struct pt_regs *regs,
>> + struct mce_error_info *mce_err)
>> {
>> long handled = 0;
>> + const struct exception_table_entry *entry;
>> +
>> + entry = search_exception_tables(regs->nip);
>
> Uh oh, this searches module exception tables too... we can't do that
> in real mode, can we?
Yeah, we can not do that in real mode. Should we directly call
search_extable() for kernel exception table ?
>
> Thanks,
> Nick
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v3 7/7] powerpc/64s: save r13 in MCE handler (simulator workaroud)
2019-07-06 9:56 ` Nicholas Piggin
@ 2019-07-08 15:11 ` Reza Arbab
0 siblings, 0 replies; 18+ messages in thread
From: Reza Arbab @ 2019-07-08 15:11 UTC (permalink / raw)
To: Nicholas Piggin
Cc: Santosh Sivaraj, Aneesh Kumar K.V, Mahesh Salgaonkar,
Chandan Rajendra, linuxppc-dev
On Sat, Jul 06, 2019 at 07:56:39PM +1000, Nicholas Piggin wrote:
>Santosh Sivaraj's on July 6, 2019 7:26 am:
>> From: Reza Arbab <arbab@linux.ibm.com>
>>
>> Testing my memcpy_mcsafe() work in progress with an injected UE, I get
>> an error like this immediately after the function returns:
>>
>> BUG: Unable to handle kernel data access at 0x7fff84dec8f8
>> Faulting instruction address: 0xc0080000009c00b0
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
>> Modules linked in: mce(O+) vmx_crypto crc32c_vpmsum
>> CPU: 0 PID: 1375 Comm: modprobe Tainted: G O 5.1.0-rc6 #267
>> NIP: c0080000009c00b0 LR: c0080000009c00a8 CTR: c000000000095f90
>> REGS: c0000000ee197790 TRAP: 0300 Tainted: G O (5.1.0-rc6)
>> MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88002826 XER: 00040000
>> CFAR: c000000000095f8c DAR: 00007fff84dec8f8 DSISR: 40000000 IRQMASK: 0
>> GPR00: 000000006c6c6568 c0000000ee197a20 c0080000009c8400 fffffffffffffff2
>> GPR04: c0080000009c02e0 0000000000000006 0000000000000000 c000000003c834c8
>> GPR08: 0080000000000000 776a6681b7fb5100 0000000000000000 c0080000009c01c8
>> GPR12: c000000000095f90 00007fff84debc00 000000004d071440 0000000000000000
>> GPR16: 0000000100000601 c0080000009e0000 c000000000c98dd8 c000000000c98d98
>> GPR20: c000000003bba970 c0080000009c04d0 c0080000009c0618 c0000000001e5820
>> GPR24: 0000000000000000 0000000000000100 0000000000000001 c000000003bba958
>> GPR28: c0080000009c02e8 c0080000009c0318 c0080000009c02e0 0000000000000000
>> NIP [c0080000009c00b0] cause_ue+0xa8/0xe8 [mce]
>> LR [c0080000009c00a8] cause_ue+0xa0/0xe8 [mce]
>>
>> After debugging we see that the first instruction at vector 200 is skipped by
>> the simulator, due to which r13 is not saved. Adding a nop at 0x200 fixes the
>> issue.
>>
>> (This commit is needed for testing this series. This should not be taken
>> into the tree)
>
>Would be good if this was testable in simulator upstream, did you
>report it? What does cause_ue do? exc_mce in mambo seems to do the
>right thing AFAIKS.
I think I posted this earlier, but cause_ue() is just a test function
telling me where to set up the error injection:
static noinline void cause_ue(void)
{
static const char src[] = "hello";
char dst[10];
int rc;
/* During the pause, break into mambo and run the following */
pr_info("inject_mce_ue_on_addr 0x%px\n", src);
pause(10);
rc = memcpy_mcsafe(dst, src, sizeof(src));
pr_info("memcpy_mcsafe() returns %d\n", rc);
if (!rc)
pr_info("dst=\"%s\"\n", dst);
}
Can't speak for the others, but I haven't chased this upstream. I didn't
know it was a simulator issue.
--
Reza Arbab
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2019-07-08 15:13 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-05 21:26 [v3 0/7] powerpc: implement machine check safe memcpy Santosh Sivaraj
2019-07-05 21:26 ` [v3 1/7] powerpc/mce: Make machine_check_ue_event() static Santosh Sivaraj
2019-07-06 5:18 ` Nicholas Piggin
2019-07-05 21:26 ` [v3 2/7] powerpc/mce: Bug fixes for MCE handling in kernel space Santosh Sivaraj
2019-07-06 9:44 ` Nicholas Piggin
2019-07-05 21:26 ` [v3 3/7] powerpc/memcpy: Add memcpy_mcsafe for pmem Santosh Sivaraj
2019-07-06 6:12 ` Christophe Leroy
2019-07-05 21:26 ` [v3 4/7] powerpc/mce: Handle UE event for memcpy_mcsafe Santosh Sivaraj
2019-07-06 9:53 ` Nicholas Piggin
2019-07-08 8:23 ` Mahesh Jagannath Salgaonkar
2019-07-05 21:26 ` [v3 5/7] powerpc/memcpy_mcsafe: return remaining bytes Santosh Sivaraj
2019-07-06 6:16 ` Christophe Leroy
2019-07-06 10:05 ` Nicholas Piggin
2019-07-05 21:26 ` [v3 6/7] powerpc: add machine check safe copy_to_user Santosh Sivaraj
2019-07-06 6:19 ` Christophe Leroy
2019-07-05 21:26 ` [v3 7/7] powerpc/64s: save r13 in MCE handler (simulator workaroud) Santosh Sivaraj
2019-07-06 9:56 ` Nicholas Piggin
2019-07-08 15:11 ` Reza Arbab
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).