* [PATCH v6 12/22] powerpc/book3s64/pkeys: Reset userspace AMR correctly on exec
From: Aneesh Kumar K.V @ 2020-11-25 5:16 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, Sandipan Das
In-Reply-To: <20201125051634.509286-1-aneesh.kumar@linux.ibm.com>
On fork, we inherit from the parent and on exec, we should switch to default_amr values.
Also, avoid changing the AMR register value within the kernel. The kernel now runs with
different AMR values.
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/pkeys.h | 2 ++
arch/powerpc/kernel/process.c | 6 +++++-
arch/powerpc/mm/book3s64/pkeys.c | 16 ++--------------
3 files changed, 9 insertions(+), 15 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/pkeys.h b/arch/powerpc/include/asm/book3s/64/pkeys.h
index b7d9f4267bcd..3b8640498f5b 100644
--- a/arch/powerpc/include/asm/book3s/64/pkeys.h
+++ b/arch/powerpc/include/asm/book3s/64/pkeys.h
@@ -6,6 +6,8 @@
#include <asm/book3s/64/hash-pkey.h>
extern u64 __ro_after_init default_uamor;
+extern u64 __ro_after_init default_amr;
+extern u64 __ro_after_init default_iamr;
static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
{
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 733680de0ba4..98f7e9ec766f 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1542,6 +1542,11 @@ void arch_setup_new_exec(void)
current->thread.regs = regs - 1;
}
+#ifdef CONFIG_PPC_MEM_KEYS
+ current->thread.regs->amr = default_amr;
+ current->thread.regs->iamr = default_iamr;
+#endif
+
}
#else
void arch_setup_new_exec(void)
@@ -1902,7 +1907,6 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
current->thread.load_tm = 0;
#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
- thread_pkey_regs_init(¤t->thread);
}
EXPORT_SYMBOL(start_thread);
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 640f090b9f9d..f47d11f2743d 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -28,8 +28,8 @@ static u32 initial_allocation_mask __ro_after_init;
* Even if we allocate keys with sys_pkey_alloc(), we need to make sure
* other thread still find the access denied using the same keys.
*/
-static u64 default_amr = ~0x0UL;
-static u64 default_iamr = 0x5555555555555555UL;
+u64 default_amr __ro_after_init = ~0x0UL;
+u64 default_iamr __ro_after_init = 0x5555555555555555UL;
u64 default_uamor __ro_after_init;
/*
* Key used to implement PROT_EXEC mmap. Denies READ/WRITE
@@ -388,18 +388,6 @@ void thread_pkey_regs_restore(struct thread_struct *new_thread,
write_iamr(new_thread->iamr);
}
-void thread_pkey_regs_init(struct thread_struct *thread)
-{
- if (!mmu_has_feature(MMU_FTR_PKEY))
- return;
-
- thread->amr = default_amr;
- thread->iamr = default_iamr;
-
- write_amr(default_amr);
- write_iamr(default_iamr);
-}
-
int execute_only_pkey(struct mm_struct *mm)
{
return mm->context.execute_only_pkey;
--
2.28.0
^ permalink raw reply related
* [PATCH v6 13/22] powerpc/ptrace-view: Use pt_regs values instead of thread_struct based one.
From: Aneesh Kumar K.V @ 2020-11-25 5:16 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V
In-Reply-To: <20201125051634.509286-1-aneesh.kumar@linux.ibm.com>
We will remove thread.amr/iamr/uamor in a later patch
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/kernel/ptrace/ptrace-view.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
index 7e6478e7ed07..bdbe8cfdafc7 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-view.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
@@ -470,12 +470,12 @@ static int pkey_active(struct task_struct *target, const struct user_regset *reg
static int pkey_get(struct task_struct *target, const struct user_regset *regset,
struct membuf to)
{
- BUILD_BUG_ON(TSO(amr) + sizeof(unsigned long) != TSO(iamr));
if (!arch_pkeys_enabled())
return -ENODEV;
- membuf_write(&to, &target->thread.amr, 2 * sizeof(unsigned long));
+ membuf_store(&to, target->thread.regs->amr);
+ membuf_store(&to, target->thread.regs->iamr);
return membuf_store(&to, default_uamor);
}
@@ -508,7 +508,8 @@ static int pkey_set(struct task_struct *target, const struct user_regset *regset
* Pick the AMR values for the keys that kernel is using. This
* will be indicated by the ~default_uamor bits.
*/
- target->thread.amr = (new_amr & default_uamor) | (target->thread.amr & ~default_uamor);
+ target->thread.regs->amr = (new_amr & default_uamor) |
+ (target->thread.regs->amr & ~default_uamor);
return 0;
}
--
2.28.0
^ permalink raw reply related
* [PATCH v6 14/22] powerpc/book3s64/pkeys: Don't update SPRN_AMR when in kernel mode.
From: Aneesh Kumar K.V @ 2020-11-25 5:16 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, Sandipan Das
In-Reply-To: <20201125051634.509286-1-aneesh.kumar@linux.ibm.com>
Now that kernel correctly store/restore userspace AMR/IAMR values, avoid
manipulating AMR and IAMR from the kernel on behalf of userspace.
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/kup.h | 21 +++++++++
arch/powerpc/include/asm/processor.h | 4 --
arch/powerpc/kernel/process.c | 4 --
arch/powerpc/kernel/traps.c | 6 ---
arch/powerpc/mm/book3s64/pkeys.c | 57 +++++-------------------
5 files changed, 31 insertions(+), 61 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
index 4dbb2d53fd8f..47270596215b 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -175,6 +175,27 @@ DECLARE_STATIC_KEY_FALSE(uaccess_flush_key);
#include <asm/mmu.h>
#include <asm/ptrace.h>
+/*
+ * For kernel thread that doesn't have thread.regs return
+ * default AMR/IAMR values.
+ */
+static inline u64 current_thread_amr(void)
+{
+ if (current->thread.regs)
+ return current->thread.regs->amr;
+ return AMR_KUAP_BLOCKED;
+}
+
+static inline u64 current_thread_iamr(void)
+{
+ if (current->thread.regs)
+ return current->thread.regs->iamr;
+ return AMR_KUEP_BLOCKED;
+}
+#endif /* CONFIG_PPC_PKEY */
+
+#ifdef CONFIG_PPC_KUAP
+
static inline void kuap_restore_user_amr(struct pt_regs *regs)
{
if (!mmu_has_feature(MMU_FTR_PKEY))
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index c61c859b51a8..c3df3a420c92 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -230,10 +230,6 @@ struct thread_struct {
struct thread_vr_state ckvr_state; /* Checkpointed VR state */
unsigned long ckvrsave; /* Checkpointed VRSAVE */
#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
-#ifdef CONFIG_PPC_MEM_KEYS
- unsigned long amr;
- unsigned long iamr;
-#endif
#ifdef CONFIG_KVM_BOOK3S_32_HANDLER
void* kvm_shadow_vcpu; /* KVM internal data */
#endif /* CONFIG_KVM_BOOK3S_32_HANDLER */
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 98f7e9ec766f..5ffdac46a187 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -589,7 +589,6 @@ static void save_all(struct task_struct *tsk)
__giveup_spe(tsk);
msr_check_and_clear(msr_all_available);
- thread_pkey_regs_save(&tsk->thread);
}
void flush_all_to_thread(struct task_struct *tsk)
@@ -1160,8 +1159,6 @@ static inline void save_sprs(struct thread_struct *t)
t->tar = mfspr(SPRN_TAR);
}
#endif
-
- thread_pkey_regs_save(t);
}
static inline void restore_sprs(struct thread_struct *old_thread,
@@ -1202,7 +1199,6 @@ static inline void restore_sprs(struct thread_struct *old_thread,
mtspr(SPRN_TIDR, new_thread->tidr);
#endif
- thread_pkey_regs_restore(new_thread, old_thread);
}
struct task_struct *__switch_to(struct task_struct *prev,
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 5006dcbe1d9f..419028d53fd6 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -347,12 +347,6 @@ static bool exception_common(int signr, struct pt_regs *regs, int code,
current->thread.trap_nr = code;
- /*
- * Save all the pkey registers AMR/IAMR/UAMOR. Eg: Core dumps need
- * to capture the content, if the task gets killed.
- */
- thread_pkey_regs_save(¤t->thread);
-
return true;
}
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index f47d11f2743d..f747d66cc87d 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -273,30 +273,17 @@ void __init setup_kuap(bool disabled)
}
#endif
-static inline u64 read_amr(void)
+static inline void update_current_thread_amr(u64 value)
{
- return mfspr(SPRN_AMR);
+ current->thread.regs->amr = value;
}
-static inline void write_amr(u64 value)
-{
- mtspr(SPRN_AMR, value);
-}
-
-static inline u64 read_iamr(void)
-{
- if (!likely(pkey_execute_disable_supported))
- return 0x0UL;
-
- return mfspr(SPRN_IAMR);
-}
-
-static inline void write_iamr(u64 value)
+static inline void update_current_thread_iamr(u64 value)
{
if (!likely(pkey_execute_disable_supported))
return;
- mtspr(SPRN_IAMR, value);
+ current->thread.regs->iamr = value;
}
#ifdef CONFIG_PPC_MEM_KEYS
@@ -311,17 +298,17 @@ void pkey_mm_init(struct mm_struct *mm)
static inline void init_amr(int pkey, u8 init_bits)
{
u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
- u64 old_amr = read_amr() & ~((u64)(0x3ul) << pkeyshift(pkey));
+ u64 old_amr = current_thread_amr() & ~((u64)(0x3ul) << pkeyshift(pkey));
- write_amr(old_amr | new_amr_bits);
+ update_current_thread_amr(old_amr | new_amr_bits);
}
static inline void init_iamr(int pkey, u8 init_bits)
{
u64 new_iamr_bits = (((u64)init_bits & 0x1UL) << pkeyshift(pkey));
- u64 old_iamr = read_iamr() & ~((u64)(0x1ul) << pkeyshift(pkey));
+ u64 old_iamr = current_thread_iamr() & ~((u64)(0x1ul) << pkeyshift(pkey));
- write_iamr(old_iamr | new_iamr_bits);
+ update_current_thread_iamr(old_iamr | new_iamr_bits);
}
/*
@@ -364,30 +351,6 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
return 0;
}
-void thread_pkey_regs_save(struct thread_struct *thread)
-{
- if (!mmu_has_feature(MMU_FTR_PKEY))
- return;
-
- /*
- * TODO: Skip saving registers if @thread hasn't used any keys yet.
- */
- thread->amr = read_amr();
- thread->iamr = read_iamr();
-}
-
-void thread_pkey_regs_restore(struct thread_struct *new_thread,
- struct thread_struct *old_thread)
-{
- if (!mmu_has_feature(MMU_FTR_PKEY))
- return;
-
- if (old_thread->amr != new_thread->amr)
- write_amr(new_thread->amr);
- if (old_thread->iamr != new_thread->iamr)
- write_iamr(new_thread->iamr);
-}
-
int execute_only_pkey(struct mm_struct *mm)
{
return mm->context.execute_only_pkey;
@@ -436,9 +399,9 @@ static bool pkey_access_permitted(int pkey, bool write, bool execute)
pkey_shift = pkeyshift(pkey);
if (execute)
- return !(read_iamr() & (IAMR_EX_BIT << pkey_shift));
+ return !(current_thread_iamr() & (IAMR_EX_BIT << pkey_shift));
- amr = read_amr();
+ amr = current_thread_amr();
if (write)
return !(amr & (AMR_WR_BIT << pkey_shift));
--
2.28.0
^ permalink raw reply related
* [PATCH v6 15/22] powerpc/book3s64/kuap: Restrict access to userspace based on userspace AMR
From: Aneesh Kumar K.V @ 2020-11-25 5:16 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, Sandipan Das
In-Reply-To: <20201125051634.509286-1-aneesh.kumar@linux.ibm.com>
If an application has configured address protection such that read/write is
denied using pkey even the kernel should receive a FAULT on accessing the same.
This patch use user AMR value stored in pt_regs.amr to achieve the same.
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/kup.h | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
index 47270596215b..4a3d0d601745 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -312,14 +312,20 @@ bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
static __always_inline void allow_user_access(void __user *to, const void __user *from,
unsigned long size, unsigned long dir)
{
+ unsigned long thread_amr = 0;
+
// This is written so we can resolve to a single case at build time
BUILD_BUG_ON(!__builtin_constant_p(dir));
+
+ if (mmu_has_feature(MMU_FTR_PKEY))
+ thread_amr = current_thread_amr();
+
if (dir == KUAP_READ)
- set_kuap(AMR_KUAP_BLOCK_WRITE);
+ set_kuap(thread_amr | AMR_KUAP_BLOCK_WRITE);
else if (dir == KUAP_WRITE)
- set_kuap(AMR_KUAP_BLOCK_READ);
+ set_kuap(thread_amr | AMR_KUAP_BLOCK_READ);
else if (dir == KUAP_READ_WRITE)
- set_kuap(0);
+ set_kuap(thread_amr);
else
BUILD_BUG();
}
--
2.28.0
^ permalink raw reply related
* [PATCH v6 16/22] powerpc/book3s64/kuap: Improve error reporting with KUAP
From: Aneesh Kumar K.V @ 2020-11-25 5:16 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V
In-Reply-To: <20201125051634.509286-1-aneesh.kumar@linux.ibm.com>
With hash translation use DSISR_KEYFAULT to identify a wrong access.
With Radix we look at the AMR value and type of fault.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/32/kup.h | 4 +--
arch/powerpc/include/asm/book3s/64/kup.h | 27 ++++++++++++++++----
arch/powerpc/include/asm/kup.h | 4 +--
arch/powerpc/include/asm/nohash/32/kup-8xx.h | 4 +--
arch/powerpc/mm/fault.c | 2 +-
5 files changed, 29 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index 32fd4452e960..b18cd931e325 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -177,8 +177,8 @@ static inline void restore_user_access(unsigned long flags)
allow_user_access(to, to, end - addr, KUAP_READ_WRITE);
}
-static inline bool
-bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
+static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
+ bool is_write, unsigned long error_code)
{
unsigned long begin = regs->kuap & 0xf0000000;
unsigned long end = regs->kuap << 28;
diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
index 4a3d0d601745..2922c442a218 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -301,12 +301,29 @@ static inline void set_kuap(unsigned long value)
isync();
}
-static inline bool
-bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
+#define RADIX_KUAP_BLOCK_READ UL(0x4000000000000000)
+#define RADIX_KUAP_BLOCK_WRITE UL(0x8000000000000000)
+
+static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
+ bool is_write, unsigned long error_code)
{
- return WARN(mmu_has_feature(MMU_FTR_KUAP) &&
- (regs->kuap & (is_write ? AMR_KUAP_BLOCK_WRITE : AMR_KUAP_BLOCK_READ)),
- "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
+ if (!mmu_has_feature(MMU_FTR_KUAP))
+ return false;
+
+ if (radix_enabled()) {
+ /*
+ * Will be a storage protection fault.
+ * Only check the details of AMR[0]
+ */
+ return WARN((regs->kuap & (is_write ? RADIX_KUAP_BLOCK_WRITE : RADIX_KUAP_BLOCK_READ)),
+ "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
+ }
+ /*
+ * We don't want to WARN here because userspace can setup
+ * keys such that a kernel access to user address can cause
+ * fault
+ */
+ return !!(error_code & DSISR_KEYFAULT);
}
static __always_inline void allow_user_access(void __user *to, const void __user *from,
diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
index a06e50b68d40..952be0414f43 100644
--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -59,8 +59,8 @@ void setup_kuap(bool disabled);
#else
static inline void setup_kuap(bool disabled) { }
-static inline bool
-bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
+static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
+ bool is_write, unsigned long error_code)
{
return false;
}
diff --git a/arch/powerpc/include/asm/nohash/32/kup-8xx.h b/arch/powerpc/include/asm/nohash/32/kup-8xx.h
index 567cdc557402..7bdd9e5b63ed 100644
--- a/arch/powerpc/include/asm/nohash/32/kup-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/kup-8xx.h
@@ -60,8 +60,8 @@ static inline void restore_user_access(unsigned long flags)
mtspr(SPRN_MD_AP, flags);
}
-static inline bool
-bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
+static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
+ bool is_write, unsigned long error_code)
{
return WARN(!((regs->kuap ^ MD_APG_KUAP) & 0xff000000),
"Bug: fault blocked by AP register !");
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 0add963a849b..c91621df0c61 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -227,7 +227,7 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
// Read/write fault in a valid region (the exception table search passed
// above), but blocked by KUAP is bad, it can never succeed.
- if (bad_kuap_fault(regs, address, is_write))
+ if (bad_kuap_fault(regs, address, is_write, error_code))
return true;
// What's left? Kernel fault on user in well defined regions (extable
--
2.28.0
^ permalink raw reply related
* [PATCH v6 17/22] powerpc/book3s64/kuap: Use Key 3 to implement KUAP with hash translation.
From: Aneesh Kumar K.V @ 2020-11-25 5:16 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, Sandipan Das
In-Reply-To: <20201125051634.509286-1-aneesh.kumar@linux.ibm.com>
Radix use AMR Key 0 and hash translation use AMR key 3.
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/kup.h | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
index 2922c442a218..b8861cc2b6c7 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -5,11 +5,10 @@
#include <linux/const.h>
#include <asm/reg.h>
-#define AMR_KUAP_BLOCK_READ UL(0x4000000000000000)
-#define AMR_KUAP_BLOCK_WRITE UL(0x8000000000000000)
+#define AMR_KUAP_BLOCK_READ UL(0x5455555555555555)
+#define AMR_KUAP_BLOCK_WRITE UL(0xa8aaaaaaaaaaaaaa)
#define AMR_KUEP_BLOCKED (1UL << 62)
#define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
-#define AMR_KUAP_SHIFT 62
#ifdef __ASSEMBLY__
@@ -62,8 +61,8 @@
#ifdef CONFIG_PPC_KUAP_DEBUG
BEGIN_MMU_FTR_SECTION_NESTED(67)
mfspr \gpr1, SPRN_AMR
- li \gpr2, (AMR_KUAP_BLOCKED >> AMR_KUAP_SHIFT)
- sldi \gpr2, \gpr2, AMR_KUAP_SHIFT
+ /* Prevent access to userspace using any key values */
+ LOAD_REG_IMMEDIATE(\gpr2, AMR_KUAP_BLOCKED)
999: tdne \gpr1, \gpr2
EMIT_BUG_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE)
END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_KUAP, 67)
--
2.28.0
^ permalink raw reply related
* [PATCH v6 18/22] powerpc/book3s64/kuep: Use Key 3 to implement KUEP with hash translation.
From: Aneesh Kumar K.V @ 2020-11-25 5:16 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, Sandipan Das
In-Reply-To: <20201125051634.509286-1-aneesh.kumar@linux.ibm.com>
Radix use IAMR Key 0 and hash translation use IAMR key 3.
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/kup.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
index b8861cc2b6c7..7026d1b5d0c6 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -7,7 +7,7 @@
#define AMR_KUAP_BLOCK_READ UL(0x5455555555555555)
#define AMR_KUAP_BLOCK_WRITE UL(0xa8aaaaaaaaaaaaaa)
-#define AMR_KUEP_BLOCKED (1UL << 62)
+#define AMR_KUEP_BLOCKED UL(0x5455555555555555)
#define AMR_KUAP_BLOCKED (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
#ifdef __ASSEMBLY__
--
2.28.0
^ permalink raw reply related
* [PATCH v6 19/22] powerpc/book3s64/hash/kuap: Enable kuap on hash
From: Aneesh Kumar K.V @ 2020-11-25 5:16 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, Sandipan Das
In-Reply-To: <20201125051634.509286-1-aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/mm/book3s64/pkeys.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index f747d66cc87d..84f8664ffc47 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -257,7 +257,12 @@ void __init setup_kuep(bool disabled)
#ifdef CONFIG_PPC_KUAP
void __init setup_kuap(bool disabled)
{
- if (disabled || !early_radix_enabled())
+ if (disabled)
+ return;
+ /*
+ * On hash if PKEY feature is not enabled, disable KUAP too.
+ */
+ if (!early_radix_enabled() && !early_mmu_has_feature(MMU_FTR_PKEY))
return;
if (smp_processor_id() == boot_cpuid) {
--
2.28.0
^ permalink raw reply related
* [PATCH v6 20/22] powerpc/book3s64/hash/kuep: Enable KUEP on hash
From: Aneesh Kumar K.V @ 2020-11-25 5:16 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, Sandipan Das
In-Reply-To: <20201125051634.509286-1-aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/mm/book3s64/pkeys.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 84f8664ffc47..f029e7bf5ca2 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -236,7 +236,12 @@ void __init pkey_early_init_devtree(void)
#ifdef CONFIG_PPC_KUEP
void __init setup_kuep(bool disabled)
{
- if (disabled || !early_radix_enabled())
+ if (disabled)
+ return;
+ /*
+ * On hash if PKEY feature is not enabled, disable KUAP too.
+ */
+ if (!early_radix_enabled() && !early_mmu_has_feature(MMU_FTR_PKEY))
return;
if (smp_processor_id() == boot_cpuid) {
--
2.28.0
^ permalink raw reply related
* [PATCH v6 21/22] powerpc/book3s64/hash/kup: Don't hardcode kup key
From: Aneesh Kumar K.V @ 2020-11-25 5:16 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V
In-Reply-To: <20201125051634.509286-1-aneesh.kumar@linux.ibm.com>
Make KUAP/KUEP key a variable and also check whether the platform
limit the max key such that we can't use the key for KUAP/KEUP.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
.../powerpc/include/asm/book3s/64/hash-pkey.h | 22 +-------
arch/powerpc/include/asm/book3s/64/pkeys.h | 1 +
arch/powerpc/mm/book3s64/pkeys.c | 53 ++++++++++++++++---
3 files changed, 49 insertions(+), 27 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-pkey.h b/arch/powerpc/include/asm/book3s/64/hash-pkey.h
index 9f44e208f036..ff9907c72ee3 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-pkey.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-pkey.h
@@ -2,9 +2,7 @@
#ifndef _ASM_POWERPC_BOOK3S_64_HASH_PKEY_H
#define _ASM_POWERPC_BOOK3S_64_HASH_PKEY_H
-/* We use key 3 for KERNEL */
-#define HASH_DEFAULT_KERNEL_KEY (HPTE_R_KEY_BIT0 | HPTE_R_KEY_BIT1)
-
+u64 pte_to_hpte_pkey_bits(u64 pteflags, unsigned long flags);
static inline u64 hash__vmflag_to_pte_pkey_bits(u64 vm_flags)
{
return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT0 : 0x0UL) |
@@ -14,24 +12,6 @@ static inline u64 hash__vmflag_to_pte_pkey_bits(u64 vm_flags)
((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT4 : 0x0UL));
}
-static inline u64 pte_to_hpte_pkey_bits(u64 pteflags, unsigned long flags)
-{
- unsigned long pte_pkey;
-
- pte_pkey = (((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL));
-
- if (mmu_has_feature(MMU_FTR_KUAP) || mmu_has_feature(MMU_FTR_KUEP)) {
- if ((pte_pkey == 0) && (flags & HPTE_USE_KERNEL_KEY))
- return HASH_DEFAULT_KERNEL_KEY;
- }
-
- return pte_pkey;
-}
-
static inline u16 hash__pte_to_pkey_bits(u64 pteflags)
{
return (((pteflags & H_PTE_PKEY_BIT4) ? 0x10 : 0x0UL) |
diff --git a/arch/powerpc/include/asm/book3s/64/pkeys.h b/arch/powerpc/include/asm/book3s/64/pkeys.h
index 3b8640498f5b..a2b6c4a7275f 100644
--- a/arch/powerpc/include/asm/book3s/64/pkeys.h
+++ b/arch/powerpc/include/asm/book3s/64/pkeys.h
@@ -8,6 +8,7 @@
extern u64 __ro_after_init default_uamor;
extern u64 __ro_after_init default_amr;
extern u64 __ro_after_init default_iamr;
+extern int kup_key;
static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
{
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index f029e7bf5ca2..204e4598b45c 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -37,7 +37,10 @@ u64 default_uamor __ro_after_init;
*/
static int execute_only_key = 2;
static bool pkey_execute_disable_supported;
-
+/*
+ * key used to implement KUAP/KUEP with hash translation.
+ */
+int kup_key = 3;
#define AMR_BITS_PER_PKEY 2
#define AMR_RD_BIT 0x1UL
@@ -185,6 +188,25 @@ void __init pkey_early_init_devtree(void)
default_uamor &= ~(0x3ul << pkeyshift(execute_only_key));
}
+ if (unlikely(num_pkey <= kup_key)) {
+ /*
+ * Insufficient number of keys to support
+ * KUAP/KUEP feature.
+ */
+ kup_key = -1;
+ } else {
+ /* handle key which is used by kernel for KAUP */
+ reserved_allocation_mask |= (0x1 << kup_key);
+ /*
+ * Mark access for kup_key in default amr so that
+ * we continue to operate with that AMR in
+ * copy_to/from_user().
+ */
+ default_amr &= ~(0x3ul << pkeyshift(kup_key));
+ default_iamr &= ~(0x1ul << pkeyshift(kup_key));
+ default_uamor &= ~(0x3ul << pkeyshift(kup_key));
+ }
+
/*
* Allow access for only key 0. And prevent any other modification.
*/
@@ -205,9 +227,6 @@ void __init pkey_early_init_devtree(void)
reserved_allocation_mask |= (0x1 << 1);
default_uamor &= ~(0x3ul << pkeyshift(1));
- /* handle key 3 which is used by kernel for KAUP */
- reserved_allocation_mask |= (0x1 << 3);
- default_uamor &= ~(0x3ul << pkeyshift(3));
/*
* Prevent the usage of OS reserved keys. Update UAMOR
@@ -236,7 +255,7 @@ void __init pkey_early_init_devtree(void)
#ifdef CONFIG_PPC_KUEP
void __init setup_kuep(bool disabled)
{
- if (disabled)
+ if (disabled || kup_key == -1)
return;
/*
* On hash if PKEY feature is not enabled, disable KUAP too.
@@ -262,7 +281,7 @@ void __init setup_kuep(bool disabled)
#ifdef CONFIG_PPC_KUAP
void __init setup_kuap(bool disabled)
{
- if (disabled)
+ if (disabled || kup_key == -1)
return;
/*
* On hash if PKEY feature is not enabled, disable KUAP too.
@@ -458,4 +477,26 @@ void arch_dup_pkeys(struct mm_struct *oldmm, struct mm_struct *mm)
mm->context.execute_only_pkey = oldmm->context.execute_only_pkey;
}
+u64 pte_to_hpte_pkey_bits(u64 pteflags, unsigned long flags)
+{
+ unsigned long pte_pkey;
+
+ pte_pkey = (((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL));
+
+ if (mmu_has_feature(MMU_FTR_KUAP) || mmu_has_feature(MMU_FTR_KUEP)) {
+ if ((pte_pkey == 0) &&
+ (flags & HPTE_USE_KERNEL_KEY) && (kup_key != -1)) {
+ u64 vm_flag = pkey_to_vmflag_bits(kup_key);
+ u64 pte_flag = hash__vmflag_to_pte_pkey_bits(vm_flag);
+ return pte_to_hpte_pkey_bits(pte_flag, 0);
+ }
+ }
+
+ return pte_pkey;
+}
+
#endif /* CONFIG_PPC_MEM_KEYS */
--
2.28.0
^ permalink raw reply related
* [PATCH v6 22/22] powerpc/book3s64/pkeys: Optimize FTR_KUAP and FTR_KUEP disabled case
From: Aneesh Kumar K.V @ 2020-11-25 5:16 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V
In-Reply-To: <20201125051634.509286-1-aneesh.kumar@linux.ibm.com>
If FTR_KUAP is disabled kernel will continue to run with the same AMR
value with which it was entered. Hence there is a high chance that
we can return without restoring the AMR value. This also helps the case
when applications are not using the pkey feature. In this case, different
applications will have the same AMR values and hence we can avoid restoring
AMR in this case too.
Also avoid isync() if not really needed.
Do the same for IAMR.
null-syscall benchmark results:
With smap/smep disabled:
Without patch:
957.95 ns 2778.17 cycles
With patch:
858.38 ns 2489.30 cycles
With smap/smep enabled:
Without patch:
1017.26 ns 2950.36 cycles
With patch:
1021.51 ns 2962.44 cycles
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/kup.h | 61 +++++++++++++++++++++---
arch/powerpc/kernel/entry_64.S | 2 +-
arch/powerpc/kernel/syscall_64.c | 12 +++--
3 files changed, 65 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
index 7026d1b5d0c6..e063e439b0a8 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -12,28 +12,54 @@
#ifdef __ASSEMBLY__
-.macro kuap_restore_user_amr gpr1
+.macro kuap_restore_user_amr gpr1, gpr2
#if defined(CONFIG_PPC_PKEY)
BEGIN_MMU_FTR_SECTION_NESTED(67)
+ b 100f // skip_restore_amr
+ END_MMU_FTR_SECTION_NESTED_IFCLR(MMU_FTR_PKEY, 67)
/*
* AMR and IAMR are going to be different when
* returning to userspace.
*/
ld \gpr1, STACK_REGS_AMR(r1)
+
+ /*
+ * If kuap feature is not enabled, do the mtspr
+ * only if AMR value is different.
+ */
+ BEGIN_MMU_FTR_SECTION_NESTED(68)
+ mfspr \gpr2, SPRN_AMR
+ cmpd \gpr1, \gpr2
+ beq 99f
+ END_MMU_FTR_SECTION_NESTED_IFCLR(MMU_FTR_KUAP, 68)
+
isync
mtspr SPRN_AMR, \gpr1
+99:
/*
* Restore IAMR only when returning to userspace
*/
ld \gpr1, STACK_REGS_IAMR(r1)
+
+ /*
+ * If kuep feature is not enabled, do the mtspr
+ * only if IAMR value is different.
+ */
+ BEGIN_MMU_FTR_SECTION_NESTED(69)
+ mfspr \gpr2, SPRN_IAMR
+ cmpd \gpr1, \gpr2
+ beq 100f
+ END_MMU_FTR_SECTION_NESTED_IFCLR(MMU_FTR_KUEP, 69)
+
+ isync
mtspr SPRN_IAMR, \gpr1
+100: //skip_restore_amr
/* No isync required, see kuap_restore_user_amr() */
- END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_PKEY, 67)
#endif
.endm
-.macro kuap_restore_kernel_amr gpr1, gpr2
+.macro kuap_restore_kernel_amr gpr1, gpr2
#if defined(CONFIG_PPC_PKEY)
BEGIN_MMU_FTR_SECTION_NESTED(67)
@@ -197,18 +223,41 @@ static inline u64 current_thread_iamr(void)
static inline void kuap_restore_user_amr(struct pt_regs *regs)
{
+ bool restore_amr = false, restore_iamr = false;
+ unsigned long amr, iamr;
+
if (!mmu_has_feature(MMU_FTR_PKEY))
return;
- isync();
- mtspr(SPRN_AMR, regs->amr);
- mtspr(SPRN_IAMR, regs->iamr);
+ if (!mmu_has_feature(MMU_FTR_KUAP)) {
+ amr = mfspr(SPRN_AMR);
+ if (amr != regs->amr)
+ restore_amr = true;
+ } else
+ restore_amr = true;
+
+ if (!mmu_has_feature(MMU_FTR_KUEP)) {
+ iamr = mfspr(SPRN_IAMR);
+ if (iamr != regs->iamr)
+ restore_iamr = true;
+ } else
+ restore_iamr = true;
+
+
+ if (restore_amr || restore_iamr) {
+ isync();
+ if (restore_amr)
+ mtspr(SPRN_AMR, regs->amr);
+ if (restore_iamr)
+ mtspr(SPRN_IAMR, regs->iamr);
+ }
/*
* No isync required here because we are about to rfi
* back to previous context before any user accesses
* would be made, which is a CSI.
*/
}
+
static inline void kuap_restore_kernel_amr(struct pt_regs *regs,
unsigned long amr)
{
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index e49291594c68..a68517e99fd2 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -675,7 +675,7 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return)
bne- .Lrestore_nvgprs
.Lfast_user_interrupt_return_amr:
- kuap_restore_user_amr r3
+ kuap_restore_user_amr r3, r4
.Lfast_user_interrupt_return:
ld r11,_NIP(r1)
ld r12,_MSR(r1)
diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
index 60c57609d316..681f9afafc6f 100644
--- a/arch/powerpc/kernel/syscall_64.c
+++ b/arch/powerpc/kernel/syscall_64.c
@@ -38,6 +38,7 @@ notrace long system_call_exception(long r3, long r4, long r5,
#ifdef CONFIG_PPC_PKEY
if (mmu_has_feature(MMU_FTR_PKEY)) {
unsigned long amr, iamr;
+ bool flush_needed = false;
/*
* When entering from userspace we mostly have the AMR/IAMR
* different from kernel default values. Hence don't compare.
@@ -46,11 +47,16 @@ notrace long system_call_exception(long r3, long r4, long r5,
iamr = mfspr(SPRN_IAMR);
regs->amr = amr;
regs->iamr = iamr;
- if (mmu_has_feature(MMU_FTR_KUAP))
+ if (mmu_has_feature(MMU_FTR_KUAP)) {
mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
- if (mmu_has_feature(MMU_FTR_KUEP))
+ flush_needed = true;
+ }
+ if (mmu_has_feature(MMU_FTR_KUEP)) {
mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
- isync();
+ flush_needed = true;
+ }
+ if (flush_needed)
+ isync();
} else
#endif
kuap_check_amr();
--
2.28.0
^ permalink raw reply related
* Re: [PATCH net 1/2] ibmvnic: Ensure that SCRQ entry reads are correctly ordered
From: Michael Ellerman @ 2020-11-25 5:43 UTC (permalink / raw)
To: Thomas Falcon, netdev
Cc: cforno12, ljp, ricklind, dnbanerg, tlfalcon, drt, brking, sukadev,
linuxppc-dev
In-Reply-To: <1606238776-30259-2-git-send-email-tlfalcon@linux.ibm.com>
Thomas Falcon <tlfalcon@linux.ibm.com> writes:
> Ensure that received Subordinate Command-Response Queue (SCRQ)
> entries are properly read in order by the driver. These queues
> are used in the ibmvnic device to process RX buffer and TX completion
> descriptors. dma_rmb barriers have been added after checking for a
> pending descriptor to ensure the correct descriptor entry is checked
> and after reading the SCRQ descriptor to ensure the entire
> descriptor is read before processing.
>
> Fixes: 032c5e828 ("Driver for IBM System i/p VNIC protocol")
> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
> ---
> drivers/net/ethernet/ibm/ibmvnic.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index 2aa40b2..489ed5e 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2403,6 +2403,8 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
>
> if (!pending_scrq(adapter, adapter->rx_scrq[scrq_num]))
> break;
> + /* ensure that we do not prematurely exit the polling loop */
> + dma_rmb();
I'd be happier if these comments were more specific about which read(s)
they are ordering vs which other read(s).
I'm sure it's obvious to you, but it may not be to a future author,
and/or after the code has been refactored over time.
> next = ibmvnic_next_scrq(adapter, adapter->rx_scrq[scrq_num]);
> rx_buff =
> (struct ibmvnic_rx_buff *)be64_to_cpu(next->
> @@ -3098,6 +3100,9 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter *adapter,
> unsigned int pool = scrq->pool_index;
> int num_entries = 0;
>
> + /* ensure that the correct descriptor entry is read */
> + dma_rmb();
> +
> next = ibmvnic_next_scrq(adapter, scrq);
> for (i = 0; i < next->tx_comp.num_comps; i++) {
> if (next->tx_comp.rcs[i]) {
> @@ -3498,6 +3503,9 @@ static union sub_crq *ibmvnic_next_scrq(struct ibmvnic_adapter *adapter,
> }
> spin_unlock_irqrestore(&scrq->lock, flags);
>
> + /* ensure that the entire SCRQ descriptor is read */
> + dma_rmb();
> +
> return entry;
> }
cheers
^ permalink raw reply
* Re: [PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms
From: Christophe Leroy @ 2020-11-25 6:36 UTC (permalink / raw)
To: Michael Ellerman, Arnd Bergmann
Cc: Kate Stewart, Mark Rutland, Desnes A. Nunes do Rosario,
Geert Uytterhoeven, open list:DOCUMENTATION,
ALSA Development Mailing List, dri-devel, Jaroslav Kysela,
Richard Fontana, Paul Mackerras, Miquel Raynal,
Mauro Carvalho Chehab, Fabio Estevam, Sasha Levin,
Stephen Rothwell, Jonathan Corbet, Masahiro Yamada, YueHaibing,
Michal Simek, Krzysztof Kozlowski, Allison Randal, Leonardo Bras,
DTML, Andrew Donnellan, Bartlomiej Zolnierkiewicz, Marc Zyngier,
Alistair Popple, Nicholas Piggin, Alexios Zavras, Mark Brown, git,
Linux Fbdev development list, Jonathan Cameron, Thomas Gleixner,
Andy Shevchenko, Linux ARM, Christophe Leroy, Enrico Weigelt,
Michal Simek, Wei Hu, Christian Lamparter, Greg Kroah-Hartman,
Nick Desaulniers, Takashi Iwai, linux-kernel@vger.kernel.org,
Armijn Hemel, Rob Herring, linuxppc-dev, David S. Miller,
Thiago Jung Bauermann
In-Reply-To: <33b873a8-ded2-4866-fb70-c336fb325923@csgroup.eu>
Le 21/05/2020 à 12:38, Christophe Leroy a écrit :
>
>
> Le 21/05/2020 à 09:02, Michael Ellerman a écrit :
>> Arnd Bergmann <arnd@arndb.de> writes:
>>> +On Wed, Apr 8, 2020 at 2:04 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>>> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
>>>>> On Fri, 2020-04-03 at 15:59 +1100, Michael Ellerman wrote:
>>>>>> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
>>>>> IBM still put 40x cores inside POWER chips no ?
>>>>
>>>> Oh yeah that's true. I guess most folks don't know that, or that they
>>>> run RHEL on them.
>>>
>>> Is there a reason for not having those dts files in mainline then?
>>> If nothing else, it would document what machines are still being
>>> used with future kernels.
>>
>> Sorry that part was a joke :D Those chips don't run Linux.
>>
>
> Nice to know :)
>
> What's the plan then, do we still want to keep 40x in the kernel ?
>
> If yes, is it ok to drop the oldies anyway as done in my series
> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=172630 ?
>
> (Note that this series will conflict with my series on hugepages on 8xx due to the
> PTE_ATOMIC_UPDATES stuff. I can rebase the 40x modernisation series on top of the 8xx hugepages
> series if it is worth it)
>
Do we still want to keep 40x in the kernel ? We don't even have a running 40x QEMU machine as far as
I know.
I'm asking because I'd like to drop the non CONFIG_VMAP_STACK code to simplify and ease stuff (code
that works with vmalloc'ed stacks also works with stacks in linear memory), but I can't do it
because 40x doesn't have VMAP_STACK and should I implement it for 40x, I have to means to test it.
So it would ease things if we could drop 40x completely, unless someone there has a 40x platform to
test stuff.
Thanks
Christophe
^ permalink raw reply
* [PATCH v1 1/8] powerpc/32s: Always map kernel text and rodata with BATs
From: Christophe Leroy @ 2020-11-25 7:10 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
Since commit 2b279c0348af ("powerpc/32s: Allow mapping with BATs with
DEBUG_PAGEALLOC"), there is no real situation where mapping without
BATs is required.
In order to simplify memory handling, always map kernel text
and rodata with BATs even when "nobats" kernel parameter is set.
Also fix the 603 TLB miss exceptions that don't require anymore
kernel page table if DEBUG_PAGEALLOC.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_book3s_32.S | 4 ++--
arch/powerpc/mm/book3s32/mmu.c | 8 +++-----
2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index a0dda2a1f2df..27767f3e7ec1 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -453,13 +453,13 @@ InstructionTLBMiss:
*/
/* Get PTE (linux-style) and check access */
mfspr r3,SPRN_IMISS
-#if defined(CONFIG_MODULES) || defined(CONFIG_DEBUG_PAGEALLOC)
+#ifdef CONFIG_MODULES
lis r1, TASK_SIZE@h /* check if kernel address */
cmplw 0,r1,r3
#endif
mfspr r2, SPRN_SPRG_PGDIR
li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC
-#if defined(CONFIG_MODULES) || defined(CONFIG_DEBUG_PAGEALLOC)
+#ifdef CONFIG_MODULES
bgt- 112f
lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */
addi r2, r2, (swapper_pg_dir - PAGE_OFFSET)@l /* kernel page table */
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index a59e7ec98180..5c60dcade90a 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -157,11 +157,9 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
unsigned long done;
unsigned long border = (unsigned long)__init_begin - PAGE_OFFSET;
- if (__map_without_bats) {
- pr_debug("RAM mapped without BATs\n");
- return base;
- }
- if (debug_pagealloc_enabled()) {
+
+ if (debug_pagealloc_enabled() || __map_without_bats) {
+ pr_debug_once("Read-Write memory mapped without BATs\n");
if (base >= border)
return base;
if (top >= border)
--
2.25.0
^ permalink raw reply related
* [PATCH v1 4/8] powerpc/32s: Don't use SPRN_SPRG_PGDIR in hash_page
From: Christophe Leroy @ 2020-11-25 7:10 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>
SPRN_SPRG_PGDIR is there mainly to speedup SW TLB miss handlers
for powerpc 603.
We need to free SPRN_SPRG2 to reduce the mess with CONFIG_VMAP_STACK.
In hash_page(), reading PGDIR from thread_struct will be in the noise
performance wise.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/mm/book3s32/hash_low.S | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/mm/book3s32/hash_low.S b/arch/powerpc/mm/book3s32/hash_low.S
index 48415c857d80..aca353d1c5f4 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -65,13 +65,14 @@ _GLOBAL(hash_page)
/* Get PTE (linux-style) and check access */
lis r0, TASK_SIZE@h /* check if kernel address */
cmplw 0,r4,r0
+ mfspr r8,SPRN_SPRG_THREAD /* current task's THREAD (phys) */
ori r3,r3,_PAGE_USER|_PAGE_PRESENT /* test low addresses as user */
- mfspr r5, SPRN_SPRG_PGDIR /* phys page-table root */
+ lwz r5,PGDIR(r8) /* virt page-table root */
blt+ 112f /* assume user more likely */
- lis r5, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */
- addi r5 ,r5 ,(swapper_pg_dir - PAGE_OFFSET)@l /* kernel page table */
+ lis r5,swapper_pg_dir@ha /* if kernel address, use */
+ addi r5,r5,swapper_pg_dir@l /* kernel page table */
rlwimi r3,r9,32-12,29,29 /* MSR_PR -> _PAGE_USER */
-112:
+112: tophys(r5, r5)
#ifndef CONFIG_PTE_64BIT
rlwimi r5,r4,12,20,29 /* insert top 10 bits of address */
lwz r8,0(r5) /* get pmd entry */
--
2.25.0
^ permalink raw reply related
* [PATCH v1 3/8] powerpc/32s: Fix an FTR_SECTION_ELSE
From: Christophe Leroy @ 2020-11-25 7:10 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>
An FTR_SECTION_ELSE is in the middle of
BEGIN_MMU_FTR_SECTION/ALT_MMU_FTR_SECTION_END_IFSET
Change it to MMU_FTR_SECTION_ELSE
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_book3s_32.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 27767f3e7ec1..236a95d163be 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -332,7 +332,7 @@ BEGIN_MMU_FTR_SECTION
rlwinm r3, r5, 32 - 15, 21, 21 /* DSISR_STORE -> _PAGE_RW */
bl hash_page
b handle_page_fault_tramp_1
-FTR_SECTION_ELSE
+MMU_FTR_SECTION_ELSE
b handle_page_fault_tramp_2
ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
#endif /* CONFIG_VMAP_STACK */
--
2.25.0
^ permalink raw reply related
* [PATCH v1 2/8] powerpc/32s: Don't hash_preload() kernel text
From: Christophe Leroy @ 2020-11-25 7:10 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>
We now always map kernel text with BATs. Neither need to preload
hash with kernel text addresses nor ensure they are never evicted.
This is more or less a revert of commit ee4f2ea48674 ("[POWERPC] Fix
32-bit mm operations when not using BATs")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/mm/book3s32/hash_low.S | 18 +-----------------
arch/powerpc/mm/book3s32/mmu.c | 2 +-
arch/powerpc/mm/mmu_decl.h | 2 --
arch/powerpc/mm/pgtable_32.c | 4 ----
4 files changed, 2 insertions(+), 24 deletions(-)
diff --git a/arch/powerpc/mm/book3s32/hash_low.S b/arch/powerpc/mm/book3s32/hash_low.S
index b2c912e517b9..48415c857d80 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -411,30 +411,14 @@ END_FTR_SECTION_IFCLR(CPU_FTR_NEED_COHERENT)
* and we know there is a definite (although small) speed
* advantage to putting the PTE in the primary PTEG, we always
* put the PTE in the primary PTEG.
- *
- * In addition, we skip any slot that is mapping kernel text in
- * order to avoid a deadlock when not using BAT mappings if
- * trying to hash in the kernel hash code itself after it has
- * already taken the hash table lock. This works in conjunction
- * with pre-faulting of the kernel text.
- *
- * If the hash table bucket is full of kernel text entries, we'll
- * lockup here but that shouldn't happen
*/
-1: lis r4, (next_slot - PAGE_OFFSET)@ha /* get next evict slot */
+ lis r4, (next_slot - PAGE_OFFSET)@ha /* get next evict slot */
lwz r6, (next_slot - PAGE_OFFSET)@l(r4)
addi r6,r6,HPTE_SIZE /* search for candidate */
andi. r6,r6,7*HPTE_SIZE
stw r6,next_slot@l(r4)
add r4,r3,r6
- LDPTE r0,HPTE_SIZE/2(r4) /* get PTE second word */
- clrrwi r0,r0,12
- lis r6,etext@h
- ori r6,r6,etext@l /* get etext */
- tophys(r6,r6)
- cmpl cr0,r0,r6 /* compare and try again */
- blt 1b
#ifndef CONFIG_SMP
/* Store PTE in PTEG */
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 5c60dcade90a..23f60e97196e 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -302,7 +302,7 @@ void __init setbat(int index, unsigned long virt, phys_addr_t phys,
/*
* Preload a translation in the hash table
*/
-void hash_preload(struct mm_struct *mm, unsigned long ea)
+static void hash_preload(struct mm_struct *mm, unsigned long ea)
{
pmd_t *pmd;
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 1b6d39e9baed..0ad6d476d01d 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -91,8 +91,6 @@ void print_system_hash_info(void);
#ifdef CONFIG_PPC32
-void hash_preload(struct mm_struct *mm, unsigned long ea);
-
extern void mapin_ram(void);
extern void setbat(int index, unsigned long virt, phys_addr_t phys,
unsigned int size, pgprot_t prot);
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 079159e97bca..6e0083e7f008 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -112,10 +112,6 @@ static void __init __mapin_ram_chunk(unsigned long offset, unsigned long top)
ktext = ((char *)v >= _stext && (char *)v < etext) ||
((char *)v >= _sinittext && (char *)v < _einittext);
map_kernel_page(v, p, ktext ? PAGE_KERNEL_TEXT : PAGE_KERNEL);
-#ifdef CONFIG_PPC_BOOK3S_32
- if (ktext)
- hash_preload(&init_mm, v);
-#endif
v += PAGE_SIZE;
p += PAGE_SIZE;
}
--
2.25.0
^ permalink raw reply related
* [PATCH v1 5/8] powerpc/603: Use SPRN_SDR1 to store the pgdir phys address
From: Christophe Leroy @ 2020-11-25 7:10 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>
On the 603, SDR1 is not used.
In order to free SPRN_SPRG2, use SPRN_SDR1 to store the pgdir
phys addr.
But only some bits of SDR1 can be used (0xffff01ff).
As the pgdir is 4k aligned, rotate it by 4 bits to the left.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/reg.h | 1 -
arch/powerpc/kernel/head_book3s_32.S | 31 +++++++++++++++++++++-------
2 files changed, 24 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index f877a576b338..a37ce826f6f6 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1203,7 +1203,6 @@
#ifdef CONFIG_PPC_BOOK3S_32
#define SPRN_SPRG_SCRATCH0 SPRN_SPRG0
#define SPRN_SPRG_SCRATCH1 SPRN_SPRG1
-#define SPRN_SPRG_PGDIR SPRN_SPRG2
#define SPRN_SPRG_603_LRU SPRN_SPRG4
#endif
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 236a95d163be..51eef7b82f9c 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -457,8 +457,9 @@ InstructionTLBMiss:
lis r1, TASK_SIZE@h /* check if kernel address */
cmplw 0,r1,r3
#endif
- mfspr r2, SPRN_SPRG_PGDIR
+ mfspr r2, SPRN_SDR1
li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC
+ rlwinm r2, r2, 28, 0xfffff000
#ifdef CONFIG_MODULES
bgt- 112f
lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */
@@ -519,8 +520,9 @@ DataLoadTLBMiss:
mfspr r3,SPRN_DMISS
lis r1, TASK_SIZE@h /* check if kernel address */
cmplw 0,r1,r3
- mfspr r2, SPRN_SPRG_PGDIR
+ mfspr r2, SPRN_SDR1
li r1, _PAGE_PRESENT | _PAGE_ACCESSED
+ rlwinm r2, r2, 28, 0xfffff000
bgt- 112f
lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */
addi r2, r2, (swapper_pg_dir - PAGE_OFFSET)@l /* kernel page table */
@@ -595,8 +597,9 @@ DataStoreTLBMiss:
mfspr r3,SPRN_DMISS
lis r1, TASK_SIZE@h /* check if kernel address */
cmplw 0,r1,r3
- mfspr r2, SPRN_SPRG_PGDIR
+ mfspr r2, SPRN_SDR1
li r1, _PAGE_RW | _PAGE_DIRTY | _PAGE_PRESENT | _PAGE_ACCESSED
+ rlwinm r2, r2, 28, 0xfffff000
bgt- 112f
lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */
addi r2, r2, (swapper_pg_dir - PAGE_OFFSET)@l /* kernel page table */
@@ -889,9 +892,12 @@ __secondary_start:
tophys(r4,r2)
addi r4,r4,THREAD /* phys address of our thread_struct */
mtspr SPRN_SPRG_THREAD,r4
+BEGIN_MMU_FTR_SECTION
lis r4, (swapper_pg_dir - PAGE_OFFSET)@h
ori r4, r4, (swapper_pg_dir - PAGE_OFFSET)@l
- mtspr SPRN_SPRG_PGDIR, r4
+ rlwinm r4, r4, 4, 0xffff01ff
+ mtspr SPRN_SDR1, r4
+END_MMU_FTR_SECTION_IFCLR(MMU_FTR_HPTE_TABLE)
/* enable MMU and jump to start_secondary */
li r4,MSR_KERNEL
@@ -931,11 +937,13 @@ load_up_mmu:
tlbia /* Clear all TLB entries */
sync /* wait for tlbia/tlbie to finish */
TLBSYNC /* ... on all CPUs */
+BEGIN_MMU_FTR_SECTION
/* Load the SDR1 register (hash table base & size) */
lis r6,_SDR1@ha
tophys(r6,r6)
lwz r6,_SDR1@l(r6)
mtspr SPRN_SDR1,r6
+END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
/* Load the BAT registers with the values set up by MMU_init. */
lis r3,BATS@ha
@@ -991,9 +999,12 @@ start_here:
tophys(r4,r2)
addi r4,r4,THREAD /* init task's THREAD */
mtspr SPRN_SPRG_THREAD,r4
+BEGIN_MMU_FTR_SECTION
lis r4, (swapper_pg_dir - PAGE_OFFSET)@h
ori r4, r4, (swapper_pg_dir - PAGE_OFFSET)@l
- mtspr SPRN_SPRG_PGDIR, r4
+ rlwinm r4, r4, 4, 0xffff01ff
+ mtspr SPRN_SDR1, r4
+END_MMU_FTR_SECTION_IFCLR(MMU_FTR_HPTE_TABLE)
/* stack */
lis r1,init_thread_union@ha
@@ -1073,16 +1084,22 @@ _ENTRY(switch_mmu_context)
li r0,NUM_USER_SEGMENTS
mtctr r0
- lwz r4, MM_PGD(r4)
#ifdef CONFIG_BDI_SWITCH
/* Context switch the PTE pointer for the Abatron BDI2000.
* The PGDIR is passed as second argument.
*/
+ lwz r4, MM_PGD(r4)
lis r5, abatron_pteptrs@ha
stw r4, abatron_pteptrs@l + 0x4(r5)
+#endif
+BEGIN_MMU_FTR_SECTION
+#ifndef CONFIG_BDI_SWITCH
+ lwz r4, MM_PGD(r4)
#endif
tophys(r4, r4)
- mtspr SPRN_SPRG_PGDIR, r4
+ rlwinm r4, r4, 4, 0xffff01ff
+ mtspr SPRN_SDR1, r4
+END_MMU_FTR_SECTION_IFCLR(MMU_FTR_HPTE_TABLE)
li r4,0
isync
3:
--
2.25.0
^ permalink raw reply related
* [PATCH v1 6/8] powerpc/32: Simplify EXCEPTION_PROLOG_1 macro
From: Christophe Leroy @ 2020-11-25 7:10 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>
Make code more readable with a clear CONFIG_VMAP_STACK
section and a clear non CONFIG_VMAP_STACK section.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_32.h | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 7c767765071d..5e3393122d29 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -46,18 +46,16 @@
mfspr r1,SPRN_SPRG_THREAD
lwz r1,TASK_STACK-THREAD(r1)
addi r1, r1, THREAD_SIZE - INT_FRAME_SIZE
+1:
+ mtcrf 0x7f, r1
+ bt 32 - THREAD_ALIGN_SHIFT, stack_overflow
#else
subi r11, r1, INT_FRAME_SIZE /* use r1 if kernel */
beq 1f
mfspr r11,SPRN_SPRG_THREAD
lwz r11,TASK_STACK-THREAD(r11)
addi r11, r11, THREAD_SIZE - INT_FRAME_SIZE
-#endif
-1:
- tophys_novmstack r11, r11
-#ifdef CONFIG_VMAP_STACK
- mtcrf 0x7f, r1
- bt 32 - THREAD_ALIGN_SHIFT, stack_overflow
+1: tophys(r11, r11)
#endif
.endm
--
2.25.0
^ permalink raw reply related
* [PATCH v1 7/8] powerpc/32s: Use SPRN_SPRG_SCRATCH2 in DSI prolog
From: Christophe Leroy @ 2020-11-25 7:10 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>
Use SPRN_SPRG_SCRATCH2 as an alternative scratch register in
the early part of DSI prolog in order to avoid clobbering
SPRN_SPRG_SCRATCH0/1 used by other prologs.
The 603 doesn't like a jump from DataLoadTLBMiss to the 10 nops
that are now in the beginning of DSI exception as a result of
the feature section. To workaround this, add a jump as alternative.
It also avoids fetching 10 nops for nothing.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/reg.h | 1 +
arch/powerpc/kernel/head_book3s_32.S | 24 ++++++++----------------
2 files changed, 9 insertions(+), 16 deletions(-)
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index a37ce826f6f6..acd334ee3936 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1203,6 +1203,7 @@
#ifdef CONFIG_PPC_BOOK3S_32
#define SPRN_SPRG_SCRATCH0 SPRN_SPRG0
#define SPRN_SPRG_SCRATCH1 SPRN_SPRG1
+#define SPRN_SPRG_SCRATCH2 SPRN_SPRG2
#define SPRN_SPRG_603_LRU SPRN_SPRG4
#endif
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 51eef7b82f9c..22d670263222 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -288,9 +288,9 @@ MachineCheck:
DO_KVM 0x300
DataAccess:
#ifdef CONFIG_VMAP_STACK
- mtspr SPRN_SPRG_SCRATCH0,r10
- mfspr r10, SPRN_SPRG_THREAD
BEGIN_MMU_FTR_SECTION
+ mtspr SPRN_SPRG_SCRATCH2,r10
+ mfspr r10, SPRN_SPRG_THREAD
stw r11, THR11(r10)
mfspr r10, SPRN_DSISR
mfcr r11
@@ -304,19 +304,11 @@ BEGIN_MMU_FTR_SECTION
.Lhash_page_dsi_cont:
mtcr r11
lwz r11, THR11(r10)
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
- mtspr SPRN_SPRG_SCRATCH1,r11
- mfspr r11, SPRN_DAR
- stw r11, DAR(r10)
- mfspr r11, SPRN_DSISR
- stw r11, DSISR(r10)
- mfspr r11, SPRN_SRR0
- stw r11, SRR0(r10)
- mfspr r11, SPRN_SRR1 /* check whether user or kernel */
- stw r11, SRR1(r10)
- mfcr r10
- andi. r11, r11, MSR_PR
-
+ mfspr r10, SPRN_SPRG_SCRATCH2
+MMU_FTR_SECTION_ELSE
+ b 1f
+ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
+1: EXCEPTION_PROLOG_0 handle_dar_dsisr=1
EXCEPTION_PROLOG_1
b handle_page_fault_tramp_1
#else /* CONFIG_VMAP_STACK */
@@ -760,7 +752,7 @@ fast_hash_page_return:
/* DSI */
mtcr r11
lwz r11, THR11(r10)
- mfspr r10, SPRN_SPRG_SCRATCH0
+ mfspr r10, SPRN_SPRG_SCRATCH2
RFI
1: /* ISI */
--
2.25.0
^ permalink raw reply related
* [PATCH v1 8/8] powerpc/32: Use SPRN_SPRG_SCRATCH2 in exception prologs
From: Christophe Leroy @ 2020-11-25 7:10 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu>
Use SPRN_SPRG_SCRATCH2 as a third scratch register in
exception prologs in order to simplify them and avoid
data going back and forth from/to CR.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_32.h | 22 +++++++---------------
1 file changed, 7 insertions(+), 15 deletions(-)
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index 5e3393122d29..a1ee1e12241e 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -40,7 +40,7 @@
.macro EXCEPTION_PROLOG_1 for_rtas=0
#ifdef CONFIG_VMAP_STACK
- mr r11, r1
+ mtspr SPRN_SPRG_SCRATCH2,r1
subi r1, r1, INT_FRAME_SIZE /* use r1 if kernel */
beq 1f
mfspr r1,SPRN_SPRG_THREAD
@@ -61,15 +61,10 @@
.macro EXCEPTION_PROLOG_2 handle_dar_dsisr=0
#ifdef CONFIG_VMAP_STACK
- mtcr r10
- li r10, MSR_KERNEL & ~(MSR_IR | MSR_RI) /* can take DTLB miss */
- mtmsr r10
+ li r11, MSR_KERNEL & ~(MSR_IR | MSR_RI) /* can take DTLB miss */
+ mtmsr r11
isync
-#else
- stw r10,_CCR(r11) /* save registers */
-#endif
- mfspr r10, SPRN_SPRG_SCRATCH0
-#ifdef CONFIG_VMAP_STACK
+ mfspr r11, SPRN_SPRG_SCRATCH2
stw r11,GPR1(r1)
stw r11,0(r1)
mr r11, r1
@@ -78,14 +73,12 @@
stw r1,0(r11)
tovirt(r1, r11) /* set new kernel sp */
#endif
+ stw r10,_CCR(r11) /* save registers */
stw r12,GPR12(r11)
stw r9,GPR9(r11)
- stw r10,GPR10(r11)
-#ifdef CONFIG_VMAP_STACK
- mfcr r10
- stw r10, _CCR(r11)
-#endif
+ mfspr r10,SPRN_SPRG_SCRATCH0
mfspr r12,SPRN_SPRG_SCRATCH1
+ stw r10,GPR10(r11)
stw r12,GPR11(r11)
mflr r10
stw r10,_LINK(r11)
@@ -99,7 +92,6 @@
stw r10, _DSISR(r11)
.endif
lwz r9, SRR1(r12)
- andi. r10, r9, MSR_PR
lwz r12, SRR0(r12)
#else
mfspr r12,SPRN_SRR0
--
2.25.0
^ permalink raw reply related
* Re: [PATCH v4] dt-bindings: misc: convert fsl,qoriq-mc from txt to YAML
From: Ioana Ciornei @ 2020-11-25 7:17 UTC (permalink / raw)
To: Laurentiu Tudor
Cc: devicetree@vger.kernel.org, corbet@lwn.net,
netdev@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, Leo Li, robh+dt@kernel.org,
Ionut-robert Aron, kuba@kernel.org, linuxppc-dev@lists.ozlabs.org,
davem@davemloft.net, linux-arm-kernel@lists.infradead.org
In-Reply-To: <20201123090035.15734-1-laurentiu.tudor@nxp.com>
On Mon, Nov 23, 2020 at 11:00:35AM +0200, Laurentiu Tudor wrote:
> From: Ionut-robert Aron <ionut-robert.aron@nxp.com>
>
> Convert fsl,qoriq-mc to YAML in order to automate the verification
> process of dts files. In addition, update MAINTAINERS accordingly
> and, while at it, add some missing files.
>
> Signed-off-by: Ionut-robert Aron <ionut-robert.aron@nxp.com>
> [laurentiu.tudor@nxp.com: update MINTAINERS, updates & fixes in schema]
> Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com>
> ---
> Changes in v4:
> - use $ref to point to fsl,qoriq-mc-dpmac binding
>
> Changes in v3:
> - dropped duplicated "fsl,qoriq-mc-dpmac" schema and replaced with
> reference to it
> - fixed a dt_binding_check warning
>
> Changes in v2:
> - fixed errors reported by yamllint
> - dropped multiple unnecessary quotes
> - used schema instead of text in description
> - added constraints on dpmac reg property
>
> .../devicetree/bindings/misc/fsl,qoriq-mc.txt | 196 ------------------
> .../bindings/misc/fsl,qoriq-mc.yaml | 186 +++++++++++++++++
> .../ethernet/freescale/dpaa2/overview.rst | 5 +-
> MAINTAINERS | 4 +-
> 4 files changed, 193 insertions(+), 198 deletions(-)
> delete mode 100644 Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
> create mode 100644 Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml
>
> diff --git a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
> deleted file mode 100644
> index 7b486d4985dc..000000000000
> --- a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
> +++ /dev/null
> @@ -1,196 +0,0 @@
> -* Freescale Management Complex
> -
> -The Freescale Management Complex (fsl-mc) is a hardware resource
> -manager that manages specialized hardware objects used in
> -network-oriented packet processing applications. After the fsl-mc
> -block is enabled, pools of hardware resources are available, such as
> -queues, buffer pools, I/O interfaces. These resources are building
> -blocks that can be used to create functional hardware objects/devices
> -such as network interfaces, crypto accelerator instances, L2 switches,
> -etc.
> -
> -For an overview of the DPAA2 architecture and fsl-mc bus see:
> -Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst
> -
> -As described in the above overview, all DPAA2 objects in a DPRC share the
> -same hardware "isolation context" and a 10-bit value called an ICID
> -(isolation context id) is expressed by the hardware to identify
> -the requester.
> -
> -The generic 'iommus' property is insufficient to describe the relationship
> -between ICIDs and IOMMUs, so an iommu-map property is used to define
> -the set of possible ICIDs under a root DPRC and how they map to
> -an IOMMU.
> -
> -For generic IOMMU bindings, see
> -Documentation/devicetree/bindings/iommu/iommu.txt.
> -
> -For arm-smmu binding, see:
> -Documentation/devicetree/bindings/iommu/arm,smmu.yaml.
> -
> -The MSI writes are accompanied by sideband data which is derived from the ICID.
> -The msi-map property is used to associate the devices with both the ITS
> -controller and the sideband data which accompanies the writes.
> -
> -For generic MSI bindings, see
> -Documentation/devicetree/bindings/interrupt-controller/msi.txt.
> -
> -For GICv3 and GIC ITS bindings, see:
> -Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml.
> -
> -Required properties:
> -
> - - compatible
> - Value type: <string>
> - Definition: Must be "fsl,qoriq-mc". A Freescale Management Complex
> - compatible with this binding must have Block Revision
> - Registers BRR1 and BRR2 at offset 0x0BF8 and 0x0BFC in
> - the MC control register region.
> -
> - - reg
> - Value type: <prop-encoded-array>
> - Definition: A standard property. Specifies one or two regions
> - defining the MC's registers:
> -
> - -the first region is the command portal for the
> - this machine and must always be present
> -
> - -the second region is the MC control registers. This
> - region may not be present in some scenarios, such
> - as in the device tree presented to a virtual machine.
> -
> - - ranges
> - Value type: <prop-encoded-array>
> - Definition: A standard property. Defines the mapping between the child
> - MC address space and the parent system address space.
> -
> - The MC address space is defined by 3 components:
> - <region type> <offset hi> <offset lo>
> -
> - Valid values for region type are
> - 0x0 - MC portals
> - 0x1 - QBMAN portals
> -
> - - #address-cells
> - Value type: <u32>
> - Definition: Must be 3. (see definition in 'ranges' property)
> -
> - - #size-cells
> - Value type: <u32>
> - Definition: Must be 1.
> -
> -Sub-nodes:
> -
> - The fsl-mc node may optionally have dpmac sub-nodes that describe
> - the relationship between the Ethernet MACs which belong to the MC
> - and the Ethernet PHYs on the system board.
> -
> - The dpmac nodes must be under a node named "dpmacs" which contains
> - the following properties:
> -
> - - #address-cells
> - Value type: <u32>
> - Definition: Must be present if dpmac sub-nodes are defined and must
> - have a value of 1.
> -
> - - #size-cells
> - Value type: <u32>
> - Definition: Must be present if dpmac sub-nodes are defined and must
> - have a value of 0.
> -
> - These nodes must have the following properties:
> -
> - - compatible
> - Value type: <string>
> - Definition: Must be "fsl,qoriq-mc-dpmac".
> -
> - - reg
> - Value type: <prop-encoded-array>
> - Definition: Specifies the id of the dpmac.
> -
> - - phy-handle
> - Value type: <phandle>
> - Definition: Specifies the phandle to the PHY device node associated
> - with the this dpmac.
> -Optional properties:
> -
> -- iommu-map: Maps an ICID to an IOMMU and associated iommu-specifier
> - data.
> -
> - The property is an arbitrary number of tuples of
> - (icid-base,iommu,iommu-base,length).
> -
> - Any ICID i in the interval [icid-base, icid-base + length) is
> - associated with the listed IOMMU, with the iommu-specifier
> - (i - icid-base + iommu-base).
> -
> -- msi-map: Maps an ICID to a GIC ITS and associated msi-specifier
> - data.
> -
> - The property is an arbitrary number of tuples of
> - (icid-base,gic-its,msi-base,length).
> -
> - Any ICID in the interval [icid-base, icid-base + length) is
> - associated with the listed GIC ITS, with the msi-specifier
> - (i - icid-base + msi-base).
> -
> -Deprecated properties:
> -
> - - msi-parent
> - Value type: <phandle>
> - Definition: Describes the MSI controller node handling message
> - interrupts for the MC. When there is no translation
> - between the ICID and deviceID this property can be used
> - to describe the MSI controller used by the devices on the
> - mc-bus.
> - The use of this property for mc-bus is deprecated. Please
> - use msi-map.
> -
> -Example:
> -
> - smmu: iommu@5000000 {
> - compatible = "arm,mmu-500";
> - #iommu-cells = <1>;
> - stream-match-mask = <0x7C00>;
> - ...
> - };
> -
> - gic: interrupt-controller@6000000 {
> - compatible = "arm,gic-v3";
> - ...
> - }
> - its: gic-its@6020000 {
> - compatible = "arm,gic-v3-its";
> - msi-controller;
> - ...
> - };
> -
> - fsl_mc: fsl-mc@80c000000 {
> - compatible = "fsl,qoriq-mc";
> - reg = <0x00000008 0x0c000000 0 0x40>, /* MC portal base */
> - <0x00000000 0x08340000 0 0x40000>; /* MC control reg */
> - /* define map for ICIDs 23-64 */
> - iommu-map = <23 &smmu 23 41>;
> - /* define msi map for ICIDs 23-64 */
> - msi-map = <23 &its 23 41>;
> - #address-cells = <3>;
> - #size-cells = <1>;
> -
> - /*
> - * Region type 0x0 - MC portals
> - * Region type 0x1 - QBMAN portals
> - */
> - ranges = <0x0 0x0 0x0 0x8 0x0c000000 0x4000000
> - 0x1 0x0 0x0 0x8 0x18000000 0x8000000>;
> -
> - dpmacs {
> - #address-cells = <1>;
> - #size-cells = <0>;
> -
> - dpmac@1 {
> - compatible = "fsl,qoriq-mc-dpmac";
> - reg = <1>;
> - phy-handle = <&mdio0_phy0>;
> - }
> - }
> - };
> diff --git a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml
> new file mode 100644
> index 000000000000..f45e21872e4f
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml
> @@ -0,0 +1,186 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +# Copyright 2020 NXP
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/misc/fsl,qoriq-mc.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +maintainers:
> + - Laurentiu Tudor <laurentiu.tudor@nxp.com>
> +
> +title: Freescale Management Complex
> +
> +description: |
> + The Freescale Management Complex (fsl-mc) is a hardware resource
> + manager that manages specialized hardware objects used in
> + network-oriented packet processing applications. After the fsl-mc
> + block is enabled, pools of hardware resources are available, such as
> + queues, buffer pools, I/O interfaces. These resources are building
> + blocks that can be used to create functional hardware objects/devices
> + such as network interfaces, crypto accelerator instances, L2 switches,
> + etc.
> +
> + For an overview of the DPAA2 architecture and fsl-mc bus see:
> + Documentation/networking/device_drivers/freescale/dpaa2/overview.rst
> +
> + As described in the above overview, all DPAA2 objects in a DPRC share the
> + same hardware "isolation context" and a 10-bit value called an ICID
> + (isolation context id) is expressed by the hardware to identify
> + the requester.
> +
> + The generic 'iommus' property is insufficient to describe the relationship
> + between ICIDs and IOMMUs, so an iommu-map property is used to define
> + the set of possible ICIDs under a root DPRC and how they map to
> + an IOMMU.
> +
> + For generic IOMMU bindings, see:
> + Documentation/devicetree/bindings/iommu/iommu.txt.
> +
> + For arm-smmu binding, see:
> + Documentation/devicetree/bindings/iommu/arm,smmu.yaml.
> +
> + MC firmware binary images can be found here:
> + https://github.com/NXP/qoriq-mc-binary
> +
> +properties:
> + compatible:
> + const: fsl,qoriq-mc
> + description:
> + A Freescale Management Complex compatible with this binding must have
> + Block Revision Registers BRR1 and BRR2 at offset 0x0BF8 and 0x0BFC in
> + the MC control register region.
> +
> + reg:
> + minItems: 1
> + items:
> + - description: the command portal for this machine
> + - description:
> + MC control registers. This region may not be present in some
> + scenarios, such as in the device tree presented to a virtual
> + machine.
> +
> + ranges:
> + description: |
> + A standard property. Defines the mapping between the child MC address
> + space and the parent system address space.
> +
> + The MC address space is defined by 3 components:
> + <region type> <offset hi> <offset lo>
> +
> + Valid values for region type are:
> + 0x0 - MC portals
> + 0x1 - QBMAN portals
> +
> + '#address-cells':
> + const: 3
> +
> + '#size-cells':
> + const: 1
> +
> + dpmacs:
> + type: object
> + description:
> + The fsl-mc node may optionally have dpmac sub-nodes that describe the
> + relationship between the Ethernet MACs which belong to the MC and the
> + Ethernet PHYs on the system board.
> +
> + properties:
> + '#address-cells':
> + const: 1
> +
> + '#size-cells':
> + const: 0
> +
> + patternProperties:
> + "^(dpmac@[0-9a-f]+)|(ethernet@[0-9a-f]+)$":
> + type: object
> +
> + $ref: /schemas/net/fsl,qoriq-mc-dpmac.yaml#
> +
> + iommu-map:
> + description: |
> + Maps an ICID to an IOMMU and associated iommu-specifier data.
> +
> + The property is an arbitrary number of tuples of
> + (icid-base, iommu, iommu-base, length).
> +
> + Any ICID i in the interval [icid-base, icid-base + length) is
> + associated with the listed IOMMU, with the iommu-specifier
> + (i - icid-base + iommu-base).
> +
> + msi-map:
> + description: |
> + Maps an ICID to a GIC ITS and associated msi-specifier data.
> +
> + The property is an arbitrary number of tuples of
> + (icid-base, gic-its, msi-base, length).
> +
> + Any ICID in the interval [icid-base, icid-base + length) is
> + associated with the listed GIC ITS, with the msi-specifier
> + (i - icid-base + msi-base).
> +
> + msi-parent:
> + deprecated: true
> + description:
> + Points to the MSI controller node handling message interrupts for the MC.
> +
> +required:
> + - compatible
> + - reg
> + - iommu-map
> + - msi-map
> + - ranges
> + - '#address-cells'
> + - '#size-cells'
> +
> +additionalProperties: false
> +
> +examples:
> + - |
> + soc {
> + #address-cells = <2>;
> + #size-cells = <2>;
> +
> + smmu: iommu@5000000 {
> + compatible = "arm,mmu-500";
> + #global-interrupts = <1>;
> + #iommu-cells = <1>;
> + reg = <0 0x5000000 0 0x800000>;
> + stream-match-mask = <0x7c00>;
> + interrupts = <0 13 4>,
> + <0 146 4>, <0 147 4>,
> + <0 148 4>, <0 149 4>,
> + <0 150 4>, <0 151 4>,
> + <0 152 4>, <0 153 4>;
> + };
> +
> + fsl_mc: fsl-mc@80c000000 {
> + compatible = "fsl,qoriq-mc";
> + reg = <0x00000008 0x0c000000 0 0x40>, /* MC portal base */
> + <0x00000000 0x08340000 0 0x40000>; /* MC control reg */
> + /* define map for ICIDs 23-64 */
> + iommu-map = <23 &smmu 23 41>;
> + /* define msi map for ICIDs 23-64 */
> + msi-map = <23 &its 23 41>;
> + #address-cells = <3>;
> + #size-cells = <1>;
> +
> + /*
> + * Region type 0x0 - MC portals
> + * Region type 0x1 - QBMAN portals
> + */
> + ranges = <0x0 0x0 0x0 0x8 0x0c000000 0x4000000
> + 0x1 0x0 0x0 0x8 0x18000000 0x8000000>;
> +
> + dpmacs {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + ethernet@1 {
> + compatible = "fsl,qoriq-mc-dpmac";
> + reg = <1>;
> + phy-handle = <&mdio0_phy0>;
> + };
> + };
> + };
> + };
> diff --git a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst
> index d638b5a8aadd..b3261c5871cc 100644
> --- a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst
> +++ b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst
> @@ -28,6 +28,9 @@ interfaces, an L2 switch, or accelerator instances.
> The MC provides memory-mapped I/O command interfaces (MC portals)
> which DPAA2 software drivers use to operate on DPAA2 objects.
>
> +MC firmware binary images can be found here:
> +https://github.com/NXP/qoriq-mc-binary
> +
> The diagram below shows an overview of the DPAA2 resource management
> architecture::
>
> @@ -338,7 +341,7 @@ Key functions include:
> a bind of the root DPRC to the DPRC driver
>
> The binding for the MC-bus device-tree node can be consulted at
> -*Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt*.
> +*Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml*.
> The sysfs bind/unbind interfaces for the MC-bus can be consulted at
> *Documentation/ABI/testing/sysfs-bus-fsl-mc*.
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b516bb34a8d5..e0ce6e2b663c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -14409,9 +14409,11 @@ M: Stuart Yoder <stuyoder@gmail.com>
> M: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> L: linux-kernel@vger.kernel.org
> S: Maintained
> -F: Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
> +F: Documentation/devicetree/bindings/misc/fsl,dpaa2-console.yaml
> +F: Documentation/devicetree/bindings/misc/fsl,qoriq-mc.yaml
> F: Documentation/networking/device_drivers/ethernet/freescale/dpaa2/overview.rst
> F: drivers/bus/fsl-mc/
> +F: include/linux/fsl/mc.h
>
> QT1010 MEDIA DRIVER
> M: Antti Palosaari <crope@iki.fi>
> --
> 2.17.1
>
^ permalink raw reply
* [PATCH V2] powerpc/perf: Exclude kernel samples while counting events in user space.
From: Athira Rajeev @ 2020-11-25 7:26 UTC (permalink / raw)
To: mpe; +Cc: maddy, linuxppc-dev
Perf event attritube supports exclude_kernel flag
to avoid sampling/profiling in supervisor state (kernel).
Based on this event attr flag, Monitor Mode Control Register
bit is set to freeze on supervisor state. But sometime (due
to hardware limitation), Sampled Instruction Address
Register (SIAR) locks on to kernel address even when
freeze on supervisor is set. Patch here adds a check to
drop those samples.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
Changes in v2:
- Initial patch was sent along with series:
https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=209195
Moving this patch as separate since this change is applicable
for all PMU platforms.
arch/powerpc/perf/core-book3s.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 08643cb..40aa117 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2122,6 +2122,17 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
perf_event_update_userpage(event);
/*
+ * Due to hardware limitation, sometimes SIAR could
+ * lock on to kernel address even with freeze on
+ * supervisor state (kernel) is set in MMCR2.
+ * Check attr.exclude_kernel and address
+ * to drop the sample in these cases.
+ */
+ if (event->attr.exclude_kernel && record)
+ if (is_kernel_addr(mfspr(SPRN_SIAR)))
+ record = 0;
+
+ /*
* Finally record data if requested.
*/
if (record) {
--
1.8.3.1
^ permalink raw reply related
* Re: [PATCH 1/2] genirq: add an affinity parameter to irq_create_mapping()
From: Laurent Vivier @ 2020-11-25 7:30 UTC (permalink / raw)
To: Thomas Gleixner, linux-kernel
Cc: Michael S . Tsirkin, linux-pci, linux-block, Paul Mackerras,
Marc Zyngier, linuxppc-dev, Christoph Hellwig
In-Reply-To: <87h7pel7ng.fsf@nanos.tec.linutronix.de>
On 24/11/2020 23:19, Thomas Gleixner wrote:
> On Tue, Nov 24 2020 at 21:03, Laurent Vivier wrote:
>> This parameter is needed to pass it to irq_domain_alloc_descs().
>>
>> This seems to have been missed by
>> o06ee6d571f0e ("genirq: Add affinity hint to irq allocation")
>
> No, this has not been missed at all. There was and is no reason to do
> this.
>
>> This is needed to implement proper support for multiqueue with
>> pseries.
>
> And because pseries needs this _all_ callers need to be changed?
>
>> 123 files changed, 171 insertions(+), 146 deletions(-)
>
> Lots of churn for nothing. 99% of the callers will never need that.
>
> What's wrong with simply adding an interface which takes that parameter,
> make the existing one an inline wrapper and and leave the rest alone?
Nothing. I'm going to do like that.
Thank you for your comment.
Laurent
^ permalink raw reply
* Re: [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events
From: Michael Ellerman @ 2020-11-25 8:12 UTC (permalink / raw)
To: Namhyung Kim
Cc: Ian Rogers, Andi Kleen, Peter Zijlstra, linuxppc-dev,
linux-kernel, Stephane Eranian, Paul Mackerras,
Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Gabriel Marin,
Liang, Kan
In-Reply-To: <CAM9d7cg8kYMyPHQK_rhEiYQaSddqqt93=pLVNKJm8Y6F=if9ow@mail.gmail.com>
Namhyung Kim <namhyung@kernel.org> writes:
> Hello,
>
> On Mon, Nov 23, 2020 at 8:00 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> Namhyung Kim <namhyung@kernel.org> writes:
>> > Hi Peter and Kan,
>> >
>> > (Adding PPC folks)
>> >
>> > On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim <namhyung@kernel.org> wrote:
>> >>
>> >> Hello,
>> >>
>> >> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On 11/11/2020 11:25 AM, Peter Zijlstra wrote:
>> >> > > On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote:
>> >> > >
>> >> > >> - When the large PEBS was introduced (9c964efa4330), the sched_task() should
>> >> > >> be invoked to flush the PEBS buffer in each context switch. However, The
>> >> > >> perf_sched_events in account_event() is not updated accordingly. The
>> >> > >> perf_event_task_sched_* never be invoked for a pure per-CPU context. Only
>> >> > >> per-task event works.
>> >> > >> At that time, the perf_pmu_sched_task() is outside of
>> >> > >> perf_event_context_sched_in/out. It means that perf has to double
>> >> > >> perf_pmu_disable() for per-task event.
>> >> > >
>> >> > >> - The patch 1 tries to fix broken per-CPU events. The CPU context cannot be
>> >> > >> retrieved from the task->perf_event_ctxp. So it has to be tracked in the
>> >> > >> sched_cb_list. Yes, the code is very similar to the original codes, but it
>> >> > >> is actually the new code for per-CPU events. The optimization for per-task
>> >> > >> events is still kept.
>> >> > >> For the case, which has both a CPU context and a task context, yes, the
>> >> > >> __perf_pmu_sched_task() in this patch is not invoked. Because the
>> >> > >> sched_task() only need to be invoked once in a context switch. The
>> >> > >> sched_task() will be eventually invoked in the task context.
>> >> > >
>> >> > > The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB and
>> >> > > only set that for large pebs. Are you sure the other users (Intel LBR
>> >> > > and PowerPC BHRB) don't need it?
>> >> >
>> >> > I didn't set it for LBR, because the perf_sched_events is always enabled
>> >> > for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB
>> >> > for LBR.
>> >> >
>> >> > if (has_branch_stack(event))
>> >> > inc = true;
>> >> >
>> >> > >
>> >> > > If they indeed do not require the pmu::sched_task() callback for CPU
>> >> > > events, then I still think the whole perf_sched_cb_{inc,dec}() interface
>> >> >
>> >> > No, LBR requires the pmu::sched_task() callback for CPU events.
>> >> >
>> >> > Now, The LBR registers have to be reset in sched in even for CPU events.
>> >> >
>> >> > To fix the shorter LBR callstack issue for CPU events, we also need to
>> >> > save/restore LBRs in pmu::sched_task().
>> >> > https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.liang@linux.intel.com/
>> >> >
>> >> > > is confusing at best.
>> >> > >
>> >> > > Can't we do something like this instead?
>> >> > >
>> >> > I think the below patch may have two issues.
>> >> > - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as well) now.
>> >> > - We may disable the large PEBS later if not all PEBS events support
>> >> > large PEBS. The PMU need a way to notify the generic code to decrease
>> >> > the nr_sched_task.
>> >>
>> >> Any updates on this? I've reviewed and tested Kan's patches
>> >> and they all look good.
>> >>
>> >> Maybe we can talk to PPC folks to confirm the BHRB case?
>> >
>> > Can we move this forward? I saw patch 3/3 also adds PERF_ATTACH_SCHED_CB
>> > for PowerPC too. But it'd be nice if ppc folks can confirm the change.
>>
>> Sorry I've read the whole thread, but I'm still not entirely sure I
>> understand the question.
>
> Thanks for your time and sorry about not being clear enough.
>
> We found per-cpu events are not calling pmu::sched_task()
> on context switches. So PERF_ATTACH_SCHED_CB was
> added to indicate the core logic that it needs to invoke the
> callback.
OK. TBH I've never thought of using branch stack with a per-cpu event,
but I guess you can do it.
I think the same logic applies as LBR, we need to read the BHRB entries
in the context of the task that they were recorded for.
> The patch 3/3 added the flag to PPC (for BHRB) with other
> changes (I think it should be split like in the patch 2/3) and
> want to get ACKs from the PPC folks.
If you post a new version with Maddy's comments addressed then he or I
can ack it.
cheers
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox