* Re: [PATCH v3 3/3] powerpc/uaccess: simplify the get_fs() set_fs() logic
From: Christoph Hellwig @ 2020-08-06 9:17 UTC (permalink / raw)
To: Christophe Leroy; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel
In-Reply-To: <cf39cb8e42cffe323393b8cecdc59a7230298eab.1596702117.git.christophe.leroy@csgroup.eu>
Do you urgently need this? My plan for 5.10 is to rebased and submit
the remaining bits of this branch:
http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/set_fs-removal
which will kill off set_fs/get_fs entirely.
^ permalink raw reply
* [PATCH] powerpc/book3s64/radix: Make radix_mem_block_size 64bit
From: Aneesh Kumar K.V @ 2020-08-06 8:14 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V
Similar to commit: 89c140bbaeee ("pseries: Fix 64 bit logical memory block panic")
make sure we update different variables tracking lmb_size are updated
to be 64 bit.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/mmu.h | 2 +-
arch/powerpc/include/asm/drmem.h | 2 +-
arch/powerpc/mm/book3s64/radix_pgtable.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 55442d45c597..1a0c9d09950f 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -85,7 +85,7 @@ extern unsigned int mmu_base_pid;
/*
* memory block size used with radix translation.
*/
-extern unsigned int __ro_after_init radix_mem_block_size;
+extern unsigned long __ro_after_init radix_mem_block_size;
#define PRTB_SIZE_SHIFT (mmu_pid_bits + 4)
#define PRTB_ENTRIES (1ul << mmu_pid_bits)
diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
index 17ccc6474ab6..07c158c5f939 100644
--- a/arch/powerpc/include/asm/drmem.h
+++ b/arch/powerpc/include/asm/drmem.h
@@ -21,7 +21,7 @@ struct drmem_lmb {
struct drmem_lmb_info {
struct drmem_lmb *lmbs;
int n_lmbs;
- u32 lmb_size;
+ u64 lmb_size;
};
extern struct drmem_lmb_info *drmem_info;
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 28c784976bed..ca76d9d6372a 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -34,7 +34,7 @@
unsigned int mmu_pid_bits;
unsigned int mmu_base_pid;
-unsigned int radix_mem_block_size __ro_after_init;
+unsigned long radix_mem_block_size __ro_after_init;
static __ref void *early_alloc_pgtable(unsigned long size, int nid,
unsigned long region_start, unsigned long region_end)
--
2.26.2
^ permalink raw reply related
* Re: [PATCH] powerpc/signal: Move and simplify get_clean_sp()
From: Christoph Hellwig @ 2020-08-06 9:25 UTC (permalink / raw)
To: Christophe Leroy; +Cc: Paul Mackerras, linuxppc-dev, linux-kernel
In-Reply-To: <04169f40c09682ce5747518268ca84285bc17fbc.1596703345.git.christophe.leroy@csgroup.eu>
On Thu, Aug 06, 2020 at 08:50:20AM +0000, Christophe Leroy wrote:
> get_clean_sp() is only used in kernel/signal.c . Move it there.
>
> And GCC is smart enough to reduce the function when on PPC32, no
> need of a special PPC32 simple version.
What about just open coding it in the only caller, which would seem even
cleaner?
^ permalink raw reply
* [PATCH] powerpc/32s: Fix assembler warning about r0
From: Christophe Leroy @ 2020-08-06 6:01 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
The assembler says:
arch/powerpc/kernel/head_32.S:1095: Warning: invalid register expression
It's objecting to the use of r0 as the RA argument. That's because
when RA = 0 the literal value 0 is used, rather than the content of
r0, making the use of r0 in the source potentially confusing.
Fix it to use a literal 0, the generated code is identical.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_32.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
index f3ab94d73936..5624db0e09a1 100644
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -1092,7 +1092,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
*/
lis r5, abatron_pteptrs@h
ori r5, r5, abatron_pteptrs@l
- stw r5, 0xf0(r0) /* This much match your Abatron config */
+ stw r5, 0xf0(0) /* This much match your Abatron config */
lis r6, swapper_pg_dir@h
ori r6, r6, swapper_pg_dir@l
tophys(r5, r5)
--
2.25.0
^ permalink raw reply related
* [PATCH v3 3/3] powerpc/uaccess: simplify the get_fs() set_fs() logic
From: Christophe Leroy @ 2020-08-06 8:23 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <a6e62627d25fb7ae9b91d8bf553e707689e37498.1596702117.git.christophe.leroy@csgroup.eu>
On powerpc, we only have USER_DS and KERNEL_DS
Today, this is managed as an 'unsigned long' data space limit
which is used to compare the passed address with, plus a bit
in the thread_info flags that is set whenever modifying the limit
to enable the verification in addr_limit_user_check()
The limit is either the last address of user space when USER_DS is
set, and the last address of address space when KERNEL_DS is set.
In both cases, the limit is a compiletime constant.
get_fs() returns the limit, which is part of thread_info struct
set_fs() updates the limit then set the TI_FSCHECK flag.
addr_limit_user_check() check the flag, and if it is set it checks
the limit is the user limit, then unsets the TI_FSCHECK flag.
In addition, when the flag is set the syscall exit work is involved.
This exit work is heavy compared to normal syscall exit as it goes
through normal exception exit instead of the fast syscall exit.
Rename this TI_FSCHECK flag to TIF_UACCESS_KERNEL flag which tells
whether KERNEL_DS or USER_DS is set. Get mm_segment_t be redifined as
a bool struct that is either false (for USER_DS) or true (for
KERNEL_DS). When TIF_UACCESS_KERNEL is set, the limit is ~0UL.
Otherwise it is TASK_SIZE_USER (resp TASK_SIZE_USER64 on PPC64). When
KERNEL_DS is set, there is no range to check. Define TI_FSCHECK as an
alias to TIF_UACCESS_KERNEL.
On exit, involve exit work when the bit is set, i.e. when KERNEL_DS
is set. addr_limit_user_check() will clear the bit and kill the
user process.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v3: Rebased and taken into account removal of segment_eq() and comments from mpe
---
arch/powerpc/include/asm/processor.h | 5 +---
arch/powerpc/include/asm/thread_info.h | 9 ++++---
arch/powerpc/include/asm/uaccess.h | 35 +++++++++++++-------------
arch/powerpc/lib/sstep.c | 2 +-
4 files changed, 25 insertions(+), 26 deletions(-)
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index ed0d633ab5aa..86a9c4395b99 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -84,7 +84,7 @@ void start_thread(struct pt_regs *regs, unsigned long fdptr, unsigned long sp);
void release_thread(struct task_struct *);
typedef struct {
- unsigned long seg;
+ bool uaccess_kernel;
} mm_segment_t;
#define TS_FPR(i) fp_state.fpr[i][TS_FPROFFSET]
@@ -148,7 +148,6 @@ struct thread_struct {
unsigned long ksp_vsid;
#endif
struct pt_regs *regs; /* Pointer to saved register state */
- mm_segment_t addr_limit; /* for get_fs() validation */
#ifdef CONFIG_BOOKE
/* BookE base exception scratch space; align on cacheline */
unsigned long normsave[8] ____cacheline_aligned;
@@ -295,7 +294,6 @@ struct thread_struct {
#define INIT_THREAD { \
.ksp = INIT_SP, \
.ksp_limit = INIT_SP_LIMIT, \
- .addr_limit = KERNEL_DS, \
.pgdir = swapper_pg_dir, \
.fpexc_mode = MSR_FE0 | MSR_FE1, \
SPEFSCR_INIT \
@@ -303,7 +301,6 @@ struct thread_struct {
#else
#define INIT_THREAD { \
.ksp = INIT_SP, \
- .addr_limit = KERNEL_DS, \
.fpexc_mode = 0, \
}
#endif
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index ca6c97025704..123232a63ee7 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -69,7 +69,7 @@ struct thread_info {
#define INIT_THREAD_INFO(tsk) \
{ \
.preempt_count = INIT_PREEMPT_COUNT, \
- .flags = 0, \
+ .flags = _TIF_UACCESS_KERNEL, \
}
#define THREAD_SIZE_ORDER (THREAD_SHIFT - PAGE_SHIFT)
@@ -90,7 +90,8 @@ void arch_setup_new_exec(void);
#define TIF_SYSCALL_TRACE 0 /* syscall trace active */
#define TIF_SIGPENDING 1 /* signal pending */
#define TIF_NEED_RESCHED 2 /* rescheduling necessary */
-#define TIF_FSCHECK 3 /* Check FS is USER_DS on return */
+#define TIF_UACCESS_KERNEL 3 /* KERNEL_DS is set */
+#define TIF_FSCHECK TIF_UACCESS_KERNEL
#define TIF_SYSCALL_EMU 4 /* syscall emulation active */
#define TIF_RESTORE_TM 5 /* need to restore TM FP/VEC/VSX */
#define TIF_PATCH_PENDING 6 /* pending live patching update */
@@ -130,7 +131,7 @@ void arch_setup_new_exec(void);
#define _TIF_SYSCALL_TRACEPOINT (1<<TIF_SYSCALL_TRACEPOINT)
#define _TIF_EMULATE_STACK_STORE (1<<TIF_EMULATE_STACK_STORE)
#define _TIF_NOHZ (1<<TIF_NOHZ)
-#define _TIF_FSCHECK (1<<TIF_FSCHECK)
+#define _TIF_UACCESS_KERNEL (1 << TIF_UACCESS_KERNEL)
#define _TIF_SYSCALL_EMU (1<<TIF_SYSCALL_EMU)
#define _TIF_SYSCALL_DOTRACE (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
_TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT | \
@@ -139,7 +140,7 @@ void arch_setup_new_exec(void);
#define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
_TIF_NOTIFY_RESUME | _TIF_UPROBE | \
_TIF_RESTORE_TM | _TIF_PATCH_PENDING | \
- _TIF_FSCHECK)
+ _TIF_UACCESS_KERNEL)
#define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR)
/* Bits in local_flags */
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 00699903f1ef..8567bec6f939 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -15,48 +15,49 @@
*
* For historical reasons, these macros are grossly misnamed.
*
- * The fs/ds values are now the highest legal address in the "segment".
+ * The fs/ds values are now a bool which tells the "segment" is user or kernel.
* This simplifies the checking in the routines below.
*/
#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) })
-#define KERNEL_DS MAKE_MM_SEG(~0UL)
-#ifdef __powerpc64__
-/* We use TASK_SIZE_USER64 as TASK_SIZE is not constant */
-#define USER_DS MAKE_MM_SEG(TASK_SIZE_USER64 - 1)
-#else
-#define USER_DS MAKE_MM_SEG(TASK_SIZE - 1)
-#endif
+#define KERNEL_DS MAKE_MM_SEG(true)
+#define USER_DS MAKE_MM_SEG(false)
-#define get_fs() (current->thread.addr_limit)
+#define get_fs() (MAKE_MM_SEG(test_thread_flag(TIF_UACCESS_KERNEL)))
static inline void set_fs(mm_segment_t fs)
{
- current->thread.addr_limit = fs;
- /* On user-mode return check addr_limit (fs) is correct */
- set_thread_flag(TIF_FSCHECK);
+ update_thread_flag(TIF_UACCESS_KERNEL, fs.uaccess_kernel);
}
-#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
-#define user_addr_max() (get_fs().seg)
+#define uaccess_kernel() (get_fs().uaccess_kernel)
+#define user_addr_max() (get_fs().uaccess_kernel ? ~0UL : USER_ADDR_MAX - 1)
#ifdef __powerpc64__
+
+#define USER_ADDR_MAX TASK_SIZE_USER64
+
/*
* This check is sufficient because there is a large enough
* gap between user addresses and the kernel addresses
*/
#define __access_ok(addr, size, segment) \
- (((addr) <= (segment).seg) && ((size) <= (segment).seg))
+ segment.uaccess_kernel ? \
+ 1 : (addr) < USER_ADDR_MAX && ((size) < USER_ADDR_MAX)
#else
+#define USER_ADDR_MAX TASK_SIZE
+
static inline int __access_ok(unsigned long addr, unsigned long size,
mm_segment_t seg)
{
- if (addr > seg.seg)
+ if (seg.uaccess_kernel)
+ return 1;
+ if (addr >= USER_ADDR_MAX)
return 0;
- return (size == 0 || size - 1 <= seg.seg - addr);
+ return addr + size <= USER_ADDR_MAX;
}
#endif
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index caee8cc77e19..e10b642566ba 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -112,7 +112,7 @@ static nokprobe_inline long address_ok(struct pt_regs *regs,
return 1;
if (__access_ok(ea, 1, USER_DS))
/* Access overlaps the end of the user region */
- regs->dar = USER_DS.seg;
+ regs->dar = USER_ADDR_MAX;
else
regs->dar = ea;
return 0;
--
2.25.0
^ permalink raw reply related
* [PATCH v3 2/3] uaccess: remove segment_eq
From: Christophe Leroy @ 2020-08-06 8:23 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <a6e62627d25fb7ae9b91d8bf553e707689e37498.1596702117.git.christophe.leroy@csgroup.eu>
From: Christoph Hellwig <hch@lst.de>
segment_eq is only used to implement uaccess_kernel. Just open code
uaccess_kernel in the arch uaccess headers and remove one layer of
indirection.
Link: http://lkml.kernel.org/r/20200710135706.537715-5-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Greentime Hu <green.hu@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
arch/alpha/include/asm/uaccess.h | 2 +-
arch/arc/include/asm/segment.h | 3 +--
arch/arm/include/asm/uaccess.h | 4 ++--
arch/arm64/include/asm/uaccess.h | 2 +-
arch/csky/include/asm/segment.h | 2 +-
arch/h8300/include/asm/segment.h | 2 +-
arch/ia64/include/asm/uaccess.h | 2 +-
arch/m68k/include/asm/segment.h | 2 +-
arch/microblaze/include/asm/uaccess.h | 2 +-
arch/mips/include/asm/uaccess.h | 2 +-
arch/nds32/include/asm/uaccess.h | 2 +-
arch/nios2/include/asm/uaccess.h | 2 +-
arch/openrisc/include/asm/uaccess.h | 2 +-
arch/parisc/include/asm/uaccess.h | 2 +-
arch/powerpc/include/asm/uaccess.h | 3 +--
arch/riscv/include/asm/uaccess.h | 4 +---
arch/s390/include/asm/uaccess.h | 2 +-
arch/sh/include/asm/segment.h | 3 +--
arch/sparc/include/asm/uaccess_32.h | 2 +-
arch/sparc/include/asm/uaccess_64.h | 2 +-
arch/x86/include/asm/uaccess.h | 2 +-
arch/xtensa/include/asm/uaccess.h | 2 +-
include/asm-generic/uaccess.h | 4 ++--
include/linux/uaccess.h | 2 --
24 files changed, 25 insertions(+), 32 deletions(-)
diff --git a/arch/alpha/include/asm/uaccess.h b/arch/alpha/include/asm/uaccess.h
index 1fe2b56cb861..1b6f25efa247 100644
--- a/arch/alpha/include/asm/uaccess.h
+++ b/arch/alpha/include/asm/uaccess.h
@@ -20,7 +20,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
/*
* Is a address valid? This does a straightforward calculation rather
diff --git a/arch/arc/include/asm/segment.h b/arch/arc/include/asm/segment.h
index 6a2a5be5026d..871f8ab11bfd 100644
--- a/arch/arc/include/asm/segment.h
+++ b/arch/arc/include/asm/segment.h
@@ -14,8 +14,7 @@ typedef unsigned long mm_segment_t;
#define KERNEL_DS MAKE_MM_SEG(0)
#define USER_DS MAKE_MM_SEG(TASK_SIZE)
-
-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)
#endif /* __ASSEMBLY__ */
#endif /* __ASMARC_SEGMENT_H */
diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 98c6b91be4a8..b19c9bec1f7a 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -76,7 +76,7 @@ static inline void set_fs(mm_segment_t fs)
modify_domain(DOMAIN_KERNEL, fs ? DOMAIN_CLIENT : DOMAIN_MANAGER);
}
-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)
/* We use 33-bit arithmetic here... */
#define __range_ok(addr, size) ({ \
@@ -263,7 +263,7 @@ extern int __put_user_8(void *, unsigned long long);
*/
#define USER_DS KERNEL_DS
-#define segment_eq(a, b) (1)
+#define uaccess_kernel() (true)
#define __addr_ok(addr) ((void)(addr), 1)
#define __range_ok(addr, size) ((void)(addr), 0)
#define get_fs() (KERNEL_DS)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index bc5c7b091152..fcb8174de505 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -49,7 +49,7 @@ static inline void set_fs(mm_segment_t fs)
CONFIG_ARM64_UAO));
}
-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)
/*
* Test whether a block of memory is a valid user space address.
diff --git a/arch/csky/include/asm/segment.h b/arch/csky/include/asm/segment.h
index db2640d5f575..79ede9b1a646 100644
--- a/arch/csky/include/asm/segment.h
+++ b/arch/csky/include/asm/segment.h
@@ -13,6 +13,6 @@ typedef struct {
#define USER_DS ((mm_segment_t) { 0x80000000UL })
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#endif /* __ASM_CSKY_SEGMENT_H */
diff --git a/arch/h8300/include/asm/segment.h b/arch/h8300/include/asm/segment.h
index a407978f9f9f..37950725d9b9 100644
--- a/arch/h8300/include/asm/segment.h
+++ b/arch/h8300/include/asm/segment.h
@@ -33,7 +33,7 @@ static inline mm_segment_t get_fs(void)
return USER_DS;
}
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#endif /* __ASSEMBLY__ */
diff --git a/arch/ia64/include/asm/uaccess.h b/arch/ia64/include/asm/uaccess.h
index 8aa473a4b0f4..179243c3dfc7 100644
--- a/arch/ia64/include/asm/uaccess.h
+++ b/arch/ia64/include/asm/uaccess.h
@@ -50,7 +50,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
/*
* When accessing user memory, we need to make sure the entire area really is in
diff --git a/arch/m68k/include/asm/segment.h b/arch/m68k/include/asm/segment.h
index c6686559e9b7..2b5e68a71ef7 100644
--- a/arch/m68k/include/asm/segment.h
+++ b/arch/m68k/include/asm/segment.h
@@ -52,7 +52,7 @@ static inline void set_fs(mm_segment_t val)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
#endif
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#endif /* __ASSEMBLY__ */
diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
index 6723c56ec378..304b04ffea2f 100644
--- a/arch/microblaze/include/asm/uaccess.h
+++ b/arch/microblaze/include/asm/uaccess.h
@@ -41,7 +41,7 @@
# define get_fs() (current_thread_info()->addr_limit)
# define set_fs(val) (current_thread_info()->addr_limit = (val))
-# define segment_eq(a, b) ((a).seg == (b).seg)
+# define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#ifndef CONFIG_MMU
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 62b298c50905..61fc01f177a6 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -72,7 +72,7 @@ extern u64 __ua_limit;
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
/*
* eva_kernel_access() - determine whether kernel memory access on an EVA system
diff --git a/arch/nds32/include/asm/uaccess.h b/arch/nds32/include/asm/uaccess.h
index 3a9219f53ee0..010ba5f1d7dd 100644
--- a/arch/nds32/include/asm/uaccess.h
+++ b/arch/nds32/include/asm/uaccess.h
@@ -44,7 +44,7 @@ static inline void set_fs(mm_segment_t fs)
current_thread_info()->addr_limit = fs;
}
-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)
#define __range_ok(addr, size) (size <= get_fs() && addr <= (get_fs() -size))
diff --git a/arch/nios2/include/asm/uaccess.h b/arch/nios2/include/asm/uaccess.h
index e83f831a76f9..a741abbed6fb 100644
--- a/arch/nios2/include/asm/uaccess.h
+++ b/arch/nios2/include/asm/uaccess.h
@@ -30,7 +30,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(seg) (current_thread_info()->addr_limit = (seg))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define __access_ok(addr, len) \
(((signed long)(((long)get_fs().seg) & \
diff --git a/arch/openrisc/include/asm/uaccess.h b/arch/openrisc/include/asm/uaccess.h
index 17c24f14615f..48b691530d3e 100644
--- a/arch/openrisc/include/asm/uaccess.h
+++ b/arch/openrisc/include/asm/uaccess.h
@@ -43,7 +43,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)
/* Ensure that the range from addr to addr+size is all within the process'
* address space
diff --git a/arch/parisc/include/asm/uaccess.h b/arch/parisc/include/asm/uaccess.h
index ebbb9ffe038c..ed2cd4fb479b 100644
--- a/arch/parisc/include/asm/uaccess.h
+++ b/arch/parisc/include/asm/uaccess.h
@@ -14,7 +14,7 @@
#define KERNEL_DS ((mm_segment_t){0})
#define USER_DS ((mm_segment_t){1})
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 64c04ab09112..00699903f1ef 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -38,8 +38,7 @@ static inline void set_fs(mm_segment_t fs)
set_thread_flag(TIF_FSCHECK);
}
-#define segment_eq(a, b) ((a).seg == (b).seg)
-
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define user_addr_max() (get_fs().seg)
#ifdef __powerpc64__
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
index 8ce9d607b53d..1bdf663e5932 100644
--- a/arch/riscv/include/asm/uaccess.h
+++ b/arch/riscv/include/asm/uaccess.h
@@ -62,11 +62,9 @@ static inline void set_fs(mm_segment_t fs)
current_thread_info()->addr_limit = fs;
}
-#define segment_eq(a, b) ((a).seg == (b).seg)
-
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define user_addr_max() (get_fs().seg)
-
/**
* access_ok: - Checks if a user space pointer is valid
* @addr: User space pointer to start of block to check
diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index 324438889fe1..f09444d6aeab 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -32,7 +32,7 @@
#define USER_DS_SACF (3)
#define get_fs() (current->thread.mm_segment)
-#define segment_eq(a,b) (((a) & 2) == ((b) & 2))
+#define uaccess_kernel() ((get_fs() & 2) == KERNEL_DS)
void set_fs(mm_segment_t fs);
diff --git a/arch/sh/include/asm/segment.h b/arch/sh/include/asm/segment.h
index 33d1d28057cb..02e54a3335d6 100644
--- a/arch/sh/include/asm/segment.h
+++ b/arch/sh/include/asm/segment.h
@@ -24,8 +24,7 @@ typedef struct {
#define USER_DS KERNEL_DS
#endif
-#define segment_eq(a, b) ((a).seg == (b).seg)
-
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
diff --git a/arch/sparc/include/asm/uaccess_32.h b/arch/sparc/include/asm/uaccess_32.h
index d6d8413eca83..0a2d3ebc4bb8 100644
--- a/arch/sparc/include/asm/uaccess_32.h
+++ b/arch/sparc/include/asm/uaccess_32.h
@@ -28,7 +28,7 @@
#define get_fs() (current->thread.current_ds)
#define set_fs(val) ((current->thread.current_ds) = (val))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
/* We have there a nice not-mapped page at PAGE_OFFSET - PAGE_SIZE, so that this test
* can be fairly lightweight.
diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h
index bf9d330073b2..698cf69f74e9 100644
--- a/arch/sparc/include/asm/uaccess_64.h
+++ b/arch/sparc/include/asm/uaccess_64.h
@@ -32,7 +32,7 @@
#define get_fs() ((mm_segment_t){(current_thread_info()->current_ds)})
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define set_fs(val) \
do { \
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 18dfa07d3ef0..dd3261f9f4ea 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -33,7 +33,7 @@ static inline void set_fs(mm_segment_t fs)
set_thread_flag(TIF_FSCHECK);
}
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define user_addr_max() (current->thread.addr_limit.seg)
/*
diff --git a/arch/xtensa/include/asm/uaccess.h b/arch/xtensa/include/asm/uaccess.h
index e57f0d0a88d8..b9758119feca 100644
--- a/arch/xtensa/include/asm/uaccess.h
+++ b/arch/xtensa/include/asm/uaccess.h
@@ -35,7 +35,7 @@
#define get_fs() (current->thread.current_ds)
#define set_fs(val) (current->thread.current_ds = (val))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define __kernel_ok (uaccess_kernel())
#define __user_ok(addr, size) \
diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
index e935318804f8..ba68ee4dabfa 100644
--- a/include/asm-generic/uaccess.h
+++ b/include/asm-generic/uaccess.h
@@ -86,8 +86,8 @@ static inline void set_fs(mm_segment_t fs)
}
#endif
-#ifndef segment_eq
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#ifndef uaccess_kernel
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#endif
#define access_ok(addr, size) __access_ok((unsigned long)(addr),(size))
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 0a76ddc07d59..5c62d0c6f15b 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -6,8 +6,6 @@
#include <linux/sched.h>
#include <linux/thread_info.h>
-#define uaccess_kernel() segment_eq(get_fs(), KERNEL_DS)
-
#include <asm/uaccess.h>
/*
--
2.25.0
^ permalink raw reply related
* [PATCH v3 1/3] syscalls: use uaccess_kernel in addr_limit_user_check
From: Christophe Leroy @ 2020-08-06 8:23 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
From: Christoph Hellwig <hch@lst.de>
Patch series "clean up address limit helpers", v2.
In preparation for eventually phasing out direct use of set_fs(), this
series removes the segment_eq() arch helper that is only used to implement
or duplicate the uaccess_kernel() API, and then adds descriptive helpers
to force the kernel address limit.
This patch (of 6):
Use the uaccess_kernel helper instead of duplicating it.
Link: http://lkml.kernel.org/r/20200714105505.935079-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200710135706.537715-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200710135706.537715-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
include/linux/syscalls.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index b951a87da987..e933a43d4a69 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -263,7 +263,7 @@ static inline void addr_limit_user_check(void)
return;
#endif
- if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
+ if (CHECK_DATA_CORRUPTION(uaccess_kernel(),
"Invalid address limit on user-mode return"))
force_sig(SIGKILL);
--
2.25.0
^ permalink raw reply related
* Re: [PATCH] powerpc/40x: Fix assembler warning about r0
From: Christophe Leroy @ 2020-08-06 8:26 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev
In-Reply-To: <87o8noy0sc.fsf@mpe.ellerman.id.au>
Le 06/08/2020 à 04:18, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 22/07/2020 à 04:24, Michael Ellerman a écrit :
>>> The assembler says:
>>> arch/powerpc/kernel/head_40x.S:623: Warning: invalid register expression
>>
>> I get exactly the same with head_32.S, for the exact same reason.
>
> Ah yep, I see it. I mostly build pmac32_defconfig which doesn't have
> BDI_SWITCH enabled.
>
> Send a patch? :)
Done.
>
> Do we still need the BDI_SWITCH code? Is it likely anyone still has one,
> that works?
I have three (One for 83xx and two for 8xx) and they work, allthough I'm
using them only for Uboot and for very very very early Linux boot
debugging (Last time I used it with Linux was when implementing KASAN)
Christophe
^ permalink raw reply
* Re: [PATCH v2] powerpc/uaccess: simplify the get_fs() set_fs() logic
From: Christophe Leroy @ 2020-08-06 8:29 UTC (permalink / raw)
To: Michael Ellerman, Christophe Leroy, Benjamin Herrenschmidt,
Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <87mu3nyh3w.fsf@mpe.ellerman.id.au>
Le 25/07/2020 à 13:22, Michael Ellerman a écrit :
> Hi Christophe,
>
> Unfortunately this would collide messily with "uaccess: remove
> segment_eq" in linux-next, so I'll ask you to do a respin based on that,
> some comments below.
Done, sent as v3, together with the 2 patchs from Linux next to get it
build and boot.
>
> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>> On powerpc, we only have USER_DS and KERNEL_DS
>>
>> Today, this is managed as an 'unsigned long' data space limit
>> which is used to compare the passed address with, plus a bit
>> in the thread_info flags that is set whenever modifying the limit
>> to enable the verification in addr_limit_user_check()
>>
>> The limit is either the last address of user space when USER_DS is
>> set, and the last address of address space when KERNEL_DS is set.
>> In both cases, the limit is a compiletime constant.
>>
>> get_fs() returns the limit, which is part of thread_info struct
>> set_fs() updates the limit then set the TI_FSCHECK flag.
>> addr_limit_user_check() check the flag, and if it is set it checks
>> the limit is the user limit, then unsets the TI_FSCHECK flag.
>>
>> In addition, when the flag is set the syscall exit work is involved.
>> This exit work is heavy compared to normal syscall exit as it goes
>> through normal exception exit instead of the fast syscall exit.
>>
>> Rename this TI_FSCHECK flag to TIF_KERNEL_DS flag which tells whether
>> KERNEL_DS or USER_DS is set. Get mm_segment_t be redifined as a bool
>> struct that is either false (for USER_DS) or true (for KERNEL_DS).
>> When TIF_KERNEL_DS is set, the limit is ~0UL. Otherwise it is
>> TASK_SIZE_USER (resp TASK_SIZE_USER64 on PPC64). When KERNEL_DS is
>> set, there is no range to check. Define TI_FSCHECK as an alias to
>> TIF_KERNEL_DS.
>
> I'd rather avoid the "DS" name any more than we have to. Maybe it means
> "data space" but that's not a very common term.
I thought it was a reference to the ds/fs/gs ... segment registers in
the 8086 ?
>
> The generic helper these days is called uaccess_kernel(), which returns
> true when uaccess routines are allowed to access the kernel.
>
> So calling it TIF_UACCESS_KERNEL would work I think?
ok
>
> The bool could be called uaccess_kernel.
> And END_OF_USER_DS could be USER_ADDR_MAX.
ok
>
>> On exit, involve exit work when the bit is set, i.e. when KERNEL_DS
>> is set. addr_limit_user_check() will clear the bit and kill the
>> user process.
>
> I guess this is safe. The check was added to make sure we never return
> to userspace with KERNEL_DS set, but using the actual TIF flag to
> determine the address limit should be equally safe, and avoid the
> overhead of the check in the good case.
That's the purpose indeed, yes.
christophe
^ permalink raw reply
* [PATCH] powerpc/hwirq: Remove stale forward irq_chip declaration
From: Christophe Leroy @ 2020-08-06 12:19 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
Since commit identified below, the forward declaration of
struct irq_chip is useless (was struct hw_interrupt_type at that time)
Remove it, together with the associated comment.
Fixes: c0ad90a32fb6 ("[PATCH] genirq: add ->retrigger() irq op to consolidate hw_irq_resend()")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/hw_irq.h | 6 ------
1 file changed, 6 deletions(-)
diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index 3a0db7b0b46e..538698facb80 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -372,12 +372,6 @@ static inline void may_hard_irq_enable(void) { }
#define ARCH_IRQ_INIT_FLAGS IRQ_NOREQUEST
-/*
- * interrupt-retrigger: should we handle this via lost interrupts and IPIs
- * or should we not care like we do now ? --BenH.
- */
-struct irq_chip;
-
#endif /* __ASSEMBLY__ */
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_HW_IRQ_H */
--
2.25.0
^ permalink raw reply related
* [PATCH] powerpc/irq: Drop forward declaration of struct irqaction
From: Christophe Leroy @ 2020-08-06 12:19 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
Since the commit identified below, the forward declaration of
struct irqaction is useless. Drop it.
Fixes: b709c0832824 ("ppc64: move stack switching up in interrupt processing")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/irq.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h
index 814dfab7e392..4f983ca4030a 100644
--- a/arch/powerpc/include/asm/irq.h
+++ b/arch/powerpc/include/asm/irq.h
@@ -35,7 +35,6 @@ static __inline__ int irq_canonicalize(int irq)
extern int distribute_irqs;
-struct irqaction;
struct pt_regs;
#define __ARCH_HAS_DO_SOFTIRQ
--
2.25.0
^ permalink raw reply related
* [PATCH 1/2] powerpc/fpu: Drop cvt_fd() and cvt_df()
From: Christophe Leroy @ 2020-08-06 12:20 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
Those two functions have been unused since commit identified below.
Drop them.
Fixes: 31bfdb036f12 ("powerpc: Use instruction emulation infrastructure to handle alignment faults")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/processor.h | 2 --
arch/powerpc/kernel/fpu.S | 15 ---------------
2 files changed, 17 deletions(-)
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 5c20b6d509ae..9ebcb2f095db 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -425,8 +425,6 @@ extern void flush_instruction_cache(void);
extern void hard_reset_now(void);
extern void poweroff_now(void);
extern int fix_alignment(struct pt_regs *);
-extern void cvt_fd(float *from, double *to);
-extern void cvt_df(double *from, float *to);
extern void _nmask_and_or_msr(unsigned long nmask, unsigned long or_val);
#ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S
index 4ae39db70044..825893d4cb59 100644
--- a/arch/powerpc/kernel/fpu.S
+++ b/arch/powerpc/kernel/fpu.S
@@ -134,18 +134,3 @@ _GLOBAL(save_fpu)
mffs fr0
stfd fr0,FPSTATE_FPSCR(r6)
blr
-
-/*
- * These are used in the alignment trap handler when emulating
- * single-precision loads and stores.
- */
-
-_GLOBAL(cvt_fd)
- lfs 0,0(r3)
- stfd 0,0(r4)
- blr
-
-_GLOBAL(cvt_df)
- lfd 0,0(r3)
- stfs 0,0(r4)
- blr
--
2.25.0
^ permalink raw reply related
* [PATCH 2/2] powerpc: drop hard_reset_now() and poweroff_now() declaration
From: Christophe Leroy @ 2020-08-06 12:20 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <d5641ada199b8dd2af16ad00a66084cf974f2704.1596716418.git.christophe.leroy@csgroup.eu>
Those function have never existed. Drop their declaration.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/processor.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 9ebcb2f095db..2b4dc10230da 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -422,8 +422,6 @@ extern void power9_idle_type(unsigned long stop_psscr_val,
unsigned long stop_psscr_mask);
extern void flush_instruction_cache(void);
-extern void hard_reset_now(void);
-extern void poweroff_now(void);
extern int fix_alignment(struct pt_regs *);
extern void _nmask_and_or_msr(unsigned long nmask, unsigned long or_val);
--
2.25.0
^ permalink raw reply related
* Re: [PATCH V5 0/4] powerpc/perf: Add support for perf extended regs in powerpc
From: Arnaldo Carvalho de Melo @ 2020-08-06 12:20 UTC (permalink / raw)
To: Athira Rajeev
Cc: Ravi Bangoria, Michael Neuling, maddy, kjain, linuxppc-dev,
Jiri Olsa, Jiri Olsa
In-Reply-To: <CA3D75F3-5F63-425B-A3C1-00C181E41108@linux.vnet.ibm.com>
Em Fri, Jul 31, 2020 at 11:04:14PM +0530, Athira Rajeev escreveu:
>
>
> > On 31-Jul-2020, at 1:20 AM, Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Thu, Jul 30, 2020 at 01:24:40PM +0530, Athira Rajeev wrote:
> >>
> >>
> >>> On 27-Jul-2020, at 10:46 PM, Athira Rajeev <atrajeev@linux.vnet.ibm.com> wrote:
> >>>
> >>> Patch set to add support for perf extended register capability in
> >>> powerpc. The capability flag PERF_PMU_CAP_EXTENDED_REGS, is used to
> >>> indicate the PMU which support extended registers. The generic code
> >>> define the mask of extended registers as 0 for non supported architectures.
> >>>
> >>> Patches 1 and 2 are the kernel side changes needed to include
> >>> base support for extended regs in powerpc and in power10.
> >>> Patches 3 and 4 are the perf tools side changes needed to support the
> >>> extended registers.
> >>>
> >>
> >> Hi Arnaldo, Jiri
> >>
> >> please let me know if you have any comments/suggestions on this patch series to add support for perf extended regs.
> >
> > hi,
> > can't really tell for powerpc, but in general
> > perf tool changes look ok
> >
>
> Hi Jiri,
> Thanks for checking the patchset.
So I'dd say you submit a v6, split into the kernel part, that probably
should go via the PPC arch tree, and I can pick the tooling part, ok?
- Arnaldo
^ permalink raw reply
* Re: [PATCH 1/2] sched/topology: Allow archs to override cpu_smt_mask
From: Michael Ellerman @ 2020-08-06 12:25 UTC (permalink / raw)
To: peterz
Cc: Gautham R Shenoy, Michael Neuling, Vincent Guittot,
Srikar Dronamraju, Rik van Riel, linuxppc-dev, LKML,
Valentin Schneider, Thomas Gleixner, Mel Gorman, Ingo Molnar,
Dietmar Eggemann
In-Reply-To: <20200806085429.GX2674@hirez.programming.kicks-ass.net>
peterz@infradead.org writes:
> On Thu, Aug 06, 2020 at 03:32:25PM +1000, Michael Ellerman wrote:
>
>> That brings with it a bunch of problems, such as existing software that
>> has been developed/configured for Power8 and expects to see SMT8.
>>
>> We also allow LPARs to be live migrated from Power8 to Power9 (and back), so
>> maintaining the illusion of SMT8 is considered a requirement to make that work.
>
> So how does that work if the kernel booted on P9 and demuxed the SMT8
> into 2xSMT4? If you migrate that state onto a P8 with actual SMT8 you're
> toast again.
The SMT mask would be inaccurate on the P8, rather than the current case
where it's inaccurate on the P9.
Which would be our preference, because the backward migration case is
not common AIUI.
Or am I missing a reason we'd be even more toast than that?
Under PowerVM the kernel does know it's being migrated, so we could
actually update the mask, but I'm not sure if that's really feasible.
>> Yeah I agree the naming is confusing.
>>
>> Let's call them "SMT4 cores" and "SMT8 cores"?
>
> Works for me, thanks!
>
>> The problem is we are already lying to userspace, because firmware lies to us.
>>
>> ie. the firmware on these systems shows us an SMT8 core, and so current kernels
>> show SMT8 to userspace. I don't think we can realistically change that fact now,
>> as these systems are already out in the field.
>>
>> What this patch tries to do is undo some of the mess, and at least give the
>> scheduler the right information.
>
> What a mess... I think it depends on what you do with that P9 to P8
> migration case. Does it make sense to have a "p8_compat" boot arg for
> the case where you want LPAR migration back onto P8 systems -- in which
> case it simply takes the firmware's word as gospel and doesn't untangle
> things, because it can actually land on a P8.
We already get told by firmware that we're running in "p8 compat" mode,
because we have to pretend to userspace that it's running on a P8. So we
could use that as a signal to leave things alone.
But my understanding is most LPARs don't get migrated back and forth,
they'll start life on a P8 and only get migrated to a P9 once when the
customer gets a P9. They might then run for a long time (months to
years) on the P9 in P8 compat mode, not because they ever want to
migrate back to a real P8, but because the software in the LPAR is still
expecting to be on a P8.
I'm not a real expert on all the Enterprisey stuff though, so someone
else might be able to give us a better picture.
But the point of mentioning the migration stuff was mainly just to
explain why we feel we need to present SMT8 to userspace even on P9.
cheers
^ permalink raw reply
* Re: [PATCH v2] powerpc/uaccess: simplify the get_fs() set_fs() logic
From: Michael Ellerman @ 2020-08-06 12:33 UTC (permalink / raw)
To: Christophe Leroy, Christophe Leroy, Benjamin Herrenschmidt,
Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <2d0b777d-f1aa-08a1-f287-47ac68efbd99@csgroup.eu>
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 25/07/2020 à 13:22, Michael Ellerman a écrit :
>> Hi Christophe,
>>
>> Unfortunately this would collide messily with "uaccess: remove
>> segment_eq" in linux-next, so I'll ask you to do a respin based on that,
>> some comments below.
>
> Done, sent as v3, together with the 2 patchs from Linux next to get it
> build and boot.
Thanks.
>> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>>> On powerpc, we only have USER_DS and KERNEL_DS
>>>
>>> Today, this is managed as an 'unsigned long' data space limit
>>> which is used to compare the passed address with, plus a bit
>>> in the thread_info flags that is set whenever modifying the limit
>>> to enable the verification in addr_limit_user_check()
>>>
>>> The limit is either the last address of user space when USER_DS is
>>> set, and the last address of address space when KERNEL_DS is set.
>>> In both cases, the limit is a compiletime constant.
>>>
>>> get_fs() returns the limit, which is part of thread_info struct
>>> set_fs() updates the limit then set the TI_FSCHECK flag.
>>> addr_limit_user_check() check the flag, and if it is set it checks
>>> the limit is the user limit, then unsets the TI_FSCHECK flag.
>>>
>>> In addition, when the flag is set the syscall exit work is involved.
>>> This exit work is heavy compared to normal syscall exit as it goes
>>> through normal exception exit instead of the fast syscall exit.
>>>
>>> Rename this TI_FSCHECK flag to TIF_KERNEL_DS flag which tells whether
>>> KERNEL_DS or USER_DS is set. Get mm_segment_t be redifined as a bool
>>> struct that is either false (for USER_DS) or true (for KERNEL_DS).
>>> When TIF_KERNEL_DS is set, the limit is ~0UL. Otherwise it is
>>> TASK_SIZE_USER (resp TASK_SIZE_USER64 on PPC64). When KERNEL_DS is
>>> set, there is no range to check. Define TI_FSCHECK as an alias to
>>> TIF_KERNEL_DS.
>>
>> I'd rather avoid the "DS" name any more than we have to. Maybe it means
>> "data space" but that's not a very common term.
>
> I thought it was a reference to the ds/fs/gs ... segment registers in
> the 8086 ?
Yes.
In your changelog you used "data space limit", so I thought you were
trying to retrospectively redefine the "DS" acronym to mean that.
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/40x: Fix assembler warning about r0
From: Michael Ellerman @ 2020-08-06 12:33 UTC (permalink / raw)
To: Christophe Leroy, linuxppc-dev
In-Reply-To: <7faebe28-07f9-b5df-6b6d-a25342e2bcad@csgroup.eu>
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 06/08/2020 à 04:18, Michael Ellerman a écrit :
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Le 22/07/2020 à 04:24, Michael Ellerman a écrit :
>>>> The assembler says:
>>>> arch/powerpc/kernel/head_40x.S:623: Warning: invalid register expression
>>>
>>> I get exactly the same with head_32.S, for the exact same reason.
>>
>> Ah yep, I see it. I mostly build pmac32_defconfig which doesn't have
>> BDI_SWITCH enabled.
>>
>> Send a patch? :)
>
> Done.
>
>>
>> Do we still need the BDI_SWITCH code? Is it likely anyone still has one,
>> that works?
>
> I have three (One for 83xx and two for 8xx) and they work, allthough I'm
> using them only for Uboot and for very very very early Linux boot
> debugging (Last time I used it with Linux was when implementing KASAN)
OK, happy to keep the code around if it works and is being used, even
just a little bit.
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/book3s64/radix: Make radix_mem_block_size 64bit
From: Michael Ellerman @ 2020-08-06 12:34 UTC (permalink / raw)
To: Aneesh Kumar K.V, linuxppc-dev; +Cc: Aneesh Kumar K.V
In-Reply-To: <20200806081415.208546-1-aneesh.kumar@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> Similar to commit: 89c140bbaeee ("pseries: Fix 64 bit logical memory block panic")
> make sure we update different variables tracking lmb_size are updated
> to be 64 bit.
That commit went to all stable releases, should this one also?
cheers
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> index 55442d45c597..1a0c9d09950f 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> @@ -85,7 +85,7 @@ extern unsigned int mmu_base_pid;
> /*
> * memory block size used with radix translation.
> */
> -extern unsigned int __ro_after_init radix_mem_block_size;
> +extern unsigned long __ro_after_init radix_mem_block_size;
>
> #define PRTB_SIZE_SHIFT (mmu_pid_bits + 4)
> #define PRTB_ENTRIES (1ul << mmu_pid_bits)
> diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
> index 17ccc6474ab6..07c158c5f939 100644
> --- a/arch/powerpc/include/asm/drmem.h
> +++ b/arch/powerpc/include/asm/drmem.h
> @@ -21,7 +21,7 @@ struct drmem_lmb {
> struct drmem_lmb_info {
> struct drmem_lmb *lmbs;
> int n_lmbs;
> - u32 lmb_size;
> + u64 lmb_size;
> };
>
> extern struct drmem_lmb_info *drmem_info;
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index 28c784976bed..ca76d9d6372a 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -34,7 +34,7 @@
>
> unsigned int mmu_pid_bits;
> unsigned int mmu_base_pid;
> -unsigned int radix_mem_block_size __ro_after_init;
> +unsigned long radix_mem_block_size __ro_after_init;
>
> static __ref void *early_alloc_pgtable(unsigned long size, int nid,
> unsigned long region_start, unsigned long region_end)
> --
> 2.26.2
^ permalink raw reply
* [RFC PATCH 1/3] powerpc/mem: Store the dt_root_size/addr cell values for later usage
From: Aneesh Kumar K.V @ 2020-08-06 12:36 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Nathan Lynch, Aneesh Kumar K.V, Hari Bathini
dt_root_addr_cells and dt_root_size_cells are __initdata variables.
So make a copy of the same which can be used post init.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/drmem.h | 2 ++
arch/powerpc/kernel/prom.c | 7 +++++++
arch/powerpc/mm/numa.c | 1 +
3 files changed, 10 insertions(+)
diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
index 07c158c5f939..1f0eaf432755 100644
--- a/arch/powerpc/include/asm/drmem.h
+++ b/arch/powerpc/include/asm/drmem.h
@@ -123,4 +123,6 @@ static inline void lmb_clear_nid(struct drmem_lmb *lmb)
}
#endif
+extern int mem_addr_cells, mem_size_cells;
+
#endif /* _ASM_POWERPC_LMB_H */
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index d8a2fb87ba0c..9a1701e85747 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -73,6 +73,7 @@ u64 ppc64_rma_size;
#endif
static phys_addr_t first_memblock_size;
static int __initdata boot_cpu_count;
+int mem_addr_cells, mem_size_cells;
static int __init early_parse_mem(char *p)
{
@@ -536,6 +537,12 @@ static int __init early_init_dt_scan_memory_ppc(unsigned long node,
const char *uname,
int depth, void *data)
{
+ /*
+ * Make a copy from __initdata variable
+ */
+ mem_addr_cells = dt_root_addr_cells;
+ mem_size_cells = dt_root_size_cells;
+
#ifdef CONFIG_PPC_PSERIES
if (depth == 1 &&
strcmp(uname, "ibm,dynamic-reconfiguration-memory") == 0) {
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 058fee9a0835..77d41d9775d2 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -368,6 +368,7 @@ static void __init get_n_mem_cells(int *n_addr_cells, int *n_size_cells)
of_node_put(memory);
}
+/* dt_mem_next_cell is __init */
static unsigned long read_n_cells(int n, const __be32 **buf)
{
unsigned long result = 0;
--
2.26.2
^ permalink raw reply related
* [RFC PATCH 2/3] powerpc/numa: Use global variable instead of fetching again
From: Aneesh Kumar K.V @ 2020-08-06 12:36 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Nathan Lynch, Aneesh Kumar K.V, Hari Bathini
In-Reply-To: <20200806123604.248361-1-aneesh.kumar@linux.ibm.com>
use mem_addr_cells/mem_size_cells instead of fetching the values
again from device tree.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/mm/numa.c | 36 ++++++++++--------------------------
1 file changed, 10 insertions(+), 26 deletions(-)
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 77d41d9775d2..c420872acd61 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -52,7 +52,6 @@ EXPORT_SYMBOL(node_to_cpumask_map);
EXPORT_SYMBOL(node_data);
static int min_common_depth;
-static int n_mem_addr_cells, n_mem_size_cells;
static int form1_affinity;
#define MAX_DISTANCE_REF_POINTS 4
@@ -355,19 +354,6 @@ static int __init find_min_common_depth(void)
return -1;
}
-static void __init get_n_mem_cells(int *n_addr_cells, int *n_size_cells)
-{
- struct device_node *memory = NULL;
-
- memory = of_find_node_by_type(memory, "memory");
- if (!memory)
- panic("numa.c: No memory nodes found!");
-
- *n_addr_cells = of_n_addr_cells(memory);
- *n_size_cells = of_n_size_cells(memory);
- of_node_put(memory);
-}
-
/* dt_mem_next_cell is __init */
static unsigned long read_n_cells(int n, const __be32 **buf)
{
@@ -639,12 +625,12 @@ static inline int __init read_usm_ranges(const __be32 **usm)
* a counter followed by that many (base, size) duple.
* read the counter from linux,drconf-usable-memory
*/
- return read_n_cells(n_mem_size_cells, usm);
+ return read_n_cells(mem_size_cells, usm);
}
/*
* Extract NUMA information from the ibm,dynamic-reconfiguration-memory
- * node. This assumes n_mem_{addr,size}_cells have been set.
+ * node. This assumes mem_{addr,size}_cells have been set.
*/
static int __init numa_setup_drmem_lmb(struct drmem_lmb *lmb,
const __be32 **usm,
@@ -677,8 +663,8 @@ static int __init numa_setup_drmem_lmb(struct drmem_lmb *lmb,
do {
if (is_kexec_kdump) {
- base = read_n_cells(n_mem_addr_cells, usm);
- size = read_n_cells(n_mem_size_cells, usm);
+ base = read_n_cells(mem_addr_cells, usm);
+ size = read_n_cells(mem_size_cells, usm);
}
nid = of_drconf_to_nid_single(lmb);
@@ -741,8 +727,6 @@ static int __init parse_numa_properties(void)
node_set_online(nid);
}
- get_n_mem_cells(&n_mem_addr_cells, &n_mem_size_cells);
-
for_each_node_by_type(memory, "memory") {
unsigned long start;
unsigned long size;
@@ -759,11 +743,11 @@ static int __init parse_numa_properties(void)
continue;
/* ranges in cell */
- ranges = (len >> 2) / (n_mem_addr_cells + n_mem_size_cells);
+ ranges = (len >> 2) / (mem_addr_cells + mem_size_cells);
new_range:
/* these are order-sensitive, and modify the buffer pointer */
- start = read_n_cells(n_mem_addr_cells, &memcell_buf);
- size = read_n_cells(n_mem_size_cells, &memcell_buf);
+ start = read_n_cells(mem_addr_cells, &memcell_buf);
+ size = read_n_cells(mem_size_cells, &memcell_buf);
/*
* Assumption: either all memory nodes or none will
@@ -1042,11 +1026,11 @@ static int hot_add_node_scn_to_nid(unsigned long scn_addr)
continue;
/* ranges in cell */
- ranges = (len >> 2) / (n_mem_addr_cells + n_mem_size_cells);
+ ranges = (len >> 2) / (mem_addr_cells + mem_size_cells);
while (ranges--) {
- start = read_n_cells(n_mem_addr_cells, &memcell_buf);
- size = read_n_cells(n_mem_size_cells, &memcell_buf);
+ start = read_n_cells(mem_addr_cells, &memcell_buf);
+ size = read_n_cells(mem_size_cells, &memcell_buf);
if ((scn_addr < start) || (scn_addr >= (start + size)))
continue;
--
2.26.2
^ permalink raw reply related
* [RFC PATCH 3/3] powerpc/lmb-size: Use addr #size-cells value when fetching lmb-size
From: Aneesh Kumar K.V @ 2020-08-06 12:36 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Nathan Lynch, Aneesh Kumar K.V, Hari Bathini
In-Reply-To: <20200806123604.248361-1-aneesh.kumar@linux.ibm.com>
Make it consistent with other usages.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/mm/book3s64/radix_pgtable.c | 7 ++++---
arch/powerpc/platforms/pseries/hotplug-memory.c | 10 ++++++----
2 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index ca76d9d6372a..a48e6618a27b 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -497,7 +497,7 @@ static int __init probe_memory_block_size(unsigned long node, const char *uname,
depth, void *data)
{
unsigned long *mem_block_size = (unsigned long *)data;
- const __be64 *prop;
+ const __be32 *prop;
int len;
if (depth != 1)
@@ -507,13 +507,14 @@ static int __init probe_memory_block_size(unsigned long node, const char *uname,
return 0;
prop = of_get_flat_dt_prop(node, "ibm,lmb-size", &len);
- if (!prop || len < sizeof(__be64))
+
+ if (!prop || len < dt_root_size_cells * sizeof(__be32))
/*
* Nothing in the device tree
*/
*mem_block_size = MIN_MEMORY_BLOCK_SIZE;
else
- *mem_block_size = be64_to_cpup(prop);
+ *mem_block_size = of_read_number(prop, dt_root_size_cells);
return 1;
}
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 5d545b78111f..aba23ef8dfdd 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -30,12 +30,14 @@ unsigned long pseries_memory_block_size(void)
np = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
if (np) {
- const __be64 *size;
+ int len;
+ const __be32 *prop;
- size = of_get_property(np, "ibm,lmb-size", NULL);
- if (size)
- memblock_size = be64_to_cpup(size);
+ prop = of_get_property(np, "ibm,lmb-size", &len);
+ if (prop && len >= mem_size_cells * sizeof(__be32))
+ memblock_size = of_read_number(prop, mem_size_cells);
of_node_put(np);
+
} else if (machine_is(pseries)) {
/* This fallback really only applies to pseries */
unsigned int memzero_size = 0;
--
2.26.2
^ permalink raw reply related
* Re: [PATCH] ASoC: fsl-asoc-card: Get "extal" clock rate by clk_get_rate
From: Mark Brown @ 2020-08-06 12:37 UTC (permalink / raw)
To: Shengjiu Wang
Cc: alsa-devel, timur, Xiubo.Lee, linuxppc-dev, tiwai, lgirdwood,
perex, nicoleotsuka, festevam, linux-kernel
In-Reply-To: <1596699585-27429-1-git-send-email-shengjiu.wang@nxp.com>
[-- Attachment #1: Type: text/plain, Size: 663 bytes --]
On Thu, Aug 06, 2020 at 03:39:45PM +0800, Shengjiu Wang wrote:
> } else if (of_node_name_eq(cpu_np, "esai")) {
> + struct clk *esai_clk = clk_get(&cpu_pdev->dev, "extal");
> +
> + if (!IS_ERR(esai_clk)) {
> + priv->cpu_priv.sysclk_freq[TX] = clk_get_rate(esai_clk);
> + priv->cpu_priv.sysclk_freq[RX] = clk_get_rate(esai_clk);
> + clk_put(esai_clk);
> + }
This should handle probe deferral. Also if this clock is in use
shouldn't we be enabling it? It looks like it's intended to be a
crystal so it's probably forced on all the time but sometimes there's
power control for crystals, or perhaps someone might do something
unusual with the hardware.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH v2 2/2] powerpc/pseries: new lparcfg key/value pair: partition_affinity_score
From: Michael Ellerman @ 2020-08-06 12:44 UTC (permalink / raw)
To: Tyrel Datwyler, Scott Cheloha, linuxppc-dev; +Cc: Nathan Lynch
In-Reply-To: <bc9858c9-7d55-88c0-1f85-157af48e1d8c@linux.ibm.com>
Tyrel Datwyler <tyreld@linux.ibm.com> writes:
> On 7/27/20 11:46 AM, Scott Cheloha wrote:
>> The H_GetPerformanceCounterInfo (GPCI) PHYP hypercall has a subcall,
>> Affinity_Domain_Info_By_Partition, which returns, among other things,
>> a "partition affinity score" for a given LPAR. This score, a value on
>> [0-100], represents the processor-memory affinity for the LPAR in
>> question. A score of 0 indicates the worst possible affinity while a
>> score of 100 indicates perfect affinity. The score can be used to
>> reason about performance.
>>
>> This patch adds the score for the local LPAR to the lparcfg procfile
>> under a new 'partition_affinity_score' key.
>>
>> Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
>
> I was hoping Michael would chime in the first time around on this patch series
> about adding another key/value pair to lparcfg.
That guy is so unreliable.
I don't love adding new stuff in lparcfg, but given the file already
exists and there's no prospect of removing it, it's probably not worth
the effort to put the new field anywhere else.
My other query with this was how on earth anyone is meant to interpret
the metric. ie. if my metric is 50, what does that mean? If it's 90
should I worry?
Which makes me realise we have no documentation for lparcfg in the
kernel at all.
So it would be nice to have it mentioned somewhere in Documentation,
even if it just points to the manpage in powerpc-ibm-utils.
cheers
> So, barring a NACK from mpe:
>
> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com>
>
>> ---
>> arch/powerpc/platforms/pseries/lparcfg.c | 35 ++++++++++++++++++++++++
>> 1 file changed, 35 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/pseries/lparcfg.c b/arch/powerpc/platforms/pseries/lparcfg.c
>> index b8d28ab88178..e278390ab28d 100644
>> --- a/arch/powerpc/platforms/pseries/lparcfg.c
>> +++ b/arch/powerpc/platforms/pseries/lparcfg.c
>> @@ -136,6 +136,39 @@ static unsigned int h_get_ppp(struct hvcall_ppp_data *ppp_data)
>> return rc;
>> }
>>
>> +static void show_gpci_data(struct seq_file *m)
>> +{
>> + struct hv_gpci_request_buffer *buf;
>> + unsigned int affinity_score;
>> + long ret;
>> +
>> + buf = kmalloc(sizeof(*buf), GFP_KERNEL);
>> + if (buf == NULL)
>> + return;
>> +
>> + /*
>> + * Show the local LPAR's affinity score.
>> + *
>> + * 0xB1 selects the Affinity_Domain_Info_By_Partition subcall.
>> + * The score is at byte 0xB in the output buffer.
>> + */
>> + memset(&buf->params, 0, sizeof(buf->params));
>> + buf->params.counter_request = cpu_to_be32(0xB1);
>> + buf->params.starting_index = cpu_to_be32(-1); /* local LPAR */
>> + buf->params.counter_info_version_in = 0x5; /* v5+ for score */
>> + ret = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO, virt_to_phys(buf),
>> + sizeof(*buf));
>> + if (ret != H_SUCCESS) {
>> + pr_debug("hcall failed: H_GET_PERF_COUNTER_INFO: %ld, %x\n",
>> + ret, be32_to_cpu(buf->params.detail_rc));
>> + goto out;
>> + }
>> + affinity_score = buf->bytes[0xB];
>> + seq_printf(m, "partition_affinity_score=%u\n", affinity_score);
>> +out:
>> + kfree(buf);
>> +}
>> +
>> static unsigned h_pic(unsigned long *pool_idle_time,
>> unsigned long *num_procs)
>> {
>> @@ -487,6 +520,8 @@ static int pseries_lparcfg_data(struct seq_file *m, void *v)
>> partition_active_processors * 100);
>> }
>>
>> + show_gpci_data(m);
>> +
>> seq_printf(m, "partition_active_processors=%d\n",
>> partition_active_processors);
>>
>>
^ permalink raw reply
* Re: [PATCH] powerpc/book3s64/radix: Make radix_mem_block_size 64bit
From: Aneesh Kumar K.V @ 2020-08-06 12:44 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev
In-Reply-To: <874kpgymty.fsf@mpe.ellerman.id.au>
Michael Ellerman <mpe@ellerman.id.au> writes:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>> Similar to commit: 89c140bbaeee ("pseries: Fix 64 bit logical memory block panic")
>> make sure we update different variables tracking lmb_size are updated
>> to be 64 bit.
>
> That commit went to all stable releases, should this one also?
>
radix_mem_block_size got added recently and it is not yet upstram. But
the drmem_lmb_info change can be a stable candidate. We also need this
I will split this as two patches?
modified arch/powerpc/include/asm/drmem.h
@@ -67,7 +67,7 @@ struct of_drconf_cell_v2 {
#define DRCONF_MEM_RESERVED 0x00000080
#define DRCONF_MEM_HOTREMOVABLE 0x00000100
-static inline u32 drmem_lmb_size(void)
+static inline u64 drmem_lmb_size(void)
{
return drmem_info->lmb_size;
}
-aneesh
^ permalink raw reply
* [PATCH] powerpc/perf: Account for interrupts during PMC overflow for an invalid SIAR check
From: Athira Rajeev @ 2020-08-06 12:46 UTC (permalink / raw)
To: mpe; +Cc: aik, maddy, linuxppc-dev
Performance monitor interrupt handler checks if any counter has overflown
and calls `record_and_restart` in core-book3s which invokes
`perf_event_overflow` to record the sample information.
Apart from creating sample, perf_event_overflow also does the interrupt
and period checks via perf_event_account_interrupt.
Currently we record information only if the SIAR valid bit is set
( using `siar_valid` check ) and hence the interrupt check.
But it is possible that we do sampling for some events that are not
generating valid SIAR and hence there is no chance to disable the event
if interrupts is more than max_samples_per_tick. This leads to soft lockup.
Fix this by adding perf_event_account_interrupt in the invalid siar
code path for a sampling event. ie if siar is invalid, just do interrupt
check and don't record the sample information.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
arch/powerpc/perf/core-book3s.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 01d7028..626e587 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2101,6 +2101,10 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
if (perf_event_overflow(event, &data, regs))
power_pmu_stop(event, 0);
+ } else if (period) {
+ /* Account for interrupt incase of invalid siar */
+ if (perf_event_account_interrupt(event))
+ power_pmu_stop(event, 0);
}
}
--
1.8.3.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox