* [PATCH] powerpc/pseries: Fix update of LPAR security flavor after LPM
From: Laurent Dufour @ 2021-08-05 15:23 UTC (permalink / raw)
To: mpe, benh, paulus; +Cc: linuxppc-dev, linux-kernel, stable
After LPM, when migrating from a system with security mitigation enabled to
a system with mitigation disabled, the security flavor exposed in /proc is
not correctly set back to 0.
Do not assume the value of the security flavor is set to 0 when entering
init_cpu_char_feature_flags(), so when called after a LPM, the value is set
correctly even if the mitigation are not turned off.
Fixes: 6ce56e1ac380 ("powerpc/pseries: export LPAR security flavor in
lparcfg")
Cc: stable@vger.kernel.org # 5.13.x
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
arch/powerpc/platforms/pseries/setup.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 6b0886668465..0dfaa6ab44cc 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -539,9 +539,10 @@ static void init_cpu_char_feature_flags(struct h_cpu_char_result *result)
* H_CPU_BEHAV_FAVOUR_SECURITY_H could be set only if
* H_CPU_BEHAV_FAVOUR_SECURITY is.
*/
- if (!(result->behaviour & H_CPU_BEHAV_FAVOUR_SECURITY))
+ if (!(result->behaviour & H_CPU_BEHAV_FAVOUR_SECURITY)) {
security_ftr_clear(SEC_FTR_FAVOUR_SECURITY);
- else if (result->behaviour & H_CPU_BEHAV_FAVOUR_SECURITY_H)
+ pseries_security_flavor = 0;
+ } else if (result->behaviour & H_CPU_BEHAV_FAVOUR_SECURITY_H)
pseries_security_flavor = 1;
else
pseries_security_flavor = 2;
--
2.32.0
^ permalink raw reply related
* [Bug 213961] Oops while loading radeon driver
From: bugzilla-daemon @ 2021-08-05 11:43 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <bug-213961-206035@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=213961
Elimar Riesebieter (riesebie@lxtec.de) changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|PPC-32 |Video(Other)
Product|Platform Specific/Hardware |Drivers
Regression|No |Yes
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply
* Re: [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
From: Xianting Tian @ 2021-08-05 11:14 UTC (permalink / raw)
To: Jiri Slaby, gregkh, amit, arnd, osandov
Cc: linuxppc-dev, linux-kernel, virtualization
In-Reply-To: <f1b92c7d-0eaf-4eac-ecd2-fbb74fb63b52@kernel.org>
在 2021/8/5 下午4:09, Jiri Slaby 写道:
> On 05. 08. 21, 9:58, Jiri Slaby wrote:
>> Hi,
>>
>> On 04. 08. 21, 4:54, Xianting Tian wrote:
>>> @@ -933,6 +949,16 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno,
>>> int data,
>>> hp->outbuf_size = outbuf_size;
>>> hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
>
> This deserves cleanup too. Why is "outbuf" not "char outbuf[0]
> __ALIGNED__" at the end of the structure? The allocation would be
> easier (using struct_size()) and this line would be gone completely.
I will make the cleanup in v4.
>
>>> + /*
>>> + * hvc_con_outbuf is guaranteed to be aligned at least to the
>>> + * size(N_OUTBUF) by kmalloc().
>>> + */
>>> + hp->hvc_con_outbuf = kzalloc(N_OUTBUF, GFP_KERNEL);
>>> + if (!hp->hvc_con_outbuf)
>>> + return ERR_PTR(-ENOMEM);
>>
>> This leaks hp, right?
>
> Actually, why don't you make
> char c[N_OUTBUF] __ALIGNED__;
>
> part of struct hvc_struct directly?
thanks, it a good idea, I will change it in v4.
>
>> BTW your 2 patches are still not threaded, that is hard to follow.
>>
>>> +
>>> + spin_lock_init(&hp->hvc_con_lock);
>>> +
>>> tty_port_init(&hp->port);
>>> hp->port.ops = &hvc_port_ops;
>>
>> thanks,
^ permalink raw reply
* [Bug 213961] Oops while loading radeon driver
From: bugzilla-daemon @ 2021-08-05 11:00 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <bug-213961-206035@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=213961
--- Comment #10 from Christophe Leroy (christophe.leroy@csgroup.eu) ---
(In reply to Elimar Riesebieter from comment #9)
> Well, 5.13.8 just runs fine, though.
Yes, recent changes to Radeon appear for the first time in v5.14-rc1
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply
* [Bug 213961] Oops while loading radeon driver
From: bugzilla-daemon @ 2021-08-05 10:52 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <bug-213961-206035@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=213961
--- Comment #9 from Elimar Riesebieter (riesebie@lxtec.de) ---
Well, 5.13.8 just runs fine, though.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply
* Re: [RFC PATCH 3/4] powerpc: Optimize register usage for dear register
From: Christophe Leroy @ 2021-08-05 10:09 UTC (permalink / raw)
To: sxwjean, linuxppc-dev
Cc: ravi.bangoria, Xiongwei Song, oleg, npiggin, linux-kernel,
efremov, paulus, aneesh.kumar, peterx, akpm, sandipan
In-Reply-To: <20210726143053.532839-3-sxwjean@me.com>
Le 26/07/2021 à 16:30, sxwjean@me.com a écrit :
> From: Xiongwei Song <sxwjean@gmail.com>
>
> Create an anonymous union for dar and dear regsiters, we can reference
> dear to get the effective address when CONFIG_4xx=y or CONFIG_BOOKE=y.
> Otherwise, reference dar. This makes code more clear.
Same comment here as for patch 1.
>
> Signed-off-by: Xiongwei Song <sxwjean@gmail.com>
> ---
> arch/powerpc/include/asm/ptrace.h | 5 ++++-
> arch/powerpc/include/uapi/asm/ptrace.h | 5 ++++-
> arch/powerpc/kernel/process.c | 2 +-
> arch/powerpc/kernel/ptrace/ptrace.c | 2 ++
> arch/powerpc/kernel/traps.c | 5 ++++-
> arch/powerpc/mm/fault.c | 2 +-
> 6 files changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
> index c252d04b1206..fa725e3238c2 100644
> --- a/arch/powerpc/include/asm/ptrace.h
> +++ b/arch/powerpc/include/asm/ptrace.h
> @@ -43,7 +43,10 @@ struct pt_regs
> unsigned long mq;
> #endif
> unsigned long trap;
> - unsigned long dar;
> + union {
> + unsigned long dar;
> + unsigned long dear;
> + };
> union {
> unsigned long dsisr;
> unsigned long esr;
> diff --git a/arch/powerpc/include/uapi/asm/ptrace.h b/arch/powerpc/include/uapi/asm/ptrace.h
> index e357288b5f34..9ae150fb4c4b 100644
> --- a/arch/powerpc/include/uapi/asm/ptrace.h
> +++ b/arch/powerpc/include/uapi/asm/ptrace.h
> @@ -52,7 +52,10 @@ struct pt_regs
> unsigned long trap; /* Reason for being here */
> /* N.B. for critical exceptions on 4xx, the dar and dsisr
> fields are overloaded to hold srr0 and srr1. */
> - unsigned long dar; /* Fault registers */
> + union {
> + unsigned long dar; /* Fault registers */
> + unsigned long dear;
> + };
> union {
> unsigned long dsisr; /* on Book-S used for DSISR */
> unsigned long esr; /* on 4xx/Book-E used for ESR */
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index f74af8f9133c..50436b52c213 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1499,7 +1499,7 @@ static void __show_regs(struct pt_regs *regs)
> trap == INTERRUPT_DATA_STORAGE ||
> trap == INTERRUPT_ALIGNMENT) {
> if (IS_ENABLED(CONFIG_4xx) || IS_ENABLED(CONFIG_BOOKE))
> - pr_cont("DEAR: "REG" ESR: "REG" ", regs->dar, regs->esr);
> + pr_cont("DEAR: "REG" ESR: "REG" ", regs->dear, regs->esr);
> else
> pr_cont("DAR: "REG" DSISR: %08lx ", regs->dar, regs->dsisr);
> }
> diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
> index 00789ad2c4a3..969dca8b0718 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace.c
> @@ -373,6 +373,8 @@ void __init pt_regs_check(void)
> offsetof(struct user_pt_regs, trap));
> BUILD_BUG_ON(offsetof(struct pt_regs, dar) !=
> offsetof(struct user_pt_regs, dar));
> + BUILD_BUG_ON(offsetof(struct pt_regs, dear) !=
> + offsetof(struct user_pt_regs, dear));
> BUILD_BUG_ON(offsetof(struct pt_regs, dsisr) !=
> offsetof(struct user_pt_regs, dsisr));
> BUILD_BUG_ON(offsetof(struct pt_regs, esr) !=
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 2164f5705a0b..0796630d3d23 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -1609,7 +1609,10 @@ DEFINE_INTERRUPT_HANDLER(alignment_exception)
> }
> bad:
> if (user_mode(regs))
> - _exception(sig, regs, code, regs->dar);
> + if (IS_ENABLED(CONFIG_4xx) || IS_ENABLED(CONFIG_BOOKE))
> + _exception(sig, regs, code, regs->dear);
> + else
> + _exception(sig, regs, code, regs->dar);
> else
> bad_page_fault(regs, sig);
> }
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 62953d4e7c93..3db6b39a1178 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -542,7 +542,7 @@ static __always_inline void __do_page_fault(struct pt_regs *regs)
> long err;
>
> if (IS_ENABLED(CONFIG_4xx) || IS_ENABLED(CONFIG_BOOKE))
> - err = ___do_page_fault(regs, regs->dar, regs->esr);
> + err = ___do_page_fault(regs, regs->dear, regs->esr);
> else
> err = ___do_page_fault(regs, regs->dar, regs->dsisr);
>
>
^ permalink raw reply
* Re: [RFC PATCH 1/4] powerpc: Optimize register usage for esr register
From: Christophe Leroy @ 2021-08-05 10:06 UTC (permalink / raw)
To: sxwjean, linuxppc-dev
Cc: ravi.bangoria, Xiongwei Song, oleg, npiggin, linux-kernel,
efremov, paulus, aneesh.kumar, peterx, akpm, sandipan
In-Reply-To: <20210726143053.532839-1-sxwjean@me.com>
Le 26/07/2021 à 16:30, sxwjean@me.com a écrit :
> From: Xiongwei Song <sxwjean@gmail.com>
>
> Create an anonymous union for dsisr and esr regsiters, we can reference
> esr to get the exception detail when CONFIG_4xx=y or CONFIG_BOOKE=y.
> Otherwise, reference dsisr. This makes code more clear.
I'm not sure it is worth doing that.
What is the point in doing the following when you know that regs->esr and regs->dsisr are exactly
the same:
> - err = ___do_page_fault(regs, regs->dar, regs->dsisr);
> + if (IS_ENABLED(CONFIG_4xx) || IS_ENABLED(CONFIG_BOOKE))
> + err = ___do_page_fault(regs, regs->dar, regs->esr);
> + else
> + err = ___do_page_fault(regs, regs->dar, regs->dsisr);
> +
Or even
> - int is_write = page_fault_is_write(regs->dsisr);
> + unsigned long err_reg;
> + int is_write;
> +
> + if (IS_ENABLED(CONFIG_4xx) || IS_ENABLED(CONFIG_BOOKE))
> + err_reg = regs->esr;
> + else
> + err_reg = regs->dsisr;
> +
> + is_write = page_fault_is_write(err_reg);
Artificially growing the code for that makes no sense to me.
To avoid anbiguity, maybe the best would be to rename regs->dsisr to something like regs->sr , so
that we know it represents the status register, which is DSISR or ESR depending on the platform.
>
> Signed-off-by: Xiongwei Song <sxwjean@gmail.com>
> ---
> arch/powerpc/include/asm/ptrace.h | 5 ++++-
> arch/powerpc/include/uapi/asm/ptrace.h | 5 ++++-
> arch/powerpc/kernel/process.c | 2 +-
> arch/powerpc/kernel/ptrace/ptrace.c | 2 ++
> arch/powerpc/kernel/traps.c | 2 +-
> arch/powerpc/mm/fault.c | 16 ++++++++++++++--
> arch/powerpc/platforms/44x/machine_check.c | 4 ++--
> arch/powerpc/platforms/4xx/machine_check.c | 2 +-
> 8 files changed, 29 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
> index 3e5d470a6155..c252d04b1206 100644
> --- a/arch/powerpc/include/asm/ptrace.h
> +++ b/arch/powerpc/include/asm/ptrace.h
> @@ -44,7 +44,10 @@ struct pt_regs
> #endif
> unsigned long trap;
> unsigned long dar;
> - unsigned long dsisr;
> + union {
> + unsigned long dsisr;
> + unsigned long esr;
> + };
> unsigned long result;
> };
> };
> diff --git a/arch/powerpc/include/uapi/asm/ptrace.h b/arch/powerpc/include/uapi/asm/ptrace.h
> index 7004cfea3f5f..e357288b5f34 100644
> --- a/arch/powerpc/include/uapi/asm/ptrace.h
> +++ b/arch/powerpc/include/uapi/asm/ptrace.h
> @@ -53,7 +53,10 @@ struct pt_regs
> /* N.B. for critical exceptions on 4xx, the dar and dsisr
> fields are overloaded to hold srr0 and srr1. */
> unsigned long dar; /* Fault registers */
> - unsigned long dsisr; /* on 4xx/Book-E used for ESR */
> + union {
> + unsigned long dsisr; /* on Book-S used for DSISR */
> + unsigned long esr; /* on 4xx/Book-E used for ESR */
> + };
> unsigned long result; /* Result of a system call */
> };
>
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 185beb290580..f74af8f9133c 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1499,7 +1499,7 @@ static void __show_regs(struct pt_regs *regs)
> trap == INTERRUPT_DATA_STORAGE ||
> trap == INTERRUPT_ALIGNMENT) {
> if (IS_ENABLED(CONFIG_4xx) || IS_ENABLED(CONFIG_BOOKE))
> - pr_cont("DEAR: "REG" ESR: "REG" ", regs->dar, regs->dsisr);
> + pr_cont("DEAR: "REG" ESR: "REG" ", regs->dar, regs->esr);
> else
> pr_cont("DAR: "REG" DSISR: %08lx ", regs->dar, regs->dsisr);
> }
> diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
> index 0a0a33eb0d28..00789ad2c4a3 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace.c
> @@ -375,6 +375,8 @@ void __init pt_regs_check(void)
> offsetof(struct user_pt_regs, dar));
> BUILD_BUG_ON(offsetof(struct pt_regs, dsisr) !=
> offsetof(struct user_pt_regs, dsisr));
> + BUILD_BUG_ON(offsetof(struct pt_regs, esr) !=
> + offsetof(struct user_pt_regs, esr));
> BUILD_BUG_ON(offsetof(struct pt_regs, result) !=
> offsetof(struct user_pt_regs, result));
>
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index dfbce527c98e..2164f5705a0b 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -562,7 +562,7 @@ static inline int check_io_access(struct pt_regs *regs)
> #ifdef CONFIG_PPC_ADV_DEBUG_REGS
> /* On 4xx, the reason for the machine check or program exception
> is in the ESR. */
> -#define get_reason(regs) ((regs)->dsisr)
> +#define get_reason(regs) ((regs)->esr)
> #define REASON_FP ESR_FP
> #define REASON_ILLEGAL (ESR_PIL | ESR_PUO)
> #define REASON_PRIVILEGED ESR_PPR
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index a8d0ce85d39a..62953d4e7c93 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -541,7 +541,11 @@ static __always_inline void __do_page_fault(struct pt_regs *regs)
> {
> long err;
>
> - err = ___do_page_fault(regs, regs->dar, regs->dsisr);
> + if (IS_ENABLED(CONFIG_4xx) || IS_ENABLED(CONFIG_BOOKE))
> + err = ___do_page_fault(regs, regs->dar, regs->esr);
> + else
> + err = ___do_page_fault(regs, regs->dar, regs->dsisr);
> +
> if (unlikely(err))
> bad_page_fault(regs, err);
> }
> @@ -567,7 +571,15 @@ NOKPROBE_SYMBOL(hash__do_page_fault);
> */
> static void __bad_page_fault(struct pt_regs *regs, int sig)
> {
> - int is_write = page_fault_is_write(regs->dsisr);
> + unsigned long err_reg;
> + int is_write;
> +
> + if (IS_ENABLED(CONFIG_4xx) || IS_ENABLED(CONFIG_BOOKE))
> + err_reg = regs->esr;
> + else
> + err_reg = regs->dsisr;
> +
> + is_write = page_fault_is_write(err_reg);
>
> /* kernel has accessed a bad area */
>
> diff --git a/arch/powerpc/platforms/44x/machine_check.c b/arch/powerpc/platforms/44x/machine_check.c
> index a5c898bb9bab..5d19daacd78a 100644
> --- a/arch/powerpc/platforms/44x/machine_check.c
> +++ b/arch/powerpc/platforms/44x/machine_check.c
> @@ -11,7 +11,7 @@
>
> int machine_check_440A(struct pt_regs *regs)
> {
> - unsigned long reason = regs->dsisr;
> + unsigned long reason = regs->esr;
>
> printk("Machine check in kernel mode.\n");
> if (reason & ESR_IMCP){
> @@ -48,7 +48,7 @@ int machine_check_440A(struct pt_regs *regs)
> #ifdef CONFIG_PPC_47x
> int machine_check_47x(struct pt_regs *regs)
> {
> - unsigned long reason = regs->dsisr;
> + unsigned long reason = regs->esr;
> u32 mcsr;
>
> printk(KERN_ERR "Machine check in kernel mode.\n");
> diff --git a/arch/powerpc/platforms/4xx/machine_check.c b/arch/powerpc/platforms/4xx/machine_check.c
> index a71c29892a91..a905da1d6f41 100644
> --- a/arch/powerpc/platforms/4xx/machine_check.c
> +++ b/arch/powerpc/platforms/4xx/machine_check.c
> @@ -10,7 +10,7 @@
>
> int machine_check_4xx(struct pt_regs *regs)
> {
> - unsigned long reason = regs->dsisr;
> + unsigned long reason = regs->esr;
>
> if (reason & ESR_IMCP) {
> printk("Instruction");
>
^ permalink raw reply
* Re: [PATCH] powerpc/kprobes: Fix kprobe Oops happens in booke
From: Christophe Leroy @ 2021-08-05 9:51 UTC (permalink / raw)
To: Pu Lehui, oleg, mpe, benh, paulus, naveen.n.rao, mhiramat, peterz,
npiggin, ruscur
Cc: zhangjinhao2, xukuohai, linuxppc-dev, linux-kernel
In-Reply-To: <20210804143735.148547-1-pulehui@huawei.com>
Le 04/08/2021 à 16:37, Pu Lehui a écrit :
> When using kprobe on powerpc booke series processor, Oops happens
> as show bellow:
>
> [ 35.861352] Oops: Exception in kernel mode, sig: 5 [#1]
> [ 35.861676] BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500
> [ 35.861905] Modules linked in:
> [ 35.862144] CPU: 0 PID: 76 Comm: sh Not tainted 5.14.0-rc3-00060-g7e96bf476270 #18
> [ 35.862610] NIP: c0b96470 LR: c00107b4 CTR: c0161c80
> [ 35.862805] REGS: c387fe70 TRAP: 0700 Not tainted (5.14.0-rc3-00060-g7e96bf476270)
> [ 35.863198] MSR: 00029002 <CE,EE,ME> CR: 24022824 XER: 20000000
> [ 35.863577]
> [ 35.863577] GPR00: c0015218 c387ff20 c313e300 c387ff50 00000004 40000002 40000000 0a1a2cce
> [ 35.863577] GPR08: 00000000 00000004 00000000 59764000 24022422 102490c2 00000000 00000000
> [ 35.863577] GPR16: 00000000 00000000 00000040 10240000 10240000 10240000 10240000 10220000
> [ 35.863577] GPR24: ffffffff 10240000 00000000 00000000 bfc655e8 00000800 c387ff50 00000000
> [ 35.865367] NIP [c0b96470] schedule+0x0/0x130
> [ 35.865606] LR [c00107b4] interrupt_exit_user_prepare_main+0xf4/0x100
> [ 35.865974] Call Trace:
> [ 35.866142] [c387ff20] [c0053224] irq_exit+0x114/0x120 (unreliable)
> [ 35.866472] [c387ff40] [c0015218] interrupt_return+0x14/0x13c
> [ 35.866728] --- interrupt: 900 at 0x100af3dc
> [ 35.866963] NIP: 100af3dc LR: 100de020 CTR: 00000000
> [ 35.867177] REGS: c387ff50 TRAP: 0900 Not tainted (5.14.0-rc3-00060-g7e96bf476270)
> [ 35.867488] MSR: 0002f902 <CE,EE,PR,FP,ME> CR: 20022422 XER: 20000000
> [ 35.867808]
> [ 35.867808] GPR00: c001509c bfc65570 1024b4d0 00000000 100de020 20022422 bfc655a8 100af3dc
> [ 35.867808] GPR08: 0002f902 00000000 00000000 00000000 72656773 102490c2 00000000 00000000
> [ 35.867808] GPR16: 00000000 00000000 00000040 10240000 10240000 10240000 10240000 10220000
> [ 35.867808] GPR24: ffffffff 10240000 00000000 00000000 bfc655e8 10245910 ffffffff 00000001
> [ 35.869406] NIP [100af3dc] 0x100af3dc
> [ 35.869578] LR [100de020] 0x100de020
> [ 35.869751] --- interrupt: 900
> [ 35.870001] Instruction dump:
> [ 35.870283] 40c20010 815e0518 714a0100 41e2fd04 39200000 913e00c0 3b1e0450 4bfffd80
> [ 35.870666] 0fe00000 92a10024 4bfff1a9 60000000 <7fe00008> 7c0802a6 93e1001c 7c5f1378
> [ 35.871339] ---[ end trace 23ff848139efa9b9 ]---
>
> There is no real mode for booke arch and the MMU translation is
> always on. The corresponding MSR_IS/MSR_DS bit in booke is used
> to switch the address space, but not for real mode judgment.
Can you explain more the link between that explanation and the Oops itself ?
>
> Fixes: 21f8b2fa3ca5 ("powerpc/kprobes: Ignore traps that happened in real mode")
> Signed-off-by: Pu Lehui <pulehui@huawei.com>
> ---
> arch/powerpc/include/asm/ptrace.h | 6 ++++++
> arch/powerpc/kernel/kprobes.c | 5 +----
> 2 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
> index 3e5d470a6155..4aec1a97024b 100644
> --- a/arch/powerpc/include/asm/ptrace.h
> +++ b/arch/powerpc/include/asm/ptrace.h
> @@ -187,6 +187,12 @@ static inline unsigned long frame_pointer(struct pt_regs *regs)
> #define user_mode(regs) (((regs)->msr & MSR_PR) != 0)
> #endif
>
> +#ifdef CONFIG_BOOKE
> +#define real_mode(regs) 0
> +#else
> +#define real_mode(regs) (!((regs)->msr & MSR_IR) || !((regs)->msr & MSR_DR))
> +#endif
> +
You don't need an #ifdef stuff here, you can base your testing on IS_ENABLED(CONFIG_BOOKE)
> #define force_successful_syscall_return() \
> do { \
> set_thread_flag(TIF_NOERROR); \
> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index cbc28d1a2e1b..fac9a5974718 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -289,10 +289,7 @@ int kprobe_handler(struct pt_regs *regs)
> unsigned int *addr = (unsigned int *)regs->nip;
> struct kprobe_ctlblk *kcb;
>
> - if (user_mode(regs))
> - return 0;
> -
> - if (!(regs->msr & MSR_IR) || !(regs->msr & MSR_DR))
> + if (user_mode(regs) || real_mode(regs))
> return 0;
>
> /*
>
^ permalink raw reply
* Re: [PATCH v5 7/8] powerpc/64s: Initialize and use a temporary mm for patching on Radix
From: Christophe Leroy @ 2021-08-05 9:48 UTC (permalink / raw)
To: Christopher M. Riedl, linuxppc-dev
Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja
In-Reply-To: <20210713053113.4632-8-cmr@linux.ibm.com>
Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> When code patching a STRICT_KERNEL_RWX kernel the page containing the
> address to be patched is temporarily mapped as writeable. Currently, a
> per-cpu vmalloc patch area is used for this purpose. While the patch
> area is per-cpu, the temporary page mapping is inserted into the kernel
> page tables for the duration of patching. The mapping is exposed to CPUs
> other than the patching CPU - this is undesirable from a hardening
> perspective. Use a temporary mm instead which keeps the mapping local to
> the CPU doing the patching.
>
> Use the `poking_init` init hook to prepare a temporary mm and patching
> address. Initialize the temporary mm by copying the init mm. Choose a
> randomized patching address inside the temporary mm userspace address
> space. The patching address is randomized between PAGE_SIZE and
> DEFAULT_MAP_WINDOW-PAGE_SIZE.
>
> Bits of entropy with 64K page size on BOOK3S_64:
>
> bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)
>
> PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
> bits of entropy = log2(128TB / 64K)
> bits of entropy = 31
>
> The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU
> operates - by default the space above DEFAULT_MAP_WINDOW is not
> available. Currently the Hash MMU does not use a temporary mm so
> technically this upper limit isn't necessary; however, a larger
> randomization range does not further "harden" this overall approach and
> future work may introduce patching with a temporary mm on Hash as well.
>
> Randomization occurs only once during initialization at boot for each
> possible CPU in the system.
>
> Introduce two new functions, map_patch() and unmap_patch(), to
> respectively create and remove the temporary mapping with write
> permissions at patching_addr. Map the page with PAGE_KERNEL to set
> EAA[0] for the PTE which ignores the AMR (so no need to unlock/lock
> KUAP) according to PowerISA v3.0b Figure 35 on Radix.
>
> Based on x86 implementation:
>
> commit 4fc19708b165
> ("x86/alternatives: Initialize temporary mm for patching")
>
> and:
>
> commit b3fd8e83ada0
> ("x86/alternatives: Use temporary mm for text poking")
>
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
>
> ---
>
> v5: * Only support Book3s64 Radix MMU for now.
> * Use a per-cpu datastructure to hold the patching_addr and
> patching_mm to avoid the need for a synchronization lock/mutex.
>
> v4: * In the previous series this was two separate patches: one to init
> the temporary mm in poking_init() (unused in powerpc at the time)
> and the other to use it for patching (which removed all the
> per-cpu vmalloc code). Now that we use poking_init() in the
> existing per-cpu vmalloc approach, that separation doesn't work
> as nicely anymore so I just merged the two patches into one.
> * Preload the SLB entry and hash the page for the patching_addr
> when using Hash on book3s64 to avoid taking an SLB and Hash fault
> during patching. The previous implementation was a hack which
> changed current->mm to allow the SLB and Hash fault handlers to
> work with the temporary mm since both of those code-paths always
> assume mm == current->mm.
> * Also (hmm - seeing a trend here) with the book3s64 Hash MMU we
> have to manage the mm->context.active_cpus counter and mm cpumask
> since they determine (via mm_is_thread_local()) if the TLB flush
> in pte_clear() is local or not - it should always be local when
> we're using the temporary mm. On book3s64's Radix MMU we can
> just call local_flush_tlb_mm().
> * Use HPTE_USE_KERNEL_KEY on Hash to avoid costly lock/unlock of
> KUAP.
> ---
> arch/powerpc/lib/code-patching.c | 132 +++++++++++++++++++++++++++++--
> 1 file changed, 125 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 9f2eba9b70ee4..027dabd42b8dd 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -11,6 +11,7 @@
> #include <linux/cpuhotplug.h>
> #include <linux/slab.h>
> #include <linux/uaccess.h>
> +#include <linux/random.h>
>
> #include <asm/tlbflush.h>
> #include <asm/page.h>
> @@ -103,6 +104,7 @@ static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
>
> static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
> static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
> +static DEFINE_PER_CPU(struct mm_struct *, cpu_patching_mm);
>
> #if IS_BUILTIN(CONFIG_LKDTM)
> unsigned long read_cpu_patching_addr(unsigned int cpu)
> @@ -133,6 +135,51 @@ static int text_area_cpu_down(unsigned int cpu)
> return 0;
> }
>
> +static __always_inline void __poking_init_temp_mm(void)
> +{
> + int cpu;
> + spinlock_t *ptl; /* for protecting pte table */
> + pte_t *ptep;
> + struct mm_struct *patching_mm;
> + unsigned long patching_addr;
> +
> + for_each_possible_cpu(cpu) {
> + /*
> + * Some parts of the kernel (static keys for example) depend on
> + * successful code patching. Code patching under
> + * STRICT_KERNEL_RWX requires this setup - otherwise we cannot
> + * patch at all. We use BUG_ON() here and later since an early
> + * failure is preferred to buggy behavior and/or strange
> + * crashes later.
> + */
> + patching_mm = copy_init_mm();
> + BUG_ON(!patching_mm);
Read https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on and
https://github.com/linuxppc/issues/issues/88
Avoid BUG_ON()s thanks.
> +
> + per_cpu(cpu_patching_mm, cpu) = patching_mm;
> +
> + /*
> + * Choose a randomized, page-aligned address from the range:
> + * [PAGE_SIZE, DEFAULT_MAP_WINDOW - PAGE_SIZE] The lower
> + * address bound is PAGE_SIZE to avoid the zero-page. The
> + * upper address bound is DEFAULT_MAP_WINDOW - PAGE_SIZE to
> + * stay under DEFAULT_MAP_WINDOW with the Book3s64 Hash MMU.
> + */
> + patching_addr = PAGE_SIZE + ((get_random_long() & PAGE_MASK)
> + % (DEFAULT_MAP_WINDOW - 2 * PAGE_SIZE));
% should be at the end of first line and the second line alignment should match open parenthesis in
first line.
> +
> + per_cpu(cpu_patching_addr, cpu) = patching_addr;
> +
> + /*
> + * PTE allocation uses GFP_KERNEL which means we need to
> + * pre-allocate the PTE here because we cannot do the
> + * allocation during patching when IRQs are disabled.
> + */
> + ptep = get_locked_pte(patching_mm, patching_addr, &ptl);
> + BUG_ON(!ptep);
Avoid BUG_ON() please
> + pte_unmap_unlock(ptep, ptl);
> + }
> +}
> +
> /*
> * Although BUG_ON() is rude, in this case it should only happen if ENOMEM, and
> * we judge it as being preferable to a kernel that will crash later when
> @@ -140,6 +187,11 @@ static int text_area_cpu_down(unsigned int cpu)
> */
> void __init poking_init(void)
> {
> + if (radix_enabled()) {
> + __poking_init_temp_mm();
> + return;
> + }
> +
> BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> "powerpc/text_poke:online", text_area_cpu_up,
> text_area_cpu_down));
> @@ -213,30 +265,96 @@ static inline int unmap_patch_area(void)
> return -EINVAL;
> }
>
> +struct patch_mapping {
> + spinlock_t *ptl; /* for protecting pte table */
> + pte_t *ptep;
> + struct temp_mm temp_mm;
> +};
> +
> +/*
> + * This can be called for kernel text or a module.
> + */
> +static int map_patch(const void *addr, struct patch_mapping *patch_mapping)
> +{
> + struct page *page;
> + pte_t pte;
> + pgprot_t pgprot;
> + struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
> + unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
> +
> + if (is_vmalloc_or_module_addr(addr))
> + page = vmalloc_to_page(addr);
> + else
> + page = virt_to_page(addr);
> +
> + patch_mapping->ptep = get_locked_pte(patching_mm, patching_addr,
> + &patch_mapping->ptl);
Not sure you need to split this line, checkpatch now allows 100 chars per line.
> + if (unlikely(!patch_mapping->ptep)) {
> + pr_warn("map patch: failed to allocate pte for patching\n");
That's a lot better than all above BUG_ONs
> + return -1;
> + }
> +
> + pgprot = PAGE_KERNEL;
> + pte = mk_pte(page, pgprot);
> + pte = pte_mkdirty(pte);
I'm sure you can do
pte = pte_mkdirty(mk_pte(page, PAGE_KERNEL));
And indeed PAGE_KERNEL already includes _PAGE_DIRTY, so all you should need is
pte = mk_pte(page, PAGE_KERNEL);
Or even just
set_pte_at(patching_mm, patching_addr, patch_mapping->ptep, mk_pte(page, PAGE_KERNEL));
> + set_pte_at(patching_mm, patching_addr, patch_mapping->ptep, pte);
> +
> + init_temp_mm(&patch_mapping->temp_mm, patching_mm);
> + use_temporary_mm(&patch_mapping->temp_mm);
> +
> + return 0;
> +}
> +
> +static void unmap_patch(struct patch_mapping *patch_mapping)
> +{
> + struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
> + unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
> +
> + pte_clear(patching_mm, patching_addr, patch_mapping->ptep);
> +
> + local_flush_tlb_mm(patching_mm);
> +
> + pte_unmap_unlock(patch_mapping->ptep, patch_mapping->ptl);
> +
> + unuse_temporary_mm(&patch_mapping->temp_mm);
Shouldn't you stop using it before unmapping/unlocking it ?
> +}
> +
> static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
> {
> int err, rc = 0;
> u32 *patch_addr = NULL;
> unsigned long flags;
> + struct patch_mapping patch_mapping;
>
> /*
> - * During early early boot patch_instruction is called
> - * when text_poke_area is not ready, but we still need
> - * to allow patching. We just do the plain old patching
> + * During early early boot patch_instruction is called when the
> + * patching_mm/text_poke_area is not ready, but we still need to allow
> + * patching. We just do the plain old patching.
> */
> - if (!this_cpu_read(text_poke_area))
> - return raw_patch_instruction(addr, instr);
> + if (radix_enabled()) {
> + if (!this_cpu_read(cpu_patching_mm))
> + return raw_patch_instruction(addr, instr);
> + } else {
> + if (!this_cpu_read(text_poke_area))
> + return raw_patch_instruction(addr, instr);
> + }
>
> local_irq_save(flags);
>
> - err = map_patch_area(addr);
> + if (radix_enabled())
> + err = map_patch(addr, &patch_mapping);
Maybe call it map_patch_mm() or map_patch_mapping() ?
> + else
> + err = map_patch_area(addr);
> if (err)
> goto out;
>
> patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr));
> rc = __patch_instruction(addr, instr, patch_addr);
>
> - err = unmap_patch_area();
> + if (radix_enabled())
> + unmap_patch(&patch_mapping);
No err ? Would be better if it could return something, allthough always 0.
And same comment about naming.
> + else
> + err = unmap_patch_area();
>
> out:
> local_irq_restore(flags);
>
^ permalink raw reply
* Re: [PATCH v5 6/8] powerpc: Rework and improve STRICT_KERNEL_RWX patching
From: Christophe Leroy @ 2021-08-05 9:34 UTC (permalink / raw)
To: Christopher M. Riedl, linuxppc-dev
Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja
In-Reply-To: <20210713053113.4632-7-cmr@linux.ibm.com>
Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> Rework code-patching with STRICT_KERNEL_RWX to prepare for the next
> patch which uses a temporary mm for patching under the Book3s64 Radix
> MMU. Make improvements by adding a WARN_ON when the patchsite doesn't
> match after patching and return the error from __patch_instruction()
> properly.
>
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
>
> ---
>
> v5: * New to series.
> ---
> arch/powerpc/lib/code-patching.c | 51 +++++++++++++++++---------------
> 1 file changed, 27 insertions(+), 24 deletions(-)
>
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 3122d8e4cc013..9f2eba9b70ee4 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -102,11 +102,12 @@ static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
> }
>
> static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
> +static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
>
> #if IS_BUILTIN(CONFIG_LKDTM)
> unsigned long read_cpu_patching_addr(unsigned int cpu)
> {
> - return (unsigned long)(per_cpu(text_poke_area, cpu))->addr;
> + return per_cpu(cpu_patching_addr, cpu);
> }
> #endif
>
> @@ -121,6 +122,7 @@ static int text_area_cpu_up(unsigned int cpu)
> return -1;
> }
> this_cpu_write(text_poke_area, area);
> + this_cpu_write(cpu_patching_addr, (unsigned long)area->addr);
>
> return 0;
> }
> @@ -146,7 +148,7 @@ void __init poking_init(void)
> /*
> * This can be called for kernel text or a module.
> */
> -static int map_patch_area(void *addr, unsigned long text_poke_addr)
> +static int map_patch_area(void *addr)
> {
> unsigned long pfn;
> int err;
> @@ -156,17 +158,20 @@ static int map_patch_area(void *addr, unsigned long text_poke_addr)
> else
> pfn = __pa_symbol(addr) >> PAGE_SHIFT;
>
> - err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
> + err = map_kernel_page(__this_cpu_read(cpu_patching_addr),
> + (pfn << PAGE_SHIFT), PAGE_KERNEL);
>
> - pr_devel("Mapped addr %lx with pfn %lx:%d\n", text_poke_addr, pfn, err);
> + pr_devel("Mapped addr %lx with pfn %lx:%d\n",
> + __this_cpu_read(cpu_patching_addr), pfn, err);
> if (err)
> return -1;
>
> return 0;
> }
>
> -static inline int unmap_patch_area(unsigned long addr)
> +static inline int unmap_patch_area(void)
> {
> + unsigned long addr = __this_cpu_read(cpu_patching_addr);
> pte_t *ptep;
> pmd_t *pmdp;
> pud_t *pudp;
> @@ -175,23 +180,23 @@ static inline int unmap_patch_area(unsigned long addr)
>
> pgdp = pgd_offset_k(addr);
> if (unlikely(!pgdp))
> - return -EINVAL;
> + goto out_err;
>
> p4dp = p4d_offset(pgdp, addr);
> if (unlikely(!p4dp))
> - return -EINVAL;
> + goto out_err;
>
> pudp = pud_offset(p4dp, addr);
> if (unlikely(!pudp))
> - return -EINVAL;
> + goto out_err;
>
> pmdp = pmd_offset(pudp, addr);
> if (unlikely(!pmdp))
> - return -EINVAL;
> + goto out_err;
>
> ptep = pte_offset_kernel(pmdp, addr);
> if (unlikely(!ptep))
> - return -EINVAL;
> + goto out_err;
>
> pr_devel("clearing mm %p, pte %p, addr %lx\n", &init_mm, ptep, addr);
>
> @@ -202,15 +207,17 @@ static inline int unmap_patch_area(unsigned long addr)
> flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>
> return 0;
> +
> +out_err:
> + pr_warn("failed to unmap %lx\n", addr);
> + return -EINVAL;
Can you keep that in the caller of unmap_patch_area() instead of all those goto stuff ?
> }
>
> static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
> {
> - int err;
> + int err, rc = 0;
> u32 *patch_addr = NULL;
> unsigned long flags;
> - unsigned long text_poke_addr;
> - unsigned long kaddr = (unsigned long)addr;
>
> /*
> * During early early boot patch_instruction is called
> @@ -222,24 +229,20 @@ static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
>
> local_irq_save(flags);
>
> - text_poke_addr = (unsigned long)__this_cpu_read(text_poke_area)->addr;
> - if (map_patch_area(addr, text_poke_addr)) {
> - err = -1;
> + err = map_patch_area(addr);
> + if (err)
> goto out;
> - }
> -
> - patch_addr = (u32 *)(text_poke_addr + (kaddr & ~PAGE_MASK));
>
> - __patch_instruction(addr, instr, patch_addr);
> + patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr));
> + rc = __patch_instruction(addr, instr, patch_addr);
>
> - err = unmap_patch_area(text_poke_addr);
> - if (err)
> - pr_warn("failed to unmap %lx\n", text_poke_addr);
> + err = unmap_patch_area();
>
> out:
> local_irq_restore(flags);
> + WARN_ON(!ppc_inst_equal(ppc_inst_read(addr), instr));
Why adding that WARN_ON(), what could make that happen that is worth a WARN_ON() ?
Patching is quite a critical fast path, I'm not sure we want to afford too many checks during
patching, we want it quick at first.
>
> - return err;
> + return rc ? rc : err;
> }
> #else /* !CONFIG_STRICT_KERNEL_RWX */
>
>
^ permalink raw reply
* Re: [PATCH v15 7/9] powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
From: Laurent Vivier @ 2021-08-05 9:34 UTC (permalink / raw)
To: Jordan Niethe, linuxppc-dev
Cc: ajd, aneesh.kumar, Greg Kurz, npiggin, cmr, kvm-ppc, naveen.n.rao,
David Gibson, dja
In-Reply-To: <20210609013431.9805-8-jniethe5@gmail.com>
Hi,
On 09/06/2021 03:34, Jordan Niethe wrote:
> From: Russell Currey <ruscur@russell.cc>
>
> To enable strict module RWX on powerpc, set:
>
> CONFIG_STRICT_MODULE_RWX=y
>
> You should also have CONFIG_STRICT_KERNEL_RWX=y set to have any real
> security benefit.
>
> ARCH_HAS_STRICT_MODULE_RWX is set to require ARCH_HAS_STRICT_KERNEL_RWX.
> This is due to a quirk in arch/Kconfig and arch/powerpc/Kconfig that
> makes STRICT_MODULE_RWX *on by default* in configurations where
> STRICT_KERNEL_RWX is *unavailable*.
>
> Since this doesn't make much sense, and module RWX without kernel RWX
> doesn't make much sense, having the same dependencies as kernel RWX
> works around this problem.
>
> Book3s/32 603 and 604 core processors are not able to write protect
> kernel pages so do not set ARCH_HAS_STRICT_MODULE_RWX for Book3s/32.
>
> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> Signed-off-by: Russell Currey <ruscur@russell.cc>
> [jpn: - predicate on !PPC_BOOK3S_604
> - make module_alloc() use PAGE_KERNEL protection]
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
> v10: - Predicate on !PPC_BOOK3S_604
> - Make module_alloc() use PAGE_KERNEL protection
> v11: - Neaten up
> v13: Use strict_kernel_rwx_enabled()
> v14: Make changes to module_alloc() its own commit
> v15: - Force STRICT_KERNEL_RWX if STRICT_MODULE_RWX is selected
> - Predicate on !PPC_BOOK3S_32 instead
> ---
> arch/powerpc/Kconfig | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index abfe2e9225fa..72f307f1796b 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -142,6 +142,7 @@ config PPC
> select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
> select ARCH_HAS_SET_MEMORY
> select ARCH_HAS_STRICT_KERNEL_RWX if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION)
> + select ARCH_HAS_STRICT_MODULE_RWX if ARCH_HAS_STRICT_KERNEL_RWX && !PPC_BOOK3S_32
> select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
> select ARCH_HAS_UACCESS_FLUSHCACHE
> select ARCH_HAS_UBSAN_SANITIZE_ALL
> @@ -267,6 +268,7 @@ config PPC
> select PPC_DAWR if PPC64
> select RTC_LIB
> select SPARSE_IRQ
> + select STRICT_KERNEL_RWX if STRICT_MODULE_RWX
> select SYSCTL_EXCEPTION_TRACE
> select THREAD_INFO_IN_TASK
> select VIRT_TO_BUS if !PPC64
>
since this patch is merged my VM is experiencing a crash at boot (20% of the time):
[ 8.496850] kernel tried to execute exec-protected page (c008000004073278) - exploit
attempt? (uid: 0)
[ 8.496921] BUG: Unable to handle kernel instruction fetch
[ 8.496954] Faulting instruction address: 0xc008000004073278
[ 8.496994] Oops: Kernel access of bad area, sig: 11 [#1]
[ 8.497028] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[ 8.497071] Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks xfs
libcrc32c virtio_net net_failover virtio_blk vmx_crypto failover dm_mirror dm_region_hash
dm_log dm_mod
[ 8.497186] CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
[ 8.497228] Workqueue: events control_work_handler [virtio_console]
[ 8.497272] NIP: c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
[ 8.497320] REGS: c00000002e4ef7e0 TRAP: 0400 Not tainted (5.14.0-rc4+)
[ 8.497361] MSR: 800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24002822
XER: 200400cf
[ 8.497426] CFAR: c0000000001e9e44 IRQMASK: 1
[ 8.497426] GPR00: c008000004073278 c00000002e4efa80 c000000002a26b00 c000000042c39520
[ 8.497426] GPR04: 0000000000000001 0000000000000000 0000000000000000 00000000000000ff
[ 8.497426] GPR08: 0000000000000001 c000000042c39520 0000000000000001 c008000004076008
[ 8.497426] GPR12: c0000000001e9de0 c0000001fffccb00 c00000000018ba88 c00000002c91d400
[ 8.497426] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 8.497426] GPR20: 0000000000000000 0000000000000000 0000000000000000 c008000004080340
[ 8.497426] GPR24: c0080000040a01e8 0000000000000000 0000000000000000 c00000002e0975c0
[ 8.497426] GPR28: c00000002ce72940 c000000042c39520 0000000000000048 0000000000000038
[ 8.497891] NIP [c008000004073278] fill_queue+0xf0/0x210 [virtio_console]
[ 8.497934] LR [c008000004073278] fill_queue+0xf0/0x210 [virtio_console]
[ 8.497976] Call Trace:
[ 8.497993] [c00000002e4efa80] [c00800000407323c] fill_queue+0xb4/0x210
[virtio_console] (unreliable)
[ 8.498052] [c00000002e4efae0] [c008000004073a90] add_port+0x1a8/0x470 [virtio_console]
[ 8.498102] [c00000002e4efbb0] [c0080000040750f4] control_work_handler+0xbc/0x1e8
[virtio_console]
[ 8.498160] [c00000002e4efc60] [c00000000017f4f0] process_one_work+0x290/0x590
[ 8.498212] [c00000002e4efd00] [c00000000017f878] worker_thread+0x88/0x620
[ 8.498256] [c00000002e4efda0] [c00000000018bc14] kthread+0x194/0x1a0
[ 8.498299] [c00000002e4efe10] [c00000000000cf54] ret_from_kernel_thread+0x5c/0x64
[ 8.498349] Instruction dump:
[ 8.498374] 7da96b78 a14d0c8a 419c00b0 2f8a0000 419eff88 b32d0c8a 7c0004ac 4bffff7c
[ 8.498430] 60000000 60000000 7fa3eb78 48002d95 <e8410018> 38600000 480025e1 e8410018
[ 8.498485] ---[ end trace 16ee10903290b647 ]---
[ 8.501433]
[ 9.502601] Kernel panic - not syncing: Fatal exception
add_port+0x1a8/0x470 :
1420
1421 /* We can safely ignore ENOSPC because it means
1422 * the queue already has buffers. Buffers are removed
1423 * only by virtcons_remove(), not by unplug_port()
1424 */
->1425 err = fill_queue(port->in_vq, &port->inbuf_lock);
1426 if (err < 0 && err != -ENOSPC) {
1427 dev_err(port->dev, "Error allocating inbufs\n");
1428 goto free_device;
1429 }
fill_queue+0x90/0x210 :
1326 static int fill_queue(struct virtqueue *vq, spinlock_t *lock)
1327 {
1328 struct port_buffer *buf;
1329 int nr_added_bufs;
1330 int ret;
1331
1332 nr_added_bufs = 0;
1333 do {
1334 buf = alloc_buf(vq->vdev, PAGE_SIZE, 0);
1335 if (!buf)
1336 return -ENOMEM;
1337
->1338 spin_lock_irq(lock);
I'm using an upstream kernel (5.14-rc4, 251a1524293d) in the VM.
My host is a RHEL 8.5/POWER9: qemu-kvm-6.0.0-21 and kernel-4.18.0-325
My qemu command line is:
/usr/libexec/qemu-kvm \
-M pseries,accel=kvm \
-nographic -nodefaults \
-device virtio-serial-pci \
-device virtconsole \
-device virtio-net-pci,mac=9a:2b:2c:2d:2e:2f,netdev=hostnet0 \
-blockdev
node-name=disk1,file.driver=file,driver=qcow2,file.driver=file,file.filename=disk.qcow2 \
-netdev bridge,id=hostnet0,br=virbr0,helper=/usr/libexec/qemu-bridge-helper \
-device virtio-blk-pci,id=image1,drive=disk1 \
-m 8192 \
-smp 4 \
-serial mon:stdio
Do we need something in qemu/kvm to support STRICT_MODULE_RWX ?
Thanks,
Laurent
^ permalink raw reply
* Re: [PATCH v5 5/8] powerpc/64s: Introduce temporary mm for Radix MMU
From: Christophe Leroy @ 2021-08-05 9:27 UTC (permalink / raw)
To: Christopher M. Riedl, linuxppc-dev
Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja
In-Reply-To: <20210713053113.4632-6-cmr@linux.ibm.com>
Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> x86 supports the notion of a temporary mm which restricts access to
> temporary PTEs to a single CPU. A temporary mm is useful for situations
> where a CPU needs to perform sensitive operations (such as patching a
> STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing
> said mappings to other CPUs. Another benefit is that other CPU TLBs do
> not need to be flushed when the temporary mm is torn down.
>
> Mappings in the temporary mm can be set in the userspace portion of the
> address-space.
>
> Interrupts must be disabled while the temporary mm is in use. HW
> breakpoints, which may have been set by userspace as watchpoints on
> addresses now within the temporary mm, are saved and disabled when
> loading the temporary mm. The HW breakpoints are restored when unloading
> the temporary mm. All HW breakpoints are indiscriminately disabled while
> the temporary mm is in use.
Can you explain more about that breakpoint stuff ? Why is it a special case here at all ? Isn't it
the same when you switch from one user task to another one ? x86 commit doesn't say anythink about
breakpoints.
>
> Based on x86 implementation:
>
> commit cefa929c034e
> ("x86/mm: Introduce temporary mm structs")
>
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
>
> ---
>
> v5: * Drop support for using a temporary mm on Book3s64 Hash MMU.
>
> v4: * Pass the prev mm instead of NULL to switch_mm_irqs_off() when
> using/unusing the temp mm as suggested by Jann Horn to keep
> the context.active counter in-sync on mm/nohash.
> * Disable SLB preload in the temporary mm when initializing the
> temp_mm struct.
> * Include asm/debug.h header to fix build issue with
> ppc44x_defconfig.
> ---
> arch/powerpc/include/asm/debug.h | 1 +
> arch/powerpc/kernel/process.c | 5 +++
> arch/powerpc/lib/code-patching.c | 56 ++++++++++++++++++++++++++++++++
> 3 files changed, 62 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
> index 86a14736c76c3..dfd82635ea8b3 100644
> --- a/arch/powerpc/include/asm/debug.h
> +++ b/arch/powerpc/include/asm/debug.h
> @@ -46,6 +46,7 @@ static inline int debugger_fault_handler(struct pt_regs *regs) { return 0; }
> #endif
>
> void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk);
> +void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk);
> bool ppc_breakpoint_available(void);
> #ifdef CONFIG_PPC_ADV_DEBUG_REGS
> extern void do_send_trap(struct pt_regs *regs, unsigned long address,
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 185beb2905801..a0776200772e8 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -865,6 +865,11 @@ static inline int set_breakpoint_8xx(struct arch_hw_breakpoint *brk)
> return 0;
> }
>
> +void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk)
> +{
> + memcpy(brk, this_cpu_ptr(¤t_brk[nr]), sizeof(*brk));
> +}
> +
> void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk)
> {
> memcpy(this_cpu_ptr(¤t_brk[nr]), brk, sizeof(*brk));
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 54b6157d44e95..3122d8e4cc013 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -17,6 +17,9 @@
> #include <asm/code-patching.h>
> #include <asm/setup.h>
> #include <asm/inst.h>
> +#include <asm/mmu_context.h>
> +#include <asm/debug.h>
> +#include <asm/tlb.h>
>
> static int __patch_instruction(u32 *exec_addr, struct ppc_inst instr, u32 *patch_addr)
> {
> @@ -45,6 +48,59 @@ int raw_patch_instruction(u32 *addr, struct ppc_inst instr)
> }
>
> #ifdef CONFIG_STRICT_KERNEL_RWX
> +
> +struct temp_mm {
> + struct mm_struct *temp;
> + struct mm_struct *prev;
> + struct arch_hw_breakpoint brk[HBP_NUM_MAX];
> +};
> +
> +static inline void init_temp_mm(struct temp_mm *temp_mm, struct mm_struct *mm)
> +{
> + /* We currently only support temporary mm on the Book3s64 Radix MMU */
> + WARN_ON(!radix_enabled());
> +
> + temp_mm->temp = mm;
> + temp_mm->prev = NULL;
> + memset(&temp_mm->brk, 0, sizeof(temp_mm->brk));
> +}
> +
> +static inline void use_temporary_mm(struct temp_mm *temp_mm)
> +{
> + lockdep_assert_irqs_disabled();
> +
> + temp_mm->prev = current->active_mm;
> + switch_mm_irqs_off(temp_mm->prev, temp_mm->temp, current);
> +
> + WARN_ON(!mm_is_thread_local(temp_mm->temp));
> +
> + if (ppc_breakpoint_available()) {
> + struct arch_hw_breakpoint null_brk = {0};
> + int i = 0;
> +
> + for (; i < nr_wp_slots(); ++i) {
> + __get_breakpoint(i, &temp_mm->brk[i]);
> + if (temp_mm->brk[i].type != 0)
> + __set_breakpoint(i, &null_brk);
> + }
> + }
> +}
> +
> +static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
not sure about the naming.
Maybe start_using_temp_mm() and stop_using_temp_mm() would be more explicit.
> +{
> + lockdep_assert_irqs_disabled();
> +
> + switch_mm_irqs_off(temp_mm->temp, temp_mm->prev, current);
> +
> + if (ppc_breakpoint_available()) {
> + int i = 0;
> +
> + for (; i < nr_wp_slots(); ++i)
> + if (temp_mm->brk[i].type != 0)
> + __set_breakpoint(i, &temp_mm->brk[i]);
> + }
> +}
> +
> static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
>
> #if IS_BUILTIN(CONFIG_LKDTM)
>
You'll probably get a bisecting hasard with those unused 'static inline' functions in a .c file
because that patch alone probably fails build.
^ permalink raw reply
* [PATCH v2 3/3] powerpc/mce: Modify the real address error logging messages
From: Ganesh Goudar @ 2021-08-05 9:20 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: mikey, Ganesh Goudar, mahesh, npiggin
In-Reply-To: <20210805092025.272871-1-ganeshgr@linux.ibm.com>
To avoid ambiguity, modify the strings in real address error
logging messages to "foreign/control memory" from "foreign",
Since the error discriptions in P9 user manual and P10 user
manual are different for same type of errors.
P9 User Manual for MCE:
DSISR:59 Host real address to foreign space during translation.
DSISR:60 Host real address to foreign space on a load or store
access.
P10 User Manual for MCE:
DSISR:59 D-side tablewalk used a host real address in the
control memory address range.
DSISR:60 D-side operand access to control memory address space.
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
---
v2: No changes in this patch.
---
arch/powerpc/kernel/mce.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 47a683cd00d2..f3ef480bb739 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -388,14 +388,14 @@ void machine_check_print_event_info(struct machine_check_event *evt,
static const char *mc_ra_types[] = {
"Indeterminate",
"Instruction fetch (bad)",
- "Instruction fetch (foreign)",
+ "Instruction fetch (foreign/control memory)",
"Page table walk ifetch (bad)",
- "Page table walk ifetch (foreign)",
+ "Page table walk ifetch (foreign/control memory)",
"Load (bad)",
"Store (bad)",
"Page table walk Load/Store (bad)",
- "Page table walk Load/Store (foreign)",
- "Load/Store (foreign)",
+ "Page table walk Load/Store (foreign/control memory)",
+ "Load/Store (foreign/control memory)",
};
static const char *mc_link_types[] = {
"Indeterminate",
--
2.31.1
^ permalink raw reply related
* [PATCH v2 2/3] selftests/powerpc: Add test for real address error handling
From: Ganesh Goudar @ 2021-08-05 9:20 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: mikey, Ganesh Goudar, mahesh, npiggin
In-Reply-To: <20210805092025.272871-1-ganeshgr@linux.ibm.com>
Add test for real address or control memory address access
error handling, using NX-GZIP engine.
The error is injected by accessing the control memory address
using illegal instruction, on successful handling the process
attempting to access control memory address using illegal
instruction receives SIGBUS.
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
---
v2: Fix build error.
---
tools/testing/selftests/powerpc/Makefile | 3 +-
tools/testing/selftests/powerpc/mce/Makefile | 6 +++
.../selftests/powerpc/mce/inject-ra-err.c | 42 +++++++++++++++++++
.../selftests/powerpc/mce/inject-ra-err.sh | 18 ++++++++
tools/testing/selftests/powerpc/mce/vas-api.h | 1 +
5 files changed, 69 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/powerpc/mce/Makefile
create mode 100644 tools/testing/selftests/powerpc/mce/inject-ra-err.c
create mode 100755 tools/testing/selftests/powerpc/mce/inject-ra-err.sh
create mode 120000 tools/testing/selftests/powerpc/mce/vas-api.h
diff --git a/tools/testing/selftests/powerpc/Makefile b/tools/testing/selftests/powerpc/Makefile
index 0830e63818c1..4830372d7416 100644
--- a/tools/testing/selftests/powerpc/Makefile
+++ b/tools/testing/selftests/powerpc/Makefile
@@ -31,7 +31,8 @@ SUB_DIRS = alignment \
vphn \
math \
ptrace \
- security
+ security \
+ mce
endif
diff --git a/tools/testing/selftests/powerpc/mce/Makefile b/tools/testing/selftests/powerpc/mce/Makefile
new file mode 100644
index 000000000000..0f537ce86370
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mce/Makefile
@@ -0,0 +1,6 @@
+#SPDX-License-Identifier: GPL-2.0-or-later
+
+TEST_PROGS := inject-ra-err.sh
+TEST_GEN_FILES := inject-ra-err
+
+include ../../lib.mk
diff --git a/tools/testing/selftests/powerpc/mce/inject-ra-err.c b/tools/testing/selftests/powerpc/mce/inject-ra-err.c
new file mode 100644
index 000000000000..05ab11cec3da
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mce/inject-ra-err.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+#include <errno.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include "vas-api.h"
+
+int main(void)
+{
+ int fd, ret;
+ int *paste_addr;
+ struct vas_tx_win_open_attr attr;
+ char *devname = "/dev/crypto/nx-gzip";
+
+ memset(&attr, 0, sizeof(attr));
+ attr.version = 1;
+ attr.vas_id = 0;
+
+ fd = open(devname, O_RDWR);
+ if (fd < 0) {
+ fprintf(stderr, "Failed to open device %s\n", devname);
+ return -errno;
+ }
+ ret = ioctl(fd, VAS_TX_WIN_OPEN, &attr);
+ if (ret < 0) {
+ fprintf(stderr, "ioctl() n %d, error %d\n", ret, errno);
+ ret = -errno;
+ goto out;
+ }
+ paste_addr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0ULL);
+ /* The following assignment triggers exception */
+ *paste_addr = 1;
+ ret = 0;
+out:
+ close(fd);
+ return ret;
+}
diff --git a/tools/testing/selftests/powerpc/mce/inject-ra-err.sh b/tools/testing/selftests/powerpc/mce/inject-ra-err.sh
new file mode 100755
index 000000000000..3633cdc651a1
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mce/inject-ra-err.sh
@@ -0,0 +1,18 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+if [[ ! -w /dev/crypto/nx-gzip ]]; then
+ echo "WARN: Can't access /dev/crypto/nx-gzip, skipping"
+ exit 0
+fi
+
+timeout 5 ./inject-ra-err
+
+# 128 + 7 (SIGBUS) = 135, 128 is a exit code with special meaning.
+if [ $? -ne 135 ]; then
+ echo "FAILED: Real address or Control memory access error not handled"
+ exit $?
+fi
+
+echo "OK: Real address or Control memory access error is handled"
+exit 0
diff --git a/tools/testing/selftests/powerpc/mce/vas-api.h b/tools/testing/selftests/powerpc/mce/vas-api.h
new file mode 120000
index 000000000000..1455c1bcd351
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mce/vas-api.h
@@ -0,0 +1 @@
+../../../../../arch/powerpc/include/uapi/asm/vas-api.h
\ No newline at end of file
--
2.31.1
^ permalink raw reply related
* [PATCH v2 1/3] powerpc/pseries: Parse control memory access error
From: Ganesh Goudar @ 2021-08-05 9:20 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: mikey, Ganesh Goudar, mahesh, npiggin
Add support to parse and log control memory access
error for pseries.
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
---
v2: No changes in this patch.
---
arch/powerpc/platforms/pseries/ras.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index 167f2e1b8d39..608c35cad0c3 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -80,6 +80,7 @@ struct pseries_mc_errorlog {
#define MC_ERROR_TYPE_TLB 0x04
#define MC_ERROR_TYPE_D_CACHE 0x05
#define MC_ERROR_TYPE_I_CACHE 0x07
+#define MC_ERROR_TYPE_CTRL_MEM_ACCESS 0x08
/* RTAS pseries MCE error sub types */
#define MC_ERROR_UE_INDETERMINATE 0
@@ -103,6 +104,9 @@ struct pseries_mc_errorlog {
#define MC_ERROR_TLB_MULTIHIT 2
#define MC_ERROR_TLB_INDETERMINATE 3
+#define MC_ERROR_CTRL_MEM_ACCESS_PTABLE_WALK 0
+#define MC_ERROR_CTRL_MEM_ACCESS_OP_ACCESS 1
+
static inline u8 rtas_mc_error_sub_type(const struct pseries_mc_errorlog *mlog)
{
switch (mlog->error_type) {
@@ -112,6 +116,8 @@ static inline u8 rtas_mc_error_sub_type(const struct pseries_mc_errorlog *mlog)
case MC_ERROR_TYPE_ERAT:
case MC_ERROR_TYPE_TLB:
return (mlog->sub_err_type & 0x03);
+ case MC_ERROR_TYPE_CTRL_MEM_ACCESS:
+ return (mlog->sub_err_type & 0x70) >> 4;
default:
return 0;
}
@@ -699,6 +705,21 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
case MC_ERROR_TYPE_I_CACHE:
mce_err.error_type = MCE_ERROR_TYPE_ICACHE;
break;
+ case MC_ERROR_TYPE_CTRL_MEM_ACCESS:
+ mce_err.error_type = MCE_ERROR_TYPE_RA;
+ if (mce_log->sub_err_type & 0x80)
+ eaddr = be64_to_cpu(mce_log->effective_address);
+ switch (err_sub_type) {
+ case MC_ERROR_CTRL_MEM_ACCESS_PTABLE_WALK:
+ mce_err.u.ra_error_type =
+ MCE_RA_ERROR_PAGE_TABLE_WALK_LOAD_STORE_FOREIGN;
+ break;
+ case MC_ERROR_CTRL_MEM_ACCESS_OP_ACCESS:
+ mce_err.u.ra_error_type =
+ MCE_RA_ERROR_LOAD_STORE_FOREIGN;
+ break;
+ }
+ break;
case MC_ERROR_TYPE_UNKNOWN:
default:
mce_err.error_type = MCE_ERROR_TYPE_UNKNOWN;
--
2.31.1
^ permalink raw reply related
* Re: [PATCH v5 8/8] lkdtm/powerpc: Fix code patching hijack test
From: Christophe Leroy @ 2021-08-05 9:18 UTC (permalink / raw)
To: Christopher M. Riedl, linuxppc-dev
Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja
In-Reply-To: <20210713053113.4632-9-cmr@linux.ibm.com>
Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> Code patching on powerpc with a STRICT_KERNEL_RWX uses a userspace
> address in a temporary mm on Radix now. Use __put_user() to avoid write
> failures due to KUAP when attempting a "hijack" on the patching address.
> __put_user() also works with the non-userspace, vmalloc-based patching
> address on non-Radix MMUs.
It is not really clean to use __put_user() on non user address, allthought it works by change.
I think it would be better to do something like
if (is_kernel_addr(addr))
copy_to_kernel_nofault(...);
else
copy_to_user_nofault(...);
>
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> ---
> drivers/misc/lkdtm/perms.c | 9 ---------
> 1 file changed, 9 deletions(-)
>
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 41e87e5f9cc86..da6a34a0a49fb 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -262,16 +262,7 @@ static inline u32 lkdtm_read_patch_site(void)
> /* Returns True if the write succeeds */
> static inline bool lkdtm_try_write(u32 data, u32 *addr)
> {
> -#ifdef CONFIG_PPC
> - __put_kernel_nofault(addr, &data, u32, err);
> - return true;
> -
> -err:
> - return false;
> -#endif
> -#ifdef CONFIG_X86_64
> return !__put_user(data, addr);
> -#endif
> }
>
> static int lkdtm_patching_cpu(void *data)
>
^ permalink raw reply
* Re: [PATCH 1/3] arch: Export machine_restart() instances so they can be called from modules
From: Thomas Bogendoerfer @ 2021-08-05 9:16 UTC (permalink / raw)
To: Lee Jones
Cc: Rich Felker, Greg Kroah-Hartman, Catalin Marinas, Paul Walmsley,
Sebastian Reichel, James E . J . Bottomley, Max Filippov, Guo Ren,
linux-csky, sparclinux, linux-hexagon, linux-riscv, Will Deacon,
Thomas Gleixner, Anton Ivanov, Jonas Bonn, linux-s390, Brian Cain,
Helge Deller, linux-sh, Ley Foon Tan, Christian Borntraeger,
Ingo Molnar, Geert Uytterhoeven, linux-snps-arc, Jeff Dike,
uclinux-h8-devel, linux-xtensa, Albert Ou, Vasily Gorbik,
Heiko Carstens, linux-um, Stefan Kristiansson, linux-m68k,
openrisc, Borislav Petkov, John Crispin, Stafford Horne,
linux-arm-kernel, Chris Zankel, Michal Simek, linux-mips,
Yoshinori Sato, linux-parisc, Vineet Gupta, linux-kernel,
Palmer Dabbelt, Richard Weinberger, linuxppc-dev,
David S . Miller
In-Reply-To: <20210805075032.723037-2-lee.jones@linaro.org>
On Thu, Aug 05, 2021 at 08:50:30AM +0100, Lee Jones wrote:
> A recent attempt to convert the Power Reset Restart driver to tristate
> failed because of the following compile error (reported once merged by
> Stephen Rothwell via Linux Next):
>
> ERROR: "machine_restart" [drivers/power/reset/restart-poweroff.ko] undefined!
>
> This error occurs since some of the machine_restart() instances are
> not currently exported for use in modules. This patch aims to rectify
> that.
>
> Cc: Vineet Gupta <vgupta@synopsys.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Guo Ren <guoren@kernel.org>
> Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
> Cc: Brian Cain <bcain@codeaurora.org>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: Michal Simek <monstr@monstr.eu>
> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
> Cc: John Crispin <john@phrozen.org>
> Cc: Ley Foon Tan <ley.foon.tan@intel.com>
> Cc: Jonas Bonn <jonas@southpole.se>
> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
> Cc: Stafford Horne <shorne@gmail.com>
> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
> Cc: Helge Deller <deller@gmx.de>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Paul Walmsley <paul.walmsley@sifive.com>
> Cc: Palmer Dabbelt <palmer@dabbelt.com>
> Cc: Albert Ou <aou@eecs.berkeley.edu>
> Cc: Heiko Carstens <hca@linux.ibm.com>
> Cc: Vasily Gorbik <gor@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Rich Felker <dalias@libc.org>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Jeff Dike <jdike@addtoit.com>
> Cc: Richard Weinberger <richard@nod.at>
> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Chris Zankel <chris@zankel.net>
> Cc: Max Filippov <jcmvbkbc@gmail.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Sebastian Reichel <sre@kernel.org>
> Cc: linux-snps-arc@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-csky@vger.kernel.org
> Cc: uclinux-h8-devel@lists.sourceforge.jp
> Cc: linux-hexagon@vger.kernel.org
> Cc: linux-m68k@lists.linux-m68k.org
> Cc: linux-mips@vger.kernel.org
> Cc: openrisc@lists.librecores.org
> Cc: linux-parisc@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-riscv@lists.infradead.org
> Cc: linux-s390@vger.kernel.org
> Cc: linux-sh@vger.kernel.org
> Cc: sparclinux@vger.kernel.org
> Cc: linux-um@lists.infradead.org
> Cc: linux-xtensa@linux-xtensa.org
> Signed-off-by: Lee Jones <lee.jones@linaro.org>
> ---
>
> The 2 patches this change supports have the required Acks already.
>
> NB: If it's safe to omit some of these, let me know and I'll revise the patch.
>
> [...]
> arch/mips/kernel/reset.c | 1 +
> arch/mips/lantiq/falcon/reset.c | 1 +
> arch/mips/sgi-ip27/ip27-reset.c | 1 +
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply
* Re: [PATCH v2] arch: vdso: remove if-conditionals of $(c-gettimeofday-y)
From: Thomas Bogendoerfer @ 2021-08-05 9:12 UTC (permalink / raw)
To: Masahiro Yamada
Cc: Albert Ou, Russell King, linuxppc-dev, linux-kernel, linux-csky,
linux-mips, Paul Walmsley, Palmer Dabbelt, linux-arm-kernel,
Andy Lutomirski, Catalin Marinas, Guo Ren, Thomas Gleixner,
Vincenzo Frascino, Will Deacon, linux-riscv, Paul Mackerras
In-Reply-To: <20210731060020.12913-1-masahiroy@kernel.org>
On Sat, Jul 31, 2021 at 03:00:20PM +0900, Masahiro Yamada wrote:
> arm, arm64, csky, mips, powerpc always select GENERIC_GETTIMEOFDAY,
> hence $(gettimeofday-y) never becomes empty.
>
> riscv conditionally selects GENERIC_GETTIMEOFDAY when MMU=y && 64BIT=y,
> but arch/riscv/kernel/vdso/vgettimeofday.o is built only under that
> condition. So, you can always define CFLAGS_vgettimeofday.o
>
> Remove all the meaningless conditionals.
>
> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
> ---
>
> Changes in v2:
> - Fix csky as well
>
> [..]
> arch/mips/vdso/Makefile | 2 --
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply
* Re: [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping
From: Christophe Leroy @ 2021-08-05 9:13 UTC (permalink / raw)
To: Christopher M. Riedl, linuxppc-dev
Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja
In-Reply-To: <20210713053113.4632-3-cmr@linux.ibm.com>
Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> When live patching with STRICT_KERNEL_RWX the CPU doing the patching
> must temporarily remap the page(s) containing the patch site with +W
> permissions. While this temporary mapping is in use, another CPU could
> write to the same mapping and maliciously alter kernel text. Implement a
> LKDTM test to attempt to exploit such an opening during code patching.
> The test is implemented on powerpc and requires LKDTM built into the
> kernel (building LKDTM as a module is insufficient).
>
> The LKDTM "hijack" test works as follows:
>
> 1. A CPU executes an infinite loop to patch an instruction. This is
> the "patching" CPU.
> 2. Another CPU attempts to write to the address of the temporary
> mapping used by the "patching" CPU. This other CPU is the
> "hijacker" CPU. The hijack either fails with a fault/error or
> succeeds, in which case some kernel text is now overwritten.
>
> The virtual address of the temporary patch mapping is provided via an
> LKDTM-specific accessor to the hijacker CPU. This test assumes a
> hypothetical situation where this address was leaked previously.
>
> How to run the test:
>
> mount -t debugfs none /sys/kernel/debug
> (echo HIJACK_PATCH > /sys/kernel/debug/provoke-crash/DIRECT)
>
> A passing test indicates that it is not possible to overwrite kernel
> text from another CPU by using the temporary mapping established by
> a CPU for patching.
>
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
>
> ---
>
> v5: * Use `u32*` instead of `struct ppc_inst*` based on new series in
> upstream.
>
> v4: * Separate the powerpc and x86_64 bits into individual patches.
> * Use __put_kernel_nofault() when attempting to hijack the mapping
> * Use raw_smp_processor_id() to avoid triggering the BUG() when
> calling smp_processor_id() in preemptible code - the only thing
> that matters is that one of the threads is bound to a different
> CPU - we are not using smp_processor_id() to access any per-cpu
> data or similar where preemption should be disabled.
> * Rework the patching_cpu() kthread stop condition to avoid:
> https://lwn.net/Articles/628628/
> ---
> drivers/misc/lkdtm/core.c | 1 +
> drivers/misc/lkdtm/lkdtm.h | 1 +
> drivers/misc/lkdtm/perms.c | 134 +++++++++++++++++++++++++++++++++++++
> 3 files changed, 136 insertions(+)
>
> diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
> index 8024b6a5cc7fc..fbcb95eda337b 100644
> --- a/drivers/misc/lkdtm/core.c
> +++ b/drivers/misc/lkdtm/core.c
> @@ -147,6 +147,7 @@ static const struct crashtype crashtypes[] = {
> CRASHTYPE(WRITE_RO),
> CRASHTYPE(WRITE_RO_AFTER_INIT),
> CRASHTYPE(WRITE_KERN),
> + CRASHTYPE(HIJACK_PATCH),
> CRASHTYPE(REFCOUNT_INC_OVERFLOW),
> CRASHTYPE(REFCOUNT_ADD_OVERFLOW),
> CRASHTYPE(REFCOUNT_INC_NOT_ZERO_OVERFLOW),
> diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
> index 99f90d3e5e9cb..87e7e6136d962 100644
> --- a/drivers/misc/lkdtm/lkdtm.h
> +++ b/drivers/misc/lkdtm/lkdtm.h
> @@ -62,6 +62,7 @@ void lkdtm_EXEC_USERSPACE(void);
> void lkdtm_EXEC_NULL(void);
> void lkdtm_ACCESS_USERSPACE(void);
> void lkdtm_ACCESS_NULL(void);
> +void lkdtm_HIJACK_PATCH(void);
>
> /* refcount.c */
> void lkdtm_REFCOUNT_INC_OVERFLOW(void);
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 2dede2ef658f3..39e7456852229 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -9,6 +9,7 @@
> #include <linux/vmalloc.h>
> #include <linux/mman.h>
> #include <linux/uaccess.h>
> +#include <linux/kthread.h>
> #include <asm/cacheflush.h>
>
> /* Whether or not to fill the target memory area with do_nothing(). */
> @@ -222,6 +223,139 @@ void lkdtm_ACCESS_NULL(void)
> pr_err("FAIL: survived bad write\n");
> }
>
> +#if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \
> + defined(CONFIG_PPC))
I think this test shouldn't be limited to CONFIG_PPC and shouldn't be limited to
CONFIG_STRICT_KERNEL_RWX. It should be there all the time.
Also why limiting it to IS_BUILTIN(CONFIG_LKDTM) ?
> +/*
> + * This is just a dummy location to patch-over.
> + */
> +static void patching_target(void)
> +{
> + return;
> +}
> +
> +#include <asm/code-patching.h>
> +const u32 *patch_site = (const u32 *)&patching_target;
> +
> +static inline int lkdtm_do_patch(u32 data)
> +{
> + return patch_instruction((u32 *)patch_site, ppc_inst(data));
> +}
> +
> +static inline u32 lkdtm_read_patch_site(void)
> +{
> + return READ_ONCE(*patch_site);
> +}
> +
> +/* Returns True if the write succeeds */
> +static inline bool lkdtm_try_write(u32 data, u32 *addr)
> +{
> + __put_kernel_nofault(addr, &data, u32, err);
> + return true;
> +
> +err:
> + return false;
> +}
> +
> +static int lkdtm_patching_cpu(void *data)
> +{
> + int err = 0;
> + u32 val = 0xdeadbeef;
> +
> + pr_info("starting patching_cpu=%d\n", raw_smp_processor_id());
> +
> + do {
> + err = lkdtm_do_patch(val);
> + } while (lkdtm_read_patch_site() == val && !err && !kthread_should_stop());
> +
> + if (err)
> + pr_warn("XFAIL: patch_instruction returned error: %d\n", err);
> +
> + while (!kthread_should_stop()) {
> + set_current_state(TASK_INTERRUPTIBLE);
> + schedule();
> + }
> +
> + return err;
> +}
> +
> +void lkdtm_HIJACK_PATCH(void)
> +{
> + struct task_struct *patching_kthrd;
> + int patching_cpu, hijacker_cpu, attempts;
> + unsigned long addr;
> + bool hijacked;
> + const u32 bad_data = 0xbad00bad;
> + const u32 original_insn = lkdtm_read_patch_site();
> +
> + if (!IS_ENABLED(CONFIG_SMP)) {
> + pr_err("XFAIL: this test requires CONFIG_SMP\n");
> + return;
> + }
> +
> + if (num_online_cpus() < 2) {
> + pr_warn("XFAIL: this test requires at least two cpus\n");
> + return;
> + }
> +
> + hijacker_cpu = raw_smp_processor_id();
> + patching_cpu = cpumask_any_but(cpu_online_mask, hijacker_cpu);
> +
> + patching_kthrd = kthread_create_on_node(&lkdtm_patching_cpu, NULL,
> + cpu_to_node(patching_cpu),
> + "lkdtm_patching_cpu");
> + kthread_bind(patching_kthrd, patching_cpu);
> + wake_up_process(patching_kthrd);
> +
> + addr = offset_in_page(patch_site) | read_cpu_patching_addr(patching_cpu);
> +
> + pr_info("starting hijacker_cpu=%d\n", hijacker_cpu);
> + for (attempts = 0; attempts < 100000; ++attempts) {
> + /* Try to write to the other CPU's temp patch mapping */
> + hijacked = lkdtm_try_write(bad_data, (u32 *)addr);
> +
> + if (hijacked) {
> + if (kthread_stop(patching_kthrd)) {
> + pr_info("hijack attempts: %d\n", attempts);
> + pr_err("XFAIL: error stopping patching cpu\n");
> + return;
> + }
> + break;
> + }
> + }
> + pr_info("hijack attempts: %d\n", attempts);
> +
> + if (hijacked) {
> + if (lkdtm_read_patch_site() == bad_data)
> + pr_err("overwrote kernel text\n");
> + /*
> + * There are window conditions where the hijacker cpu manages to
> + * write to the patch site but the site gets overwritten again by
> + * the patching cpu. We still consider that a "successful" hijack
> + * since the hijacker cpu did not fault on the write.
> + */
> + pr_err("FAIL: wrote to another cpu's patching area\n");
> + } else {
> + kthread_stop(patching_kthrd);
> + }
> +
> + /* Restore the original data to be able to run the test again */
> + lkdtm_do_patch(original_insn);
> +}
> +
> +#else
> +
> +void lkdtm_HIJACK_PATCH(void)
> +{
> + if (!IS_ENABLED(CONFIG_PPC))
> + pr_err("XFAIL: this test only runs on powerpc\n");
> + if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
> + pr_err("XFAIL: this test requires CONFIG_STRICT_KERNEL_RWX\n");
> + if (!IS_BUILTIN(CONFIG_LKDTM))
> + pr_err("XFAIL: this test requires CONFIG_LKDTM=y (not =m!)\n");
> +}
> +
> +#endif
> +
> void __init lkdtm_perms_init(void)
> {
> /* Make sure we can write to __ro_after_init values during __init */
>
^ permalink raw reply
* Re: [PATCH v5 4/8] lkdtm/x86_64: Add test to hijack a patch mapping
From: Christophe Leroy @ 2021-08-05 9:09 UTC (permalink / raw)
To: Christopher M. Riedl, linuxppc-dev
Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja
In-Reply-To: <20210713053113.4632-5-cmr@linux.ibm.com>
Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> A previous commit implemented an LKDTM test on powerpc to exploit the
> temporary mapping established when patching code with STRICT_KERNEL_RWX
> enabled. Extend the test to work on x86_64 as well.
>
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
> ---
> drivers/misc/lkdtm/perms.c | 26 ++++++++++++++++++++++----
> 1 file changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 39e7456852229..41e87e5f9cc86 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -224,7 +224,7 @@ void lkdtm_ACCESS_NULL(void)
> }
>
> #if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \
> - defined(CONFIG_PPC))
> + (defined(CONFIG_PPC) || defined(CONFIG_X86_64)))
> /*
> * This is just a dummy location to patch-over.
> */
> @@ -233,12 +233,25 @@ static void patching_target(void)
> return;
> }
>
> -#include <asm/code-patching.h>
> const u32 *patch_site = (const u32 *)&patching_target;
>
> +#ifdef CONFIG_PPC
> +#include <asm/code-patching.h>
> +#endif
> +
> +#ifdef CONFIG_X86_64
> +#include <asm/text-patching.h>
> +#endif
> +
> static inline int lkdtm_do_patch(u32 data)
> {
> +#ifdef CONFIG_PPC
> return patch_instruction((u32 *)patch_site, ppc_inst(data));
> +#endif
> +#ifdef CONFIG_X86_64
> + text_poke((void *)patch_site, &data, sizeof(u32));
> + return 0;
> +#endif
> }
>
> static inline u32 lkdtm_read_patch_site(void)
> @@ -249,11 +262,16 @@ static inline u32 lkdtm_read_patch_site(void)
> /* Returns True if the write succeeds */
> static inline bool lkdtm_try_write(u32 data, u32 *addr)
> {
> +#ifdef CONFIG_PPC
> __put_kernel_nofault(addr, &data, u32, err);
> return true;
>
> err:
> return false;
> +#endif
> +#ifdef CONFIG_X86_64
> + return !__put_user(data, addr);
> +#endif
> }
>
> static int lkdtm_patching_cpu(void *data)
> @@ -346,8 +364,8 @@ void lkdtm_HIJACK_PATCH(void)
>
> void lkdtm_HIJACK_PATCH(void)
> {
> - if (!IS_ENABLED(CONFIG_PPC))
> - pr_err("XFAIL: this test only runs on powerpc\n");
> + if (!IS_ENABLED(CONFIG_PPC) && !IS_ENABLED(CONFIG_X86_64))
> + pr_err("XFAIL: this test only runs on powerpc and x86_64\n");
> if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
> pr_err("XFAIL: this test requires CONFIG_STRICT_KERNEL_RWX\n");
> if (!IS_BUILTIN(CONFIG_LKDTM))
>
Instead of spreading arch specific stuff into LKDTM, wouldn't it make sence to define common a
common API ? Because the day another arch like arm64 implements it own approach, do we add specific
functions again and again into LKDTM ?
Also, I find it odd to define tests only when they can succeed. For other tests like
ACCESS_USERSPACE, they are there all the time, regardless of whether we have selected
CONFIG_PPC_KUAP or not. I think it should be the same here, have it all there time, if
CONFIG_STRICT_KERNEL_RWX is selected the test succeeds otherwise it fails, but it is always there.
Christophe
^ permalink raw reply
* Re: [PATCH v5 0/8] Use per-CPU temporary mappings for patching on Radix MMU
From: Christophe Leroy @ 2021-08-05 9:03 UTC (permalink / raw)
To: Christopher M. Riedl, linuxppc-dev
Cc: keescook, peterz, x86, npiggin, linux-hardening, tglx, dja
In-Reply-To: <20210713053113.4632-1-cmr@linux.ibm.com>
Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> When compiled with CONFIG_STRICT_KERNEL_RWX, the kernel must create
> temporary mappings when patching itself. These mappings temporarily
> override the strict RWX text protections to permit a write. Currently,
> powerpc allocates a per-CPU VM area for patching. Patching occurs as
> follows:
>
> 1. Map page in per-CPU VM area w/ PAGE_KERNEL protection
> 2. Patch text
> 3. Remove the temporary mapping
>
> While the VM area is per-CPU, the mapping is actually inserted into the
> kernel page tables. Presumably, this could allow another CPU to access
> the normally write-protected text - either malicously or accidentally -
> via this same mapping if the address of the VM area is known. Ideally,
> the mapping should be kept local to the CPU doing the patching [0].
>
> x86 introduced "temporary mm" structs which allow the creation of mappings
> local to a particular CPU [1]. This series intends to bring the notion of a
> temporary mm to powerpc's Book3s64 Radix MMU and harden it by using such a
> mapping for patching a kernel with strict RWX permissions.
>
> The first four patches implement an LKDTM test "proof-of-concept" which
> exploits the potential vulnerability (ie. the temporary mapping during patching
> is exposed in the kernel page tables and accessible by other CPUs) using a
> simple brute-force approach. This test is implemented for both powerpc and
> x86_64. The test passes on powerpc Radix with this new series, fails on
> upstream powerpc, passes on upstream x86_64, and fails on an older (ancient)
> x86_64 tree without the x86_64 temporary mm patches. The remaining patches add
> support for and use a temporary mm for code patching on powerpc with the Radix
> MMU.
I think four first patches (together with last one) are quite independent from the heart of the
series itself which is patches 5, 6, 7. Maybe you should split that series it two series ? After all
those selftests are nice to have but are not absolutely necessary, that would help getting forward I
think.
>
> Tested boot, ftrace, and repeated LKDTM "hijack":
> - QEMU+KVM (host: POWER9 Blackbird): Radix MMU w/ KUAP
> - QEMU+KVM (host: POWER9 Blackbird): Hash MMU
>
> Tested repeated LKDTM "hijack":
> - QEMU+KVM (host: AMD desktop): x86_64 upstream
> - QEMU+KVM (host: AMD desktop): x86_64 w/o percpu temp mm to
> verify the LKDTM "hijack" test fails
>
> Tested boot and ftrace:
> - QEMU+TCG: ppc44x (bamboo)
> - QEMU+TCG: g5 (mac99)
>
> I also tested with various extra config options enabled as suggested in
> section 12) in Documentation/process/submit-checklist.rst.
>
> v5: * Only support Book3s64 Radix MMU for now. There are some issues with
> the previous implementation on the Hash MMU as pointed out by Nick
> Piggin. Fixing these is not trivial so we only support the Radix MMU
> for now. I tried using realmode (no data translation) to patch with
> Hash to at least avoid exposing the patch mapping to other CPUs but
> this doesn't reliably work either since we cannot access vmalloc'd
> space in realmode.
So you now accept to have two different mode depending on the platform ?
As far as I remember I commented some time ago that non SMP didn't need that feature and you were
reluctant to have two different implementations. What made you change your mind ? (just curious).
> * Use percpu variables for the patching_mm and patching_addr. This
> avoids the need for synchronization mechanisms entirely. Thanks to
> Peter Zijlstra for pointing out text_mutex which unfortunately didn't
> work out without larger code changes in powerpc. Also thanks to Daniel
> Axtens for comments about using percpu variables for the *percpu* temp
> mm things off list.
>
> v4: * It's time to revisit this series again since @jpn and @mpe fixed
> our known STRICT_*_RWX bugs on powerpc/64s.
> * Rebase on linuxppc/next:
> commit ee1bc694fbaec ("powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n")
> * Completely rework how map_patch() works on book3s64 Hash MMU
> * Split the LKDTM x86_64 and powerpc bits into separate patches
> * Annotate commit messages with changes from v3 instead of
> listing them here completely out-of context...
>
> v3: * Rebase on linuxppc/next: commit 9123e3a74ec7 ("Linux 5.9-rc1")
> * Move temporary mm implementation into code-patching.c where it
> belongs
> * Implement LKDTM hijacker test on x86_64 (on IBM time oof) Do
> * not use address zero for the patching address in the
> temporary mm (thanks @dja for pointing this out!)
> * Wrap the LKDTM test w/ CONFIG_SMP as suggested by Christophe
> Leroy
> * Comments to clarify PTE pre-allocation and patching addr
> selection
>
> v2: * Rebase on linuxppc/next:
> commit 105fb38124a4 ("powerpc/8xx: Modify ptep_get()")
> * Always dirty pte when mapping patch
> * Use `ppc_inst_len` instead of `sizeof` on instructions
> * Declare LKDTM patching addr accessor in header where it belongs
>
> v1: * Rebase on linuxppc/next (4336b9337824)
> * Save and restore second hw watchpoint
> * Use new ppc_inst_* functions for patching check and in LKDTM test
>
> rfc-v2: * Many fixes and improvements mostly based on extensive feedback
> and testing by Christophe Leroy (thanks!).
> * Make patching_mm and patching_addr static and move
> '__ro_after_init' to after the variable name (more common in
> other parts of the kernel)
> * Use 'asm/debug.h' header instead of 'asm/hw_breakpoint.h' to
> fix PPC64e compile
> * Add comment explaining why we use BUG_ON() during the init
> call to setup for patching later
> * Move ptep into patch_mapping to avoid walking page tables a
> second time when unmapping the temporary mapping
> * Use KUAP under non-radix, also manually dirty the PTE for patch
> mapping on non-BOOK3S_64 platforms
> * Properly return any error from __patch_instruction
> * Do not use 'memcmp' where a simple comparison is appropriate
> * Simplify expression for patch address by removing pointer maths
> * Add LKDTM test
>
> [0]: https://github.com/linuxppc/issues/issues/224
> [1]: https://lore.kernel.org/kernel-hardening/20190426232303.28381-1-nadav.amit@gmail.com/
>
> Christopher M. Riedl (8):
> powerpc: Add LKDTM accessor for patching addr
> lkdtm/powerpc: Add test to hijack a patch mapping
> x86_64: Add LKDTM accessor for patching addr
> lkdtm/x86_64: Add test to hijack a patch mapping
> powerpc/64s: Introduce temporary mm for Radix MMU
> powerpc: Rework and improve STRICT_KERNEL_RWX patching
> powerpc/64s: Initialize and use a temporary mm for patching on Radix
> lkdtm/powerpc: Fix code patching hijack test
>
> arch/powerpc/include/asm/code-patching.h | 4 +
> arch/powerpc/include/asm/debug.h | 1 +
> arch/powerpc/kernel/process.c | 5 +
> arch/powerpc/lib/code-patching.c | 240 ++++++++++++++++++++---
> arch/x86/include/asm/text-patching.h | 4 +
> arch/x86/kernel/alternative.c | 7 +
> drivers/misc/lkdtm/core.c | 1 +
> drivers/misc/lkdtm/lkdtm.h | 1 +
> drivers/misc/lkdtm/perms.c | 143 ++++++++++++++
> 9 files changed, 378 insertions(+), 28 deletions(-)
>
^ permalink raw reply
* Re: [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
From: Xianting Tian @ 2021-08-05 8:23 UTC (permalink / raw)
To: Greg KH
Cc: arnd, amit, Jiri Slaby, linux-kernel, virtualization,
linuxppc-dev, osandov
In-Reply-To: <YQue3tK98e6fAqwP@kroah.com>
在 2021/8/5 下午4:18, Greg KH 写道:
> On Thu, Aug 05, 2021 at 04:08:46PM +0800, Xianting Tian wrote:
>> 在 2021/8/5 下午3:58, Jiri Slaby 写道:
>>> Hi,
>>>
>>> On 04. 08. 21, 4:54, Xianting Tian wrote:
>>>> @@ -933,6 +949,16 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno,
>>>> int data,
>>>> hp->outbuf_size = outbuf_size;
>>>> hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
>>>> + /*
>>>> + * hvc_con_outbuf is guaranteed to be aligned at least to the
>>>> + * size(N_OUTBUF) by kmalloc().
>>>> + */
>>>> + hp->hvc_con_outbuf = kzalloc(N_OUTBUF, GFP_KERNEL);
>>>> + if (!hp->hvc_con_outbuf)
>>>> + return ERR_PTR(-ENOMEM);
>>> This leaks hp, right?
>>>
>>> BTW your 2 patches are still not threaded, that is hard to follow.
>> yes, thanks, I found the bug, I am preparing to do this in v4.
>>
>> It is the first time I send series patches(number >1), I checked the method
>> for sending series patch on LKML.org, I should send '0/2' which is the
>> history info for series patches.
> Please use 'git send-email' to send the full series all at once,
> otherwise it is hard to make the emails threaded "by hand" if you do not
> do so.
I got it, thanks for your guide:)
>
> thanks,
>
> greg k-h
^ permalink raw reply
* Re: [PATCH 3/3] powerpc/bpf: Reallocate BPF registers to volatile registers when possible on PPC64
From: Christophe Leroy @ 2021-08-05 8:21 UTC (permalink / raw)
To: Jordan Niethe, linuxppc-dev; +Cc: naveen.n.rao
In-Reply-To: <20210727065539.299598-3-jniethe5@gmail.com>
Le 27/07/2021 à 08:55, Jordan Niethe a écrit :
> Implement commit 40272035e1d0 ("powerpc/bpf: Reallocate BPF registers to
> volatile registers when possible on PPC32") for PPC64.
>
> When the BPF routine doesn't call any function, the non volatile
> registers can be reallocated to volatile registers in order to avoid
> having to save them/restore on the stack. To keep track of which
> registers can be reallocated to make sure registers are set seen when
> used.
Maybe you could try and do as on PPC32, try to use r0 as much as possible instead of TMP regs.
r0 needs to be used carefully because for some instructions (ex: addi, lwz, etc) r0 means 0 instead
of register 0, but it would help freeing one more register in several cases.
>
> Before this patch, the test #359 ADD default X is:
> 0: nop
> 4: nop
> 8: std r27,-40(r1)
> c: std r28,-32(r1)
> 10: xor r8,r8,r8
> 14: rotlwi r8,r8,0
> 18: xor r28,r28,r28
> 1c: rotlwi r28,r28,0
> 20: mr r27,r3
> 24: li r8,66
> 28: add r8,r8,r28
> 2c: rotlwi r8,r8,0
> 30: ld r27,-40(r1)
> 34: ld r28,-32(r1)
> 38: mr r3,r8
> 3c: blr
>
> After this patch, the same test has become:
> 0: nop
> 4: nop
> 8: xor r8,r8,r8
> c: rotlwi r8,r8,0
> 10: xor r5,r5,r5
> 14: rotlwi r5,r5,0
> 18: mr r4,r3
> 1c: li r8,66
> 20: add r8,r8,r5
> 24: rotlwi r8,r8,0
> 28: mr r3,r8
> 2c: blr
>
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
> arch/powerpc/net/bpf_jit64.h | 2 ++
> arch/powerpc/net/bpf_jit_comp64.c | 60 +++++++++++++++++++++++++------
> 2 files changed, 52 insertions(+), 10 deletions(-)
>
> diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
> index 89b625d9342b..e20521bf77bf 100644
> --- a/arch/powerpc/net/bpf_jit64.h
> +++ b/arch/powerpc/net/bpf_jit64.h
> @@ -70,6 +70,7 @@ const int b2p[MAX_BPF_JIT_REG + 2] = {
> */
> #define PPC_BPF_LL(ctx, r, base, i) do { \
> if ((i) % 4) { \
> + bpf_set_seen_register(ctx, bpf_to_ppc(ctx, TMP_REG_2));\
> EMIT(PPC_RAW_LI(bpf_to_ppc(ctx, TMP_REG_2), (i)));\
> EMIT(PPC_RAW_LDX(r, base, \
> bpf_to_ppc(ctx, TMP_REG_2))); \
> @@ -78,6 +79,7 @@ const int b2p[MAX_BPF_JIT_REG + 2] = {
> } while(0)
> #define PPC_BPF_STL(ctx, r, base, i) do { \
> if ((i) % 4) { \
> + bpf_set_seen_register(ctx, bpf_to_ppc(ctx, TMP_REG_2));\
> EMIT(PPC_RAW_LI(bpf_to_ppc(ctx, TMP_REG_2), (i)));\
> EMIT(PPC_RAW_STDX(r, base, \
> bpf_to_ppc(ctx, TMP_REG_2))); \
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index f7a668c1e364..287e0322bbf3 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -66,6 +66,24 @@ static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
>
> void bpf_jit_realloc_regs(struct codegen_context *ctx)
> {
> + if (ctx->seen & SEEN_FUNC)
> + return;
> +
> + while (ctx->seen & SEEN_NVREG_MASK &&
> + (ctx->seen & SEEN_VREG_MASK) != SEEN_VREG_MASK) {
> + int old = 32 - fls(ctx->seen & SEEN_NVREG_MASK);
> + int new = 32 - fls(~ctx->seen & SEEN_VREG_MASK);
> + int i;
> +
> + for (i = BPF_REG_0; i <= TMP_REG_2; i++) {
> + if (ctx->b2p[i] != old)
> + continue;
> + ctx->b2p[i] = new;
> + bpf_set_seen_register(ctx, new);
> + bpf_clear_seen_register(ctx, old);
> + break;
> + }
> + }
This function is not very different from the one for PPC32. Maybe we could cook a common function.
> }
>
> void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
> @@ -106,10 +124,9 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
> * If we haven't created our own stack frame, we save these
> * in the protected zone below the previous stack frame
> */
> - for (i = BPF_REG_6; i <= BPF_REG_10; i++)
> - if (bpf_is_seen_register(ctx, bpf_to_ppc(ctx, i)))
> - PPC_BPF_STL(ctx, bpf_to_ppc(ctx, i), 1,
> - bpf_jit_stack_offsetof(ctx, bpf_to_ppc(ctx, i)));
> + for (i = BPF_PPC_NVR_MIN; i <= 31; i++)
> + if (bpf_is_seen_register(ctx, i))
> + PPC_BPF_STL(ctx, i, 1, bpf_jit_stack_offsetof(ctx, i));
>
> /* Setup frame pointer to point to the bpf stack area */
> if (bpf_is_seen_register(ctx, bpf_to_ppc(ctx, BPF_REG_FP)))
> @@ -122,10 +139,9 @@ static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx
> int i;
>
> /* Restore NVRs */
> - for (i = BPF_REG_6; i <= BPF_REG_10; i++)
> - if (bpf_is_seen_register(ctx, bpf_to_ppc(ctx, i)))
> - PPC_BPF_LL(ctx, bpf_to_ppc(ctx, i), 1,
> - bpf_jit_stack_offsetof(ctx, bpf_to_ppc(ctx, i)));
> + for (i = BPF_PPC_NVR_MIN; i <= 31; i++)
> + if (bpf_is_seen_register(ctx, i))
> + PPC_BPF_LL(ctx, i, 1, bpf_jit_stack_offsetof(ctx, i));
>
> /* Tear down our stack frame */
> if (bpf_has_stack_frame(ctx)) {
> @@ -223,6 +239,7 @@ static void bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32
> * if (index >= array->map.max_entries)
> * goto out;
> */
> + bpf_set_seen_register(ctx, bpf_to_ppc(ctx, TMP_REG_1));
> EMIT(PPC_RAW_LWZ(bpf_to_ppc(ctx, TMP_REG_1), b2p_bpf_array,
> offsetof(struct bpf_array, map.max_entries)));
> EMIT(PPC_RAW_RLWINM(b2p_index, b2p_index, 0, 0, 31));
> @@ -318,9 +335,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> * optimization but everything else should work without
> * any issues.
> */
> - if (dst_reg >= BPF_PPC_NVR_MIN && dst_reg < 32)
> + if (dst_reg >= 3 && dst_reg < 32)
> bpf_set_seen_register(ctx, dst_reg);
> - if (src_reg >= BPF_PPC_NVR_MIN && src_reg < 32)
> + if (src_reg >= 3 && src_reg < 32)
> bpf_set_seen_register(ctx, src_reg);
>
> switch (code) {
> @@ -345,6 +362,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> if (imm >= -32768 && imm < 32768)
> EMIT(PPC_RAW_ADDI(dst_reg, dst_reg, IMM_L(imm)));
> else {
> + bpf_set_seen_register(ctx, tmp1_reg);
> PPC_LI32(tmp1_reg, imm);
> EMIT(PPC_RAW_ADD(dst_reg, dst_reg, tmp1_reg));
> }
> @@ -362,6 +380,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> if (imm >= -32768 && imm < 32768)
> EMIT(PPC_RAW_MULI(dst_reg, dst_reg, IMM_L(imm)));
> else {
> + bpf_set_seen_register(ctx, tmp1_reg);
> PPC_LI32(tmp1_reg, imm);
> if (BPF_CLASS(code) == BPF_ALU)
> EMIT(PPC_RAW_MULW(dst_reg, dst_reg, tmp1_reg));
> @@ -372,6 +391,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> case BPF_ALU | BPF_DIV | BPF_X: /* (u32) dst /= (u32) src */
> case BPF_ALU | BPF_MOD | BPF_X: /* (u32) dst %= (u32) src */
> if (BPF_OP(code) == BPF_MOD) {
> + bpf_set_seen_register(ctx, tmp1_reg);
> EMIT(PPC_RAW_DIVWU(tmp1_reg, dst_reg, src_reg));
> EMIT(PPC_RAW_MULW(tmp1_reg, src_reg, tmp1_reg));
> EMIT(PPC_RAW_SUB(dst_reg, dst_reg, tmp1_reg));
> @@ -381,6 +401,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> case BPF_ALU64 | BPF_DIV | BPF_X: /* dst /= src */
> case BPF_ALU64 | BPF_MOD | BPF_X: /* dst %= src */
> if (BPF_OP(code) == BPF_MOD) {
> + bpf_set_seen_register(ctx, tmp1_reg);
> EMIT(PPC_RAW_DIVDU(tmp1_reg, dst_reg, src_reg));
> EMIT(PPC_RAW_MULD(tmp1_reg, src_reg, tmp1_reg));
> EMIT(PPC_RAW_SUB(dst_reg, dst_reg, tmp1_reg));
> @@ -396,10 +417,12 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> else if (imm == 1)
> goto bpf_alu32_trunc;
>
> + bpf_set_seen_register(ctx, tmp1_reg);
> PPC_LI32(tmp1_reg, imm);
> switch (BPF_CLASS(code)) {
> case BPF_ALU:
> if (BPF_OP(code) == BPF_MOD) {
> + bpf_set_seen_register(ctx, tmp2_reg);
> EMIT(PPC_RAW_DIVWU(tmp2_reg, dst_reg, tmp1_reg));
> EMIT(PPC_RAW_MULW(tmp1_reg, tmp1_reg, tmp2_reg));
> EMIT(PPC_RAW_SUB(dst_reg, dst_reg, tmp1_reg));
> @@ -408,6 +431,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> break;
> case BPF_ALU64:
> if (BPF_OP(code) == BPF_MOD) {
> + bpf_set_seen_register(ctx, tmp2_reg);
> EMIT(PPC_RAW_DIVDU(tmp2_reg, dst_reg, tmp1_reg));
> EMIT(PPC_RAW_MULD(tmp1_reg, tmp1_reg, tmp2_reg));
> EMIT(PPC_RAW_SUB(dst_reg, dst_reg, tmp1_reg));
> @@ -434,6 +458,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> EMIT(PPC_RAW_ANDI(dst_reg, dst_reg, IMM_L(imm)));
> else {
> /* Sign-extended */
> + bpf_set_seen_register(ctx, tmp1_reg);
> PPC_LI32(tmp1_reg, imm);
> EMIT(PPC_RAW_AND(dst_reg, dst_reg, tmp1_reg));
> }
> @@ -446,6 +471,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> case BPF_ALU64 | BPF_OR | BPF_K:/* dst = dst | imm */
> if (imm < 0 && BPF_CLASS(code) == BPF_ALU64) {
> /* Sign-extended */
> + bpf_set_seen_register(ctx, tmp1_reg);
> PPC_LI32(tmp1_reg, imm);
> EMIT(PPC_RAW_OR(dst_reg, dst_reg, tmp1_reg));
> } else {
> @@ -463,6 +489,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> case BPF_ALU64 | BPF_XOR | BPF_K: /* dst ^= imm */
> if (imm < 0 && BPF_CLASS(code) == BPF_ALU64) {
> /* Sign-extended */
> + bpf_set_seen_register(ctx, tmp1_reg);
> PPC_LI32(tmp1_reg, imm);
> EMIT(PPC_RAW_XOR(dst_reg, dst_reg, tmp1_reg));
> } else {
> @@ -562,6 +589,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> if (BPF_SRC(code) == BPF_FROM_LE)
> goto emit_clear;
> #endif
> + bpf_set_seen_register(ctx, tmp1_reg);
> switch (imm) {
> case 16:
> /* Rotate 8 bits left & mask with 0x0000ff00 */
> @@ -625,6 +653,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> case BPF_STX | BPF_MEM | BPF_B: /* *(u8 *)(dst + off) = src */
> case BPF_ST | BPF_MEM | BPF_B: /* *(u8 *)(dst + off) = imm */
> if (BPF_CLASS(code) == BPF_ST) {
> + bpf_set_seen_register(ctx, tmp1_reg);
> EMIT(PPC_RAW_LI(tmp1_reg, imm));
> src_reg = tmp1_reg;
> }
> @@ -633,6 +662,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> case BPF_STX | BPF_MEM | BPF_H: /* (u16 *)(dst + off) = src */
> case BPF_ST | BPF_MEM | BPF_H: /* (u16 *)(dst + off) = imm */
> if (BPF_CLASS(code) == BPF_ST) {
> + bpf_set_seen_register(ctx, tmp1_reg);
> EMIT(PPC_RAW_LI(tmp1_reg, imm));
> src_reg = tmp1_reg;
> }
> @@ -641,6 +671,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> case BPF_STX | BPF_MEM | BPF_W: /* *(u32 *)(dst + off) = src */
> case BPF_ST | BPF_MEM | BPF_W: /* *(u32 *)(dst + off) = imm */
> if (BPF_CLASS(code) == BPF_ST) {
> + bpf_set_seen_register(ctx, tmp1_reg);
> PPC_LI32(tmp1_reg, imm);
> src_reg = tmp1_reg;
> }
> @@ -649,6 +680,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> case BPF_STX | BPF_MEM | BPF_DW: /* (u64 *)(dst + off) = src */
> case BPF_ST | BPF_MEM | BPF_DW: /* *(u64 *)(dst + off) = imm */
> if (BPF_CLASS(code) == BPF_ST) {
> + bpf_set_seen_register(ctx, tmp1_reg);
> PPC_LI32(tmp1_reg, imm);
> src_reg = tmp1_reg;
> }
> @@ -669,6 +701,8 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> /* *(u32 *)(dst + off) += src */
>
> /* Get EA into TMP_REG_1 */
> + bpf_set_seen_register(ctx, tmp1_reg);
> + bpf_set_seen_register(ctx, tmp2_reg);
> EMIT(PPC_RAW_ADDI(tmp1_reg, dst_reg, off));
> tmp_idx = ctx->idx * 4;
> /* load value from memory into TMP_REG_2 */
> @@ -689,6 +723,8 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> }
> /* *(u64 *)(dst + off) += src */
>
> + bpf_set_seen_register(ctx, tmp1_reg);
> + bpf_set_seen_register(ctx, tmp2_reg);
> EMIT(PPC_RAW_ADDI(tmp1_reg, dst_reg, off));
> tmp_idx = ctx->idx * 4;
> EMIT(PPC_RAW_LDARX(tmp2_reg, 0, tmp1_reg, 0));
> @@ -870,6 +906,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> break;
> case BPF_JMP | BPF_JSET | BPF_X:
> case BPF_JMP32 | BPF_JSET | BPF_X:
> + bpf_set_seen_register(ctx, tmp1_reg);
> if (BPF_CLASS(code) == BPF_JMP) {
> EMIT(PPC_RAW_AND_DOT(tmp1_reg, dst_reg, src_reg));
> } else {
> @@ -903,6 +940,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> EMIT(PPC_RAW_CMPLDI(dst_reg, imm));
> } else {
> /* sign-extending load */
> + bpf_set_seen_register(ctx, tmp1_reg);
> PPC_LI32(tmp1_reg, imm);
> /* ... but unsigned comparison */
> if (is_jmp32)
> @@ -933,6 +971,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> else
> EMIT(PPC_RAW_CMPDI(dst_reg, imm));
> } else {
> + bpf_set_seen_register(ctx, tmp1_reg);
> PPC_LI32(tmp1_reg, imm);
> if (is_jmp32)
> EMIT(PPC_RAW_CMPW(dst_reg, tmp1_reg));
> @@ -944,6 +983,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context *
> case BPF_JMP | BPF_JSET | BPF_K:
> case BPF_JMP32 | BPF_JSET | BPF_K:
> /* andi does not sign-extend the immediate */
> + bpf_set_seen_register(ctx, tmp1_reg);
> if (imm >= 0 && imm < 32768)
> /* PPC_ANDI is _only/always_ dot-form */
> EMIT(PPC_RAW_ANDI(tmp1_reg, dst_reg, imm));
>
^ permalink raw reply
* Re: [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
From: Xianting Tian @ 2021-08-05 8:08 UTC (permalink / raw)
To: Jiri Slaby, gregkh, amit, arnd, osandov
Cc: linuxppc-dev, linux-kernel, virtualization
In-Reply-To: <0f26a1c3-53e8-9282-69e8-8d81a9cafc59@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 1210 bytes --]
在 2021/8/5 下午3:58, Jiri Slaby 写道:
> Hi,
>
> On 04. 08. 21, 4:54, Xianting Tian wrote:
>> @@ -933,6 +949,16 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno,
>> int data,
>> hp->outbuf_size = outbuf_size;
>> hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
>> + /*
>> + * hvc_con_outbuf is guaranteed to be aligned at least to the
>> + * size(N_OUTBUF) by kmalloc().
>> + */
>> + hp->hvc_con_outbuf = kzalloc(N_OUTBUF, GFP_KERNEL);
>> + if (!hp->hvc_con_outbuf)
>> + return ERR_PTR(-ENOMEM);
>
> This leaks hp, right?
>
> BTW your 2 patches are still not threaded, that is hard to follow.
yes, thanks, I found the bug, I am preparing to do this in v4.
It is the first time I send series patches(number >1), I checked the
method for sending series patch on LKML.org, I should send '0/2' which
is the history info for series patches.
I will add 0/2 in v4, sorry again for this:(
beside avove things, the solution in this patch is for you? thanks
>
>> +
>> + spin_lock_init(&hp->hvc_con_lock);
>> +
>> tty_port_init(&hp->port);
>> hp->port.ops = &hvc_port_ops;
>
> thanks,
[-- Attachment #2: Type: text/html, Size: 3407 bytes --]
^ permalink raw reply
* [PATCH linux-next] powerpc/tm: remove duplicate include in tm-poison.c
From: cgel.zte @ 2021-08-05 6:52 UTC (permalink / raw)
To: mpe
Cc: yong.yiran, Zeal Robot, linuxppc-dev, linux-kernel, paulus,
linux-kselftest, shuah
From: yong yiran <yong.yiran@zte.com.cn>
'inttypes.h' included in 'tm-poison.c' is duplicated.
Remove all but the first include of inttypes.h from tm-poison.c.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: yong yiran <yong.yiran@zte.com.cn>
---
tools/testing/selftests/powerpc/tm/tm-poison.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/powerpc/tm/tm-poison.c b/tools/testing/selftests/powerpc/tm/tm-poison.c
index 29e5f26af7b9..27c083a03d1f 100644
--- a/tools/testing/selftests/powerpc/tm/tm-poison.c
+++ b/tools/testing/selftests/powerpc/tm/tm-poison.c
@@ -20,7 +20,6 @@
#include <sched.h>
#include <sys/types.h>
#include <signal.h>
-#include <inttypes.h>
#include "tm.h"
--
2.25.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox