LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH kernel] powerpc/iommu: Annotate nested lock for lockdep
From: Alexey Kardashevskiy @ 2021-02-23  5:13 UTC (permalink / raw)
  To: Frederic Barrat, linuxppc-dev; +Cc: kvm-ppc
In-Reply-To: <49b1f5cb-107c-296f-c339-13e627a73d6d@linux.ibm.com>



On 18/02/2021 23:59, Frederic Barrat wrote:
> 
> 
> On 16/02/2021 04:20, Alexey Kardashevskiy wrote:
>> The IOMMU table is divided into pools for concurrent mappings and each
>> pool has a separate spinlock. When taking the ownership of an IOMMU group
>> to pass through a device to a VM, we lock these spinlocks which triggers
>> a false negative warning in lockdep (below).
>>
>> This fixes it by annotating the large pool's spinlock as a nest lock.
>>
>> ===
>> WARNING: possible recursive locking detected
>> 5.11.0-le_syzkaller_a+fstn1 #100 Not tainted
>> --------------------------------------------
>> qemu-system-ppc/4129 is trying to acquire lock:
>> c0000000119bddb0 (&(p->lock)/1){....}-{2:2}, at: 
>> iommu_take_ownership+0xac/0x1e0
>>
>> but task is already holding lock:
>> c0000000119bdd30 (&(p->lock)/1){....}-{2:2}, at: 
>> iommu_take_ownership+0xac/0x1e0
>>
>> other info that might help us debug this:
>>   Possible unsafe locking scenario:
>>
>>         CPU0
>>         ----
>>    lock(&(p->lock)/1);
>>    lock(&(p->lock)/1);
>> ===
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>   arch/powerpc/kernel/iommu.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
>> index 557a09dd5b2f..2ee642a6731a 100644
>> --- a/arch/powerpc/kernel/iommu.c
>> +++ b/arch/powerpc/kernel/iommu.c
>> @@ -1089,7 +1089,7 @@ int iommu_take_ownership(struct iommu_table *tbl)
>>       spin_lock_irqsave(&tbl->large_pool.lock, flags);
>>       for (i = 0; i < tbl->nr_pools; i++)
>> -        spin_lock(&tbl->pools[i].lock);
>> +        spin_lock_nest_lock(&tbl->pools[i].lock, &tbl->large_pool.lock);
> 
> 
> We have the same pattern and therefore should have the same problem in 
> iommu_release_ownership().
> 
> But as I understand, we're hacking our way around lockdep here, since 
> conceptually, those locks are independent. I was wondering why it seems 
> to fix it by worrying only about the large pool lock.

This is the other way around - we telling the lockdep not to worry about 
small pool locks if the nest lock (==large pool lock) is locked. The 
warning is printed when a nested lock is detected and the lockdep checks 
if there is a nest for this nested lock at check_deadlock().


> That loop can take 
> many locks (up to 4 with current config). However, if the dma window is 
> less than 1GB, we would only have one, so it would make sense for 
> lockdep to stop complaining.

Why would it stop if the large pool is always there?

> Is it what happened? In which case, this 
> patch doesn't really fix it. Or I'm missing something :-)

I tried with 1 or 2 small pools, no difference at all. I might also be 
missing something here too :)


> 
>    Fred
> 
> 
> 
>>       iommu_table_release_pages(tbl);
>>

-- 
Alexey

^ permalink raw reply

* [PATCH] powerpc/perf: Fix handling of privilege level checks in perf interrupt context
From: Athira Rajeev @ 2021-02-23  6:31 UTC (permalink / raw)
  To: mpe; +Cc: maddy, peterz, omosnace, acme, jolsa, linuxppc-dev, kan.liang

Running "perf mem record" in powerpc platforms with selinux enabled
resulted in soft lockup's. Below call-trace was seen in the logs:

CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2
NIP:  c000000000dff3d4 LR: c000000000dff3d0 CTR: 0000000000000000
REGS: c000007fffab7d60 TRAP: 0100   Not tainted  (5.11.0-rc7+)
<<>>
NIP [c000000000dff3d4] _raw_spin_lock_irqsave+0x94/0x120
LR [c000000000dff3d0] _raw_spin_lock_irqsave+0x90/0x120
Call Trace:
[c00000000fd471a0] [c00000000fd47260] 0xc00000000fd47260 (unreliable)
[c00000000fd471e0] [c000000000b5fbbc] skb_queue_tail+0x3c/0x90
[c00000000fd47220] [c000000000296edc] audit_log_end+0x6c/0x180
[c00000000fd47260] [c0000000006a3f20] common_lsm_audit+0xb0/0xe0
[c00000000fd472a0] [c00000000066c664] slow_avc_audit+0xa4/0x110
[c00000000fd47320] [c00000000066cff4] avc_has_perm+0x1c4/0x260
[c00000000fd47430] [c00000000066e064] selinux_perf_event_open+0x74/0xd0
[c00000000fd47450] [c000000000669888] security_perf_event_open+0x68/0xc0
[c00000000fd47490] [c00000000013d788] record_and_restart+0x6e8/0x7f0
[c00000000fd476c0] [c00000000013dabc] perf_event_interrupt+0x22c/0x560
[c00000000fd477d0] [c00000000002d0fc] performance_monitor_exception+0x4c/0x60
[c00000000fd477f0] [c00000000000b378] performance_monitor_common_virt+0x1c8/0x1d0
interrupt: f00 at _raw_spin_lock_irqsave+0x38/0x120
NIP:  c000000000dff378 LR: c000000000b5fbbc CTR: c0000000007d47f0
REGS: c00000000fd47860 TRAP: 0f00   Not tainted  (5.11.0-rc7+)
<<>>
NIP [c000000000dff378] _raw_spin_lock_irqsave+0x38/0x120
LR [c000000000b5fbbc] skb_queue_tail+0x3c/0x90
interrupt: f00
[c00000000fd47b00] [0000000000000038] 0x38 (unreliable)
[c00000000fd47b40] [c00000000aae6200] 0xc00000000aae6200
[c00000000fd47b80] [c000000000296edc] audit_log_end+0x6c/0x180
[c00000000fd47bc0] [c00000000029f494] audit_log_exit+0x344/0xf80
[c00000000fd47d10] [c0000000002a2b00] __audit_syscall_exit+0x2c0/0x320
[c00000000fd47d60] [c000000000032878] do_syscall_trace_leave+0x148/0x200
[c00000000fd47da0] [c00000000003d5b4] syscall_exit_prepare+0x324/0x390
[c00000000fd47e10] [c00000000000d76c] system_call_common+0xfc/0x27c

The above trace shows that while the CPU was handling a performance
monitor exception, there was a call to "security_perf_event_open"
function. In powerpc core-book3s, this function is called from
'perf_allow_kernel' check during recording of data address in the sample
via perf_get_data_addr().

Commit da97e18458fb ("perf_event: Add support for LSM and SELinux checks")
introduced security enhancements to perf. As part of this commit, the new
security hook for perf_event_open was added in all places where perf
paranoid check was previously used. In powerpc core-book3s code, originally
had paranoid checks in 'perf_get_data_addr' and 'power_pmu_bhrb_read'. So
'perf_paranoid_kernel' checks were replaced with 'perf_allow_kernel' in
these pmu helper functions as well.

The intention of paranoid checks in core-book3s is to verify privilege
access before capturing some of the sample data. Along with paranoid
checks, 'perf_allow_kernel' also does a 'security_perf_event_open'. Since
these functions are accessed while recording sample, we end up in calling
selinux_perf_event_open in PMI context. Some of the security functions
use spinlock like sidtab_sid2str_put(). If a perf interrupt hits under
a spin lock and if we end up in calling selinux hook functions in PMI
handler, this could cause a dead lock.

Since the purpose of this security hook is to control access to
perf_event_open, it is not right to call this in interrupt context.
But in case of powerpc PMU, we need the privilege checks for specific
samples from branch history ring buffer and sampling register values.
Reference commits:
Commit cd1231d7035f ("powerpc/perf: Prevent kernel address leak via
perf_get_data_addr()")
Commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to
userspace via BHRB buffer")

As a fix, patch caches 'perf_allow_kernel' value in event_init in
'pmu_private' field of perf_event. The cached value is used in the
PMI code path.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/perf/core-book3s.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 4b4319d8..9e9f67f 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -189,6 +189,11 @@ static inline unsigned long perf_ip_adjust(struct pt_regs *regs)
 	return 0;
 }

+static bool event_allow_kernel(struct perf_event *event)
+{
+	return (bool)event->pmu_private;
+}
+
 /*
  * The user wants a data address recorded.
  * If we're not doing instruction sampling, give them the SDAR
@@ -222,7 +227,7 @@ static inline void perf_get_data_addr(struct perf_event *event, struct pt_regs *
 	if (!(mmcra & MMCRA_SAMPLE_ENABLE) || sdar_valid)
 		*addrp = mfspr(SPRN_SDAR);

-	if (is_kernel_addr(mfspr(SPRN_SDAR)) && perf_allow_kernel(&event->attr) != 0)
+	if (is_kernel_addr(mfspr(SPRN_SDAR)) && !event_allow_kernel(event))
 		*addrp = 0;
 }

@@ -507,7 +512,7 @@ static void power_pmu_bhrb_read(struct perf_event *event, struct cpu_hw_events *
 			 * addresses, hence include a check before filtering code
 			 */
 			if (!(ppmu->flags & PPMU_ARCH_31) &&
-				is_kernel_addr(addr) && perf_allow_kernel(&event->attr) != 0)
+			    is_kernel_addr(addr) && !event_allow_kernel(event))
 				continue;

 			/* Branches are read most recent first (ie. mfbhrb 0 is
@@ -2049,6 +2054,13 @@ static int power_pmu_event_init(struct perf_event *event)
 	if (err)
 		return -EINVAL;

+	/*
+	 * We (ab)use pmu_private to cache the result of perf_allow_kernel(). We
+	 * need access to that result at interrupt time, but can't call
+	 * perf_allow_kernel() directly from interrupt context.
+	 */
+	event->pmu_private = (void *)(long)(perf_allow_kernel(&event->attr) == 0);
+
 	event->hw.config = events[n];
 	event->hw.event_base = cflags[n];
 	event->hw.last_period = event->hw.sample_period;
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH v2] selftests/powerpc: Fix L1D flushing tests for Power10
From: Russell Currey @ 2021-02-23  7:02 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Russell Currey, dja

The rfi_flush and entry_flush selftests work by using the PM_LD_MISS_L1
perf event to count L1D misses.  The value of this event has changed
over time:

- Power7 uses 0x400f0
- Power8 and Power9 use both 0x400f0 and 0x3e054
- Power10 uses only 0x3e054

Rather than relying on raw values, configure perf to count L1D read
misses in the most explicit way available.

This fixes the selftests to work on systems without 0x400f0 as
PM_LD_MISS_L1, and should change no behaviour for systems that the tests
already worked on.

The only potential downside is that referring to a specific perf event
requires PMU support implemented in the kernel for that platform.

Signed-off-by: Russell Currey <ruscur@russell.cc>
---
v2: Move away from raw events as suggested by mpe

 tools/testing/selftests/powerpc/security/entry_flush.c | 2 +-
 tools/testing/selftests/powerpc/security/flush_utils.h | 4 ++++
 tools/testing/selftests/powerpc/security/rfi_flush.c   | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/powerpc/security/entry_flush.c b/tools/testing/selftests/powerpc/security/entry_flush.c
index 78cf914fa321..68ce377b205e 100644
--- a/tools/testing/selftests/powerpc/security/entry_flush.c
+++ b/tools/testing/selftests/powerpc/security/entry_flush.c
@@ -53,7 +53,7 @@ int entry_flush_test(void)
 
 	entry_flush = entry_flush_orig;
 
-	fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
+	fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, PERF_L1D_READ_MISS_CONFIG, -1);
 	FAIL_IF(fd < 0);
 
 	p = (char *)memalign(zero_size, CACHELINE_SIZE);
diff --git a/tools/testing/selftests/powerpc/security/flush_utils.h b/tools/testing/selftests/powerpc/security/flush_utils.h
index 07a5eb301466..7a3d60292916 100644
--- a/tools/testing/selftests/powerpc/security/flush_utils.h
+++ b/tools/testing/selftests/powerpc/security/flush_utils.h
@@ -9,6 +9,10 @@
 
 #define CACHELINE_SIZE 128
 
+#define PERF_L1D_READ_MISS_CONFIG	((PERF_COUNT_HW_CACHE_L1D) | 		\
+					(PERF_COUNT_HW_CACHE_OP_READ << 8) |	\
+					(PERF_COUNT_HW_CACHE_RESULT_MISS << 16))
+
 void syscall_loop(char *p, unsigned long iterations,
 		  unsigned long zero_size);
 
diff --git a/tools/testing/selftests/powerpc/security/rfi_flush.c b/tools/testing/selftests/powerpc/security/rfi_flush.c
index 7565fd786640..f73484a6470f 100644
--- a/tools/testing/selftests/powerpc/security/rfi_flush.c
+++ b/tools/testing/selftests/powerpc/security/rfi_flush.c
@@ -54,7 +54,7 @@ int rfi_flush_test(void)
 
 	rfi_flush = rfi_flush_orig;
 
-	fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
+	fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, PERF_L1D_READ_MISS_CONFIG, -1);
 	FAIL_IF(fd < 0);
 
 	p = (char *)memalign(zero_size, CACHELINE_SIZE);
-- 
2.30.1


^ permalink raw reply related

* [PATCH] powerpc/chrp: Make hydra_init() static
From: Geert Uytterhoeven @ 2021-02-23  9:53 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Oliver O'Halloran
  Cc: Geert Uytterhoeven, linuxppc-dev, linux-kernel

Commit 407d418f2fd4c20a ("powerpc/chrp: Move PHB discovery") moved the
sole call to hydra_init() to the source file where it is defined, so it
can be made static.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
Compile-tested only.  My LongTrail died in 2004.
---
 arch/powerpc/include/asm/hydra.h  | 2 --
 arch/powerpc/platforms/chrp/pci.c | 3 +--
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/hydra.h b/arch/powerpc/include/asm/hydra.h
index ae02eb53d6ef8221..d024447283a0cf3c 100644
--- a/arch/powerpc/include/asm/hydra.h
+++ b/arch/powerpc/include/asm/hydra.h
@@ -94,8 +94,6 @@ extern volatile struct Hydra __iomem *Hydra;
 #define HYDRA_INT_EXT7		18	/* Power Off Request */
 #define HYDRA_INT_SPARE		19
 
-extern int hydra_init(void);
-
 #endif /* __KERNEL__ */
 
 #endif /* _ASMPPC_HYDRA_H */
diff --git a/arch/powerpc/platforms/chrp/pci.c b/arch/powerpc/platforms/chrp/pci.c
index 8c421dc78b28334b..76e6256cb0a788f4 100644
--- a/arch/powerpc/platforms/chrp/pci.c
+++ b/arch/powerpc/platforms/chrp/pci.c
@@ -131,8 +131,7 @@ static struct pci_ops rtas_pci_ops =
 
 volatile struct Hydra __iomem *Hydra = NULL;
 
-int __init
-hydra_init(void)
+static int __init hydra_init(void)
 {
 	struct device_node *np;
 	struct resource r;
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH] powerpc/perf: Fix handling of privilege level checks in perf interrupt context
From: Ondrej Mosnacek @ 2021-02-23 10:36 UTC (permalink / raw)
  To: atrajeev
  Cc: maddy, Peter Zijlstra, Arnaldo Carvalho de Melo, SElinux list,
	Linux Security Module list, Jiri Olsa, linuxppc-dev, kan.liang
In-Reply-To: <1614061909-1734-1-git-send-email-atrajeev@linux.vnet.ibm.com>

(CC'ing LSM and SELinux lists; the initial message can be found here:
https://lore.kernel.org/linuxppc-dev/1614061909-1734-1-git-send-email-atrajeev@linux.vnet.ibm.com/T/)

On Tue, Feb 23, 2021 at 7:32 AM Athira Rajeev
<atrajeev@linux.vnet.ibm.com> wrote:
>
> Running "perf mem record" in powerpc platforms with selinux enabled
> resulted in soft lockup's. Below call-trace was seen in the logs:
>
> CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2
> NIP:  c000000000dff3d4 LR: c000000000dff3d0 CTR: 0000000000000000
> REGS: c000007fffab7d60 TRAP: 0100   Not tainted  (5.11.0-rc7+)
> <<>>
> NIP [c000000000dff3d4] _raw_spin_lock_irqsave+0x94/0x120
> LR [c000000000dff3d0] _raw_spin_lock_irqsave+0x90/0x120
> Call Trace:
> [c00000000fd471a0] [c00000000fd47260] 0xc00000000fd47260 (unreliable)
> [c00000000fd471e0] [c000000000b5fbbc] skb_queue_tail+0x3c/0x90
> [c00000000fd47220] [c000000000296edc] audit_log_end+0x6c/0x180
> [c00000000fd47260] [c0000000006a3f20] common_lsm_audit+0xb0/0xe0
> [c00000000fd472a0] [c00000000066c664] slow_avc_audit+0xa4/0x110
> [c00000000fd47320] [c00000000066cff4] avc_has_perm+0x1c4/0x260
> [c00000000fd47430] [c00000000066e064] selinux_perf_event_open+0x74/0xd0
> [c00000000fd47450] [c000000000669888] security_perf_event_open+0x68/0xc0
> [c00000000fd47490] [c00000000013d788] record_and_restart+0x6e8/0x7f0
> [c00000000fd476c0] [c00000000013dabc] perf_event_interrupt+0x22c/0x560
> [c00000000fd477d0] [c00000000002d0fc] performance_monitor_exception+0x4c/0x60
> [c00000000fd477f0] [c00000000000b378] performance_monitor_common_virt+0x1c8/0x1d0
> interrupt: f00 at _raw_spin_lock_irqsave+0x38/0x120
> NIP:  c000000000dff378 LR: c000000000b5fbbc CTR: c0000000007d47f0
> REGS: c00000000fd47860 TRAP: 0f00   Not tainted  (5.11.0-rc7+)
> <<>>
> NIP [c000000000dff378] _raw_spin_lock_irqsave+0x38/0x120
> LR [c000000000b5fbbc] skb_queue_tail+0x3c/0x90
> interrupt: f00
> [c00000000fd47b00] [0000000000000038] 0x38 (unreliable)
> [c00000000fd47b40] [c00000000aae6200] 0xc00000000aae6200
> [c00000000fd47b80] [c000000000296edc] audit_log_end+0x6c/0x180
> [c00000000fd47bc0] [c00000000029f494] audit_log_exit+0x344/0xf80
> [c00000000fd47d10] [c0000000002a2b00] __audit_syscall_exit+0x2c0/0x320
> [c00000000fd47d60] [c000000000032878] do_syscall_trace_leave+0x148/0x200
> [c00000000fd47da0] [c00000000003d5b4] syscall_exit_prepare+0x324/0x390
> [c00000000fd47e10] [c00000000000d76c] system_call_common+0xfc/0x27c
>
> The above trace shows that while the CPU was handling a performance
> monitor exception, there was a call to "security_perf_event_open"
> function. In powerpc core-book3s, this function is called from
> 'perf_allow_kernel' check during recording of data address in the sample
> via perf_get_data_addr().
>
> Commit da97e18458fb ("perf_event: Add support for LSM and SELinux checks")
> introduced security enhancements to perf. As part of this commit, the new
> security hook for perf_event_open was added in all places where perf
> paranoid check was previously used. In powerpc core-book3s code, originally
> had paranoid checks in 'perf_get_data_addr' and 'power_pmu_bhrb_read'. So
> 'perf_paranoid_kernel' checks were replaced with 'perf_allow_kernel' in
> these pmu helper functions as well.
>
> The intention of paranoid checks in core-book3s is to verify privilege
> access before capturing some of the sample data. Along with paranoid
> checks, 'perf_allow_kernel' also does a 'security_perf_event_open'. Since
> these functions are accessed while recording sample, we end up in calling
> selinux_perf_event_open in PMI context. Some of the security functions
> use spinlock like sidtab_sid2str_put(). If a perf interrupt hits under
> a spin lock and if we end up in calling selinux hook functions in PMI
> handler, this could cause a dead lock.
>
> Since the purpose of this security hook is to control access to
> perf_event_open, it is not right to call this in interrupt context.
> But in case of powerpc PMU, we need the privilege checks for specific
> samples from branch history ring buffer and sampling register values.
> Reference commits:
> Commit cd1231d7035f ("powerpc/perf: Prevent kernel address leak via
> perf_get_data_addr()")
> Commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to
> userspace via BHRB buffer")
>
> As a fix, patch caches 'perf_allow_kernel' value in event_init in
> 'pmu_private' field of perf_event. The cached value is used in the
> PMI code path.
>
> Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
>  arch/powerpc/perf/core-book3s.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 4b4319d8..9e9f67f 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -189,6 +189,11 @@ static inline unsigned long perf_ip_adjust(struct pt_regs *regs)
>         return 0;
>  }
>
> +static bool event_allow_kernel(struct perf_event *event)
> +{
> +       return (bool)event->pmu_private;
> +}
> +
>  /*
>   * The user wants a data address recorded.
>   * If we're not doing instruction sampling, give them the SDAR
> @@ -222,7 +227,7 @@ static inline void perf_get_data_addr(struct perf_event *event, struct pt_regs *
>         if (!(mmcra & MMCRA_SAMPLE_ENABLE) || sdar_valid)
>                 *addrp = mfspr(SPRN_SDAR);
>
> -       if (is_kernel_addr(mfspr(SPRN_SDAR)) && perf_allow_kernel(&event->attr) != 0)
> +       if (is_kernel_addr(mfspr(SPRN_SDAR)) && !event_allow_kernel(event))
>                 *addrp = 0;
>  }
>
> @@ -507,7 +512,7 @@ static void power_pmu_bhrb_read(struct perf_event *event, struct cpu_hw_events *
>                          * addresses, hence include a check before filtering code
>                          */
>                         if (!(ppmu->flags & PPMU_ARCH_31) &&
> -                               is_kernel_addr(addr) && perf_allow_kernel(&event->attr) != 0)
> +                           is_kernel_addr(addr) && !event_allow_kernel(event))
>                                 continue;
>
>                         /* Branches are read most recent first (ie. mfbhrb 0 is
> @@ -2049,6 +2054,13 @@ static int power_pmu_event_init(struct perf_event *event)
>         if (err)
>                 return -EINVAL;
>
> +       /*
> +        * We (ab)use pmu_private to cache the result of perf_allow_kernel(). We
> +        * need access to that result at interrupt time, but can't call
> +        * perf_allow_kernel() directly from interrupt context.
> +        */
> +       event->pmu_private = (void *)(long)(perf_allow_kernel(&event->attr) == 0);

I don't think you need this. Unless I'm missing something, you can
simply use "event->attr.exclude_kernel" in place of
"!event_allow_kernel(event)". If it is set, then there must have been
a successful perf_allow_kernel() check in perf_event_open(2) before
the event was created. power_pmu_event_init() would be called shortly
after via perf_event_alloc() -> perf_init_event(), so I don't think
this additional check would add much value.

> +
>         event->hw.config = events[n];
>         event->hw.event_base = cflags[n];
>         event->hw.last_period = event->hw.sample_period;
> --
> 1.8.3.1
>

-- 
Ondrej Mosnacek
Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


^ permalink raw reply

* Re: [PATCH] powerpc/perf: Fix handling of privilege level checks in perf interrupt context
From: Peter Zijlstra @ 2021-02-23 11:03 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: maddy, omosnace, acme, jolsa, linuxppc-dev, kan.liang
In-Reply-To: <1614061909-1734-1-git-send-email-atrajeev@linux.vnet.ibm.com>

On Tue, Feb 23, 2021 at 01:31:49AM -0500, Athira Rajeev wrote:
> Running "perf mem record" in powerpc platforms with selinux enabled
> resulted in soft lockup's. Below call-trace was seen in the logs:
> 
> CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2
> NIP:  c000000000dff3d4 LR: c000000000dff3d0 CTR: 0000000000000000
> REGS: c000007fffab7d60 TRAP: 0100   Not tainted  (5.11.0-rc7+)
> <<>>
> NIP [c000000000dff3d4] _raw_spin_lock_irqsave+0x94/0x120
> LR [c000000000dff3d0] _raw_spin_lock_irqsave+0x90/0x120
> Call Trace:
> [c00000000fd471a0] [c00000000fd47260] 0xc00000000fd47260 (unreliable)
> [c00000000fd471e0] [c000000000b5fbbc] skb_queue_tail+0x3c/0x90
> [c00000000fd47220] [c000000000296edc] audit_log_end+0x6c/0x180
> [c00000000fd47260] [c0000000006a3f20] common_lsm_audit+0xb0/0xe0
> [c00000000fd472a0] [c00000000066c664] slow_avc_audit+0xa4/0x110
> [c00000000fd47320] [c00000000066cff4] avc_has_perm+0x1c4/0x260
> [c00000000fd47430] [c00000000066e064] selinux_perf_event_open+0x74/0xd0
> [c00000000fd47450] [c000000000669888] security_perf_event_open+0x68/0xc0
> [c00000000fd47490] [c00000000013d788] record_and_restart+0x6e8/0x7f0
> [c00000000fd476c0] [c00000000013dabc] perf_event_interrupt+0x22c/0x560
> [c00000000fd477d0] [c00000000002d0fc] performance_monitor_exception+0x4c/0x60
> [c00000000fd477f0] [c00000000000b378] performance_monitor_common_virt+0x1c8/0x1d0
> interrupt: f00 at _raw_spin_lock_irqsave+0x38/0x120
> NIP:  c000000000dff378 LR: c000000000b5fbbc CTR: c0000000007d47f0
> REGS: c00000000fd47860 TRAP: 0f00   Not tainted  (5.11.0-rc7+)
> <<>>
> NIP [c000000000dff378] _raw_spin_lock_irqsave+0x38/0x120
> LR [c000000000b5fbbc] skb_queue_tail+0x3c/0x90
> interrupt: f00
> [c00000000fd47b00] [0000000000000038] 0x38 (unreliable)
> [c00000000fd47b40] [c00000000aae6200] 0xc00000000aae6200
> [c00000000fd47b80] [c000000000296edc] audit_log_end+0x6c/0x180
> [c00000000fd47bc0] [c00000000029f494] audit_log_exit+0x344/0xf80
> [c00000000fd47d10] [c0000000002a2b00] __audit_syscall_exit+0x2c0/0x320
> [c00000000fd47d60] [c000000000032878] do_syscall_trace_leave+0x148/0x200
> [c00000000fd47da0] [c00000000003d5b4] syscall_exit_prepare+0x324/0x390
> [c00000000fd47e10] [c00000000000d76c] system_call_common+0xfc/0x27c
> 
> The above trace shows that while the CPU was handling a performance
> monitor exception, there was a call to "security_perf_event_open"
> function. In powerpc core-book3s, this function is called from
> 'perf_allow_kernel' check during recording of data address in the sample
> via perf_get_data_addr().
> 
> Commit da97e18458fb ("perf_event: Add support for LSM and SELinux checks")
> introduced security enhancements to perf. As part of this commit, the new
> security hook for perf_event_open was added in all places where perf
> paranoid check was previously used. In powerpc core-book3s code, originally
> had paranoid checks in 'perf_get_data_addr' and 'power_pmu_bhrb_read'. So
> 'perf_paranoid_kernel' checks were replaced with 'perf_allow_kernel' in
> these pmu helper functions as well.
> 
> The intention of paranoid checks in core-book3s is to verify privilege
> access before capturing some of the sample data. Along with paranoid
> checks, 'perf_allow_kernel' also does a 'security_perf_event_open'. Since
> these functions are accessed while recording sample, we end up in calling
> selinux_perf_event_open in PMI context. Some of the security functions
> use spinlock like sidtab_sid2str_put(). If a perf interrupt hits under
> a spin lock and if we end up in calling selinux hook functions in PMI
> handler, this could cause a dead lock.
> 
> Since the purpose of this security hook is to control access to
> perf_event_open, it is not right to call this in interrupt context.
> But in case of powerpc PMU, we need the privilege checks for specific
> samples from branch history ring buffer and sampling register values.

I'm confused... why would you need those checks at event time? Either
the event has perf_event_attr::exclude_kernel and it then isn't allowed
to expose kernel addresses, or it doesn't and it is.

There should never be an event-time question of permission like this. If
you allow creation of an event, you're allowing the data it generates.

^ permalink raw reply

* Re: [PATCH] powerpc/perf: Fix handling of privilege level checks in perf interrupt context
From: Michael Ellerman @ 2021-02-23 12:54 UTC (permalink / raw)
  To: Peter Zijlstra, Athira Rajeev
  Cc: maddy, omosnace, acme, jolsa, linuxppc-dev, kan.liang
In-Reply-To: <YDTg99xsVolE+sv+@hirez.programming.kicks-ass.net>

Peter Zijlstra <peterz@infradead.org> writes:
> On Tue, Feb 23, 2021 at 01:31:49AM -0500, Athira Rajeev wrote:
>> Running "perf mem record" in powerpc platforms with selinux enabled
>> resulted in soft lockup's. Below call-trace was seen in the logs:
...
>> 
>> Since the purpose of this security hook is to control access to
>> perf_event_open, it is not right to call this in interrupt context.
>> But in case of powerpc PMU, we need the privilege checks for specific
>> samples from branch history ring buffer and sampling register values.
>
> I'm confused... why would you need those checks at event time? Either
> the event has perf_event_attr::exclude_kernel and it then isn't allowed
> to expose kernel addresses, or it doesn't and it is.

Well one of us is confused that's for sure ^_^

I missed/forgot that we had that logic in open.

I think the reason we got here is that in the past we didn't have the
event in the low-level routines where we want to check,
power_pmu_bhrb_read() and perf_get_data_addr(), so we hacked in a
perf_paranoid_kernel() check. Which was wrong.

Then Joel's patch plumbed the event through and switched those paranoid
checks to perf_allow_kernel().

Anyway, we'll just switch those to exclude_kernel checks.

> There should never be an event-time question of permission like this. If
> you allow creation of an event, you're allowing the data it generates.

Ack.

cheers

^ permalink raw reply

* Re: [PATCH for 5.10] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error
From: Christophe Leroy @ 2021-02-23 14:39 UTC (permalink / raw)
  To: Greg KH; +Cc: linuxppc-dev, fedora.dm0, stable, linux-kernel
In-Reply-To: <YCqFn/4YuT+445xW@kroah.com>



Le 15/02/2021 à 15:30, Greg KH a écrit :
> On Fri, Feb 12, 2021 at 08:57:14AM +0000, Christophe Leroy wrote:
>> This is backport of 3642eb21256a ("powerpc/32: Preserve cr1 in
>> exception prolog stack check to fix build error") for kernel 5.10
>>
>> It fixes the build failure on v5.10 reported by kernel test robot
>> and by David Michael.
>>
>> This fix is not in Linux tree yet, it is in next branch in powerpc tree.
> 
> Then there's nothing I can do about it until that happens :(
> 

It now is in Linus' tree, see 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3642eb21256a317ac14e9ed560242c6d20cf06d9

Thanks
Christophe

^ permalink raw reply

* Re: [PATCH v6 06/10] powerpc/signal64: Replace setup_sigcontext() w/ unsafe_setup_sigcontext()
From: Christophe Leroy @ 2021-02-23 17:12 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
In-Reply-To: <20210221012401.22328-7-cmr@codefail.de>



Le 21/02/2021 à 02:23, Christopher M. Riedl a écrit :
> Previously setup_sigcontext() performed a costly KUAP switch on every
> uaccess operation. These repeated uaccess switches cause a significant
> drop in signal handling performance.
> 
> Rewrite setup_sigcontext() to assume that a userspace write access window
> is open by replacing all uaccess functions with their 'unsafe' versions.
> Modify the callers to first open, call unsafe_setup_sigcontext() and
> then close the uaccess window.

Do you plan to also convert setup_tm_sigcontexts() ?
It would allow to then remove copy_fpr_to_user() and copy_ckfpr_to_user() and maybe other functions too.

Christophe

> 
> Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> ---
>   arch/powerpc/kernel/signal_64.c | 71 ++++++++++++++++++++-------------
>   1 file changed, 44 insertions(+), 27 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index bd8d210c9115..3faaa736ed62 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -101,9 +101,13 @@ static void prepare_setup_sigcontext(struct task_struct *tsk)
>    * Set up the sigcontext for the signal frame.
>    */
>   
> -static long setup_sigcontext(struct sigcontext __user *sc,
> -		struct task_struct *tsk, int signr, sigset_t *set,
> -		unsigned long handler, int ctx_has_vsx_region)
> +#define unsafe_setup_sigcontext(sc, tsk, signr, set, handler,		\
> +				ctx_has_vsx_region, e)			\
> +	unsafe_op_wrap(__unsafe_setup_sigcontext(sc, tsk, signr, set,	\
> +			handler, ctx_has_vsx_region), e)
> +static long notrace __unsafe_setup_sigcontext(struct sigcontext __user *sc,
> +					struct task_struct *tsk, int signr, sigset_t *set,
> +					unsigned long handler, int ctx_has_vsx_region)
>   {
>   	/* When CONFIG_ALTIVEC is set, we _always_ setup v_regs even if the
>   	 * process never used altivec yet (MSR_VEC is zero in pt_regs of
> @@ -118,20 +122,19 @@ static long setup_sigcontext(struct sigcontext __user *sc,
>   #endif
>   	struct pt_regs *regs = tsk->thread.regs;
>   	unsigned long msr = regs->msr;
> -	long err = 0;
>   	/* Force usr to alway see softe as 1 (interrupts enabled) */
>   	unsigned long softe = 0x1;
>   
>   	BUG_ON(tsk != current);
>   
>   #ifdef CONFIG_ALTIVEC
> -	err |= __put_user(v_regs, &sc->v_regs);
> +	unsafe_put_user(v_regs, &sc->v_regs, efault_out);
>   
>   	/* save altivec registers */
>   	if (tsk->thread.used_vr) {
>   		/* Copy 33 vec registers (vr0..31 and vscr) to the stack */
> -		err |= __copy_to_user(v_regs, &tsk->thread.vr_state,
> -				      33 * sizeof(vector128));
> +		unsafe_copy_to_user(v_regs, &tsk->thread.vr_state,
> +				    33 * sizeof(vector128), efault_out);
>   		/* set MSR_VEC in the MSR value in the frame to indicate that sc->v_reg)
>   		 * contains valid data.
>   		 */
> @@ -140,12 +143,12 @@ static long setup_sigcontext(struct sigcontext __user *sc,
>   	/* We always copy to/from vrsave, it's 0 if we don't have or don't
>   	 * use altivec.
>   	 */
> -	err |= __put_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33]);
> +	unsafe_put_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33], efault_out);
>   #else /* CONFIG_ALTIVEC */
> -	err |= __put_user(0, &sc->v_regs);
> +	unsafe_put_user(0, &sc->v_regs, efault_out);
>   #endif /* CONFIG_ALTIVEC */
>   	/* copy fpr regs and fpscr */
> -	err |= copy_fpr_to_user(&sc->fp_regs, tsk);
> +	unsafe_copy_fpr_to_user(&sc->fp_regs, tsk, efault_out);
>   
>   	/*
>   	 * Clear the MSR VSX bit to indicate there is no valid state attached
> @@ -160,24 +163,27 @@ static long setup_sigcontext(struct sigcontext __user *sc,
>   	 */
>   	if (tsk->thread.used_vsr && ctx_has_vsx_region) {
>   		v_regs += ELF_NVRREG;
> -		err |= copy_vsx_to_user(v_regs, tsk);
> +		unsafe_copy_vsx_to_user(v_regs, tsk, efault_out);
>   		/* set MSR_VSX in the MSR value in the frame to
>   		 * indicate that sc->vs_reg) contains valid data.
>   		 */
>   		msr |= MSR_VSX;
>   	}
>   #endif /* CONFIG_VSX */
> -	err |= __put_user(&sc->gp_regs, &sc->regs);
> +	unsafe_put_user(&sc->gp_regs, &sc->regs, efault_out);
>   	WARN_ON(!FULL_REGS(regs));
> -	err |= __copy_to_user(&sc->gp_regs, regs, GP_REGS_SIZE);
> -	err |= __put_user(msr, &sc->gp_regs[PT_MSR]);
> -	err |= __put_user(softe, &sc->gp_regs[PT_SOFTE]);
> -	err |= __put_user(signr, &sc->signal);
> -	err |= __put_user(handler, &sc->handler);
> +	unsafe_copy_to_user(&sc->gp_regs, regs, GP_REGS_SIZE, efault_out);
> +	unsafe_put_user(msr, &sc->gp_regs[PT_MSR], efault_out);
> +	unsafe_put_user(softe, &sc->gp_regs[PT_SOFTE], efault_out);
> +	unsafe_put_user(signr, &sc->signal, efault_out);
> +	unsafe_put_user(handler, &sc->handler, efault_out);
>   	if (set != NULL)
> -		err |=  __put_user(set->sig[0], &sc->oldmask);
> +		unsafe_put_user(set->sig[0], &sc->oldmask, efault_out);
>   
> -	return err;
> +	return 0;
> +
> +efault_out:
> +	return -EFAULT;
>   }
>   
>   #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> @@ -670,12 +676,15 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
>   
>   	if (old_ctx != NULL) {
>   		prepare_setup_sigcontext(current);
> -		if (!access_ok(old_ctx, ctx_size)
> -		    || setup_sigcontext(&old_ctx->uc_mcontext, current, 0, NULL, 0,
> -					ctx_has_vsx_region)
> -		    || __copy_to_user(&old_ctx->uc_sigmask,
> -				      &current->blocked, sizeof(sigset_t)))
> +		if (!user_write_access_begin(old_ctx, ctx_size))
>   			return -EFAULT;
> +
> +		unsafe_setup_sigcontext(&old_ctx->uc_mcontext, current, 0, NULL,
> +					0, ctx_has_vsx_region, efault_out);
> +		unsafe_copy_to_user(&old_ctx->uc_sigmask, &current->blocked,
> +				    sizeof(sigset_t), efault_out);
> +
> +		user_write_access_end();
>   	}
>   	if (new_ctx == NULL)
>   		return 0;
> @@ -704,6 +713,10 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
>   	/* This returns like rt_sigreturn */
>   	set_thread_flag(TIF_RESTOREALL);
>   	return 0;
> +
> +efault_out:
> +	user_write_access_end();
> +	return -EFAULT;
>   }
>   
>   
> @@ -854,9 +867,13 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
>   	} else {
>   		err |= __put_user(0, &frame->uc.uc_link);
>   		prepare_setup_sigcontext(tsk);
> -		err |= setup_sigcontext(&frame->uc.uc_mcontext, tsk, ksig->sig,
> -					NULL, (unsigned long)ksig->ka.sa.sa_handler,
> -					1);
> +		if (!user_write_access_begin(&frame->uc.uc_mcontext,
> +					     sizeof(frame->uc.uc_mcontext)))
> +			return -EFAULT;
> +		err |= __unsafe_setup_sigcontext(&frame->uc.uc_mcontext, tsk,
> +						ksig->sig, NULL,
> +						(unsigned long)ksig->ka.sa.sa_handler, 1);
> +		user_write_access_end();
>   	}
>   	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
>   	if (err)
> 

^ permalink raw reply

* Re: [PATCH v6 01/10] powerpc/uaccess: Add unsafe_copy_from_user
From: Christophe Leroy @ 2021-02-23 17:15 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev; +Cc: Daniel Axtens
In-Reply-To: <20210221012401.22328-2-cmr@codefail.de>



Le 21/02/2021 à 02:23, Christopher M. Riedl a écrit :
> Just wrap __copy_tofrom_user() for the usual 'unsafe' pattern which
> accepts a label to goto on error.
> 
> Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> Reviewed-by: Daniel Axtens <dja@axtens.net>
> ---
>   arch/powerpc/include/asm/uaccess.h | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index 78e2a3990eab..33b2de642120 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -487,6 +487,9 @@ user_write_access_begin(const void __user *ptr, size_t len)
>   #define unsafe_put_user(x, p, e) \
>   	__unsafe_put_user_goto((__typeof__(*(p)))(x), (p), sizeof(*(p)), e)
>   
> +#define unsafe_copy_from_user(d, s, l, e) \
> +	unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
> +

Could we perform same as unsafe_copy_to_user() instead of calling an external function which is 
banned in principle inside uaccess blocks ?


>   #define unsafe_copy_to_user(d, s, l, e) \
>   do {									\
>   	u8 __user *_dst = (u8 __user *)(d);				\
> 

^ permalink raw reply

* Re: [PATCH v6 07/10] powerpc/signal64: Replace restore_sigcontext() w/ unsafe_restore_sigcontext()
From: Christophe Leroy @ 2021-02-23 17:36 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
In-Reply-To: <20210221012401.22328-8-cmr@codefail.de>



Le 21/02/2021 à 02:23, Christopher M. Riedl a écrit :
> Previously restore_sigcontext() performed a costly KUAP switch on every
> uaccess operation. These repeated uaccess switches cause a significant
> drop in signal handling performance.
> 
> Rewrite restore_sigcontext() to assume that a userspace read access
> window is open by replacing all uaccess functions with their 'unsafe'
> versions. Modify the callers to first open, call
> unsafe_restore_sigcontext(), and then close the uaccess window.
> 
> Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> ---
>   arch/powerpc/kernel/signal_64.c | 68 ++++++++++++++++++++-------------
>   1 file changed, 41 insertions(+), 27 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index 3faaa736ed62..76b525261f61 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -326,14 +326,14 @@ static long setup_tm_sigcontexts(struct sigcontext __user *sc,
>   /*
>    * Restore the sigcontext from the signal frame.
>    */
> -
> -static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
> -			      struct sigcontext __user *sc)
> +#define unsafe_restore_sigcontext(tsk, set, sig, sc, e) \
> +	unsafe_op_wrap(__unsafe_restore_sigcontext(tsk, set, sig, sc), e)

unsafe_op_wrap() was not initially meant to be used outside of uaccess.h

In the begining, it has been copied from include/linux/uaccess.h and was used
for unsafe_put_user(), unsafe_get_user() and unsafe_copy_to_user(). After other changes, only 
unsafe_get_user() is still using it and I'm going to drop unsafe_op_wrap() soon.

I'd prefer if you can do the same as unsafe_save_general_regs() and others in signal_32.c

> +static long notrace __unsafe_restore_sigcontext(struct task_struct *tsk, sigset_t *set,
> +						int sig, struct sigcontext __user *sc)
>   {
>   #ifdef CONFIG_ALTIVEC
>   	elf_vrreg_t __user *v_regs;
>   #endif
> -	unsigned long err = 0;
>   	unsigned long save_r13 = 0;
>   	unsigned long msr;
>   	struct pt_regs *regs = tsk->thread.regs;
> @@ -348,27 +348,28 @@ static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
>   		save_r13 = regs->gpr[13];
>   
>   	/* copy the GPRs */
> -	err |= __copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr));
> -	err |= __get_user(regs->nip, &sc->gp_regs[PT_NIP]);
> +	unsafe_copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr),
> +			      efault_out);

I think it would be better to keep the above on a single line for readability.
Nowadays we tolerate 100 chars lines for cases like this one.

> +	unsafe_get_user(regs->nip, &sc->gp_regs[PT_NIP], efault_out);
>   	/* get MSR separately, transfer the LE bit if doing signal return */
> -	err |= __get_user(msr, &sc->gp_regs[PT_MSR]);
> +	unsafe_get_user(msr, &sc->gp_regs[PT_MSR], efault_out);
>   	if (sig)
>   		regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE);
> -	err |= __get_user(regs->orig_gpr3, &sc->gp_regs[PT_ORIG_R3]);
> -	err |= __get_user(regs->ctr, &sc->gp_regs[PT_CTR]);
> -	err |= __get_user(regs->link, &sc->gp_regs[PT_LNK]);
> -	err |= __get_user(regs->xer, &sc->gp_regs[PT_XER]);
> -	err |= __get_user(regs->ccr, &sc->gp_regs[PT_CCR]);
> +	unsafe_get_user(regs->orig_gpr3, &sc->gp_regs[PT_ORIG_R3], efault_out);
> +	unsafe_get_user(regs->ctr, &sc->gp_regs[PT_CTR], efault_out);
> +	unsafe_get_user(regs->link, &sc->gp_regs[PT_LNK], efault_out);
> +	unsafe_get_user(regs->xer, &sc->gp_regs[PT_XER], efault_out);
> +	unsafe_get_user(regs->ccr, &sc->gp_regs[PT_CCR], efault_out);
>   	/* Don't allow userspace to set SOFTE */
>   	set_trap_norestart(regs);
> -	err |= __get_user(regs->dar, &sc->gp_regs[PT_DAR]);
> -	err |= __get_user(regs->dsisr, &sc->gp_regs[PT_DSISR]);
> -	err |= __get_user(regs->result, &sc->gp_regs[PT_RESULT]);
> +	unsafe_get_user(regs->dar, &sc->gp_regs[PT_DAR], efault_out);
> +	unsafe_get_user(regs->dsisr, &sc->gp_regs[PT_DSISR], efault_out);
> +	unsafe_get_user(regs->result, &sc->gp_regs[PT_RESULT], efault_out);
>   
>   	if (!sig)
>   		regs->gpr[13] = save_r13;
>   	if (set != NULL)
> -		err |=  __get_user(set->sig[0], &sc->oldmask);
> +		unsafe_get_user(set->sig[0], &sc->oldmask, efault_out);
>   
>   	/*
>   	 * Force reload of FP/VEC.
> @@ -378,29 +379,28 @@ static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
>   	regs->msr &= ~(MSR_FP | MSR_FE0 | MSR_FE1 | MSR_VEC | MSR_VSX);
>   
>   #ifdef CONFIG_ALTIVEC
> -	err |= __get_user(v_regs, &sc->v_regs);
> -	if (err)
> -		return err;
> +	unsafe_get_user(v_regs, &sc->v_regs, efault_out);
>   	if (v_regs && !access_ok(v_regs, 34 * sizeof(vector128)))
>   		return -EFAULT;
>   	/* Copy 33 vec registers (vr0..31 and vscr) from the stack */
>   	if (v_regs != NULL && (msr & MSR_VEC) != 0) {
> -		err |= __copy_from_user(&tsk->thread.vr_state, v_regs,
> -					33 * sizeof(vector128));
> +		unsafe_copy_from_user(&tsk->thread.vr_state, v_regs,
> +				      33 * sizeof(vector128), efault_out);
>   		tsk->thread.used_vr = true;
>   	} else if (tsk->thread.used_vr) {
>   		memset(&tsk->thread.vr_state, 0, 33 * sizeof(vector128));
>   	}
>   	/* Always get VRSAVE back */
>   	if (v_regs != NULL)
> -		err |= __get_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33]);
> +		unsafe_get_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33],
> +				efault_out);

Same, would be better on a single line I think.

>   	else
>   		tsk->thread.vrsave = 0;
>   	if (cpu_has_feature(CPU_FTR_ALTIVEC))
>   		mtspr(SPRN_VRSAVE, tsk->thread.vrsave);
>   #endif /* CONFIG_ALTIVEC */
>   	/* restore floating point */
> -	err |= copy_fpr_from_user(tsk, &sc->fp_regs);
> +	unsafe_copy_fpr_from_user(tsk, &sc->fp_regs, efault_out);
>   #ifdef CONFIG_VSX
>   	/*
>   	 * Get additional VSX data. Update v_regs to point after the
> @@ -409,14 +409,17 @@ static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
>   	 */
>   	v_regs += ELF_NVRREG;
>   	if ((msr & MSR_VSX) != 0) {
> -		err |= copy_vsx_from_user(tsk, v_regs);
> +		unsafe_copy_vsx_from_user(tsk, v_regs, efault_out);
>   		tsk->thread.used_vsr = true;
>   	} else {
>   		for (i = 0; i < 32 ; i++)
>   			tsk->thread.fp_state.fpr[i][TS_VSRLOWOFFSET] = 0;
>   	}
>   #endif
> -	return err;
> +	return 0;
> +
> +efault_out:
> +	return -EFAULT;
>   }
>   
>   #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> @@ -707,8 +710,14 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
>   	if (__copy_from_user(&set, &new_ctx->uc_sigmask, sizeof(set)))
>   		do_exit(SIGSEGV);
>   	set_current_blocked(&set);
> -	if (restore_sigcontext(current, NULL, 0, &new_ctx->uc_mcontext))
> +
> +	if (!user_read_access_begin(new_ctx, ctx_size))
> +		return -EFAULT;
> +	if (__unsafe_restore_sigcontext(current, NULL, 0, &new_ctx->uc_mcontext)) {
> +		user_read_access_end();
>   		do_exit(SIGSEGV);
> +	}
> +	user_read_access_end();
>   
>   	/* This returns like rt_sigreturn */
>   	set_thread_flag(TIF_RESTOREALL);
> @@ -811,8 +820,13 @@ SYSCALL_DEFINE0(rt_sigreturn)
>   		 * causing a TM bad thing.
>   		 */
>   		current->thread.regs->msr &= ~MSR_TS_MASK;
> -		if (restore_sigcontext(current, NULL, 1, &uc->uc_mcontext))
> +		if (!user_read_access_begin(&uc->uc_mcontext, sizeof(uc->uc_mcontext)))
> +			return -EFAULT;
> +		if (__unsafe_restore_sigcontext(current, NULL, 1, &uc->uc_mcontext)) {
> +			user_read_access_end();
>   			goto badframe;
> +		}
> +		user_read_access_end();
>   	}
>   
>   	if (restore_altstack(&uc->uc_stack))
> 

^ permalink raw reply

* [PASEMI] Nemo board doesn't boot anymore because of moving pas_pci_init
From: Christian Zigotzky @ 2021-02-23 21:43 UTC (permalink / raw)
  To: linuxppc-dev, Olof Johansson, Michael Ellerman
  Cc: Darren Stevens, R.T.Dickinson

Hello,

The Nemo board [1] with a P.A. Semi PA6T SoC doesn't boot anymore 
because of moving "pas_pci_init" to the device tree adoption [2] in the 
latest PowerPC updates 5.12-1 [3].

Unfortunately the Nemo board doesn't have it in its device tree. I 
reverted this commit and after that the Nemo board boots without any 
problems.

What do you think about this ifdef?

#ifdef CONFIG_PPC_PASEMI_NEMO
         /*
          * Check for the Nemo motherboard here, if we are running on one
          * then pas_pci_init()
          */
         if (of_machine_is_compatible("pasemi,nemo")) {
                 pas_pci_init();
         }
#endif

Thanks,
Christian

[1] https://en.wikipedia.org/wiki/AmigaOne_X1000
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/arch/powerpc/platforms/pasemi/setup.c?id=b12b47249688915e987a9a2a393b522f86f6b7ab
[3] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b12b47249688915e987a9a2a393b522f86f6b7ab

^ permalink raw reply

* Re: [PASEMI] Nemo board doesn't boot anymore because of moving pas_pci_init
From: Olof Johansson @ 2021-02-23 22:28 UTC (permalink / raw)
  To: Christian Zigotzky; +Cc: Darren Stevens, linuxppc-dev, R.T.Dickinson
In-Reply-To: <13741214-bafc-1ee5-4157-854c14dae17c@xenosoft.de>

Hi,

On Tue, Feb 23, 2021 at 1:43 PM Christian Zigotzky
<chzigotzky@xenosoft.de> wrote:
>
> Hello,
>
> The Nemo board [1] with a P.A. Semi PA6T SoC doesn't boot anymore
> because of moving "pas_pci_init" to the device tree adoption [2] in the
> latest PowerPC updates 5.12-1 [3].
>
> Unfortunately the Nemo board doesn't have it in its device tree. I
> reverted this commit and after that the Nemo board boots without any
> problems.
>
> What do you think about this ifdef?
>
> #ifdef CONFIG_PPC_PASEMI_NEMO
>          /*
>           * Check for the Nemo motherboard here, if we are running on one
>           * then pas_pci_init()
>           */
>          if (of_machine_is_compatible("pasemi,nemo")) {
>                  pas_pci_init();
>          }
> #endif

This is not a proper fix for the problem. Someone will need to debug
what on the pas_pci_init() codepath still needs to happen early in the
boot, even if the main PCI setup happens later.


-Olof

^ permalink raw reply

* Re: [PATCH] ibmveth: Switch to using the new API kobj_to_dev()
From: Jakub Kicinski @ 2021-02-23 23:46 UTC (permalink / raw)
  To: Yang Li; +Cc: cforno12, linux-kernel, netdev, paulus, linuxppc-dev, davem
In-Reply-To: <1613980941-45992-1-git-send-email-yang.lee@linux.alibaba.com>

On Mon, 22 Feb 2021 16:02:21 +0800 Yang Li wrote:
> fixed the following coccicheck:
> ./drivers/net/ethernet/ibm/ibmveth.c:1805:51-52: WARNING opportunity for
> kobj_to_dev()
> 
> Reported-by: Abaci Robot <abaci@linux.alibaba.com>
> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>

# Form letter - net-next is closed

We have already sent the networking pull request for 5.12 and therefore
net-next is closed for new drivers, features, code refactoring and
optimizations. We are currently accepting bug fixes only.

Please repost when net-next reopens after 5.12-rc1 is cut.

Look out for the announcement on the mailing list or check:
http://vger.kernel.org/~davem/net-next.html

RFC patches sent for review only are obviously welcome at any time.

^ permalink raw reply

* Re: [PATCH v2 1/2] ima: Free IMA measurement buffer on error
From: Petr Vorel @ 2021-02-23 23:33 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: sashal, Greg KH, linux-kernel, Mimi Zohar, tyhicks, ebiederm,
	dmitry.kasatkin, linux-integrity, linuxppc-dev, bauerman
In-Reply-To: <b8573374-86d0-f679-6c9f-a61b2bc6f7ea@linux.microsoft.com>

Hi all,

<snip>
> > > > <formletter>

> > > > This is not the correct way to submit patches for inclusion in the
> > > > stable kernel tree.  Please read:
> > > >       https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> > > > for how to do this properly.

> > > > </formletter>


> > > Thanks for the info Greg.

> > > I will re-submit the two patches in the proper format.

> > No need.  I'm testing these patches now.  I'm not exactly sure what the
> > problem is.  Stable wasn't Cc'ed.  Is it that you sent the patch
> > directly to Greg or added "Fixes"?

> I had not Cced stable, but had "Fixes" tag in the patch.

> Fixes: 7b8589cc29e7 ("ima: on soft reboot, save the measurement list")

> The problem is that the buffer allocated for forwarding the IMA measurement
> list is not freed - at the end of the kexec call and also in an error path.
> Please see the patch description for

> [PATCH v2 2/2] ima: Free IMA measurement buffer after kexec syscall

> IMA allocates kernel virtual memory to carry forward the measurement
> list, from the current kernel to the next kernel on kexec system call,
> in ima_add_kexec_buffer() function.  This buffer is not freed before
> completing the kexec system call resulting in memory leak.

> thanks,
>  -lakshmi

Mimi, Lakshmi, it looks like these two fixes haven't been submitted to stable kernels.
Could you please submit them?

Thanks a lot!

Kind regards,
Petr

^ permalink raw reply

* Re: [PATCH v2] selftests/powerpc: Fix L1D flushing tests for Power10
From: Daniel Axtens @ 2021-02-24  0:42 UTC (permalink / raw)
  To: Russell Currey, linuxppc-dev
In-Reply-To: <20210223070227.2916871-1-ruscur@russell.cc>

Russell Currey <ruscur@russell.cc> writes:

> The rfi_flush and entry_flush selftests work by using the PM_LD_MISS_L1
> perf event to count L1D misses.  The value of this event has changed
> over time:
>
> - Power7 uses 0x400f0
> - Power8 and Power9 use both 0x400f0 and 0x3e054
> - Power10 uses only 0x3e054
>
> Rather than relying on raw values, configure perf to count L1D read
> misses in the most explicit way available.
>
> This fixes the selftests to work on systems without 0x400f0 as
> PM_LD_MISS_L1, and should change no behaviour for systems that the tests
> already worked on.
>
> The only potential downside is that referring to a specific perf event
> requires PMU support implemented in the kernel for that platform.

So, IIUC:

 - if you used raw events and ran the binary on a P10 system using an
   older kernel that did not support P10 PMU events, you would still get
   valid results
 - but, if the HW perf event changes again, we have the same issue with
   failing tests

vs

 - if you use symbolic events and run the binary on a P10 system using
   an older kernel that does not support P10 PMU events, you would not
   be able to run the test.
 - but, if the HW perf event changes again, the test will continue to
   work. 


This seems like the correct tradeoff to make.

Having spent some quality time with these tests in the past:

Acked-by: Daniel Axtens <dja@axtens.net>

Kind regards,
Daniel

>
> Signed-off-by: Russell Currey <ruscur@russell.cc>
> ---
> v2: Move away from raw events as suggested by mpe
>
>  tools/testing/selftests/powerpc/security/entry_flush.c | 2 +-
>  tools/testing/selftests/powerpc/security/flush_utils.h | 4 ++++
>  tools/testing/selftests/powerpc/security/rfi_flush.c   | 2 +-
>  3 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/powerpc/security/entry_flush.c b/tools/testing/selftests/powerpc/security/entry_flush.c
> index 78cf914fa321..68ce377b205e 100644
> --- a/tools/testing/selftests/powerpc/security/entry_flush.c
> +++ b/tools/testing/selftests/powerpc/security/entry_flush.c
> @@ -53,7 +53,7 @@ int entry_flush_test(void)
>  
>  	entry_flush = entry_flush_orig;
>  
> -	fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
> +	fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, PERF_L1D_READ_MISS_CONFIG, -1);
>  	FAIL_IF(fd < 0);
>  
>  	p = (char *)memalign(zero_size, CACHELINE_SIZE);
> diff --git a/tools/testing/selftests/powerpc/security/flush_utils.h b/tools/testing/selftests/powerpc/security/flush_utils.h
> index 07a5eb301466..7a3d60292916 100644
> --- a/tools/testing/selftests/powerpc/security/flush_utils.h
> +++ b/tools/testing/selftests/powerpc/security/flush_utils.h
> @@ -9,6 +9,10 @@
>  
>  #define CACHELINE_SIZE 128
>  
> +#define PERF_L1D_READ_MISS_CONFIG	((PERF_COUNT_HW_CACHE_L1D) | 		\
> +					(PERF_COUNT_HW_CACHE_OP_READ << 8) |	\
> +					(PERF_COUNT_HW_CACHE_RESULT_MISS << 16))
> +
>  void syscall_loop(char *p, unsigned long iterations,
>  		  unsigned long zero_size);
>  
> diff --git a/tools/testing/selftests/powerpc/security/rfi_flush.c b/tools/testing/selftests/powerpc/security/rfi_flush.c
> index 7565fd786640..f73484a6470f 100644
> --- a/tools/testing/selftests/powerpc/security/rfi_flush.c
> +++ b/tools/testing/selftests/powerpc/security/rfi_flush.c
> @@ -54,7 +54,7 @@ int rfi_flush_test(void)
>  
>  	rfi_flush = rfi_flush_orig;
>  
> -	fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
> +	fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, PERF_L1D_READ_MISS_CONFIG, -1);
>  	FAIL_IF(fd < 0);
>  
>  	p = (char *)memalign(zero_size, CACHELINE_SIZE);
> -- 
> 2.30.1

^ permalink raw reply

* Re: [PASEMI] Nemo board doesn't boot anymore because of moving pas_pci_init
From: Michael Ellerman @ 2021-02-24  0:55 UTC (permalink / raw)
  To: Olof Johansson, Christian Zigotzky
  Cc: Darren Stevens, linuxppc-dev, R.T.Dickinson
In-Reply-To: <CAOesGMgtAXPQRThhkF5QR25R+F68F5C_HSUvFPW0Wk1DcpCwvA@mail.gmail.com>

Olof Johansson <olof@lixom.net> writes:
> Hi,
>
> On Tue, Feb 23, 2021 at 1:43 PM Christian Zigotzky
> <chzigotzky@xenosoft.de> wrote:
>>
>> Hello,
>>
>> The Nemo board [1] with a P.A. Semi PA6T SoC doesn't boot anymore
>> because of moving "pas_pci_init" to the device tree adoption [2] in the
>> latest PowerPC updates 5.12-1 [3].
>>
>> Unfortunately the Nemo board doesn't have it in its device tree. I
>> reverted this commit and after that the Nemo board boots without any
>> problems.
>>
>> What do you think about this ifdef?
>>
>> #ifdef CONFIG_PPC_PASEMI_NEMO
>>          /*
>>           * Check for the Nemo motherboard here, if we are running on one
>>           * then pas_pci_init()
>>           */
>>          if (of_machine_is_compatible("pasemi,nemo")) {
>>                  pas_pci_init();
>>          }
>> #endif
>
> This is not a proper fix for the problem. Someone will need to debug
> what on the pas_pci_init() codepath still needs to happen early in the
> boot, even if the main PCI setup happens later.

I looked but don't see anything 100% obvious.

Possibly it's the call to isa_bridge_find_early()?

static int __init pas_add_bridge(struct device_node *dev)
{
	...
	/*
	 * Scan for an isa bridge. This is needed to find the SB600 on the nemo
	 * and does nothing on machines without one.
	 */
	isa_bridge_find_early(hose);

	return 0;
}


cheers

^ permalink raw reply

* Re: [PATCH v19 01/13] kexec: Move ELF fields to struct kimage
From: Thiago Jung Bauermann @ 2021-02-24  1:03 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: mark.rutland, tao.li, zohar, paulus, vincenzo.frascino,
	frowand.list, sashal, robh, masahiroy, jmorris, takahiro.akashi,
	linux-arm-kernel, catalin.marinas, serge, devicetree,
	pasha.tatashin, will, sfr, prsriva, hsinyi, allison,
	christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
	linux-kernel, james.morse, gregkh, joe, linux-integrity,
	linuxppc-dev
In-Reply-To: <20210221174930.27324-2-nramas@linux.microsoft.com>


Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:

> ELF related fields elf_headers, elf_headers_sz, and elf_load_addr are
> defined in architecture specific 'struct kimage_arch' for x86, powerpc,
> and arm64.  The name of these fields are different in these
> architectures that makes it hard to have a common code for setting up
> the device tree for kexec system call.
>
> Move the ELF fields to 'struct kimage' defined in include/linux/kexec.h
> so common code can use it.
>
> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
> Suggested-by: Rob Herring <robh@kernel.org>
> Fixes: 33488dc4d61f ("of: Add a common kexec FDT setup function")

This Fixes tag should be removed. It is referencing a patch from the
future (later in the series), and the commit id is meaningless.

> Reported-by: kernel test robot <lkp@intel.com>
> ---
>  include/linux/kexec.h | 5 +++++
>  1 file changed, 5 insertions(+)

With that fixed:

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply

* Re: [PATCH v19 02/13] arm64: Use ELF fields defined in 'struct kimage'
From: Thiago Jung Bauermann @ 2021-02-24  1:07 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: mark.rutland, tao.li, zohar, paulus, vincenzo.frascino,
	frowand.list, sashal, robh, masahiroy, jmorris, takahiro.akashi,
	linux-arm-kernel, catalin.marinas, serge, devicetree,
	pasha.tatashin, will, sfr, prsriva, hsinyi, allison,
	christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
	linux-kernel, james.morse, gregkh, joe, linux-integrity,
	linuxppc-dev
In-Reply-To: <20210221174930.27324-3-nramas@linux.microsoft.com>


Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:

> ELF related fields elf_headers, elf_headers_sz, and elf_headers_mem
> have been moved from 'struct kimage_arch' to 'struct kimage' as
> elf_headers, elf_headers_sz, and elf_load_addr respectively.
>
> Use the ELF fields defined in 'struct kimage'.
>
> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
> Suggested-by: Rob Herring <robh@kernel.org>
> Fixes: 33488dc4d61f ("of: Add a common kexec FDT setup function")

This Fixes tag should be removed. It is referencing a patch from the
future (later in the series), and the commit id is meaningless.

> Reported-by: kernel test robot <lkp@intel.com>
> ---
>  arch/arm64/include/asm/kexec.h         |  4 ----
>  arch/arm64/kernel/machine_kexec_file.c | 18 +++++++++---------
>  2 files changed, 9 insertions(+), 13 deletions(-)

With that fixed:

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply

* Re: [PATCH v19 03/13] powerpc: Use ELF fields defined in 'struct kimage'
From: Thiago Jung Bauermann @ 2021-02-24  1:07 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: mark.rutland, tao.li, zohar, paulus, vincenzo.frascino,
	frowand.list, sashal, robh, masahiroy, jmorris, takahiro.akashi,
	linux-arm-kernel, catalin.marinas, serge, devicetree,
	pasha.tatashin, will, sfr, prsriva, hsinyi, allison,
	christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
	linux-kernel, james.morse, gregkh, joe, linux-integrity,
	linuxppc-dev
In-Reply-To: <20210221174930.27324-4-nramas@linux.microsoft.com>


Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:

> ELF related fields elf_headers, elf_headers_sz, and elfcorehdr_addr
> have been moved from 'struct kimage_arch' to 'struct kimage' as
> elf_headers, elf_headers_sz, and elf_load_addr respectively.
>
> Use the ELF fields defined in 'struct kimage'.
>
> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
> Suggested-by: Rob Herring <robh@kernel.org>
> Fixes: 33488dc4d61f ("of: Add a common kexec FDT setup function")

This Fixes tag should be removed. It is referencing a patch from the
future (later in the series), and the commit id is meaningless.

> Reported-by: kernel test robot <lkp@intel.com>
> ---
>  arch/powerpc/include/asm/kexec.h  |  4 ----
>  arch/powerpc/kexec/file_load.c    |  6 +++---
>  arch/powerpc/kexec/file_load_64.c | 14 +++++++-------
>  3 files changed, 10 insertions(+), 14 deletions(-)

With that fixed:

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply

* Re: [PATCH v19 04/13] x86: Use ELF fields defined in 'struct kimage'
From: Thiago Jung Bauermann @ 2021-02-24  1:08 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: mark.rutland, tao.li, zohar, paulus, vincenzo.frascino,
	frowand.list, sashal, robh, masahiroy, jmorris, takahiro.akashi,
	linux-arm-kernel, catalin.marinas, serge, devicetree,
	pasha.tatashin, will, sfr, prsriva, hsinyi, allison,
	christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
	linux-kernel, james.morse, gregkh, joe, linux-integrity,
	linuxppc-dev
In-Reply-To: <20210221174930.27324-5-nramas@linux.microsoft.com>


Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:

> ELF related fields elf_headers, elf_headers_sz, and elf_load_addr
> have been moved from 'struct kimage_arch' to 'struct kimage'.
>
> Use the ELF fields defined in 'struct kimage'.
>
> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
> Suggested-by: Rob Herring <robh@kernel.org>
> Fixes: 33488dc4d61f ("of: Add a common kexec FDT setup function")

Same thing. :-)

> Reported-by: kernel test robot <lkp@intel.com>
> ---
>  arch/x86/include/asm/kexec.h       |  5 -----
>  arch/x86/kernel/crash.c            | 14 +++++++-------
>  arch/x86/kernel/kexec-bzimage64.c  |  2 +-
>  arch/x86/kernel/machine_kexec_64.c |  4 ++--
>  4 files changed, 10 insertions(+), 15 deletions(-)

With that fixed:

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply

* Re: [PATCH v19 04/13] x86: Use ELF fields defined in 'struct kimage'
From: Thiago Jung Bauermann @ 2021-02-24  1:13 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: mark.rutland, tao.li, zohar, paulus, vincenzo.frascino,
	frowand.list, sashal, robh, masahiroy, jmorris, takahiro.akashi,
	linux-arm-kernel, catalin.marinas, serge, devicetree,
	pasha.tatashin, will, sfr, prsriva, hsinyi, allison,
	christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
	linux-kernel, james.morse, gregkh, joe, linux-integrity,
	linuxppc-dev
In-Reply-To: <20210221174930.27324-5-nramas@linux.microsoft.com>


Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:

> ELF related fields elf_headers, elf_headers_sz, and elf_load_addr
> have been moved from 'struct kimage_arch' to 'struct kimage'.
>
> Use the ELF fields defined in 'struct kimage'.
>
> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
> Suggested-by: Rob Herring <robh@kernel.org>
> Fixes: 33488dc4d61f ("of: Add a common kexec FDT setup function")

Ditto.

> Reported-by: kernel test robot <lkp@intel.com>
> ---
>  arch/x86/include/asm/kexec.h       |  5 -----
>  arch/x86/kernel/crash.c            | 14 +++++++-------
>  arch/x86/kernel/kexec-bzimage64.c  |  2 +-
>  arch/x86/kernel/machine_kexec_64.c |  4 ++--
>  4 files changed, 10 insertions(+), 15 deletions(-)

With that fixed:

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply

* Re: [PATCH v19 05/13] of: Add a common kexec FDT setup function
From: Thiago Jung Bauermann @ 2021-02-24  1:20 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: mark.rutland, tao.li, zohar, paulus, vincenzo.frascino,
	frowand.list, sashal, robh, masahiroy, jmorris, takahiro.akashi,
	linux-arm-kernel, catalin.marinas, serge, devicetree,
	pasha.tatashin, will, sfr, prsriva, hsinyi, allison,
	christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
	linux-kernel, james.morse, gregkh, joe, linux-integrity,
	linuxppc-dev
In-Reply-To: <20210221174930.27324-6-nramas@linux.microsoft.com>


Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:

> From: Rob Herring <robh@kernel.org>
>
> Both arm64 and powerpc do essentially the same FDT /chosen setup for
> kexec.  The differences are either omissions that arm64 should have
> or additional properties that will be ignored.  The setup code can be
> combined and shared by both powerpc and arm64.
>
> The differences relative to the arm64 version:
>  - If /chosen doesn't exist, it will be created (should never happen).
>  - Any old dtb and initrd reserved memory will be released.
>  - The new initrd and elfcorehdr are marked reserved.
>  - "linux,booted-from-kexec" is set.
>
> The differences relative to the powerpc version:
>  - "kaslr-seed" and "rng-seed" may be set.
>  - "linux,elfcorehdr" is set.
>  - Any existing "linux,usable-memory-range" is removed.
>
> Combine the code for setting up the /chosen node in the FDT and updating
> the memory reservation for kexec, for powerpc and arm64, in
> of_kexec_alloc_and_setup_fdt() and move it to "drivers/of/kexec.c".
>
> Signed-off-by: Rob Herring <robh@kernel.org>
> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
> Fixes: 33488dc4d61f ("of: Add a common kexec FDT setup function")

A patch cannot fix itself. The world would be a much better place if it
could. :-)

> Reported-by: kernel test robot <lkp@intel.com>
> ---
>  drivers/of/Makefile |   6 +
>  drivers/of/kexec.c  | 265 ++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/of.h  |   5 +
>  3 files changed, 276 insertions(+)
>  create mode 100644 drivers/of/kexec.c

With that fixed:

Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply

* Re: [PATCH v19 05/13] of: Add a common kexec FDT setup function
From: Lakshmi Ramasubramanian @ 2021-02-24  1:57 UTC (permalink / raw)
  To: Thiago Jung Bauermann
  Cc: mark.rutland, tao.li, zohar, paulus, vincenzo.frascino,
	frowand.list, sashal, robh, masahiroy, jmorris, takahiro.akashi,
	linux-arm-kernel, catalin.marinas, serge, devicetree,
	pasha.tatashin, will, sfr, prsriva, hsinyi, allison,
	christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
	linux-kernel, james.morse, gregkh, joe, linux-integrity,
	linuxppc-dev
In-Reply-To: <874ki2w9ak.fsf@manicouagan.localdomain>

On 2/23/21 5:20 PM, Thiago Jung Bauermann wrote:
> 
> Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
> 
>> From: Rob Herring <robh@kernel.org>
>>
>> Both arm64 and powerpc do essentially the same FDT /chosen setup for
>> kexec.  The differences are either omissions that arm64 should have
>> or additional properties that will be ignored.  The setup code can be
>> combined and shared by both powerpc and arm64.
>>
>> The differences relative to the arm64 version:
>>   - If /chosen doesn't exist, it will be created (should never happen).
>>   - Any old dtb and initrd reserved memory will be released.
>>   - The new initrd and elfcorehdr are marked reserved.
>>   - "linux,booted-from-kexec" is set.
>>
>> The differences relative to the powerpc version:
>>   - "kaslr-seed" and "rng-seed" may be set.
>>   - "linux,elfcorehdr" is set.
>>   - Any existing "linux,usable-memory-range" is removed.
>>
>> Combine the code for setting up the /chosen node in the FDT and updating
>> the memory reservation for kexec, for powerpc and arm64, in
>> of_kexec_alloc_and_setup_fdt() and move it to "drivers/of/kexec.c".
>>
>> Signed-off-by: Rob Herring <robh@kernel.org>
>> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
>> Fixes: 33488dc4d61f ("of: Add a common kexec FDT setup function")
> 
> A patch cannot fix itself. The world would be a much better place if it
> could. :-)

:)

> 
>> Reported-by: kernel test robot <lkp@intel.com>
>> ---
>>   drivers/of/Makefile |   6 +
>>   drivers/of/kexec.c  | 265 ++++++++++++++++++++++++++++++++++++++++++++
>>   include/linux/of.h  |   5 +
>>   3 files changed, 276 insertions(+)
>>   create mode 100644 drivers/of/kexec.c
> 
> With that fixed:
> 
> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>

Thanks for reviewing the patches Thiago.

  -lakshmi



^ permalink raw reply

* Re: [PASEMI] Nemo board doesn't boot anymore because of moving pas_pci_init
From: Oliver O'Halloran @ 2021-02-24  2:17 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Olof Johansson, Darren Stevens, linuxppc-dev, R.T.Dickinson,
	Christian Zigotzky
In-Reply-To: <877dmythcr.fsf@mpe.ellerman.id.au>

On Wed, Feb 24, 2021 at 11:55 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Olof Johansson <olof@lixom.net> writes:
> > Hi,
> >
> > On Tue, Feb 23, 2021 at 1:43 PM Christian Zigotzky
> > <chzigotzky@xenosoft.de> wrote:
> >>
> >> Hello,
> >>
> >> The Nemo board [1] with a P.A. Semi PA6T SoC doesn't boot anymore
> >> because of moving "pas_pci_init" to the device tree adoption [2] in the
> >> latest PowerPC updates 5.12-1 [3].
> >>
> >> Unfortunately the Nemo board doesn't have it in its device tree. I
> >> reverted this commit and after that the Nemo board boots without any
> >> problems.
> >>
> >> What do you think about this ifdef?
> >>
> >> #ifdef CONFIG_PPC_PASEMI_NEMO
> >>          /*
> >>           * Check for the Nemo motherboard here, if we are running on one
> >>           * then pas_pci_init()
> >>           */
> >>          if (of_machine_is_compatible("pasemi,nemo")) {
> >>                  pas_pci_init();
> >>          }
> >> #endif
> >
> > This is not a proper fix for the problem. Someone will need to debug
> > what on the pas_pci_init() codepath still needs to happen early in the
> > boot, even if the main PCI setup happens later.
>
> I looked but don't see anything 100% obvious.
>
> Possibly it's the call to isa_bridge_find_early()?

Looks like it. I think the problem stems from the use of the PIO
helpers (mainly outb()) in i8259_init() which is called from
nemo_init_IRQ(). The PIO helpers require the ISA space to be mapped
and io_isa_base to be set since they take a PIO register address
rather than an MMIO address. It looks like there's a few other legacy
embedded platforms that might have the same problem.

I guess the real fix would be to decouple the ISA base address
discovery from the PHB discovery. That should be doable since it's all
discovered via DT anyway and we only support one ISA address range,
but it's a bit of work.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox