LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: KVM on POWER8 host lock up since 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
From: Nicholas Piggin @ 2020-10-19  4:50 UTC (permalink / raw)
  To: Michal Suchánek; +Cc: ro, linuxppc-dev, Hari Bathini
In-Reply-To: <1603066878.gtbyofrzyo.astroid@bobo.none>

Excerpts from Nicholas Piggin's message of October 19, 2020 11:00 am:
> Excerpts from Michal Suchánek's message of October 17, 2020 6:14 am:
>> On Mon, Sep 07, 2020 at 11:13:47PM +1000, Nicholas Piggin wrote:
>>> Excerpts from Michael Ellerman's message of August 31, 2020 8:50 pm:
>>> > Michal Suchánek <msuchanek@suse.de> writes:
>>> >> On Mon, Aug 31, 2020 at 11:14:18AM +1000, Nicholas Piggin wrote:
>>> >>> Excerpts from Michal Suchánek's message of August 31, 2020 6:11 am:
>>> >>> > Hello,
>>> >>> > 
>>> >>> > on POWER8 KVM hosts lock up since commit 10d91611f426 ("powerpc/64s:
>>> >>> > Reimplement book3s idle code in C").
>>> >>> > 
>>> >>> > The symptom is host locking up completely after some hours of KVM
>>> >>> > workload with messages like
>>> >>> > 
>>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 47
>>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 71
>>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 47
>>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 71
>>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 47
>>> >>> > 
>>> >>> > printed before the host locks up.
>>> >>> > 
>>> >>> > The machines run sandboxed builds which is a mixed workload resulting in
>>> >>> > IO/single core/mutiple core load over time and there are periods of no
>>> >>> > activity and no VMS runnig as well. The VMs are shortlived so VM
>>> >>> > setup/terdown is somewhat excercised as well.
>>> >>> > 
>>> >>> > POWER9 with the new guest entry fast path does not seem to be affected.
>>> >>> > 
>>> >>> > Reverted the patch and the followup idle fixes on top of 5.2.14 and
>>> >>> > re-applied commit a3f3072db6ca ("powerpc/powernv/idle: Restore IAMR
>>> >>> > after idle") which gives same idle code as 5.1.16 and the kernel seems
>>> >>> > stable.
>>> >>> > 
>>> >>> > Config is attached.
>>> >>> > 
>>> >>> > I cannot easily revert this commit, especially if I want to use the same
>>> >>> > kernel on POWER8 and POWER9 - many of the POWER9 fixes are applicable
>>> >>> > only to the new idle code.
>>> >>> > 
>>> >>> > Any idea what can be the problem?
>>> >>> 
>>> >>> So hwthread_state is never getting back to to HWTHREAD_IN_IDLE on
>>> >>> those threads. I wonder what they are doing. POWER8 doesn't have a good
>>> >>> NMI IPI and I don't know if it supports pdbg dumping registers from the
>>> >>> BMC unfortunately.
>>> >>
>>> >> It may be possible to set up fadump with a later kernel version that
>>> >> supports it on powernv and dump the whole kernel.
>>> > 
>>> > Your firmware won't support it AFAIK.
>>> > 
>>> > You could try kdump, but if we have CPUs stuck in KVM then there's a
>>> > good chance it won't work :/
>>> 
>>> I haven't had any luck yet reproducing this still. Testing with sub 
>>> cores of various different combinations, etc. I'll keep trying though.
>> 
>> Hello,
>> 
>> I tried running some KVM guests to simulate the workload and what I get
>> is guests failing to start with a rcu stall. Tried both 5.3 and 5.9
>> kernel and qemu 4.2.1 and 5.1.0
>> 
>> To start some guests I run
>> 
>> for i in $(seq 0 9) ; do /opt/qemu/bin/qemu-system-ppc64 -m 2048 -accel kvm -smp 8 -kernel /boot/vmlinux -initrd /boot/initrd -nodefaults -nographic -serial mon:telnet::444$i,server,wait & done
>> 
>> To simulate some workload I run
>> 
>> xz -zc9T0 < /dev/zero > /dev/null &
>> while true; do
>>     killall -STOP xz; sleep 1; killall -CONT xz; sleep 1;
>> done &
>> 
>> on the host and add a job that executes this to the ramdisk. However, most
>> guests never get to the point where the job is executed.
>> 
>> Any idea what might be the problem?
> 
> I would say try without pv queued spin locks (but if the same thing is 
> happening with 5.3 then it must be something else I guess). 
> 
> I'll try to test a similar setup on a POWER8 here.

Couldn't reproduce the guest hang, they seem to run fine even with 
queued spinlocks. Might have a different .config.

I might have got a lockup in the host (although different symptoms than 
the original report). I'll look into that a bit further.

Thanks,
Nick

^ permalink raw reply

* [PATCH 2/2] powerpc/smp: Use GFP_ATOMIC while allocating tmp mask
From: Srikar Dronamraju @ 2020-10-19  4:27 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Srikar Dronamraju, Qian Cai, LKML,
	Valentin Schneider, Peter Zijlstra, linuxppc-dev, Ingo Molnar
In-Reply-To: <20201019042716.106234-1-srikar@linux.vnet.ibm.com>

Qian Cai reported a regression where CPU Hotplug fails with the latest
powerpc/next

BUG: sleeping function called from invalid context at mm/slab.h:494
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/88
no locks held by swapper/88/0.
irq event stamp: 18074448
hardirqs last  enabled at (18074447): [<c0000000001a2a7c>] tick_nohz_idle_enter+0x9c/0x110
hardirqs last disabled at (18074448): [<c000000000106798>] do_idle+0x138/0x3b0
do_idle at kernel/sched/idle.c:253 (discriminator 1)
softirqs last  enabled at (18074440): [<c0000000000bbec4>] irq_enter_rcu+0x94/0xa0
softirqs last disabled at (18074439): [<c0000000000bbea0>] irq_enter_rcu+0x70/0xa0
CPU: 88 PID: 0 Comm: swapper/88 Tainted: G        W         5.9.0-rc8-next-20201007 #1
Call Trace:
[c00020000a4bfcf0] [c000000000649e98] dump_stack+0xec/0x144 (unreliable)
[c00020000a4bfd30] [c0000000000f6c34] ___might_sleep+0x2f4/0x310
[c00020000a4bfdb0] [c000000000354f94] slab_pre_alloc_hook.constprop.82+0x124/0x190
[c00020000a4bfe00] [c00000000035e9e8] __kmalloc_node+0x88/0x3a0
slab_alloc_node at mm/slub.c:2817
(inlined by) __kmalloc_node at mm/slub.c:4013
[c00020000a4bfe80] [c0000000006494d8] alloc_cpumask_var_node+0x38/0x80
kmalloc_node at include/linux/slab.h:577
(inlined by) alloc_cpumask_var_node at lib/cpumask.c:116
[c00020000a4bfef0] [c00000000003eedc] start_secondary+0x27c/0x800
update_mask_by_l2 at arch/powerpc/kernel/smp.c:1267
(inlined by) add_cpu_to_masks at arch/powerpc/kernel/smp.c:1387
(inlined by) start_secondary at arch/powerpc/kernel/smp.c:1420
[c00020000a4bff90] [c00000000000c468] start_secondary_resume+0x10/0x14

Allocating a temporary mask while performing a CPU Hotplug operation
with CONFIG_CPUMASK_OFFSTACK enabled, leads to calling a sleepable
function from a atomic context. Fix this by allocating the temporary
mask with GFP_ATOMIC flag. Also instead of having to allocate twice,
allocate the mask in the caller so that we only have to allocate once.
If the allocation fails, assume the mask to be same as sibling mask, which
will make the scheduler to drop this domain for this CPU.

Fixes: 70a94089d7f7 ("powerpc/smp: Optimize update_coregroup_mask")
Fixes: 3ab33d6dc3e9 ("powerpc/smp: Optimize update_mask_by_l2")
Reported-by: Qian Cai <cai@redhat.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Qian Cai <cai@redhat.com>
---
Changelog v1->v2:
https://lore.kernel.org/linuxppc-dev/20201008034240.34059-1-srikar@linux.vnet.ibm.com/t/#u
Updated 2nd patch based on comments from Michael Ellerman
- Remove the WARN_ON.
- Handle allocation failures in a more subtle fashion
- Allocate in the caller so that we allocate once.

 arch/powerpc/kernel/smp.c | 57 +++++++++++++++++++++------------------
 1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index a864b9b3228c..028479e9b66b 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1257,38 +1257,33 @@ static struct device_node *cpu_to_l2cache(int cpu)
 	return cache;
 }
 
-static bool update_mask_by_l2(int cpu)
+static bool update_mask_by_l2(int cpu, cpumask_var_t *mask)
 {
 	struct cpumask *(*submask_fn)(int) = cpu_sibling_mask;
 	struct device_node *l2_cache, *np;
-	cpumask_var_t mask;
 	int i;
 
 	if (has_big_cores)
 		submask_fn = cpu_smallcore_mask;
 
 	l2_cache = cpu_to_l2cache(cpu);
-	if (!l2_cache) {
-		/*
-		 * If no l2cache for this CPU, assume all siblings to share
-		 * cache with this CPU.
-		 */
+	if (!l2_cache || !*mask) {
+		/* Assume only core siblings share cache with this CPU */
 		for_each_cpu(i, submask_fn(cpu))
 			set_cpus_related(cpu, i, cpu_l2_cache_mask);
 
 		return false;
 	}
 
-	alloc_cpumask_var_node(&mask, GFP_KERNEL, cpu_to_node(cpu));
-	cpumask_and(mask, cpu_online_mask, cpu_cpu_mask(cpu));
+	cpumask_and(*mask, cpu_online_mask, cpu_cpu_mask(cpu));
 
 	/* Update l2-cache mask with all the CPUs that are part of submask */
 	or_cpumasks_related(cpu, cpu, submask_fn, cpu_l2_cache_mask);
 
 	/* Skip all CPUs already part of current CPU l2-cache mask */
-	cpumask_andnot(mask, mask, cpu_l2_cache_mask(cpu));
+	cpumask_andnot(*mask, *mask, cpu_l2_cache_mask(cpu));
 
-	for_each_cpu(i, mask) {
+	for_each_cpu(i, *mask) {
 		/*
 		 * when updating the marks the current CPU has not been marked
 		 * online, but we need to update the cache masks
@@ -1298,15 +1293,14 @@ static bool update_mask_by_l2(int cpu)
 		/* Skip all CPUs already part of current CPU l2-cache */
 		if (np == l2_cache) {
 			or_cpumasks_related(cpu, i, submask_fn, cpu_l2_cache_mask);
-			cpumask_andnot(mask, mask, submask_fn(i));
+			cpumask_andnot(*mask, *mask, submask_fn(i));
 		} else {
-			cpumask_andnot(mask, mask, cpu_l2_cache_mask(i));
+			cpumask_andnot(*mask, *mask, cpu_l2_cache_mask(i));
 		}
 
 		of_node_put(np);
 	}
 	of_node_put(l2_cache);
-	free_cpumask_var(mask);
 
 	return true;
 }
@@ -1349,40 +1343,46 @@ static inline void add_cpu_to_smallcore_masks(int cpu)
 	}
 }
 
-static void update_coregroup_mask(int cpu)
+static void update_coregroup_mask(int cpu, cpumask_var_t *mask)
 {
 	struct cpumask *(*submask_fn)(int) = cpu_sibling_mask;
-	cpumask_var_t mask;
 	int coregroup_id = cpu_to_coregroup_id(cpu);
 	int i;
 
-	alloc_cpumask_var_node(&mask, GFP_KERNEL, cpu_to_node(cpu));
-	cpumask_and(mask, cpu_online_mask, cpu_cpu_mask(cpu));
-
 	if (shared_caches)
 		submask_fn = cpu_l2_cache_mask;
 
+	if (!*mask) {
+		/* Assume only siblings are part of this CPU's coregroup */
+		for_each_cpu(i, submask_fn(cpu))
+			set_cpus_related(cpu, i, cpu_coregroup_mask);
+
+		return;
+	}
+
+	cpumask_and(*mask, cpu_online_mask, cpu_cpu_mask(cpu));
+
 	/* Update coregroup mask with all the CPUs that are part of submask */
 	or_cpumasks_related(cpu, cpu, submask_fn, cpu_coregroup_mask);
 
 	/* Skip all CPUs already part of coregroup mask */
-	cpumask_andnot(mask, mask, cpu_coregroup_mask(cpu));
+	cpumask_andnot(*mask, *mask, cpu_coregroup_mask(cpu));
 
-	for_each_cpu(i, mask) {
+	for_each_cpu(i, *mask) {
 		/* Skip all CPUs not part of this coregroup */
 		if (coregroup_id == cpu_to_coregroup_id(i)) {
 			or_cpumasks_related(cpu, i, submask_fn, cpu_coregroup_mask);
-			cpumask_andnot(mask, mask, submask_fn(i));
+			cpumask_andnot(*mask, *mask, submask_fn(i));
 		} else {
-			cpumask_andnot(mask, mask, cpu_coregroup_mask(i));
+			cpumask_andnot(*mask, *mask, cpu_coregroup_mask(i));
 		}
 	}
-	free_cpumask_var(mask);
 }
 
 static void add_cpu_to_masks(int cpu)
 {
 	int first_thread = cpu_first_thread_sibling(cpu);
+	cpumask_var_t mask;
 	int i;
 
 	/*
@@ -1396,10 +1396,15 @@ static void add_cpu_to_masks(int cpu)
 			set_cpus_related(i, cpu, cpu_sibling_mask);
 
 	add_cpu_to_smallcore_masks(cpu);
-	update_mask_by_l2(cpu);
+
+	/* In CPU-hotplug path, hence use GFP_ATOMIC */
+	alloc_cpumask_var_node(&mask, GFP_ATOMIC, cpu_to_node(cpu));
+	update_mask_by_l2(cpu, &mask);
 
 	if (has_coregroup_support())
-		update_coregroup_mask(cpu);
+		update_coregroup_mask(cpu, &mask);
+
+	free_cpumask_var(mask);
 }
 
 /* Activate a secondary processor. */
-- 
2.18.2


^ permalink raw reply related

* [PATCH 1/2] powerpc/smp: Remove unnecessary variable
From: Srikar Dronamraju @ 2020-10-19  4:27 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Srikar Dronamraju, Peter Zijlstra,
	LKML, Valentin Schneider, Qian Cai, linuxppc-dev, Ingo Molnar
In-Reply-To: <20201019042716.106234-1-srikar@linux.vnet.ibm.com>

Commit 3ab33d6dc3e9 ("powerpc/smp: Optimize update_mask_by_l2")
introduced submask_fn in update_mask_by_l2 to track the right submask.
However commit f6606cfdfbcd ("powerpc/smp: Dont assume l2-cache to be
superset of sibling") introduced sibling_mask in update_mask_by_l2 to
track the same submask. Remove sibling_mask in favour of submask_fn.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Qian Cai <cai@redhat.com>
---
 arch/powerpc/kernel/smp.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 8d1c401f4617..a864b9b3228c 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1264,18 +1264,16 @@ static bool update_mask_by_l2(int cpu)
 	cpumask_var_t mask;
 	int i;
 
+	if (has_big_cores)
+		submask_fn = cpu_smallcore_mask;
+
 	l2_cache = cpu_to_l2cache(cpu);
 	if (!l2_cache) {
-		struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
-
 		/*
 		 * If no l2cache for this CPU, assume all siblings to share
 		 * cache with this CPU.
 		 */
-		if (has_big_cores)
-			sibling_mask = cpu_smallcore_mask;
-
-		for_each_cpu(i, sibling_mask(cpu))
+		for_each_cpu(i, submask_fn(cpu))
 			set_cpus_related(cpu, i, cpu_l2_cache_mask);
 
 		return false;
@@ -1284,9 +1282,6 @@ static bool update_mask_by_l2(int cpu)
 	alloc_cpumask_var_node(&mask, GFP_KERNEL, cpu_to_node(cpu));
 	cpumask_and(mask, cpu_online_mask, cpu_cpu_mask(cpu));
 
-	if (has_big_cores)
-		submask_fn = cpu_smallcore_mask;
-
 	/* Update l2-cache mask with all the CPUs that are part of submask */
 	or_cpumasks_related(cpu, cpu, submask_fn, cpu_l2_cache_mask);
 
-- 
2.18.2


^ permalink raw reply related

* [PATCH v2 0/2] Fixes for coregroup
From: Srikar Dronamraju @ 2020-10-19  4:27 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Srikar Dronamraju, Peter Zijlstra,
	LKML, Valentin Schneider, Qian Cai, linuxppc-dev, Ingo Molnar

These patches fixes problems introduced by the coregroup patches.
The first patch we remove a redundant variable.
Second patch allows to boot with CONFIG_CPUMASK_OFFSTACK enabled.

Changelog v1->v2:
https://lore.kernel.org/linuxppc-dev/20201008034240.34059-1-srikar@linux.vnet.ibm.com/t/#u
1. 1st patch was not part of previous posting.
2. Updated 2nd patch based on comments from Michael Ellerman

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Qian Cai <cai@redhat.com>

Srikar Dronamraju (2):
  powerpc/smp: Remove unnecessary variable
  powerpc/smp: Use GFP_ATOMIC while allocating tmp mask

 arch/powerpc/kernel/smp.c | 70 +++++++++++++++++++--------------------
 1 file changed, 35 insertions(+), 35 deletions(-)

-- 
2.18.2


^ permalink raw reply

* Re: KVM on POWER8 host lock up since 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
From: Nicholas Piggin @ 2020-10-19  1:00 UTC (permalink / raw)
  To: Michal Suchánek; +Cc: ro, linuxppc-dev, Hari Bathini
In-Reply-To: <20201016201410.GH29778@kitsune.suse.cz>

Excerpts from Michal Suchánek's message of October 17, 2020 6:14 am:
> On Mon, Sep 07, 2020 at 11:13:47PM +1000, Nicholas Piggin wrote:
>> Excerpts from Michael Ellerman's message of August 31, 2020 8:50 pm:
>> > Michal Suchánek <msuchanek@suse.de> writes:
>> >> On Mon, Aug 31, 2020 at 11:14:18AM +1000, Nicholas Piggin wrote:
>> >>> Excerpts from Michal Suchánek's message of August 31, 2020 6:11 am:
>> >>> > Hello,
>> >>> > 
>> >>> > on POWER8 KVM hosts lock up since commit 10d91611f426 ("powerpc/64s:
>> >>> > Reimplement book3s idle code in C").
>> >>> > 
>> >>> > The symptom is host locking up completely after some hours of KVM
>> >>> > workload with messages like
>> >>> > 
>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 47
>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 71
>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 47
>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 71
>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 47
>> >>> > 
>> >>> > printed before the host locks up.
>> >>> > 
>> >>> > The machines run sandboxed builds which is a mixed workload resulting in
>> >>> > IO/single core/mutiple core load over time and there are periods of no
>> >>> > activity and no VMS runnig as well. The VMs are shortlived so VM
>> >>> > setup/terdown is somewhat excercised as well.
>> >>> > 
>> >>> > POWER9 with the new guest entry fast path does not seem to be affected.
>> >>> > 
>> >>> > Reverted the patch and the followup idle fixes on top of 5.2.14 and
>> >>> > re-applied commit a3f3072db6ca ("powerpc/powernv/idle: Restore IAMR
>> >>> > after idle") which gives same idle code as 5.1.16 and the kernel seems
>> >>> > stable.
>> >>> > 
>> >>> > Config is attached.
>> >>> > 
>> >>> > I cannot easily revert this commit, especially if I want to use the same
>> >>> > kernel on POWER8 and POWER9 - many of the POWER9 fixes are applicable
>> >>> > only to the new idle code.
>> >>> > 
>> >>> > Any idea what can be the problem?
>> >>> 
>> >>> So hwthread_state is never getting back to to HWTHREAD_IN_IDLE on
>> >>> those threads. I wonder what they are doing. POWER8 doesn't have a good
>> >>> NMI IPI and I don't know if it supports pdbg dumping registers from the
>> >>> BMC unfortunately.
>> >>
>> >> It may be possible to set up fadump with a later kernel version that
>> >> supports it on powernv and dump the whole kernel.
>> > 
>> > Your firmware won't support it AFAIK.
>> > 
>> > You could try kdump, but if we have CPUs stuck in KVM then there's a
>> > good chance it won't work :/
>> 
>> I haven't had any luck yet reproducing this still. Testing with sub 
>> cores of various different combinations, etc. I'll keep trying though.
> 
> Hello,
> 
> I tried running some KVM guests to simulate the workload and what I get
> is guests failing to start with a rcu stall. Tried both 5.3 and 5.9
> kernel and qemu 4.2.1 and 5.1.0
> 
> To start some guests I run
> 
> for i in $(seq 0 9) ; do /opt/qemu/bin/qemu-system-ppc64 -m 2048 -accel kvm -smp 8 -kernel /boot/vmlinux -initrd /boot/initrd -nodefaults -nographic -serial mon:telnet::444$i,server,wait & done
> 
> To simulate some workload I run
> 
> xz -zc9T0 < /dev/zero > /dev/null &
> while true; do
>     killall -STOP xz; sleep 1; killall -CONT xz; sleep 1;
> done &
> 
> on the host and add a job that executes this to the ramdisk. However, most
> guests never get to the point where the job is executed.
> 
> Any idea what might be the problem?

I would say try without pv queued spin locks (but if the same thing is 
happening with 5.3 then it must be something else I guess). 

I'll try to test a similar setup on a POWER8 here.

Thanks,
Nick

> 
> In the past I was able to boot guests quite realiably.
> 
> This is boot log of one of the VMs
> 
> Trying ::1...
> Connected to localhost.
> Escape character is '^]'.
> 
> 
> SLOF **********************************************************************
> QEMU Starting
>  Build Date = Jul 17 2020 11:15:24
>  FW Version = git-e18ddad8516ff2cf
>  Press "s" to enter Open Firmware.
> 
> Populating /vdevice methods
> Populating /vdevice/vty@71000000
> Populating /vdevice/nvram@71000001
> Populating /pci@800000020000000
> No NVRAM common partition, re-initializing...
> Scanning USB 
> Using default console: /vdevice/vty@71000000
> Detected RAM kernel at 400000 (27c8620 bytes) 
>      
>   Welcome to Open Firmware
> 
>   Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
>   This program and the accompanying materials are made available
>   under the terms of the BSD License available at
>   http://www.opensource.org/licenses/bsd-license.php
> 
> Booting from memory...
> OF stdout device is: /vdevice/vty@71000000
> Preparing to boot Linux version 5.9.0-1.g11733e1-default (geeko@buildhost) (gcc (SUSE Linux) 10.2.1 20200825 [revision c0746a1beb1ba073c7981eb09f55b3d993b32e5c], GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.34.0.20200325-1) #1 SMP Sun Oct 11 22:20:46 UTC 2020 (11733e1)
> Detected machine type: 0000000000000101
> command line:  
> Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
> Calling ibm,client-architecture-support... done
> memory layout at init:
>   memory_limit : 0000000000000000 (16 MB aligned)
>   alloc_bottom : 0000000003810000
>   alloc_top    : 0000000030000000
>   alloc_top_hi : 0000000080000000
>   rmo_top      : 0000000030000000
>   ram_top      : 0000000080000000
> instantiating rtas at 0x000000002fff0000... done
> prom_hold_cpus: skipped
> copying OF device tree...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x0000000003820000 -> 0x0000000003820a2c
> Device tree struct  0x0000000003830000 -> 0x0000000003840000
> Quiescing Open Firmware ...
> Booting Linux via __start() @ 0x0000000000400000 ...
> [    0.000000] hash-mmu: Page sizes from device-tree:
> [    0.000000] hash-mmu: base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0
> [    0.000000] hash-mmu: base_shift=16: shift=16, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=1
> [    0.000000] Using 1TB segments
> [    0.000000] hash-mmu: Initializing hash mmu with SLB
> [    0.000000] Linux version 5.9.0-1.g11733e1-default (geeko@buildhost) (gcc (SUSE Linux) 10.2.1 20200825 [revision c0746a1beb1ba073c7981eb09f55b3d993b32e5c], GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.34.0.20200325-1) #1 SMP Sun Oct 11 22:20:46 UTC 2020 (11733e1)
> [    0.000000] Found initrd at 0xc000000002be0000:0xc000000003804784
> [    0.000000] Using pSeries machine description
> [    0.000000] printk: bootconsole [udbg0] enabled
> [    0.000000] Partition configured for 8 cpus.
> [    0.000000] CPU maps initialized for 1 thread per core
> [    0.000000] -----------------------------------------------------
> [    0.000000] phys_mem_size     = 0x80000000
> [    0.000000] dcache_bsize      = 0x80
> [    0.000000] icache_bsize      = 0x80
> [    0.000000] cpu_features      = 0x000002eb8f4d91a7
> [    0.000000]   possible        = 0x000ffbfbcf5fb1a7
> [    0.000000]   always          = 0x00000003800081a1
> [    0.000000] cpu_user_features = 0xdc0065c2 0xae000000
> [    0.000000] mmu_features      = 0x78006001
> [    0.000000] firmware_features = 0x00000085455a445f
> [    0.000000] vmalloc start     = 0xc008000000000000
> [    0.000000] IO start          = 0xc00a000000000000
> [    0.000000] vmemmap start     = 0xc00c000000000000
> [    0.000000] hash-mmu: ppc64_pft_size    = 0x18
> [    0.000000] hash-mmu: htab_hash_mask    = 0x1ffff
> [    0.000000] -----------------------------------------------------
> [    0.000000] numa:   NODE_DATA [mem 0x7ffa8900-0x7ffaffff]
> [    0.000000] rfi-flush: fallback displacement flush available
> [    0.000000] rfi-flush: ori type flush available
> [    0.000000] rfi-flush: mttrig type flush available
> [    0.000000] count-cache-flush: hardware flush enabled.
> [    0.000000] link-stack-flush: software flush enabled.
> [    0.000000] stf-barrier: hwsync barrier available
> [    0.000000] PCI host bridge /pci@800000020000000  ranges:
> [    0.000000]   IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
> [    0.000000]  MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000 
> [    0.000000]  MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000 
> [    0.000000] PCI: OF: PROBE_ONLY disabled
> [    0.000000] PPC64 nvram contains 65536 bytes
> [    0.000000] PV qspinlock hash table entries: 4096 (order: 0, 65536 bytes, linear)
> [    0.000000] barrier-nospec: using ORI speculation barrier
> [    0.000000] Zone ranges:
> [    0.000000]   Normal   [mem 0x0000000000000000-0x000000007fffffff]
> [    0.000000]   Device   empty
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000000000-0x000000007fffffff]
> [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
> [    0.000000] percpu: Embedded 11 pages/cpu s629400 r0 d91496 u1048576
> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32736
> [    0.000000] Policy zone: Normal
> [    0.000000] Kernel command line: 
> [    0.000000] Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes, linear)
> [    0.000000] Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes, linear)
> [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> [    0.000000] Memory: 0K/2097152K available (15104K kernel code, 1984K rwdata, 7040K rodata, 5824K init, 10721K bss, 131968K reserved, 0K cma-reserved)
> [    0.000000] random: get_random_u64 called from kmem_cache_open+0x3c/0x330 with crng_init=0
> [    0.000000] SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=8, Nodes=1
> [    0.000000] ftrace: allocating 37469 entries in 14 pages
> [    0.000000] ftrace: allocated 14 pages with 3 groups
> [    0.000000] rcu: Hierarchical RCU implementation.
> [    0.000000] rcu: 	RCU event tracing is enabled.
> [    0.000000] rcu: 	RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=8.
> [    0.000000] 	Trampoline variant of Tasks RCU enabled.
> [    0.000000] 	Rude variant of Tasks RCU enabled.
> [    0.000000] 	Tracing variant of Tasks RCU enabled.
> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
> [    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8
> [    0.000000] NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> [    0.000001] clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
> [    0.001107] clocksource: timebase mult[1f40000] shift[24] registered
> [    0.001819] Console: colour dummy device 80x25
> [    0.002296] printk: console [hvc0] enabled
> [    0.002296] printk: console [hvc0] enabled
> [    0.002762] printk: bootconsole [udbg0] disabled
> [    0.002762] printk: bootconsole [udbg0] disabled
> [    0.003323] pid_max: default: 32768 minimum: 301
> [    0.003435] LSM: Security Framework initializing
> [    0.003529] AppArmor: AppArmor initialized
> [    0.003597] Mount-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
> [    0.003660] Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
> [    0.004286] EEH: pSeries platform initialized
> [    0.004329] POWER8 performance monitor hardware support registered
> [    0.004380] power8-pmu: PMAO restore workaround active.
> [    0.004434] rcu: Hierarchical SRCU implementation.
> [    0.004660] smp: Bringing up secondary CPUs ...
> [    0.007882] smp: Brought up 1 node, 8 CPUs
> [    0.007927] numa: Node 0 CPUs: 0-7
> [    0.007958] Using standard scheduler topology
> [    0.009002] node 0 deferred pages initialised in 0ms
> [    0.009491] devtmpfs: initialized
> [    0.018942] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
> [    0.019031] futex hash table entries: 2048 (order: 2, 262144 bytes, linear)
> [    0.019229] pinctrl core: initialized pinctrl subsystem
> [    0.019602] NET: Registered protocol family 16
> [    0.019805] audit: initializing netlink subsys (disabled)
> [    0.020018] audit: type=2000 audit(1602878417.010:1): state=initialized audit_enabled=0 res=1
> [    0.020091] thermal_sys: Registered thermal governor 'fair_share'
> [    0.020092] thermal_sys: Registered thermal governor 'bang_bang'
> [    0.020180] thermal_sys: Registered thermal governor 'step_wise'
> [    0.020228] thermal_sys: Registered thermal governor 'user_space'
> [    0.020292] cpuidle: using governor ladder
> [    0.020372] cpuidle: using governor menu
> [    0.020521] pstore: Registered nvram as persistent store backend
> Linux ppc64le
> #1 SMP Sun Oct 1[    0.024636] PCI: Probing PCI hardware
> [    0.024713] PCI host bridge to bus 0000:00
> [    0.024746] pci_bus 0000:00: root bus resource [io  0x10000-0x1ffff] (bus address [0x0000-0xffff])
> [    0.024814] pci_bus 0000:00: root bus resource [mem 0x200080000000-0x2000ffffffff] (bus address [0x80000000-0xffffffff])
> [    0.024891] pci_bus 0000:00: root bus resource [mem 0x210000000000-0x21ffffffffff]
> [    0.024950] pci_bus 0000:00: root bus resource [bus 00-ff]
> [    0.026710] IOMMU table initialized, virtual merging enabled
> [    0.026769] pci_bus 0000:00: resource 4 [io  0x10000-0x1ffff]
> [    0.026820] pci_bus 0000:00: resource 5 [mem 0x200080000000-0x2000ffffffff]
> [    0.026869] pci_bus 0000:00: resource 6 [mem 0x210000000000-0x21ffffffffff]
> [    0.026918] EEH: No capable adapters found: recovery disabled.
> [    0.031232] iommu: Default domain type: Translated 
> [    0.031358] vgaarb: loaded
> [    0.031439] pps_core: LinuxPPS API ver. 1 registered
> [    0.031479] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
> [    0.031548] PTP clock support registered
> [    0.031585] EDAC MC: Ver: 3.0.0
> [    0.031861] NetLabel: Initializing
> [    0.031893] NetLabel:  domain hash size = 128
> [    0.031932] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO
> [    0.031992] NetLabel:  unlabeled traffic allowed by default
> [    0.045283] clocksource: Switched to clocksource timebase
> [    0.061403] VFS: Disk quotas dquot_6.6.0
> [    0.061497] VFS: Dquot-cache hash table entries: 8192 (order 0, 65536 bytes)
> [    0.061581] hugetlbfs: disabling because there are no supported hugepage sizes
> [    0.061727] AppArmor: AppArmor Filesystem Enabled
> [    0.063692] random: fast init done
> [    0.065139] NET: Registered protocol family 2
> [    0.065324] tcp_listen_portaddr_hash hash table entries: 4096 (order: 0, 65536 bytes, linear)
> [    0.065401] TCP established hash table entries: 16384 (order: 1, 131072 bytes, linear)
> [    0.065482] TCP bind hash table entries: 16384 (order: 2, 262144 bytes, linear)
> [    0.065564] TCP: Hash tables configured (established 16384 bind 16384)
> [    0.065658] MPTCP token hash table entries: 4096 (order: 0, 98304 bytes, linear)
> [    0.065729] UDP hash table entries: 2048 (order: 0, 65536 bytes, linear)
> [    0.065785] UDP-Lite hash table entries: 2048 (order: 0, 65536 bytes, linear)
> [    0.065893] NET: Registered protocol family 1
> [    0.065937] NET: Registered protocol family 44
> [    0.065979] PCI: CLS 0 bytes, default 128
> [    0.066053] Trying to unpack rootfs image as initramfs...
> [    0.095306] rcu: INFO: rcu_sched self-detected stall on CPU
> [    0.095492] rcu: 	4-...!: (1 GPs behind) idle=0d6/0/0x1 softirq=6/7 fqs=0 
> [    0.095647] 	(t=13508 jiffies g=-1187 q=215)
> [    0.095769] rcu: rcu_sched kthread starved for 13508 jiffies! g-1187 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
> [    0.096015] rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> [    0.096233] rcu: RCU grace-period kthread stack dump:
> [    0.096360] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000808
> [    0.096551] Call Trace:
> [    0.096621] [c000000007aab820] [c000000007aab890] 0xc000000007aab890 (unreliable)
> [    0.096812] [c000000007aaba00] [c00000000001c25c] __switch_to+0x11c/0x200
> [    0.096971] [c000000007aaba60] [c000000000ea50ac] __schedule+0x23c/0x800
> [    0.097136] [c000000007aabb30] [c000000000ea56e4] schedule+0x74/0x140
> [    0.097294] [c000000007aabb60] [c000000000eab6b0] schedule_timeout+0xa0/0x1c0
> [    0.097483] [c000000007aabc30] [c00000000021f00c] rcu_gp_fqs_loop+0x48c/0x610
> [    0.097674] [c000000007aabcf0] [c000000000222f40] rcu_gp_kthread+0x200/0x290
> [    0.097859] [c000000007aabdb0] [c000000000190e60] kthread+0x190/0x1a0
> [    0.098018] [c000000007aabe20] [c00000000000d3d0] ret_from_kernel_thread+0x5c/0x6c
> [    0.098211] Sending NMI from CPU 4 to CPUs 0:
> [    6.719253] CPU 0 didn't respond to backtrace IPI, inspecting paca.
> [    6.722181] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 1 (swapper/0)
> [    6.722297] Back trace of paca->saved_r1 (0xc000000007a8f140) (possibly stale):
> [    6.722412] Call Trace:
> [    6.722489] [c000000007a8f140] [0000000000000075] 0x75 (unreliable)
> [    6.722581] [c000000007a8f1a0] [c000000007a8f280] 0xc000000007a8f280
> [    6.722680] [c000000007a8f250] [c000000000254e30] tick_do_update_jiffies64.part.0+0x100/0x1f0
> [    6.722785] [c000000007a8f290] [c00000000025505c] tick_sched_timer+0x13c/0x140
> [    6.722855] [c000000007a8f2d0] [c00000000023ac50] __run_hrtimer+0xb0/0x360
> [    6.722916] [c000000007a8f320] [c00000000023afd4] __hrtimer_run_queues+0xd4/0x1a0
> [    6.722986] [c000000007a8f380] [c00000000023c144] hrtimer_interrupt+0x124/0x300
> [    6.723064] [c000000007a8f430] [c0000000000272b4] timer_interrupt+0x104/0x2d0
> [    6.723141] [c000000007a8f490] [c000000000009aa0] decrementer_common_virt+0x190/0x1a0
> [    6.723227] --- interrupt: 900 at lzma_main+0x1d4/0x2a0
> [    6.723227]     LR = lzma_main+0xcc/0x2a0
> [    6.723317] [c000000007a8f790] [c0000000008dec00] lzma_main+0x1f0/0x2a0 (unreliable)
> [    6.723393] [c000000007a8f7d0] [c0000000008dee90] lzma2_lzma+0x1e0/0x390
> [    6.723448] [c000000007a8f820] [c0000000008df150] xz_dec_lzma2_run+0x110/0x6e0
> [    6.723523] [c000000007a8f8c0] [c0000000008dce58] dec_block+0x218/0x2b0
> [    6.723578] [c000000007a8f920] [c0000000008dd390] dec_main+0x1a0/0x590
> [    6.723645] [c000000007a8f9b0] [c0000000008dd7d4] xz_dec_run+0x54/0x180
> [    6.723714] [c000000007a8f9f0] [c00000000160333c] unxz+0x1e8/0x408
> [    6.723786] [c000000007a8faa0] [c0000000015b6f58] unpack_to_rootfs+0x1e4/0x378
> [    6.723851] [c000000007a8fb50] [c0000000015b7264] populate_rootfs+0x98/0x184
> [    6.723926] [c000000007a8fbd0] [c000000000011ee0] do_one_initcall+0x60/0x2b0
> [    6.723992] [c000000007a8fca0] [c0000000015b4f34] do_initcalls+0x140/0x18c
> [    6.724062] [c000000007a8fd50] [c0000000015b520c] kernel_init_freeable+0x1f0/0x25c
> [    6.724129] [c000000007a8fdb0] [c000000000012534] kernel_init+0x2c/0x168
> [    6.724183] [c000000007a8fe20] [c00000000000d3d0] ret_from_kernel_thread+0x5c/0x6c
> [    6.724315] Sending NMI from CPU 4 to CPUs 1:
> [    6.724569] CPU 1 didn't respond to backtrace IPI, inspecting paca.
> [    6.724582] NMI backtrace for cpu 1
> [    6.724640] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 0 (swapper/1)
> [    6.724801] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0-1.g11733e1-default #1 openSUSE Tumbleweed (unreleased)
> [    6.724838] Back trace of paca->saved_r1 (0xc000000007aff720) (possibly stale):
> [    6.724841] Call Trace:
> [    6.724957] NIP:  c0000000000fb260 LR: c0000000000fdf58 CTR: c00000000012e014
> [    6.725027] Sending NMI from CPU 4 to CPUs 2:
> [    6.725049] REGS: c000000007ad7b00 TRAP: 0500   Not tainted  (5.9.0-1.g11733e1-default)
> [    6.725052] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 48000224  XER: 00000000
> [    6.725188] CPU 2 didn't respond to backtrace IPI, inspecting paca.
> [    6.725194] NMI backtrace for cpu 2
> [    6.725200] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.9.0-1.g11733e1-default #1 openSUSE Tumbleweed (unreleased)
> [    6.725201] NIP:  c0000000000fb260 LR: c0000000000fdf58 CTR: c00000000012e014
> [    6.725203] REGS: c000000007af7b00 TRAP: 0500   Not tainted  (5.9.0-1.g11733e1-default)
> [    6.725203] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 48000024  XER: 00000000
> [    6.725208] CFAR: c0000000001e8fbc IRQMASK: 0 
> [    6.725208] GPR00: 0000000028000024 c000000007af7d90 c000000001cd4a00 0000000000000000 
> [    6.725208] GPR04: c00000007f612c30 0000000000000000 00000000001570d0 000000000014de44 
> [    6.725208] GPR08: 000000007df60000 c00000003fffd000 0000000000000001 00000000ffffffff 
> [    6.725208] GPR12: c0000000000fdec0 c00000003fffd600 0000000000000000 0000000000000000 
> [    6.725208] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    6.725208] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 
> [    6.725208] GPR24: 0000000000000002 0000000000000000 0000000000000000 c000000001d17640 
> [    6.725208] GPR28: 0000000000000001 c000000007a49e00 c0000000016a7190 c0000000016a7188 
> [    6.725233] NIP [c0000000000fb260] plpar_hcall_norets+0x1c/0x28
> [    6.725235] LR [c0000000000fdf58] pseries_lpar_idle+0x98/0x190
> [    6.725236] Call Trace:
> [    6.725238] [c000000007af7d90] [c000000007a49e00] 0xc000000007a49e00 (unreliable)
> [    6.725241] [c000000007af7e00] [c00000000001da40] arch_cpu_idle+0x50/0x180
> [    6.725266] [c000000007af7e30] [c000000000eacd84] default_idle_call+0x84/0x208
> [    6.725268] CFAR: c0000000001e9170 
> [    6.725272] IRQMASK: 0 
> [    6.725272] GPR00: 0000000028000224 c000000007ad7d90 c000000001cd4a00 0000000000000000 
> [    6.725272] GPR04: c00000007f512c30 0000000000000000 0000000000101b60 00000000000fa76c 
> [    6.725272] GPR08: 000000007de60000 c00000003fffe800 0000000000000001 00000000ffffffff 
> [    6.725272] GPR12: c0000000000fdec0 c00000003fffee00 0000000000000000 0000000000000000 
> [    6.725272] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    6.725272] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 
> [    6.725272] GPR24: 0000000000000001 0000000000000000 0000000000000000 c000000001d17640 
> [    6.725272] GPR28: 0000000000000001 c000000007a42780 c0000000016a7190 c0000000016a7188 
> [    6.725289] NIP [c0000000000fb260] plpar_hcall_norets+0x1c/0x28
> [    6.725291] LR [c0000000000fdf58] pseries_lpar_idle+0x98/0x190
> [    6.725292] Call Trace:
> [    6.725295] [c000000007ad7d90] [c000000007a42780] 0xc000000007a42780 (unreliable)
> [    6.725298] [c000000007ad7e00] [c00000000001da40] arch_cpu_idle+0x50/0x180
> [    6.725302] [c000000007ad7e30] [c000000000eacd84] default_idle_call+0x84/0x208
> [    6.725305] [c000000007ad7e70] [c0000000001afd6c] cpuidle_idle_call+0x20c/0x2d0
> [    6.725306] [c000000007ad7ec0] [c0000000001aff20] do_idle+0xf0/0x1d0
> [    6.725308] [c000000007ad7f10] [c0000000001b024c] cpu_startup_entry+0x3c/0x40
> [    6.725310] [c000000007ad7f40] [c00000000005f038] start_secondary+0x248/0x250
> [    6.725312] [c000000007ad7f90] [c00000000000c454] start_secondary_prolog+0x10/0x14
> [    6.725312] Instruction dump:
> [    6.725337] 7c0803a6 3884fff8 78630100 78840020 4bfffeb0 3c4c01be 384297bc 7c421378 
> [    6.725340] 7c000026 90010008 60000000 44000022 <80010008> 7c0ff120 4e800020 7c0802a6 
> [    6.725357] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 0 (swapper/2)
> [    6.725517] [c000000007af7e70] [c0000000001afd6c] cpuidle_idle_call+0x20c/0x2d0
> [    6.725556] Back trace of paca->saved_r1 (0x0000000000000000) (possibly stale):
> [    6.725558] Call Trace:
> [    6.725658] [c000000007af7ec0] [c0000000001aff20] do_idle+0xf0/0x1d0
> [    6.725740] Sending NMI from CPU 4 to CPUs 3:
> [    6.725806] [c000000007af7f10] [c0000000001b024c] cpu_startup_entry+0x3c/0x40
> [    6.728644] [c000000007af7f40] [c00000000005f038] start_secondary+0x248/0x250
> [    6.728707] [c000000007af7f90] [c00000000000c454] start_secondary_prolog+0x10/0x14
> [    6.728769] Instruction dump:
> [    6.728805] 7c0803a6 3884fff8 78630100 78840020 4bfffeb0 3c4c01be 384297bc 7c421378 
> [    6.728869] 7c000026 90010008 60000000 44000022 <80010008> 7c0ff120 4e800020 7c0802a6 
> [   12.209770] CPU 3 didn't respond to backtrace IPI, inspecting paca.
> [   12.210353] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 0 (swapper/3)
> [   12.210436] Back trace of paca->saved_r1 (0xc000000007a83760) (possibly stale):
> [   12.210546] Call Trace:
> [   12.210578] [c000000007a83760] [c000000007a837b0] 0xc000000007a837b0 (unreliable)
> [   12.210655] [c000000007a83800] [c0000000001e9574] __pv_queued_spin_lock_slowpath+0x2f4/0x420
> [   12.210739] [c000000007a83850] [c000000000ead414] _raw_spin_lock_irqsave+0xa4/0xc0
> [   12.210814] [c000000007a83880] [c000000000220cc8] rcu_report_qs_rdp.constprop.0+0x38/0x170
> [   12.210892] [c000000007a838c0] [c000000000221334] rcu_core+0xa4/0x2a0
> [   12.210946] [c000000007a83910] [c000000000eadb40] __do_softirq+0x160/0x3f8
> [   12.211001] [c000000007a83a00] [c000000000162cf8] irq_exit+0xd8/0x140
> [   12.211066] [c000000007a83a30] [c0000000000272d8] timer_interrupt+0x128/0x2d0
> [   12.211143] [c000000007a83a90] [c000000000009aa0] decrementer_common_virt+0x190/0x1a0
> [   12.211222] --- interrupt: 900 at plpar_hcall_norets+0x1c/0x28
> [   12.211222]     LR = pseries_lpar_idle+0x98/0x190
> [   12.211307] [c000000007a83d90] [c000000007a4ed00] 0xc000000007a4ed00 (unreliable)
> [   12.211385] [c000000007a83e00] [c00000000001da40] arch_cpu_idle+0x50/0x180
> [   12.211441] [c000000007a83e30] [c000000000eacd84] default_idle_call+0x84/0x208
> [   12.211518] [c000000007a83e70] [c0000000001afd6c] cpuidle_idle_call+0x20c/0x2d0
> [   12.211584] [c000000007a83ec0] [c0000000001aff20] do_idle+0xf0/0x1d0
> [   12.211650] [c000000007a83f10] [c0000000001b024c] cpu_startup_entry+0x3c/0x40
> [   12.211717] [c000000007a83f40] [c00000000005f038] start_secondary+0x248/0x250
> [   12.211792] [c000000007a83f90] [c00000000000c454] start_secondary_prolog+0x10/0x14
> [   12.211879] NMI backtrace for cpu 4
> [   12.211932] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.9.0-1.g11733e1-default #1 openSUSE Tumbleweed (unreleased)
> [   12.212022] Call Trace:
> [   12.212055] [c000000007abb4f0] [c0000000008f5de0] dump_stack+0xc4/0x114 (unreliable)
> [   12.212121] [c000000007abb540] [c00000000090316c] nmi_cpu_backtrace+0xac/0x100
> [   12.212187] [c000000007abb5b0] [c00000000090336c] nmi_trigger_cpumask_backtrace+0x1ac/0x1f0
> [   12.212271] [c000000007abb650] [c000000000072e38] arch_trigger_cpumask_backtrace+0x28/0x40
> [   12.212338] [c000000007abb670] [c00000000022479c] rcu_dump_cpu_stacks+0x10c/0x168
> [   12.212416] [c000000007abb6c0] [c00000000021b4d0] print_cpu_stall+0x1d0/0x2a0
> [   12.212492] [c000000007abb770] [c00000000021d998] check_cpu_stall+0x148/0x340
> [   12.212557] [c000000007abb7a0] [c00000000021dbe0] rcu_pending+0x50/0x150
> [   12.212623] [c000000007abb7f0] [c0000000002233f4] rcu_sched_clock_irq+0x84/0x240
> [   12.212699] [c000000007abb830] [c000000000239b28] update_process_times+0x48/0xd0
> [   12.212765] [c000000007abb860] [c0000000002546bc] tick_sched_handle+0x3c/0xd0
> [   12.212842] [c000000007abb890] [c000000000254fb0] tick_sched_timer+0x90/0x140
> [   12.212919] [c000000007abb8d0] [c00000000023ac50] __run_hrtimer+0xb0/0x360
> [   12.212973] [c000000007abb920] [c00000000023afd4] __hrtimer_run_queues+0xd4/0x1a0
> [   12.213050] [c000000007abb980] [c00000000023c144] hrtimer_interrupt+0x124/0x300
> [   12.213128] [c000000007abba30] [c0000000000272b4] timer_interrupt+0x104/0x2d0
> [   12.213203] [c000000007abba90] [c000000000009aa0] decrementer_common_virt+0x190/0x1a0
> [   12.213269] --- interrupt: 900 at plpar_hcall_norets+0x1c/0x28
> [   12.213269]     LR = pseries_lpar_idle+0x98/0x190
> [   12.213361] [c000000007abbd90] [c000000007a44f00] 0xc000000007a44f00 (unreliable)
> [   12.213437] [c000000007abbe00] [c00000000001da40] arch_cpu_idle+0x50/0x180
> [   12.213502] [c000000007abbe30] [c000000000eacd84] default_idle_call+0x84/0x208
> [   12.213567] [c000000007abbe70] [c0000000001afd6c] cpuidle_idle_call+0x20c/0x2d0
> [   12.213645] [c000000007abbec0] [c0000000001aff20] do_idle+0xf0/0x1d0
> [   12.213707] [c000000007abbf10] [c0000000001b0248] cpu_startup_entry+0x38/0x40
> [   12.213775] [c000000007abbf40] [c00000000005f038] start_secondary+0x248/0x250
> [   12.213840] [c000000007abbf90] [c00000000000c454] start_secondary_prolog+0x10/0x14
> [   12.213917] Sending NMI from CPU 4 to CPUs 5:
> [   17.888326] CPU 5 didn't respond to backtrace IPI, inspecting paca.
> [   17.888947] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 0 (swapper/5)
> [   17.889011] Back trace of paca->saved_r1 (0xc000000007ac7720) (possibly stale):
> [   17.889108] Call Trace:
> [   17.889141] [c000000007ac7720] [c0000000001c1684] load_balance+0x1c4/0xaa0 (unreliable)
> [   17.889215] [c000000007ac7800] [c0000000001e9434] __pv_queued_spin_lock_slowpath+0x1b4/0x420
> [   17.889303] [c000000007ac7850] [c000000000ead414] _raw_spin_lock_irqsave+0xa4/0xc0
> [   17.889383] [c000000007ac7880] [c000000000220cc8] rcu_report_qs_rdp.constprop.0+0x38/0x170
> [   17.889463] [c000000007ac78c0] [c000000000221334] rcu_core+0xa4/0x2a0
> [   17.889519] [c000000007ac7910] [c000000000eadb40] __do_softirq+0x160/0x3f8
> [   17.889582] [c000000007ac7a00] [c000000000162cf8] irq_exit+0xd8/0x140
> [   17.889641] [c000000007ac7a30] [c0000000000272d8] timer_interrupt+0x128/0x2d0
> [   17.889712] [c000000007ac7a90] [c000000000009aa0] decrementer_common_virt+0x190/0x1a0
> [   17.889795] --- interrupt: 900 at plpar_hcall_norets+0x1c/0x28
> [   17.889795]     LR = pseries_lpar_idle+0x98/0x190
> [   17.889899] [c000000007ac7d90] [c000000007a5da00] 0xc000000007a5da00 (unreliable)
> [   17.889979] [c000000007ac7e00] [c00000000001da40] arch_cpu_idle+0x50/0x180
> [   17.890044] [c000000007ac7e30] [c000000000eacd84] default_idle_call+0x84/0x208
> [   17.890123] [c000000007ac7e70] [c0000000001afd6c] cpuidle_idle_call+0x20c/0x2d0
> [   17.890201] [c000000007ac7ec0] [c0000000001aff20] do_idle+0xf0/0x1d0
> [   17.890258] [c000000007ac7f10] [c0000000001b0248] cpu_startup_entry+0x38/0x40
> [   17.890337] [c000000007ac7f40] [c00000000005f038] start_secondary+0x248/0x250
> [   17.890402] [c000000007ac7f90] [c00000000000c454] start_secondary_prolog+0x10/0x14
> [   17.890492] Sending NMI from CPU 4 to CPUs 6:
> [   17.890573] NMI backtrace for cpu 6
> [   17.890574] CPU 6 didn't respond to backtrace IPI, inspecting paca.
> [   17.890576] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 0 (swapper/6)
> [   17.890666] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 5.9.0-1.g11733e1-default #1 openSUSE Tumbleweed (unreleased)
> [   17.890719] Back trace of paca->saved_r1 (0x0000000000000000) (possibly stale):
> [   17.890720] Call Trace:
> [   17.890791] NIP:  c0000000000fb260 LR: c0000000000fdf58 CTR: c00000000012e014
> [   17.890885] Sending NMI from CPU 4 to CPUs 7:
> [   17.890955] REGS: c000000007ae7b00 TRAP: 0500   Not tainted  (5.9.0-1.g11733e1-default)
> [   17.890956] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 48000024  XER: 00000000
> [   17.891234] CFAR: c0000000000276bc IRQMASK: 0 
> [   17.891234] GPR00: 0000000028000024 c000000007ae7d90 c000000001cd4a00 0000000000000000 
> [   17.891234] GPR04: c000000007ae7e00 0000000000000000 c000000001d03a00 000000000018ed08 
> [   17.891234] GPR08: 000000007e360000 c00000003fff7400 0000000000000001 0000000000000006 
> [   17.891234] GPR12: c0000000000fdec0 c00000003fff7800 0000000000000000 0000000000000000 
> [   17.891234] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [   17.891234] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 
> [   17.891234] GPR24: 0000000000000006 0000000000000000 0000000000000000 c000000001d17640 
> [   17.891234] GPR28: 0000000000000001 c000000007a40000 c0000000016a7190 c0000000016a7188 
> [   17.891807] NIP [c0000000000fb260] plpar_hcall_norets+0x1c/0x28
> [   17.891863] LR [c0000000000fdf58] pseries_lpar_idle+0x98/0x190
> [   17.891930] Call Trace:
> [   17.891954] [c000000007ae7d90] [c000000007a40000] 0xc000000007a40000 (unreliable)
> [   17.892034] [c000000007ae7e00] [c00000000001da40] arch_cpu_idle+0x50/0x180
> [   17.892091] [c000000007ae7e30] [c000000000eacd84] default_idle_call+0x84/0x208
> [   17.892172] [c000000007ae7e70] [c0000000001afd6c] cpuidle_idle_call+0x20c/0x2d0
> [   17.892240] [c000000007ae7ec0] [c0000000001aff20] do_idle+0xf0/0x1d0
> [   17.892307] [c000000007ae7f10] [c0000000001b0248] cpu_startup_entry+0x38/0x40
> [   17.892374] [c000000007ae7f40] [c00000000005f038] start_secondary+0x248/0x250
> [   17.892452] [c000000007ae7f90] [c00000000000c454] start_secondary_prolog+0x10/0x14
> [   17.892529] Instruction dump:
> [   17.892566] 7c0803a6 3884fff8 78630100 78840020 4bfffeb0 3c4c01be 384297bc 7c421378 
> [   17.892647] 7c000026 90010008 60000000 44000022 <80010008> 7c0ff120 4e800020 7c0802a6 
> [   24.118936] CPU 7 didn't respond to backtrace IPI, inspecting paca.
> [   24.120313] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 0 (swapper/7)
> [   24.120568] Back trace of paca->saved_r1 (0xc000000007aa3760) (possibly stale):
> [   24.120834] Call Trace:
> [   24.120927] [c000000007aa3800] [c0000000001e9574] __pv_queued_spin_lock_slowpath+0x2f4/0x420
> [   24.121220] [c000000007aa3850] [c000000000ead414] _raw_spin_lock_irqsave+0xa4/0xc0
> [   24.121473] [c000000007aa3880] [c000000000220cc8] rcu_report_qs_rdp.constprop.0+0x38/0x170
> [   24.121724] [c000000007aa38c0] [c000000000221334] rcu_core+0xa4/0x2a0
> [   24.121933] [c000000007aa3910] [c000000000eadb40] __do_softirq+0x160/0x3f8
> [   24.122144] [c000000007aa3a00] [c000000000162cf8] irq_exit+0xd8/0x140
> [   24.122354] [c000000007aa3a30] [c0000000000272d8] timer_interrupt+0x128/0x2d0
> [   24.122606] [c000000007aa3a90] [c000000000009aa0] decrementer_common_virt+0x190/0x1a0
> [   24.122858] --- interrupt: 900 at plpar_hcall_norets+0x1c/0x28
> [   24.122858]     LR = pseries_lpar_idle+0x98/0x190
> [   24.123191] [c000000007aa3d90] [c000000007a71600] 0xc000000007a71600 (unreliable)
> [   24.123443] [c000000007aa3e00] [c00000000001da40] arch_cpu_idle+0x50/0x180
> [   24.123654] [c000000007aa3e30] [c000000000eacd84] default_idle_call+0x84/0x208
> [   24.123908] [c000000007aa3e70] [c0000000001afd6c] cpuidle_idle_call+0x20c/0x2d0
> [   24.124159] [c000000007aa3ec0] [c0000000001aff20] do_idle+0xf0/0x1d0
> [   24.124367] [c000000007aa3f10] [c0000000001b0248] cpu_startup_entry+0x38/0x40
> [   24.124619] [c000000007aa3f40] [c00000000005f038] start_secondary+0x248/0x250
> [   24.124869] [c000000007aa3f90] [c00000000000c454] start_secondary_prolog+0x10/0x14
> 
> 
> qemu and overall the problem is the same - the boot stops early.
> 

^ permalink raw reply

* [Bug 209733] Starting new KVM virtual machines on PPC64 starts to hang after box is up for a while
From: bugzilla-daemon @ 2020-10-18 23:13 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <bug-209733-206035@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=209733

Cameron (cam@neo-zeon.de) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|PPC-64                      |kvm
            Version|2.5                         |unspecified
            Product|Platform Specific/Hardware  |Virtualization

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply

* [Bug 209733] New: Starting new KVM virtual machines on PPC64 starts to hang after box is up for a while
From: bugzilla-daemon @ 2020-10-18 23:09 UTC (permalink / raw)
  To: linuxppc-dev

https://bugzilla.kernel.org/show_bug.cgi?id=209733

            Bug ID: 209733
           Summary: Starting new KVM virtual machines on PPC64 starts to
                    hang after box is up for a while
           Product: Platform Specific/Hardware
           Version: 2.5
    Kernel Version: >=5.8
          Hardware: PPC-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: PPC-64
          Assignee: platform_ppc-64@kernel-bugs.osdl.org
          Reporter: cam@neo-zeon.de
        Regression: No

Issue occurs with 5.8.14, 5.8.16, and 5.9.1.  Does NOT occur with 5.7.x. I
suspect it occurs with all of 5.8, but I haven't confirmed this yet.

After the box has been up for a "while", starting new VM's fails. Completely
shutting down existing VM's and then starting them back up will also fail in
the same way.

What is a while? Could be 2 days, might be 9. I'll update as the pattern
becomes more clear.

libvirt is generally used, but when running kvm manually with strace, kvm
always gets stuck here:
ioctl(11, KVM_PPC_ALLOCATE_HTAB, 0x7fffea0bade4

Maybe the kernel is trying to find the memory needed to allocate the Hashed
Page Table but is unable to do so? Maybe there's a memory leak?

Before this issue starts occurring, I have confirmed I am able to run the exact
same kvm command manually:
sudo -u libvirt-qemu qemu-system-ppc64 -enable-kvm -m 8192 -nographic -vga none
-drive file=/var/lib/libvirt/images/test.qcow2,format=qcow2 -mem-prealloc -smp
4

Nothing in dmesg, nothing useful in the logs.

This box's configuration:
Debian 10 stable
2x 18 core POWER9 (144 threads)
512g physical memory
Raptor Talos II motherboard
radix MMU disabled

Unfortunately, I cannot test the affected box with the Radix MMU enabled
because I have some important VM's that won't run unless it is disabled.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply

* [PATCH AUTOSEL 4.4 24/33] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Sasha Levin @ 2020-10-18 19:27 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Tyrel Datwyler, Martin K . Petersen, linux-scsi,
	Jing Xiangfeng, linuxppc-dev
In-Reply-To: <20201018192728.4056577-1-sashal@kernel.org>

From: Jing Xiangfeng <jingxiangfeng@huawei.com>

[ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ]

Fix to return error code PTR_ERR() from the error handling case instead of
0.

Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangfeng@huawei.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 0526a47e30a3f..db80ab8335dfb 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -4790,6 +4790,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	if (IS_ERR(vhost->work_thread)) {
 		dev_err(dev, "Couldn't create kernel thread: %ld\n",
 			PTR_ERR(vhost->work_thread));
+		rc = PTR_ERR(vhost->work_thread);
 		goto free_host_mem;
 	}
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH AUTOSEL 4.9 32/41] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Sasha Levin @ 2020-10-18 19:26 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Tyrel Datwyler, Martin K . Petersen, linux-scsi,
	Jing Xiangfeng, linuxppc-dev
In-Reply-To: <20201018192635.4056198-1-sashal@kernel.org>

From: Jing Xiangfeng <jingxiangfeng@huawei.com>

[ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ]

Fix to return error code PTR_ERR() from the error handling case instead of
0.

Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangfeng@huawei.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 54dea767dfde9..04b3ac17531db 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -4804,6 +4804,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	if (IS_ERR(vhost->work_thread)) {
 		dev_err(dev, "Couldn't create kernel thread: %ld\n",
 			PTR_ERR(vhost->work_thread));
+		rc = PTR_ERR(vhost->work_thread);
 		goto free_host_mem;
 	}
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH AUTOSEL 4.14 39/52] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Sasha Levin @ 2020-10-18 19:25 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Tyrel Datwyler, Martin K . Petersen, linux-scsi,
	Jing Xiangfeng, linuxppc-dev
In-Reply-To: <20201018192530.4055730-1-sashal@kernel.org>

From: Jing Xiangfeng <jingxiangfeng@huawei.com>

[ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ]

Fix to return error code PTR_ERR() from the error handling case instead of
0.

Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangfeng@huawei.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 34612add3829f..dbacd9830d3df 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -4797,6 +4797,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	if (IS_ERR(vhost->work_thread)) {
 		dev_err(dev, "Couldn't create kernel thread: %ld\n",
 			PTR_ERR(vhost->work_thread));
+		rc = PTR_ERR(vhost->work_thread);
 		goto free_host_mem;
 	}
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH AUTOSEL 4.19 42/56] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Sasha Levin @ 2020-10-18 19:24 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Tyrel Datwyler, Martin K . Petersen, linux-scsi,
	Jing Xiangfeng, linuxppc-dev
In-Reply-To: <20201018192417.4055228-1-sashal@kernel.org>

From: Jing Xiangfeng <jingxiangfeng@huawei.com>

[ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ]

Fix to return error code PTR_ERR() from the error handling case instead of
0.

Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangfeng@huawei.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 71d53bb239e25..090ab377f65e5 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -4795,6 +4795,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	if (IS_ERR(vhost->work_thread)) {
 		dev_err(dev, "Couldn't create kernel thread: %ld\n",
 			PTR_ERR(vhost->work_thread));
+		rc = PTR_ERR(vhost->work_thread);
 		goto free_host_mem;
 	}
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.4 55/80] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Sasha Levin @ 2020-10-18 19:22 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Tyrel Datwyler, Martin K . Petersen, linux-scsi,
	Jing Xiangfeng, linuxppc-dev
In-Reply-To: <20201018192231.4054535-1-sashal@kernel.org>

From: Jing Xiangfeng <jingxiangfeng@huawei.com>

[ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ]

Fix to return error code PTR_ERR() from the error handling case instead of
0.

Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangfeng@huawei.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index df897df5cafee..8a76284b59b08 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -4788,6 +4788,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	if (IS_ERR(vhost->work_thread)) {
 		dev_err(dev, "Couldn't create kernel thread: %ld\n",
 			PTR_ERR(vhost->work_thread));
+		rc = PTR_ERR(vhost->work_thread);
 		goto free_host_mem;
 	}
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.8 071/101] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Sasha Levin @ 2020-10-18 19:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Tyrel Datwyler, Martin K . Petersen, linux-scsi,
	Jing Xiangfeng, linuxppc-dev
In-Reply-To: <20201018192026.4053674-1-sashal@kernel.org>

From: Jing Xiangfeng <jingxiangfeng@huawei.com>

[ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ]

Fix to return error code PTR_ERR() from the error handling case instead of
0.

Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangfeng@huawei.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 635f6f9cffc40..ef91f3d01f989 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -4928,6 +4928,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	if (IS_ERR(vhost->work_thread)) {
 		dev_err(dev, "Couldn't create kernel thread: %ld\n",
 			PTR_ERR(vhost->work_thread));
+		rc = PTR_ERR(vhost->work_thread);
 		goto free_host_mem;
 	}
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.9 076/111] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Sasha Levin @ 2020-10-18 19:17 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, Tyrel Datwyler, Martin K . Petersen, linux-scsi,
	Jing Xiangfeng, linuxppc-dev
In-Reply-To: <20201018191807.4052726-1-sashal@kernel.org>

From: Jing Xiangfeng <jingxiangfeng@huawei.com>

[ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ]

Fix to return error code PTR_ERR() from the error handling case instead of
0.

Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangfeng@huawei.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index ea7c8930592dc..70daa0605082d 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -4928,6 +4928,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	if (IS_ERR(vhost->work_thread)) {
 		dev_err(dev, "Couldn't create kernel thread: %ld\n",
 			PTR_ERR(vhost->work_thread));
+		rc = PTR_ERR(vhost->work_thread);
 		goto free_host_mem;
 	}
 
-- 
2.25.1


^ permalink raw reply related

* [PATCH v2 2/2] powerpc/44x: Don't support 47x code and non 47x code at the same time
From: Christophe Leroy @ 2020-10-18 17:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <822833ce3dc10634339818f7d1ab616edf63b0c6.1603041883.git.christophe.leroy@csgroup.eu>

440/460 variants and 470 variants are not compatible, no
need to make code supporting both and using MMU features.

Just use CONFIG_PPC_47x to decide what to build.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v2: Move outside #ifdef CONFIG_PPC_47x a label "1:" used by 44x
---
 arch/powerpc/kernel/entry_32.S   | 11 +++--------
 arch/powerpc/mm/nohash/tlb_low.S | 29 +++++++----------------------
 2 files changed, 10 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 8cdc8bcde703..a425360deabb 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -439,15 +439,13 @@ syscall_exit_cont:
 	andis.	r10,r0,DBCR0_IDM@h
 	bnel-	load_dbcr0
 #endif
-#ifdef CONFIG_44x
-BEGIN_MMU_FTR_SECTION
+#ifdef CONFIG_PPC_47x
 	lis	r4,icache_44x_need_flush@ha
 	lwz	r5,icache_44x_need_flush@l(r4)
 	cmplwi	cr0,r5,0
 	bne-	2f
+#endif /* CONFIG_PPC_47x */
 1:
-END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_47x)
-#endif /* CONFIG_44x */
 BEGIN_FTR_SECTION
 	lwarx	r7,0,r1
 END_FTR_SECTION_IFSET(CPU_FTR_NEED_PAIRED_STWCX)
@@ -948,10 +946,7 @@ restore_kuap:
 
 	/* interrupts are hard-disabled at this point */
 restore:
-#ifdef CONFIG_44x
-BEGIN_MMU_FTR_SECTION
-	b	1f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_47x)
+#if defined(CONFIG_44x) && !defined(CONFIG_PPC_47x)
 	lis	r4,icache_44x_need_flush@ha
 	lwz	r5,icache_44x_need_flush@l(r4)
 	cmplwi	cr0,r5,0
diff --git a/arch/powerpc/mm/nohash/tlb_low.S b/arch/powerpc/mm/nohash/tlb_low.S
index eaeee402f96e..68797e072f55 100644
--- a/arch/powerpc/mm/nohash/tlb_low.S
+++ b/arch/powerpc/mm/nohash/tlb_low.S
@@ -92,36 +92,25 @@ _GLOBAL(__tlbil_va)
 	tlbsx.	r6,0,r3
 	bne	10f
 	sync
-BEGIN_MMU_FTR_SECTION
-	b	2f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_47x)
+#ifndef CONFIG_PPC_47x
 	/* On 440 There are only 64 TLB entries, so r3 < 64, which means bit
 	 * 22, is clear.  Since 22 is the V bit in the TLB_PAGEID, loading this
 	 * value will invalidate the TLB entry.
 	 */
 	tlbwe	r6,r6,PPC44x_TLB_PAGEID
-	isync
-10:	wrtee	r10
-	blr
-2:
-#ifdef CONFIG_PPC_47x
+#else
 	oris	r7,r6,0x8000	/* specify way explicitly */
 	clrrwi	r4,r3,12	/* get an EPN for the hashing with V = 0 */
 	ori	r4,r4,PPC47x_TLBE_SIZE
 	tlbwe   r4,r7,0		/* write it */
+#endif /* !CONFIG_PPC_47x */
 	isync
-	wrtee	r10
+10:	wrtee	r10
 	blr
-#else /* CONFIG_PPC_47x */
-1:	trap
-	EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,0;
-#endif /* !CONFIG_PPC_47x */
 
 _GLOBAL(_tlbil_all)
 _GLOBAL(_tlbil_pid)
-BEGIN_MMU_FTR_SECTION
-	b	2f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_47x)
+#ifndef CONFIG_PPC_47x
 	li	r3,0
 	sync
 
@@ -136,8 +125,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_47x)
 
 	isync
 	blr
-2:
-#ifdef CONFIG_PPC_47x
+#else
 	/* 476 variant. There's not simple way to do this, hopefully we'll
 	 * try to limit the amount of such full invalidates
 	 */
@@ -179,11 +167,8 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_47x)
 	b	1b		/* Then loop */
 1:	isync			/* Sync shadows */
 	wrtee	r11
-#else /* CONFIG_PPC_47x */
-1:	trap
-	EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,0;
-#endif /* !CONFIG_PPC_47x */
 	blr
+#endif /* !CONFIG_PPC_47x */
 
 #ifdef CONFIG_PPC_47x
 
-- 
2.25.0


^ permalink raw reply related

* [PATCH v2 1/2] powerpc/44x: Don't support 440 when CONFIG_PPC_47x is set
From: Christophe Leroy @ 2020-10-18 17:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

As stated in platform/44x/Kconfig, CONFIG_PPC_47x is not
compatible with 440 and 460 variants.

This is confirmed in asm/cache.h as L1_CACHE_SHIFT is different
for 47x, meaning a kernel built for 47x will not run correctly
on a 440.

In cputable, opt out all 440 and 460 variants when CONFIG_PPC_47x
is set. Also add a default match dedicated to 470.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/cputable.h |  9 +++++----
 arch/powerpc/include/asm/mmu.h      |  7 +++----
 arch/powerpc/kernel/cputable.c      | 29 +++++++++++++++++++++--------
 3 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
index d88bcb79f16d..4a0ddf66bd4a 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -523,11 +523,10 @@ enum {
 #ifdef CONFIG_40x
 	    CPU_FTRS_40X |
 #endif
-#ifdef CONFIG_44x
-	    CPU_FTRS_44X | CPU_FTRS_440x6 |
-#endif
 #ifdef CONFIG_PPC_47x
 	    CPU_FTRS_47X | CPU_FTR_476_DD2 |
+#elif defined(CONFIG_44x)
+	    CPU_FTRS_44X | CPU_FTRS_440x6 |
 #endif
 #ifdef CONFIG_E200
 	    CPU_FTRS_E200 |
@@ -596,7 +595,9 @@ enum {
 #ifdef CONFIG_40x
 	    CPU_FTRS_40X &
 #endif
-#ifdef CONFIG_44x
+#ifdef CONFIG_PPC_47x
+	    CPU_FTRS_47X &
+#elif defined(CONFIG_44x)
 	    CPU_FTRS_44X & CPU_FTRS_440x6 &
 #endif
 #ifdef CONFIG_E200
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index a1769c0426f2..bf5d3b5291f1 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -162,15 +162,14 @@ enum {
 #ifdef CONFIG_40x
 		MMU_FTR_TYPE_40x |
 #endif
-#ifdef CONFIG_44x
+#ifdef CONFIG_PPC_47x
+		MMU_FTR_TYPE_47x | MMU_FTR_USE_TLBIVAX_BCAST | MMU_FTR_LOCK_BCAST_INVAL |
+#elif defined(CONFIG_44x)
 		MMU_FTR_TYPE_44x |
 #endif
 #if defined(CONFIG_E200) || defined(CONFIG_E500)
 		MMU_FTR_TYPE_FSL_E | MMU_FTR_BIG_PHYS | MMU_FTR_USE_TLBILX |
 #endif
-#ifdef CONFIG_PPC_47x
-		MMU_FTR_TYPE_47x | MMU_FTR_USE_TLBIVAX_BCAST | MMU_FTR_LOCK_BCAST_INVAL |
-#endif
 #ifdef CONFIG_PPC_BOOK3S_32
 		MMU_FTR_USE_HIGH_BATS |
 #endif
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 492c0b36aff6..cf80e6c8ed5e 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -1533,6 +1533,7 @@ static struct cpu_spec __initdata cpu_specs[] = {
 
 #endif /* CONFIG_40x */
 #ifdef CONFIG_44x
+#ifndef CONFIG_PPC_47x
 	{
 		.pvr_mask		= 0xf0000fff,
 		.pvr_value		= 0x40000850,
@@ -1815,7 +1816,19 @@ static struct cpu_spec __initdata cpu_specs[] = {
 		.machine_check		= machine_check_440A,
 		.platform		= "ppc440",
 	},
-#ifdef CONFIG_PPC_47x
+	{	/* default match */
+		.pvr_mask		= 0x00000000,
+		.pvr_value		= 0x00000000,
+		.cpu_name		= "(generic 44x PPC)",
+		.cpu_features		= CPU_FTRS_44X,
+		.cpu_user_features	= COMMON_USER_BOOKE,
+		.mmu_features		= MMU_FTR_TYPE_44x,
+		.icache_bsize		= 32,
+		.dcache_bsize		= 32,
+		.machine_check		= machine_check_4xx,
+		.platform		= "ppc440",
+	}
+#else /* CONFIG_PPC_47x */
 	{ /* 476 DD2 core */
 		.pvr_mask		= 0xffffffff,
 		.pvr_value		= 0x11a52080,
@@ -1872,19 +1885,19 @@ static struct cpu_spec __initdata cpu_specs[] = {
 		.machine_check		= machine_check_47x,
 		.platform		= "ppc470",
 	},
-#endif /* CONFIG_PPC_47x */
 	{	/* default match */
 		.pvr_mask		= 0x00000000,
 		.pvr_value		= 0x00000000,
-		.cpu_name		= "(generic 44x PPC)",
-		.cpu_features		= CPU_FTRS_44X,
+		.cpu_name		= "(generic 47x PPC)",
+		.cpu_features		= CPU_FTRS_47X,
 		.cpu_user_features	= COMMON_USER_BOOKE,
-		.mmu_features		= MMU_FTR_TYPE_44x,
+		.mmu_features		= MMU_FTR_TYPE_47x,
 		.icache_bsize		= 32,
-		.dcache_bsize		= 32,
-		.machine_check		= machine_check_4xx,
-		.platform		= "ppc440",
+		.dcache_bsize		= 128,
+		.machine_check		= machine_check_47x,
+		.platform		= "ppc470",
 	}
+#endif /* CONFIG_PPC_47x */
 #endif /* CONFIG_44x */
 #ifdef CONFIG_E200
 	{	/* e200z5 */
-- 
2.25.0


^ permalink raw reply related

* Re: [PATCH v5 21/23] powerpc/book3s64/hash/kuep: Enable KUEP on hash
From: Sandipan Das @ 2020-10-18  4:03 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev, mpe
In-Reply-To: <20200827040931.297759-22-aneesh.kumar@linux.ibm.com>



On 27/08/20 9:39 am, Aneesh Kumar K.V wrote:
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/mm/book3s64/pkeys.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 

Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>

^ permalink raw reply

* Re: [PATCH v5 20/23] powerpc/book3s64/hash/kuap: Enable kuap on hash
From: Sandipan Das @ 2020-10-18  4:02 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev, mpe
In-Reply-To: <20200827040931.297759-21-aneesh.kumar@linux.ibm.com>



On 27/08/20 9:39 am, Aneesh Kumar K.V wrote:
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/mm/book3s64/pkeys.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 

Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>

^ permalink raw reply

* Re: [PATCH v5 19/23] powerpc/book3s64/kuep: Use Key 3 to implement KUEP with hash translation.
From: Sandipan Das @ 2020-10-18  4:01 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev
In-Reply-To: <20200827040931.297759-20-aneesh.kumar@linux.ibm.com>



On 27/08/20 9:39 am, Aneesh Kumar K.V wrote:
> Radix use IAMR Key 0 and hash translation use IAMR key 3.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/kup.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>

^ permalink raw reply

* Re: [PATCH v5 18/23] powerpc/book3s64/kuap: Use Key 3 to implement KUAP with hash translation.
From: Sandipan Das @ 2020-10-18  4:00 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev
In-Reply-To: <20200827040931.297759-19-aneesh.kumar@linux.ibm.com>



On 27/08/20 9:39 am, Aneesh Kumar K.V wrote:
> Radix use AMR Key 0 and hash translation use AMR key 3.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/kup.h | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 

Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>

^ permalink raw reply

* Re: [PATCH v5 16/23] powerpc/book3s64/kuap: Restrict access to userspace based on userspace AMR
From: Sandipan Das @ 2020-10-18  4:00 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev
In-Reply-To: <20200827040931.297759-17-aneesh.kumar@linux.ibm.com>



On 27/08/20 9:39 am, Aneesh Kumar K.V wrote:
> If an application has configured address protection such that read/write is
> denied using pkey even the kernel should receive a FAULT on accessing the same.
> 
> This patch use user AMR value stored in pt_regs.kuap to achieve the same.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/kup.h | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 

Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>

^ permalink raw reply

* Re: [PATCH v5 15/23] powerpc/book3s64/pkeys: Don't update SPRN_AMR when in kernel mode.
From: Sandipan Das @ 2020-10-18  3:52 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev
In-Reply-To: <20200827040931.297759-16-aneesh.kumar@linux.ibm.com>



On 27/08/20 9:39 am, Aneesh Kumar K.V wrote:
> Now that kernel correctly store/restore userspace AMR/IAMR values, avoid
> manipulating AMR and IAMR from the kernel on behalf of userspace.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/kup.h | 18 ++++++++
>  arch/powerpc/include/asm/processor.h     |  4 --
>  arch/powerpc/kernel/process.c            |  4 --
>  arch/powerpc/kernel/traps.c              |  6 ---
>  arch/powerpc/mm/book3s64/pkeys.c         | 57 +++++-------------------
>  5 files changed, 28 insertions(+), 61 deletions(-)
> 

Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>

^ permalink raw reply

* Re: [PATCH v5 12/23] powerpc/book3s64/pkeys: Inherit correctly on fork.
From: Sandipan Das @ 2020-10-18  3:50 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev
In-Reply-To: <20200827040931.297759-13-aneesh.kumar@linux.ibm.com>



On 27/08/20 9:39 am, Aneesh Kumar K.V wrote:
> Child thread.kuap value is inherited from the parent in copy_thread_tls. We still
> need to make sure when the child returns from a fork in the kernel we start with the kernel
> default AMR value.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/kernel/process.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 

Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>

^ permalink raw reply

* Re: [PATCH v5 13/23] powerpc/book3s64/pkeys: Reset userspace AMR correctly on exec
From: Sandipan Das @ 2020-10-18  3:50 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev
In-Reply-To: <20200827040931.297759-14-aneesh.kumar@linux.ibm.com>



On 27/08/20 9:39 am, Aneesh Kumar K.V wrote:
> On fork, we inherit from the parent and on exec, we should switch to default_amr values.
> 
> Also, avoid changing the AMR register value within the kernel. The kernel now runs with
> different AMR values.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/pkeys.h |  2 ++
>  arch/powerpc/kernel/process.c              |  6 +++++-
>  arch/powerpc/mm/book3s64/pkeys.c           | 16 ++--------------
>  3 files changed, 9 insertions(+), 15 deletions(-)
> 

Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>

^ permalink raw reply

* Re: [PATCH v5 11/23] powerpc/book3s64/pkeys: Store/restore userspace AMR/IAMR correctly on entry and exit from kernel
From: Sandipan Das @ 2020-10-18  3:49 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev
In-Reply-To: <20200827040931.297759-12-aneesh.kumar@linux.ibm.com>



On 27/08/20 9:39 am, Aneesh Kumar K.V wrote:
> This prepare kernel to operate with a different value than userspace AMR/IAMR.
> For this, AMR/IAMR need to be saved and restored on entry and return from the
> kernel.
> 
> With KUAP we modify kernel AMR when accessing user address from the kernel
> via copy_to/from_user interfaces. We don't need to modify IAMR value in
> similar fashion.
> 
> If MMU_FTR_PKEY is enabled we need to save AMR/IAMR in pt_regs on entering
> kernel from userspace. If not we can assume that AMR/IAMR is not modified
> from userspace.
> 
> We need to save AMR if we have MMU_FTR_KUAP feature enabled and we are
> interrupted within kernel. This is required so that if we get interrupted
> within copy_to/from_user we continue with the right AMR value.
> 
> If we hae MMU_FTR_KUEP enabled we need to restore IAMR on return to userspace
> beause kernel will be running with a different IAMR value.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/kup.h | 177 ++++++++++++++++++++---
>  arch/powerpc/include/asm/ptrace.h        |   4 +-
>  arch/powerpc/kernel/asm-offsets.c        |   2 +
>  arch/powerpc/kernel/entry_64.S           |   6 +-
>  arch/powerpc/kernel/exceptions-64s.S     |   4 +-
>  arch/powerpc/kernel/syscall_64.c         |  30 +++-
>  6 files changed, 192 insertions(+), 31 deletions(-)
> 

Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox