* Re: [powerpc][5.13.0-rc7] Kernel warning (kernel/sched/fair.c:401) while running LTP tests
From: Vincent Guittot @ 2021-06-21 9:50 UTC (permalink / raw)
To: Odin Ugedal; +Cc: Sachin Sant, Peter Zijlstra, linuxppc-dev, open list
In-Reply-To: <CAFpoUr3g5t3Z0BtW4-jnYomc3cdY=V5=Zt94-C+fHOjGWa107w@mail.gmail.com>
On Mon, 21 Jun 2021 at 11:39, Odin Ugedal <odin@uged.al> wrote:
>
> man. 21. jun. 2021 kl. 08:33 skrev Sachin Sant <sachinp@linux.vnet.ibm.com>:
> >
> > While running LTP tests (cfs_bandwidth01) against 5.13.0-rc7 kernel on a powerpc box
> > following warning is seen
> >
> > [ 6611.331827] ------------[ cut here ]------------
> > [ 6611.331855] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list
> > [ 6611.331862] WARNING: CPU: 8 PID: 0 at kernel/sched/fair.c:401 unthrottle_cfs_rq+0x4cc/0x590
> > [ 6611.331883] Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache netfs tun brd overlay vfat fat btrfs blake2b_generic xor zstd_compress raid6_pq xfs loop sctp ip6_udp_tunnel udp_tunnel libcrc32c dm_mod bonding rfkill sunrpc pseries_rng xts vmx_crypto sch_fq_codel ip_tables ext4 mbcache jbd2 sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp fuse [last unloaded: init_module]
> > [ 6611.331957] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G OE 5.13.0-rc6-gcba5e97280f5 #1
> > [ 6611.331968] NIP: c0000000001b7aac LR: c0000000001b7aa8 CTR: c000000000722d30
> > [ 6611.331976] REGS: c00000000274f3a0 TRAP: 0700 Tainted: G OE (5.13.0-rc6-gcba5e97280f5)
> > [ 6611.331985] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 48000224 XER: 00000005
> > [ 6611.332002] CFAR: c00000000014ca20 IRQMASK: 1
> > [ 6611.332002] GPR00: c0000000001b7aa8 c00000000274f640 c000000001abaf00 000000000000002d
> > [ 6611.332002] GPR04: 00000000ffff7fff c00000000274f300 0000000000000027 c000000efdb07e08
> > [ 6611.332002] GPR08: 0000000000000023 0000000000000001 0000000000000027 c000000001976680
> > [ 6611.332002] GPR12: 0000000000000000 c000000effc0be80 c000000ef07b3f90 000000001eefe200
> > [ 6611.332002] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [ 6611.332002] GPR20: 0000000000000001 c000000000fa6c08 c000000000fa6030 0000000000000001
> > [ 6611.332002] GPR24: 0000000000000000 0000000000000000 c000000efde12380 0000000000000001
> > [ 6611.332002] GPR28: 0000000000000001 0000000000000000 c000000efde12400 0000000000000000
> > [ 6611.332094] NIP [c0000000001b7aac] unthrottle_cfs_rq+0x4cc/0x590
> > [ 6611.332104] LR [c0000000001b7aa8] unthrottle_cfs_rq+0x4c8/0x590
> > [ 6611.332113] Call Trace:
> > [ 6611.332116] [c00000000274f640] [c0000000001b7aa8] unthrottle_cfs_rq+0x4c8/0x590 (unreliable)
> > [ 6611.332128] [c00000000274f6e0] [c0000000001b7e38] distribute_cfs_runtime+0x1d8/0x280
> > [ 6611.332139] [c00000000274f7b0] [c0000000001b81d0] sched_cfs_period_timer+0x140/0x330
> > [ 6611.332149] [c00000000274f870] [c00000000022a03c] __hrtimer_run_queues+0x17c/0x380
> > [ 6611.332158] [c00000000274f8f0] [c00000000022ac68] hrtimer_interrupt+0x128/0x2f0
> > [ 6611.332168] [c00000000274f9a0] [c00000000002940c] timer_interrupt+0x13c/0x370
> > [ 6611.332179] [c00000000274fa00] [c000000000009c04] decrementer_common_virt+0x1a4/0x1b0
> > [ 6611.332189] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x24
> > [ 6611.332199] NIP: c0000000000f6af8 LR: c000000000a05f68 CTR: 0000000000000000
> > [ 6611.332206] REGS: c00000000274fa70 TRAP: 0900 Tainted: G OE (5.13.0-rc6-gcba5e97280f5)
> > [ 6611.332214] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28000224 XER: 00000000
> > [ 6611.332234] CFAR: 0000000000000c00 IRQMASK: 0
> > [ 6611.332234] GPR00: 0000000000000000 c00000000274fd10 c000000001abaf00 0000000000000000
> > [ 6611.332234] GPR04: 00000000000000c0 0000000000000080 0001a91c68b80fa1 00000000000003dc
> > [ 6611.332234] GPR08: 000000000001f400 0000000000000001 0000000000000000 0000000000000000
> > [ 6611.332234] GPR12: 0000000000000000 c000000effc0be80 c000000ef07b3f90 000000001eefe200
> > [ 6611.332234] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [ 6611.332234] GPR20: 0000000000000001 0000000000000002 0000000000000010 c0000000019fe2f8
> > [ 6611.332234] GPR24: 0000000000000001 00000603517d757e 0000000000000000 0000000000000000
> > [ 6611.332234] GPR28: 0000000000000001 0000000000000000 c000000001231f90 c000000001231f98
> > [ 6611.332323] NIP [c0000000000f6af8] plpar_hcall_norets_notrace+0x18/0x24
> > [ 6611.332332] LR [c000000000a05f68] check_and_cede_processor+0x48/0x60
> > [ 6611.332340] --- interrupt: 900
> > [ 6611.332345] [c00000000274fd10] [c000000efdb92380] 0xc000000efdb92380 (unreliable)
> > [ 6611.332355] [c00000000274fd70] [c000000000a063bc] dedicated_cede_loop+0x9c/0x1b0
> > [ 6611.332364] [c00000000274fdc0] [c000000000a02b04] cpuidle_enter_state+0x2e4/0x4e0
> > [ 6611.332375] [c00000000274fe20] [c000000000a02da0] cpuidle_enter+0x50/0x70
> > [ 6611.332385] [c00000000274fe60] [c0000000001a883c] call_cpuidle+0x4c/0x80
> > [ 6611.332393] [c00000000274fe80] [c0000000001a8ee0] do_idle+0x380/0x3e0
> > [ 6611.332402] [c00000000274ff00] [c0000000001a91bc] cpu_startup_entry+0x3c/0x40
> > [ 6611.332411] [c00000000274ff30] [c000000000063ff8] start_secondary+0x298/0x2b0
> > [ 6611.332421] [c00000000274ff90] [c00000000000c754] start_secondary_prolog+0x10/0x14
> > [ 6611.332430] Instruction dump:
> > [ 6611.332435] 4bfffc44 3d22fff6 8929f328 2f890000 409efea4 39200001 3d42fff6 3c62ff4f
> > [ 6611.332451] 3863bcd8 992af328 4bf94f15 60000000 <0fe00000> 4bfffe80 7f6407b4 7f43d378
> > [ 6611.332466] ---[ end trace 1346f865cd1cae91 ]—
> >
> > 5.13.0-rc6 was good. Bisect points to following patch
> >
> > commit a7b359fc6a37
> > sched/fair: Correctly insert cfs_rq's to list on unthrottle
> >
> > The test runs to completion(without this warning) if the patch is reverted.
> >
> > Thanks
> > -Sachin
> >
>
> Hi,
>
> Thanks for the report! I have a theory about what is possibly causing
> this, so I will try to reproduce it and see if my assumptions are
> correct.
This means that a child's load was not null and it was inserted
whereas parent's load was null. This should not happen unless the
propagation failed somewhere
>
>
> Odin
^ permalink raw reply
* Re: arch/powerpc/kvm/book3s_hv_nested.c:264:6: error: stack frame size of 2304 bytes in function 'kvmhv_enter_nested_guest'
From: Michael Ellerman @ 2021-06-21 9:46 UTC (permalink / raw)
To: Nathan Chancellor, Nicholas Piggin, Arnd Bergmann,
kernel test robot
Cc: kbuild-all, Kees Cook, clang-built-linux, linux-kernel, kvm-ppc,
Linux Memory Management List, Andrew Morton, linuxppc-dev
In-Reply-To: <e6167885-30e5-d149-bcde-3e9ad9f5d381@kernel.org>
Nathan Chancellor <nathan@kernel.org> writes:
> On 6/20/2021 4:59 PM, Nicholas Piggin wrote:
>> Excerpts from kernel test robot's message of April 3, 2021 8:47 pm:
>>> tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>>> head: d93a0d43e3d0ba9e19387be4dae4a8d5b175a8d7
>>> commit: 97e4910232fa1f81e806aa60c25a0450276d99a2 linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP*
>>> date: 3 weeks ago
>>> config: powerpc64-randconfig-r006-20210403 (attached as .config)
>>> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 0fe8af94688aa03c01913c2001d6a1a911f42ce6)
>>> reproduce (this is a W=1 build):
>>> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>> chmod +x ~/bin/make.cross
>>> # install powerpc64 cross compiling tool for clang build
>>> # apt-get install binutils-powerpc64-linux-gnu
>>> # https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97e4910232fa1f81e806aa60c25a0450276d99a2
>>> git remote add linus https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>> git fetch --no-tags linus master
>>> git checkout 97e4910232fa1f81e806aa60c25a0450276d99a2
>>> # save the attached .config to linux build tree
>>> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
>>>
>>> If you fix the issue, kindly add following tag as appropriate
>>> Reported-by: kernel test robot <lkp@intel.com>
>>>
>>> All errors (new ones prefixed by >>):
>>>
>>>>> arch/powerpc/kvm/book3s_hv_nested.c:264:6: error: stack frame size of 2304 bytes in function 'kvmhv_enter_nested_guest' [-Werror,-Wframe-larger-than=]
>>> long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
>>> ^
>>> 1 error generated.
>>>
>>>
>>> vim +/kvmhv_enter_nested_guest +264 arch/powerpc/kvm/book3s_hv_nested.c
>>
>> Not much changed here recently. It's not that big a concern because it's
>> only called in the KVM ioctl path, not in any deep IO paths or anything,
>> and doesn't recurse. Might be a bit of inlining or stack spilling put it
>> over the edge.
>
> It appears to be the fact that LLVM's PowerPC backend does not emit
> efficient byteswap assembly:
>
> https://github.com/ClangBuiltLinux/linux/issues/1292
>
> https://bugs.llvm.org/show_bug.cgi?id=49610
>
>> powerpc does make it an error though, would be good to avoid that so the
>> robot doesn't keep tripping over.
>
> Marking byteswap_pt_regs as 'noinline_for_stack' drastically reduces the
> stack usage. If that is an acceptable solution, I can send it along
> tomorrow.
Yeah that should be OK. Can you post the before/after disassembly when
you post the patch?
It should just be two extra function calls, which shouldn't be enough
overhead to be measurable.
cheers
^ permalink raw reply
* Re: [PATCH v3] lockdown,selinux: fix wrong subject in some SELinux lockdown checks
From: Steffen Klassert @ 2021-06-21 8:35 UTC (permalink / raw)
To: Ondrej Mosnacek
Cc: linux-efi, linux-pci, linux-cxl, Herbert Xu, x86, James Morris,
linux-acpi, Ingo Molnar, linux-serial, linux-pm, selinux,
Steven Rostedt, Casey Schaufler, Paul Moore, netdev,
Stephen Smalley, kexec, linux-kernel, linux-security-module,
linux-fsdevel, bpf, linuxppc-dev, David S . Miller
In-Reply-To: <20210616085118.1141101-1-omosnace@redhat.com>
On Wed, Jun 16, 2021 at 10:51:18AM +0200, Ondrej Mosnacek wrote:
> Commit 59438b46471a ("security,lockdown,selinux: implement SELinux
> lockdown") added an implementation of the locked_down LSM hook to
> SELinux, with the aim to restrict which domains are allowed to perform
> operations that would breach lockdown.
>
> However, in several places the security_locked_down() hook is called in
> situations where the current task isn't doing any action that would
> directly breach lockdown, leading to SELinux checks that are basically
> bogus.
>
> To fix this, add an explicit struct cred pointer argument to
> security_lockdown() and define NULL as a special value to pass instead
> of current_cred() in such situations. LSMs that take the subject
> credentials into account can then fall back to some default or ignore
> such calls altogether. In the SELinux lockdown hook implementation, use
> SECINITSID_KERNEL in case the cred argument is NULL.
>
> Most of the callers are updated to pass current_cred() as the cred
> pointer, thus maintaining the same behavior. The following callers are
> modified to pass NULL as the cred pointer instead:
> 1. arch/powerpc/xmon/xmon.c
> Seems to be some interactive debugging facility. It appears that
> the lockdown hook is called from interrupt context here, so it
> should be more appropriate to request a global lockdown decision.
> 2. fs/tracefs/inode.c:tracefs_create_file()
> Here the call is used to prevent creating new tracefs entries when
> the kernel is locked down. Assumes that locking down is one-way -
> i.e. if the hook returns non-zero once, it will never return zero
> again, thus no point in creating these files. Also, the hook is
> often called by a module's init function when it is loaded by
> userspace, where it doesn't make much sense to do a check against
> the current task's creds, since the task itself doesn't actually
> use the tracing functionality (i.e. doesn't breach lockdown), just
> indirectly makes some new tracepoints available to whoever is
> authorized to use them.
> 3. net/xfrm/xfrm_user.c:copy_to_user_*()
> Here a cryptographic secret is redacted based on the value returned
> from the hook. There are two possible actions that may lead here:
> a) A netlink message XFRM_MSG_GETSA with NLM_F_DUMP set - here the
> task context is relevant, since the dumped data is sent back to
> the current task.
> b) When adding/deleting/updating an SA via XFRM_MSG_xxxSA, the
> dumped SA is broadcasted to tasks subscribed to XFRM events -
> here the current task context is not relevant as it doesn't
> represent the tasks that could potentially see the secret.
> It doesn't seem worth it to try to keep using the current task's
> context in the a) case, since the eventual data leak can be
> circumvented anyway via b), plus there is no way for the task to
> indicate that it doesn't care about the actual key value, so the
> check could generate a lot of "false alert" denials with SELinux.
> Thus, let's pass NULL instead of current_cred() here faute de
> mieux.
>
> Improvements-suggested-by: Casey Schaufler <casey@schaufler-ca.com>
> Improvements-suggested-by: Paul Moore <paul@paul-moore.com>
> Fixes: 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown")
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
For the xfrm part:
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
^ permalink raw reply
* [PATCH v8 6/6] KVM: PPC: Book3S HV: Use H_RPT_INVALIDATE in nested KVM
From: Bharata B Rao @ 2021-06-21 8:50 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev
Cc: farosas, aneesh.kumar, npiggin, Bharata B Rao, david
In-Reply-To: <20210621085003.904767-1-bharata@linux.ibm.com>
In the nested KVM case, replace H_TLB_INVALIDATE by the new hcall
H_RPT_INVALIDATE if available. The availability of this hcall
is determined from "hcall-rpt-invalidate" string in ibm,hypertas-functions
DT property.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
arch/powerpc/kvm/book3s_64_mmu_radix.c | 27 +++++++++++++++++++++-----
arch/powerpc/kvm/book3s_hv_nested.c | 12 ++++++++++--
2 files changed, 32 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index d909c069363e..b5905ae4377c 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -21,6 +21,7 @@
#include <asm/pte-walk.h>
#include <asm/ultravisor.h>
#include <asm/kvm_book3s_uvmem.h>
+#include <asm/plpar_wrappers.h>
/*
* Supported radix tree geometry.
@@ -318,9 +319,19 @@ void kvmppc_radix_tlbie_page(struct kvm *kvm, unsigned long addr,
}
psi = shift_to_mmu_psize(pshift);
- rb = addr | (mmu_get_ap(psi) << PPC_BITLSHIFT(58));
- rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(0, 0, 1),
- lpid, rb);
+
+ if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE)) {
+ rb = addr | (mmu_get_ap(psi) << PPC_BITLSHIFT(58));
+ rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(0, 0, 1),
+ lpid, rb);
+ } else {
+ rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
+ H_RPTI_TYPE_NESTED |
+ H_RPTI_TYPE_TLB,
+ psize_to_rpti_pgsize(psi),
+ addr, addr + psize);
+ }
+
if (rc)
pr_err("KVM: TLB page invalidation hcall failed, rc=%ld\n", rc);
}
@@ -334,8 +345,14 @@ static void kvmppc_radix_flush_pwc(struct kvm *kvm, unsigned int lpid)
return;
}
- rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(1, 0, 1),
- lpid, TLBIEL_INVAL_SET_LPID);
+ if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE))
+ rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(1, 0, 1),
+ lpid, TLBIEL_INVAL_SET_LPID);
+ else
+ rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
+ H_RPTI_TYPE_NESTED |
+ H_RPTI_TYPE_PWC, H_RPTI_PAGE_ALL,
+ 0, -1UL);
if (rc)
pr_err("KVM: TLB PWC invalidation hcall failed, rc=%ld\n", rc);
}
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 056d3df68de1..d78efb5f5bb3 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -19,6 +19,7 @@
#include <asm/pgalloc.h>
#include <asm/pte-walk.h>
#include <asm/reg.h>
+#include <asm/plpar_wrappers.h>
static struct patb_entry *pseries_partition_tb;
@@ -467,8 +468,15 @@ static void kvmhv_flush_lpid(unsigned int lpid)
return;
}
- rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(2, 0, 1),
- lpid, TLBIEL_INVAL_SET_LPID);
+ if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE))
+ rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(2, 0, 1),
+ lpid, TLBIEL_INVAL_SET_LPID);
+ else
+ rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
+ H_RPTI_TYPE_NESTED |
+ H_RPTI_TYPE_TLB | H_RPTI_TYPE_PWC |
+ H_RPTI_TYPE_PAT,
+ H_RPTI_PAGE_ALL, 0, -1UL);
if (rc)
pr_err("KVM: TLB LPID invalidation hcall failed, rc=%ld\n", rc);
}
--
2.31.1
^ permalink raw reply related
* [PATCH v8 5/6] KVM: PPC: Book3S HV: Add KVM_CAP_PPC_RPT_INVALIDATE capability
From: Bharata B Rao @ 2021-06-21 8:50 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev
Cc: farosas, aneesh.kumar, npiggin, Bharata B Rao, david
In-Reply-To: <20210621085003.904767-1-bharata@linux.ibm.com>
Now that we have H_RPT_INVALIDATE fully implemented, enable
support for the same via KVM_CAP_PPC_RPT_INVALIDATE KVM capability
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
Documentation/virt/kvm/api.rst | 18 ++++++++++++++++++
arch/powerpc/kvm/powerpc.c | 3 +++
include/uapi/linux/kvm.h | 1 +
3 files changed, 22 insertions(+)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 7fcb2fd38f42..9977e845633f 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6362,6 +6362,24 @@ default.
See Documentation/x86/sgx/2.Kernel-internals.rst for more details.
+7.26 KVM_CAP_PPC_RPT_INVALIDATE
+-------------------------------
+
+:Capability: KVM_CAP_PPC_RPT_INVALIDATE
+:Architectures: ppc
+:Type: vm
+
+This capability indicates that the kernel is capable of handling
+H_RPT_INVALIDATE hcall.
+
+In order to enable the use of H_RPT_INVALIDATE in the guest,
+user space might have to advertise it for the guest. For example,
+IBM pSeries (sPAPR) guest starts using it if "hcall-rpt-invalidate" is
+present in the "ibm,hypertas-functions" device-tree property.
+
+This capability is enabled for hypervisors on platforms like POWER9
+that support radix MMU.
+
8. Other capabilities.
======================
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index a2a68a958fa0..be33b5321a76 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -682,6 +682,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = !!(hv_enabled && kvmppc_hv_ops->enable_dawr1 &&
!kvmppc_hv_ops->enable_dawr1(NULL));
break;
+ case KVM_CAP_PPC_RPT_INVALIDATE:
+ r = 1;
+ break;
#endif
default:
r = 0;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 79d9c44d1ad7..9016e96de971 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1083,6 +1083,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_SGX_ATTRIBUTE 196
#define KVM_CAP_VM_COPY_ENC_CONTEXT_FROM 197
#define KVM_CAP_PTP_KVM 198
+#define KVM_CAP_PPC_RPT_INVALIDATE 199
#ifdef KVM_CAP_IRQ_ROUTING
--
2.31.1
^ permalink raw reply related
* [PATCH v8 4/6] KVM: PPC: Book3S HV: Nested support in H_RPT_INVALIDATE
From: Bharata B Rao @ 2021-06-21 8:50 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev
Cc: farosas, aneesh.kumar, npiggin, Bharata B Rao, david
In-Reply-To: <20210621085003.904767-1-bharata@linux.ibm.com>
Enable support for process-scoped invalidations from nested
guests and partition-scoped invalidations for nested guests.
Process-scoped invalidations for any level of nested guests
are handled by implementing H_RPT_INVALIDATE handler in the
nested guest exit path in L0.
Partition-scoped invalidation requests are forwarded to the
right nested guest, handled there and passed down to L0
for eventual handling.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[Nested guest partition-scoped invalidation changes]
---
.../include/asm/book3s/64/tlbflush-radix.h | 4 +
arch/powerpc/include/asm/kvm_book3s.h | 3 +
arch/powerpc/kvm/book3s_hv.c | 59 ++++++++-
arch/powerpc/kvm/book3s_hv_nested.c | 117 ++++++++++++++++++
arch/powerpc/mm/book3s64/radix_tlb.c | 4 -
5 files changed, 180 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index 8b33601cdb9d..a46fd37ad552 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -4,6 +4,10 @@
#include <asm/hvcall.h>
+#define RIC_FLUSH_TLB 0
+#define RIC_FLUSH_PWC 1
+#define RIC_FLUSH_ALL 2
+
struct vm_area_struct;
struct mm_struct;
struct mmu_gather;
diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index e6b53c6e21e3..caaa0f592d8e 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -307,6 +307,9 @@ void kvmhv_set_ptbl_entry(unsigned int lpid, u64 dw0, u64 dw1);
void kvmhv_release_all_nested(struct kvm *kvm);
long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu);
long kvmhv_do_nested_tlbie(struct kvm_vcpu *vcpu);
+long do_h_rpt_invalidate_pat(struct kvm_vcpu *vcpu, unsigned long lpid,
+ unsigned long type, unsigned long pg_sizes,
+ unsigned long start, unsigned long end);
int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu,
u64 time_limit, unsigned long lpcr);
void kvmhv_save_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 7e6da4687d88..3d5b8ba3786d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -925,6 +925,34 @@ static int kvmppc_get_yield_count(struct kvm_vcpu *vcpu)
return yield_count;
}
+/*
+ * H_RPT_INVALIDATE hcall handler for nested guests.
+ *
+ * Handles only nested process-scoped invalidation requests in L0.
+ */
+static int kvmppc_nested_h_rpt_invalidate(struct kvm_vcpu *vcpu)
+{
+ unsigned long type = kvmppc_get_gpr(vcpu, 6);
+ unsigned long pid, pg_sizes, start, end;
+
+ /*
+ * The partition-scoped invalidations aren't handled here in L0.
+ */
+ if (type & H_RPTI_TYPE_NESTED)
+ return RESUME_HOST;
+
+ pid = kvmppc_get_gpr(vcpu, 4);
+ pg_sizes = kvmppc_get_gpr(vcpu, 7);
+ start = kvmppc_get_gpr(vcpu, 8);
+ end = kvmppc_get_gpr(vcpu, 9);
+
+ do_h_rpt_invalidate_prt(pid, vcpu->arch.nested->shadow_lpid,
+ type, pg_sizes, start, end);
+
+ kvmppc_set_gpr(vcpu, 3, H_SUCCESS);
+ return RESUME_GUEST;
+}
+
static long kvmppc_h_rpt_invalidate(struct kvm_vcpu *vcpu,
unsigned long id, unsigned long target,
unsigned long type, unsigned long pg_sizes,
@@ -938,10 +966,18 @@ static long kvmppc_h_rpt_invalidate(struct kvm_vcpu *vcpu,
/*
* Partition-scoped invalidation for nested guests.
- * Not yet supported
*/
- if (type & H_RPTI_TYPE_NESTED)
- return H_P3;
+ if (type & H_RPTI_TYPE_NESTED) {
+ if (!nesting_enabled(vcpu->kvm))
+ return H_FUNCTION;
+
+ /* Support only cores as target */
+ if (target != H_RPTI_TARGET_CMMU)
+ return H_P2;
+
+ return do_h_rpt_invalidate_pat(vcpu, id, type, pg_sizes,
+ start, end);
+ }
/*
* Process-scoped invalidation for L1 guests.
@@ -1629,6 +1665,23 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
if (!xics_on_xive())
kvmppc_xics_rm_complete(vcpu, 0);
break;
+ case BOOK3S_INTERRUPT_SYSCALL:
+ {
+ unsigned long req = kvmppc_get_gpr(vcpu, 3);
+
+ /*
+ * The H_RPT_INVALIDATE hcalls issued by nested
+ * guests for process-scoped invalidations when
+ * GTSE=0, are handled here in L0.
+ */
+ if (req == H_RPT_INVALIDATE) {
+ r = kvmppc_nested_h_rpt_invalidate(vcpu);
+ break;
+ }
+
+ r = RESUME_HOST;
+ break;
+ }
default:
r = RESUME_HOST;
break;
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 60724f674421..056d3df68de1 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -1214,6 +1214,123 @@ long kvmhv_do_nested_tlbie(struct kvm_vcpu *vcpu)
return H_SUCCESS;
}
+static long do_tlb_invalidate_nested_tlb(struct kvm_vcpu *vcpu,
+ unsigned long lpid,
+ unsigned long page_size,
+ unsigned long ap,
+ unsigned long start,
+ unsigned long end)
+{
+ unsigned long addr = start;
+ int ret;
+
+ do {
+ ret = kvmhv_emulate_tlbie_tlb_addr(vcpu, lpid, ap,
+ get_epn(addr));
+ if (ret)
+ return ret;
+ addr += page_size;
+ } while (addr < end);
+
+ return ret;
+}
+
+static long do_tlb_invalidate_nested_all(struct kvm_vcpu *vcpu,
+ unsigned long lpid, unsigned long ric)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_nested_guest *gp;
+
+ gp = kvmhv_get_nested(kvm, lpid, false);
+ if (gp) {
+ kvmhv_emulate_tlbie_lpid(vcpu, gp, ric);
+ kvmhv_put_nested(gp);
+ }
+ return H_SUCCESS;
+}
+
+/*
+ * Number of pages above which we invalidate the entire LPID rather than
+ * flush individual pages.
+ */
+static unsigned long tlb_range_flush_page_ceiling __read_mostly = 33;
+
+/*
+ * Performs partition-scoped invalidations for nested guests
+ * as part of H_RPT_INVALIDATE hcall.
+ */
+long do_h_rpt_invalidate_pat(struct kvm_vcpu *vcpu, unsigned long lpid,
+ unsigned long type, unsigned long pg_sizes,
+ unsigned long start, unsigned long end)
+{
+ struct kvm_nested_guest *gp;
+ long ret;
+ unsigned long psize, ap;
+
+ /*
+ * If L2 lpid isn't valid, we need to return H_PARAMETER.
+ *
+ * However, nested KVM issues a L2 lpid flush call when creating
+ * partition table entries for L2. This happens even before the
+ * corresponding shadow lpid is created in HV which happens in
+ * H_ENTER_NESTED call. Since we can't differentiate this case from
+ * the invalid case, we ignore such flush requests and return success.
+ */
+ gp = kvmhv_find_nested(vcpu->kvm, lpid);
+ if (!gp)
+ return H_SUCCESS;
+
+ /*
+ * A flush all request can be handled by a full lpid flush only.
+ */
+ if ((type & H_RPTI_TYPE_NESTED_ALL) == H_RPTI_TYPE_NESTED_ALL)
+ return do_tlb_invalidate_nested_all(vcpu, lpid, RIC_FLUSH_ALL);
+
+ /*
+ * We don't need to handle a PWC flush like process table here,
+ * because intermediate partition scoped table in nested guest doesn't
+ * really have PWC. Only level we have PWC is in L0 and for nested
+ * invalidate at L0 we always do kvm_flush_lpid() which does
+ * radix__flush_all_lpid(). For range invalidate at any level, we
+ * are not removing the higher level page tables and hence there is
+ * no PWC invalidate needed.
+ *
+ * if (type & H_RPTI_TYPE_PWC) {
+ * ret = do_tlb_invalidate_nested_all(vcpu, lpid, RIC_FLUSH_PWC);
+ * if (ret)
+ * return H_P4;
+ * }
+ */
+
+ if (start == 0 && end == -1)
+ return do_tlb_invalidate_nested_all(vcpu, lpid, RIC_FLUSH_TLB);
+
+ if (type & H_RPTI_TYPE_TLB) {
+ struct mmu_psize_def *def;
+ bool flush_lpid;
+ unsigned long nr_pages;
+
+ for (psize = 0; psize < MMU_PAGE_COUNT; psize++) {
+ def = &mmu_psize_defs[psize];
+ if (!(pg_sizes & def->h_rpt_pgsize))
+ continue;
+
+ nr_pages = (end - start) >> def->shift;
+ flush_lpid = nr_pages > tlb_range_flush_page_ceiling;
+ if (flush_lpid)
+ return do_tlb_invalidate_nested_all(vcpu, lpid,
+ RIC_FLUSH_TLB);
+
+ ret = do_tlb_invalidate_nested_tlb(vcpu, lpid,
+ (1UL << def->shift),
+ ap, start, end);
+ if (ret)
+ return H_P4;
+ }
+ }
+ return H_SUCCESS;
+}
+
/* Used to convert a nested guest real address to a L1 guest real address */
static int kvmhv_translate_addr_nested(struct kvm_vcpu *vcpu,
struct kvm_nested_guest *gp,
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
index cdd98b9e7b15..4f38cf34ea40 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -20,10 +20,6 @@
#include "internal.h"
-#define RIC_FLUSH_TLB 0
-#define RIC_FLUSH_PWC 1
-#define RIC_FLUSH_ALL 2
-
/*
* tlbiel instruction for radix, set invalidation
* i.e., r=1 and is=01 or is=10 or is=11
--
2.31.1
^ permalink raw reply related
* [PATCH v8 0/6] Support for H_RPT_INVALIDATE in PowerPC KVM
From: Bharata B Rao @ 2021-06-21 8:49 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev
Cc: farosas, aneesh.kumar, npiggin, Bharata B Rao, david
This patchset adds support for the new hcall H_RPT_INVALIDATE
and replaces the nested tlb flush calls with this new hcall
if support for the same exists.
Changes in v8:
-------------
- Used tlb_single_page_flush_ceiling in the process-scoped range
flush routine to switch to full PID invalation if
the number of pages is above the threshold
- Moved iterating over page sizes into the actual routine that
handles the eventual flushing thereby limiting the page size
iteration only to range based flushing
- Converted #if 0 section into a comment section to avoid
checkpatch from complaining.
- Used a threshold in the partition-scoped range flushing
to switch to full LPID invalidation
v7: https://lore.kernel.org/linuxppc-dev/20210505154642.178702-1-bharata@linux.ibm.com/
Aneesh Kumar K.V (1):
KVM: PPC: Book3S HV: Fix comments of H_RPT_INVALIDATE arguments
Bharata B Rao (5):
powerpc/book3s64/radix: Add H_RPT_INVALIDATE pgsize encodings to
mmu_psize_def
KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE
KVM: PPC: Book3S HV: Nested support in H_RPT_INVALIDATE
KVM: PPC: Book3S HV: Add KVM_CAP_PPC_RPT_INVALIDATE capability
KVM: PPC: Book3S HV: Use H_RPT_INVALIDATE in nested KVM
Documentation/virt/kvm/api.rst | 18 ++
arch/powerpc/include/asm/book3s/64/mmu.h | 1 +
.../include/asm/book3s/64/tlbflush-radix.h | 4 +
arch/powerpc/include/asm/hvcall.h | 4 +-
arch/powerpc/include/asm/kvm_book3s.h | 3 +
arch/powerpc/include/asm/mmu_context.h | 9 +
arch/powerpc/kvm/book3s_64_mmu_radix.c | 27 ++-
arch/powerpc/kvm/book3s_hv.c | 89 +++++++++
arch/powerpc/kvm/book3s_hv_nested.c | 129 ++++++++++++-
arch/powerpc/kvm/powerpc.c | 3 +
arch/powerpc/mm/book3s64/radix_pgtable.c | 5 +
arch/powerpc/mm/book3s64/radix_tlb.c | 176 +++++++++++++++++-
include/uapi/linux/kvm.h | 1 +
13 files changed, 456 insertions(+), 13 deletions(-)
--
2.31.1
^ permalink raw reply
* [PATCH v8 3/6] KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE
From: Bharata B Rao @ 2021-06-21 8:50 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev
Cc: farosas, aneesh.kumar, npiggin, Bharata B Rao, david
In-Reply-To: <20210621085003.904767-1-bharata@linux.ibm.com>
H_RPT_INVALIDATE does two types of TLB invalidations:
1. Process-scoped invalidations for guests when LPCR[GTSE]=0.
This is currently not used in KVM as GTSE is not usually
disabled in KVM.
2. Partition-scoped invalidations that an L1 hypervisor does on
behalf of an L2 guest. This is currently handled
by H_TLB_INVALIDATE hcall and this new replaces the old that.
This commit enables process-scoped invalidations for L1 guests.
Support for process-scoped and partition-scoped invalidations
from/for nested guests will be added separately.
Process scoped tlbie invalidations from L1 and nested guests
need RS register for TLBIE instruction to contain both PID and
LPID. This patch introduces primitives that execute tlbie
instruction with both PID and LPID set in prepartion for
H_RPT_INVALIDATE hcall.
A description of H_RPT_INVALIDATE follows:
int64 /* H_Success: Return code on successful completion */
/* H_Busy - repeat the call with the same */
/* H_Parameter, H_P2, H_P3, H_P4, H_P5 : Invalid
parameters */
hcall(const uint64 H_RPT_INVALIDATE, /* Invalidate RPT
translation
lookaside information */
uint64 id, /* PID/LPID to invalidate */
uint64 target, /* Invalidation target */
uint64 type, /* Type of lookaside information */
uint64 pg_sizes, /* Page sizes */
uint64 start, /* Start of Effective Address (EA)
range (inclusive) */
uint64 end) /* End of EA range (exclusive) */
Invalidation targets (target)
-----------------------------
Core MMU 0x01 /* All virtual processors in the
partition */
Core local MMU 0x02 /* Current virtual processor */
Nest MMU 0x04 /* All nest/accelerator agents
in use by the partition */
A combination of the above can be specified,
except core and core local.
Type of translation to invalidate (type)
---------------------------------------
NESTED 0x0001 /* invalidate nested guest partition-scope */
TLB 0x0002 /* Invalidate TLB */
PWC 0x0004 /* Invalidate Page Walk Cache */
PRT 0x0008 /* Invalidate caching of Process Table
Entries if NESTED is clear */
PAT 0x0008 /* Invalidate caching of Partition Table
Entries if NESTED is set */
A combination of the above can be specified.
Page size mask (pages)
----------------------
4K 0x01
64K 0x02
2M 0x04
1G 0x08
All sizes (-1UL)
A combination of the above can be specified.
All page sizes can be selected with -1.
Semantics: Invalidate radix tree lookaside information
matching the parameters given.
* Return H_P2, H_P3 or H_P4 if target, type, or pageSizes parameters
are different from the defined values.
* Return H_PARAMETER if NESTED is set and pid is not a valid nested
LPID allocated to this partition
* Return H_P5 if (start, end) doesn't form a valid range. Start and
end should be a valid Quadrant address and end > start.
* Return H_NotSupported if the partition is not in running in radix
translation mode.
* May invalidate more translation information than requested.
* If start = 0 and end = -1, set the range to cover all valid
addresses. Else start and end should be aligned to 4kB (lower 11
bits clear).
* If NESTED is clear, then invalidate process scoped lookaside
information. Else pid specifies a nested LPID, and the invalidation
is performed on nested guest partition table and nested guest
partition scope real addresses.
* If pid = 0 and NESTED is clear, then valid addresses are quadrant 3
and quadrant 0 spaces, Else valid addresses are quadrant 0.
* Pages which are fully covered by the range are to be invalidated.
Those which are partially covered are considered outside
invalidation range, which allows a caller to optimally invalidate
ranges that may contain mixed page sizes.
* Return H_SUCCESS on success.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
arch/powerpc/include/asm/mmu_context.h | 9 ++
arch/powerpc/kvm/book3s_hv.c | 36 ++++++
arch/powerpc/mm/book3s64/radix_tlb.c | 172 +++++++++++++++++++++++++
3 files changed, 217 insertions(+)
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 4bc45d3ed8b0..b44f291fc909 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -124,8 +124,17 @@ static inline bool need_extra_context(struct mm_struct *mm, unsigned long ea)
#if defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE) && defined(CONFIG_PPC_RADIX_MMU)
extern void radix_kvm_prefetch_workaround(struct mm_struct *mm);
+void do_h_rpt_invalidate_prt(unsigned long pid, unsigned long lpid,
+ unsigned long type, unsigned long pg_sizes,
+ unsigned long start, unsigned long end);
#else
static inline void radix_kvm_prefetch_workaround(struct mm_struct *mm) { }
+static inline void do_h_rpt_invalidate_prt(unsigned long pid,
+ unsigned long lpid,
+ unsigned long type,
+ unsigned long pg_sizes,
+ unsigned long start,
+ unsigned long end) { }
#endif
extern void switch_cop(struct mm_struct *next);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index bc0813644666..7e6da4687d88 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -76,6 +76,7 @@
#include <asm/kvm_book3s_uvmem.h>
#include <asm/ultravisor.h>
#include <asm/dtl.h>
+#include <asm/plpar_wrappers.h>
#include "book3s.h"
@@ -924,6 +925,32 @@ static int kvmppc_get_yield_count(struct kvm_vcpu *vcpu)
return yield_count;
}
+static long kvmppc_h_rpt_invalidate(struct kvm_vcpu *vcpu,
+ unsigned long id, unsigned long target,
+ unsigned long type, unsigned long pg_sizes,
+ unsigned long start, unsigned long end)
+{
+ if (!kvm_is_radix(vcpu->kvm))
+ return H_UNSUPPORTED;
+
+ if (end < start)
+ return H_P5;
+
+ /*
+ * Partition-scoped invalidation for nested guests.
+ * Not yet supported
+ */
+ if (type & H_RPTI_TYPE_NESTED)
+ return H_P3;
+
+ /*
+ * Process-scoped invalidation for L1 guests.
+ */
+ do_h_rpt_invalidate_prt(id, vcpu->kvm->arch.lpid,
+ type, pg_sizes, start, end);
+ return H_SUCCESS;
+}
+
int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
{
unsigned long req = kvmppc_get_gpr(vcpu, 3);
@@ -1132,6 +1159,14 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
*/
ret = kvmppc_h_svm_init_abort(vcpu->kvm);
break;
+ case H_RPT_INVALIDATE:
+ ret = kvmppc_h_rpt_invalidate(vcpu, kvmppc_get_gpr(vcpu, 4),
+ kvmppc_get_gpr(vcpu, 5),
+ kvmppc_get_gpr(vcpu, 6),
+ kvmppc_get_gpr(vcpu, 7),
+ kvmppc_get_gpr(vcpu, 8),
+ kvmppc_get_gpr(vcpu, 9));
+ break;
default:
return RESUME_HOST;
@@ -1178,6 +1213,7 @@ static int kvmppc_hcall_impl_hv(unsigned long cmd)
case H_XIRR_X:
#endif
case H_PAGE_INIT:
+ case H_RPT_INVALIDATE:
return 1;
}
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
index 409e61210789..cdd98b9e7b15 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -130,6 +130,21 @@ static __always_inline void __tlbie_pid(unsigned long pid, unsigned long ric)
trace_tlbie(0, 0, rb, rs, ric, prs, r);
}
+static __always_inline void __tlbie_pid_lpid(unsigned long pid,
+ unsigned long lpid,
+ unsigned long ric)
+{
+ unsigned long rb, rs, prs, r;
+
+ rb = PPC_BIT(53); /* IS = 1 */
+ rs = (pid << PPC_BITLSHIFT(31)) | (lpid & ~(PPC_BITMASK(0, 31)));
+ prs = 1; /* process scoped */
+ r = 1; /* radix format */
+
+ asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
+ : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory");
+ trace_tlbie(0, 0, rb, rs, ric, prs, r);
+}
static __always_inline void __tlbie_lpid(unsigned long lpid, unsigned long ric)
{
unsigned long rb,rs,prs,r;
@@ -190,6 +205,23 @@ static __always_inline void __tlbie_va(unsigned long va, unsigned long pid,
trace_tlbie(0, 0, rb, rs, ric, prs, r);
}
+static __always_inline void __tlbie_va_lpid(unsigned long va, unsigned long pid,
+ unsigned long lpid,
+ unsigned long ap, unsigned long ric)
+{
+ unsigned long rb, rs, prs, r;
+
+ rb = va & ~(PPC_BITMASK(52, 63));
+ rb |= ap << PPC_BITLSHIFT(58);
+ rs = (pid << PPC_BITLSHIFT(31)) | (lpid & ~(PPC_BITMASK(0, 31)));
+ prs = 1; /* process scoped */
+ r = 1; /* radix format */
+
+ asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
+ : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory");
+ trace_tlbie(0, 0, rb, rs, ric, prs, r);
+}
+
static __always_inline void __tlbie_lpid_va(unsigned long va, unsigned long lpid,
unsigned long ap, unsigned long ric)
{
@@ -235,6 +267,22 @@ static inline void fixup_tlbie_va_range(unsigned long va, unsigned long pid,
}
}
+static inline void fixup_tlbie_va_range_lpid(unsigned long va,
+ unsigned long pid,
+ unsigned long lpid,
+ unsigned long ap)
+{
+ if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) {
+ asm volatile("ptesync" : : : "memory");
+ __tlbie_pid_lpid(0, lpid, RIC_FLUSH_TLB);
+ }
+
+ if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
+ asm volatile("ptesync" : : : "memory");
+ __tlbie_va_lpid(va, pid, lpid, ap, RIC_FLUSH_TLB);
+ }
+}
+
static inline void fixup_tlbie_pid(unsigned long pid)
{
/*
@@ -254,6 +302,25 @@ static inline void fixup_tlbie_pid(unsigned long pid)
}
}
+static inline void fixup_tlbie_pid_lpid(unsigned long pid, unsigned long lpid)
+{
+ /*
+ * We can use any address for the invalidation, pick one which is
+ * probably unused as an optimisation.
+ */
+ unsigned long va = ((1UL << 52) - 1);
+
+ if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) {
+ asm volatile("ptesync" : : : "memory");
+ __tlbie_pid_lpid(0, lpid, RIC_FLUSH_TLB);
+ }
+
+ if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
+ asm volatile("ptesync" : : : "memory");
+ __tlbie_va_lpid(va, pid, lpid, mmu_get_ap(MMU_PAGE_64K),
+ RIC_FLUSH_TLB);
+ }
+}
static inline void fixup_tlbie_lpid_va(unsigned long va, unsigned long lpid,
unsigned long ap)
@@ -344,6 +411,31 @@ static inline void _tlbie_pid(unsigned long pid, unsigned long ric)
asm volatile("eieio; tlbsync; ptesync": : :"memory");
}
+static inline void _tlbie_pid_lpid(unsigned long pid, unsigned long lpid,
+ unsigned long ric)
+{
+ asm volatile("ptesync" : : : "memory");
+
+ /*
+ * Workaround the fact that the "ric" argument to __tlbie_pid
+ * must be a compile-time contraint to match the "i" constraint
+ * in the asm statement.
+ */
+ switch (ric) {
+ case RIC_FLUSH_TLB:
+ __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_TLB);
+ fixup_tlbie_pid_lpid(pid, lpid);
+ break;
+ case RIC_FLUSH_PWC:
+ __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC);
+ break;
+ case RIC_FLUSH_ALL:
+ default:
+ __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_ALL);
+ fixup_tlbie_pid_lpid(pid, lpid);
+ }
+ asm volatile("eieio; tlbsync; ptesync" : : : "memory");
+}
struct tlbiel_pid {
unsigned long pid;
unsigned long ric;
@@ -469,6 +561,20 @@ static inline void __tlbie_va_range(unsigned long start, unsigned long end,
fixup_tlbie_va_range(addr - page_size, pid, ap);
}
+static inline void __tlbie_va_range_lpid(unsigned long start, unsigned long end,
+ unsigned long pid, unsigned long lpid,
+ unsigned long page_size,
+ unsigned long psize)
+{
+ unsigned long addr;
+ unsigned long ap = mmu_get_ap(psize);
+
+ for (addr = start; addr < end; addr += page_size)
+ __tlbie_va_lpid(addr, pid, lpid, ap, RIC_FLUSH_TLB);
+
+ fixup_tlbie_va_range_lpid(addr - page_size, pid, lpid, ap);
+}
+
static __always_inline void _tlbie_va(unsigned long va, unsigned long pid,
unsigned long psize, unsigned long ric)
{
@@ -549,6 +655,18 @@ static inline void _tlbie_va_range(unsigned long start, unsigned long end,
asm volatile("eieio; tlbsync; ptesync": : :"memory");
}
+static inline void _tlbie_va_range_lpid(unsigned long start, unsigned long end,
+ unsigned long pid, unsigned long lpid,
+ unsigned long page_size,
+ unsigned long psize, bool also_pwc)
+{
+ asm volatile("ptesync" : : : "memory");
+ if (also_pwc)
+ __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC);
+ __tlbie_va_range_lpid(start, end, pid, lpid, page_size, psize);
+ asm volatile("eieio; tlbsync; ptesync" : : : "memory");
+}
+
static inline void _tlbiel_va_range_multicast(struct mm_struct *mm,
unsigned long start, unsigned long end,
unsigned long pid, unsigned long page_size,
@@ -1381,4 +1499,58 @@ extern void radix_kvm_prefetch_workaround(struct mm_struct *mm)
}
}
EXPORT_SYMBOL_GPL(radix_kvm_prefetch_workaround);
+
+/*
+ * Performs process-scoped invalidations for a given LPID
+ * as part of H_RPT_INVALIDATE hcall.
+ */
+void do_h_rpt_invalidate_prt(unsigned long pid, unsigned long lpid,
+ unsigned long type, unsigned long pg_sizes,
+ unsigned long start, unsigned long end)
+{
+ unsigned long psize, nr_pages;
+ struct mmu_psize_def *def;
+ bool flush_pid;
+
+ /*
+ * A H_RPTI_TYPE_ALL request implies RIC=3, hence
+ * do a single IS=1 based flush.
+ */
+ if ((type & H_RPTI_TYPE_ALL) == H_RPTI_TYPE_ALL) {
+ _tlbie_pid_lpid(pid, lpid, RIC_FLUSH_ALL);
+ return;
+ }
+
+ if (type & H_RPTI_TYPE_PWC)
+ _tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC);
+
+ /* Full PID flush */
+ if (start == 0 && end == -1)
+ return _tlbie_pid_lpid(pid, lpid, RIC_FLUSH_TLB);
+
+ /* Do range invalidation for all the valid page sizes */
+ for (psize = 0; psize < MMU_PAGE_COUNT; psize++) {
+ def = &mmu_psize_defs[psize];
+ if (!(pg_sizes & def->h_rpt_pgsize))
+ continue;
+
+ nr_pages = (end - start) >> def->shift;
+ flush_pid = nr_pages > tlb_single_page_flush_ceiling;
+
+ /*
+ * If the number of pages spanning the range is above
+ * the ceiling, convert the request into a full PID flush.
+ * And since PID flush takes out all the page sizes, there
+ * is no need to consider remaining page sizes.
+ */
+ if (flush_pid) {
+ _tlbie_pid_lpid(pid, lpid, RIC_FLUSH_TLB);
+ return;
+ }
+ _tlbie_va_range_lpid(start, end, pid, lpid,
+ (1UL << def->shift), psize, false);
+ }
+}
+EXPORT_SYMBOL_GPL(do_h_rpt_invalidate_prt);
+
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
--
2.31.1
^ permalink raw reply related
* [PATCH v8 1/6] KVM: PPC: Book3S HV: Fix comments of H_RPT_INVALIDATE arguments
From: Bharata B Rao @ 2021-06-21 8:49 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev
Cc: farosas, aneesh.kumar, npiggin, Bharata B Rao, david
In-Reply-To: <20210621085003.904767-1-bharata@linux.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
The type values H_RPTI_TYPE_PRT and H_RPTI_TYPE_PAT indicate
invalidating the caching of process and partition scoped entries
respectively.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
arch/powerpc/include/asm/hvcall.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index e3b29eda8074..7e4b2cef40c2 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -413,9 +413,9 @@
#define H_RPTI_TYPE_NESTED 0x0001 /* Invalidate nested guest partition-scope */
#define H_RPTI_TYPE_TLB 0x0002 /* Invalidate TLB */
#define H_RPTI_TYPE_PWC 0x0004 /* Invalidate Page Walk Cache */
-/* Invalidate Process Table Entries if H_RPTI_TYPE_NESTED is clear */
+/* Invalidate caching of Process Table Entries if H_RPTI_TYPE_NESTED is clear */
#define H_RPTI_TYPE_PRT 0x0008
-/* Invalidate Partition Table Entries if H_RPTI_TYPE_NESTED is set */
+/* Invalidate caching of Partition Table Entries if H_RPTI_TYPE_NESTED is set */
#define H_RPTI_TYPE_PAT 0x0008
#define H_RPTI_TYPE_ALL (H_RPTI_TYPE_TLB | H_RPTI_TYPE_PWC | \
H_RPTI_TYPE_PRT)
--
2.31.1
^ permalink raw reply related
* [PATCH v8 2/6] powerpc/book3s64/radix: Add H_RPT_INVALIDATE pgsize encodings to mmu_psize_def
From: Bharata B Rao @ 2021-06-21 8:49 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev
Cc: farosas, aneesh.kumar, npiggin, Bharata B Rao, david
In-Reply-To: <20210621085003.904767-1-bharata@linux.ibm.com>
Add a field to mmu_psize_def to store the page size encodings
of H_RPT_INVALIDATE hcall. Initialize this while scanning the radix
AP encodings. This will be used when invalidating with required
page size encoding in the hcall.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
arch/powerpc/include/asm/book3s/64/mmu.h | 1 +
arch/powerpc/mm/book3s64/radix_pgtable.c | 5 +++++
2 files changed, 6 insertions(+)
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index eace8c3f7b0a..c02f42d1031e 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -19,6 +19,7 @@ struct mmu_psize_def {
int penc[MMU_PAGE_COUNT]; /* HPTE encoding */
unsigned int tlbiel; /* tlbiel supported for that page size */
unsigned long avpnm; /* bits to mask out in AVPN in the HPTE */
+ unsigned long h_rpt_pgsize; /* H_RPT_INVALIDATE page size encoding */
union {
unsigned long sllp; /* SLB L||LP (exact mask to use in slbmte) */
unsigned long ap; /* Ap encoding used by PowerISA 3.0 */
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 5fef8db3b463..637db10d841e 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -486,6 +486,7 @@ static int __init radix_dt_scan_page_sizes(unsigned long node,
def = &mmu_psize_defs[idx];
def->shift = shift;
def->ap = ap;
+ def->h_rpt_pgsize = psize_to_rpti_pgsize(idx);
}
/* needed ? */
@@ -560,9 +561,13 @@ void __init radix__early_init_devtree(void)
*/
mmu_psize_defs[MMU_PAGE_4K].shift = 12;
mmu_psize_defs[MMU_PAGE_4K].ap = 0x0;
+ mmu_psize_defs[MMU_PAGE_4K].h_rpt_pgsize =
+ psize_to_rpti_pgsize(MMU_PAGE_4K);
mmu_psize_defs[MMU_PAGE_64K].shift = 16;
mmu_psize_defs[MMU_PAGE_64K].ap = 0x5;
+ mmu_psize_defs[MMU_PAGE_64K].h_rpt_pgsize =
+ psize_to_rpti_pgsize(MMU_PAGE_64K);
}
/*
--
2.31.1
^ permalink raw reply related
* [PATCH 1/2] powerpc/prom_init: Convert prom_strcpy() into prom_strscpy_pad()
From: Michael Ellerman @ 2021-06-21 6:49 UTC (permalink / raw)
To: linuxppc-dev
In a subsequent patch we'd like to have something like a strscpy_pad()
implementation usable in prom_init.c.
Currently we have a strcpy() implementation with only one caller, so
convert it into strscpy_pad() and update the caller.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
arch/powerpc/kernel/prom_init.c | 30 ++++++++++++++++++++++++------
1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index 523b31685c4c..c18d55f8b951 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -242,13 +242,31 @@ static int __init prom_strcmp(const char *cs, const char *ct)
return 0;
}
-static char __init *prom_strcpy(char *dest, const char *src)
+static ssize_t __init prom_strscpy_pad(char *dest, const char *src, size_t n)
{
- char *tmp = dest;
+ ssize_t rc;
+ size_t i;
- while ((*dest++ = *src++) != '\0')
- /* nothing */;
- return tmp;
+ if (n == 0 || n > INT_MAX)
+ return -E2BIG;
+
+ // Copy up to n bytes
+ for (i = 0; i < n && src[i] != '\0'; i++)
+ dest[i] = src[i];
+
+ rc = i;
+
+ // If we copied all n then we have run out of space for the nul
+ if (rc == n) {
+ // Rewind by one character to ensure nul termination
+ i--;
+ rc = -E2BIG;
+ }
+
+ for (; i < n; i++)
+ dest[i] = '\0';
+
+ return rc;
}
static int __init prom_strncmp(const char *cs, const char *ct, size_t count)
@@ -2701,7 +2719,7 @@ static void __init flatten_device_tree(void)
/* Add "phandle" in there, we'll need it */
namep = make_room(&mem_start, &mem_end, 16, 1);
- prom_strcpy(namep, "phandle");
+ prom_strscpy_pad(namep, "phandle", sizeof("phandle"));
mem_start = (unsigned long)namep + prom_strlen(namep) + 1;
/* Build string array */
--
2.25.1
^ permalink raw reply related
* [PATCH 2/2] powerpc/prom_init: Pass linux_banner to firmware via option vector 7
From: Michael Ellerman @ 2021-06-21 6:49 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <20210621064938.2021419-1-mpe@ellerman.id.au>
Pass the value of linux_banner to firmware via option vector 7.
Option vector 7 is described in "LoPAR" Linux on Power Architecture
Reference v2.9, in table B.7 on page 824:
An ASCII character formatted null terminated string that describes
the client operating system. The string shall be human readable and
may be displayed on the console.
The string can be up to 256 bytes total, including the nul terminator.
linux_banner contains lots of information, and should make it possible
to identify the exact kernel version that is running:
const char linux_banner[] =
"Linux version " UTS_RELEASE " (" LINUX_COMPILE_BY "@"
LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION "\n";
For example:
Linux version 4.15.0-144-generic (buildd@bos02-ppc64el-018) (gcc
version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #148-Ubuntu SMP Sat May 8
02:32:13 UTC 2021 (Ubuntu 4.15.0-144.148-generic 4.15.18)
It's also printed at boot to the console/dmesg, which should make it
possible to correlate what firmware receives with the console/dmesg on
the machine.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
NB. linux_banner is already allowed by prom_init_check.sh
LoPAR: https://openpowerfoundation.org/?resource_lib=linux-on-power-architecture-reference-a-papr-linux-subset-review-draft
---
arch/powerpc/kernel/prom_init.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index c18d55f8b951..7343076b261c 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -27,6 +27,7 @@
#include <linux/initrd.h>
#include <linux/bitops.h>
#include <linux/pgtable.h>
+#include <linux/printk.h>
#include <asm/prom.h>
#include <asm/rtas.h>
#include <asm/page.h>
@@ -944,6 +945,10 @@ struct option_vector6 {
u8 os_name;
} __packed;
+struct option_vector7 {
+ u8 os_id[256];
+} __packed;
+
struct ibm_arch_vec {
struct { u32 mask, val; } pvrs[14];
@@ -966,6 +971,9 @@ struct ibm_arch_vec {
u8 vec6_len;
struct option_vector6 vec6;
+
+ u8 vec7_len;
+ struct option_vector7 vec7;
} __packed;
static const struct ibm_arch_vec ibm_architecture_vec_template __initconst = {
@@ -1112,6 +1120,9 @@ static const struct ibm_arch_vec ibm_architecture_vec_template __initconst = {
.secondary_pteg = 0,
.os_name = OV6_LINUX,
},
+
+ /* option vector 7: OS Identification */
+ .vec7_len = VECTOR_LENGTH(sizeof(struct option_vector7)),
};
static struct ibm_arch_vec __prombss ibm_architecture_vec ____cacheline_aligned;
@@ -1340,6 +1351,10 @@ static void __init prom_check_platform_support(void)
memcpy(&ibm_architecture_vec, &ibm_architecture_vec_template,
sizeof(ibm_architecture_vec));
+ prom_strscpy_pad(ibm_architecture_vec.vec7.os_id, linux_banner, 256);
+ // Ensure nul termination
+ ibm_architecture_vec.vec7.os_id[255] = '\0';
+
if (prop_len > 1) {
int i;
u8 vec[8];
--
2.25.1
^ permalink raw reply related
* [powerpc][5.13.0-rc7] Kernel warning (kernel/sched/fair.c:401) while running LTP tests
From: Sachin Sant @ 2021-06-21 6:32 UTC (permalink / raw)
To: linux-kernel; +Cc: peterz, odin, linuxppc-dev
While running LTP tests (cfs_bandwidth01) against 5.13.0-rc7 kernel on a powerpc box
following warning is seen
[ 6611.331827] ------------[ cut here ]------------
[ 6611.331855] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list
[ 6611.331862] WARNING: CPU: 8 PID: 0 at kernel/sched/fair.c:401 unthrottle_cfs_rq+0x4cc/0x590
[ 6611.331883] Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache netfs tun brd overlay vfat fat btrfs blake2b_generic xor zstd_compress raid6_pq xfs loop sctp ip6_udp_tunnel udp_tunnel libcrc32c dm_mod bonding rfkill sunrpc pseries_rng xts vmx_crypto sch_fq_codel ip_tables ext4 mbcache jbd2 sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp fuse [last unloaded: init_module]
[ 6611.331957] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G OE 5.13.0-rc6-gcba5e97280f5 #1
[ 6611.331968] NIP: c0000000001b7aac LR: c0000000001b7aa8 CTR: c000000000722d30
[ 6611.331976] REGS: c00000000274f3a0 TRAP: 0700 Tainted: G OE (5.13.0-rc6-gcba5e97280f5)
[ 6611.331985] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 48000224 XER: 00000005
[ 6611.332002] CFAR: c00000000014ca20 IRQMASK: 1
[ 6611.332002] GPR00: c0000000001b7aa8 c00000000274f640 c000000001abaf00 000000000000002d
[ 6611.332002] GPR04: 00000000ffff7fff c00000000274f300 0000000000000027 c000000efdb07e08
[ 6611.332002] GPR08: 0000000000000023 0000000000000001 0000000000000027 c000000001976680
[ 6611.332002] GPR12: 0000000000000000 c000000effc0be80 c000000ef07b3f90 000000001eefe200
[ 6611.332002] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 6611.332002] GPR20: 0000000000000001 c000000000fa6c08 c000000000fa6030 0000000000000001
[ 6611.332002] GPR24: 0000000000000000 0000000000000000 c000000efde12380 0000000000000001
[ 6611.332002] GPR28: 0000000000000001 0000000000000000 c000000efde12400 0000000000000000
[ 6611.332094] NIP [c0000000001b7aac] unthrottle_cfs_rq+0x4cc/0x590
[ 6611.332104] LR [c0000000001b7aa8] unthrottle_cfs_rq+0x4c8/0x590
[ 6611.332113] Call Trace:
[ 6611.332116] [c00000000274f640] [c0000000001b7aa8] unthrottle_cfs_rq+0x4c8/0x590 (unreliable)
[ 6611.332128] [c00000000274f6e0] [c0000000001b7e38] distribute_cfs_runtime+0x1d8/0x280
[ 6611.332139] [c00000000274f7b0] [c0000000001b81d0] sched_cfs_period_timer+0x140/0x330
[ 6611.332149] [c00000000274f870] [c00000000022a03c] __hrtimer_run_queues+0x17c/0x380
[ 6611.332158] [c00000000274f8f0] [c00000000022ac68] hrtimer_interrupt+0x128/0x2f0
[ 6611.332168] [c00000000274f9a0] [c00000000002940c] timer_interrupt+0x13c/0x370
[ 6611.332179] [c00000000274fa00] [c000000000009c04] decrementer_common_virt+0x1a4/0x1b0
[ 6611.332189] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x24
[ 6611.332199] NIP: c0000000000f6af8 LR: c000000000a05f68 CTR: 0000000000000000
[ 6611.332206] REGS: c00000000274fa70 TRAP: 0900 Tainted: G OE (5.13.0-rc6-gcba5e97280f5)
[ 6611.332214] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28000224 XER: 00000000
[ 6611.332234] CFAR: 0000000000000c00 IRQMASK: 0
[ 6611.332234] GPR00: 0000000000000000 c00000000274fd10 c000000001abaf00 0000000000000000
[ 6611.332234] GPR04: 00000000000000c0 0000000000000080 0001a91c68b80fa1 00000000000003dc
[ 6611.332234] GPR08: 000000000001f400 0000000000000001 0000000000000000 0000000000000000
[ 6611.332234] GPR12: 0000000000000000 c000000effc0be80 c000000ef07b3f90 000000001eefe200
[ 6611.332234] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 6611.332234] GPR20: 0000000000000001 0000000000000002 0000000000000010 c0000000019fe2f8
[ 6611.332234] GPR24: 0000000000000001 00000603517d757e 0000000000000000 0000000000000000
[ 6611.332234] GPR28: 0000000000000001 0000000000000000 c000000001231f90 c000000001231f98
[ 6611.332323] NIP [c0000000000f6af8] plpar_hcall_norets_notrace+0x18/0x24
[ 6611.332332] LR [c000000000a05f68] check_and_cede_processor+0x48/0x60
[ 6611.332340] --- interrupt: 900
[ 6611.332345] [c00000000274fd10] [c000000efdb92380] 0xc000000efdb92380 (unreliable)
[ 6611.332355] [c00000000274fd70] [c000000000a063bc] dedicated_cede_loop+0x9c/0x1b0
[ 6611.332364] [c00000000274fdc0] [c000000000a02b04] cpuidle_enter_state+0x2e4/0x4e0
[ 6611.332375] [c00000000274fe20] [c000000000a02da0] cpuidle_enter+0x50/0x70
[ 6611.332385] [c00000000274fe60] [c0000000001a883c] call_cpuidle+0x4c/0x80
[ 6611.332393] [c00000000274fe80] [c0000000001a8ee0] do_idle+0x380/0x3e0
[ 6611.332402] [c00000000274ff00] [c0000000001a91bc] cpu_startup_entry+0x3c/0x40
[ 6611.332411] [c00000000274ff30] [c000000000063ff8] start_secondary+0x298/0x2b0
[ 6611.332421] [c00000000274ff90] [c00000000000c754] start_secondary_prolog+0x10/0x14
[ 6611.332430] Instruction dump:
[ 6611.332435] 4bfffc44 3d22fff6 8929f328 2f890000 409efea4 39200001 3d42fff6 3c62ff4f
[ 6611.332451] 3863bcd8 992af328 4bf94f15 60000000 <0fe00000> 4bfffe80 7f6407b4 7f43d378
[ 6611.332466] ---[ end trace 1346f865cd1cae91 ]—
5.13.0-rc6 was good. Bisect points to following patch
commit a7b359fc6a37
sched/fair: Correctly insert cfs_rq's to list on unthrottle
The test runs to completion(without this warning) if the patch is reverted.
Thanks
-Sachin
^ permalink raw reply
* Re: arch/powerpc/kvm/book3s_hv_nested.c:264:6: error: stack frame size of 2304 bytes in function 'kvmhv_enter_nested_guest'
From: Nathan Chancellor @ 2021-06-21 5:53 UTC (permalink / raw)
To: Nicholas Piggin, Arnd Bergmann, kernel test robot
Cc: kbuild-all, Kees Cook, clang-built-linux, linux-kernel, kvm-ppc,
Linux Memory Management List, Andrew Morton, linuxppc-dev
In-Reply-To: <1624232938.d90brlmh3p.astroid@bobo.none>
On 6/20/2021 4:59 PM, Nicholas Piggin wrote:
> Excerpts from kernel test robot's message of April 3, 2021 8:47 pm:
>> tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>> head: d93a0d43e3d0ba9e19387be4dae4a8d5b175a8d7
>> commit: 97e4910232fa1f81e806aa60c25a0450276d99a2 linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP*
>> date: 3 weeks ago
>> config: powerpc64-randconfig-r006-20210403 (attached as .config)
>> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 0fe8af94688aa03c01913c2001d6a1a911f42ce6)
>> reproduce (this is a W=1 build):
>> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>> chmod +x ~/bin/make.cross
>> # install powerpc64 cross compiling tool for clang build
>> # apt-get install binutils-powerpc64-linux-gnu
>> # https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97e4910232fa1f81e806aa60c25a0450276d99a2
>> git remote add linus https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>> git fetch --no-tags linus master
>> git checkout 97e4910232fa1f81e806aa60c25a0450276d99a2
>> # save the attached .config to linux build tree
>> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
>>
>> If you fix the issue, kindly add following tag as appropriate
>> Reported-by: kernel test robot <lkp@intel.com>
>>
>> All errors (new ones prefixed by >>):
>>
>>>> arch/powerpc/kvm/book3s_hv_nested.c:264:6: error: stack frame size of 2304 bytes in function 'kvmhv_enter_nested_guest' [-Werror,-Wframe-larger-than=]
>> long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
>> ^
>> 1 error generated.
>>
>>
>> vim +/kvmhv_enter_nested_guest +264 arch/powerpc/kvm/book3s_hv_nested.c
>
> Not much changed here recently. It's not that big a concern because it's
> only called in the KVM ioctl path, not in any deep IO paths or anything,
> and doesn't recurse. Might be a bit of inlining or stack spilling put it
> over the edge.
It appears to be the fact that LLVM's PowerPC backend does not emit
efficient byteswap assembly:
https://github.com/ClangBuiltLinux/linux/issues/1292
https://bugs.llvm.org/show_bug.cgi?id=49610
> powerpc does make it an error though, would be good to avoid that so the
> robot doesn't keep tripping over.
Marking byteswap_pt_regs as 'noinline_for_stack' drastically reduces the
stack usage. If that is an acceptable solution, I can send it along
tomorrow.
Cheers,
Nathan
> Thanks,
> Nick
>
>
>>
>> afe75049303f75 Ravi Bangoria 2020-12-16 263
>> 360cae313702cd Paul Mackerras 2018-10-08 @264 long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
>> 360cae313702cd Paul Mackerras 2018-10-08 265 {
>> 360cae313702cd Paul Mackerras 2018-10-08 266 long int err, r;
>> 360cae313702cd Paul Mackerras 2018-10-08 267 struct kvm_nested_guest *l2;
>> 360cae313702cd Paul Mackerras 2018-10-08 268 struct pt_regs l2_regs, saved_l1_regs;
>> afe75049303f75 Ravi Bangoria 2020-12-16 269 struct hv_guest_state l2_hv = {0}, saved_l1_hv;
>> 360cae313702cd Paul Mackerras 2018-10-08 270 struct kvmppc_vcore *vc = vcpu->arch.vcore;
>> 360cae313702cd Paul Mackerras 2018-10-08 271 u64 hv_ptr, regs_ptr;
>> 360cae313702cd Paul Mackerras 2018-10-08 272 u64 hdec_exp;
>> 360cae313702cd Paul Mackerras 2018-10-08 273 s64 delta_purr, delta_spurr, delta_ic, delta_vtb;
>> 360cae313702cd Paul Mackerras 2018-10-08 274 u64 mask;
>> 360cae313702cd Paul Mackerras 2018-10-08 275 unsigned long lpcr;
>> 360cae313702cd Paul Mackerras 2018-10-08 276
>> 360cae313702cd Paul Mackerras 2018-10-08 277 if (vcpu->kvm->arch.l1_ptcr == 0)
>> 360cae313702cd Paul Mackerras 2018-10-08 278 return H_NOT_AVAILABLE;
>> 360cae313702cd Paul Mackerras 2018-10-08 279
>> 360cae313702cd Paul Mackerras 2018-10-08 280 /* copy parameters in */
>> 360cae313702cd Paul Mackerras 2018-10-08 281 hv_ptr = kvmppc_get_gpr(vcpu, 4);
>> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 282 regs_ptr = kvmppc_get_gpr(vcpu, 5);
>> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 283 vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
>> afe75049303f75 Ravi Bangoria 2020-12-16 284 err = kvmhv_read_guest_state_and_regs(vcpu, &l2_hv, &l2_regs,
>> afe75049303f75 Ravi Bangoria 2020-12-16 285 hv_ptr, regs_ptr);
>> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 286 srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
>> 360cae313702cd Paul Mackerras 2018-10-08 287 if (err)
>> 360cae313702cd Paul Mackerras 2018-10-08 288 return H_PARAMETER;
>> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 289
>> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 290 if (kvmppc_need_byteswap(vcpu))
>> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 291 byteswap_hv_regs(&l2_hv);
>> afe75049303f75 Ravi Bangoria 2020-12-16 292 if (l2_hv.version > HV_GUEST_STATE_VERSION)
>> 360cae313702cd Paul Mackerras 2018-10-08 293 return H_P2;
>> 360cae313702cd Paul Mackerras 2018-10-08 294
>> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 295 if (kvmppc_need_byteswap(vcpu))
>> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 296 byteswap_pt_regs(&l2_regs);
>> 9d0b048da788c1 Suraj Jitindar Singh 2018-10-08 297 if (l2_hv.vcpu_token >= NR_CPUS)
>> 9d0b048da788c1 Suraj Jitindar Singh 2018-10-08 298 return H_PARAMETER;
>> 9d0b048da788c1 Suraj Jitindar Singh 2018-10-08 299
>> 360cae313702cd Paul Mackerras 2018-10-08 300 /* translate lpid */
>> 360cae313702cd Paul Mackerras 2018-10-08 301 l2 = kvmhv_get_nested(vcpu->kvm, l2_hv.lpid, true);
>> 360cae313702cd Paul Mackerras 2018-10-08 302 if (!l2)
>> 360cae313702cd Paul Mackerras 2018-10-08 303 return H_PARAMETER;
>> 360cae313702cd Paul Mackerras 2018-10-08 304 if (!l2->l1_gr_to_hr) {
>> 360cae313702cd Paul Mackerras 2018-10-08 305 mutex_lock(&l2->tlb_lock);
>> 360cae313702cd Paul Mackerras 2018-10-08 306 kvmhv_update_ptbl_cache(l2);
>> 360cae313702cd Paul Mackerras 2018-10-08 307 mutex_unlock(&l2->tlb_lock);
>> 360cae313702cd Paul Mackerras 2018-10-08 308 }
>> 360cae313702cd Paul Mackerras 2018-10-08 309
>> 360cae313702cd Paul Mackerras 2018-10-08 310 /* save l1 values of things */
>> 360cae313702cd Paul Mackerras 2018-10-08 311 vcpu->arch.regs.msr = vcpu->arch.shregs.msr;
>> 360cae313702cd Paul Mackerras 2018-10-08 312 saved_l1_regs = vcpu->arch.regs;
>> 360cae313702cd Paul Mackerras 2018-10-08 313 kvmhv_save_hv_regs(vcpu, &saved_l1_hv);
>> 360cae313702cd Paul Mackerras 2018-10-08 314
>> 360cae313702cd Paul Mackerras 2018-10-08 315 /* convert TB values/offsets to host (L0) values */
>> 360cae313702cd Paul Mackerras 2018-10-08 316 hdec_exp = l2_hv.hdec_expiry - vc->tb_offset;
>> 360cae313702cd Paul Mackerras 2018-10-08 317 vc->tb_offset += l2_hv.tb_offset;
>> 360cae313702cd Paul Mackerras 2018-10-08 318
>> 360cae313702cd Paul Mackerras 2018-10-08 319 /* set L1 state to L2 state */
>> 360cae313702cd Paul Mackerras 2018-10-08 320 vcpu->arch.nested = l2;
>> 360cae313702cd Paul Mackerras 2018-10-08 321 vcpu->arch.nested_vcpu_id = l2_hv.vcpu_token;
>> 360cae313702cd Paul Mackerras 2018-10-08 322 vcpu->arch.regs = l2_regs;
>> 360cae313702cd Paul Mackerras 2018-10-08 323 vcpu->arch.shregs.msr = vcpu->arch.regs.msr;
>> 360cae313702cd Paul Mackerras 2018-10-08 324 mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD |
>> 360cae313702cd Paul Mackerras 2018-10-08 325 LPCR_LPES | LPCR_MER;
>> 360cae313702cd Paul Mackerras 2018-10-08 326 lpcr = (vc->lpcr & ~mask) | (l2_hv.lpcr & mask);
>> 73937deb4b2d7f Suraj Jitindar Singh 2018-10-08 327 sanitise_hv_regs(vcpu, &l2_hv);
>> 360cae313702cd Paul Mackerras 2018-10-08 328 restore_hv_regs(vcpu, &l2_hv);
>> 360cae313702cd Paul Mackerras 2018-10-08 329
>> 360cae313702cd Paul Mackerras 2018-10-08 330 vcpu->arch.ret = RESUME_GUEST;
>> 360cae313702cd Paul Mackerras 2018-10-08 331 vcpu->arch.trap = 0;
>> 360cae313702cd Paul Mackerras 2018-10-08 332 do {
>> 360cae313702cd Paul Mackerras 2018-10-08 333 if (mftb() >= hdec_exp) {
>> 360cae313702cd Paul Mackerras 2018-10-08 334 vcpu->arch.trap = BOOK3S_INTERRUPT_HV_DECREMENTER;
>> 360cae313702cd Paul Mackerras 2018-10-08 335 r = RESUME_HOST;
>> 360cae313702cd Paul Mackerras 2018-10-08 336 break;
>> 360cae313702cd Paul Mackerras 2018-10-08 337 }
>> 8c99d34578628b Tianjia Zhang 2020-04-27 338 r = kvmhv_run_single_vcpu(vcpu, hdec_exp, lpcr);
>> 360cae313702cd Paul Mackerras 2018-10-08 339 } while (is_kvmppc_resume_guest(r));
>> 360cae313702cd Paul Mackerras 2018-10-08 340
>> 360cae313702cd Paul Mackerras 2018-10-08 341 /* save L2 state for return */
>> 360cae313702cd Paul Mackerras 2018-10-08 342 l2_regs = vcpu->arch.regs;
>> 360cae313702cd Paul Mackerras 2018-10-08 343 l2_regs.msr = vcpu->arch.shregs.msr;
>> 360cae313702cd Paul Mackerras 2018-10-08 344 delta_purr = vcpu->arch.purr - l2_hv.purr;
>> 360cae313702cd Paul Mackerras 2018-10-08 345 delta_spurr = vcpu->arch.spurr - l2_hv.spurr;
>> 360cae313702cd Paul Mackerras 2018-10-08 346 delta_ic = vcpu->arch.ic - l2_hv.ic;
>> 360cae313702cd Paul Mackerras 2018-10-08 347 delta_vtb = vc->vtb - l2_hv.vtb;
>> 360cae313702cd Paul Mackerras 2018-10-08 348 save_hv_return_state(vcpu, vcpu->arch.trap, &l2_hv);
>> 360cae313702cd Paul Mackerras 2018-10-08 349
>> 360cae313702cd Paul Mackerras 2018-10-08 350 /* restore L1 state */
>> 360cae313702cd Paul Mackerras 2018-10-08 351 vcpu->arch.nested = NULL;
>> 360cae313702cd Paul Mackerras 2018-10-08 352 vcpu->arch.regs = saved_l1_regs;
>> 360cae313702cd Paul Mackerras 2018-10-08 353 vcpu->arch.shregs.msr = saved_l1_regs.msr & ~MSR_TS_MASK;
>> 360cae313702cd Paul Mackerras 2018-10-08 354 /* set L1 MSR TS field according to L2 transaction state */
>> 360cae313702cd Paul Mackerras 2018-10-08 355 if (l2_regs.msr & MSR_TS_MASK)
>> 360cae313702cd Paul Mackerras 2018-10-08 356 vcpu->arch.shregs.msr |= MSR_TS_S;
>> 360cae313702cd Paul Mackerras 2018-10-08 357 vc->tb_offset = saved_l1_hv.tb_offset;
>> 360cae313702cd Paul Mackerras 2018-10-08 358 restore_hv_regs(vcpu, &saved_l1_hv);
>> 360cae313702cd Paul Mackerras 2018-10-08 359 vcpu->arch.purr += delta_purr;
>> 360cae313702cd Paul Mackerras 2018-10-08 360 vcpu->arch.spurr += delta_spurr;
>> 360cae313702cd Paul Mackerras 2018-10-08 361 vcpu->arch.ic += delta_ic;
>> 360cae313702cd Paul Mackerras 2018-10-08 362 vc->vtb += delta_vtb;
>> 360cae313702cd Paul Mackerras 2018-10-08 363
>> 360cae313702cd Paul Mackerras 2018-10-08 364 kvmhv_put_nested(l2);
>> 360cae313702cd Paul Mackerras 2018-10-08 365
>> 360cae313702cd Paul Mackerras 2018-10-08 366 /* copy l2_hv_state and regs back to guest */
>> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 367 if (kvmppc_need_byteswap(vcpu)) {
>> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 368 byteswap_hv_regs(&l2_hv);
>> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 369 byteswap_pt_regs(&l2_regs);
>> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 370 }
>> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 371 vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
>> afe75049303f75 Ravi Bangoria 2020-12-16 372 err = kvmhv_write_guest_state_and_regs(vcpu, &l2_hv, &l2_regs,
>> afe75049303f75 Ravi Bangoria 2020-12-16 373 hv_ptr, regs_ptr);
>> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 374 srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
>> 360cae313702cd Paul Mackerras 2018-10-08 375 if (err)
>> 360cae313702cd Paul Mackerras 2018-10-08 376 return H_AUTHORITY;
>> 360cae313702cd Paul Mackerras 2018-10-08 377
>> 360cae313702cd Paul Mackerras 2018-10-08 378 if (r == -EINTR)
>> 360cae313702cd Paul Mackerras 2018-10-08 379 return H_INTERRUPT;
>> 360cae313702cd Paul Mackerras 2018-10-08 380
>> 873db2cd9a6d7f Suraj Jitindar Singh 2018-12-14 381 if (vcpu->mmio_needed) {
>> 873db2cd9a6d7f Suraj Jitindar Singh 2018-12-14 382 kvmhv_nested_mmio_needed(vcpu, regs_ptr);
>> 873db2cd9a6d7f Suraj Jitindar Singh 2018-12-14 383 return H_TOO_HARD;
>> 873db2cd9a6d7f Suraj Jitindar Singh 2018-12-14 384 }
>> 873db2cd9a6d7f Suraj Jitindar Singh 2018-12-14 385
>> 360cae313702cd Paul Mackerras 2018-10-08 386 return vcpu->arch.trap;
>> 360cae313702cd Paul Mackerras 2018-10-08 387 }
>> 360cae313702cd Paul Mackerras 2018-10-08 388
>>
>> :::::: The code at line 264 was first introduced by commit
>> :::::: 360cae313702cdd0b90f82c261a8302fecef030a KVM: PPC: Book3S HV: Nested guest entry via hypercall
>>
>> :::::: TO: Paul Mackerras <paulus@ozlabs.org>
>> :::::: CC: Michael Ellerman <mpe@ellerman.id.au>
>>
>> ---
>> 0-DAY CI Kernel Test Service, Intel Corporation
>> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
>>
^ permalink raw reply
* Re: [PATCH 0/2] powerpc/perf: Add instruction and data address registers to extended regs
From: Nageswara Sastry @ 2021-06-21 4:09 UTC (permalink / raw)
To: Athira Rajeev, mpe, acme, jolsa; +Cc: kjain, maddy, linuxppc-dev
In-Reply-To: <1624200360-1429-1-git-send-email-atrajeev@linux.vnet.ibm.com>
On 20/06/21 8:15 pm, Athira Rajeev wrote:
> Patch set adds PMU registers namely Sampled Instruction Address Register
> (SIAR) and Sampled Data Address Register (SDAR) as part of extended regs
> in PowerPC. These registers provides the instruction/data address and
> adding these to extended regs helps in debug purposes.
>
> Patch 1/2 adds SIAR and SDAR as part of the extended regs mask.
> Patch 2/2 includes perf tools side changes to add the SPRs to
> sample_reg_mask to use with -I? option.
>
> Athira Rajeev (2):
> powerpc/perf: Expose instruction and data address registers as part of
> extended regs
> tools/perf: Add perf tools support to expose instruction and data
> address registers as part of extended regs
Tested with the following scenarios on P9, P10 - PowerVM environment
1. perf record -I? - shows added - sdar, siar
2. perf record -I <workload> and perf report -D - shows added - sdar,
siar with and with out counts.
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
> arch/powerpc/include/uapi/asm/perf_regs.h | 12 +++++++-----
> arch/powerpc/perf/perf_regs.c | 4 ++++
> tools/arch/powerpc/include/uapi/asm/perf_regs.h | 12 +++++++-----
> tools/perf/arch/powerpc/include/perf_regs.h | 2 ++
> tools/perf/arch/powerpc/util/perf_regs.c | 2 ++
> 5 files changed, 22 insertions(+), 10 deletions(-)
>
--
Thanks and Regards
R.Nageswara Sastry
^ permalink raw reply
* [powerpc:next-test] BUILD SUCCESS 41075908e941f30636a607e841c08d7941966e1b
From: kernel test robot @ 2021-06-21 3:48 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test
branch HEAD: 41075908e941f30636a607e841c08d7941966e1b powerpc: Enable KFENCE on BOOK3S/64
elapsed time: 723m
configs tested: 132
configs skipped: 3
The following configs have been built successfully.
More configs may be tested in the coming days.
gcc tested configs:
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
powerpc chrp32_defconfig
arm lpd270_defconfig
riscv rv32_defconfig
powerpc makalu_defconfig
m68k defconfig
mips loongson2k_defconfig
powerpc amigaone_defconfig
powerpc asp8347_defconfig
powerpc sbc8548_defconfig
powerpc tqm8560_defconfig
powerpc mpc8315_rdb_defconfig
powerpc mpc5200_defconfig
h8300 allyesconfig
sh se7724_defconfig
m68k amcore_defconfig
m68k mvme147_defconfig
mips qi_lb60_defconfig
riscv nommu_k210_sdcard_defconfig
mips allyesconfig
powerpc warp_defconfig
powerpc mpc83xx_defconfig
arm multi_v5_defconfig
arm sunxi_defconfig
arm zeus_defconfig
sh sh03_defconfig
powerpc ppc64e_defconfig
powerpc mpc7448_hpc2_defconfig
ia64 bigsur_defconfig
sh sh7710voipgw_defconfig
sh espt_defconfig
powerpc fsp2_defconfig
arc hsdk_defconfig
csky defconfig
powerpc kilauea_defconfig
arm eseries_pxa_defconfig
arm tct_hammer_defconfig
sparc64 defconfig
riscv defconfig
nios2 alldefconfig
powerpc mpc8540_ads_defconfig
xtensa smp_lx200_defconfig
arm lpc32xx_defconfig
powerpc stx_gp3_defconfig
sh rsk7203_defconfig
arm aspeed_g5_defconfig
powerpc mpc8560_ads_defconfig
arm mvebu_v7_defconfig
sh kfr2r09-romimage_defconfig
m68k m5475evb_defconfig
sh r7780mp_defconfig
powerpc arches_defconfig
arm dove_defconfig
x86_64 allnoconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
nds32 defconfig
nios2 allyesconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
s390 allmodconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
mips allmodconfig
powerpc allyesconfig
powerpc allmodconfig
powerpc allnoconfig
i386 randconfig-a001-20210620
i386 randconfig-a002-20210620
i386 randconfig-a003-20210620
i386 randconfig-a006-20210620
i386 randconfig-a005-20210620
i386 randconfig-a004-20210620
x86_64 randconfig-a012-20210620
x86_64 randconfig-a016-20210620
x86_64 randconfig-a015-20210620
x86_64 randconfig-a014-20210620
x86_64 randconfig-a013-20210620
x86_64 randconfig-a011-20210620
i386 randconfig-a011-20210620
i386 randconfig-a014-20210620
i386 randconfig-a013-20210620
i386 randconfig-a015-20210620
i386 randconfig-a012-20210620
i386 randconfig-a016-20210620
x86_64 randconfig-a002-20210621
x86_64 randconfig-a001-20210621
x86_64 randconfig-a005-20210621
x86_64 randconfig-a003-20210621
x86_64 randconfig-a004-20210621
x86_64 randconfig-a006-20210621
riscv nommu_k210_defconfig
riscv allyesconfig
riscv nommu_virt_defconfig
riscv allnoconfig
riscv allmodconfig
um x86_64_defconfig
um i386_defconfig
um kunit_defconfig
x86_64 allyesconfig
x86_64 rhel-8.3-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 rhel-8.3-kbuiltin
x86_64 kexec
clang tested configs:
x86_64 randconfig-b001-20210620
x86_64 randconfig-b001-20210621
x86_64 randconfig-a002-20210620
x86_64 randconfig-a001-20210620
x86_64 randconfig-a005-20210620
x86_64 randconfig-a003-20210620
x86_64 randconfig-a004-20210620
x86_64 randconfig-a006-20210620
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [RESEND PATCH v4 08/11] powerpc: Initialize and use a temporary mm for patching
From: Daniel Axtens @ 2021-06-21 3:19 UTC (permalink / raw)
To: Christopher M. Riedl, linuxppc-dev; +Cc: tglx, x86, linux-hardening, keescook
In-Reply-To: <20210506043452.9674-9-cmr@linux.ibm.com>
Hi Chris,
> + /*
> + * Choose a randomized, page-aligned address from the range:
> + * [PAGE_SIZE, DEFAULT_MAP_WINDOW - PAGE_SIZE]
> + * The lower address bound is PAGE_SIZE to avoid the zero-page.
> + * The upper address bound is DEFAULT_MAP_WINDOW - PAGE_SIZE to stay
> + * under DEFAULT_MAP_WINDOW with the Book3s64 Hash MMU.
> + */
> + patching_addr = PAGE_SIZE + ((get_random_long() & PAGE_MASK)
> + % (DEFAULT_MAP_WINDOW - 2 * PAGE_SIZE));
I checked and poking_init() comes after the functions that init the RNG,
so this should be fine. The maths - while a bit fiddly to reason about -
does check out.
> +
> + /*
> + * PTE allocation uses GFP_KERNEL which means we need to pre-allocate
> + * the PTE here. We cannot do the allocation during patching with IRQs
> + * disabled (ie. "atomic" context).
> + */
> + ptep = get_locked_pte(patching_mm, patching_addr, &ptl);
> + BUG_ON(!ptep);
> + pte_unmap_unlock(ptep, ptl);
> +}
>
> #if IS_BUILTIN(CONFIG_LKDTM)
> unsigned long read_cpu_patching_addr(unsigned int cpu)
> {
> - return (unsigned long)(per_cpu(text_poke_area, cpu))->addr;
> + return patching_addr;
> }
> #endif
>
> -static int text_area_cpu_up(unsigned int cpu)
> +struct patch_mapping {
> + spinlock_t *ptl; /* for protecting pte table */
> + pte_t *ptep;
> + struct temp_mm temp_mm;
> +};
> +
> +#ifdef CONFIG_PPC_BOOK3S_64
> +
> +static inline int hash_prefault_mapping(pgprot_t pgprot)
> {
> - struct vm_struct *area;
> + int err;
>
> - area = get_vm_area(PAGE_SIZE, VM_ALLOC);
> - if (!area) {
> - WARN_ONCE(1, "Failed to create text area for cpu %d\n",
> - cpu);
> - return -1;
> - }
> - this_cpu_write(text_poke_area, area);
> + if (radix_enabled())
> + return 0;
>
> - return 0;
> -}
> + err = slb_allocate_user(patching_mm, patching_addr);
> + if (err)
> + pr_warn("map patch: failed to allocate slb entry\n");
>
Here if slb_allocate_user() fails, you'll print a warning and then fall
through to the rest of the function. You do return err, but there's a
later call to hash_page_mm() that also sets err. Can slb_allocate_user()
fail while hash_page_mm() succeeds, and would that be a problem?
> -static int text_area_cpu_down(unsigned int cpu)
> -{
> - free_vm_area(this_cpu_read(text_poke_area));
> - return 0;
> + err = hash_page_mm(patching_mm, patching_addr, pgprot_val(pgprot), 0,
> + HPTE_USE_KERNEL_KEY);
> + if (err)
> + pr_warn("map patch: failed to insert hashed page\n");
> +
> + /* See comment in switch_slb() in mm/book3s64/slb.c */
> + isync();
> +
The comment reads:
/*
* Synchronize slbmte preloads with possible subsequent user memory
* address accesses by the kernel (user mode won't happen until
* rfid, which is safe).
*/
isync();
I have to say having read the description of isync I'm not 100% sure why
that's enough (don't we also need stores to complete?) but I'm happy to
take commit 5434ae74629a ("powerpc/64s/hash: Add a SLB preload cache")
on trust here!
I think it does make sense for you to have that barrier here: you are
potentially about to start poking at the memory mapped through that SLB
entry so you should make sure you're fully synchronised.
> + return err;
> }
>
> + init_temp_mm(&patch_mapping->temp_mm, patching_mm);
> + use_temporary_mm(&patch_mapping->temp_mm);
>
> - pmdp = pmd_offset(pudp, addr);
> - if (unlikely(!pmdp))
> - return -EINVAL;
> + /*
> + * On Book3s64 with the Hash MMU we have to manually insert the SLB
> + * entry and HPTE to prevent taking faults on the patching_addr later.
> + */
> + return(hash_prefault_mapping(pgprot));
hmm, `return hash_prefault_mapping(pgprot);` or
`return (hash_prefault_mapping((pgprot));` maybe?
Kind regards,
Daniel
^ permalink raw reply
* Re: [RESEND PATCH v4 05/11] powerpc/64s: Add ability to skip SLB preload
From: Daniel Axtens @ 2021-06-21 3:13 UTC (permalink / raw)
To: Christopher M. Riedl, linuxppc-dev; +Cc: tglx, x86, linux-hardening, keescook
In-Reply-To: <20210506043452.9674-6-cmr@linux.ibm.com>
"Christopher M. Riedl" <cmr@linux.ibm.com> writes:
> Switching to a different mm with Hash translation causes SLB entries to
> be preloaded from the current thread_info. This reduces SLB faults, for
> example when threads share a common mm but operate on different address
> ranges.
>
> Preloading entries from the thread_info struct may not always be
> appropriate - such as when switching to a temporary mm. Introduce a new
> boolean in mm_context_t to skip the SLB preload entirely. Also move the
> SLB preload code into a separate function since switch_slb() is already
> quite long. The default behavior (preloading SLB entries from the
> current thread_info struct) remains unchanged.
>
> Signed-off-by: Christopher M. Riedl <cmr@linux.ibm.com>
>
> ---
>
> v4: * New to series.
> ---
> arch/powerpc/include/asm/book3s/64/mmu.h | 3 ++
> arch/powerpc/include/asm/mmu_context.h | 13 ++++++
> arch/powerpc/mm/book3s64/mmu_context.c | 2 +
> arch/powerpc/mm/book3s64/slb.c | 56 ++++++++++++++----------
> 4 files changed, 50 insertions(+), 24 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> index eace8c3f7b0a1..b23a9dcdee5af 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> @@ -130,6 +130,9 @@ typedef struct {
> u32 pkey_allocation_map;
> s16 execute_only_pkey; /* key holding execute-only protection */
> #endif
> +
> + /* Do not preload SLB entries from thread_info during switch_slb() */
> + bool skip_slb_preload;
> } mm_context_t;
>
> static inline u16 mm_ctx_user_psize(mm_context_t *ctx)
> diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> index 4bc45d3ed8b0e..264787e90b1a1 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -298,6 +298,19 @@ static inline int arch_dup_mmap(struct mm_struct *oldmm,
> return 0;
> }
>
> +#ifdef CONFIG_PPC_BOOK3S_64
> +
> +static inline void skip_slb_preload_mm(struct mm_struct *mm)
> +{
> + mm->context.skip_slb_preload = true;
> +}
> +
> +#else
> +
> +static inline void skip_slb_preload_mm(struct mm_struct *mm) {}
> +
> +#endif /* CONFIG_PPC_BOOK3S_64 */
> +
> #include <asm-generic/mmu_context.h>
>
> #endif /* __KERNEL__ */
> diff --git a/arch/powerpc/mm/book3s64/mmu_context.c b/arch/powerpc/mm/book3s64/mmu_context.c
> index c10fc8a72fb37..3479910264c59 100644
> --- a/arch/powerpc/mm/book3s64/mmu_context.c
> +++ b/arch/powerpc/mm/book3s64/mmu_context.c
> @@ -202,6 +202,8 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
> atomic_set(&mm->context.active_cpus, 0);
> atomic_set(&mm->context.copros, 0);
>
> + mm->context.skip_slb_preload = false;
> +
> return 0;
> }
>
> diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
> index c91bd85eb90e3..da0836cb855af 100644
> --- a/arch/powerpc/mm/book3s64/slb.c
> +++ b/arch/powerpc/mm/book3s64/slb.c
> @@ -441,10 +441,39 @@ static void slb_cache_slbie_user(unsigned int index)
> asm volatile("slbie %0" : : "r" (slbie_data));
> }
>
> +static void preload_slb_entries(struct task_struct *tsk, struct mm_struct *mm)
Should this be explicitly inline or even __always_inline? I'm thinking
switch_slb is probably a fairly hot path on hash?
> +{
> + struct thread_info *ti = task_thread_info(tsk);
> + unsigned char i;
> +
> + /*
> + * We gradually age out SLBs after a number of context switches to
> + * reduce reload overhead of unused entries (like we do with FP/VEC
> + * reload). Each time we wrap 256 switches, take an entry out of the
> + * SLB preload cache.
> + */
> + tsk->thread.load_slb++;
> + if (!tsk->thread.load_slb) {
> + unsigned long pc = KSTK_EIP(tsk);
> +
> + preload_age(ti);
> + preload_add(ti, pc);
> + }
> +
> + for (i = 0; i < ti->slb_preload_nr; i++) {
> + unsigned char idx;
> + unsigned long ea;
> +
> + idx = (ti->slb_preload_tail + i) % SLB_PRELOAD_NR;
> + ea = (unsigned long)ti->slb_preload_esid[idx] << SID_SHIFT;
> +
> + slb_allocate_user(mm, ea);
> + }
> +}
> +
> /* Flush all user entries from the segment table of the current processor. */
> void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
> {
> - struct thread_info *ti = task_thread_info(tsk);
> unsigned char i;
>
> /*
> @@ -502,29 +531,8 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
>
> copy_mm_to_paca(mm);
>
> - /*
> - * We gradually age out SLBs after a number of context switches to
> - * reduce reload overhead of unused entries (like we do with FP/VEC
> - * reload). Each time we wrap 256 switches, take an entry out of the
> - * SLB preload cache.
> - */
> - tsk->thread.load_slb++;
> - if (!tsk->thread.load_slb) {
> - unsigned long pc = KSTK_EIP(tsk);
> -
> - preload_age(ti);
> - preload_add(ti, pc);
> - }
> -
> - for (i = 0; i < ti->slb_preload_nr; i++) {
> - unsigned char idx;
> - unsigned long ea;
> -
> - idx = (ti->slb_preload_tail + i) % SLB_PRELOAD_NR;
> - ea = (unsigned long)ti->slb_preload_esid[idx] << SID_SHIFT;
> -
> - slb_allocate_user(mm, ea);
> - }
> + if (!mm->context.skip_slb_preload)
> + preload_slb_entries(tsk, mm);
Should this be wrapped in likely()?
>
> /*
> * Synchronize slbmte preloads with possible subsequent user memory
Right below this comment is the isync. It seems to be specifically
concerned with synchronising preloaded slbs. Do you need it if you are
skipping SLB preloads?
It's probably not a big deal to have an extra isync in the fairly rare
path when we're skipping preloads, but I thought I'd check.
Kind regards,
Daniel
> --
> 2.26.1
^ permalink raw reply
* Re: arch/powerpc/kvm/book3s_hv_nested.c:264:6: error: stack frame size of 2304 bytes in function 'kvmhv_enter_nested_guest'
From: Nicholas Piggin @ 2021-06-20 23:59 UTC (permalink / raw)
To: Arnd Bergmann, kernel test robot
Cc: kbuild-all, Kees Cook, clang-built-linux, linux-kernel, kvm-ppc,
Nathan Chancellor, Linux Memory Management List, Andrew Morton,
linuxppc-dev
In-Reply-To: <202104031853.vDT0Qjqj-lkp@intel.com>
Excerpts from kernel test robot's message of April 3, 2021 8:47 pm:
> tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> head: d93a0d43e3d0ba9e19387be4dae4a8d5b175a8d7
> commit: 97e4910232fa1f81e806aa60c25a0450276d99a2 linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP*
> date: 3 weeks ago
> config: powerpc64-randconfig-r006-20210403 (attached as .config)
> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 0fe8af94688aa03c01913c2001d6a1a911f42ce6)
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install powerpc64 cross compiling tool for clang build
> # apt-get install binutils-powerpc64-linux-gnu
> # https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97e4910232fa1f81e806aa60c25a0450276d99a2
> git remote add linus https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git fetch --no-tags linus master
> git checkout 97e4910232fa1f81e806aa60c25a0450276d99a2
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
>>> arch/powerpc/kvm/book3s_hv_nested.c:264:6: error: stack frame size of 2304 bytes in function 'kvmhv_enter_nested_guest' [-Werror,-Wframe-larger-than=]
> long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
> ^
> 1 error generated.
>
>
> vim +/kvmhv_enter_nested_guest +264 arch/powerpc/kvm/book3s_hv_nested.c
Not much changed here recently. It's not that big a concern because it's
only called in the KVM ioctl path, not in any deep IO paths or anything,
and doesn't recurse. Might be a bit of inlining or stack spilling put it
over the edge.
powerpc does make it an error though, would be good to avoid that so the
robot doesn't keep tripping over.
Thanks,
Nick
>
> afe75049303f75 Ravi Bangoria 2020-12-16 263
> 360cae313702cd Paul Mackerras 2018-10-08 @264 long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
> 360cae313702cd Paul Mackerras 2018-10-08 265 {
> 360cae313702cd Paul Mackerras 2018-10-08 266 long int err, r;
> 360cae313702cd Paul Mackerras 2018-10-08 267 struct kvm_nested_guest *l2;
> 360cae313702cd Paul Mackerras 2018-10-08 268 struct pt_regs l2_regs, saved_l1_regs;
> afe75049303f75 Ravi Bangoria 2020-12-16 269 struct hv_guest_state l2_hv = {0}, saved_l1_hv;
> 360cae313702cd Paul Mackerras 2018-10-08 270 struct kvmppc_vcore *vc = vcpu->arch.vcore;
> 360cae313702cd Paul Mackerras 2018-10-08 271 u64 hv_ptr, regs_ptr;
> 360cae313702cd Paul Mackerras 2018-10-08 272 u64 hdec_exp;
> 360cae313702cd Paul Mackerras 2018-10-08 273 s64 delta_purr, delta_spurr, delta_ic, delta_vtb;
> 360cae313702cd Paul Mackerras 2018-10-08 274 u64 mask;
> 360cae313702cd Paul Mackerras 2018-10-08 275 unsigned long lpcr;
> 360cae313702cd Paul Mackerras 2018-10-08 276
> 360cae313702cd Paul Mackerras 2018-10-08 277 if (vcpu->kvm->arch.l1_ptcr == 0)
> 360cae313702cd Paul Mackerras 2018-10-08 278 return H_NOT_AVAILABLE;
> 360cae313702cd Paul Mackerras 2018-10-08 279
> 360cae313702cd Paul Mackerras 2018-10-08 280 /* copy parameters in */
> 360cae313702cd Paul Mackerras 2018-10-08 281 hv_ptr = kvmppc_get_gpr(vcpu, 4);
> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 282 regs_ptr = kvmppc_get_gpr(vcpu, 5);
> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 283 vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
> afe75049303f75 Ravi Bangoria 2020-12-16 284 err = kvmhv_read_guest_state_and_regs(vcpu, &l2_hv, &l2_regs,
> afe75049303f75 Ravi Bangoria 2020-12-16 285 hv_ptr, regs_ptr);
> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 286 srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
> 360cae313702cd Paul Mackerras 2018-10-08 287 if (err)
> 360cae313702cd Paul Mackerras 2018-10-08 288 return H_PARAMETER;
> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 289
> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 290 if (kvmppc_need_byteswap(vcpu))
> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 291 byteswap_hv_regs(&l2_hv);
> afe75049303f75 Ravi Bangoria 2020-12-16 292 if (l2_hv.version > HV_GUEST_STATE_VERSION)
> 360cae313702cd Paul Mackerras 2018-10-08 293 return H_P2;
> 360cae313702cd Paul Mackerras 2018-10-08 294
> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 295 if (kvmppc_need_byteswap(vcpu))
> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 296 byteswap_pt_regs(&l2_regs);
> 9d0b048da788c1 Suraj Jitindar Singh 2018-10-08 297 if (l2_hv.vcpu_token >= NR_CPUS)
> 9d0b048da788c1 Suraj Jitindar Singh 2018-10-08 298 return H_PARAMETER;
> 9d0b048da788c1 Suraj Jitindar Singh 2018-10-08 299
> 360cae313702cd Paul Mackerras 2018-10-08 300 /* translate lpid */
> 360cae313702cd Paul Mackerras 2018-10-08 301 l2 = kvmhv_get_nested(vcpu->kvm, l2_hv.lpid, true);
> 360cae313702cd Paul Mackerras 2018-10-08 302 if (!l2)
> 360cae313702cd Paul Mackerras 2018-10-08 303 return H_PARAMETER;
> 360cae313702cd Paul Mackerras 2018-10-08 304 if (!l2->l1_gr_to_hr) {
> 360cae313702cd Paul Mackerras 2018-10-08 305 mutex_lock(&l2->tlb_lock);
> 360cae313702cd Paul Mackerras 2018-10-08 306 kvmhv_update_ptbl_cache(l2);
> 360cae313702cd Paul Mackerras 2018-10-08 307 mutex_unlock(&l2->tlb_lock);
> 360cae313702cd Paul Mackerras 2018-10-08 308 }
> 360cae313702cd Paul Mackerras 2018-10-08 309
> 360cae313702cd Paul Mackerras 2018-10-08 310 /* save l1 values of things */
> 360cae313702cd Paul Mackerras 2018-10-08 311 vcpu->arch.regs.msr = vcpu->arch.shregs.msr;
> 360cae313702cd Paul Mackerras 2018-10-08 312 saved_l1_regs = vcpu->arch.regs;
> 360cae313702cd Paul Mackerras 2018-10-08 313 kvmhv_save_hv_regs(vcpu, &saved_l1_hv);
> 360cae313702cd Paul Mackerras 2018-10-08 314
> 360cae313702cd Paul Mackerras 2018-10-08 315 /* convert TB values/offsets to host (L0) values */
> 360cae313702cd Paul Mackerras 2018-10-08 316 hdec_exp = l2_hv.hdec_expiry - vc->tb_offset;
> 360cae313702cd Paul Mackerras 2018-10-08 317 vc->tb_offset += l2_hv.tb_offset;
> 360cae313702cd Paul Mackerras 2018-10-08 318
> 360cae313702cd Paul Mackerras 2018-10-08 319 /* set L1 state to L2 state */
> 360cae313702cd Paul Mackerras 2018-10-08 320 vcpu->arch.nested = l2;
> 360cae313702cd Paul Mackerras 2018-10-08 321 vcpu->arch.nested_vcpu_id = l2_hv.vcpu_token;
> 360cae313702cd Paul Mackerras 2018-10-08 322 vcpu->arch.regs = l2_regs;
> 360cae313702cd Paul Mackerras 2018-10-08 323 vcpu->arch.shregs.msr = vcpu->arch.regs.msr;
> 360cae313702cd Paul Mackerras 2018-10-08 324 mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD |
> 360cae313702cd Paul Mackerras 2018-10-08 325 LPCR_LPES | LPCR_MER;
> 360cae313702cd Paul Mackerras 2018-10-08 326 lpcr = (vc->lpcr & ~mask) | (l2_hv.lpcr & mask);
> 73937deb4b2d7f Suraj Jitindar Singh 2018-10-08 327 sanitise_hv_regs(vcpu, &l2_hv);
> 360cae313702cd Paul Mackerras 2018-10-08 328 restore_hv_regs(vcpu, &l2_hv);
> 360cae313702cd Paul Mackerras 2018-10-08 329
> 360cae313702cd Paul Mackerras 2018-10-08 330 vcpu->arch.ret = RESUME_GUEST;
> 360cae313702cd Paul Mackerras 2018-10-08 331 vcpu->arch.trap = 0;
> 360cae313702cd Paul Mackerras 2018-10-08 332 do {
> 360cae313702cd Paul Mackerras 2018-10-08 333 if (mftb() >= hdec_exp) {
> 360cae313702cd Paul Mackerras 2018-10-08 334 vcpu->arch.trap = BOOK3S_INTERRUPT_HV_DECREMENTER;
> 360cae313702cd Paul Mackerras 2018-10-08 335 r = RESUME_HOST;
> 360cae313702cd Paul Mackerras 2018-10-08 336 break;
> 360cae313702cd Paul Mackerras 2018-10-08 337 }
> 8c99d34578628b Tianjia Zhang 2020-04-27 338 r = kvmhv_run_single_vcpu(vcpu, hdec_exp, lpcr);
> 360cae313702cd Paul Mackerras 2018-10-08 339 } while (is_kvmppc_resume_guest(r));
> 360cae313702cd Paul Mackerras 2018-10-08 340
> 360cae313702cd Paul Mackerras 2018-10-08 341 /* save L2 state for return */
> 360cae313702cd Paul Mackerras 2018-10-08 342 l2_regs = vcpu->arch.regs;
> 360cae313702cd Paul Mackerras 2018-10-08 343 l2_regs.msr = vcpu->arch.shregs.msr;
> 360cae313702cd Paul Mackerras 2018-10-08 344 delta_purr = vcpu->arch.purr - l2_hv.purr;
> 360cae313702cd Paul Mackerras 2018-10-08 345 delta_spurr = vcpu->arch.spurr - l2_hv.spurr;
> 360cae313702cd Paul Mackerras 2018-10-08 346 delta_ic = vcpu->arch.ic - l2_hv.ic;
> 360cae313702cd Paul Mackerras 2018-10-08 347 delta_vtb = vc->vtb - l2_hv.vtb;
> 360cae313702cd Paul Mackerras 2018-10-08 348 save_hv_return_state(vcpu, vcpu->arch.trap, &l2_hv);
> 360cae313702cd Paul Mackerras 2018-10-08 349
> 360cae313702cd Paul Mackerras 2018-10-08 350 /* restore L1 state */
> 360cae313702cd Paul Mackerras 2018-10-08 351 vcpu->arch.nested = NULL;
> 360cae313702cd Paul Mackerras 2018-10-08 352 vcpu->arch.regs = saved_l1_regs;
> 360cae313702cd Paul Mackerras 2018-10-08 353 vcpu->arch.shregs.msr = saved_l1_regs.msr & ~MSR_TS_MASK;
> 360cae313702cd Paul Mackerras 2018-10-08 354 /* set L1 MSR TS field according to L2 transaction state */
> 360cae313702cd Paul Mackerras 2018-10-08 355 if (l2_regs.msr & MSR_TS_MASK)
> 360cae313702cd Paul Mackerras 2018-10-08 356 vcpu->arch.shregs.msr |= MSR_TS_S;
> 360cae313702cd Paul Mackerras 2018-10-08 357 vc->tb_offset = saved_l1_hv.tb_offset;
> 360cae313702cd Paul Mackerras 2018-10-08 358 restore_hv_regs(vcpu, &saved_l1_hv);
> 360cae313702cd Paul Mackerras 2018-10-08 359 vcpu->arch.purr += delta_purr;
> 360cae313702cd Paul Mackerras 2018-10-08 360 vcpu->arch.spurr += delta_spurr;
> 360cae313702cd Paul Mackerras 2018-10-08 361 vcpu->arch.ic += delta_ic;
> 360cae313702cd Paul Mackerras 2018-10-08 362 vc->vtb += delta_vtb;
> 360cae313702cd Paul Mackerras 2018-10-08 363
> 360cae313702cd Paul Mackerras 2018-10-08 364 kvmhv_put_nested(l2);
> 360cae313702cd Paul Mackerras 2018-10-08 365
> 360cae313702cd Paul Mackerras 2018-10-08 366 /* copy l2_hv_state and regs back to guest */
> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 367 if (kvmppc_need_byteswap(vcpu)) {
> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 368 byteswap_hv_regs(&l2_hv);
> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 369 byteswap_pt_regs(&l2_regs);
> 10b5022db7861a Suraj Jitindar Singh 2018-10-08 370 }
> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 371 vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
> afe75049303f75 Ravi Bangoria 2020-12-16 372 err = kvmhv_write_guest_state_and_regs(vcpu, &l2_hv, &l2_regs,
> afe75049303f75 Ravi Bangoria 2020-12-16 373 hv_ptr, regs_ptr);
> 1508c22f112ce1 Alexey Kardashevskiy 2020-06-09 374 srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
> 360cae313702cd Paul Mackerras 2018-10-08 375 if (err)
> 360cae313702cd Paul Mackerras 2018-10-08 376 return H_AUTHORITY;
> 360cae313702cd Paul Mackerras 2018-10-08 377
> 360cae313702cd Paul Mackerras 2018-10-08 378 if (r == -EINTR)
> 360cae313702cd Paul Mackerras 2018-10-08 379 return H_INTERRUPT;
> 360cae313702cd Paul Mackerras 2018-10-08 380
> 873db2cd9a6d7f Suraj Jitindar Singh 2018-12-14 381 if (vcpu->mmio_needed) {
> 873db2cd9a6d7f Suraj Jitindar Singh 2018-12-14 382 kvmhv_nested_mmio_needed(vcpu, regs_ptr);
> 873db2cd9a6d7f Suraj Jitindar Singh 2018-12-14 383 return H_TOO_HARD;
> 873db2cd9a6d7f Suraj Jitindar Singh 2018-12-14 384 }
> 873db2cd9a6d7f Suraj Jitindar Singh 2018-12-14 385
> 360cae313702cd Paul Mackerras 2018-10-08 386 return vcpu->arch.trap;
> 360cae313702cd Paul Mackerras 2018-10-08 387 }
> 360cae313702cd Paul Mackerras 2018-10-08 388
>
> :::::: The code at line 264 was first introduced by commit
> :::::: 360cae313702cdd0b90f82c261a8302fecef030a KVM: PPC: Book3S HV: Nested guest entry via hypercall
>
> :::::: TO: Paul Mackerras <paulus@ozlabs.org>
> :::::: CC: Michael Ellerman <mpe@ellerman.id.au>
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
>
^ permalink raw reply
* Re: [PATCH] watchdog: Remove MV64x60 watchdog driver
From: gituser @ 2021-06-20 13:30 UTC (permalink / raw)
To: Guenter Roeck
Cc: linux-watchdog, linux-kernel, netdev, Wim Van Sebroeck,
linuxppc-dev, Sebastian Hesselbarth
In-Reply-To: <20210607112950.GB314533@roeck-us.net>
Hi All,
On Mon, Jun 07, 2021 at 04:29:50AM -0700, Guenter Roeck wrote:
> On Mon, Jun 07, 2021 at 11:43:26AM +1000, Michael Ellerman wrote:
> > Guenter Roeck <linux@roeck-us.net> writes:
> > > On 5/17/21 4:17 AM, Michael Ellerman wrote:
> > >> Guenter Roeck <linux@roeck-us.net> writes:
> > >>> On 3/18/21 10:25 AM, Christophe Leroy wrote:
> > >>>> Commit 92c8c16f3457 ("powerpc/embedded6xx: Remove C2K board support")
> > >>>> removed the last selector of CONFIG_MV64X60.
> > >>>>
> > >>>> Therefore CONFIG_MV64X60_WDT cannot be selected anymore and
> > >>>> can be removed.
> > >>>>
> > >>>> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> > >>>
> > >>> Reviewed-by: Guenter Roeck <linux@roeck-us.net>
> > >>>
> > >>>> ---
> > >>>> drivers/watchdog/Kconfig | 4 -
> > >>>> drivers/watchdog/Makefile | 1 -
> > >>>> drivers/watchdog/mv64x60_wdt.c | 324 ---------------------------------
> > >>>> include/linux/mv643xx.h | 8 -
> > >>>> 4 files changed, 337 deletions(-)
> > >>>> delete mode 100644 drivers/watchdog/mv64x60_wdt.c
> > >>
> > >> I assumed this would go via the watchdog tree, but seems like I
> > >> misinterpreted.
> > >>
> > >
> > > Wim didn't send a pull request this time around.
> > >
> > > Guenter
> > >
> > >> Should I take this via the powerpc tree for v5.14 ?
> >
> > I still don't see this in the watchdog tree, should I take it?
> >
> It is in my personal watchdog-next tree, but afaics Wim hasn't picked any
> of it up yet. Wim ?
Picking it up right now.
Kind regards,
Wim.
^ permalink raw reply
* Re: [PATCH v2 2/9] powerpc: Add Microwatt device tree
From: Paul Mackerras @ 2021-06-20 12:08 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: linuxppc-dev
In-Reply-To: <20210619142616.GW5077@gate.crashing.org>
On Sat, Jun 19, 2021 at 09:26:16AM -0500, Segher Boessenkool wrote:
> On Fri, Jun 18, 2021 at 01:44:16PM +1000, Paul Mackerras wrote:
> > Microwatt currently runs with MSR[HV] = 0,
>
> That isn't compliant though? If your implementation does not have LPAR
> it must set MSR[HV]=1 always.
True - but if I actually do that, Linux starts trying to use hrfid
(for example in masked_Hinterrupt), which Microwatt doesn't have.
Something for Nick to fix. :)
Paul.
^ permalink raw reply
* Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.13-6 tag
From: pr-tracker-bot @ 2021-06-20 16:49 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, atrajeev, Linus Torvalds, linux-kernel
In-Reply-To: <87lf752zk9.fsf@mpe.ellerman.id.au>
The pull request you sent on Sun, 20 Jun 2021 09:40:38 +1000:
> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.13-6
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b84a7c286cecf0604a5f8bd5dfcd5e1ca7233e15
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply
* [PATCH 0/2] powerpc/perf: Add instruction and data address registers to extended regs
From: Athira Rajeev @ 2021-06-20 14:45 UTC (permalink / raw)
To: mpe, acme, jolsa; +Cc: kjain, maddy, linuxppc-dev, rnsastry
Patch set adds PMU registers namely Sampled Instruction Address Register
(SIAR) and Sampled Data Address Register (SDAR) as part of extended regs
in PowerPC. These registers provides the instruction/data address and
adding these to extended regs helps in debug purposes.
Patch 1/2 adds SIAR and SDAR as part of the extended regs mask.
Patch 2/2 includes perf tools side changes to add the SPRs to
sample_reg_mask to use with -I? option.
Athira Rajeev (2):
powerpc/perf: Expose instruction and data address registers as part of
extended regs
tools/perf: Add perf tools support to expose instruction and data
address registers as part of extended regs
arch/powerpc/include/uapi/asm/perf_regs.h | 12 +++++++-----
arch/powerpc/perf/perf_regs.c | 4 ++++
tools/arch/powerpc/include/uapi/asm/perf_regs.h | 12 +++++++-----
tools/perf/arch/powerpc/include/perf_regs.h | 2 ++
tools/perf/arch/powerpc/util/perf_regs.c | 2 ++
5 files changed, 22 insertions(+), 10 deletions(-)
--
1.8.3.1
^ permalink raw reply
* [PATCH 1/2] powerpc/perf: Expose instruction and data address registers as part of extended regs
From: Athira Rajeev @ 2021-06-20 14:45 UTC (permalink / raw)
To: mpe, acme, jolsa; +Cc: kjain, maddy, linuxppc-dev, rnsastry
In-Reply-To: <1624200360-1429-1-git-send-email-atrajeev@linux.vnet.ibm.com>
Patch adds support to include Sampled Instruction Address Register
(SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended
registers. Update the definition of PERF_REG_PMU_MASK_300/31 and
PERF_REG_EXTENDED_MAX to include these SPR's.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
arch/powerpc/include/uapi/asm/perf_regs.h | 12 +++++++-----
arch/powerpc/perf/perf_regs.c | 4 ++++
2 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h
index 578b3ee..cf5eee5 100644
--- a/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -61,6 +61,8 @@ enum perf_event_powerpc_regs {
PERF_REG_POWERPC_PMC4,
PERF_REG_POWERPC_PMC5,
PERF_REG_POWERPC_PMC6,
+ PERF_REG_POWERPC_SDAR,
+ PERF_REG_POWERPC_SIAR,
/* Max regs without the extended regs */
PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
};
@@ -72,16 +74,16 @@ enum perf_event_powerpc_regs {
/*
* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300
- * includes 9 SPRS from MMCR0 to PMC6 excluding the
+ * includes 11 SPRS from MMCR0 to SIAR excluding the
* unsupported SPRS in PERF_EXCLUDE_REG_EXT_300.
*/
-#define PERF_REG_PMU_MASK_300 ((0xfffULL << PERF_REG_POWERPC_MMCR0) - PERF_EXCLUDE_REG_EXT_300)
+#define PERF_REG_PMU_MASK_300 ((0x3fffULL << PERF_REG_POWERPC_MMCR0) - PERF_EXCLUDE_REG_EXT_300)
/*
* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31
- * includes 12 SPRs from MMCR0 to PMC6.
+ * includes 14 SPRs from MMCR0 to SIAR.
*/
-#define PERF_REG_PMU_MASK_31 (0xfffULL << PERF_REG_POWERPC_MMCR0)
+#define PERF_REG_PMU_MASK_31 (0x3fffULL << PERF_REG_POWERPC_MMCR0)
-#define PERF_REG_EXTENDED_MAX (PERF_REG_POWERPC_PMC6 + 1)
+#define PERF_REG_EXTENDED_MAX (PERF_REG_POWERPC_SIAR + 1)
#endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
index b931eed..51d31b6 100644
--- a/arch/powerpc/perf/perf_regs.c
+++ b/arch/powerpc/perf/perf_regs.c
@@ -90,7 +90,11 @@ static u64 get_ext_regs_value(int idx)
return mfspr(SPRN_SIER2);
case PERF_REG_POWERPC_SIER3:
return mfspr(SPRN_SIER3);
+ case PERF_REG_POWERPC_SDAR:
+ return mfspr(SPRN_SDAR);
#endif
+ case PERF_REG_POWERPC_SIAR:
+ return mfspr(SPRN_SIAR);
default: return 0;
}
}
--
1.8.3.1
^ permalink raw reply related
* [PATCH 2/2] tools/perf: Add perf tools support to expose instruction and data address registers as part of extended regs
From: Athira Rajeev @ 2021-06-20 14:46 UTC (permalink / raw)
To: mpe, acme, jolsa; +Cc: kjain, maddy, linuxppc-dev, rnsastry
In-Reply-To: <1624200360-1429-1-git-send-email-atrajeev@linux.vnet.ibm.com>
Patch enables presenting of Sampled Instruction Address Register (SIAR)
and Sampled Data Address Register (SDAR) SPRs as part of extended regsiters
for perf tool. Add these SPR's to sample_reg_mask in the tool side (to use
with -I? option).
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/arch/powerpc/include/uapi/asm/perf_regs.h | 12 +++++++-----
tools/perf/arch/powerpc/include/perf_regs.h | 2 ++
tools/perf/arch/powerpc/util/perf_regs.c | 2 ++
3 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
index 578b3ee..cf5eee5 100644
--- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -61,6 +61,8 @@ enum perf_event_powerpc_regs {
PERF_REG_POWERPC_PMC4,
PERF_REG_POWERPC_PMC5,
PERF_REG_POWERPC_PMC6,
+ PERF_REG_POWERPC_SDAR,
+ PERF_REG_POWERPC_SIAR,
/* Max regs without the extended regs */
PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
};
@@ -72,16 +74,16 @@ enum perf_event_powerpc_regs {
/*
* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300
- * includes 9 SPRS from MMCR0 to PMC6 excluding the
+ * includes 11 SPRS from MMCR0 to SIAR excluding the
* unsupported SPRS in PERF_EXCLUDE_REG_EXT_300.
*/
-#define PERF_REG_PMU_MASK_300 ((0xfffULL << PERF_REG_POWERPC_MMCR0) - PERF_EXCLUDE_REG_EXT_300)
+#define PERF_REG_PMU_MASK_300 ((0x3fffULL << PERF_REG_POWERPC_MMCR0) - PERF_EXCLUDE_REG_EXT_300)
/*
* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31
- * includes 12 SPRs from MMCR0 to PMC6.
+ * includes 14 SPRs from MMCR0 to SIAR.
*/
-#define PERF_REG_PMU_MASK_31 (0xfffULL << PERF_REG_POWERPC_MMCR0)
+#define PERF_REG_PMU_MASK_31 (0x3fffULL << PERF_REG_POWERPC_MMCR0)
-#define PERF_REG_EXTENDED_MAX (PERF_REG_POWERPC_PMC6 + 1)
+#define PERF_REG_EXTENDED_MAX (PERF_REG_POWERPC_SIAR + 1)
#endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
diff --git a/tools/perf/arch/powerpc/include/perf_regs.h b/tools/perf/arch/powerpc/include/perf_regs.h
index 04e5dc0..93339d1 100644
--- a/tools/perf/arch/powerpc/include/perf_regs.h
+++ b/tools/perf/arch/powerpc/include/perf_regs.h
@@ -77,6 +77,8 @@
[PERF_REG_POWERPC_PMC4] = "pmc4",
[PERF_REG_POWERPC_PMC5] = "pmc5",
[PERF_REG_POWERPC_PMC6] = "pmc6",
+ [PERF_REG_POWERPC_SDAR] = "sdar",
+ [PERF_REG_POWERPC_SIAR] = "siar",
};
static inline const char *__perf_reg_name(int id)
diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
index 8116a25..8d07a78 100644
--- a/tools/perf/arch/powerpc/util/perf_regs.c
+++ b/tools/perf/arch/powerpc/util/perf_regs.c
@@ -74,6 +74,8 @@
SMPL_REG(pmc4, PERF_REG_POWERPC_PMC4),
SMPL_REG(pmc5, PERF_REG_POWERPC_PMC5),
SMPL_REG(pmc6, PERF_REG_POWERPC_PMC6),
+ SMPL_REG(sdar, PERF_REG_POWERPC_SDAR),
+ SMPL_REG(siar, PERF_REG_POWERPC_SIAR),
SMPL_REG_END
};
--
1.8.3.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox