* [v3 0/6] KVM: arm64: implement vcpu_is_preempted check
@ 2023-01-17 10:29 Usama Arif
2023-01-17 10:29 ` [v3 1/6] KVM: arm64: Document PV-lock interface Usama Arif
` (6 more replies)
0 siblings, 7 replies; 10+ messages in thread
From: Usama Arif @ 2023-01-17 10:29 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel, kvmarm, kvm, linux-doc,
virtualization, linux, yezengruan, catalin.marinas, will, maz,
steven.price, mark.rutland, bagasdotme, pbonzini
Cc: fam.zheng, liangma, punit.agrawal, Usama Arif
This patchset adds support for vcpu_is_preempted in arm64, which allows the guest
to check if a vcpu was scheduled out, which is useful to know incase it was
holding a lock. vcpu_is_preempted is well integrated in core kernel code and can
be used to improve performance in locking (owner_on_cpu usage in mutex_spin_on_owner,
mutex_can_spin_on_owner, rtmutex_spin_on_owner and osq_lock) and scheduling
(available_idle_cpu which is used in several places in kernel/sched/fair.c
for e.g. in wake_affine to determine which CPU can run soonest).
This patchset shows significant improvement on overcommitted hosts (vCPUs > pCPUS),
as waiting for preempted vCPUs reduces performance.
If merged, vcpu_is_preempted could also be used to optimize IPI performance (along
with directed yield to target IPI vCPU) similar to how its done in x86
(https://lore.kernel.org/all/1560255830-8656-2-git-send-email-wanpengli@tencent.com/)
All the results in the below experiments are done on an aws r6g.metal instance
which has 64 pCPUs.
The following table shows the index results of UnixBench running on a 128 vCPU VM
with (6.0+vcpu_is_preempted) and without (6.0 base) the patchset.
TestName 6.0 base 6.0+vcpu_is_preempted % improvement for vcpu_is_preempted
Dhrystone 2 using register variables 187761 191274.7 1.871368389
Double-Precision Whetstone 96743.6 98414.4 1.727039308
Execl Throughput 689.3 10426 1412.548963
File Copy 1024 bufsize 2000 maxblocks 549.5 3165 475.978162
File Copy 256 bufsize 500 maxblocks 400.7 2084.7 420.2645371
File Copy 4096 bufsize 8000 maxblocks 894.3 5003.2 459.4543218
Pipe Throughput 76819.5 78601.5 2.319723508
Pipe-based Context Switching 3444.8 13414.5 289.4130283
Process Creation 301.1 293.4 -2.557289937
Shell Scripts (1 concurrent) 1248.1 28300.6 2167.494592
Shell Scripts (8 concurrent) 781.2 26222.3 3256.669227
System Call Overhead 3426 3729.4 8.855808523
System Benchmarks Index Score 3053 11534 277.7923354
This shows a 278% overall improvement using these patches.
The biggest improvement is in the shell scripts benchmark, which forks a lot of processes.
This acquires rwsem lock where a large chunk of time is spent in base kernel.
This can be seen from one of the callstack of the perf output of the shell
scripts benchmark on base (pseudo NMI enabled for perf numbers below):
- 33.79% el0_svc
- 33.43% do_el0_svc
- 33.43% el0_svc_common.constprop.3
- 33.30% invoke_syscall
- 17.27% __arm64_sys_clone
- 17.27% __do_sys_clone
- 17.26% kernel_clone
- 16.73% copy_process
- 11.91% dup_mm
- 11.82% dup_mmap
- 9.15% down_write
- 8.87% rwsem_down_write_slowpath
- 8.48% osq_lock
Just under 50% of the total time in the shell script benchmarks ends up being
spent in osq_lock in the base kernel:
Children Self Command Shared Object Symbol
17.19% 10.71% sh [kernel.kallsyms] [k] osq_lock
6.17% 4.04% sort [kernel.kallsyms] [k] osq_lock
4.20% 2.60% multi. [kernel.kallsyms] [k] osq_lock
3.77% 2.47% grep [kernel.kallsyms] [k] osq_lock
3.50% 2.24% expr [kernel.kallsyms] [k] osq_lock
3.41% 2.23% od [kernel.kallsyms] [k] osq_lock
3.36% 2.15% rm [kernel.kallsyms] [k] osq_lock
3.28% 2.12% tee [kernel.kallsyms] [k] osq_lock
3.16% 2.02% wc [kernel.kallsyms] [k] osq_lock
0.21% 0.13% looper [kernel.kallsyms] [k] osq_lock
0.01% 0.00% Run [kernel.kallsyms] [k] osq_lock
and this comes down to less than 1% total with 6.0+vcpu_is_preempted kernel:
Children Self Command Shared Object Symbol
0.26% 0.21% sh [kernel.kallsyms] [k] osq_lock
0.10% 0.08% multi. [kernel.kallsyms] [k] osq_lock
0.04% 0.04% sort [kernel.kallsyms] [k] osq_lock
0.02% 0.01% grep [kernel.kallsyms] [k] osq_lock
0.02% 0.02% od [kernel.kallsyms] [k] osq_lock
0.01% 0.01% tee [kernel.kallsyms] [k] osq_lock
0.01% 0.00% expr [kernel.kallsyms] [k] osq_lock
0.01% 0.01% looper [kernel.kallsyms] [k] osq_lock
0.00% 0.00% wc [kernel.kallsyms] [k] osq_lock
0.00% 0.00% rm [kernel.kallsyms] [k] osq_lock
To make sure, there is no change in performance when vCPUs < pCPUs, UnixBench
was run on a 32 CPU VM. The kernel with vcpu_is_preempted implemented
performed 0.9% better overall than base kernel, and the individual benchmarks
were within +/-2% improvement over 6.0 base.
Hence the patches have no negative affect when vCPUs < pCPUs.
The respective QEMU change to test this is at
https://github.com/uarif1/qemu/commit/2da2c2927ae8de8f03f439804a0dad9cf68501b6.
Looking forward to your response!
Thanks,
Usama
---
v2->v3
- Updated the patchset from 6.0 to 6.2-rc3
- Made pv_lock_init an early_initcall
- Improved documentation
- Changed pvlock_vcpu_state to aligned struct
- Minor improvevments
RFC->v2
- Fixed table and code referencing in pvlock documentation
- Switched to using a single hypercall similar to ptp_kvm and made check
for has_kvm_pvlock simpler
Usama Arif (6):
KVM: arm64: Document PV-lock interface
KVM: arm64: Add SMCCC paravirtualised lock calls
KVM: arm64: Support pvlock preempted via shared structure
KVM: arm64: Provide VCPU attributes for PV lock
KVM: arm64: Support the VCPU preemption check
KVM: selftests: add tests for PV time specific hypercall
Documentation/virt/kvm/arm/hypercalls.rst | 3 +
Documentation/virt/kvm/arm/index.rst | 1 +
Documentation/virt/kvm/arm/pvlock.rst | 54 +++++++++
Documentation/virt/kvm/devices/vcpu.rst | 25 ++++
arch/arm64/include/asm/kvm_host.h | 25 ++++
arch/arm64/include/asm/paravirt.h | 2 +
arch/arm64/include/asm/pvlock-abi.h | 15 +++
arch/arm64/include/asm/spinlock.h | 16 ++-
arch/arm64/include/uapi/asm/kvm.h | 3 +
arch/arm64/kernel/paravirt.c | 113 ++++++++++++++++++
arch/arm64/kvm/Makefile | 2 +-
arch/arm64/kvm/arm.c | 8 ++
arch/arm64/kvm/guest.c | 9 ++
arch/arm64/kvm/hypercalls.c | 8 ++
arch/arm64/kvm/pvlock.c | 100 ++++++++++++++++
include/linux/arm-smccc.h | 8 ++
include/uapi/linux/kvm.h | 2 +
tools/arch/arm64/include/uapi/asm/kvm.h | 1 +
tools/include/linux/arm-smccc.h | 8 ++
.../selftests/kvm/aarch64/hypercalls.c | 2 +
20 files changed, 403 insertions(+), 2 deletions(-)
create mode 100644 Documentation/virt/kvm/arm/pvlock.rst
create mode 100644 arch/arm64/include/asm/pvlock-abi.h
create mode 100644 arch/arm64/kvm/pvlock.c
--
2.25.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [v3 1/6] KVM: arm64: Document PV-lock interface
2023-01-17 10:29 [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
@ 2023-01-17 10:29 ` Usama Arif
2023-01-18 13:29 ` Bagas Sanjaya
2023-01-17 10:29 ` [v3 2/6] KVM: arm64: Add SMCCC paravirtualised lock calls Usama Arif
` (5 subsequent siblings)
6 siblings, 1 reply; 10+ messages in thread
From: Usama Arif @ 2023-01-17 10:29 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel, kvmarm, kvm, linux-doc,
virtualization, linux, yezengruan, catalin.marinas, will, maz,
steven.price, mark.rutland, bagasdotme, pbonzini
Cc: fam.zheng, liangma, punit.agrawal, Usama Arif
Introduce a paravirtualization interface for KVM/arm64 to obtain whether
the VCPU is currently running or not.
The PV lock structure of the guest is allocated by user space.
A hypercall interface is provided for the guest to interrogate the
location of the shared memory structures.
Signed-off-by: Zengruan Ye <yezengruan@huawei.com>
Signed-off-by: Usama Arif <usama.arif@bytedance.com>
---
Documentation/virt/kvm/arm/index.rst | 1 +
Documentation/virt/kvm/arm/pvlock.rst | 54 +++++++++++++++++++++++++
Documentation/virt/kvm/devices/vcpu.rst | 25 ++++++++++++
3 files changed, 80 insertions(+)
create mode 100644 Documentation/virt/kvm/arm/pvlock.rst
diff --git a/Documentation/virt/kvm/arm/index.rst b/Documentation/virt/kvm/arm/index.rst
index e84848432158..b8499dc00a6a 100644
--- a/Documentation/virt/kvm/arm/index.rst
+++ b/Documentation/virt/kvm/arm/index.rst
@@ -10,4 +10,5 @@ ARM
hyp-abi
hypercalls
pvtime
+ pvlock
ptp_kvm
diff --git a/Documentation/virt/kvm/arm/pvlock.rst b/Documentation/virt/kvm/arm/pvlock.rst
new file mode 100644
index 000000000000..1b9ff7d8a385
--- /dev/null
+++ b/Documentation/virt/kvm/arm/pvlock.rst
@@ -0,0 +1,54 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Paravirtualized lock support for arm64
+======================================
+
+KVM/arm64 provides a hypervisor service call for paravirtualized guests to
+determine whether a VCPU is currently running or not.
+
+A new SMCCC compatible hypercall is defined:
+
+* ARM_SMCCC_VENDOR_HYP_KVM_PV_LOCK_FUNC_ID: 0xC6000002
+
+ARM_SMCCC_VENDOR_HYP_KVM_PV_LOCK_FUNC_ID
+
+ ============= ======== ==========================================
+ Function ID: (uint32) 0xC6000002
+ Return value: (int64) IPA of the pv lock data structure for this
+ VCPU. On failure:
+ NOT_SUPPORTED (-1)
+ ============= ======== ==========================================
+
+The IPA returned by PV_LOCK_PREEMPTED should be mapped by the guest as normal
+memory with inner and outer write back caching attributes, in the inner
+shareable domain.
+
+PV_LOCK_PREEMPTED returns the structure for the calling VCPU.
+
+PV lock state
+-------------
+
+The structure pointed to by the PV_LOCK_PREEMPTED hypercall is as follows:
+
++-----------+-------------+-------------+---------------------------------+
+| Field | Byte Length | Byte Offset | Description |
++===========+=============+=============+=================================+
+| preempted | 8 | 0 | Used to indicate if the VCPU |
+| | | | which owns this struct is |
+| | | | running or not. |
+| | | | A non-zero value mean the VCPU |
+| | | | has been scheduled out. A zero |
+| | | | value means the VCPU has been |
+| | | | scheduled in. |
++-----------+-------------+-------------+---------------------------------+
+
+The preempted field will be updated to 0 by the hypervisor prior to scheduling
+a VCPU. When the VCPU is scheduled out, the preempted field will be updated
+to 1 by the hypervisor.
+
+The structure will be present within a reserved region of the normal memory
+given to the guest. The guest should not attempt to write into this memory.
+There is a structure per VCPU of the guest.
+
+For the user space interface see
+:ref:`Documentation/virt/kvm/devices/vcpu.rst <kvm_arm_vcpu_pvlock_ctrl>`.
\ No newline at end of file
diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 31f14ec4a65b..0f999919ba92 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -265,3 +265,28 @@ From the destination VMM process:
7. Write the KVM_VCPU_TSC_OFFSET attribute for every vCPU with the
respective value derived in the previous step.
+
+.. _kvm_arm_vcpu_pvlock_ctrl:
+
+5. GROUP: KVM_ARM_VCPU_PVLOCK_CTRL
+==================================
+
+:Architectures: ARM64
+
+5.1 ATTRIBUTE: KVM_ARM_VCPU_PVLOCK_IPA
+--------------------------------------
+
+:Parameters: 64-bit base address
+
+Returns:
+
+ ======= ======================================
+ -ENXIO PV lock not implemented
+ -EEXIST Base address already set for this VCPU
+ -EINVAL Base address not 64 byte aligned
+ ======= ======================================
+
+Specifies the base address of the pv lock structure for this VCPU. The
+base address must be 64 byte aligned and exist within a valid guest memory
+region. See Documentation/virt/kvm/arm/pvlock.rst for more information
+including the layout of the pv lock structure.
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [v3 2/6] KVM: arm64: Add SMCCC paravirtualised lock calls
2023-01-17 10:29 [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
2023-01-17 10:29 ` [v3 1/6] KVM: arm64: Document PV-lock interface Usama Arif
@ 2023-01-17 10:29 ` Usama Arif
2023-01-17 10:29 ` [v3 3/6] KVM: arm64: Support pvlock preempted via shared structure Usama Arif
` (4 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Usama Arif @ 2023-01-17 10:29 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel, kvmarm, kvm, linux-doc,
virtualization, linux, yezengruan, catalin.marinas, will, maz,
steven.price, mark.rutland, bagasdotme, pbonzini
Cc: fam.zheng, liangma, punit.agrawal, Usama Arif
Add a new SMCCC compatible hypercalls for PV lock features:
ARM_SMCCC_KVM_FUNC_PV_LOCK: 0xC6000002
Also add the header file which defines the ABI for the paravirtualized
lock features we're about to add.
Signed-off-by: Zengruan Ye <yezengruan@huawei.com>
Signed-off-by: Usama Arif <usama.arif@bytedance.com>
---
arch/arm64/include/asm/pvlock-abi.h | 15 +++++++++++++++
include/linux/arm-smccc.h | 8 ++++++++
tools/include/linux/arm-smccc.h | 8 ++++++++
3 files changed, 31 insertions(+)
create mode 100644 arch/arm64/include/asm/pvlock-abi.h
diff --git a/arch/arm64/include/asm/pvlock-abi.h b/arch/arm64/include/asm/pvlock-abi.h
new file mode 100644
index 000000000000..e12c8ec05178
--- /dev/null
+++ b/arch/arm64/include/asm/pvlock-abi.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright(c) 2019 Huawei Technologies Co., Ltd
+ * Author: Zengruan Ye <yezengruan@huawei.com>
+ * Usama Arif <usama.arif@bytedance.com>
+ */
+
+#ifndef __ASM_PVLOCK_ABI_H
+#define __ASM_PVLOCK_ABI_H
+
+struct pvlock_vcpu_state {
+ __le64 preempted;
+} __aligned(64);
+
+#endif
diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index 220c8c60e021..104c10035b10 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -112,6 +112,7 @@
/* KVM "vendor specific" services */
#define ARM_SMCCC_KVM_FUNC_FEATURES 0
#define ARM_SMCCC_KVM_FUNC_PTP 1
+#define ARM_SMCCC_KVM_FUNC_PV_LOCK 2
#define ARM_SMCCC_KVM_FUNC_FEATURES_2 127
#define ARM_SMCCC_KVM_NUM_FUNCS 128
@@ -151,6 +152,13 @@
ARM_SMCCC_OWNER_STANDARD_HYP, \
0x21)
+/* Paravirtualised lock calls */
+#define ARM_SMCCC_VENDOR_HYP_KVM_PV_LOCK_FUNC_ID \
+ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+ ARM_SMCCC_SMC_64, \
+ ARM_SMCCC_OWNER_VENDOR_HYP, \
+ ARM_SMCCC_KVM_FUNC_PV_LOCK)
+
/* TRNG entropy source calls (defined by ARM DEN0098) */
#define ARM_SMCCC_TRNG_VERSION \
ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
diff --git a/tools/include/linux/arm-smccc.h b/tools/include/linux/arm-smccc.h
index 63ce9bebccd3..c21e539c0228 100644
--- a/tools/include/linux/arm-smccc.h
+++ b/tools/include/linux/arm-smccc.h
@@ -111,6 +111,7 @@
/* KVM "vendor specific" services */
#define ARM_SMCCC_KVM_FUNC_FEATURES 0
#define ARM_SMCCC_KVM_FUNC_PTP 1
+#define ARM_SMCCC_KVM_FUNC_PV_LOCK 2
#define ARM_SMCCC_KVM_FUNC_FEATURES_2 127
#define ARM_SMCCC_KVM_NUM_FUNCS 128
@@ -150,6 +151,13 @@
ARM_SMCCC_OWNER_STANDARD_HYP, \
0x21)
+/* Paravirtualised lock calls */
+#define ARM_SMCCC_VENDOR_HYP_KVM_PV_LOCK_FUNC_ID \
+ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+ ARM_SMCCC_SMC_64, \
+ ARM_SMCCC_OWNER_VENDOR_HYP, \
+ ARM_SMCCC_KVM_FUNC_PV_LOCK)
+
/* TRNG entropy source calls (defined by ARM DEN0098) */
#define ARM_SMCCC_TRNG_VERSION \
ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [v3 3/6] KVM: arm64: Support pvlock preempted via shared structure
2023-01-17 10:29 [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
2023-01-17 10:29 ` [v3 1/6] KVM: arm64: Document PV-lock interface Usama Arif
2023-01-17 10:29 ` [v3 2/6] KVM: arm64: Add SMCCC paravirtualised lock calls Usama Arif
@ 2023-01-17 10:29 ` Usama Arif
2023-01-17 10:29 ` [v3 4/6] KVM: arm64: Provide VCPU attributes for PV lock Usama Arif
` (3 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Usama Arif @ 2023-01-17 10:29 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel, kvmarm, kvm, linux-doc,
virtualization, linux, yezengruan, catalin.marinas, will, maz,
steven.price, mark.rutland, bagasdotme, pbonzini
Cc: fam.zheng, liangma, punit.agrawal, Usama Arif
Implement the service call for configuring a shared structure between a
VCPU and the hypervisor in which the hypervisor can tell whether the
VCPU is running or not.
The preempted field is zero if the VCPU is not preempted.
Any other value means the VCPU has been preempted.
Signed-off-by: Zengruan Ye <yezengruan@huawei.com>
Signed-off-by: Usama Arif <usama.arif@bytedance.com>
---
Documentation/virt/kvm/arm/hypercalls.rst | 3 ++
arch/arm64/include/asm/kvm_host.h | 18 ++++++++++
arch/arm64/include/uapi/asm/kvm.h | 1 +
arch/arm64/kvm/Makefile | 2 +-
arch/arm64/kvm/arm.c | 8 +++++
arch/arm64/kvm/hypercalls.c | 8 +++++
arch/arm64/kvm/pvlock.c | 43 +++++++++++++++++++++++
tools/arch/arm64/include/uapi/asm/kvm.h | 1 +
8 files changed, 83 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/kvm/pvlock.c
diff --git a/Documentation/virt/kvm/arm/hypercalls.rst b/Documentation/virt/kvm/arm/hypercalls.rst
index 3e23084644ba..872a16226ace 100644
--- a/Documentation/virt/kvm/arm/hypercalls.rst
+++ b/Documentation/virt/kvm/arm/hypercalls.rst
@@ -127,6 +127,9 @@ The pseudo-firmware bitmap register are as follows:
Bit-1: KVM_REG_ARM_VENDOR_HYP_BIT_PTP:
The bit represents the Precision Time Protocol KVM service.
+ Bit-2: KVM_REG_ARM_VENDOR_HYP_BIT_PV_LOCK:
+ The bit represents the Paravirtualized lock service.
+
Errors:
======= =============================================================
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 35a159d131b5..1d1acc5be8d7 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -501,6 +501,11 @@ struct kvm_vcpu_arch {
u64 last_steal;
gpa_t base;
} steal;
+
+ /* Guest PV lock state */
+ struct {
+ gpa_t base;
+ } pv_lock;
};
/*
@@ -924,6 +929,19 @@ static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch)
return (vcpu_arch->steal.base != GPA_INVALID);
}
+static inline void kvm_arm_pvlock_preempted_init(struct kvm_vcpu_arch *vcpu_arch)
+{
+ vcpu_arch->pv_lock.base = GPA_INVALID;
+}
+
+static inline bool kvm_arm_is_pvlock_preempted_ready(struct kvm_vcpu_arch *vcpu_arch)
+{
+ return (vcpu_arch->pv_lock.base != GPA_INVALID);
+}
+
+gpa_t kvm_init_pvlock(struct kvm_vcpu *vcpu);
+void kvm_update_pvlock_preempted(struct kvm_vcpu *vcpu, u64 preempted);
+
void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index a7a857f1784d..34dd6df3f8eb 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -366,6 +366,7 @@ enum {
enum {
KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT = 0,
KVM_REG_ARM_VENDOR_HYP_BIT_PTP = 1,
+ KVM_REG_ARM_VENDOR_HYP_BIT_PV_LOCK = 2,
#ifdef __KERNEL__
KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_COUNT,
#endif
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 5e33c2d4645a..e1f711885916 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -10,7 +10,7 @@ include $(srctree)/virt/kvm/Makefile.kvm
obj-$(CONFIG_KVM) += kvm.o
obj-$(CONFIG_KVM) += hyp/
-kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
+kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o pvlock.o \
inject_fault.o va_layout.o handle_exit.o \
guest.o debug.o reset.o sys_regs.o stacktrace.o \
vgic-sys-reg-v3.o fpsimd.o pkvm.o \
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 9c5573bc4614..5808e6695f75 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -357,6 +357,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
kvm_arm_pvtime_vcpu_init(&vcpu->arch);
+ kvm_arm_pvlock_preempted_init(&vcpu->arch);
+
vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
err = kvm_vgic_vcpu_init(vcpu);
@@ -432,6 +434,10 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (vcpu_has_ptrauth(vcpu))
vcpu_ptrauth_disable(vcpu);
+
+ if (kvm_arm_is_pvlock_preempted_ready(&vcpu->arch))
+ kvm_update_pvlock_preempted(vcpu, 0);
+
kvm_arch_vcpu_load_debug_state_flags(vcpu);
if (!cpumask_test_cpu(smp_processor_id(), vcpu->kvm->arch.supported_cpus))
@@ -445,6 +451,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
if (has_vhe())
kvm_vcpu_put_sysregs_vhe(vcpu);
kvm_timer_vcpu_put(vcpu);
+ if (kvm_arm_is_pvlock_preempted_ready(&vcpu->arch))
+ kvm_update_pvlock_preempted(vcpu, 1);
kvm_vgic_put(vcpu);
kvm_vcpu_pmu_restore_host(vcpu);
kvm_arm_vmid_clear_active();
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index c9f401fa01a9..ec85b4b2a272 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -116,6 +116,9 @@ static bool kvm_hvc_call_allowed(struct kvm_vcpu *vcpu, u32 func_id)
case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
return test_bit(KVM_REG_ARM_VENDOR_HYP_BIT_PTP,
&smccc_feat->vendor_hyp_bmap);
+ case ARM_SMCCC_VENDOR_HYP_KVM_PV_LOCK_FUNC_ID:
+ return test_bit(KVM_REG_ARM_VENDOR_HYP_BIT_PV_LOCK,
+ &smccc_feat->vendor_hyp_bmap);
default:
return kvm_hvc_call_default_allowed(func_id);
}
@@ -201,6 +204,11 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
if (gpa != GPA_INVALID)
val[0] = gpa;
break;
+ case ARM_SMCCC_VENDOR_HYP_KVM_PV_LOCK_FUNC_ID:
+ gpa = kvm_init_pvlock(vcpu);
+ if (gpa != GPA_INVALID)
+ val[0] = gpa;
+ break;
case ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID:
val[0] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_0;
val[1] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_1;
diff --git a/arch/arm64/kvm/pvlock.c b/arch/arm64/kvm/pvlock.c
new file mode 100644
index 000000000000..228d24bfe281
--- /dev/null
+++ b/arch/arm64/kvm/pvlock.c
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright(c) 2019 Huawei Technologies Co., Ltd
+ * Author: Zengruan Ye <yezengruan@huawei.com>
+ * Usama Arif <usama.arif@bytedance.com>
+ */
+
+#include <linux/arm-smccc.h>
+#include <linux/kvm_host.h>
+
+#include <asm/pvlock-abi.h>
+
+#include <kvm/arm_hypercalls.h>
+
+gpa_t kvm_init_pvlock(struct kvm_vcpu *vcpu)
+{
+ struct pvlock_vcpu_state init_values = {};
+ struct kvm *kvm = vcpu->kvm;
+ u64 base = vcpu->arch.pv_lock.base;
+ int idx;
+
+ if (base == GPA_INVALID)
+ return base;
+
+ idx = srcu_read_lock(&kvm->srcu);
+ kvm_write_guest(kvm, base, &init_values, sizeof(init_values));
+ srcu_read_unlock(&kvm->srcu, idx);
+
+ return base;
+}
+
+void kvm_update_pvlock_preempted(struct kvm_vcpu *vcpu, u64 preempted)
+{
+ int idx;
+ u64 offset;
+ struct kvm *kvm = vcpu->kvm;
+ u64 base = vcpu->arch.pv_lock.base;
+
+ idx = srcu_read_lock(&kvm->srcu);
+ offset = offsetof(struct pvlock_vcpu_state, preempted);
+ kvm_put_guest(kvm, base + offset, cpu_to_le64(preempted));
+ srcu_read_unlock(&kvm->srcu, idx);
+}
diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
index 316917b98707..bd05ece5c590 100644
--- a/tools/arch/arm64/include/uapi/asm/kvm.h
+++ b/tools/arch/arm64/include/uapi/asm/kvm.h
@@ -365,6 +365,7 @@ enum {
enum {
KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT = 0,
KVM_REG_ARM_VENDOR_HYP_BIT_PTP = 1,
+ KVM_REG_ARM_VENDOR_HYP_BIT_PV_LOCK = 2,
#ifdef __KERNEL__
KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_COUNT,
#endif
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [v3 4/6] KVM: arm64: Provide VCPU attributes for PV lock
2023-01-17 10:29 [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
` (2 preceding siblings ...)
2023-01-17 10:29 ` [v3 3/6] KVM: arm64: Support pvlock preempted via shared structure Usama Arif
@ 2023-01-17 10:29 ` Usama Arif
2023-01-17 10:29 ` [v3 5/6] KVM: arm64: Support the VCPU preemption check Usama Arif
` (2 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Usama Arif @ 2023-01-17 10:29 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel, kvmarm, kvm, linux-doc,
virtualization, linux, yezengruan, catalin.marinas, will, maz,
steven.price, mark.rutland, bagasdotme, pbonzini
Cc: fam.zheng, liangma, punit.agrawal, Usama Arif
Allow user space to inform the KVM host where in the physical memory
map the paravirtualized lock structures should be located.
User space can set an attribute on the VCPU providing the IPA base
address of the PV lock structure for that VCPU. This must be
repeated for every VCPU in the VM.
The address is given in terms of the physical address visible to
the guest and must be 64 byte aligned. The guest will discover the
address via a hypercall.
Signed-off-by: Zengruan Ye <yezengruan@huawei.com>
Signed-off-by: Usama Arif <usama.arif@bytedance.com>
---
arch/arm64/include/asm/kvm_host.h | 7 ++++
arch/arm64/include/uapi/asm/kvm.h | 2 ++
arch/arm64/kvm/guest.c | 9 +++++
arch/arm64/kvm/pvlock.c | 57 +++++++++++++++++++++++++++++++
include/uapi/linux/kvm.h | 2 ++
5 files changed, 77 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 1d1acc5be8d7..5041b27dfcf2 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -913,6 +913,13 @@ int kvm_arm_pvtime_get_attr(struct kvm_vcpu *vcpu,
int kvm_arm_pvtime_has_attr(struct kvm_vcpu *vcpu,
struct kvm_device_attr *attr);
+int kvm_arm_pvlock_set_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr);
+int kvm_arm_pvlock_get_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr);
+int kvm_arm_pvlock_has_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr);
+
extern unsigned int kvm_arm_vmid_bits;
int kvm_arm_vmid_alloc_init(void);
void kvm_arm_vmid_alloc_free(void);
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 34dd6df3f8eb..d67816f193a6 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -413,6 +413,8 @@ enum {
#define KVM_ARM_VCPU_TIMER_IRQ_PTIMER 1
#define KVM_ARM_VCPU_PVTIME_CTRL 2
#define KVM_ARM_VCPU_PVTIME_IPA 0
+#define KVM_ARM_VCPU_PVLOCK_CTRL 3
+#define KVM_ARM_VCPU_PVLOCK_IPA 0
/* KVM_IRQ_LINE irq field index values */
#define KVM_ARM_IRQ_VCPU2_SHIFT 28
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 5626ddb540ce..a1b0f39f24ba 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -959,6 +959,9 @@ int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
case KVM_ARM_VCPU_PVTIME_CTRL:
ret = kvm_arm_pvtime_set_attr(vcpu, attr);
break;
+ case KVM_ARM_VCPU_PVLOCK_CTRL:
+ ret = kvm_arm_pvlock_set_attr(vcpu, attr);
+ break;
default:
ret = -ENXIO;
break;
@@ -982,6 +985,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
case KVM_ARM_VCPU_PVTIME_CTRL:
ret = kvm_arm_pvtime_get_attr(vcpu, attr);
break;
+ case KVM_ARM_VCPU_PVLOCK_CTRL:
+ ret = kvm_arm_pvlock_get_attr(vcpu, attr);
+ break;
default:
ret = -ENXIO;
break;
@@ -1005,6 +1011,9 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
case KVM_ARM_VCPU_PVTIME_CTRL:
ret = kvm_arm_pvtime_has_attr(vcpu, attr);
break;
+ case KVM_ARM_VCPU_PVLOCK_CTRL:
+ ret = kvm_arm_pvlock_has_attr(vcpu, attr);
+ break;
default:
ret = -ENXIO;
break;
diff --git a/arch/arm64/kvm/pvlock.c b/arch/arm64/kvm/pvlock.c
index 228d24bfe281..cdd9749efd33 100644
--- a/arch/arm64/kvm/pvlock.c
+++ b/arch/arm64/kvm/pvlock.c
@@ -41,3 +41,60 @@ void kvm_update_pvlock_preempted(struct kvm_vcpu *vcpu, u64 preempted)
kvm_put_guest(kvm, base + offset, cpu_to_le64(preempted));
srcu_read_unlock(&kvm->srcu, idx);
}
+
+int kvm_arm_pvlock_set_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr)
+{
+ u64 __user *user = (u64 __user *)attr->addr;
+ struct kvm *kvm = vcpu->kvm;
+ u64 ipa;
+ int ret = 0;
+ int idx;
+
+ if (attr->attr != KVM_ARM_VCPU_PVLOCK_IPA)
+ return -ENXIO;
+
+ if (get_user(ipa, user))
+ return -EFAULT;
+ if (!IS_ALIGNED(ipa, 64))
+ return -EINVAL;
+ if (vcpu->arch.pv_lock.base != GPA_INVALID)
+ return -EEXIST;
+
+ /* Check the address is in a valid memslot */
+ idx = srcu_read_lock(&kvm->srcu);
+ if (kvm_is_error_hva(gfn_to_hva(kvm, ipa >> PAGE_SHIFT)))
+ ret = -EINVAL;
+ srcu_read_unlock(&kvm->srcu, idx);
+
+ if (!ret)
+ vcpu->arch.pv_lock.base = ipa;
+
+ return ret;
+}
+
+int kvm_arm_pvlock_get_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr)
+{
+ u64 __user *user = (u64 __user *)attr->addr;
+ u64 ipa;
+
+ if (attr->attr != KVM_ARM_VCPU_PVLOCK_IPA)
+ return -ENXIO;
+
+ ipa = vcpu->arch.pv_lock.base;
+
+ if (put_user(ipa, user))
+ return -EFAULT;
+ return 0;
+}
+
+int kvm_arm_pvlock_has_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr)
+{
+ switch (attr->attr) {
+ case KVM_ARM_VCPU_PVLOCK_IPA:
+ return 0;
+ }
+ return -ENXIO;
+}
\ No newline at end of file
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 55155e262646..0d76b7034002 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1427,6 +1427,8 @@ enum kvm_device_type {
#define KVM_DEV_TYPE_XIVE KVM_DEV_TYPE_XIVE
KVM_DEV_TYPE_ARM_PV_TIME,
#define KVM_DEV_TYPE_ARM_PV_TIME KVM_DEV_TYPE_ARM_PV_TIME
+ KVM_DEV_TYPE_ARM_PV_LOCK,
+#define KVM_DEV_TYPE_ARM_PV_LOCK KVM_DEV_TYPE_ARM_PV_LOCK
KVM_DEV_TYPE_MAX,
};
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [v3 5/6] KVM: arm64: Support the VCPU preemption check
2023-01-17 10:29 [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
` (3 preceding siblings ...)
2023-01-17 10:29 ` [v3 4/6] KVM: arm64: Provide VCPU attributes for PV lock Usama Arif
@ 2023-01-17 10:29 ` Usama Arif
2023-01-17 10:29 ` [v3 6/6] KVM: selftests: add tests for PV time specific hypercall Usama Arif
2023-02-14 16:06 ` [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
6 siblings, 0 replies; 10+ messages in thread
From: Usama Arif @ 2023-01-17 10:29 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel, kvmarm, kvm, linux-doc,
virtualization, linux, yezengruan, catalin.marinas, will, maz,
steven.price, mark.rutland, bagasdotme, pbonzini
Cc: fam.zheng, liangma, punit.agrawal, Usama Arif
Support the vcpu_is_preempted() functionality under KVM/arm64. This will
enhance lock performance on overcommitted hosts (more runnable VCPUs
than physical CPUs in the system) as doing busy waits for preempted
VCPUs will hurt system performance far worse than early yielding.
Signed-off-by: Zengruan Ye <yezengruan@huawei.com>
Signed-off-by: Usama Arif <usama.arif@bytedance.com>
---
arch/arm64/include/asm/paravirt.h | 2 +
arch/arm64/include/asm/spinlock.h | 16 ++++-
arch/arm64/kernel/paravirt.c | 113 ++++++++++++++++++++++++++++++
3 files changed, 130 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/paravirt.h b/arch/arm64/include/asm/paravirt.h
index 9aa193e0e8f2..4ccb4356c56b 100644
--- a/arch/arm64/include/asm/paravirt.h
+++ b/arch/arm64/include/asm/paravirt.h
@@ -19,10 +19,12 @@ static inline u64 paravirt_steal_clock(int cpu)
}
int __init pv_time_init(void);
+int __init pv_lock_init(void);
#else
#define pv_time_init() do {} while (0)
+#define pv_lock_init() do {} while (0)
#endif // CONFIG_PARAVIRT
diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h
index 0525c0b089ed..7023efa4de96 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -10,7 +10,20 @@
/* See include/linux/spinlock.h */
#define smp_mb__after_spinlock() smp_mb()
+#define vcpu_is_preempted vcpu_is_preempted
+
+#ifdef CONFIG_PARAVIRT
+#include <linux/static_call_types.h>
+
+bool dummy_vcpu_is_preempted(int cpu);
+DECLARE_STATIC_CALL(pv_vcpu_is_preempted, dummy_vcpu_is_preempted);
+static inline bool vcpu_is_preempted(int cpu)
+{
+ return static_call(pv_vcpu_is_preempted)(cpu);
+}
+
+#else
/*
* Changing this will break osq_lock() thanks to the call inside
* smp_cond_load_relaxed().
@@ -18,10 +31,11 @@
* See:
* https://lore.kernel.org/lkml/20200110100612.GC2827@hirez.programming.kicks-ass.net
*/
-#define vcpu_is_preempted vcpu_is_preempted
static inline bool vcpu_is_preempted(int cpu)
{
return false;
}
+#endif /* CONFIG_PARAVIRT */
+
#endif /* __ASM_SPINLOCK_H */
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c
index aa718d6a9274..c56d701db1bb 100644
--- a/arch/arm64/kernel/paravirt.c
+++ b/arch/arm64/kernel/paravirt.c
@@ -20,8 +20,10 @@
#include <linux/types.h>
#include <linux/static_call.h>
+#include <asm/hypervisor.h>
#include <asm/paravirt.h>
#include <asm/pvclock-abi.h>
+#include <asm/pvlock-abi.h>
#include <asm/smp_plat.h>
struct static_key paravirt_steal_enabled;
@@ -38,7 +40,12 @@ struct pv_time_stolen_time_region {
struct pvclock_vcpu_stolen_time __rcu *kaddr;
};
+struct pv_lock_state_region {
+ struct pvlock_vcpu_state __rcu *kaddr;
+};
+
static DEFINE_PER_CPU(struct pv_time_stolen_time_region, stolen_time_region);
+static DEFINE_PER_CPU(struct pv_lock_state_region, lock_state_region);
static bool steal_acc = true;
static int __init parse_no_stealacc(char *arg)
@@ -174,3 +181,109 @@ int __init pv_time_init(void)
return 0;
}
+
+static bool native_vcpu_is_preempted(int cpu)
+{
+ return false;
+}
+
+DEFINE_STATIC_CALL(pv_vcpu_is_preempted, native_vcpu_is_preempted);
+
+static bool para_vcpu_is_preempted(int cpu)
+{
+ struct pv_lock_state_region *reg;
+ __le64 preempted_le;
+
+ reg = per_cpu_ptr(&lock_state_region, cpu);
+ if (!reg->kaddr) {
+ pr_warn_once("PV lock enabled but not configured for cpu %d\n",
+ cpu);
+ return false;
+ }
+
+ preempted_le = le64_to_cpu(READ_ONCE(reg->kaddr->preempted));
+
+ return !!(preempted_le);
+}
+
+static int pvlock_vcpu_state_dying_cpu(unsigned int cpu)
+{
+ struct pv_lock_state_region *reg;
+
+ reg = this_cpu_ptr(&lock_state_region);
+ if (!reg->kaddr)
+ return 0;
+
+ memunmap(reg->kaddr);
+ memset(reg, 0, sizeof(*reg));
+
+ return 0;
+}
+
+static int init_pvlock_vcpu_state(unsigned int cpu)
+{
+ struct pv_lock_state_region *reg;
+ struct arm_smccc_res res;
+
+ reg = this_cpu_ptr(&lock_state_region);
+
+ arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_KVM_PV_LOCK_FUNC_ID, &res);
+
+ if (res.a0 == SMCCC_RET_NOT_SUPPORTED) {
+ pr_warn("Failed to init PV lock data structure\n");
+ return -EINVAL;
+ }
+
+ reg->kaddr = memremap(res.a0,
+ sizeof(struct pvlock_vcpu_state),
+ MEMREMAP_WB);
+
+ if (!reg->kaddr) {
+ pr_warn("Failed to map PV lock data structure\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static int kvm_arm_init_pvlock(void)
+{
+ int ret;
+
+ ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "hypervisor/arm/pvlock:online",
+ init_pvlock_vcpu_state,
+ pvlock_vcpu_state_dying_cpu);
+ if (ret < 0) {
+ pr_warn("PV-lock init failed\n");
+ return ret;
+ }
+
+ return 0;
+}
+
+static bool has_kvm_pvlock(void)
+{
+ return kvm_arm_hyp_service_available(ARM_SMCCC_KVM_FUNC_PV_LOCK);
+}
+
+int __init pv_lock_init(void)
+{
+ int ret;
+
+ if (is_hyp_mode_available())
+ return 0;
+
+ if (!has_kvm_pvlock())
+ return 0;
+
+ ret = kvm_arm_init_pvlock();
+ if (ret)
+ return ret;
+
+ static_call_update(pv_vcpu_is_preempted, para_vcpu_is_preempted);
+ pr_info("using PV-lock preempted\n");
+
+ return 0;
+}
+early_initcall(pv_lock_init);
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [v3 6/6] KVM: selftests: add tests for PV time specific hypercall
2023-01-17 10:29 [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
` (4 preceding siblings ...)
2023-01-17 10:29 ` [v3 5/6] KVM: arm64: Support the VCPU preemption check Usama Arif
@ 2023-01-17 10:29 ` Usama Arif
2023-02-14 16:06 ` [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
6 siblings, 0 replies; 10+ messages in thread
From: Usama Arif @ 2023-01-17 10:29 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel, kvmarm, kvm, linux-doc,
virtualization, linux, yezengruan, catalin.marinas, will, maz,
steven.price, mark.rutland, bagasdotme, pbonzini
Cc: fam.zheng, liangma, punit.agrawal, Usama Arif
This is a vendor specific hypercall.
Signed-off-by: Usama Arif <usama.arif@bytedance.com>
---
tools/testing/selftests/kvm/aarch64/hypercalls.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/kvm/aarch64/hypercalls.c b/tools/testing/selftests/kvm/aarch64/hypercalls.c
index bef1499fb465..375bcc4126d5 100644
--- a/tools/testing/selftests/kvm/aarch64/hypercalls.c
+++ b/tools/testing/selftests/kvm/aarch64/hypercalls.c
@@ -78,6 +78,8 @@ static const struct test_hvc_info hvc_info[] = {
TEST_HVC_INFO(ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID,
ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID),
TEST_HVC_INFO(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, 0),
+ TEST_HVC_INFO(ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID,
+ ARM_SMCCC_VENDOR_HYP_KVM_PV_LOCK_FUNC_ID),
TEST_HVC_INFO(ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID, KVM_PTP_VIRT_COUNTER),
};
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [v3 1/6] KVM: arm64: Document PV-lock interface
2023-01-17 10:29 ` [v3 1/6] KVM: arm64: Document PV-lock interface Usama Arif
@ 2023-01-18 13:29 ` Bagas Sanjaya
0 siblings, 0 replies; 10+ messages in thread
From: Bagas Sanjaya @ 2023-01-18 13:29 UTC (permalink / raw)
To: Usama Arif, linux-kernel, linux-arm-kernel, kvmarm, kvm,
linux-doc, virtualization, linux, yezengruan, catalin.marinas,
will, maz, steven.price, mark.rutland, pbonzini
Cc: fam.zheng, liangma, punit.agrawal
[-- Attachment #1: Type: text/plain, Size: 511 bytes --]
On Tue, Jan 17, 2023 at 10:29:25AM +0000, Usama Arif wrote:
> Introduce a paravirtualization interface for KVM/arm64 to obtain whether
> the VCPU is currently running or not.
>
> The PV lock structure of the guest is allocated by user space.
>
> A hypercall interface is provided for the guest to interrogate the
> location of the shared memory structures.
>
The doc LGTM, thanks.
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
--
An old man doll... just what I always wanted! - Clara
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [v3 0/6] KVM: arm64: implement vcpu_is_preempted check
2023-01-17 10:29 [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
` (5 preceding siblings ...)
2023-01-17 10:29 ` [v3 6/6] KVM: selftests: add tests for PV time specific hypercall Usama Arif
@ 2023-02-14 16:06 ` Usama Arif
2023-02-14 16:49 ` Marc Zyngier
6 siblings, 1 reply; 10+ messages in thread
From: Usama Arif @ 2023-02-14 16:06 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel, kvmarm, kvm, linux-doc,
virtualization, linux, yezengruan, catalin.marinas, will, maz,
steven.price, mark.rutland, bagasdotme, pbonzini
Cc: fam.zheng, liangma, punit.agrawal
On 17/01/2023 10:29, Usama Arif wrote:
> This patchset adds support for vcpu_is_preempted in arm64, which allows the guest
> to check if a vcpu was scheduled out, which is useful to know incase it was
> holding a lock. vcpu_is_preempted is well integrated in core kernel code and can
> be used to improve performance in locking (owner_on_cpu usage in mutex_spin_on_owner,
> mutex_can_spin_on_owner, rtmutex_spin_on_owner and osq_lock) and scheduling
> (available_idle_cpu which is used in several places in kernel/sched/fair.c
> for e.g. in wake_affine to determine which CPU can run soonest).
>
> This patchset shows significant improvement on overcommitted hosts (vCPUs > pCPUS),
> as waiting for preempted vCPUs reduces performance.
>
Hi,
Just wanted to check if there are any comments for this?
Thanks,
Usama
> If merged, vcpu_is_preempted could also be used to optimize IPI performance (along
> with directed yield to target IPI vCPU) similar to how its done in x86
> (https://lore.kernel.org/all/1560255830-8656-2-git-send-email-wanpengli@tencent.com/)
>
> All the results in the below experiments are done on an aws r6g.metal instance
> which has 64 pCPUs.
>
> The following table shows the index results of UnixBench running on a 128 vCPU VM
> with (6.0+vcpu_is_preempted) and without (6.0 base) the patchset.
> TestName 6.0 base 6.0+vcpu_is_preempted % improvement for vcpu_is_preempted
> Dhrystone 2 using register variables 187761 191274.7 1.871368389
> Double-Precision Whetstone 96743.6 98414.4 1.727039308
> Execl Throughput 689.3 10426 1412.548963
> File Copy 1024 bufsize 2000 maxblocks 549.5 3165 475.978162
> File Copy 256 bufsize 500 maxblocks 400.7 2084.7 420.2645371
> File Copy 4096 bufsize 8000 maxblocks 894.3 5003.2 459.4543218
> Pipe Throughput 76819.5 78601.5 2.319723508
> Pipe-based Context Switching 3444.8 13414.5 289.4130283
> Process Creation 301.1 293.4 -2.557289937
> Shell Scripts (1 concurrent) 1248.1 28300.6 2167.494592
> Shell Scripts (8 concurrent) 781.2 26222.3 3256.669227
> System Call Overhead 3426 3729.4 8.855808523
>
> System Benchmarks Index Score 3053 11534 277.7923354
>
> This shows a 278% overall improvement using these patches.
>
> The biggest improvement is in the shell scripts benchmark, which forks a lot of processes.
> This acquires rwsem lock where a large chunk of time is spent in base kernel.
> This can be seen from one of the callstack of the perf output of the shell
> scripts benchmark on base (pseudo NMI enabled for perf numbers below):
> - 33.79% el0_svc
> - 33.43% do_el0_svc
> - 33.43% el0_svc_common.constprop.3
> - 33.30% invoke_syscall
> - 17.27% __arm64_sys_clone
> - 17.27% __do_sys_clone
> - 17.26% kernel_clone
> - 16.73% copy_process
> - 11.91% dup_mm
> - 11.82% dup_mmap
> - 9.15% down_write
> - 8.87% rwsem_down_write_slowpath
> - 8.48% osq_lock
>
> Just under 50% of the total time in the shell script benchmarks ends up being
> spent in osq_lock in the base kernel:
> Children Self Command Shared Object Symbol
> 17.19% 10.71% sh [kernel.kallsyms] [k] osq_lock
> 6.17% 4.04% sort [kernel.kallsyms] [k] osq_lock
> 4.20% 2.60% multi. [kernel.kallsyms] [k] osq_lock
> 3.77% 2.47% grep [kernel.kallsyms] [k] osq_lock
> 3.50% 2.24% expr [kernel.kallsyms] [k] osq_lock
> 3.41% 2.23% od [kernel.kallsyms] [k] osq_lock
> 3.36% 2.15% rm [kernel.kallsyms] [k] osq_lock
> 3.28% 2.12% tee [kernel.kallsyms] [k] osq_lock
> 3.16% 2.02% wc [kernel.kallsyms] [k] osq_lock
> 0.21% 0.13% looper [kernel.kallsyms] [k] osq_lock
> 0.01% 0.00% Run [kernel.kallsyms] [k] osq_lock
>
> and this comes down to less than 1% total with 6.0+vcpu_is_preempted kernel:
> Children Self Command Shared Object Symbol
> 0.26% 0.21% sh [kernel.kallsyms] [k] osq_lock
> 0.10% 0.08% multi. [kernel.kallsyms] [k] osq_lock
> 0.04% 0.04% sort [kernel.kallsyms] [k] osq_lock
> 0.02% 0.01% grep [kernel.kallsyms] [k] osq_lock
> 0.02% 0.02% od [kernel.kallsyms] [k] osq_lock
> 0.01% 0.01% tee [kernel.kallsyms] [k] osq_lock
> 0.01% 0.00% expr [kernel.kallsyms] [k] osq_lock
> 0.01% 0.01% looper [kernel.kallsyms] [k] osq_lock
> 0.00% 0.00% wc [kernel.kallsyms] [k] osq_lock
> 0.00% 0.00% rm [kernel.kallsyms] [k] osq_lock
>
> To make sure, there is no change in performance when vCPUs < pCPUs, UnixBench
> was run on a 32 CPU VM. The kernel with vcpu_is_preempted implemented
> performed 0.9% better overall than base kernel, and the individual benchmarks
> were within +/-2% improvement over 6.0 base.
> Hence the patches have no negative affect when vCPUs < pCPUs.
>
> The respective QEMU change to test this is at
> https://github.com/uarif1/qemu/commit/2da2c2927ae8de8f03f439804a0dad9cf68501b6.
>
> Looking forward to your response!
> Thanks,
> Usama
> ---
> v2->v3
> - Updated the patchset from 6.0 to 6.2-rc3
> - Made pv_lock_init an early_initcall
> - Improved documentation
> - Changed pvlock_vcpu_state to aligned struct
> - Minor improvevments
>
> RFC->v2
> - Fixed table and code referencing in pvlock documentation
> - Switched to using a single hypercall similar to ptp_kvm and made check
> for has_kvm_pvlock simpler
>
> Usama Arif (6):
> KVM: arm64: Document PV-lock interface
> KVM: arm64: Add SMCCC paravirtualised lock calls
> KVM: arm64: Support pvlock preempted via shared structure
> KVM: arm64: Provide VCPU attributes for PV lock
> KVM: arm64: Support the VCPU preemption check
> KVM: selftests: add tests for PV time specific hypercall
>
> Documentation/virt/kvm/arm/hypercalls.rst | 3 +
> Documentation/virt/kvm/arm/index.rst | 1 +
> Documentation/virt/kvm/arm/pvlock.rst | 54 +++++++++
> Documentation/virt/kvm/devices/vcpu.rst | 25 ++++
> arch/arm64/include/asm/kvm_host.h | 25 ++++
> arch/arm64/include/asm/paravirt.h | 2 +
> arch/arm64/include/asm/pvlock-abi.h | 15 +++
> arch/arm64/include/asm/spinlock.h | 16 ++-
> arch/arm64/include/uapi/asm/kvm.h | 3 +
> arch/arm64/kernel/paravirt.c | 113 ++++++++++++++++++
> arch/arm64/kvm/Makefile | 2 +-
> arch/arm64/kvm/arm.c | 8 ++
> arch/arm64/kvm/guest.c | 9 ++
> arch/arm64/kvm/hypercalls.c | 8 ++
> arch/arm64/kvm/pvlock.c | 100 ++++++++++++++++
> include/linux/arm-smccc.h | 8 ++
> include/uapi/linux/kvm.h | 2 +
> tools/arch/arm64/include/uapi/asm/kvm.h | 1 +
> tools/include/linux/arm-smccc.h | 8 ++
> .../selftests/kvm/aarch64/hypercalls.c | 2 +
> 20 files changed, 403 insertions(+), 2 deletions(-)
> create mode 100644 Documentation/virt/kvm/arm/pvlock.rst
> create mode 100644 arch/arm64/include/asm/pvlock-abi.h
> create mode 100644 arch/arm64/kvm/pvlock.c
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [v3 0/6] KVM: arm64: implement vcpu_is_preempted check
2023-02-14 16:06 ` [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
@ 2023-02-14 16:49 ` Marc Zyngier
0 siblings, 0 replies; 10+ messages in thread
From: Marc Zyngier @ 2023-02-14 16:49 UTC (permalink / raw)
To: Usama Arif
Cc: linux-kernel, linux-arm-kernel, kvmarm, kvm, linux-doc,
virtualization, linux, yezengruan, catalin.marinas, will,
steven.price, mark.rutland, bagasdotme, pbonzini, fam.zheng,
liangma, punit.agrawal
On Tue, 14 Feb 2023 16:06:26 +0000,
Usama Arif <usama.arif@bytedance.com> wrote:
>
>
>
> On 17/01/2023 10:29, Usama Arif wrote:
> > This patchset adds support for vcpu_is_preempted in arm64, which allows the guest
> > to check if a vcpu was scheduled out, which is useful to know incase it was
> > holding a lock. vcpu_is_preempted is well integrated in core kernel code and can
> > be used to improve performance in locking (owner_on_cpu usage in mutex_spin_on_owner,
> > mutex_can_spin_on_owner, rtmutex_spin_on_owner and osq_lock) and scheduling
> > (available_idle_cpu which is used in several places in kernel/sched/fair.c
> > for e.g. in wake_affine to determine which CPU can run soonest).
> >
> > This patchset shows significant improvement on overcommitted hosts (vCPUs > pCPUS),
> > as waiting for preempted vCPUs reduces performance.
> >
>
> Hi,
>
> Just wanted to check if there are any comments for this?
Not a lot, I'm afraid. My concerns with this thing are still the same:
- it is KVM-specific
- it doesn't work with nested virtualisation
- its correctness is unproven on arm64
I'm also not going to entertain any of this without the core arm64
maintainers saying that they will enable this.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-02-14 16:50 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-17 10:29 [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
2023-01-17 10:29 ` [v3 1/6] KVM: arm64: Document PV-lock interface Usama Arif
2023-01-18 13:29 ` Bagas Sanjaya
2023-01-17 10:29 ` [v3 2/6] KVM: arm64: Add SMCCC paravirtualised lock calls Usama Arif
2023-01-17 10:29 ` [v3 3/6] KVM: arm64: Support pvlock preempted via shared structure Usama Arif
2023-01-17 10:29 ` [v3 4/6] KVM: arm64: Provide VCPU attributes for PV lock Usama Arif
2023-01-17 10:29 ` [v3 5/6] KVM: arm64: Support the VCPU preemption check Usama Arif
2023-01-17 10:29 ` [v3 6/6] KVM: selftests: add tests for PV time specific hypercall Usama Arif
2023-02-14 16:06 ` [v3 0/6] KVM: arm64: implement vcpu_is_preempted check Usama Arif
2023-02-14 16:49 ` Marc Zyngier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).