From: Marc Zyngier <maz@kernel.org>
To: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
kvm@vger.kernel.org
Cc: James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Oliver Upton <oliver.upton@linux.dev>,
Zenghui Yu <yuzenghui@huawei.com>,
Joey Gouly <joey.gouly@arm.com>,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
Xu Zhao <zhaoxu.35@bytedance.com>,
Eric Auger <eric.auger@redhat.com>
Subject: [PATCH v3 08/11] KVM: arm64: Build MPIDR to vcpu index cache at runtime
Date: Wed, 27 Sep 2023 10:09:08 +0100 [thread overview]
Message-ID: <20230927090911.3355209-9-maz@kernel.org> (raw)
In-Reply-To: <20230927090911.3355209-1-maz@kernel.org>
The MPIDR_EL1 register contains a unique value that identifies
the CPU. The only problem with it is that it is stupidly large
(32 bits, once the useless stuff is removed).
Trying to obtain a vcpu from an MPIDR value is a fairly common,
yet costly operation: we iterate over all the vcpus until we
find the correct one. While this is cheap for small VMs, it is
pretty expensive on large ones, specially if you are trying to
get to the one that's at the end of the list...
In order to help with this, it is important to realise that
the MPIDR values are actually structured, and that implementations
tend to use a small number of significant bits in the 32bit space.
We can use this fact to our advantage by computing a small hash
table that uses the "compression" of the significant MPIDR bits
as an index, giving us the vcpu index as a result.
Given that the MPIDR values can be supplied by userspace, and
that an evil VMM could decide to make *all* bits significant,
resulting in a 4G-entry table, we only use this method if the
resulting table fits in a single page. Otherwise, we fallback
to the good old iterative method.
Nothing uses that table just yet, but keep your eyes peeled.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Tested-by: Joey Gouly <joey.gouly@arm.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/include/asm/kvm_host.h | 28 ++++++++++++++++
arch/arm64/kvm/arm.c | 54 +++++++++++++++++++++++++++++++
2 files changed, 82 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index af06ccb7ee34..4b5a57a60b6a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -202,6 +202,31 @@ struct kvm_protected_vm {
struct kvm_hyp_memcache teardown_mc;
};
+struct kvm_mpidr_data {
+ u64 mpidr_mask;
+ DECLARE_FLEX_ARRAY(u16, cmpidr_to_idx);
+};
+
+static inline u16 kvm_mpidr_index(struct kvm_mpidr_data *data, u64 mpidr)
+{
+ unsigned long mask = data->mpidr_mask;
+ u64 aff = mpidr & MPIDR_HWID_BITMASK;
+ int nbits, bit, bit_idx = 0;
+ u16 index = 0;
+
+ /*
+ * If this looks like RISC-V's BEXT or x86's PEXT
+ * instructions, it isn't by accident.
+ */
+ nbits = fls(mask);
+ for_each_set_bit(bit, &mask, nbits) {
+ index |= (aff & BIT(bit)) >> (bit - bit_idx);
+ bit_idx++;
+ }
+
+ return index;
+}
+
struct kvm_arch {
struct kvm_s2_mmu mmu;
@@ -248,6 +273,9 @@ struct kvm_arch {
/* VM-wide vCPU feature set */
DECLARE_BITMAP(vcpu_features, KVM_VCPU_MAX_FEATURES);
+ /* MPIDR to vcpu index mapping, optional */
+ struct kvm_mpidr_data *mpidr_data;
+
/*
* VM-wide PMU filter, implemented as a bitmap and big enough for
* up to 2^10 events (ARMv8.0) or 2^16 events (ARMv8.1+).
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 9379a1227501..b02e28f76083 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -205,6 +205,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
if (is_protected_kvm_enabled())
pkvm_destroy_hyp_vm(kvm);
+ kfree(kvm->arch.mpidr_data);
kvm_destroy_vcpus(kvm);
kvm_unshare_hyp(kvm, kvm + 1);
@@ -578,6 +579,57 @@ static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu)
return vcpu_get_flag(vcpu, VCPU_INITIALIZED);
}
+static void kvm_init_mpidr_data(struct kvm *kvm)
+{
+ struct kvm_mpidr_data *data = NULL;
+ unsigned long c, mask, nr_entries;
+ u64 aff_set = 0, aff_clr = ~0UL;
+ struct kvm_vcpu *vcpu;
+
+ mutex_lock(&kvm->arch.config_lock);
+
+ if (kvm->arch.mpidr_data || atomic_read(&kvm->online_vcpus) == 1)
+ goto out;
+
+ kvm_for_each_vcpu(c, vcpu, kvm) {
+ u64 aff = kvm_vcpu_get_mpidr_aff(vcpu);
+ aff_set |= aff;
+ aff_clr &= aff;
+ }
+
+ /*
+ * A significant bit can be either 0 or 1, and will only appear in
+ * aff_set. Use aff_clr to weed out the useless stuff.
+ */
+ mask = aff_set ^ aff_clr;
+ nr_entries = BIT_ULL(hweight_long(mask));
+
+ /*
+ * Don't let userspace fool us. If we need more than a single page
+ * to describe the compressed MPIDR array, just fall back to the
+ * iterative method. Single vcpu VMs do not need this either.
+ */
+ if (struct_size(data, cmpidr_to_idx, nr_entries) <= PAGE_SIZE)
+ data = kzalloc(struct_size(data, cmpidr_to_idx, nr_entries),
+ GFP_KERNEL_ACCOUNT);
+
+ if (!data)
+ goto out;
+
+ data->mpidr_mask = mask;
+
+ kvm_for_each_vcpu(c, vcpu, kvm) {
+ u64 aff = kvm_vcpu_get_mpidr_aff(vcpu);
+ u16 index = kvm_mpidr_index(data, aff);
+
+ data->cmpidr_to_idx[index] = c;
+ }
+
+ kvm->arch.mpidr_data = data;
+out:
+ mutex_unlock(&kvm->arch.config_lock);
+}
+
/*
* Handle both the initialisation that is being done when the vcpu is
* run for the first time, as well as the updates that must be
@@ -601,6 +653,8 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
if (likely(vcpu_has_run_once(vcpu)))
return 0;
+ kvm_init_mpidr_data(kvm);
+
kvm_arm_vcpu_init_debug(vcpu);
if (likely(irqchip_in_kernel(kvm))) {
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-09-27 9:10 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-27 9:09 [PATCH v3 00/11] KVM: arm64: Accelerate lookup of vcpus by MPIDR values (and other fixes) Marc Zyngier
2023-09-27 9:09 ` [PATCH v3 01/11] KVM: arm64: vgic: Make kvm_vgic_inject_irq() take a vcpu pointer Marc Zyngier
2023-09-27 9:09 ` [PATCH v3 02/11] KVM: arm64: vgic-its: Treat the collection target address as a vcpu_id Marc Zyngier
2023-09-27 9:09 ` [PATCH v3 03/11] KVM: arm64: vgic-v3: Refactor GICv3 SGI generation Marc Zyngier
2023-09-27 9:09 ` [PATCH v3 04/11] KVM: arm64: vgic-v2: Use cpuid from userspace as vcpu_id Marc Zyngier
2023-09-27 9:09 ` [PATCH v3 05/11] KVM: arm64: vgic: Use vcpu_idx for the debug information Marc Zyngier
2023-09-27 9:09 ` [PATCH v3 06/11] KVM: arm64: Use vcpu_idx for invalidation tracking Marc Zyngier
2023-09-27 9:09 ` [PATCH v3 07/11] KVM: arm64: Simplify kvm_vcpu_get_mpidr_aff() Marc Zyngier
2023-09-27 9:09 ` Marc Zyngier [this message]
2023-09-27 9:09 ` [PATCH v3 09/11] KVM: arm64: Fast-track kvm_mpidr_to_vcpu() when mpidr_data is available Marc Zyngier
2023-09-27 9:09 ` [PATCH v3 10/11] KVM: arm64: vgic-v3: Optimize affinity-based SGI injection Marc Zyngier
2023-09-27 9:09 ` [PATCH v3 11/11] KVM: arm64: Clarify the ordering requirements for vcpu/RD creation Marc Zyngier
2023-09-30 18:27 ` [PATCH v3 00/11] KVM: arm64: Accelerate lookup of vcpus by MPIDR values (and other fixes) Oliver Upton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230927090911.3355209-9-maz@kernel.org \
--to=maz@kernel.org \
--cc=eric.auger@redhat.com \
--cc=james.morse@arm.com \
--cc=joey.gouly@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=oliver.upton@linux.dev \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=suzuki.poulose@arm.com \
--cc=yuzenghui@huawei.com \
--cc=zhaoxu.35@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).