* [PATCH v2 0/6] x86/kvm: add boot parameters for max vcpu configs
@ 2021-09-03 13:08 Juergen Gross
2021-09-03 13:08 ` [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits Juergen Gross
2021-09-03 13:08 ` [PATCH v2 6/6] x86/kvm: add boot parameter for setting max number of vcpus per guest Juergen Gross
0 siblings, 2 replies; 10+ messages in thread
From: Juergen Gross @ 2021-09-03 13:08 UTC (permalink / raw)
To: kvm, x86, linux-kernel, linux-doc, linux-arm-kernel
Cc: maz, ehabkost, Juergen Gross, Paolo Bonzini, Sean Christopherson,
Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
Jonathan Corbet, James Morse, Alexandru Elisei, Suzuki K Poulose,
Catalin Marinas, Will Deacon, kvmarm
In order to be able to have a single kernel for supporting even huge
numbers of vcpus per guest some arrays should be sized dynamically.
The easiest way to do that is to add boot parameters for the maximum
number of vcpus and to calculate the maximum vcpu-id from that using
either the host topology or a topology hint via another boot parameter.
This patch series is doing that for x86. The same scheme can be easily
adapted to other architectures, but I don't want to do that in the
first iteration.
In the long term I'd suggest to have a per-guest setting of the two
parameters allowing to spare some memory for smaller guests. OTOH this
would require new ioctl()s and respective qemu modifications, so I let
those away for now.
I've tested the series not to break normal guest operation and the new
parameters to be effective on x86. For Arm64 I did a compile test only.
Changes in V2:
- removed old patch 1, as already applied
- patch 1 (old patch 2) only for reference, as the patch is already in
the kvm tree
- switch patch 2 (old patch 3) to calculate vcpu-id
- added patch 4
Juergen Gross (6):
x86/kvm: remove non-x86 stuff from arch/x86/kvm/ioapic.h
x86/kvm: add boot parameter for adding vcpu-id bits
x86/kvm: introduce per cpu vcpu masks
kvm: use kvfree() in kvm_arch_free_vm()
kvm: allocate vcpu pointer array separately
x86/kvm: add boot parameter for setting max number of vcpus per guest
.../admin-guide/kernel-parameters.txt | 25 ++++++
arch/arm64/include/asm/kvm_host.h | 1 -
arch/arm64/kvm/arm.c | 23 ++++--
arch/x86/include/asm/kvm_host.h | 26 +++++--
arch/x86/kvm/hyperv.c | 25 ++++--
arch/x86/kvm/ioapic.c | 12 ++-
arch/x86/kvm/ioapic.h | 8 +-
arch/x86/kvm/irq_comm.c | 9 ++-
arch/x86/kvm/x86.c | 78 ++++++++++++++++++-
include/linux/kvm_host.h | 26 ++++++-
10 files changed, 198 insertions(+), 35 deletions(-)
--
2.26.2
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits
2021-09-03 13:08 [PATCH v2 0/6] x86/kvm: add boot parameters for max vcpu configs Juergen Gross
@ 2021-09-03 13:08 ` Juergen Gross
2021-09-03 13:43 ` Vitaly Kuznetsov
` (2 more replies)
2021-09-03 13:08 ` [PATCH v2 6/6] x86/kvm: add boot parameter for setting max number of vcpus per guest Juergen Gross
1 sibling, 3 replies; 10+ messages in thread
From: Juergen Gross @ 2021-09-03 13:08 UTC (permalink / raw)
To: kvm, x86, linux-doc, linux-kernel
Cc: maz, ehabkost, Juergen Gross, Jonathan Corbet, Paolo Bonzini,
Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H. Peter Anvin
Today the maximum vcpu-id of a kvm guest's vcpu on x86 systems is set
via a #define in a header file.
In order to support higher vcpu-ids without generally increasing the
memory consumption of guests on the host (some guest structures contain
arrays sized by KVM_MAX_VCPU_ID) add a boot parameter for adding some
bits to the vcpu-id. Additional bits are needed as the vcpu-id is
constructed via bit-wise concatenation of socket-id, core-id, etc.
As those ids maximum values are not always a power of 2, the vcpu-ids
are sparse.
The additional number of bits needed is basically the number of
topology levels with a non-power-of-2 maximum value, excluding the top
most level.
The default value of the new parameter will be to take the correct
setting from the host's topology.
Calculating the maximum vcpu-id dynamically requires to allocate the
arrays using KVM_MAX_VCPU_ID as the size dynamically.
Signed-of-by: Juergen Gross <jgross@suse.com>
---
V2:
- switch to specifying additional bits (based on comment by Vitaly
Kuznetsov)
Signed-off-by: Juergen Gross <jgross@suse.com>
---
.../admin-guide/kernel-parameters.txt | 18 ++++++++++++
arch/x86/include/asm/kvm_host.h | 4 ++-
arch/x86/kvm/ioapic.c | 12 +++++++-
arch/x86/kvm/ioapic.h | 4 +--
arch/x86/kvm/x86.c | 29 +++++++++++++++++++
5 files changed, 63 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 84dc5790741b..37e194299311 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2435,6 +2435,24 @@
feature (tagged TLBs) on capable Intel chips.
Default is 1 (enabled)
+ kvm.vcpu_id_add_bits=
+ [KVM,X86] The vcpu-ids of guests are sparse, as they
+ are constructed by bit-wise concatenation of the ids of
+ the different topology levels (sockets, cores, threads).
+
+ This parameter specifies how many additional bits the
+ maximum vcpu-id needs compared to the maximum number of
+ vcpus.
+
+ Normally this value is the number of topology levels
+ without the threads level and without the highest
+ level.
+
+ The special value -1 can be used to support guests
+ with the same topology is the host.
+
+ Default: -1
+
l1d_flush= [X86,INTEL]
Control mitigation for L1D based snooping vulnerability.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index af6ce8d4c86a..3513edee8e22 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -39,7 +39,7 @@
#define KVM_MAX_VCPUS 288
#define KVM_SOFT_MAX_VCPUS 240
-#define KVM_MAX_VCPU_ID 1023
+#define KVM_MAX_VCPU_ID kvm_max_vcpu_id()
/* memory slots that are not exposed to userspace */
#define KVM_PRIVATE_MEM_SLOTS 3
@@ -1588,6 +1588,8 @@ extern u64 kvm_max_tsc_scaling_ratio;
extern u64 kvm_default_tsc_scaling_ratio;
/* bus lock detection supported? */
extern bool kvm_has_bus_lock_exit;
+/* maximum vcpu-id */
+unsigned int kvm_max_vcpu_id(void);
extern u64 kvm_mce_cap_supported;
diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index ff005fe738a4..52e8ea90c914 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -685,11 +685,21 @@ static const struct kvm_io_device_ops ioapic_mmio_ops = {
int kvm_ioapic_init(struct kvm *kvm)
{
struct kvm_ioapic *ioapic;
+ size_t sz;
int ret;
- ioapic = kzalloc(sizeof(struct kvm_ioapic), GFP_KERNEL_ACCOUNT);
+ sz = sizeof(struct kvm_ioapic) +
+ sizeof(*ioapic->rtc_status.dest_map.map) *
+ BITS_TO_LONGS(KVM_MAX_VCPU_ID + 1) +
+ sizeof(*ioapic->rtc_status.dest_map.vectors) *
+ (KVM_MAX_VCPU_ID + 1);
+ ioapic = kzalloc(sz, GFP_KERNEL_ACCOUNT);
if (!ioapic)
return -ENOMEM;
+ ioapic->rtc_status.dest_map.map = (void *)(ioapic + 1);
+ ioapic->rtc_status.dest_map.vectors =
+ (void *)(ioapic->rtc_status.dest_map.map +
+ BITS_TO_LONGS(KVM_MAX_VCPU_ID + 1));
spin_lock_init(&ioapic->lock);
INIT_DELAYED_WORK(&ioapic->eoi_inject, kvm_ioapic_eoi_inject_work);
kvm->arch.vioapic = ioapic;
diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
index bbd4a5d18b5d..623a3c5afad7 100644
--- a/arch/x86/kvm/ioapic.h
+++ b/arch/x86/kvm/ioapic.h
@@ -39,13 +39,13 @@ struct kvm_vcpu;
struct dest_map {
/* vcpu bitmap where IRQ has been sent */
- DECLARE_BITMAP(map, KVM_MAX_VCPU_ID + 1);
+ unsigned long *map;
/*
* Vector sent to a given vcpu, only valid when
* the vcpu's bit in map is set
*/
- u8 vectors[KVM_MAX_VCPU_ID + 1];
+ u8 *vectors;
};
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e5d5c5ed7dd4..6b6f38f0b617 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -78,6 +78,7 @@
#include <asm/intel_pt.h>
#include <asm/emulate_prefix.h>
#include <asm/sgx.h>
+#include <asm/topology.h>
#include <clocksource/hyperv_timer.h>
#define CREATE_TRACE_POINTS
@@ -184,6 +185,34 @@ module_param(force_emulation_prefix, bool, S_IRUGO);
int __read_mostly pi_inject_timer = -1;
module_param(pi_inject_timer, bint, S_IRUGO | S_IWUSR);
+static int __read_mostly vcpu_id_add_bits = -1;
+module_param(vcpu_id_add_bits, int, S_IRUGO);
+
+unsigned int kvm_max_vcpu_id(void)
+{
+ int n_bits = fls(KVM_MAX_VCPUS - 1);
+
+ if (vcpu_id_add_bits < -1 || vcpu_id_add_bits > (32 - n_bits)) {
+ pr_err("Invalid value of vcpu_id_add_bits=%d parameter!\n",
+ vcpu_id_add_bits);
+ vcpu_id_add_bits = -1;
+ }
+
+ if (vcpu_id_add_bits >= 0) {
+ n_bits += vcpu_id_add_bits;
+ } else {
+ n_bits++; /* One additional bit for core level. */
+ if (topology_max_die_per_package() > 1)
+ n_bits++; /* One additional bit for die level. */
+ }
+
+ if (!n_bits)
+ n_bits = 1;
+
+ return (1U << n_bits) - 1;
+}
+EXPORT_SYMBOL_GPL(kvm_max_vcpu_id);
+
/*
* Restoring the host value for MSRs that are only consumed when running in
* usermode, e.g. SYSCALL MSRs and TSC_AUX, can be deferred until the CPU
--
2.26.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 6/6] x86/kvm: add boot parameter for setting max number of vcpus per guest
2021-09-03 13:08 [PATCH v2 0/6] x86/kvm: add boot parameters for max vcpu configs Juergen Gross
2021-09-03 13:08 ` [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits Juergen Gross
@ 2021-09-03 13:08 ` Juergen Gross
2021-09-06 0:45 ` Yao Yuan
1 sibling, 1 reply; 10+ messages in thread
From: Juergen Gross @ 2021-09-03 13:08 UTC (permalink / raw)
To: kvm, x86, linux-doc, linux-kernel
Cc: maz, ehabkost, Juergen Gross, Jonathan Corbet, Paolo Bonzini,
Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H. Peter Anvin
Today the maximum number of vcpus of a kvm guest is set via a #define
in a header file.
In order to support higher vcpu numbers for guests without generally
increasing the memory consumption of guests on the host especially on
very large systems add a boot parameter for specifying the number of
allowed vcpus for guests.
The default will still be the current setting of 288. The value 0 has
the special meaning to limit the number of possible vcpus to the
number of possible cpus of the host.
Signed-off-by: Juergen Gross <jgross@suse.com>
---
Documentation/admin-guide/kernel-parameters.txt | 7 +++++++
arch/x86/include/asm/kvm_host.h | 5 ++++-
arch/x86/kvm/x86.c | 9 ++++++++-
3 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 37e194299311..b9641c9989ef 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2435,6 +2435,13 @@
feature (tagged TLBs) on capable Intel chips.
Default is 1 (enabled)
+ kvm.max_vcpus= [KVM,X86] Set the maximum allowed numbers of vcpus per
+ guest. The special value 0 sets the limit to the number
+ of physical cpus possible on the host (including not
+ yet hotplugged cpus). Higher values will result in
+ slightly higher memory consumption per guest.
+ Default: 288
+
kvm.vcpu_id_add_bits=
[KVM,X86] The vcpu-ids of guests are sparse, as they
are constructed by bit-wise concatenation of the ids of
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6c28d0800208..a4ab387b0e1c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -38,7 +38,8 @@
#define __KVM_HAVE_ARCH_VCPU_DEBUGFS
-#define KVM_MAX_VCPUS 288
+#define KVM_DEFAULT_MAX_VCPUS 288
+#define KVM_MAX_VCPUS max_vcpus
#define KVM_SOFT_MAX_VCPUS 240
#define KVM_MAX_VCPU_ID kvm_max_vcpu_id()
/* memory slots that are not exposed to userspace */
@@ -1588,6 +1589,8 @@ extern u64 kvm_max_tsc_scaling_ratio;
extern u64 kvm_default_tsc_scaling_ratio;
/* bus lock detection supported? */
extern bool kvm_has_bus_lock_exit;
+/* maximum number of vcpus per guest */
+extern unsigned int max_vcpus;
/* maximum vcpu-id */
unsigned int kvm_max_vcpu_id(void);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ff142b6dd00c..49c3d91c559e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -188,9 +188,13 @@ module_param(pi_inject_timer, bint, S_IRUGO | S_IWUSR);
static int __read_mostly vcpu_id_add_bits = -1;
module_param(vcpu_id_add_bits, int, S_IRUGO);
+unsigned int __read_mostly max_vcpus = KVM_DEFAULT_MAX_VCPUS;
+module_param(max_vcpus, uint, S_IRUGO);
+EXPORT_SYMBOL_GPL(max_vcpus);
+
unsigned int kvm_max_vcpu_id(void)
{
- int n_bits = fls(KVM_MAX_VCPUS - 1);
+ int n_bits = fls(max_vcpus - 1);
if (vcpu_id_add_bits < -1 || vcpu_id_add_bits > (32 - n_bits)) {
pr_err("Invalid value of vcpu_id_add_bits=%d parameter!\n",
@@ -11033,6 +11037,9 @@ int kvm_arch_hardware_setup(void *opaque)
if (boot_cpu_has(X86_FEATURE_XSAVES))
rdmsrl(MSR_IA32_XSS, host_xss);
+ if (max_vcpus == 0)
+ max_vcpus = num_possible_cpus();
+
kvm_pcpu_vcpu_mask = __alloc_percpu(KVM_VCPU_MASK_SZ,
sizeof(unsigned long));
kvm_hv_vp_bitmap = __alloc_percpu(KVM_HV_VPMAP_SZ, sizeof(u64));
--
2.26.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits
2021-09-03 13:08 ` [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits Juergen Gross
@ 2021-09-03 13:43 ` Vitaly Kuznetsov
2021-09-03 13:53 ` Juergen Gross
2021-09-03 19:48 ` Eduardo Habkost
2021-09-28 16:41 ` Paolo Bonzini
2 siblings, 1 reply; 10+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-03 13:43 UTC (permalink / raw)
To: Juergen Gross, kvm, x86, linux-doc, linux-kernel
Cc: maz, ehabkost, Juergen Gross, Jonathan Corbet, Paolo Bonzini,
Sean Christopherson, Wanpeng Li, Jim Mattson, Joerg Roedel,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin
Juergen Gross <jgross@suse.com> writes:
> Today the maximum vcpu-id of a kvm guest's vcpu on x86 systems is set
> via a #define in a header file.
>
> In order to support higher vcpu-ids without generally increasing the
> memory consumption of guests on the host (some guest structures contain
> arrays sized by KVM_MAX_VCPU_ID) add a boot parameter for adding some
> bits to the vcpu-id. Additional bits are needed as the vcpu-id is
> constructed via bit-wise concatenation of socket-id, core-id, etc.
> As those ids maximum values are not always a power of 2, the vcpu-ids
> are sparse.
>
> The additional number of bits needed is basically the number of
> topology levels with a non-power-of-2 maximum value, excluding the top
> most level.
>
> The default value of the new parameter will be to take the correct
> setting from the host's topology.
>
> Calculating the maximum vcpu-id dynamically requires to allocate the
> arrays using KVM_MAX_VCPU_ID as the size dynamically.
>
> Signed-of-by: Juergen Gross <jgross@suse.com>
> ---
> V2:
> - switch to specifying additional bits (based on comment by Vitaly
> Kuznetsov)
>
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
> .../admin-guide/kernel-parameters.txt | 18 ++++++++++++
> arch/x86/include/asm/kvm_host.h | 4 ++-
> arch/x86/kvm/ioapic.c | 12 +++++++-
> arch/x86/kvm/ioapic.h | 4 +--
> arch/x86/kvm/x86.c | 29 +++++++++++++++++++
> 5 files changed, 63 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 84dc5790741b..37e194299311 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2435,6 +2435,24 @@
> feature (tagged TLBs) on capable Intel chips.
> Default is 1 (enabled)
>
> + kvm.vcpu_id_add_bits=
> + [KVM,X86] The vcpu-ids of guests are sparse, as they
> + are constructed by bit-wise concatenation of the ids of
> + the different topology levels (sockets, cores, threads).
> +
> + This parameter specifies how many additional bits the
> + maximum vcpu-id needs compared to the maximum number of
> + vcpus.
> +
> + Normally this value is the number of topology levels
> + without the threads level and without the highest
> + level.
> +
> + The special value -1 can be used to support guests
> + with the same topology is the host.
> +
> + Default: -1
> +
> l1d_flush= [X86,INTEL]
> Control mitigation for L1D based snooping vulnerability.
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index af6ce8d4c86a..3513edee8e22 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -39,7 +39,7 @@
>
> #define KVM_MAX_VCPUS 288
> #define KVM_SOFT_MAX_VCPUS 240
> -#define KVM_MAX_VCPU_ID 1023
> +#define KVM_MAX_VCPU_ID kvm_max_vcpu_id()
> /* memory slots that are not exposed to userspace */
> #define KVM_PRIVATE_MEM_SLOTS 3
>
> @@ -1588,6 +1588,8 @@ extern u64 kvm_max_tsc_scaling_ratio;
> extern u64 kvm_default_tsc_scaling_ratio;
> /* bus lock detection supported? */
> extern bool kvm_has_bus_lock_exit;
> +/* maximum vcpu-id */
> +unsigned int kvm_max_vcpu_id(void);
>
> extern u64 kvm_mce_cap_supported;
>
> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
> index ff005fe738a4..52e8ea90c914 100644
> --- a/arch/x86/kvm/ioapic.c
> +++ b/arch/x86/kvm/ioapic.c
> @@ -685,11 +685,21 @@ static const struct kvm_io_device_ops ioapic_mmio_ops = {
> int kvm_ioapic_init(struct kvm *kvm)
> {
> struct kvm_ioapic *ioapic;
> + size_t sz;
> int ret;
>
> - ioapic = kzalloc(sizeof(struct kvm_ioapic), GFP_KERNEL_ACCOUNT);
> + sz = sizeof(struct kvm_ioapic) +
> + sizeof(*ioapic->rtc_status.dest_map.map) *
> + BITS_TO_LONGS(KVM_MAX_VCPU_ID + 1) +
> + sizeof(*ioapic->rtc_status.dest_map.vectors) *
> + (KVM_MAX_VCPU_ID + 1);
> + ioapic = kzalloc(sz, GFP_KERNEL_ACCOUNT);
> if (!ioapic)
> return -ENOMEM;
> + ioapic->rtc_status.dest_map.map = (void *)(ioapic + 1);
> + ioapic->rtc_status.dest_map.vectors =
> + (void *)(ioapic->rtc_status.dest_map.map +
> + BITS_TO_LONGS(KVM_MAX_VCPU_ID + 1));
> spin_lock_init(&ioapic->lock);
> INIT_DELAYED_WORK(&ioapic->eoi_inject, kvm_ioapic_eoi_inject_work);
> kvm->arch.vioapic = ioapic;
> diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
> index bbd4a5d18b5d..623a3c5afad7 100644
> --- a/arch/x86/kvm/ioapic.h
> +++ b/arch/x86/kvm/ioapic.h
> @@ -39,13 +39,13 @@ struct kvm_vcpu;
>
> struct dest_map {
> /* vcpu bitmap where IRQ has been sent */
> - DECLARE_BITMAP(map, KVM_MAX_VCPU_ID + 1);
> + unsigned long *map;
>
> /*
> * Vector sent to a given vcpu, only valid when
> * the vcpu's bit in map is set
> */
> - u8 vectors[KVM_MAX_VCPU_ID + 1];
> + u8 *vectors;
> };
>
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e5d5c5ed7dd4..6b6f38f0b617 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -78,6 +78,7 @@
> #include <asm/intel_pt.h>
> #include <asm/emulate_prefix.h>
> #include <asm/sgx.h>
> +#include <asm/topology.h>
> #include <clocksource/hyperv_timer.h>
>
> #define CREATE_TRACE_POINTS
> @@ -184,6 +185,34 @@ module_param(force_emulation_prefix, bool, S_IRUGO);
> int __read_mostly pi_inject_timer = -1;
> module_param(pi_inject_timer, bint, S_IRUGO | S_IWUSR);
>
> +static int __read_mostly vcpu_id_add_bits = -1;
> +module_param(vcpu_id_add_bits, int, S_IRUGO);
> +
> +unsigned int kvm_max_vcpu_id(void)
> +{
> + int n_bits = fls(KVM_MAX_VCPUS - 1);
> +
> + if (vcpu_id_add_bits < -1 || vcpu_id_add_bits > (32 - n_bits)) {
> + pr_err("Invalid value of vcpu_id_add_bits=%d parameter!\n",
> + vcpu_id_add_bits);
> + vcpu_id_add_bits = -1;
> + }
> +
> + if (vcpu_id_add_bits >= 0) {
> + n_bits += vcpu_id_add_bits;
> + } else {
> + n_bits++; /* One additional bit for core level. */
> + if (topology_max_die_per_package() > 1)
> + n_bits++; /* One additional bit for die level. */
This assumes topology_max_die_per_package() can not be greater than 2,
or 1 additional bit may not suffice, right?
> + }
> +
> + if (!n_bits)
> + n_bits = 1;
Nitpick: AFAIU n_bits can't be zero here as KVM_MAX_VCPUS is still
static. The last patch of the series, however, makes it possible when
max_vcpus = 1 and vcpu_id_add_bits = 0. With this, I'd suggest to move
the check to the last patch.
> +
> + return (1U << n_bits) - 1;
> +}
> +EXPORT_SYMBOL_GPL(kvm_max_vcpu_id);
> +
> /*
> * Restoring the host value for MSRs that are only consumed when running in
> * usermode, e.g. SYSCALL MSRs and TSC_AUX, can be deferred until the CPU
--
Vitaly
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits
2021-09-03 13:43 ` Vitaly Kuznetsov
@ 2021-09-03 13:53 ` Juergen Gross
0 siblings, 0 replies; 10+ messages in thread
From: Juergen Gross @ 2021-09-03 13:53 UTC (permalink / raw)
To: Vitaly Kuznetsov, kvm, x86, linux-doc, linux-kernel
Cc: maz, ehabkost, Jonathan Corbet, Paolo Bonzini,
Sean Christopherson, Wanpeng Li, Jim Mattson, Joerg Roedel,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin
[-- Attachment #1.1.1: Type: text/plain, Size: 7437 bytes --]
On 03.09.21 15:43, Vitaly Kuznetsov wrote:
> Juergen Gross <jgross@suse.com> writes:
>
>> Today the maximum vcpu-id of a kvm guest's vcpu on x86 systems is set
>> via a #define in a header file.
>>
>> In order to support higher vcpu-ids without generally increasing the
>> memory consumption of guests on the host (some guest structures contain
>> arrays sized by KVM_MAX_VCPU_ID) add a boot parameter for adding some
>> bits to the vcpu-id. Additional bits are needed as the vcpu-id is
>> constructed via bit-wise concatenation of socket-id, core-id, etc.
>> As those ids maximum values are not always a power of 2, the vcpu-ids
>> are sparse.
>>
>> The additional number of bits needed is basically the number of
>> topology levels with a non-power-of-2 maximum value, excluding the top
>> most level.
>>
>> The default value of the new parameter will be to take the correct
>> setting from the host's topology.
>>
>> Calculating the maximum vcpu-id dynamically requires to allocate the
>> arrays using KVM_MAX_VCPU_ID as the size dynamically.
>>
>> Signed-of-by: Juergen Gross <jgross@suse.com>
>> ---
>> V2:
>> - switch to specifying additional bits (based on comment by Vitaly
>> Kuznetsov)
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>> .../admin-guide/kernel-parameters.txt | 18 ++++++++++++
>> arch/x86/include/asm/kvm_host.h | 4 ++-
>> arch/x86/kvm/ioapic.c | 12 +++++++-
>> arch/x86/kvm/ioapic.h | 4 +--
>> arch/x86/kvm/x86.c | 29 +++++++++++++++++++
>> 5 files changed, 63 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index 84dc5790741b..37e194299311 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -2435,6 +2435,24 @@
>> feature (tagged TLBs) on capable Intel chips.
>> Default is 1 (enabled)
>>
>> + kvm.vcpu_id_add_bits=
>> + [KVM,X86] The vcpu-ids of guests are sparse, as they
>> + are constructed by bit-wise concatenation of the ids of
>> + the different topology levels (sockets, cores, threads).
>> +
>> + This parameter specifies how many additional bits the
>> + maximum vcpu-id needs compared to the maximum number of
>> + vcpus.
>> +
>> + Normally this value is the number of topology levels
>> + without the threads level and without the highest
>> + level.
>> +
>> + The special value -1 can be used to support guests
>> + with the same topology is the host.
>> +
>> + Default: -1
>> +
>> l1d_flush= [X86,INTEL]
>> Control mitigation for L1D based snooping vulnerability.
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index af6ce8d4c86a..3513edee8e22 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -39,7 +39,7 @@
>>
>> #define KVM_MAX_VCPUS 288
>> #define KVM_SOFT_MAX_VCPUS 240
>> -#define KVM_MAX_VCPU_ID 1023
>> +#define KVM_MAX_VCPU_ID kvm_max_vcpu_id()
>> /* memory slots that are not exposed to userspace */
>> #define KVM_PRIVATE_MEM_SLOTS 3
>>
>> @@ -1588,6 +1588,8 @@ extern u64 kvm_max_tsc_scaling_ratio;
>> extern u64 kvm_default_tsc_scaling_ratio;
>> /* bus lock detection supported? */
>> extern bool kvm_has_bus_lock_exit;
>> +/* maximum vcpu-id */
>> +unsigned int kvm_max_vcpu_id(void);
>>
>> extern u64 kvm_mce_cap_supported;
>>
>> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
>> index ff005fe738a4..52e8ea90c914 100644
>> --- a/arch/x86/kvm/ioapic.c
>> +++ b/arch/x86/kvm/ioapic.c
>> @@ -685,11 +685,21 @@ static const struct kvm_io_device_ops ioapic_mmio_ops = {
>> int kvm_ioapic_init(struct kvm *kvm)
>> {
>> struct kvm_ioapic *ioapic;
>> + size_t sz;
>> int ret;
>>
>> - ioapic = kzalloc(sizeof(struct kvm_ioapic), GFP_KERNEL_ACCOUNT);
>> + sz = sizeof(struct kvm_ioapic) +
>> + sizeof(*ioapic->rtc_status.dest_map.map) *
>> + BITS_TO_LONGS(KVM_MAX_VCPU_ID + 1) +
>> + sizeof(*ioapic->rtc_status.dest_map.vectors) *
>> + (KVM_MAX_VCPU_ID + 1);
>> + ioapic = kzalloc(sz, GFP_KERNEL_ACCOUNT);
>> if (!ioapic)
>> return -ENOMEM;
>> + ioapic->rtc_status.dest_map.map = (void *)(ioapic + 1);
>> + ioapic->rtc_status.dest_map.vectors =
>> + (void *)(ioapic->rtc_status.dest_map.map +
>> + BITS_TO_LONGS(KVM_MAX_VCPU_ID + 1));
>> spin_lock_init(&ioapic->lock);
>> INIT_DELAYED_WORK(&ioapic->eoi_inject, kvm_ioapic_eoi_inject_work);
>> kvm->arch.vioapic = ioapic;
>> diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
>> index bbd4a5d18b5d..623a3c5afad7 100644
>> --- a/arch/x86/kvm/ioapic.h
>> +++ b/arch/x86/kvm/ioapic.h
>> @@ -39,13 +39,13 @@ struct kvm_vcpu;
>>
>> struct dest_map {
>> /* vcpu bitmap where IRQ has been sent */
>> - DECLARE_BITMAP(map, KVM_MAX_VCPU_ID + 1);
>> + unsigned long *map;
>>
>> /*
>> * Vector sent to a given vcpu, only valid when
>> * the vcpu's bit in map is set
>> */
>> - u8 vectors[KVM_MAX_VCPU_ID + 1];
>> + u8 *vectors;
>> };
>>
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index e5d5c5ed7dd4..6b6f38f0b617 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -78,6 +78,7 @@
>> #include <asm/intel_pt.h>
>> #include <asm/emulate_prefix.h>
>> #include <asm/sgx.h>
>> +#include <asm/topology.h>
>> #include <clocksource/hyperv_timer.h>
>>
>> #define CREATE_TRACE_POINTS
>> @@ -184,6 +185,34 @@ module_param(force_emulation_prefix, bool, S_IRUGO);
>> int __read_mostly pi_inject_timer = -1;
>> module_param(pi_inject_timer, bint, S_IRUGO | S_IWUSR);
>>
>> +static int __read_mostly vcpu_id_add_bits = -1;
>> +module_param(vcpu_id_add_bits, int, S_IRUGO);
>> +
>> +unsigned int kvm_max_vcpu_id(void)
>> +{
>> + int n_bits = fls(KVM_MAX_VCPUS - 1);
>> +
>> + if (vcpu_id_add_bits < -1 || vcpu_id_add_bits > (32 - n_bits)) {
>> + pr_err("Invalid value of vcpu_id_add_bits=%d parameter!\n",
>> + vcpu_id_add_bits);
>> + vcpu_id_add_bits = -1;
>> + }
>> +
>> + if (vcpu_id_add_bits >= 0) {
>> + n_bits += vcpu_id_add_bits;
>> + } else {
>> + n_bits++; /* One additional bit for core level. */
>> + if (topology_max_die_per_package() > 1)
>> + n_bits++; /* One additional bit for die level. */
>
> This assumes topology_max_die_per_package() can not be greater than 2,
> or 1 additional bit may not suffice, right?
No. Each topology level can at least add one additional bit. This
mechanism assumes that each level consumes not more bits as
necessary, so with e.g. a core count of 18 per die 5 bits are used,
and not more.
>
>> + }
>> +
>> + if (!n_bits)
>> + n_bits = 1;
>
> Nitpick: AFAIU n_bits can't be zero here as KVM_MAX_VCPUS is still
> static. The last patch of the series, however, makes it possible when
> max_vcpus = 1 and vcpu_id_add_bits = 0. With this, I'd suggest to move
> the check to the last patch.
This is true only if no downstream has a patch setting KVM_MAX_VCPUS to
1. I'd rather be safe than sorry here, especially as it would be very
easy to miss this dependency.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits
2021-09-03 13:08 ` [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits Juergen Gross
2021-09-03 13:43 ` Vitaly Kuznetsov
@ 2021-09-03 19:48 ` Eduardo Habkost
2021-09-06 4:46 ` Juergen Gross
2021-09-28 16:41 ` Paolo Bonzini
2 siblings, 1 reply; 10+ messages in thread
From: Eduardo Habkost @ 2021-09-03 19:48 UTC (permalink / raw)
To: Juergen Gross
Cc: kvm, x86, linux-doc, linux-kernel, maz, Jonathan Corbet,
Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H. Peter Anvin
On Fri, Sep 03, 2021 at 03:08:03PM +0200, Juergen Gross wrote:
> Today the maximum vcpu-id of a kvm guest's vcpu on x86 systems is set
> via a #define in a header file.
>
> In order to support higher vcpu-ids without generally increasing the
> memory consumption of guests on the host (some guest structures contain
> arrays sized by KVM_MAX_VCPU_ID) add a boot parameter for adding some
> bits to the vcpu-id. Additional bits are needed as the vcpu-id is
> constructed via bit-wise concatenation of socket-id, core-id, etc.
> As those ids maximum values are not always a power of 2, the vcpu-ids
> are sparse.
>
> The additional number of bits needed is basically the number of
> topology levels with a non-power-of-2 maximum value, excluding the top
> most level.
>
> The default value of the new parameter will be to take the correct
> setting from the host's topology.
Having the default depend on the host topology makes the host
behaviour unpredictable (which might be a problem when migrating
VMs from another host with a different topology). Can't we just
default to 2?
>
> Calculating the maximum vcpu-id dynamically requires to allocate the
> arrays using KVM_MAX_VCPU_ID as the size dynamically.
>
> Signed-of-by: Juergen Gross <jgross@suse.com>
> ---
> V2:
> - switch to specifying additional bits (based on comment by Vitaly
> Kuznetsov)
>
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
[...]
> #define KVM_MAX_VCPUS 288
> #define KVM_SOFT_MAX_VCPUS 240
> -#define KVM_MAX_VCPU_ID 1023
> +#define KVM_MAX_VCPU_ID kvm_max_vcpu_id()
[...]
> +unsigned int kvm_max_vcpu_id(void)
> +{
> + int n_bits = fls(KVM_MAX_VCPUS - 1);
> +
> + if (vcpu_id_add_bits < -1 || vcpu_id_add_bits > (32 - n_bits)) {
> + pr_err("Invalid value of vcpu_id_add_bits=%d parameter!\n",
> + vcpu_id_add_bits);
> + vcpu_id_add_bits = -1;
> + }
> +
> + if (vcpu_id_add_bits >= 0) {
> + n_bits += vcpu_id_add_bits;
> + } else {
> + n_bits++; /* One additional bit for core level. */
> + if (topology_max_die_per_package() > 1)
> + n_bits++; /* One additional bit for die level. */
> + }
> +
> + if (!n_bits)
> + n_bits = 1;
> +
> + return (1U << n_bits) - 1;
The largest possible VCPU ID is not KVM_MAX_VCPU_ID,
it's (KVM_MAX_VCPU_ID - 1). This is enforced by
kvm_vm_ioctl_create_vcpu().
That would mean KVM_MAX_VCPU_ID should be (1 << n_bits) instead
of ((1 << n_bits) - 1), wouldn't it?
> +}
> +EXPORT_SYMBOL_GPL(kvm_max_vcpu_id);
> +
> /*
> * Restoring the host value for MSRs that are only consumed when running in
> * usermode, e.g. SYSCALL MSRs and TSC_AUX, can be deferred until the CPU
> --
> 2.26.2
>
--
Eduardo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 6/6] x86/kvm: add boot parameter for setting max number of vcpus per guest
2021-09-03 13:08 ` [PATCH v2 6/6] x86/kvm: add boot parameter for setting max number of vcpus per guest Juergen Gross
@ 2021-09-06 0:45 ` Yao Yuan
2021-09-06 4:47 ` Juergen Gross
0 siblings, 1 reply; 10+ messages in thread
From: Yao Yuan @ 2021-09-06 0:45 UTC (permalink / raw)
To: Juergen Gross
Cc: kvm, x86, linux-doc, linux-kernel, maz, ehabkost, Jonathan Corbet,
Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H. Peter Anvin
On Fri, Sep 03, 2021 at 03:08:07PM +0200, Juergen Gross wrote:
> Today the maximum number of vcpus of a kvm guest is set via a #define
> in a header file.
>
> In order to support higher vcpu numbers for guests without generally
> increasing the memory consumption of guests on the host especially on
> very large systems add a boot parameter for specifying the number of
> allowed vcpus for guests.
>
> The default will still be the current setting of 288. The value 0 has
> the special meaning to limit the number of possible vcpus to the
> number of possible cpus of the host.
>
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 7 +++++++
> arch/x86/include/asm/kvm_host.h | 5 ++++-
> arch/x86/kvm/x86.c | 9 ++++++++-
> 3 files changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 37e194299311..b9641c9989ef 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2435,6 +2435,13 @@
> feature (tagged TLBs) on capable Intel chips.
> Default is 1 (enabled)
>
> + kvm.max_vcpus= [KVM,X86] Set the maximum allowed numbers of vcpus per
> + guest. The special value 0 sets the limit to the number
> + of physical cpus possible on the host (including not
> + yet hotplugged cpus). Higher values will result in
> + slightly higher memory consumption per guest.
> + Default: 288
> +
> kvm.vcpu_id_add_bits=
> [KVM,X86] The vcpu-ids of guests are sparse, as they
> are constructed by bit-wise concatenation of the ids of
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 6c28d0800208..a4ab387b0e1c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -38,7 +38,8 @@
>
> #define __KVM_HAVE_ARCH_VCPU_DEBUGFS
>
> -#define KVM_MAX_VCPUS 288
> +#define KVM_DEFAULT_MAX_VCPUS 288
> +#define KVM_MAX_VCPUS max_vcpus
> #define KVM_SOFT_MAX_VCPUS 240
> #define KVM_MAX_VCPU_ID kvm_max_vcpu_id()
> /* memory slots that are not exposed to userspace */
> @@ -1588,6 +1589,8 @@ extern u64 kvm_max_tsc_scaling_ratio;
> extern u64 kvm_default_tsc_scaling_ratio;
> /* bus lock detection supported? */
> extern bool kvm_has_bus_lock_exit;
> +/* maximum number of vcpus per guest */
> +extern unsigned int max_vcpus;
> /* maximum vcpu-id */
> unsigned int kvm_max_vcpu_id(void);
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ff142b6dd00c..49c3d91c559e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -188,9 +188,13 @@ module_param(pi_inject_timer, bint, S_IRUGO | S_IWUSR);
> static int __read_mostly vcpu_id_add_bits = -1;
> module_param(vcpu_id_add_bits, int, S_IRUGO);
>
> +unsigned int __read_mostly max_vcpus = KVM_DEFAULT_MAX_VCPUS;
> +module_param(max_vcpus, uint, S_IRUGO);
> +EXPORT_SYMBOL_GPL(max_vcpus);
> +
> unsigned int kvm_max_vcpu_id(void)
> {
> - int n_bits = fls(KVM_MAX_VCPUS - 1);
> + int n_bits = fls(max_vcpus - 1);
A quesintion here: the parameter "vcpu_id_add_bits" also depends
on the "max_vcpus", we can't calculate the "vcpu_id_add_bits" from
"max_vcpus" because KVM has no topologically knowledge to determine
bits needed for each socket/core/thread level, right?
>
> if (vcpu_id_add_bits < -1 || vcpu_id_add_bits > (32 - n_bits)) {
> pr_err("Invalid value of vcpu_id_add_bits=%d parameter!\n",
> @@ -11033,6 +11037,9 @@ int kvm_arch_hardware_setup(void *opaque)
> if (boot_cpu_has(X86_FEATURE_XSAVES))
> rdmsrl(MSR_IA32_XSS, host_xss);
>
> + if (max_vcpus == 0)
> + max_vcpus = num_possible_cpus();
> +
> kvm_pcpu_vcpu_mask = __alloc_percpu(KVM_VCPU_MASK_SZ,
> sizeof(unsigned long));
> kvm_hv_vp_bitmap = __alloc_percpu(KVM_HV_VPMAP_SZ, sizeof(u64));
> --
> 2.26.2
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits
2021-09-03 19:48 ` Eduardo Habkost
@ 2021-09-06 4:46 ` Juergen Gross
0 siblings, 0 replies; 10+ messages in thread
From: Juergen Gross @ 2021-09-06 4:46 UTC (permalink / raw)
To: Eduardo Habkost
Cc: kvm, x86, linux-doc, linux-kernel, maz, Jonathan Corbet,
Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H. Peter Anvin
[-- Attachment #1.1.1: Type: text/plain, Size: 2844 bytes --]
On 03.09.21 21:48, Eduardo Habkost wrote:
> On Fri, Sep 03, 2021 at 03:08:03PM +0200, Juergen Gross wrote:
>> Today the maximum vcpu-id of a kvm guest's vcpu on x86 systems is set
>> via a #define in a header file.
>>
>> In order to support higher vcpu-ids without generally increasing the
>> memory consumption of guests on the host (some guest structures contain
>> arrays sized by KVM_MAX_VCPU_ID) add a boot parameter for adding some
>> bits to the vcpu-id. Additional bits are needed as the vcpu-id is
>> constructed via bit-wise concatenation of socket-id, core-id, etc.
>> As those ids maximum values are not always a power of 2, the vcpu-ids
>> are sparse.
>>
>> The additional number of bits needed is basically the number of
>> topology levels with a non-power-of-2 maximum value, excluding the top
>> most level.
>>
>> The default value of the new parameter will be to take the correct
>> setting from the host's topology.
>
> Having the default depend on the host topology makes the host
> behaviour unpredictable (which might be a problem when migrating
> VMs from another host with a different topology). Can't we just
> default to 2?
Okay, fine with me.
>
>>
>> Calculating the maximum vcpu-id dynamically requires to allocate the
>> arrays using KVM_MAX_VCPU_ID as the size dynamically.
>>
>> Signed-of-by: Juergen Gross <jgross@suse.com>
>> ---
>> V2:
>> - switch to specifying additional bits (based on comment by Vitaly
>> Kuznetsov)
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
> [...]
>> #define KVM_MAX_VCPUS 288
>> #define KVM_SOFT_MAX_VCPUS 240
>> -#define KVM_MAX_VCPU_ID 1023
>> +#define KVM_MAX_VCPU_ID kvm_max_vcpu_id()
> [...]
>> +unsigned int kvm_max_vcpu_id(void)
>> +{
>> + int n_bits = fls(KVM_MAX_VCPUS - 1);
>> +
>> + if (vcpu_id_add_bits < -1 || vcpu_id_add_bits > (32 - n_bits)) {
>> + pr_err("Invalid value of vcpu_id_add_bits=%d parameter!\n",
>> + vcpu_id_add_bits);
>> + vcpu_id_add_bits = -1;
>> + }
>> +
>> + if (vcpu_id_add_bits >= 0) {
>> + n_bits += vcpu_id_add_bits;
>> + } else {
>> + n_bits++; /* One additional bit for core level. */
>> + if (topology_max_die_per_package() > 1)
>> + n_bits++; /* One additional bit for die level. */
>> + }
>> +
>> + if (!n_bits)
>> + n_bits = 1;
>> +
>> + return (1U << n_bits) - 1;
>
> The largest possible VCPU ID is not KVM_MAX_VCPU_ID,
> it's (KVM_MAX_VCPU_ID - 1). This is enforced by
> kvm_vm_ioctl_create_vcpu().
>
> That would mean KVM_MAX_VCPU_ID should be (1 << n_bits) instead
> of ((1 << n_bits) - 1), wouldn't it?
Oh, indeed. I have been fooled by the IMO bad naming of this macro.
The current value 1023 suggests it is not only me having been fooled.
Shouldn't it be named "KVM_MAX_VCPU_IDS" instead?
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 6/6] x86/kvm: add boot parameter for setting max number of vcpus per guest
2021-09-06 0:45 ` Yao Yuan
@ 2021-09-06 4:47 ` Juergen Gross
0 siblings, 0 replies; 10+ messages in thread
From: Juergen Gross @ 2021-09-06 4:47 UTC (permalink / raw)
To: Yao Yuan
Cc: kvm, x86, linux-doc, linux-kernel, maz, ehabkost, Jonathan Corbet,
Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, H. Peter Anvin
[-- Attachment #1.1.1: Type: text/plain, Size: 3691 bytes --]
On 06.09.21 02:45, Yao Yuan wrote:
> On Fri, Sep 03, 2021 at 03:08:07PM +0200, Juergen Gross wrote:
>> Today the maximum number of vcpus of a kvm guest is set via a #define
>> in a header file.
>>
>> In order to support higher vcpu numbers for guests without generally
>> increasing the memory consumption of guests on the host especially on
>> very large systems add a boot parameter for specifying the number of
>> allowed vcpus for guests.
>>
>> The default will still be the current setting of 288. The value 0 has
>> the special meaning to limit the number of possible vcpus to the
>> number of possible cpus of the host.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>> Documentation/admin-guide/kernel-parameters.txt | 7 +++++++
>> arch/x86/include/asm/kvm_host.h | 5 ++++-
>> arch/x86/kvm/x86.c | 9 ++++++++-
>> 3 files changed, 19 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index 37e194299311..b9641c9989ef 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -2435,6 +2435,13 @@
>> feature (tagged TLBs) on capable Intel chips.
>> Default is 1 (enabled)
>>
>> + kvm.max_vcpus= [KVM,X86] Set the maximum allowed numbers of vcpus per
>> + guest. The special value 0 sets the limit to the number
>> + of physical cpus possible on the host (including not
>> + yet hotplugged cpus). Higher values will result in
>> + slightly higher memory consumption per guest.
>> + Default: 288
>> +
>> kvm.vcpu_id_add_bits=
>> [KVM,X86] The vcpu-ids of guests are sparse, as they
>> are constructed by bit-wise concatenation of the ids of
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 6c28d0800208..a4ab387b0e1c 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -38,7 +38,8 @@
>>
>> #define __KVM_HAVE_ARCH_VCPU_DEBUGFS
>>
>> -#define KVM_MAX_VCPUS 288
>> +#define KVM_DEFAULT_MAX_VCPUS 288
>> +#define KVM_MAX_VCPUS max_vcpus
>> #define KVM_SOFT_MAX_VCPUS 240
>> #define KVM_MAX_VCPU_ID kvm_max_vcpu_id()
>> /* memory slots that are not exposed to userspace */
>> @@ -1588,6 +1589,8 @@ extern u64 kvm_max_tsc_scaling_ratio;
>> extern u64 kvm_default_tsc_scaling_ratio;
>> /* bus lock detection supported? */
>> extern bool kvm_has_bus_lock_exit;
>> +/* maximum number of vcpus per guest */
>> +extern unsigned int max_vcpus;
>> /* maximum vcpu-id */
>> unsigned int kvm_max_vcpu_id(void);
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index ff142b6dd00c..49c3d91c559e 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -188,9 +188,13 @@ module_param(pi_inject_timer, bint, S_IRUGO | S_IWUSR);
>> static int __read_mostly vcpu_id_add_bits = -1;
>> module_param(vcpu_id_add_bits, int, S_IRUGO);
>>
>> +unsigned int __read_mostly max_vcpus = KVM_DEFAULT_MAX_VCPUS;
>> +module_param(max_vcpus, uint, S_IRUGO);
>> +EXPORT_SYMBOL_GPL(max_vcpus);
>> +
>> unsigned int kvm_max_vcpu_id(void)
>> {
>> - int n_bits = fls(KVM_MAX_VCPUS - 1);
>> + int n_bits = fls(max_vcpus - 1);
>
> A quesintion here: the parameter "vcpu_id_add_bits" also depends
> on the "max_vcpus", we can't calculate the "vcpu_id_add_bits" from
> "max_vcpus" because KVM has no topologically knowledge to determine
> bits needed for each socket/core/thread level, right?
Correct.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits
2021-09-03 13:08 ` [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits Juergen Gross
2021-09-03 13:43 ` Vitaly Kuznetsov
2021-09-03 19:48 ` Eduardo Habkost
@ 2021-09-28 16:41 ` Paolo Bonzini
2 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2021-09-28 16:41 UTC (permalink / raw)
To: Juergen Gross, kvm, x86, linux-doc, linux-kernel
Cc: maz, ehabkost, Jonathan Corbet, Sean Christopherson,
Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin
On 03/09/21 15:08, Juergen Gross wrote:
> + if (vcpu_id_add_bits >= 0) {
> + n_bits += vcpu_id_add_bits;
> + } else {
> + n_bits++; /* One additional bit for core level. */
> + if (topology_max_die_per_package() > 1)
> + n_bits++; /* One additional bit for die level. */
This needs to be unconditional since it is always possible to emulate a
multiple-die-per-package topology for a guest, even if the host has just
one.
Paolo
> + }
> +
> + if (!n_bits)
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-09-28 16:41 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-09-03 13:08 [PATCH v2 0/6] x86/kvm: add boot parameters for max vcpu configs Juergen Gross
2021-09-03 13:08 ` [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits Juergen Gross
2021-09-03 13:43 ` Vitaly Kuznetsov
2021-09-03 13:53 ` Juergen Gross
2021-09-03 19:48 ` Eduardo Habkost
2021-09-06 4:46 ` Juergen Gross
2021-09-28 16:41 ` Paolo Bonzini
2021-09-03 13:08 ` [PATCH v2 6/6] x86/kvm: add boot parameter for setting max number of vcpus per guest Juergen Gross
2021-09-06 0:45 ` Yao Yuan
2021-09-06 4:47 ` Juergen Gross
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).