* [PATCH v4] KVM: s390: Use ESCA instead of BSCA at VM init
@ 2025-06-02 16:34 Christoph Schlameuss
2025-06-03 8:48 ` Janosch Frank
0 siblings, 1 reply; 3+ messages in thread
From: Christoph Schlameuss @ 2025-06-02 16:34 UTC (permalink / raw)
To: kvm
Cc: linux-s390, Christian Borntraeger, Janosch Frank,
Claudio Imbrenda, David Hildenbrand, Heiko Carstens,
Vasily Gorbik, Alexander Gordeev, Sven Schnelle, Thomas Huth,
Christoph Schlameuss
All modern IBM Z and Linux One machines do offer support for the
Extended System Control Area (ESCA). The ESCA is available since the
z114/z196 released in 2010.
KVM needs to allocate and manage the SCA for guest VMs. Prior to this
change the SCA was setup as Basic SCA only supporting a maximum of 64
vCPUs when initializing the VM. With addition of the 65th vCPU the SCA
was needed to be converted to a ESCA.
Instead of allocating a BSCA and upgrading it for PV or when adding the
65th cpu we can always allocate the ESCA directly upon VM creation
simplifying the code in multiple places as well as completely removing
the need to convert an existing SCA.
In cases where the ESCA is not supported (z10 and earlier) the use of
the SCA entries and with that SIGP interpretation are disabled for VMs.
This increases the number of exits from the VM in multiprocessor
scenarios and thus decreases performance.
The same is true for VSIE where SIGP is currently disabled and thus no
SCA entries are used.
The only downside of the change is that we will always allocate 4 pages
for a 248 cpu ESCA instead of a single page for the BSCA per VM.
In return we can delete a bunch of checks and special handling depending
on the SCA type as well as the whole BSCA to ESCA conversion.
With that behavior change we are no longer referencing a bsca_block in
kvm->arch.sca. This will always be esca_block instead.
By specifying the type of the sca as esca_block we can simplify access
to the sca and get rid of some helpers while making the code clearer.
KVM_MAX_VCPUS is also moved to kvm_host_types to allow using this in
future type definitions.
Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
---
Changes in v4:
- Squash patches into single patch
- Revert KVM_CAP_MAX_VCPUS to return KVM_CAP_MAX_VCPU_ID (255) again
- Link to v3: https://lore.kernel.org/r/20250522-rm-bsca-v3-0-51d169738fcf@linux.ibm.com
Changes in v3:
- do not enable sigp for guests when kvm_s390_use_sca_entries() is false
- consistently use kvm_s390_use_sca_entries() instead of sclp.has_sigpif
- Link to v2: https://lore.kernel.org/r/20250519-rm-bsca-v2-0-e3ea53dd0394@linux.ibm.com
Changes in v2:
- properly apply checkpatch --strict (Thanks Claudio)
- some small comment wording changes
- rebased
- Link to v1: https://lore.kernel.org/r/20250514-rm-bsca-v1-0-6c2b065a8680@linux.ibm.com
---
arch/s390/include/asm/kvm_host.h | 7 +-
arch/s390/include/asm/kvm_host_types.h | 2 +
arch/s390/kvm/gaccess.c | 10 +-
arch/s390/kvm/interrupt.c | 71 ++++----------
arch/s390/kvm/kvm-s390.c | 167 ++++++---------------------------
arch/s390/kvm/kvm-s390.h | 9 +-
6 files changed, 58 insertions(+), 208 deletions(-)
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index cb89e54ada257eb4fdfe840ff37b2ea639c2d1cb..2a2b557357c8e40c82022eb338c3e98aa8f03a2b 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -27,8 +27,6 @@
#include <asm/isc.h>
#include <asm/guarded_storage.h>
-#define KVM_MAX_VCPUS 255
-
#define KVM_INTERNAL_MEM_SLOTS 1
/*
@@ -631,9 +629,8 @@ struct kvm_s390_pv {
struct mmu_notifier mmu_notifier;
};
-struct kvm_arch{
- void *sca;
- int use_esca;
+struct kvm_arch {
+ struct esca_block *sca;
rwlock_t sca_lock;
debug_info_t *dbf;
struct kvm_s390_float_interrupt float_int;
diff --git a/arch/s390/include/asm/kvm_host_types.h b/arch/s390/include/asm/kvm_host_types.h
index 1394d3fb648f1e46dba2c513ed26e5dfd275fad4..9697db9576f6c39a6689251f85b4b974c344769a 100644
--- a/arch/s390/include/asm/kvm_host_types.h
+++ b/arch/s390/include/asm/kvm_host_types.h
@@ -6,6 +6,8 @@
#include <linux/atomic.h>
#include <linux/types.h>
+#define KVM_MAX_VCPUS 256
+
#define KVM_S390_BSCA_CPU_SLOTS 64
#define KVM_S390_ESCA_CPU_SLOTS 248
diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
index f6fded15633ad87f6b02c2c42aea35a3c9164253..ee37d397d9218a4d33c7a33bd877d0b974ca9003 100644
--- a/arch/s390/kvm/gaccess.c
+++ b/arch/s390/kvm/gaccess.c
@@ -112,7 +112,7 @@ int ipte_lock_held(struct kvm *kvm)
int rc;
read_lock(&kvm->arch.sca_lock);
- rc = kvm_s390_get_ipte_control(kvm)->kh != 0;
+ rc = kvm->arch.sca->ipte_control.kh != 0;
read_unlock(&kvm->arch.sca_lock);
return rc;
}
@@ -129,7 +129,7 @@ static void ipte_lock_simple(struct kvm *kvm)
goto out;
retry:
read_lock(&kvm->arch.sca_lock);
- ic = kvm_s390_get_ipte_control(kvm);
+ ic = &kvm->arch.sca->ipte_control;
old = READ_ONCE(*ic);
do {
if (old.k) {
@@ -154,7 +154,7 @@ static void ipte_unlock_simple(struct kvm *kvm)
if (kvm->arch.ipte_lock_count)
goto out;
read_lock(&kvm->arch.sca_lock);
- ic = kvm_s390_get_ipte_control(kvm);
+ ic = &kvm->arch.sca->ipte_control;
old = READ_ONCE(*ic);
do {
new = old;
@@ -172,7 +172,7 @@ static void ipte_lock_siif(struct kvm *kvm)
retry:
read_lock(&kvm->arch.sca_lock);
- ic = kvm_s390_get_ipte_control(kvm);
+ ic = &kvm->arch.sca->ipte_control;
old = READ_ONCE(*ic);
do {
if (old.kg) {
@@ -192,7 +192,7 @@ static void ipte_unlock_siif(struct kvm *kvm)
union ipte_control old, new, *ic;
read_lock(&kvm->arch.sca_lock);
- ic = kvm_s390_get_ipte_control(kvm);
+ ic = &kvm->arch.sca->ipte_control;
old = READ_ONCE(*ic);
do {
new = old;
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 60c360c18690f6b94e8483dab2c25f016451204b..95a876ff7aca9c632c3e361275da6781ec070c07 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -51,21 +51,11 @@ static int sca_ext_call_pending(struct kvm_vcpu *vcpu, int *src_id)
BUG_ON(!kvm_s390_use_sca_entries());
read_lock(&vcpu->kvm->arch.sca_lock);
- if (vcpu->kvm->arch.use_esca) {
- struct esca_block *sca = vcpu->kvm->arch.sca;
- union esca_sigp_ctrl sigp_ctrl =
- sca->cpu[vcpu->vcpu_id].sigp_ctrl;
+ struct esca_block *sca = vcpu->kvm->arch.sca;
+ union esca_sigp_ctrl sigp_ctrl = sca->cpu[vcpu->vcpu_id].sigp_ctrl;
- c = sigp_ctrl.c;
- scn = sigp_ctrl.scn;
- } else {
- struct bsca_block *sca = vcpu->kvm->arch.sca;
- union bsca_sigp_ctrl sigp_ctrl =
- sca->cpu[vcpu->vcpu_id].sigp_ctrl;
-
- c = sigp_ctrl.c;
- scn = sigp_ctrl.scn;
- }
+ c = sigp_ctrl.c;
+ scn = sigp_ctrl.scn;
read_unlock(&vcpu->kvm->arch.sca_lock);
if (src_id)
@@ -80,33 +70,17 @@ static int sca_inject_ext_call(struct kvm_vcpu *vcpu, int src_id)
BUG_ON(!kvm_s390_use_sca_entries());
read_lock(&vcpu->kvm->arch.sca_lock);
- if (vcpu->kvm->arch.use_esca) {
- struct esca_block *sca = vcpu->kvm->arch.sca;
- union esca_sigp_ctrl *sigp_ctrl =
- &(sca->cpu[vcpu->vcpu_id].sigp_ctrl);
- union esca_sigp_ctrl new_val = {0}, old_val;
-
- old_val = READ_ONCE(*sigp_ctrl);
- new_val.scn = src_id;
- new_val.c = 1;
- old_val.c = 0;
-
- expect = old_val.value;
- rc = cmpxchg(&sigp_ctrl->value, old_val.value, new_val.value);
- } else {
- struct bsca_block *sca = vcpu->kvm->arch.sca;
- union bsca_sigp_ctrl *sigp_ctrl =
- &(sca->cpu[vcpu->vcpu_id].sigp_ctrl);
- union bsca_sigp_ctrl new_val = {0}, old_val;
+ struct esca_block *sca = vcpu->kvm->arch.sca;
+ union esca_sigp_ctrl *sigp_ctrl = &sca->cpu[vcpu->vcpu_id].sigp_ctrl;
+ union esca_sigp_ctrl new_val = {0}, old_val;
- old_val = READ_ONCE(*sigp_ctrl);
- new_val.scn = src_id;
- new_val.c = 1;
- old_val.c = 0;
+ old_val = READ_ONCE(*sigp_ctrl);
+ new_val.scn = src_id;
+ new_val.c = 1;
+ old_val.c = 0;
- expect = old_val.value;
- rc = cmpxchg(&sigp_ctrl->value, old_val.value, new_val.value);
- }
+ expect = old_val.value;
+ rc = cmpxchg(&sigp_ctrl->value, old_val.value, new_val.value);
read_unlock(&vcpu->kvm->arch.sca_lock);
if (rc != expect) {
@@ -123,19 +97,10 @@ static void sca_clear_ext_call(struct kvm_vcpu *vcpu)
return;
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_ECALL_PEND);
read_lock(&vcpu->kvm->arch.sca_lock);
- if (vcpu->kvm->arch.use_esca) {
- struct esca_block *sca = vcpu->kvm->arch.sca;
- union esca_sigp_ctrl *sigp_ctrl =
- &(sca->cpu[vcpu->vcpu_id].sigp_ctrl);
+ struct esca_block *sca = vcpu->kvm->arch.sca;
+ union esca_sigp_ctrl *sigp_ctrl = &sca->cpu[vcpu->vcpu_id].sigp_ctrl;
- WRITE_ONCE(sigp_ctrl->value, 0);
- } else {
- struct bsca_block *sca = vcpu->kvm->arch.sca;
- union bsca_sigp_ctrl *sigp_ctrl =
- &(sca->cpu[vcpu->vcpu_id].sigp_ctrl);
-
- WRITE_ONCE(sigp_ctrl->value, 0);
- }
+ WRITE_ONCE(sigp_ctrl->value, 0);
read_unlock(&vcpu->kvm->arch.sca_lock);
}
@@ -1223,7 +1188,7 @@ int kvm_s390_ext_call_pending(struct kvm_vcpu *vcpu)
{
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
- if (!sclp.has_sigpif)
+ if (!kvm_s390_use_sca_entries())
return test_bit(IRQ_PEND_EXT_EXTERNAL, &li->pending_irqs);
return sca_ext_call_pending(vcpu, NULL);
@@ -1547,7 +1512,7 @@ static int __inject_extcall(struct kvm_vcpu *vcpu, struct kvm_s390_irq *irq)
if (kvm_get_vcpu_by_id(vcpu->kvm, src_id) == NULL)
return -EINVAL;
- if (sclp.has_sigpif && !kvm_s390_pv_cpu_get_handle(vcpu))
+ if (kvm_s390_use_sca_entries() && !kvm_s390_pv_cpu_get_handle(vcpu))
return sca_inject_ext_call(vcpu, src_id);
if (test_and_set_bit(IRQ_PEND_EXT_EXTERNAL, &li->pending_irqs))
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 3f3175193fd7a7a26658eb2e2533d8037447a0b4..4f6c31b452be49d7dd731a4b953c676bf48ee6cb 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -271,7 +271,6 @@ debug_info_t *kvm_s390_dbf_uv;
/* forward declarations */
static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
unsigned long end);
-static int sca_switch_to_extended(struct kvm *kvm);
static void kvm_clock_sync_scb(struct kvm_s390_sie_block *scb, u64 delta)
{
@@ -631,11 +630,13 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_NR_VCPUS:
case KVM_CAP_MAX_VCPUS:
case KVM_CAP_MAX_VCPU_ID:
- r = KVM_S390_BSCA_CPU_SLOTS;
+ /*
+ * Return the same value for KVM_CAP_MAX_VCPUS and
+ * KVM_CAP_MAX_VCPU_ID to pass the kvm_create_max_vcpus selftest.
+ */
+ r = KVM_S390_ESCA_CPU_SLOTS;
if (!kvm_s390_use_sca_entries())
- r = KVM_MAX_VCPUS;
- else if (sclp.has_esca && sclp.has_64bscao)
- r = KVM_S390_ESCA_CPU_SLOTS;
+ r = KVM_MAX_VCPUS - 1;
if (ext == KVM_CAP_NR_VCPUS)
r = min_t(unsigned int, num_online_cpus(), r);
break;
@@ -1930,13 +1931,11 @@ static int kvm_s390_get_cpu_model(struct kvm *kvm, struct kvm_device_attr *attr)
* Updates the Multiprocessor Topology-Change-Report bit to signal
* the guest with a topology change.
* This is only relevant if the topology facility is present.
- *
- * The SCA version, bsca or esca, doesn't matter as offset is the same.
*/
static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
{
union sca_utility new, old;
- struct bsca_block *sca;
+ struct esca_block *sca;
read_lock(&kvm->arch.sca_lock);
sca = kvm->arch.sca;
@@ -1967,7 +1966,7 @@ static int kvm_s390_get_topo_change_indication(struct kvm *kvm,
return -ENXIO;
read_lock(&kvm->arch.sca_lock);
- topo = ((struct bsca_block *)kvm->arch.sca)->utility.mtcr;
+ topo = kvm->arch.sca->utility.mtcr;
read_unlock(&kvm->arch.sca_lock);
return put_user(topo, (u8 __user *)attr->addr);
@@ -2666,14 +2665,6 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd)
if (kvm_s390_pv_is_protected(kvm))
break;
- /*
- * FMT 4 SIE needs esca. As we never switch back to bsca from
- * esca, we need no cleanup in the error cases below
- */
- r = sca_switch_to_extended(kvm);
- if (r)
- break;
-
r = s390_disable_cow_sharing();
if (r)
break;
@@ -3314,10 +3305,7 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
static void sca_dispose(struct kvm *kvm)
{
- if (kvm->arch.use_esca)
- free_pages_exact(kvm->arch.sca, sizeof(struct esca_block));
- else
- free_page((unsigned long)(kvm->arch.sca));
+ free_pages_exact(kvm->arch.sca, sizeof(*kvm->arch.sca));
kvm->arch.sca = NULL;
}
@@ -3331,10 +3319,9 @@ void kvm_arch_free_vm(struct kvm *kvm)
int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
{
- gfp_t alloc_flags = GFP_KERNEL_ACCOUNT;
- int i, rc;
+ gfp_t alloc_flags = GFP_KERNEL_ACCOUNT | __GFP_ZERO;
char debug_name[16];
- static unsigned long sca_offset;
+ int i, rc;
rc = -EINVAL;
#ifdef CONFIG_KVM_S390_UCONTROL
@@ -3356,17 +3343,12 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
if (!sclp.has_64bscao)
alloc_flags |= GFP_DMA;
rwlock_init(&kvm->arch.sca_lock);
- /* start with basic SCA */
- kvm->arch.sca = (struct bsca_block *) get_zeroed_page(alloc_flags);
- if (!kvm->arch.sca)
- goto out_err;
mutex_lock(&kvm_lock);
- sca_offset += 16;
- if (sca_offset + sizeof(struct bsca_block) > PAGE_SIZE)
- sca_offset = 0;
- kvm->arch.sca = (struct bsca_block *)
- ((char *) kvm->arch.sca + sca_offset);
+
+ kvm->arch.sca = alloc_pages_exact(sizeof(*kvm->arch.sca), alloc_flags);
mutex_unlock(&kvm_lock);
+ if (!kvm->arch.sca)
+ goto out_err;
sprintf(debug_name, "kvm-%u", current->pid);
@@ -3548,17 +3530,10 @@ static void sca_del_vcpu(struct kvm_vcpu *vcpu)
if (!kvm_s390_use_sca_entries())
return;
read_lock(&vcpu->kvm->arch.sca_lock);
- if (vcpu->kvm->arch.use_esca) {
- struct esca_block *sca = vcpu->kvm->arch.sca;
+ struct esca_block *sca = vcpu->kvm->arch.sca;
- clear_bit_inv(vcpu->vcpu_id, (unsigned long *) sca->mcn);
- sca->cpu[vcpu->vcpu_id].sda = 0;
- } else {
- struct bsca_block *sca = vcpu->kvm->arch.sca;
-
- clear_bit_inv(vcpu->vcpu_id, (unsigned long *) &sca->mcn);
- sca->cpu[vcpu->vcpu_id].sda = 0;
- }
+ clear_bit_inv(vcpu->vcpu_id, (unsigned long *)sca->mcn);
+ sca->cpu[vcpu->vcpu_id].sda = 0;
read_unlock(&vcpu->kvm->arch.sca_lock);
}
@@ -3573,105 +3548,23 @@ static void sca_add_vcpu(struct kvm_vcpu *vcpu)
return;
}
read_lock(&vcpu->kvm->arch.sca_lock);
- if (vcpu->kvm->arch.use_esca) {
- struct esca_block *sca = vcpu->kvm->arch.sca;
- phys_addr_t sca_phys = virt_to_phys(sca);
-
- sca->cpu[vcpu->vcpu_id].sda = virt_to_phys(vcpu->arch.sie_block);
- vcpu->arch.sie_block->scaoh = sca_phys >> 32;
- vcpu->arch.sie_block->scaol = sca_phys & ESCA_SCAOL_MASK;
- vcpu->arch.sie_block->ecb2 |= ECB2_ESCA;
- set_bit_inv(vcpu->vcpu_id, (unsigned long *) sca->mcn);
- } else {
- struct bsca_block *sca = vcpu->kvm->arch.sca;
- phys_addr_t sca_phys = virt_to_phys(sca);
-
- sca->cpu[vcpu->vcpu_id].sda = virt_to_phys(vcpu->arch.sie_block);
- vcpu->arch.sie_block->scaoh = sca_phys >> 32;
- vcpu->arch.sie_block->scaol = sca_phys;
- set_bit_inv(vcpu->vcpu_id, (unsigned long *) &sca->mcn);
- }
+ struct esca_block *sca = vcpu->kvm->arch.sca;
+ phys_addr_t sca_phys = virt_to_phys(sca);
+
+ sca->cpu[vcpu->vcpu_id].sda = virt_to_phys(vcpu->arch.sie_block);
+ vcpu->arch.sie_block->scaoh = sca_phys >> 32;
+ vcpu->arch.sie_block->scaol = sca_phys & ESCA_SCAOL_MASK;
+ vcpu->arch.sie_block->ecb2 |= ECB2_ESCA;
+ set_bit_inv(vcpu->vcpu_id, (unsigned long *)sca->mcn);
read_unlock(&vcpu->kvm->arch.sca_lock);
}
-/* Basic SCA to Extended SCA data copy routines */
-static inline void sca_copy_entry(struct esca_entry *d, struct bsca_entry *s)
-{
- d->sda = s->sda;
- d->sigp_ctrl.c = s->sigp_ctrl.c;
- d->sigp_ctrl.scn = s->sigp_ctrl.scn;
-}
-
-static void sca_copy_b_to_e(struct esca_block *d, struct bsca_block *s)
-{
- int i;
-
- d->ipte_control = s->ipte_control;
- d->mcn[0] = s->mcn;
- for (i = 0; i < KVM_S390_BSCA_CPU_SLOTS; i++)
- sca_copy_entry(&d->cpu[i], &s->cpu[i]);
-}
-
-static int sca_switch_to_extended(struct kvm *kvm)
-{
- struct bsca_block *old_sca = kvm->arch.sca;
- struct esca_block *new_sca;
- struct kvm_vcpu *vcpu;
- unsigned long vcpu_idx;
- u32 scaol, scaoh;
- phys_addr_t new_sca_phys;
-
- if (kvm->arch.use_esca)
- return 0;
-
- new_sca = alloc_pages_exact(sizeof(*new_sca), GFP_KERNEL_ACCOUNT | __GFP_ZERO);
- if (!new_sca)
- return -ENOMEM;
-
- new_sca_phys = virt_to_phys(new_sca);
- scaoh = new_sca_phys >> 32;
- scaol = new_sca_phys & ESCA_SCAOL_MASK;
-
- kvm_s390_vcpu_block_all(kvm);
- write_lock(&kvm->arch.sca_lock);
-
- sca_copy_b_to_e(new_sca, old_sca);
-
- kvm_for_each_vcpu(vcpu_idx, vcpu, kvm) {
- vcpu->arch.sie_block->scaoh = scaoh;
- vcpu->arch.sie_block->scaol = scaol;
- vcpu->arch.sie_block->ecb2 |= ECB2_ESCA;
- }
- kvm->arch.sca = new_sca;
- kvm->arch.use_esca = 1;
-
- write_unlock(&kvm->arch.sca_lock);
- kvm_s390_vcpu_unblock_all(kvm);
-
- free_page((unsigned long)old_sca);
-
- VM_EVENT(kvm, 2, "Switched to ESCA (0x%p -> 0x%p)",
- old_sca, kvm->arch.sca);
- return 0;
-}
-
static int sca_can_add_vcpu(struct kvm *kvm, unsigned int id)
{
- int rc;
-
- if (!kvm_s390_use_sca_entries()) {
- if (id < KVM_MAX_VCPUS)
- return true;
- return false;
- }
- if (id < KVM_S390_BSCA_CPU_SLOTS)
- return true;
- if (!sclp.has_esca || !sclp.has_64bscao)
- return false;
-
- rc = kvm->arch.use_esca ? 0 : sca_switch_to_extended(kvm);
+ if (!kvm_s390_use_sca_entries())
+ return id < KVM_MAX_VCPUS;
- return rc == 0 && id < KVM_S390_ESCA_CPU_SLOTS;
+ return id < KVM_S390_ESCA_CPU_SLOTS;
}
/* needs disabled preemption to protect from TOD sync and vcpu_load/put */
@@ -3917,7 +3810,7 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
vcpu->arch.sie_block->eca |= ECA_IB;
if (sclp.has_siif)
vcpu->arch.sie_block->eca |= ECA_SII;
- if (sclp.has_sigpif)
+ if (kvm_s390_use_sca_entries())
vcpu->arch.sie_block->eca |= ECA_SIGPI;
if (test_kvm_facility(vcpu->kvm, 129)) {
vcpu->arch.sie_block->eca |= ECA_VX;
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 8d3bbb2dd8d27802bbde2a7bd1378033ad614b8e..0c5e8ae07b77648d554668cc0536607545636a68 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -528,13 +528,6 @@ void kvm_s390_prepare_debug_exit(struct kvm_vcpu *vcpu);
int kvm_s390_handle_per_ifetch_icpt(struct kvm_vcpu *vcpu);
int kvm_s390_handle_per_event(struct kvm_vcpu *vcpu);
-/* support for Basic/Extended SCA handling */
-static inline union ipte_control *kvm_s390_get_ipte_control(struct kvm *kvm)
-{
- struct bsca_block *sca = kvm->arch.sca; /* SCA version doesn't matter */
-
- return &sca->ipte_control;
-}
static inline int kvm_s390_use_sca_entries(void)
{
/*
@@ -542,7 +535,7 @@ static inline int kvm_s390_use_sca_entries(void)
* might use the entries. By not setting the entries and keeping them
* invalid, hardware will not access them but intercept.
*/
- return sclp.has_sigpif;
+ return sclp.has_sigpif && sclp.has_esca;
}
void kvm_s390_reinject_machine_check(struct kvm_vcpu *vcpu,
struct mcck_volatile_info *mcck_info);
---
base-commit: 0ff41df1cb268fc69e703a08a57ee14ae967d0ca
change-id: 20250513-rm-bsca-ab1e8649aca7
Best regards,
--
Christoph Schlameuss <schlameuss@linux.ibm.com>
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v4] KVM: s390: Use ESCA instead of BSCA at VM init
2025-06-02 16:34 [PATCH v4] KVM: s390: Use ESCA instead of BSCA at VM init Christoph Schlameuss
@ 2025-06-03 8:48 ` Janosch Frank
2025-06-03 15:15 ` Christoph Schlameuss
0 siblings, 1 reply; 3+ messages in thread
From: Janosch Frank @ 2025-06-03 8:48 UTC (permalink / raw)
To: Christoph Schlameuss, kvm
Cc: linux-s390, Christian Borntraeger, Claudio Imbrenda,
David Hildenbrand, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Sven Schnelle, Thomas Huth
On 6/2/25 6:34 PM, Christoph Schlameuss wrote:
> All modern IBM Z and Linux One machines do offer support for the
> Extended System Control Area (ESCA). The ESCA is available since the
> z114/z196 released in 2010.
> KVM needs to allocate and manage the SCA for guest VMs. Prior to this
> change the SCA was setup as Basic SCA only supporting a maximum of 64
> vCPUs when initializing the VM. With addition of the 65th vCPU the SCA
> was needed to be converted to a ESCA.
>
> Instead of allocating a BSCA and upgrading it for PV or when adding the
> 65th cpu we can always allocate the ESCA directly upon VM creation
> simplifying the code in multiple places as well as completely removing
> the need to convert an existing SCA.
>
> In cases where the ESCA is not supported (z10 and earlier) the use of
> the SCA entries and with that SIGP interpretation are disabled for VMs.
> This increases the number of exits from the VM in multiprocessor
> scenarios and thus decreases performance.
> The same is true for VSIE where SIGP is currently disabled and thus no
> SCA entries are used.
>
> The only downside of the change is that we will always allocate 4 pages
> for a 248 cpu ESCA instead of a single page for the BSCA per VM.
> In return we can delete a bunch of checks and special handling depending
> on the SCA type as well as the whole BSCA to ESCA conversion.
>
> With that behavior change we are no longer referencing a bsca_block in
> kvm->arch.sca. This will always be esca_block instead.
> By specifying the type of the sca as esca_block we can simplify access
> to the sca and get rid of some helpers while making the code clearer.
>
> KVM_MAX_VCPUS is also moved to kvm_host_types to allow using this in
> future type definitions.
>
> Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
> ---
> Changes in v4:
> - Squash patches into single patch
> - Revert KVM_CAP_MAX_VCPUS to return KVM_CAP_MAX_VCPU_ID (255) again
> - Link to v3: https://lore.kernel.org/r/20250522-rm-bsca-v3-0-51d169738fcf@linux.ibm.com
>
> Changes in v3:
> - do not enable sigp for guests when kvm_s390_use_sca_entries() is false
> - consistently use kvm_s390_use_sca_entries() instead of sclp.has_sigpif
> - Link to v2: https://lore.kernel.org/r/20250519-rm-bsca-v2-0-e3ea53dd0394@linux.ibm.com
>
> Changes in v2:
> - properly apply checkpatch --strict (Thanks Claudio)
> - some small comment wording changes
> - rebased
> - Link to v1: https://lore.kernel.org/r/20250514-rm-bsca-v1-0-6c2b065a8680@linux.ibm.com
> ---
> arch/s390/include/asm/kvm_host.h | 7 +-
> arch/s390/include/asm/kvm_host_types.h | 2 +
> arch/s390/kvm/gaccess.c | 10 +-
> arch/s390/kvm/interrupt.c | 71 ++++----------
> arch/s390/kvm/kvm-s390.c | 167 ++++++---------------------------
> arch/s390/kvm/kvm-s390.h | 9 +-
> 6 files changed, 58 insertions(+), 208 deletions(-)
>
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index cb89e54ada257eb4fdfe840ff37b2ea639c2d1cb..2a2b557357c8e40c82022eb338c3e98aa8f03a2b 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -27,8 +27,6 @@
> #include <asm/isc.h>
> #include <asm/guarded_storage.h>
>
> -#define KVM_MAX_VCPUS 255
> -
> #define KVM_INTERNAL_MEM_SLOTS 1
>
> /*
> @@ -631,9 +629,8 @@ struct kvm_s390_pv {
> struct mmu_notifier mmu_notifier;
> };
>
> -struct kvm_arch{
> - void *sca;
> - int use_esca;
> +struct kvm_arch {
> + struct esca_block *sca;
> rwlock_t sca_lock;
> debug_info_t *dbf;
> struct kvm_s390_float_interrupt float_int;
> diff --git a/arch/s390/include/asm/kvm_host_types.h b/arch/s390/include/asm/kvm_host_types.h
> index 1394d3fb648f1e46dba2c513ed26e5dfd275fad4..9697db9576f6c39a6689251f85b4b974c344769a 100644
> --- a/arch/s390/include/asm/kvm_host_types.h
> +++ b/arch/s390/include/asm/kvm_host_types.h
> @@ -6,6 +6,8 @@
> #include <linux/atomic.h>
> #include <linux/types.h>
>
> +#define KVM_MAX_VCPUS 256
Why are we doing the whole 256 - 1 game?
> +
> #define KVM_S390_BSCA_CPU_SLOTS 64
Can't you remove that now?
> #define KVM_S390_ESCA_CPU_SLOTS 248
>
> diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
> index f6fded15633ad87f6b02c2c42aea35a3c9164253..ee37d397d9218a4d33c7a33bd877d0b974ca9003 100644
> --- a/arch/s390/kvm/gaccess.c
> +++ b/arch/s390/kvm/gaccess.c
> @@ -112,7 +112,7 @@ int ipte_lock_held(struct kvm *kvm)
> int rc;
>
> read_lock(&kvm->arch.sca_lock);
> - rc = kvm_s390_get_ipte_control(kvm)->kh != 0;
> + rc = kvm->arch.sca->ipte_control.kh != 0;
> read_unlock(&kvm->arch.sca_lock);
> return rc;
> }
[...]
> -static int sca_switch_to_extended(struct kvm *kvm);
>
> static void kvm_clock_sync_scb(struct kvm_s390_sie_block *scb, u64 delta)
> {
> @@ -631,11 +630,13 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> case KVM_CAP_NR_VCPUS:
> case KVM_CAP_MAX_VCPUS:
> case KVM_CAP_MAX_VCPU_ID:
> - r = KVM_S390_BSCA_CPU_SLOTS;
> + /*
> + * Return the same value for KVM_CAP_MAX_VCPUS and
> + * KVM_CAP_MAX_VCPU_ID to pass the kvm_create_max_vcpus selftest.
> + */
> + r = KVM_S390_ESCA_CPU_SLOTS;
We're not doing this to pass the test, we're doing this to adhere to the
KVM API. Yes, the API document explains it with one indirection but it
is in there.
The whole KVM_CAP_MAX_VCPU_ID problem will pop up in the future since we
can't change the caps name. We'll have to live with it.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v4] KVM: s390: Use ESCA instead of BSCA at VM init
2025-06-03 8:48 ` Janosch Frank
@ 2025-06-03 15:15 ` Christoph Schlameuss
0 siblings, 0 replies; 3+ messages in thread
From: Christoph Schlameuss @ 2025-06-03 15:15 UTC (permalink / raw)
To: Janosch Frank, kvm
Cc: linux-s390, Christian Borntraeger, Claudio Imbrenda,
David Hildenbrand, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Sven Schnelle, Thomas Huth
On Tue Jun 3, 2025 at 10:48 AM CEST, Janosch Frank wrote:
> On 6/2/25 6:34 PM, Christoph Schlameuss wrote:
>> All modern IBM Z and Linux One machines do offer support for the
>> Extended System Control Area (ESCA). The ESCA is available since the
>> z114/z196 released in 2010.
>> KVM needs to allocate and manage the SCA for guest VMs. Prior to this
>> change the SCA was setup as Basic SCA only supporting a maximum of 64
>> vCPUs when initializing the VM. With addition of the 65th vCPU the SCA
>> was needed to be converted to a ESCA.
>>
>> Instead of allocating a BSCA and upgrading it for PV or when adding the
>> 65th cpu we can always allocate the ESCA directly upon VM creation
>> simplifying the code in multiple places as well as completely removing
>> the need to convert an existing SCA.
>>
>> In cases where the ESCA is not supported (z10 and earlier) the use of
>> the SCA entries and with that SIGP interpretation are disabled for VMs.
>> This increases the number of exits from the VM in multiprocessor
>> scenarios and thus decreases performance.
>> The same is true for VSIE where SIGP is currently disabled and thus no
>> SCA entries are used.
>>
>> The only downside of the change is that we will always allocate 4 pages
>> for a 248 cpu ESCA instead of a single page for the BSCA per VM.
>> In return we can delete a bunch of checks and special handling depending
>> on the SCA type as well as the whole BSCA to ESCA conversion.
>>
>> With that behavior change we are no longer referencing a bsca_block in
>> kvm->arch.sca. This will always be esca_block instead.
>> By specifying the type of the sca as esca_block we can simplify access
>> to the sca and get rid of some helpers while making the code clearer.
>>
>> KVM_MAX_VCPUS is also moved to kvm_host_types to allow using this in
>> future type definitions.
>>
>> Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
>> ---
>> Changes in v4:
>> - Squash patches into single patch
>> - Revert KVM_CAP_MAX_VCPUS to return KVM_CAP_MAX_VCPU_ID (255) again
>> - Link to v3: https://lore.kernel.org/r/20250522-rm-bsca-v3-0-51d169738fcf@linux.ibm.com
>>
>> Changes in v3:
>> - do not enable sigp for guests when kvm_s390_use_sca_entries() is false
>> - consistently use kvm_s390_use_sca_entries() instead of sclp.has_sigpif
>> - Link to v2: https://lore.kernel.org/r/20250519-rm-bsca-v2-0-e3ea53dd0394@linux.ibm.com
>>
>> Changes in v2:
>> - properly apply checkpatch --strict (Thanks Claudio)
>> - some small comment wording changes
>> - rebased
>> - Link to v1: https://lore.kernel.org/r/20250514-rm-bsca-v1-0-6c2b065a8680@linux.ibm.com
>> ---
>> arch/s390/include/asm/kvm_host.h | 7 +-
>> arch/s390/include/asm/kvm_host_types.h | 2 +
>> arch/s390/kvm/gaccess.c | 10 +-
>> arch/s390/kvm/interrupt.c | 71 ++++----------
>> arch/s390/kvm/kvm-s390.c | 167 ++++++---------------------------
>> arch/s390/kvm/kvm-s390.h | 9 +-
>> 6 files changed, 58 insertions(+), 208 deletions(-)
>>
>> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
>> index cb89e54ada257eb4fdfe840ff37b2ea639c2d1cb..2a2b557357c8e40c82022eb338c3e98aa8f03a2b 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -27,8 +27,6 @@
>> #include <asm/isc.h>
>> #include <asm/guarded_storage.h>
>>
>> -#define KVM_MAX_VCPUS 255
>> -
>> #define KVM_INTERNAL_MEM_SLOTS 1
>>
>> /*
>> @@ -631,9 +629,8 @@ struct kvm_s390_pv {
>> struct mmu_notifier mmu_notifier;
>> };
>>
>> -struct kvm_arch{
>> - void *sca;
>> - int use_esca;
>> +struct kvm_arch {
>> + struct esca_block *sca;
>> rwlock_t sca_lock;
>> debug_info_t *dbf;
>> struct kvm_s390_float_interrupt float_int;
>> diff --git a/arch/s390/include/asm/kvm_host_types.h b/arch/s390/include/asm/kvm_host_types.h
>> index 1394d3fb648f1e46dba2c513ed26e5dfd275fad4..9697db9576f6c39a6689251f85b4b974c344769a 100644
>> --- a/arch/s390/include/asm/kvm_host_types.h
>> +++ b/arch/s390/include/asm/kvm_host_types.h
>> @@ -6,6 +6,8 @@
>> #include <linux/atomic.h>
>> #include <linux/types.h>
>>
>> +#define KVM_MAX_VCPUS 256
>
> Why are we doing the whole 256 - 1 game?
>
I guess that was just me trying to force it to have the proper number there. But
you are right, that is moot. I will revert that.
>> +
>> #define KVM_S390_BSCA_CPU_SLOTS 64
>
> Can't you remove that now?
>
Sadly no. That is still needed along with struct bsca_block to have bsca
support in vsie sigp.
>> #define KVM_S390_ESCA_CPU_SLOTS 248
>>
>> diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
>> index f6fded15633ad87f6b02c2c42aea35a3c9164253..ee37d397d9218a4d33c7a33bd877d0b974ca9003 100644
>> --- a/arch/s390/kvm/gaccess.c
>> +++ b/arch/s390/kvm/gaccess.c
>> @@ -112,7 +112,7 @@ int ipte_lock_held(struct kvm *kvm)
>> int rc;
>>
>> read_lock(&kvm->arch.sca_lock);
>> - rc = kvm_s390_get_ipte_control(kvm)->kh != 0;
>> + rc = kvm->arch.sca->ipte_control.kh != 0;
>> read_unlock(&kvm->arch.sca_lock);
>> return rc;
>> }
>
> [...]
>
>> -static int sca_switch_to_extended(struct kvm *kvm);
>>
>> static void kvm_clock_sync_scb(struct kvm_s390_sie_block *scb, u64 delta)
>> {
>> @@ -631,11 +630,13 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>> case KVM_CAP_NR_VCPUS:
>> case KVM_CAP_MAX_VCPUS:
>> case KVM_CAP_MAX_VCPU_ID:
>> - r = KVM_S390_BSCA_CPU_SLOTS;
>> + /*
>> + * Return the same value for KVM_CAP_MAX_VCPUS and
>> + * KVM_CAP_MAX_VCPU_ID to pass the kvm_create_max_vcpus selftest.
>> + */
>> + r = KVM_S390_ESCA_CPU_SLOTS;
>
> We're not doing this to pass the test, we're doing this to adhere to the
> KVM API. Yes, the API document explains it with one indirection but it
> is in there.
>
> The whole KVM_CAP_MAX_VCPU_ID problem will pop up in the future since we
> can't change the caps name. We'll have to live with it.
Let me just clarify the comment then. But hopefully that comment will be helpful
to the next one trying this.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-06-03 15:16 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-02 16:34 [PATCH v4] KVM: s390: Use ESCA instead of BSCA at VM init Christoph Schlameuss
2025-06-03 8:48 ` Janosch Frank
2025-06-03 15:15 ` Christoph Schlameuss
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).