* [PATCH v1 0/8] KVM: s390: Misc fixes
@ 2026-07-02 15:23 Claudio Imbrenda
2026-07-02 15:23 ` [PATCH v1 1/8] KVM: s390: vsie: Avoid potential deadlock with real spaces Claudio Imbrenda
` (7 more replies)
0 siblings, 8 replies; 15+ messages in thread
From: Claudio Imbrenda @ 2026-07-02 15:23 UTC (permalink / raw)
To: linux-kernel
Cc: kvm, linux-s390, borntraeger, frankja, david, seiden, nrb,
schlameuss, gra
Fix a bunch of small issues that came up during the previous round of fixes.
They are mostly extremely unlikely races, but they should be fixed
nonetheless.
Claudio Imbrenda (8):
KVM: s390: vsie: Avoid potential deadlock with real spaces
KVM: s390: Fix unlikely NULL gmap dereference
KVM: s390: Fix unlikely race with KVM_CAP_S390_USER_OPEREXEC
KVM: s390: Fix return value of kvm_s390_set_cmma_bits()
KVM: s390: Fix race in __do_essa()
KVM: s390: cmma: Fix dirty tracking when removing memslot
KVM: s390: Fix dat_crste_walk_range() early return
KVM: s390: Improve kvm_s390_vm_stop_migration()
arch/s390/kvm/dat.c | 2 ++
arch/s390/kvm/gmap.c | 7 ++++++-
arch/s390/kvm/kvm-s390.c | 30 +++++++++++++++++++++++-------
arch/s390/kvm/priv.c | 5 +++--
4 files changed, 34 insertions(+), 10 deletions(-)
--
2.55.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v1 1/8] KVM: s390: vsie: Avoid potential deadlock with real spaces
2026-07-02 15:23 [PATCH v1 0/8] KVM: s390: Misc fixes Claudio Imbrenda
@ 2026-07-02 15:23 ` Claudio Imbrenda
2026-07-02 15:24 ` [PATCH v1 2/8] KVM: s390: Fix unlikely NULL gmap dereference Claudio Imbrenda
` (6 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Claudio Imbrenda @ 2026-07-02 15:23 UTC (permalink / raw)
To: linux-kernel
Cc: kvm, linux-s390, borntraeger, frankja, david, seiden, nrb,
schlameuss, gra
The natural lock ordering is mmu_lock -> children_lock, but in
gmap_create_shadow() the reverse order is used when handling shadowing
of real address spaces.
Convert the inner locking of kvm->mmu_lock to a trylock; return -EAGAIN
if the lock is busy, and let the caller try again.
This path is not expected to happen in real-life scenarios, so its
performance is not important.
Fixes: a2c17f9270cc ("KVM: s390: New gmap code")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
arch/s390/kvm/gmap.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kvm/gmap.c b/arch/s390/kvm/gmap.c
index 298fbaecec28..8abb4f55b306 100644
--- a/arch/s390/kvm/gmap.c
+++ b/arch/s390/kvm/gmap.c
@@ -1374,8 +1374,13 @@ struct gmap *gmap_create_shadow(struct kvm_s390_mmu_cache *mc, struct gmap *pare
/* Only allow one real-space gmap shadow. */
list_for_each_entry(sg, &parent->children, list) {
if (sg->guest_asce.r) {
- scoped_guard(write_lock, &parent->kvm->mmu_lock)
+ if (write_trylock(&parent->kvm->mmu_lock)) {
gmap_unshadow(sg);
+ write_unlock(&parent->kvm->mmu_lock);
+ } else {
+ gmap_put(new);
+ return ERR_PTR(-EAGAIN);
+ }
break;
}
}
--
2.55.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v1 2/8] KVM: s390: Fix unlikely NULL gmap dereference
2026-07-02 15:23 [PATCH v1 0/8] KVM: s390: Misc fixes Claudio Imbrenda
2026-07-02 15:23 ` [PATCH v1 1/8] KVM: s390: vsie: Avoid potential deadlock with real spaces Claudio Imbrenda
@ 2026-07-02 15:24 ` Claudio Imbrenda
2026-07-02 15:43 ` sashiko-bot
2026-07-02 15:24 ` [PATCH v1 3/8] KVM: s390: Fix unlikely race with KVM_CAP_S390_USER_OPEREXEC Claudio Imbrenda
` (5 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Claudio Imbrenda @ 2026-07-02 15:24 UTC (permalink / raw)
To: linux-kernel
Cc: kvm, linux-s390, borntraeger, frankja, david, seiden, nrb,
schlameuss, gra
When creating a new vCPU, kvm_vm_ioctl_create_vcpu() will call
kvm_arch_vcpu_postcreate() after the file descriptor for the new vCPU
has been created. The new file descriptor has not been returned yet,
but a malicious userspace program could try to guess it.
If a malicious userspace program manages to start the newly created vCPU
before kvm_arch_vcpu_postcreate() is called, __vcpu_run() will try to
dereference vcpu->arch.gmap and trigger a NULL pointer dereference.
Fix this by moving the initialization of vcpu->arch.gmap into
kvm_arch_vcpu_create(), which is called before the file descriptor for
the vCPU is created.
Fixes: dafd032a15f8 ("KVM: s390: move vcpu specific initalization to a later point")
Fixes: e38c884df921 ("KVM: s390: Switch to new gmap")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
arch/s390/kvm/kvm-s390.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 23c817595e28..4a6d903e3523 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -3604,10 +3604,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
vcpu->arch.sie_block->epdx = vcpu->kvm->arch.epdx;
preempt_enable();
mutex_unlock(&vcpu->kvm->lock);
- if (!kvm_is_ucontrol(vcpu->kvm)) {
- vcpu->arch.gmap = vcpu->kvm->arch.gmap;
+ if (!kvm_is_ucontrol(vcpu->kvm))
sca_add_vcpu(vcpu);
- }
if (test_kvm_facility(vcpu->kvm, 74) || vcpu->kvm->arch.user_instr0 ||
vcpu->kvm->arch.user_operexec)
vcpu->arch.sie_block->ictl |= ICTL_OPEREXC;
@@ -3850,6 +3848,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
vcpu->arch.gmap = gmap_new_child(vcpu->kvm->arch.gmap, -1UL);
if (!vcpu->arch.gmap)
goto out_free_sie_block;
+ } else {
+ vcpu->arch.gmap = vcpu->kvm->arch.gmap;
}
VM_EVENT(vcpu->kvm, 3, "create cpu %d at 0x%p, sie block at 0x%p",
--
2.55.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v1 3/8] KVM: s390: Fix unlikely race with KVM_CAP_S390_USER_OPEREXEC
2026-07-02 15:23 [PATCH v1 0/8] KVM: s390: Misc fixes Claudio Imbrenda
2026-07-02 15:23 ` [PATCH v1 1/8] KVM: s390: vsie: Avoid potential deadlock with real spaces Claudio Imbrenda
2026-07-02 15:24 ` [PATCH v1 2/8] KVM: s390: Fix unlikely NULL gmap dereference Claudio Imbrenda
@ 2026-07-02 15:24 ` Claudio Imbrenda
2026-07-02 15:42 ` sashiko-bot
2026-07-02 17:08 ` Eric Farman
2026-07-02 15:24 ` [PATCH v1 4/8] KVM: s390: Fix return value of kvm_s390_set_cmma_bits() Claudio Imbrenda
` (4 subsequent siblings)
7 siblings, 2 replies; 15+ messages in thread
From: Claudio Imbrenda @ 2026-07-02 15:24 UTC (permalink / raw)
To: linux-kernel
Cc: kvm, linux-s390, borntraeger, frankja, david, seiden, nrb,
schlameuss, gra
Due to a missing barrier, it was theoretically possible that a vCPU
being created concurrently with enabling KVM_CAP_S390_USER_OPEREXEC
would end up with the capability not enabled.
Fix by using WRITE_ONCE(), which is enough on s390.
Fixes: 8e8678e740ec ("KVM: s390: Add capability that forwards operation exceptions")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
arch/s390/kvm/kvm-s390.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 4a6d903e3523..784f7d9c79c7 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -955,7 +955,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
break;
case KVM_CAP_S390_USER_OPEREXEC:
VM_EVENT(kvm, 3, "%s", "ENABLE: CAP_S390_USER_OPEREXEC");
- kvm->arch.user_operexec = 1;
+ WRITE_ONCE(kvm->arch.user_operexec, 1);
icpt_operexc_on_all_vcpus(kvm);
r = 0;
break;
--
2.55.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v1 4/8] KVM: s390: Fix return value of kvm_s390_set_cmma_bits()
2026-07-02 15:23 [PATCH v1 0/8] KVM: s390: Misc fixes Claudio Imbrenda
` (2 preceding siblings ...)
2026-07-02 15:24 ` [PATCH v1 3/8] KVM: s390: Fix unlikely race with KVM_CAP_S390_USER_OPEREXEC Claudio Imbrenda
@ 2026-07-02 15:24 ` Claudio Imbrenda
2026-07-02 15:37 ` sashiko-bot
2026-07-02 15:24 ` [PATCH v1 5/8] KVM: s390: Fix race in __do_essa() Claudio Imbrenda
` (3 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Claudio Imbrenda @ 2026-07-02 15:24 UTC (permalink / raw)
To: linux-kernel
Cc: kvm, linux-s390, borntraeger, frankja, david, seiden, nrb,
schlameuss, gra
If kvm_s390_set_cmma_bits() is asked to set CMMA values outside of a
memslot, PGM_ADDRESSING (5) is returned, instead of a negative error
value.
Fix by returning -EFAULT whenever the return value would be > 0, which
is consistent with the behaviour before the gmap rewrite.
Fixes: e38c884df921 ("KVM: s390: Switch to new gmap")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
arch/s390/kvm/kvm-s390.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 784f7d9c79c7..512c81eee068 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2384,7 +2384,7 @@ static int kvm_s390_set_cmma_bits(struct kvm *kvm,
set_bit(GMAP_FLAG_USES_CMM, &kvm->arch.gmap->flags);
- return r;
+ return r <= 0 ? r : -EFAULT;
}
/**
--
2.55.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v1 5/8] KVM: s390: Fix race in __do_essa()
2026-07-02 15:23 [PATCH v1 0/8] KVM: s390: Misc fixes Claudio Imbrenda
` (3 preceding siblings ...)
2026-07-02 15:24 ` [PATCH v1 4/8] KVM: s390: Fix return value of kvm_s390_set_cmma_bits() Claudio Imbrenda
@ 2026-07-02 15:24 ` Claudio Imbrenda
2026-07-02 15:34 ` sashiko-bot
2026-07-02 15:24 ` [PATCH v1 6/8] KVM: s390: cmma: Fix dirty tracking when removing memslot Claudio Imbrenda
` (2 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Claudio Imbrenda @ 2026-07-02 15:24 UTC (permalink / raw)
To: linux-kernel
Cc: kvm, linux-s390, borntraeger, frankja, david, seiden, nrb,
schlameuss, gra
An unlikely race between __do_essa() and kvm_s390_vm_start_migration(),
kvm_s390_vm_stop_migration(), or dat_get_cmma() was possible.
Fix by locking kvm->slots_lock. Since this is not a hot path, the
overhead of an additional mutex is negligible.
Fixes: e38c884df921 ("KVM: s390: Switch to new gmap")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
arch/s390/kvm/priv.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index ad0ddc433a73..b0f3a47b1829 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -1260,8 +1260,9 @@ static int handle_essa(struct kvm_vcpu *vcpu)
/* Retry the ESSA instruction */
kvm_s390_retry_instr(vcpu);
} else {
- scoped_guard(read_lock, &vcpu->kvm->mmu_lock)
- i = __do_essa(vcpu, orc);
+ scoped_guard(mutex, &vcpu->kvm->slots_lock)
+ scoped_guard(read_lock, &vcpu->kvm->mmu_lock)
+ i = __do_essa(vcpu, orc);
if (i < 0)
return i;
/* Account for the possible extra cbrl entry */
--
2.55.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v1 6/8] KVM: s390: cmma: Fix dirty tracking when removing memslot
2026-07-02 15:23 [PATCH v1 0/8] KVM: s390: Misc fixes Claudio Imbrenda
` (4 preceding siblings ...)
2026-07-02 15:24 ` [PATCH v1 5/8] KVM: s390: Fix race in __do_essa() Claudio Imbrenda
@ 2026-07-02 15:24 ` Claudio Imbrenda
2026-07-02 15:48 ` sashiko-bot
2026-07-02 15:24 ` [PATCH v1 7/8] KVM: s390: Fix dat_crste_walk_range() early return Claudio Imbrenda
2026-07-02 15:24 ` [PATCH v1 8/8] KVM: s390: Improve kvm_s390_vm_stop_migration() Claudio Imbrenda
7 siblings, 1 reply; 15+ messages in thread
From: Claudio Imbrenda @ 2026-07-02 15:24 UTC (permalink / raw)
To: linux-kernel
Cc: kvm, linux-s390, borntraeger, frankja, david, seiden, nrb,
schlameuss, gra
When a memslot is removed, all ptes that mapped the slot are cleared or
even deallocated. If this happens while the system is in migration
mode, and if cmma-dirty pages are removed, the cmma-dirty counter will
not reflect reality.
Fix by appropriately decrementing the cmma-dirty counter when removing
a memslot.
Fixes: e38c884df921 ("KVM: s390: Switch to new gmap")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
arch/s390/kvm/kvm-s390.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 512c81eee068..2ac3a9ac4698 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -5792,11 +5792,19 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
return 0;
}
+static long cmma_d_count_pte(union pte *ptep, gfn_t gfn, gfn_t next, struct dat_walk *walk)
+{
+ if (pgste_of(ptep)->cmma_d)
+ atomic64_dec(walk->priv);
+ return 0;
+}
+
void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
+ const struct dat_walk_ops ops = { .pte_entry = cmma_d_count_pte, };
struct kvm_s390_mmu_cache *mc = NULL;
int rc = 0;
@@ -5810,6 +5818,12 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
}
scoped_guard(write_lock, &kvm->mmu_lock) {
+ if (kvm->arch.migration_mode && kvm->arch.use_cmma) {
+ _dat_walk_gfn_range(old->base_gfn, old->base_gfn + old->npages,
+ kvm->arch.gmap->asce, &ops, DAT_WALK_IGN_HOLES,
+ &kvm->arch.cmma_dirty_pages);
+ }
+
switch (change) {
case KVM_MR_DELETE:
rc = dat_delete_slot(mc, kvm->arch.gmap->asce, old->base_gfn, old->npages);
--
2.55.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v1 7/8] KVM: s390: Fix dat_crste_walk_range() early return
2026-07-02 15:23 [PATCH v1 0/8] KVM: s390: Misc fixes Claudio Imbrenda
` (5 preceding siblings ...)
2026-07-02 15:24 ` [PATCH v1 6/8] KVM: s390: cmma: Fix dirty tracking when removing memslot Claudio Imbrenda
@ 2026-07-02 15:24 ` Claudio Imbrenda
2026-07-02 15:24 ` [PATCH v1 8/8] KVM: s390: Improve kvm_s390_vm_stop_migration() Claudio Imbrenda
7 siblings, 0 replies; 15+ messages in thread
From: Claudio Imbrenda @ 2026-07-02 15:24 UTC (permalink / raw)
To: linux-kernel
Cc: kvm, linux-s390, borntraeger, frankja, david, seiden, nrb,
schlameuss, gra
If a walk entry handler for a lower level returns a value,
dat_crste_walk_range() will not return immediately, but instead loop
again and move to the next entry.
This means that some entries are potentially skipped, and early return
is ignored. Skipped entries might lead to all kinds of issues, given
that the caller expects them to not be skipped. Early return is often
used to interrupt a walk when a rescheduling is needed; if it is
ignored it can lead to stalls.
Fix by breaking from the loop immediately if the walk to a lower level
returned non-zero.
Fixes: 2db149a0a6c5 ("KVM: s390: KVM page table management functions: walks")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
arch/s390/kvm/dat.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/s390/kvm/dat.c b/arch/s390/kvm/dat.c
index 5f1960ec982d..ed4259d17629 100644
--- a/arch/s390/kvm/dat.c
+++ b/arch/s390/kvm/dat.c
@@ -570,6 +570,8 @@ static long dat_crste_walk_range(gfn_t start, gfn_t end, struct crst_table *tabl
else if (walk->ops->pte_entry)
rc = dat_pte_walk_range(max(start, cur), min(end, next),
dereference_pmd(crste.pmd), walk);
+ if (rc)
+ break;
}
}
return rc;
--
2.55.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v1 8/8] KVM: s390: Improve kvm_s390_vm_stop_migration()
2026-07-02 15:23 [PATCH v1 0/8] KVM: s390: Misc fixes Claudio Imbrenda
` (6 preceding siblings ...)
2026-07-02 15:24 ` [PATCH v1 7/8] KVM: s390: Fix dat_crste_walk_range() early return Claudio Imbrenda
@ 2026-07-02 15:24 ` Claudio Imbrenda
7 siblings, 0 replies; 15+ messages in thread
From: Claudio Imbrenda @ 2026-07-02 15:24 UTC (permalink / raw)
To: linux-kernel
Cc: kvm, linux-s390, borntraeger, frankja, david, seiden, nrb,
schlameuss, gra
There is no need to clear cmma-dirty state if the VM is not using CMMA.
Skip the CMMA-related code if CMMA is not in use.
Fixes: 6cfd47f91f6a ("KVM: s390: Fix cmma dirty tracking")
Fixes: 190df4a212a7 ("KVM: s390: CMMA tracking, ESSA emulation, migration mode")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
arch/s390/kvm/kvm-s390.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 2ac3a9ac4698..deec673a9206 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -1280,8 +1280,10 @@ static int kvm_s390_vm_stop_migration(struct kvm *kvm)
* PGSTEs might have cmma_d set.
*/
WRITE_ONCE(kvm->arch.migration_mode, 0);
- if (kvm->arch.use_cmma)
- kvm_s390_sync_request_broadcast(kvm, KVM_REQ_STOP_MIGRATION);
+ if (!kvm->arch.use_cmma)
+ return 0;
+
+ kvm_s390_sync_request_broadcast(kvm, KVM_REQ_STOP_MIGRATION);
/* Clear cmma_d on all existing PGSTEs and set cmma_dirty_pages to 0. */
gmap_set_cmma_all_clean(kvm->arch.gmap);
atomic64_set(&kvm->arch.cmma_dirty_pages, 0);
--
2.55.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v1 5/8] KVM: s390: Fix race in __do_essa()
2026-07-02 15:24 ` [PATCH v1 5/8] KVM: s390: Fix race in __do_essa() Claudio Imbrenda
@ 2026-07-02 15:34 ` sashiko-bot
0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-07-02 15:34 UTC (permalink / raw)
To: Claudio Imbrenda
Cc: Heiko Carstens, linux-s390, kvm, Vasily Gorbik, Alexander Gordeev
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Acquiring `kvm->slots_lock` inside an SRCU read-side critical section causes an ABBA deadlock.
--
commit e40855b84f2ee647bcced37d340b4d6805640050
Author: Claudio Imbrenda <imbrenda@linux.ibm.com>
KVM: s390: Fix race in __do_essa()
This commit addresses a race condition between __do_essa() and VM migration
or dat_get_cmma() by acquiring kvm->slots_lock before calling __do_essa().
> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
> --- a/arch/s390/kvm/priv.c
> +++ b/arch/s390/kvm/priv.c
> @@ -1260,8 +1260,9 @@ static int handle_essa(struct kvm_vcpu *vcpu)
> /* Retry the ESSA instruction */
> kvm_s390_retry_instr(vcpu);
> } else {
> - scoped_guard(read_lock, &vcpu->kvm->mmu_lock)
> - i = __do_essa(vcpu, orc);
> + scoped_guard(mutex, &vcpu->kvm->slots_lock)
> + scoped_guard(read_lock, &vcpu->kvm->mmu_lock)
> + i = __do_essa(vcpu, orc);
[Severity: High]
Does acquiring kvm->slots_lock here introduce an ABBA deadlock?
Looking at handle_essa(), it executes within an SRCU read-side critical
section, as verified by the lockdep assertion at the start of the function:
static int handle_essa(struct kvm_vcpu *vcpu)
{
lockdep_assert_held(&vcpu->kvm->srcu);
...
If a concurrent memslot update acquires kvm->slots_lock and then calls
synchronize_srcu(&kvm->srcu), it will wait on the SRCU lock held by this
thread, while this thread blocks waiting for kvm->slots_lock.
> if (i < 0)
> return i;
> /* Account for the possible extra cbrl entry */
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260702152406.204782-1-imbrenda@linux.ibm.com?part=5
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 4/8] KVM: s390: Fix return value of kvm_s390_set_cmma_bits()
2026-07-02 15:24 ` [PATCH v1 4/8] KVM: s390: Fix return value of kvm_s390_set_cmma_bits() Claudio Imbrenda
@ 2026-07-02 15:37 ` sashiko-bot
0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-07-02 15:37 UTC (permalink / raw)
To: Claudio Imbrenda
Cc: kvm, Alexander Gordeev, Heiko Carstens, linux-s390, Vasily Gorbik
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] Leaked positive error codes (e.g., PGM_ADDRESSING) to userspace in KVM storage key ioctls.
--
commit 9e2ad138adcf0c89db7df7095fa7b5199afab279
Author: Claudio Imbrenda <imbrenda@linux.ibm.com>
KVM: s390: Fix return value of kvm_s390_set_cmma_bits()
If kvm_s390_set_cmma_bits() is asked to set CMMA values outside of a
memslot, PGM_ADDRESSING (5) is returned, instead of a negative error
value. Fix by returning -EFAULT whenever the return value would be > 0.
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2384,7 +2384,7 @@ static int kvm_s390_set_cmma_bits(struct kvm *kvm,
>
> set_bit(GMAP_FLAG_USES_CMM, &kvm->arch.gmap->flags);
>
> - return r;
> + return r <= 0 ? r : -EFAULT;
[Severity: High]
This isn't a bug introduced by this patch, but since you are fixing the
leaked PGM_ADDRESSING error code here, should we also fix the same issue
in the storage key ioctls?
If userspace calls KVM_S390_GET_SKEYS or KVM_S390_SET_SKEYS with an
out-of-bounds guest address, do they also return the positive error code
directly?
arch/s390/kvm/kvm-s390.c:kvm_s390_get_skeys() {
...
r = dat_get_storage_key(kvm->arch.gmap->asce, args->start_gfn + i, keys + i);
if (r)
break;
...
return r;
}
arch/s390/kvm/kvm-s390.c:kvm_s390_set_skeys() {
...
r = dat_set_storage_key(mc, kvm->arch.gmap->asce, args->start_gfn + i, keys[i], 0);
if (r)
break;
...
return r;
}
Furthermore, for KVM_S390_KEYOP, does it falsely report success and assign
the error code to the output key?
arch/s390/kvm/kvm-s390.c:kvm_arch_vm_ioctl() {
case KVM_S390_KEYOP: {
...
r = kvm_s390_keyop(mc, kvm, kop.operation, kop.guest_addr, skey);
...
if (r < 0)
break;
kop.key = r;
r = 0;
...
}
Should these functions also be updated to return -EFAULT for positive errors,
just like kvm_s390_set_cmma_bits() does here?
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260702152406.204782-1-imbrenda@linux.ibm.com?part=4
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 3/8] KVM: s390: Fix unlikely race with KVM_CAP_S390_USER_OPEREXEC
2026-07-02 15:24 ` [PATCH v1 3/8] KVM: s390: Fix unlikely race with KVM_CAP_S390_USER_OPEREXEC Claudio Imbrenda
@ 2026-07-02 15:42 ` sashiko-bot
2026-07-02 17:08 ` Eric Farman
1 sibling, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-07-02 15:42 UTC (permalink / raw)
To: Claudio Imbrenda
Cc: linux-s390, kvm, Heiko Carstens, Vasily Gorbik, Alexander Gordeev
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
Pre-existing issues:
- [High] The WRITE_ONCE() fix for KVM_CAP_S390_USER_OPEREXEC is architecturally insufficient to prevent hardware Store-Load reordering, leaving the concurrency race exploitable.
- [High] KVM_CAP_S390_USER_INSTR0 suffers from the identical concurrency vulnerability as USER_OPEREXEC, but was completely missed by the patch.
--
commit d2c961d1e461c8babbad1df33fbe8ed3bddfddba
Author: Claudio Imbrenda <imbrenda@linux.ibm.com>
KVM: s390: Fix unlikely race with KVM_CAP_S390_USER_OPEREXEC
This commit attempts to fix a race condition when enabling
KVM_CAP_S390_USER_OPEREXEC concurrently with vCPU creation. It uses
WRITE_ONCE() to ensure the capability flag is set before updating all
vCPUs.
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 4a6d903e35231..784f7d9c79c74 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
[ ... ]
> @@ -955,7 +955,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
> break;
> case KVM_CAP_S390_USER_OPEREXEC:
> VM_EVENT(kvm, 3, "%s", "ENABLE: CAP_S390_USER_OPEREXEC");
> - kvm->arch.user_operexec = 1;
> + WRITE_ONCE(kvm->arch.user_operexec, 1);
> icpt_operexc_on_all_vcpus(kvm);
[Severity: High]
This isn't a bug introduced by this patch, but is WRITE_ONCE() sufficient here
to prevent hardware Store-Load reordering?
On s390x, hardware Store-Load reordering is permitted. When Thread A enables
the capability, it stores to user_operexec and then loads online_vcpus inside
icpt_operexc_on_all_vcpus():
Thread A:
WRITE_ONCE(kvm->arch.user_operexec, 1);
// icpt_operexc_on_all_vcpus()
kvm_for_each_vcpu(i, vcpu, kvm) { ... }
Thread B concurrently creates a vCPU, incrementing online_vcpus and then
loading user_operexec in kvm_arch_vcpu_postcreate():
Thread B:
atomic_inc(&kvm->online_vcpus);
if (... || vcpu->kvm->arch.user_operexec)
vcpu->arch.sie_block->ictl |= ICTL_OPEREXC;
Without a full memory barrier (smp_mb()), Thread A's CPU can execute the load
of online_vcpus before the store to user_operexec is globally visible. Thread
A might miss the new vCPU, and Thread B might read user_operexec == 0. Does
this require an explicit smp_mb() between the store and the loop?
[Severity: High]
This is a pre-existing issue, but does KVM_CAP_S390_USER_INSTR0 suffer from the
exact same concurrency vulnerability?
In the same kvm_vm_ioctl_enable_cap() function, KVM_CAP_S390_USER_INSTR0 uses
the identical pattern:
kvm->arch.user_instr0 = 1;
icpt_operexc_on_all_vcpus(kvm);
Since kvm_arch_vcpu_postcreate() also reads user_instr0:
if (test_kvm_facility(vcpu->kvm, 74) || vcpu->kvm->arch.user_instr0 ||
vcpu->kvm->arch.user_operexec)
vcpu->arch.sie_block->ictl |= ICTL_OPEREXC;
Could a concurrent vCPU creation fail to enable the ICTL_OPEREXC intercept if
it reads user_instr0 == 0 while the enablement loop misses the new vCPU?
> r = 0;
> break;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260702152406.204782-1-imbrenda@linux.ibm.com?part=3
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 2/8] KVM: s390: Fix unlikely NULL gmap dereference
2026-07-02 15:24 ` [PATCH v1 2/8] KVM: s390: Fix unlikely NULL gmap dereference Claudio Imbrenda
@ 2026-07-02 15:43 ` sashiko-bot
0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-07-02 15:43 UTC (permalink / raw)
To: Claudio Imbrenda
Cc: Heiko Carstens, kvm, Alexander Gordeev, Vasily Gorbik, linux-s390
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
Pre-existing issues:
- [Critical] The patch incompletely fixes the race condition between vCPU creation and `KVM_RUN`. While it fixes the `gmap` NULL pointer dereference, it leaves the initialization of other critical `sie_block` fields (`sca_add_vcpu`, `epoch`, `epdx`, `ictl`) in `kvm_arch_vcpu_postcreate()`, which can run concurrently with `KVM_RUN` and corrupt host memory.
- [High] Calling `gmap_remove_child()` without holding `children_lock` in `kvm_arch_vcpu_create()` error path.
- [High] Concurrent creation of vCPUs with the same `vcpu_id` clears the valid vCPU's SCA entry.
--
commit 8b61d8766d5d3ccd96fcdcaf7ad4379e712b4282
Author: Claudio Imbrenda <imbrenda@linux.ibm.com>
KVM: s390: Fix unlikely NULL gmap dereference
When creating a new vCPU, kvm_vm_ioctl_create_vcpu() will call
kvm_arch_vcpu_postcreate() after the file descriptor for the new vCPU
has been created. The new file descriptor has not been returned yet,
but a malicious userspace program could try to guess it.
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 23c817595e28d..4a6d903e35231 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -3604,10 +3604,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
> vcpu->arch.sie_block->epdx = vcpu->kvm->arch.epdx;
> preempt_enable();
> mutex_unlock(&vcpu->kvm->lock);
> - if (!kvm_is_ucontrol(vcpu->kvm)) {
> - vcpu->arch.gmap = vcpu->kvm->arch.gmap;
> + if (!kvm_is_ucontrol(vcpu->kvm))
> sca_add_vcpu(vcpu);
[Severity: High]
This isn't a bug introduced by this patch, but does leaving sca_add_vcpu() and
other sie_block initialization in kvm_arch_vcpu_postcreate() leave a race
condition open?
The commit message notes that malicious userspace can start the newly created
vCPU before kvm_arch_vcpu_postcreate() finishes. If kvm_vm_ioctl_create_vcpu()
drops vcpu->mutex and kvm->lock before calling kvm_arch_vcpu_postcreate(),
can KVM_RUN execute concurrently?
If KVM_RUN enters the guest before sca_add_vcpu() completes:
arch/s390/kvm/kvm-s390.c:sca_add_vcpu() {
...
vcpu->arch.sie_block->scaoh = sca_phys >> 32;
vcpu->arch.sie_block->scaol = sca_phys & ESCA_SCAOL_MASK;
vcpu->arch.sie_block->ecb2 |= ECB2_ESCA;
...
}
Will the hardware use physical address 0 as the System Control Area base and
overwrite host absolute physical address 0 when updating inter-processor
states?
> if (test_kvm_facility(vcpu->kvm, 74) || vcpu->kvm->arch.user_instr0 ||
> vcpu->kvm->arch.user_operexec)
> vcpu->arch.sie_block->ictl |= ICTL_OPEREXC;
[ ... ]
> @@ -3850,6 +3848,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> vcpu->arch.gmap = gmap_new_child(vcpu->kvm->arch.gmap, -1UL);
> if (!vcpu->arch.gmap)
> goto out_free_sie_block;
> + } else {
> + vcpu->arch.gmap = vcpu->kvm->arch.gmap;
> }
>
> VM_EVENT(vcpu->kvm, 3, "create cpu %d at 0x%p, sie block at 0x%p",
[Severity: High]
This is a pre-existing issue, but does the error handling path for ucontrol
vCPUs in kvm_arch_vcpu_create() safely remove the gmap child?
If kvm_s390_vcpu_setup() fails, the out_ucontrol_uninit path calls
gmap_remove_child(vcpu->arch.gmap). Looking at gmap_remove_child() in
arch/s390/kvm/gmap.c, it modifies child->list and has a lockdep assertion
for child->parent->children_lock:
arch/s390/kvm/gmap.c:gmap_remove_child() {
lockdep_assert_held(&child->parent->children_lock);
list_del(&child->list);
...
}
Since kvm_arch_vcpu_create() does not acquire children_lock before jumping to
out_ucontrol_uninit, could this cause list corruption if concurrent creation
failures race?
[Severity: High]
This is also a pre-existing issue, but is there a risk of clearing a valid
SCA entry if userspace concurrently creates two vCPUs with the same vcpu_id?
Duplicate ID detection occurs in kvm_vm_ioctl_create_vcpu() after
kvm_arch_vcpu_create() succeeds. If the ID check fails, it eventually calls
kvm_arch_vcpu_destroy(), which unconditionally calls sca_del_vcpu():
arch/s390/kvm/kvm-s390.c:sca_del_vcpu() {
...
sca->cpu[vcpu->vcpu_id].sda = 0;
...
}
If a valid vCPU already exists with this vcpu_id, does the failed creation
thread wipe out the valid vCPU's SCA interrupt delivery structures?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260702152406.204782-1-imbrenda@linux.ibm.com?part=2
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 6/8] KVM: s390: cmma: Fix dirty tracking when removing memslot
2026-07-02 15:24 ` [PATCH v1 6/8] KVM: s390: cmma: Fix dirty tracking when removing memslot Claudio Imbrenda
@ 2026-07-02 15:48 ` sashiko-bot
0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-07-02 15:48 UTC (permalink / raw)
To: Claudio Imbrenda
Cc: linux-s390, kvm, Vasily Gorbik, Heiko Carstens, Alexander Gordeev
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Double decrement and underflow of cmma_dirty_pages counter when recreating unaligned memslots.
--
commit cfdf82bf92088bc4e4a719ccbb0ad71b6708ed76
Author: Claudio Imbrenda <imbrenda@linux.ibm.com>
KVM: s390: cmma: Fix dirty tracking when removing memslot
This commit fixes the dirty tracking counter when a memslot is removed.
It introduces a dat_walk_ops callback to appropriately decrement the
cmma-dirty counter for any cmma-dirty pages being removed.
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 512c81eee0684..2ac3a9ac46988 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
[ ... ]
> @@ -5792,11 +5792,19 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
> return 0;
> }
>
> +static long cmma_d_count_pte(union pte *ptep, gfn_t gfn, gfn_t next, struct dat_walk *walk)
> +{
> + if (pgste_of(ptep)->cmma_d)
> + atomic64_dec(walk->priv);
[Severity: High]
Does this leave the cmma_d bit set in the PGSTE, which could lead to a double
decrement later?
If a memslot that isn't fully 1MB-aligned is removed or moved,
dat_delete_slot() replaces the PTEs with hole tokens but preserves the PGSTEs.
When userspace creates a new memslot over the same GFN range, dat_create_slot()
replaces the hole tokens with empty PTEs, leaving the stale cmma_d == 1 bits
intact.
During migration, could dat_get_cmma() visit these newly recreated empty PTEs,
see cmma_d == 1, clear the bit, and decrement the counter a second time?
This would cause the cmma_dirty_pages counter to underflow. Since this value is
returned to userspace as an unsigned 64-bit integer, an underflow could cause
QEMU's live migration loop to never converge.
Should we clear the cmma_d bit here after decrementing the counter?
> + return 0;
> +}
> +
> void kvm_arch_commit_memory_region(struct kvm *kvm,
> struct kvm_memory_slot *old,
> const struct kvm_memory_slot *new,
> enum kvm_mr_change change)
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260702152406.204782-1-imbrenda@linux.ibm.com?part=6
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 3/8] KVM: s390: Fix unlikely race with KVM_CAP_S390_USER_OPEREXEC
2026-07-02 15:24 ` [PATCH v1 3/8] KVM: s390: Fix unlikely race with KVM_CAP_S390_USER_OPEREXEC Claudio Imbrenda
2026-07-02 15:42 ` sashiko-bot
@ 2026-07-02 17:08 ` Eric Farman
1 sibling, 0 replies; 15+ messages in thread
From: Eric Farman @ 2026-07-02 17:08 UTC (permalink / raw)
To: Claudio Imbrenda, linux-kernel
Cc: kvm, linux-s390, borntraeger, frankja, david, seiden, nrb,
schlameuss, gra
On Thu, 2026-07-02 at 17:24 +0200, Claudio Imbrenda wrote:
> Due to a missing barrier, it was theoretically possible that a vCPU
> being created concurrently with enabling KVM_CAP_S390_USER_OPEREXEC
> would end up with the capability not enabled.
>
> Fix by using WRITE_ONCE(), which is enough on s390.
>
> Fixes: 8e8678e740ec ("KVM: s390: Add capability that forwards operation exceptions")
> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> ---
> arch/s390/kvm/kvm-s390.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Thanks, Claudio. FWIW,
Reviewed-by: Eric Farman <farman@linux.ibm.com>
>
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 4a6d903e3523..784f7d9c79c7 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -955,7 +955,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
> break;
> case KVM_CAP_S390_USER_OPEREXEC:
> VM_EVENT(kvm, 3, "%s", "ENABLE: CAP_S390_USER_OPEREXEC");
> - kvm->arch.user_operexec = 1;
> + WRITE_ONCE(kvm->arch.user_operexec, 1);
> icpt_operexc_on_all_vcpus(kvm);
> r = 0;
> break;
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2026-07-02 17:08 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-02 15:23 [PATCH v1 0/8] KVM: s390: Misc fixes Claudio Imbrenda
2026-07-02 15:23 ` [PATCH v1 1/8] KVM: s390: vsie: Avoid potential deadlock with real spaces Claudio Imbrenda
2026-07-02 15:24 ` [PATCH v1 2/8] KVM: s390: Fix unlikely NULL gmap dereference Claudio Imbrenda
2026-07-02 15:43 ` sashiko-bot
2026-07-02 15:24 ` [PATCH v1 3/8] KVM: s390: Fix unlikely race with KVM_CAP_S390_USER_OPEREXEC Claudio Imbrenda
2026-07-02 15:42 ` sashiko-bot
2026-07-02 17:08 ` Eric Farman
2026-07-02 15:24 ` [PATCH v1 4/8] KVM: s390: Fix return value of kvm_s390_set_cmma_bits() Claudio Imbrenda
2026-07-02 15:37 ` sashiko-bot
2026-07-02 15:24 ` [PATCH v1 5/8] KVM: s390: Fix race in __do_essa() Claudio Imbrenda
2026-07-02 15:34 ` sashiko-bot
2026-07-02 15:24 ` [PATCH v1 6/8] KVM: s390: cmma: Fix dirty tracking when removing memslot Claudio Imbrenda
2026-07-02 15:48 ` sashiko-bot
2026-07-02 15:24 ` [PATCH v1 7/8] KVM: s390: Fix dat_crste_walk_range() early return Claudio Imbrenda
2026-07-02 15:24 ` [PATCH v1 8/8] KVM: s390: Improve kvm_s390_vm_stop_migration() Claudio Imbrenda
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox