Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Claudio Imbrenda" <imbrenda@linux.ibm.com>
Cc: "Alexander Gordeev" <agordeev@linux.ibm.com>,
	kvm@vger.kernel.org, "Heiko Carstens" <hca@linux.ibm.com>,
	"Vasily Gorbik" <gor@linux.ibm.com>,
	linux-s390@vger.kernel.org
Subject: Re: [PATCH v5 7/9] KVM: s390: Fix cmma dirty tracking
Date: Mon, 22 Jun 2026 16:27:11 +0000	[thread overview]
Message-ID: <20260622162711.F20A21F00A3A@smtp.kernel.org> (raw)
In-Reply-To: <20260622160739.202044-8-imbrenda@linux.ibm.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

New issues:
- [High] A stalled VCPU thread in `handle_essa()` racing with `kvm_s390_vm_start_migration()` double-counts `cmma_dirty_pages`, causing an infinite loop in userspace.

Pre-existing issues:
- [Critical] Uninitialized kernel memory from `vmalloc` is leaked to userspace via the CMMA state buffer.
- [High] CMMA state of newly mapped pages can be silently lost during the start of migration due to a TOCTOU race with `gmap_set_cmma_all_dirty` running while `migration_mode == 0`.
--

commit d55126b5a3d84f06909a4a559b0185411825d1f2
Author: Claudio Imbrenda <imbrenda@linux.ibm.com>

KVM: s390: Fix cmma dirty tracking

This commit updates migration mode initialization to accurately count pages
with allocated PTEs instead of counting all pages in memslots. It also fixes
dat_get_cmma() to wrap around when searching for cmma-dirty pages.

> diff --git a/arch/s390/kvm/dat.c b/arch/s390/kvm/dat.c
> index cffac7782c4bf..0ad4ebc80ebaf 100644
> --- a/arch/s390/kvm/dat.c
> +++ b/arch/s390/kvm/dat.c
> @@ -1253,6 +1253,9 @@ int dat_get_cmma(union asce asce, gfn_t *start, unsigned int *count, u8 *values,
>  	};
>  
>  	_dat_walk_gfn_range(*start, asce_end(asce), asce, &ops, DAT_WALK_IGN_HOLES, &state);
> +	/* If no dirty pages were found, wrap around and continue searching */
> +	if (*start && state.start == -1)
> +		_dat_walk_gfn_range(0, *start, asce, &ops, DAT_WALK_IGN_HOLES, &state);

[Severity: Critical]
This is a pre-existing issue, but does the values buffer leak uninitialized
kernel memory to userspace?

In kvm_s390_get_cmma_bits(), the values buffer is allocated via vmalloc()
without zeroing:

arch/s390/kvm/kvm-s390.c:kvm_s390_get_cmma_bits() {
    ...
    values = vmalloc(args->count);
    if (!values)
        return -ENOMEM;
    ...
}

When dat_get_cmma() iterates over guest memory, __dat_get_cmma_pte() skips
unmapped holes and non-dirty pages (cmma_d == 0). This leaves uninitialized
gaps in the state->values array between the first and last dirty pages in the
batch.

Since the entire array up to args->count is later copied to userspace via
copy_to_user(), could this expose sensitive heap data?

[ ... ]
> diff --git a/arch/s390/kvm/gmap.c b/arch/s390/kvm/gmap.c
> index e6e786811db86..a9a37fdeb809b 100644
> --- a/arch/s390/kvm/gmap.c
> +++ b/arch/s390/kvm/gmap.c
> @@ -1075,7 +1075,13 @@ int gmap_protect_rmap(struct kvm_s390_mmu_cache *mc, struct gmap *sg, gfn_t p_gf
>  
>  static long __set_cmma_dirty_pte(union pte *ptep, gfn_t gfn, gfn_t next, struct dat_walk *walk)
>  {
> -	__atomic64_or(PGSTE_CMMA_D_BIT, &pgste_of(ptep)->val);
> +	union pgste pgste;
> +
> +	pgste = pgste_get_lock(ptep);
> +	pgste.cmma_d = 1;
> +	pgste_set_unlock(ptep, pgste);
> +	atomic64_inc(walk->priv);

[Severity: High]
Can this cause double-counting of cmma_dirty_pages if a VCPU thread in
handle_essa() races with kvm_s390_vm_start_migration()?

If a VCPU locklessly reads migration_mode == 1 in handle_essa(), stalls, and
userspace stops then restarts migration:
kvm_s390_vm_start_migration() resets cmma_dirty_pages = 0 and starts
gmap_set_cmma_all_dirty().

The stalled VCPU thread could then resume, enter __do_essa(), set cmma_d = 1,
and increment cmma_dirty_pages. When __set_cmma_dirty_pte() reaches the same
page later, it lacks a check for !pgste.cmma_d, blindly setting cmma_d = 1
and incrementing cmma_dirty_pages again.

Would this over-counting cause userspace (like QEMU) to spin infinitely when
calling the KVM_GET_CMMA_BITS ioctl, as it decrements its remaining counter
expecting more pages than actually exist?

[ ... ]
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 93141a68e0dde..cdd8b41d24ed2 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
[ ... ]
>  	/* mark all the pages as dirty */
> +	atomic64_set(&kvm->arch.cmma_dirty_pages, 0);
>  	gmap_set_cmma_all_dirty(kvm->arch.gmap);
> -	atomic64_set(&kvm->arch.cmma_dirty_pages, ram_pages);
> -	kvm->arch.migration_mode = 1;
> +	WRITE_ONCE(kvm->arch.migration_mode, 1);

[Severity: High]
This isn't a bug introduced by this patch, but does kvm_s390_vm_start_migration()
have a race condition because gmap_set_cmma_all_dirty() runs entirely while
migration_mode == 0?

During gmap_set_cmma_all_dirty(), if a VCPU faults in a new page at an address
already scanned by the walker, the PTE is created with cmma_d = 0. Since
migration_mode is still 0, the VCPU can execute ESSA instructions which
hardware handles natively.

This would modify the CMMA state in the PGSTE but leave cmma_d = 0. When the
migration scan finishes and sets migration_mode = 1, does the newly dirtied
page remain untracked, causing guest state corruption across live migration?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260622160739.202044-1-imbrenda@linux.ibm.com?part=7

  reply	other threads:[~2026-06-22 16:27 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-22 16:07 [PATCH v5 0/9] KVM: s390: A bunch of gmap-related fixes Claudio Imbrenda
2026-06-22 16:07 ` [PATCH v5 1/9] s390/mm: Fix handling of _PAGE_UNUSED pte bit Claudio Imbrenda
2026-06-22 16:07 ` [PATCH v5 2/9] KVM: s390: Fix dat_peek_cmma() overflow Claudio Imbrenda
2026-06-22 16:18   ` Christian Borntraeger
2026-06-22 16:23   ` sashiko-bot
2026-06-22 16:07 ` [PATCH v5 3/9] KVM: s390: Do not set special large pages dirty Claudio Imbrenda
2026-06-22 16:27   ` sashiko-bot
2026-06-22 16:07 ` [PATCH v5 4/9] KVM: s390: Fix code typo in gmap_protect_asce_top_level() Claudio Imbrenda
2026-06-22 16:24   ` sashiko-bot
2026-06-22 16:07 ` [PATCH v5 5/9] KVM: s390: Fix handle_{sske,pfmf} under memory pressure Claudio Imbrenda
2026-06-22 16:07 ` [PATCH v5 6/9] KVM: s390: Fix locking in kvm_s390_set_mem_control() Claudio Imbrenda
2026-06-22 16:19   ` sashiko-bot
2026-06-22 16:07 ` [PATCH v5 7/9] KVM: s390: Fix cmma dirty tracking Claudio Imbrenda
2026-06-22 16:27   ` sashiko-bot [this message]
2026-06-22 16:07 ` [PATCH v5 8/9] KVM: s390: selftests: Fix cmma selftest Claudio Imbrenda
2026-06-22 16:07 ` [PATCH v5 9/9] KVM: s390: Return failure in case of failure in kvm_s390_set_cmma_bits() Claudio Imbrenda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260622162711.F20A21F00A3A@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=agordeev@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox