From: sashiko-bot@kernel.org
To: "Claudio Imbrenda" <imbrenda@linux.ibm.com>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH v1 3/3] KVM: s390: Allow for 2G hugepages
Date: Wed, 03 Jun 2026 16:32:48 +0000 [thread overview]
Message-ID: <20260603163249.744341F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260603154758.234768-4-imbrenda@linux.ibm.com>
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [Critical] Enabling 2G hugepages exposes a hardcoded `_SEGMENT_MASK` in `_crste_fc1`, injecting invalid entries into hardware page tables.
Pre-existing issues:
- [High] Concurrent page faults can insert 1M or 2G huge pages into a PV guest gmap while `KVM_PV_ENABLE` runs, because `gmap_split_huge_pages` and `gmap_link` both acquire `mmu_lock` in read mode without sufficient serialization.
--
commit 8d8980fda5d60990dd59ce7e857350fb4c8cedbd
Author: Claudio Imbrenda <imbrenda@linux.ibm.com>
KVM: s390: Allow for 2G hugepages
Change gmap_2g_allowed() to perform the necessary checks to allow for
2G hugepages to be used, instead of unconditionally returning false. The
GMAP_FLAG_ALLOW_HPAGE_2G flag is now taken into account when evaluating
whether a 2G hugepage can back the faulting address.
> diff --git a/arch/s390/kvm/gmap.c b/arch/s390/kvm/gmap.c
> index c9e348149ba12..c5a8132eaa22b 100644
> --- a/arch/s390/kvm/gmap.c
> +++ b/arch/s390/kvm/gmap.c
[ ... ]
> @@ -635,10 +635,27 @@ int gmap_try_fixup_minor(struct gmap *gmap, struct guest_fault *fault)
> return rc;
> }
>
> +/**
> + * gmap_2g_allowed() - Check whether a 2G hugepage is allowed.
> + * @gmap: The gmap of the guest.
> + * @f: Describes the fault that is being resolved.
> + * @slot: The memslot the faulting address belongs to.
> + *
> + * The function checks whether the GMAP_FLAG_ALLOW_HPAGE_2G flag is set for
> + * @gmap, whether the offset of the address in the 2G virtual frame is the
> + * same as the offset in the physical 2G frame, and finally whether the whole
> + * 2G page would fit in the given memslot.
> + *
> + * Return: true if a 2G hugepage is allowed to back the faulting address, false
> + * otherwise.
> + */
> static inline bool gmap_2g_allowed(struct gmap *gmap, struct guest_fault *f,
> struct kvm_memory_slot *slot)
> {
> - return false;
> + return test_bit(GMAP_FLAG_ALLOW_HPAGE_2G, &gmap->flags) &&
[Severity: Critical]
Enabling 2G hugepages here exposes a hardcoded _SEGMENT_MASK in _crste_fc1().
When 2G hugepages proceed to _gmap_link(), it calls _crste_fc1() with an
unaligned f->pfn and a Region-3 table type.
arch/s390/kvm/dat.h:_crste_fc1() {
union crste res = { .val = PFN_PHYS(pfn) & _SEGMENT_MASK };
...
}
The constructor hardcodes _SEGMENT_MASK, which fails to zero out bits 20-30
for Region-3 hugepages. For a Region-3 large page entry, the hardware requires
these bits to be strictly zero. Writing the unaligned 2MB-to-2GB offset into
these reserved bits causes the hardware MMU to trigger a Translation
Specification Exception when it walks the page table.
Should _crste_fc1() be updated to use _REGION3_MASK for Region-3 entries to
prevent guest crashes?
[Severity: High]
This isn't a bug introduced by this patch, but adding 2G hugepages to this
path exacerbates a preexisting race with PV initialization.
Concurrent page faults can insert huge pages into a PV guest gmap while
kvm_s390_pv_init_vm() runs. The initialization code clears the flags and calls
gmap_split_huge_pages():
arch/s390/kvm/pv.c:kvm_s390_pv_init_vm() {
...
clear_bit(GMAP_FLAG_ALLOW_HPAGE_1M, &kvm->arch.gmap->flags);
clear_bit(GMAP_FLAG_ALLOW_HPAGE_2G, &kvm->arch.gmap->flags);
gmap_split_huge_pages(kvm->arch.gmap);
...
}
However, gmap_split_huge_pages() only takes a read lock on kvm->mmu_lock.
If a vCPU thread evaluates test_bit() in gmap_2g_allowed() to true before
the flag is cleared, and is then preempted, the splitting function can
walk past the faulting GFN.
When the vCPU thread resumes, it can atomically insert the 2G huge page
into the gmap via _gmap_link(), as that also only holds mmu_lock in read
mode.
Does this race allow a PV guest to map a hugepage and break memory
isolation invariants?
> + !((f->gfn ^ f->pfn) & ~_REGION3_FR_MASK) &&
> + slot->base_gfn <= ALIGN_DOWN(f->gfn, _PAGES_PER_REGION3) &&
> + slot->base_gfn + slot->npages >= ALIGN(f->gfn + 1, _PAGES_PER_REGION3);
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603154758.234768-1-imbrenda@linux.ibm.com?part=3
next prev parent reply other threads:[~2026-06-03 16:32 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-03 15:47 [PATCH v1 0/3] KVM: s390: Add support for 2G hugepages Claudio Imbrenda
2026-06-03 15:47 ` [PATCH v1 1/3] KVM: s390: Add module parameter to fence " Claudio Imbrenda
2026-06-08 12:50 ` Steffen Eiden
2026-06-03 15:47 ` [PATCH v1 2/3] KVM: s390: Add capability to support " Claudio Imbrenda
2026-06-03 16:12 ` sashiko-bot
2026-06-08 12:51 ` Steffen Eiden
2026-06-03 15:47 ` [PATCH v1 3/3] KVM: s390: Allow for " Claudio Imbrenda
2026-06-03 16:32 ` sashiko-bot [this message]
2026-06-08 12:53 ` Steffen Eiden
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260603163249.744341F00893@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=imbrenda@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.