Re: [PATCH v2 1/2] KVM: riscv: Check hugetlb block mappings against memslot bounds

Kernel KVM virtualization development
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Jinyu Tang" <tjytimi@163.com>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH v2 1/2] KVM: riscv: Check hugetlb block mappings against memslot bounds
Date: Thu, 04 Jun 2026 14:52:49 +0000	[thread overview]
Message-ID: <20260604145249.81EDC1F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260604142602.3582602-2-tjytimi@163.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

New issues:
- [Low] Duplicate case value in `hugetlb_mapping_size` switch statement breaks compilation on 32-bit RISC-V.

Pre-existing issues:
- [Critical] Missing dirty page tracking for writes to huge pages when KVM dirty ring is enabled.
- [High] Infinite page fault loop during THP stage-2 mapping update when a 4KB mapping already exists.
--

commit 1677f7a5702904834be9a413dad018d5713ace8a
Author: Jinyu Tang <tjytimi@163.com>

KVM: riscv: Check hugetlb block mappings against memslot bounds

This commit fixes an issue where KVM on RISC-V could install hugetlb block
mappings that exceed memslot boundaries or map the wrong host pages. It
introduces a helper to check mapping alignments and sizes, falling back to
smaller page sizes when necessary. It also prevents hugetlb mappings from
being incorrectly promoted back to huge pages by the THP path.

> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index 2d3def024270c..0adf017a2fa4a 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c
[ ... ]
> @@ -404,7 +406,7 @@ static unsigned long transparent_hugepage_adjust(struct kvm *kvm,
>  	 * sure that the HVA and GPA are sufficiently aligned and that the
>  	 * block map is contained within the memslot.
>  	 */
> -	if (fault_supports_gstage_huge_mapping(memslot, hva)) {
> +	if (fault_supports_gstage_huge_mapping(memslot, hva, PMD_SIZE)) {
>  		int sz;
>  
>  		sz = get_hva_mapping_size(kvm, hva);

[Severity: High]
This is a pre-existing issue, but does aligning the faulting gpa down to the
PMD boundary cause an infinite page fault loop if a 4KB mapping already
exists?

In transparent_hugepage_adjust(), the gpa is aligned down further along in
this block:
    *gpa &= PMD_MASK;

When this aligned gpa is passed to kvm_riscv_gstage_map_page(), the fast
path matches the aligned address:

arch/riscv/kvm/gstage.c:kvm_riscv_gstage_map_page() {
    ...
    } else if (ALIGN_DOWN(PFN_PHYS(pte_pfn(ptep_get(ptep))), page_size)
               == hpa) {
        kvm_riscv_gstage_update_pte_prot(gstage, ptep_level, gpa,
                                         ptep, prot);
        return 0;
    }

This updates the protection of the 0th 4KB PTE in the 2MB block instead of
the actual faulting 4KB PTE at the unaligned offset. Does this leave the
faulting PTE unmodified, causing the guest to infinitely retry and hang the
vCPU?

[ ... ]
> @@ -421,12 +423,33 @@ static unsigned long transparent_hugepage_adjust(struct kvm *kvm,
>  	return PAGE_SIZE;
>  }
>  
> +static unsigned long hugetlb_mapping_size(struct kvm_memory_slot *memslot,
> +					  unsigned long hva,
> +					  unsigned long map_size)
> +{
> +	switch (map_size) {
> +	case PUD_SIZE:
> +		if (fault_supports_gstage_huge_mapping(memslot, hva, PUD_SIZE))
> +			return PUD_SIZE;
> +		fallthrough;
> +	case PMD_SIZE:

[Severity: Low]
Will this switch statement break the build on 32-bit RISC-V?

On 32-bit RISC-V, PMD is folded into PUD, which makes PUD_SIZE exactly equal
to PMD_SIZE (both are 4MB). Since C requires all case values in a switch
block to be unique constants, evaluating both macros without
#ifndef __PAGETABLE_PMD_FOLDED guards will trigger a duplicate case value
compiler error.

> +		if (fault_supports_gstage_huge_mapping(memslot, hva, PMD_SIZE))
> +			return PMD_SIZE;
> +		fallthrough;
> +	case PAGE_SIZE:
> +		return PAGE_SIZE;
> +	default:
> +		return map_size;
> +	}
> +}
[ ... ]
> @@ -511,8 +541,12 @@ int kvm_riscv_mmu_map(struct kvm_vcpu *vcpu, struct kvm_memory_slot *memslot,
>  	if (mmu_invalidate_retry(kvm, mmu_seq))
>  		goto out_unlock;
>  
> -	/* Check if we are backed by a THP and thus use block mapping if possible */
> -	if (!logging && (vma_pagesize == PAGE_SIZE))
> +	/*
> +	 * Check if we are backed by a THP and thus use block mapping if
> +	 * possible. Hugetlb mappings already selected their target size above,
> +	 * so do not promote them through the THP helper.
> +	 */
> +	if (!logging && !is_hugetlb && vma_pagesize == PAGE_SIZE)
>  		vma_pagesize = transparent_hugepage_adjust(kvm, memslot, hva, &hfn, &gpa);
>  
>  	if (writable) {

[Severity: Critical]
This is a pre-existing issue, but does checking memslot->dirty_bitmap to
determine logging status miss dirty pages when the KVM dirty ring is enabled?

In kvm_riscv_mmu_map(), the logging flag is set like this:
    bool logging = (memslot->dirty_bitmap &&
                    !(memslot->flags & KVM_MEM_READONLY)) ? true : false;

If the dirty ring is used without a backup bitmap, memslot->dirty_bitmap
is NULL, making logging false. This allows transparent_hugepage_adjust()
to run and potentially install a writable huge page mapping.

If a fault occurs on a huge page, KVM maps the entire block as writable but
only logs the single faulting 4K gfn:

arch/riscv/kvm/mmu.c:kvm_riscv_mmu_map() {
    ...
    if (writable) {
        mark_page_dirty_in_slot(kvm, memslot, gfn);
        ret = kvm_riscv_gstage_map_page(..., vma_pagesize, ...);
    }
    ...
}

Would subsequent guest writes to any other pages in the 2MB block bypass the
dirty ring entirely since the hardware stage-2 PTE is already writable,
resulting in stale memory during live migration?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260604142602.3582602-1-tjytimi@163.com?part=1

next prev parent reply	other threads:[~2026-06-04 14:52 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-04 14:26 [PATCH v2 0/2] KVM: riscv: Fix hugetlb G-stage block mapping eligibility Jinyu Tang
2026-06-04 14:26 ` [PATCH v2 1/2] KVM: riscv: Check hugetlb block mappings against memslot bounds Jinyu Tang
2026-06-04 14:52   ` sashiko-bot [this message]
2026-06-04 14:26 ` [PATCH v2 2/2] KVM: selftests: Add a hugetlb memslot alignment test mode Jinyu Tang
2026-06-04 14:42   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260604145249.81EDC1F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=tjytimi@163.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox