Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Wang Yechao" <wang.yechao255@zte.com.cn>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH v3 2/5] RISC-V: KVM: Split huge pages when dirty logging is enabled
Date: Tue, 23 Jun 2026 12:33:42 +0000	[thread overview]
Message-ID: <20260623123343.2912F1F00A3A@smtp.kernel.org> (raw)
In-Reply-To: <20260623121332.78514-3-wang.yechao255@zte.com.cn>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Dropping `mmu_lock` to top up the per-VM `pgd_split_page_cache` introduces a race with `mmu_notifier_release()`, leading to a Use-After-Free.
- [High] Missing smp_wmb() before linking a newly populated page table to the parent PTE allows hardware walkers to observe uninitialized memory.
--

commit 4aa84b3974bf84dfd6f8f8c1dd2e7b3b61d34417
Author: Wang Yechao <wang.yechao255@zte.com.cn>
Subject: RISC-V: KVM: Split huge pages when dirty logging is enabled

Split huge pages eagerly when enabling dirty logging. The goal is to
avoid doing it while faulting on write-protected pages, which
negatively impacts guest performance.

> diff --git a/arch/riscv/kvm/gstage.c b/arch/riscv/kvm/gstage.c
> index c4c3b79567f10..1e980b4a08dc5 100644
> --- a/arch/riscv/kvm/gstage.c
> +++ b/arch/riscv/kvm/gstage.c

[ ... ]

> @@ -337,7 +338,7 @@ int kvm_riscv_gstage_split_huge(struct kvm_gstage *gstage,
>  
>  		next_ptep = kvm_mmu_memory_cache_alloc(pcache);
>  		if (!next_ptep)
> -			return -ENOMEM;
> +			return need_flush;
>  
>  		for (i = 0; i < PTRS_PER_PTE; i++) {
>  			child_pte = make_child_pte(huge_pte, i, child_page_size);
> @@ -349,11 +350,13 @@ int kvm_riscv_gstage_split_huge(struct kvm_gstage *gstage,
>  
>  		if (flush)
>  			gstage_tlb_flush(gstage, current_level, addr);

[Severity: High]
Can hardware page table walkers observe an uninitialized or empty page
directory here?

Because RISC-V utilizes a weakly ordered memory model (RVWMO), the CPU
might reorder the stores to the next_ptep array after the store that links
next_ptep into the parent ptep.

Should we add an smp_wmb() before updating the parent PTE to ensure the
child PTEs are fully visible first?

> +		else
> +			need_flush = true;
>  
>  		current_level--;
>  	}

[ ... ]

> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index d69fba7d8e9a6..479f7e9bb8f81 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c

[ ... ]

> @@ -97,6 +97,60 @@ void kvm_riscv_mmu_iounmap(struct kvm *kvm, gpa_t gpa, unsigned long size)
>  					    size >> PAGE_SHIFT);
>  }
>  
> +static bool need_topup_split_caches_or_resched(struct kvm_mmu_memory_cache *cache,
> +					       int count)
> +{
> +	if (need_resched())
> +		return true;
> +
> +	return kvm_mmu_memory_cache_nr_free_objects(cache) < count;
> +}
> +
> +static bool mmu_split_huge_pages(struct kvm_gstage *gstage,
> +				 phys_addr_t start, phys_addr_t end)
> +{
> +	struct kvm *kvm = gstage->kvm;
> +	struct kvm_mmu_memory_cache *pcache = &kvm->arch.pgd_split_page_cache;
> +	phys_addr_t addr = ALIGN(start, PMD_SIZE);
> +	phys_addr_t last_flush_gfn = addr >> PAGE_SHIFT;
> +	int count = gstage->pgd_levels;
> +	bool flush = false;
> +	int ret;
> +
> +	lockdep_assert_held_write(&kvm->mmu_lock);
> +
> +	while (addr < end) {
> +		if (need_topup_split_caches_or_resched(pcache, count)) {
> +			if (flush) {
> +				kvm_flush_remote_tlbs_range(kvm, last_flush_gfn,
> +					  (addr >> PAGE_SHIFT) - last_flush_gfn);
> +				last_flush_gfn = addr >> PAGE_SHIFT;
> +				flush = false;
> +			}
> +
> +			write_unlock(&kvm->mmu_lock);
> +			cond_resched();
> +
> +			ret = kvm_mmu_topup_memory_cache(pcache, count);

[Severity: Critical]
Does dropping mmu_lock here introduce a race with VM teardown?

If the VMM process terminates concurrently, mmu_notifier_release() could
trigger kvm_riscv_mmu_free_pgd(), which explicitly frees
kvm->arch.pgd_split_page_cache without taking mmu_lock.

Since we call kvm_mmu_topup_memory_cache() on this shared per-VM cache
without holding mmu_lock, could this lead to a use-after-free or memory
corruption on the cache objects array?

> +			if (ret) {
> +				kvm_err("Failed to toup split page cache\n");
> +				write_lock(&kvm->mmu_lock);
> +				return flush;
> +			}
> +			write_lock(&kvm->mmu_lock);
> +		}
> +

[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260623121332.78514-1-wang.yechao255@zte.com.cn?part=2

  reply	other threads:[~2026-06-23 12:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23 12:13 [PATCH v3 0/5] Implement Eager Page Splitting for RISC-V Wang Yechao
2026-06-23 12:13 ` [PATCH v3 1/5] RISC-V: KVM: Add the split page cache for ioctl context Wang Yechao
2026-06-23 12:34   ` sashiko-bot
2026-06-23 12:13 ` [PATCH v3 2/5] RISC-V: KVM: Split huge pages when dirty logging is enabled Wang Yechao
2026-06-23 12:33   ` sashiko-bot [this message]
2026-06-23 12:13 ` [PATCH v3 3/5] RISC-V: KVM: Remove redundant TLB flush operations Wang Yechao
2026-06-23 12:13 ` [PATCH v3 4/5] RISC-V: KVM: Split huge pages during KVM_CLEAR_DIRTY_LOG Wang Yechao
2026-06-23 12:33   ` sashiko-bot
2026-06-23 12:13 ` [PATCH v3 5/5] RISC-V: KVM: Add the eager_page_split module parameter Wang Yechao
2026-06-23 12:31   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623123343.2912F1F00A3A@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=wang.yechao255@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox