All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Wang Yechao" <wang.yechao255@zte.com.cn>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH v3 2/5] RISC-V: KVM: Split huge pages when dirty logging is enabled
Date: Tue, 23 Jun 2026 12:33:42 +0000	[thread overview]
Message-ID: <20260623123343.2912F1F00A3A@smtp.kernel.org> (raw)
In-Reply-To: <20260623121332.78514-3-wang.yechao255@zte.com.cn>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Dropping `mmu_lock` to top up the per-VM `pgd_split_page_cache` introduces a race with `mmu_notifier_release()`, leading to a Use-After-Free.
- [High] Missing smp_wmb() before linking a newly populated page table to the parent PTE allows hardware walkers to observe uninitialized memory.
--

commit 4aa84b3974bf84dfd6f8f8c1dd2e7b3b61d34417
Author: Wang Yechao <wang.yechao255@zte.com.cn>
Subject: RISC-V: KVM: Split huge pages when dirty logging is enabled

Split huge pages eagerly when enabling dirty logging. The goal is to
avoid doing it while faulting on write-protected pages, which
negatively impacts guest performance.

> diff --git a/arch/riscv/kvm/gstage.c b/arch/riscv/kvm/gstage.c
> index c4c3b79567f10..1e980b4a08dc5 100644
> --- a/arch/riscv/kvm/gstage.c
> +++ b/arch/riscv/kvm/gstage.c

[ ... ]

> @@ -337,7 +338,7 @@ int kvm_riscv_gstage_split_huge(struct kvm_gstage *gstage,
>  
>  		next_ptep = kvm_mmu_memory_cache_alloc(pcache);
>  		if (!next_ptep)
> -			return -ENOMEM;
> +			return need_flush;
>  
>  		for (i = 0; i < PTRS_PER_PTE; i++) {
>  			child_pte = make_child_pte(huge_pte, i, child_page_size);
> @@ -349,11 +350,13 @@ int kvm_riscv_gstage_split_huge(struct kvm_gstage *gstage,
>  
>  		if (flush)
>  			gstage_tlb_flush(gstage, current_level, addr);

[Severity: High]
Can hardware page table walkers observe an uninitialized or empty page
directory here?

Because RISC-V utilizes a weakly ordered memory model (RVWMO), the CPU
might reorder the stores to the next_ptep array after the store that links
next_ptep into the parent ptep.

Should we add an smp_wmb() before updating the parent PTE to ensure the
child PTEs are fully visible first?

> +		else
> +			need_flush = true;
>  
>  		current_level--;
>  	}

[ ... ]

> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index d69fba7d8e9a6..479f7e9bb8f81 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c

[ ... ]

> @@ -97,6 +97,60 @@ void kvm_riscv_mmu_iounmap(struct kvm *kvm, gpa_t gpa, unsigned long size)
>  					    size >> PAGE_SHIFT);
>  }
>  
> +static bool need_topup_split_caches_or_resched(struct kvm_mmu_memory_cache *cache,
> +					       int count)
> +{
> +	if (need_resched())
> +		return true;
> +
> +	return kvm_mmu_memory_cache_nr_free_objects(cache) < count;
> +}
> +
> +static bool mmu_split_huge_pages(struct kvm_gstage *gstage,
> +				 phys_addr_t start, phys_addr_t end)
> +{
> +	struct kvm *kvm = gstage->kvm;
> +	struct kvm_mmu_memory_cache *pcache = &kvm->arch.pgd_split_page_cache;
> +	phys_addr_t addr = ALIGN(start, PMD_SIZE);
> +	phys_addr_t last_flush_gfn = addr >> PAGE_SHIFT;
> +	int count = gstage->pgd_levels;
> +	bool flush = false;
> +	int ret;
> +
> +	lockdep_assert_held_write(&kvm->mmu_lock);
> +
> +	while (addr < end) {
> +		if (need_topup_split_caches_or_resched(pcache, count)) {
> +			if (flush) {
> +				kvm_flush_remote_tlbs_range(kvm, last_flush_gfn,
> +					  (addr >> PAGE_SHIFT) - last_flush_gfn);
> +				last_flush_gfn = addr >> PAGE_SHIFT;
> +				flush = false;
> +			}
> +
> +			write_unlock(&kvm->mmu_lock);
> +			cond_resched();
> +
> +			ret = kvm_mmu_topup_memory_cache(pcache, count);

[Severity: Critical]
Does dropping mmu_lock here introduce a race with VM teardown?

If the VMM process terminates concurrently, mmu_notifier_release() could
trigger kvm_riscv_mmu_free_pgd(), which explicitly frees
kvm->arch.pgd_split_page_cache without taking mmu_lock.

Since we call kvm_mmu_topup_memory_cache() on this shared per-VM cache
without holding mmu_lock, could this lead to a use-after-free or memory
corruption on the cache objects array?

> +			if (ret) {
> +				kvm_err("Failed to toup split page cache\n");
> +				write_lock(&kvm->mmu_lock);
> +				return flush;
> +			}
> +			write_lock(&kvm->mmu_lock);
> +		}
> +

[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260623121332.78514-1-wang.yechao255@zte.com.cn?part=2

  reply	other threads:[~2026-06-23 12:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23 12:13 [PATCH v3 0/5] Implement Eager Page Splitting for RISC-V Wang Yechao
2026-06-23 12:13 ` [PATCH v3 1/5] RISC-V: KVM: Add the split page cache for ioctl context Wang Yechao
2026-06-23 12:34   ` sashiko-bot
2026-06-23 12:13 ` [PATCH v3 2/5] RISC-V: KVM: Split huge pages when dirty logging is enabled Wang Yechao
2026-06-23 12:33   ` sashiko-bot [this message]
2026-06-23 12:13 ` [PATCH v3 3/5] RISC-V: KVM: Remove redundant TLB flush operations Wang Yechao
2026-06-23 12:13 ` [PATCH v3 4/5] RISC-V: KVM: Split huge pages during KVM_CLEAR_DIRTY_LOG Wang Yechao
2026-06-23 12:33   ` sashiko-bot
2026-06-23 12:13 ` [PATCH v3 5/5] RISC-V: KVM: Add the eager_page_split module parameter Wang Yechao
2026-06-23 12:31   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623123343.2912F1F00A3A@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=wang.yechao255@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.