All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: arm64: Reallocate the nested_mmus array under the mmu_lock
@ 2026-06-04 18:30 Hyunwoo Kim
  2026-06-04 22:27 ` Oliver Upton
  0 siblings, 1 reply; 5+ messages in thread
From: Hyunwoo Kim @ 2026-06-04 18:30 UTC (permalink / raw)
  To: maz, oupton, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, christoffer.dall
  Cc: linux-arm-kernel, kvmarm, imv4bel

Code that walks kvm->arch.nested_mmus[] holds kvm->mmu_lock. By contrast,
kvm_vcpu_init_nested() reallocates the array and frees the old buffer while
holding only kvm->arch.config_lock, so a walker can reference the freed
array.

Allocate the new array outside the lock, as the allocation can sleep, and
do only the copy and the pointer swap under the mmu_lock. After the swap no
walker can reach the old buffer, so free it once the lock has been
released.

Fixes: 4f128f8e1aaac ("KVM: arm64: nv: Support multiple nested Stage-2 mmu structures")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
---
 arch/arm64/kvm/nested.c | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 38f672e940878..6f7bc9a9992e0 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -89,21 +89,28 @@ int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu)
 	 * again, and there is no reason to affect the whole VM for this.
 	 */
 	num_mmus = atomic_read(&kvm->online_vcpus) * S2_MMU_PER_VCPU;
-	tmp = kvrealloc(kvm->arch.nested_mmus,
-			size_mul(sizeof(*kvm->arch.nested_mmus), num_mmus),
-			GFP_KERNEL_ACCOUNT | __GFP_ZERO);
-	if (!tmp)
-		return -ENOMEM;
 
-	swap(kvm->arch.nested_mmus, tmp);
+	if (num_mmus > kvm->arch.nested_mmus_size) {
+		tmp = kvcalloc(num_mmus, sizeof(*tmp), GFP_KERNEL_ACCOUNT);
+		if (!tmp)
+			return -ENOMEM;
 
-	/*
-	 * If we went through a realocation, adjust the MMU back-pointers in
-	 * the previously initialised kvm_pgtable structures.
-	 */
-	if (kvm->arch.nested_mmus != tmp)
-		for (int i = 0; i < kvm->arch.nested_mmus_size; i++)
-			kvm->arch.nested_mmus[i].pgt->mmu = &kvm->arch.nested_mmus[i];
+		write_lock(&kvm->mmu_lock);
+
+		if (kvm->arch.nested_mmus_size) {
+			memcpy(tmp, kvm->arch.nested_mmus,
+			       size_mul(sizeof(*tmp), kvm->arch.nested_mmus_size));
+
+			for (int i = 0; i < kvm->arch.nested_mmus_size; i++)
+				tmp[i].pgt->mmu = &tmp[i];
+		}
+
+		swap(kvm->arch.nested_mmus, tmp);
+
+		write_unlock(&kvm->mmu_lock);
+
+		kvfree(tmp);
+	}
 
 	for (int i = kvm->arch.nested_mmus_size; !ret && i < num_mmus; i++)
 		ret = init_nested_s2_mmu(kvm, &kvm->arch.nested_mmus[i]);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: Reallocate the nested_mmus array under the mmu_lock
  2026-06-04 18:30 [PATCH] KVM: arm64: Reallocate the nested_mmus array under the mmu_lock Hyunwoo Kim
@ 2026-06-04 22:27 ` Oliver Upton
  2026-06-04 22:58   ` Sean Christopherson
  2026-06-05  5:35   ` Hyunwoo Kim
  0 siblings, 2 replies; 5+ messages in thread
From: Oliver Upton @ 2026-06-04 22:27 UTC (permalink / raw)
  To: Hyunwoo Kim
  Cc: maz, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, christoffer.dall, linux-arm-kernel, kvmarm

Hi,

The shortlog is very confusing, since "allocate behind $LOCK" is usually
something alarming. Maybe instead:

  KVM: arm64: Reassign nested_mmus array behind mmu_lock

On Fri, Jun 05, 2026 at 03:30:00AM +0900, Hyunwoo Kim wrote:
> Code that walks kvm->arch.nested_mmus[] holds kvm->mmu_lock. By contrast,
> kvm_vcpu_init_nested() reallocates the array and frees the old buffer while
> holding only kvm->arch.config_lock, so a walker can reference the freed
> array.

It wouldn't hurt to share slightly more information here. Are you
dealing with a concurrent MMU notifier?

> Allocate the new array outside the lock, as the allocation can sleep, and
> do only the copy and the pointer swap under the mmu_lock. After the swap no
> walker can reach the old buffer, so free it once the lock has been
> released.
> 
> Fixes: 4f128f8e1aaac ("KVM: arm64: nv: Support multiple nested Stage-2 mmu structures")
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>

The diff itself LGTM

Reviewed-by: Oliver Upton <oupton@kernel.org>

Thanks,
Oliver

> ---
>  arch/arm64/kvm/nested.c | 33 ++++++++++++++++++++-------------
>  1 file changed, 20 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 38f672e940878..6f7bc9a9992e0 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -89,21 +89,28 @@ int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu)
>  	 * again, and there is no reason to affect the whole VM for this.
>  	 */
>  	num_mmus = atomic_read(&kvm->online_vcpus) * S2_MMU_PER_VCPU;
> -	tmp = kvrealloc(kvm->arch.nested_mmus,
> -			size_mul(sizeof(*kvm->arch.nested_mmus), num_mmus),
> -			GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> -	if (!tmp)
> -		return -ENOMEM;
>  
> -	swap(kvm->arch.nested_mmus, tmp);
> +	if (num_mmus > kvm->arch.nested_mmus_size) {
> +		tmp = kvcalloc(num_mmus, sizeof(*tmp), GFP_KERNEL_ACCOUNT);
> +		if (!tmp)
> +			return -ENOMEM;
>  
> -	/*
> -	 * If we went through a realocation, adjust the MMU back-pointers in
> -	 * the previously initialised kvm_pgtable structures.
> -	 */
> -	if (kvm->arch.nested_mmus != tmp)
> -		for (int i = 0; i < kvm->arch.nested_mmus_size; i++)
> -			kvm->arch.nested_mmus[i].pgt->mmu = &kvm->arch.nested_mmus[i];
> +		write_lock(&kvm->mmu_lock);
> +
> +		if (kvm->arch.nested_mmus_size) {
> +			memcpy(tmp, kvm->arch.nested_mmus,
> +			       size_mul(sizeof(*tmp), kvm->arch.nested_mmus_size));
> +
> +			for (int i = 0; i < kvm->arch.nested_mmus_size; i++)
> +				tmp[i].pgt->mmu = &tmp[i];
> +		}
> +
> +		swap(kvm->arch.nested_mmus, tmp);
> +
> +		write_unlock(&kvm->mmu_lock);
> +
> +		kvfree(tmp);
> +	}
>  
>  	for (int i = kvm->arch.nested_mmus_size; !ret && i < num_mmus; i++)
>  		ret = init_nested_s2_mmu(kvm, &kvm->arch.nested_mmus[i]);
> -- 
> 2.43.0
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: Reallocate the nested_mmus array under the mmu_lock
  2026-06-04 22:27 ` Oliver Upton
@ 2026-06-04 22:58   ` Sean Christopherson
  2026-06-05  5:35   ` Hyunwoo Kim
  1 sibling, 0 replies; 5+ messages in thread
From: Sean Christopherson @ 2026-06-04 22:58 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Hyunwoo Kim, maz, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, christoffer.dall, linux-arm-kernel, kvmarm

On Thu, Jun 04, 2026, Oliver Upton wrote:
> The shortlog is very confusing, since "allocate behind $LOCK" is usually
> something alarming. Maybe instead:
> 
>   KVM: arm64: Reassign nested_mmus array behind mmu_lock

+1 from the peanut gallery.  After reading the shortlog, I was about to grab my
pitchfork :-)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: Reallocate the nested_mmus array under the mmu_lock
  2026-06-04 22:27 ` Oliver Upton
  2026-06-04 22:58   ` Sean Christopherson
@ 2026-06-05  5:35   ` Hyunwoo Kim
  2026-06-05  7:51     ` Marc Zyngier
  1 sibling, 1 reply; 5+ messages in thread
From: Hyunwoo Kim @ 2026-06-05  5:35 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, christoffer.dall, linux-arm-kernel, kvmarm,
	imv4bel

On Thu, Jun 04, 2026 at 03:27:16PM -0700, Oliver Upton wrote:
> Hi,
> 
> The shortlog is very confusing, since "allocate behind $LOCK" is usually
> something alarming. Maybe instead:
> 
>   KVM: arm64: Reassign nested_mmus array behind mmu_lock

heh, that's confusing indeed. I'll change it that way.

> 
> On Fri, Jun 05, 2026 at 03:30:00AM +0900, Hyunwoo Kim wrote:
> > Code that walks kvm->arch.nested_mmus[] holds kvm->mmu_lock. By contrast,
> > kvm_vcpu_init_nested() reallocates the array and frees the old buffer while
> > holding only kvm->arch.config_lock, so a walker can reference the freed
> > array.
> 
> It wouldn't hurt to share slightly more information here. Are you
> dealing with a concurrent MMU notifier?

Yes. The MMU notifier path also walks nested_mmus[] under mmu_lock.
kvm_vcpu_init_nested() holds only config_lock, so if a notifier fires
during vCPU init, it races with the array realloc and free.

Here's the reworked changelog. Should I send v2?

  kvm->arch.nested_mmus[] is walked under kvm->mmu_lock, including from the
  MMU notifier path (kvm_unmap_gfn_range() -> kvm_nested_s2_unmap()), which
  can run at any time. kvm_vcpu_init_nested() reallocates the array and frees
  the old buffer while holding only kvm->arch.config_lock, so such a walker
  can reference the freed array.

  Allocate the new array outside of mmu_lock, as the allocation can sleep.
  Under the lock, copy the existing entries, fix up the back pointers and
  reassign the array. Free the old buffer after dropping the lock, as
  kvfree() can sleep as well.

> 
> > Allocate the new array outside the lock, as the allocation can sleep, and
> > do only the copy and the pointer swap under the mmu_lock. After the swap no
> > walker can reach the old buffer, so free it once the lock has been
> > released.
> > 
> > Fixes: 4f128f8e1aaac ("KVM: arm64: nv: Support multiple nested Stage-2 mmu structures")
> > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> 
> The diff itself LGTM
> 
> Reviewed-by: Oliver Upton <oupton@kernel.org>

Thanks for the review.

> 
> Thanks,
> Oliver
> 
> > ---
> >  arch/arm64/kvm/nested.c | 33 ++++++++++++++++++++-------------
> >  1 file changed, 20 insertions(+), 13 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> > index 38f672e940878..6f7bc9a9992e0 100644
> > --- a/arch/arm64/kvm/nested.c
> > +++ b/arch/arm64/kvm/nested.c
> > @@ -89,21 +89,28 @@ int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu)
> >  	 * again, and there is no reason to affect the whole VM for this.
> >  	 */
> >  	num_mmus = atomic_read(&kvm->online_vcpus) * S2_MMU_PER_VCPU;
> > -	tmp = kvrealloc(kvm->arch.nested_mmus,
> > -			size_mul(sizeof(*kvm->arch.nested_mmus), num_mmus),
> > -			GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> > -	if (!tmp)
> > -		return -ENOMEM;
> >  
> > -	swap(kvm->arch.nested_mmus, tmp);
> > +	if (num_mmus > kvm->arch.nested_mmus_size) {
> > +		tmp = kvcalloc(num_mmus, sizeof(*tmp), GFP_KERNEL_ACCOUNT);
> > +		if (!tmp)
> > +			return -ENOMEM;
> >  
> > -	/*
> > -	 * If we went through a realocation, adjust the MMU back-pointers in
> > -	 * the previously initialised kvm_pgtable structures.
> > -	 */
> > -	if (kvm->arch.nested_mmus != tmp)
> > -		for (int i = 0; i < kvm->arch.nested_mmus_size; i++)
> > -			kvm->arch.nested_mmus[i].pgt->mmu = &kvm->arch.nested_mmus[i];
> > +		write_lock(&kvm->mmu_lock);
> > +
> > +		if (kvm->arch.nested_mmus_size) {
> > +			memcpy(tmp, kvm->arch.nested_mmus,
> > +			       size_mul(sizeof(*tmp), kvm->arch.nested_mmus_size));
> > +
> > +			for (int i = 0; i < kvm->arch.nested_mmus_size; i++)
> > +				tmp[i].pgt->mmu = &tmp[i];
> > +		}
> > +
> > +		swap(kvm->arch.nested_mmus, tmp);
> > +
> > +		write_unlock(&kvm->mmu_lock);
> > +
> > +		kvfree(tmp);
> > +	}
> >  
> >  	for (int i = kvm->arch.nested_mmus_size; !ret && i < num_mmus; i++)
> >  		ret = init_nested_s2_mmu(kvm, &kvm->arch.nested_mmus[i]);
> > -- 
> > 2.43.0
> > 


Best regards,
Hyunwoo Kim

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: Reallocate the nested_mmus array under the mmu_lock
  2026-06-05  5:35   ` Hyunwoo Kim
@ 2026-06-05  7:51     ` Marc Zyngier
  0 siblings, 0 replies; 5+ messages in thread
From: Marc Zyngier @ 2026-06-05  7:51 UTC (permalink / raw)
  To: Hyunwoo Kim
  Cc: Oliver Upton, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, christoffer.dall, linux-arm-kernel, kvmarm

On Fri, 05 Jun 2026 06:35:20 +0100,
Hyunwoo Kim <imv4bel@gmail.com> wrote:
> 
> On Thu, Jun 04, 2026 at 03:27:16PM -0700, Oliver Upton wrote:
> > Hi,
> > 
> > The shortlog is very confusing, since "allocate behind $LOCK" is usually
> > something alarming. Maybe instead:
> > 
> >   KVM: arm64: Reassign nested_mmus array behind mmu_lock
> 
> heh, that's confusing indeed. I'll change it that way.
> 
> > 
> > On Fri, Jun 05, 2026 at 03:30:00AM +0900, Hyunwoo Kim wrote:
> > > Code that walks kvm->arch.nested_mmus[] holds kvm->mmu_lock. By contrast,
> > > kvm_vcpu_init_nested() reallocates the array and frees the old buffer while
> > > holding only kvm->arch.config_lock, so a walker can reference the freed
> > > array.
> > 
> > It wouldn't hurt to share slightly more information here. Are you
> > dealing with a concurrent MMU notifier?
> 
> Yes. The MMU notifier path also walks nested_mmus[] under mmu_lock.
> kvm_vcpu_init_nested() holds only config_lock, so if a notifier fires
> during vCPU init, it races with the array realloc and free.
> 
> Here's the reworked changelog. Should I send v2?
> 
>   kvm->arch.nested_mmus[] is walked under kvm->mmu_lock, including from the
>   MMU notifier path (kvm_unmap_gfn_range() -> kvm_nested_s2_unmap()), which
>   can run at any time. kvm_vcpu_init_nested() reallocates the array and frees
>   the old buffer while holding only kvm->arch.config_lock, so such a walker
>   can reference the freed array.
> 
>   Allocate the new array outside of mmu_lock, as the allocation can sleep.
>   Under the lock, copy the existing entries, fix up the back pointers and
>   reassign the array. Free the old buffer after dropping the lock, as
>   kvfree() can sleep as well.

That's significantly better. Please send a v2 with this.

Thanks,

	M.

-- 
Jazz isn't dead. It just smells funny.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-05  7:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04 18:30 [PATCH] KVM: arm64: Reallocate the nested_mmus array under the mmu_lock Hyunwoo Kim
2026-06-04 22:27 ` Oliver Upton
2026-06-04 22:58   ` Sean Christopherson
2026-06-05  5:35   ` Hyunwoo Kim
2026-06-05  7:51     ` Marc Zyngier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.