The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH] KVM: arm64: account pKVM reclaim against the VM mm
@ 2026-06-21 21:31 Bradley Morgan
  2026-06-22  8:32 ` Marc Zyngier
  2026-06-22  8:32 ` Fuad Tabba
  0 siblings, 2 replies; 5+ messages in thread
From: Bradley Morgan @ 2026-06-21 21:31 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton
  Cc: Fuad Tabba, Joey Gouly, Steffen Eiden, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, linux-arm-kernel,
	kvmarm, linux-kernel, Bradley Morgan

Protected guest faults charge long term pins to the VM's mm. Teardown
can run later from file release, where current->mm may be unrelated.

Drop the charge from kvm->mm instead.

Fixes: 4e6e03f9eadd ("KVM: arm64: Hook up reclaim hypercall to pkvm_pgtable_stage2_destroy()")
Signed-off-by: Bradley Morgan <include@grrlz.net>
---
 arch/arm64/kvm/pkvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
index 053e4f733e4b..428723b1b0f5 100644
--- a/arch/arm64/kvm/pkvm.c
+++ b/arch/arm64/kvm/pkvm.c
@@ -352,7 +352,7 @@ static int __pkvm_pgtable_stage2_reclaim(struct kvm_pgtable *pgt, u64 start, u64
 		page = pfn_to_page(mapping->pfn);
 		WARN_ON_ONCE(mapping->nr_pages != 1);
 		unpin_user_pages_dirty_lock(&page, 1, true);
-		account_locked_vm(current->mm, 1, false);
+		account_locked_vm(kvm->mm, 1, false);
 		pkvm_mapping_remove(mapping, &pgt->pkvm_mappings);
 		kfree(mapping);
 	}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: account pKVM reclaim against the VM mm
  2026-06-21 21:31 [PATCH] KVM: arm64: account pKVM reclaim against the VM mm Bradley Morgan
@ 2026-06-22  8:32 ` Marc Zyngier
  2026-06-22  8:32 ` Fuad Tabba
  1 sibling, 0 replies; 5+ messages in thread
From: Marc Zyngier @ 2026-06-22  8:32 UTC (permalink / raw)
  To: Will Deacon, Bradley Morgan
  Cc: Oliver Upton, Fuad Tabba, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, linux-arm-kernel,
	kvmarm, linux-kernel

On Sun, 21 Jun 2026 22:31:55 +0100,
Bradley Morgan <include@grrlz.net> wrote:
> 
> Protected guest faults charge long term pins to the VM's mm. Teardown
> can run later from file release, where current->mm may be unrelated.
>
> Drop the charge from kvm->mm instead.
> 
> Fixes: 4e6e03f9eadd ("KVM: arm64: Hook up reclaim hypercall to pkvm_pgtable_stage2_destroy()")
> Signed-off-by: Bradley Morgan <include@grrlz.net>
> ---
>  arch/arm64/kvm/pkvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
> index 053e4f733e4b..428723b1b0f5 100644
> --- a/arch/arm64/kvm/pkvm.c
> +++ b/arch/arm64/kvm/pkvm.c
> @@ -352,7 +352,7 @@ static int __pkvm_pgtable_stage2_reclaim(struct kvm_pgtable *pgt, u64 start, u64
>  		page = pfn_to_page(mapping->pfn);
>  		WARN_ON_ONCE(mapping->nr_pages != 1);
>  		unpin_user_pages_dirty_lock(&page, 1, true);
> -		account_locked_vm(current->mm, 1, false);
> +		account_locked_vm(kvm->mm, 1, false);
>  		pkvm_mapping_remove(mapping, &pgt->pkvm_mappings);
>  		kfree(mapping);
>  	}

Seems correct to me, as the final mmdrop(kvm->mm) occurs after S2
teardown.

Will, what do you think?

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: account pKVM reclaim against the VM mm
  2026-06-21 21:31 [PATCH] KVM: arm64: account pKVM reclaim against the VM mm Bradley Morgan
  2026-06-22  8:32 ` Marc Zyngier
@ 2026-06-22  8:32 ` Fuad Tabba
  2026-06-22  9:16   ` Marc Zyngier
  1 sibling, 1 reply; 5+ messages in thread
From: Fuad Tabba @ 2026-06-22  8:32 UTC (permalink / raw)
  To: Bradley Morgan
  Cc: Marc Zyngier, Oliver Upton, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	linux-arm-kernel, kvmarm, linux-kernel

On Sun, 21 Jun 2026 at 22:32, Bradley Morgan <include@grrlz.net> wrote:
>
> Protected guest faults charge long term pins to the VM's mm. Teardown
> can run later from file release, where current->mm may be unrelated.
>
> Drop the charge from kvm->mm instead.
>
> Fixes: 4e6e03f9eadd ("KVM: arm64: Hook up reclaim hypercall to pkvm_pgtable_stage2_destroy()")
> Signed-off-by: Bradley Morgan <include@grrlz.net>

Reproduced by creating a protected VM, running the vCPU to fault in a
page, then forking and having the child close the last fd reference.
Without the fix, the parent's VmLck leaks (the reclaim decrements the
child's mm, which is freed on exit). With the fix the parent's VmLck
returns to zero.

One minor observation: account_locked_vm() also passes `current` as
the task pointer to __account_locked_vm(), but on the decrement path
that is only used in the pr_debug log line, so it is technically wrong
but functionally harmless.

Reviewed-by: Fuad Tabba <fuad.tabba@linux.dev>
Tested-by: Fuad Tabba < fuad.tabba@linux.dev>

Cheers,
/fuad

> ---
>  arch/arm64/kvm/pkvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
> index 053e4f733e4b..428723b1b0f5 100644
> --- a/arch/arm64/kvm/pkvm.c
> +++ b/arch/arm64/kvm/pkvm.c
> @@ -352,7 +352,7 @@ static int __pkvm_pgtable_stage2_reclaim(struct kvm_pgtable *pgt, u64 start, u64
>                 page = pfn_to_page(mapping->pfn);
>                 WARN_ON_ONCE(mapping->nr_pages != 1);
>                 unpin_user_pages_dirty_lock(&page, 1, true);
> -               account_locked_vm(current->mm, 1, false);
> +               account_locked_vm(kvm->mm, 1, false);
>                 pkvm_mapping_remove(mapping, &pgt->pkvm_mappings);
>                 kfree(mapping);
>         }
> --
> 2.53.0
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: account pKVM reclaim against the VM mm
  2026-06-22  8:32 ` Fuad Tabba
@ 2026-06-22  9:16   ` Marc Zyngier
  2026-06-22 14:49     ` Bradley Morgan
  0 siblings, 1 reply; 5+ messages in thread
From: Marc Zyngier @ 2026-06-22  9:16 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: Bradley Morgan, Oliver Upton, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	linux-arm-kernel, kvmarm, linux-kernel

On Mon, 22 Jun 2026 09:32:45 +0100,
Fuad Tabba <fuad.tabba@linux.dev> wrote:
> 
> On Sun, 21 Jun 2026 at 22:32, Bradley Morgan <include@grrlz.net> wrote:
> >
> > Protected guest faults charge long term pins to the VM's mm. Teardown
> > can run later from file release, where current->mm may be unrelated.
> >
> > Drop the charge from kvm->mm instead.
> >
> > Fixes: 4e6e03f9eadd ("KVM: arm64: Hook up reclaim hypercall to pkvm_pgtable_stage2_destroy()")
> > Signed-off-by: Bradley Morgan <include@grrlz.net>
> 
> Reproduced by creating a protected VM, running the vCPU to fault in a
> page, then forking and having the child close the last fd reference.
> Without the fix, the parent's VmLck leaks (the reclaim decrements the
> child's mm, which is freed on exit). With the fix the parent's VmLck
> returns to zero.
> 
> One minor observation: account_locked_vm() also passes `current` as
> the task pointer to __account_locked_vm(), but on the decrement path
> that is only used in the pr_debug log line, so it is technically wrong
> but functionally harmless.

I don't think this is wrong. Awkward, maybe. It is just that the
rlimit check and the accounting may be different contexts, and the
pr_debug() call covers both inc and dec.

>
> Reviewed-by: Fuad Tabba <fuad.tabba@linux.dev>
> Tested-by: Fuad Tabba < fuad.tabba@linux.dev>

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: arm64: account pKVM reclaim against the VM mm
  2026-06-22  9:16   ` Marc Zyngier
@ 2026-06-22 14:49     ` Bradley Morgan
  0 siblings, 0 replies; 5+ messages in thread
From: Bradley Morgan @ 2026-06-22 14:49 UTC (permalink / raw)
  To: Marc Zyngier, Fuad Tabba
  Cc: Oliver Upton, Joey Gouly, Steffen Eiden, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, linux-arm-kernel,
	kvmarm, linux-kernel

On June 22, 2026 10:16:18 AM GMT+01:00, Marc Zyngier <maz@kernel.org>
wrote:
>On Mon, 22 Jun 2026 09:32:45 +0100,
>Fuad Tabba <fuad.tabba@linux.dev> wrote:
>> 
>> On Sun, 21 Jun 2026 at 22:32, Bradley Morgan <include@grrlz.net> wrote:
>> >
>> > Protected guest faults charge long term pins to the VM's mm. Teardown
>> > can run later from file release, where current->mm may be unrelated.
>> >
>> > Drop the charge from kvm->mm instead.
>> >
>> > Fixes: 4e6e03f9eadd ("KVM: arm64: Hook up reclaim hypercall to
>pkvm_pgtable_stage2_destroy()")
>> > Signed-off-by: Bradley Morgan <include@grrlz.net>
>> 
>> Reproduced by creating a protected VM, running the vCPU to fault in a
>> page, then forking and having the child close the last fd reference.
>> Without the fix, the parent's VmLck leaks (the reclaim decrements the
>> child's mm, which is freed on exit). With the fix the parent's VmLck
>> returns to zero.
>> 
>> One minor observation: account_locked_vm() also passes `current` as
>> the task pointer to __account_locked_vm(), but on the decrement path
>> that is only used in the pr_debug log line, so it is technically wrong
>> but functionally harmless.

I agree with marc here. Maybe awkward.


I tested it on my pixel 7! :)

>I don't think this is wrong. Awkward, maybe. It is just that the
>rlimit check and the accounting may be different contexts, and the
>pr_debug() call covers both inc and dec.
>
>>
>> Reviewed-by: Fuad Tabba <fuad.tabba@linux.dev>
>> Tested-by: Fuad Tabba < fuad.tabba@linux.dev>

Thanks for the review! :)

Cheers!

>Thanks,
>
>	M.
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-22 14:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-21 21:31 [PATCH] KVM: arm64: account pKVM reclaim against the VM mm Bradley Morgan
2026-06-22  8:32 ` Marc Zyngier
2026-06-22  8:32 ` Fuad Tabba
2026-06-22  9:16   ` Marc Zyngier
2026-06-22 14:49     ` Bradley Morgan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox