public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation
@ 2026-05-06 10:50 Alexandru Elisei
  2026-05-06 12:44 ` Sean Christopherson
  0 siblings, 1 reply; 3+ messages in thread
From: Alexandru Elisei @ 2026-05-06 10:50 UTC (permalink / raw)
  To: maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	linux-arm-kernel, kvmarm
  Cc: tabba, David.Hildenbrand

The documentation for KVM_EXIT_MEMORY_FAULT states:

'Note!  KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that
it accompanies a return code of '-1', not '0'!  errno will always be set to
EFAULT or EHWPOISON when KVM exits with KVM_EXIT_MEMORY_FAULT, userspace
should assume kvm_run.exit_reason is stale/undefined for all other error
numbers'.

where a return code of '-1' is special because according to man 2 ioctl:

'On error, -1 is returned, and errno is set to indicate the error'.

Putting the two together means that the ioctl KVM_RUN must 1) complete with
an error and 2) that error must must be either EFAULT or EHWPOISON for
userspace to detect a KVM_EXIT_MEMORY_FAULT VCPU exit.

On a kvm_gmem_get_pfn() error, gmem_abort() prepares the
KVM_EXIT_MEMORY_FAULT exit_reason and propagates the error back to
userspace. kvm_gmem_get_pfn() does not massage the error code, and if the
error is not -EFAULT or -EHWPOISON, userspace implementing the ABI fails to
detect the memory fault exit.

Things get more complicated with kvm_handle_vncr_abort().
kvm_translate_vncr(), similar to gmem_abort(), prepares the VCPU to exit
with KVM_EXIT_MEMORY_FAULT and propagates the error code from
kvm_gmem_get_pfn(). Then kvm_handle_vncr_abort() does a number of things
based on this specific error code:

- If it's -EAGAIN, KVM resumes the guest. Note that KVM, when handling a
  *host* fault on a guest_memfd backed VMA, retries the fault handling if
  kvm_gmem_get_pfn() returns -EAGAIN.
- If it's -ENOMEM, -EFAULT, -EIO or -EHWPOISON, it returns to userspace
  with 0 (success), meaning that, according to the documentation, userspace
  will not detect the memory fault exit.
- If it's -EINVAL, -ENOENT, -EACCESS, KVM injects a synchronous exception
  back to the guest.
- If it's -EPERM, KVM injects a permission fault.
- If the error code is something else, KVM resumes the guest.

Bring a measure of order to all of this by implementing the documented
behaviour. -EAGAIN is treated as an error, similar to the
__kvm_faultin_pfn() behaviour for an anonymous VMA.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---

This has the potential to break userspace, hence the RFC tag.

I went back and forth on the fix. I cannot test any of this and I have no
context around the usage of guest_memfd. In the end I settled on strictly
implementing the documented behaviour.

Really not sure what userspace is supposed to do to fixup the fault if
kvm_gmem_get_pfn() returns -EAGAIN either.

Someone with more knowledge please chime in!

 arch/arm64/kvm/mmu.c    |  8 +++++++-
 arch/arm64/kvm/nested.c | 24 +++++++++++++-----------
 2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index d089c107d9b7..ea6c96818fc6 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1610,7 +1610,13 @@ static int gmem_abort(const struct kvm_s2_fault_desc *s2fd)
 	if (ret) {
 		kvm_prepare_memory_fault_exit(s2fd->vcpu, s2fd->fault_ipa, PAGE_SIZE,
 					      write_fault, exec_fault, false);
-		return ret;
+		switch (ret) {
+		case -EFAULT:
+		case -EHWPOISON:
+			return ret;
+		default:
+			return -EFAULT;
+		}
 	}
 
 	if (!(s2fd->memslot->flags & KVM_MEM_READONLY))
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 883b6c1008fb..ef426c94daff 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1320,8 +1320,14 @@ static int kvm_translate_vncr(struct kvm_vcpu *vcpu, bool *is_gmem)
 		ret = kvm_gmem_get_pfn(vcpu->kvm, memslot, gfn, &pfn, &page, NULL);
 		if (ret) {
 			kvm_prepare_memory_fault_exit(vcpu, vt->wr.pa, PAGE_SIZE,
-					      write_fault, false, false);
-			return ret;
+						      write_fault, false, false);
+			switch (ret) {
+			case -EFAULT:
+			case -EHWPOISON:
+				return ret;
+			default:
+				return -EFAULT;
+			}
 		}
 	}
 
@@ -1401,23 +1407,19 @@ int kvm_handle_vncr_abort(struct kvm_vcpu *vcpu)
 
 		switch (ret) {
 		case -EAGAIN:
+		case -ENOMEM:
 			/* Let's try again... */
 			break;
-		case -ENOMEM:
+		case -EFAULT:
+		case -EHWPOISON:
 			/*
 			 * For guest_memfd, this indicates that it failed to
 			 * create a folio to back the memory. Inform userspace.
 			 */
 			if (is_gmem)
-				return 0;
-			/* Otherwise, let's try again... */
-			break;
-		case -EFAULT:
-		case -EIO:
-		case -EHWPOISON:
-			if (is_gmem)
-				return 0;
+				return ret;
 			fallthrough;
+		case -EIO:
 		case -EINVAL:
 		case -ENOENT:
 		case -EACCES:

base-commit: 7fd2df204f342fc17d1a0bfcd474b24232fb0f32
-- 
2.54.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation
  2026-05-06 10:50 [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation Alexandru Elisei
@ 2026-05-06 12:44 ` Sean Christopherson
  2026-05-06 13:39   ` Alexandru Elisei, Sean Christopherson
  0 siblings, 1 reply; 3+ messages in thread
From: Sean Christopherson @ 2026-05-06 12:44 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	linux-arm-kernel, kvmarm, tabba, David.Hildenbrand

On Wed, May 06, 2026, Alexandru Elisei wrote:
> The documentation for KVM_EXIT_MEMORY_FAULT states:
> 
> 'Note!  KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that
> it accompanies a return code of '-1', not '0'!  errno will always be set to
> EFAULT or EHWPOISON when KVM exits with KVM_EXIT_MEMORY_FAULT, userspace
> should assume kvm_run.exit_reason is stale/undefined for all other error
> numbers'.
> 
> where a return code of '-1' is special because according to man 2 ioctl:
> 
> 'On error, -1 is returned, and errno is set to indicate the error'.
> 
> Putting the two together means that the ioctl KVM_RUN must 1) complete with
> an error and 2) that error must must be either EFAULT or EHWPOISON for
> userspace to detect a KVM_EXIT_MEMORY_FAULT VCPU exit.

Yes and no.  The key escape valve we (very deliberately) gave ourselves is this:

  userspace should assume kvm_run.exit_reason is stale/undefined for all other
  error numbers.

As arm64 already does, that clause allows KVM to "speculatively" set exit_reason
to KVM_EXIT_MEMORY_FAULT.  Which is by design.  The userspace flow is intended
to be "if KVM_RUN returns EFAULT or EHWPOISON, then check for KVM_EXIT_MEMORY_FAULT
to see if KVM provided more information about why the EFAULT/EHWPOISON error was
returned".

> On a kvm_gmem_get_pfn() error, gmem_abort() prepares the
> KVM_EXIT_MEMORY_FAULT exit_reason and propagates the error back to
> userspace. kvm_gmem_get_pfn() does not massage the error code, and if the
> error is not -EFAULT or -EHWPOISON, userspace implementing the ABI fails to
> detect the memory fault exit.
> 
> Things get more complicated with kvm_handle_vncr_abort().
> kvm_translate_vncr(), similar to gmem_abort(), prepares the VCPU to exit
> with KVM_EXIT_MEMORY_FAULT and propagates the error code from
> kvm_gmem_get_pfn(). Then kvm_handle_vncr_abort() does a number of things
> based on this specific error code:
> 
> - If it's -EAGAIN, KVM resumes the guest. Note that KVM, when handling a
>   *host* fault on a guest_memfd backed VMA, retries the fault handling if
>   kvm_gmem_get_pfn() returns -EAGAIN.

Totally fine.

> - If it's -ENOMEM, -EFAULT, -EIO or -EHWPOISON, it returns to userspace
>   with 0 (success), meaning that, according to the documentation, userspace
>   will not detect the memory fault exit.

Also totally fine, and working as intended.  KVM_EXIT_MEMORY_FAULT is provided
for scenarios where (a) the issue is likely related to the GPA and (b) userspace
can remedy the underlying issue using the information provided in kvm_run.memory_fault.

ENOMEM doesn't meet (a), and EIO doesn't meet (b) (and probably not (a) in the
vast majority of cases either).

> - If it's -EINVAL, -ENOENT, -EACCESS, KVM injects a synchronous exception
>   back to the guest.
> - If it's -EPERM, KVM injects a permission fault.
> - If the error code is something else, KVM resumes the guest.

All of these are totally fine.  The fact that KVM "scribbled" kvm_run a bit is
a non-issue, because KVM will fill kvm_run with the correct information on the
next userspace exit, or will exit with an error that doesn't utilize kvm_run (in
which case userspace shouldn't be looking at it), or KVM is buggy somewhere else.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation
  2026-05-06 12:44 ` Sean Christopherson
@ 2026-05-06 13:39   ` Alexandru Elisei, Sean Christopherson
  0 siblings, 0 replies; 3+ messages in thread
From: Alexandru Elisei, Sean Christopherson @ 2026-05-06 13:39 UTC (permalink / raw)
  Cc: maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
	linux-arm-kernel, kvmarm, tabba, David.Hildenbrand

Hi Sean,

Thanks for the explanations!

On Wed, May 06, 2026 at 05:44:50AM -0700, Sean Christopherson wrote:
> On Wed, May 06, 2026, Alexandru Elisei wrote:
> > The documentation for KVM_EXIT_MEMORY_FAULT states:
> > 
> > 'Note!  KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that
> > it accompanies a return code of '-1', not '0'!  errno will always be set to
> > EFAULT or EHWPOISON when KVM exits with KVM_EXIT_MEMORY_FAULT, userspace
> > should assume kvm_run.exit_reason is stale/undefined for all other error
> > numbers'.
> > 
> > where a return code of '-1' is special because according to man 2 ioctl:
> > 
> > 'On error, -1 is returned, and errno is set to indicate the error'.
> > 
> > Putting the two together means that the ioctl KVM_RUN must 1) complete with
> > an error and 2) that error must must be either EFAULT or EHWPOISON for
> > userspace to detect a KVM_EXIT_MEMORY_FAULT VCPU exit.
> 
> Yes and no.  The key escape valve we (very deliberately) gave ourselves is this:
> 
>   userspace should assume kvm_run.exit_reason is stale/undefined for all other
>   error numbers.
> 
> As arm64 already does, that clause allows KVM to "speculatively" set exit_reason
> to KVM_EXIT_MEMORY_FAULT.  Which is by design.  The userspace flow is intended
> to be "if KVM_RUN returns EFAULT or EHWPOISON, then check for KVM_EXIT_MEMORY_FAULT
> to see if KVM provided more information about why the EFAULT/EHWPOISON error was
> returned".

Hm... In general, "speculatively" populating exit_reason with
KVM_EXIT_MEMORY_FAULT when userspace is not intended to use that information
looks a bit dubious to me. Why do the work if userspace is not supposed to use
the information?

Regarding gmem_abort(). As I see it, if today someone writes userspace that
relies on any of the undocumented error codes propagated from kvm_gmem_get_pfn()
to handle KVM_EXIT_MEMORY_FAULT, that means that KVM can never use those error
codes for any other exit_reason in the future, because that userspace will
break.

I'm sure this was all carefully considered when designing the interface, I was
just curious how this particular problem has been solved.

> 
> > On a kvm_gmem_get_pfn() error, gmem_abort() prepares the
> > KVM_EXIT_MEMORY_FAULT exit_reason and propagates the error back to
> > userspace. kvm_gmem_get_pfn() does not massage the error code, and if the
> > error is not -EFAULT or -EHWPOISON, userspace implementing the ABI fails to
> > detect the memory fault exit.
> > 
> > Things get more complicated with kvm_handle_vncr_abort().
> > kvm_translate_vncr(), similar to gmem_abort(), prepares the VCPU to exit
> > with KVM_EXIT_MEMORY_FAULT and propagates the error code from
> > kvm_gmem_get_pfn(). Then kvm_handle_vncr_abort() does a number of things
> > based on this specific error code:
> > 
> > - If it's -EAGAIN, KVM resumes the guest. Note that KVM, when handling a
> >   *host* fault on a guest_memfd backed VMA, retries the fault handling if
> >   kvm_gmem_get_pfn() returns -EAGAIN.
> 
> Totally fine.
> 
> > - If it's -ENOMEM, -EFAULT, -EIO or -EHWPOISON, it returns to userspace
> >   with 0 (success), meaning that, according to the documentation, userspace
> >   will not detect the memory fault exit.
> 
> Also totally fine, and working as intended.  KVM_EXIT_MEMORY_FAULT is provided
> for scenarios where (a) the issue is likely related to the GPA and (b) userspace
> can remedy the underlying issue using the information provided in kvm_run.memory_fault.

If KVM_RUN always returns 0 when exit_reason = KVM_EXIT_MEMORY_FAULT, which is what
kvm_handle_vncr_abort() does, how will userspace ever be able to handle the
fault?

Thanks,
Alex


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-06 13:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-06 10:50 [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation Alexandru Elisei
2026-05-06 12:44 ` Sean Christopherson
2026-05-06 13:39   ` Alexandru Elisei, Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox