* [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation
@ 2026-05-06 10:50 Alexandru Elisei
2026-05-06 12:44 ` Sean Christopherson
0 siblings, 1 reply; 6+ messages in thread
From: Alexandru Elisei @ 2026-05-06 10:50 UTC (permalink / raw)
To: maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
linux-arm-kernel, kvmarm
Cc: tabba, David.Hildenbrand
The documentation for KVM_EXIT_MEMORY_FAULT states:
'Note! KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that
it accompanies a return code of '-1', not '0'! errno will always be set to
EFAULT or EHWPOISON when KVM exits with KVM_EXIT_MEMORY_FAULT, userspace
should assume kvm_run.exit_reason is stale/undefined for all other error
numbers'.
where a return code of '-1' is special because according to man 2 ioctl:
'On error, -1 is returned, and errno is set to indicate the error'.
Putting the two together means that the ioctl KVM_RUN must 1) complete with
an error and 2) that error must must be either EFAULT or EHWPOISON for
userspace to detect a KVM_EXIT_MEMORY_FAULT VCPU exit.
On a kvm_gmem_get_pfn() error, gmem_abort() prepares the
KVM_EXIT_MEMORY_FAULT exit_reason and propagates the error back to
userspace. kvm_gmem_get_pfn() does not massage the error code, and if the
error is not -EFAULT or -EHWPOISON, userspace implementing the ABI fails to
detect the memory fault exit.
Things get more complicated with kvm_handle_vncr_abort().
kvm_translate_vncr(), similar to gmem_abort(), prepares the VCPU to exit
with KVM_EXIT_MEMORY_FAULT and propagates the error code from
kvm_gmem_get_pfn(). Then kvm_handle_vncr_abort() does a number of things
based on this specific error code:
- If it's -EAGAIN, KVM resumes the guest. Note that KVM, when handling a
*host* fault on a guest_memfd backed VMA, retries the fault handling if
kvm_gmem_get_pfn() returns -EAGAIN.
- If it's -ENOMEM, -EFAULT, -EIO or -EHWPOISON, it returns to userspace
with 0 (success), meaning that, according to the documentation, userspace
will not detect the memory fault exit.
- If it's -EINVAL, -ENOENT, -EACCESS, KVM injects a synchronous exception
back to the guest.
- If it's -EPERM, KVM injects a permission fault.
- If the error code is something else, KVM resumes the guest.
Bring a measure of order to all of this by implementing the documented
behaviour. -EAGAIN is treated as an error, similar to the
__kvm_faultin_pfn() behaviour for an anonymous VMA.
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
This has the potential to break userspace, hence the RFC tag.
I went back and forth on the fix. I cannot test any of this and I have no
context around the usage of guest_memfd. In the end I settled on strictly
implementing the documented behaviour.
Really not sure what userspace is supposed to do to fixup the fault if
kvm_gmem_get_pfn() returns -EAGAIN either.
Someone with more knowledge please chime in!
arch/arm64/kvm/mmu.c | 8 +++++++-
arch/arm64/kvm/nested.c | 24 +++++++++++++-----------
2 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index d089c107d9b7..ea6c96818fc6 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1610,7 +1610,13 @@ static int gmem_abort(const struct kvm_s2_fault_desc *s2fd)
if (ret) {
kvm_prepare_memory_fault_exit(s2fd->vcpu, s2fd->fault_ipa, PAGE_SIZE,
write_fault, exec_fault, false);
- return ret;
+ switch (ret) {
+ case -EFAULT:
+ case -EHWPOISON:
+ return ret;
+ default:
+ return -EFAULT;
+ }
}
if (!(s2fd->memslot->flags & KVM_MEM_READONLY))
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 883b6c1008fb..ef426c94daff 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1320,8 +1320,14 @@ static int kvm_translate_vncr(struct kvm_vcpu *vcpu, bool *is_gmem)
ret = kvm_gmem_get_pfn(vcpu->kvm, memslot, gfn, &pfn, &page, NULL);
if (ret) {
kvm_prepare_memory_fault_exit(vcpu, vt->wr.pa, PAGE_SIZE,
- write_fault, false, false);
- return ret;
+ write_fault, false, false);
+ switch (ret) {
+ case -EFAULT:
+ case -EHWPOISON:
+ return ret;
+ default:
+ return -EFAULT;
+ }
}
}
@@ -1401,23 +1407,19 @@ int kvm_handle_vncr_abort(struct kvm_vcpu *vcpu)
switch (ret) {
case -EAGAIN:
+ case -ENOMEM:
/* Let's try again... */
break;
- case -ENOMEM:
+ case -EFAULT:
+ case -EHWPOISON:
/*
* For guest_memfd, this indicates that it failed to
* create a folio to back the memory. Inform userspace.
*/
if (is_gmem)
- return 0;
- /* Otherwise, let's try again... */
- break;
- case -EFAULT:
- case -EIO:
- case -EHWPOISON:
- if (is_gmem)
- return 0;
+ return ret;
fallthrough;
+ case -EIO:
case -EINVAL:
case -ENOENT:
case -EACCES:
base-commit: 7fd2df204f342fc17d1a0bfcd474b24232fb0f32
--
2.54.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation
2026-05-06 10:50 [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation Alexandru Elisei
@ 2026-05-06 12:44 ` Sean Christopherson
2026-05-06 13:39 ` Alexandru Elisei, Sean Christopherson
2026-05-07 8:45 ` Alexandru Elisei
0 siblings, 2 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 12:44 UTC (permalink / raw)
To: Alexandru Elisei
Cc: maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
linux-arm-kernel, kvmarm, tabba, David.Hildenbrand
On Wed, May 06, 2026, Alexandru Elisei wrote:
> The documentation for KVM_EXIT_MEMORY_FAULT states:
>
> 'Note! KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that
> it accompanies a return code of '-1', not '0'! errno will always be set to
> EFAULT or EHWPOISON when KVM exits with KVM_EXIT_MEMORY_FAULT, userspace
> should assume kvm_run.exit_reason is stale/undefined for all other error
> numbers'.
>
> where a return code of '-1' is special because according to man 2 ioctl:
>
> 'On error, -1 is returned, and errno is set to indicate the error'.
>
> Putting the two together means that the ioctl KVM_RUN must 1) complete with
> an error and 2) that error must must be either EFAULT or EHWPOISON for
> userspace to detect a KVM_EXIT_MEMORY_FAULT VCPU exit.
Yes and no. The key escape valve we (very deliberately) gave ourselves is this:
userspace should assume kvm_run.exit_reason is stale/undefined for all other
error numbers.
As arm64 already does, that clause allows KVM to "speculatively" set exit_reason
to KVM_EXIT_MEMORY_FAULT. Which is by design. The userspace flow is intended
to be "if KVM_RUN returns EFAULT or EHWPOISON, then check for KVM_EXIT_MEMORY_FAULT
to see if KVM provided more information about why the EFAULT/EHWPOISON error was
returned".
> On a kvm_gmem_get_pfn() error, gmem_abort() prepares the
> KVM_EXIT_MEMORY_FAULT exit_reason and propagates the error back to
> userspace. kvm_gmem_get_pfn() does not massage the error code, and if the
> error is not -EFAULT or -EHWPOISON, userspace implementing the ABI fails to
> detect the memory fault exit.
>
> Things get more complicated with kvm_handle_vncr_abort().
> kvm_translate_vncr(), similar to gmem_abort(), prepares the VCPU to exit
> with KVM_EXIT_MEMORY_FAULT and propagates the error code from
> kvm_gmem_get_pfn(). Then kvm_handle_vncr_abort() does a number of things
> based on this specific error code:
>
> - If it's -EAGAIN, KVM resumes the guest. Note that KVM, when handling a
> *host* fault on a guest_memfd backed VMA, retries the fault handling if
> kvm_gmem_get_pfn() returns -EAGAIN.
Totally fine.
> - If it's -ENOMEM, -EFAULT, -EIO or -EHWPOISON, it returns to userspace
> with 0 (success), meaning that, according to the documentation, userspace
> will not detect the memory fault exit.
Also totally fine, and working as intended. KVM_EXIT_MEMORY_FAULT is provided
for scenarios where (a) the issue is likely related to the GPA and (b) userspace
can remedy the underlying issue using the information provided in kvm_run.memory_fault.
ENOMEM doesn't meet (a), and EIO doesn't meet (b) (and probably not (a) in the
vast majority of cases either).
> - If it's -EINVAL, -ENOENT, -EACCESS, KVM injects a synchronous exception
> back to the guest.
> - If it's -EPERM, KVM injects a permission fault.
> - If the error code is something else, KVM resumes the guest.
All of these are totally fine. The fact that KVM "scribbled" kvm_run a bit is
a non-issue, because KVM will fill kvm_run with the correct information on the
next userspace exit, or will exit with an error that doesn't utilize kvm_run (in
which case userspace shouldn't be looking at it), or KVM is buggy somewhere else.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation
2026-05-06 12:44 ` Sean Christopherson
@ 2026-05-06 13:39 ` Alexandru Elisei, Sean Christopherson
2026-05-07 8:45 ` Alexandru Elisei
1 sibling, 0 replies; 6+ messages in thread
From: Alexandru Elisei, Sean Christopherson @ 2026-05-06 13:39 UTC (permalink / raw)
Cc: maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
linux-arm-kernel, kvmarm, tabba, David.Hildenbrand
Hi Sean,
Thanks for the explanations!
On Wed, May 06, 2026 at 05:44:50AM -0700, Sean Christopherson wrote:
> On Wed, May 06, 2026, Alexandru Elisei wrote:
> > The documentation for KVM_EXIT_MEMORY_FAULT states:
> >
> > 'Note! KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that
> > it accompanies a return code of '-1', not '0'! errno will always be set to
> > EFAULT or EHWPOISON when KVM exits with KVM_EXIT_MEMORY_FAULT, userspace
> > should assume kvm_run.exit_reason is stale/undefined for all other error
> > numbers'.
> >
> > where a return code of '-1' is special because according to man 2 ioctl:
> >
> > 'On error, -1 is returned, and errno is set to indicate the error'.
> >
> > Putting the two together means that the ioctl KVM_RUN must 1) complete with
> > an error and 2) that error must must be either EFAULT or EHWPOISON for
> > userspace to detect a KVM_EXIT_MEMORY_FAULT VCPU exit.
>
> Yes and no. The key escape valve we (very deliberately) gave ourselves is this:
>
> userspace should assume kvm_run.exit_reason is stale/undefined for all other
> error numbers.
>
> As arm64 already does, that clause allows KVM to "speculatively" set exit_reason
> to KVM_EXIT_MEMORY_FAULT. Which is by design. The userspace flow is intended
> to be "if KVM_RUN returns EFAULT or EHWPOISON, then check for KVM_EXIT_MEMORY_FAULT
> to see if KVM provided more information about why the EFAULT/EHWPOISON error was
> returned".
Hm... In general, "speculatively" populating exit_reason with
KVM_EXIT_MEMORY_FAULT when userspace is not intended to use that information
looks a bit dubious to me. Why do the work if userspace is not supposed to use
the information?
Regarding gmem_abort(). As I see it, if today someone writes userspace that
relies on any of the undocumented error codes propagated from kvm_gmem_get_pfn()
to handle KVM_EXIT_MEMORY_FAULT, that means that KVM can never use those error
codes for any other exit_reason in the future, because that userspace will
break.
I'm sure this was all carefully considered when designing the interface, I was
just curious how this particular problem has been solved.
>
> > On a kvm_gmem_get_pfn() error, gmem_abort() prepares the
> > KVM_EXIT_MEMORY_FAULT exit_reason and propagates the error back to
> > userspace. kvm_gmem_get_pfn() does not massage the error code, and if the
> > error is not -EFAULT or -EHWPOISON, userspace implementing the ABI fails to
> > detect the memory fault exit.
> >
> > Things get more complicated with kvm_handle_vncr_abort().
> > kvm_translate_vncr(), similar to gmem_abort(), prepares the VCPU to exit
> > with KVM_EXIT_MEMORY_FAULT and propagates the error code from
> > kvm_gmem_get_pfn(). Then kvm_handle_vncr_abort() does a number of things
> > based on this specific error code:
> >
> > - If it's -EAGAIN, KVM resumes the guest. Note that KVM, when handling a
> > *host* fault on a guest_memfd backed VMA, retries the fault handling if
> > kvm_gmem_get_pfn() returns -EAGAIN.
>
> Totally fine.
>
> > - If it's -ENOMEM, -EFAULT, -EIO or -EHWPOISON, it returns to userspace
> > with 0 (success), meaning that, according to the documentation, userspace
> > will not detect the memory fault exit.
>
> Also totally fine, and working as intended. KVM_EXIT_MEMORY_FAULT is provided
> for scenarios where (a) the issue is likely related to the GPA and (b) userspace
> can remedy the underlying issue using the information provided in kvm_run.memory_fault.
If KVM_RUN always returns 0 when exit_reason = KVM_EXIT_MEMORY_FAULT, which is what
kvm_handle_vncr_abort() does, how will userspace ever be able to handle the
fault?
Thanks,
Alex
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation
2026-05-06 12:44 ` Sean Christopherson
2026-05-06 13:39 ` Alexandru Elisei, Sean Christopherson
@ 2026-05-07 8:45 ` Alexandru Elisei
2026-05-07 13:33 ` Sean Christopherson
1 sibling, 1 reply; 6+ messages in thread
From: Alexandru Elisei @ 2026-05-07 8:45 UTC (permalink / raw)
To: Sean Christopherson
Cc: maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
linux-arm-kernel, kvmarm, tabba, David.Hildenbrand
Hi Sean,
(Resending this because I managed to mess up the headers, sorry for the
duplicate).
Thanks for the explanations!
On Wed, May 06, 2026 at 05:44:50AM -0700, Sean Christopherson wrote:
> On Wed, May 06, 2026, Alexandru Elisei wrote:
> > The documentation for KVM_EXIT_MEMORY_FAULT states:
> >
> > 'Note! KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that
> > it accompanies a return code of '-1', not '0'! errno will always be set to
> > EFAULT or EHWPOISON when KVM exits with KVM_EXIT_MEMORY_FAULT, userspace
> > should assume kvm_run.exit_reason is stale/undefined for all other error
> > numbers'.
> >
> > where a return code of '-1' is special because according to man 2 ioctl:
> >
> > 'On error, -1 is returned, and errno is set to indicate the error'.
> >
> > Putting the two together means that the ioctl KVM_RUN must 1) complete with
> > an error and 2) that error must must be either EFAULT or EHWPOISON for
> > userspace to detect a KVM_EXIT_MEMORY_FAULT VCPU exit.
>
> Yes and no. The key escape valve we (very deliberately) gave ourselves is this:
>
> userspace should assume kvm_run.exit_reason is stale/undefined for all other
> error numbers.
>
> As arm64 already does, that clause allows KVM to "speculatively" set exit_reason
> to KVM_EXIT_MEMORY_FAULT. Which is by design. The userspace flow is intended
> to be "if KVM_RUN returns EFAULT or EHWPOISON, then check for KVM_EXIT_MEMORY_FAULT
> to see if KVM provided more information about why the EFAULT/EHWPOISON error was
> returned".
Hm... In general, "speculatively" populating exit_reason with
KVM_EXIT_MEMORY_FAULT when userspace is not intended to use that information
looks a bit dubious to me. Why do the work if userspace is not supposed to use
the information?
Regarding gmem_abort(). As I see it, if today someone writes userspace that
relies on any of the undocumented error codes propagated from kvm_gmem_get_pfn()
to handle KVM_EXIT_MEMORY_FAULT, that means that KVM can never use those error
codes for any other exit_reason in the future, because that userspace will
break.
I'm sure this was all carefully considered when designing the interface, I was
just curious how this particular problem has been solved.
>
> > On a kvm_gmem_get_pfn() error, gmem_abort() prepares the
> > KVM_EXIT_MEMORY_FAULT exit_reason and propagates the error back to
> > userspace. kvm_gmem_get_pfn() does not massage the error code, and if the
> > error is not -EFAULT or -EHWPOISON, userspace implementing the ABI fails to
> > detect the memory fault exit.
> >
> > Things get more complicated with kvm_handle_vncr_abort().
> > kvm_translate_vncr(), similar to gmem_abort(), prepares the VCPU to exit
> > with KVM_EXIT_MEMORY_FAULT and propagates the error code from
> > kvm_gmem_get_pfn(). Then kvm_handle_vncr_abort() does a number of things
> > based on this specific error code:
> >
> > - If it's -EAGAIN, KVM resumes the guest. Note that KVM, when handling a
> > *host* fault on a guest_memfd backed VMA, retries the fault handling if
> > kvm_gmem_get_pfn() returns -EAGAIN.
>
> Totally fine.
>
> > - If it's -ENOMEM, -EFAULT, -EIO or -EHWPOISON, it returns to userspace
> > with 0 (success), meaning that, according to the documentation, userspace
> > will not detect the memory fault exit.
>
> Also totally fine, and working as intended. KVM_EXIT_MEMORY_FAULT is provided
> for scenarios where (a) the issue is likely related to the GPA and (b) userspace
> can remedy the underlying issue using the information provided in kvm_run.memory_fault.
If KVM_RUN always returns 0 when exit_reason = KVM_EXIT_MEMORY_FAULT, which is what
kvm_handle_vncr_abort() does, how will userspace ever be able to handle the
fault?
Thanks,
Alex
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation
2026-05-07 8:45 ` Alexandru Elisei
@ 2026-05-07 13:33 ` Sean Christopherson
2026-05-08 8:55 ` Alexandru Elisei
0 siblings, 1 reply; 6+ messages in thread
From: Sean Christopherson @ 2026-05-07 13:33 UTC (permalink / raw)
To: Alexandru Elisei
Cc: maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
linux-arm-kernel, kvmarm, tabba, David.Hildenbrand
On Thu, May 07, 2026, Alexandru Elisei wrote:
> Hi Sean,
>
> (Resending this because I managed to mess up the headers, sorry for the
> duplicate).
>
> Thanks for the explanations!
>
> On Wed, May 06, 2026 at 05:44:50AM -0700, Sean Christopherson wrote:
> > On Wed, May 06, 2026, Alexandru Elisei wrote:
> > > The documentation for KVM_EXIT_MEMORY_FAULT states:
> > >
> > > 'Note! KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that
> > > it accompanies a return code of '-1', not '0'! errno will always be set to
> > > EFAULT or EHWPOISON when KVM exits with KVM_EXIT_MEMORY_FAULT, userspace
> > > should assume kvm_run.exit_reason is stale/undefined for all other error
> > > numbers'.
> > >
> > > where a return code of '-1' is special because according to man 2 ioctl:
> > >
> > > 'On error, -1 is returned, and errno is set to indicate the error'.
> > >
> > > Putting the two together means that the ioctl KVM_RUN must 1) complete with
> > > an error and 2) that error must must be either EFAULT or EHWPOISON for
> > > userspace to detect a KVM_EXIT_MEMORY_FAULT VCPU exit.
> >
> > Yes and no. The key escape valve we (very deliberately) gave ourselves is this:
> >
> > userspace should assume kvm_run.exit_reason is stale/undefined for all other
> > error numbers.
> >
> > As arm64 already does, that clause allows KVM to "speculatively" set exit_reason
> > to KVM_EXIT_MEMORY_FAULT. Which is by design. The userspace flow is intended
> > to be "if KVM_RUN returns EFAULT or EHWPOISON, then check for KVM_EXIT_MEMORY_FAULT
> > to see if KVM provided more information about why the EFAULT/EHWPOISON error was
> > returned".
>
> Hm... In general, "speculatively" populating exit_reason with
> KVM_EXIT_MEMORY_FAULT when userspace is not intended to use that information
> looks a bit dubious to me.
Oh, for sure, it's not exactly ideal.
> Why do the work if userspace is not supposed to use the information?
Because not filling kvm_run when KVM is supposed to (per KVM's contract with
userspace) would be a bug, whereas unnecessarily filling kvm_run is "just" wasted
cycles (and not very many of them). x86 also has multiple flows where it fills
kvm_run "speculatively", e.g. in low(ish) level helpers where it's not known if
KVM will actually exit to userspace.
Overall, for code like this, IMO it's also yields less complex KVM code, though
I suppose it can also end up being more confusing for readers.
> Regarding gmem_abort(). As I see it, if today someone writes userspace that
> relies on any of the undocumented error codes propagated from kvm_gmem_get_pfn()
> to handle KVM_EXIT_MEMORY_FAULT, that means that KVM can never use those error
> codes for any other exit_reason in the future, because that userspace will
> break.
Hmm, if we wanted to defend against that, we could scribble kvm_run.exit_reason
on the way out of KVM_RUN, e.g.
diff --git virt/kvm/kvm_main.c virt/kvm/kvm_main.c
index 89489996fbc1..76801d103dd9 100644
--- virt/kvm/kvm_main.c
+++ virt/kvm/kvm_main.c
@@ -4475,6 +4475,10 @@ static long kvm_vcpu_ioctl(struct file *filp,
*/
rseq_virt_userspace_exit();
+ if (vcpu->run->exit_reason == KVM_EXIT_MEMORY_FAULT &&
+ r && r != -EFAULT && r != EHWPOISON)
+ vcpu->run->exit_reason = KVM_EXIT_UNKNOWN;
+
trace_kvm_userspace_exit(vcpu->run->exit_reason, r);
break;
}
I don't know that I'm convinced that level of paranoia is worth it though.
> I'm sure this was all carefully considered when designing the interface, I was
> just curious how this particular problem has been solved.
Heh, I like to think we carefully considered the interface, but thinking of every
possible way userspace can be silly is hard :-)
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation
2026-05-07 13:33 ` Sean Christopherson
@ 2026-05-08 8:55 ` Alexandru Elisei
0 siblings, 0 replies; 6+ messages in thread
From: Alexandru Elisei @ 2026-05-08 8:55 UTC (permalink / raw)
To: Sean Christopherson
Cc: maz, oupton, joey.gouly, suzuki.poulose, yuzenghui,
linux-arm-kernel, kvmarm, tabba, David.Hildenbrand
Hi Sean,
On Thu, May 07, 2026 at 06:33:05AM -0700, Sean Christopherson wrote:
> On Thu, May 07, 2026, Alexandru Elisei wrote:
> > Hi Sean,
> >
> > (Resending this because I managed to mess up the headers, sorry for the
> > duplicate).
> >
> > Thanks for the explanations!
> >
> > On Wed, May 06, 2026 at 05:44:50AM -0700, Sean Christopherson wrote:
> > > On Wed, May 06, 2026, Alexandru Elisei wrote:
> > > > The documentation for KVM_EXIT_MEMORY_FAULT states:
> > > >
> > > > 'Note! KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that
> > > > it accompanies a return code of '-1', not '0'! errno will always be set to
> > > > EFAULT or EHWPOISON when KVM exits with KVM_EXIT_MEMORY_FAULT, userspace
> > > > should assume kvm_run.exit_reason is stale/undefined for all other error
> > > > numbers'.
> > > >
> > > > where a return code of '-1' is special because according to man 2 ioctl:
> > > >
> > > > 'On error, -1 is returned, and errno is set to indicate the error'.
> > > >
> > > > Putting the two together means that the ioctl KVM_RUN must 1) complete with
> > > > an error and 2) that error must must be either EFAULT or EHWPOISON for
> > > > userspace to detect a KVM_EXIT_MEMORY_FAULT VCPU exit.
> > >
> > > Yes and no. The key escape valve we (very deliberately) gave ourselves is this:
> > >
> > > userspace should assume kvm_run.exit_reason is stale/undefined for all other
> > > error numbers.
> > >
> > > As arm64 already does, that clause allows KVM to "speculatively" set exit_reason
> > > to KVM_EXIT_MEMORY_FAULT. Which is by design. The userspace flow is intended
> > > to be "if KVM_RUN returns EFAULT or EHWPOISON, then check for KVM_EXIT_MEMORY_FAULT
> > > to see if KVM provided more information about why the EFAULT/EHWPOISON error was
> > > returned".
> >
> > Hm... In general, "speculatively" populating exit_reason with
> > KVM_EXIT_MEMORY_FAULT when userspace is not intended to use that information
> > looks a bit dubious to me.
>
> Oh, for sure, it's not exactly ideal.
>
> > Why do the work if userspace is not supposed to use the information?
>
> Because not filling kvm_run when KVM is supposed to (per KVM's contract with
> userspace) would be a bug, whereas unnecessarily filling kvm_run is "just" wasted
> cycles (and not very many of them). x86 also has multiple flows where it fills
> kvm_run "speculatively", e.g. in low(ish) level helpers where it's not known if
> KVM will actually exit to userspace.
For arm64, it's not that hard to figure out that 0 from the fault handlers means
a return to guest:
fault handler returns 0 => kvm_handle_guest_abort() massages the 0 into 1 =>
kvm_vcpu_arch_ioctl() resumes loop.
Consequently anything other than 0 from the fault handlers means an exit to
userspace.
Not sure if that proves or disproves my point though :(
>
> Overall, for code like this, IMO it's also yields less complex KVM code, though
> I suppose it can also end up being more confusing for readers.
>
> > Regarding gmem_abort(). As I see it, if today someone writes userspace that
> > relies on any of the undocumented error codes propagated from kvm_gmem_get_pfn()
> > to handle KVM_EXIT_MEMORY_FAULT, that means that KVM can never use those error
> > codes for any other exit_reason in the future, because that userspace will
> > break.
>
> Hmm, if we wanted to defend against that, we could scribble kvm_run.exit_reason
> on the way out of KVM_RUN, e.g.
>
> diff --git virt/kvm/kvm_main.c virt/kvm/kvm_main.c
> index 89489996fbc1..76801d103dd9 100644
> --- virt/kvm/kvm_main.c
> +++ virt/kvm/kvm_main.c
> @@ -4475,6 +4475,10 @@ static long kvm_vcpu_ioctl(struct file *filp,
> */
> rseq_virt_userspace_exit();
>
> + if (vcpu->run->exit_reason == KVM_EXIT_MEMORY_FAULT &&
> + r && r != -EFAULT && r != EHWPOISON)
^^^^^^^^^^
-EHWPOISON
> + vcpu->run->exit_reason = KVM_EXIT_UNKNOWN;
> +
> trace_kvm_userspace_exit(vcpu->run->exit_reason, r);
> break;
> }
I was thinking something like this, to avoid populating KVM_EXIT_MEMORY_FAULT
information and then overwriting it later (I assume all architectures go
through the helper and don't open code it, haven't checked):
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 4c14aee1fb06..6e1eeb511967 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2505,11 +2505,14 @@ static inline void kvm_account_pgtable_pages(void *virt, int nr)
/* Max number of entries allowed for each kvm dirty ring */
#define KVM_DIRTY_RING_MAX_ENTRIES 65536
-static inline void kvm_prepare_memory_fault_exit(struct kvm_vcpu *vcpu,
+static inline void kvm_prepare_memory_fault_exit(struct kvm_vcpu *vcpu, int error,
gpa_t gpa, gpa_t size,
bool is_write, bool is_exec,
bool is_private)
{
+ if (error != -EFAULT && error != -EHWPOISON)
+ return;
+
vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT;
vcpu->run->memory_fault.gpa = gpa;
vcpu->run->memory_fault.size = size;
arm64 sets exit_reason = KVM_EXIT_UNKNOWN before the run loop. After a
cursory look, I *think* x86 does the same, so the result would be similar
to what you propose, at least for these two architectures. We could also
set exit_reason to unknown if !-EFAULT && !-EHWPOISON to be sure.
Avoids leaking what memory the guest accesses, for the extra, extra
paranoid.
It would also make at least one person (me) less confused about why
KVM_EXIT_MEMORY_FAULT is populated when userspace is not supposed to
consume it :)
On the other hand, all call sites would need to be modified.
>
> I don't know that I'm convinced that level of paranoia is worth it though.
It's up to you, I don't feel strongly about it. If you do decide to go
ahead with it, whatever approach you choose, I can prepare the patch.
>
> > I'm sure this was all carefully considered when designing the interface, I was
> > just curious how this particular problem has been solved.
>
> Heh, I like to think we carefully considered the interface, but thinking of every
> possible way userspace can be silly is hard :-)
Agreed. That's why I think exposing strictly the minimum necessary
information to userspace is a good defence :)
Thanks,
Alex
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-05-08 8:55 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-06 10:50 [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation Alexandru Elisei
2026-05-06 12:44 ` Sean Christopherson
2026-05-06 13:39 ` Alexandru Elisei, Sean Christopherson
2026-05-07 8:45 ` Alexandru Elisei
2026-05-07 13:33 ` Sean Christopherson
2026-05-08 8:55 ` Alexandru Elisei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox