Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Alexandru Elisei <alexandru.elisei@arm.com>
To: Sean Christopherson <seanjc@google.com>
Cc: maz@kernel.org, oupton@kernel.org, joey.gouly@arm.com,
	suzuki.poulose@arm.com, yuzenghui@huawei.com,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	tabba@google.com, David.Hildenbrand@arm.com
Subject: Re: [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation
Date: Fri, 8 May 2026 09:55:17 +0100	[thread overview]
Message-ID: <af2k9btE5hS6lWHR@raptor> (raw)
In-Reply-To: <afyUkaQ1wtivyH8W@google.com>

Hi Sean,

On Thu, May 07, 2026 at 06:33:05AM -0700, Sean Christopherson wrote:
> On Thu, May 07, 2026, Alexandru Elisei wrote:
> > Hi Sean,
> > 
> > (Resending this because I managed to mess up the headers, sorry for the
> > duplicate).
> > 
> > Thanks for the explanations!
> > 
> > On Wed, May 06, 2026 at 05:44:50AM -0700, Sean Christopherson wrote:
> > > On Wed, May 06, 2026, Alexandru Elisei wrote:
> > > > The documentation for KVM_EXIT_MEMORY_FAULT states:
> > > > 
> > > > 'Note!  KVM_EXIT_MEMORY_FAULT is unique among all KVM exit reasons in that
> > > > it accompanies a return code of '-1', not '0'!  errno will always be set to
> > > > EFAULT or EHWPOISON when KVM exits with KVM_EXIT_MEMORY_FAULT, userspace
> > > > should assume kvm_run.exit_reason is stale/undefined for all other error
> > > > numbers'.
> > > > 
> > > > where a return code of '-1' is special because according to man 2 ioctl:
> > > > 
> > > > 'On error, -1 is returned, and errno is set to indicate the error'.
> > > > 
> > > > Putting the two together means that the ioctl KVM_RUN must 1) complete with
> > > > an error and 2) that error must must be either EFAULT or EHWPOISON for
> > > > userspace to detect a KVM_EXIT_MEMORY_FAULT VCPU exit.
> > > 
> > > Yes and no.  The key escape valve we (very deliberately) gave ourselves is this:
> > > 
> > >   userspace should assume kvm_run.exit_reason is stale/undefined for all other
> > >   error numbers.
> > > 
> > > As arm64 already does, that clause allows KVM to "speculatively" set exit_reason
> > > to KVM_EXIT_MEMORY_FAULT.  Which is by design.  The userspace flow is intended
> > > to be "if KVM_RUN returns EFAULT or EHWPOISON, then check for KVM_EXIT_MEMORY_FAULT
> > > to see if KVM provided more information about why the EFAULT/EHWPOISON error was
> > > returned".
> > 
> > Hm... In general, "speculatively" populating exit_reason with
> > KVM_EXIT_MEMORY_FAULT when userspace is not intended to use that information
> > looks a bit dubious to me.
> 
> Oh, for sure, it's not exactly ideal.
> 
> > Why do the work if userspace is not supposed to use the information?
> 
> Because not filling kvm_run when KVM is supposed to (per KVM's contract with
> userspace) would be a bug, whereas unnecessarily filling kvm_run is "just" wasted
> cycles (and not very many of them).  x86 also has multiple flows where it fills
> kvm_run "speculatively", e.g. in low(ish) level helpers where it's not known if
> KVM will actually exit to userspace.

For arm64, it's not that hard to figure out that 0 from the fault handlers means
a return to guest:

fault handler returns 0 => kvm_handle_guest_abort() massages the 0 into 1 =>
kvm_vcpu_arch_ioctl() resumes loop.

Consequently anything other than 0 from the fault handlers means an exit to
userspace.

Not sure if that proves or disproves my point though :(

> 
> Overall, for code like this, IMO it's also yields less complex KVM code, though
> I suppose it can also end up being more confusing for readers.
> 
> > Regarding gmem_abort(). As I see it, if today someone writes userspace that
> > relies on any of the undocumented error codes propagated from kvm_gmem_get_pfn()
> > to handle KVM_EXIT_MEMORY_FAULT, that means that KVM can never use those error
> > codes for any other exit_reason in the future, because that userspace will
> > break.
> 
> Hmm, if we wanted to defend against that, we could scribble kvm_run.exit_reason
> on the way out of KVM_RUN, e.g.
> 
> diff --git virt/kvm/kvm_main.c virt/kvm/kvm_main.c
> index 89489996fbc1..76801d103dd9 100644
> --- virt/kvm/kvm_main.c
> +++ virt/kvm/kvm_main.c
> @@ -4475,6 +4475,10 @@ static long kvm_vcpu_ioctl(struct file *filp,
>                  */
>                 rseq_virt_userspace_exit();
>  
> +               if (vcpu->run->exit_reason == KVM_EXIT_MEMORY_FAULT &&
> +                   r && r != -EFAULT && r != EHWPOISON)
                                               ^^^^^^^^^^
					       -EHWPOISON

> +                       vcpu->run->exit_reason = KVM_EXIT_UNKNOWN;
> +
>                 trace_kvm_userspace_exit(vcpu->run->exit_reason, r);
>                 break;
>         }

I was thinking something like this, to avoid populating KVM_EXIT_MEMORY_FAULT
information and then overwriting it later (I assume all architectures go
through the helper and don't open code it, haven't checked):

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 4c14aee1fb06..6e1eeb511967 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2505,11 +2505,14 @@ static inline void kvm_account_pgtable_pages(void *virt, int nr)
 /* Max number of entries allowed for each kvm dirty ring */
 #define  KVM_DIRTY_RING_MAX_ENTRIES  65536

-static inline void kvm_prepare_memory_fault_exit(struct kvm_vcpu *vcpu,
+static inline void kvm_prepare_memory_fault_exit(struct kvm_vcpu *vcpu, int error,
                                                 gpa_t gpa, gpa_t size,
                                                 bool is_write, bool is_exec,
                                                 bool is_private)
 {
+       if (error != -EFAULT && error != -EHWPOISON)
+               return;
+
        vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT;
        vcpu->run->memory_fault.gpa = gpa;
        vcpu->run->memory_fault.size = size;

arm64 sets exit_reason = KVM_EXIT_UNKNOWN before the run loop. After a
cursory look, I *think* x86 does the same, so the result would be similar
to what you propose, at least for these two architectures. We could also
set exit_reason to unknown if !-EFAULT && !-EHWPOISON to be sure.

Avoids leaking what memory the guest accesses, for the extra, extra
paranoid.

It would also make at least one person (me) less confused about why
KVM_EXIT_MEMORY_FAULT is populated when userspace is not supposed to
consume it :)

On the other hand, all call sites would need to be modified.

> 
> I don't know that I'm convinced that level of paranoia is worth it though.

It's up to you, I don't feel strongly about it. If you do decide to go
ahead with it, whatever approach you choose, I can prepare the patch.

> 
> > I'm sure this was all carefully considered when designing the interface, I was
> > just curious how this particular problem has been solved.
> 
> Heh, I like to think we carefully considered the interface, but thinking of every
> possible way userspace can be silly is hard :-)

Agreed. That's why I think exposing strictly the minimum necessary
information to userspace is a good defence :)

Thanks,
Alex


      reply	other threads:[~2026-05-08  8:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-06 10:50 [RFC PATCH] KVM: arm64: Align KVM_EXIT_MEMORY_FAULT error codes with documentation Alexandru Elisei
2026-05-06 12:44 ` Sean Christopherson
2026-05-06 13:39   ` Alexandru Elisei, Sean Christopherson
2026-05-07  8:45   ` Alexandru Elisei
2026-05-07 13:33     ` Sean Christopherson
2026-05-08  8:55       ` Alexandru Elisei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af2k9btE5hS6lWHR@raptor \
    --to=alexandru.elisei@arm.com \
    --cc=David.Hildenbrand@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=maz@kernel.org \
    --cc=oupton@kernel.org \
    --cc=seanjc@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox