public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Zhao Liu <zhao1.liu@intel.com>,
	Binbin Wu <binbin.wu@linux.intel.com>,
	xiaoyao.li@intel.com,  qemu-devel@nongnu.org,
	michael.roth@amd.com, rick.p.edgecombe@intel.com,
	 isaku.yamahata@intel.com, farrah.chen@intel.com,
	kvm@vger.kernel.org
Subject: Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL
Date: Thu, 12 Dec 2024 14:11:16 -0800	[thread overview]
Message-ID: <Z1tfhPaHruhS3teK@google.com> (raw)
In-Reply-To: <5b8f7d63-ef0a-487f-bf9d-44421691fa85@redhat.com>

On Thu, Dec 12, 2024, Paolo Bonzini wrote:
> On 12/12/24 20:13, Sean Christopherson wrote:
> > On Thu, Dec 12, 2024, Paolo Bonzini wrote:
> > > If ret is less than zero, will stop the VM anyway as
> > > RUN_STATE_INTERNAL_ERROR.
> > > 
> > > If this has to be fixed in QEMU, I think there's no need to set anything
> > > if ret != 0; also because kvm_convert_memory() returns -1 on error and
> > > that's not how the error would be passed to the guest.
> > > 
> > > However, I think the right fix should simply be this in KVM:
> > > 
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index 83fe0a78146f..e2118ba93ef6 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -10066,6 +10066,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> > >   		}
> > >   		vcpu->run->exit_reason        = KVM_EXIT_HYPERCALL;
> > > +		vcpu->run->ret                = 0;
> > 
> > 		vcpu->run->hypercall.ret
> > 
> > >   		vcpu->run->hypercall.nr       = KVM_HC_MAP_GPA_RANGE;
> > >   		vcpu->run->hypercall.args[0]  = gpa;
> > >   		vcpu->run->hypercall.args[1]  = npages;
> > > 
> > > While there is arguably a change in behavior of the kernel both with
> > > the patches in kvm-coco-queue and with the above one, _in practice_
> > > the above change is one that userspace will not notice.
> > 
> > I agree that KVM should initialize "ret", but I don't think '0' is the right
> > value.  KVM shouldn't assume userspace will successfully handle the hypercall.
> > What happens if KVM sets vcpu->run->hypercall.ret to a non-zero value, e.g. -KVM_ENOSYS?
> 
> Unfortunately QEMU is never writing vcpu->run->hypercall.ret, so the guest
> sees -KVM_ENOSYS; this is basically the same bug that Binbin is fixing, just
> with a different value passed to the guest.
> 
> In other words, the above one-liner is pulling the "don't break userspace"
> card.

But how is anything breaking userspace?  QEMU needs to opt-in to intercepting
KVM_HC_MAP_GPA_RANGE, and this has been KVM's behavior since commit 0dbb11230437
("KVM: X86: Introduce KVM_HC_MAP_GPA_RANGE hypercall").

Ah, "ret" happens to be deep in the union and KVM zero allocates vcpu->run, so
QEMU gets lucky and "ret" happens to be zero because no other non-fatal userspace
exit on x86 happens to need as many bytes.  Hilarious.

FWIW, if TDX marshalls hypercall state into KVM's "normal" registers, then KVM's
shenanigans with vcpu->run->hypercall.ret might go away?  Though regardless of
what happens on that front, I think it makes to explicitly initialize "ret" to
*something*.

I checked our VMM, and it does the right thing, so I don't have any objection
to explicitly zeroing "ret".  Though it needs a comment explaining that it's a
terrible hack for broken userspace ;-)

  reply	other threads:[~2024-12-12 22:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-12  3:26 [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL Binbin Wu
2024-12-12  3:41 ` Yao Yuan
2024-12-12  3:44 ` Xiaoyao Li
2024-12-12  5:18   ` Binbin Wu
2024-12-12  7:09     ` Xiaoyao Li
2024-12-12  7:24       ` Binbin Wu
2024-12-12  8:07 ` Zhao Liu
2024-12-12 16:03   ` Paolo Bonzini
2024-12-12 19:13     ` Sean Christopherson
2024-12-12 21:28       ` Paolo Bonzini
2024-12-12 22:11         ` Sean Christopherson [this message]
2024-12-13  1:46         ` Binbin Wu
2024-12-13  1:52           ` Binbin Wu
2024-12-13  1:56           ` Binbin Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z1tfhPaHruhS3teK@google.com \
    --to=seanjc@google.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=farrah.chen@intel.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=xiaoyao.li@intel.com \
    --cc=zhao1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox