Re: [PATCH] KVM: x86: Reset RSP before exiting to userspace when emulating POPA

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Tao Su <tao1.su@linux.intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, chao.gao@intel.com,
	xiaoyao.li@intel.com
Subject: Re: [PATCH] KVM: x86: Reset RSP before exiting to userspace when emulating POPA
Date: Thu, 25 Jul 2024 10:32:53 +0800	[thread overview]
Message-ID: <ZqG5VaFKdP+G3/vg@linux.bj.intel.com> (raw)
In-Reply-To: <ZqEPrE429UQi9duo@google.com>

On Wed, Jul 24, 2024 at 08:33:48AM -0700, Sean Christopherson wrote:
> On Wed, Jul 24, 2024, Tao Su wrote:
> > When emulating POPA and exiting to userspace for MMIO, reset modified RSP
> > as emulation context may not be reset. POPA may generate more multiple
> > reads, i.e. multiple POPs from the stack, but if stack points to MMIO,
> > KVM needs to emulate multiple MMIO reads.
> > 
> > When one MMIO done, POPA may be re-emulated with EMULTYPE_NO_DECODE set,
> > i.e. ctxt will not be reset, but RSP is modified by previous emulation of
> > current POPA instruction, which eventually leads to emulation error.
> > 
> > The commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if
> > emulator detects exception") provides a detailed analysis of how KVM
> > emulates multiple MMIO reads, and its correctness can be verified in the
> > POPA instruction with this patch.
> 
> I don't see how this can work.  If POPA is reading from MMIO, it will need to
> do 8 distinct emulated MMIO accesses.  Unwinding to the original RSP will allow
> the first MMIO (store to EDI) to succeed, but then the second MMIO (store to ESI)
> will exit back to userspace.  And the second restart will load EDI with the
> result of the MMIO, not ESI.  It will also re-trigger the second MMIO indefinitely.
> 

This can work like commit 0dc902267cb3 description. Since there is a buffer
(ctxt->mem_read), KVM can safely restart the instruction from the beginning.

After the first MMIO (store to EDI) done, vcpu->mmio_read_completed is set and
POPA is re-emulated with EMULTYPE_NO_DECODE. When re-emulating, KVM will try
first MMIO (store to EDI) _again_, but KVM will not exit to userspace in
emulator_read_write() because vcpu->mmio_read_completed is set and then adjust
mc->end in read_emulated().

For the second MMIO (store to ESI), it will exit to userspace and re-emulate.
When re-emulating, KVM can directly read EDI from the buffer (mc->data) and try
second MMIO (store to ESI) _again_, but KVM will not exit to userspace in
emulator_read_write() because vcpu->mmio_read_completed is set and then adjust
mc->end in read_emulated().

For the third MMIO (store to EBP) ...

> To make this work, KVM would need to allow precisely resuming execution where
> it left off.  We can't use MMIO fragments, because unlike MMIO accesses that
> split pages, each memory load is an individual access.
> 

Since all MMIO reads are saved to the buffer (mc->data) and the index (mc->end)
is adjusted by every re-emulation, KVM already supports multiple MMIO reads.

> I don't see any reason to try to make this work.  It's a ridiculously convoluted
> scenario that, AFAIK, has no real world application.
> 

I agree POPA+MMIO is rare but supporting it to be same as hardware is cheap. We
can also use POPA to validate the multiple MMIO reads which has a complex flow (
actually, that's why I tried POPA in kvm-unit-test and then find this issue).

> > Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> > ---
> > For testing, we can add POPA to the emulator case in kvm-unit-test.
> > ---
> >  arch/x86/kvm/emulate.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> > index e72aed25d721..3746fef6ca60 100644
> > --- a/arch/x86/kvm/emulate.c
> > +++ b/arch/x86/kvm/emulate.c
> > @@ -1999,6 +1999,7 @@ static int em_pushf(struct x86_emulate_ctxt *ctxt)
> >  
> >  static int em_popa(struct x86_emulate_ctxt *ctxt)
> >  {
> > +	unsigned long old_esp = reg_read(ctxt, VCPU_REGS_RSP);
> >  	int rc = X86EMUL_CONTINUE;
> >  	int reg = VCPU_REGS_RDI;
> >  	u32 val = 0;
> > @@ -2010,8 +2011,11 @@ static int em_popa(struct x86_emulate_ctxt *ctxt)
> >  		}
> >  
> >  		rc = emulate_pop(ctxt, &val, ctxt->op_bytes);
> > -		if (rc != X86EMUL_CONTINUE)
> > +		if (rc != X86EMUL_CONTINUE) {
> > +			assign_register(reg_rmw(ctxt, VCPU_REGS_RSP),
> > +					old_esp, ctxt->op_bytes);
> >  			break;
> > +		}
> >  		assign_register(reg_rmw(ctxt, reg), val, ctxt->op_bytes);
> >  		--reg;
> >  	}
> > 
> > base-commit: 786c8248dbd33a5a7a07f7c6e55a7bfc68d2ca48
> > -- 
> > 2.34.1
> >

     prev parent reply	other threads:[~2024-07-25  2:37 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-24  4:45 [PATCH] KVM: x86: Reset RSP before exiting to userspace when emulating POPA Tao Su
2024-07-24 15:33 ` Sean Christopherson
2024-07-25  2:32   ` Tao Su [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZqG5VaFKdP+G3/vg@linux.bj.intel.com \
    --to=tao1.su@linux.intel.com \
    --cc=chao.gao@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=xiaoyao.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox