* [PATCH] KVM: x86: Reset RSP before exiting to userspace when emulating POPA
@ 2024-07-24 4:45 Tao Su
2024-07-24 15:33 ` Sean Christopherson
0 siblings, 1 reply; 3+ messages in thread
From: Tao Su @ 2024-07-24 4:45 UTC (permalink / raw)
To: kvm; +Cc: seanjc, pbonzini, chao.gao, xiaoyao.li, tao1.su
When emulating POPA and exiting to userspace for MMIO, reset modified RSP
as emulation context may not be reset. POPA may generate more multiple
reads, i.e. multiple POPs from the stack, but if stack points to MMIO,
KVM needs to emulate multiple MMIO reads.
When one MMIO done, POPA may be re-emulated with EMULTYPE_NO_DECODE set,
i.e. ctxt will not be reset, but RSP is modified by previous emulation of
current POPA instruction, which eventually leads to emulation error.
The commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if
emulator detects exception") provides a detailed analysis of how KVM
emulates multiple MMIO reads, and its correctness can be verified in the
POPA instruction with this patch.
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
---
For testing, we can add POPA to the emulator case in kvm-unit-test.
---
arch/x86/kvm/emulate.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index e72aed25d721..3746fef6ca60 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1999,6 +1999,7 @@ static int em_pushf(struct x86_emulate_ctxt *ctxt)
static int em_popa(struct x86_emulate_ctxt *ctxt)
{
+ unsigned long old_esp = reg_read(ctxt, VCPU_REGS_RSP);
int rc = X86EMUL_CONTINUE;
int reg = VCPU_REGS_RDI;
u32 val = 0;
@@ -2010,8 +2011,11 @@ static int em_popa(struct x86_emulate_ctxt *ctxt)
}
rc = emulate_pop(ctxt, &val, ctxt->op_bytes);
- if (rc != X86EMUL_CONTINUE)
+ if (rc != X86EMUL_CONTINUE) {
+ assign_register(reg_rmw(ctxt, VCPU_REGS_RSP),
+ old_esp, ctxt->op_bytes);
break;
+ }
assign_register(reg_rmw(ctxt, reg), val, ctxt->op_bytes);
--reg;
}
base-commit: 786c8248dbd33a5a7a07f7c6e55a7bfc68d2ca48
--
2.34.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] KVM: x86: Reset RSP before exiting to userspace when emulating POPA
2024-07-24 4:45 [PATCH] KVM: x86: Reset RSP before exiting to userspace when emulating POPA Tao Su
@ 2024-07-24 15:33 ` Sean Christopherson
2024-07-25 2:32 ` Tao Su
0 siblings, 1 reply; 3+ messages in thread
From: Sean Christopherson @ 2024-07-24 15:33 UTC (permalink / raw)
To: Tao Su; +Cc: kvm, pbonzini, chao.gao, xiaoyao.li
On Wed, Jul 24, 2024, Tao Su wrote:
> When emulating POPA and exiting to userspace for MMIO, reset modified RSP
> as emulation context may not be reset. POPA may generate more multiple
> reads, i.e. multiple POPs from the stack, but if stack points to MMIO,
> KVM needs to emulate multiple MMIO reads.
>
> When one MMIO done, POPA may be re-emulated with EMULTYPE_NO_DECODE set,
> i.e. ctxt will not be reset, but RSP is modified by previous emulation of
> current POPA instruction, which eventually leads to emulation error.
>
> The commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if
> emulator detects exception") provides a detailed analysis of how KVM
> emulates multiple MMIO reads, and its correctness can be verified in the
> POPA instruction with this patch.
I don't see how this can work. If POPA is reading from MMIO, it will need to
do 8 distinct emulated MMIO accesses. Unwinding to the original RSP will allow
the first MMIO (store to EDI) to succeed, but then the second MMIO (store to ESI)
will exit back to userspace. And the second restart will load EDI with the
result of the MMIO, not ESI. It will also re-trigger the second MMIO indefinitely.
To make this work, KVM would need to allow precisely resuming execution where
it left off. We can't use MMIO fragments, because unlike MMIO accesses that
split pages, each memory load is an individual access.
I don't see any reason to try to make this work. It's a ridiculously convoluted
scenario that, AFAIK, has no real world application.
> Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> ---
> For testing, we can add POPA to the emulator case in kvm-unit-test.
> ---
> arch/x86/kvm/emulate.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index e72aed25d721..3746fef6ca60 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -1999,6 +1999,7 @@ static int em_pushf(struct x86_emulate_ctxt *ctxt)
>
> static int em_popa(struct x86_emulate_ctxt *ctxt)
> {
> + unsigned long old_esp = reg_read(ctxt, VCPU_REGS_RSP);
> int rc = X86EMUL_CONTINUE;
> int reg = VCPU_REGS_RDI;
> u32 val = 0;
> @@ -2010,8 +2011,11 @@ static int em_popa(struct x86_emulate_ctxt *ctxt)
> }
>
> rc = emulate_pop(ctxt, &val, ctxt->op_bytes);
> - if (rc != X86EMUL_CONTINUE)
> + if (rc != X86EMUL_CONTINUE) {
> + assign_register(reg_rmw(ctxt, VCPU_REGS_RSP),
> + old_esp, ctxt->op_bytes);
> break;
> + }
> assign_register(reg_rmw(ctxt, reg), val, ctxt->op_bytes);
> --reg;
> }
>
> base-commit: 786c8248dbd33a5a7a07f7c6e55a7bfc68d2ca48
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] KVM: x86: Reset RSP before exiting to userspace when emulating POPA
2024-07-24 15:33 ` Sean Christopherson
@ 2024-07-25 2:32 ` Tao Su
0 siblings, 0 replies; 3+ messages in thread
From: Tao Su @ 2024-07-25 2:32 UTC (permalink / raw)
To: Sean Christopherson; +Cc: kvm, pbonzini, chao.gao, xiaoyao.li
On Wed, Jul 24, 2024 at 08:33:48AM -0700, Sean Christopherson wrote:
> On Wed, Jul 24, 2024, Tao Su wrote:
> > When emulating POPA and exiting to userspace for MMIO, reset modified RSP
> > as emulation context may not be reset. POPA may generate more multiple
> > reads, i.e. multiple POPs from the stack, but if stack points to MMIO,
> > KVM needs to emulate multiple MMIO reads.
> >
> > When one MMIO done, POPA may be re-emulated with EMULTYPE_NO_DECODE set,
> > i.e. ctxt will not be reset, but RSP is modified by previous emulation of
> > current POPA instruction, which eventually leads to emulation error.
> >
> > The commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if
> > emulator detects exception") provides a detailed analysis of how KVM
> > emulates multiple MMIO reads, and its correctness can be verified in the
> > POPA instruction with this patch.
>
> I don't see how this can work. If POPA is reading from MMIO, it will need to
> do 8 distinct emulated MMIO accesses. Unwinding to the original RSP will allow
> the first MMIO (store to EDI) to succeed, but then the second MMIO (store to ESI)
> will exit back to userspace. And the second restart will load EDI with the
> result of the MMIO, not ESI. It will also re-trigger the second MMIO indefinitely.
>
This can work like commit 0dc902267cb3 description. Since there is a buffer
(ctxt->mem_read), KVM can safely restart the instruction from the beginning.
After the first MMIO (store to EDI) done, vcpu->mmio_read_completed is set and
POPA is re-emulated with EMULTYPE_NO_DECODE. When re-emulating, KVM will try
first MMIO (store to EDI) _again_, but KVM will not exit to userspace in
emulator_read_write() because vcpu->mmio_read_completed is set and then adjust
mc->end in read_emulated().
For the second MMIO (store to ESI), it will exit to userspace and re-emulate.
When re-emulating, KVM can directly read EDI from the buffer (mc->data) and try
second MMIO (store to ESI) _again_, but KVM will not exit to userspace in
emulator_read_write() because vcpu->mmio_read_completed is set and then adjust
mc->end in read_emulated().
For the third MMIO (store to EBP) ...
> To make this work, KVM would need to allow precisely resuming execution where
> it left off. We can't use MMIO fragments, because unlike MMIO accesses that
> split pages, each memory load is an individual access.
>
Since all MMIO reads are saved to the buffer (mc->data) and the index (mc->end)
is adjusted by every re-emulation, KVM already supports multiple MMIO reads.
> I don't see any reason to try to make this work. It's a ridiculously convoluted
> scenario that, AFAIK, has no real world application.
>
I agree POPA+MMIO is rare but supporting it to be same as hardware is cheap. We
can also use POPA to validate the multiple MMIO reads which has a complex flow (
actually, that's why I tried POPA in kvm-unit-test and then find this issue).
> > Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> > ---
> > For testing, we can add POPA to the emulator case in kvm-unit-test.
> > ---
> > arch/x86/kvm/emulate.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> > index e72aed25d721..3746fef6ca60 100644
> > --- a/arch/x86/kvm/emulate.c
> > +++ b/arch/x86/kvm/emulate.c
> > @@ -1999,6 +1999,7 @@ static int em_pushf(struct x86_emulate_ctxt *ctxt)
> >
> > static int em_popa(struct x86_emulate_ctxt *ctxt)
> > {
> > + unsigned long old_esp = reg_read(ctxt, VCPU_REGS_RSP);
> > int rc = X86EMUL_CONTINUE;
> > int reg = VCPU_REGS_RDI;
> > u32 val = 0;
> > @@ -2010,8 +2011,11 @@ static int em_popa(struct x86_emulate_ctxt *ctxt)
> > }
> >
> > rc = emulate_pop(ctxt, &val, ctxt->op_bytes);
> > - if (rc != X86EMUL_CONTINUE)
> > + if (rc != X86EMUL_CONTINUE) {
> > + assign_register(reg_rmw(ctxt, VCPU_REGS_RSP),
> > + old_esp, ctxt->op_bytes);
> > break;
> > + }
> > assign_register(reg_rmw(ctxt, reg), val, ctxt->op_bytes);
> > --reg;
> > }
> >
> > base-commit: 786c8248dbd33a5a7a07f7c6e55a7bfc68d2ca48
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-07-25 2:37 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-24 4:45 [PATCH] KVM: x86: Reset RSP before exiting to userspace when emulating POPA Tao Su
2024-07-24 15:33 ` Sean Christopherson
2024-07-25 2:32 ` Tao Su
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox