public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: kvm@vger.kernel.org
Subject: [Bug 216867] KVM instruction emulation breaks LOCK instruction atomicity when CMPXCHG fails
Date: Tue, 03 Jan 2023 22:38:32 +0000	[thread overview]
Message-ID: <bug-216867-28872-dcUCSYBNjg@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-216867-28872@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=216867

--- Comment #2 from Eric Li (ercli@ucdavis.edu) ---
在 2023-01-03星期二的 22:05 +0000,bugzilla-daemon@kernel.org写道:
> https://bugzilla.kernel.org/show_bug.cgi?id=216867
> 
> --- Comment #1 from Sean Christopherson (seanjc@google.com) ---
> On Fri, Dec 30, 2022, bugzilla-daemon@kernel.org wrote:
> 
> > My code performs the following experiment repeatedly on 3 CPUs:
> > 
> > * Initially, "ptr" at address 0xb8000 (VGA memory mapped I/O) is
> > set to 0
> > * CPU 0 writes 0x12345678 to ptr, then increases counter "count0".
> > * In an infinite loop, CPU 1 tries exchanges ptr with register EAX
> > (contains
> > 0)
> > using the XCHG instruction. If CPU 1 sees 0x12345678, it increases
> > counter
> > "count1".
> > * CPU 2's behavior is similar to CPU 1, except it increases counter
> > "count2"
> > when it sees 0x12345678.
> > 
> > Ideally, after each experiment there should always be count1 +
> > count2 =
> > count0.
> > However, in KVM, there may be count1 + count2 > count0. This
> > because CPU 0
> > writes 0x12345678 to ptr once, but CPU 1 and CPU 2 both get
> > 0x12345678 in
> > XCHG.
> > Note that XCHG instruction always implements the locking protocol.
> > 
> > There is also a deadlock after running the experiment a few times.
> > However I
> > am
> > not trying to explain it for now.
> 
> Is the suspect deadlock in userspace, the guest, or in the host
> kernel?
> 

The deadlock happens in the guest. It is due to how my experiment is
implemented. It is not directly related to KVM.

> > Guessed cause:
> > 
> > I guess that KVM emulates the XCHG instruction that accesses
> > 0xb8000. The
> > call
> > stack should be:
> > 
> > ...
> >  x86_emulate_instruction (arch/x86/kvm/x86.c)
> >   x86_emulate_insn (arch/x86/kvm/emulate.c)
> >    writeback (arch/x86/kvm/emulate.c)
> >     segmented_cmpxchg (arch/x86/kvm/emulate.c)
> >      emulator_cmpxchg_emulated (arch/x86/kvm/x86.c, -
> > >cmpxchg_emulated)
> >       emulator_try_cmpxchg_user (arch/x86/kvm/x86.c)
> >        ...
> >         CMPXCHG instruction
> > 
> > Suppose CPU 2 wants to write 0 to ptr using writeback(), and
> > expecting ptr to
> > already contain 0x13245678. However, CPU 1 changes the content of
> > ptr to 0.
> > So
> > * The CMPXCHG instruction fails (clears ZF).
> > * emulator_try_cmpxchg_user returns 1.
> > * emulator_cmpxchg_emulated() returns X86EMUL_CMPXCHG_FAILED.
> > * segmented_cmpxchg() returns X86EMUL_CMPXCHG_FAILED.
> > * writeback() returns X86EMUL_CMPXCHG_FAILED.
> > * x86_emulate_insn() returns EMULATION_OK.
> > 
> > Thus, I think the root cause of this bug is that x86_emulate_insn()
> > ignores
> > the
> > X86EMUL_CMPXCHG_FAILED error. The correct behavior should be
> > retrying the
> > emulation using the updated value (similar to load-linked/store-
> > conditional).
> 
> KVM does retry the emulation, albeit in a very roundabout and non-
> robust way.
> On X86EMUL_CMPXCHG_FAILED, x86_emulate_insn() skips the EIP update
> and doesn't
> writeback GPRs.  x86_emulate_instruction() is flawed and emulates
> single-step,
> but
> the "eip" written should be the original RIP, i.e. shouldn't advance
> past the
> instructions being emulated.  The single-step mess should be fixed,
> but I doubt
> that's the root cause here.
> 

I see, thanks for the explanation. Now the retrying code looks correct
to me (though I agree that the code could have been written in a better
way).

> Is there a memslot for 0xb8000?  I assume not since KVM is emulating
> (have you
> actually verified that, e.g. with tracepoints?).  KVM's ABI doesn't
> support
> atomic MMIO operations, i.e. if there's no memslot, KVM will
> effectively drop
> the LOCK semantics.  If that's indeed what's happening, you should
> see
> 
>   kvm: emulating exchange as write
> 
> in the host dmesg (just once though).
> 

You are right. I see "kvm: emulating exchange as write" when I run the
guest I wrote. Looks like this is the check that causes KVM to drop
LOCK on VGA MMIO:

>       gpa = kvm_mmu_gva_to_gpa_write(vcpu, addr, NULL);
>
>       if (gpa == INVALID_GPA ||
>           (gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
>               goto emul_write;

Closing this bug since LOCK on MMIO is not supported by KVM's ABI.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

  parent reply	other threads:[~2023-01-03 22:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30  3:02 [Bug 216867] New: KVM instruction emulation breaks LOCK instruction atomicity when CMPXCHG fails bugzilla-daemon
2023-01-03 22:05 ` Sean Christopherson
2023-01-03 22:05 ` [Bug 216867] " bugzilla-daemon
2023-01-03 22:38 ` bugzilla-daemon [this message]
2023-01-03 22:39 ` bugzilla-daemon
2023-01-03 22:48 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-216867-28872-dcUCSYBNjg@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox