From: Gleb Natapov <gleb@redhat.com>
To: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Cc: kvm@vger.kernel.org, Avi Kivity <avi@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: Question on skip_emulated_instructions()
Date: Wed, 7 Apr 2010 18:43:25 +0300 [thread overview]
Message-ID: <20100407154324.GF303@redhat.com> (raw)
In-Reply-To: <r2l87e9effc1004062325jba1026e1v2598333ccfa51964@mail.gmail.com>
On Wed, Apr 07, 2010 at 03:25:10PM +0900, Yoshiaki Tamura wrote:
> 2010/4/6 Gleb Natapov <gleb@redhat.com>:
> > On Tue, Apr 06, 2010 at 01:11:23PM +0900, Yoshiaki Tamura wrote:
> >> Hi.
> >>
> >> When handle_io() is called, rip is currently proceeded *before* actually having
> >> I/O handled by qemu in userland. Upon implementing Kemari for
> >> KVM(http://www.mail-archive.com/kvm@vger.kernel.org/msg25141.html) mainly in
> >> userland qemu, we encountered a problem that synchronizing the content of VCPU
> >> before handling I/O in qemu is too late because rip is already proceeded in KVM,
> >> Although we avoided this issue with temporal hack, I would like to ask a few
> >> question on skip_emulated_instructions.
> >>
> >> 1. Does rip need to be proceeded before having I/O handled by qemu?
> > In current kvm.git rip is proceeded before I/O is handled by qemu only
> > in case of "out" instruction. From architecture point of view I think
> > it's OK since on real HW you can't guaranty that I/O will take effect
> > before instruction pointer is advanced. It is done like that because we
> > want "out" emulation to be real fast so we skip x86 emulator.
>
> Thanks for your reply.
>
> If proceeding rip later doesn't break the behavior of devices or
> introduce slow down, I would like that to be done.
>
Device can not care less about what value rip register currently has.
Why is it matters for you code?
> >> 2. If no, is it possible to divide skip_emulated_instructions(), like
> >> rec_emulated_instructions() to remember to next_rip, and
> >> skip_emulated_instructions() to actually proceed the rip.
> > Currently only emulator can call userspace to do I/O, so after
> > userspace returns after I/O exit, control is handled back to emulator
> > unconditionally. "out" instruction skips emulator, but there is nothing
> > to do after userspace returns, so regular cpu loop is executed. If we
> > want to advance rip only after userspace executed I/O done by "out" we
> > need to distinguish who requested I/O (emulator or kvm_fast_pio_out())
> > and call different code depending on who that was. It can be done by
> > having a callback that (if not null) is called on return from userspace.
>
> Your suggestion is to introduce a callback entry, and instead of
> calling kvm_rip_write(), set it to the entry before calling
> kvm_fast_pio_out(),
> and check the entry upon return from the userspace, correct?
>
Something like that, yes.
> According to the comment in x86.c, when it was "out" instruction
> vcpu->arch.pio.count is set to 0 to skip the emulator.
> To call kvm_fast_pio_out(), "!string" and "!in" must be set.
> If we can check, vcpu->arch.pio.count, "string" and "in" on return
> from the userspace, can't we distinguish who requested I/O, emulator
> or kvm_fast_pio_out()?
>
May be, but callback approach is much cleaner. "string" and "in" can have
stale data for instance.
> >> 3. svm has next_rip but when it is 0, nop is emulated. Can this be modified to
> >> continue without emulating nop when next_rip is 0?
> >>
> > I don't see where nop is emulated if next_rip is 0. As far as I see in
> > case of next_rip==0 an instruction at rip is decoded to figure out its
> > length and then rip is advanced by instruction length. Anyway next_rip
> > is svm thing only.
>
> Sorry. I wasn't understanding the code enough.
>
> static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
> {
> ...
> if (!svm->next_rip) {
> if (emulate_instruction(vcpu, 0, 0, EMULTYPE_SKIP) !=
> EMULATE_DONE)
> printk(KERN_DEBUG "%s: NOP\n", __func__);
> return;
> }
>
> Since the printk says NOP, I thought emulate_instruction was doing so...
>
> The reason I asked about next_rip is because I was hoping to use this
> entry to advance rip only after userspace executed I/O done by "out",
> like if next_rip is !0,
> call kvm_rip_write(), and introduce next_rip to vmx if it is usable
> because vmx is
> currently using local variable rip.
>
> Yoshi
--
Gleb.
next prev parent reply other threads:[~2010-04-07 15:43 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-06 4:11 Question on skip_emulated_instructions() Yoshiaki Tamura
2010-04-06 10:05 ` Gleb Natapov
2010-04-07 6:25 ` Yoshiaki Tamura
2010-04-07 15:43 ` Gleb Natapov [this message]
2010-04-07 17:21 ` Yoshiaki Tamura
2010-04-07 17:37 ` Avi Kivity
2010-04-08 5:27 ` Yoshiaki Tamura
2010-04-08 5:41 ` Gleb Natapov
2010-04-08 6:18 ` Yoshiaki Tamura
2010-04-08 6:56 ` Gleb Natapov
2010-04-08 7:30 ` Yoshiaki Tamura
2010-04-08 7:37 ` Avi Kivity
2010-04-08 8:30 ` Yoshiaki Tamura
2010-04-08 8:38 ` Avi Kivity
2010-04-08 7:17 ` Avi Kivity
2010-04-08 7:19 ` Gleb Natapov
2010-04-08 8:10 ` Yoshiaki Tamura
2010-04-08 8:40 ` Avi Kivity
2010-04-08 9:14 ` Yoshiaki Tamura
2010-04-08 11:49 ` Avi Kivity
2010-04-08 13:42 ` Yoshiaki Tamura
2010-04-08 13:47 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100407154324.GF303@redhat.com \
--to=gleb@redhat.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=tamura.yoshiaki@lab.ntt.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox