From: Gleb Natapov <gleb@redhat.com>
To: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Cc: kvm@vger.kernel.org, Avi Kivity <avi@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: Question on skip_emulated_instructions()
Date: Wed, 7 Apr 2010 18:43:25 +0300 [thread overview]
Message-ID: <20100407154324.GF303@redhat.com> (raw)
In-Reply-To: <r2l87e9effc1004062325jba1026e1v2598333ccfa51964@mail.gmail.com>
On Wed, Apr 07, 2010 at 03:25:10PM +0900, Yoshiaki Tamura wrote:
> 2010/4/6 Gleb Natapov <gleb@redhat.com>:
> > On Tue, Apr 06, 2010 at 01:11:23PM +0900, Yoshiaki Tamura wrote:
> >> Hi.
> >>
> >> When handle_io() is called, rip is currently proceeded *before* actually having
> >> I/O handled by qemu in userland. Upon implementing Kemari for
> >> KVM(http://www.mail-archive.com/kvm@vger.kernel.org/msg25141.html) mainly in
> >> userland qemu, we encountered a problem that synchronizing the content of VCPU
> >> before handling I/O in qemu is too late because rip is already proceeded in KVM,
> >> Although we avoided this issue with temporal hack, I would like to ask a few
> >> question on skip_emulated_instructions.
> >>
> >> 1. Does rip need to be proceeded before having I/O handled by qemu?
> > In current kvm.git rip is proceeded before I/O is handled by qemu only
> > in case of "out" instruction. From architecture point of view I think
> > it's OK since on real HW you can't guaranty that I/O will take effect
> > before instruction pointer is advanced. It is done like that because we
> > want "out" emulation to be real fast so we skip x86 emulator.
>
> Thanks for your reply.
>
> If proceeding rip later doesn't break the behavior of devices or
> introduce slow down, I would like that to be done.
>
Device can not care less about what value rip register currently has.
Why is it matters for you code?
> >> 2. If no, is it possible to divide skip_emulated_instructions(), like
> >> rec_emulated_instructions() to remember to next_rip, and
> >> skip_emulated_instructions() to actually proceed the rip.
> > Currently only emulator can call userspace to do I/O, so after
> > userspace returns after I/O exit, control is handled back to emulator
> > unconditionally. "out" instruction skips emulator, but there is nothing
> > to do after userspace returns, so regular cpu loop is executed. If we
> > want to advance rip only after userspace executed I/O done by "out" we
> > need to distinguish who requested I/O (emulator or kvm_fast_pio_out())
> > and call different code depending on who that was. It can be done by
> > having a callback that (if not null) is called on return from userspace.
>
> Your suggestion is to introduce a callback entry, and instead of
> calling kvm_rip_write(), set it to the entry before calling
> kvm_fast_pio_out(),
> and check the entry upon return from the userspace, correct?
>
Something like that, yes.
> According to the comment in x86.c, when it was "out" instruction
> vcpu->arch.pio.count is set to 0 to skip the emulator.
> To call kvm_fast_pio_out(), "!string" and "!in" must be set.
> If we can check, vcpu->arch.pio.count, "string" and "in" on return
> from the userspace, can't we distinguish who requested I/O, emulator
> or kvm_fast_pio_out()?
>
May be, but callback approach is much cleaner. "string" and "in" can have
stale data for instance.
> >> 3. svm has next_rip but when it is 0, nop is emulated. Can this be modified to
> >> continue without emulating nop when next_rip is 0?
> >>
> > I don't see where nop is emulated if next_rip is 0. As far as I see in
> > case of next_rip==0 an instruction at rip is decoded to figure out its
> > length and then rip is advanced by instruction length. Anyway next_rip
> > is svm thing only.
>
> Sorry. I wasn't understanding the code enough.
>
> static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
> {
> ...
> if (!svm->next_rip) {
> if (emulate_instruction(vcpu, 0, 0, EMULTYPE_SKIP) !=
> EMULATE_DONE)
> printk(KERN_DEBUG "%s: NOP\n", __func__);
> return;
> }
>
> Since the printk says NOP, I thought emulate_instruction was doing so...
>
> The reason I asked about next_rip is because I was hoping to use this
> entry to advance rip only after userspace executed I/O done by "out",
> like if next_rip is !0,
> call kvm_rip_write(), and introduce next_rip to vmx if it is usable
> because vmx is
> currently using local variable rip.
>
> Yoshi
--
Gleb.
next prev parent reply other threads:[~2010-04-07 15:43 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-06 4:11 Question on skip_emulated_instructions() Yoshiaki Tamura
2010-04-06 10:05 ` Gleb Natapov
2010-04-07 6:25 ` Yoshiaki Tamura
2010-04-07 15:43 ` Gleb Natapov [this message]
2010-04-07 17:21 ` Yoshiaki Tamura
2010-04-07 17:37 ` Avi Kivity
2010-04-08 5:27 ` Yoshiaki Tamura
2010-04-08 5:41 ` Gleb Natapov
2010-04-08 6:18 ` Yoshiaki Tamura
2010-04-08 6:56 ` Gleb Natapov
2010-04-08 7:30 ` Yoshiaki Tamura
2010-04-08 7:37 ` Avi Kivity
2010-04-08 8:30 ` Yoshiaki Tamura
2010-04-08 8:38 ` Avi Kivity
2010-04-08 7:17 ` Avi Kivity
2010-04-08 7:19 ` Gleb Natapov
2010-04-08 8:10 ` Yoshiaki Tamura
2010-04-08 8:40 ` Avi Kivity
2010-04-08 9:14 ` Yoshiaki Tamura
2010-04-08 11:49 ` Avi Kivity
2010-04-08 13:42 ` Yoshiaki Tamura
2010-04-08 13:47 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100407154324.GF303@redhat.com \
--to=gleb@redhat.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=tamura.yoshiaki@lab.ntt.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.