From: Gleb Natapov <gleb@minantech.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: kvm@vger.kernel.org, ak@linux.intel.com, pbonzini@redhat.com,
xiaoguangrong@linux.vnet.ibm.com, avi@cloudius-systems.com
Subject: Re: [patch 4/5] KVM: MMU: reload request from GET_DIRTY_LOG path
Date: Tue, 1 Jul 2014 09:27:19 +0300 [thread overview]
Message-ID: <20140701062719.GK18167@minantech.com> (raw)
In-Reply-To: <20140630205902.GB26566@amt.cnet>
On Mon, Jun 30, 2014 at 05:59:02PM -0300, Marcelo Tosatti wrote:
> On Fri, Jun 20, 2014 at 01:46:10PM +0300, Gleb Natapov wrote:
> > On Thu, Jun 19, 2014 at 03:40:31PM -0300, Marcelo Tosatti wrote:
> > > On Thu, Jun 19, 2014 at 11:17:19AM +0300, Gleb Natapov wrote:
> > > > On Wed, Jun 18, 2014 at 08:12:07PM -0300, mtosatti@redhat.com wrote:
> > > > > Reload remote vcpus MMU from GET_DIRTY_LOG codepath, before
> > > > > deleting a pinned spte.
> > > > >
> > > > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> > > > >
> > > > > ---
> > > > > arch/x86/kvm/mmu.c | 3 +++
> > > > > 1 file changed, 3 insertions(+)
> > > > >
> > > > > Index: kvm.pinned-sptes/arch/x86/kvm/mmu.c
> > > > > ===================================================================
> > > > > --- kvm.pinned-sptes.orig/arch/x86/kvm/mmu.c 2014-06-13 16:50:50.040140594 -0300
> > > > > +++ kvm.pinned-sptes/arch/x86/kvm/mmu.c 2014-06-13 16:51:05.620104451 -0300
> > > > > @@ -1247,6 +1247,9 @@
> > > > > spte &= ~SPTE_MMU_WRITEABLE;
> > > > > spte = spte & ~PT_WRITABLE_MASK;
> > > > >
> > > > > + if (is_pinned_spte(spte))
> > > > > + mmu_reload_pinned_vcpus(kvm);
> > > > > +
> > > > Why write protect it at all? mmu_reload_pinned_vcpus() will unprotected it anyway
> > > > on the next vmentry. Isn't it better to just report all pinned pages as dirty alway.
> > >
> > > That was the initial plan, however its awkward to stop vcpus, execute
> > > get_dirty_log twice, and have pages marked as dirty on the second
> > > execution.
> > Indeed, but I think it may happen today with vhost (or even other devices
> > that emulate dma), so userspace should be able to deal with it already.
> >
> > >
> > > That is, it is in "incorrect" to report pages as dirty when they are
> > > clean.
> > In case of PEBS area we cannot know. CPU writes there directly without
> > KVM even knowing about it so the only sane thing is to assume that after
> > a vmentry PEBS area is dirty. This is what happens with this patch BTW,
> > mmu_reload_pinned_vcpus() will force ->page_fault(FERR_WRITE) which will
> > mark all pinned pages as dirty even if pages are never written to. You
> > can achieve the same by having vcpu->pinned_page_dirty which is set to
> > true on each vmentry and to false on each GET_DIRTY_LOG.
> >
> > >
> > > Moreover, if the number of pinned pages is larger than the dirty
> > > threshold to stop VM and migrate, you'll never migrate. If vcpus are
> > > in HLT and don't VM-enter immediately, the pages should not be refaulted
> > > right away.
> > We should not allow that many pinned pages for security reasons. And
> > having a lot of page to fault in on a next vmentry after each
> > GET_DIRTY_LOG will slow down a guest during migration considerably.
> >
> > >
> > > Do you think the optimization is worthwhile ?
> > >
> > I think it's simpler, but even if we will go with your current approach
> > it should be improved: there is no point sending IPI to all vcpus in
> > spte_write_protect() like you do here since the IPI will be send anyway at
> > the end of write protect because of ptes writable->nonwritable transition,
> > so all you have to do here is to set KVM_REQ_MMU_RELOAD, not need for IPI.
>
> No, you have to make sure the vcpu is out of guest mode.
>
> > In fact this makes me thinking that write protecting pinned page here is
> > incorrect because old translation may not be in TLB and if CPU will try to
> > write PEBS entry after pte is write protected but before MMU is reloaded
> > it will encounter non writable pte and god knows what will happen, SDM
> > does not tell. So spte_write_protect() should be something like that IMO:
>
> The vcpu will never see a read-only spte because the VM-exit (due to
> IPI) guarantees vcpu is outside of guest mode _before_ it is write
> protected.
Right. Now I see why you absolutely have to send IPI in mmu_reload_pinned_vcpus()
before marking pte as read only. And kvm->mmu_lock is what will prevent vcpu from
re-entering guest mode again before pte is marked read only, right?
>
> So i ask you: do you still hold the "current approach should be
> improved" position ?
>
As I said IMO what I proposed is much simpler and not as tricky as what you have here.
It also has an advantage of not slowing down next guest entry after GET_DIRTY_LOG because
it does not require mmu reload and page_faulting in pinned pages.
--
Gleb.
next prev parent reply other threads:[~2014-07-01 6:27 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-18 23:12 [patch 0/5] KVM: support for pinning sptes mtosatti
2014-06-18 23:12 ` [patch 1/5] KVM: x86: add pinned parameter to page_fault methods mtosatti
2014-06-18 23:12 ` [patch 2/5] KVM: MMU: allow pinning spte translations (TDP-only) mtosatti
2014-06-19 7:21 ` Gleb Natapov
2014-06-19 19:22 ` Marcelo Tosatti
2014-06-20 10:09 ` Gleb Natapov
2014-06-30 20:46 ` Marcelo Tosatti
2014-06-30 22:00 ` Andi Kleen
2014-06-19 8:01 ` Avi Kivity
2014-06-19 14:06 ` Andi Kleen
2014-06-19 18:26 ` Marcelo Tosatti
2014-06-22 13:35 ` Avi Kivity
2014-07-09 13:25 ` Marcelo Tosatti
2014-07-02 0:58 ` Nadav Amit
2014-06-18 23:12 ` [patch 3/5] KVM: MMU: notifiers support for pinned sptes mtosatti
2014-06-19 6:48 ` Gleb Natapov
2014-06-19 18:28 ` Marcelo Tosatti
2014-06-20 10:11 ` Gleb Natapov
2014-06-18 23:12 ` [patch 4/5] KVM: MMU: reload request from GET_DIRTY_LOG path mtosatti
2014-06-19 8:17 ` Gleb Natapov
2014-06-19 18:40 ` Marcelo Tosatti
2014-06-20 10:46 ` Gleb Natapov
2014-06-30 20:59 ` Marcelo Tosatti
2014-07-01 6:27 ` Gleb Natapov [this message]
2014-07-01 17:50 ` Marcelo Tosatti
2014-06-18 23:12 ` [patch 5/5] KVM: MMU: pinned sps are not candidates for deletion mtosatti
2014-06-19 1:44 ` [patch 0/5] KVM: support for pinning sptes Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140701062719.GK18167@minantech.com \
--to=gleb@minantech.com \
--cc=ak@linux.intel.com \
--cc=avi@cloudius-systems.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=xiaoguangrong@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.