From: Gleb Natapov <gleb@minantech.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: kvm@vger.kernel.org, ak@linux.intel.com, pbonzini@redhat.com,
xiaoguangrong@linux.vnet.ibm.com, avi@cloudius-systems.com
Subject: Re: [patch 4/5] KVM: MMU: reload request from GET_DIRTY_LOG path
Date: Fri, 20 Jun 2014 13:46:10 +0300 [thread overview]
Message-ID: <20140620104610.GD20764@minantech.com> (raw)
In-Reply-To: <20140619184031.GC32410@amt.cnet>
On Thu, Jun 19, 2014 at 03:40:31PM -0300, Marcelo Tosatti wrote:
> On Thu, Jun 19, 2014 at 11:17:19AM +0300, Gleb Natapov wrote:
> > On Wed, Jun 18, 2014 at 08:12:07PM -0300, mtosatti@redhat.com wrote:
> > > Reload remote vcpus MMU from GET_DIRTY_LOG codepath, before
> > > deleting a pinned spte.
> > >
> > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> > >
> > > ---
> > > arch/x86/kvm/mmu.c | 3 +++
> > > 1 file changed, 3 insertions(+)
> > >
> > > Index: kvm.pinned-sptes/arch/x86/kvm/mmu.c
> > > ===================================================================
> > > --- kvm.pinned-sptes.orig/arch/x86/kvm/mmu.c 2014-06-13 16:50:50.040140594 -0300
> > > +++ kvm.pinned-sptes/arch/x86/kvm/mmu.c 2014-06-13 16:51:05.620104451 -0300
> > > @@ -1247,6 +1247,9 @@
> > > spte &= ~SPTE_MMU_WRITEABLE;
> > > spte = spte & ~PT_WRITABLE_MASK;
> > >
> > > + if (is_pinned_spte(spte))
> > > + mmu_reload_pinned_vcpus(kvm);
> > > +
> > Why write protect it at all? mmu_reload_pinned_vcpus() will unprotected it anyway
> > on the next vmentry. Isn't it better to just report all pinned pages as dirty alway.
>
> That was the initial plan, however its awkward to stop vcpus, execute
> get_dirty_log twice, and have pages marked as dirty on the second
> execution.
Indeed, but I think it may happen today with vhost (or even other devices
that emulate dma), so userspace should be able to deal with it already.
>
> That is, it is in "incorrect" to report pages as dirty when they are
> clean.
In case of PEBS area we cannot know. CPU writes there directly without
KVM even knowing about it so the only sane thing is to assume that after
a vmentry PEBS area is dirty. This is what happens with this patch BTW,
mmu_reload_pinned_vcpus() will force ->page_fault(FERR_WRITE) which will
mark all pinned pages as dirty even if pages are never written to. You
can achieve the same by having vcpu->pinned_page_dirty which is set to
true on each vmentry and to false on each GET_DIRTY_LOG.
>
> Moreover, if the number of pinned pages is larger than the dirty
> threshold to stop VM and migrate, you'll never migrate. If vcpus are
> in HLT and don't VM-enter immediately, the pages should not be refaulted
> right away.
We should not allow that many pinned pages for security reasons. And
having a lot of page to fault in on a next vmentry after each
GET_DIRTY_LOG will slow down a guest during migration considerably.
>
> Do you think the optimization is worthwhile ?
>
I think it's simpler, but even if we will go with your current approach
it should be improved: there is no point sending IPI to all vcpus in
spte_write_protect() like you do here since the IPI will be send anyway at
the end of write protect because of ptes writable->nonwritable transition,
so all you have to do here is to set KVM_REQ_MMU_RELOAD, not need for IPI.
In fact this makes me thinking that write protecting pinned page here is
incorrect because old translation may not be in TLB and if CPU will try to
write PEBS entry after pte is write protected but before MMU is reloaded
it will encounter non writable pte and god knows what will happen, SDM
does not tell. So spte_write_protect() should be something like that IMO:
static bool spte_write_protect(struct kvm *kvm, u64 *sptep, bool pt_protect)
{
u64 spte = *sptep;
if (!is_writable_pte(spte) &&
!(pt_protect && spte_is_locklessly_modifiable(spte)))
return false;
if (!pt_protect && !is_pinned_spte(spte)) {
for_each_vcpu()
kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
return true;
}
rmap_printk("rmap_write_protect: spte %p %llx\n", sptep, *sptep);
if (pt_protect)
spte &= ~SPTE_MMU_WRITEABLE;
spte = spte & ~PT_WRITABLE_MASK;
return mmu_spte_update(sptep, spte);
}
--
Gleb.
next prev parent reply other threads:[~2014-06-20 10:46 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-18 23:12 [patch 0/5] KVM: support for pinning sptes mtosatti
2014-06-18 23:12 ` [patch 1/5] KVM: x86: add pinned parameter to page_fault methods mtosatti
2014-06-18 23:12 ` [patch 2/5] KVM: MMU: allow pinning spte translations (TDP-only) mtosatti
2014-06-19 7:21 ` Gleb Natapov
2014-06-19 19:22 ` Marcelo Tosatti
2014-06-20 10:09 ` Gleb Natapov
2014-06-30 20:46 ` Marcelo Tosatti
2014-06-30 22:00 ` Andi Kleen
2014-06-19 8:01 ` Avi Kivity
2014-06-19 14:06 ` Andi Kleen
2014-06-19 18:26 ` Marcelo Tosatti
2014-06-22 13:35 ` Avi Kivity
2014-07-09 13:25 ` Marcelo Tosatti
2014-07-02 0:58 ` Nadav Amit
2014-06-18 23:12 ` [patch 3/5] KVM: MMU: notifiers support for pinned sptes mtosatti
2014-06-19 6:48 ` Gleb Natapov
2014-06-19 18:28 ` Marcelo Tosatti
2014-06-20 10:11 ` Gleb Natapov
2014-06-18 23:12 ` [patch 4/5] KVM: MMU: reload request from GET_DIRTY_LOG path mtosatti
2014-06-19 8:17 ` Gleb Natapov
2014-06-19 18:40 ` Marcelo Tosatti
2014-06-20 10:46 ` Gleb Natapov [this message]
2014-06-30 20:59 ` Marcelo Tosatti
2014-07-01 6:27 ` Gleb Natapov
2014-07-01 17:50 ` Marcelo Tosatti
2014-06-18 23:12 ` [patch 5/5] KVM: MMU: pinned sps are not candidates for deletion mtosatti
2014-06-19 1:44 ` [patch 0/5] KVM: support for pinning sptes Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140620104610.GD20764@minantech.com \
--to=gleb@minantech.com \
--cc=ak@linux.intel.com \
--cc=avi@cloudius-systems.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=xiaoguangrong@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.