public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Avi Kivity <avi@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
	KVM <kvm@vger.kernel.org>
Subject: Re: [PATCH 11/11] KVM: MMU: improve write flooding detected
Date: Tue, 23 Aug 2011 09:38:18 -0300	[thread overview]
Message-ID: <20110823123818.GB4261@amt.cnet> (raw)
In-Reply-To: <4E53872B.3070407@cn.fujitsu.com>

On Tue, Aug 23, 2011 at 06:55:39PM +0800, Xiao Guangrong wrote:
> Hi Marcelo,
> 
> On 08/23/2011 04:00 PM, Marcelo Tosatti wrote:
> > On Tue, Aug 16, 2011 at 02:46:47PM +0800, Xiao Guangrong wrote:
> >> Detecting write-flooding does not work well, when we handle page written, if
> >> the last speculative spte is not accessed, we treat the page is
> >> write-flooding, however, we can speculative spte on many path, such as pte
> >> prefetch, page synced, that means the last speculative spte may be not point
> >> to the written page and the written page can be accessed via other sptes, so
> >> depends on the Accessed bit of the last speculative spte is not enough
> > 
> > Yes, a stale last_speculative_spte is possible, but is this fact a
> > noticeable problem in practice?
> > 
> > Was this detected by code inspection?
> > 
> 
> I detected this because: i noticed some shadow page is zapped by
> write-flooding but it is accessed soon, it causes the shadow page zapped
> and alloced again and again(very frequently).
> 
> Another reason is that: in current code, write-flooding is little complex
> and it stuffs code in many places, actually, write-flooding is only needed for
> shadow page/nested guest, so i want to simplify it and wrap its code up.
> 
> >> -	}
> >> +	if (spte && !(*spte & shadow_accessed_mask))
> >> +		sp->write_flooding_count++;
> >> +	else
> >> +		sp->write_flooding_count = 0;
> > 
> > This relies on the sptes being created by speculative means
> > or by pressure on the host clearing the accessed bit for the
> > shadow page to be zapped. 
> > 
> > There is no guarantee that either of these is true for a given
> > spte.
> > 
> > And if the sptes do not have accessed bit set, any nonconsecutive 3 pte
> > updates will zap the page.
> > 
> 
> Please note we clear 'sp->write_flooding_count' when it is accessed from
> shadow page cache (in kvm_mmu_get_page), it means if any spte of sp generates
> #PF, the fooding count can be reset.

OK.

> And, i think there are not problems since: if the spte without accssed bit is
> written frequently, it means the guest page table is accessed infrequently or
> during the writing, the guest page table is not accessed, in this time, zapping
> this shadow page is not bad.

Think of the following scenario:

1) page fault, spte with accessed bit is created from gpte at gfnA+indexA.
2) write to gfnA+indexA, spte has accessed bit set, write_flooding_count
is not increased.
3) repeat

So you cannot rely on the accessed bit being cleared to zap the shadow
page, because it might not be cleared in certain scenarios.

> Comparing the old way, the advantage of it is good for zapping upper shadow page,
> for example, in the old way:
> if a gfn is used as PDE for a task, later, the gfn is freed and used as PTE for
> the new task, so we have two shadow pages in the host, one sp1.level = 2 and the
> other sp2.level = 1. So, when we detect write-flooding, the vcpu->last_pte_updated
> always point to sp2.pte. As sp2 is used for the new task, we always detected both
> shadow pages are bing used, but actually, sp1 is not used by guest anymore.

Makes sense.

> > Back to the first question, what is the motivation for this heuristic
> > change? Do you have any numbers?
> > 
> 
> Yes, i have done the quick test:
> 
> before this patch:
> 2m56.561
> 2m50.651
> 2m51.220
> 2m52.199
> 2m48.066
> 
> After this patch:
> 2m51.194
> 2m55.980
> 2m50.755
> 2m47.396
> 2m46.807
> 
> It shows the new way is little better than the old way.

What test is this?

  reply	other threads:[~2011-08-23 12:38 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-16  6:40 [PATCH 01/11] KVM: MMU: avoid pte_list_desc running out in kvm_mmu_pte_write Xiao Guangrong
2011-08-16  6:41 ` [PATCH 02/11] KVM: x86: tag the instructions which are used to write page table Xiao Guangrong
2011-08-22 14:32   ` Marcelo Tosatti
2011-08-22 14:36     ` Avi Kivity
2011-08-16  6:42 ` [PATCH 03/11] KVM: x86: retry non-page-table writing instruction Xiao Guangrong
2011-08-22 19:59   ` Marcelo Tosatti
2011-08-22 20:21     ` Xiao Guangrong
2011-08-22 20:42       ` Marcelo Tosatti
2011-08-16  6:42 ` [PATCH 04/11] KVM: x86: cleanup port-in/port-out emulated Xiao Guangrong
2011-08-16  6:43 ` [PATCH 05/11] KVM: MMU: do not mark access bit on pte write path Xiao Guangrong
2011-08-16  6:44 ` [PATCH 06/11] KVM: MMU: cleanup FNAME(invlpg) Xiao Guangrong
2011-08-16  6:44 ` [PATCH 07/11] KVM: MMU: fast prefetch spte on invlpg path Xiao Guangrong
2011-08-22 22:28   ` Marcelo Tosatti
2011-08-23  1:50     ` Xiao Guangrong
2011-08-16  6:45 ` [PATCH 08/11] KVM: MMU: remove unnecessary kvm_mmu_free_some_pages Xiao Guangrong
2011-08-16  6:45 ` [PATCH 09/11] KVM: MMU: split kvm_mmu_pte_write function Xiao Guangrong
2011-08-16  6:46 ` [PATCH 10/11] KVM: MMU: fix detecting misaligned accessed Xiao Guangrong
2011-08-16  6:46 ` [PATCH 11/11] KVM: MMU: improve write flooding detected Xiao Guangrong
2011-08-23  8:00   ` Marcelo Tosatti
2011-08-23 10:55     ` Xiao Guangrong
2011-08-23 12:38       ` Marcelo Tosatti [this message]
2011-08-23 16:32         ` Xiao Guangrong
2011-08-23 19:09           ` Marcelo Tosatti
2011-08-23 20:16             ` Xiao Guangrong
2011-08-24 20:05               ` Marcelo Tosatti
2011-08-25  2:04                 ` Marcelo Tosatti
2011-08-25  4:42                   ` Avi Kivity
2011-08-25 13:21                     ` Marcelo Tosatti
2011-08-25 14:06                       ` Avi Kivity
2011-08-25 14:07                         ` Avi Kivity
2011-08-25  7:40                   ` Xiao Guangrong
2011-08-25  7:57             ` Xiao Guangrong
2011-08-25 13:47               ` Marcelo Tosatti
2011-08-26  3:18                 ` Xiao Guangrong
2011-08-26 10:53                   ` Marcelo Tosatti
2011-08-26 14:24                     ` Xiao Guangrong
  -- strict thread matches above, loose matches on Subject: below --
2011-07-26 11:24 [PATCH 0/11] KVM: x86: optimize for guest page written Xiao Guangrong
2011-07-26 11:32 ` [PATCH 11/11] KVM: MMU: improve write flooding detected Xiao Guangrong
2011-07-27  9:23   ` Avi Kivity
2011-07-27 10:20     ` Xiao Guangrong
2011-07-27 11:08       ` Avi Kivity
2011-07-28  2:43         ` Xiao Guangrong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110823123818.GB4261@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xiaoguangrong@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox