public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Avi Kivity <avi@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
	KVM <kvm@vger.kernel.org>
Subject: Re: [PATCH 11/11] KVM: MMU: improve write flooding detected
Date: Tue, 23 Aug 2011 18:55:39 +0800	[thread overview]
Message-ID: <4E53872B.3070407@cn.fujitsu.com> (raw)
In-Reply-To: <20110823080024.GA2297@amt.cnet>

Hi Marcelo,

On 08/23/2011 04:00 PM, Marcelo Tosatti wrote:
> On Tue, Aug 16, 2011 at 02:46:47PM +0800, Xiao Guangrong wrote:
>> Detecting write-flooding does not work well, when we handle page written, if
>> the last speculative spte is not accessed, we treat the page is
>> write-flooding, however, we can speculative spte on many path, such as pte
>> prefetch, page synced, that means the last speculative spte may be not point
>> to the written page and the written page can be accessed via other sptes, so
>> depends on the Accessed bit of the last speculative spte is not enough
> 
> Yes, a stale last_speculative_spte is possible, but is this fact a
> noticeable problem in practice?
> 
> Was this detected by code inspection?
> 

I detected this because: i noticed some shadow page is zapped by
write-flooding but it is accessed soon, it causes the shadow page zapped
and alloced again and again(very frequently).

Another reason is that: in current code, write-flooding is little complex
and it stuffs code in many places, actually, write-flooding is only needed for
shadow page/nested guest, so i want to simplify it and wrap its code up.

>> -	}
>> +	if (spte && !(*spte & shadow_accessed_mask))
>> +		sp->write_flooding_count++;
>> +	else
>> +		sp->write_flooding_count = 0;
> 
> This relies on the sptes being created by speculative means
> or by pressure on the host clearing the accessed bit for the
> shadow page to be zapped. 
> 
> There is no guarantee that either of these is true for a given
> spte.
> 
> And if the sptes do not have accessed bit set, any nonconsecutive 3 pte
> updates will zap the page.
> 

Please note we clear 'sp->write_flooding_count' when it is accessed from
shadow page cache (in kvm_mmu_get_page), it means if any spte of sp generates
#PF, the fooding count can be reset.

And, i think there are not problems since: if the spte without accssed bit is
written frequently, it means the guest page table is accessed infrequently or
during the writing, the guest page table is not accessed, in this time, zapping
this shadow page is not bad.

Comparing the old way, the advantage of it is good for zapping upper shadow page,
for example, in the old way:
if a gfn is used as PDE for a task, later, the gfn is freed and used as PTE for
the new task, so we have two shadow pages in the host, one sp1.level = 2 and the
other sp2.level = 1. So, when we detect write-flooding, the vcpu->last_pte_updated
always point to sp2.pte. As sp2 is used for the new task, we always detected both
shadow pages are bing used, but actually, sp1 is not used by guest anymore.

> Back to the first question, what is the motivation for this heuristic
> change? Do you have any numbers?
> 

Yes, i have done the quick test:

before this patch:
2m56.561
2m50.651
2m51.220
2m52.199
2m48.066

After this patch:
2m51.194
2m55.980
2m50.755
2m47.396
2m46.807

It shows the new way is little better than the old way.

  reply	other threads:[~2011-08-23 10:53 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-16  6:40 [PATCH 01/11] KVM: MMU: avoid pte_list_desc running out in kvm_mmu_pte_write Xiao Guangrong
2011-08-16  6:41 ` [PATCH 02/11] KVM: x86: tag the instructions which are used to write page table Xiao Guangrong
2011-08-22 14:32   ` Marcelo Tosatti
2011-08-22 14:36     ` Avi Kivity
2011-08-16  6:42 ` [PATCH 03/11] KVM: x86: retry non-page-table writing instruction Xiao Guangrong
2011-08-22 19:59   ` Marcelo Tosatti
2011-08-22 20:21     ` Xiao Guangrong
2011-08-22 20:42       ` Marcelo Tosatti
2011-08-16  6:42 ` [PATCH 04/11] KVM: x86: cleanup port-in/port-out emulated Xiao Guangrong
2011-08-16  6:43 ` [PATCH 05/11] KVM: MMU: do not mark access bit on pte write path Xiao Guangrong
2011-08-16  6:44 ` [PATCH 06/11] KVM: MMU: cleanup FNAME(invlpg) Xiao Guangrong
2011-08-16  6:44 ` [PATCH 07/11] KVM: MMU: fast prefetch spte on invlpg path Xiao Guangrong
2011-08-22 22:28   ` Marcelo Tosatti
2011-08-23  1:50     ` Xiao Guangrong
2011-08-16  6:45 ` [PATCH 08/11] KVM: MMU: remove unnecessary kvm_mmu_free_some_pages Xiao Guangrong
2011-08-16  6:45 ` [PATCH 09/11] KVM: MMU: split kvm_mmu_pte_write function Xiao Guangrong
2011-08-16  6:46 ` [PATCH 10/11] KVM: MMU: fix detecting misaligned accessed Xiao Guangrong
2011-08-16  6:46 ` [PATCH 11/11] KVM: MMU: improve write flooding detected Xiao Guangrong
2011-08-23  8:00   ` Marcelo Tosatti
2011-08-23 10:55     ` Xiao Guangrong [this message]
2011-08-23 12:38       ` Marcelo Tosatti
2011-08-23 16:32         ` Xiao Guangrong
2011-08-23 19:09           ` Marcelo Tosatti
2011-08-23 20:16             ` Xiao Guangrong
2011-08-24 20:05               ` Marcelo Tosatti
2011-08-25  2:04                 ` Marcelo Tosatti
2011-08-25  4:42                   ` Avi Kivity
2011-08-25 13:21                     ` Marcelo Tosatti
2011-08-25 14:06                       ` Avi Kivity
2011-08-25 14:07                         ` Avi Kivity
2011-08-25  7:40                   ` Xiao Guangrong
2011-08-25  7:57             ` Xiao Guangrong
2011-08-25 13:47               ` Marcelo Tosatti
2011-08-26  3:18                 ` Xiao Guangrong
2011-08-26 10:53                   ` Marcelo Tosatti
2011-08-26 14:24                     ` Xiao Guangrong
  -- strict thread matches above, loose matches on Subject: below --
2011-07-26 11:24 [PATCH 0/11] KVM: x86: optimize for guest page written Xiao Guangrong
2011-07-26 11:32 ` [PATCH 11/11] KVM: MMU: improve write flooding detected Xiao Guangrong
2011-07-27  9:23   ` Avi Kivity
2011-07-27 10:20     ` Xiao Guangrong
2011-07-27 11:08       ` Avi Kivity
2011-07-28  2:43         ` Xiao Guangrong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E53872B.3070407@cn.fujitsu.com \
    --to=xiaoguangrong@cn.fujitsu.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox