Re: [PATCH 0/8] KVM: Reduce mmu_lock hold time when zapping mmu pages

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>,
	gleb@redhat.com, kvm@vger.kernel.org
Subject: Re: [PATCH 0/8] KVM: Reduce mmu_lock hold time when zapping mmu pages
Date: Tue, 05 Feb 2013 13:30:03 +0800	[thread overview]
Message-ID: <511098DB.7070400@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130204134235.GB9005@amt.cnet>

On 02/04/2013 09:42 PM, Marcelo Tosatti wrote:
> On Wed, Jan 23, 2013 at 06:44:52PM +0800, Xiao Guangrong wrote:
>> On 01/23/2013 06:12 PM, Takuya Yoshikawa wrote:
>>> This patch set mitigates another mmu_lock hold time issue.  Although
>>> this is not enough and I'm thinking of additional work already, this
>>> alone can reduce the lock hold time to some extent.
>>>
>>
>> It is not worth doing this kind of complex thing, usually, only a few pages on
>> the invalid list.
> 
> I think its a good idea - memory freeing can be done outside mmu_lock
> protection (as long as its bounded). It reduces mmu_lock contention
> overall.

It is not much help since we still need to walk and delete all shadow
pages, rmaps and parenet-pte-list - still need cost lots of time and not
good for the scalability.

> 
>> The *really* heavily case is kvm_mmu_zap_all() which can be speeded
>> up by using generation number, this is a todo work in kvm wiki:
>>
>> http://www.linux-kvm.org/page/TODO: O(1) mmu invalidation using a generation number
>>
>> I am doing this work for some weeks and will post the patch out during these days.
> 
> Can you describe the generation number scheme in more detail, please?

Yes, but i currently use a simple way instead of the generation number.

The optimization way is, we can switch the hashtable and rmaps into the new one, then
the later page fault can install shadow-pages and rmaps on the new one, and the old one
can be directly freed out of mmu-lock.

More detail:

zap_all_shadow_pages:

hold mmu_lock;
LIST_HEAD(active_list);
LIST_HEAD(pte_list_desc);

/*
 * Prepare the root shadow pages since they can not be
 * freed directly.
 */
for_each_root_sp(sp, mmu->root_sp_list) {
	prepare_zap(sp);
	/* Delete it from mmu->active_list */
	list_del_init(sp->link);
}

/* Zap the hashtable and ramp. */
memset(mmu->hashtable, 0);
memset(memslot->rmap, 0);

list_replace_init(mmu->active_sp_list, active_list);

/* All the pte_list_desc for rmap and parent_list */
list_replace_init(mmu->pte_list_desc_list, pte_list_desc);

/* Reload mmu, let the old shadow pages be zapped. */
kvm_reload_remote_mmus(kvm);

release_mmu_lock;

for_each_sp_on_active_list(sp, active_list)
	kvm_mmu_free_page(sp);

for_each_pte_desc(desc, pte_list_desc)
	mmu_free_pte_list_desc(desc);

The patches is being tested on my box, it works well and can improve
zap_all_shadow_pages more than 75%.

============
Note: later we can use the generation number to continue to optimize it:
zap_all_shadow_pages:
   generation_number++;
   kvm_reload_remote_mmus(kvm);

And, on unload_mmu path:

hold mmu_lock
   if (kvm->generation_number != generation_number) {
	switch hashtable and ramp to the new one;
	kvm->generation_number = generation_number
   }
   release mmu_lock

   free the old one

We need to adjust the code of page-fault and sync-children, let them do not
install sp on the old shadow page cache.
=============

next prev parent reply	other threads:[~2013-02-05  5:30 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-23 10:12 [PATCH 0/8] KVM: Reduce mmu_lock hold time when zapping mmu pages Takuya Yoshikawa
2013-01-23 10:13 ` [PATCH 1/8] KVM: MMU: Fix and clean up for_each_gfn_* macros Takuya Yoshikawa
2013-01-28 12:24   ` Gleb Natapov
2013-01-28 12:29     ` Takuya Yoshikawa
2013-01-23 10:13 ` [PATCH 2/8] KVM: MMU: Use list_for_each_entry_safe in kvm_mmu_commit_zap_page() Takuya Yoshikawa
2013-01-23 10:14 ` [PATCH 3/8] KVM: MMU: Add a parameter to kvm_mmu_prepare_zap_page() to update the next position Takuya Yoshikawa
2013-01-23 10:15 ` [PATCH 4/8] KVM: MMU: Introduce for_each_gfn_indirect_valid_sp_safe macro Takuya Yoshikawa
2013-01-23 10:16 ` [PATCH 5/8] KVM: MMU: Delete hash_link node in kvm_mmu_prepare_zap_page() Takuya Yoshikawa
2013-01-23 10:16 ` [PATCH 6/8] KVM: MMU: Introduce free_zapped_mmu_pages() for freeing mmu pages in a list Takuya Yoshikawa
2013-01-23 10:17 ` [PATCH 7/8] KVM: MMU: Split out free_zapped_mmu_pages() from kvm_mmu_commit_zap_page() Takuya Yoshikawa
2013-01-23 10:18 ` [PATCH 8/8] KVM: MMU: Move free_zapped_mmu_pages() out of the protection of mmu_lock Takuya Yoshikawa
2013-02-04 13:50   ` Marcelo Tosatti
2013-02-05  2:21     ` Takuya Yoshikawa
2013-01-23 10:44 ` [PATCH 0/8] KVM: Reduce mmu_lock hold time when zapping mmu pages Xiao Guangrong
2013-01-23 13:28   ` Takuya Yoshikawa
2013-01-23 13:45     ` Xiao Guangrong
2013-01-23 14:49       ` Takuya Yoshikawa
2013-01-23 15:45         ` Xiao Guangrong
2013-02-04 13:42   ` Marcelo Tosatti
2013-02-05  5:30     ` Xiao Guangrong [this message]
2013-02-04 13:29 ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=511098DB.7070400@linux.vnet.ibm.com \
    --to=xiaoguangrong@linux.vnet.ibm.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=yoshikawa_takuya_b1@lab.ntt.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox