linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gleb Natapov <gleb@redhat.com>
To: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: avi.kivity@gmail.com, mtosatti@redhat.com, pbonzini@redhat.com,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH v7 09/11] KVM: MMU: introduce kvm_mmu_prepare_zap_obsolete_page
Date: Thu, 23 May 2013 18:57:22 +0300	[thread overview]
Message-ID: <20130523155722.GJ26157@redhat.com> (raw)
In-Reply-To: <519E13B6.7060807@linux.vnet.ibm.com>

On Thu, May 23, 2013 at 09:03:50PM +0800, Xiao Guangrong wrote:
> On 05/23/2013 08:39 PM, Gleb Natapov wrote:
> > On Thu, May 23, 2013 at 07:13:58PM +0800, Xiao Guangrong wrote:
> >> On 05/23/2013 04:09 PM, Gleb Natapov wrote:
> >>> On Thu, May 23, 2013 at 03:50:16PM +0800, Xiao Guangrong wrote:
> >>>> On 05/23/2013 03:37 PM, Gleb Natapov wrote:
> >>>>> On Thu, May 23, 2013 at 02:31:47PM +0800, Xiao Guangrong wrote:
> >>>>>> On 05/23/2013 02:18 PM, Gleb Natapov wrote:
> >>>>>>> On Thu, May 23, 2013 at 02:13:06PM +0800, Xiao Guangrong wrote:
> >>>>>>>> On 05/23/2013 01:57 PM, Gleb Natapov wrote:
> >>>>>>>>> On Thu, May 23, 2013 at 03:55:58AM +0800, Xiao Guangrong wrote:
> >>>>>>>>>> It is only used to zap the obsolete page. Since the obsolete page
> >>>>>>>>>> will not be used, we need not spend time to find its unsync children
> >>>>>>>>>> out. Also, we delete the page from shadow page cache so that the page
> >>>>>>>>>> is completely isolated after call this function.
> >>>>>>>>>>
> >>>>>>>>>> The later patch will use it to collapse tlb flushes
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> >>>>>>>>>> ---
> >>>>>>>>>>  arch/x86/kvm/mmu.c |   46 +++++++++++++++++++++++++++++++++++++++++-----
> >>>>>>>>>>  1 files changed, 41 insertions(+), 5 deletions(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> >>>>>>>>>> index 9b57faa..e676356 100644
> >>>>>>>>>> --- a/arch/x86/kvm/mmu.c
> >>>>>>>>>> +++ b/arch/x86/kvm/mmu.c
> >>>>>>>>>> @@ -1466,7 +1466,7 @@ static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, int nr)
> >>>>>>>>>>  static void kvm_mmu_free_page(struct kvm_mmu_page *sp)
> >>>>>>>>>>  {
> >>>>>>>>>>  	ASSERT(is_empty_shadow_page(sp->spt));
> >>>>>>>>>> -	hlist_del(&sp->hash_link);
> >>>>>>>>>> +	hlist_del_init(&sp->hash_link);
> >>>>>>>>> Why do you need hlist_del_init() here? Why not move it into
> >>>>>>>>
> >>>>>>>> Since the hlist will be double freed. We will it like this:
> >>>>>>>>
> >>>>>>>> kvm_mmu_prepare_zap_obsolete_page(page, list);
> >>>>>>>> kvm_mmu_commit_zap_page(list);
> >>>>>>>>    kvm_mmu_free_page(page);
> >>>>>>>>
> >>>>>>>> The first place is kvm_mmu_prepare_zap_obsolete_page(page), which have
> >>>>>>>> deleted the hash list.
> >>>>>>>>
> >>>>>>>>> kvm_mmu_prepare_zap_page() like we discussed it here:
> >>>>>>>>> https://patchwork.kernel.org/patch/2580351/ instead of doing
> >>>>>>>>> it differently for obsolete and non obsolete pages?
> >>>>>>>>
> >>>>>>>> It is can break the hash-list walking: we should rescan the
> >>>>>>>> hash list once the page is prepared-ly zapped.
> >>>>>>>>
> >>>>>>>> I mentioned it in the changelog:
> >>>>>>>>
> >>>>>>>>   4): drop the patch which deleted page from hash list at the "prepare"
> >>>>>>>>       time since it can break the walk based on hash list.
> >>>>>>> Can you elaborate on how this can happen?
> >>>>>>
> >>>>>> There is a example:
> >>>>>>
> >>>>>> int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
> >>>>>> {
> >>>>>> 	struct kvm_mmu_page *sp;
> >>>>>> 	LIST_HEAD(invalid_list);
> >>>>>> 	int r;
> >>>>>>
> >>>>>> 	pgprintk("%s: looking for gfn %llx\n", __func__, gfn);
> >>>>>> 	r = 0;
> >>>>>> 	spin_lock(&kvm->mmu_lock);
> >>>>>> 	for_each_gfn_indirect_valid_sp(kvm, sp, gfn) {
> >>>>>> 		pgprintk("%s: gfn %llx role %x\n", __func__, gfn,
> >>>>>> 			 sp->role.word);
> >>>>>> 		r = 1;
> >>>>>> 		kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
> >>>>>> 	}
> >>>>>> 	kvm_mmu_commit_zap_page(kvm, &invalid_list);
> >>>>>> 	spin_unlock(&kvm->mmu_lock);
> >>>>>>
> >>>>>> 	return r;
> >>>>>> }
> >>>>>>
> >>>>>> It works fine since kvm_mmu_prepare_zap_page does not touch the hash list.
> >>>>>> If we delete hlist in kvm_mmu_prepare_zap_page(), this kind of codes should
> >>>>>> be changed to:
> >>>>>>
> >>>>>> restart:
> >>>>>> 	for_each_gfn_indirect_valid_sp(kvm, sp, gfn) {
> >>>>>> 		pgprintk("%s: gfn %llx role %x\n", __func__, gfn,
> >>>>>> 			 sp->role.word);
> >>>>>> 		r = 1;
> >>>>>> 		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
> >>>>>> 			goto restart;
> >>>>>> 	}
> >>>>>> 	kvm_mmu_commit_zap_page(kvm, &invalid_list);
> >>>>>>
> >>>>> Hmm, yes. So lets leave it as is and always commit invalid_list before
> >>>>
> >>>> So, you mean drop this patch and the patch of
> >>>> KVM: MMU: collapse TLB flushes when zap all pages?
> >>>>
> >>> We still want to add kvm_reload_remote_mmus() to
> >>> kvm_mmu_invalidate_zap_all_pages(). But yes, we disable a nice
> >>> optimization here. So may be skipping obsolete pages while walking
> >>> hashtable is better solution.
> >>
> >> I am willing to use this way instead, but it looks worse than this
> >> patch:
> >>
> >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> >> index 9b57faa..810410c 100644
> >> --- a/arch/x86/kvm/mmu.c
> >> +++ b/arch/x86/kvm/mmu.c
> >> @@ -1466,7 +1466,7 @@ static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, int nr)
> >>  static void kvm_mmu_free_page(struct kvm_mmu_page *sp)
> >>  {
> >>  	ASSERT(is_empty_shadow_page(sp->spt));
> >> -	hlist_del(&sp->hash_link);
> >> +	hlist_del_init(&sp->hash_link);
> > Why not drop this
> > 
> >>  	list_del(&sp->link);
> >>  	free_page((unsigned long)sp->spt);
> >>  	if (!sp->role.direct)
> >> @@ -1648,14 +1648,20 @@ static int kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
> >>  static void kvm_mmu_commit_zap_page(struct kvm *kvm,
> >>  				    struct list_head *invalid_list);
> >>
> >> +static bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
> >> +{
> >> +	return unlikely(sp->mmu_valid_gen != kvm->arch.mmu_valid_gen);
> >> +}
> >> +
> >>  #define for_each_gfn_sp(_kvm, _sp, _gfn)				\
> >>  	hlist_for_each_entry(_sp,					\
> >>  	  &(_kvm)->arch.mmu_page_hash[kvm_page_table_hashfn(_gfn)], hash_link) \
> >> -		if ((_sp)->gfn != (_gfn)) {} else
> >> +		if ((_sp)->gfn != (_gfn) || is_obsolete_sp(_kvm, _sp)) {} else
> >>
> >>  #define for_each_gfn_indirect_valid_sp(_kvm, _sp, _gfn)			\
> >>  	for_each_gfn_sp(_kvm, _sp, _gfn)				\
> >> -		if ((_sp)->role.direct || (_sp)->role.invalid) {} else
> >> +		if ((_sp)->role.direct ||				\
> >> +		      (_sp)->role.invalid || is_obsolete_sp(_kvm, _sp)) {} else
> >>
> >>  /* @sp->gfn should be write-protected at the call site */
> >>  static int __kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
> >> @@ -1838,11 +1844,6 @@ static void clear_sp_write_flooding_count(u64 *spte)
> >>  	__clear_sp_write_flooding_count(sp);
> >>  }
> >>
> >> -static bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
> >> -{
> >> -	return unlikely(sp->mmu_valid_gen != kvm->arch.mmu_valid_gen);
> >> -}
> >> -
> >>  static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
> >>  					     gfn_t gfn,
> >>  					     gva_t gaddr,
> >> @@ -2085,11 +2086,15 @@ static int kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
> >>
> >>  	if (sp->unsync)
> >>  		kvm_unlink_unsync_page(kvm, sp);
> >> +
> >>  	if (!sp->root_count) {
> >>  		/* Count self */
> >>  		ret++;
> >>  		list_move(&sp->link, invalid_list);
> >>  		kvm_mod_used_mmu_pages(kvm, -1);
> >> +
> >> +		if (unlikely(is_obsolete_sp(kvm, sp)))
> >> +			hlist_del_init(&sp->hash_link);
> > and this.
> > 
> > Since we check for obsolete while searching hashtable why delete it
> > here?
> 
> In order to zap obsolete pages without tlb flush, we should delete them from
> hash list at the "prepare" time. Here, we only delete the obsolete pages so
> that the hashtable walking functions, like kvm_mmu_unprotect_page(), can work
> properly by skipping obsolete page.
> 
Why we have to delete them from the hash at "prepare" time? I hash walk
ignores them they are as good as deleted, no?

> And, kvm_mmu_prepare_zap_page() is a recursion function:
> kvm_mmu_prepare_zap_page() -> zap_unsync_children -> kvm_mmu_prepare_zap_page().
> It seems it is the only place to do this thing. For example, below code is not
> allowed in kvm_zap_obsolete_pages():
> 
> if (kvm_mmu_prepare_zap_page(sp, list))
> 	hlist_del(sp->hlist);
> 
> Or, i missed your suggestion?
My assumption is that we can leave obsolete shadow pages on hashtable
till commit_zap time.

BTW is it such a good idea to call kvm_mmu_commit_zap_page() once on all
obsolete pages? We basically loop over all of them under the lock
without lock break.

--
			Gleb.

  reply	other threads:[~2013-05-23 15:58 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-22 19:55 [PATCH v7 00/11] KVM: MMU: fast zap all shadow pages Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 01/11] KVM: x86: drop calling kvm_mmu_zap_all in emulator_fix_hypercall Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 02/11] KVM: MMU: drop unnecessary kvm_reload_remote_mmus Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 03/11] KVM: MMU: fast invalidate all pages Xiao Guangrong
2013-05-24 20:23   ` Marcelo Tosatti
2013-05-26  8:26     ` Gleb Natapov
2013-05-26 20:37       ` Marcelo Tosatti
2013-05-27 22:59         ` Xiao Guangrong
2013-05-27  2:02     ` Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 04/11] KVM: MMU: zap pages in batch Xiao Guangrong
2013-05-24 20:34   ` Marcelo Tosatti
2013-05-27  2:20     ` Xiao Guangrong
2013-05-28  0:18       ` Marcelo Tosatti
2013-05-28 15:02         ` Xiao Guangrong
2013-05-29 11:11           ` Marcelo Tosatti
2013-05-29 13:09             ` Xiao Guangrong
2013-05-29 13:21               ` Marcelo Tosatti
2013-05-29 14:00                 ` Xiao Guangrong
2013-05-29 13:32               ` Marcelo Tosatti
2013-05-29 14:02                 ` Xiao Guangrong
2013-05-29 16:03                   ` Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 05/11] KVM: x86: use the fast way to invalidate all pages Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 06/11] KVM: MMU: show mmu_valid_gen in shadow page related tracepoints Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 07/11] KVM: MMU: add tracepoint for kvm_mmu_invalidate_all_pages Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 08/11] KVM: MMU: do not reuse the obsolete page Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 09/11] KVM: MMU: introduce kvm_mmu_prepare_zap_obsolete_page Xiao Guangrong
2013-05-23  5:57   ` Gleb Natapov
2013-05-23  6:13     ` Xiao Guangrong
2013-05-23  6:18       ` Gleb Natapov
2013-05-23  6:31         ` Xiao Guangrong
2013-05-23  7:37           ` Gleb Natapov
2013-05-23  7:50             ` Xiao Guangrong
2013-05-23  8:09               ` Gleb Natapov
2013-05-23  8:33                 ` Xiao Guangrong
2013-05-23 11:13                 ` Xiao Guangrong
2013-05-23 12:39                   ` Gleb Natapov
2013-05-23 13:03                     ` Xiao Guangrong
2013-05-23 15:57                       ` Gleb Natapov [this message]
2013-05-24  5:39                         ` Xiao Guangrong
2013-05-24  5:53                           ` Xiao Guangrong
2013-05-28  0:13   ` Marcelo Tosatti
2013-05-28 14:51     ` Xiao Guangrong
2013-05-29 12:25       ` Marcelo Tosatti
2013-05-29 13:43         ` Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 10/11] KVM: MMU: collapse TLB flushes when zap all pages Xiao Guangrong
2013-05-23  6:12   ` Gleb Natapov
2013-05-23  6:26     ` Xiao Guangrong
2013-05-23  7:24       ` Gleb Natapov
2013-05-23  7:37         ` Xiao Guangrong
2013-05-23  7:38           ` Xiao Guangrong
2013-05-23  7:56             ` Gleb Natapov
2013-05-28  0:36   ` Marcelo Tosatti
2013-05-28 15:19     ` Xiao Guangrong
2013-05-29  3:03       ` Xiao Guangrong
2013-05-29 12:39         ` Marcelo Tosatti
2013-05-29 13:19           ` Xiao Guangrong
2013-05-30  0:53             ` Gleb Natapov
2013-05-30 16:24               ` Takuya Yoshikawa
2013-05-30 17:10                 ` Takuya Yoshikawa
2013-05-22 19:56 ` [PATCH v7 11/11] KVM: MMU: reduce KVM_REQ_MMU_RELOAD when root page is zapped Xiao Guangrong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130523155722.GJ26157@redhat.com \
    --to=gleb@redhat.com \
    --cc=avi.kivity@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).