public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Avi Kivity <avi@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
	KVM <kvm@vger.kernel.org>
Subject: Re: [PATCH 10/15] KVM: MMU: lockless walking shadow page table
Date: Tue, 21 Jun 2011 02:54:30 +0800	[thread overview]
Message-ID: <4DFF9766.3080505@cn.fujitsu.com> (raw)
In-Reply-To: <20110620163704.GD17130@amt.cnet>

On 06/21/2011 12:37 AM, Marcelo Tosatti wrote:

>> +	if (atomic_read(&kvm->arch.reader_counter)) {
>> +		free_mmu_pages_unlock_parts(invalid_list);
>> +		sp = list_first_entry(invalid_list, struct kvm_mmu_page, link);
>> +		list_del_init(invalid_list);
>> +		call_rcu(&sp->rcu, free_invalid_pages_rcu);
>> +		return;
>> +	}
> 
> This is probably wrong, the caller wants the page to be zapped by the 
> time the function returns, not scheduled sometime in the future.
> 

It can be freed soon and KVM does not reuse these pages anymore...
it is not too bad, no?

>> +
>>  	do {
>>  		sp = list_first_entry(invalid_list, struct kvm_mmu_page, link);
>>  		WARN_ON(!sp->role.invalid || sp->root_count);
>> @@ -2601,6 +2633,35 @@ static gpa_t nonpaging_gva_to_gpa_nested(struct kvm_vcpu *vcpu, gva_t vaddr,
>>  	return vcpu->arch.nested_mmu.translate_gpa(vcpu, vaddr, access);
>>  }
>>  
>> +int kvm_mmu_walk_shadow_page_lockless(struct kvm_vcpu *vcpu, u64 addr,
>> +				      u64 sptes[4])
>> +{
>> +	struct kvm_shadow_walk_iterator iterator;
>> +	int nr_sptes = 0;
>> +
>> +	rcu_read_lock();
>> +
>> +	atomic_inc(&vcpu->kvm->arch.reader_counter);
>> +	/* Increase the counter before walking shadow page table */
>> +	smp_mb__after_atomic_inc();
>> +
>> +	for_each_shadow_entry(vcpu, addr, iterator) {
>> +		sptes[iterator.level-1] = *iterator.sptep;
>> +		nr_sptes++;
>> +		if (!is_shadow_present_pte(*iterator.sptep))
>> +			break;
>> +	}
> 
> Why is lockless access needed for the MMIO optimization? Note the spte 
> contents are copied to the array here are used for debugging purposes
> only, their contents are potentially stale.
> 

Um, we can use it to check the mmio page fault if it is the real mmio access or the
bug of KVM, i discussed it with Avi:

===============================================
>
> Yes, it is, i just want to detect BUG for KVM, it helps us to know if "ept misconfig" is the
> real MMIO or the BUG. I noticed some "ept misconfig" BUGs is reported before, so i think doing
> this is necessary, and i think it is not too bad, since walking spte hierarchy is lockless,
> it really fast.

Okay.  We can later see if it show up on profiles. 
===============================================

And it is really fast, i will attach the 'perf result' when the v2 is posted.

Yes, their contents are potentially stale, we just use it to check mmio, after all, if we get the
stale spte, we will call page fault path to fix it.
 



  reply	other threads:[~2011-06-20 18:52 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-07 12:58 [PATCH 0/15] KVM: optimize for MMIO handled Xiao Guangrong
2011-06-07 12:58 ` [PATCH 01/15] KVM: MMU: fix walking shadow page table Xiao Guangrong
2011-06-07 12:59 ` [PATCH 02/15] KVM: MMU: do not update slot bitmap if spte is nonpresent Xiao Guangrong
2011-06-20 16:28   ` Marcelo Tosatti
2011-06-20 18:32     ` Xiao Guangrong
2011-06-07 12:59 ` [PATCH 03/15] KVM: x86: avoid unnecessarily guest page table walking Xiao Guangrong
2011-06-09  6:59   ` Avi Kivity
2011-06-10  3:51     ` Xiao Guangrong
2011-06-07 13:00 ` [PATCH 04/15] KVM: MMU: cache mmio info on page fault path Xiao Guangrong
2011-06-08  8:22   ` Alexander Graf
2011-06-08  8:58     ` Xiao Guangrong
2011-06-08  9:18       ` Alexander Graf
2011-06-08  9:33         ` Xiao Guangrong
2011-06-08  9:39           ` Alexander Graf
2011-06-20 16:14   ` Marcelo Tosatti
2011-06-20 16:16     ` Marcelo Tosatti
2011-06-07 13:01 ` [PATCH 05/15] KVM: MMU: optimize to handle dirty bit Xiao Guangrong
2011-06-08  3:16   ` Xiao Guangrong
2011-06-07 13:01 ` [PATCH 06/15] KVM: MMU: cleanup for FNAME(fetch) Xiao Guangrong
2011-06-07 13:02 ` [PATCH 07/15] KVM: MMU: rename 'pt_write' to 'emulate' Xiao Guangrong
2011-06-07 13:02 ` [PATCH 08/15] KVM: MMU: count used shadow pages on preparing path Xiao Guangrong
2011-06-07 13:03 ` [PATCH 09/15] KVM: MMU: split kvm_mmu_free_page Xiao Guangrong
2011-06-09  7:07   ` Avi Kivity
2011-06-10  3:50     ` Xiao Guangrong
2011-06-12  8:33       ` Avi Kivity
2011-06-13  3:15         ` Xiao Guangrong
2011-06-07 13:04 ` [PATCH 10/15] KVM: MMU: lockless walking shadow page table Xiao Guangrong
2011-06-09 20:09   ` Paul E. McKenney
2011-06-10  4:23     ` Xiao Guangrong
2011-06-20 16:37   ` Marcelo Tosatti
2011-06-20 18:54     ` Xiao Guangrong [this message]
2011-06-07 13:05 ` [PATCH 11/15] KVM: MMU: filter out the mmio pfn from the fault pfn Xiao Guangrong
2011-06-07 13:05 ` [PATCH 12/15] KVM: MMU: abstract some functions to handle " Xiao Guangrong
2011-06-07 13:06 ` [PATCH 13/15] KVM: VMX: modify the default value of nontrap shadow pte Xiao Guangrong
2011-06-09  7:14   ` Avi Kivity
2011-06-07 13:07 ` [PATCH 14/15] KVM: MMU: mmio page fault support Xiao Guangrong
2011-06-09  7:28   ` Avi Kivity
2011-06-10  3:47     ` Xiao Guangrong
2011-06-12  8:38       ` Avi Kivity
2011-06-13  3:38         ` Xiao Guangrong
2011-06-13  8:10           ` Avi Kivity
2011-06-07 13:07 ` [PATCH 15/15] KVM: MMU: trace mmio page fault Xiao Guangrong
2011-06-08  3:11 ` [PATCH 0/15] KVM: optimize for MMIO handled Takuya Yoshikawa
2011-06-08  3:25   ` Xiao Guangrong
2011-06-08  3:32     ` Xiao Guangrong
2011-06-08  3:47       ` Takuya Yoshikawa
2011-06-08  5:16         ` Xiao Guangrong
2011-06-08  6:22         ` Xiao Guangrong
2011-06-08  8:33           ` Takuya Yoshikawa
2011-06-09  7:39 ` Avi Kivity
2011-06-10  4:05   ` Xiao Guangrong
2011-06-12  8:47     ` Avi Kivity
2011-06-13  4:46       ` Xiao Guangrong
2011-06-13  8:06         ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DFF9766.3080505@cn.fujitsu.com \
    --to=xiaoguangrong@cn.fujitsu.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox