public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Rik van Riel <riel@redhat.com>,
	"Dike, Jeffrey G" <jeffrey.g.dike@intel.com>,
	"Yu, Wilfred" <wilfred.yu@intel.com>,
	"Kleen, Andi" <andi.kleen@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Mel Gorman <mel@csn.ul.ie>, LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	KVM list <kvm@vger.kernel.org>
Subject: Re: [RFC] respect the referenced bit of KVM guest pages?
Date: Wed, 05 Aug 2009 11:17:12 +0300	[thread overview]
Message-ID: <4A794008.6030204@redhat.com> (raw)
In-Reply-To: <4A793B92.9040204@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]

On 08/05/2009 10:58 AM, Avi Kivity wrote:
> On 08/05/2009 05:40 AM, Wu Fengguang wrote:
>> Greetings,
>>
>> Jeff Dike found that many KVM pages are being refaulted in 2.6.29:
>>
>> "Lots of pages between discarded due to memory pressure only to be
>> faulted back in soon after. These pages are nearly all stack pages.
>> This is not consistent - sometimes there are relatively few such pages
>> and they are spread out between processes."
>>
>> The refaults can be drastically reduced by the following patch, which
>> respects the referenced bit of all anonymous pages (including the KVM
>> pages).
>>
>> However it risks reintroducing the problem addressed by commit 7e9cd4842
>> (fix reclaim scalability problem by ignoring the referenced bit,
>> mainly the pte young bit). I wonder if there are better solutions?
>
> How do you distinguish between kvm pages and non-kvm anonymous pages?  
> More importantly, why should you?
>
> Jeff, do you see the refaults on Nehalem systems?  If so, that's 
> likely due to the lack of an accessed bit on EPT pagetables.  It would 
> be interesting to compare with Barcelona  (which does).
>
> If that's indeed the case, we can have the EPT ageing mechanism give 
> pages a bit more time around by using an available bit in the EPT PTEs 
> to return accessed on the first pass and not-accessed on the second.
>

The attached patch implements this.

-- 
error compiling committee.c: too many arguments to function


[-- Attachment #2: ept-emulate-accessed-bit.patch --]
[-- Type: text/x-patch, Size: 2115 bytes --]

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 7b53614..310938a 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -195,6 +195,7 @@ static u64 __read_mostly shadow_x_mask;	/* mutual exclusive with nx_mask */
 static u64 __read_mostly shadow_user_mask;
 static u64 __read_mostly shadow_accessed_mask;
 static u64 __read_mostly shadow_dirty_mask;
+static int __read_mostly shadow_accessed_shift;
 
 static inline u64 rsvd_bits(int s, int e)
 {
@@ -219,6 +220,8 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
 {
 	shadow_user_mask = user_mask;
 	shadow_accessed_mask = accessed_mask;
+	shadow_accessed_shift
+		= find_first_bit((void *)&shadow_accessed_mask, 64);
 	shadow_dirty_mask = dirty_mask;
 	shadow_nx_mask = nx_mask;
 	shadow_x_mask = x_mask;
@@ -817,11 +820,11 @@ static int kvm_age_rmapp(struct kvm *kvm, unsigned long *rmapp)
 	while (spte) {
 		int _young;
 		u64 _spte = *spte;
-		BUG_ON(!(_spte & PT_PRESENT_MASK));
-		_young = _spte & PT_ACCESSED_MASK;
+		BUG_ON(!(_spte & shadow_accessed_mask));
+		_young = _spte & shadow_accessed_mask;
 		if (_young) {
 			young = 1;
-			clear_bit(PT_ACCESSED_SHIFT, (unsigned long *)spte);
+			clear_bit(shadow_accessed_shift, (unsigned long *)spte);
 		}
 		spte = rmap_next(kvm, rmapp, spte);
 	}
@@ -2572,7 +2575,7 @@ static void kvm_mmu_access_page(struct kvm_vcpu *vcpu, gfn_t gfn)
 	    && shadow_accessed_mask
 	    && !(*spte & shadow_accessed_mask)
 	    && is_shadow_present_pte(*spte))
-		set_bit(PT_ACCESSED_SHIFT, (unsigned long *)spte);
+		set_bit(shadow_accessed_shift, (unsigned long *)spte);
 }
 
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 0ba706e..bc99367 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4029,7 +4029,7 @@ static int __init vmx_init(void)
 		bypass_guest_pf = 0;
 		kvm_mmu_set_base_ptes(VMX_EPT_READABLE_MASK |
 			VMX_EPT_WRITABLE_MASK);
-		kvm_mmu_set_mask_ptes(0ull, 0ull, 0ull, 0ull,
+		kvm_mmu_set_mask_ptes(0ull, 1ull << 63, 0ull, 0ull,
 				VMX_EPT_EXECUTABLE_MASK);
 		kvm_enable_tdp();
 	} else

       reply	other threads:[~2009-08-05  8:17 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20090805024058.GA8886@localhost>
     [not found] ` <4A793B92.9040204@redhat.com>
2009-08-05  8:17   ` Avi Kivity [this message]
2009-08-05 14:33     ` [RFC] respect the referenced bit of KVM guest pages? Rik van Riel
2009-08-05 15:37       ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A794008.6030204@redhat.com \
    --to=avi@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi.kleen@intel.com \
    --cc=cl@linux-foundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=jeffrey.g.dike@intel.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    --cc=wilfred.yu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox