From mboxrd@z Thu Jan  1 00:00:00 1970
From: "David S. Ahern" <daahern@cisco.com>
Subject: Re: [kvm-devel] performance with guests running 2.4 kernels (specifically
 RHEL3)
Date: Thu, 05 Jun 2008 10:20:32 -0600
Message-ID: <48481250.6060005@cisco.com>
References: <48054518.3000104@cisco.com>	<48085485.3090205@qumranet.com>	<480C188F.3020101@cisco.com>	<480C5C39.4040300@qumranet.com>	<480E492B.3060500@cisco.com>	<480EEDA0.3080209@qumranet.com>	<480F546C.2030608@cisco.com>	<481215DE.3000302@cisco.com>	<20080428181550.GA3965@dmt>	<4816617F.3080403@cisco.com>	<4817F30C.6050308@cisco.com>	<48184228.2020701@qumranet.com>	<481876A9.1010806@cisco.com>	<48187903.2070409@qumranet.com>	<4826E744.1080107@qumranet.com>	<4826F668.6030305@qumranet.com> <48290FC2.4070505@cisco.com> <48294272.5020801@qumranet.com> <482B4D29.7010202@cisco.com> <482C1633.5070302@qumranet.com> <482E5F9C.6000207@cisco.com> <482FCEE1.5040306@qumranet.com> <4830F90A.1020809@cisco.com> <4830FE8D.6010006@cisco.com> <48318E64.8090706@qumranet.com> <4832DDEB.4000100@qumranet.com>
  <4835EEF5.9010600@cisco.com> <483D391F.7050007@qumranet.com> <483EDCEE.6070307@cisco.com> <4841094A.8090507@qumranet.com> <484422EE.5090501@cisco.com> <4847A5B8.6020503@qumranet.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: kvm@vger.kernel.org
To: Avi Kivity <avi@qumranet.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from sj-iport-2.cisco.com ([171.71.176.71]:12818 "EHLO
	sj-iport-2.cisco.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752928AbYFEQUh (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 5 Jun 2008 12:20:37 -0400
In-Reply-To: <4847A5B8.6020503@qumranet.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>


Avi Kivity wrote:
> David S. Ahern wrote:
>>> Oh!  Only 45K pages were direct, so the other 45K were shared, with
>>> perhaps many ptes.  We shoud count ptes, not pages.
>>>
>>> Can you modify page_referenced() to count the numbers of ptes mapped (1
>>> for direct pages, nr_chains for indirect pages) and print the total
>>> deltas in active_anon_scan?
>>>
>>>     
>>
>> Here you go. I've shortened the line lengths to get them to squeeze into
>> 80 columns:
>>
>> anon_scan, all HighMem zone, 187,910 active pages at loop start:
>>   count[12] 21462 -> 230,   direct 20469, chains 3479,   dj 58
>>   count[11] 1338  -> 1162,  direct 227,   chains 26144,  dj 59
>>   count[8] 29397  -> 5410,  direct 26115, chains 27617,  dj 117
>>   count[4] 35804  -> 25556, direct 31508, chains 82929,  dj 256
>>   count[3] 2738   -> 2207,  direct 2680,  chains 58,     dj 7
>>   count[0] 92580  -> 89509, direct 75024, chains 262834, dj 726
>> (age number is the index in [])
>>
>>   
> 
> Where do all those ptes come from?  that's 180K pages (most of highmem),
> but with 550K ptes.
> 
> The memuser workload doesn't use fork(), so there shouldn't be any
> indirect ptes.
> 
> We might try to unshadow the fixmap page; that means we don't have to do
> 4 fixmap pte accesses per pte scanned.
> 
> The kernel uses two methods for clearing the accessed bit:
> 
> For direct pages:
> 
>                if (pte_young(*pte) && ptep_test_and_clear_young(pte))
>                        referenced++;
> 
> (two accesses)
> 
> For indirect pages:
> 
>                                if (ptep_test_and_clear_young(pte))
>                                        referenced++;
> 
> (one access)
> 
> Which have to be emulated if we don't shadow the fixmap.  With the data
> above, that translates to 700K emulations with your numbers above, vs
> 2200K emulations, a 3X improvement.  I'm not sure it will be sufficient
> given that we're reducing a 10-second kscand scan into a 3-second scan.
> 

A 3-second scan is much better and incomparison to where kvm was when I
started this e-mail thread (as high as 30-seconds for a scan) it's a
10-fold improvement.

I gave a shot at implementing your suggestion, but evidently I am still
not understanding the shadow implementation. Can you suggest a patch to
try this out?

david