From mboxrd@z Thu Jan 1 00:00:00 1970 From: "David S. Ahern" Subject: Re: [kvm-devel] performance with guests running 2.4 kernels (specifically RHEL3) Date: Mon, 19 May 2008 10:25:47 -0600 Message-ID: <4831AA0B.30700@cisco.com> References: <48054518.3000104@cisco.com> <4805BCF1.6040605@qumranet.com> <4807BD53.6020304@cisco.com> <48085485.3090205@qumranet.com> <480C188F.3020101@cisco.com> <480C5C39.4040300@qumranet.com> <480E492B.3060500@cisco.com> <480EEDA0.3080209@qumranet.com> <480F546C.2030608@cisco.com> <481215DE.3000302@cisco.com> <20080428181550.GA3965@dmt> <4816617F.3080403@cisco.com> <4817F30C.6050308@cisco.com> <48184228.2020701@qumranet.com> <481876A9.1010806@cisco.com> <48187903.2070409@qumranet.com> <4826E744.1080107@qumranet.com> <4826F668.6030305@qumranet.com> <48290FC2.4070505@cisco.com> <48294272.5020801@qumranet.com> <482B4D29.7010202@cisco.com> <482C1633.5070302@qumranet.com> <482E5F9C.6000207@cisco.com> <482FCEE1.5040306@qumranet.com> <4830F90A.1020809@cisco.com> <4830FE8D.6010006@cisco.com> <4 8318E64.8090706@qumranet.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from sj-iport-3.cisco.com ([171.71.176.72]:2777 "EHLO sj-iport-3.cisco.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753487AbYESQ0J (ORCPT ); Mon, 19 May 2008 12:26:09 -0400 In-Reply-To: <48318E64.8090706@qumranet.com> Sender: kvm-owner@vger.kernel.org List-ID: Does the fact that the hugemem kernel works just fine have any bearing on your options? Or rather, is there something unique about the way kscand works in the hugemem kernel that its performance is ok? I mentioned last month (so without your first patch) that running the hugemem kernel showed a remarkable improvement in performance compared to the standard smp kernel. Over the weekend I ran a test with your first patch and with the flood detector at 3 (I have not run a case with the detector at 5) and performance with the hugemem was even better in the sense that 1-minute averages of guest system time show no noticeable spikes. In an earlier post I showed a diff in the config files for the standard SMP and hugemem kernels. See: http://article.gmane.org/gmane.comp.emulators.kvm.devel/16944/ david Avi Kivity wrote: > David S. Ahern wrote: >>> [dsa] No. I saw the same problem with the flood count at 5. The >>> attachment in the last email shows kvm_stat data during a kscand event. >>> The data was collected with the patch you posted. With the flood count >>> at 3 the mmu cache/flood counters are in the 18,000/sec and pte updates >>> at ~50,000/sec and writes at 70,000/sec. With the flood count at 5 >>> mmu_cache/flood drops to 0 and pte updates and writes both hit >>> 180,000+/second. In both cases these last for 30 seconds or more. I only >>> included data for the onset as it's pretty flat during the kscand >>> activity. >>> > > It makes sense. We removed a flooding false positive, and introduced a > false negative. > > The guest access sequence is: > - point kmap pte at page table > - use the new pte to access the page table > > Prior to the patch, the mmu didn't see the 'use' part, so it concluded > the kmap pte would be better off unshadowed. This shows up as a high > flood count. > > After the patch, this no longer happens, so the sequence can repreat for > long periods. However the pte that is the result of the 'use' part is > never accessed, so it should be detected as flooded! But our flood > detection mechanism looks at one page at a time (per vcpu), while there > are two pages involved here. > > There are (at least) three options available: > - detect and special-case this scenario > - change the flood detector to be per page table instead of per vcpu > - change the flood detector to look at a list of recently used page > tables instead of the last page table > > I'm having a hard time trying to pick between the second and third options. >