From mboxrd@z Thu Jan 1 00:00:00 1970 From: Raghavendra K T Subject: Re: [PATCH RFC 0/2] kvm: Better yield_to candidate using preemption notifiers Date: Tue, 05 Mar 2013 17:54:09 +0530 Message-ID: <5135E3E9.3020608@linux.vnet.ibm.com> References: <20130304180146.31281.33540.sendpatchset@codeblue.in.ibm.com> <20130305095307.GA11728@hawk.usersys.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Peter Zijlstra , Avi Kivity , Gleb Natapov , Ingo Molnar , Marcelo Tosatti , Rik van Riel , Srikar , "H. Peter Anvin" , "Nikunj A. Dadhania" , KVM , Thomas Gleixner , Jiannan Ouyang , Chegu Vinod , "Andrew M. Theurer" , LKML , Srivatsa Vaddagiri To: Andrew Jones Return-path: In-Reply-To: <20130305095307.GA11728@hawk.usersys.redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On 03/05/2013 03:23 PM, Andrew Jones wrote: > On Mon, Mar 04, 2013 at 11:31:46PM +0530, Raghavendra K T wrote: >> This patch series further filters better vcpu candidate to yield to >> in PLE handler. The main idea is to record the preempted vcpus using >> preempt notifiers and iterate only those preempted vcpus in the >> handler. Note that the vcpus which were in spinloop during pause loop >> exit are already filtered. > > The %improvement and patch series look good. > Thank you for the review. >> >> Thanks Jiannan, Avi for bringing the idea and Gleb, PeterZ for >> precious suggestions during the discussion. >> Thanks Srikar for suggesting to avoid rcu lock while checking task state >> that has improved overcommit cases. >> >> There are basically two approches for the implementation. >> >> Method 1: Uses per vcpu preempt flag (this series). >> >> Method 2: We keep a bitmap of preempted vcpus. using this we can easily >> iterate over preempted vcpus. >> >> Note that method 2 needs an extra index variable to identify/map bitmap to >> vcpu and it also needs static vcpu allocation. > > We definitely don't want something that requires static vcpu allocation. > I think it'd be better to add another counter for the vcpu bit assignment. > So do you mean some thing parallel to online_vcpus? >> >> I am also posting Method 2 approach for reference in case it interests. > > I guess the interest in Method2 would come from perf numbers. Did you try > comparing Method1 vs. Method2? > Yes I did. Performance wise method2 is almost equal to method1. But I believe if there is any difference it may show when we have large vcpu guest. (Currently I have only 32 core host).