public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Jones <drjones@redhat.com>
To: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Avi Kivity <avi.kivity@gmail.com>, Gleb Natapov <gleb@redhat.com>,
	Ingo Molnar <mingo@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Srikar <srikar@linux.vnet.ibm.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Nikunj A. Dadhania" <nikunj@linux.vnet.ibm.com>,
	KVM <kvm@vger.kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
	Jiannan Ouyang <ouyang@cs.pitt.edu>,
	Chegu Vinod <chegu_vinod@hp.com>,
	"Andrew M. Theurer" <habanero@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Srivatsa Vaddagiri <srivatsa.vaddagiri@gmail.com>
Subject: Re: [PATCH RFC 0/2] kvm: Better yield_to candidate using preemption notifiers
Date: Tue, 5 Mar 2013 10:53:08 +0100	[thread overview]
Message-ID: <20130305095307.GA11728@hawk.usersys.redhat.com> (raw)
In-Reply-To: <20130304180146.31281.33540.sendpatchset@codeblue.in.ibm.com>

On Mon, Mar 04, 2013 at 11:31:46PM +0530, Raghavendra K T wrote:
>  This patch series further filters better vcpu candidate to yield to
> in PLE handler. The main idea is to record the preempted vcpus using
> preempt notifiers and iterate only those preempted vcpus in the
> handler. Note that the vcpus which were in spinloop during pause loop
> exit are already filtered.

The %improvement and patch series look good.

> 
> Thanks Jiannan, Avi for bringing the idea and Gleb, PeterZ for
> precious suggestions during the discussion. 
> Thanks Srikar for suggesting to avoid rcu lock while checking task state
> that has improved overcommit cases.
> 
> There are basically two approches for the implementation.
> 
> Method 1: Uses per vcpu preempt flag (this series).
> 
> Method 2: We keep a bitmap of preempted vcpus. using this we can easily
> iterate over preempted vcpus.
> 
> Note that method 2 needs an extra index variable to identify/map bitmap to
> vcpu and it also needs static vcpu allocation.

We definitely don't want something that requires static vcpu allocation.
I think it'd be better to add another counter for the vcpu bit assignment.

> 
> I am also posting Method 2 approach for reference in case it interests.

I guess the interest in Method2 would come from perf numbers. Did you try
comparing Method1 vs. Method2?

> 
> Result: decent improvement for kernbench and ebizzy.
> 
> base = 3.8.0 + undercommit patches 
> patched = base + preempt patches
> 
> Tested on 32 core (no HT) mx3850 machine with 32 vcpu guest 8GB RAM
> 
> --+-----------+-----------+-----------+------------+-----------+
>                kernbench (exec time in sec lower is beter) 
> --+-----------+-----------+-----------+------------+-----------+
>       base       stdev       patched       stdev      %improve 
> --+-----------+-----------+-----------+------------+-----------+
> 1x    47.0383     4.6977     44.2584     1.2899	    5.90986
> 2x    96.0071     7.1873     91.2605     7.3567	    4.94401
> 3x   164.0157    10.3613    156.6750    11.4267	    4.47561
> 4x   212.5768    23.7326    204.4800    13.2908	    3.80888
> --+-----------+-----------+-----------+------------+-----------+
> no ple kernbench 1x result for reference: 46.056133
> 
> --+-----------+-----------+-----------+------------+-----------+
>                ebizzy (record/sec higher is better)
> --+-----------+-----------+-----------+------------+-----------+
>       base       stdev       patched       stdev      %improve 
> --+-----------+-----------+-----------+------------+-----------+
> 1x  5609.2000    56.9343    6263.7000    64.7097     11.66833
> 2x  2071.9000   108.4829    2653.5000   181.8395     28.07085
> 3x  1557.4167   109.7141    1993.5000   166.3176     28.00043
> 4x  1254.7500    91.2997    1765.5000   237.5410     40.70532
> --+-----------+-----------+-----------+------------+-----------+
> no ple ebizzy 1x result for reference : 7394.9 rec/sec
> 
> Please let me know if you have any suggestions and comments.
> 
> Raghavendra K T (2):
>    kvm: Record the preemption status of vcpus using preempt notifiers
>    kvm: Iterate over only vcpus that are preempted
> 	
> ----
>  include/linux/kvm_host.h | 1 +
>  virt/kvm/kvm_main.c      | 7 +++++++
>  2 files changed, 8 insertions(+)
>  
> Reference patch for Method 2
> ---8<---
> Use preempt bitmap and optimize vcpu iteration using preempt notifiers
> 
> From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> 
> Record the preempted vcpus in a bit map using preempt notifiers.
> Add the logic of iterating over only preempted vcpus thus making
> vcpu iteration fast.
> Thanks Jiannan, Avi for initially proposing patch. Gleb, Peter for
> precious suggestions.
> Thanks srikar for suggesting to remove rcu lock while checking
> task state that helped in reducing overcommit overhead
> 
> Not-yet-signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> ---
>  include/linux/kvm_host.h |    7 +++++++
>  virt/kvm/kvm_main.c      |   15 ++++++++++++---
>  2 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index cad77fe..8c4a2409 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -252,6 +252,7 @@ struct kvm_vcpu {
>  		bool dy_eligible;
>  	} spin_loop;
>  #endif
> +	int idx;
>  	struct kvm_vcpu_arch arch;
>  };
>  
> @@ -385,6 +386,7 @@ struct kvm {
>  	long mmu_notifier_count;
>  #endif
>  	long tlbs_dirty;
> +	DECLARE_BITMAP(preempt_bitmap, KVM_MAX_VCPUS);
>  };
>  
>  #define kvm_err(fmt, ...) \
> @@ -413,6 +415,11 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
>  	     (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
>  	     idx++)
>  
> +#define kvm_for_each_preempted_vcpu(idx, vcpup, kvm, n) \
> +	for (idx = find_first_bit(kvm->preempt_bitmap, KVM_MAX_VCPUS); \
> +	     idx < n && (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
> +	     idx = find_next_bit(kvm->preempt_bitmap, KVM_MAX_VCPUS, idx+1))
> +
>  #define kvm_for_each_memslot(memslot, slots)	\
>  	for (memslot = &slots->memslots[0];	\
>  	      memslot < slots->memslots + KVM_MEM_SLOTS_NUM && memslot->npages;\
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index adc68fe..1db16b3 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1770,10 +1770,12 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>  	struct kvm_vcpu *vcpu;
>  	int last_boosted_vcpu = me->kvm->last_boosted_vcpu;
>  	int yielded = 0;
> +	int num_vcpus;
>  	int try = 3;
>  	int pass;
>  	int i;
> -
> +
> +	num_vcpus = atomic_read(&kvm->online_vcpus);
>  	kvm_vcpu_set_in_spin_loop(me, true);
>  	/*
>  	 * We boost the priority of a VCPU that is runnable but not
> @@ -1783,7 +1785,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>  	 * We approximate round-robin by starting at the last boosted VCPU.
>  	 */
>  	for (pass = 0; pass < 2 && !yielded && try; pass++) {
> -		kvm_for_each_vcpu(i, vcpu, kvm) {
> +		kvm_for_each_preempted_vcpu(i, vcpu, kvm, num_vcpus) {
>  			if (!pass && i <= last_boosted_vcpu) {
>  				i = last_boosted_vcpu;
>  				continue;
> @@ -1878,6 +1880,7 @@ static int create_vcpu_fd(struct kvm_vcpu *vcpu)
>  static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
>  {
>  	int r;
> +	int curr_idx;
>  	struct kvm_vcpu *vcpu, *v;
>  
>  	vcpu = kvm_arch_vcpu_create(kvm, id);
> @@ -1916,7 +1919,9 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
>  		goto unlock_vcpu_destroy;
>  	}
>  
> -	kvm->vcpus[atomic_read(&kvm->online_vcpus)] = vcpu;
> +	curr_idx = atomic_read(&kvm->online_vcpus);
> +	kvm->vcpus[curr_idx] = vcpu;
> +	vcpu->idx = curr_idx;
>  	smp_wmb();
>  	atomic_inc(&kvm->online_vcpus);
>  
> @@ -2902,6 +2907,7 @@ struct kvm_vcpu *preempt_notifier_to_vcpu(struct preempt_notifier *pn)
>  static void kvm_sched_in(struct preempt_notifier *pn, int cpu)
>  {
>  	struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
> +	clear_bit(vcpu->idx, vcpu->kvm->preempt_bitmap);
>  
>  	kvm_arch_vcpu_load(vcpu, cpu);
>  }
> @@ -2911,6 +2917,9 @@ static void kvm_sched_out(struct preempt_notifier *pn,
>  {
>  	struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
>  
> +	if (current->state == TASK_RUNNING)
> +		set_bit(vcpu->idx, vcpu->kvm->preempt_bitmap);
> +
>  	kvm_arch_vcpu_put(vcpu);
>  }
>  
> 

  parent reply	other threads:[~2013-03-05  9:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-04 18:01 [PATCH RFC 0/2] kvm: Better yield_to candidate using preemption notifiers Raghavendra K T
2013-03-04 18:02 ` [PATCH RFC 1/2] kvm: Record the preemption status of vcpus using preempt notifiers Raghavendra K T
2013-03-05 15:19   ` Chegu Vinod
2013-03-07  9:19     ` Raghavendra K T
2013-03-04 18:02 ` [PATCH RFC 2/2] kvm: Iterate over only vcpus that are preempted Raghavendra K T
2013-03-05 15:20   ` Chegu Vinod
2013-03-05  9:53 ` Andrew Jones [this message]
2013-03-05 12:24   ` [PATCH RFC 0/2] kvm: Better yield_to candidate using preemption notifiers Raghavendra K T
2013-03-05 12:40     ` Andrew Jones
2013-03-07 19:10 ` Marcelo Tosatti
2013-03-08  7:13   ` Raghavendra K T
2013-03-11  9:38 ` Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130305095307.GA11728@hawk.usersys.redhat.com \
    --to=drjones@redhat.com \
    --cc=avi.kivity@gmail.com \
    --cc=chegu_vinod@hp.com \
    --cc=gleb@redhat.com \
    --cc=habanero@linux.vnet.ibm.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=ouyang@cs.pitt.edu \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=srivatsa.vaddagiri@gmail.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox