Re: [PATCH RFC 0/2] kvm: Better yield_to candidate using preemption notifiers

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrew Jones <drjones@redhat.com>
To: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Avi Kivity <avi.kivity@gmail.com>, Gleb Natapov <gleb@redhat.com>,
	Ingo Molnar <mingo@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Srikar <srikar@linux.vnet.ibm.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Nikunj A. Dadhania" <nikunj@linux.vnet.ibm.com>,
	KVM <kvm@vger.kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
	Jiannan Ouyang <ouyang@cs.pitt.edu>,
	Chegu Vinod <chegu_vinod@hp.com>,
	"Andrew M. Theurer" <habanero@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Srivatsa Vaddagiri <srivatsa.vaddagiri@gmail.com>
Subject: Re: [PATCH RFC 0/2] kvm: Better yield_to candidate using preemption notifiers
Date: Tue, 5 Mar 2013 10:53:08 +0100	[thread overview]
Message-ID: <20130305095307.GA11728@hawk.usersys.redhat.com> (raw)
In-Reply-To: <20130304180146.31281.33540.sendpatchset@codeblue.in.ibm.com>

On Mon, Mar 04, 2013 at 11:31:46PM +0530, Raghavendra K T wrote:
>  This patch series further filters better vcpu candidate to yield to
> in PLE handler. The main idea is to record the preempted vcpus using
> preempt notifiers and iterate only those preempted vcpus in the
> handler. Note that the vcpus which were in spinloop during pause loop
> exit are already filtered.

The %improvement and patch series look good.

> 
> Thanks Jiannan, Avi for bringing the idea and Gleb, PeterZ for
> precious suggestions during the discussion. 
> Thanks Srikar for suggesting to avoid rcu lock while checking task state
> that has improved overcommit cases.
> 
> There are basically two approches for the implementation.
> 
> Method 1: Uses per vcpu preempt flag (this series).
> 
> Method 2: We keep a bitmap of preempted vcpus. using this we can easily
> iterate over preempted vcpus.
> 
> Note that method 2 needs an extra index variable to identify/map bitmap to
> vcpu and it also needs static vcpu allocation.

We definitely don't want something that requires static vcpu allocation.
I think it'd be better to add another counter for the vcpu bit assignment.

> 
> I am also posting Method 2 approach for reference in case it interests.

I guess the interest in Method2 would come from perf numbers. Did you try
comparing Method1 vs. Method2?

> 
> Result: decent improvement for kernbench and ebizzy.
> 
> base = 3.8.0 + undercommit patches 
> patched = base + preempt patches
> 
> Tested on 32 core (no HT) mx3850 machine with 32 vcpu guest 8GB RAM
> 
> --+-----------+-----------+-----------+------------+-----------+
>                kernbench (exec time in sec lower is beter) 
> --+-----------+-----------+-----------+------------+-----------+
>       base       stdev       patched       stdev      %improve 
> --+-----------+-----------+-----------+------------+-----------+
> 1x    47.0383     4.6977     44.2584     1.2899	    5.90986
> 2x    96.0071     7.1873     91.2605     7.3567	    4.94401
> 3x   164.0157    10.3613    156.6750    11.4267	    4.47561
> 4x   212.5768    23.7326    204.4800    13.2908	    3.80888
> --+-----------+-----------+-----------+------------+-----------+
> no ple kernbench 1x result for reference: 46.056133
> 
> --+-----------+-----------+-----------+------------+-----------+
>                ebizzy (record/sec higher is better)
> --+-----------+-----------+-----------+------------+-----------+
>       base       stdev       patched       stdev      %improve 
> --+-----------+-----------+-----------+------------+-----------+
> 1x  5609.2000    56.9343    6263.7000    64.7097     11.66833
> 2x  2071.9000   108.4829    2653.5000   181.8395     28.07085
> 3x  1557.4167   109.7141    1993.5000   166.3176     28.00043
> 4x  1254.7500    91.2997    1765.5000   237.5410     40.70532
> --+-----------+-----------+-----------+------------+-----------+
> no ple ebizzy 1x result for reference : 7394.9 rec/sec
> 
> Please let me know if you have any suggestions and comments.
> 
> Raghavendra K T (2):
>    kvm: Record the preemption status of vcpus using preempt notifiers
>    kvm: Iterate over only vcpus that are preempted
> 	
> ----
>  include/linux/kvm_host.h | 1 +
>  virt/kvm/kvm_main.c      | 7 +++++++
>  2 files changed, 8 insertions(+)
>  
> Reference patch for Method 2
> ---8<---
> Use preempt bitmap and optimize vcpu iteration using preempt notifiers
> 
> From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> 
> Record the preempted vcpus in a bit map using preempt notifiers.
> Add the logic of iterating over only preempted vcpus thus making
> vcpu iteration fast.
> Thanks Jiannan, Avi for initially proposing patch. Gleb, Peter for
> precious suggestions.
> Thanks srikar for suggesting to remove rcu lock while checking
> task state that helped in reducing overcommit overhead
> 
> Not-yet-signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> ---
>  include/linux/kvm_host.h |    7 +++++++
>  virt/kvm/kvm_main.c      |   15 ++++++++++++---
>  2 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index cad77fe..8c4a2409 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -252,6 +252,7 @@ struct kvm_vcpu {
>  		bool dy_eligible;
>  	} spin_loop;
>  #endif
> +	int idx;
>  	struct kvm_vcpu_arch arch;
>  };
>  
> @@ -385,6 +386,7 @@ struct kvm {
>  	long mmu_notifier_count;
>  #endif
>  	long tlbs_dirty;
> +	DECLARE_BITMAP(preempt_bitmap, KVM_MAX_VCPUS);
>  };
>  
>  #define kvm_err(fmt, ...) \
> @@ -413,6 +415,11 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
>  	     (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
>  	     idx++)
>  
> +#define kvm_for_each_preempted_vcpu(idx, vcpup, kvm, n) \
> +	for (idx = find_first_bit(kvm->preempt_bitmap, KVM_MAX_VCPUS); \
> +	     idx < n && (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
> +	     idx = find_next_bit(kvm->preempt_bitmap, KVM_MAX_VCPUS, idx+1))
> +
>  #define kvm_for_each_memslot(memslot, slots)	\
>  	for (memslot = &slots->memslots[0];	\
>  	      memslot < slots->memslots + KVM_MEM_SLOTS_NUM && memslot->npages;\
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index adc68fe..1db16b3 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1770,10 +1770,12 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>  	struct kvm_vcpu *vcpu;
>  	int last_boosted_vcpu = me->kvm->last_boosted_vcpu;
>  	int yielded = 0;
> +	int num_vcpus;
>  	int try = 3;
>  	int pass;
>  	int i;
> -
> +
> +	num_vcpus = atomic_read(&kvm->online_vcpus);
>  	kvm_vcpu_set_in_spin_loop(me, true);
>  	/*
>  	 * We boost the priority of a VCPU that is runnable but not
> @@ -1783,7 +1785,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>  	 * We approximate round-robin by starting at the last boosted VCPU.
>  	 */
>  	for (pass = 0; pass < 2 && !yielded && try; pass++) {
> -		kvm_for_each_vcpu(i, vcpu, kvm) {
> +		kvm_for_each_preempted_vcpu(i, vcpu, kvm, num_vcpus) {
>  			if (!pass && i <= last_boosted_vcpu) {
>  				i = last_boosted_vcpu;
>  				continue;
> @@ -1878,6 +1880,7 @@ static int create_vcpu_fd(struct kvm_vcpu *vcpu)
>  static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
>  {
>  	int r;
> +	int curr_idx;
>  	struct kvm_vcpu *vcpu, *v;
>  
>  	vcpu = kvm_arch_vcpu_create(kvm, id);
> @@ -1916,7 +1919,9 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
>  		goto unlock_vcpu_destroy;
>  	}
>  
> -	kvm->vcpus[atomic_read(&kvm->online_vcpus)] = vcpu;
> +	curr_idx = atomic_read(&kvm->online_vcpus);
> +	kvm->vcpus[curr_idx] = vcpu;
> +	vcpu->idx = curr_idx;
>  	smp_wmb();
>  	atomic_inc(&kvm->online_vcpus);
>  
> @@ -2902,6 +2907,7 @@ struct kvm_vcpu *preempt_notifier_to_vcpu(struct preempt_notifier *pn)
>  static void kvm_sched_in(struct preempt_notifier *pn, int cpu)
>  {
>  	struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
> +	clear_bit(vcpu->idx, vcpu->kvm->preempt_bitmap);
>  
>  	kvm_arch_vcpu_load(vcpu, cpu);
>  }
> @@ -2911,6 +2917,9 @@ static void kvm_sched_out(struct preempt_notifier *pn,
>  {
>  	struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
>  
> +	if (current->state == TASK_RUNNING)
> +		set_bit(vcpu->idx, vcpu->kvm->preempt_bitmap);
> +
>  	kvm_arch_vcpu_put(vcpu);
>  }
>  
>

next prev parent reply	other threads:[~2013-03-05  9:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-04 18:01 [PATCH RFC 0/2] kvm: Better yield_to candidate using preemption notifiers Raghavendra K T
2013-03-04 18:02 ` [PATCH RFC 1/2] kvm: Record the preemption status of vcpus using preempt notifiers Raghavendra K T
2013-03-05 15:19   ` Chegu Vinod
2013-03-07  9:19     ` Raghavendra K T
2013-03-04 18:02 ` [PATCH RFC 2/2] kvm: Iterate over only vcpus that are preempted Raghavendra K T
2013-03-05 15:20   ` Chegu Vinod
2013-03-05  9:53 ` Andrew Jones [this message]
2013-03-05 12:24   ` [PATCH RFC 0/2] kvm: Better yield_to candidate using preemption notifiers Raghavendra K T
2013-03-05 12:40     ` Andrew Jones
2013-03-07 19:10 ` Marcelo Tosatti
2013-03-08  7:13   ` Raghavendra K T
2013-03-11  9:38 ` Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130305095307.GA11728@hawk.usersys.redhat.com \
    --to=drjones@redhat.com \
    --cc=avi.kivity@gmail.com \
    --cc=chegu_vinod@hp.com \
    --cc=gleb@redhat.com \
    --cc=habanero@linux.vnet.ibm.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=ouyang@cs.pitt.edu \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=srivatsa.vaddagiri@gmail.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.