From: Srivatsa Vaddagiri <vatsa@in.ibm.com>
To: Avi Kivity <avi@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>, Gleb Natapov <gleb@redhat.com>,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org, hpa@zytor.com,
mingo@elte.hu, npiggin@suse.de, tglx@linutronix.de,
mtosatti@redhat.com
Subject: Re: [PATCH] use unfair spinlock when running on hypervisor.
Date: Thu, 3 Jun 2010 09:50:51 +0530 [thread overview]
Message-ID: <20100603042051.GA5953@linux.vnet.ibm.com> (raw)
In-Reply-To: <4C061DAB.6000804@redhat.com>
On Wed, Jun 02, 2010 at 12:00:27PM +0300, Avi Kivity wrote:
>
> There are two separate problems: the more general problem is that
> the hypervisor can put a vcpu to sleep while holding a lock, causing
> other vcpus to spin until the end of their time slice. This can
> only be addressed with hypervisor help.
Fyi - I have a early patch ready to address this issue. Basically I am using
host-kernel memory (mmap'ed into guest as io-memory via ivshmem driver) to hint
host whenever guest is in spin-lock'ed section, which is read by host scheduler
to defer preemption.
Guest side:
static inline void spin_lock(spinlock_t *lock)
{
raw_spin_lock(&lock->rlock);
+ __get_cpu_var(gh_vcpu_ptr)->defer_preempt++;
}
static inline void spin_unlock(spinlock_t *lock)
{
+ __get_cpu_var(gh_vcpu_ptr)->defer_preempt--;
raw_spin_unlock(&lock->rlock);
}
[similar changes to other spinlock variants]
Host side:
@@ -860,6 +866,17 @@ check_preempt_tick(struct cfs_rq *cfs_rq
ideal_runtime = sched_slice(cfs_rq, curr);
delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime;
if (delta_exec > ideal_runtime) {
+ if ((sched_feat(DEFER_PREEMPT)) && (rq_of(cfs_rq)->curr->ghptr)) {
+ int defer_preempt = rq_of(cfs_rq)->curr->ghptr->defer_preempt;
+ if (((defer_preempt & 0xFFFF0000) == 0xfeed0000) && ((defer_preempt & 0x0000FFFF) != 0)) {
+ if ((rq_of(cfs_rq)->curr->grace_defer++ < sysctl_sched_preempt_defer_count)) {
+ rq_of(cfs_rq)->defer_preempt++;
+ return;
+ } else
+ rq_of(cfs_rq)->force_preempt++;
+ }
+ }
resched_task(rq_of(cfs_rq)->curr);
/*
* The current task ran long enough, ensure it doesn't get
[similar changes introduced at other preemption points in sched_fair.c]
Note that guest can only request preemption to be deferred (and not disabled via
this mechanism). I have seen good improvement (~15%) in kern compile benchmark
with sysctl_sched_preempt_defer_count set to a low value of just 2 (i.e we can
defer preemption by maximum two ticks). I intend to cleanup and post the patches
pretty soon for comments.
One pathological case where this may actually hurt is routines in guest like
flush_tlb_others_ipi() which take a spinlock and then enter a while() loop
waiting for other cpus to ack something. In this case, deferring preemption just
because guest is in critical section actually hurts! Hopefully the upper bound
for deferring preemtion and the fact that such routines may not be frequently
hit should help alleviate such situations.
- vatsa
next prev parent reply other threads:[~2010-06-03 4:20 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-01 9:35 [PATCH] use unfair spinlock when running on hypervisor Gleb Natapov
2010-06-01 15:53 ` Andi Kleen
2010-06-01 16:24 ` Gleb Natapov
2010-06-01 16:38 ` Andi Kleen
2010-06-01 16:52 ` Avi Kivity
2010-06-01 17:27 ` Andi Kleen
2010-06-02 2:51 ` Avi Kivity
2010-06-02 5:26 ` Srivatsa Vaddagiri
2010-06-02 8:50 ` Andi Kleen
2010-06-02 9:00 ` Avi Kivity
2010-06-03 4:20 ` Srivatsa Vaddagiri [this message]
2010-06-03 4:51 ` Eric Dumazet
2010-06-03 5:38 ` Srivatsa Vaddagiri
2010-06-03 8:52 ` Andi Kleen
2010-06-03 9:26 ` Srivatsa Vaddagiri
2010-06-03 10:22 ` Nick Piggin
2010-06-03 10:38 ` Nick Piggin
2010-06-03 12:04 ` Srivatsa Vaddagiri
2010-06-03 12:38 ` Nick Piggin
2010-06-03 12:58 ` Srivatsa Vaddagiri
2010-06-03 13:04 ` Srivatsa Vaddagiri
2010-06-03 13:45 ` Nick Piggin
2010-06-03 14:48 ` Srivatsa Vaddagiri
2010-06-03 15:17 ` Andi Kleen
2010-06-03 15:35 ` Nick Piggin
2010-06-03 17:25 ` Andi Kleen
2010-06-01 17:39 ` Valdis.Kletnieks
2010-06-02 2:46 ` Avi Kivity
2010-06-02 7:39 ` H. Peter Anvin
2010-06-01 17:54 ` john cooper
2010-06-01 19:36 ` Andi Kleen
2010-06-03 11:06 ` David Woodhouse
2010-06-03 15:15 ` Andi Kleen
2010-06-01 21:39 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100603042051.GA5953@linux.vnet.ibm.com \
--to=vatsa@in.ibm.com \
--cc=andi@firstfloor.org \
--cc=avi@redhat.com \
--cc=gleb@redhat.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mtosatti@redhat.com \
--cc=npiggin@suse.de \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).