From: Rik van Riel <riel@redhat.com>
To: Mike Galbraith <efault@gmx.de>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Avi Kiviti <avi@redhat.com>,
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Chris Wright <chrisw@sous-sol.org>
Subject: Re: [RFC -v2 PATCH 2/3] sched: add yield_to function
Date: Thu, 16 Dec 2010 14:49:08 -0500 [thread overview]
Message-ID: <4D0A6D34.6070806@redhat.com> (raw)
In-Reply-To: <1292306896.7448.157.camel@marge.simson.net>
On 12/14/2010 01:08 AM, Mike Galbraith wrote:
> On Mon, 2010-12-13 at 22:46 -0500, Rik van Riel wrote:
>
>> diff --git a/kernel/sched.c b/kernel/sched.c
>> index dc91a4d..6399641 100644
>> --- a/kernel/sched.c
>> +++ b/kernel/sched.c
>> @@ -5166,6 +5166,46 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
>> return ret;
>> }
>>
>> +/*
>> + * Yield the CPU, giving the remainder of our time slice to task p.
>> + * Typically used to hand CPU time to another thread inside the same
>> + * process, eg. when p holds a resource other threads are waiting for.
>> + * Giving priority to p may help get that resource released sooner.
>> + */
>> +void yield_to(struct task_struct *p)
>> +{
>> + unsigned long flags;
>> + struct rq *rq, *p_rq;
>> +
>> + local_irq_save(flags);
>> + rq = this_rq();
>> +again:
>> + p_rq = task_rq(p);
>> + double_rq_lock(rq, p_rq);
>> + if (p_rq != task_rq(p)) {
>> + double_rq_unlock(rq, p_rq);
>> + goto again;
>> + }
>> +
>> + /* We can't yield to a process that doesn't want to run. */
>> + if (!p->se.on_rq)
>> + goto out;
>> +
>> + /*
>> + * We can only yield to a runnable task, in the same schedule class
>> + * as the current task, if the schedule class implements yield_to_task.
>> + */
>> + if (!task_running(rq, p)&& current->sched_class == p->sched_class&&
>> + current->sched_class->yield_to)
>> + current->sched_class->yield_to(rq, p);
>> +
>> +out:
>> + double_rq_unlock(rq, p_rq);
>> + local_irq_restore(flags);
>> + yield();
>> +}
>> +EXPORT_SYMBOL_GPL(yield_to);
>
> That part looks ok, except for the yield cross cpu bit. Trying to yield
> a resource you don't have doesn't make much sense to me.
The current task just donated the rest of its timeslice.
Surely that makes it a reasonable idea to call yield, and
get one of the other tasks on the current CPU running for
a bit?
> <ramble>
> slice_remain() measures the distance to your last preemption, which has
> no relationship with entitlement. sched_slice() is not used to issue
> entitlement, it's only a ruler.
>
> You have entitlement on your current runqueue only, that entitlement
> being the instantaneous distance to min_vruntime in a closed and fluid
> system. You can't inject some instantaneous relationship from one
> closed system into an another without making the math go kind of fuzzy,
> so you need tight constraints on how fuzzy it can get.
>
> We do that with migrations, inject fuzz. There is no global fair-stick,
> but we invent one by injecting little bits of fuzz. It's constrained by
> chaos and the magnitude constraints of the common engine. The more you
> migrate, the more tightly you couple systems. As long as we stay fairly
> well balanced, we can migrate without the fuzz getting out of hand, and
> end up with a globally ~fair system.
>
> What you're injecting isn't instantaneously irrelevant lag-fuzz, which
> distributed over time becomes relevant. you're inventing entitlement out
> of the void. Likely not a big hairy deal unless you do it frequently,
> but you're doing something completely bogus and seemingly unconstrained.
> </ramble>
I'm open to suggestions on what to do instead.
>> +static void yield_to_fair(struct rq *rq, struct task_struct *p)
>> +{
>> + struct sched_entity *se =&p->se;
>> + struct cfs_rq *cfs_rq = cfs_rq_of(se);
>> + u64 remain = slice_remain(current);
>> +
>> + dequeue_task(rq, p, 0);
>> + se->vruntime -= remain;
>> + if (se->vruntime< cfs_rq->min_vruntime)
>> + se->vruntime = cfs_rq->min_vruntime;
>
> This has an excellent chance of moving the recipient rightward.. and the
> yielding task didn't yield anything. This may achieve the desired
> result or may just create a nasty latency spike... but it makes no
> arithmetic sense.
Good point, the current task calls yield() in the function
that calls yield_to_fair, but I seem to have lost the code
that penalizes the current task's runtime...
I'll reinstate that.
--
All rights reversed
next prev parent reply other threads:[~2010-12-16 19:49 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-14 3:44 [RFC -v2 PATCH 0/3] directed yield for Pause Loop Exiting Rik van Riel
2010-12-14 3:45 ` [RFC -v2 PATCH 1/3] kvm: keep track of which task is running a KVM vcpu Rik van Riel
2010-12-14 3:46 ` [RFC -v2 PATCH 2/3] sched: add yield_to function Rik van Riel
2010-12-14 6:08 ` Mike Galbraith
2010-12-14 10:24 ` Srivatsa Vaddagiri
2010-12-14 11:03 ` Mike Galbraith
2010-12-14 11:26 ` Srivatsa Vaddagiri
2010-12-14 12:47 ` Mike Galbraith
2010-12-16 19:49 ` Rik van Riel [this message]
2010-12-17 6:56 ` Mike Galbraith
2010-12-17 7:15 ` Mike Galbraith
2010-12-18 17:08 ` Avi Kivity
2010-12-18 19:13 ` Mike Galbraith
2010-12-19 6:08 ` Avi Kivity
2010-12-20 15:40 ` Rik van Riel
2010-12-20 16:04 ` Mike Galbraith
2010-12-28 5:54 ` Mike Galbraith
2010-12-28 6:08 ` Gene Heskett
2010-12-28 6:16 ` Mike Galbraith
2010-12-28 16:18 ` Gene Heskett
2010-12-28 22:34 ` Rik van Riel
2010-12-17 15:09 ` Avi Kivity
2010-12-17 19:51 ` Mike Galbraith
2010-12-18 17:02 ` Avi Kivity
2010-12-18 19:06 ` Mike Galbraith
2010-12-19 6:21 ` Avi Kivity
2010-12-19 10:05 ` Mike Galbraith
2010-12-19 9:19 ` Avi Kivity
2010-12-19 11:18 ` Mike Galbraith
2010-12-20 8:39 ` Mike Galbraith
2010-12-20 8:45 ` Avi Kivity
2010-12-20 8:55 ` Mike Galbraith
2010-12-20 9:03 ` Avi Kivity
2010-12-20 9:30 ` Mike Galbraith
2010-12-20 9:46 ` Avi Kivity
2010-12-20 10:33 ` Mike Galbraith
2010-12-20 10:39 ` Avi Kivity
2010-12-20 10:46 ` Mike Galbraith
2010-12-20 10:49 ` Avi Kivity
2010-12-20 10:50 ` Mike Galbraith
2010-12-20 11:06 ` Avi Kivity
2010-12-14 12:22 ` Peter Zijlstra
2010-12-18 14:50 ` Rik van Riel
2010-12-14 3:48 ` [RFC -v2 PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D0A6D34.6070806@redhat.com \
--to=riel@redhat.com \
--cc=a.p.zijlstra@chello.nl \
--cc=avi@redhat.com \
--cc=chrisw@sous-sol.org \
--cc=efault@gmx.de \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox