From: Avi Kivity <avi@redhat.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>,
Mark Langsdorf <mark.langsdorf@amd.com>,
Mike Galbraith <efault@gmx.de>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Prevent immediate process rescheduling
Date: Sat, 19 Sep 2009 11:31:41 +0300 [thread overview]
Message-ID: <4AB496ED.7010800@redhat.com> (raw)
In-Reply-To: <1253304203.10538.64.camel@laptop>
On 09/18/2009 11:03 PM, Peter Zijlstra wrote:
> On Fri, 2009-09-18 at 21:54 +0200, Ingo Molnar wrote:
>
>
>>> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
>>> index 652e8bd..4fad08f 100644
>>> --- a/kernel/sched_fair.c
>>> +++ b/kernel/sched_fair.c
>>> @@ -353,11 +353,25 @@ static void __dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
>>> static struct sched_entity *__pick_next_entity(struct cfs_rq *cfs_rq)
>>> {
>>> struct rb_node *left = cfs_rq->rb_leftmost;
>>> + struct sched_entity *se, *curr;
>>>
>>> if (!left)
>>> return NULL;
>>>
>>> - return rb_entry(left, struct sched_entity, run_node);
>>> + se = rb_entry(left, struct sched_entity, run_node);
>>> + curr =¤t->se;
>>> +
>>> + /*
>>> + * Don't select the entity who just tried to schedule away
>>> + * if there's another entity available.
>>> + */
>>> + if (unlikely(se == curr&& cfs_rq->nr_running> 1)) {
>>> + struct rb_node *next_node = rb_next(&curr->run_node);
>>> + if (next_node)
>>> + se = rb_entry(next_node, struct sched_entity, run_node);
>>> + }
>>> +
>>> + return se;
>>> }
>>>
> Really hate this change though,. doesn't seem right to not pick the same
> task again if its runnable. Bad for cache footprint.
>
> The scenario is quite common for stuff like:
>
> CPU0 CPU1
>
> set_task_state(TASK_INTERRUPTIBLE)
>
> if (cond)
> goto out;
> <--- ttwu()
> schedule();
>
>
I agree, yielding should be explicitly requested.
Also, on a heavily overcommitted box an undirected yield might take
quite a long time to find the thread that's holding the lock. I think a
yield_to() will be a lot better:
- we can pick one of the vcpus belonging to the same guest to improve
the probability that the lock actually get released
- we avoid an issue when the other vcpus are on different runqueues (in
which case the current patch does nothing)
- we can fix the accounting by donating vruntime from current to the
yielded-to vcpu
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
next prev parent reply other threads:[~2009-09-19 8:31 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-18 19:49 [PATCH] Prevent immediate process rescheduling Mark Langsdorf
2009-09-18 19:54 ` Ingo Molnar
2009-09-18 20:00 ` Langsdorf, Mark
2009-09-18 20:03 ` Peter Zijlstra
2009-09-19 2:16 ` Mike Galbraith
2009-09-19 9:03 ` Mike Galbraith
2009-09-19 8:31 ` Avi Kivity [this message]
2009-09-20 21:03 ` Langsdorf, Mark
2009-09-20 21:55 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AB496ED.7010800@redhat.com \
--to=avi@redhat.com \
--cc=a.p.zijlstra@chello.nl \
--cc=efault@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.langsdorf@amd.com \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox