From: Avi Kivity <avi@redhat.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>,
Mark Langsdorf <mark.langsdorf@amd.com>,
Mike Galbraith <efault@gmx.de>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Prevent immediate process rescheduling
Date: Sat, 19 Sep 2009 11:31:41 +0300 [thread overview]
Message-ID: <4AB496ED.7010800@redhat.com> (raw)
In-Reply-To: <1253304203.10538.64.camel@laptop>
On 09/18/2009 11:03 PM, Peter Zijlstra wrote:
> On Fri, 2009-09-18 at 21:54 +0200, Ingo Molnar wrote:
>
>
>>> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
>>> index 652e8bd..4fad08f 100644
>>> --- a/kernel/sched_fair.c
>>> +++ b/kernel/sched_fair.c
>>> @@ -353,11 +353,25 @@ static void __dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
>>> static struct sched_entity *__pick_next_entity(struct cfs_rq *cfs_rq)
>>> {
>>> struct rb_node *left = cfs_rq->rb_leftmost;
>>> + struct sched_entity *se, *curr;
>>>
>>> if (!left)
>>> return NULL;
>>>
>>> - return rb_entry(left, struct sched_entity, run_node);
>>> + se = rb_entry(left, struct sched_entity, run_node);
>>> + curr =¤t->se;
>>> +
>>> + /*
>>> + * Don't select the entity who just tried to schedule away
>>> + * if there's another entity available.
>>> + */
>>> + if (unlikely(se == curr&& cfs_rq->nr_running> 1)) {
>>> + struct rb_node *next_node = rb_next(&curr->run_node);
>>> + if (next_node)
>>> + se = rb_entry(next_node, struct sched_entity, run_node);
>>> + }
>>> +
>>> + return se;
>>> }
>>>
> Really hate this change though,. doesn't seem right to not pick the same
> task again if its runnable. Bad for cache footprint.
>
> The scenario is quite common for stuff like:
>
> CPU0 CPU1
>
> set_task_state(TASK_INTERRUPTIBLE)
>
> if (cond)
> goto out;
> <--- ttwu()
> schedule();
>
>
I agree, yielding should be explicitly requested.
Also, on a heavily overcommitted box an undirected yield might take
quite a long time to find the thread that's holding the lock. I think a
yield_to() will be a lot better:
- we can pick one of the vcpus belonging to the same guest to improve
the probability that the lock actually get released
- we avoid an issue when the other vcpus are on different runqueues (in
which case the current patch does nothing)
- we can fix the accounting by donating vruntime from current to the
yielded-to vcpu
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
next prev parent reply other threads:[~2009-09-19 8:31 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-18 19:49 [PATCH] Prevent immediate process rescheduling Mark Langsdorf
2009-09-18 19:54 ` Ingo Molnar
2009-09-18 20:00 ` Langsdorf, Mark
2009-09-18 20:03 ` Peter Zijlstra
2009-09-19 2:16 ` Mike Galbraith
2009-09-19 9:03 ` Mike Galbraith
2009-09-19 8:31 ` Avi Kivity [this message]
2009-09-20 21:03 ` Langsdorf, Mark
2009-09-20 21:55 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AB496ED.7010800@redhat.com \
--to=avi@redhat.com \
--cc=a.p.zijlstra@chello.nl \
--cc=efault@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.langsdorf@amd.com \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.