From: Milton Miller <miltonm@bga.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yong Zhang <yong.zhang0@gmail.com>,
Borislav Petkov <bp@amd64.org>, Borislav Petkov <bp@alien8.de>,
"mingo@redhat.com" <mingo@redhat.com>,
"hpa@zytor.com" <hpa@zytor.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"markus@trippelsdorf.de" <markus@trippelsdorf.de>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"mingo@elte.hu" <mingo@elte.hu>,
"linux-tip-commits@vger.kernel.org"
<linux-tip-commits@vger.kernel.org>
Subject: Re: [tip:sched/urgent] sched: Fix cross-cpu clock sync on remote wakeups
Date: Fri, 03 Jun 2011 04:57:21 -0500 [thread overview]
Message-ID: <schedulerIPIirqEnter@mdm.bga.com> (raw)
In-Reply-To: <1307029711.2497.717.camel@laptop>
On Thu, 02 Jun 2011 about 15:48:31 -0000, Peter Zijlstra wrote:
> On Thu, 2011-06-02 at 22:23 +0800, Yong Zhang wrote:
> > On Thu, Jun 02, 2011 at 03:04:26PM +0200, Peter Zijlstra wrote:
> > > irq_enter() -> tick_check_idle() -> tick_check_nohz() ->
> > > tick_nohz_stop_idle() -> sched_clock_idle_wakeup_event()
> > >
> > > should update the thing before we run any isrs, right?
> >
> > Hmmm, you are right.
> >
> > But smp_reschedule_interrupt() doesn't call irq_enter()/irq_exit(),
> > is that correct?
>
> Crap.. you're right. And I bet other archs don't do that either. With
> NO_HZ you really need irq_enter() for pretty much all interrupts so I
> was assuming the resched IPI had it, but its been special and never
> really needed it. If it would wake an idle cpu the idle loop exit would
> deal with it, if it interrupted userspace the thing was running and
> NO_HZ wasn't relevant.
>
> Damn.
>
> And yes, the only reason I didn't see this on my dev box was because we
> do indeed set that sched_clock_stable thing on wsm. And I never noticed
> on my desktop because firefox/X/etc. consuming heaps of CPU isn't weird
> at all.
>
> Adding it to all resched int handlers is of course a possibility but
> would slow down the thing, although with the new code, most users are
> now indeed wakeups (excepting weird and wonderful users like KVM).
[me looks closely at patch and finds early return]
>
> We could of course add it in sched.c since the logic recurses just
> fine.. its not pretty though.. :/
>
> Thoughts?
Many architectures already have an irq_enter becuase they have a single
interrupt to the cpu for all external causes including software; they
do the irq_enter before reading from the irq controller to know the
reason for the interrupt. A quick glance at irq_enter and irq_exit
shows they will do several things twice when nested, even if that
is safe.
Are there really that many calls with the empty list that it makes
sense to avoid and optimize this on x86 while penalizing the several
architectures with a nested irq_enter and exit? When it also duplicates
sched_ttwu_pending (because it can't be common with the additional tests)?
We said the perf mon callback (now irq_work) had to be under irq_enter.
Can we get some numbers for how often the two cases occur on some
various workloads?
milton
>
> ---
> kernel/sched.c | 18 +++++++++++++++++-
> 1 files changed, 17 insertions(+), 1 deletions(-)
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 2fe98ed..365ed6b 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -2554,7 +2554,23 @@ static void sched_ttwu_pending(void)
>
> void scheduler_ipi(void)
> {
> - sched_ttwu_pending();
> + struct rq *rq = this_rq();
> + struct task_struct *list = xchg(&rq->wake_list, NULL);
> +
> + if (!list)
> + return;
> +
> + irq_enter();
> + raw_spin_lock(&rq->lock);
> +
> + while (list) {
> + struct task_struct *p = list;
> + list = list->wake_entry;
> + ttwu_do_activate(rq, p, 0);
> + }
> +
> + raw_spin_unlock(&rq->lock);
> + irq_exit();
> }
>
> static void ttwu_queue_remote(struct task_struct *p, int cpu)
next prev parent reply other threads:[~2011-06-03 9:57 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-30 17:39 Very high CPU values in top on idle system (3.0-rc1) Markus Trippelsdorf
2011-05-30 18:05 ` Peter Zijlstra
2011-05-30 18:23 ` Markus Trippelsdorf
2011-05-30 20:45 ` Markus Trippelsdorf
2011-05-30 22:12 ` Peter Zijlstra
2011-05-31 9:55 ` Peter Zijlstra
2011-05-31 10:04 ` Markus Trippelsdorf
2011-05-31 12:31 ` [tip:sched/urgent] sched: Fix cross-cpu clock sync on remote wakeups tip-bot for Peter Zijlstra
2011-05-31 12:56 ` Borislav Petkov
2011-05-31 13:11 ` Peter Zijlstra
2011-06-01 7:05 ` Borislav Petkov
2011-06-01 10:36 ` Peter Zijlstra
2011-06-01 15:50 ` Borislav Petkov
2011-06-02 7:52 ` Yong Zhang
2011-06-02 13:04 ` Peter Zijlstra
2011-06-02 14:23 ` Yong Zhang
2011-06-02 15:48 ` Peter Zijlstra
2011-06-03 6:49 ` Yong Zhang
2011-06-03 9:57 ` Milton Miller [this message]
2011-06-03 10:36 ` Peter Zijlstra
2011-06-03 10:55 ` Peter Zijlstra
2011-06-03 10:58 ` Peter Zijlstra
2011-06-07 13:12 ` Borislav Petkov
2011-06-07 13:16 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=schedulerIPIirqEnter@mdm.bga.com \
--to=miltonm@bga.com \
--cc=a.p.zijlstra@chello.nl \
--cc=bp@alien8.de \
--cc=bp@amd64.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=markus@trippelsdorf.de \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=yong.zhang0@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox