From: Ingo Molnar <mingo@elte.hu>
To: David Miller <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org, tglx@linutronix.de,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: Soft lockup regression from today's sched.git merge.
Date: Wed, 23 Apr 2008 11:40:07 +0200 [thread overview]
Message-ID: <20080423094007.GA23293@elte.hu> (raw)
In-Reply-To: <20080422.224212.32058127.davem@davemloft.net>
* David Miller <davem@davemloft.net> wrote:
> The code is fatally flawed, once __sync_cpu_clock() calls start
> happening, they will happen on every cpu_clock() call.
>
> So like my bisect showed from the get-go, these cpu_clock() changes
> have major problems, so it was quite a mind boggling stretch to stick
> a touch_softlockup_watchdog() call somewhere to try and fix this when
> the guilty change in question didn't touch that area at all.
> :-(
>
> Furthermore, this is an extremely expensive way to ensure monotonic
> per-rq timestamps. A global spinlock taken every 100000 ns on every
> cpu?!?! :-/
>
> At least move any implication of "high speed" from the comments above
> cpu_clock() if we're going to need something like this. I have 128
> cpus, that's 128 grabs of that spinlock every quantum. My next system
> I'm getting will have 256 cpus. The expense of your solution
> increases linearly with the number of cpus, which doesn't scale.
we are glad about your feedback and we will fix any and all bugs in this
code, as fast as we can. Let me also defend the code as there are two
factors that you might not be aware of:
Firstly, cpu_clock() is only used in debug code, not in a fastpath.
Using a global lock there was a conscious choice, it's been in -mm for
two months and in linux-next as well for some time. I booted it on a
64-way box with nohz enabled and it wasnt a visible problem in
benchmarks.
The time_sync_thresh thing was indeed an afterthought and it was indeed
buggy but harmless to functionality. (that's why it went unnoticed -
until you found the thinko - thanks for that.) The patches i sent today
should largely address that.
Secondly, and perhaps more importantly, the nohz code _already_ uses a
very heavy lock in the idle path and always did that! It is exactly the
kind of global lock you complain about above, just much worse: it's used
in the fastpath.
Every time irq_enter() is done on an idle CPU we do:
| static void tick_do_update_jiffies64(ktime_t now)
| {
| unsigned long ticks = 0;
| ktime_t delta;
|
| /* Reevalute with xtime_lock held */
| write_seqlock(&xtime_lock);
... that xtime_lock is a heavy global seqlock - just as heavy as any
global spinlock. That's the price we pay for global jiffies and shoddy
drivers that rely on it.
This has been there in the fastpath unconditionally (not in debug code)
and it was not reported as a problem before. The reason is that while
this global lock truly sucks it spends time from an _idle_ CPU's time
which mutes some of its overhead and makes it mostly "invisible" to any
real performance measurement.
It's still not ideal because it slightly increases irq latency. So it
would be nice to improve upon it but it's nothing new and it's not
caused by these commits. Any ideas how to do it? I guess we could first
check jiffies unlocked - this would mute much of the polling that goes
on here. Something like the patch below. Hm?
Ingo
------------------------->
Subject: nohz: reduce jiffies polling overhead
From: Ingo Molnar <mingo@elte.hu>
Date: Wed Apr 23 11:05:09 CEST 2008
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/time/tick-sched.c | 7 +++++++
1 file changed, 7 insertions(+)
Index: linux/kernel/time/tick-sched.c
===================================================================
--- linux.orig/kernel/time/tick-sched.c
+++ linux/kernel/time/tick-sched.c
@@ -48,6 +48,13 @@ static void tick_do_update_jiffies64(kti
unsigned long ticks = 0;
ktime_t delta;
+ /*
+ * Do a quick check without holding xtime_lock:
+ */
+ delta = ktime_sub(now, last_jiffies_update);
+ if (delta.tv64 < tick_period.tv64)
+ return;
+
/* Reevalute with xtime_lock held */
write_seqlock(&xtime_lock);
prev parent reply other threads:[~2008-04-23 9:40 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-22 8:59 Soft lockup regression from today's sched.git merge David Miller
2008-04-22 9:14 ` Ingo Molnar
2008-04-22 10:05 ` David Miller
2008-04-22 12:45 ` Peter Zijlstra
2008-05-06 22:41 ` Rafael J. Wysocki
2008-05-06 23:05 ` David Miller
2008-05-07 6:43 ` Ingo Molnar
2008-05-07 18:56 ` Rafael J. Wysocki
2008-04-23 8:50 ` [patch] softlockup: fix false positives on nohz if CPU is 100% idle for more than 60 seconds Ingo Molnar
2008-04-23 10:55 ` David Miller
2008-04-23 12:29 ` David Miller
2008-04-23 13:36 ` Ingo Molnar
2008-04-23 23:23 ` David Miller
2008-04-23 5:42 ` Soft lockup regression from today's sched.git merge David Miller
2008-04-23 7:32 ` Dhaval Giani
2008-04-23 7:51 ` Ingo Molnar
2008-04-23 9:40 ` Ingo Molnar [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080423094007.GA23293@elte.hu \
--to=mingo@elte.hu \
--cc=a.p.zijlstra@chello.nl \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox