All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: David Miller <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org, tglx@linutronix.de,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: Soft lockup regression from today's sched.git merge.
Date: Wed, 23 Apr 2008 09:51:40 +0200	[thread overview]
Message-ID: <20080423075140.GA20980@elte.hu> (raw)
In-Reply-To: <20080422.224212.32058127.davem@davemloft.net>


* David Miller <davem@davemloft.net> wrote:

> From: Ingo Molnar <mingo@elte.hu>
> Date: Tue, 22 Apr 2008 11:14:56 +0200
> 
> > thanks for reporting it. I havent seen this false positive happen in a 
> > long time - but then again, PC CPUs are a lot less idle than a 128-CPU 
> > Niagara2 :-/ I'm wondering what the best method would be to provoke a 
> > CPU to stay idle that long - to make sure this bug is fixed.
> 
> I looked more closely at this.
> 
> There is no way the patch in question can work properly.
> 
> The algorithm is, essentialy "if time - prev_cpu_time is large enough, 
> call __sync_cpu_clock()" which if fine, except that nothing ever sets 
> prev_cpu_time.

you are right - and this causes us to hit that global spinlock every 
time cpu_clock() is called. Note that only debugging code uses 
cpu_clock() though (softlockup watchdog, blktrace, etc.) - but you are 
right that the performance bug should be fixed - the patch below fixes 
the bogosity.

the second patch below makes time_sync_thresh available to architecture 
code - that way, if your platform has a guaranteed-synchronous 
sched_clock(), you can set that to a larger value (or just -1LL for 
infinite).

this problem cannot explain the softlockup bug though.

	Ingo

---------------------------->
Subject: sched: fix cpu clock
From: Ingo Molnar <mingo@elte.hu>
Date: Wed Apr 23 09:24:06 CEST 2008

David Miller pointed it out that nothing in cpu_clock() sets
prev_cpu_time. This caused __sync_cpu_clock() to be called
all the time - against the intention of this code.

The result was that in practice we hit a global spinlock every
time cpu_clock() is called - which - even though cpu_clock()
is used for tracing and debugging, is suboptimal.

Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched.c |    2 ++
 1 file changed, 2 insertions(+)

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -1001,6 +1001,8 @@ unsigned long long notrace cpu_clock(int
 	if (unlikely(delta_time > time_sync_thresh))
 		time = __sync_cpu_clock(time, cpu);
 
+	per_cpu(prev_cpu_time, cpu) = time;
+
 	return time;
 }
 EXPORT_SYMBOL_GPL(cpu_clock);

Subject: sched: make clock sync tunable by architecture code
From: Ingo Molnar <mingo@elte.hu>
Date: Wed Apr 23 09:31:35 CEST 2008

make time_sync_thresh tunable to architecture code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 include/linux/sched.h |    2 ++
 kernel/sched.c        |    2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

Index: linux/include/linux/sched.h
===================================================================
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -159,6 +159,8 @@ print_cfs_rq(struct seq_file *m, int cpu
 }
 #endif
 
+extern unsigned long long time_sync_thresh;
+
 /*
  * Task state bitmask. NOTE! These bits are also
  * encoded in fs/proc/array.c: get_task_state().
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -924,7 +924,7 @@ static inline u64 global_rt_runtime(void
 	return (u64)sysctl_sched_rt_runtime * NSEC_PER_USEC;
 }
 
-static const unsigned long long time_sync_thresh = 100000;
+unsigned long long time_sync_thresh = 100000;
 
 static DEFINE_PER_CPU(unsigned long long, time_offset);
 static DEFINE_PER_CPU(unsigned long long, prev_cpu_time);

  parent reply	other threads:[~2008-04-23  7:52 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-22  8:59 Soft lockup regression from today's sched.git merge David Miller
2008-04-22  9:14 ` Ingo Molnar
2008-04-22 10:05   ` David Miller
2008-04-22 12:45     ` Peter Zijlstra
2008-05-06 22:41       ` Rafael J. Wysocki
2008-05-06 23:05         ` David Miller
2008-05-07  6:43           ` Ingo Molnar
2008-05-07 18:56             ` Rafael J. Wysocki
2008-04-23  8:50     ` [patch] softlockup: fix false positives on nohz if CPU is 100% idle for more than 60 seconds Ingo Molnar
2008-04-23 10:55       ` David Miller
2008-04-23 12:29         ` David Miller
2008-04-23 13:36           ` Ingo Molnar
2008-04-23 23:23             ` David Miller
2008-04-23  5:42   ` Soft lockup regression from today's sched.git merge David Miller
2008-04-23  7:32     ` Dhaval Giani
2008-04-23  7:51     ` Ingo Molnar [this message]
2008-04-23  9:40     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080423075140.GA20980@elte.hu \
    --to=mingo@elte.hu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.