* [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
@ 2009-01-06 16:27 Dimitri Sivanich
2009-01-06 17:12 ` Greg KH
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Dimitri Sivanich @ 2009-01-06 16:27 UTC (permalink / raw)
To: linux-ia64, Tony Luck, Greg KH
Cc: linux-kernel, Peter Zijlstra, Gregory Haskins, Nick Piggin,
Tony Luck, Robin Holt
Turn on CONFIG_HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN.
SGI Altix has unsynchronized itc clocks. This results in rq->clock
occasionally being set to a time in the past by a remote cpu.
Note that it is possible that this problem may exist for other ia64
machines as well, based on the following comment for sched_clock() in
arch/ia64/kernel/head.S:
* Return a CPU-local timestamp in nano-seconds. This timestamp is
* NOT synchronized across CPUs its return value must never be
* compared against the values returned on another CPU. The usage in
* kernel/sched.c ensures that.
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
---
Greg, if everyone is OK with this patch, this should also be applied
to all stable trees starting with 2.6.26.
arch/ia64/Kconfig | 1 +
1 file changed, 1 insertion(+)
Index: linux-2.6/arch/ia64/Kconfig
===================================================================
--- linux-2.6.orig/arch/ia64/Kconfig 2009-01-06 10:13:13.051918923 -0600
+++ linux-2.6/arch/ia64/Kconfig 2009-01-06 10:13:44.547856328 -0600
@@ -536,6 +536,7 @@ config IA64_MC_ERR_INJECT
config SGI_SN
def_bool y if (IA64_SGI_SN2 || IA64_GENERIC)
+ select HAVE_UNSTABLE_SCHED_CLOCK
config IA64_ESI
bool "ESI (Extensible SAL Interface) support"
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-06 16:27 [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems Dimitri Sivanich
@ 2009-01-06 17:12 ` Greg KH
2009-01-06 20:15 ` Luck, Tony
2009-01-15 18:48 ` Greg KH
2 siblings, 0 replies; 19+ messages in thread
From: Greg KH @ 2009-01-06 17:12 UTC (permalink / raw)
To: Dimitri Sivanich
Cc: linux-ia64, Tony Luck, linux-kernel, Peter Zijlstra,
Gregory Haskins, Nick Piggin, Tony Luck, Robin Holt
On Tue, Jan 06, 2009 at 10:27:41AM -0600, Dimitri Sivanich wrote:
> Turn on CONFIG_HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN.
>
> SGI Altix has unsynchronized itc clocks. This results in rq->clock
> occasionally being set to a time in the past by a remote cpu.
>
> Note that it is possible that this problem may exist for other ia64
> machines as well, based on the following comment for sched_clock() in
> arch/ia64/kernel/head.S:
>
> * Return a CPU-local timestamp in nano-seconds. This timestamp is
> * NOT synchronized across CPUs its return value must never be
> * compared against the values returned on another CPU. The usage in
> * kernel/sched.c ensures that.
>
>
> Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
>
> ---
>
> Greg, if everyone is OK with this patch, this should also be applied
> to all stable trees starting with 2.6.26.
That's fine (but 2.6.26 is no longer getting -stable releases).
Just let stable@kernel.org know when this patch is applied in Linus's
tree. You can do this automatically by adding a:
Cc: stable <stable@kernel.org>
to the signed-off-by: area of the patch, and the stable team will be
automatically notified when it goes into Linus's tree.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-06 16:27 [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems Dimitri Sivanich
2009-01-06 17:12 ` Greg KH
@ 2009-01-06 20:15 ` Luck, Tony
2009-01-06 20:19 ` Robin Holt
2009-01-15 18:48 ` Greg KH
2 siblings, 1 reply; 19+ messages in thread
From: Luck, Tony @ 2009-01-06 20:15 UTC (permalink / raw)
To: Dimitri Sivanich, linux-ia64@vger.kernel.org, Greg KH
Cc: linux-kernel@vger.kernel.org, Peter Zijlstra, Gregory Haskins,
Nick Piggin, Tony Luck, Robin Holt
> Turn on CONFIG_HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN.
>
> SGI Altix has unsynchronized itc clocks. This results in rq->clock
> occasionally being set to a time in the past by a remote cpu.
SGI Altix is definitely the worst offender here. On heterogeneneous
systems with different model cpus across nodes the itc rate may differ
by hundreds of megahertz. Even when all cpus are at the same rated
speed, different nodes are driven from different crystals, so the itc
values will slowly drift apart (Linux discovers this from SAL and so
doesn't bother to synchronize them).
> Note that it is possible that this problem may exist for other ia64
> machines as well, based on the following comment for sched_clock() in
> arch/ia64/kernel/head.S:
>
> * Return a CPU-local timestamp in nano-seconds. This timestamp is
> * NOT synchronized across CPUs its return value must never be
> * compared against the values returned on another CPU. The usage in
> * kernel/sched.c ensures that.
When sched_clock() was first created this was part of its definition.
It meant that sched_clock could be simple & fast because it didn't
need to be synchronized across cpus.
It seems that new uses have grown for it that no longer allow that
flexibility.
All ia64 systems are potentially affected ... but perhaps you might
never see the problem on most because the itc clocks are synced as close
as s/w can get them when cpus are brought on line.
-Tony
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-06 20:15 ` Luck, Tony
@ 2009-01-06 20:19 ` Robin Holt
2009-01-06 20:34 ` Luck, Tony
0 siblings, 1 reply; 19+ messages in thread
From: Robin Holt @ 2009-01-06 20:19 UTC (permalink / raw)
To: Luck, Tony
Cc: Dimitri Sivanich, linux-ia64@vger.kernel.org, Greg KH,
linux-kernel@vger.kernel.org, Peter Zijlstra, Gregory Haskins,
Nick Piggin, Tony Luck, Robin Holt
On Tue, Jan 06, 2009 at 12:15:34PM -0800, Luck, Tony wrote:
> > Turn on CONFIG_HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN.
> >
> > SGI Altix has unsynchronized itc clocks. This results in rq->clock
> > occasionally being set to a time in the past by a remote cpu.
>
> SGI Altix is definitely the worst offender here. On heterogeneneous
> systems with different model cpus across nodes the itc rate may differ
> by hundreds of megahertz. Even when all cpus are at the same rated
> speed, different nodes are driven from different crystals, so the itc
> values will slowly drift apart (Linux discovers this from SAL and so
> doesn't bother to synchronize them).
>
> > Note that it is possible that this problem may exist for other ia64
> > machines as well, based on the following comment for sched_clock() in
> > arch/ia64/kernel/head.S:
> >
> > * Return a CPU-local timestamp in nano-seconds. This timestamp is
> > * NOT synchronized across CPUs its return value must never be
> > * compared against the values returned on another CPU. The usage in
> > * kernel/sched.c ensures that.
>
> When sched_clock() was first created this was part of its definition.
> It meant that sched_clock could be simple & fast because it didn't
> need to be synchronized across cpus.
>
> It seems that new uses have grown for it that no longer allow that
> flexibility.
>
> All ia64 systems are potentially affected ... but perhaps you might
> never see the problem on most because the itc clocks are synced as close
> as s/w can get them when cpus are brought on line.
Do you want Dimitri to resubmit with this set for all IA64 or leave it
as is?
Thanks,
Robin
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-06 20:19 ` Robin Holt
@ 2009-01-06 20:34 ` Luck, Tony
2009-01-06 20:57 ` Peter Zijlstra
0 siblings, 1 reply; 19+ messages in thread
From: Luck, Tony @ 2009-01-06 20:34 UTC (permalink / raw)
To: Robin Holt
Cc: Dimitri Sivanich, linux-ia64@vger.kernel.org, Greg KH,
linux-kernel@vger.kernel.org, Peter Zijlstra, Gregory Haskins,
Nick Piggin, Tony Luck
> > All ia64 systems are potentially affected ... but perhaps you might
> > never see the problem on most because the itc clocks are synced as close
> > as s/w can get them when cpus are brought on line.
>
> Do you want Dimitri to resubmit with this set for all IA64 or leave it
> as is?
I'd like to understand the impact of turning on HAVE_UNSTABLE_SCHED_CLOCK
It looks like both the i386_defconfig and x86_64_defconfig choose this,
so at least ia64 will be hitting the well tested code paths
Have the other architectures just not hit this yet? Or do they all have
"stable" sched_clock() functions?
sched_clock() seemed like such a straightforward thing to begin with. A
quick & easy way to measure a time delta ON THE SAME CPU. I'm not at
all sure why it has been co-opted for general time measurement.
-Tony
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-06 20:34 ` Luck, Tony
@ 2009-01-06 20:57 ` Peter Zijlstra
2009-01-06 22:50 ` Robin Holt
0 siblings, 1 reply; 19+ messages in thread
From: Peter Zijlstra @ 2009-01-06 20:57 UTC (permalink / raw)
To: Luck, Tony
Cc: Robin Holt, Dimitri Sivanich, linux-ia64@vger.kernel.org, Greg KH,
linux-kernel@vger.kernel.org, Gregory Haskins, Nick Piggin,
Tony Luck
On Tue, 2009-01-06 at 12:34 -0800, Luck, Tony wrote:
> > > All ia64 systems are potentially affected ... but perhaps you might
> > > never see the problem on most because the itc clocks are synced as close
> > > as s/w can get them when cpus are brought on line.
> >
> > Do you want Dimitri to resubmit with this set for all IA64 or leave it
> > as is?
>
> I'd like to understand the impact of turning on HAVE_UNSTABLE_SCHED_CLOCK
>
> It looks like both the i386_defconfig and x86_64_defconfig choose this,
> so at least ia64 will be hitting the well tested code paths
>
> Have the other architectures just not hit this yet? Or do they all have
> "stable" sched_clock() functions?
>
>
> sched_clock() seemed like such a straightforward thing to begin with. A
> quick & easy way to measure a time delta ON THE SAME CPU. I'm not at
> all sure why it has been co-opted for general time measurement.
It came from the complication of needing to tell a remote cpu's time due
to remote wakeups in the scheduler.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-06 20:57 ` Peter Zijlstra
@ 2009-01-06 22:50 ` Robin Holt
2009-01-06 23:16 ` Peter Zijlstra
0 siblings, 1 reply; 19+ messages in thread
From: Robin Holt @ 2009-01-06 22:50 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Luck, Tony, Robin Holt, Dimitri Sivanich,
linux-ia64@vger.kernel.org, Greg KH, linux-kernel@vger.kernel.org,
Gregory Haskins, Nick Piggin, Tony Luck
On Tue, Jan 06, 2009 at 09:57:20PM +0100, Peter Zijlstra wrote:
> On Tue, 2009-01-06 at 12:34 -0800, Luck, Tony wrote:
> > > > All ia64 systems are potentially affected ... but perhaps you might
> > > > never see the problem on most because the itc clocks are synced as close
> > > > as s/w can get them when cpus are brought on line.
> > >
> > > Do you want Dimitri to resubmit with this set for all IA64 or leave it
> > > as is?
> >
> > I'd like to understand the impact of turning on HAVE_UNSTABLE_SCHED_CLOCK
> >
> > It looks like both the i386_defconfig and x86_64_defconfig choose this,
> > so at least ia64 will be hitting the well tested code paths
> >
> > Have the other architectures just not hit this yet? Or do they all have
> > "stable" sched_clock() functions?
> >
> >
> > sched_clock() seemed like such a straightforward thing to begin with. A
> > quick & easy way to measure a time delta ON THE SAME CPU. I'm not at
> > all sure why it has been co-opted for general time measurement.
>
> It came from the complication of needing to tell a remote cpu's time due
> to remote wakeups in the scheduler.
But doesn't scheduler tick advance the rq->clock? Why do the others
need to fiddle with a remote runqueue's clock? When that cpu starts
taking ticks again, it will update it's rq->clock field and start the
processes. I guess I am a lot underinformed about the new scheduler
design.
Robin
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-06 22:50 ` Robin Holt
@ 2009-01-06 23:16 ` Peter Zijlstra
2009-01-07 3:00 ` Nick Piggin
0 siblings, 1 reply; 19+ messages in thread
From: Peter Zijlstra @ 2009-01-06 23:16 UTC (permalink / raw)
To: Robin Holt
Cc: Luck, Tony, Dimitri Sivanich, linux-ia64@vger.kernel.org, Greg KH,
linux-kernel@vger.kernel.org, Gregory Haskins, Nick Piggin,
Tony Luck
On Tue, 2009-01-06 at 16:50 -0600, Robin Holt wrote:
> On Tue, Jan 06, 2009 at 09:57:20PM +0100, Peter Zijlstra wrote:
> > On Tue, 2009-01-06 at 12:34 -0800, Luck, Tony wrote:
> > > > > All ia64 systems are potentially affected ... but perhaps you might
> > > > > never see the problem on most because the itc clocks are synced as close
> > > > > as s/w can get them when cpus are brought on line.
> > > >
> > > > Do you want Dimitri to resubmit with this set for all IA64 or leave it
> > > > as is?
> > >
> > > I'd like to understand the impact of turning on HAVE_UNSTABLE_SCHED_CLOCK
> > >
> > > It looks like both the i386_defconfig and x86_64_defconfig choose this,
> > > so at least ia64 will be hitting the well tested code paths
> > >
> > > Have the other architectures just not hit this yet? Or do they all have
> > > "stable" sched_clock() functions?
> > >
> > >
> > > sched_clock() seemed like such a straightforward thing to begin with. A
> > > quick & easy way to measure a time delta ON THE SAME CPU. I'm not at
> > > all sure why it has been co-opted for general time measurement.
> >
> > It came from the complication of needing to tell a remote cpu's time due
> > to remote wakeups in the scheduler.
>
> But doesn't scheduler tick advance the rq->clock? Why do the others
> need to fiddle with a remote runqueue's clock? When that cpu starts
> taking ticks again, it will update it's rq->clock field and start the
> processes. I guess I am a lot underinformed about the new scheduler
> design.
We try to do better than tick based time accounting these days.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-06 23:16 ` Peter Zijlstra
@ 2009-01-07 3:00 ` Nick Piggin
2009-01-07 3:16 ` Jack Steiner
2009-01-07 7:28 ` Peter Zijlstra
0 siblings, 2 replies; 19+ messages in thread
From: Nick Piggin @ 2009-01-07 3:00 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Robin Holt, Luck, Tony, Dimitri Sivanich,
linux-ia64@vger.kernel.org, Greg KH, linux-kernel@vger.kernel.org,
Gregory Haskins, Tony Luck
On Wed, Jan 07, 2009 at 12:16:03AM +0100, Peter Zijlstra wrote:
> >
> > But doesn't scheduler tick advance the rq->clock? Why do the others
> > need to fiddle with a remote runqueue's clock? When that cpu starts
> > taking ticks again, it will update it's rq->clock field and start the
> > processes. I guess I am a lot underinformed about the new scheduler
> > design.
>
> We try to do better than tick based time accounting these days.
But if you contain the drift to within one tick, it shouldn't be much
problem to just truncate negative deltas I would have thought? The
time between events on different CPUs is pretty fuzzy at the ns level
anyway, I think ;)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-07 3:00 ` Nick Piggin
@ 2009-01-07 3:16 ` Jack Steiner
2009-01-07 7:28 ` Peter Zijlstra
1 sibling, 0 replies; 19+ messages in thread
From: Jack Steiner @ 2009-01-07 3:16 UTC (permalink / raw)
To: Nick Piggin
Cc: Peter Zijlstra, Robin Holt, Luck, Tony, Dimitri Sivanich,
linux-ia64@vger.kernel.org, Greg KH, linux-kernel@vger.kernel.org,
Gregory Haskins, Tony Luck
On Wed, Jan 07, 2009 at 04:00:30AM +0100, Nick Piggin wrote:
> On Wed, Jan 07, 2009 at 12:16:03AM +0100, Peter Zijlstra wrote:
> > >
> > > But doesn't scheduler tick advance the rq->clock? Why do the others
> > > need to fiddle with a remote runqueue's clock? When that cpu starts
> > > taking ticks again, it will update it's rq->clock field and start the
> > > processes. I guess I am a lot underinformed about the new scheduler
> > > design.
> >
> > We try to do better than tick based time accounting these days.
>
> But if you contain the drift to within one tick, it shouldn't be much
> problem to just truncate negative deltas I would have thought? The
> time between events on different CPUs is pretty fuzzy at the ns level
> anyway, I think ;)
Unfortunately, not possible on SGI IA64 systems. The cpus on different
nodes (blades) are not required to be the same steppings or core
frequencies. Core frequencies within the SSI can differ by hundreds of
MHz (~25%).
(I don't recall if we support systems with mixed madison & montecito
processors. If so, IIRC, the itc frequencies of these differ by 4X
for the same core frequency).
--- jack
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-07 3:00 ` Nick Piggin
2009-01-07 3:16 ` Jack Steiner
@ 2009-01-07 7:28 ` Peter Zijlstra
2009-01-07 7:40 ` Nick Piggin
2009-01-07 9:43 ` Robin Holt
1 sibling, 2 replies; 19+ messages in thread
From: Peter Zijlstra @ 2009-01-07 7:28 UTC (permalink / raw)
To: Nick Piggin
Cc: Robin Holt, Luck, Tony, Dimitri Sivanich,
linux-ia64@vger.kernel.org, Greg KH, linux-kernel@vger.kernel.org,
Gregory Haskins, Tony Luck
On Wed, 2009-01-07 at 04:00 +0100, Nick Piggin wrote:
> On Wed, Jan 07, 2009 at 12:16:03AM +0100, Peter Zijlstra wrote:
> > >
> > > But doesn't scheduler tick advance the rq->clock? Why do the others
> > > need to fiddle with a remote runqueue's clock? When that cpu starts
> > > taking ticks again, it will update it's rq->clock field and start the
> > > processes. I guess I am a lot underinformed about the new scheduler
> > > design.
> >
> > We try to do better than tick based time accounting these days.
>
> But if you contain the drift to within one tick, it shouldn't be much
> problem to just truncate negative deltas I would have thought? The
> time between events on different CPUs is pretty fuzzy at the ns level
> anyway, I think ;)
That's basically what the HAVE_UNSTABLE_SCHED_CLOCK code does. It takes
a tick timestamp and tries to improve on that by using strict per cpu
sched_clock() deltas.
What we do to obtain remote time, is basically calculate local time and
pull remote time fwd if that was behind.
While doing that, it filters out any backward motion and large fwd leaps
so as to stay no worse than a jiffie clock.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-07 7:28 ` Peter Zijlstra
@ 2009-01-07 7:40 ` Nick Piggin
2009-01-07 9:43 ` Robin Holt
1 sibling, 0 replies; 19+ messages in thread
From: Nick Piggin @ 2009-01-07 7:40 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Robin Holt, Luck, Tony, Dimitri Sivanich,
linux-ia64@vger.kernel.org, Greg KH, linux-kernel@vger.kernel.org,
Gregory Haskins, Tony Luck
On Wed, Jan 07, 2009 at 08:28:09AM +0100, Peter Zijlstra wrote:
> On Wed, 2009-01-07 at 04:00 +0100, Nick Piggin wrote:
> > On Wed, Jan 07, 2009 at 12:16:03AM +0100, Peter Zijlstra wrote:
> > > >
> > > > But doesn't scheduler tick advance the rq->clock? Why do the others
> > > > need to fiddle with a remote runqueue's clock? When that cpu starts
> > > > taking ticks again, it will update it's rq->clock field and start the
> > > > processes. I guess I am a lot underinformed about the new scheduler
> > > > design.
> > >
> > > We try to do better than tick based time accounting these days.
> >
> > But if you contain the drift to within one tick, it shouldn't be much
> > problem to just truncate negative deltas I would have thought? The
> > time between events on different CPUs is pretty fuzzy at the ns level
> > anyway, I think ;)
>
> That's basically what the HAVE_UNSTABLE_SCHED_CLOCK code does. It takes
> a tick timestamp and tries to improve on that by using strict per cpu
> sched_clock() deltas.
>
> What we do to obtain remote time, is basically calculate local time and
> pull remote time fwd if that was behind.
>
> While doing that, it filters out any backward motion and large fwd leaps
> so as to stay no worse than a jiffie clock.
OK, that's good. I guess the optimisations to remove that code should
have been called HAVE_STABLE_SCHED_CLOCK and have archs turn it on on
a case by case basis.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-07 7:28 ` Peter Zijlstra
2009-01-07 7:40 ` Nick Piggin
@ 2009-01-07 9:43 ` Robin Holt
2009-01-07 9:53 ` Peter Zijlstra
1 sibling, 1 reply; 19+ messages in thread
From: Robin Holt @ 2009-01-07 9:43 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Nick Piggin, Robin Holt, Luck, Tony, Dimitri Sivanich,
linux-ia64@vger.kernel.org, Greg KH, linux-kernel@vger.kernel.org,
Gregory Haskins, Tony Luck
On Wed, Jan 07, 2009 at 08:28:09AM +0100, Peter Zijlstra wrote:
> That's basically what the HAVE_UNSTABLE_SCHED_CLOCK code does. It takes
> a tick timestamp and tries to improve on that by using strict per cpu
> sched_clock() deltas.
>
> What we do to obtain remote time, is basically calculate local time and
> pull remote time fwd if that was behind.
But what happens when the remote clock then takes a single tick and
pulls forward a different remote clock by both your tick and its tick.
Multiply that by 4096 and I am starting to feel that time is likely to
go racing forward in a really random fashion.
> While doing that, it filters out any backward motion and large fwd leaps
> so as to stay no worse than a jiffie clock.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-07 9:43 ` Robin Holt
@ 2009-01-07 9:53 ` Peter Zijlstra
2009-01-07 13:32 ` Dimitri Sivanich
0 siblings, 1 reply; 19+ messages in thread
From: Peter Zijlstra @ 2009-01-07 9:53 UTC (permalink / raw)
To: Robin Holt
Cc: Nick Piggin, Luck, Tony, Dimitri Sivanich,
linux-ia64@vger.kernel.org, Greg KH, linux-kernel@vger.kernel.org,
Gregory Haskins, Tony Luck
On Wed, 2009-01-07 at 03:43 -0600, Robin Holt wrote:
> On Wed, Jan 07, 2009 at 08:28:09AM +0100, Peter Zijlstra wrote:
> > That's basically what the HAVE_UNSTABLE_SCHED_CLOCK code does. It takes
> > a tick timestamp and tries to improve on that by using strict per cpu
> > sched_clock() deltas.
> >
> > What we do to obtain remote time, is basically calculate local time and
> > pull remote time fwd if that was behind.
>
> But what happens when the remote clock then takes a single tick and
> pulls forward a different remote clock by both your tick and its tick.
> Multiply that by 4096 and I am starting to feel that time is likely to
> go racing forward in a really random fashion.
I'm having trouble parsing that...
Clock state is kept per-cpu, and locked with a spinlock. When we request
a remote time-stamp we lock both our local and the remote clock state,
compute our local time, and set both the remote and local clock to the
max of local,remote clock.
An interrupt on the remote cpu will spin for its clock state lock before
doing anything - also, the tick interrupt will never bother with remote
clock state.
So I'm not exactly seeing what could go wrong here.
Please read kernel/sched_clock.c, its only 266 lines and could use an
extra eyeball, I'd be glad to answer any questions.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-07 9:53 ` Peter Zijlstra
@ 2009-01-07 13:32 ` Dimitri Sivanich
2009-01-07 15:16 ` Peter Zijlstra
0 siblings, 1 reply; 19+ messages in thread
From: Dimitri Sivanich @ 2009-01-07 13:32 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Robin Holt, Nick Piggin, Luck, Tony, linux-ia64@vger.kernel.org,
Greg KH, linux-kernel@vger.kernel.org, Gregory Haskins, Tony Luck
Peter,
On Wed, Jan 07, 2009 at 10:53:57AM +0100, Peter Zijlstra wrote:
>
> Clock state is kept per-cpu, and locked with a spinlock. When we request
Admittedly I have not looked at this possibility too closely, but my initial concern upon looking at sched_clock_cpu() for the UNSTABLE case was the lock_double_clock() and what sort of contention that might cause on larger systems under certain conditions.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-07 13:32 ` Dimitri Sivanich
@ 2009-01-07 15:16 ` Peter Zijlstra
0 siblings, 0 replies; 19+ messages in thread
From: Peter Zijlstra @ 2009-01-07 15:16 UTC (permalink / raw)
To: Dimitri Sivanich
Cc: Robin Holt, Nick Piggin, Luck, Tony, linux-ia64@vger.kernel.org,
Greg KH, linux-kernel@vger.kernel.org, Gregory Haskins, Tony Luck
On Wed, 2009-01-07 at 07:32 -0600, Dimitri Sivanich wrote:
> Peter,
>
> On Wed, Jan 07, 2009 at 10:53:57AM +0100, Peter Zijlstra wrote:
> >
> > Clock state is kept per-cpu, and locked with a spinlock. When we request
>
> Admittedly I have not looked at this possibility too closely, but my
> initial concern upon looking at sched_clock_cpu() for the UNSTABLE
> case was the lock_double_clock() and what sort of contention that
> might cause on larger systems under certain conditions.
Similar contention would already exist on rq->lock.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-06 16:27 [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems Dimitri Sivanich
2009-01-06 17:12 ` Greg KH
2009-01-06 20:15 ` Luck, Tony
@ 2009-01-15 18:48 ` Greg KH
2009-01-15 19:21 ` Luck, Tony
2 siblings, 1 reply; 19+ messages in thread
From: Greg KH @ 2009-01-15 18:48 UTC (permalink / raw)
To: Dimitri Sivanich
Cc: linux-ia64, Tony Luck, linux-kernel, Peter Zijlstra,
Gregory Haskins, Nick Piggin, Tony Luck, Robin Holt
On Tue, Jan 06, 2009 at 10:27:41AM -0600, Dimitri Sivanich wrote:
> Turn on CONFIG_HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN.
>
> SGI Altix has unsynchronized itc clocks. This results in rq->clock
> occasionally being set to a time in the past by a remote cpu.
>
> Note that it is possible that this problem may exist for other ia64
> machines as well, based on the following comment for sched_clock() in
> arch/ia64/kernel/head.S:
>
> * Return a CPU-local timestamp in nano-seconds. This timestamp is
> * NOT synchronized across CPUs its return value must never be
> * compared against the values returned on another CPU. The usage in
> * kernel/sched.c ensures that.
>
>
> Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
>
> ---
>
> Greg, if everyone is OK with this patch, this should also be applied
> to all stable trees starting with 2.6.26.
What is the status of this going into Linus's tree?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-15 18:48 ` Greg KH
@ 2009-01-15 19:21 ` Luck, Tony
2009-01-22 19:04 ` Greg KH
0 siblings, 1 reply; 19+ messages in thread
From: Luck, Tony @ 2009-01-15 19:21 UTC (permalink / raw)
To: Greg KH, Dimitri Sivanich
Cc: linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org,
Peter Zijlstra, Gregory Haskins, Nick Piggin, Tony Luck,
Robin Holt
> > Greg, if everyone is OK with this patch, this should also be applied
> > to all stable trees starting with 2.6.26.
> What is the status of this going into Linus's tree?
I'm about to ask Linus to pull a modified version that does
a "select HAVE_UNSTABLE_SCHED_CLOCK" for all IA64 systems ...
SGI sched_clock() is a lot less "stable" than most, but all
ia64 systems can hit this as the ar.itc cycle counter is only
s/w synchronized between cpus.
-Tony
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems
2009-01-15 19:21 ` Luck, Tony
@ 2009-01-22 19:04 ` Greg KH
0 siblings, 0 replies; 19+ messages in thread
From: Greg KH @ 2009-01-22 19:04 UTC (permalink / raw)
To: Luck, Tony
Cc: Dimitri Sivanich, linux-ia64@vger.kernel.org,
linux-kernel@vger.kernel.org, Peter Zijlstra, Gregory Haskins,
Nick Piggin, Tony Luck, Robin Holt
On Thu, Jan 15, 2009 at 11:21:05AM -0800, Luck, Tony wrote:
> > > Greg, if everyone is OK with this patch, this should also be applied
> > > to all stable trees starting with 2.6.26.
>
> > What is the status of this going into Linus's tree?
>
> I'm about to ask Linus to pull a modified version that does
> a "select HAVE_UNSTABLE_SCHED_CLOCK" for all IA64 systems ...
> SGI sched_clock() is a lot less "stable" than most, but all
> ia64 systems can hit this as the ar.itc cycle counter is only
> s/w synchronized between cpus.
Thanks, I see it now and will queue it up for the next -stable releases.
greg k-h
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2009-01-22 19:37 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-06 16:27 [PATCH] configure HAVE_UNSTABLE_SCHED_CLOCK for SGI_SN systems Dimitri Sivanich
2009-01-06 17:12 ` Greg KH
2009-01-06 20:15 ` Luck, Tony
2009-01-06 20:19 ` Robin Holt
2009-01-06 20:34 ` Luck, Tony
2009-01-06 20:57 ` Peter Zijlstra
2009-01-06 22:50 ` Robin Holt
2009-01-06 23:16 ` Peter Zijlstra
2009-01-07 3:00 ` Nick Piggin
2009-01-07 3:16 ` Jack Steiner
2009-01-07 7:28 ` Peter Zijlstra
2009-01-07 7:40 ` Nick Piggin
2009-01-07 9:43 ` Robin Holt
2009-01-07 9:53 ` Peter Zijlstra
2009-01-07 13:32 ` Dimitri Sivanich
2009-01-07 15:16 ` Peter Zijlstra
2009-01-15 18:48 ` Greg KH
2009-01-15 19:21 ` Luck, Tony
2009-01-22 19:04 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox