public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
@ 2007-04-11 16:29 Daniel Walker
  2007-04-11 20:31 ` Andrew Morton
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Walker @ 2007-04-11 16:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, johnstul, tglx

The locking of the xtime_lock around the cpu notifier is unessesary now. At one
time the tsc was used after a frequency change for timekeeping, but the re-write
of timekeeping no longer uses the TSC unless the frequency is constant. 

The variables that are changed in this section of code had also once been used
for timekeeping, but not any longer ..

Signed-Off-By: Daniel Walker <dwalker@mvista.com>

---
 arch/i386/kernel/tsc.c |    8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

Index: linux-2.6.20/arch/i386/kernel/tsc.c
===================================================================
--- linux-2.6.20.orig/arch/i386/kernel/tsc.c
+++ linux-2.6.20/arch/i386/kernel/tsc.c
@@ -200,13 +200,10 @@ time_cpufreq_notifier(struct notifier_bl
 {
 	struct cpufreq_freqs *freq = data;
 
-	if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE)
-		write_seqlock_irq(&xtime_lock);
-
 	if (!ref_freq) {
 		if (!freq->old){
 			ref_freq = freq->new;
-			goto end;
+			return 0;
 		}
 		ref_freq = freq->old;
 		loops_per_jiffy_ref = cpu_data[freq->cpu].loops_per_jiffy;
@@ -237,9 +234,6 @@ time_cpufreq_notifier(struct notifier_bl
 			}
 		}
 	}
-end:
-	if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE)
-		write_sequnlock_irq(&xtime_lock);
 
 	return 0;
 }
-- 

-- 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-11 16:29 [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier Daniel Walker
@ 2007-04-11 20:31 ` Andrew Morton
  2007-04-11 20:54   ` Daniel Walker
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2007-04-11 20:31 UTC (permalink / raw)
  To: Daniel Walker; +Cc: linux-kernel, johnstul, tglx, Andi Kleen

On Wed, 11 Apr 2007 09:29:04 -0700
Daniel Walker <dwalker@mvista.com> wrote:

> The locking of the xtime_lock around the cpu notifier is unessesary now. At one
> time the tsc was used after a frequency change for timekeeping, but the re-write
> of timekeeping no longer uses the TSC unless the frequency is constant. 
> 
> The variables that are changed in this section of code had also once been used
> for timekeeping, but not any longer ..
> 
> Signed-Off-By: Daniel Walker <dwalker@mvista.com>
> 
> ---
>  arch/i386/kernel/tsc.c |    8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> Index: linux-2.6.20/arch/i386/kernel/tsc.c
> ===================================================================
> --- linux-2.6.20.orig/arch/i386/kernel/tsc.c
> +++ linux-2.6.20/arch/i386/kernel/tsc.c
> @@ -200,13 +200,10 @@ time_cpufreq_notifier(struct notifier_bl
>  {
>  	struct cpufreq_freqs *freq = data;
>  
> -	if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE)
> -		write_seqlock_irq(&xtime_lock);
> -
>  	if (!ref_freq) {
>  		if (!freq->old){
>  			ref_freq = freq->new;
> -			goto end;
> +			return 0;
>  		}
>  		ref_freq = freq->old;
>  		loops_per_jiffy_ref = cpu_data[freq->cpu].loops_per_jiffy;
> @@ -237,9 +234,6 @@ time_cpufreq_notifier(struct notifier_bl
>  			}
>  		}
>  	}
> -end:
> -	if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE)
> -		write_sequnlock_irq(&xtime_lock);
>  
>  	return 0;
>  }

hm.

I've been permadropping Andi's
ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt-current/patches/sched-clock-share
because it causes a lockup when initscripts start ondemand on my
single-CPU, CONFIG_SMP=n Vaio.

I don't know _why_ it locks up - I traced it down to the
write_seqlock_irq() which you have just removed.  But write_seqlock()
doesn't loop with CONFIG_SMP=n builds, so a hang there is quite mysterious.

Anyway, your patch might make that hang go away.  We'll see.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-11 20:31 ` Andrew Morton
@ 2007-04-11 20:54   ` Daniel Walker
  2007-04-11 21:33     ` Andrew Morton
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Walker @ 2007-04-11 20:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, johnstul, tglx, Andi Kleen

On Wed, 2007-04-11 at 13:31 -0700, Andrew Morton wrote:
> On Wed, 11 Apr 2007 09:29:04 -0700
> Daniel Walker <dwalker@mvista.com> wrote:
> 
> > The locking of the xtime_lock around the cpu notifier is unessesary now. At one
> > time the tsc was used after a frequency change for timekeeping, but the re-write
> > of timekeeping no longer uses the TSC unless the frequency is constant. 
> > 
> > The variables that are changed in this section of code had also once been used
> > for timekeeping, but not any longer ..
> > 
> > Signed-Off-By: Daniel Walker <dwalker@mvista.com>
> > 
> > ---
> >  arch/i386/kernel/tsc.c |    8 +-------
> >  1 file changed, 1 insertion(+), 7 deletions(-)
> > 
> > Index: linux-2.6.20/arch/i386/kernel/tsc.c
> > ===================================================================
> > --- linux-2.6.20.orig/arch/i386/kernel/tsc.c
> > +++ linux-2.6.20/arch/i386/kernel/tsc.c
> > @@ -200,13 +200,10 @@ time_cpufreq_notifier(struct notifier_bl
> >  {
> >  	struct cpufreq_freqs *freq = data;
> >  
> > -	if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE)
> > -		write_seqlock_irq(&xtime_lock);
> > -
> >  	if (!ref_freq) {
> >  		if (!freq->old){
> >  			ref_freq = freq->new;
> > -			goto end;
> > +			return 0;
> >  		}
> >  		ref_freq = freq->old;
> >  		loops_per_jiffy_ref = cpu_data[freq->cpu].loops_per_jiffy;
> > @@ -237,9 +234,6 @@ time_cpufreq_notifier(struct notifier_bl
> >  			}
> >  		}
> >  	}
> > -end:
> > -	if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE)
> > -		write_sequnlock_irq(&xtime_lock);
> >  
> >  	return 0;
> >  }
> 
> hm.
> 
> I've been permadropping Andi's
> ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt-current/patches/sched-clock-share
> because it causes a lockup when initscripts start ondemand on my
> single-CPU, CONFIG_SMP=n Vaio.
> 
> I don't know _why_ it locks up - I traced it down to the
> write_seqlock_irq() which you have just removed.  But write_seqlock()
> doesn't loop with CONFIG_SMP=n builds, so a hang there is quite mysterious.
> 
> Anyway, your patch might make that hang go away.  We'll see.


I don't know to what extent this is relevant, but it's something I've
noticed ..

>From the patch above ,

+ */
+unsigned long long sched_clock(void)
+{
+	int cpu = get_cpu();
+	struct sc_data *sc = &per_cpu(sc_data, cpu);
+	unsigned long long r;
+
+	if (sc->instable) {
+		/* TBD find a cheaper fallback timer than this */
+		r = ktime_to_ns(ktime_get());
+	} else {
+		get_scheduled_cycles(r);
+		r = ((u64)sc->ns_base) + cycles_2_ns(cpu, r - sc->last_tsc);
+	}
+	put_cpu();
+	return r;
+}

Your VAIO is the "instable" case above I think .. So your using a case
that needs to be implemented still , I guess .. ktime_get() has a
peculiarity of recursively looping on the read seqlock on xtime_lock ..

Here is the call ordering ,

ktime_get()
 ktime_get_ts() -> read_seqretry(&xtime_lock, seq)
  getnstimeofday()
   __get_realtime_clock_ts() -> read_seqretry(&xtime_lock, seq)


I wonder if there is a weird case which case this to loop forever .. But
as said , it's just something I noticed so I don't know if it's
related .

Daniel


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-11 20:54   ` Daniel Walker
@ 2007-04-11 21:33     ` Andrew Morton
  2007-04-12  0:39       ` Andrew Morton
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2007-04-11 21:33 UTC (permalink / raw)
  To: Daniel Walker; +Cc: linux-kernel, johnstul, tglx, Andi Kleen

On Wed, 11 Apr 2007 13:54:41 -0700
Daniel Walker <dwalker@mvista.com> wrote:

> On Wed, 2007-04-11 at 13:31 -0700, Andrew Morton wrote:
> > On Wed, 11 Apr 2007 09:29:04 -0700
> > Daniel Walker <dwalker@mvista.com> wrote:
> > 
> > > The locking of the xtime_lock around the cpu notifier is unessesary now. At one
> > > time the tsc was used after a frequency change for timekeeping, but the re-write
> > > of timekeeping no longer uses the TSC unless the frequency is constant. 
> > > 
> > > The variables that are changed in this section of code had also once been used
> > > for timekeeping, but not any longer ..
> > > 
> > > Signed-Off-By: Daniel Walker <dwalker@mvista.com>
> > > 
> > > ---
> > >  arch/i386/kernel/tsc.c |    8 +-------
> > >  1 file changed, 1 insertion(+), 7 deletions(-)
> > > 
> > > Index: linux-2.6.20/arch/i386/kernel/tsc.c
> > > ===================================================================
> > > --- linux-2.6.20.orig/arch/i386/kernel/tsc.c
> > > +++ linux-2.6.20/arch/i386/kernel/tsc.c
> > > @@ -200,13 +200,10 @@ time_cpufreq_notifier(struct notifier_bl
> > >  {
> > >  	struct cpufreq_freqs *freq = data;
> > >  
> > > -	if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE)
> > > -		write_seqlock_irq(&xtime_lock);
> > > -
> > >  	if (!ref_freq) {
> > >  		if (!freq->old){
> > >  			ref_freq = freq->new;
> > > -			goto end;
> > > +			return 0;
> > >  		}
> > >  		ref_freq = freq->old;
> > >  		loops_per_jiffy_ref = cpu_data[freq->cpu].loops_per_jiffy;
> > > @@ -237,9 +234,6 @@ time_cpufreq_notifier(struct notifier_bl
> > >  			}
> > >  		}
> > >  	}
> > > -end:
> > > -	if (val != CPUFREQ_RESUMECHANGE && val != CPUFREQ_SUSPENDCHANGE)
> > > -		write_sequnlock_irq(&xtime_lock);
> > >  
> > >  	return 0;
> > >  }
> > 
> > hm.
> > 
> > I've been permadropping Andi's
> > ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt-current/patches/sched-clock-share
> > because it causes a lockup when initscripts start ondemand on my
> > single-CPU, CONFIG_SMP=n Vaio.
> > 
> > I don't know _why_ it locks up - I traced it down to the
> > write_seqlock_irq() which you have just removed.  But write_seqlock()
> > doesn't loop with CONFIG_SMP=n builds, so a hang there is quite mysterious.
> > 
> > Anyway, your patch might make that hang go away.  We'll see.
> 
> 
> I don't know to what extent this is relevant, but it's something I've
> noticed ..
> 
> >From the patch above ,
> 
> + */
> +unsigned long long sched_clock(void)
> +{
> +	int cpu = get_cpu();
> +	struct sc_data *sc = &per_cpu(sc_data, cpu);
> +	unsigned long long r;
> +
> +	if (sc->instable) {
> +		/* TBD find a cheaper fallback timer than this */
> +		r = ktime_to_ns(ktime_get());
> +	} else {
> +		get_scheduled_cycles(r);
> +		r = ((u64)sc->ns_base) + cycles_2_ns(cpu, r - sc->last_tsc);
> +	}
> +	put_cpu();
> +	return r;
> +}
> 
> Your VAIO is the "instable" case above I think .. So your using a case
> that needs to be implemented still , I guess .. ktime_get() has a
> peculiarity of recursively looping on the read seqlock on xtime_lock ..
> 
> Here is the call ordering ,
> 
> ktime_get()
>  ktime_get_ts() -> read_seqretry(&xtime_lock, seq)
>   getnstimeofday()
>    __get_realtime_clock_ts() -> read_seqretry(&xtime_lock, seq)
> 
> 
> I wonder if there is a weird case which case this to loop forever .. But
> as said , it's just something I noticed so I don't know if it's
> related .
> 

hm.

Bear in mind that printk calls sched_clock() for each line of output. 
(with the "time" kernel boot parameter).

If we're doing a read_seqretry() in sched_clock() then bascially any printk
inside the write_seqlock() will cause a lockup.

So in fact, this explains my hang: I was debugging it with printk and I
noticed that the printk before the write_seqlock() came out and the one
after it did not.  Presumably if I wasn't using "time", that hang wouldn't
have happened.

Which means that I still don't have a clue why Andi's patch is locking up
the Vaio.

It's a bad idea to make sched_clock() this complex - we've gone and
degraded kernel debuggability somewhat.

We have provision for fixing this: the architecture can provide its own
printk_clock().  We should do something quick-n-dirty in printk_clock()
which doesn't require any locks.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-11 21:33     ` Andrew Morton
@ 2007-04-12  0:39       ` Andrew Morton
  2007-04-12  9:36         ` Andi Kleen
  2007-04-12 17:17         ` Andi Kleen
  0 siblings, 2 replies; 21+ messages in thread
From: Andrew Morton @ 2007-04-12  0:39 UTC (permalink / raw)
  To: Daniel Walker, linux-kernel, johnstul, tglx, Andi Kleen

On Wed, 11 Apr 2007 14:33:57 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> > Here is the call ordering ,
> > 
> > ktime_get()
> >  ktime_get_ts() -> read_seqretry(&xtime_lock, seq)
> >   getnstimeofday()
> >    __get_realtime_clock_ts() -> read_seqretry(&xtime_lock, seq)
> > 
> > 
> > I wonder if there is a weird case which case this to loop forever .. But
> > as said , it's just something I noticed so I don't know if it's
> > related .
> > 
> 
> hm.
> 
> Bear in mind that printk calls sched_clock() for each line of output. 
> (with the "time" kernel boot parameter).
> 
> If we're doing a read_seqretry() in sched_clock() then bascially any printk
> inside the write_seqlock() will cause a lockup.
> 
> So in fact, this explains my hang: I was debugging it with printk and I
> noticed that the printk before the write_seqlock() came out and the one
> after it did not.  Presumably if I wasn't using "time", that hang wouldn't
> have happened.
> 
> Which means that I still don't have a clue why Andi's patch is locking up
> the Vaio.
> 
> It's a bad idea to make sched_clock() this complex - we've gone and
> degraded kernel debuggability somewhat.
> 
> We have provision for fixing this: the architecture can provide its own
> printk_clock().  We should do something quick-n-dirty in printk_clock()
> which doesn't require any locks.
> 

OK, so I resurrected x86_64-mm-sched-clock-share.patch and
x86_64-mm-sched-clock64.patch.  The x86_64 box hangs on boot when using
netconsole and printk timestamps too.  Removing "time" from the kernel boot
command line prevents that.

This explains why the hang only happens with
x86_64-mm-log-reason-why-tsc-was-marked-unstable.patch applied, too: that
patch must be triggering a printk inside xtime_lock.

Does someone want to cook up a lockless printk_clock() for i386 and x86_64?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12  0:39       ` Andrew Morton
@ 2007-04-12  9:36         ` Andi Kleen
  2007-04-12 16:23           ` Andrew Morton
  2007-04-12 17:17         ` Andi Kleen
  1 sibling, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2007-04-12  9:36 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Daniel Walker, linux-kernel, johnstul, tglx


> OK, so I resurrected x86_64-mm-sched-clock-share.patch and
> x86_64-mm-sched-clock64.patch.  The x86_64 box hangs on boot when using
> netconsole and printk timestamps too.  Removing "time" from the kernel boot
> command line prevents that.

Ah. But ktime_get shouldn't printk. Or did you change that?

> 
> This explains why the hang only happens with
> x86_64-mm-log-reason-why-tsc-was-marked-unstable.patch applied, too: that
> patch must be triggering a printk inside xtime_lock.
> 
> Does someone want to cook up a lockless printk_clock() for i386 and x86_64?

Just use jiffies directly in printk. That's only HZ accurate, but should
be good enough for printk.  

One could use pure monotonic xtime as fallback instead of ktime_get in sched_clock. 
The trouble is  just that they might cause sched_clock to go backwards during 
a temporary instability period (cpufreq change) because the xtime will be 
always a bit behind the TSC and a TSC->xtime conversion will lose time.
At least the scheduler doesn't handle backwards time warp on a CPU gratefully.
Ok I guess it could  return max(last_value_before_instability, xtime) 

-Andi



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12  9:36         ` Andi Kleen
@ 2007-04-12 16:23           ` Andrew Morton
  2007-04-12 16:45             ` Andi Kleen
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2007-04-12 16:23 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Daniel Walker, linux-kernel, johnstul, tglx

On Thu, 12 Apr 2007 11:36:02 +0200 Andi Kleen <ak@novell.com> wrote:

> 
> > OK, so I resurrected x86_64-mm-sched-clock-share.patch and
> > x86_64-mm-sched-clock64.patch.  The x86_64 box hangs on boot when using
> > netconsole and printk timestamps too.  Removing "time" from the kernel boot
> > command line prevents that.
> 
> Ah. But ktime_get shouldn't printk. Or did you change that?

I didn't change anything.

If we change printk() to do a read_seqretry(xtime_lock) (as your patches
apparently do), then any printk() inside write_seqlock(xtime_lock) will
hang.

> > 
> > This explains why the hang only happens with
> > x86_64-mm-log-reason-why-tsc-was-marked-unstable.patch applied, too: that
> > patch must be triggering a printk inside xtime_lock.
> > 
> > Does someone want to cook up a lockless printk_clock() for i386 and x86_64?
> 
> Just use jiffies directly in printk. That's only HZ accurate, but should
> be good enough for printk.  

Bit sad.  printk timestamping was originally implemented as a way of
observing and measuring bootup delays.  It seems pretty popular now and
probably quite a few people like high resolution on it.

> One could use pure monotonic xtime as fallback instead of ktime_get in sched_clock. 
> The trouble is  just that they might cause sched_clock to go backwards during 
> a temporary instability period (cpufreq change) because the xtime will be 
> always a bit behind the TSC and a TSC->xtime conversion will lose time.
> At least the scheduler doesn't handle backwards time warp on a CPU gratefully.
> Ok I guess it could  return max(last_value_before_instability, xtime) 
> 

I wasn't proposing any change in sched_clock().

I was proposing that i386 and x86_64 be given a new, lockless,
high-resolution printk_clock().  Presently x86 uses the default
printk_clock(), which uses sched_clock().  Presumably copying the
pre-x86_64-mm-sched-clock-share.patch version of sched_clock() into
printk_clock() will suffice.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 16:23           ` Andrew Morton
@ 2007-04-12 16:45             ` Andi Kleen
  2007-04-12 17:00               ` Andrew Morton
  2007-04-12 17:41               ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 21+ messages in thread
From: Andi Kleen @ 2007-04-12 16:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Daniel Walker, linux-kernel, johnstul, tglx


> I was proposing that i386 and x86_64 be given a new, lockless,
> high-resolution printk_clock().  Presently x86 uses the default
> printk_clock(), which uses sched_clock().  Presumably copying the
> pre-x86_64-mm-sched-clock-share.patch version of sched_clock() into
> printk_clock() will suffice.

Ok. I think it's better to just fix sched_clock() again than to
add another one.  I can probably
eliminate the ktime_get() and use something based on jiffies. That will
be inaccurate for the instable case of course.

I will do that later today.

-Andi




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 16:45             ` Andi Kleen
@ 2007-04-12 17:00               ` Andrew Morton
  2007-04-12 17:43                 ` Jeremy Fitzhardinge
  2007-04-12 17:41               ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2007-04-12 17:00 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Daniel Walker, linux-kernel, johnstul, tglx

On Thu, 12 Apr 2007 18:45:39 +0200 Andi Kleen <ak@novell.com> wrote:

> 
> > I was proposing that i386 and x86_64 be given a new, lockless,
> > high-resolution printk_clock().  Presently x86 uses the default
> > printk_clock(), which uses sched_clock().  Presumably copying the
> > pre-x86_64-mm-sched-clock-share.patch version of sched_clock() into
> > printk_clock() will suffice.
> 
> Ok. I think it's better to just fix sched_clock() again than to
> add another one.  I can probably
> eliminate the ktime_get() and use something based on jiffies. That will
> be inaccurate for the instable case of course.

hm.  People (ab)use sched_clock() for all sorts of things nowadays.  I wouldn't
do anything to degrade it just on behalf of printk-timestamping.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12  0:39       ` Andrew Morton
  2007-04-12  9:36         ` Andi Kleen
@ 2007-04-12 17:17         ` Andi Kleen
  1 sibling, 0 replies; 21+ messages in thread
From: Andi Kleen @ 2007-04-12 17:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Daniel Walker, linux-kernel, johnstul, tglx


> OK, so I resurrected x86_64-mm-sched-clock-share.patch and
> x86_64-mm-sched-clock64.patch.  The x86_64 box hangs on boot when using
> netconsole and printk timestamps too.  Removing "time" from the kernel boot
> command line prevents that.

It's not worse than before for you CPU. The old code did 

#ifndef CONFIG_NUMA
        if (!cpu_khz || check_tsc_unstable())
#endif
                /* no locking but a rare wrong value is not a big deal */
                return (jiffies_64 - INITIAL_JIFFIES) * (1000000000 / HZ);


So if it worked before it still will.

> This explains why the hang only happens with
> x86_64-mm-log-reason-why-tsc-was-marked-unstable.patch applied, too: that
> patch must be triggering a printk inside xtime_lock.
> 
> Does someone want to cook up a lockless printk_clock() for i386 and x86_64?

With the fall back case changed back to jiffies sched_clock will be fully lockless.

-Andi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 16:45             ` Andi Kleen
  2007-04-12 17:00               ` Andrew Morton
@ 2007-04-12 17:41               ` Jeremy Fitzhardinge
  2007-04-12 17:45                 ` Andi Kleen
  1 sibling, 1 reply; 21+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-12 17:41 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Daniel Walker, linux-kernel, johnstul, tglx

Andi Kleen wrote:
> Ok. I think it's better to just fix sched_clock() again than to
> add another one.  I can probably
> eliminate the ktime_get() and use something based on jiffies. That will
> be inaccurate for the instable case of course.
>
> I will do that later today.

sched_clock seems a bit weird to use.  In the pv_ops world, it only
counts unstolen time, and it is therefore inherently per-cpu.  The
timestamps should be at least system-wide monotonic.

    J

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 17:00               ` Andrew Morton
@ 2007-04-12 17:43                 ` Jeremy Fitzhardinge
  2007-04-12 17:46                   ` Andi Kleen
                                     ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-12 17:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Andi Kleen, Daniel Walker, linux-kernel, johnstul, tglx

Andrew Morton wrote:
> hm.  People (ab)use sched_clock() for all sorts of things nowadays.  I wouldn't
> do anything to degrade it just on behalf of printk-timestamping.
>   

printk was the only non-scheduler-ish use I could find.  Are there others?

    J

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 17:41               ` Jeremy Fitzhardinge
@ 2007-04-12 17:45                 ` Andi Kleen
  2007-04-12 19:46                   ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2007-04-12 17:45 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andrew Morton, Daniel Walker, linux-kernel, johnstul, tglx

On Thursday 12 April 2007 19:41:51 Jeremy Fitzhardinge wrote:
> Andi Kleen wrote:
> > Ok. I think it's better to just fix sched_clock() again than to
> > add another one.  I can probably
> > eliminate the ktime_get() and use something based on jiffies. That will
> > be inaccurate for the instable case of course.
> >
> > I will do that later today.
> 
> sched_clock seems a bit weird to use.  In the pv_ops world, it only
> counts unstolen time, and it is therefore inherently per-cpu.  The
> timestamps should be at least system-wide monotonic.

Even on real hardware it's also per CPU, although the errors
are usually not big. At least the scheduler deals with that by
only ever comparing time stamps from the same CPU.

If you have big deviations between CPUs then it might cause problems
for non scheduler uses. I guess printk_clock is not critical, but
it might be a little confusing.

-Andi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 17:43                 ` Jeremy Fitzhardinge
@ 2007-04-12 17:46                   ` Andi Kleen
  2007-04-12 17:52                   ` Daniel Walker
  2007-04-12 17:55                   ` Andrew Morton
  2 siblings, 0 replies; 21+ messages in thread
From: Andi Kleen @ 2007-04-12 17:46 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andrew Morton, Daniel Walker, linux-kernel, johnstul, tglx

On Thursday 12 April 2007 19:43:03 Jeremy Fitzhardinge wrote:
> Andrew Morton wrote:
> > hm.  People (ab)use sched_clock() for all sorts of things nowadays.  I wouldn't
> > do anything to degrade it just on behalf of printk-timestamping.
> >   
> 
> printk was the only non-scheduler-ish use I could find.  Are there others?

iirc some of the out of tree tracing toolkits used it at least.

-Andi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 17:43                 ` Jeremy Fitzhardinge
  2007-04-12 17:46                   ` Andi Kleen
@ 2007-04-12 17:52                   ` Daniel Walker
  2007-04-12 17:55                   ` Andrew Morton
  2 siblings, 0 replies; 21+ messages in thread
From: Daniel Walker @ 2007-04-12 17:52 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andrew Morton, Andi Kleen, linux-kernel, johnstul, tglx

On Thu, 2007-04-12 at 10:43 -0700, Jeremy Fitzhardinge wrote:
> Andrew Morton wrote:
> > hm.  People (ab)use sched_clock() for all sorts of things nowadays.  I wouldn't
> > do anything to degrade it just on behalf of printk-timestamping.
> >   
> 
> printk was the only non-scheduler-ish use I could find.  Are there others?
> 
>     J

There is some usage in block/blktrace.c (which I've not seen before..)

Daniel


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 17:43                 ` Jeremy Fitzhardinge
  2007-04-12 17:46                   ` Andi Kleen
  2007-04-12 17:52                   ` Daniel Walker
@ 2007-04-12 17:55                   ` Andrew Morton
  2007-04-12 18:27                     ` Andi Kleen
  2007-04-12 19:43                     ` Jeremy Fitzhardinge
  2 siblings, 2 replies; 21+ messages in thread
From: Andrew Morton @ 2007-04-12 17:55 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andi Kleen, Daniel Walker, linux-kernel, johnstul, tglx

On Thu, 12 Apr 2007 10:43:03 -0700
Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> Andrew Morton wrote:
> > hm.  People (ab)use sched_clock() for all sorts of things nowadays.  I wouldn't
> > do anything to degrade it just on behalf of printk-timestamping.
> >   
> 
> printk was the only non-scheduler-ish use I could find.  Are there others?
> 

blktrace.  I've seen a couple of trace/debug-style things which use
sched_clock for timestamping event collection. I think lttng does exotic
things with TSCs, performing private skew correction, although that might
have changed now.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 17:55                   ` Andrew Morton
@ 2007-04-12 18:27                     ` Andi Kleen
  2007-04-12 19:41                       ` Jeremy Fitzhardinge
  2007-04-12 19:43                     ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2007-04-12 18:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jeremy Fitzhardinge, Daniel Walker, linux-kernel, johnstul, tglx

On Thursday 12 April 2007 19:55:59 Andrew Morton wrote:
> On Thu, 12 Apr 2007 10:43:03 -0700
> Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> 
> > Andrew Morton wrote:
> > > hm.  People (ab)use sched_clock() for all sorts of things nowadays.  I wouldn't
> > > do anything to degrade it just on behalf of printk-timestamping.
> > >   
> > 
> > printk was the only non-scheduler-ish use I could find.  Are there others?
> > 
> 
> blktrace.  I've seen a couple of trace/debug-style things which use
> sched_clock for timestamping event collection. I think lttng does exotic
> things with TSCs, performing private skew correction, although that might
> have changed now.

They should always just store the cpu too and educate the users that
only (cpu, timestamp) pairs make sense to compare.

That said at least my new sched_clock should not normally show 
large non differences between CPUs, so it can be usually ignored, but they can 
happen. I believe some of the already existing sched_clocks() (like the
one used on Altix) have the same property. 

But on VMI/Xen as currently implemented the differences will be large.

-Andi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 18:27                     ` Andi Kleen
@ 2007-04-12 19:41                       ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 21+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-12 19:41 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Daniel Walker, linux-kernel, johnstul, tglx

Andi Kleen wrote:
> They should always just store the cpu too and educate the users that
> only (cpu, timestamp) pairs make sense to compare.
>
> That said at least my new sched_clock should not normally show 
> large non differences between CPUs, so it can be usually ignored, but they can 
> happen. I believe some of the already existing sched_clocks() (like the
> one used on Altix) have the same property. 
>
> But on VMI/Xen as currently implemented the differences will be large.

Yes, it's pretty much unavoidable.  It seems to me that these
non-scheduler uses of sched_clock should use something else.  Or we
could add a new clock function for the scheduler to use, but
"sched_clock" seems like the best name for it.

    J

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 17:55                   ` Andrew Morton
  2007-04-12 18:27                     ` Andi Kleen
@ 2007-04-12 19:43                     ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 21+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-12 19:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Andi Kleen, Daniel Walker, linux-kernel, johnstul, tglx

Andrew Morton wrote:
> blktrace.  I've seen a couple of trace/debug-style things which use
> sched_clock for timestamping event collection. I think lttng does exotic
> things with TSCs, performing private skew correction, although that might
> have changed now.
>   

I'm not worried about things using the tsc directly.  But since we hook
into sched_clock, anyone using it is going to get the pv_ops backend's
behaviour, which is targeted at making the scheduler work properly
rather than all these secondary uses.

    J

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 17:45                 ` Andi Kleen
@ 2007-04-12 19:46                   ` Jeremy Fitzhardinge
  2007-04-12 20:15                     ` Andi Kleen
  0 siblings, 1 reply; 21+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-12 19:46 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Daniel Walker, linux-kernel, johnstul, tglx

Andi Kleen wrote:
> Even on real hardware it's also per CPU, although the errors
> are usually not big. At least the scheduler deals with that by
> only ever comparing time stamps from the same CPU.
>   

Well, it uses sched_clock to measure how long something has been asleep,
which is inherently non-per-cpu.  But it tries to keep a measure of the
skew between the various runqueue's sched_clocks, so the error doesn't
seem to get too large.

> If you have big deviations between CPUs then it might cause problems
> for non scheduler uses. I guess printk_clock is not critical, but
> it might be a little confusing.

They could be huge differences - unbounded, in fact.  It would make
printk fairly hard to interpret,  I would think.  The only benefit to
using sched_clock in printk is that if you're using it to work out the
startup latencies you won't be confused by stolen time.  But I think
that's a fairly small benefit compared to the disadvantage of not being
able to meaningfully compare the timestamps on two printk messages.

    J

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
  2007-04-12 19:46                   ` Jeremy Fitzhardinge
@ 2007-04-12 20:15                     ` Andi Kleen
  0 siblings, 0 replies; 21+ messages in thread
From: Andi Kleen @ 2007-04-12 20:15 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andrew Morton, Daniel Walker, linux-kernel, johnstul, tglx


> They could be huge differences - unbounded, in fact.  It would make
> printk fairly hard to interpret,  I would think.  The only benefit to
> using sched_clock in printk is that if you're using it to work out the
> startup latencies you won't be confused by stolen time.  But I think
> that's a fairly small benefit compared to the disadvantage of not being
> able to meaningfully compare the timestamps on two printk messages.

Ok so the right solution would be a separate printk_clock() that is
implemented as the native sched_clock() even on Xen/VMI.  Should be a SMOP.

-Andi

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2007-04-12 20:15 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-11 16:29 [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier Daniel Walker
2007-04-11 20:31 ` Andrew Morton
2007-04-11 20:54   ` Daniel Walker
2007-04-11 21:33     ` Andrew Morton
2007-04-12  0:39       ` Andrew Morton
2007-04-12  9:36         ` Andi Kleen
2007-04-12 16:23           ` Andrew Morton
2007-04-12 16:45             ` Andi Kleen
2007-04-12 17:00               ` Andrew Morton
2007-04-12 17:43                 ` Jeremy Fitzhardinge
2007-04-12 17:46                   ` Andi Kleen
2007-04-12 17:52                   ` Daniel Walker
2007-04-12 17:55                   ` Andrew Morton
2007-04-12 18:27                     ` Andi Kleen
2007-04-12 19:41                       ` Jeremy Fitzhardinge
2007-04-12 19:43                     ` Jeremy Fitzhardinge
2007-04-12 17:41               ` Jeremy Fitzhardinge
2007-04-12 17:45                 ` Andi Kleen
2007-04-12 19:46                   ` Jeremy Fitzhardinge
2007-04-12 20:15                     ` Andi Kleen
2007-04-12 17:17         ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox