[MIPS]clocks_calc_mult_shift() may gen a too big mult value

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [MIPS]clocks_calc_mult_shift() may gen a too big mult value
@ 2011-10-31  9:00 Chen Jie
  2011-10-31  9:20 ` Yong Zhang
  0 siblings, 1 reply; 8+ messages in thread
From: Chen Jie @ 2011-10-31  9:00 UTC (permalink / raw)
  To: linux-mips, LKML
  Cc: johnstul, tglx, yanhua, 项宇, zhangfx,
	孙海勇

[-- Attachment #1: Type: text/plain, Size: 965 bytes --]

Hi all,

On MIPS, with maxsec=4, clocks_calc_mult_shift() may generate a very
big mult, which may easily cause timekeeper.mult overflow within
timekeeping jobs.

e.g. when clock freq was 250000500(i.e. mips_hpt_frequency=250000500,
and the CPU Freq will be 250000500*2=500001000), mult will be
0xffffde72

Attachment is a script that calculates mult values for CPU Freq
between 400050000 and 500050000, with 1KHz step. It outputs mult
values greater than 0xf0000000:
CPU Freq:500001000, mult:0xffffde72, shift:30
CPU Freq:500002000, mult:0xffffbce4, shift:30
CPU Freq:500003000, mult:0xffff9b56, shift:30
CPU Freq:500004000, mult:0xffff79c9, shift:30
...

The peak value appears around CPU_freq=500001000.

To avoid this, it may need:
1. Supply a bigger maxsec value?
2. In clocks_calc_mult_shift(), pick next mult/shift pair if mult is
too big? Then maxsec will not be strictly obeyed.
3. Change timekeeper.mult to u64?
4. ...

Any idea?

--
Regards,
- Chen Jie

[-- Attachment #2: mult-test.py --]
[-- Type: text/x-python, Size: 511 bytes --]

#!/bin/env python

def clocks_calc_mult_shift(from_, to_, maxsec):
	sftacc = 32;

	tmp = maxsec * from_ >> 32;
	while tmp:
		tmp >>= 1
		sftacc -= 1

	for sft in xrange(32, 0, -1):
		tmp = to_ << sft;
		tmp += (from_ / 2)
		tmp /= from_
		if ((tmp >> sftacc) == 0):
			break

	mult = tmp
	shift = sft
	return mult, shift 

for i in xrange(400050000, 500050000, 1000):
	mult, shift = clocks_calc_mult_shift(i/2, 1000000000, 4)
	if mult > 0xf0000000:
		print "CPU Freq:%d, mult:0x%x, shift:%d" % (i, mult, shift)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [MIPS]clocks_calc_mult_shift() may gen a too big mult value
  2011-10-31  9:00 [MIPS]clocks_calc_mult_shift() may gen a too big mult value Chen Jie
@ 2011-10-31  9:20 ` Yong Zhang
  2011-10-31 10:48   ` Chen Jie
  0 siblings, 1 reply; 8+ messages in thread
From: Yong Zhang @ 2011-10-31  9:20 UTC (permalink / raw)
  To: Chen Jie
  Cc: linux-mips, LKML, johnstul, tglx, yanhua, 项宇,
	zhangfx, 孙海勇

On Mon, Oct 31, 2011 at 5:00 PM, Chen Jie <chenj@lemote.com> wrote:
> Hi all,
>
> On MIPS, with maxsec=4, clocks_calc_mult_shift() may generate a very
> big mult, which may easily cause timekeeper.mult overflow within
> timekeeping jobs.

Hmmm, why not use clocksource_register_hz()/clocksource_register_khz()
instead? it's more convenient.

Thanks,
Yong

>
> e.g. when clock freq was 250000500(i.e. mips_hpt_frequency=250000500,
> and the CPU Freq will be 250000500*2=500001000), mult will be
> 0xffffde72
>
> Attachment is a script that calculates mult values for CPU Freq
> between 400050000 and 500050000, with 1KHz step. It outputs mult
> values greater than 0xf0000000:
> CPU Freq:500001000, mult:0xffffde72, shift:30
> CPU Freq:500002000, mult:0xffffbce4, shift:30
> CPU Freq:500003000, mult:0xffff9b56, shift:30
> CPU Freq:500004000, mult:0xffff79c9, shift:30
> ...
>
> The peak value appears around CPU_freq=500001000.
>
> To avoid this, it may need:
> 1. Supply a bigger maxsec value?
> 2. In clocks_calc_mult_shift(), pick next mult/shift pair if mult is
> too big? Then maxsec will not be strictly obeyed.
> 3. Change timekeeper.mult to u64?
> 4. ...
>
> Any idea?
>
>
>
> --
> Regards,
> - Chen Jie
>



-- 
Only stand for myself

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [MIPS]clocks_calc_mult_shift() may gen a too big mult value
  2011-10-31  9:20 ` Yong Zhang
@ 2011-10-31 10:48   ` Chen Jie
  2011-10-31 13:03     ` John Stultz
  0 siblings, 1 reply; 8+ messages in thread
From: Chen Jie @ 2011-10-31 10:48 UTC (permalink / raw)
  To: Yong Zhang
  Cc: linux-mips, LKML, johnstul, tglx, yanhua, 项宇,
	zhangfx, 孙海勇

Hi,

2011/10/31 Yong Zhang <yong.zhang0@gmail.com>:
> On Mon, Oct 31, 2011 at 5:00 PM, Chen Jie <chenj@lemote.com> wrote:
>> Hi all,
>>
>> On MIPS, with maxsec=4, clocks_calc_mult_shift() may generate a very
>> big mult, which may easily cause timekeeper.mult overflow within
>> timekeeping jobs.
>
> Hmmm, why not use clocksource_register_hz()/clocksource_register_khz()
> instead? it's more convenient.

Thanks for the suggestion. And sorry for I didn't notice the upstream
code has already hooked to clocksource_register_hz() in csrc-r4k.c
(We're using r4000 clock source)

I'm afraid this still doesn't fix my case. Through
clocksource_register_hz()->__clocksource_register_scale()->__clocksource_updatefreq_scale,
I got a calculated maxsec = (0xffffffff - (0xffffffff>>5))/250000500 =
16        # assume mips_hpt_frequency=250000500

With this maxsec, I got a mult of 0xffffde72, still too big.

Regards,
-- Chen Jie

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [MIPS]clocks_calc_mult_shift() may gen a too big mult value
  2011-10-31 10:48   ` Chen Jie
@ 2011-10-31 13:03     ` John Stultz
  2011-10-31 13:59       ` zhangfx
  0 siblings, 1 reply; 8+ messages in thread
From: John Stultz @ 2011-10-31 13:03 UTC (permalink / raw)
  To: Chen Jie
  Cc: Yong Zhang, linux-mips, LKML, tglx, yanhua, 项宇,
	zhangfx, 孙海勇

On Mon, 2011-10-31 at 18:48 +0800, Chen Jie wrote:
> Hi,
> 
> 2011/10/31 Yong Zhang <yong.zhang0@gmail.com>:
> > On Mon, Oct 31, 2011 at 5:00 PM, Chen Jie <chenj@lemote.com> wrote:
> >> Hi all,
> >>
> >> On MIPS, with maxsec=4, clocks_calc_mult_shift() may generate a very
> >> big mult, which may easily cause timekeeper.mult overflow within
> >> timekeeping jobs.
> >
> > Hmmm, why not use clocksource_register_hz()/clocksource_register_khz()
> > instead? it's more convenient.
> 
> Thanks for the suggestion. And sorry for I didn't notice the upstream
> code has already hooked to clocksource_register_hz() in csrc-r4k.c
> (We're using r4000 clock source)
> 
> I'm afraid this still doesn't fix my case. Through
> clocksource_register_hz()->__clocksource_register_scale()->__clocksource_updatefreq_scale,
> I got a calculated maxsec = (0xffffffff - (0xffffffff>>5))/250000500 =
> 16        # assume mips_hpt_frequency=250000500
> 
> With this maxsec, I got a mult of 0xffffde72, still too big.

Hrmm. Yong Zang is right to suggest clocksource_register_hz(), as the
intention of that code is to try to avoid these sorts of issues.

What is the corresponding shift value you're getting for the value
above?

Could you annotate clocks_calc_mult_shift() a little bit to see where
things might be going wrong?

thanks
-john


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [MIPS]clocks_calc_mult_shift() may gen a too big mult value
  2011-10-31 13:03     ` John Stultz
@ 2011-10-31 13:59       ` zhangfx
  2011-10-31 18:12         ` John Stultz
  0 siblings, 1 reply; 8+ messages in thread
From: zhangfx @ 2011-10-31 13:59 UTC (permalink / raw)
  To: John Stultz
  Cc: Chen Jie, Yong Zhang, linux-mips, LKML, tglx, yanhua,
	项宇, 孙海勇

Dear Sirs,
>> Thanks for the suggestion. And sorry for I didn't notice the upstream
>> code has already hooked to clocksource_register_hz() in csrc-r4k.c
>> (We're using r4000 clock source)
>>
>> I'm afraid this still doesn't fix my case. Through
>> clocksource_register_hz()->__clocksource_register_scale()->__clocksource_updatefreq_scale,
>> I got a calculated maxsec = (0xffffffff - (0xffffffff>>5))/250000500 =
>> 16        # assume mips_hpt_frequency=250000500
>>
>> With this maxsec, I got a mult of 0xffffde72, still too big.
> Hrmm. Yong Zang is right to suggest clocksource_register_hz(), as the
> intention of that code is to try to avoid these sorts of issues.
>
> What is the corresponding shift value you're getting for the value
> above?
>
> Could you annotate clocks_calc_mult_shift() a little bit to see where
> things might be going wrong?
Let me give some real world data:
in one machine with 500MHz freq,
the calculated freq = 500084016, and clocks_calc_mult_shift() give
   mult = 4294245725
   shift = 30

but in the 1785th call to update_wall_time, due to error correction 
algorithm, the mult become 4293964632,
in next update_wall_time, the ntp_error is 0x301c93b7927c, which lead to 
an adj of 20, then mult is overflow:
    mult = 4293964632 + (1<<20) = 45912
with this mult, if anyone call timekeeping_get_ns or others using mult, 
the time concept will be extremely wrong, so some sleep will 
(almost)never return => virtually hang

We are not abosulately sure that the error source is normal, but anyway 
it is a possible for the code to overflow, and it will cause hang.

For this case, the timekeeping_bigadjust should be able to control adj 
to a maximum of around 20 with the lookahead for any error. So if the 
mult is chosen at shift = 29, then mult becomes 4294245725/2, it will 
not be possible to be overflowed.

In short, choosing a mult close to 2^32 is dangerous. But I don't know 
what's the best way to avoid it for general cases, because I don't know 
how big error can be and the adj can be for different systems.

Regards

Yours
Fuxin Zhang

>
> thanks
> -john
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [MIPS]clocks_calc_mult_shift() may gen a too big mult value
  2011-10-31 13:59       ` zhangfx
@ 2011-10-31 18:12         ` John Stultz
  2011-10-31 18:30           ` David Daney
  0 siblings, 1 reply; 8+ messages in thread
From: John Stultz @ 2011-10-31 18:12 UTC (permalink / raw)
  To: zhangfx
  Cc: Chen Jie, Yong Zhang, linux-mips, LKML, tglx, yanhua,
	项宇, 孙海勇

On Mon, 2011-10-31 at 21:59 +0800, zhangfx wrote:
> > Could you annotate clocks_calc_mult_shift() a little bit to see where
> > things might be going wrong?
> Let me give some real world data:
> in one machine with 500MHz freq,
> the calculated freq = 500084016, and clocks_calc_mult_shift() give
>    mult = 4294245725
>    shift = 30
> 
> but in the 1785th call to update_wall_time, due to error correction 
> algorithm, the mult become 4293964632,
> in next update_wall_time, the ntp_error is 0x301c93b7927c, which lead to 
> an adj of 20, then mult is overflow:
>     mult = 4293964632 + (1<<20) = 45912
> with this mult, if anyone call timekeeping_get_ns or others using mult, 
> the time concept will be extremely wrong, so some sleep will 
> (almost)never return => virtually hang
> 
> We are not abosulately sure that the error source is normal, but anyway 
> it is a possible for the code to overflow, and it will cause hang.
> 
> For this case, the timekeeping_bigadjust should be able to control adj 
> to a maximum of around 20 with the lookahead for any error. So if the 
> mult is chosen at shift = 29, then mult becomes 4294245725/2, it will 
> not be possible to be overflowed.
> 
> In short, choosing a mult close to 2^32 is dangerous. But I don't know 
> what's the best way to avoid it for general cases, because I don't know 
> how big error can be and the adj can be for different systems.

Ah. Ok, sorry for being so slow to understand.

So yea, I think you're right that the issue seems to be that for certain
feq values, the resulting mult,shift pair is optimized a little too far,
and we don't leave enough room for ntp adjustments to mult, without the
possibility of overflowing mult.

The following patch should handle it (although I'm at a conf right now,
so I couldn't test it), although I might be trying to be too smart here.
Rather then just checking if mult is larger then 0xf0000000, we try to
quantify the largest valid mult adjustment, and then make sure mult +
that value doesn't overflow. NTP limits adjustments to 500ppm, but the
kernel might have to deal with larger internal adjustments. Probably we
could be safe ruling out larger then 5% adjustments.

So then its just a matter of 1/2^4. So the largest mult adjustment
should be 1 << (cs->shift - 4)

Does this seem reasonable? Let me know if this seems to work for you.

Thomas: does this fix look like its in a reasonable spot? I don't want
to clutter up the calc_mult_shift() code up since this really only
applies to clocksources and not clockevents.

NOT TESTED & NOT FOR INCLUSION (YET)
Signed-off-by: John Stultz <john.stultz@linaro.org>
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index cf52fda..73518d2 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -640,7 +640,7 @@ static void clocksource_enqueue(struct clocksource *cs)
 void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 {
 	u64 sec;
-
+	u32 maxadj;
 	/*
 	 * Calc the maximum number of seconds which we can run before
 	 * wrapping around. For clocksources which have a mask > 32bit
@@ -661,6 +661,22 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 
 	clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
 			       NSEC_PER_SEC / scale, sec * scale);
+
+	/*
+	 * Since mult may be adjusted by ntp, add an extra saftey margin
+	 * for clocksources that have large mults, to avoid overflow.
+	 *
+	 * Assume we won't try to correct for more then 5% adjustments
+	 * (50,000 ppm), which approximates to 1/16 or 1/2^4.
+	 * Thus 1 << (shift - 4) is the largest mult adjustment we'll
+	 * support.
+	 */
+	maxadj = 1 << (shift-4);
+	if ((cs->mult + maxadj < cs->mult) || (cs->mult - maxadj > cs->mult)) {
+		cs->mult >>= 1;
+		cs->shift--;
+	}
+
 	cs->max_idle_ns = clocksource_max_deferment(cs);
 }
 EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [MIPS]clocks_calc_mult_shift() may gen a too big mult value
  2011-10-31 18:12         ` John Stultz
@ 2011-10-31 18:30           ` David Daney
  2011-10-31 19:51             ` John Stultz
  0 siblings, 1 reply; 8+ messages in thread
From: David Daney @ 2011-10-31 18:30 UTC (permalink / raw)
  To: John Stultz, tglx@linutronix.de
  Cc: zhangfx, Chen Jie, Yong Zhang, linux-mips@linux-mips.org, LKML,
	yanhua, 项宇, 孙海勇

On 10/31/2011 11:12 AM, John Stultz wrote:
> On Mon, 2011-10-31 at 21:59 +0800, zhangfx wrote:
[...]
>>
>> In short, choosing a mult close to 2^32 is dangerous. But I don't know
>> what's the best way to avoid it for general cases, because I don't know
>> how big error can be and the adj can be for different systems.
>
> Ah. Ok, sorry for being so slow to understand.
>
> So yea, I think you're right that the issue seems to be that for certain
> feq values, the resulting mult,shift pair is optimized a little too far,
> and we don't leave enough room for ntp adjustments to mult, without the
> possibility of overflowing mult.
>
> The following patch should handle it (although I'm at a conf right now,
> so I couldn't test it), although I might be trying to be too smart here.
> Rather then just checking if mult is larger then 0xf0000000, we try to
> quantify the largest valid mult adjustment, and then make sure mult +
> that value doesn't overflow. NTP limits adjustments to 500ppm, but the
> kernel might have to deal with larger internal adjustments. Probably we
> could be safe ruling out larger then 5% adjustments.
>
> So then its just a matter of 1/2^4. So the largest mult adjustment
> should be 1<<  (cs->shift - 4)
>
> Does this seem reasonable? Let me know if this seems to work for you.
>
> Thomas: does this fix look like its in a reasonable spot? I don't want
> to clutter up the calc_mult_shift() code up since this really only
> applies to clocksources and not clockevents.
>
> NOT TESTED&  NOT FOR INCLUSION (YET)
> Signed-off-by: John Stultz<john.stultz@linaro.org>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index cf52fda..73518d2 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -640,7 +640,7 @@ static void clocksource_enqueue(struct clocksource *cs)
>   void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
>   {
>   	u64 sec;
> -
> +	u32 maxadj;
>   	/*
>   	 * Calc the maximum number of seconds which we can run before
>   	 * wrapping around. For clocksources which have a mask>  32bit
> @@ -661,6 +661,22 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
>
>   	clocks_calc_mult_shift(&cs->mult,&cs->shift, freq,
>   			       NSEC_PER_SEC / scale, sec * scale);
> +
> +	/*
> +	 * Since mult may be adjusted by ntp, add an extra saftey margin
> +	 * for clocksources that have large mults, to avoid overflow.
> +	 *
> +	 * Assume we won't try to correct for more then 5% adjustments

Can we do any better than making assumptions about this?

The current assumption appears to be that only very small adjustments 
will be made, and that didn't workout so well.

Is it possible to put hard constraints on these things, so that it will 
always work?

David Daney


> +	 * (50,000 ppm), which approximates to 1/16 or 1/2^4.
> +	 * Thus 1<<  (shift - 4) is the largest mult adjustment we'll
> +	 * support.
> +	 */
> +	maxadj = 1<<  (shift-4);
> +	if ((cs->mult + maxadj<  cs->mult) || (cs->mult - maxadj>  cs->mult)) {
> +		cs->mult>>= 1;
> +		cs->shift--;
> +	}
> +
>   	cs->max_idle_ns = clocksource_max_deferment(cs);
>   }
>   EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [MIPS]clocks_calc_mult_shift() may gen a too big mult value
  2011-10-31 18:30           ` David Daney
@ 2011-10-31 19:51             ` John Stultz
  0 siblings, 0 replies; 8+ messages in thread
From: John Stultz @ 2011-10-31 19:51 UTC (permalink / raw)
  To: David Daney
  Cc: tglx@linutronix.de, zhangfx, Chen Jie, Yong Zhang,
	linux-mips@linux-mips.org, LKML, yanhua, 项宇,
	孙海勇

On Mon, 2011-10-31 at 11:30 -0700, David Daney wrote:
> On 10/31/2011 11:12 AM, John Stultz wrote:
> > +	/*
> > +	 * Since mult may be adjusted by ntp, add an extra saftey margin
> > +	 * for clocksources that have large mults, to avoid overflow.
> > +	 *
> > +	 * Assume we won't try to correct for more then 5% adjustments
> 
> Can we do any better than making assumptions about this?
> 
> The current assumption appears to be that only very small adjustments 
> will be made, and that didn't workout so well.

s/only very small/no/

The calc_mult_shift function doesn't take any adjustments into account.

> Is it possible to put hard constraints on these things, so that it will 
> always work?

Fair enough. The patch was a bit off the cuff, and you're right that the
assumption is broken: ntp limits the freq adjustment to 500ppm, but the
kernel limits tick adjustments to 10%. Thus 11% adjustments (easier to
remember then 10.05%) would be the hard limit of adjustments from
external interfaces. So I'll need to rework the patch to adapt for that.

The harder part, once we put a hard constraint on the adjustment, is to
enforce that the timekeeping_bigadjust() doesn't push it beyond that,
since its logic uses relative adjustments and doesn't consider the
original mult value. Further, since its fairly opaque code, proving the
constraint itself won't break in edge cases is also needed, and will
take some time.

So yea, you're point is fair. Its just going to take a bit of thoughtful
review before implementing such a hard constraint universally.

So making a good adversary constraint first (11%), and then iteratively
hardening the code it impacts might be a good approach to get there.

thanks
-john

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-10-31 19:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-31  9:00 [MIPS]clocks_calc_mult_shift() may gen a too big mult value Chen Jie
2011-10-31  9:20 ` Yong Zhang
2011-10-31 10:48   ` Chen Jie
2011-10-31 13:03     ` John Stultz
2011-10-31 13:59       ` zhangfx
2011-10-31 18:12         ` John Stultz
2011-10-31 18:30           ` David Daney
2011-10-31 19:51             ` John Stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox