* [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
@ 2002-08-22 9:50 Yoann Vandoorselaere
2002-08-22 10:21 ` Gabriel Paubert
0 siblings, 1 reply; 14+ messages in thread
From: Yoann Vandoorselaere @ 2002-08-22 9:50 UTC (permalink / raw)
To: cpufreq; +Cc: cpufreq, linux-kernel, benh@kernel.crashing.org
[-- Attachment #1: Type: text/plain, Size: 513 bytes --]
Hi,
The "low_part * mult" multiplication of the old function may overflow a
32bits integer...
This patch both fix the overflow issue (tested with frequencies up to
20Ghz), and make the result of the function lose less precision.
Please apply,
--
Yoann Vandoorselaere, http://www.prelude-ids.org
"Programming is a race between programmers, who try and make more and
more idiot-proof software, and universe, which produces more and more
remarkable idiots. Until now, universe leads the race" -- R. Cook
[-- Attachment #2: cpufreq-overflow.diff --]
[-- Type: text/plain, Size: 747 bytes --]
--- linux-benh/kernel/cpufreq.c 2002-08-21 17:27:52.000000000 +0200
+++ linux-yoann/kernel/cpufreq.c 2002-08-22 11:27:09.000000000 +0200
@@ -78,14 +78,16 @@ static unsigned int cpufreq_
*/
static unsigned long scale(unsigned long old, u_int div, u_int mult)
{
- unsigned long low_part, high_part;
-
- high_part = old / div;
- low_part = (old % div) / 100;
- high_part *= mult;
- low_part = low_part * mult / div;
-
- return high_part + low_part * 100;
+ unsigned long val, carry = 0;
+
+ mult /= 100;
+ div /= 100;
+ val = old / div * mult;
+
+ carry = old % div;
+ carry = carry * mult / div;
+
+ return val + carry;
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
2002-08-22 9:50 Yoann Vandoorselaere
@ 2002-08-22 10:21 ` Gabriel Paubert
2002-08-22 13:00 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 14+ messages in thread
From: Gabriel Paubert @ 2002-08-22 10:21 UTC (permalink / raw)
To: Yoann Vandoorselaere
Cc: cpufreq, cpufreq, linux-kernel, benh@kernel.crashing.org
Yoann Vandoorselaere wrote:
> Hi,
>
> The "low_part * mult" multiplication of the old function may overflow a
> 32bits integer...
>
> This patch both fix the overflow issue (tested with frequencies up to
> 20Ghz), and make the result of the function lose less precision.
>
> Please apply,
>
>
>
> ------------------------------------------------------------------------
>
> --- linux-benh/kernel/cpufreq.c 2002-08-21 17:27:52.000000000 +0200
> +++ linux-yoann/kernel/cpufreq.c 2002-08-22 11:27:09.000000000 +0200
> @@ -78,14 +78,16 @@ static unsigned int cpufreq_
> */
> static unsigned long scale(unsigned long old, u_int div, u_int mult)
> {
> - unsigned long low_part, high_part;
> -
> - high_part = old / div;
> - low_part = (old % div) / 100;
> - high_part *= mult;
> - low_part = low_part * mult / div;
> -
> - return high_part + low_part * 100;
> + unsigned long val, carry = 0;
> +
> + mult /= 100;
> + div /= 100;
if(abs(div)<100) div=0;
> + val = old / div * mult;
Now happily divide by zero.
> +
> + carry = old % div;
Again.
> + carry = carry * mult / div;
Again.
> +
> + return val + carry;
> }
>
And I can't see how it can be more precise, you divide the numerator and
denominator of the fraction by 100 and then proceed forgetting
everything about the rest. Basically this looses about 7 bits of precision.
Now altogether I believe that such a function pertains to a per arch
optimized routine.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
2002-08-22 13:02 [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation Benjamin Herrenschmidt
@ 2002-08-22 12:12 ` Gabriel Paubert
2002-08-22 14:31 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 14+ messages in thread
From: Gabriel Paubert @ 2002-08-22 12:12 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Yoann Vandoorselaere, cpufreq, cpufreq, linux-kernel
Hi Ben,
> None of the above can happen in the domain of application of this
> function. It's used to scale up/down the loops_per_jiffy value when
> scaling the CPU frequency. Anyway, the above isn't worse than the
> original function. Ideally, we would want 64 bits arithmetics, but we
> decided long ago not to bring the libcc support routines for that in
> the kernel.
Well, first on sane archs which have an easily accessible, fixed
frequency time counter, loops_per_jiffy should never have existed :-)
Second, putting this code there means that one day somebody will
inevitably try to use it outside of its domain of operation (like it
happened for div64 a few months ago when I pointed out that it would not
work for divisors above 65535 or so).
Finally, I agree that we should not import libgcc, but for example on
PPC32 the double lengths shifts (__ashrdi3, __ashldi3, and __lshsldi3)
are implemented somewhere, and the assembly implementation (directly
taken from some appendix in PPC documentation, I just slightly twisted
__ashrdi3 to make it branchless AFAIR) is actually way faster than the
one in libgcc ;-), and less than half the size.
Adding a few subroutines that implement a subset of libgcc's
functionality is necessary for most archs (which functions are needed is
arch, and sometimes compiler's, dependent).
In this case a generic scaling function, while not a standard libgcc/C
library feature has potentially more applications than this simple
cpufreq approximation. But I don't see very much the need for scaling a
long (64 bit on 64 bit archs) value, 32 bit would be sufficient.
Regards,
Gabriel.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
2002-08-22 10:21 ` Gabriel Paubert
@ 2002-08-22 13:00 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-22 13:00 UTC (permalink / raw)
To: Gabriel Paubert, Yoann Vandoorselaere; +Cc: cpufreq, cpufreq, linux-kernel
Hi Gabriel !
>if(abs(div)<100) div=0;
>
>> + val = old / div * mult;
>
>Now happily divide by zero.
>
>> +
>> + carry = old % div;
>
>Again.
>
>> + carry = carry * mult / div;
>
>Again.
>
>> +
>> + return val + carry;
>> }
None of the above can happen in the domain of application of this
function. It's used to scale up/down the loops_per_jiffy value when
scaling the CPU frequency. Anyway, the above isn't worse than the
original function. Ideally, we would want 64 bits arithmetics, but
we decided long ago not to bring the libcc support routines for that
in the kernel.
>
>And I can't see how it can be more precise, you divide the numerator and
>denominator of the fraction by 100 and then proceed forgetting
>everything about the rest. Basically this looses about 7 bits of precision.
Which is mostly ok for what we need. I think Yoann didn't mean it's
more precise that what it replace, but rather more precise than his
original implementation that divided by 1000 ;) Anyway, it's not
significantly worse than what we had and won't overflow as easily
which is all we want for this routine now.
>Now altogether I believe that such a function pertains to a per arch
>optimized routine.
Maybe... though in the context of cpufreq, it may not make that much
sense.
Ben.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
@ 2002-08-22 13:02 Benjamin Herrenschmidt
2002-08-22 12:12 ` Gabriel Paubert
0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-22 13:02 UTC (permalink / raw)
To: Gabriel Paubert, Yoann Vandoorselaere; +Cc: cpufreq, cpufreq, linux-kernel
Hi Gabriel !
>if(abs(div)<100) div=0;
>
>> + val = old / div * mult;
>
>Now happily divide by zero.
>
>> +
>> + carry = old % div;
>
>Again.
>
>> + carry = carry * mult / div;
>
>Again.
>
>> +
>> + return val + carry;
>> }
None of the above can happen in the domain of application of this
function. It's used to scale up/down the loops_per_jiffy value when
scaling the CPU frequency. Anyway, the above isn't worse than the
original function. Ideally, we would want 64 bits arithmetics, but
we decided long ago not to bring the libcc support routines for that
in the kernel.
>
>And I can't see how it can be more precise, you divide the numerator and
>denominator of the fraction by 100 and then proceed forgetting
>everything about the rest. Basically this looses about 7 bits of precision.
Which is mostly ok for what we need. I think Yoann didn't mean it's
more precise that what it replace, but rather more precise than his
original implementation that divided by 1000 ;) Anyway, it's not
significantly worse than what we had and won't overflow as easily
which is all we want for this routine now.
>Now altogether I believe that such a function pertains to a per arch
>optimized routine.
Maybe... though in the context of cpufreq, it may not make that much
sense.
Ben.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
2002-08-22 12:12 ` Gabriel Paubert
@ 2002-08-22 14:31 ` Benjamin Herrenschmidt
2002-08-22 15:23 ` Gabriel Paubert
0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-22 14:31 UTC (permalink / raw)
To: Gabriel Paubert; +Cc: Yoann Vandoorselaere, cpufreq, cpufreq, linux-kernel
>Well, first on sane archs which have an easily accessible, fixed
>frequency time counter, loops_per_jiffy should never have existed :-)
>
>Second, putting this code there means that one day somebody will
>inevitably try to use it outside of its domain of operation (like it
>happened for div64 a few months ago when I pointed out that it would not
>work for divisors above 65535 or so).
Well... it's clearly located inside kernel/cpufreq.c, so there is
little risk, though it may be worth a big bold comment
>Finally, I agree that we should not import libgcc, but for example on
>PPC32 the double lengths shifts (__ashrdi3, __ashldi3, and __lshsldi3)
>are implemented somewhere, and the assembly implementation (directly
>taken from some appendix in PPC documentation, I just slightly twisted
>__ashrdi3 to make it branchless AFAIR) is actually way faster than the
>one in libgcc ;-), and less than half the size.
>
> Adding a few subroutines that implement a subset of libgcc's
>functionality is necessary for most archs (which functions are needed is
>arch, and sometimes compiler's, dependent).
>
>In this case a generic scaling function, while not a standard libgcc/C
>library feature has potentially more applications than this simple
>cpufreq approximation. But I don't see very much the need for scaling a
>long (64 bit on 64 bit archs) value, 32 bit would be sufficient.
Well... if you can write one, go on then ;) In my case, I'm happy
with Yoann implementation for cpufreq right now. Though I agree that
could ultimately be moved to arch code.
Ben.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
2002-08-22 14:31 ` Benjamin Herrenschmidt
@ 2002-08-22 15:23 ` Gabriel Paubert
2002-08-22 15:59 ` Yoann Vandoorselaere
2002-08-22 16:51 ` [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation Dominik Brodowski
0 siblings, 2 replies; 14+ messages in thread
From: Gabriel Paubert @ 2002-08-22 15:23 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Yoann Vandoorselaere, cpufreq, cpufreq, linux-kernel
Benjamin Herrenschmidt wrote:
>>Well, first on sane archs which have an easily accessible, fixed
>>frequency time counter, loops_per_jiffy should never have existed :-)
>>
>>Second, putting this code there means that one day somebody will
>>inevitably try to use it outside of its domain of operation (like it
>>happened for div64 a few months ago when I pointed out that it would not
>>work for divisors above 65535 or so).
>
>
> Well... it's clearly located inside kernel/cpufreq.c, so there is
> little risk, though it may be worth a big bold comment
Hmm, in my experience people hardly ever read detailed comments even
when they are well-written. Perhaps if you called the function
imprecise_scale or coarse_scale, it might ring a bell.
Besides that functions should do one thing and do that *well*[1]. Well,
I'm usually not too dogmatic, but this function breaks the second rule
beyond what I find acceptable.
>>In this case a generic scaling function, while not a standard libgcc/C
>>library feature has potentially more applications than this simple
>>cpufreq approximation. But I don't see very much the need for scaling a
>>long (64 bit on 64 bit archs) value, 32 bit would be sufficient.
>
>
> Well... if you can write one, go on then ;) In my case, I'm happy
> with Yoann implementation for cpufreq right now. Though I agree that
> could ultimately be moved to arch code.
Ok, I'll give it a try this week-end (PPC, i386 and all 64 bit should
archs should be trivial).
Gabriel.
[1] Documentation/CodingStyle, which also claims that functions should
be short and *sweet*. Well, I found the patch far too bitter ;-).
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
2002-08-22 15:23 ` Gabriel Paubert
@ 2002-08-22 15:59 ` Yoann Vandoorselaere
2002-08-22 17:22 ` [PATCH]: fix 32bits integer overflow in loops_per_jiffycalculation george anzinger
2002-08-22 16:51 ` [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation Dominik Brodowski
1 sibling, 1 reply; 14+ messages in thread
From: Yoann Vandoorselaere @ 2002-08-22 15:59 UTC (permalink / raw)
To: Gabriel Paubert; +Cc: Benjamin Herrenschmidt, cpufreq, cpufreq, linux-kernel
On Thu, 2002-08-22 at 17:23, Gabriel Paubert wrote:
> Benjamin Herrenschmidt wrote:
> >>Well, first on sane archs which have an easily accessible, fixed
> >>frequency time counter, loops_per_jiffy should never have existed :-)
> >>
> >>Second, putting this code there means that one day somebody will
> >>inevitably try to use it outside of its domain of operation (like it
> >>happened for div64 a few months ago when I pointed out that it would not
> >>work for divisors above 65535 or so).
> >
> >
> > Well... it's clearly located inside kernel/cpufreq.c, so there is
> > little risk, though it may be worth a big bold comment
>
> Hmm, in my experience people hardly ever read detailed comments even
> when they are well-written. Perhaps if you called the function
> imprecise_scale or coarse_scale, it might ring a bell.
>
> Besides that functions should do one thing and do that *well*[1]. Well,
> I'm usually not too dogmatic, but this function breaks the second rule
> beyond what I find acceptable.
At least it report *correct* result (when the old one was returning BS
because of the 32 bits integer overflow). Doing it well require per
architecture support.
> >>In this case a generic scaling function, while not a standard libgcc/C
> >>library feature has potentially more applications than this simple
> >>cpufreq approximation. But I don't see very much the need for scaling a
> >>long (64 bit on 64 bit archs) value, 32 bit would be sufficient.
> >
> >
> > Well... if you can write one, go on then ;) In my case, I'm happy
> > with Yoann implementation for cpufreq right now. Though I agree that
> > could ultimately be moved to arch code.
[...]
> [1] Documentation/CodingStyle, which also claims that functions should
> be short and *sweet*. Well, I found the patch far too bitter ;-).
No wonder why you're loosing contributor with such comportment.
--
Yoann Vandoorselaere, http://www.prelude-ids.org
"Programming is a race between programmers, who try and make more and
more idiot-proof software, and universe, which produces more and more
remarkable idiots. Until now, universe leads the race" -- R. Cook
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
2002-08-22 15:23 ` Gabriel Paubert
2002-08-22 15:59 ` Yoann Vandoorselaere
@ 2002-08-22 16:51 ` Dominik Brodowski
2002-08-22 19:35 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 14+ messages in thread
From: Dominik Brodowski @ 2002-08-22 16:51 UTC (permalink / raw)
To: Gabriel Paubert
Cc: Benjamin Herrenschmidt, Yoann Vandoorselaere, cpufreq, cpufreq,
linux-kernel
Hi,
> > Well... it's clearly located inside kernel/cpufreq.c, so there is
> > little risk, though it may be worth a big bold comment
>
> Hmm, in my experience people hardly ever read detailed comments even
> when they are well-written. Perhaps if you called the function
> imprecise_scale or coarse_scale, it might ring a bell.
First of all, it's located in include/linux/cpufreq.h [to be accessible for
arch/i386/kernel/time.c, called cpufreq_scale() which should mean that it is
only meant for CPUFreq and nothing else.
> >>In this case a generic scaling function, while not a standard libgcc/C
> >>library feature has potentially more applications than this simple
> >>cpufreq approximation. But I don't see very much the need for scaling a
> >>long (64 bit on 64 bit archs) value, 32 bit would be sufficient.
> >
> >
> > Well... if you can write one, go on then ;) In my case, I'm happy
> > with Yoann implementation for cpufreq right now. Though I agree that
> > could ultimately be moved to arch code.
>
> Ok, I'll give it a try this week-end (PPC, i386 and all 64 bit should
> archs should be trivial).
IMHO per-arch functions are really not needed. The only architectures which
have CPUFreq drivers by now are ARM and i386. This will change, hopefully;
IMHO it should be enough to include some basic limit checking in
cpufreq_scale().
Dominik
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffycalculation
2002-08-22 15:59 ` Yoann Vandoorselaere
@ 2002-08-22 17:22 ` george anzinger
0 siblings, 0 replies; 14+ messages in thread
From: george anzinger @ 2002-08-22 17:22 UTC (permalink / raw)
To: Yoann Vandoorselaere
Cc: Gabriel Paubert, Benjamin Herrenschmidt, cpufreq, cpufreq,
linux-kernel
[-- Attachment #1: Type: text/plain, Size: 3269 bytes --]
You might find the attached header of some interest. It is
part of the high-res-timers patch and is for i386, but I
expect we will do the same for most archs before we are
done. The notion is to use a power of two scale and to make
it easy to access by keeping the asm out of your face (and
in a neat little header file :) In the patch we use it to
avoid div in all the places that it matters, i.e. we do a
div to set up the conversion constants (e.g. TSC to
nanosecond or TSC to microsecond) once and then use the "sc"
mpy functions to do the conversions.
Enjoy
-g
Yoann Vandoorselaere wrote:
>
> On Thu, 2002-08-22 at 17:23, Gabriel Paubert wrote:
> > Benjamin Herrenschmidt wrote:
> > >>Well, first on sane archs which have an easily accessible, fixed
> > >>frequency time counter, loops_per_jiffy should never have existed :-)
> > >>
> > >>Second, putting this code there means that one day somebody will
> > >>inevitably try to use it outside of its domain of operation (like it
> > >>happened for div64 a few months ago when I pointed out that it would not
> > >>work for divisors above 65535 or so).
> > >
> > >
> > > Well... it's clearly located inside kernel/cpufreq.c, so there is
> > > little risk, though it may be worth a big bold comment
> >
> > Hmm, in my experience people hardly ever read detailed comments even
> > when they are well-written. Perhaps if you called the function
> > imprecise_scale or coarse_scale, it might ring a bell.
> >
> > Besides that functions should do one thing and do that *well*[1]. Well,
> > I'm usually not too dogmatic, but this function breaks the second rule
> > beyond what I find acceptable.
>
> At least it report *correct* result (when the old one was returning BS
> because of the 32 bits integer overflow). Doing it well require per
> architecture support.
>
>
> > >>In this case a generic scaling function, while not a standard libgcc/C
> > >>library feature has potentially more applications than this simple
> > >>cpufreq approximation. But I don't see very much the need for scaling a
> > >>long (64 bit on 64 bit archs) value, 32 bit would be sufficient.
> > >
> > >
> > > Well... if you can write one, go on then ;) In my case, I'm happy
> > > with Yoann implementation for cpufreq right now. Though I agree that
> > > could ultimately be moved to arch code.
>
> [...]
>
> > [1] Documentation/CodingStyle, which also claims that functions should
> > be short and *sweet*. Well, I found the patch far too bitter ;-).
>
> No wonder why you're loosing contributor with such comportment.
>
> --
> Yoann Vandoorselaere, http://www.prelude-ids.org
>
> "Programming is a race between programmers, who try and make more and
> more idiot-proof software, and universe, which produces more and more
> remarkable idiots. Until now, universe leads the race" -- R. Cook
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
George Anzinger george@mvista.com
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
[-- Attachment #2: sc_math.h --]
[-- Type: text/plain, Size: 3271 bytes --]
#ifndef SC_MATH
#define SC_MATH
#define MATH_STR(X) #X
#define MATH_NAME(X) X
/*
* Pre scaling defines
*/
#define SC_32(x) ((long long)x<<32)
#define SC_n(n,x) (((long long)x)<<n)
/*
* This routine preforms the following calculation:
*
* X = (a*b)>>32
* we could, (but don't) also get the part shifted out.
*/
extern inline long mpy_ex32(long a,long b)
{
long edx;
__asm__("imull %2"
:"=a" (a), "=d" (edx)
:"rm" (b),
"0" (a));
return edx;
}
/*
* X = (a/b)<<32 or more precisely x = (a<<32)/b
*/
extern inline long div_ex32(long a, long b)
{
long dum;
__asm__("divl %2"
:"=a" (b), "=d" (dum)
:"r" (b), "0" (0), "1" (a));
return b;
}
/*
* X = (a*b)>>24
* we could, (but don't) also get the part shifted out.
*/
#define mpy_ex24(a,b) mpy_sc_n(24,a,b)
/*
* X = (a/b)<<24 or more precisely x = (a<<24)/b
*/
#define div_ex24(a,b) div_sc_n(24,a,b)
/*
* The routines allow you to do x = (a/b) << N and
* x=(a*b)>>N for values of N from 1 to 32.
*
* These are handy to have to do scaled math.
* Scaled math has two nice features:
* A.) A great deal more precision can be maintained by
* keeping more signifigant bits.
* B.) Often an in line div can be repaced with a mpy
* which is a LOT faster.
*/
#define mpy_sc_n(N,aa,bb) ({long edx,a=aa,b=bb; \
__asm__("imull %2\n\t" \
"shldl $(32-"MATH_STR(N)"),%0,%1" \
:"=a" (a), "=d" (edx)\
:"rm" (b), \
"0" (a)); edx;})
#define div_sc_n(N,aa,bb) ({long dum=aa,dum2,b=bb; \
__asm__("shrdl $(32-"MATH_STR(N)"),%4,%3\n\t" \
"sarl $(32-"MATH_STR(N)"),%4\n\t" \
"divl %2" \
:"=a" (dum2), "=d" (dum) \
:"rm" (b), "0" (0), "1" (dum)); dum2;})
/*
* (long)X = ((long long)divs) / (long)div
* (long)rem = ((long long)divs) % (long)div
*
* Warning, this will do an exception if X overflows.
*/
#define div_long_long_rem(a,b,c) div_ll_X_l_rem(a,b,c)
extern inline long div_ll_X_l_rem(long long divs, long div,long * rem)
{
long dum2;
__asm__( "divl %2"
:"=a" (dum2), "=d" (*rem)
:"rm" (div), "A" (divs));
return dum2;
}
/*
* same as above, but no remainder
*/
extern inline long div_ll_X_l(long long divs, long div)
{
long dum;
return div_ll_X_l_rem(divs,div,&dum);
}
/*
* (long)X = (((long)divh<<32) | (long)divl) / (long)div
* (long)rem = (((long)divh<<32) % (long)divl) / (long)div
*
* Warning, this will do an exception if X overflows.
*/
extern inline long div_h_or_l_X_l_rem(long divh,long divl, long div,long* rem)
{
long dum2;
__asm__( "divl %2"
:"=a" (dum2), "=d" (*rem)
:"rm" (div), "0" (divl),"1" (divh));
return dum2;
}
extern inline long long mpy_l_X_l_ll(long mpy1,long mpy2)
{
long long eax;
__asm__("imull %1\n\t"
:"=A" (eax)
:"rm" (mpy2),
"a" (mpy1));
return eax;
}
extern inline long mpy_1_X_1_h(long mpy1,long mpy2,long *hi)
{
long eax;
__asm__("imull %2\n\t"
:"=a" (eax),"=d" (*hi)
:"rm" (mpy2),
"0" (mpy1));
return eax;
}
#endif
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
2002-08-22 19:35 ` Benjamin Herrenschmidt
@ 2002-08-22 17:46 ` Dominik Brodowski
2002-08-22 18:02 ` Yoann Vandoorselaere
2002-08-22 20:00 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 14+ messages in thread
From: Dominik Brodowski @ 2002-08-22 17:46 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Gabriel Paubert, Yoann Vandoorselaere, cpufreq, cpufreq,
linux-kernel
On Thu, Aug 22, 2002 at 09:35:16PM +0200, Benjamin Herrenschmidt wrote:
> >IMHO per-arch functions are really not needed. The only architectures which
> >have CPUFreq drivers by now are ARM and i386. This will change, hopefully;
> >IMHO it should be enough to include some basic limit checking in
> >cpufreq_scale().
>
> In this specific case, we were talking about PPC since the problem
> occured when I implemented cpufreq support to switch the speed
> of the latest powerbooks between 667 and 800Mhz
And the patch from Yoann solves this? Then it might be easiest to use this
for the time being, and switch to George Anzinger's sc_math.h once
high-res-timer is merged.
Dominik
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
2002-08-22 17:46 ` Dominik Brodowski
@ 2002-08-22 18:02 ` Yoann Vandoorselaere
2002-08-22 20:00 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 14+ messages in thread
From: Yoann Vandoorselaere @ 2002-08-22 18:02 UTC (permalink / raw)
To: Dominik Brodowski
Cc: Benjamin Herrenschmidt, Gabriel Paubert, cpufreq, cpufreq,
linux-kernel
On Thu, 2002-08-22 at 19:46, Dominik Brodowski wrote:
> On Thu, Aug 22, 2002 at 09:35:16PM +0200, Benjamin Herrenschmidt wrote:
> > >IMHO per-arch functions are really not needed. The only architectures which
> > >have CPUFreq drivers by now are ARM and i386. This will change, hopefully;
> > >IMHO it should be enough to include some basic limit checking in
> > >cpufreq_scale().
> >
> > In this specific case, we were talking about PPC since the problem
> > occured when I implemented cpufreq support to switch the speed
> > of the latest powerbooks between 667 and 800Mhz
>
> And the patch from Yoann solves this?
Yep, the integer overflow resulted in an incorrectly computed
loops_per_jiffy :
Aug 21 19:50:41 titane kernel: adjust_jiffies: prechange cur=667000, new=800000
Aug 21 19:50:41 titane kernel: old loop_per_jiffy = 665.19 (cpufreq_ref_loops=3325952, cpufreq_ref_freq=667000).
Aug 21 19:50:41 titane kernel: new loop_per_jiffy = 669.02 (cpufreq_ref_loops=3325952, cpufreq_ref_freq=667000).
With the patch applied, it work fine :
Aug 22 11:33:40 titane kernel: adjust_jiffies: prechange cur=667000, new=800000
Aug 22 11:33:40 titane kernel: old loop_per_jiffy = 665.19 (cpufreq_ref_loops=3325952, cpufreq_ref_freq=667000).
Aug 22 11:33:40 titane kernel: new loop_per_jiffy = 797.82 (cpufreq_ref_loops=3325952, cpufreq_ref_freq=667000).
--
Yoann Vandoorselaere, http://www.prelude-ids.org
"Programming is a race between programmers, who try and make more and
more idiot-proof software, and universe, which produces more and more
remarkable idiots. Until now, universe leads the race" -- R. Cook
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
2002-08-22 16:51 ` [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation Dominik Brodowski
@ 2002-08-22 19:35 ` Benjamin Herrenschmidt
2002-08-22 17:46 ` Dominik Brodowski
0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-22 19:35 UTC (permalink / raw)
To: Dominik Brodowski, Gabriel Paubert
Cc: Yoann Vandoorselaere, cpufreq, cpufreq, linux-kernel
>IMHO per-arch functions are really not needed. The only architectures which
>have CPUFreq drivers by now are ARM and i386. This will change, hopefully;
>IMHO it should be enough to include some basic limit checking in
>cpufreq_scale().
In this specific case, we were talking about PPC since the problem
occured when I implemented cpufreq support to switch the speed
of the latest powerbooks between 667 and 800Mhz
Ben.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation
2002-08-22 17:46 ` Dominik Brodowski
2002-08-22 18:02 ` Yoann Vandoorselaere
@ 2002-08-22 20:00 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2002-08-22 20:00 UTC (permalink / raw)
To: Dominik Brodowski
Cc: Gabriel Paubert, Yoann Vandoorselaere, cpufreq, cpufreq,
linux-kernel
>> In this specific case, we were talking about PPC since the problem
>> occured when I implemented cpufreq support to switch the speed
>> of the latest powerbooks between 667 and 800Mhz
>
>And the patch from Yoann solves this? Then it might be easiest to use this
>for the time being, and switch to George Anzinger's sc_math.h once
>high-res-timer is merged.
Provided Yoann patch doesn't break other machines, that's what I would
do, yes.
Ben.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2002-08-22 17:57 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-22 13:02 [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation Benjamin Herrenschmidt
2002-08-22 12:12 ` Gabriel Paubert
2002-08-22 14:31 ` Benjamin Herrenschmidt
2002-08-22 15:23 ` Gabriel Paubert
2002-08-22 15:59 ` Yoann Vandoorselaere
2002-08-22 17:22 ` [PATCH]: fix 32bits integer overflow in loops_per_jiffycalculation george anzinger
2002-08-22 16:51 ` [PATCH]: fix 32bits integer overflow in loops_per_jiffy calculation Dominik Brodowski
2002-08-22 19:35 ` Benjamin Herrenschmidt
2002-08-22 17:46 ` Dominik Brodowski
2002-08-22 18:02 ` Yoann Vandoorselaere
2002-08-22 20:00 ` Benjamin Herrenschmidt
-- strict thread matches above, loose matches on Subject: below --
2002-08-22 9:50 Yoann Vandoorselaere
2002-08-22 10:21 ` Gabriel Paubert
2002-08-22 13:00 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox