* [PATCH] sched, x86: fix overflow in cyc2ns_offset
@ 2012-03-08 23:23 Salman Qazi
2012-03-09 19:23 ` john stultz
0 siblings, 1 reply; 10+ messages in thread
From: Salman Qazi @ 2012-03-08 23:23 UTC (permalink / raw)
To: sqazi, Peter Zijlstra, John Stultz, LKML, Ingo Molnar,
Paul Turner
When a machine boots up, the TSC generally gets reset. However, when
kexec is used to boot into a kernel, the TSC value would be carried
over from the previous kernel. The computation of cycns_offset in
set_cyc2ns_scale is prone to an overflow, if the machine has been up
more than 208 days prior to the kexec. The overflow happens when
we multiply *scale, even though there is enough room to store the
final answer. We fix this issue by decomposing tsc_now into the
quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
performing the multiplication separately on the two components.
Signed-off-by: Salman Qazi <sqazi@google.com>
---
arch/x86/kernel/tsc.c | 12 +++++++++++-
1 files changed, 11 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a62c201..ef1dc8e 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -608,6 +608,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
{
unsigned long long tsc_now, ns_now, *offset;
unsigned long flags, *scale;
+ unsigned long long quot;
+ unsigned long long rem;
local_irq_save(flags);
sched_clock_idle_sleep_event();
@@ -620,7 +622,15 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
if (cpu_khz) {
*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
- *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
+
+ /*
+ * Avoid premature overflow by splitting into quotient
+ * and remainder. See the comment above __cycles_2_ns
+ */
+ quot = (tsc_now >> CYC2NS_SCALE_FACTOR);
+ rem = tsc_now & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
+ *offset = ns_now - (quot * *scale +
+ ((rem * *scale) >> CYC2NS_SCALE_FACTOR));
}
sched_clock_idle_wakeup_event(0);
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
2012-03-08 23:23 [PATCH] sched, x86: fix overflow in cyc2ns_offset Salman Qazi
@ 2012-03-09 19:23 ` john stultz
2012-03-09 23:58 ` Salman Qazi
0 siblings, 1 reply; 10+ messages in thread
From: john stultz @ 2012-03-09 19:23 UTC (permalink / raw)
To: Salman Qazi; +Cc: Peter Zijlstra, LKML, Ingo Molnar, Paul Turner
On Thu, 2012-03-08 at 15:23 -0800, Salman Qazi wrote:
> When a machine boots up, the TSC generally gets reset. However, when
> kexec is used to boot into a kernel, the TSC value would be carried
> over from the previous kernel. The computation of cycns_offset in
> set_cyc2ns_scale is prone to an overflow, if the machine has been up
> more than 208 days prior to the kexec. The overflow happens when
> we multiply *scale, even though there is enough room to store the
> final answer. We fix this issue by decomposing tsc_now into the
> quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
> performing the multiplication separately on the two components.
>
> Signed-off-by: Salman Qazi <sqazi@google.com>
> ---
> arch/x86/kernel/tsc.c | 12 +++++++++++-
> 1 files changed, 11 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index a62c201..ef1dc8e 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -608,6 +608,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
> {
> unsigned long long tsc_now, ns_now, *offset;
> unsigned long flags, *scale;
> + unsigned long long quot;
> + unsigned long long rem;
>
> local_irq_save(flags);
> sched_clock_idle_sleep_event();
> @@ -620,7 +622,15 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
>
> if (cpu_khz) {
> *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
> - *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
> +
> + /*
> + * Avoid premature overflow by splitting into quotient
> + * and remainder. See the comment above __cycles_2_ns
> + */
> + quot = (tsc_now >> CYC2NS_SCALE_FACTOR);
> + rem = tsc_now & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
> + *offset = ns_now - (quot * *scale +
> + ((rem * *scale) >> CYC2NS_SCALE_FACTOR));
> }
This clearly is a needed fix. Thanks for finding it and sending it in.
Although I'm curious if it might be good to encapsulate this code into a
macro that can be reused in both set_cyc2ns_scale and __cycles_2_ns()
(as well as others, I suspect this issue is going to crop up on other
arches at some point too)?
thanks
-john
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] sched, x86: fix overflow in cyc2ns_offset
2012-03-09 19:23 ` john stultz
@ 2012-03-09 23:58 ` Salman Qazi
2012-03-10 0:00 ` Salman Qazi
0 siblings, 1 reply; 10+ messages in thread
From: Salman Qazi @ 2012-03-09 23:58 UTC (permalink / raw)
To: johnstultzjohnstul; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML
When a machine boots up, the TSC generally gets reset. However, when
kexec is used to boot into a kernel, the TSC value would be carried
over from the previous kernel. The computation of cycns_offset in
set_cyc2ns_scale is prone to an overflow, if the machine has been up
more than 208 days prior to the kexec. The overflow happens when
we multiply *scale, even though there is enough room to store the
final answer. We fix this issue by decomposing tsc_now into the
quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
performing the multiplication separately on the two components.
Refactor code to share the calculation with the previous
fix in __cycles_2_ns.
Signed-off-by: Salman Qazi <sqazi@google.com>
---
arch/x86/include/asm/timer.h | 8 ++------
arch/x86/kernel/tsc.c | 3 ++-
include/linux/kernel.h | 13 +++++++++++++
3 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
index 431793e..34baa0e 100644
--- a/arch/x86/include/asm/timer.h
+++ b/arch/x86/include/asm/timer.h
@@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
{
- unsigned long long quot;
- unsigned long long rem;
int cpu = smp_processor_id();
unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
- quot = (cyc >> CYC2NS_SCALE_FACTOR);
- rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
- ns += quot * per_cpu(cyc2ns, cpu) +
- ((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
+ ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
+ (1UL << CYC2NS_SCALE_FACTOR));
return ns;
}
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a62c201..183c592 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
if (cpu_khz) {
*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
- *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
+ *offset = ns_now - mult_frac(tsc_now, *scale,
+ (1UL << CYC2NS_SCALE_FACTOR));
}
sched_clock_idle_wakeup_event(0);
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e834342..d801acb 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -85,6 +85,19 @@
} \
)
+/*
+ * Multiplies an integer by a fraction, while avoiding unnecessary
+ * overflow or loss of precision.
+ */
+#define mult_frac(x, numer, denom)( \
+{ \
+ typeof(x) quot = (x) / (denom); \
+ typeof(x) rem = (x) % (denom); \
+ (quot * (numer)) + ((rem * (numer)) / (denom)); \
+} \
+)
+
+
#define _RET_IP_ (unsigned long)__builtin_return_address(0)
#define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; })
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH] sched, x86: fix overflow in cyc2ns_offset
2012-03-09 23:58 ` Salman Qazi
@ 2012-03-10 0:00 ` Salman Qazi
2012-03-10 0:22 ` Paul Turner
2012-03-10 0:25 ` john stultz
0 siblings, 2 replies; 10+ messages in thread
From: Salman Qazi @ 2012-03-10 0:00 UTC (permalink / raw)
To: john stultz; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML
When a machine boots up, the TSC generally gets reset. However, when
kexec is used to boot into a kernel, the TSC value would be carried
over from the previous kernel. The computation of cycns_offset in
set_cyc2ns_scale is prone to an overflow, if the machine has been up
more than 208 days prior to the kexec. The overflow happens when
we multiply *scale, even though there is enough room to store the
final answer. We fix this issue by decomposing tsc_now into the
quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
performing the multiplication separately on the two components.
Refactor code to share the calculation with the previous
fix in __cycles_2_ns.
Signed-off-by: Salman Qazi <sqazi@google.com>
---
arch/x86/include/asm/timer.h | 8 ++------
arch/x86/kernel/tsc.c | 3 ++-
include/linux/kernel.h | 13 +++++++++++++
3 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
index 431793e..34baa0e 100644
--- a/arch/x86/include/asm/timer.h
+++ b/arch/x86/include/asm/timer.h
@@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
{
- unsigned long long quot;
- unsigned long long rem;
int cpu = smp_processor_id();
unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
- quot = (cyc >> CYC2NS_SCALE_FACTOR);
- rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
- ns += quot * per_cpu(cyc2ns, cpu) +
- ((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
+ ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
+ (1UL << CYC2NS_SCALE_FACTOR));
return ns;
}
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a62c201..183c592 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
if (cpu_khz) {
*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
- *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
+ *offset = ns_now - mult_frac(tsc_now, *scale,
+ (1UL << CYC2NS_SCALE_FACTOR));
}
sched_clock_idle_wakeup_event(0);
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e834342..d801acb 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -85,6 +85,19 @@
} \
)
+/*
+ * Multiplies an integer by a fraction, while avoiding unnecessary
+ * overflow or loss of precision.
+ */
+#define mult_frac(x, numer, denom)( \
+{ \
+ typeof(x) quot = (x) / (denom); \
+ typeof(x) rem = (x) % (denom); \
+ (quot * (numer)) + ((rem * (numer)) / (denom)); \
+} \
+)
+
+
#define _RET_IP_ (unsigned long)__builtin_return_address(0)
#define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; })
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
2012-03-10 0:00 ` Salman Qazi
@ 2012-03-10 0:22 ` Paul Turner
2012-03-10 0:29 ` Salman Qazi
2012-03-10 0:25 ` john stultz
1 sibling, 1 reply; 10+ messages in thread
From: Paul Turner @ 2012-03-10 0:22 UTC (permalink / raw)
To: Salman Qazi; +Cc: john stultz, Ingo Molnar, Peter Zijlstra, LKML
On Fri, Mar 9, 2012 at 4:00 PM, Salman Qazi <sqazi@google.com> wrote:
> When a machine boots up, the TSC generally gets reset. However, when
> kexec is used to boot into a kernel, the TSC value would be carried
> over from the previous kernel. The computation of cycns_offset in
> set_cyc2ns_scale is prone to an overflow, if the machine has been up
> more than 208 days prior to the kexec. The overflow happens when
> we multiply *scale, even though there is enough room to store the
> final answer. We fix this issue by decomposing tsc_now into the
> quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
> performing the multiplication separately on the two components.
>
> Refactor code to share the calculation with the previous
> fix in __cycles_2_ns.
>
> Signed-off-by: Salman Qazi <sqazi@google.com>
> ---
> arch/x86/include/asm/timer.h | 8 ++------
> arch/x86/kernel/tsc.c | 3 ++-
> include/linux/kernel.h | 13 +++++++++++++
> 3 files changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
> index 431793e..34baa0e 100644
> --- a/arch/x86/include/asm/timer.h
> +++ b/arch/x86/include/asm/timer.h
> @@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
>
> static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
> {
> - unsigned long long quot;
> - unsigned long long rem;
> int cpu = smp_processor_id();
> unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
> - quot = (cyc >> CYC2NS_SCALE_FACTOR);
> - rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
> - ns += quot * per_cpu(cyc2ns, cpu) +
> - ((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
> + ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
> + (1UL << CYC2NS_SCALE_FACTOR));
> return ns;
> }
>
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index a62c201..183c592 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
>
> if (cpu_khz) {
> *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
> - *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
> + *offset = ns_now - mult_frac(tsc_now, *scale,
> + (1UL << CYC2NS_SCALE_FACTOR));
> }
>
> sched_clock_idle_wakeup_event(0);
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index e834342..d801acb 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -85,6 +85,19 @@
> } \
> )
>
> +/*
> + * Multiplies an integer by a fraction, while avoiding unnecessary
> + * overflow or loss of precision.
> + */
> +#define mult_frac(x, numer, denom)( \
> +{ \
> + typeof(x) quot = (x) / (denom); \
> + typeof(x) rem = (x) % (denom); \
> + (quot * (numer)) + ((rem * (numer)) / (denom)); \
> +} \
> +)
Any reason not to make this a static inline so that we get saner typing?
> +
> +
> #define _RET_IP_ (unsigned long)__builtin_return_address(0)
> #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; })
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
2012-03-10 0:00 ` Salman Qazi
2012-03-10 0:22 ` Paul Turner
@ 2012-03-10 0:25 ` john stultz
2012-03-10 0:28 ` Salman Qazi
1 sibling, 1 reply; 10+ messages in thread
From: john stultz @ 2012-03-10 0:25 UTC (permalink / raw)
To: Salman Qazi; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML
On Fri, 2012-03-09 at 16:00 -0800, Salman Qazi wrote:
> When a machine boots up, the TSC generally gets reset. However, when
> kexec is used to boot into a kernel, the TSC value would be carried
> over from the previous kernel. The computation of cycns_offset in
> set_cyc2ns_scale is prone to an overflow, if the machine has been up
> more than 208 days prior to the kexec. The overflow happens when
> we multiply *scale, even though there is enough room to store the
> final answer. We fix this issue by decomposing tsc_now into the
> quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
> performing the multiplication separately on the two components.
>
> Refactor code to share the calculation with the previous
> fix in __cycles_2_ns.
Thanks so much for making it more generic and reusable! But one question
below.
> Signed-off-by: Salman Qazi <sqazi@google.com>
> ---
> arch/x86/include/asm/timer.h | 8 ++------
> arch/x86/kernel/tsc.c | 3 ++-
> include/linux/kernel.h | 13 +++++++++++++
> 3 files changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
> index 431793e..34baa0e 100644
> --- a/arch/x86/include/asm/timer.h
> +++ b/arch/x86/include/asm/timer.h
> @@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
>
> static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
> {
> - unsigned long long quot;
> - unsigned long long rem;
> int cpu = smp_processor_id();
> unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
> - quot = (cyc >> CYC2NS_SCALE_FACTOR);
> - rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
> - ns += quot * per_cpu(cyc2ns, cpu) +
> - ((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
> + ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
> + (1UL << CYC2NS_SCALE_FACTOR));
> return ns;
> }
>
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index a62c201..183c592 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
>
> if (cpu_khz) {
> *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
> - *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
> + *offset = ns_now - mult_frac(tsc_now, *scale,
> + (1UL << CYC2NS_SCALE_FACTOR));
> }
>
> sched_clock_idle_wakeup_event(0);
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index e834342..d801acb 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -85,6 +85,19 @@
> } \
> )
>
> +/*
> + * Multiplies an integer by a fraction, while avoiding unnecessary
> + * overflow or loss of precision.
> + */
> +#define mult_frac(x, numer, denom)( \
> +{ \
> + typeof(x) quot = (x) / (denom); \
> + typeof(x) rem = (x) % (denom); \
> + (quot * (numer)) + ((rem * (numer)) / (denom)); \
> +} \
> +)
> +
> +
So... Sorry, why did you change it from the shifted logic?
I was thinking more like:
mult_shifted_fract(x, mult, shift)
{
quot = x >> shift;
rem = x & ((1ULL << shift)-1);
return quot * mult + (rem * mult) >> shift;
}
Is the compiler really smart enough to avoid the divides?
thanks
-john
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
2012-03-10 0:25 ` john stultz
@ 2012-03-10 0:28 ` Salman Qazi
2012-03-10 0:35 ` john stultz
0 siblings, 1 reply; 10+ messages in thread
From: Salman Qazi @ 2012-03-10 0:28 UTC (permalink / raw)
To: john stultz; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML
On Fri, Mar 9, 2012 at 4:25 PM, john stultz <johnstul@us.ibm.com> wrote:
>
> On Fri, 2012-03-09 at 16:00 -0800, Salman Qazi wrote:
> > When a machine boots up, the TSC generally gets reset. However, when
> > kexec is used to boot into a kernel, the TSC value would be carried
> > over from the previous kernel. The computation of cycns_offset in
> > set_cyc2ns_scale is prone to an overflow, if the machine has been up
> > more than 208 days prior to the kexec. The overflow happens when
> > we multiply *scale, even though there is enough room to store the
> > final answer. We fix this issue by decomposing tsc_now into the
> > quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
> > performing the multiplication separately on the two components.
> >
> > Refactor code to share the calculation with the previous
> > fix in __cycles_2_ns.
>
> Thanks so much for making it more generic and reusable! But one question
> below.
>
> > Signed-off-by: Salman Qazi <sqazi@google.com>
> > ---
> > arch/x86/include/asm/timer.h | 8 ++------
> > arch/x86/kernel/tsc.c | 3 ++-
> > include/linux/kernel.h | 13 +++++++++++++
> > 3 files changed, 17 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
> > index 431793e..34baa0e 100644
> > --- a/arch/x86/include/asm/timer.h
> > +++ b/arch/x86/include/asm/timer.h
> > @@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
> >
> > static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
> > {
> > - unsigned long long quot;
> > - unsigned long long rem;
> > int cpu = smp_processor_id();
> > unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
> > - quot = (cyc >> CYC2NS_SCALE_FACTOR);
> > - rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
> > - ns += quot * per_cpu(cyc2ns, cpu) +
> > - ((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
> > + ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
> > + (1UL << CYC2NS_SCALE_FACTOR));
> > return ns;
> > }
> >
> > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> > index a62c201..183c592 100644
> > --- a/arch/x86/kernel/tsc.c
> > +++ b/arch/x86/kernel/tsc.c
> > @@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz,
> > int cpu)
> >
> > if (cpu_khz) {
> > *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
> > - *offset = ns_now - (tsc_now * *scale >>
> > CYC2NS_SCALE_FACTOR);
> > + *offset = ns_now - mult_frac(tsc_now, *scale,
> > + (1UL <<
> > CYC2NS_SCALE_FACTOR));
> > }
> >
> > sched_clock_idle_wakeup_event(0);
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index e834342..d801acb 100644
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -85,6 +85,19 @@
> > } \
> > )
> >
> > +/*
> > + * Multiplies an integer by a fraction, while avoiding unnecessary
> > + * overflow or loss of precision.
> > + */
> > +#define mult_frac(x, numer, denom)( \
> > +{ \
> > + typeof(x) quot = (x) / (denom); \
> > + typeof(x) rem = (x) % (denom); \
> > + (quot * (numer)) + ((rem * (numer)) / (denom)); \
> > +} \
> > +)
> > +
> > +
>
> So... Sorry, why did you change it from the shifted logic?
>
It's more generally applicable. The compiler is smart enough to do
the right thing in the power-of-two
constant case.
> I was thinking more like:
> mult_shifted_fract(x, mult, shift)
> {
> quot = x >> shift;
> rem = x & ((1ULL << shift)-1);
> return quot * mult + (rem * mult) >> shift;
> }
>
> Is the compiler really smart enough to avoid the divides?
Yes, it is. I verified that.
>
> thanks
> -john
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
2012-03-10 0:22 ` Paul Turner
@ 2012-03-10 0:29 ` Salman Qazi
0 siblings, 0 replies; 10+ messages in thread
From: Salman Qazi @ 2012-03-10 0:29 UTC (permalink / raw)
To: Paul Turner; +Cc: john stultz, Ingo Molnar, Peter Zijlstra, LKML
On Fri, Mar 9, 2012 at 4:22 PM, Paul Turner <pjt@google.com> wrote:
> On Fri, Mar 9, 2012 at 4:00 PM, Salman Qazi <sqazi@google.com> wrote:
>> When a machine boots up, the TSC generally gets reset. However, when
>> kexec is used to boot into a kernel, the TSC value would be carried
>> over from the previous kernel. The computation of cycns_offset in
>> set_cyc2ns_scale is prone to an overflow, if the machine has been up
>> more than 208 days prior to the kexec. The overflow happens when
>> we multiply *scale, even though there is enough room to store the
>> final answer. We fix this issue by decomposing tsc_now into the
>> quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
>> performing the multiplication separately on the two components.
>>
>> Refactor code to share the calculation with the previous
>> fix in __cycles_2_ns.
>>
>> Signed-off-by: Salman Qazi <sqazi@google.com>
>> ---
>> arch/x86/include/asm/timer.h | 8 ++------
>> arch/x86/kernel/tsc.c | 3 ++-
>> include/linux/kernel.h | 13 +++++++++++++
>> 3 files changed, 17 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
>> index 431793e..34baa0e 100644
>> --- a/arch/x86/include/asm/timer.h
>> +++ b/arch/x86/include/asm/timer.h
>> @@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
>>
>> static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
>> {
>> - unsigned long long quot;
>> - unsigned long long rem;
>> int cpu = smp_processor_id();
>> unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
>> - quot = (cyc >> CYC2NS_SCALE_FACTOR);
>> - rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
>> - ns += quot * per_cpu(cyc2ns, cpu) +
>> - ((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
>> + ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
>> + (1UL << CYC2NS_SCALE_FACTOR));
>> return ns;
>> }
>>
>> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
>> index a62c201..183c592 100644
>> --- a/arch/x86/kernel/tsc.c
>> +++ b/arch/x86/kernel/tsc.c
>> @@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
>>
>> if (cpu_khz) {
>> *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
>> - *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
>> + *offset = ns_now - mult_frac(tsc_now, *scale,
>> + (1UL << CYC2NS_SCALE_FACTOR));
>> }
>>
>> sched_clock_idle_wakeup_event(0);
>> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
>> index e834342..d801acb 100644
>> --- a/include/linux/kernel.h
>> +++ b/include/linux/kernel.h
>> @@ -85,6 +85,19 @@
>> } \
>> )
>>
>> +/*
>> + * Multiplies an integer by a fraction, while avoiding unnecessary
>> + * overflow or loss of precision.
>> + */
>> +#define mult_frac(x, numer, denom)( \
>> +{ \
>> + typeof(x) quot = (x) / (denom); \
>> + typeof(x) rem = (x) % (denom); \
>> + (quot * (numer)) + ((rem * (numer)) / (denom)); \
>> +} \
>> +)
>
> Any reason not to make this a static inline so that we get saner typing?
Only re-usability. I put it in include/linux/kernel.h with a lot of
other #defines like min , max, clamp, that are generally
applicable.
>
>> +
>> +
>> #define _RET_IP_ (unsigned long)__builtin_return_address(0)
>> #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; })
>>
>>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
2012-03-10 0:28 ` Salman Qazi
@ 2012-03-10 0:35 ` john stultz
0 siblings, 0 replies; 10+ messages in thread
From: john stultz @ 2012-03-10 0:35 UTC (permalink / raw)
To: Salman Qazi; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML
On Fri, 2012-03-09 at 16:28 -0800, Salman Qazi wrote:
> On Fri, Mar 9, 2012 at 4:25 PM, john stultz <johnstul@us.ibm.com> wrote:
> > So... Sorry, why did you change it from the shifted logic?
> >
>
> It's more generally applicable. The compiler is smart enough to do
> the right thing in the power-of-two
> constant case.
>
> >
> > Is the compiler really smart enough to avoid the divides?
>
> Yes, it is. I verified that.
Ok, great! Thanks again! (Although I probably need to put rewriting the
the big x86/include/asm/timer.h comment on my TODO, since its not as
relevant anymore)
Acked-by: John Stultz <john.stultz@linaro.org>
Ingo/Thomas, this should probably go tip/timers/urgent & be marked for
stable. I can queue it and send you a pull request, but it might be just
easier to just pick up the patch.
thanks
-john
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] sched, x86: fix overflow in cyc2ns_offset
@ 2012-03-10 0:41 Salman Qazi
0 siblings, 0 replies; 10+ messages in thread
From: Salman Qazi @ 2012-03-10 0:41 UTC (permalink / raw)
To: john stultz; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML
When a machine boots up, the TSC generally gets reset. However, when
kexec is used to boot into a kernel, the TSC value would be carried
over from the previous kernel. The computation of cycns_offset in
set_cyc2ns_scale is prone to an overflow, if the machine has been up
more than 208 days prior to the kexec. The overflow happens when
we multiply *scale, even though there is enough room to store the
final answer. We fix this issue by decomposing tsc_now into the
quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
performing the multiplication separately on the two components.
Refactor code to share the calculation with the previous
fix in __cycles_2_ns.
Acked-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Salman Qazi <sqazi@google.com>
---
arch/x86/include/asm/timer.h | 8 ++------
arch/x86/kernel/tsc.c | 3 ++-
include/linux/kernel.h | 13 +++++++++++++
3 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
index 431793e..34baa0e 100644
--- a/arch/x86/include/asm/timer.h
+++ b/arch/x86/include/asm/timer.h
@@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
{
- unsigned long long quot;
- unsigned long long rem;
int cpu = smp_processor_id();
unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
- quot = (cyc >> CYC2NS_SCALE_FACTOR);
- rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
- ns += quot * per_cpu(cyc2ns, cpu) +
- ((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
+ ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
+ (1UL << CYC2NS_SCALE_FACTOR));
return ns;
}
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a62c201..183c592 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
if (cpu_khz) {
*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
- *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
+ *offset = ns_now - mult_frac(tsc_now, *scale,
+ (1UL << CYC2NS_SCALE_FACTOR));
}
sched_clock_idle_wakeup_event(0);
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e834342..d801acb 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -85,6 +85,19 @@
} \
)
+/*
+ * Multiplies an integer by a fraction, while avoiding unnecessary
+ * overflow or loss of precision.
+ */
+#define mult_frac(x, numer, denom)( \
+{ \
+ typeof(x) quot = (x) / (denom); \
+ typeof(x) rem = (x) % (denom); \
+ (quot * (numer)) + ((rem * (numer)) / (denom)); \
+} \
+)
+
+
#define _RET_IP_ (unsigned long)__builtin_return_address(0)
#define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; })
^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-03-10 0:41 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-08 23:23 [PATCH] sched, x86: fix overflow in cyc2ns_offset Salman Qazi
2012-03-09 19:23 ` john stultz
2012-03-09 23:58 ` Salman Qazi
2012-03-10 0:00 ` Salman Qazi
2012-03-10 0:22 ` Paul Turner
2012-03-10 0:29 ` Salman Qazi
2012-03-10 0:25 ` john stultz
2012-03-10 0:28 ` Salman Qazi
2012-03-10 0:35 ` john stultz
-- strict thread matches above, loose matches on Subject: below --
2012-03-10 0:41 Salman Qazi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox