public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched, x86: fix overflow in cyc2ns_offset
@ 2012-03-08 23:23 Salman Qazi
  2012-03-09 19:23 ` john stultz
  0 siblings, 1 reply; 10+ messages in thread
From: Salman Qazi @ 2012-03-08 23:23 UTC (permalink / raw)
  To: sqazi, Peter Zijlstra, John Stultz, LKML, Ingo Molnar,
	Paul Turner

When a machine boots up, the TSC generally gets reset.  However, when
kexec is used to boot into a kernel, the TSC value would be carried
over from the previous kernel.  The computation of cycns_offset in
set_cyc2ns_scale is prone to an overflow, if the machine has been up
more than 208 days prior to the kexec.  The overflow happens when
we multiply *scale, even though there is enough room to store the
final answer.  We fix this issue by decomposing tsc_now into the
quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
performing the multiplication separately on the two components.

Signed-off-by: Salman Qazi <sqazi@google.com>
---
 arch/x86/kernel/tsc.c |   12 +++++++++++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a62c201..ef1dc8e 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -608,6 +608,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 {
 	unsigned long long tsc_now, ns_now, *offset;
 	unsigned long flags, *scale;
+	unsigned long long quot;
+	unsigned long long rem;
 
 	local_irq_save(flags);
 	sched_clock_idle_sleep_event();
@@ -620,7 +622,15 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 
 	if (cpu_khz) {
 		*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
-		*offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
+
+		/*
+		 * Avoid premature overflow by splitting into quotient
+		 * and remainder.  See the comment above __cycles_2_ns
+		 */
+		quot = (tsc_now >> CYC2NS_SCALE_FACTOR);
+		rem = tsc_now & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
+		*offset = ns_now - (quot * *scale +
+				    ((rem * *scale) >> CYC2NS_SCALE_FACTOR));
 	}
 
 	sched_clock_idle_wakeup_event(0);


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
  2012-03-08 23:23 [PATCH] sched, x86: fix overflow in cyc2ns_offset Salman Qazi
@ 2012-03-09 19:23 ` john stultz
  2012-03-09 23:58   ` Salman Qazi
  0 siblings, 1 reply; 10+ messages in thread
From: john stultz @ 2012-03-09 19:23 UTC (permalink / raw)
  To: Salman Qazi; +Cc: Peter Zijlstra, LKML, Ingo Molnar, Paul Turner

On Thu, 2012-03-08 at 15:23 -0800, Salman Qazi wrote:
> When a machine boots up, the TSC generally gets reset.  However, when
> kexec is used to boot into a kernel, the TSC value would be carried
> over from the previous kernel.  The computation of cycns_offset in
> set_cyc2ns_scale is prone to an overflow, if the machine has been up
> more than 208 days prior to the kexec.  The overflow happens when
> we multiply *scale, even though there is enough room to store the
> final answer.  We fix this issue by decomposing tsc_now into the
> quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
> performing the multiplication separately on the two components.
> 
> Signed-off-by: Salman Qazi <sqazi@google.com>
> ---
>  arch/x86/kernel/tsc.c |   12 +++++++++++-
>  1 files changed, 11 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index a62c201..ef1dc8e 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -608,6 +608,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
>  {
>  	unsigned long long tsc_now, ns_now, *offset;
>  	unsigned long flags, *scale;
> +	unsigned long long quot;
> +	unsigned long long rem;
> 
>  	local_irq_save(flags);
>  	sched_clock_idle_sleep_event();
> @@ -620,7 +622,15 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
> 
>  	if (cpu_khz) {
>  		*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
> -		*offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
> +
> +		/*
> +		 * Avoid premature overflow by splitting into quotient
> +		 * and remainder.  See the comment above __cycles_2_ns
> +		 */
> +		quot = (tsc_now >> CYC2NS_SCALE_FACTOR);
> +		rem = tsc_now & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
> +		*offset = ns_now - (quot * *scale +
> +				    ((rem * *scale) >> CYC2NS_SCALE_FACTOR));
>  	}

This clearly is a needed fix. Thanks for finding it and sending it in.

Although I'm curious if it might be good to encapsulate this code into a
macro that can be reused in both set_cyc2ns_scale and __cycles_2_ns()
(as well as others, I suspect this issue is going to crop up on other
arches at some point too)?

thanks
-john




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] sched, x86: fix overflow in cyc2ns_offset
  2012-03-09 19:23 ` john stultz
@ 2012-03-09 23:58   ` Salman Qazi
  2012-03-10  0:00     ` Salman Qazi
  0 siblings, 1 reply; 10+ messages in thread
From: Salman Qazi @ 2012-03-09 23:58 UTC (permalink / raw)
  To: johnstultzjohnstul; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML

When a machine boots up, the TSC generally gets reset.  However, when
kexec is used to boot into a kernel, the TSC value would be carried
over from the previous kernel.  The computation of cycns_offset in
set_cyc2ns_scale is prone to an overflow, if the machine has been up
more than 208 days prior to the kexec.  The overflow happens when
we multiply *scale, even though there is enough room to store the
final answer.  We fix this issue by decomposing tsc_now into the
quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
performing the multiplication separately on the two components.

Refactor code to share the calculation with the previous
fix in __cycles_2_ns.

Signed-off-by: Salman Qazi <sqazi@google.com>
---
 arch/x86/include/asm/timer.h |    8 ++------
 arch/x86/kernel/tsc.c        |    3 ++-
 include/linux/kernel.h       |   13 +++++++++++++
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
index 431793e..34baa0e 100644
--- a/arch/x86/include/asm/timer.h
+++ b/arch/x86/include/asm/timer.h
@@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
 
 static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
 {
-	unsigned long long quot;
-	unsigned long long rem;
 	int cpu = smp_processor_id();
 	unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
-	quot = (cyc >> CYC2NS_SCALE_FACTOR);
-	rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
-	ns += quot * per_cpu(cyc2ns, cpu) +
-		((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
+	ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
+			(1UL << CYC2NS_SCALE_FACTOR));
 	return ns;
 }
 
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a62c201..183c592 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 
 	if (cpu_khz) {
 		*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
-		*offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
+		*offset = ns_now - mult_frac(tsc_now, *scale,
+					     (1UL << CYC2NS_SCALE_FACTOR));
 	}
 
 	sched_clock_idle_wakeup_event(0);
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e834342..d801acb 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -85,6 +85,19 @@
 }							\
 )
 
+/*
+ * Multiplies an integer by a fraction, while avoiding unnecessary
+ * overflow or loss of precision.
+ */
+#define mult_frac(x, numer, denom)(			\
+{							\
+	typeof(x) quot = (x) / (denom);			\
+	typeof(x) rem  = (x) % (denom);			\
+	(quot * (numer)) + ((rem * (numer)) / (denom));	\
+}							\
+)
+
+
 #define _RET_IP_		(unsigned long)__builtin_return_address(0)
 #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
 


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH] sched, x86: fix overflow in cyc2ns_offset
  2012-03-09 23:58   ` Salman Qazi
@ 2012-03-10  0:00     ` Salman Qazi
  2012-03-10  0:22       ` Paul Turner
  2012-03-10  0:25       ` john stultz
  0 siblings, 2 replies; 10+ messages in thread
From: Salman Qazi @ 2012-03-10  0:00 UTC (permalink / raw)
  To: john stultz; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML

When a machine boots up, the TSC generally gets reset.  However, when
kexec is used to boot into a kernel, the TSC value would be carried
over from the previous kernel.  The computation of cycns_offset in
set_cyc2ns_scale is prone to an overflow, if the machine has been up
more than 208 days prior to the kexec.  The overflow happens when
we multiply *scale, even though there is enough room to store the
final answer.  We fix this issue by decomposing tsc_now into the
quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
performing the multiplication separately on the two components.

Refactor code to share the calculation with the previous
fix in __cycles_2_ns.

Signed-off-by: Salman Qazi <sqazi@google.com>
---
 arch/x86/include/asm/timer.h |    8 ++------
 arch/x86/kernel/tsc.c        |    3 ++-
 include/linux/kernel.h       |   13 +++++++++++++
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
index 431793e..34baa0e 100644
--- a/arch/x86/include/asm/timer.h
+++ b/arch/x86/include/asm/timer.h
@@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
 
 static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
 {
-	unsigned long long quot;
-	unsigned long long rem;
 	int cpu = smp_processor_id();
 	unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
-	quot = (cyc >> CYC2NS_SCALE_FACTOR);
-	rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
-	ns += quot * per_cpu(cyc2ns, cpu) +
-		((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
+	ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
+			(1UL << CYC2NS_SCALE_FACTOR));
 	return ns;
 }
 
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a62c201..183c592 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 
 	if (cpu_khz) {
 		*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
-		*offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
+		*offset = ns_now - mult_frac(tsc_now, *scale,
+					     (1UL << CYC2NS_SCALE_FACTOR));
 	}
 
 	sched_clock_idle_wakeup_event(0);
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e834342..d801acb 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -85,6 +85,19 @@
 }							\
 )
 
+/*
+ * Multiplies an integer by a fraction, while avoiding unnecessary
+ * overflow or loss of precision.
+ */
+#define mult_frac(x, numer, denom)(			\
+{							\
+	typeof(x) quot = (x) / (denom);			\
+	typeof(x) rem  = (x) % (denom);			\
+	(quot * (numer)) + ((rem * (numer)) / (denom));	\
+}							\
+)
+
+
 #define _RET_IP_		(unsigned long)__builtin_return_address(0)
 #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
 


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
  2012-03-10  0:00     ` Salman Qazi
@ 2012-03-10  0:22       ` Paul Turner
  2012-03-10  0:29         ` Salman Qazi
  2012-03-10  0:25       ` john stultz
  1 sibling, 1 reply; 10+ messages in thread
From: Paul Turner @ 2012-03-10  0:22 UTC (permalink / raw)
  To: Salman Qazi; +Cc: john stultz, Ingo Molnar, Peter Zijlstra, LKML

On Fri, Mar 9, 2012 at 4:00 PM, Salman Qazi <sqazi@google.com> wrote:
> When a machine boots up, the TSC generally gets reset.  However, when
> kexec is used to boot into a kernel, the TSC value would be carried
> over from the previous kernel.  The computation of cycns_offset in
> set_cyc2ns_scale is prone to an overflow, if the machine has been up
> more than 208 days prior to the kexec.  The overflow happens when
> we multiply *scale, even though there is enough room to store the
> final answer.  We fix this issue by decomposing tsc_now into the
> quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
> performing the multiplication separately on the two components.
>
> Refactor code to share the calculation with the previous
> fix in __cycles_2_ns.
>
> Signed-off-by: Salman Qazi <sqazi@google.com>
> ---
>  arch/x86/include/asm/timer.h |    8 ++------
>  arch/x86/kernel/tsc.c        |    3 ++-
>  include/linux/kernel.h       |   13 +++++++++++++
>  3 files changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
> index 431793e..34baa0e 100644
> --- a/arch/x86/include/asm/timer.h
> +++ b/arch/x86/include/asm/timer.h
> @@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
>
>  static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
>  {
> -       unsigned long long quot;
> -       unsigned long long rem;
>        int cpu = smp_processor_id();
>        unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
> -       quot = (cyc >> CYC2NS_SCALE_FACTOR);
> -       rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
> -       ns += quot * per_cpu(cyc2ns, cpu) +
> -               ((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
> +       ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
> +                       (1UL << CYC2NS_SCALE_FACTOR));
>        return ns;
>  }
>
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index a62c201..183c592 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
>
>        if (cpu_khz) {
>                *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
> -               *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
> +               *offset = ns_now - mult_frac(tsc_now, *scale,
> +                                            (1UL << CYC2NS_SCALE_FACTOR));
>        }
>
>        sched_clock_idle_wakeup_event(0);
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index e834342..d801acb 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -85,6 +85,19 @@
>  }                                                      \
>  )
>
> +/*
> + * Multiplies an integer by a fraction, while avoiding unnecessary
> + * overflow or loss of precision.
> + */
> +#define mult_frac(x, numer, denom)(                    \
> +{                                                      \
> +       typeof(x) quot = (x) / (denom);                 \
> +       typeof(x) rem  = (x) % (denom);                 \
> +       (quot * (numer)) + ((rem * (numer)) / (denom)); \
> +}                                                      \
> +)

Any reason not to make this a static inline so that we get saner typing?

> +
> +
>  #define _RET_IP_               (unsigned long)__builtin_return_address(0)
>  #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
  2012-03-10  0:00     ` Salman Qazi
  2012-03-10  0:22       ` Paul Turner
@ 2012-03-10  0:25       ` john stultz
  2012-03-10  0:28         ` Salman Qazi
  1 sibling, 1 reply; 10+ messages in thread
From: john stultz @ 2012-03-10  0:25 UTC (permalink / raw)
  To: Salman Qazi; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML

On Fri, 2012-03-09 at 16:00 -0800, Salman Qazi wrote:
> When a machine boots up, the TSC generally gets reset.  However, when
> kexec is used to boot into a kernel, the TSC value would be carried
> over from the previous kernel.  The computation of cycns_offset in
> set_cyc2ns_scale is prone to an overflow, if the machine has been up
> more than 208 days prior to the kexec.  The overflow happens when
> we multiply *scale, even though there is enough room to store the
> final answer.  We fix this issue by decomposing tsc_now into the
> quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
> performing the multiplication separately on the two components.
> 
> Refactor code to share the calculation with the previous
> fix in __cycles_2_ns.

Thanks so much for making it more generic and reusable! But one question
below.

> Signed-off-by: Salman Qazi <sqazi@google.com>
> ---
>  arch/x86/include/asm/timer.h |    8 ++------
>  arch/x86/kernel/tsc.c        |    3 ++-
>  include/linux/kernel.h       |   13 +++++++++++++
>  3 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
> index 431793e..34baa0e 100644
> --- a/arch/x86/include/asm/timer.h
> +++ b/arch/x86/include/asm/timer.h
> @@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
> 
>  static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
>  {
> -	unsigned long long quot;
> -	unsigned long long rem;
>  	int cpu = smp_processor_id();
>  	unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
> -	quot = (cyc >> CYC2NS_SCALE_FACTOR);
> -	rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
> -	ns += quot * per_cpu(cyc2ns, cpu) +
> -		((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
> +	ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
> +			(1UL << CYC2NS_SCALE_FACTOR));
>  	return ns;
>  }
> 
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index a62c201..183c592 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
> 
>  	if (cpu_khz) {
>  		*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
> -		*offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
> +		*offset = ns_now - mult_frac(tsc_now, *scale,
> +					     (1UL << CYC2NS_SCALE_FACTOR));
>  	}
> 
>  	sched_clock_idle_wakeup_event(0);
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index e834342..d801acb 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -85,6 +85,19 @@
>  }							\
>  )
> 
> +/*
> + * Multiplies an integer by a fraction, while avoiding unnecessary
> + * overflow or loss of precision.
> + */
> +#define mult_frac(x, numer, denom)(			\
> +{							\
> +	typeof(x) quot = (x) / (denom);			\
> +	typeof(x) rem  = (x) % (denom);			\
> +	(quot * (numer)) + ((rem * (numer)) / (denom));	\
> +}							\
> +)
> +
> +

So... Sorry, why did you change it from the shifted logic?

I was thinking more like:
mult_shifted_fract(x, mult, shift)
{
	quot = x >> shift;
	rem = x & ((1ULL << shift)-1);
	return quot * mult + (rem * mult) >> shift;
}

Is the compiler really smart enough to avoid the divides?

thanks
-john



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
  2012-03-10  0:25       ` john stultz
@ 2012-03-10  0:28         ` Salman Qazi
  2012-03-10  0:35           ` john stultz
  0 siblings, 1 reply; 10+ messages in thread
From: Salman Qazi @ 2012-03-10  0:28 UTC (permalink / raw)
  To: john stultz; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML

On Fri, Mar 9, 2012 at 4:25 PM, john stultz <johnstul@us.ibm.com> wrote:
>
> On Fri, 2012-03-09 at 16:00 -0800, Salman Qazi wrote:
> > When a machine boots up, the TSC generally gets reset.  However, when
> > kexec is used to boot into a kernel, the TSC value would be carried
> > over from the previous kernel.  The computation of cycns_offset in
> > set_cyc2ns_scale is prone to an overflow, if the machine has been up
> > more than 208 days prior to the kexec.  The overflow happens when
> > we multiply *scale, even though there is enough room to store the
> > final answer.  We fix this issue by decomposing tsc_now into the
> > quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
> > performing the multiplication separately on the two components.
> >
> > Refactor code to share the calculation with the previous
> > fix in __cycles_2_ns.
>
> Thanks so much for making it more generic and reusable! But one question
> below.
>
> > Signed-off-by: Salman Qazi <sqazi@google.com>
> > ---
> >  arch/x86/include/asm/timer.h |    8 ++------
> >  arch/x86/kernel/tsc.c        |    3 ++-
> >  include/linux/kernel.h       |   13 +++++++++++++
> >  3 files changed, 17 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
> > index 431793e..34baa0e 100644
> > --- a/arch/x86/include/asm/timer.h
> > +++ b/arch/x86/include/asm/timer.h
> > @@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
> >
> >  static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
> >  {
> > -     unsigned long long quot;
> > -     unsigned long long rem;
> >       int cpu = smp_processor_id();
> >       unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
> > -     quot = (cyc >> CYC2NS_SCALE_FACTOR);
> > -     rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
> > -     ns += quot * per_cpu(cyc2ns, cpu) +
> > -             ((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
> > +     ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
> > +                     (1UL << CYC2NS_SCALE_FACTOR));
> >       return ns;
> >  }
> >
> > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> > index a62c201..183c592 100644
> > --- a/arch/x86/kernel/tsc.c
> > +++ b/arch/x86/kernel/tsc.c
> > @@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz,
> > int cpu)
> >
> >       if (cpu_khz) {
> >               *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
> > -             *offset = ns_now - (tsc_now * *scale >>
> > CYC2NS_SCALE_FACTOR);
> > +             *offset = ns_now - mult_frac(tsc_now, *scale,
> > +                                          (1UL <<
> > CYC2NS_SCALE_FACTOR));
> >       }
> >
> >       sched_clock_idle_wakeup_event(0);
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index e834342..d801acb 100644
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -85,6 +85,19 @@
> >  }                                                    \
> >  )
> >
> > +/*
> > + * Multiplies an integer by a fraction, while avoiding unnecessary
> > + * overflow or loss of precision.
> > + */
> > +#define mult_frac(x, numer, denom)(                  \
> > +{                                                    \
> > +     typeof(x) quot = (x) / (denom);                 \
> > +     typeof(x) rem  = (x) % (denom);                 \
> > +     (quot * (numer)) + ((rem * (numer)) / (denom)); \
> > +}                                                    \
> > +)
> > +
> > +
>
> So... Sorry, why did you change it from the shifted logic?
>

It's more generally applicable.  The compiler is smart enough to do
the right thing in the power-of-two
constant case.


> I was thinking more like:
> mult_shifted_fract(x, mult, shift)
> {
>        quot = x >> shift;
>        rem = x & ((1ULL << shift)-1);
>        return quot * mult + (rem * mult) >> shift;
> }
>
> Is the compiler really smart enough to avoid the divides?

Yes, it is.  I verified that.

>
> thanks
> -john
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
  2012-03-10  0:22       ` Paul Turner
@ 2012-03-10  0:29         ` Salman Qazi
  0 siblings, 0 replies; 10+ messages in thread
From: Salman Qazi @ 2012-03-10  0:29 UTC (permalink / raw)
  To: Paul Turner; +Cc: john stultz, Ingo Molnar, Peter Zijlstra, LKML

On Fri, Mar 9, 2012 at 4:22 PM, Paul Turner <pjt@google.com> wrote:
> On Fri, Mar 9, 2012 at 4:00 PM, Salman Qazi <sqazi@google.com> wrote:
>> When a machine boots up, the TSC generally gets reset.  However, when
>> kexec is used to boot into a kernel, the TSC value would be carried
>> over from the previous kernel.  The computation of cycns_offset in
>> set_cyc2ns_scale is prone to an overflow, if the machine has been up
>> more than 208 days prior to the kexec.  The overflow happens when
>> we multiply *scale, even though there is enough room to store the
>> final answer.  We fix this issue by decomposing tsc_now into the
>> quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
>> performing the multiplication separately on the two components.
>>
>> Refactor code to share the calculation with the previous
>> fix in __cycles_2_ns.
>>
>> Signed-off-by: Salman Qazi <sqazi@google.com>
>> ---
>>  arch/x86/include/asm/timer.h |    8 ++------
>>  arch/x86/kernel/tsc.c        |    3 ++-
>>  include/linux/kernel.h       |   13 +++++++++++++
>>  3 files changed, 17 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
>> index 431793e..34baa0e 100644
>> --- a/arch/x86/include/asm/timer.h
>> +++ b/arch/x86/include/asm/timer.h
>> @@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
>>
>>  static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
>>  {
>> -       unsigned long long quot;
>> -       unsigned long long rem;
>>        int cpu = smp_processor_id();
>>        unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
>> -       quot = (cyc >> CYC2NS_SCALE_FACTOR);
>> -       rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
>> -       ns += quot * per_cpu(cyc2ns, cpu) +
>> -               ((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
>> +       ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
>> +                       (1UL << CYC2NS_SCALE_FACTOR));
>>        return ns;
>>  }
>>
>> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
>> index a62c201..183c592 100644
>> --- a/arch/x86/kernel/tsc.c
>> +++ b/arch/x86/kernel/tsc.c
>> @@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
>>
>>        if (cpu_khz) {
>>                *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
>> -               *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
>> +               *offset = ns_now - mult_frac(tsc_now, *scale,
>> +                                            (1UL << CYC2NS_SCALE_FACTOR));
>>        }
>>
>>        sched_clock_idle_wakeup_event(0);
>> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
>> index e834342..d801acb 100644
>> --- a/include/linux/kernel.h
>> +++ b/include/linux/kernel.h
>> @@ -85,6 +85,19 @@
>>  }                                                      \
>>  )
>>
>> +/*
>> + * Multiplies an integer by a fraction, while avoiding unnecessary
>> + * overflow or loss of precision.
>> + */
>> +#define mult_frac(x, numer, denom)(                    \
>> +{                                                      \
>> +       typeof(x) quot = (x) / (denom);                 \
>> +       typeof(x) rem  = (x) % (denom);                 \
>> +       (quot * (numer)) + ((rem * (numer)) / (denom)); \
>> +}                                                      \
>> +)
>
> Any reason not to make this a static inline so that we get saner typing?

Only re-usability.  I put it in include/linux/kernel.h with a lot of
other #defines like min , max, clamp, that are generally
applicable.

>
>> +
>> +
>>  #define _RET_IP_               (unsigned long)__builtin_return_address(0)
>>  #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
>>
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched, x86: fix overflow in cyc2ns_offset
  2012-03-10  0:28         ` Salman Qazi
@ 2012-03-10  0:35           ` john stultz
  0 siblings, 0 replies; 10+ messages in thread
From: john stultz @ 2012-03-10  0:35 UTC (permalink / raw)
  To: Salman Qazi; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML

On Fri, 2012-03-09 at 16:28 -0800, Salman Qazi wrote:
> On Fri, Mar 9, 2012 at 4:25 PM, john stultz <johnstul@us.ibm.com> wrote:
> > So... Sorry, why did you change it from the shifted logic?
> >
> 
> It's more generally applicable.  The compiler is smart enough to do
> the right thing in the power-of-two
> constant case.
>
> >
> > Is the compiler really smart enough to avoid the divides?
> 
> Yes, it is.  I verified that.

Ok, great! Thanks again! (Although I probably need to put rewriting the
the big x86/include/asm/timer.h comment on my TODO, since its not as
relevant anymore)

Acked-by: John Stultz <john.stultz@linaro.org>

Ingo/Thomas, this should probably go tip/timers/urgent & be marked for
stable. I can queue it and send you a pull request, but it might be just
easier to just pick up the patch.

thanks
-john


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] sched, x86: fix overflow in cyc2ns_offset
@ 2012-03-10  0:41 Salman Qazi
  0 siblings, 0 replies; 10+ messages in thread
From: Salman Qazi @ 2012-03-10  0:41 UTC (permalink / raw)
  To: john stultz; +Cc: Ingo Molnar, Paul Turner, Peter Zijlstra, LKML

When a machine boots up, the TSC generally gets reset.  However, when
kexec is used to boot into a kernel, the TSC value would be carried
over from the previous kernel.  The computation of cycns_offset in
set_cyc2ns_scale is prone to an overflow, if the machine has been up
more than 208 days prior to the kexec.  The overflow happens when
we multiply *scale, even though there is enough room to store the
final answer.  We fix this issue by decomposing tsc_now into the
quotient and remainder of division by CYC2NS_SCALE_FACTOR and then
performing the multiplication separately on the two components.

Refactor code to share the calculation with the previous
fix in __cycles_2_ns.

Acked-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Salman Qazi <sqazi@google.com>
---
 arch/x86/include/asm/timer.h |    8 ++------
 arch/x86/kernel/tsc.c        |    3 ++-
 include/linux/kernel.h       |   13 +++++++++++++
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
index 431793e..34baa0e 100644
--- a/arch/x86/include/asm/timer.h
+++ b/arch/x86/include/asm/timer.h
@@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
 
 static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
 {
-	unsigned long long quot;
-	unsigned long long rem;
 	int cpu = smp_processor_id();
 	unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
-	quot = (cyc >> CYC2NS_SCALE_FACTOR);
-	rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
-	ns += quot * per_cpu(cyc2ns, cpu) +
-		((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
+	ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
+			(1UL << CYC2NS_SCALE_FACTOR));
 	return ns;
 }
 
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a62c201..183c592 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 
 	if (cpu_khz) {
 		*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
-		*offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
+		*offset = ns_now - mult_frac(tsc_now, *scale,
+					     (1UL << CYC2NS_SCALE_FACTOR));
 	}
 
 	sched_clock_idle_wakeup_event(0);
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e834342..d801acb 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -85,6 +85,19 @@
 }							\
 )
 
+/*
+ * Multiplies an integer by a fraction, while avoiding unnecessary
+ * overflow or loss of precision.
+ */
+#define mult_frac(x, numer, denom)(			\
+{							\
+	typeof(x) quot = (x) / (denom);			\
+	typeof(x) rem  = (x) % (denom);			\
+	(quot * (numer)) + ((rem * (numer)) / (denom));	\
+}							\
+)
+
+
 #define _RET_IP_		(unsigned long)__builtin_return_address(0)
 #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
 


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-03-10  0:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-08 23:23 [PATCH] sched, x86: fix overflow in cyc2ns_offset Salman Qazi
2012-03-09 19:23 ` john stultz
2012-03-09 23:58   ` Salman Qazi
2012-03-10  0:00     ` Salman Qazi
2012-03-10  0:22       ` Paul Turner
2012-03-10  0:29         ` Salman Qazi
2012-03-10  0:25       ` john stultz
2012-03-10  0:28         ` Salman Qazi
2012-03-10  0:35           ` john stultz
  -- strict thread matches above, loose matches on Subject: below --
2012-03-10  0:41 Salman Qazi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox