All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Paul Turner <pjt@google.com>,
	Richard Cochran <richardcochran@gmail.com>,
	Prarit Bhargava <prarit@redhat.com>,
	Aaron Jacobs <jacobsa@google.com>, Andrew Hunter <ahh@google.com>,
	John Stultz <john.stultz@linaro.org>,
	Ben Hutchings <ben@decadent.org.uk>
Subject: [PATCH 3.16 26/26] jiffies: Fix timeval conversion to jiffies
Date: Tue,  7 Oct 2014 16:19:42 -0700	[thread overview]
Message-ID: <20141007231849.265808421@linuxfoundation.org> (raw)
In-Reply-To: <20141007231848.487609292@linuxfoundation.org>

3.16-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andrew Hunter <ahh@google.com>

commit d78c9300c51d6ceed9f6d078d4e9366f259de28c upstream.

timeval_to_jiffies tried to round a timeval up to an integral number
of jiffies, but the logic for doing so was incorrect: intervals
corresponding to exactly N jiffies would become N+1. This manifested
itself particularly repeatedly stopping/starting an itimer:

setitimer(ITIMER_PROF, &val, NULL);
setitimer(ITIMER_PROF, NULL, &val);

would add a full tick to val, _even if it was exactly representable in
terms of jiffies_ (say, the result of a previous rounding.)  Doing
this repeatedly would cause unbounded growth in val.  So fix the math.

Here's what was wrong with the conversion: we essentially computed
(eliding seconds)

jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)

by using scaling arithmetic, which took the best approximation of
NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
x/(2^USEC_JIFFIE_SC), and computed:

jiffies = (usec * x) >> USEC_JIFFIE_SC

and rounded this calculation up in the intermediate form (since we
can't necessarily exactly represent TICK_NSEC in usec.) But the
scaling arithmetic is a (very slight) *over*approximation of the true
value; that is, instead of dividing by (1 usec/ 1 jiffie), we
effectively divided by (1 usec/1 jiffie)-epsilon (rounding
down). This would normally be fine, but we want to round timeouts up,
and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this
would be fine if our division was exact, but dividing this by the
slightly smaller factor was equivalent to adding just _over_ 1 to the
final result (instead of just _under_ 1, as desired.)

In particular, with HZ=1000, we consistently computed that 10000 usec
was 11 jiffies; the same was true for any exact multiple of
TICK_NSEC.

We could possibly still round in the intermediate form, adding
something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
convert usec->nsec, round in nanoseconds, and then convert using
time*spec*_to_jiffies.  This adds one constant multiplication, and is
not observably slower in microbenchmarks on recent x86 hardware.

Tested: the following program:

int main() {
  struct itimerval zero = {{0, 0}, {0, 0}};
  /* Initially set to 10 ms. */
  struct itimerval initial = zero;
  initial.it_interval.tv_usec = 10000;
  setitimer(ITIMER_PROF, &initial, NULL);
  /* Save and restore several times. */
  for (size_t i = 0; i < 10; ++i) {
    struct itimerval prev;
    setitimer(ITIMER_PROF, &zero, &prev);
    /* on old kernels, this goes up by TICK_USEC every iteration */
    printf("previous value: %ld %ld %ld %ld\n",
           prev.it_interval.tv_sec, prev.it_interval.tv_usec,
           prev.it_value.tv_sec, prev.it_value.tv_usec);
    setitimer(ITIMER_PROF, &prev, NULL);
  }
    return 0;
}


Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Turner <pjt@google.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Paul Turner <pjt@google.com>
Reported-by: Aaron Jacobs <jacobsa@google.com>
Signed-off-by: Andrew Hunter <ahh@google.com>
[jstultz: Tweaked to apply to 3.17-rc]
Signed-off-by: John Stultz <john.stultz@linaro.org>
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/jiffies.h |   12 ----------
 kernel/time.c           |   54 ++++++++++++++++++++++++++----------------------
 2 files changed, 30 insertions(+), 36 deletions(-)

--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -258,23 +258,11 @@ extern unsigned long preset_lpj;
 #define SEC_JIFFIE_SC (32 - SHIFT_HZ)
 #endif
 #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29)
-#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19)
 #define SEC_CONVERSION ((unsigned long)((((u64)NSEC_PER_SEC << SEC_JIFFIE_SC) +\
                                 TICK_NSEC -1) / (u64)TICK_NSEC))
 
 #define NSEC_CONVERSION ((unsigned long)((((u64)1 << NSEC_JIFFIE_SC) +\
                                         TICK_NSEC -1) / (u64)TICK_NSEC))
-#define USEC_CONVERSION  \
-                    ((unsigned long)((((u64)NSEC_PER_USEC << USEC_JIFFIE_SC) +\
-                                        TICK_NSEC -1) / (u64)TICK_NSEC))
-/*
- * USEC_ROUND is used in the timeval to jiffie conversion.  See there
- * for more details.  It is the scaled resolution rounding value.  Note
- * that it is a 64-bit value.  Since, when it is applied, we are already
- * in jiffies (albit scaled), it is nothing but the bits we will shift
- * off.
- */
-#define USEC_ROUND (u64)(((u64)1 << USEC_JIFFIE_SC) - 1)
 /*
  * The maximum jiffie value is (MAX_INT >> 1).  Here we translate that
  * into seconds.  The 64-bit case will overflow if we are not careful,
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies);
  * that a remainder subtract here would not do the right thing as the
  * resolution values don't fall on second boundries.  I.e. the line:
  * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding.
+ * Note that due to the small error in the multiplier here, this
+ * rounding is incorrect for sufficiently large values of tv_nsec, but
+ * well formed timespecs should have tv_nsec < NSEC_PER_SEC, so we're
+ * OK.
  *
  * Rather, we just shift the bits off the right.
  *
  * The >> (NSEC_JIFFIE_SC - SEC_JIFFIE_SC) converts the scaled nsec
  * value to a scaled second value.
  */
-unsigned long
-timespec_to_jiffies(const struct timespec *value)
+static unsigned long
+__timespec_to_jiffies(unsigned long sec, long nsec)
 {
-	unsigned long sec = value->tv_sec;
-	long nsec = value->tv_nsec + TICK_NSEC - 1;
+	nsec = nsec + TICK_NSEC - 1;
 
 	if (sec >= MAX_SEC_IN_JIFFIES){
 		sec = MAX_SEC_IN_JIFFIES;
@@ -517,6 +520,13 @@ timespec_to_jiffies(const struct timespe
 		 (NSEC_JIFFIE_SC - SEC_JIFFIE_SC))) >> SEC_JIFFIE_SC;
 
 }
+
+unsigned long
+timespec_to_jiffies(const struct timespec *value)
+{
+	return __timespec_to_jiffies(value->tv_sec, value->tv_nsec);
+}
+
 EXPORT_SYMBOL(timespec_to_jiffies);
 
 void
@@ -533,31 +543,27 @@ jiffies_to_timespec(const unsigned long
 }
 EXPORT_SYMBOL(jiffies_to_timespec);
 
-/* Same for "timeval"
+/*
+ * We could use a similar algorithm to timespec_to_jiffies (with a
+ * different multiplier for usec instead of nsec). But this has a
+ * problem with rounding: we can't exactly add TICK_NSEC - 1 to the
+ * usec value, since it's not necessarily integral.
+ *
+ * We could instead round in the intermediate scaled representation
+ * (i.e. in units of 1/2^(large scale) jiffies) but that's also
+ * perilous: the scaling introduces a small positive error, which
+ * combined with a division-rounding-upward (i.e. adding 2^(scale) - 1
+ * units to the intermediate before shifting) leads to accidental
+ * overflow and overestimates.
  *
- * Well, almost.  The problem here is that the real system resolution is
- * in nanoseconds and the value being converted is in micro seconds.
- * Also for some machines (those that use HZ = 1024, in-particular),
- * there is a LARGE error in the tick size in microseconds.
-
- * The solution we use is to do the rounding AFTER we convert the
- * microsecond part.  Thus the USEC_ROUND, the bits to be shifted off.
- * Instruction wise, this should cost only an additional add with carry
- * instruction above the way it was done above.
+ * At the cost of one additional multiplication by a constant, just
+ * use the timespec implementation.
  */
 unsigned long
 timeval_to_jiffies(const struct timeval *value)
 {
-	unsigned long sec = value->tv_sec;
-	long usec = value->tv_usec;
-
-	if (sec >= MAX_SEC_IN_JIFFIES){
-		sec = MAX_SEC_IN_JIFFIES;
-		usec = 0;
-	}
-	return (((u64)sec * SEC_CONVERSION) +
-		(((u64)usec * USEC_CONVERSION + USEC_ROUND) >>
-		 (USEC_JIFFIE_SC - SEC_JIFFIE_SC))) >> SEC_JIFFIE_SC;
+	return __timespec_to_jiffies(value->tv_sec,
+				     value->tv_usec * NSEC_PER_USEC);
 }
 EXPORT_SYMBOL(timeval_to_jiffies);
 



  parent reply	other threads:[~2014-10-07 23:36 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-07 23:19 [PATCH 3.16 00/26] 3.16.5-stable review Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 01/26] udf: Avoid infinite loop when processing indirect ICBs Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 02/26] ASoC: ssm2602: do not hardcode type to SSM2602 Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 03/26] ASoC: core: fix possible ZERO_SIZE_PTR pointer dereferencing error Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 04/26] perf: fix perf bug in fork() Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 05/26] mm: memcontrol: do not iterate uninitialized memcgs Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 06/26] mm: migrate: Close race between migration completion and mprotect Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 07/26] i2c: qup: Fix order of runtime pm initialization Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 08/26] i2c: rk3x: fix 0 length write transfers Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 10/26] cpufreq: integrator: fix integrator_cpufreq_remove return type Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 11/26] cpufreq: pcc-cpufreq: Fix wait_event() under spinlock Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 12/26] md/raid5: disable DISCARD by default due to safety concerns Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 13/26] drm/i915: Flush the PTEs after updating them before suspend Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 14/26] Fix problem recognizing symlinks Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 15/26] init/Kconfig: Fix HAVE_FUTEX_CMPXCHG to not break up the EXPERT menu Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 16/26] ring-buffer: Fix infinite spin in reading buffer Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 17/26] uas: Only complain about missing sg if all other checks succeed Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 18/26] uas: Log a warning when we cannot use uas because the hcd lacks streams Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 19/26] uas: Disable uas on ASM1051 devices Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 20/26] uas: Add missing le16_to_cpu calls to asm1051 / asm1053 usb-id check Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 23/26] mm, thp: move invariant bug check out of loop in __split_huge_page_map Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 24/26] mm: numa: Do not mark PTEs pte_numa when splitting huge pages Greg Kroah-Hartman
2014-10-07 23:19 ` [PATCH 3.16 25/26] media: vb2: fix VBI/poll regression Greg Kroah-Hartman
2014-10-07 23:19 ` Greg Kroah-Hartman [this message]
2014-10-08  2:47 ` [PATCH 3.16 00/26] 3.16.5-stable review Guenter Roeck
2014-10-08  4:48   ` Greg Kroah-Hartman
2014-10-08 12:39     ` Satoru Takeuchi
2014-10-08 13:32       ` Greg Kroah-Hartman
2014-10-08 15:49         ` Guenter Roeck
2014-10-08 16:34           ` Greg Kroah-Hartman
2014-10-08 20:05 ` Shuah Khan
2014-10-08 22:48   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141007231849.265808421@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=ahh@google.com \
    --cc=ben@decadent.org.uk \
    --cc=jacobsa@google.com \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pjt@google.com \
    --cc=prarit@redhat.com \
    --cc=richardcochran@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.