* [RFC][PATCH 0/4] new human-time soft-timer subsystem
@ 2005-07-14 20:26 Nishanth Aravamudan
2005-07-14 20:28 ` [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs Nishanth Aravamudan
` (4 more replies)
0 siblings, 5 replies; 12+ messages in thread
From: Nishanth Aravamudan @ 2005-07-14 20:26 UTC (permalink / raw)
To: linux-kernel
On 14.07.2005 [12:18:41 -0700], john stultz wrote:
<snip>
> Nish has some code, which I hope he'll be sending out shortly that
> does just this, converting the soft-timer subsystem to use absolute
> time instead of ticks for expiration. I feel it both simplifies the
> code and makes it easier to changing the timer interrupt frequency
> while the system is running.
Here's the set of patches John promised :)
1/4: add jiffies conversion helper functions
2/4: core human-time modifications to soft-timer subsystem
3/4: add new human-time schedule_timeout() functions
4/4: rework sys_nanosleep() to use schedule_timeout_nsecs()
The individual patches have more details, but the gist is this:
We no longer use jiffies (the variable) as the basis for determining
what "time" a timer should expire or when it should be added. Instead,
we use a new function, do_monotonic_clock(), which is simply a wrapper
for getnstimeofday(). That is to say, we use uptime in nanoseconds. But,
to avoid modifying the existing soft-timer algorithm, we convert the
64-bit nanosecond value to "timerinterval" units. These units are simply
2^TIMEINTERVAL_BITS nanoseconds in length (thus determined at compile
time).
To sum up, soft-timers now use time (as defined by the
timeofday-subsystem) not ticks. Hopefully, the individual
e-mails/patches make this change clear.
Thanks,
Nish
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs
2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan
@ 2005-07-14 20:28 ` Nishanth Aravamudan
2005-07-14 20:54 ` Dave Hansen
2005-07-14 20:40 ` [RFC][PATCH 2/4] human-time soft-timer core changes Nishanth Aravamudan
` (3 subsequent siblings)
4 siblings, 1 reply; 12+ messages in thread
From: Nishanth Aravamudan @ 2005-07-14 20:28 UTC (permalink / raw)
To: linux-kernel
From: Nishanth Aravamudan <nacc@us.ibm.com>
Description: Add a jiffies_to_nsecs() helper function. Make consistent
the size of microseconds (unsigned long) throughout the conversion
functions.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
jiffies.h | 15 +++++++++++++--
1 files changed, 13 insertions(+), 2 deletions(-)
diff -urpN 2.6.13-rc3-base/include/linux/jiffies.h 2.6.13-rc3-dev/include/linux/jiffies.h
--- 2.6.13-rc3-base/include/linux/jiffies.h 2005-03-01 23:37:31.000000000 -0800
+++ 2.6.13-rc3-dev/include/linux/jiffies.h 2005-07-14 12:43:44.000000000 -0700
@@ -263,7 +263,7 @@ static inline unsigned int jiffies_to_ms
#endif
}
-static inline unsigned int jiffies_to_usecs(const unsigned long j)
+static inline unsigned long jiffies_to_usecs(const unsigned long j)
{
#if HZ <= 1000000 && !(1000000 % HZ)
return (1000000 / HZ) * j;
@@ -274,6 +274,17 @@ static inline unsigned int jiffies_to_us
#endif
}
+static inline u64 jiffies_to_nsecs(const unsigned long j)
+{
+#if HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ)
+ return (NSEC_PER_SEC / HZ) * (u64)j;
+#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC)
+ return ((u64)j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC);
+#else
+ return ((u64)j * NSEC_PER_SEC) / HZ;
+#endif
+}
+
static inline unsigned long msecs_to_jiffies(const unsigned int m)
{
if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET))
@@ -287,7 +298,7 @@ static inline unsigned long msecs_to_jif
#endif
}
-static inline unsigned long usecs_to_jiffies(const unsigned int u)
+static inline unsigned long usecs_to_jiffies(const unsigned long u)
{
if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET))
return MAX_JIFFY_OFFSET;
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC][PATCH 2/4] human-time soft-timer core changes
2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan
2005-07-14 20:28 ` [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs Nishanth Aravamudan
@ 2005-07-14 20:40 ` Nishanth Aravamudan
2005-07-18 21:53 ` [RFC][UPDATE PATCH " Nishanth Aravamudan
2005-07-14 20:41 ` [RFC][PATCH 3/4] new human-time schedule_timeout() functions Nishanth Aravamudan
` (2 subsequent siblings)
4 siblings, 1 reply; 12+ messages in thread
From: Nishanth Aravamudan @ 2005-07-14 20:40 UTC (permalink / raw)
To: linux-kernel
From: Nishanth Aravamudan <nacc@us.ibm.com>
Description: The core revision to the soft-timer subsystem to divorce it
from the timer interrupt in software, i.e. jiffies. Instead, use
getnstimeofday() (via do_monotonic_clock()) as the basis for addition
and expiration of timers. Add a new unit, the timerinterval, which is
a 2^TIMERINTERVAL_BITS nanoseconds in length. The converted value in
timerintervals is used where we would have used the timer's expires
member before. Add set_timer_nsecs() and set_timer_nsecs_on() functions
to directly request nanosecond delays. These functions replace
add_timer(), mod_timer() and add_timer_on().
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
include/linux/time.h | 1
include/linux/timer.h | 27 +-----
kernel/time.c | 18 ++++
kernel/timer.c | 215 +++++++++++++++++++++++++++++++++++++++++++++-----
4 files changed, 220 insertions(+), 41 deletions(-)
diff -urpN 2.6.13-rc3-base/include/linux/time.h 2.6.13-rc3-dev/include/linux/time.h
--- 2.6.13-rc3-base/include/linux/time.h 2005-03-01 23:38:12.000000000 -0800
+++ 2.6.13-rc3-dev/include/linux/time.h 2005-07-14 12:44:40.000000000 -0700
@@ -103,6 +103,7 @@ struct itimerval;
extern int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue);
extern int do_getitimer(int which, struct itimerval *value);
extern void getnstimeofday (struct timespec *tv);
+extern u64 do_monotonic_clock(void);
extern struct timespec timespec_trunc(struct timespec t, unsigned gran);
diff -urpN 2.6.13-rc3-base/include/linux/timer.h 2.6.13-rc3-dev/include/linux/timer.h
--- 2.6.13-rc3-base/include/linux/timer.h 2005-07-13 15:52:14.000000000 -0700
+++ 2.6.13-rc3-dev/include/linux/timer.h 2005-07-14 12:44:40.000000000 -0700
@@ -11,6 +11,7 @@ struct timer_base_s;
struct timer_list {
struct list_head entry;
unsigned long expires;
+ u64 expires_nsecs;
unsigned long magic;
@@ -27,6 +28,7 @@ extern struct timer_base_s __init_timer_
#define TIMER_INITIALIZER(_function, _expires, _data) { \
.function = (_function), \
.expires = (_expires), \
+ .expires_nsecs = 0, \
.data = (_data), \
.base = &__init_timer_base, \
.magic = TIMER_MAGIC, \
@@ -51,30 +53,15 @@ static inline int timer_pending(const st
extern void add_timer_on(struct timer_list *timer, int cpu);
extern int del_timer(struct timer_list * timer);
-extern int __mod_timer(struct timer_list *timer, unsigned long expires);
+extern int __mod_timer(struct timer_list *timer);
extern int mod_timer(struct timer_list *timer, unsigned long expires);
+extern void add_timer(struct timer_list *timer);
+extern int set_timer_nsecs(struct timer_list *timer, u64 expires_nsecs);
+extern void set_timer_on_nsecs(struct timer_list *timer, u64 expires_nsecs,
+ int cpu);
extern unsigned long next_timer_interrupt(void);
-/***
- * add_timer - start a timer
- * @timer: the timer to be added
- *
- * The kernel will do a ->function(->data) callback from the
- * timer interrupt at the ->expired point in the future. The
- * current time is 'jiffies'.
- *
- * The timer's ->expired, ->function (and if the handler uses it, ->data)
- * fields must be set prior calling this function.
- *
- * Timers with an ->expired field in the past will be executed in the next
- * timer tick.
- */
-static inline void add_timer(struct timer_list * timer)
-{
- __mod_timer(timer, timer->expires);
-}
-
#ifdef CONFIG_SMP
extern int try_to_del_timer_sync(struct timer_list *timer);
extern int del_timer_sync(struct timer_list *timer);
diff -urpN 2.6.13-rc3-base/kernel/time.c 2.6.13-rc3-dev/kernel/time.c
--- 2.6.13-rc3-base/kernel/time.c 2005-07-13 15:51:57.000000000 -0700
+++ 2.6.13-rc3-dev/kernel/time.c 2005-07-14 12:44:40.000000000 -0700
@@ -589,3 +589,21 @@ EXPORT_SYMBOL(get_jiffies_64);
#endif
EXPORT_SYMBOL(jiffies);
+
+u64 do_monotonic_clock(void)
+{
+ struct timespec now, now_w2m;
+ unsigned long seq;
+
+ getnstimeofday(&now);
+
+ do {
+ seq = read_seqbegin(&xtime_lock);
+ now_w2m = wall_to_monotonic;
+ } while (read_seqretry(&xtime_lock, seq));
+
+ return (u64)(now.tv_sec + now_w2m.tv_sec) * NSEC_PER_SEC +
+ (now.tv_nsec + now_w2m.tv_nsec);
+}
+
+EXPORT_SYMBOL_GPL(do_monotonic_clock);
diff -urpN 2.6.13-rc3-base/kernel/timer.c 2.6.13-rc3-dev/kernel/timer.c
--- 2.6.13-rc3-base/kernel/timer.c 2005-07-13 15:52:14.000000000 -0700
+++ 2.6.13-rc3-dev/kernel/timer.c 2005-07-14 12:44:40.000000000 -0700
@@ -56,6 +56,15 @@ static void time_interpolator_update(lon
#define TVR_SIZE (1 << TVR_BITS)
#define TVN_MASK (TVN_SIZE - 1)
#define TVR_MASK (TVR_SIZE - 1)
+/*
+ * Modifying TIMERINTERVAL_BITS changes the software resolution of
+ * soft-timers. While 20 bits would be closer to a millisecond, there
+ * are performance gains from allowing a software resolution finer than
+ * the hardware (HZ=1000)
+ */
+#define TIMERINTERVAL_BITS 19
+#define TIMERINTERVAL_SIZE (1 << TIMERINTERVAL_BITS)
+#define TIMERINTERVAL_MASK (TIMERINTERVAL_SIZE - 1)
struct timer_base_s {
spinlock_t lock;
@@ -72,7 +81,7 @@ typedef struct tvec_root_s {
struct tvec_t_base_s {
struct timer_base_s t_base;
- unsigned long timer_jiffies;
+ unsigned long last_timer_time;
tvec_root_t tv1;
tvec_t tv2;
tvec_t tv3;
@@ -114,11 +123,88 @@ static inline void check_timer(struct ti
check_timer_failed(timer);
}
+/*
+ * nsecs_to_timerintervals_ceiling - convert nanoseconds to timerintervals
+ * @n: number of nanoseconds to convert
+ *
+ * This is where changes to TIMERINTERVAL_BITS affect the soft-timer
+ * subsystem.
+ *
+ * Some explanation of the math is necessary:
+ * Rather than do decimal arithmetic, we shift for the sake of speed.
+ * This does mean that the actual requestable sleeps are
+ * 2^(sizeof(unsigned long)*8 - TIMERINTERVAL_BITS)
+ * timerintervals.
+ *
+ * The conditional takes care of the corner case where we request a 0
+ * nanosecond sleep; if the quantity were unsigned, we would not
+ * propogate the carry and force a wrap when adding the 1.
+ *
+ * To prevent timers from being expired early, we:
+ * Take the ceiling when we add; and
+ * Take the floor when we expire.
+ */
+static inline unsigned long nsecs_to_timerintervals_ceiling(u64 nsecs)
+{
+ if (nsecs)
+ return (unsigned long)(((nsecs - 1) >> TIMERINTERVAL_BITS) + 1);
+ else
+ return 0UL;
+}
+
+/*
+ * nsecs_to_timerintervals_floor - convert nanoseconds to timerintervals
+ * @n: number of nanoseconds to convert
+ *
+ * This is where changes to TIMERINTERVAL_BITS affect the soft-timer
+ * subsystem.
+ *
+ * Some explanation of the math is necessary:
+ * Rather than do decimal arithmetic, we shift for the sake of speed.
+ * This does mean that the actual requestable sleeps are
+ * 2^(sizeof(unsigned long)*8 - TIMERINTERVAL_BITS)
+ *
+ * There is no special case for 0 in the floor function, since we do not
+ * do any subtraction or addition of 1
+ *
+ * To prevent timers from being expired early, we:
+ * Take the ceiling when we add; and
+ * Take the floor when we expire.
+ */
+static inline unsigned long nsecs_to_timerintervals_floor(u64 nsecs)
+{
+ return (unsigned long)(nsecs >> TIMERINTERVAL_BITS);
+}
+
+/*
+ * jiffies_to_timerintervals - convert absolute jiffies to timerintervals
+ * @abs_jiffies: number of jiffies to convert
+ *
+ * First, we convert the absolute jiffies parameter to a relative
+ * jiffies value. To maintain precision, we convert the relative
+ * jiffies value to a relative nanosecond value and then convert that
+ * to a relative soft-timer interval unit value. We then add this
+ * relative value to the current time according to the timeofday-
+ * subsystem, converted to soft-timer interval units.
+ *
+ * We only use this function when adding timers, so we are free to
+ * always use the ceiling version of nsecs_to_timerintervals.
+ *
+ * This function only exists to support deprecated interfaces. Once
+ * those interfaces have been converted to the alternatives, it should
+ * be removed.
+ */
+static inline unsigned long jiffies_to_timerintervals(unsigned long abs_jiffies)
+{
+ unsigned long relative_jiffies = abs_jiffies - jiffies;
+ return nsecs_to_timerintervals_ceiling(do_monotonic_clock() +
+ jiffies_to_nsecs(relative_jiffies));
+}
static void internal_add_timer(tvec_base_t *base, struct timer_list *timer)
{
- unsigned long expires = timer->expires;
- unsigned long idx = expires - base->timer_jiffies;
+ unsigned long expires = nsecs_to_timerintervals_ceiling(timer->expires_nsecs);
+ unsigned long idx = expires - base->last_timer_time;
struct list_head *vec;
if (idx < TVR_SIZE) {
@@ -138,7 +224,7 @@ static void internal_add_timer(tvec_base
* Can happen if you add a timer with expires == jiffies,
* or you set a timer to go off in the past
*/
- vec = base->tv1.vec + (base->timer_jiffies & TVR_MASK);
+ vec = base->tv1.vec + (base->last_timer_time & TVR_MASK);
} else {
int i;
/* If the timeout is larger than 0xffffffff on 64-bit
@@ -146,7 +232,7 @@ static void internal_add_timer(tvec_base
*/
if (idx > 0xffffffffUL) {
idx = 0xffffffffUL;
- expires = idx + base->timer_jiffies;
+ expires = idx + base->last_timer_time;
}
i = (expires >> (TVR_BITS + 3 * TVN_BITS)) & TVN_MASK;
vec = base->tv5.vec + i;
@@ -222,7 +308,7 @@ static timer_base_t *lock_timer_base(str
}
}
-int __mod_timer(struct timer_list *timer, unsigned long expires)
+int __mod_timer(struct timer_list *timer)
{
timer_base_t *base;
tvec_base_t *new_base;
@@ -261,7 +347,7 @@ int __mod_timer(struct timer_list *timer
}
}
- timer->expires = expires;
+ /* expires should be in timerintervals, and is currently ignored? */
internal_add_timer(new_base, timer);
spin_unlock_irqrestore(&new_base->t_base.lock, flags);
@@ -281,21 +367,50 @@ void add_timer_on(struct timer_list *tim
{
tvec_base_t *base = &per_cpu(tvec_bases, cpu);
unsigned long flags;
-
+
BUG_ON(timer_pending(timer) || !timer->function);
check_timer(timer);
spin_lock_irqsave(&base->t_base.lock, flags);
+ timer->expires_nsecs = do_monotonic_clock() +
+ jiffies_to_nsecs(timer->expires - jiffies);
timer->base = &base->t_base;
internal_add_timer(base, timer);
spin_unlock_irqrestore(&base->t_base.lock, flags);
}
+/***
+ * add_timer - start a timer
+ * @timer: the timer to be added
+ *
+ * The kernel will do a ->function(->data) callback from the
+ * timer interrupt at the ->expired point in the future. The
+ * current time is 'jiffies'.
+ *
+ * The timer's ->expired, ->function (and if the handler uses it, ->data)
+ * fields must be set prior calling this function.
+ *
+ * Timers with an ->expired field in the past will be executed in the next
+ * timer tick.
+ *
+ * The callers of add_timer() should be aware that the interface is now
+ * deprecated. set_timer_nsecs() is the single interface for adding and
+ * modifying timers.
+ */
+void add_timer(struct timer_list * timer)
+{
+ timer->expires_nsecs = do_monotonic_clock() +
+ jiffies_to_nsecs(timer->expires - jiffies);
+ __mod_timer(timer);
+}
+
+EXPORT_SYMBOL(add_timer);
/***
* mod_timer - modify a timer's timeout
* @timer: the timer to be modified
+ * @expires: absolute time, in jiffies, when timer should expire
*
* mod_timer is a more efficient way to update the expire field of an
* active timer (if the timer is inactive it will be activated)
@@ -311,6 +426,10 @@ void add_timer_on(struct timer_list *tim
* The function returns whether it has modified a pending timer or not.
* (ie. mod_timer() of an inactive timer returns 0, mod_timer() of an
* active timer returns 1.)
+ *
+ * The callers of mod_timer() should be aware that the interface is now
+ * deprecated. set_timer_nsecs() is the single interface for adding and
+ * modifying timers.
*/
int mod_timer(struct timer_list *timer, unsigned long expires)
{
@@ -318,6 +437,9 @@ int mod_timer(struct timer_list *timer,
check_timer(timer);
+ timer->expires_nsecs = do_monotonic_clock() +
+ jiffies_to_nsecs(expires - jiffies);
+
/*
* This is a common optimization triggered by the
* networking code - if the timer is re-modified
@@ -326,10 +448,56 @@ int mod_timer(struct timer_list *timer,
if (timer->expires == expires && timer_pending(timer))
return 1;
- return __mod_timer(timer, expires);
+ return __mod_timer(timer);
}
-EXPORT_SYMBOL(mod_timer);
+/*
+ * set_timer_nsecs - modify a timer's timeout in nsecs
+ * @timer: the timer to be modified
+ *
+ * set_timer_nsecs replaces both add_timer and mod_timer. The caller
+ * should call do_monotonic_clock() to determine the absolute timeout
+ * necessary.
+ */
+int set_timer_nsecs(struct timer_list *timer, u64 expires_nsecs)
+{
+ BUG_ON(!timer->function);
+
+ check_timer(timer);
+
+ if (timer_pending(timer) && timer->expires_nsecs == expires_nsecs)
+ return 1;
+
+ timer->expires_nsecs = expires_nsecs;
+
+ return __mod_timer(timer);
+}
+
+EXPORT_SYMBOL_GPL(set_timer_nsecs);
+
+/***
+ * set_timer_on_nsecs - start a timer on a particular CPU
+ * @timer: the timer to be added
+ * @expires_nsecs: absolute time in nsecs when timer should expire
+ * @cpu: the CPU to start it on
+ *
+ * This is not very scalable on SMP. Double adds are not possible.
+ */
+void set_timer_on_nsecs(struct timer_list *timer, u64 expires_nsecs, int cpu)
+{
+ tvec_base_t *base = &per_cpu(tvec_bases, cpu);
+ unsigned long flags;
+
+ BUG_ON(timer_pending(timer) || !timer->function);
+
+ check_timer(timer);
+
+ spin_lock_irqsave(&base->t_base.lock, flags);
+ timer->expires_nsecs = expires_nsecs;
+ timer->base = &base->t_base;
+ internal_add_timer(base, timer);
+ spin_unlock_irqrestore(&base->t_base.lock, flags);
+}
/***
* del_timer - deactive a timer.
@@ -455,17 +623,17 @@ static int cascade(tvec_base_t *base, tv
* This function cascades all vectors and executes all expired timer
* vectors.
*/
-#define INDEX(N) (base->timer_jiffies >> (TVR_BITS + N * TVN_BITS)) & TVN_MASK
+#define INDEX(N) (base->last_timer_time >> (TVR_BITS + N * TVN_BITS)) & TVN_MASK
-static inline void __run_timers(tvec_base_t *base)
+static inline void __run_timers(tvec_base_t *base, unsigned long current_timer_time)
{
struct timer_list *timer;
spin_lock_irq(&base->t_base.lock);
- while (time_after_eq(jiffies, base->timer_jiffies)) {
+ while (time_after_eq(current_timer_time, base->last_timer_time)) {
struct list_head work_list = LIST_HEAD_INIT(work_list);
struct list_head *head = &work_list;
- int index = base->timer_jiffies & TVR_MASK;
+ int index = base->last_timer_time & TVR_MASK;
/*
* Cascade timers:
@@ -475,7 +643,7 @@ static inline void __run_timers(tvec_bas
(!cascade(base, &base->tv3, INDEX(1))) &&
!cascade(base, &base->tv4, INDEX(2)))
cascade(base, &base->tv5, INDEX(3));
- ++base->timer_jiffies;
+ ++base->last_timer_time;
list_splice_init(base->tv1.vec + index, &work_list);
while (!list_empty(head)) {
void (*fn)(unsigned long);
@@ -524,20 +692,20 @@ unsigned long next_timer_interrupt(void)
base = &__get_cpu_var(tvec_bases);
spin_lock(&base->t_base.lock);
- expires = base->timer_jiffies + (LONG_MAX >> 1);
+ expires = base->last_timer_time + (LONG_MAX >> 1);
list = 0;
/* Look for timer events in tv1. */
- j = base->timer_jiffies & TVR_MASK;
+ j = base->last_timer_time & TVR_MASK;
do {
list_for_each_entry(nte, base->tv1.vec + j, entry) {
expires = nte->expires;
- if (j < (base->timer_jiffies & TVR_MASK))
+ if (j < (base->last_timer_time & TVR_MASK))
list = base->tv2.vec + (INDEX(0));
goto found;
}
j = (j + 1) & TVR_MASK;
- } while (j != (base->timer_jiffies & TVR_MASK));
+ } while (j != (base->last_timer_time & TVR_MASK));
/* Check tv2-tv5. */
varray[0] = &base->tv2;
@@ -910,10 +1078,15 @@ EXPORT_SYMBOL(xtime_lock);
*/
static void run_timer_softirq(struct softirq_action *h)
{
+ unsigned long current_timer_time;
tvec_base_t *base = &__get_cpu_var(tvec_bases);
- if (time_after_eq(jiffies, base->timer_jiffies))
- __run_timers(base);
+ /* cache the converted current time, rounding down */
+ current_timer_time =
+ nsecs_to_timerintervals_floor(do_monotonic_clock());
+
+ if (time_after_eq(current_timer_time, base->last_timer_time))
+ __run_timers(base, current_timer_time);
}
/*
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC][PATCH 3/4] new human-time schedule_timeout() functions
2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan
2005-07-14 20:28 ` [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs Nishanth Aravamudan
2005-07-14 20:40 ` [RFC][PATCH 2/4] human-time soft-timer core changes Nishanth Aravamudan
@ 2005-07-14 20:41 ` Nishanth Aravamudan
2005-07-14 20:43 ` [RFC][PATCH 4/4] convert sys_nanosleep() to use set_timer_nsecs() Nishanth Aravamudan
2005-07-14 22:28 ` [RFC][PATCH 0/4] new human-time soft-timer subsystem Roman Zippel
4 siblings, 0 replies; 12+ messages in thread
From: Nishanth Aravamudan @ 2005-07-14 20:41 UTC (permalink / raw)
To: linux-kernel
From: Nishanth Aravamudan <nacc@us.ibm.com>
Description: Add new human-time schedule_timeout() style functions,
along with the appropriate constants/prototypes.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
include/linux/sched.h | 7 ++
include/linux/time.h | 4 +
kernel/timer.c | 147 ++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 158 insertions(+)
diff -urpN 2.6.13-rc3-base/include/linux/sched.h 2.6.13-rc3-dev/include/linux/sched.h
--- 2.6.13-rc3-base/include/linux/sched.h 2005-07-13 15:52:14.000000000 -0700
+++ 2.6.13-rc3-dev/include/linux/sched.h 2005-07-14 12:45:15.000000000 -0700
@@ -182,7 +182,14 @@ extern void scheduler_tick(void);
extern int in_sched_functions(unsigned long addr);
#define MAX_SCHEDULE_TIMEOUT LONG_MAX
+#define MAX_SCHEDULE_TIMEOUT_NSECS ((u64)(-1))
+#define MAX_SCHEDULE_TIMEOUT_USECS ULONG_MAX
+#define MAX_SCHEDULE_TIMEOUT_MSECS UINT_MAX
+
extern signed long FASTCALL(schedule_timeout(signed long timeout));
+extern u64 FASTCALL(schedule_timeout_nsecs(u64 timeout_nsecs));
+extern unsigned long FASTCALL(schedule_timeout_usecs(unsigned long timeout_usecs));
+extern unsigned int FASTCALL(schedule_timeout_msecs(unsigned int timeout_msecs));
asmlinkage void schedule(void);
struct namespace;
diff -urpN 2.6.13-rc3-base/include/linux/time.h 2.6.13-rc3-dev/include/linux/time.h
--- 2.6.13-rc3-base/include/linux/time.h 2005-07-14 12:45:07.000000000 -0700
+++ 2.6.13-rc3-dev/include/linux/time.h 2005-07-14 12:45:15.000000000 -0700
@@ -36,6 +36,10 @@ struct timezone {
#define NSEC_PER_SEC (1000000000L)
#endif
+#ifndef NSEC_PER_MSEC
+#define NSEC_PER_MSEC (1000000L)
+#endif
+
#ifndef NSEC_PER_USEC
#define NSEC_PER_USEC (1000L)
#endif
diff -urpN 2.6.13-rc3-base/kernel/timer.c 2.6.13-rc3-dev/kernel/timer.c
--- 2.6.13-rc3-base/kernel/timer.c 2005-07-14 12:45:07.000000000 -0700
+++ 2.6.13-rc3-dev/kernel/timer.c 2005-07-14 12:45:15.000000000 -0700
@@ -1271,6 +1271,10 @@ static void process_timeout(unsigned lon
* value will be %MAX_SCHEDULE_TIMEOUT.
*
* In all cases the return value is guaranteed to be non-negative.
+ *
+ * The callers of schedule_timeout() should be aware that the interface
+ * is now deprecated. schedule_timeout_{msecs,usecs,nsecs}() are now the
+ * interfaces for relative timeout requests.
*/
fastcall signed long __sched schedule_timeout(signed long timeout)
{
@@ -1326,6 +1330,149 @@ fastcall signed long __sched schedule_ti
EXPORT_SYMBOL(schedule_timeout);
+/**
+ * schedule_timeout_nsecs - sleep until timeout
+ * @timeout_nsecs: timeout value in nanoseconds
+ *
+ * Make the current task sleep until @timeout_nsecs nsecs have
+ * elapsed. The routine will return immediately unless
+ * the current task state has been set (see set_current_state()).
+ *
+ * You can set the task state as follows -
+ *
+ * %TASK_UNINTERRUPTIBLE - at least @timeout_nsecs nsecs are guaranteed
+ * to pass before the routine returns. The routine will return 0
+ *
+ * %TASK_INTERRUPTIBLE - the routine may return early if a signal is
+ * delivered to the current task. In this case the remaining time
+ * in nsecs will be returned, or 0 if the timer expired in time
+ *
+ * The current task state is guaranteed to be TASK_RUNNING when this
+ * routine returns.
+ *
+ * Specifying a @timeout value of %MAX_SCHEDULE_TIMEOUT_NSECS will
+ * schedule the CPU away without a bound on the timeout. In this case
+ * the return value will be %MAX_SCHEDULE_TIMEOUT_NSECS.
+ */
+fastcall u64 __sched schedule_timeout_nsecs(u64 timeout_nsecs)
+{
+ struct timer_list timer;
+ u64 expires;
+
+ if (timeout_nsecs == MAX_SCHEDULE_TIMEOUT_NSECS) {
+ schedule();
+ goto out;
+ }
+
+ expires = do_monotonic_clock() + timeout_nsecs;
+
+ init_timer(&timer);
+ timer.data = (unsigned long) current;
+ timer.function = process_timeout;
+
+ set_timer_nsecs(&timer, expires);
+ schedule();
+ del_singleshot_timer_sync(&timer);
+
+ timeout_nsecs = do_monotonic_clock();
+ if (expires < timeout_nsecs)
+ timeout_nsecs = (u64)0UL;
+ else
+ timeout_nsecs = expires - timeout_nsecs;
+out:
+ return timeout_nsecs;
+}
+
+EXPORT_SYMBOL_GPL(schedule_timeout_nsecs);
+
+/**
+ * schedule_timeout_usecs - sleep until timeout
+ * @timeout_usecs: timeout value in nanoseconds
+ *
+ * Make the current task sleep until @timeout_usecs usecs have
+ * elapsed. The routine will return immediately unless
+ * the current task state has been set (see set_current_state()).
+ *
+ * You can set the task state as follows -
+ *
+ * %TASK_UNINTERRUPTIBLE - at least @timeout_usecs usecs are guaranteed
+ * to pass before the routine returns. The routine will return 0
+ *
+ * %TASK_INTERRUPTIBLE - the routine may return early if a signal is
+ * delivered to the current task. In this case the remaining time
+ * in usecs will be returned, or 0 if the timer expired in time
+ *
+ * The current task state is guaranteed to be TASK_RUNNING when this
+ * routine returns.
+ *
+ * Specifying a @timeout value of %MAX_SCHEDULE_TIMEOUT_USECS will
+ * schedule the CPU away without a bound on the timeout. In this case
+ * the return value will be %MAX_SCHEDULE_TIMEOUT_USECS.
+ */
+fastcall inline unsigned long __sched schedule_timeout_usecs(unsigned long timeout_usecs)
+{
+ u64 timeout_nsecs;
+
+ if (timeout_usecs == MAX_SCHEDULE_TIMEOUT_USECS)
+ timeout_nsecs = MAX_SCHEDULE_TIMEOUT_NSECS;
+ else
+ timeout_nsecs = timeout_usecs * (u64)NSEC_PER_USEC;
+ /*
+ * Make sure to round up by subtracting one before division and
+ * adding one after
+ */
+ timeout_nsecs = schedule_timeout_nsecs(timeout_nsecs) - 1;
+ do_div(timeout_nsecs, NSEC_PER_USEC);
+ timeout_usecs = (unsigned long)timeout_nsecs + 1UL;
+ return timeout_usecs;
+}
+
+EXPORT_SYMBOL_GPL(schedule_timeout_usecs);
+
+/**
+ * schedule_timeout_msecs - sleep until timeout
+ * @timeout_msecs: timeout value in nanoseconds
+ *
+ * Make the current task sleep until @timeout_msecs msecs have
+ * elapsed. The routine will return immediately unless
+ * the current task state has been set (see set_current_state()).
+ *
+ * You can set the task state as follows -
+ *
+ * %TASK_UNINTERRUPTIBLE - at least @timeout_msecs msecs are guaranteed
+ * to pass before the routine returns. The routine will return 0
+ *
+ * %TASK_INTERRUPTIBLE - the routine may return early if a signal is
+ * delivered to the current task. In this case the remaining time
+ * in msecs will be returned, or 0 if the timer expired in time
+ *
+ * The current task state is guaranteed to be TASK_RUNNING when this
+ * routine returns.
+ *
+ * Specifying a @timeout value of %MAX_SCHEDULE_TIMEOUT_MSECS will
+ * schedule the CPU away without a bound on the timeout. In this case
+ * the return value will be %MAX_SCHEDULE_TIMEOUT_MSECS.
+ */
+fastcall inline unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
+{
+ u64 timeout_nsecs;
+
+ if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS)
+ timeout_nsecs = MAX_SCHEDULE_TIMEOUT_NSECS;
+ else
+ timeout_nsecs = timeout_msecs * (u64)NSEC_PER_MSEC;
+ /*
+ * Make sure to round up by subtracting one before division and
+ * adding one after
+ */
+ timeout_nsecs = schedule_timeout_nsecs(timeout_nsecs) - 1;
+ do_div(timeout_nsecs, NSEC_PER_MSEC);
+ timeout_msecs = (unsigned int)timeout_nsecs + 1;
+ return timeout_msecs;
+}
+
+EXPORT_SYMBOL_GPL(schedule_timeout_msecs);
+
/* Thread ID - the internal kernel "pid" */
asmlinkage long sys_gettid(void)
{
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC][PATCH 4/4] convert sys_nanosleep() to use set_timer_nsecs()
2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan
` (2 preceding siblings ...)
2005-07-14 20:41 ` [RFC][PATCH 3/4] new human-time schedule_timeout() functions Nishanth Aravamudan
@ 2005-07-14 20:43 ` Nishanth Aravamudan
2005-07-14 22:28 ` [RFC][PATCH 0/4] new human-time soft-timer subsystem Roman Zippel
4 siblings, 0 replies; 12+ messages in thread
From: Nishanth Aravamudan @ 2005-07-14 20:43 UTC (permalink / raw)
To: linux-kernel
From: Nishanth Aravamudan <nacc@us.ibm.com>
Description: Add timespec and timeval conversion functions for
nanoseconds. Convert sys_nanosleep() to use schedule_timeout_nsecs().
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
include/linux/time.h | 33 +++++++++++++++++++++++++++++++++
kernel/timer.c | 24 ++++++++++++------------
2 files changed, 45 insertions(+), 12 deletions(-)
diff -urpN 2.6.13-rc3-base/include/linux/time.h 2.6.13-rc3-dev/include/linux/time.h
--- 2.6.13-rc3-base/include/linux/time.h 2005-07-14 12:46:46.000000000 -0700
+++ 2.6.13-rc3-dev/include/linux/time.h 2005-07-14 12:48:25.000000000 -0700
@@ -2,6 +2,7 @@
#define _LINUX_TIME_H
#include <linux/types.h>
+#include <asm/div64.h>
#ifdef __KERNEL__
#include <linux/seqlock.h>
@@ -126,6 +127,38 @@ set_normalized_timespec (struct timespec
ts->tv_nsec = nsec;
}
+/* Inline helper functions */
+static inline struct timeval nsecs_to_timeval(u64 ns)
+{
+ struct timeval tv;
+ tv.tv_sec = div_long_long_rem(ns, NSEC_PER_SEC, &tv.tv_usec);
+ tv.tv_usec = (tv.tv_usec + NSEC_PER_USEC/2) / NSEC_PER_USEC;
+ return tv;
+}
+
+static inline struct timespec nsecs_to_timespec(u64 ns)
+{
+ struct timespec ts;
+ ts.tv_sec = div_long_long_rem(ns, NSEC_PER_SEC, &ts.tv_nsec);
+ return ts;
+}
+
+static inline u64 timespec_to_nsecs(struct timespec* ts)
+{
+ u64 ret;
+ ret = ((u64)ts->tv_sec) * NSEC_PER_SEC;
+ ret += (u64)ts->tv_nsec;
+ return ret;
+}
+
+static inline u64 timeval_to_nsecs(struct timeval* tv)
+{
+ u64 ret;
+ ret = ((u64)tv->tv_sec) * NSEC_PER_SEC;
+ ret += ((u64)tv->tv_usec) * NSEC_PER_USEC;
+ return ret;
+}
+
#endif /* __KERNEL__ */
#define NFDBITS __NFDBITS
diff -urpN 2.6.13-rc3-base/kernel/timer.c 2.6.13-rc3-dev/kernel/timer.c
--- 2.6.13-rc3-base/kernel/timer.c 2005-07-14 12:46:46.000000000 -0700
+++ 2.6.13-rc3-dev/kernel/timer.c 2005-07-14 12:48:25.000000000 -0700
@@ -1481,21 +1481,21 @@ asmlinkage long sys_gettid(void)
static long __sched nanosleep_restart(struct restart_block *restart)
{
- unsigned long expire = restart->arg0, now = jiffies;
+ u64 expire = restart->arg0, now = do_monotonic_clock();
struct timespec __user *rmtp = (struct timespec __user *) restart->arg1;
long ret;
/* Did it expire while we handled signals? */
- if (!time_after(expire, now))
+ if (now > expire)
return 0;
- current->state = TASK_INTERRUPTIBLE;
- expire = schedule_timeout(expire - now);
+ set_current_state(TASK_INTERRUPTIBLE);
+ expire = schedule_timeout_nsecs(expire - now);
ret = 0;
if (expire) {
struct timespec t;
- jiffies_to_timespec(expire, &t);
+ t = nsecs_to_timespec(expire);
ret = -ERESTART_RESTARTBLOCK;
if (rmtp && copy_to_user(rmtp, &t, sizeof(t)))
@@ -1508,7 +1508,7 @@ static long __sched nanosleep_restart(st
asmlinkage long sys_nanosleep(struct timespec __user *rqtp, struct timespec __user *rmtp)
{
struct timespec t;
- unsigned long expire;
+ u64 expire;
long ret;
if (copy_from_user(&t, rqtp, sizeof(t)))
@@ -1517,20 +1517,20 @@ asmlinkage long sys_nanosleep(struct tim
if ((t.tv_nsec >= 1000000000L) || (t.tv_nsec < 0) || (t.tv_sec < 0))
return -EINVAL;
- expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);
- current->state = TASK_INTERRUPTIBLE;
- expire = schedule_timeout(expire);
+ expire = timespec_to_nsecs(&t);
+ set_current_state(TASK_INTERRUPTIBLE);
+ expire = schedule_timeout_nsecs(expire);
ret = 0;
if (expire) {
struct restart_block *restart;
- jiffies_to_timespec(expire, &t);
+ t = nsecs_to_timespec(expire);
if (rmtp && copy_to_user(rmtp, &t, sizeof(t)))
return -EFAULT;
restart = ¤t_thread_info()->restart_block;
restart->fn = nanosleep_restart;
- restart->arg0 = jiffies + expire;
+ restart->arg0 = do_monotonic_clock() + expire;
restart->arg1 = (unsigned long) rmtp;
ret = -ERESTART_RESTARTBLOCK;
}
@@ -1642,7 +1642,7 @@ static void __devinit init_timers_cpu(in
for (j = 0; j < TVR_SIZE; j++)
INIT_LIST_HEAD(base->tv1.vec + j);
- base->timer_jiffies = jiffies;
+ base->last_timer_time = 0UL;
}
#ifdef CONFIG_HOTPLUG_CPU
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs
2005-07-14 20:28 ` [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs Nishanth Aravamudan
@ 2005-07-14 20:54 ` Dave Hansen
2005-07-14 21:03 ` Nishanth Aravamudan
0 siblings, 1 reply; 12+ messages in thread
From: Dave Hansen @ 2005-07-14 20:54 UTC (permalink / raw)
To: Nishanth Aravamudan; +Cc: Linux Kernel Mailing List
On Thu, 2005-07-14 at 13:28 -0700, Nishanth Aravamudan wrote:
> +static inline u64 jiffies_to_nsecs(const unsigned long j)
> +{
> +#if HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ)
> + return (NSEC_PER_SEC / HZ) * (u64)j;
> +#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC)
> + return ((u64)j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC);
> +#else
> + return ((u64)j * NSEC_PER_SEC) / HZ;
> +#endif
> +}
That might look a little better something like:
static inline u64 jiffies_to_nsecs(const unsigned long __j)
{
u64 j = __j;
if (HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ))
return (NSEC_PER_SEC / HZ) * j;
else if (HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC))
return (j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC);
else
return (j * NSEC_PER_SEC) / HZ;
}
Compilers are smart :)
-- Dave
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs
2005-07-14 20:54 ` Dave Hansen
@ 2005-07-14 21:03 ` Nishanth Aravamudan
2005-07-15 12:14 ` Pavel Machek
0 siblings, 1 reply; 12+ messages in thread
From: Nishanth Aravamudan @ 2005-07-14 21:03 UTC (permalink / raw)
To: Dave Hansen; +Cc: Linux Kernel Mailing List
On 14.07.2005 [13:54:47 -0700], Dave Hansen wrote:
> On Thu, 2005-07-14 at 13:28 -0700, Nishanth Aravamudan wrote:
> > +static inline u64 jiffies_to_nsecs(const unsigned long j)
> > +{
> > +#if HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ)
> > + return (NSEC_PER_SEC / HZ) * (u64)j;
> > +#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC)
> > + return ((u64)j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC);
> > +#else
> > + return ((u64)j * NSEC_PER_SEC) / HZ;
> > +#endif
> > +}
>
> That might look a little better something like:
>
> static inline u64 jiffies_to_nsecs(const unsigned long __j)
> {
> u64 j = __j;
>
> if (HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ))
> return (NSEC_PER_SEC / HZ) * j;
> else if (HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC))
> return (j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC);
> else
> return (j * NSEC_PER_SEC) / HZ;
> }
>
> Compilers are smart :)
Well, I was trying to keep it similar to the other conversion functions.
I guess the compiler can evaluate the conditional full of constants at
compile-time regardless of whether it is #if or if ().
I can make these changes if others would like them as well.
Thanks,
Nish
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 0/4] new human-time soft-timer subsystem
2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan
` (3 preceding siblings ...)
2005-07-14 20:43 ` [RFC][PATCH 4/4] convert sys_nanosleep() to use set_timer_nsecs() Nishanth Aravamudan
@ 2005-07-14 22:28 ` Roman Zippel
2005-07-17 0:53 ` Nishanth Aravamudan
4 siblings, 1 reply; 12+ messages in thread
From: Roman Zippel @ 2005-07-14 22:28 UTC (permalink / raw)
To: Nishanth Aravamudan; +Cc: linux-kernel
Hi,
On Thu, 14 Jul 2005, Nishanth Aravamudan wrote:
> We no longer use jiffies (the variable) as the basis for determining
> what "time" a timer should expire or when it should be added. Instead,
> we use a new function, do_monotonic_clock(), which is simply a wrapper
> for getnstimeofday().
And suddenly a simple 32bit integer becomes a complex 64bit integer, which
requires hardware access to read a timer and additional conversion into ns.
Why is suddenly everyone so obsessed with molesting something simple and
cute as jiffies?
bye, Roman
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs
2005-07-14 21:03 ` Nishanth Aravamudan
@ 2005-07-15 12:14 ` Pavel Machek
2005-07-17 0:44 ` Nishanth Aravamudan
0 siblings, 1 reply; 12+ messages in thread
From: Pavel Machek @ 2005-07-15 12:14 UTC (permalink / raw)
To: Nishanth Aravamudan; +Cc: Dave Hansen, Linux Kernel Mailing List
Hi!
> > > +static inline u64 jiffies_to_nsecs(const unsigned long j)
> > > +{
> > > +#if HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ)
> > > + return (NSEC_PER_SEC / HZ) * (u64)j;
> > > +#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC)
> > > + return ((u64)j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC);
> > > +#else
> > > + return ((u64)j * NSEC_PER_SEC) / HZ;
> > > +#endif
> > > +}
> >
> > That might look a little better something like:
> >
> > static inline u64 jiffies_to_nsecs(const unsigned long __j)
> > {
> > u64 j = __j;
> >
> > if (HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ))
> > return (NSEC_PER_SEC / HZ) * j;
> > else if (HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC))
> > return (j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC);
> > else
> > return (j * NSEC_PER_SEC) / HZ;
> > }
> >
> > Compilers are smart :)
>
> Well, I was trying to keep it similar to the other conversion functions.
> I guess the compiler can evaluate the conditional full of constants at
> compile-time regardless of whether it is #if or if ().
>
> I can make these changes if others would like them as well.
Yes, please. And feel free to convert nearby functions, too ;-).
Pavel
--
teflon -- maybe it is a trademark, but it should not be.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs
2005-07-15 12:14 ` Pavel Machek
@ 2005-07-17 0:44 ` Nishanth Aravamudan
0 siblings, 0 replies; 12+ messages in thread
From: Nishanth Aravamudan @ 2005-07-17 0:44 UTC (permalink / raw)
To: Pavel Machek; +Cc: Dave Hansen, Linux Kernel Mailing List
On 15.07.2005 [14:14:25 +0200], Pavel Machek wrote:
> Hi!
>
> > > > +static inline u64 jiffies_to_nsecs(const unsigned long j)
> > > > +{
> > > > +#if HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ)
> > > > + return (NSEC_PER_SEC / HZ) * (u64)j;
> > > > +#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC)
> > > > + return ((u64)j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC);
> > > > +#else
> > > > + return ((u64)j * NSEC_PER_SEC) / HZ;
> > > > +#endif
> > > > +}
> > >
> > > That might look a little better something like:
> > >
> > > static inline u64 jiffies_to_nsecs(const unsigned long __j)
> > > {
> > > u64 j = __j;
> > >
> > > if (HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ))
> > > return (NSEC_PER_SEC / HZ) * j;
> > > else if (HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC))
> > > return (j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC);
> > > else
> > > return (j * NSEC_PER_SEC) / HZ;
> > > }
> > >
> > > Compilers are smart :)
> >
> > Well, I was trying to keep it similar to the other conversion functions.
> > I guess the compiler can evaluate the conditional full of constants at
> > compile-time regardless of whether it is #if or if ().
> >
> > I can make these changes if others would like them as well.
>
> Yes, please. And feel free to convert nearby functions, too ;-).
I have a patch to make this change for all the jiffies <--> human-time
functions, but have a problem. I noticed that these functions, in the
if/else form (as opposed to #if/#else) will warn about division-by-zero
problems, as (HZ / MSEC_PER_SEC), (HZ / USEC_PER_SEC) & (HZ /
NSEC_PER_SEC) are all 0 if HZ < 1000 (which, of course, is the default
now :) ). Any suggestions? Just leave the functions as is? Even then,
I'm going to update this patch to use USEC_PER_SEC and MSEC_PER_SEC in
the other conversion functions like I use NSEC_PER_SEC in the first
version.
Thanks,
Nish
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 0/4] new human-time soft-timer subsystem
2005-07-14 22:28 ` [RFC][PATCH 0/4] new human-time soft-timer subsystem Roman Zippel
@ 2005-07-17 0:53 ` Nishanth Aravamudan
0 siblings, 0 replies; 12+ messages in thread
From: Nishanth Aravamudan @ 2005-07-17 0:53 UTC (permalink / raw)
To: Roman Zippel; +Cc: linux-kernel
On 15.07.2005 [00:28:44 +0200], Roman Zippel wrote:
> Hi,
>
> On Thu, 14 Jul 2005, Nishanth Aravamudan wrote:
>
> > We no longer use jiffies (the variable) as the basis for determining
> > what "time" a timer should expire or when it should be added. Instead,
> > we use a new function, do_monotonic_clock(), which is simply a wrapper
> > for getnstimeofday().
>
> And suddenly a simple 32bit integer becomes a complex 64bit integer, which
> requires hardware access to read a timer and additional conversion into ns.
> Why is suddenly everyone so obsessed with molesting something simple and
> cute as jiffies?
Thanks for the feedback, Roman. I know the 64-bit operations are
critical from a performance perspective and may be excessive from a
pragmatic perspective. Maybe an alternative would be to only provide
*microsecond* resolution in the software, which I currently assume is
storable in an unsigned long (a little over an hour?). We could then
provide a supplemental interface for those sleeps which would exceed
this time, either via looping or a 64-bit parameter for this special
interface.
Would that perhaps be a better alternative from the 64-bit perspective?
We could do this one better, perhaps, by basically doing exactly what
jiffies does now, but storing a time value (in microseconds) instead of
a count of the number of ticks (jiffies' current interpretation). This
would perhaps be a 64-bit op, but that is the case current with
jiffies_64++ (or jiffies_64 += jiffies_increment). I will work on some
patches to do something to this effect and will bring it up during the
time/timer talk (Saturday at 13h30).
Thanks again,
Nish
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC][UPDATE PATCH 2/4] human-time soft-timer core changes
2005-07-14 20:40 ` [RFC][PATCH 2/4] human-time soft-timer core changes Nishanth Aravamudan
@ 2005-07-18 21:53 ` Nishanth Aravamudan
0 siblings, 0 replies; 12+ messages in thread
From: Nishanth Aravamudan @ 2005-07-18 21:53 UTC (permalink / raw)
To: linux-kernel
On 14.07.2005 [13:40:11 -0700], Nishanth Aravamudan wrote:
> From: Nishanth Aravamudan <nacc@us.ibm.com>
>
> Description: The core revision to the soft-timer subsystem to divorce it
> from the timer interrupt in software, i.e. jiffies. Instead, use
> getnstimeofday() (via do_monotonic_clock()) as the basis for addition
> and expiration of timers. Add a new unit, the timerinterval, which is
> a 2^TIMERINTERVAL_BITS nanoseconds in length. The converted value in
> timerintervals is used where we would have used the timer's expires
> member before. Add set_timer_nsecs() and set_timer_nsecs_on() functions
> to directly request nanosecond delays. These functions replace
> add_timer(), mod_timer() and add_timer_on().
>
> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Sigh, one version of my development patches removed the export of
mod_timer(). Of course, I forgot to revert that hunk before sending it
out. If anyone (maybe not likely) is testing out these patches, please
use this version.
Thanks,
Nish
---
include/linux/time.h | 1
include/linux/timer.h | 27 +-----
kernel/time.c | 18 ++++
kernel/timer.c | 215 +++++++++++++++++++++++++++++++++++++++++++++-----
4 files changed, 221 insertions(+), 40 deletions(-)
diff -urpN 2.6.13-rc3-base/include/linux/time.h 2.6.13-rc3-dev/include/linux/time.h
--- 2.6.13-rc3-base/include/linux/time.h 2005-03-01 23:38:12.000000000 -0800
+++ 2.6.13-rc3-dev/include/linux/time.h 2005-07-14 12:44:40.000000000 -0700
@@ -103,6 +103,7 @@ struct itimerval;
extern int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue);
extern int do_getitimer(int which, struct itimerval *value);
extern void getnstimeofday (struct timespec *tv);
+extern u64 do_monotonic_clock(void);
extern struct timespec timespec_trunc(struct timespec t, unsigned gran);
diff -urpN 2.6.13-rc3-base/include/linux/timer.h 2.6.13-rc3-dev/include/linux/timer.h
--- 2.6.13-rc3-base/include/linux/timer.h 2005-07-13 15:52:14.000000000 -0700
+++ 2.6.13-rc3-dev/include/linux/timer.h 2005-07-14 12:44:40.000000000 -0700
@@ -11,6 +11,7 @@ struct timer_base_s;
struct timer_list {
struct list_head entry;
unsigned long expires;
+ u64 expires_nsecs;
unsigned long magic;
@@ -27,6 +28,7 @@ extern struct timer_base_s __init_timer_
#define TIMER_INITIALIZER(_function, _expires, _data) { \
.function = (_function), \
.expires = (_expires), \
+ .expires_nsecs = 0, \
.data = (_data), \
.base = &__init_timer_base, \
.magic = TIMER_MAGIC, \
@@ -51,30 +53,15 @@ static inline int timer_pending(const st
extern void add_timer_on(struct timer_list *timer, int cpu);
extern int del_timer(struct timer_list * timer);
-extern int __mod_timer(struct timer_list *timer, unsigned long expires);
+extern int __mod_timer(struct timer_list *timer);
extern int mod_timer(struct timer_list *timer, unsigned long expires);
+extern void add_timer(struct timer_list *timer);
+extern int set_timer_nsecs(struct timer_list *timer, u64 expires_nsecs);
+extern void set_timer_on_nsecs(struct timer_list *timer, u64 expires_nsecs,
+ int cpu);
extern unsigned long next_timer_interrupt(void);
-/***
- * add_timer - start a timer
- * @timer: the timer to be added
- *
- * The kernel will do a ->function(->data) callback from the
- * timer interrupt at the ->expired point in the future. The
- * current time is 'jiffies'.
- *
- * The timer's ->expired, ->function (and if the handler uses it, ->data)
- * fields must be set prior calling this function.
- *
- * Timers with an ->expired field in the past will be executed in the next
- * timer tick.
- */
-static inline void add_timer(struct timer_list * timer)
-{
- __mod_timer(timer, timer->expires);
-}
-
#ifdef CONFIG_SMP
extern int try_to_del_timer_sync(struct timer_list *timer);
extern int del_timer_sync(struct timer_list *timer);
diff -urpN 2.6.13-rc3-base/kernel/time.c 2.6.13-rc3-dev/kernel/time.c
--- 2.6.13-rc3-base/kernel/time.c 2005-07-13 15:51:57.000000000 -0700
+++ 2.6.13-rc3-dev/kernel/time.c 2005-07-14 12:44:40.000000000 -0700
@@ -589,3 +589,21 @@ EXPORT_SYMBOL(get_jiffies_64);
#endif
EXPORT_SYMBOL(jiffies);
+
+u64 do_monotonic_clock(void)
+{
+ struct timespec now, now_w2m;
+ unsigned long seq;
+
+ getnstimeofday(&now);
+
+ do {
+ seq = read_seqbegin(&xtime_lock);
+ now_w2m = wall_to_monotonic;
+ } while (read_seqretry(&xtime_lock, seq));
+
+ return (u64)(now.tv_sec + now_w2m.tv_sec) * NSEC_PER_SEC +
+ (now.tv_nsec + now_w2m.tv_nsec);
+}
+
+EXPORT_SYMBOL_GPL(do_monotonic_clock);
diff -urpN 2.6.13-rc3-base/kernel/timer.c 2.6.13-rc3-dev/kernel/timer.c
--- 2.6.13-rc3-base/kernel/timer.c 2005-07-13 15:52:14.000000000 -0700
+++ 2.6.13-rc3-dev/kernel/timer.c 2005-07-14 12:44:40.000000000 -0700
@@ -56,6 +56,15 @@ static void time_interpolator_update(lon
#define TVR_SIZE (1 << TVR_BITS)
#define TVN_MASK (TVN_SIZE - 1)
#define TVR_MASK (TVR_SIZE - 1)
+/*
+ * Modifying TIMERINTERVAL_BITS changes the software resolution of
+ * soft-timers. While 20 bits would be closer to a millisecond, there
+ * are performance gains from allowing a software resolution finer than
+ * the hardware (HZ=1000)
+ */
+#define TIMERINTERVAL_BITS 19
+#define TIMERINTERVAL_SIZE (1 << TIMERINTERVAL_BITS)
+#define TIMERINTERVAL_MASK (TIMERINTERVAL_SIZE - 1)
struct timer_base_s {
spinlock_t lock;
@@ -72,7 +81,7 @@ typedef struct tvec_root_s {
struct tvec_t_base_s {
struct timer_base_s t_base;
- unsigned long timer_jiffies;
+ unsigned long last_timer_time;
tvec_root_t tv1;
tvec_t tv2;
tvec_t tv3;
@@ -114,11 +123,88 @@ static inline void check_timer(struct ti
check_timer_failed(timer);
}
+/*
+ * nsecs_to_timerintervals_ceiling - convert nanoseconds to timerintervals
+ * @n: number of nanoseconds to convert
+ *
+ * This is where changes to TIMERINTERVAL_BITS affect the soft-timer
+ * subsystem.
+ *
+ * Some explanation of the math is necessary:
+ * Rather than do decimal arithmetic, we shift for the sake of speed.
+ * This does mean that the actual requestable sleeps are
+ * 2^(sizeof(unsigned long)*8 - TIMERINTERVAL_BITS)
+ * timerintervals.
+ *
+ * The conditional takes care of the corner case where we request a 0
+ * nanosecond sleep; if the quantity were unsigned, we would not
+ * propogate the carry and force a wrap when adding the 1.
+ *
+ * To prevent timers from being expired early, we:
+ * Take the ceiling when we add; and
+ * Take the floor when we expire.
+ */
+static inline unsigned long nsecs_to_timerintervals_ceiling(u64 nsecs)
+{
+ if (nsecs)
+ return (unsigned long)(((nsecs - 1) >> TIMERINTERVAL_BITS) + 1);
+ else
+ return 0UL;
+}
+
+/*
+ * nsecs_to_timerintervals_floor - convert nanoseconds to timerintervals
+ * @n: number of nanoseconds to convert
+ *
+ * This is where changes to TIMERINTERVAL_BITS affect the soft-timer
+ * subsystem.
+ *
+ * Some explanation of the math is necessary:
+ * Rather than do decimal arithmetic, we shift for the sake of speed.
+ * This does mean that the actual requestable sleeps are
+ * 2^(sizeof(unsigned long)*8 - TIMERINTERVAL_BITS)
+ *
+ * There is no special case for 0 in the floor function, since we do not
+ * do any subtraction or addition of 1
+ *
+ * To prevent timers from being expired early, we:
+ * Take the ceiling when we add; and
+ * Take the floor when we expire.
+ */
+static inline unsigned long nsecs_to_timerintervals_floor(u64 nsecs)
+{
+ return (unsigned long)(nsecs >> TIMERINTERVAL_BITS);
+}
+
+/*
+ * jiffies_to_timerintervals - convert absolute jiffies to timerintervals
+ * @abs_jiffies: number of jiffies to convert
+ *
+ * First, we convert the absolute jiffies parameter to a relative
+ * jiffies value. To maintain precision, we convert the relative
+ * jiffies value to a relative nanosecond value and then convert that
+ * to a relative soft-timer interval unit value. We then add this
+ * relative value to the current time according to the timeofday-
+ * subsystem, converted to soft-timer interval units.
+ *
+ * We only use this function when adding timers, so we are free to
+ * always use the ceiling version of nsecs_to_timerintervals.
+ *
+ * This function only exists to support deprecated interfaces. Once
+ * those interfaces have been converted to the alternatives, it should
+ * be removed.
+ */
+static inline unsigned long jiffies_to_timerintervals(unsigned long abs_jiffies)
+{
+ unsigned long relative_jiffies = abs_jiffies - jiffies;
+ return nsecs_to_timerintervals_ceiling(do_monotonic_clock() +
+ jiffies_to_nsecs(relative_jiffies));
+}
static void internal_add_timer(tvec_base_t *base, struct timer_list *timer)
{
- unsigned long expires = timer->expires;
- unsigned long idx = expires - base->timer_jiffies;
+ unsigned long expires = nsecs_to_timerintervals_ceiling(timer->expires_nsecs);
+ unsigned long idx = expires - base->last_timer_time;
struct list_head *vec;
if (idx < TVR_SIZE) {
@@ -138,7 +224,7 @@ static void internal_add_timer(tvec_base
* Can happen if you add a timer with expires == jiffies,
* or you set a timer to go off in the past
*/
- vec = base->tv1.vec + (base->timer_jiffies & TVR_MASK);
+ vec = base->tv1.vec + (base->last_timer_time & TVR_MASK);
} else {
int i;
/* If the timeout is larger than 0xffffffff on 64-bit
@@ -146,7 +232,7 @@ static void internal_add_timer(tvec_base
*/
if (idx > 0xffffffffUL) {
idx = 0xffffffffUL;
- expires = idx + base->timer_jiffies;
+ expires = idx + base->last_timer_time;
}
i = (expires >> (TVR_BITS + 3 * TVN_BITS)) & TVN_MASK;
vec = base->tv5.vec + i;
@@ -222,7 +308,7 @@ static timer_base_t *lock_timer_base(str
}
}
-int __mod_timer(struct timer_list *timer, unsigned long expires)
+int __mod_timer(struct timer_list *timer)
{
timer_base_t *base;
tvec_base_t *new_base;
@@ -261,7 +347,7 @@ int __mod_timer(struct timer_list *timer
}
}
- timer->expires = expires;
+ /* expires should be in timerintervals, and is currently ignored? */
internal_add_timer(new_base, timer);
spin_unlock_irqrestore(&new_base->t_base.lock, flags);
@@ -281,21 +367,50 @@ void add_timer_on(struct timer_list *tim
{
tvec_base_t *base = &per_cpu(tvec_bases, cpu);
unsigned long flags;
-
+
BUG_ON(timer_pending(timer) || !timer->function);
check_timer(timer);
spin_lock_irqsave(&base->t_base.lock, flags);
+ timer->expires_nsecs = do_monotonic_clock() +
+ jiffies_to_nsecs(timer->expires - jiffies);
timer->base = &base->t_base;
internal_add_timer(base, timer);
spin_unlock_irqrestore(&base->t_base.lock, flags);
}
+/***
+ * add_timer - start a timer
+ * @timer: the timer to be added
+ *
+ * The kernel will do a ->function(->data) callback from the
+ * timer interrupt at the ->expired point in the future. The
+ * current time is 'jiffies'.
+ *
+ * The timer's ->expired, ->function (and if the handler uses it, ->data)
+ * fields must be set prior calling this function.
+ *
+ * Timers with an ->expired field in the past will be executed in the next
+ * timer tick.
+ *
+ * The callers of add_timer() should be aware that the interface is now
+ * deprecated. set_timer_nsecs() is the single interface for adding and
+ * modifying timers.
+ */
+void add_timer(struct timer_list * timer)
+{
+ timer->expires_nsecs = do_monotonic_clock() +
+ jiffies_to_nsecs(timer->expires - jiffies);
+ __mod_timer(timer);
+}
+
+EXPORT_SYMBOL(add_timer);
/***
* mod_timer - modify a timer's timeout
* @timer: the timer to be modified
+ * @expires: absolute time, in jiffies, when timer should expire
*
* mod_timer is a more efficient way to update the expire field of an
* active timer (if the timer is inactive it will be activated)
@@ -311,6 +426,10 @@ void add_timer_on(struct timer_list *tim
* The function returns whether it has modified a pending timer or not.
* (ie. mod_timer() of an inactive timer returns 0, mod_timer() of an
* active timer returns 1.)
+ *
+ * The callers of mod_timer() should be aware that the interface is now
+ * deprecated. set_timer_nsecs() is the single interface for adding and
+ * modifying timers.
*/
int mod_timer(struct timer_list *timer, unsigned long expires)
{
@@ -318,6 +437,9 @@ int mod_timer(struct timer_list *timer,
check_timer(timer);
+ timer->expires_nsecs = do_monotonic_clock() +
+ jiffies_to_nsecs(expires - jiffies);
+
/*
* This is a common optimization triggered by the
* networking code - if the timer is re-modified
@@ -326,10 +448,56 @@ int mod_timer(struct timer_list *timer,
if (timer->expires == expires && timer_pending(timer))
return 1;
- return __mod_timer(timer, expires);
+ return __mod_timer(timer);
}
EXPORT_SYMBOL(mod_timer);
+
+/*
+ * set_timer_nsecs - modify a timer's timeout in nsecs
+ * @timer: the timer to be modified
+ *
+ * set_timer_nsecs replaces both add_timer and mod_timer. The caller
+ * should call do_monotonic_clock() to determine the absolute timeout
+ * necessary.
+ */
+int set_timer_nsecs(struct timer_list *timer, u64 expires_nsecs)
+{
+ BUG_ON(!timer->function);
+
+ check_timer(timer);
+
+ if (timer_pending(timer) && timer->expires_nsecs == expires_nsecs)
+ return 1;
+
+ timer->expires_nsecs = expires_nsecs;
+
+ return __mod_timer(timer);
+}
+
+EXPORT_SYMBOL_GPL(set_timer_nsecs);
+
+/***
+ * set_timer_on_nsecs - start a timer on a particular CPU
+ * @timer: the timer to be added
+ * @expires_nsecs: absolute time in nsecs when timer should expire
+ * @cpu: the CPU to start it on
+ *
+ * This is not very scalable on SMP. Double adds are not possible.
+ */
+void set_timer_on_nsecs(struct timer_list *timer, u64 expires_nsecs, int cpu)
+{
+ tvec_base_t *base = &per_cpu(tvec_bases, cpu);
+ unsigned long flags;
+
+ BUG_ON(timer_pending(timer) || !timer->function);
+
+ check_timer(timer);
+
+ spin_lock_irqsave(&base->t_base.lock, flags);
+ timer->expires_nsecs = expires_nsecs;
+ timer->base = &base->t_base;
+ internal_add_timer(base, timer);
+ spin_unlock_irqrestore(&base->t_base.lock, flags);
+}
/***
* del_timer - deactive a timer.
@@ -455,17 +623,17 @@ static int cascade(tvec_base_t *base, tv
* This function cascades all vectors and executes all expired timer
* vectors.
*/
-#define INDEX(N) (base->timer_jiffies >> (TVR_BITS + N * TVN_BITS)) & TVN_MASK
+#define INDEX(N) (base->last_timer_time >> (TVR_BITS + N * TVN_BITS)) & TVN_MASK
-static inline void __run_timers(tvec_base_t *base)
+static inline void __run_timers(tvec_base_t *base, unsigned long current_timer_time)
{
struct timer_list *timer;
spin_lock_irq(&base->t_base.lock);
- while (time_after_eq(jiffies, base->timer_jiffies)) {
+ while (time_after_eq(current_timer_time, base->last_timer_time)) {
struct list_head work_list = LIST_HEAD_INIT(work_list);
struct list_head *head = &work_list;
- int index = base->timer_jiffies & TVR_MASK;
+ int index = base->last_timer_time & TVR_MASK;
/*
* Cascade timers:
@@ -475,7 +643,7 @@ static inline void __run_timers(tvec_bas
(!cascade(base, &base->tv3, INDEX(1))) &&
!cascade(base, &base->tv4, INDEX(2)))
cascade(base, &base->tv5, INDEX(3));
- ++base->timer_jiffies;
+ ++base->last_timer_time;
list_splice_init(base->tv1.vec + index, &work_list);
while (!list_empty(head)) {
void (*fn)(unsigned long);
@@ -524,20 +692,20 @@ unsigned long next_timer_interrupt(void)
base = &__get_cpu_var(tvec_bases);
spin_lock(&base->t_base.lock);
- expires = base->timer_jiffies + (LONG_MAX >> 1);
+ expires = base->last_timer_time + (LONG_MAX >> 1);
list = 0;
/* Look for timer events in tv1. */
- j = base->timer_jiffies & TVR_MASK;
+ j = base->last_timer_time & TVR_MASK;
do {
list_for_each_entry(nte, base->tv1.vec + j, entry) {
expires = nte->expires;
- if (j < (base->timer_jiffies & TVR_MASK))
+ if (j < (base->last_timer_time & TVR_MASK))
list = base->tv2.vec + (INDEX(0));
goto found;
}
j = (j + 1) & TVR_MASK;
- } while (j != (base->timer_jiffies & TVR_MASK));
+ } while (j != (base->last_timer_time & TVR_MASK));
/* Check tv2-tv5. */
varray[0] = &base->tv2;
@@ -910,10 +1078,15 @@ EXPORT_SYMBOL(xtime_lock);
*/
static void run_timer_softirq(struct softirq_action *h)
{
+ unsigned long current_timer_time;
tvec_base_t *base = &__get_cpu_var(tvec_bases);
- if (time_after_eq(jiffies, base->timer_jiffies))
- __run_timers(base);
+ /* cache the converted current time, rounding down */
+ current_timer_time =
+ nsecs_to_timerintervals_floor(do_monotonic_clock());
+
+ if (time_after_eq(current_timer_time, base->last_timer_time))
+ __run_timers(base, current_timer_time);
}
/*
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2005-07-18 21:53 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan
2005-07-14 20:28 ` [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs Nishanth Aravamudan
2005-07-14 20:54 ` Dave Hansen
2005-07-14 21:03 ` Nishanth Aravamudan
2005-07-15 12:14 ` Pavel Machek
2005-07-17 0:44 ` Nishanth Aravamudan
2005-07-14 20:40 ` [RFC][PATCH 2/4] human-time soft-timer core changes Nishanth Aravamudan
2005-07-18 21:53 ` [RFC][UPDATE PATCH " Nishanth Aravamudan
2005-07-14 20:41 ` [RFC][PATCH 3/4] new human-time schedule_timeout() functions Nishanth Aravamudan
2005-07-14 20:43 ` [RFC][PATCH 4/4] convert sys_nanosleep() to use set_timer_nsecs() Nishanth Aravamudan
2005-07-14 22:28 ` [RFC][PATCH 0/4] new human-time soft-timer subsystem Roman Zippel
2005-07-17 0:53 ` Nishanth Aravamudan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox