* [RFC][PATCH 0/4] new human-time soft-timer subsystem
@ 2005-07-14 20:26 Nishanth Aravamudan
2005-07-14 20:28 ` [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs Nishanth Aravamudan
` (4 more replies)
0 siblings, 5 replies; 12+ messages in thread
From: Nishanth Aravamudan @ 2005-07-14 20:26 UTC (permalink / raw)
To: linux-kernel
On 14.07.2005 [12:18:41 -0700], john stultz wrote:
<snip>
> Nish has some code, which I hope he'll be sending out shortly that
> does just this, converting the soft-timer subsystem to use absolute
> time instead of ticks for expiration. I feel it both simplifies the
> code and makes it easier to changing the timer interrupt frequency
> while the system is running.
Here's the set of patches John promised :)
1/4: add jiffies conversion helper functions
2/4: core human-time modifications to soft-timer subsystem
3/4: add new human-time schedule_timeout() functions
4/4: rework sys_nanosleep() to use schedule_timeout_nsecs()
The individual patches have more details, but the gist is this:
We no longer use jiffies (the variable) as the basis for determining
what "time" a timer should expire or when it should be added. Instead,
we use a new function, do_monotonic_clock(), which is simply a wrapper
for getnstimeofday(). That is to say, we use uptime in nanoseconds. But,
to avoid modifying the existing soft-timer algorithm, we convert the
64-bit nanosecond value to "timerinterval" units. These units are simply
2^TIMEINTERVAL_BITS nanoseconds in length (thus determined at compile
time).
To sum up, soft-timers now use time (as defined by the
timeofday-subsystem) not ticks. Hopefully, the individual
e-mails/patches make this change clear.
Thanks,
Nish
^ permalink raw reply [flat|nested] 12+ messages in thread* [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs 2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan @ 2005-07-14 20:28 ` Nishanth Aravamudan 2005-07-14 20:54 ` Dave Hansen 2005-07-14 20:40 ` [RFC][PATCH 2/4] human-time soft-timer core changes Nishanth Aravamudan ` (3 subsequent siblings) 4 siblings, 1 reply; 12+ messages in thread From: Nishanth Aravamudan @ 2005-07-14 20:28 UTC (permalink / raw) To: linux-kernel From: Nishanth Aravamudan <nacc@us.ibm.com> Description: Add a jiffies_to_nsecs() helper function. Make consistent the size of microseconds (unsigned long) throughout the conversion functions. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- jiffies.h | 15 +++++++++++++-- 1 files changed, 13 insertions(+), 2 deletions(-) diff -urpN 2.6.13-rc3-base/include/linux/jiffies.h 2.6.13-rc3-dev/include/linux/jiffies.h --- 2.6.13-rc3-base/include/linux/jiffies.h 2005-03-01 23:37:31.000000000 -0800 +++ 2.6.13-rc3-dev/include/linux/jiffies.h 2005-07-14 12:43:44.000000000 -0700 @@ -263,7 +263,7 @@ static inline unsigned int jiffies_to_ms #endif } -static inline unsigned int jiffies_to_usecs(const unsigned long j) +static inline unsigned long jiffies_to_usecs(const unsigned long j) { #if HZ <= 1000000 && !(1000000 % HZ) return (1000000 / HZ) * j; @@ -274,6 +274,17 @@ static inline unsigned int jiffies_to_us #endif } +static inline u64 jiffies_to_nsecs(const unsigned long j) +{ +#if HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ) + return (NSEC_PER_SEC / HZ) * (u64)j; +#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC) + return ((u64)j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC); +#else + return ((u64)j * NSEC_PER_SEC) / HZ; +#endif +} + static inline unsigned long msecs_to_jiffies(const unsigned int m) { if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET)) @@ -287,7 +298,7 @@ static inline unsigned long msecs_to_jif #endif } -static inline unsigned long usecs_to_jiffies(const unsigned int u) +static inline unsigned long usecs_to_jiffies(const unsigned long u) { if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET)) return MAX_JIFFY_OFFSET; ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs 2005-07-14 20:28 ` [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs Nishanth Aravamudan @ 2005-07-14 20:54 ` Dave Hansen 2005-07-14 21:03 ` Nishanth Aravamudan 0 siblings, 1 reply; 12+ messages in thread From: Dave Hansen @ 2005-07-14 20:54 UTC (permalink / raw) To: Nishanth Aravamudan; +Cc: Linux Kernel Mailing List On Thu, 2005-07-14 at 13:28 -0700, Nishanth Aravamudan wrote: > +static inline u64 jiffies_to_nsecs(const unsigned long j) > +{ > +#if HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ) > + return (NSEC_PER_SEC / HZ) * (u64)j; > +#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC) > + return ((u64)j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC); > +#else > + return ((u64)j * NSEC_PER_SEC) / HZ; > +#endif > +} That might look a little better something like: static inline u64 jiffies_to_nsecs(const unsigned long __j) { u64 j = __j; if (HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ)) return (NSEC_PER_SEC / HZ) * j; else if (HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC)) return (j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC); else return (j * NSEC_PER_SEC) / HZ; } Compilers are smart :) -- Dave ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs 2005-07-14 20:54 ` Dave Hansen @ 2005-07-14 21:03 ` Nishanth Aravamudan 2005-07-15 12:14 ` Pavel Machek 0 siblings, 1 reply; 12+ messages in thread From: Nishanth Aravamudan @ 2005-07-14 21:03 UTC (permalink / raw) To: Dave Hansen; +Cc: Linux Kernel Mailing List On 14.07.2005 [13:54:47 -0700], Dave Hansen wrote: > On Thu, 2005-07-14 at 13:28 -0700, Nishanth Aravamudan wrote: > > +static inline u64 jiffies_to_nsecs(const unsigned long j) > > +{ > > +#if HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ) > > + return (NSEC_PER_SEC / HZ) * (u64)j; > > +#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC) > > + return ((u64)j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC); > > +#else > > + return ((u64)j * NSEC_PER_SEC) / HZ; > > +#endif > > +} > > That might look a little better something like: > > static inline u64 jiffies_to_nsecs(const unsigned long __j) > { > u64 j = __j; > > if (HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ)) > return (NSEC_PER_SEC / HZ) * j; > else if (HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC)) > return (j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC); > else > return (j * NSEC_PER_SEC) / HZ; > } > > Compilers are smart :) Well, I was trying to keep it similar to the other conversion functions. I guess the compiler can evaluate the conditional full of constants at compile-time regardless of whether it is #if or if (). I can make these changes if others would like them as well. Thanks, Nish ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs 2005-07-14 21:03 ` Nishanth Aravamudan @ 2005-07-15 12:14 ` Pavel Machek 2005-07-17 0:44 ` Nishanth Aravamudan 0 siblings, 1 reply; 12+ messages in thread From: Pavel Machek @ 2005-07-15 12:14 UTC (permalink / raw) To: Nishanth Aravamudan; +Cc: Dave Hansen, Linux Kernel Mailing List Hi! > > > +static inline u64 jiffies_to_nsecs(const unsigned long j) > > > +{ > > > +#if HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ) > > > + return (NSEC_PER_SEC / HZ) * (u64)j; > > > +#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC) > > > + return ((u64)j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC); > > > +#else > > > + return ((u64)j * NSEC_PER_SEC) / HZ; > > > +#endif > > > +} > > > > That might look a little better something like: > > > > static inline u64 jiffies_to_nsecs(const unsigned long __j) > > { > > u64 j = __j; > > > > if (HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ)) > > return (NSEC_PER_SEC / HZ) * j; > > else if (HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC)) > > return (j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC); > > else > > return (j * NSEC_PER_SEC) / HZ; > > } > > > > Compilers are smart :) > > Well, I was trying to keep it similar to the other conversion functions. > I guess the compiler can evaluate the conditional full of constants at > compile-time regardless of whether it is #if or if (). > > I can make these changes if others would like them as well. Yes, please. And feel free to convert nearby functions, too ;-). Pavel -- teflon -- maybe it is a trademark, but it should not be. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs 2005-07-15 12:14 ` Pavel Machek @ 2005-07-17 0:44 ` Nishanth Aravamudan 0 siblings, 0 replies; 12+ messages in thread From: Nishanth Aravamudan @ 2005-07-17 0:44 UTC (permalink / raw) To: Pavel Machek; +Cc: Dave Hansen, Linux Kernel Mailing List On 15.07.2005 [14:14:25 +0200], Pavel Machek wrote: > Hi! > > > > > +static inline u64 jiffies_to_nsecs(const unsigned long j) > > > > +{ > > > > +#if HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ) > > > > + return (NSEC_PER_SEC / HZ) * (u64)j; > > > > +#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC) > > > > + return ((u64)j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC); > > > > +#else > > > > + return ((u64)j * NSEC_PER_SEC) / HZ; > > > > +#endif > > > > +} > > > > > > That might look a little better something like: > > > > > > static inline u64 jiffies_to_nsecs(const unsigned long __j) > > > { > > > u64 j = __j; > > > > > > if (HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ)) > > > return (NSEC_PER_SEC / HZ) * j; > > > else if (HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC)) > > > return (j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC); > > > else > > > return (j * NSEC_PER_SEC) / HZ; > > > } > > > > > > Compilers are smart :) > > > > Well, I was trying to keep it similar to the other conversion functions. > > I guess the compiler can evaluate the conditional full of constants at > > compile-time regardless of whether it is #if or if (). > > > > I can make these changes if others would like them as well. > > Yes, please. And feel free to convert nearby functions, too ;-). I have a patch to make this change for all the jiffies <--> human-time functions, but have a problem. I noticed that these functions, in the if/else form (as opposed to #if/#else) will warn about division-by-zero problems, as (HZ / MSEC_PER_SEC), (HZ / USEC_PER_SEC) & (HZ / NSEC_PER_SEC) are all 0 if HZ < 1000 (which, of course, is the default now :) ). Any suggestions? Just leave the functions as is? Even then, I'm going to update this patch to use USEC_PER_SEC and MSEC_PER_SEC in the other conversion functions like I use NSEC_PER_SEC in the first version. Thanks, Nish ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC][PATCH 2/4] human-time soft-timer core changes 2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan 2005-07-14 20:28 ` [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs Nishanth Aravamudan @ 2005-07-14 20:40 ` Nishanth Aravamudan 2005-07-18 21:53 ` [RFC][UPDATE PATCH " Nishanth Aravamudan 2005-07-14 20:41 ` [RFC][PATCH 3/4] new human-time schedule_timeout() functions Nishanth Aravamudan ` (2 subsequent siblings) 4 siblings, 1 reply; 12+ messages in thread From: Nishanth Aravamudan @ 2005-07-14 20:40 UTC (permalink / raw) To: linux-kernel From: Nishanth Aravamudan <nacc@us.ibm.com> Description: The core revision to the soft-timer subsystem to divorce it from the timer interrupt in software, i.e. jiffies. Instead, use getnstimeofday() (via do_monotonic_clock()) as the basis for addition and expiration of timers. Add a new unit, the timerinterval, which is a 2^TIMERINTERVAL_BITS nanoseconds in length. The converted value in timerintervals is used where we would have used the timer's expires member before. Add set_timer_nsecs() and set_timer_nsecs_on() functions to directly request nanosecond delays. These functions replace add_timer(), mod_timer() and add_timer_on(). Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- include/linux/time.h | 1 include/linux/timer.h | 27 +----- kernel/time.c | 18 ++++ kernel/timer.c | 215 +++++++++++++++++++++++++++++++++++++++++++++----- 4 files changed, 220 insertions(+), 41 deletions(-) diff -urpN 2.6.13-rc3-base/include/linux/time.h 2.6.13-rc3-dev/include/linux/time.h --- 2.6.13-rc3-base/include/linux/time.h 2005-03-01 23:38:12.000000000 -0800 +++ 2.6.13-rc3-dev/include/linux/time.h 2005-07-14 12:44:40.000000000 -0700 @@ -103,6 +103,7 @@ struct itimerval; extern int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue); extern int do_getitimer(int which, struct itimerval *value); extern void getnstimeofday (struct timespec *tv); +extern u64 do_monotonic_clock(void); extern struct timespec timespec_trunc(struct timespec t, unsigned gran); diff -urpN 2.6.13-rc3-base/include/linux/timer.h 2.6.13-rc3-dev/include/linux/timer.h --- 2.6.13-rc3-base/include/linux/timer.h 2005-07-13 15:52:14.000000000 -0700 +++ 2.6.13-rc3-dev/include/linux/timer.h 2005-07-14 12:44:40.000000000 -0700 @@ -11,6 +11,7 @@ struct timer_base_s; struct timer_list { struct list_head entry; unsigned long expires; + u64 expires_nsecs; unsigned long magic; @@ -27,6 +28,7 @@ extern struct timer_base_s __init_timer_ #define TIMER_INITIALIZER(_function, _expires, _data) { \ .function = (_function), \ .expires = (_expires), \ + .expires_nsecs = 0, \ .data = (_data), \ .base = &__init_timer_base, \ .magic = TIMER_MAGIC, \ @@ -51,30 +53,15 @@ static inline int timer_pending(const st extern void add_timer_on(struct timer_list *timer, int cpu); extern int del_timer(struct timer_list * timer); -extern int __mod_timer(struct timer_list *timer, unsigned long expires); +extern int __mod_timer(struct timer_list *timer); extern int mod_timer(struct timer_list *timer, unsigned long expires); +extern void add_timer(struct timer_list *timer); +extern int set_timer_nsecs(struct timer_list *timer, u64 expires_nsecs); +extern void set_timer_on_nsecs(struct timer_list *timer, u64 expires_nsecs, + int cpu); extern unsigned long next_timer_interrupt(void); -/*** - * add_timer - start a timer - * @timer: the timer to be added - * - * The kernel will do a ->function(->data) callback from the - * timer interrupt at the ->expired point in the future. The - * current time is 'jiffies'. - * - * The timer's ->expired, ->function (and if the handler uses it, ->data) - * fields must be set prior calling this function. - * - * Timers with an ->expired field in the past will be executed in the next - * timer tick. - */ -static inline void add_timer(struct timer_list * timer) -{ - __mod_timer(timer, timer->expires); -} - #ifdef CONFIG_SMP extern int try_to_del_timer_sync(struct timer_list *timer); extern int del_timer_sync(struct timer_list *timer); diff -urpN 2.6.13-rc3-base/kernel/time.c 2.6.13-rc3-dev/kernel/time.c --- 2.6.13-rc3-base/kernel/time.c 2005-07-13 15:51:57.000000000 -0700 +++ 2.6.13-rc3-dev/kernel/time.c 2005-07-14 12:44:40.000000000 -0700 @@ -589,3 +589,21 @@ EXPORT_SYMBOL(get_jiffies_64); #endif EXPORT_SYMBOL(jiffies); + +u64 do_monotonic_clock(void) +{ + struct timespec now, now_w2m; + unsigned long seq; + + getnstimeofday(&now); + + do { + seq = read_seqbegin(&xtime_lock); + now_w2m = wall_to_monotonic; + } while (read_seqretry(&xtime_lock, seq)); + + return (u64)(now.tv_sec + now_w2m.tv_sec) * NSEC_PER_SEC + + (now.tv_nsec + now_w2m.tv_nsec); +} + +EXPORT_SYMBOL_GPL(do_monotonic_clock); diff -urpN 2.6.13-rc3-base/kernel/timer.c 2.6.13-rc3-dev/kernel/timer.c --- 2.6.13-rc3-base/kernel/timer.c 2005-07-13 15:52:14.000000000 -0700 +++ 2.6.13-rc3-dev/kernel/timer.c 2005-07-14 12:44:40.000000000 -0700 @@ -56,6 +56,15 @@ static void time_interpolator_update(lon #define TVR_SIZE (1 << TVR_BITS) #define TVN_MASK (TVN_SIZE - 1) #define TVR_MASK (TVR_SIZE - 1) +/* + * Modifying TIMERINTERVAL_BITS changes the software resolution of + * soft-timers. While 20 bits would be closer to a millisecond, there + * are performance gains from allowing a software resolution finer than + * the hardware (HZ=1000) + */ +#define TIMERINTERVAL_BITS 19 +#define TIMERINTERVAL_SIZE (1 << TIMERINTERVAL_BITS) +#define TIMERINTERVAL_MASK (TIMERINTERVAL_SIZE - 1) struct timer_base_s { spinlock_t lock; @@ -72,7 +81,7 @@ typedef struct tvec_root_s { struct tvec_t_base_s { struct timer_base_s t_base; - unsigned long timer_jiffies; + unsigned long last_timer_time; tvec_root_t tv1; tvec_t tv2; tvec_t tv3; @@ -114,11 +123,88 @@ static inline void check_timer(struct ti check_timer_failed(timer); } +/* + * nsecs_to_timerintervals_ceiling - convert nanoseconds to timerintervals + * @n: number of nanoseconds to convert + * + * This is where changes to TIMERINTERVAL_BITS affect the soft-timer + * subsystem. + * + * Some explanation of the math is necessary: + * Rather than do decimal arithmetic, we shift for the sake of speed. + * This does mean that the actual requestable sleeps are + * 2^(sizeof(unsigned long)*8 - TIMERINTERVAL_BITS) + * timerintervals. + * + * The conditional takes care of the corner case where we request a 0 + * nanosecond sleep; if the quantity were unsigned, we would not + * propogate the carry and force a wrap when adding the 1. + * + * To prevent timers from being expired early, we: + * Take the ceiling when we add; and + * Take the floor when we expire. + */ +static inline unsigned long nsecs_to_timerintervals_ceiling(u64 nsecs) +{ + if (nsecs) + return (unsigned long)(((nsecs - 1) >> TIMERINTERVAL_BITS) + 1); + else + return 0UL; +} + +/* + * nsecs_to_timerintervals_floor - convert nanoseconds to timerintervals + * @n: number of nanoseconds to convert + * + * This is where changes to TIMERINTERVAL_BITS affect the soft-timer + * subsystem. + * + * Some explanation of the math is necessary: + * Rather than do decimal arithmetic, we shift for the sake of speed. + * This does mean that the actual requestable sleeps are + * 2^(sizeof(unsigned long)*8 - TIMERINTERVAL_BITS) + * + * There is no special case for 0 in the floor function, since we do not + * do any subtraction or addition of 1 + * + * To prevent timers from being expired early, we: + * Take the ceiling when we add; and + * Take the floor when we expire. + */ +static inline unsigned long nsecs_to_timerintervals_floor(u64 nsecs) +{ + return (unsigned long)(nsecs >> TIMERINTERVAL_BITS); +} + +/* + * jiffies_to_timerintervals - convert absolute jiffies to timerintervals + * @abs_jiffies: number of jiffies to convert + * + * First, we convert the absolute jiffies parameter to a relative + * jiffies value. To maintain precision, we convert the relative + * jiffies value to a relative nanosecond value and then convert that + * to a relative soft-timer interval unit value. We then add this + * relative value to the current time according to the timeofday- + * subsystem, converted to soft-timer interval units. + * + * We only use this function when adding timers, so we are free to + * always use the ceiling version of nsecs_to_timerintervals. + * + * This function only exists to support deprecated interfaces. Once + * those interfaces have been converted to the alternatives, it should + * be removed. + */ +static inline unsigned long jiffies_to_timerintervals(unsigned long abs_jiffies) +{ + unsigned long relative_jiffies = abs_jiffies - jiffies; + return nsecs_to_timerintervals_ceiling(do_monotonic_clock() + + jiffies_to_nsecs(relative_jiffies)); +} static void internal_add_timer(tvec_base_t *base, struct timer_list *timer) { - unsigned long expires = timer->expires; - unsigned long idx = expires - base->timer_jiffies; + unsigned long expires = nsecs_to_timerintervals_ceiling(timer->expires_nsecs); + unsigned long idx = expires - base->last_timer_time; struct list_head *vec; if (idx < TVR_SIZE) { @@ -138,7 +224,7 @@ static void internal_add_timer(tvec_base * Can happen if you add a timer with expires == jiffies, * or you set a timer to go off in the past */ - vec = base->tv1.vec + (base->timer_jiffies & TVR_MASK); + vec = base->tv1.vec + (base->last_timer_time & TVR_MASK); } else { int i; /* If the timeout is larger than 0xffffffff on 64-bit @@ -146,7 +232,7 @@ static void internal_add_timer(tvec_base */ if (idx > 0xffffffffUL) { idx = 0xffffffffUL; - expires = idx + base->timer_jiffies; + expires = idx + base->last_timer_time; } i = (expires >> (TVR_BITS + 3 * TVN_BITS)) & TVN_MASK; vec = base->tv5.vec + i; @@ -222,7 +308,7 @@ static timer_base_t *lock_timer_base(str } } -int __mod_timer(struct timer_list *timer, unsigned long expires) +int __mod_timer(struct timer_list *timer) { timer_base_t *base; tvec_base_t *new_base; @@ -261,7 +347,7 @@ int __mod_timer(struct timer_list *timer } } - timer->expires = expires; + /* expires should be in timerintervals, and is currently ignored? */ internal_add_timer(new_base, timer); spin_unlock_irqrestore(&new_base->t_base.lock, flags); @@ -281,21 +367,50 @@ void add_timer_on(struct timer_list *tim { tvec_base_t *base = &per_cpu(tvec_bases, cpu); unsigned long flags; - + BUG_ON(timer_pending(timer) || !timer->function); check_timer(timer); spin_lock_irqsave(&base->t_base.lock, flags); + timer->expires_nsecs = do_monotonic_clock() + + jiffies_to_nsecs(timer->expires - jiffies); timer->base = &base->t_base; internal_add_timer(base, timer); spin_unlock_irqrestore(&base->t_base.lock, flags); } +/*** + * add_timer - start a timer + * @timer: the timer to be added + * + * The kernel will do a ->function(->data) callback from the + * timer interrupt at the ->expired point in the future. The + * current time is 'jiffies'. + * + * The timer's ->expired, ->function (and if the handler uses it, ->data) + * fields must be set prior calling this function. + * + * Timers with an ->expired field in the past will be executed in the next + * timer tick. + * + * The callers of add_timer() should be aware that the interface is now + * deprecated. set_timer_nsecs() is the single interface for adding and + * modifying timers. + */ +void add_timer(struct timer_list * timer) +{ + timer->expires_nsecs = do_monotonic_clock() + + jiffies_to_nsecs(timer->expires - jiffies); + __mod_timer(timer); +} + +EXPORT_SYMBOL(add_timer); /*** * mod_timer - modify a timer's timeout * @timer: the timer to be modified + * @expires: absolute time, in jiffies, when timer should expire * * mod_timer is a more efficient way to update the expire field of an * active timer (if the timer is inactive it will be activated) @@ -311,6 +426,10 @@ void add_timer_on(struct timer_list *tim * The function returns whether it has modified a pending timer or not. * (ie. mod_timer() of an inactive timer returns 0, mod_timer() of an * active timer returns 1.) + * + * The callers of mod_timer() should be aware that the interface is now + * deprecated. set_timer_nsecs() is the single interface for adding and + * modifying timers. */ int mod_timer(struct timer_list *timer, unsigned long expires) { @@ -318,6 +437,9 @@ int mod_timer(struct timer_list *timer, check_timer(timer); + timer->expires_nsecs = do_monotonic_clock() + + jiffies_to_nsecs(expires - jiffies); + /* * This is a common optimization triggered by the * networking code - if the timer is re-modified @@ -326,10 +448,56 @@ int mod_timer(struct timer_list *timer, if (timer->expires == expires && timer_pending(timer)) return 1; - return __mod_timer(timer, expires); + return __mod_timer(timer); } -EXPORT_SYMBOL(mod_timer); +/* + * set_timer_nsecs - modify a timer's timeout in nsecs + * @timer: the timer to be modified + * + * set_timer_nsecs replaces both add_timer and mod_timer. The caller + * should call do_monotonic_clock() to determine the absolute timeout + * necessary. + */ +int set_timer_nsecs(struct timer_list *timer, u64 expires_nsecs) +{ + BUG_ON(!timer->function); + + check_timer(timer); + + if (timer_pending(timer) && timer->expires_nsecs == expires_nsecs) + return 1; + + timer->expires_nsecs = expires_nsecs; + + return __mod_timer(timer); +} + +EXPORT_SYMBOL_GPL(set_timer_nsecs); + +/*** + * set_timer_on_nsecs - start a timer on a particular CPU + * @timer: the timer to be added + * @expires_nsecs: absolute time in nsecs when timer should expire + * @cpu: the CPU to start it on + * + * This is not very scalable on SMP. Double adds are not possible. + */ +void set_timer_on_nsecs(struct timer_list *timer, u64 expires_nsecs, int cpu) +{ + tvec_base_t *base = &per_cpu(tvec_bases, cpu); + unsigned long flags; + + BUG_ON(timer_pending(timer) || !timer->function); + + check_timer(timer); + + spin_lock_irqsave(&base->t_base.lock, flags); + timer->expires_nsecs = expires_nsecs; + timer->base = &base->t_base; + internal_add_timer(base, timer); + spin_unlock_irqrestore(&base->t_base.lock, flags); +} /*** * del_timer - deactive a timer. @@ -455,17 +623,17 @@ static int cascade(tvec_base_t *base, tv * This function cascades all vectors and executes all expired timer * vectors. */ -#define INDEX(N) (base->timer_jiffies >> (TVR_BITS + N * TVN_BITS)) & TVN_MASK +#define INDEX(N) (base->last_timer_time >> (TVR_BITS + N * TVN_BITS)) & TVN_MASK -static inline void __run_timers(tvec_base_t *base) +static inline void __run_timers(tvec_base_t *base, unsigned long current_timer_time) { struct timer_list *timer; spin_lock_irq(&base->t_base.lock); - while (time_after_eq(jiffies, base->timer_jiffies)) { + while (time_after_eq(current_timer_time, base->last_timer_time)) { struct list_head work_list = LIST_HEAD_INIT(work_list); struct list_head *head = &work_list; - int index = base->timer_jiffies & TVR_MASK; + int index = base->last_timer_time & TVR_MASK; /* * Cascade timers: @@ -475,7 +643,7 @@ static inline void __run_timers(tvec_bas (!cascade(base, &base->tv3, INDEX(1))) && !cascade(base, &base->tv4, INDEX(2))) cascade(base, &base->tv5, INDEX(3)); - ++base->timer_jiffies; + ++base->last_timer_time; list_splice_init(base->tv1.vec + index, &work_list); while (!list_empty(head)) { void (*fn)(unsigned long); @@ -524,20 +692,20 @@ unsigned long next_timer_interrupt(void) base = &__get_cpu_var(tvec_bases); spin_lock(&base->t_base.lock); - expires = base->timer_jiffies + (LONG_MAX >> 1); + expires = base->last_timer_time + (LONG_MAX >> 1); list = 0; /* Look for timer events in tv1. */ - j = base->timer_jiffies & TVR_MASK; + j = base->last_timer_time & TVR_MASK; do { list_for_each_entry(nte, base->tv1.vec + j, entry) { expires = nte->expires; - if (j < (base->timer_jiffies & TVR_MASK)) + if (j < (base->last_timer_time & TVR_MASK)) list = base->tv2.vec + (INDEX(0)); goto found; } j = (j + 1) & TVR_MASK; - } while (j != (base->timer_jiffies & TVR_MASK)); + } while (j != (base->last_timer_time & TVR_MASK)); /* Check tv2-tv5. */ varray[0] = &base->tv2; @@ -910,10 +1078,15 @@ EXPORT_SYMBOL(xtime_lock); */ static void run_timer_softirq(struct softirq_action *h) { + unsigned long current_timer_time; tvec_base_t *base = &__get_cpu_var(tvec_bases); - if (time_after_eq(jiffies, base->timer_jiffies)) - __run_timers(base); + /* cache the converted current time, rounding down */ + current_timer_time = + nsecs_to_timerintervals_floor(do_monotonic_clock()); + + if (time_after_eq(current_timer_time, base->last_timer_time)) + __run_timers(base, current_timer_time); } /* ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC][UPDATE PATCH 2/4] human-time soft-timer core changes 2005-07-14 20:40 ` [RFC][PATCH 2/4] human-time soft-timer core changes Nishanth Aravamudan @ 2005-07-18 21:53 ` Nishanth Aravamudan 0 siblings, 0 replies; 12+ messages in thread From: Nishanth Aravamudan @ 2005-07-18 21:53 UTC (permalink / raw) To: linux-kernel On 14.07.2005 [13:40:11 -0700], Nishanth Aravamudan wrote: > From: Nishanth Aravamudan <nacc@us.ibm.com> > > Description: The core revision to the soft-timer subsystem to divorce it > from the timer interrupt in software, i.e. jiffies. Instead, use > getnstimeofday() (via do_monotonic_clock()) as the basis for addition > and expiration of timers. Add a new unit, the timerinterval, which is > a 2^TIMERINTERVAL_BITS nanoseconds in length. The converted value in > timerintervals is used where we would have used the timer's expires > member before. Add set_timer_nsecs() and set_timer_nsecs_on() functions > to directly request nanosecond delays. These functions replace > add_timer(), mod_timer() and add_timer_on(). > > Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Sigh, one version of my development patches removed the export of mod_timer(). Of course, I forgot to revert that hunk before sending it out. If anyone (maybe not likely) is testing out these patches, please use this version. Thanks, Nish --- include/linux/time.h | 1 include/linux/timer.h | 27 +----- kernel/time.c | 18 ++++ kernel/timer.c | 215 +++++++++++++++++++++++++++++++++++++++++++++----- 4 files changed, 221 insertions(+), 40 deletions(-) diff -urpN 2.6.13-rc3-base/include/linux/time.h 2.6.13-rc3-dev/include/linux/time.h --- 2.6.13-rc3-base/include/linux/time.h 2005-03-01 23:38:12.000000000 -0800 +++ 2.6.13-rc3-dev/include/linux/time.h 2005-07-14 12:44:40.000000000 -0700 @@ -103,6 +103,7 @@ struct itimerval; extern int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue); extern int do_getitimer(int which, struct itimerval *value); extern void getnstimeofday (struct timespec *tv); +extern u64 do_monotonic_clock(void); extern struct timespec timespec_trunc(struct timespec t, unsigned gran); diff -urpN 2.6.13-rc3-base/include/linux/timer.h 2.6.13-rc3-dev/include/linux/timer.h --- 2.6.13-rc3-base/include/linux/timer.h 2005-07-13 15:52:14.000000000 -0700 +++ 2.6.13-rc3-dev/include/linux/timer.h 2005-07-14 12:44:40.000000000 -0700 @@ -11,6 +11,7 @@ struct timer_base_s; struct timer_list { struct list_head entry; unsigned long expires; + u64 expires_nsecs; unsigned long magic; @@ -27,6 +28,7 @@ extern struct timer_base_s __init_timer_ #define TIMER_INITIALIZER(_function, _expires, _data) { \ .function = (_function), \ .expires = (_expires), \ + .expires_nsecs = 0, \ .data = (_data), \ .base = &__init_timer_base, \ .magic = TIMER_MAGIC, \ @@ -51,30 +53,15 @@ static inline int timer_pending(const st extern void add_timer_on(struct timer_list *timer, int cpu); extern int del_timer(struct timer_list * timer); -extern int __mod_timer(struct timer_list *timer, unsigned long expires); +extern int __mod_timer(struct timer_list *timer); extern int mod_timer(struct timer_list *timer, unsigned long expires); +extern void add_timer(struct timer_list *timer); +extern int set_timer_nsecs(struct timer_list *timer, u64 expires_nsecs); +extern void set_timer_on_nsecs(struct timer_list *timer, u64 expires_nsecs, + int cpu); extern unsigned long next_timer_interrupt(void); -/*** - * add_timer - start a timer - * @timer: the timer to be added - * - * The kernel will do a ->function(->data) callback from the - * timer interrupt at the ->expired point in the future. The - * current time is 'jiffies'. - * - * The timer's ->expired, ->function (and if the handler uses it, ->data) - * fields must be set prior calling this function. - * - * Timers with an ->expired field in the past will be executed in the next - * timer tick. - */ -static inline void add_timer(struct timer_list * timer) -{ - __mod_timer(timer, timer->expires); -} - #ifdef CONFIG_SMP extern int try_to_del_timer_sync(struct timer_list *timer); extern int del_timer_sync(struct timer_list *timer); diff -urpN 2.6.13-rc3-base/kernel/time.c 2.6.13-rc3-dev/kernel/time.c --- 2.6.13-rc3-base/kernel/time.c 2005-07-13 15:51:57.000000000 -0700 +++ 2.6.13-rc3-dev/kernel/time.c 2005-07-14 12:44:40.000000000 -0700 @@ -589,3 +589,21 @@ EXPORT_SYMBOL(get_jiffies_64); #endif EXPORT_SYMBOL(jiffies); + +u64 do_monotonic_clock(void) +{ + struct timespec now, now_w2m; + unsigned long seq; + + getnstimeofday(&now); + + do { + seq = read_seqbegin(&xtime_lock); + now_w2m = wall_to_monotonic; + } while (read_seqretry(&xtime_lock, seq)); + + return (u64)(now.tv_sec + now_w2m.tv_sec) * NSEC_PER_SEC + + (now.tv_nsec + now_w2m.tv_nsec); +} + +EXPORT_SYMBOL_GPL(do_monotonic_clock); diff -urpN 2.6.13-rc3-base/kernel/timer.c 2.6.13-rc3-dev/kernel/timer.c --- 2.6.13-rc3-base/kernel/timer.c 2005-07-13 15:52:14.000000000 -0700 +++ 2.6.13-rc3-dev/kernel/timer.c 2005-07-14 12:44:40.000000000 -0700 @@ -56,6 +56,15 @@ static void time_interpolator_update(lon #define TVR_SIZE (1 << TVR_BITS) #define TVN_MASK (TVN_SIZE - 1) #define TVR_MASK (TVR_SIZE - 1) +/* + * Modifying TIMERINTERVAL_BITS changes the software resolution of + * soft-timers. While 20 bits would be closer to a millisecond, there + * are performance gains from allowing a software resolution finer than + * the hardware (HZ=1000) + */ +#define TIMERINTERVAL_BITS 19 +#define TIMERINTERVAL_SIZE (1 << TIMERINTERVAL_BITS) +#define TIMERINTERVAL_MASK (TIMERINTERVAL_SIZE - 1) struct timer_base_s { spinlock_t lock; @@ -72,7 +81,7 @@ typedef struct tvec_root_s { struct tvec_t_base_s { struct timer_base_s t_base; - unsigned long timer_jiffies; + unsigned long last_timer_time; tvec_root_t tv1; tvec_t tv2; tvec_t tv3; @@ -114,11 +123,88 @@ static inline void check_timer(struct ti check_timer_failed(timer); } +/* + * nsecs_to_timerintervals_ceiling - convert nanoseconds to timerintervals + * @n: number of nanoseconds to convert + * + * This is where changes to TIMERINTERVAL_BITS affect the soft-timer + * subsystem. + * + * Some explanation of the math is necessary: + * Rather than do decimal arithmetic, we shift for the sake of speed. + * This does mean that the actual requestable sleeps are + * 2^(sizeof(unsigned long)*8 - TIMERINTERVAL_BITS) + * timerintervals. + * + * The conditional takes care of the corner case where we request a 0 + * nanosecond sleep; if the quantity were unsigned, we would not + * propogate the carry and force a wrap when adding the 1. + * + * To prevent timers from being expired early, we: + * Take the ceiling when we add; and + * Take the floor when we expire. + */ +static inline unsigned long nsecs_to_timerintervals_ceiling(u64 nsecs) +{ + if (nsecs) + return (unsigned long)(((nsecs - 1) >> TIMERINTERVAL_BITS) + 1); + else + return 0UL; +} + +/* + * nsecs_to_timerintervals_floor - convert nanoseconds to timerintervals + * @n: number of nanoseconds to convert + * + * This is where changes to TIMERINTERVAL_BITS affect the soft-timer + * subsystem. + * + * Some explanation of the math is necessary: + * Rather than do decimal arithmetic, we shift for the sake of speed. + * This does mean that the actual requestable sleeps are + * 2^(sizeof(unsigned long)*8 - TIMERINTERVAL_BITS) + * + * There is no special case for 0 in the floor function, since we do not + * do any subtraction or addition of 1 + * + * To prevent timers from being expired early, we: + * Take the ceiling when we add; and + * Take the floor when we expire. + */ +static inline unsigned long nsecs_to_timerintervals_floor(u64 nsecs) +{ + return (unsigned long)(nsecs >> TIMERINTERVAL_BITS); +} + +/* + * jiffies_to_timerintervals - convert absolute jiffies to timerintervals + * @abs_jiffies: number of jiffies to convert + * + * First, we convert the absolute jiffies parameter to a relative + * jiffies value. To maintain precision, we convert the relative + * jiffies value to a relative nanosecond value and then convert that + * to a relative soft-timer interval unit value. We then add this + * relative value to the current time according to the timeofday- + * subsystem, converted to soft-timer interval units. + * + * We only use this function when adding timers, so we are free to + * always use the ceiling version of nsecs_to_timerintervals. + * + * This function only exists to support deprecated interfaces. Once + * those interfaces have been converted to the alternatives, it should + * be removed. + */ +static inline unsigned long jiffies_to_timerintervals(unsigned long abs_jiffies) +{ + unsigned long relative_jiffies = abs_jiffies - jiffies; + return nsecs_to_timerintervals_ceiling(do_monotonic_clock() + + jiffies_to_nsecs(relative_jiffies)); +} static void internal_add_timer(tvec_base_t *base, struct timer_list *timer) { - unsigned long expires = timer->expires; - unsigned long idx = expires - base->timer_jiffies; + unsigned long expires = nsecs_to_timerintervals_ceiling(timer->expires_nsecs); + unsigned long idx = expires - base->last_timer_time; struct list_head *vec; if (idx < TVR_SIZE) { @@ -138,7 +224,7 @@ static void internal_add_timer(tvec_base * Can happen if you add a timer with expires == jiffies, * or you set a timer to go off in the past */ - vec = base->tv1.vec + (base->timer_jiffies & TVR_MASK); + vec = base->tv1.vec + (base->last_timer_time & TVR_MASK); } else { int i; /* If the timeout is larger than 0xffffffff on 64-bit @@ -146,7 +232,7 @@ static void internal_add_timer(tvec_base */ if (idx > 0xffffffffUL) { idx = 0xffffffffUL; - expires = idx + base->timer_jiffies; + expires = idx + base->last_timer_time; } i = (expires >> (TVR_BITS + 3 * TVN_BITS)) & TVN_MASK; vec = base->tv5.vec + i; @@ -222,7 +308,7 @@ static timer_base_t *lock_timer_base(str } } -int __mod_timer(struct timer_list *timer, unsigned long expires) +int __mod_timer(struct timer_list *timer) { timer_base_t *base; tvec_base_t *new_base; @@ -261,7 +347,7 @@ int __mod_timer(struct timer_list *timer } } - timer->expires = expires; + /* expires should be in timerintervals, and is currently ignored? */ internal_add_timer(new_base, timer); spin_unlock_irqrestore(&new_base->t_base.lock, flags); @@ -281,21 +367,50 @@ void add_timer_on(struct timer_list *tim { tvec_base_t *base = &per_cpu(tvec_bases, cpu); unsigned long flags; - + BUG_ON(timer_pending(timer) || !timer->function); check_timer(timer); spin_lock_irqsave(&base->t_base.lock, flags); + timer->expires_nsecs = do_monotonic_clock() + + jiffies_to_nsecs(timer->expires - jiffies); timer->base = &base->t_base; internal_add_timer(base, timer); spin_unlock_irqrestore(&base->t_base.lock, flags); } +/*** + * add_timer - start a timer + * @timer: the timer to be added + * + * The kernel will do a ->function(->data) callback from the + * timer interrupt at the ->expired point in the future. The + * current time is 'jiffies'. + * + * The timer's ->expired, ->function (and if the handler uses it, ->data) + * fields must be set prior calling this function. + * + * Timers with an ->expired field in the past will be executed in the next + * timer tick. + * + * The callers of add_timer() should be aware that the interface is now + * deprecated. set_timer_nsecs() is the single interface for adding and + * modifying timers. + */ +void add_timer(struct timer_list * timer) +{ + timer->expires_nsecs = do_monotonic_clock() + + jiffies_to_nsecs(timer->expires - jiffies); + __mod_timer(timer); +} + +EXPORT_SYMBOL(add_timer); /*** * mod_timer - modify a timer's timeout * @timer: the timer to be modified + * @expires: absolute time, in jiffies, when timer should expire * * mod_timer is a more efficient way to update the expire field of an * active timer (if the timer is inactive it will be activated) @@ -311,6 +426,10 @@ void add_timer_on(struct timer_list *tim * The function returns whether it has modified a pending timer or not. * (ie. mod_timer() of an inactive timer returns 0, mod_timer() of an * active timer returns 1.) + * + * The callers of mod_timer() should be aware that the interface is now + * deprecated. set_timer_nsecs() is the single interface for adding and + * modifying timers. */ int mod_timer(struct timer_list *timer, unsigned long expires) { @@ -318,6 +437,9 @@ int mod_timer(struct timer_list *timer, check_timer(timer); + timer->expires_nsecs = do_monotonic_clock() + + jiffies_to_nsecs(expires - jiffies); + /* * This is a common optimization triggered by the * networking code - if the timer is re-modified @@ -326,10 +448,56 @@ int mod_timer(struct timer_list *timer, if (timer->expires == expires && timer_pending(timer)) return 1; - return __mod_timer(timer, expires); + return __mod_timer(timer); } EXPORT_SYMBOL(mod_timer); + +/* + * set_timer_nsecs - modify a timer's timeout in nsecs + * @timer: the timer to be modified + * + * set_timer_nsecs replaces both add_timer and mod_timer. The caller + * should call do_monotonic_clock() to determine the absolute timeout + * necessary. + */ +int set_timer_nsecs(struct timer_list *timer, u64 expires_nsecs) +{ + BUG_ON(!timer->function); + + check_timer(timer); + + if (timer_pending(timer) && timer->expires_nsecs == expires_nsecs) + return 1; + + timer->expires_nsecs = expires_nsecs; + + return __mod_timer(timer); +} + +EXPORT_SYMBOL_GPL(set_timer_nsecs); + +/*** + * set_timer_on_nsecs - start a timer on a particular CPU + * @timer: the timer to be added + * @expires_nsecs: absolute time in nsecs when timer should expire + * @cpu: the CPU to start it on + * + * This is not very scalable on SMP. Double adds are not possible. + */ +void set_timer_on_nsecs(struct timer_list *timer, u64 expires_nsecs, int cpu) +{ + tvec_base_t *base = &per_cpu(tvec_bases, cpu); + unsigned long flags; + + BUG_ON(timer_pending(timer) || !timer->function); + + check_timer(timer); + + spin_lock_irqsave(&base->t_base.lock, flags); + timer->expires_nsecs = expires_nsecs; + timer->base = &base->t_base; + internal_add_timer(base, timer); + spin_unlock_irqrestore(&base->t_base.lock, flags); +} /*** * del_timer - deactive a timer. @@ -455,17 +623,17 @@ static int cascade(tvec_base_t *base, tv * This function cascades all vectors and executes all expired timer * vectors. */ -#define INDEX(N) (base->timer_jiffies >> (TVR_BITS + N * TVN_BITS)) & TVN_MASK +#define INDEX(N) (base->last_timer_time >> (TVR_BITS + N * TVN_BITS)) & TVN_MASK -static inline void __run_timers(tvec_base_t *base) +static inline void __run_timers(tvec_base_t *base, unsigned long current_timer_time) { struct timer_list *timer; spin_lock_irq(&base->t_base.lock); - while (time_after_eq(jiffies, base->timer_jiffies)) { + while (time_after_eq(current_timer_time, base->last_timer_time)) { struct list_head work_list = LIST_HEAD_INIT(work_list); struct list_head *head = &work_list; - int index = base->timer_jiffies & TVR_MASK; + int index = base->last_timer_time & TVR_MASK; /* * Cascade timers: @@ -475,7 +643,7 @@ static inline void __run_timers(tvec_bas (!cascade(base, &base->tv3, INDEX(1))) && !cascade(base, &base->tv4, INDEX(2))) cascade(base, &base->tv5, INDEX(3)); - ++base->timer_jiffies; + ++base->last_timer_time; list_splice_init(base->tv1.vec + index, &work_list); while (!list_empty(head)) { void (*fn)(unsigned long); @@ -524,20 +692,20 @@ unsigned long next_timer_interrupt(void) base = &__get_cpu_var(tvec_bases); spin_lock(&base->t_base.lock); - expires = base->timer_jiffies + (LONG_MAX >> 1); + expires = base->last_timer_time + (LONG_MAX >> 1); list = 0; /* Look for timer events in tv1. */ - j = base->timer_jiffies & TVR_MASK; + j = base->last_timer_time & TVR_MASK; do { list_for_each_entry(nte, base->tv1.vec + j, entry) { expires = nte->expires; - if (j < (base->timer_jiffies & TVR_MASK)) + if (j < (base->last_timer_time & TVR_MASK)) list = base->tv2.vec + (INDEX(0)); goto found; } j = (j + 1) & TVR_MASK; - } while (j != (base->timer_jiffies & TVR_MASK)); + } while (j != (base->last_timer_time & TVR_MASK)); /* Check tv2-tv5. */ varray[0] = &base->tv2; @@ -910,10 +1078,15 @@ EXPORT_SYMBOL(xtime_lock); */ static void run_timer_softirq(struct softirq_action *h) { + unsigned long current_timer_time; tvec_base_t *base = &__get_cpu_var(tvec_bases); - if (time_after_eq(jiffies, base->timer_jiffies)) - __run_timers(base); + /* cache the converted current time, rounding down */ + current_timer_time = + nsecs_to_timerintervals_floor(do_monotonic_clock()); + + if (time_after_eq(current_timer_time, base->last_timer_time)) + __run_timers(base, current_timer_time); } /* ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC][PATCH 3/4] new human-time schedule_timeout() functions 2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan 2005-07-14 20:28 ` [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs Nishanth Aravamudan 2005-07-14 20:40 ` [RFC][PATCH 2/4] human-time soft-timer core changes Nishanth Aravamudan @ 2005-07-14 20:41 ` Nishanth Aravamudan 2005-07-14 20:43 ` [RFC][PATCH 4/4] convert sys_nanosleep() to use set_timer_nsecs() Nishanth Aravamudan 2005-07-14 22:28 ` [RFC][PATCH 0/4] new human-time soft-timer subsystem Roman Zippel 4 siblings, 0 replies; 12+ messages in thread From: Nishanth Aravamudan @ 2005-07-14 20:41 UTC (permalink / raw) To: linux-kernel From: Nishanth Aravamudan <nacc@us.ibm.com> Description: Add new human-time schedule_timeout() style functions, along with the appropriate constants/prototypes. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- include/linux/sched.h | 7 ++ include/linux/time.h | 4 + kernel/timer.c | 147 ++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 158 insertions(+) diff -urpN 2.6.13-rc3-base/include/linux/sched.h 2.6.13-rc3-dev/include/linux/sched.h --- 2.6.13-rc3-base/include/linux/sched.h 2005-07-13 15:52:14.000000000 -0700 +++ 2.6.13-rc3-dev/include/linux/sched.h 2005-07-14 12:45:15.000000000 -0700 @@ -182,7 +182,14 @@ extern void scheduler_tick(void); extern int in_sched_functions(unsigned long addr); #define MAX_SCHEDULE_TIMEOUT LONG_MAX +#define MAX_SCHEDULE_TIMEOUT_NSECS ((u64)(-1)) +#define MAX_SCHEDULE_TIMEOUT_USECS ULONG_MAX +#define MAX_SCHEDULE_TIMEOUT_MSECS UINT_MAX + extern signed long FASTCALL(schedule_timeout(signed long timeout)); +extern u64 FASTCALL(schedule_timeout_nsecs(u64 timeout_nsecs)); +extern unsigned long FASTCALL(schedule_timeout_usecs(unsigned long timeout_usecs)); +extern unsigned int FASTCALL(schedule_timeout_msecs(unsigned int timeout_msecs)); asmlinkage void schedule(void); struct namespace; diff -urpN 2.6.13-rc3-base/include/linux/time.h 2.6.13-rc3-dev/include/linux/time.h --- 2.6.13-rc3-base/include/linux/time.h 2005-07-14 12:45:07.000000000 -0700 +++ 2.6.13-rc3-dev/include/linux/time.h 2005-07-14 12:45:15.000000000 -0700 @@ -36,6 +36,10 @@ struct timezone { #define NSEC_PER_SEC (1000000000L) #endif +#ifndef NSEC_PER_MSEC +#define NSEC_PER_MSEC (1000000L) +#endif + #ifndef NSEC_PER_USEC #define NSEC_PER_USEC (1000L) #endif diff -urpN 2.6.13-rc3-base/kernel/timer.c 2.6.13-rc3-dev/kernel/timer.c --- 2.6.13-rc3-base/kernel/timer.c 2005-07-14 12:45:07.000000000 -0700 +++ 2.6.13-rc3-dev/kernel/timer.c 2005-07-14 12:45:15.000000000 -0700 @@ -1271,6 +1271,10 @@ static void process_timeout(unsigned lon * value will be %MAX_SCHEDULE_TIMEOUT. * * In all cases the return value is guaranteed to be non-negative. + * + * The callers of schedule_timeout() should be aware that the interface + * is now deprecated. schedule_timeout_{msecs,usecs,nsecs}() are now the + * interfaces for relative timeout requests. */ fastcall signed long __sched schedule_timeout(signed long timeout) { @@ -1326,6 +1330,149 @@ fastcall signed long __sched schedule_ti EXPORT_SYMBOL(schedule_timeout); +/** + * schedule_timeout_nsecs - sleep until timeout + * @timeout_nsecs: timeout value in nanoseconds + * + * Make the current task sleep until @timeout_nsecs nsecs have + * elapsed. The routine will return immediately unless + * the current task state has been set (see set_current_state()). + * + * You can set the task state as follows - + * + * %TASK_UNINTERRUPTIBLE - at least @timeout_nsecs nsecs are guaranteed + * to pass before the routine returns. The routine will return 0 + * + * %TASK_INTERRUPTIBLE - the routine may return early if a signal is + * delivered to the current task. In this case the remaining time + * in nsecs will be returned, or 0 if the timer expired in time + * + * The current task state is guaranteed to be TASK_RUNNING when this + * routine returns. + * + * Specifying a @timeout value of %MAX_SCHEDULE_TIMEOUT_NSECS will + * schedule the CPU away without a bound on the timeout. In this case + * the return value will be %MAX_SCHEDULE_TIMEOUT_NSECS. + */ +fastcall u64 __sched schedule_timeout_nsecs(u64 timeout_nsecs) +{ + struct timer_list timer; + u64 expires; + + if (timeout_nsecs == MAX_SCHEDULE_TIMEOUT_NSECS) { + schedule(); + goto out; + } + + expires = do_monotonic_clock() + timeout_nsecs; + + init_timer(&timer); + timer.data = (unsigned long) current; + timer.function = process_timeout; + + set_timer_nsecs(&timer, expires); + schedule(); + del_singleshot_timer_sync(&timer); + + timeout_nsecs = do_monotonic_clock(); + if (expires < timeout_nsecs) + timeout_nsecs = (u64)0UL; + else + timeout_nsecs = expires - timeout_nsecs; +out: + return timeout_nsecs; +} + +EXPORT_SYMBOL_GPL(schedule_timeout_nsecs); + +/** + * schedule_timeout_usecs - sleep until timeout + * @timeout_usecs: timeout value in nanoseconds + * + * Make the current task sleep until @timeout_usecs usecs have + * elapsed. The routine will return immediately unless + * the current task state has been set (see set_current_state()). + * + * You can set the task state as follows - + * + * %TASK_UNINTERRUPTIBLE - at least @timeout_usecs usecs are guaranteed + * to pass before the routine returns. The routine will return 0 + * + * %TASK_INTERRUPTIBLE - the routine may return early if a signal is + * delivered to the current task. In this case the remaining time + * in usecs will be returned, or 0 if the timer expired in time + * + * The current task state is guaranteed to be TASK_RUNNING when this + * routine returns. + * + * Specifying a @timeout value of %MAX_SCHEDULE_TIMEOUT_USECS will + * schedule the CPU away without a bound on the timeout. In this case + * the return value will be %MAX_SCHEDULE_TIMEOUT_USECS. + */ +fastcall inline unsigned long __sched schedule_timeout_usecs(unsigned long timeout_usecs) +{ + u64 timeout_nsecs; + + if (timeout_usecs == MAX_SCHEDULE_TIMEOUT_USECS) + timeout_nsecs = MAX_SCHEDULE_TIMEOUT_NSECS; + else + timeout_nsecs = timeout_usecs * (u64)NSEC_PER_USEC; + /* + * Make sure to round up by subtracting one before division and + * adding one after + */ + timeout_nsecs = schedule_timeout_nsecs(timeout_nsecs) - 1; + do_div(timeout_nsecs, NSEC_PER_USEC); + timeout_usecs = (unsigned long)timeout_nsecs + 1UL; + return timeout_usecs; +} + +EXPORT_SYMBOL_GPL(schedule_timeout_usecs); + +/** + * schedule_timeout_msecs - sleep until timeout + * @timeout_msecs: timeout value in nanoseconds + * + * Make the current task sleep until @timeout_msecs msecs have + * elapsed. The routine will return immediately unless + * the current task state has been set (see set_current_state()). + * + * You can set the task state as follows - + * + * %TASK_UNINTERRUPTIBLE - at least @timeout_msecs msecs are guaranteed + * to pass before the routine returns. The routine will return 0 + * + * %TASK_INTERRUPTIBLE - the routine may return early if a signal is + * delivered to the current task. In this case the remaining time + * in msecs will be returned, or 0 if the timer expired in time + * + * The current task state is guaranteed to be TASK_RUNNING when this + * routine returns. + * + * Specifying a @timeout value of %MAX_SCHEDULE_TIMEOUT_MSECS will + * schedule the CPU away without a bound on the timeout. In this case + * the return value will be %MAX_SCHEDULE_TIMEOUT_MSECS. + */ +fastcall inline unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) +{ + u64 timeout_nsecs; + + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) + timeout_nsecs = MAX_SCHEDULE_TIMEOUT_NSECS; + else + timeout_nsecs = timeout_msecs * (u64)NSEC_PER_MSEC; + /* + * Make sure to round up by subtracting one before division and + * adding one after + */ + timeout_nsecs = schedule_timeout_nsecs(timeout_nsecs) - 1; + do_div(timeout_nsecs, NSEC_PER_MSEC); + timeout_msecs = (unsigned int)timeout_nsecs + 1; + return timeout_msecs; +} + +EXPORT_SYMBOL_GPL(schedule_timeout_msecs); + /* Thread ID - the internal kernel "pid" */ asmlinkage long sys_gettid(void) { ^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC][PATCH 4/4] convert sys_nanosleep() to use set_timer_nsecs() 2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan ` (2 preceding siblings ...) 2005-07-14 20:41 ` [RFC][PATCH 3/4] new human-time schedule_timeout() functions Nishanth Aravamudan @ 2005-07-14 20:43 ` Nishanth Aravamudan 2005-07-14 22:28 ` [RFC][PATCH 0/4] new human-time soft-timer subsystem Roman Zippel 4 siblings, 0 replies; 12+ messages in thread From: Nishanth Aravamudan @ 2005-07-14 20:43 UTC (permalink / raw) To: linux-kernel From: Nishanth Aravamudan <nacc@us.ibm.com> Description: Add timespec and timeval conversion functions for nanoseconds. Convert sys_nanosleep() to use schedule_timeout_nsecs(). Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- include/linux/time.h | 33 +++++++++++++++++++++++++++++++++ kernel/timer.c | 24 ++++++++++++------------ 2 files changed, 45 insertions(+), 12 deletions(-) diff -urpN 2.6.13-rc3-base/include/linux/time.h 2.6.13-rc3-dev/include/linux/time.h --- 2.6.13-rc3-base/include/linux/time.h 2005-07-14 12:46:46.000000000 -0700 +++ 2.6.13-rc3-dev/include/linux/time.h 2005-07-14 12:48:25.000000000 -0700 @@ -2,6 +2,7 @@ #define _LINUX_TIME_H #include <linux/types.h> +#include <asm/div64.h> #ifdef __KERNEL__ #include <linux/seqlock.h> @@ -126,6 +127,38 @@ set_normalized_timespec (struct timespec ts->tv_nsec = nsec; } +/* Inline helper functions */ +static inline struct timeval nsecs_to_timeval(u64 ns) +{ + struct timeval tv; + tv.tv_sec = div_long_long_rem(ns, NSEC_PER_SEC, &tv.tv_usec); + tv.tv_usec = (tv.tv_usec + NSEC_PER_USEC/2) / NSEC_PER_USEC; + return tv; +} + +static inline struct timespec nsecs_to_timespec(u64 ns) +{ + struct timespec ts; + ts.tv_sec = div_long_long_rem(ns, NSEC_PER_SEC, &ts.tv_nsec); + return ts; +} + +static inline u64 timespec_to_nsecs(struct timespec* ts) +{ + u64 ret; + ret = ((u64)ts->tv_sec) * NSEC_PER_SEC; + ret += (u64)ts->tv_nsec; + return ret; +} + +static inline u64 timeval_to_nsecs(struct timeval* tv) +{ + u64 ret; + ret = ((u64)tv->tv_sec) * NSEC_PER_SEC; + ret += ((u64)tv->tv_usec) * NSEC_PER_USEC; + return ret; +} + #endif /* __KERNEL__ */ #define NFDBITS __NFDBITS diff -urpN 2.6.13-rc3-base/kernel/timer.c 2.6.13-rc3-dev/kernel/timer.c --- 2.6.13-rc3-base/kernel/timer.c 2005-07-14 12:46:46.000000000 -0700 +++ 2.6.13-rc3-dev/kernel/timer.c 2005-07-14 12:48:25.000000000 -0700 @@ -1481,21 +1481,21 @@ asmlinkage long sys_gettid(void) static long __sched nanosleep_restart(struct restart_block *restart) { - unsigned long expire = restart->arg0, now = jiffies; + u64 expire = restart->arg0, now = do_monotonic_clock(); struct timespec __user *rmtp = (struct timespec __user *) restart->arg1; long ret; /* Did it expire while we handled signals? */ - if (!time_after(expire, now)) + if (now > expire) return 0; - current->state = TASK_INTERRUPTIBLE; - expire = schedule_timeout(expire - now); + set_current_state(TASK_INTERRUPTIBLE); + expire = schedule_timeout_nsecs(expire - now); ret = 0; if (expire) { struct timespec t; - jiffies_to_timespec(expire, &t); + t = nsecs_to_timespec(expire); ret = -ERESTART_RESTARTBLOCK; if (rmtp && copy_to_user(rmtp, &t, sizeof(t))) @@ -1508,7 +1508,7 @@ static long __sched nanosleep_restart(st asmlinkage long sys_nanosleep(struct timespec __user *rqtp, struct timespec __user *rmtp) { struct timespec t; - unsigned long expire; + u64 expire; long ret; if (copy_from_user(&t, rqtp, sizeof(t))) @@ -1517,20 +1517,20 @@ asmlinkage long sys_nanosleep(struct tim if ((t.tv_nsec >= 1000000000L) || (t.tv_nsec < 0) || (t.tv_sec < 0)) return -EINVAL; - expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec); - current->state = TASK_INTERRUPTIBLE; - expire = schedule_timeout(expire); + expire = timespec_to_nsecs(&t); + set_current_state(TASK_INTERRUPTIBLE); + expire = schedule_timeout_nsecs(expire); ret = 0; if (expire) { struct restart_block *restart; - jiffies_to_timespec(expire, &t); + t = nsecs_to_timespec(expire); if (rmtp && copy_to_user(rmtp, &t, sizeof(t))) return -EFAULT; restart = ¤t_thread_info()->restart_block; restart->fn = nanosleep_restart; - restart->arg0 = jiffies + expire; + restart->arg0 = do_monotonic_clock() + expire; restart->arg1 = (unsigned long) rmtp; ret = -ERESTART_RESTARTBLOCK; } @@ -1642,7 +1642,7 @@ static void __devinit init_timers_cpu(in for (j = 0; j < TVR_SIZE; j++) INIT_LIST_HEAD(base->tv1.vec + j); - base->timer_jiffies = jiffies; + base->last_timer_time = 0UL; } #ifdef CONFIG_HOTPLUG_CPU ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 0/4] new human-time soft-timer subsystem 2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan ` (3 preceding siblings ...) 2005-07-14 20:43 ` [RFC][PATCH 4/4] convert sys_nanosleep() to use set_timer_nsecs() Nishanth Aravamudan @ 2005-07-14 22:28 ` Roman Zippel 2005-07-17 0:53 ` Nishanth Aravamudan 4 siblings, 1 reply; 12+ messages in thread From: Roman Zippel @ 2005-07-14 22:28 UTC (permalink / raw) To: Nishanth Aravamudan; +Cc: linux-kernel Hi, On Thu, 14 Jul 2005, Nishanth Aravamudan wrote: > We no longer use jiffies (the variable) as the basis for determining > what "time" a timer should expire or when it should be added. Instead, > we use a new function, do_monotonic_clock(), which is simply a wrapper > for getnstimeofday(). And suddenly a simple 32bit integer becomes a complex 64bit integer, which requires hardware access to read a timer and additional conversion into ns. Why is suddenly everyone so obsessed with molesting something simple and cute as jiffies? bye, Roman ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC][PATCH 0/4] new human-time soft-timer subsystem 2005-07-14 22:28 ` [RFC][PATCH 0/4] new human-time soft-timer subsystem Roman Zippel @ 2005-07-17 0:53 ` Nishanth Aravamudan 0 siblings, 0 replies; 12+ messages in thread From: Nishanth Aravamudan @ 2005-07-17 0:53 UTC (permalink / raw) To: Roman Zippel; +Cc: linux-kernel On 15.07.2005 [00:28:44 +0200], Roman Zippel wrote: > Hi, > > On Thu, 14 Jul 2005, Nishanth Aravamudan wrote: > > > We no longer use jiffies (the variable) as the basis for determining > > what "time" a timer should expire or when it should be added. Instead, > > we use a new function, do_monotonic_clock(), which is simply a wrapper > > for getnstimeofday(). > > And suddenly a simple 32bit integer becomes a complex 64bit integer, which > requires hardware access to read a timer and additional conversion into ns. > Why is suddenly everyone so obsessed with molesting something simple and > cute as jiffies? Thanks for the feedback, Roman. I know the 64-bit operations are critical from a performance perspective and may be excessive from a pragmatic perspective. Maybe an alternative would be to only provide *microsecond* resolution in the software, which I currently assume is storable in an unsigned long (a little over an hour?). We could then provide a supplemental interface for those sleeps which would exceed this time, either via looping or a 64-bit parameter for this special interface. Would that perhaps be a better alternative from the 64-bit perspective? We could do this one better, perhaps, by basically doing exactly what jiffies does now, but storing a time value (in microseconds) instead of a count of the number of ticks (jiffies' current interpretation). This would perhaps be a 64-bit op, but that is the case current with jiffies_64++ (or jiffies_64 += jiffies_increment). I will work on some patches to do something to this effect and will bring it up during the time/timer talk (Saturday at 13h30). Thanks again, Nish ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2005-07-18 21:53 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-07-14 20:26 [RFC][PATCH 0/4] new human-time soft-timer subsystem Nishanth Aravamudan 2005-07-14 20:28 ` [RFC][PATCH 1/4] add jiffies_to_nsecs() helper and fix up size of usecs Nishanth Aravamudan 2005-07-14 20:54 ` Dave Hansen 2005-07-14 21:03 ` Nishanth Aravamudan 2005-07-15 12:14 ` Pavel Machek 2005-07-17 0:44 ` Nishanth Aravamudan 2005-07-14 20:40 ` [RFC][PATCH 2/4] human-time soft-timer core changes Nishanth Aravamudan 2005-07-18 21:53 ` [RFC][UPDATE PATCH " Nishanth Aravamudan 2005-07-14 20:41 ` [RFC][PATCH 3/4] new human-time schedule_timeout() functions Nishanth Aravamudan 2005-07-14 20:43 ` [RFC][PATCH 4/4] convert sys_nanosleep() to use set_timer_nsecs() Nishanth Aravamudan 2005-07-14 22:28 ` [RFC][PATCH 0/4] new human-time soft-timer subsystem Roman Zippel 2005-07-17 0:53 ` Nishanth Aravamudan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox