* [patch 1/4] drivers/char/ip2/i2lib.c: replace direct assignment with set_current_state()
@ 2005-07-07 21:31 domen
2005-07-08 23:08 ` Andrew Morton
0 siblings, 1 reply; 50+ messages in thread
From: domen @ 2005-07-07 21:31 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel, Christophe Lucas, domen
[-- Attachment #1: set_current_state-drivers_char_ip2_i2lib --]
[-- Type: text/plain, Size: 1092 bytes --]
From: Christophe Lucas <clucas@rotomalug.org>
Use set_current_state() instead of direct assignment of
current->state.
Signed-off-by: Christophe Lucas <clucas@rotomalug.org>
Signed-off-by: Domen Puncer <domen@coderock.org>
---
i2lib.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
Index: quilt/drivers/char/ip2/i2lib.c
===================================================================
--- quilt.orig/drivers/char/ip2/i2lib.c
+++ quilt/drivers/char/ip2/i2lib.c
@@ -655,7 +655,7 @@ i2QueueCommands(int type, i2ChanStrPtr p
timeout--; // So negative values == forever
if (!in_interrupt()) {
- current->state = TASK_INTERRUPTIBLE;
+ set_current_state(TASK_INTERRUPTIBLE);
schedule_timeout(1); // short nap
} else {
// we cannot sched/sleep in interrrupt silly
@@ -1132,7 +1132,7 @@ i2Output(i2ChanStrPtr pCh, const char *p
ip2trace (CHANN, ITRC_OUTPUT, 61, 0 );
- current->state = TASK_INTERRUPTIBLE;
+ set_current_state(TASK_INTERRUPTIBLE);
schedule_timeout(2);
if (signal_pending(current)) {
break;
--
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: [patch 1/4] drivers/char/ip2/i2lib.c: replace direct assignment with set_current_state() 2005-07-07 21:31 [patch 1/4] drivers/char/ip2/i2lib.c: replace direct assignment with set_current_state() domen @ 2005-07-08 23:08 ` Andrew Morton 2005-07-08 23:22 ` Nish Aravamudan 2005-07-23 0:27 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan 0 siblings, 2 replies; 50+ messages in thread From: Andrew Morton @ 2005-07-08 23:08 UTC (permalink / raw) To: domen; +Cc: linux-kernel, clucas, domen domen@coderock.org wrote: > > @@ -655,7 +655,7 @@ i2QueueCommands(int type, i2ChanStrPtr p > timeout--; // So negative values == forever > > if (!in_interrupt()) { I worry about what this driver is trying to do... > - current->state = TASK_INTERRUPTIBLE; > + set_current_state(TASK_INTERRUPTIBLE); > schedule_timeout(1); // short nap We do this all over the place. Adding new schedule_timeout_interruptible() and schedule_timeout_uninterruptible() would reduce kernel size and neaten things up. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [patch 1/4] drivers/char/ip2/i2lib.c: replace direct assignment with set_current_state() 2005-07-08 23:08 ` Andrew Morton @ 2005-07-08 23:22 ` Nish Aravamudan 2005-07-23 0:27 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan 1 sibling, 0 replies; 50+ messages in thread From: Nish Aravamudan @ 2005-07-08 23:22 UTC (permalink / raw) To: Andrew Morton; +Cc: domen, linux-kernel, clucas On 7/8/05, Andrew Morton <akpm@osdl.org> wrote: > domen@coderock.org wrote: > > > > @@ -655,7 +655,7 @@ i2QueueCommands(int type, i2ChanStrPtr p > > timeout--; // So negative values == forever > > > > if (!in_interrupt()) { > > I worry about what this driver is trying to do... > > > - current->state = TASK_INTERRUPTIBLE; > > + set_current_state(TASK_INTERRUPTIBLE); > > schedule_timeout(1); // short nap > > We do this all over the place. Adding new schedule_timeout_interruptible() > and schedule_timeout_uninterruptible() would reduce kernel size and neaten > things up. I'll get a patch and users set up soon to do just this. Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-08 23:08 ` Andrew Morton 2005-07-08 23:22 ` Nish Aravamudan @ 2005-07-23 0:27 ` Nishanth Aravamudan 2005-07-23 0:31 ` Arjan van de Ven 1 sibling, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-07-23 0:27 UTC (permalink / raw) To: Andrew Morton; +Cc: domen, linux-kernel, clucas On 08.07.2005 [16:08:24 -0700], Andrew Morton wrote: > domen@coderock.org wrote: > > > > @@ -655,7 +655,7 @@ i2QueueCommands(int type, i2ChanStrPtr p > > timeout--; // So negative values == forever > > > > if (!in_interrupt()) { > > I worry about what this driver is trying to do... > > > - current->state = TASK_INTERRUPTIBLE; > > + set_current_state(TASK_INTERRUPTIBLE); > > schedule_timeout(1); // short nap > > We do this all over the place. Adding new schedule_timeout_interruptible() > and schedule_timeout_uninterruptible() would reduce kernel size and neaten > things up. How does something like this look? If this looks ok, I'll send out bunches of patches to add users of the new interfaces. Description: Add wrappers for interruptible/uninterruptible schedule_timeout() callers. Also add millisecond equivalents. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- include/linux/sched.h | 11 ++++ kernel/timer.c | 125 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 135 insertions(+), 1 deletion(-) diff -urpN 2.6.13-rc3/include/linux/sched.h 2.6.13-rc3-new_interfaces/include/linux/sched.h --- 2.6.13-rc3/include/linux/sched.h 2005-07-13 15:52:14.000000000 -0700 +++ 2.6.13-rc3-new_interfaces/include/linux/sched.h 2005-07-22 16:28:47.000000000 -0700 @@ -181,8 +181,17 @@ extern void scheduler_tick(void); /* Is this address in the __sched functions? */ extern int in_sched_functions(unsigned long addr); -#define MAX_SCHEDULE_TIMEOUT LONG_MAX +#define MAX_SCHEDULE_TIMEOUT LONG_MAX +#define MAX_SCHEDULE_TIMEOUT_MSECS UINT_MAX extern signed long FASTCALL(schedule_timeout(signed long timeout)); +extern signed long FASTCALL(schedule_timeout_interruptible + (signed long timeout)); +extern signed long FASTCALL(schedule_timeout_uninterruptible + (signed long timeout)); +extern unsigned int FASTCALL(schedule_timeout_interruptible_msecs + (unsigned int timeout_msecs)); +extern unsigned int FASTCALL(schedule_timeout_uninterruptible_msecs + (unsigned int timeout_msecs)); asmlinkage void schedule(void); struct namespace; diff -urpN 2.6.13-rc3/kernel/timer.c 2.6.13-rc3-new_interfaces/kernel/timer.c --- 2.6.13-rc3/kernel/timer.c 2005-07-13 15:52:14.000000000 -0700 +++ 2.6.13-rc3-new_interfaces/kernel/timer.c 2005-07-22 16:31:31.000000000 -0700 @@ -1153,6 +1153,131 @@ fastcall signed long __sched schedule_ti EXPORT_SYMBOL(schedule_timeout); +/** + * schedule_timeout_interruptible - sleep until timeout, wait-queue + * event or signal + * @timeout: timeout value in jiffies + * + * See the comment for schedule_timeout() for details. + */ +fastcall signed long __sched schedule_timeout_interruptible(signed long timeout) +{ + set_current_state(TASK_INTERRUPTIBLE); + return schedule_timeout(timeout); +} + +EXPORT_SYMBOL(schedule_timeout_interruptible); + +/** + * schedule_timeout_uninterruptible - sleep until timeout or wait-queue + * event + * @timeout: timeout value in jiffies + * + * See the comment for schedule_timeout() for details. + */ +fastcall signed long __sched schedule_timeout_uninterruptible(signed long timeout) +{ + set_current_state(TASK_UNINTERRUPTIBLE); + return schedule_timeout(timeout); +} + +EXPORT_SYMBOL(schedule_timeout_uninterruptible); + +/* + * schedule_timeout_msecs - sleep until timeout + * @timeout_msecs: timeout value in milliseconds + * + * A human-time (but otherwise identical) alternative to + * schedule_timeout() The state, therefore, *does* need to be set before + * calling this function, but this function should *never* be called + * directly. Use the nice wrappers, schedule_{interruptible, + * uninterruptible}_timeout_msecs(). + * + * See the comment for schedule_timeout() for details. + */ +fastcall inline unsigned int __sched schedule_timeout_msecs + (unsigned int timeout_msecs) +{ + struct timer_list timer; + unsigned long expire_jifs; + signed long remaining_jifs; + + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { + schedule(); + goto out; + } + + /* + * msecs_to_jiffies() is a unit conversion, which truncates + * (rounds down), so we need to add 1. + */ + expire_jifs = jiffies + msecs_to_jiffies(timeout_msecs) + 1; + + init_timer(&timer); + timer.expires = expire_jifs; + timer.data = (unsigned long) current; + timer.function = process_timeout; + + add_timer(&timer); + schedule(); + del_singleshot_timer_sync(&timer); + + remaining_jifs = expire_jifs - jiffies; + /* if we have woken up *before* we have requested */ + if (remaining_jifs > 0) + /* + * don't need to add 1 here, even though there is + * truncation, because we will add 1 if/when the value + * is sent back in + */ + timeout_msecs = jiffies_to_msecs(remaining_jifs); + else + timeout_msecs = 0; + + out: + return timeout_msecs; +} + +/** + * schedule_timeout_msecs_interruptible - sleep until timeout, + * wait-queue event, or signal + * @timeout_msecs: timeout value in milliseconds + * + * A nice wrapper for the common + * set_current_state()/schedule_timeout_msecs() usage. The state, + * therefore, does *not* need to be set before calling this function. + * + * See the comment for schedule_timeout() for details. + */ +fastcall unsigned int __sched schedule_timeout_msecs_interruptible + (unsigned int timeout_msecs) +{ + set_current_state(TASK_INTERRUPTIBLE); + return schedule_timeout_msecs(timeout_msecs); +} + +EXPORT_SYMBOL_GPL(schedule_timeout_msecs_interruptible); + +/** + * schedule_timeout_msecs_uninterrutible - sleep until timeout or + * wait-queue event + * @timeout_msecs: timeout value in milliseconds + * + * A nice wrapper for the common + * set_current_state()/schedule_timeout_msecs() usage. The state, + * therefore, does *not* need to be set before calling this function. + * + * See the comment for schedule_timeout() for details. + */ +fastcall unsigned int __sched schedule_timeout_msecs_uninterruptible + (unsigned int timeout_msecs) +{ + set_current_state(TASK_UNINTERRUPTIBLE); + return schedule_timeout_msecs(timeout_msecs); +} + +EXPORT_SYMBOL_GPL(schedule_timeout_msecs_uninterruptible); + /* Thread ID - the internal kernel "pid" */ asmlinkage long sys_gettid(void) { ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 0:27 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan @ 2005-07-23 0:31 ` Arjan van de Ven 2005-07-23 1:08 ` [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces Nishanth Aravamudan 2005-07-23 10:50 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Roman Zippel 0 siblings, 2 replies; 50+ messages in thread From: Arjan van de Ven @ 2005-07-23 0:31 UTC (permalink / raw) To: Nishanth Aravamudan; +Cc: Andrew Morton, domen, linux-kernel, clucas > How does something like this look? If this looks ok, I'll send out > bunches of patches to add users of the new interfaces. I'd drop the FASTCALL stuff... nowadays with regparm that's automatic and the cost of register-vs-stack isn't too big anyway Also I'd rather not add the non-msec ones... either you're raw and use HZ, or you are "cooked" and use the msec variant.. I dont' see the point of adding an "in the middle" one. (Yes this means that several users need to be transformed to msecs but... I consider that progress ;) ^ permalink raw reply [flat|nested] 50+ messages in thread
* [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces 2005-07-23 0:31 ` Arjan van de Ven @ 2005-07-23 1:08 ` Nishanth Aravamudan 2005-07-23 2:30 ` Andrew Morton 2005-07-23 10:50 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Roman Zippel 1 sibling, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-07-23 1:08 UTC (permalink / raw) To: Arjan van de Ven; +Cc: Andrew Morton, domen, linux-kernel, clucas On 22.07.2005 [20:31:55 -0400], Arjan van de Ven wrote: > > > How does something like this look? If this looks ok, I'll send out > > bunches of patches to add users of the new interfaces. > > I'd drop the FASTCALL stuff... nowadays with regparm that's automatic > and the cost of register-vs-stack isn't too big anyway > > > Also I'd rather not add the non-msec ones... either you're raw and use > HZ, or you are "cooked" and use the msec variant.. I dont' see the point > of adding an "in the middle" one. (Yes this means that several users > need to be transformed to msecs but... I consider that progress ;) Thanks for the feedback! Does this look better? I will be more than happy to convert those users to msecs which should, btw :) I've been waiting to add this interface so that the ones that couldn't use msleep{,_interruptible}() because of wait-queues, can also be changed to use milliseconds. Hrm, I also noticed I typo'd the externs in sched.h. Fixed in the new patch. Description: Add schedule_timeout_msecs() and add wrappers for interruptible/uninterruptible schedule_timeout_msecs() callers. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- include/linux/sched.h | 7 +++ kernel/timer.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 100 insertions(+), 1 deletion(-) diff -urpN 2.6.13-rc3/include/linux/sched.h 2.6.13-rc3-new_interfaces/include/linux/sched.h --- 2.6.13-rc3/include/linux/sched.h 2005-07-13 15:52:14.000000000 -0700 +++ 2.6.13-rc3-new_interfaces/include/linux/sched.h 2005-07-22 18:06:36.000000000 -0700 @@ -181,8 +181,13 @@ extern void scheduler_tick(void); /* Is this address in the __sched functions? */ extern int in_sched_functions(unsigned long addr); -#define MAX_SCHEDULE_TIMEOUT LONG_MAX +#define MAX_SCHEDULE_TIMEOUT LONG_MAX +#define MAX_SCHEDULE_TIMEOUT_MSECS UINT_MAX extern signed long FASTCALL(schedule_timeout(signed long timeout)); +extern unsigned int schedule_timeout_msecs_interruptible + (unsigned int timeout_msecs); +extern unsigned int schedule_timeout_msecs_uninterruptible + (unsigned int timeout_msecs); asmlinkage void schedule(void); struct namespace; diff -urpN 2.6.13-rc3/kernel/timer.c 2.6.13-rc3-new_interfaces/kernel/timer.c --- 2.6.13-rc3/kernel/timer.c 2005-07-13 15:52:14.000000000 -0700 +++ 2.6.13-rc3-new_interfaces/kernel/timer.c 2005-07-22 18:00:46.000000000 -0700 @@ -1153,6 +1153,100 @@ fastcall signed long __sched schedule_ti EXPORT_SYMBOL(schedule_timeout); +/* + * schedule_timeout_msecs - sleep until timeout + * @timeout_msecs: timeout value in milliseconds + * + * A human-time (but otherwise identical) alternative to + * schedule_timeout() The state, therefore, *does* need to be set before + * calling this function, but this function should *never* be called + * directly. Use the nice wrappers, schedule_{interruptible, + * uninterruptible}_timeout_msecs(). + * + * See the comment for schedule_timeout() for details. + */ +inline unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) +{ + struct timer_list timer; + unsigned long expire_jifs; + signed long remaining_jifs; + + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { + schedule(); + goto out; + } + + /* + * msecs_to_jiffies() is a unit conversion, which truncates + * (rounds down), so we need to add 1. + */ + expire_jifs = jiffies + msecs_to_jiffies(timeout_msecs) + 1; + + init_timer(&timer); + timer.expires = expire_jifs; + timer.data = (unsigned long) current; + timer.function = process_timeout; + + add_timer(&timer); + schedule(); + del_singleshot_timer_sync(&timer); + + remaining_jifs = expire_jifs - jiffies; + /* if we have woken up *before* we have requested */ + if (remaining_jifs > 0) + /* + * don't need to add 1 here, even though there is + * truncation, because we will add 1 if/when the value + * is sent back in + */ + timeout_msecs = jiffies_to_msecs(remaining_jifs); + else + timeout_msecs = 0; + + out: + return timeout_msecs; +} + +/** + * schedule_timeout_msecs_interruptible - sleep until timeout, + * wait-queue event, or signal + * @timeout_msecs: timeout value in milliseconds + * + * A nice wrapper for the common + * set_current_state()/schedule_timeout_msecs() usage. The state, + * therefore, does *not* need to be set before calling this function. + * + * See the comment for schedule_timeout() for details. + */ +unsigned int __sched schedule_timeout_msecs_interruptible + (unsigned int timeout_msecs) +{ + set_current_state(TASK_INTERRUPTIBLE); + return schedule_timeout_msecs(timeout_msecs); +} + +EXPORT_SYMBOL_GPL(schedule_timeout_msecs_interruptible); + +/** + * schedule_timeout_msecs_uninterrutible - sleep until timeout or + * wait-queue event + * @timeout_msecs: timeout value in milliseconds + * + * A nice wrapper for the common + * set_current_state()/schedule_timeout_msecs() usage. The state, + * therefore, does *not* need to be set before calling this function. + * + * See the comment for schedule_timeout() for details. + */ +unsigned int __sched schedule_timeout_msecs_uninterruptible + (unsigned int timeout_msecs) +{ + set_current_state(TASK_UNINTERRUPTIBLE); + return schedule_timeout_msecs(timeout_msecs); +} + +EXPORT_SYMBOL_GPL(schedule_timeout_msecs_uninterruptible); + /* Thread ID - the internal kernel "pid" */ asmlinkage long sys_gettid(void) { ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces 2005-07-23 1:08 ` [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces Nishanth Aravamudan @ 2005-07-23 2:30 ` Andrew Morton 2005-07-23 16:23 ` Nishanth Aravamudan 0 siblings, 1 reply; 50+ messages in thread From: Andrew Morton @ 2005-07-23 2:30 UTC (permalink / raw) To: Nishanth Aravamudan; +Cc: arjan, domen, linux-kernel, clucas Nishanth Aravamudan <nacc@us.ibm.com> wrote: > > > +/* > + * schedule_timeout_msecs - sleep until timeout > + * @timeout_msecs: timeout value in milliseconds > + * > + * A human-time (but otherwise identical) alternative to > + * schedule_timeout() The state, therefore, *does* need to be set before > + * calling this function, but this function should *never* be called > + * directly. Use the nice wrappers, schedule_{interruptible, > + * uninterruptible}_timeout_msecs(). > + * > + * See the comment for schedule_timeout() for details. > + */ > +inline unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) > +{ Making this inline will add more kernel text. Can't we uninline it? I'm surprised that this function is not implemented as a call to the existing schedule_timeout()? > + init_timer(&timer); > + timer.expires = expire_jifs; > + timer.data = (unsigned long) current; > + timer.function = process_timeout; hm, if we add the needed typecast to TIMER_INITIALIZER() the above could be timer = TIMER_INITIALIZER(process_timeout, expire_jifs, (unsigned long)current); ^ permalink raw reply [flat|nested] 50+ messages in thread
* [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces 2005-07-23 2:30 ` Andrew Morton @ 2005-07-23 16:23 ` Nishanth Aravamudan 0 siblings, 0 replies; 50+ messages in thread From: Nishanth Aravamudan @ 2005-07-23 16:23 UTC (permalink / raw) To: Andrew Morton; +Cc: arjan, domen, linux-kernel, clucas On 23.07.2005 [12:30:00 +1000], Andrew Morton wrote: > Nishanth Aravamudan <nacc@us.ibm.com> wrote: > > > > > > +/* > > + * schedule_timeout_msecs - sleep until timeout > > + * @timeout_msecs: timeout value in milliseconds > > + * > > + * A human-time (but otherwise identical) alternative to > > + * schedule_timeout() The state, therefore, *does* need to be set before > > + * calling this function, but this function should *never* be called > > + * directly. Use the nice wrappers, schedule_{interruptible, > > + * uninterruptible}_timeout_msecs(). > > + * > > + * See the comment for schedule_timeout() for details. > > + */ > > +inline unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) > > +{ > > Making this inline will add more kernel text. Can't we uninline it? Uninlined in the latest version (see below). > I'm surprised that this function is not implemented as a call to the > existing schedule_timeout()? Well, I was considering doing that, but I tried to make this new interface a little more sane. I don't think we need to worry about signedness of the parameter any more (and thus we don't need to consider returning a negative value). Calling schedule_timeout() underneath schedule_timeout_msecs() will require us to do those conditional checks; I guess this isn't criticial in a sleeping path. Also, I would need to make some modifications (not critical, really), so that MAX_SCHEDULE_TIMEOUT_MSECS would map 1:1 to MAX_SCHEDULE_TIMEOUT (which is in jiffies). But I'm happy to change this, also done below -- does that map better to what you were thinking? > > + init_timer(&timer); > > + timer.expires = expire_jifs; > > + timer.data = (unsigned long) current; > > + timer.function = process_timeout; > > hm, if we add the needed typecast to TIMER_INITIALIZER() the above could be > > timer = TIMER_INITIALIZER(process_timeout, expire_jifs, > (unsigned long)current); Done as well below. Thanks, Nish Description: Add millisecond wrapper for schedule_timeout(), schedule_timeout_msecs(). Add wrappers for interruptible/uninterruptible schedule_timeout_msecs() callers. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- include/linux/sched.h | 7 ++++ kernel/timer.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 81 insertions(+), 1 deletion(-) diff -urpN 2.6.13-rc3/include/linux/sched.h 2.6.13-rc3-new_interfaces/include/linux/sched.h --- 2.6.13-rc3/include/linux/sched.h 2005-07-13 15:52:14.000000000 -0700 +++ 2.6.13-rc3-new_interfaces/include/linux/sched.h 2005-07-22 18:06:36.000000000 -0700 @@ -181,8 +181,13 @@ extern void scheduler_tick(void); /* Is this address in the __sched functions? */ extern int in_sched_functions(unsigned long addr); -#define MAX_SCHEDULE_TIMEOUT LONG_MAX +#define MAX_SCHEDULE_TIMEOUT LONG_MAX +#define MAX_SCHEDULE_TIMEOUT_MSECS UINT_MAX extern signed long FASTCALL(schedule_timeout(signed long timeout)); +extern unsigned int schedule_timeout_msecs_interruptible + (unsigned int timeout_msecs); +extern unsigned int schedule_timeout_msecs_uninterruptible + (unsigned int timeout_msecs); asmlinkage void schedule(void); struct namespace; diff -urpN 2.6.13-rc3/kernel/timer.c 2.6.13-rc3-new_interfaces/kernel/timer.c --- 2.6.13-rc3/kernel/timer.c 2005-07-13 15:52:14.000000000 -0700 +++ 2.6.13-rc3-new_interfaces/kernel/timer.c 2005-07-23 09:20:51.000000000 -0700 @@ -1153,6 +1153,81 @@ fastcall signed long __sched schedule_ti EXPORT_SYMBOL(schedule_timeout); +/* + * schedule_timeout_msecs - sleep until timeout + * @timeout_msecs: timeout value in milliseconds + * + * A human-time (but otherwise identical) alternative to + * schedule_timeout() The state, therefore, *does* need to be set before + * calling this function, but this function should *never* be called + * directly. Use the nice wrappers, schedule_{interruptible, + * uninterruptible}_timeout_msecs(). + * + * See the comment for schedule_timeout() for details. + */ +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) +{ + unsigned long expire_jifs; + + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { + expire_jifs = MAX_SCHEDULE_TIMEOUT; + } else { + /* + * msecs_to_jiffies() is a unit conversion, which truncates + * (rounds down), so we need to add 1. + */ + expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; + } + + expire_jifs = schedule_timeout(expire_jifs); + + /* + * don't need to add 1 here, even though there is truncation, + * because we will add 1 if/when the value is sent back in + */ + return jiffies_to_msecs(expire_jifs); +} + +/** + * schedule_timeout_msecs_interruptible - sleep until timeout, + * wait-queue event, or signal + * @timeout_msecs: timeout value in milliseconds + * + * A nice wrapper for the common + * set_current_state()/schedule_timeout_msecs() usage. The state, + * therefore, does *not* need to be set before calling this function. + * + * See the comment for schedule_timeout() for details. + */ +unsigned int __sched schedule_timeout_msecs_interruptible + (unsigned int timeout_msecs) +{ + set_current_state(TASK_INTERRUPTIBLE); + return schedule_timeout_msecs(timeout_msecs); +} + +EXPORT_SYMBOL_GPL(schedule_timeout_msecs_interruptible); + +/** + * schedule_timeout_msecs_uninterrutible - sleep until timeout or + * wait-queue event + * @timeout_msecs: timeout value in milliseconds + * + * A nice wrapper for the common + * set_current_state()/schedule_timeout_msecs() usage. The state, + * therefore, does *not* need to be set before calling this function. + * + * See the comment for schedule_timeout() for details. + */ +unsigned int __sched schedule_timeout_msecs_uninterruptible + (unsigned int timeout_msecs) +{ + set_current_state(TASK_UNINTERRUPTIBLE); + return schedule_timeout_msecs(timeout_msecs); +} + +EXPORT_SYMBOL_GPL(schedule_timeout_msecs_uninterruptible); + /* Thread ID - the internal kernel "pid" */ asmlinkage long sys_gettid(void) { ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 0:31 ` Arjan van de Ven 2005-07-23 1:08 ` [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces Nishanth Aravamudan @ 2005-07-23 10:50 ` Roman Zippel 2005-07-23 11:09 ` Arjan van de Ven 2005-07-23 16:30 ` Nishanth Aravamudan 1 sibling, 2 replies; 50+ messages in thread From: Roman Zippel @ 2005-07-23 10:50 UTC (permalink / raw) To: Arjan van de Ven Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas Hi, On Fri, 22 Jul 2005, Arjan van de Ven wrote: > Also I'd rather not add the non-msec ones... either you're raw and use > HZ, or you are "cooked" and use the msec variant.. I dont' see the point > of adding an "in the middle" one. (Yes this means that several users > need to be transformed to msecs but... I consider that progress ;) What's wrong with using jiffies? It's simple and the current timeout system is based on it. Calling it something else doesn't suddenly give you more precision. bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 10:50 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Roman Zippel @ 2005-07-23 11:09 ` Arjan van de Ven 2005-07-23 11:55 ` Roman Zippel 2005-07-23 16:30 ` Nishanth Aravamudan 1 sibling, 1 reply; 50+ messages in thread From: Arjan van de Ven @ 2005-07-23 11:09 UTC (permalink / raw) To: Roman Zippel Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas On Sat, 2005-07-23 at 12:50 +0200, Roman Zippel wrote: > Hi, > > On Fri, 22 Jul 2005, Arjan van de Ven wrote: > > > Also I'd rather not add the non-msec ones... either you're raw and use > > HZ, or you are "cooked" and use the msec variant.. I dont' see the point > > of adding an "in the middle" one. (Yes this means that several users > > need to be transformed to msecs but... I consider that progress ;) > > What's wrong with using jiffies? A lot of the (driver) users want a wallclock based timeout. For that, miliseconds is a more obvious API with less chance to get the jiffies/HZ conversion wrong by the driver writer. > It's simple and the current timeout > system is based on it. Calling it something else doesn't suddenly give you > more precision. It's not about precision, it's about making the new API (which is intended to be a simplification already due to sucking in the state setting) match the intent closer. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 11:09 ` Arjan van de Ven @ 2005-07-23 11:55 ` Roman Zippel 2005-07-23 12:51 ` Arjan van de Ven 2005-07-23 16:37 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan 0 siblings, 2 replies; 50+ messages in thread From: Roman Zippel @ 2005-07-23 11:55 UTC (permalink / raw) To: Arjan van de Ven Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas Hi, On Sat, 23 Jul 2005, Arjan van de Ven wrote: > > What's wrong with using jiffies? > > A lot of the (driver) users want a wallclock based timeout. For that, > miliseconds is a more obvious API with less chance to get the jiffies/HZ > conversion wrong by the driver writer. We have helper functions for that. The point about using jiffies is to make it _very_ clear, that the timeout is imprecise and for most users this is sufficient. bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 11:55 ` Roman Zippel @ 2005-07-23 12:51 ` Arjan van de Ven 2005-07-23 13:04 ` Roman Zippel 2005-07-23 16:37 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan 1 sibling, 1 reply; 50+ messages in thread From: Arjan van de Ven @ 2005-07-23 12:51 UTC (permalink / raw) To: Roman Zippel Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas On Sat, 2005-07-23 at 13:55 +0200, Roman Zippel wrote: > Hi, > > On Sat, 23 Jul 2005, Arjan van de Ven wrote: > > > > What's wrong with using jiffies? > > > > A lot of the (driver) users want a wallclock based timeout. For that, > > miliseconds is a more obvious API with less chance to get the jiffies/HZ > > conversion wrong by the driver writer. > > We have helper functions for that. I think we just disagree then... I consider this one a helper function as well, one with a simpler API that wraps the more complex and powerful api. Sure sleeps are imprecise. All sleeps are even mdelay() (due to preemption). ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 12:51 ` Arjan van de Ven @ 2005-07-23 13:04 ` Roman Zippel 2005-07-23 13:12 ` Arjan van de Ven 2005-07-23 16:43 ` Nishanth Aravamudan 0 siblings, 2 replies; 50+ messages in thread From: Roman Zippel @ 2005-07-23 13:04 UTC (permalink / raw) To: Arjan van de Ven Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas Hi, On Sat, 23 Jul 2005, Arjan van de Ven wrote: > > > > What's wrong with using jiffies? > > > > > > A lot of the (driver) users want a wallclock based timeout. For that, > > > miliseconds is a more obvious API with less chance to get the jiffies/HZ > > > conversion wrong by the driver writer. > > > > We have helper functions for that. > > I think we just disagree then... I consider this one a helper function > as well, one with a simpler API that wraps the more complex and powerful > api. How is it more powerful? The next step in introducing such API is that people start complaining, they don't get ms precision, which of course is fixed by the next patch, with then results in that the whole timer system becomes more complex for a few misguided users. Keep the thing as simple as possible and jiffies _are_ simple. Please show me the problem, that needs to be fixed. bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 13:04 ` Roman Zippel @ 2005-07-23 13:12 ` Arjan van de Ven 2005-07-23 13:29 ` Roman Zippel 2005-07-23 16:43 ` Nishanth Aravamudan 1 sibling, 1 reply; 50+ messages in thread From: Arjan van de Ven @ 2005-07-23 13:12 UTC (permalink / raw) To: Roman Zippel Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas On Sat, 2005-07-23 at 15:04 +0200, Roman Zippel wrote: > Hi, > > On Sat, 23 Jul 2005, Arjan van de Ven wrote: > > > > > > What's wrong with using jiffies? > > > > > > > > A lot of the (driver) users want a wallclock based timeout. For that, > > > > miliseconds is a more obvious API with less chance to get the jiffies/HZ > > > > conversion wrong by the driver writer. > > > > > > We have helper functions for that. > > > > I think we just disagree then... I consider this one a helper function > > as well, one with a simpler API that wraps the more complex and powerful > > api. > > How is it more powerful? jiffies/HZ is the more powerful API. The msec one which also sets current to (un)interruptible is the simplified wrapper on top. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 13:12 ` Arjan van de Ven @ 2005-07-23 13:29 ` Roman Zippel 2005-07-23 13:32 ` Arjan van de Ven 2005-07-23 16:44 ` Nishanth Aravamudan 0 siblings, 2 replies; 50+ messages in thread From: Roman Zippel @ 2005-07-23 13:29 UTC (permalink / raw) To: Arjan van de Ven Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas Hi, On Sat, 23 Jul 2005, Arjan van de Ven wrote: > jiffies/HZ is the more powerful API. The msec one which also sets > current to (un)interruptible is the simplified wrapper on top. So why do you want to hide it? Make the jiffies based API the primary one and add some convenience functions, where we BTW could convert already constants at compile time. bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 13:29 ` Roman Zippel @ 2005-07-23 13:32 ` Arjan van de Ven 2005-07-23 15:56 ` Roman Zippel 2005-07-23 16:44 ` Nishanth Aravamudan 1 sibling, 1 reply; 50+ messages in thread From: Arjan van de Ven @ 2005-07-23 13:32 UTC (permalink / raw) To: Roman Zippel Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas On Sat, 2005-07-23 at 15:29 +0200, Roman Zippel wrote: > Hi, > > On Sat, 23 Jul 2005, Arjan van de Ven wrote: > > > jiffies/HZ is the more powerful API. The msec one which also sets > > current to (un)interruptible is the simplified wrapper on top. > > So why do you want to hide it? I don't want to hide it at all. I want to provide a simpler variant for it for the cases where a simpler variant is enough. It really is a helper to take some common task and get that closer to what the programmer wants (wallclock time delay) ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 13:32 ` Arjan van de Ven @ 2005-07-23 15:56 ` Roman Zippel 0 siblings, 0 replies; 50+ messages in thread From: Roman Zippel @ 2005-07-23 15:56 UTC (permalink / raw) To: Arjan van de Ven Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas Hi, On Sat, 23 Jul 2005, Arjan van de Ven wrote: > > > jiffies/HZ is the more powerful API. The msec one which also sets > > > current to (un)interruptible is the simplified wrapper on top. > > > > So why do you want to hide it? > > I don't want to hide it at all. I want to provide a simpler variant for > it for the cases where a simpler variant is enough. Then I don't understand your initial comment and I don't understand the point in adding a ms variant of this interface. If someone does care about the return value, he is better off to convert the initial value into jiffies and continue with that. > It really is a > helper to take some common task and get that closer to what the > programmer wants (wallclock time delay) And that's not what he gets... bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 13:29 ` Roman Zippel 2005-07-23 13:32 ` Arjan van de Ven @ 2005-07-23 16:44 ` Nishanth Aravamudan 1 sibling, 0 replies; 50+ messages in thread From: Nishanth Aravamudan @ 2005-07-23 16:44 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 23.07.2005 [15:29:42 +0200], Roman Zippel wrote: > Hi, > > On Sat, 23 Jul 2005, Arjan van de Ven wrote: > > > jiffies/HZ is the more powerful API. The msec one which also sets > > current to (un)interruptible is the simplified wrapper on top. > > So why do you want to hide it? Make the jiffies based API the primary one > and add some convenience functions, where we BTW could convert already > constants at compile time. We are not hiding anything; we are providing an alternative for those users that would like to request wall-time (human-time) units -- those users do not (or should not, if nothing else) expect any kind of precision from the sleeping path. Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 13:04 ` Roman Zippel 2005-07-23 13:12 ` Arjan van de Ven @ 2005-07-23 16:43 ` Nishanth Aravamudan 2005-07-23 17:17 ` Roman Zippel 1 sibling, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-07-23 16:43 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 23.07.2005 [15:04:44 +0200], Roman Zippel wrote: > Hi, > > On Sat, 23 Jul 2005, Arjan van de Ven wrote: > > > > > > What's wrong with using jiffies? > > > > > > > > A lot of the (driver) users want a wallclock based timeout. For that, > > > > miliseconds is a more obvious API with less chance to get the jiffies/HZ > > > > conversion wrong by the driver writer. > > > > > > We have helper functions for that. > > > > I think we just disagree then... I consider this one a helper function > > as well, one with a simpler API that wraps the more complex and powerful > > api. > > How is it more powerful? The next step in introducing such API is that > people start complaining, they don't get ms precision, which of course is > fixed by the next patch, with then results in that the whole timer system > becomes more complex for a few misguided users. Do people complain about not getting jiffy precision? I think the whole idea of discussing precision in this particular sleeping path is ludicrous. Some users of this are requesting up to 100s of millisecond delays! And, please, don't confuse this patch with my other work. The soft-timer rework was an RFC, meaning I was hoping to start a discussion and get some comments. I got comments from you and others, which I greatly appreciate. But, I think you interpret that patch as being a final version, which is ready for mainline. It very explicitly is not. I am intent on dealing with your issues (perhaps moving to usecs...), but this patch has nothing to do with that. It has everything to do with my experience replacing caller after caller of schedule_timeout() and a request from Andrew for a compact interface (the milliseconds were admittedly my idea, but he hasn't said no to that yet :) ). > Keep the thing as simple as possible and jiffies _are_ simple. Please show > me the problem, that needs to be fixed. I am not sure why jiffies are any simpler than milliseconds. In the millisecond case, conversion happens in common code and only needs to be audited once. In the jiffies case, I have to check *every* caller to see if they are doing stupid math :) Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 16:43 ` Nishanth Aravamudan @ 2005-07-23 17:17 ` Roman Zippel 2005-07-23 19:10 ` Nishanth Aravamudan 0 siblings, 1 reply; 50+ messages in thread From: Roman Zippel @ 2005-07-23 17:17 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Hi, On Sat, 23 Jul 2005, Nishanth Aravamudan wrote: > > Keep the thing as simple as possible and jiffies _are_ simple. Please show > > me the problem, that needs to be fixed. > > I am not sure why jiffies are any simpler than milliseconds. In the > millisecond case, conversion happens in common code and only needs to be > audited once. In the jiffies case, I have to check *every* caller to see > if they are doing stupid math :) Jiffies are the basic time unit for kernel timers, hiding that fact gives users only wrong expectations about them. I don't mind using helper functions, but constant conversion can already happen at compile time and for variable timeouts the user should seriously consider using jiffies directly instead of constantly converting time values. bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 17:17 ` Roman Zippel @ 2005-07-23 19:10 ` Nishanth Aravamudan 2005-07-23 20:12 ` Roman Zippel 0 siblings, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-07-23 19:10 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 23.07.2005 [19:17:37 +0200], Roman Zippel wrote: > Hi, > > On Sat, 23 Jul 2005, Nishanth Aravamudan wrote: > > > > Keep the thing as simple as possible and jiffies _are_ simple. Please show > > > me the problem, that needs to be fixed. > > > > I am not sure why jiffies are any simpler than milliseconds. In the > > millisecond case, conversion happens in common code and only needs to be > > audited once. In the jiffies case, I have to check *every* caller to see > > if they are doing stupid math :) > > Jiffies are the basic time unit for kernel timers, hiding that fact gives > users only wrong expectations about them. We already have msleep() and msleep_interruptible(), which hide jiffies in milliseconds. These interfaces are their parallel in the wait-queue case. If you don't want to use them, or their use is not appropriate, then the callers won't be changed. > I don't mind using helper functions, but constant conversion can already > happen at compile time and for variable timeouts the user should seriously > consider using jiffies directly instead of constantly converting time > values. My goal is to distinguish between these cases in sleeping-logic: 1) tick-oriented use schedule_timeout(), add_timer(), etc. 2) time-oriented use schedule_timeout_msecs() I am not going to run some global sed script to change the whole kernel. I may not always be successful, but I do try to be cautious in my patches. Only those callers that can benefit from using a millisecond interface will be changed. Another question (actually the same question expressed again), do Andrew or Arjan think I should reinsert the schedule_timeout_{,un}interruptible() functions for the 1) case above? Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 19:10 ` Nishanth Aravamudan @ 2005-07-23 20:12 ` Roman Zippel 2005-07-27 22:29 ` Nishanth Aravamudan 0 siblings, 1 reply; 50+ messages in thread From: Roman Zippel @ 2005-07-23 20:12 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Hi, On Sat, 23 Jul 2005, Nishanth Aravamudan wrote: > > Jiffies are the basic time unit for kernel timers, hiding that fact gives > > users only wrong expectations about them. > > We already have msleep() and msleep_interruptible(), which hide jiffies > in milliseconds. These interfaces are their parallel in the wait-queue > case. If you don't want to use them, or their use is not appropriate, > then the callers won't be changed. I'm not exactly impressed with their implementation, it's completely silly that there is no such convience function based on jiffies, e.g. if you look at the msleep_interruptible(), you'll find quite a few "msleep_interruptible(jiffies_to_msecs())". msleep_interruptible() is especially bad as there are a few users who check the return value and since it adds one to the timeout, you can create loops which may never timeout (as e.g. in drivers/w1/w1_therm.c: w1_therm_read_bin()), this is nice example of a bad interface. These two function should actually look like this: static inline void msleep(unsigned int msecs) { sleep(msecs_to_jiffies(msecs)); } static inline int msleep_interruptible(unsigned int msecs) { sleep_interruptible(msecs_to_jiffies(msecs)) != 0; } > My goal is to distinguish between these cases in sleeping-logic: > > 1) tick-oriented > use schedule_timeout(), add_timer(), etc. > > 2) time-oriented > use schedule_timeout_msecs() There is _no_ difference, the scheduler is based on ticks. Even if we soon have different time sources, the scheduler will continue to measure the time in ticks and for a simple reason - portability. Jiffies _are_ simple, don't throw that away. bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 20:12 ` Roman Zippel @ 2005-07-27 22:29 ` Nishanth Aravamudan 2005-07-30 23:35 ` Roman Zippel 0 siblings, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-07-27 22:29 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 23.07.2005 [22:12:30 +0200], Roman Zippel wrote: > Hi, > > On Sat, 23 Jul 2005, Nishanth Aravamudan wrote: > > > > Jiffies are the basic time unit for kernel timers, hiding that fact gives > > > users only wrong expectations about them. > > > > We already have msleep() and msleep_interruptible(), which hide jiffies > > in milliseconds. These interfaces are their parallel in the wait-queue > > case. If you don't want to use them, or their use is not appropriate, > > then the callers won't be changed. > > I'm not exactly impressed with their implementation, it's completely silly > that there is no such convience function based on jiffies, e.g. if you > look at the msleep_interruptible(), you'll find quite a few > "msleep_interruptible(jiffies_to_msecs())". Patches are welcome, I guess :) > msleep_interruptible() is especially bad as there are a few users who > check the return value and since it adds one to the timeout, you can > create loops which may never timeout (as e.g. in drivers/w1/w1_therm.c: > w1_therm_read_bin()), this is nice example of a bad interface. > > These two function should actually look like this: > > static inline void msleep(unsigned int msecs) > { > sleep(msecs_to_jiffies(msecs)); > } > > static inline int msleep_interruptible(unsigned int msecs) > { > sleep_interruptible(msecs_to_jiffies(msecs)) != 0; > } Where are sleep() and sleep_interruptible() defined? And this basically pushes down the identical implementation a layer, correct? Finally, you cannot simply do msecs_to_jiffies() in the caller of sleep{,_interruptible()}, unless sleep{,_interruptible()} add 1 to every request themselves. The basic issue is that with jiffies and ticks being identical, there is no concept of the inter-tick position (i.e., a sub-jiffy value), and the callers of these conversion functions *must* assume they are going to be added to the timer subsystem immediately before a timer interrupt occurs. An entire tick has not elapsed, but if the caller had not added 1, then we could potentially go off almost an entire tick early. The +1 is basically specifying we are asking to add our delay to the *next* tick. > > My goal is to distinguish between these cases in sleeping-logic: > > > > 1) tick-oriented > > use schedule_timeout(), add_timer(), etc. > > > > 2) time-oriented > > use schedule_timeout_msecs() > > There is _no_ difference, the scheduler is based on ticks. Even if we soon > have different time sources, the scheduler will continue to measure the > time in ticks and for a simple reason - portability. Jiffies _are_ simple, > don't throw that away. I agree that from an internal perspective there is no difference, but from an *interface* perspective they are hugely different, simply on the basis that one uses human-time units and one does not. I guess we must continue to agree to disagree. I am still waiting for some feedback from Andrew if he would prefer a jiffy-only version or both the jiffy and msec version. Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-27 22:29 ` Nishanth Aravamudan @ 2005-07-30 23:35 ` Roman Zippel 2005-08-01 19:35 ` [UPDATE PATCH] Add schedule_timeout_{intr,unintr}{,_msecs}() interfaces Nishanth Aravamudan 0 siblings, 1 reply; 50+ messages in thread From: Roman Zippel @ 2005-07-30 23:35 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Hi, On Wed, 27 Jul 2005, Nishanth Aravamudan wrote: > > > My goal is to distinguish between these cases in sleeping-logic: > > > > > > 1) tick-oriented > > > use schedule_timeout(), add_timer(), etc. > > > > > > 2) time-oriented > > > use schedule_timeout_msecs() > > > > There is _no_ difference, the scheduler is based on ticks. Even if we soon > > have different time sources, the scheduler will continue to measure the > > time in ticks and for a simple reason - portability. Jiffies _are_ simple, > > don't throw that away. > > I agree that from an internal perspective there is no difference, but > from an *interface* perspective they are hugely different, simply on the > basis that one uses human-time units and one does not. > > I guess we must continue to agree to disagree. I'm not really sure, what you disagree about. 1 HZ is about one second, which I don't think is such a difficult concept. I already said wrapper functions are fine and for anything smaller than HZ/2 it's probably a good idea nowadays. My main point is to keep the core functionality in jiffies and provide some wrapper functions. What exactly do you disagree here on? bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* [UPDATE PATCH] Add schedule_timeout_{intr,unintr}{,_msecs}() interfaces 2005-07-30 23:35 ` Roman Zippel @ 2005-08-01 19:35 ` Nishanth Aravamudan 2005-08-03 14:20 ` Roman Zippel 0 siblings, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-08-01 19:35 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 31.07.2005 [01:35:35 +0200], Roman Zippel wrote: > Hi, > > On Wed, 27 Jul 2005, Nishanth Aravamudan wrote: > > > > > My goal is to distinguish between these cases in sleeping-logic: > > > > > > > > 1) tick-oriented > > > > use schedule_timeout(), add_timer(), etc. > > > > > > > > 2) time-oriented > > > > use schedule_timeout_msecs() > > > > > > There is _no_ difference, the scheduler is based on ticks. Even if we soon > > > have different time sources, the scheduler will continue to measure the > > > time in ticks and for a simple reason - portability. Jiffies _are_ simple, > > > don't throw that away. > > > > I agree that from an internal perspective there is no difference, but > > from an *interface* perspective they are hugely different, simply on the > > basis that one uses human-time units and one does not. > > > > I guess we must continue to agree to disagree. > > I'm not really sure, what you disagree about. > 1 HZ is about one second, which I don't think is such a difficult concept. > I already said wrapper functions are fine and for anything smaller than > HZ/2 it's probably a good idea nowadays. > My main point is to keep the core functionality in jiffies and provide > some wrapper functions. What exactly do you disagree here on? Well, I was under the impression you were against the patch I had sent to add some millisecond wrappers (which did *not* change any core functionality) for schedule_timeout(). If that isn't the case, I'm sorry, I misunderstood. Here's the patch again. I'm still open to feedback. Thanks, Nish Description: Add wrappers for interruptible/uninterruptible schedule_timeout() callers. Also add millisecond equivalents. I tried to make the names a bit more reasonable to prevent long lines. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- include/linux/sched.h | 7 +++ kernel/timer.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 95 insertions(+), 1 deletion(-) diff -urpN 2.6.13-rc4/include/linux/sched.h 2.6.13-rc4-dev/include/linux/sched.h --- 2.6.13-rc4/include/linux/sched.h 2005-07-29 14:11:49.000000000 -0700 +++ 2.6.13-rc4-dev/include/linux/sched.h 2005-08-01 11:20:31.000000000 -0700 @@ -181,8 +181,13 @@ extern void scheduler_tick(void); /* Is this address in the __sched functions? */ extern int in_sched_functions(unsigned long addr); -#define MAX_SCHEDULE_TIMEOUT LONG_MAX +#define MAX_SCHEDULE_TIMEOUT LONG_MAX +#define MAX_SCHEDULE_TIMEOUT_MSECS UINT_MAX extern signed long FASTCALL(schedule_timeout(signed long timeout)); +extern signed long schedule_timeout_intr(signed long timeout); +extern signed long schedule_timeout_unintr(signed long timeout); +extern unsigned int schedule_timeout_msecs_intr(unsigned int timeout_msecs); +extern unsigned int schedule_timeout_msecs_unintr(unsigned int timeout_msecs); asmlinkage void schedule(void); struct namespace; diff -urpN 2.6.13-rc4/kernel/timer.c 2.6.13-rc4-dev/kernel/timer.c --- 2.6.13-rc4/kernel/timer.c 2005-07-29 14:11:49.000000000 -0700 +++ 2.6.13-rc4-dev/kernel/timer.c 2005-08-01 12:31:53.000000000 -0700 @@ -1153,6 +1153,95 @@ fastcall signed long __sched schedule_ti EXPORT_SYMBOL(schedule_timeout); +signed long __sched schedule_timeout_intr(signed long timeout) +{ + set_current_state(TASK_INTERRUPTIBLE); + return schedule_timeout(timeout); +} + +EXPORT_SYMBOL(schedule_timeout_intr); + +signed long __sched schedule_timeout_unintr(signed long timeout) +{ + set_current_state(TASK_UNINTERRUPTIBLE); + return schedule_timeout(timeout); +} + +EXPORT_SYMBOL(schedule_timeout_unintr); + +/* + * schedule_timeout_msecs - sleep until timeout + * @timeout_msecs: timeout value in milliseconds + * + * A human-time (but otherwise identical) alternative to + * schedule_timeout() The state, therefore, *does* need to be set before + * calling this function, but this function should *never* be called + * directly. Use the nice wrappers, schedule_{interruptible, + * uninterruptible}_timeout_msecs(). + * + * See the comment for schedule_timeout() for details. + */ +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) +{ + unsigned long expire_jifs; + + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { + expire_jifs = MAX_SCHEDULE_TIMEOUT; + } else { + /* + * msecs_to_jiffies() is a unit conversion, which truncates + * (rounds down), so we need to add 1. + */ + expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; + } + + expire_jifs = schedule_timeout(expire_jifs); + + /* + * don't need to add 1 here, even though there is truncation, + * because we will add 1 if/when the value is sent back in + */ + return jiffies_to_msecs(expire_jifs); +} + +/** + * schedule_timeout_msecs_interruptible - sleep until timeout, + * wait-queue event, or signal + * @timeout_msecs: timeout value in milliseconds + * + * A nice wrapper for the common + * set_current_state()/schedule_timeout_msecs() usage. The state, + * therefore, does *not* need to be set before calling this function. + * + * See the comment for schedule_timeout() for details. + */ +unsigned int __sched schedule_timeout_msecs_intr(unsigned int timeout_msecs) +{ + set_current_state(TASK_INTERRUPTIBLE); + return schedule_timeout_msecs(timeout_msecs); +} + +EXPORT_SYMBOL_GPL(schedule_timeout_msecs_intr); + +/** + * schedule_timeout_msecs_uninterrutible - sleep until timeout or + * wait-queue event + * @timeout_msecs: timeout value in milliseconds + * + * A nice wrapper for the common + * set_current_state()/schedule_timeout_msecs() usage. The state, + * therefore, does *not* need to be set before calling this function. + * + * See the comment for schedule_timeout() for details. + */ +unsigned int __sched schedule_timeout_msecs_unintr(unsigned int timeout_msecs) +{ + set_current_state(TASK_UNINTERRUPTIBLE); + return schedule_timeout_msecs(timeout_msecs); +} + +EXPORT_SYMBOL_GPL(schedule_timeout_msecs_unintr); + /* Thread ID - the internal kernel "pid" */ asmlinkage long sys_gettid(void) { ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [UPDATE PATCH] Add schedule_timeout_{intr,unintr}{,_msecs}() interfaces 2005-08-01 19:35 ` [UPDATE PATCH] Add schedule_timeout_{intr,unintr}{,_msecs}() interfaces Nishanth Aravamudan @ 2005-08-03 14:20 ` Roman Zippel 2005-08-04 0:51 ` [PATCH] push rounding up of relative request to schedule_timeout() Nishanth Aravamudan 0 siblings, 1 reply; 50+ messages in thread From: Roman Zippel @ 2005-08-03 14:20 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Hi, On Mon, 1 Aug 2005, Nishanth Aravamudan wrote: > +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) > +{ > + unsigned long expire_jifs; > + > + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { > + expire_jifs = MAX_SCHEDULE_TIMEOUT; > + } else { > + /* > + * msecs_to_jiffies() is a unit conversion, which truncates > + * (rounds down), so we need to add 1. > + */ > + expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; > + } > + > + expire_jifs = schedule_timeout(expire_jifs); > + > + /* > + * don't need to add 1 here, even though there is truncation, > + * because we will add 1 if/when the value is sent back in > + */ > + return jiffies_to_msecs(expire_jifs); > +} As I already mentioned for msleep_interruptible this is a really terrible interface. The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the process is immediately woken up again) makes the caller suspectible to timeout manipulations and requires constant reauditing, that no caller gets it wrong, so it's better to avoid this error source completely. Constant conversion between different time units is a really bad idea. If the user needs the remaining time, he is really better off to do it himself by checking jiffies and only does an initial conversion from relative to absolute (kernel) time. This wrapper function really should be an inline function and should look more like this: static inline int schedule_timeout_msecs(unsigned int timeout_msecs) { return schedule_timeout(msecs_to_jiffies(timeout_msecs) + 1) != 0; } bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH] push rounding up of relative request to schedule_timeout() 2005-08-03 14:20 ` Roman Zippel @ 2005-08-04 0:51 ` Nishanth Aravamudan 2005-08-04 5:14 ` [UPDATE PATCH] " Nishanth Aravamudan 2005-08-04 9:38 ` [PATCH] " Roman Zippel 0 siblings, 2 replies; 50+ messages in thread From: Nishanth Aravamudan @ 2005-08-04 0:51 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote: > Hi, > > On Mon, 1 Aug 2005, Nishanth Aravamudan wrote: > > > +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) > > +{ > > + unsigned long expire_jifs; > > + > > + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { > > + expire_jifs = MAX_SCHEDULE_TIMEOUT; > > + } else { > > + /* > > + * msecs_to_jiffies() is a unit conversion, which truncates > > + * (rounds down), so we need to add 1. > > + */ > > + expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; > > + } > > + > > + expire_jifs = schedule_timeout(expire_jifs); > > + > > + /* > > + * don't need to add 1 here, even though there is truncation, > > + * because we will add 1 if/when the value is sent back in > > + */ > > + return jiffies_to_msecs(expire_jifs); > > +} > > As I already mentioned for msleep_interruptible this is a really terrible > interface. > The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the > process is immediately woken up again) makes the caller suspectible to > timeout manipulations and requires constant reauditing, that no caller > gets it wrong, so it's better to avoid this error source completely. After some thought today, I realized the +1 case is not specific to milliseconds. It's just that it's only being done *correctly* in the milliseconds case...I think ;) So, consider the following: We are requesting a 10 jiffy sleep via set_current_state(TASK_INTERRUPTIBLE); schedule_timeout(10); Keep in mind that jiffies is only incremented when the timer interrupt occurs (the whole point being, again, we do not have any inter-tick jiffy-value). In schedule_timeout() we will add the 10 to jiffies' current value. But what happens if we were calling schedule_timeout() immediately before the next timer interrupt occurs? Then we will only sleep slightly more than 9 jiffies, instead of the 10 requested. The basic issue is that we are always taking the floor of the current position in jiffies in schedule_timeout() by adding the relative offset to jiffies. To guarantee the timeout requested passes, we must add the relative offset to (jiffies+1) [See the attached patch, which I think fixes the "problem" in 2.6.13-rc5]. Most callers are already rounding up or adding one to their request, so it may not be a problem. And, often, these are sleeping paths, so most callers don't care about precision. So, while you are correct that there is a chance for "infinite" sleep in msleep_interruptible() and schedule_timeout_{intr,unintr}_msecs(), there technically *should* be such a possibility in the jiffies case too, but the code wasn't correct up until now. All in all, seems buggy, but my analysis may also be wrong -- and this case may be damn well near impossible to actually create. > Constant conversion between different time units is a really bad idea. If > the user needs the remaining time, he is really better off to do it > himself by checking jiffies and only does an initial conversion from > relative to absolute (kernel) time. > This wrapper function really should be an inline function and should look > more like this: > > static inline int schedule_timeout_msecs(unsigned int timeout_msecs) > { > return schedule_timeout(msecs_to_jiffies(timeout_msecs) + 1) != 0; > } I don't think I want the schedule_timeout*() functions' return values' meanings to be different depending on whether you use milliseconds or jiffies. Your version makes the millisecond-case boolean in return, which is differnent than schedule_timeout()'s remaining-jiffies return value. I have also been thinking about the need to use while(time_after(timeout_jiffies, timeout)) vs. while(timeout_msecs), but I will respond to a different e-mail about that. Thanks, Nish --- timer.c | 11 ++++++----- 1 files changed, 6 insertions(+), 5 deletions(-) Description: Ensure that schedule_timeout() requests can not possibly expire early in the timeout case, by adding the requested relative jiffy value to the next value of jiffies. Currently, by adding to the current value of jiffies, we might actually expire a jiffy too early (in a corner case). Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- 2.6.13-rc5/kernel/timer.c 2005-08-01 12:31:53.000000000 -0700 +++ 2.6.13-rc5-dev/kernel/timer.c 2005-08-03 17:30:10.000000000 -0700 @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti } } - expire = timeout + jiffies; + expire = timeout + jiffies + 1; init_timer(&timer); timer.expires = expire; @@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms } else { /* * msecs_to_jiffies() is a unit conversion, which truncates - * (rounds down), so we need to add 1. + * (rounds down), so we need to add 1, but this is taken + * care of by schedule_timeout() now. */ - expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; + expire_jifs = msecs_to_jiffies(timeout_msecs); } expire_jifs = schedule_timeout(expire_jifs); @@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time */ void msleep(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; + unsigned long timeout = msecs_to_jiffies(msecs); while (timeout) { set_current_state(TASK_UNINTERRUPTIBLE); @@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep); */ unsigned long msleep_interruptible(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; + unsigned long timeout = msecs_to_jiffies(msecs); while (timeout && !signal_pending(current)) { set_current_state(TASK_INTERRUPTIBLE); ^ permalink raw reply [flat|nested] 50+ messages in thread
* [UPDATE PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 0:51 ` [PATCH] push rounding up of relative request to schedule_timeout() Nishanth Aravamudan @ 2005-08-04 5:14 ` Nishanth Aravamudan 2005-08-04 16:45 ` George Anzinger 2005-08-04 17:05 ` George Anzinger 2005-08-04 9:38 ` [PATCH] " Roman Zippel 1 sibling, 2 replies; 50+ messages in thread From: Nishanth Aravamudan @ 2005-08-04 5:14 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote: > On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote: > > Hi, > > > > On Mon, 1 Aug 2005, Nishanth Aravamudan wrote: > > > > > +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) > > > +{ > > > + unsigned long expire_jifs; > > > + > > > + if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { > > > + expire_jifs = MAX_SCHEDULE_TIMEOUT; > > > + } else { > > > + /* > > > + * msecs_to_jiffies() is a unit conversion, which truncates > > > + * (rounds down), so we need to add 1. > > > + */ > > > + expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; > > > + } > > > + > > > + expire_jifs = schedule_timeout(expire_jifs); > > > + > > > + /* > > > + * don't need to add 1 here, even though there is truncation, > > > + * because we will add 1 if/when the value is sent back in > > > + */ > > > + return jiffies_to_msecs(expire_jifs); > > > +} > > > > As I already mentioned for msleep_interruptible this is a really terrible > > interface. > > The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the > > process is immediately woken up again) makes the caller suspectible to > > timeout manipulations and requires constant reauditing, that no caller > > gets it wrong, so it's better to avoid this error source completely. > > After some thought today, I realized the +1 case is not specific to > milliseconds. It's just that it's only being done *correctly* in the > milliseconds case...I think ;) <snip> > Description: Ensure that schedule_timeout() requests can not possibly > expire early in the timeout case, by adding the requested relative jiffy > value to the next value of jiffies. Currently, by adding to the current > value of jiffies, we might actually expire a jiffy too early (in a > corner case). Sorry, I forgot that sys_nanosleep() also always adds 1 to the request (to account for this same issue, I believe, as POSIX demands no early return from nanosleep() calls). There are some other locations where similar + (t.tv_sec || t.tv_nsec) rounding-ups occur. I'll fix those separately if this patch goes in. I change the one in sys_nanosleep() below to maintain the same latency as we currently have. I also screwed up my layout in the previous submission, sorry about that. Description: Ensure that schedule_timeout() requests can not possibly expire early in the timeout case, by adding the requested relative jiffy value to the next value of jiffies. Currently, by adding to the current value of jiffies, we might actually expire a jiffy too early (in a corner case). Modify the callers of schedule_timeout() in timer.c to not add 1 themselves. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- timer.c | 13 +++++++------ 1 files changed, 7 insertions(+), 6 deletions(-) --- 2.6.13-rc4-dev/kernel/timer.c 2005-08-01 12:31:53.000000000 -0700 +++ 2.6.13-rc5/kernel/timer.c 2005-08-03 22:05:16.000000000 -0700 @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti } } - expire = timeout + jiffies; + expire = timeout + jiffies + 1; init_timer(&timer); timer.expires = expire; @@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms } else { /* * msecs_to_jiffies() is a unit conversion, which truncates - * (rounds down), so we need to add 1. + * (rounds down), so we need to add 1, but this is taken + * care of by schedule_timeout() now. */ - expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; + expire_jifs = msecs_to_jiffies(timeout_msecs); } expire_jifs = schedule_timeout(expire_jifs); @@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim if ((t.tv_nsec >= 1000000000L) || (t.tv_nsec < 0) || (t.tv_sec < 0)) return -EINVAL; - expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec); + expire = timespec_to_jiffies(&t); current->state = TASK_INTERRUPTIBLE; expire = schedule_timeout(expire); @@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time */ void msleep(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; + unsigned long timeout = msecs_to_jiffies(msecs); while (timeout) { set_current_state(TASK_UNINTERRUPTIBLE); @@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep); */ unsigned long msleep_interruptible(unsigned int msecs) { - unsigned long timeout = msecs_to_jiffies(msecs) + 1; + unsigned long timeout = msecs_to_jiffies(msecs); while (timeout && !signal_pending(current)) { set_current_state(TASK_INTERRUPTIBLE); ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 5:14 ` [UPDATE PATCH] " Nishanth Aravamudan @ 2005-08-04 16:45 ` George Anzinger 2005-08-04 18:48 ` Nish Aravamudan 2005-08-16 23:05 ` Nishanth Aravamudan 2005-08-04 17:05 ` George Anzinger 1 sibling, 2 replies; 50+ messages in thread From: George Anzinger @ 2005-08-04 16:45 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and timeval_to_jiffies() to add 1. This is NOT the right thing to do. For repeating times (see setitimer code) we need the actual time as we KNOW where the jiffies edge is in the repeating case. The +1 is needed ONLY for the initial time, not the repeating time. See: http://marc.theaimsgroup.com/?l=linux-kernel&m=112208357906156&w=2 George Nishanth Aravamudan wrote: > On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote: > >>On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote: >> >>>Hi, >>> >>>On Mon, 1 Aug 2005, Nishanth Aravamudan wrote: >>> >>> >>>>+unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs) >>>>+{ >>>>+ unsigned long expire_jifs; >>>>+ >>>>+ if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) { >>>>+ expire_jifs = MAX_SCHEDULE_TIMEOUT; >>>>+ } else { >>>>+ /* >>>>+ * msecs_to_jiffies() is a unit conversion, which truncates >>>>+ * (rounds down), so we need to add 1. >>>>+ */ >>>>+ expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; >>>>+ } >>>>+ >>>>+ expire_jifs = schedule_timeout(expire_jifs); >>>>+ >>>>+ /* >>>>+ * don't need to add 1 here, even though there is truncation, >>>>+ * because we will add 1 if/when the value is sent back in >>>>+ */ >>>>+ return jiffies_to_msecs(expire_jifs); >>>>+} >>> >>>As I already mentioned for msleep_interruptible this is a really terrible >>>interface. >>>The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the >>>process is immediately woken up again) makes the caller suspectible to >>>timeout manipulations and requires constant reauditing, that no caller >>>gets it wrong, so it's better to avoid this error source completely. >> >>After some thought today, I realized the +1 case is not specific to >>milliseconds. It's just that it's only being done *correctly* in the >>milliseconds case...I think ;) > > > <snip> > >>Description: Ensure that schedule_timeout() requests can not possibly >>expire early in the timeout case, by adding the requested relative jiffy >>value to the next value of jiffies. Currently, by adding to the current >>value of jiffies, we might actually expire a jiffy too early (in a >>corner case). > > > Sorry, I forgot that sys_nanosleep() also always adds 1 to the request > (to account for this same issue, I believe, as POSIX demands no early > return from nanosleep() calls). There are some other locations where > similar > > + (t.tv_sec || t.tv_nsec) > > rounding-ups occur. I'll fix those separately if this patch goes in. I > change the one in sys_nanosleep() below to maintain the same latency as > we currently have. I also screwed up my layout in the previous > submission, sorry about that. > > Description: Ensure that schedule_timeout() requests can not possibly > expire early in the timeout case, by adding the requested relative jiffy > value to the next value of jiffies. Currently, by adding to the current > value of jiffies, we might actually expire a jiffy too early (in a > corner case). Modify the callers of schedule_timeout() in timer.c to not > add 1 themselves. > > Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> > > --- > > timer.c | 13 +++++++------ > 1 files changed, 7 insertions(+), 6 deletions(-) > > --- 2.6.13-rc4-dev/kernel/timer.c 2005-08-01 12:31:53.000000000 -0700 > +++ 2.6.13-rc5/kernel/timer.c 2005-08-03 22:05:16.000000000 -0700 > @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti > } > } > > - expire = timeout + jiffies; > + expire = timeout + jiffies + 1; > > init_timer(&timer); > timer.expires = expire; > @@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms > } else { > /* > * msecs_to_jiffies() is a unit conversion, which truncates > - * (rounds down), so we need to add 1. > + * (rounds down), so we need to add 1, but this is taken > + * care of by schedule_timeout() now. > */ > - expire_jifs = msecs_to_jiffies(timeout_msecs) + 1; > + expire_jifs = msecs_to_jiffies(timeout_msecs); > } > > expire_jifs = schedule_timeout(expire_jifs); > @@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim > if ((t.tv_nsec >= 1000000000L) || (t.tv_nsec < 0) || (t.tv_sec < 0)) > return -EINVAL; > > - expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec); > + expire = timespec_to_jiffies(&t); > current->state = TASK_INTERRUPTIBLE; > expire = schedule_timeout(expire); > > @@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time > */ > void msleep(unsigned int msecs) > { > - unsigned long timeout = msecs_to_jiffies(msecs) + 1; > + unsigned long timeout = msecs_to_jiffies(msecs); > > while (timeout) { > set_current_state(TASK_UNINTERRUPTIBLE); > @@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep); > */ > unsigned long msleep_interruptible(unsigned int msecs) > { > - unsigned long timeout = msecs_to_jiffies(msecs) + 1; > + unsigned long timeout = msecs_to_jiffies(msecs); > > while (timeout && !signal_pending(current)) { > set_current_state(TASK_INTERRUPTIBLE); > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 16:45 ` George Anzinger @ 2005-08-04 18:48 ` Nish Aravamudan 2005-08-16 23:05 ` Nishanth Aravamudan 1 sibling, 0 replies; 50+ messages in thread From: Nish Aravamudan @ 2005-08-04 18:48 UTC (permalink / raw) To: george Cc: Nishanth Aravamudan, Roman Zippel, Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 8/4/05, George Anzinger <george@mvista.com> wrote: > Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and > timeval_to_jiffies() to add 1. This is NOT the right thing to do. For > repeating times (see setitimer code) we need the actual time as we KNOW > where the jiffies edge is in the repeating case. The +1 is needed ONLY > for the initial time, not the repeating time. Please read the patch. I didn't touch timespec_to_jiffies() or timeval_to_jiffies(). Not sure why you think I did. I agree that we only need the initial time, my patch is no good. But it is hard for non-itimers, like schedule_timeout() callers, to provide an interface that only adds 1 to the initial request, since the callers currently pass in an absolute jiffies value. Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 16:45 ` George Anzinger 2005-08-04 18:48 ` Nish Aravamudan @ 2005-08-16 23:05 ` Nishanth Aravamudan 2005-08-17 0:39 ` George Anzinger 1 sibling, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-08-16 23:05 UTC (permalink / raw) To: George Anzinger Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 04.08.2005 [09:45:55 -0700], George Anzinger wrote: > Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and > timeval_to_jiffies() to add 1. This is NOT the right thing to do. For > repeating times (see setitimer code) we need the actual time as we KNOW > where the jiffies edge is in the repeating case. The +1 is needed ONLY > for the initial time, not the repeating time. > > > See: > http://marc.theaimsgroup.com/?l=linux-kernel&m=112208357906156&w=2 I followed that thread, George, but I think it's a different case with schedule_timeout() [maybe this indicates drivers/other users should maybe be using itimers, but I'll get to that in a sec]. With schedule_timeout(), we are just given a relative jiffies value. We have no context as to which task is requesting the delay, per se, meaning we don't (can't) know from the interface whether this is the first delay in a sequence, or a brand new one, without changing all users to have some sort of control structure. The callers of schedule_timeout() don't even get a pointer to the timer added internally. So, adding 1 to all sleeps seems like it might be reasonable, as looping sleeps probably need to use a different interface. I had worked a bit ago on something like poll_event() with the kernel-janitors group, which would abstract out the repeated sleeps. Basically wait_event() without wait-queues... Maybe we could make such an interface just use itimers? I've attached my old patch for poll_event(), just for reference. My point, I guess, is that in the schedule_timeout() case, we don't know where the jiffies edge is, as we either expire or receive a wait-queue event/signal, we never mod_timer() the internal timer... So we have to assume that we need to sleep the request. But maybe Roman's idea of sleeping a certain number of jiffy edges is sufficient. I am not yet convinced driver authors want/need such an interface, though, still thinking it over. Thanks for all the feedback, Nish --- 2.6.11-kj-v/include/linux/delay.h 2005-03-01 23:37:51.000000000 -0800 +++ 2.6.11-kj/include/linux/delay.h 2005-03-04 16:03:24.000000000 -0800 @@ -47,4 +47,106 @@ static inline void ssleep(unsigned int s msleep(seconds * 1000); } +/* + * poll_event* check if an @condition is true every @interval milliseconds, + * up to a @timeout maximum, if specified, else forever. + * The *_interruptible versions will sleep in TASK_INTERRUPTIBLE + * (instead of TASK_UNINTERRUPTIBLE) and will return early on signals + * Note that these interfaces are distinctly different than + * wait_event*() in two ways: + * 1) No wait-queues + * 2) Specify the time you want to sleep, not when you want to + * wake-up. Thus, callers should not add the current time to + * @timeout before calling, as it will be done by the macro. + */ + +/** + * poll_event - check a condition at regular intervals + * @condition: Condition to check + * @interval: Number of milliseconds between checks + */ +#define poll_event(condition, interval) \ +do { \ + while (!(condition)) \ + msleep(interval); \ +} while(0) + +/** + * poll_event_timeout - check a condition at regular intervals with a timeout + * @condition: Condition to check + * @interval: Number of milliseconds between checks + * @timeout: Maximum total number of milliseconds to take + * returns: 0, if condition caused return + * -ETIME, if timeout + */ +#define poll_event_timeout(condition, interval, timeout) \ +({ \ + int __ret = 0; \ + unsigned long stop = jiffies + msecs_to_jiffies(timeout); \ + \ + for (;;) { \ + if (condition) \ + break; \ + msleep(interval); \ + if (time_after(jiffies, stop)) { \ + __ret = -ETIME; \ + break; \ + } \ + } \ + __ret; \ +}) + +/** + * poll_event_interruptible - check a condition at regular intervals, returning early on signals + * @condition: Condition to check + * @interval: Number of milliseconds between checks + * returns: 0, if condition caused return + * -EINTR, if a signal + */ +#define poll_event_interruptible(condition, interval) \ +({ \ + int __ret = 0; \ + \ + for (;;) { \ + if (condition) \ + break; \ + msleep_interruptible(interval); \ + if (signal_pending(current)) { \ + __ret = -EINTR; \ + break; \ + } \ + } \ + __ret; +}) + +/** + * poll_event_interruptible_timeout - check a condition at regular intervals with a timeout, returning early on signals + * @condition: Condition to check + * @interval: Number of milliseconds between checks + * @timeout: Maximum total number of milliseconds to take + * returns 0, if condition caused return + * -EINTR, if a signal + * -ETIME, if timeout + */ +#define poll_event_interruptible_timeout(conditition, interval, timeout) \ +({ \ + int __ret = 0; \ + unsigned long stop = jiffies + msecs_to_jiffies(timeout); \ + \ + for (;;) { \ + if (condition) \ + break; \ + msleep_interruptible(interval); \ + if (signal_pending(current)) { \ + __ret = -EINTR; \ + break; \ + } \ + if (time_after(jiffies, stop)) { \ + __ret = -ETIME; \ + break; \ + } \ + } \ + __ret; \ +}) + #endif /* defined(_LINUX_DELAY_H) */ ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout() 2005-08-16 23:05 ` Nishanth Aravamudan @ 2005-08-17 0:39 ` George Anzinger 2005-08-17 5:56 ` Nishanth Aravamudan 0 siblings, 1 reply; 50+ messages in thread From: George Anzinger @ 2005-08-17 0:39 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Nishanth Aravamudan wrote: > On 04.08.2005 [09:45:55 -0700], George Anzinger wrote: > >>Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and >>timeval_to_jiffies() to add 1. This is NOT the right thing to do. For >>repeating times (see setitimer code) we need the actual time as we KNOW >>where the jiffies edge is in the repeating case. The +1 is needed ONLY >>for the initial time, not the repeating time. >> >> >>See: >>http://marc.theaimsgroup.com/?l=linux-kernel&m=112208357906156&w=2 > > > I followed that thread, George, but I think it's a different case with > schedule_timeout() [maybe this indicates drivers/other users should > maybe be using itimers, but I'll get to that in a sec]. I think I miss understood back then :). > > With schedule_timeout(), we are just given a relative jiffies value. We > have no context as to which task is requesting the delay, per se, > meaning we don't (can't) know from the interface whether this is the > first delay in a sequence, or a brand new one, without changing all > users to have some sort of control structure. The callers of > schedule_timeout() don't even get a pointer to the timer added > internally. > > So, adding 1 to all sleeps seems like it might be reasonable, as looping > sleeps probably need to use a different interface. I had worked a bit > ago on something like poll_event() with the kernel-janitors group, which > would abstract out the repeated sleeps. Basically wait_event() without > wait-queues... Maybe we could make such an interface just use itimers? > I've attached my old patch for poll_event(), just for reference. I think not. itimers is really pointed at a particular system call and has resources in the task structure to do it. These would be hard to share... > > My point, I guess, is that in the schedule_timeout() case, we don't know > where the jiffies edge is, as we either expire or receive a wait-queue > event/signal, we never mod_timer() the internal timer... So we have to > assume that we need to sleep the request. But maybe Roman's idea of > sleeping a certain number of jiffy edges is sufficient. I am not yet > convinced driver authors want/need such an interface, though, still > thinking it over. IMNSHO we should not get too parental with kernel only interfaces. Adding 1 is easy enough for the caller and even easier to explain in the instructions (i.e. this call sleeps for X jiffies edges). This allows the caller to do more if needed and, should he ever just want to sync to the next jiffie he does not have to deal with backing out that +1. > -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout() 2005-08-17 0:39 ` George Anzinger @ 2005-08-17 5:56 ` Nishanth Aravamudan 2005-08-17 19:51 ` George Anzinger 0 siblings, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-08-17 5:56 UTC (permalink / raw) To: George Anzinger Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 16.08.2005 [17:39:11 -0700], George Anzinger wrote: > Nishanth Aravamudan wrote: > >On 04.08.2005 [09:45:55 -0700], George Anzinger wrote: > > > >>Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and > >>timeval_to_jiffies() to add 1. This is NOT the right thing to do. For > >>repeating times (see setitimer code) we need the actual time as we KNOW > >>where the jiffies edge is in the repeating case. The +1 is needed ONLY > >>for the initial time, not the repeating time. > >> > >> > >>See: > >>http://marc.theaimsgroup.com/?l=linux-kernel&m=112208357906156&w=2 > > > > > >I followed that thread, George, but I think it's a different case with > >schedule_timeout() [maybe this indicates drivers/other users should > >maybe be using itimers, but I'll get to that in a sec]. > > I think I miss understood back then :). Ok, no problem. I was just thinking today about all the issues that were borught up... I really appreciate your feedback. > > > >With schedule_timeout(), we are just given a relative jiffies value. We > >have no context as to which task is requesting the delay, per se, > >meaning we don't (can't) know from the interface whether this is the > >first delay in a sequence, or a brand new one, without changing all > >users to have some sort of control structure. The callers of > >schedule_timeout() don't even get a pointer to the timer added > >internally. > > > >So, adding 1 to all sleeps seems like it might be reasonable, as looping > >sleeps probably need to use a different interface. I had worked a bit > >ago on something like poll_event() with the kernel-janitors group, which > >would abstract out the repeated sleeps. Basically wait_event() without > >wait-queues... Maybe we could make such an interface just use itimers? > >I've attached my old patch for poll_event(), just for reference. > > I think not. itimers is really pointed at a particular system call and > has resources in the task structure to do it. These would be hard to > share... Ok, I wasn't sure about that -- I'm not too familiar with the itimer code (maybe I'll take a look at it soon :) ) > >My point, I guess, is that in the schedule_timeout() case, we don't know > >where the jiffies edge is, as we either expire or receive a wait-queue > >event/signal, we never mod_timer() the internal timer... So we have to > >assume that we need to sleep the request. But maybe Roman's idea of > >sleeping a certain number of jiffy edges is sufficient. I am not yet > >convinced driver authors want/need such an interface, though, still > >thinking it over. > > IMNSHO we should not get too parental with kernel only interfaces. > Adding 1 is easy enough for the caller and even easier to explain in the > instructions (i.e. this call sleeps for X jiffies edges). This allows > the caller to do more if needed and, should he ever just want to sync to > the next jiffie he does not have to deal with backing out that +1. I don't want to be too parental either, but I also am trying to avoid code duplication. Lots of drivers basically do something like poll_event() does (or could do with some changes), i.e. looping a constant amount multiple times, checking something every so often. The patch was just a thought, though. I will keep evaluating drivers and see if it's a useful interface to have eventually. I guess I'm just concerned with making an unintuitive interface. As was brought up at OLS, drivers are a major source of bugs/buggy code. The simpler, more useful we can make interfaces, the better, I think. I'm not claiming you disagree, I just want to make my own motives clear. While fixing up the schedule_timeout() comment would make it clear what schedule_timeout() achieves, I'm not sure how useful such an interface is, if every caller adds 1 :) I need to mull it over, though... Lots to consider. I also, of course, want to stay flexible for the reasons you mention (letting the driver adjust the timeout as they expect to). Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout() 2005-08-17 5:56 ` Nishanth Aravamudan @ 2005-08-17 19:51 ` George Anzinger 2005-08-17 22:24 ` Nishanth Aravamudan 0 siblings, 1 reply; 50+ messages in thread From: George Anzinger @ 2005-08-17 19:51 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Nishanth Aravamudan wrote: ~ >>IMNSHO we should not get too parental with kernel only interfaces. >>Adding 1 is easy enough for the caller and even easier to explain in the >>instructions (i.e. this call sleeps for X jiffies edges). This allows >>the caller to do more if needed and, should he ever just want to sync to >>the next jiffie he does not have to deal with backing out that +1. > > > I don't want to be too parental either, but I also am trying to avoid > code duplication. Lots of drivers basically do something like > poll_event() does (or could do with some changes), i.e. looping a > constant amount multiple times, checking something every so often. The > patch was just a thought, though. I will keep evaluating drivers and see > if it's a useful interface to have eventually. > > I guess I'm just concerned with making an unintuitive interface. As was > brought up at OLS, drivers are a major source of bugs/buggy code. The > simpler, more useful we can make interfaces, the better, I think. I'm > not claiming you disagree, I just want to make my own motives clear. > While fixing up the schedule_timeout() comment would make it clear what > schedule_timeout() achieves, I'm not sure how useful such an interface > is, if every caller adds 1 :) I need to mull it over, though... Lots to > consider. I also, of course, want to stay flexible for the reasons you > mention (letting the driver adjust the timeout as they expect to). I would leave the +1 alone and put in the correct documentation. This way _more_ folks will be made aware of the mid jiffie issue. Far to often we see (and let get in) patches that mess up user interfaces around this issue. The recent changes to itimer come to mind... > ~ -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout() 2005-08-17 19:51 ` George Anzinger @ 2005-08-17 22:24 ` Nishanth Aravamudan 0 siblings, 0 replies; 50+ messages in thread From: Nishanth Aravamudan @ 2005-08-17 22:24 UTC (permalink / raw) To: George Anzinger Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 17.08.2005 [12:51:17 -0700], George Anzinger wrote: > Nishanth Aravamudan wrote: > ~ > >>IMNSHO we should not get too parental with kernel only interfaces. > >>Adding 1 is easy enough for the caller and even easier to explain in the > >>instructions (i.e. this call sleeps for X jiffies edges). This allows > >>the caller to do more if needed and, should he ever just want to sync to > >>the next jiffie he does not have to deal with backing out that +1. > > > > > >I don't want to be too parental either, but I also am trying to avoid > >code duplication. Lots of drivers basically do something like > >poll_event() does (or could do with some changes), i.e. looping a > >constant amount multiple times, checking something every so often. The > >patch was just a thought, though. I will keep evaluating drivers and see > >if it's a useful interface to have eventually. > > > >I guess I'm just concerned with making an unintuitive interface. As was > >brought up at OLS, drivers are a major source of bugs/buggy code. The > >simpler, more useful we can make interfaces, the better, I think. I'm > >not claiming you disagree, I just want to make my own motives clear. > >While fixing up the schedule_timeout() comment would make it clear what > >schedule_timeout() achieves, I'm not sure how useful such an interface > >is, if every caller adds 1 :) I need to mull it over, though... Lots to > >consider. I also, of course, want to stay flexible for the reasons you > >mention (letting the driver adjust the timeout as they expect to). > > I would leave the +1 alone and put in the correct documentation. This > way _more_ folks will be made aware of the mid jiffie issue. Far to > often we see (and let get in) patches that mess up user interfaces > around this issue. The recent changes to itimer come to mind... Ok, makes sense to me; does the following patch work for everybody? The wording is a bit awkward, but so is the issue :) Description: Fix schedule_timeout()'s comment, indicated the inter-tick rounding issue. Since the kernel does not keep track of an inter-tick position in jiffies, a caller which wishes to sleep for at least a certain number of jiffies should add its request to the *next* jiffies value (meaning add 1 to its relative request). Make that clear in the comment. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- diff -urpN 2.6.13-rc6/kernel/timer.c 2.6.13-rc6-dev/kernel/timer.c --- 2.6.13-rc6/kernel/timer.c 2005-08-09 15:22:57.000000000 -0700 +++ 2.6.13-rc6-dev/kernel/timer.c 2005-08-17 15:21:35.000000000 -0700 @@ -1077,9 +1077,15 @@ static void process_timeout(unsigned lon * schedule_timeout - sleep until timeout * @timeout: timeout value in jiffies * - * Make the current task sleep until @timeout jiffies have - * elapsed. The routine will return immediately unless - * the current task state has been set (see set_current_state()). + * Make the current task sleep until @timeout timer interrupts have + * occurred, meaning jiffies has incremented @timeout times and not + * necessarily that @timeout jiffies have elapsed. If the task wishes to + * sleep until (at least) @timeout jiffies have elapsed, then it should + * add 1 to its request. This is necessary, as the kernel does not keep + * track of an inter-jiffy position, so the caller must "round up" its + * request so that it begins at the next jiffy. The routine will return + * immediately unless the current task state has been set (see + * set_current_state()). * * You can set the task state as follows - * ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 5:14 ` [UPDATE PATCH] " Nishanth Aravamudan 2005-08-04 16:45 ` George Anzinger @ 2005-08-04 17:05 ` George Anzinger 2005-08-04 18:49 ` Nish Aravamudan 1 sibling, 1 reply; 50+ messages in thread From: George Anzinger @ 2005-08-04 17:05 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Nishanth Aravamudan wrote: ~ > Sorry, I forgot that sys_nanosleep() also always adds 1 to the request > (to account for this same issue, I believe, as POSIX demands no early > return from nanosleep() calls). There are some other locations where > similar > > + (t.tv_sec || t.tv_nsec) This is not the same as "always add 1". We don't do it this way just to have fun with C. If you change schedule_timeout() to add the 1, nanosleep() will need to do things differently to get the same behavior. (And, YES users do pass in zero sleep times.) -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 17:05 ` George Anzinger @ 2005-08-04 18:49 ` Nish Aravamudan 0 siblings, 0 replies; 50+ messages in thread From: Nish Aravamudan @ 2005-08-04 18:49 UTC (permalink / raw) To: george Cc: Nishanth Aravamudan, Roman Zippel, Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 8/4/05, George Anzinger <george@mvista.com> wrote: > Nishanth Aravamudan wrote: > ~ > > Sorry, I forgot that sys_nanosleep() also always adds 1 to the request > > (to account for this same issue, I believe, as POSIX demands no early > > return from nanosleep() calls). There are some other locations where > > similar > > > > + (t.tv_sec || t.tv_nsec) > > This is not the same as "always add 1". We don't do it this way just to > have fun with C. If you change schedule_timeout() to add the 1, > nanosleep() will need to do things differently to get the same behavior. > (And, YES users do pass in zero sleep times.) Fair enough. Will need to think about this more. Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 0:51 ` [PATCH] push rounding up of relative request to schedule_timeout() Nishanth Aravamudan 2005-08-04 5:14 ` [UPDATE PATCH] " Nishanth Aravamudan @ 2005-08-04 9:38 ` Roman Zippel 2005-08-04 14:33 ` Nishanth Aravamudan 2005-08-04 17:08 ` Andrew Morton 1 sibling, 2 replies; 50+ messages in thread From: Roman Zippel @ 2005-08-04 9:38 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Hi, Andrew, please drop this patch. Nish, please stop fucking around with kernel APIs. On Wed, 3 Aug 2005, Nishanth Aravamudan wrote: > > The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the > > process is immediately woken up again) makes the caller suspectible to > > timeout manipulations and requires constant reauditing, that no caller > > gets it wrong, so it's better to avoid this error source completely. Nish, did you read this? Is my English this bad? > --- 2.6.13-rc5/kernel/timer.c 2005-08-01 12:31:53.000000000 -0700 > +++ 2.6.13-rc5-dev/kernel/timer.c 2005-08-03 17:30:10.000000000 -0700 > @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti > } > } > > - expire = timeout + jiffies; > + expire = timeout + jiffies + 1; > > init_timer(&timer); > timer.expires = expire; And a little later it does: timeout = expire - jiffies; which means callers can get back a larger timeout. Nish, did you check and fix _all_ users? I can easily find a number of users which immediately use the return value as next timeout. There are _a_lot_ of schedule_timeout(1) for small busy loops, these are asking for "please schedule until next tick". Did you check that these are still ok? schedule_timeout() is arguably a broken API, but can we please _first_ come up with a plan to fix this, before we break even more? bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 9:38 ` [PATCH] " Roman Zippel @ 2005-08-04 14:33 ` Nishanth Aravamudan 2005-08-04 18:59 ` Roman Zippel 2005-08-04 17:08 ` Andrew Morton 1 sibling, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-08-04 14:33 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 04.08.2005 [11:38:33 +0200], Roman Zippel wrote: > Hi, > > Andrew, please drop this patch. > Nish, please stop fucking around with kernel APIs. The comment for schedule_timeout() claims: * Make the current task sleep until @timeout jiffies have * elapsed. Currently, it does not do so. I was simply trying to make the function do what it claims it does. > On Wed, 3 Aug 2005, Nishanth Aravamudan wrote: > > > > The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the > > > process is immediately woken up again) makes the caller suspectible to > > > timeout manipulations and requires constant reauditing, that no caller > > > gets it wrong, so it's better to avoid this error source completely. > > Nish, did you read this? Is my English this bad? Your English is fine. My point was that the +1 issues (potential infinite timeouts) are a problem with *jiffies* not milliseconds. And thus need to be pushed down to the jiffies layer. I think my explanation was pretty clear. > > > --- 2.6.13-rc5/kernel/timer.c 2005-08-01 12:31:53.000000000 -0700 > > +++ 2.6.13-rc5-dev/kernel/timer.c 2005-08-03 17:30:10.000000000 -0700 > > @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti > > } > > } > > > > - expire = timeout + jiffies; > > + expire = timeout + jiffies + 1; > > > > init_timer(&timer); > > timer.expires = expire; > > And a little later it does: > > timeout = expire - jiffies; > > which means callers can get back a larger timeout. Hrm, maybe I will need to cache the parameter. But the only way you would get a return value greater than requested is if schedule() returns before the next jiffy? Which I guess could happen if a wait-queue event or signal (if the task is set as such) occurs before the next interrupt *and* there are no other threads running, I believe, as process_timeout() -> try_to_wake_up() only puts the task back on the run queue; I don't think it actually preempts the currently running task, does it? Just an FYI, though, that is a problem in the current code, in the sense that we will pass back the exact same value again Let me take a look, maybe we need to do some -1 later (or just cache the request, as I mentioned). Thanks for pointing this out. > Nish, did you check and fix _all_ users? I can easily find a number of > users which immediately use the return value as next timeout. I haven't fixed all users yet. I plan on trying to do that today. Would those callers that do immediately use the return value as the next timeout fall under the functions that should be using time_after()/time_before()? > There are _a_lot_ of schedule_timeout(1) for small busy loops, these are > asking for "please schedule until next tick". Did you check that these are > still ok? According to the comment for schedule_timeout(), that is not true. They are asking to sleep for *1* ticks, no the *next* tick. If the assumption in the code is the latter, they have been calling this function inappropriately for a while. They probably should be requesting schedule_timeout(0), i.e. no sleep at all (but we still will add a timer internally and it won't expire until the next interrupt occurs). > schedule_timeout() is arguably a broken API, but can we please _first_ > come up with a plan to fix this, before we break even more? It *is* broken, and my patch *does* fix it. I'm not suggesting it go to Linus today, but it's ok in Andrew's tree, especially if I'm able to get the depending patches sent out soon. I am pretty sure there is no way to fix the problems of adding 1 without an inter-tick position. The +1 is the closest thing we have to a "ceiling" function with jiffies. Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 14:33 ` Nishanth Aravamudan @ 2005-08-04 18:59 ` Roman Zippel 2005-08-04 19:11 ` Nishanth Aravamudan 0 siblings, 1 reply; 50+ messages in thread From: Roman Zippel @ 2005-08-04 18:59 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Hi, On Thu, 4 Aug 2005, Nishanth Aravamudan wrote: > The comment for schedule_timeout() claims: > > * Make the current task sleep until @timeout jiffies have > * elapsed. > > Currently, it does not do so. I was simply trying to make the function > do what it claims it does. What makes you think the comment is correct? This comment was added at 2.4.3, while schedule_timeout() has this behaviour since it was added at 2.1.127. > My point was that the +1 issues (potential > infinite timeouts) are a problem with *jiffies* not milliseconds. And > thus need to be pushed down to the jiffies layer. I think my explanation > was pretty clear. Not really, could you go into more details why it's "a problem with *jiffies* not milliseconds"? bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 18:59 ` Roman Zippel @ 2005-08-04 19:11 ` Nishanth Aravamudan 2005-08-04 23:20 ` Roman Zippel 0 siblings, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-08-04 19:11 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 04.08.2005 [20:59:50 +0200], Roman Zippel wrote: > Hi, > > On Thu, 4 Aug 2005, Nishanth Aravamudan wrote: > > > The comment for schedule_timeout() claims: > > > > * Make the current task sleep until @timeout jiffies have > > * elapsed. > > > > Currently, it does not do so. I was simply trying to make the function > > do what it claims it does. > > What makes you think the comment is correct? This comment was added at > 2.4.3, while schedule_timeout() has this behaviour since it was added at > 2.1.127. Fair enough. Should we change the comment? What are the expectations of programmers wrt. to this interface? > > My point was that the +1 issues (potential > > infinite timeouts) are a problem with *jiffies* not milliseconds. And > > thus need to be pushed down to the jiffies layer. I think my explanation > > was pretty clear. > > Not really, could you go into more details why it's "a problem with > *jiffies* not milliseconds"? Well, I may have phrased that wrong. It's not that milliseconds in the kernel wouldn't have the same problem -- they would/do, because they convert to jiffies eventually. This is all the same issue, as George pointed out, that itimers face. If a user requests schedule_timeout(HZ/100); which, if HZ is 1000, is 10 jiffies, yes? But, since we are between ticks, we want to actually add that request to the next interval, not to the current one. Otherwise, we do have the possibility of returning early. Currently, we require callers to add the +1 to their request, and thus they only add it to the first one. The problem with my patch which pushed it to schedule_timeout(), is that we will do +1 to every request. I'm not sure, without some sort of "persistent" timeout control structure, how we get around that, though, in the schedule_timeout() case. Does that make any more sense? Thanks Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 19:11 ` Nishanth Aravamudan @ 2005-08-04 23:20 ` Roman Zippel 0 siblings, 0 replies; 50+ messages in thread From: Roman Zippel @ 2005-08-04 23:20 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Hi, On Thu, 4 Aug 2005, Nishanth Aravamudan wrote: > > What makes you think the comment is correct? This comment was added at > > 2.4.3, while schedule_timeout() has this behaviour since it was added at > > 2.1.127. > > Fair enough. Should we change the comment? It can't hurt to fix the comment. > If a user requests schedule_timeout(HZ/100); which, if HZ is 1000, is 10 > jiffies, yes? But, since we are between ticks, we want to actually add > that request to the next interval, not to the current one. Otherwise, we > do have the possibility of returning early. Currently, we require > callers to add the +1 to their request, and thus they only add it to the > first one. The problem with my patch which pushed it to > schedule_timeout(), is that we will do +1 to every request. I'm not > sure, without some sort of "persistent" timeout control structure, how > we get around that, though, in the schedule_timeout() case. Does that > make any more sense? I don't disagree, but please create some sane interfaces, e.g. something like below. This allows to first convert as much as possible schedule_timeout() users and then we can still change the schedule_timeout() interface. bye, Roman include/linux/sched.h | 17 +++++++++++ kernel/timer.c | 71 +++++++++++++++++++++++++++++++++++++++++++------- 2 files changed, 78 insertions(+), 10 deletions(-) Index: linux-2.6/include/linux/sched.h =================================================================== --- linux-2.6.orig/include/linux/sched.h 2005-06-18 15:00:59.000000000 +0200 +++ linux-2.6/include/linux/sched.h 2005-08-04 23:42:43.000000000 +0200 @@ -182,9 +182,24 @@ extern void scheduler_tick(void); extern int in_sched_functions(unsigned long addr); #define MAX_SCHEDULE_TIMEOUT LONG_MAX -extern signed long FASTCALL(schedule_timeout(signed long timeout)); +extern void schedule_until(unsigned long expire) fastcall; +extern int schedule_until_intr(unsigned long expire) fastcall; +extern int schedule_until_unintr(unsigned long expire) fastcall; +extern signed long schedule_timeout(signed long timeout) fastcall; +extern int schedule_timeout_intr(signed long timeout) fastcall; +extern int schedule_timeout_unintr(signed long timeout) fastcall; asmlinkage void schedule(void); +static inline int schedule_timeout_msecs_intr(unsigned int timeout) +{ + return schedule_timeout_intr(msecs_to_jiffies(timeout)); +} + +static inline int schedule_timeout_msecs_unintr(unsigned int timeout) +{ + return schedule_timeout_unintr(msecs_to_jiffies(timeout)); +} + struct namespace; /* Maximum number of active map areas.. This is a random (large) number */ Index: linux-2.6/kernel/timer.c =================================================================== --- linux-2.6.orig/kernel/timer.c 2005-06-18 15:01:12.000000000 +0200 +++ linux-2.6/kernel/timer.c 2005-08-04 23:42:36.000000000 +0200 @@ -1049,6 +1049,67 @@ static void process_timeout(unsigned lon wake_up_process((task_t *)__data); } +fastcall __sched void schedule_until(unsigned long expire) +{ + struct timer_list timer; + + init_timer(&timer); + timer.expires = expire; + timer.data = (unsigned long) current; + timer.function = process_timeout; + + add_timer(&timer); + schedule(); + del_singleshot_timer_sync(&timer); +} +EXPORT_SYMBOL(schedule_until); + +fastcall __sched int schedule_until_intr(unsigned long expire) +{ + set_current_state(TASK_INTERRUPTIBLE); + schedule_until(expire); + return time_before(expire, jiffies); +} +EXPORT_SYMBOL(schedule_until_intr); + +fastcall __sched int schedule_until_unintr(unsigned long expire) +{ + set_current_state(TASK_UNINTERRUPTIBLE); + schedule_until(expire); + return time_before(expire, jiffies); +} +EXPORT_SYMBOL(schedule_until_intr); + +fastcall __sched int schedule_timeout_intr(signed long timeout) +{ + unsigned long expire; + + set_current_state(TASK_INTERRUPTIBLE); + if (timeout >= MAX_SCHEDULE_TIMEOUT - 1) { + schedule(); + return 1; + } + expire = jiffies + timeout + 1; + schedule_until(expire); + return time_before(expire, jiffies); +} +EXPORT_SYMBOL(schedule_timeout_intr); + +fastcall __sched int schedule_timeout_unintr(signed long timeout) +{ + unsigned long expire; + + set_current_state(TASK_UNINTERRUPTIBLE); + if (timeout >= MAX_SCHEDULE_TIMEOUT - 1) { + schedule(); + return 1; + } + expire = jiffies + timeout + 1; + schedule_until(expire); + return time_before(expire, jiffies); +} +EXPORT_SYMBOL(schedule_timeout_unintr); + /** * schedule_timeout - sleep until timeout * @timeout: timeout value in jiffies @@ -1077,7 +1138,6 @@ static void process_timeout(unsigned lon */ fastcall signed long __sched schedule_timeout(signed long timeout) { - struct timer_list timer; unsigned long expire; switch (timeout) @@ -1112,14 +1172,7 @@ fastcall signed long __sched schedule_ti expire = timeout + jiffies; - init_timer(&timer); - timer.expires = expire; - timer.data = (unsigned long) current; - timer.function = process_timeout; - - add_timer(&timer); - schedule(); - del_singleshot_timer_sync(&timer); + schedule_until(expire); timeout = expire - jiffies; ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] push rounding up of relative request to schedule_timeout() 2005-08-04 9:38 ` [PATCH] " Roman Zippel 2005-08-04 14:33 ` Nishanth Aravamudan @ 2005-08-04 17:08 ` Andrew Morton 2005-08-04 19:00 ` [PATCH] add schedule_timeout_{,un}intr() interfaces Nishanth Aravamudan 1 sibling, 1 reply; 50+ messages in thread From: Andrew Morton @ 2005-08-04 17:08 UTC (permalink / raw) To: Roman Zippel; +Cc: nacc, arjan, domen, linux-kernel, clucas Roman Zippel <zippel@linux-m68k.org> wrote: > > Andrew, please drop this patch. Well I was sitting on them with the intention of taking a look later and trying to work out what the heck you two have been going on about. But maybe dropping them means that we'll later get a clear and concise description of the problem and its solution, so I'll do that. (This was supposed to be just a simple "consolidate set_current_state+schedule_timeout into a single function call" patch. What happened?) ^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH] add schedule_timeout_{,un}intr() interfaces 2005-08-04 17:08 ` Andrew Morton @ 2005-08-04 19:00 ` Nishanth Aravamudan 2005-08-05 7:38 ` Andrew Morton 0 siblings, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-08-04 19:00 UTC (permalink / raw) To: Andrew Morton; +Cc: Roman Zippel, arjan, domen, linux-kernel, clucas On 04.08.2005 [10:08:10 -0700], Andrew Morton wrote: > Roman Zippel <zippel@linux-m68k.org> wrote: > > > > Andrew, please drop this patch. > > Well I was sitting on them with the intention of taking a look later and > trying to work out what the heck you two have been going on about. I appreciate that... Good luck figuring out this mess... > But maybe dropping them means that we'll later get a clear and concise > description of the problem and its solution, so I'll do that. I will try and formulate this as well as I can today or tomorrow. > (This was supposed to be just a simple "consolidate > set_current_state+schedule_timeout into a single function call" patch. > What happened?) Here's that patch. What happened? I did, to be honest. Trying to fix too many things at once. Thanks, Nish Description: Add schedule_timeout_{,un}intr() interfaces so that schedule_timeout() callers don't have to worry about forgetting to add the set_current_state() call beforehand. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> --- include/linux/sched.h | 2 ++ kernel/timer.c | 14 ++++++++++++++ 2 files changed, 16 insertions(+) diff -urpN 2.6.13-rc5/include/linux/sched.h 2.6.13-rc5-dev/include/linux/sched.h --- 2.6.13-rc5/include/linux/sched.h 2005-08-04 11:46:06.000000000 -0700 +++ 2.6.13-rc5-dev/include/linux/sched.h 2005-08-04 11:53:20.000000000 -0700 @@ -183,6 +183,8 @@ extern int in_sched_functions(unsigned l #define MAX_SCHEDULE_TIMEOUT LONG_MAX extern signed long FASTCALL(schedule_timeout(signed long timeout)); +extern signed long schedule_timeout_intr(signed long timeout); +extern signed long schedule_timeout_unintr(signed long timeout); asmlinkage void schedule(void); struct namespace; diff -urpN 2.6.13-rc5/kernel/timer.c 2.6.13-rc5-dev/kernel/timer.c --- 2.6.13-rc5/kernel/timer.c 2005-08-04 11:46:06.000000000 -0700 +++ 2.6.13-rc5-dev/kernel/timer.c 2005-08-04 11:52:06.000000000 -0700 @@ -1153,6 +1153,20 @@ fastcall signed long __sched schedule_ti EXPORT_SYMBOL(schedule_timeout); +signed long __sched schedule_timeout_intr(signed long timeout) +{ + set_current_state(TASK_INTERRUPTIBLE); + return schedule_timeout(timeout); +} +EXPORT_SYMBOL(schedule_timeout_intr); + +signed long __sched schedule_timeout_unintr(signed long timeout) +{ + set_current_state(TASK_UNINTERRUPTIBLE); + return schedule_timeout(timeout); +} +EXPORT_SYMBOL(schedule_timeout_unintr); + /* Thread ID - the internal kernel "pid" */ asmlinkage long sys_gettid(void) { ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] add schedule_timeout_{,un}intr() interfaces 2005-08-04 19:00 ` [PATCH] add schedule_timeout_{,un}intr() interfaces Nishanth Aravamudan @ 2005-08-05 7:38 ` Andrew Morton 0 siblings, 0 replies; 50+ messages in thread From: Andrew Morton @ 2005-08-05 7:38 UTC (permalink / raw) To: Nishanth Aravamudan; +Cc: zippel, arjan, domen, linux-kernel, clucas Nishanth Aravamudan <nacc@us.ibm.com> wrote: > > Add schedule_timeout_{,un}intr() interfaces I did s/intr/interruptible/. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 11:55 ` Roman Zippel 2005-07-23 12:51 ` Arjan van de Ven @ 2005-07-23 16:37 ` Nishanth Aravamudan 2005-07-23 17:01 ` Roman Zippel 1 sibling, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-07-23 16:37 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 23.07.2005 [13:55:58 +0200], Roman Zippel wrote: > Hi, > > On Sat, 23 Jul 2005, Arjan van de Ven wrote: > > > > What's wrong with using jiffies? > > > > A lot of the (driver) users want a wallclock based timeout. For that, > > miliseconds is a more obvious API with less chance to get the jiffies/HZ > > conversion wrong by the driver writer. > > We have helper functions for that. The point about using jiffies is to > make it _very_ clear, that the timeout is imprecise and for most users > this is sufficient. We do have helper functions for human-time <-> jiffies (I keep adding new ones :) ). But why not, instead of set_current_state(TASK_{,UN}INTERRUPTIBLE); schedule_timeout(msecs_to_jiffies(some_constant_msecs)); just have an interface that allows schedule_timeout_msecs_{,un}interruptible(some_constant_msecs); and push the jiffies conversion to common code? There are some 300 or so users of schedule_timeout() in 2.6.12. I would say about half are doing something along the lines of set_current_state(TASK_{,UN}INTERRUPTIBLE); schedule_timeout(HZ/some_constant); These would be replaced with schedule_timeout_msecs_{,un}interruptible(1000/some_constant); I would *not* be changing the callers that do set_current_state(TASK_{,UN}INTERRUPTIBLE); schedule_timeout(some_other_constant); even though I think most of these are 2.4 remnants that don't need short, e.g. 1 or 2 timer interrupt, sleeps, but actually can use a 10 or 20 millisecond (HZ=100, 1 or 2 jiffies) sleep. This emphasizes another advantage of adding these new interfaces, the delay requested does not change with HZ. Internally it does, certainly, but the callers don't need to know that :) Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 16:37 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan @ 2005-07-23 17:01 ` Roman Zippel 2005-07-23 19:06 ` Nishanth Aravamudan 0 siblings, 1 reply; 50+ messages in thread From: Roman Zippel @ 2005-07-23 17:01 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Hi, On Sat, 23 Jul 2005, Nishanth Aravamudan wrote: > set_current_state(TASK_{,UN}INTERRUPTIBLE); > schedule_timeout(msecs_to_jiffies(some_constant_msecs)); > > just have an interface that allows > > schedule_timeout_msecs_{,un}interruptible(some_constant_msecs); > > and push the jiffies conversion to common code? What's wrong with just: schedule_timeout_{,un}interruptible(msecs_to_jiffies(some_constant_msecs)); The majority of users use a constant, which can already be converted at compile tile. Additionally such an interface also had to return a ms value and instead of that constant conversion, the user is better off to work with jiffies directly. bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 17:01 ` Roman Zippel @ 2005-07-23 19:06 ` Nishanth Aravamudan 2005-07-23 20:22 ` Roman Zippel 0 siblings, 1 reply; 50+ messages in thread From: Nishanth Aravamudan @ 2005-07-23 19:06 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 23.07.2005 [19:01:57 +0200], Roman Zippel wrote: > Hi, > > On Sat, 23 Jul 2005, Nishanth Aravamudan wrote: > > > set_current_state(TASK_{,UN}INTERRUPTIBLE); > > schedule_timeout(msecs_to_jiffies(some_constant_msecs)); > > > > just have an interface that allows > > > > schedule_timeout_msecs_{,un}interruptible(some_constant_msecs); > > > > and push the jiffies conversion to common code? > > What's wrong with just: > > schedule_timeout_{,un}interruptible(msecs_to_jiffies(some_constant_msecs)); Nothing, I suppose. I just prefer directly using msecs. I understand your point more now, I think. You are worried about those people that actually use the return value of schedule_timeout(). > The majority of users use a constant, which can already be converted at > compile tile. > Additionally such an interface also had to return a ms value and instead > of that constant conversion, the user is better off to work with jiffies > directly. So, I just spent a good hour looking at every caller of schedule_timeout() which actually stores the return value. Beyond the other wrappers for it (wait_event(), wait_for_completion(), sys_nanosleep(), etc., which I will leave alone using schedule_timeout() until I can change *their* parameters ;) ), I found two cases. 1) Sleep, see if you actually slept the whole time: remainder = schedule_timeout(some_value_in_jiffies); if (!remaining) report_timeout(); 2) Sleep in a loop, keeping track of remaining timeout each iteration: while (timeout) { do_some_stuff(); timeout = schedule_timeout(timeout); if (some_condition) break; } Clearly, neither needs to use jiffies. The former only wants to know if the full timeout elapsed. I didn't find anyone returning that stored value (again, excepting wrapper interfaces) to the caller. They just want to know if they should return -ETIME{,DOUT}. The latter just is a means to guarantee the entire time is slept, but doesn't care about the units. Now, some of these might depend on structures which have members with jiffy-unit values. But I will be more than happy to try and either leave them alone or convert those structures. We'll see about that on a case-by-case basis? Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 19:06 ` Nishanth Aravamudan @ 2005-07-23 20:22 ` Roman Zippel 0 siblings, 0 replies; 50+ messages in thread From: Roman Zippel @ 2005-07-23 20:22 UTC (permalink / raw) To: Nishanth Aravamudan Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas Hi, On Sat, 23 Jul 2005, Nishanth Aravamudan wrote: > > What's wrong with just: > > > > schedule_timeout_{,un}interruptible(msecs_to_jiffies(some_constant_msecs)); > > Nothing, I suppose. I just prefer directly using msecs. I understand > your point more now, I think. You are worried about those people that > actually use the return value of schedule_timeout(). It's only half the point: > > The majority of users use a constant, which can already be converted at > > compile tile. The kernel time unit is and will be jiffies and the kernel time functions should be in ticks with some optional wrappers in other time units, not the other way around. > 2) Sleep in a loop, keeping track of remaining timeout each iteration: > > while (timeout) { > do_some_stuff(); > timeout = schedule_timeout(timeout); > if (some_condition) > break; > } This actually is a pre-preempt construct and should probably use time_before() now. bye, Roman ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces 2005-07-23 10:50 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Roman Zippel 2005-07-23 11:09 ` Arjan van de Ven @ 2005-07-23 16:30 ` Nishanth Aravamudan 1 sibling, 0 replies; 50+ messages in thread From: Nishanth Aravamudan @ 2005-07-23 16:30 UTC (permalink / raw) To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas On 23.07.2005 [12:50:45 +0200], Roman Zippel wrote: > Hi, > > On Fri, 22 Jul 2005, Arjan van de Ven wrote: > > > Also I'd rather not add the non-msec ones... either you're raw and use > > HZ, or you are "cooked" and use the msec variant.. I dont' see the point > > of adding an "in the middle" one. (Yes this means that several users > > need to be transformed to msecs but... I consider that progress ;) > > What's wrong with using jiffies? It's simple and the current timeout > system is based on it. Calling it something else doesn't suddenly give you > more precision. I agree, and for users that want a certain number of ticks (they are the minority, in my experience), they can still use schedule_timeout() (so maybe I should add back in the schedule_timeout_{,un}interruptible() wrappers? As far as precision, we *are* talking about a sleeping path, where the concept of precision is vague at best :) Consider these functions expressing the completeness of what msleep() and msleep_interruptible() started, i.e. for the wait-queue case. And to be honest, we are not changing the semantics of the schedule_timeout() family of functions. The caller expresses their request in some units, the kenrel then does what it can to satisfy the request, but is at liberty to be late (or even early in these two cases). Thanks, Nish ^ permalink raw reply [flat|nested] 50+ messages in thread
end of thread, other threads:[~2005-08-17 22:24 UTC | newest]
Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-07 21:31 [patch 1/4] drivers/char/ip2/i2lib.c: replace direct assignment with set_current_state() domen
2005-07-08 23:08 ` Andrew Morton
2005-07-08 23:22 ` Nish Aravamudan
2005-07-23 0:27 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan
2005-07-23 0:31 ` Arjan van de Ven
2005-07-23 1:08 ` [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces Nishanth Aravamudan
2005-07-23 2:30 ` Andrew Morton
2005-07-23 16:23 ` Nishanth Aravamudan
2005-07-23 10:50 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Roman Zippel
2005-07-23 11:09 ` Arjan van de Ven
2005-07-23 11:55 ` Roman Zippel
2005-07-23 12:51 ` Arjan van de Ven
2005-07-23 13:04 ` Roman Zippel
2005-07-23 13:12 ` Arjan van de Ven
2005-07-23 13:29 ` Roman Zippel
2005-07-23 13:32 ` Arjan van de Ven
2005-07-23 15:56 ` Roman Zippel
2005-07-23 16:44 ` Nishanth Aravamudan
2005-07-23 16:43 ` Nishanth Aravamudan
2005-07-23 17:17 ` Roman Zippel
2005-07-23 19:10 ` Nishanth Aravamudan
2005-07-23 20:12 ` Roman Zippel
2005-07-27 22:29 ` Nishanth Aravamudan
2005-07-30 23:35 ` Roman Zippel
2005-08-01 19:35 ` [UPDATE PATCH] Add schedule_timeout_{intr,unintr}{,_msecs}() interfaces Nishanth Aravamudan
2005-08-03 14:20 ` Roman Zippel
2005-08-04 0:51 ` [PATCH] push rounding up of relative request to schedule_timeout() Nishanth Aravamudan
2005-08-04 5:14 ` [UPDATE PATCH] " Nishanth Aravamudan
2005-08-04 16:45 ` George Anzinger
2005-08-04 18:48 ` Nish Aravamudan
2005-08-16 23:05 ` Nishanth Aravamudan
2005-08-17 0:39 ` George Anzinger
2005-08-17 5:56 ` Nishanth Aravamudan
2005-08-17 19:51 ` George Anzinger
2005-08-17 22:24 ` Nishanth Aravamudan
2005-08-04 17:05 ` George Anzinger
2005-08-04 18:49 ` Nish Aravamudan
2005-08-04 9:38 ` [PATCH] " Roman Zippel
2005-08-04 14:33 ` Nishanth Aravamudan
2005-08-04 18:59 ` Roman Zippel
2005-08-04 19:11 ` Nishanth Aravamudan
2005-08-04 23:20 ` Roman Zippel
2005-08-04 17:08 ` Andrew Morton
2005-08-04 19:00 ` [PATCH] add schedule_timeout_{,un}intr() interfaces Nishanth Aravamudan
2005-08-05 7:38 ` Andrew Morton
2005-07-23 16:37 ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan
2005-07-23 17:01 ` Roman Zippel
2005-07-23 19:06 ` Nishanth Aravamudan
2005-07-23 20:22 ` Roman Zippel
2005-07-23 16:30 ` Nishanth Aravamudan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox