public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch 1/4] drivers/char/ip2/i2lib.c: replace direct assignment with set_current_state()
@ 2005-07-07 21:31 domen
  2005-07-08 23:08 ` Andrew Morton
  0 siblings, 1 reply; 50+ messages in thread
From: domen @ 2005-07-07 21:31 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, Christophe Lucas, domen

[-- Attachment #1: set_current_state-drivers_char_ip2_i2lib --]
[-- Type: text/plain, Size: 1092 bytes --]

From: Christophe Lucas <clucas@rotomalug.org>



Use set_current_state() instead of direct assignment of
current->state.

Signed-off-by: Christophe Lucas <clucas@rotomalug.org>
Signed-off-by: Domen Puncer <domen@coderock.org>


---
 i2lib.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Index: quilt/drivers/char/ip2/i2lib.c
===================================================================
--- quilt.orig/drivers/char/ip2/i2lib.c
+++ quilt/drivers/char/ip2/i2lib.c
@@ -655,7 +655,7 @@ i2QueueCommands(int type, i2ChanStrPtr p
 			timeout--;   // So negative values == forever
 		
 		if (!in_interrupt()) {
-			current->state = TASK_INTERRUPTIBLE;
+			set_current_state(TASK_INTERRUPTIBLE);
 			schedule_timeout(1);	// short nap 
 		} else {
 			// we cannot sched/sleep in interrrupt silly
@@ -1132,7 +1132,7 @@ i2Output(i2ChanStrPtr pCh, const char *p
 
 					ip2trace (CHANN, ITRC_OUTPUT, 61, 0 );
 
-					current->state = TASK_INTERRUPTIBLE;
+					set_current_state(TASK_INTERRUPTIBLE);
 					schedule_timeout(2);
 					if (signal_pending(current)) {
 						break;

--

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch 1/4] drivers/char/ip2/i2lib.c: replace direct assignment with set_current_state()
  2005-07-07 21:31 [patch 1/4] drivers/char/ip2/i2lib.c: replace direct assignment with set_current_state() domen
@ 2005-07-08 23:08 ` Andrew Morton
  2005-07-08 23:22   ` Nish Aravamudan
  2005-07-23  0:27   ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan
  0 siblings, 2 replies; 50+ messages in thread
From: Andrew Morton @ 2005-07-08 23:08 UTC (permalink / raw)
  To: domen; +Cc: linux-kernel, clucas, domen

domen@coderock.org wrote:
>
> @@ -655,7 +655,7 @@ i2QueueCommands(int type, i2ChanStrPtr p
>  			timeout--;   // So negative values == forever
>  		
>  		if (!in_interrupt()) {

I worry about what this driver is trying to do...

> -			current->state = TASK_INTERRUPTIBLE;
> +			set_current_state(TASK_INTERRUPTIBLE);
>  			schedule_timeout(1);	// short nap 

We do this all over the place.  Adding new schedule_timeout_interruptible()
and schedule_timeout_uninterruptible() would reduce kernel size and neaten
things up.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch 1/4] drivers/char/ip2/i2lib.c: replace direct assignment with set_current_state()
  2005-07-08 23:08 ` Andrew Morton
@ 2005-07-08 23:22   ` Nish Aravamudan
  2005-07-23  0:27   ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan
  1 sibling, 0 replies; 50+ messages in thread
From: Nish Aravamudan @ 2005-07-08 23:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: domen, linux-kernel, clucas

On 7/8/05, Andrew Morton <akpm@osdl.org> wrote:
> domen@coderock.org wrote:
> >
> > @@ -655,7 +655,7 @@ i2QueueCommands(int type, i2ChanStrPtr p
> >                       timeout--;   // So negative values == forever
> >
> >               if (!in_interrupt()) {
> 
> I worry about what this driver is trying to do...
> 
> > -                     current->state = TASK_INTERRUPTIBLE;
> > +                     set_current_state(TASK_INTERRUPTIBLE);
> >                       schedule_timeout(1);    // short nap
> 
> We do this all over the place.  Adding new schedule_timeout_interruptible()
> and schedule_timeout_uninterruptible() would reduce kernel size and neaten
> things up.

I'll get a patch and users set up soon to do just this.

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-08 23:08 ` Andrew Morton
  2005-07-08 23:22   ` Nish Aravamudan
@ 2005-07-23  0:27   ` Nishanth Aravamudan
  2005-07-23  0:31     ` Arjan van de Ven
  1 sibling, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-07-23  0:27 UTC (permalink / raw)
  To: Andrew Morton; +Cc: domen, linux-kernel, clucas

On 08.07.2005 [16:08:24 -0700], Andrew Morton wrote:
> domen@coderock.org wrote:
> >
> > @@ -655,7 +655,7 @@ i2QueueCommands(int type, i2ChanStrPtr p
> >  			timeout--;   // So negative values == forever
> >  		
> >  		if (!in_interrupt()) {
> 
> I worry about what this driver is trying to do...
> 
> > -			current->state = TASK_INTERRUPTIBLE;
> > +			set_current_state(TASK_INTERRUPTIBLE);
> >  			schedule_timeout(1);	// short nap 
> 
> We do this all over the place.  Adding new schedule_timeout_interruptible()
> and schedule_timeout_uninterruptible() would reduce kernel size and neaten
> things up.

How does something like this look? If this looks ok, I'll send out
bunches of patches to add users of the new interfaces.

Description: Add wrappers for interruptible/uninterruptible
schedule_timeout() callers. Also add millisecond equivalents.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>

---

 include/linux/sched.h |   11 ++++
 kernel/timer.c        |  125 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 135 insertions(+), 1 deletion(-)

diff -urpN 2.6.13-rc3/include/linux/sched.h 2.6.13-rc3-new_interfaces/include/linux/sched.h
--- 2.6.13-rc3/include/linux/sched.h	2005-07-13 15:52:14.000000000 -0700
+++ 2.6.13-rc3-new_interfaces/include/linux/sched.h	2005-07-22 16:28:47.000000000 -0700
@@ -181,8 +181,17 @@ extern void scheduler_tick(void);
 /* Is this address in the __sched functions? */
 extern int in_sched_functions(unsigned long addr);
 
-#define	MAX_SCHEDULE_TIMEOUT	LONG_MAX
+#define	MAX_SCHEDULE_TIMEOUT		LONG_MAX
+#define	MAX_SCHEDULE_TIMEOUT_MSECS	UINT_MAX
 extern signed long FASTCALL(schedule_timeout(signed long timeout));
+extern signed long FASTCALL(schedule_timeout_interruptible
+							(signed long timeout));
+extern signed long FASTCALL(schedule_timeout_uninterruptible
+							(signed long timeout));
+extern unsigned int FASTCALL(schedule_timeout_interruptible_msecs
+						(unsigned int timeout_msecs));
+extern unsigned int FASTCALL(schedule_timeout_uninterruptible_msecs
+						(unsigned int timeout_msecs));
 asmlinkage void schedule(void);
 
 struct namespace;
diff -urpN 2.6.13-rc3/kernel/timer.c 2.6.13-rc3-new_interfaces/kernel/timer.c
--- 2.6.13-rc3/kernel/timer.c	2005-07-13 15:52:14.000000000 -0700
+++ 2.6.13-rc3-new_interfaces/kernel/timer.c	2005-07-22 16:31:31.000000000 -0700
@@ -1153,6 +1153,131 @@ fastcall signed long __sched schedule_ti
 
 EXPORT_SYMBOL(schedule_timeout);
 
+/**
+ * schedule_timeout_interruptible - sleep until timeout, wait-queue
+ *					 event or signal
+ * @timeout: timeout value in jiffies
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+fastcall signed long __sched schedule_timeout_interruptible(signed long timeout)
+{
+	set_current_state(TASK_INTERRUPTIBLE);
+	return schedule_timeout(timeout);
+}
+
+EXPORT_SYMBOL(schedule_timeout_interruptible);
+
+/**
+ * schedule_timeout_uninterruptible - sleep until timeout or wait-queue
+ * 					event
+ * @timeout: timeout value in jiffies
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+fastcall signed long __sched schedule_timeout_uninterruptible(signed long timeout)
+{
+	set_current_state(TASK_UNINTERRUPTIBLE);
+	return schedule_timeout(timeout);
+}
+
+EXPORT_SYMBOL(schedule_timeout_uninterruptible);
+
+/*
+ * schedule_timeout_msecs - sleep until timeout
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A human-time (but otherwise identical) alternative to
+ * schedule_timeout() The state, therefore, *does* need to be set before
+ * calling this function, but this function should *never* be called
+ * directly. Use the nice wrappers, schedule_{interruptible,
+ * uninterruptible}_timeout_msecs().
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+fastcall inline unsigned int __sched schedule_timeout_msecs
+						(unsigned int timeout_msecs)
+{
+	struct timer_list timer;
+	unsigned long expire_jifs;
+	signed long remaining_jifs;
+
+	if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
+		schedule();
+		goto out;
+	}
+
+	/*
+	 * msecs_to_jiffies() is a unit conversion, which truncates
+	 * (rounds down), so we need to add 1.
+	 */
+	expire_jifs = jiffies + msecs_to_jiffies(timeout_msecs) + 1;
+
+	init_timer(&timer);
+	timer.expires = expire_jifs;
+	timer.data = (unsigned long) current;
+	timer.function = process_timeout;
+
+	add_timer(&timer);
+	schedule();
+	del_singleshot_timer_sync(&timer);
+
+	remaining_jifs = expire_jifs - jiffies;
+	/* if we have woken up *before* we have requested */
+	if (remaining_jifs > 0)
+		/*
+		 * don't need to add 1 here, even though there is
+		 * truncation, because we will add 1 if/when the value
+		 * is sent back in
+		 */
+		timeout_msecs = jiffies_to_msecs(remaining_jifs);
+	else
+		timeout_msecs = 0;
+
+ out:
+	return timeout_msecs;
+}
+
+/**
+ * schedule_timeout_msecs_interruptible - sleep until timeout,
+ *						 wait-queue event, or signal
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A nice wrapper for the common
+ * set_current_state()/schedule_timeout_msecs() usage.  The state,
+ * therefore, does *not* need to be set before calling this function.
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+fastcall unsigned int __sched schedule_timeout_msecs_interruptible
+						(unsigned int timeout_msecs)
+{
+	set_current_state(TASK_INTERRUPTIBLE);
+	return schedule_timeout_msecs(timeout_msecs);
+}
+
+EXPORT_SYMBOL_GPL(schedule_timeout_msecs_interruptible);
+
+/**
+ * schedule_timeout_msecs_uninterrutible - sleep until timeout or
+ * 						wait-queue event
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A nice wrapper for the common
+ * set_current_state()/schedule_timeout_msecs() usage.  The state,
+ * therefore, does *not* need to be set before calling this function.
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+fastcall unsigned int __sched schedule_timeout_msecs_uninterruptible
+						(unsigned int timeout_msecs)
+{
+	set_current_state(TASK_UNINTERRUPTIBLE);
+	return schedule_timeout_msecs(timeout_msecs);
+}
+
+EXPORT_SYMBOL_GPL(schedule_timeout_msecs_uninterruptible);
+
 /* Thread ID - the internal kernel "pid" */
 asmlinkage long sys_gettid(void)
 {

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23  0:27   ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan
@ 2005-07-23  0:31     ` Arjan van de Ven
  2005-07-23  1:08       ` [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces Nishanth Aravamudan
  2005-07-23 10:50       ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Roman Zippel
  0 siblings, 2 replies; 50+ messages in thread
From: Arjan van de Ven @ 2005-07-23  0:31 UTC (permalink / raw)
  To: Nishanth Aravamudan; +Cc: Andrew Morton, domen, linux-kernel, clucas


> How does something like this look? If this looks ok, I'll send out
> bunches of patches to add users of the new interfaces.

I'd drop the FASTCALL stuff... nowadays with regparm that's automatic
and the cost of register-vs-stack isn't too big anyway


Also I'd rather not add the non-msec ones... either you're raw and use
HZ, or you are "cooked" and use the msec variant.. I dont' see the point
of adding an "in the middle" one. (Yes this means that several users
need to be transformed to msecs but... I consider that progress ;)


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces
  2005-07-23  0:31     ` Arjan van de Ven
@ 2005-07-23  1:08       ` Nishanth Aravamudan
  2005-07-23  2:30         ` Andrew Morton
  2005-07-23 10:50       ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Roman Zippel
  1 sibling, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-07-23  1:08 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Andrew Morton, domen, linux-kernel, clucas

On 22.07.2005 [20:31:55 -0400], Arjan van de Ven wrote:
> 
> > How does something like this look? If this looks ok, I'll send out
> > bunches of patches to add users of the new interfaces.
> 
> I'd drop the FASTCALL stuff... nowadays with regparm that's automatic
> and the cost of register-vs-stack isn't too big anyway
> 
> 
> Also I'd rather not add the non-msec ones... either you're raw and use
> HZ, or you are "cooked" and use the msec variant.. I dont' see the point
> of adding an "in the middle" one. (Yes this means that several users
> need to be transformed to msecs but... I consider that progress ;)

Thanks for the feedback! Does this look better? I will be more than
happy to convert those users to msecs which should, btw :) I've been
waiting to add this interface so that the ones that couldn't use
msleep{,_interruptible}() because of wait-queues, can also be changed to
use milliseconds.

Hrm, I also noticed I typo'd the externs in sched.h. Fixed in the new
patch.

Description: Add schedule_timeout_msecs() and add wrappers for
interruptible/uninterruptible schedule_timeout_msecs() callers.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>

---

 include/linux/sched.h |    7 +++
 kernel/timer.c        |   94 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 100 insertions(+), 1 deletion(-)

diff -urpN 2.6.13-rc3/include/linux/sched.h 2.6.13-rc3-new_interfaces/include/linux/sched.h
--- 2.6.13-rc3/include/linux/sched.h	2005-07-13 15:52:14.000000000 -0700
+++ 2.6.13-rc3-new_interfaces/include/linux/sched.h	2005-07-22 18:06:36.000000000 -0700
@@ -181,8 +181,13 @@ extern void scheduler_tick(void);
 /* Is this address in the __sched functions? */
 extern int in_sched_functions(unsigned long addr);
 
-#define	MAX_SCHEDULE_TIMEOUT	LONG_MAX
+#define	MAX_SCHEDULE_TIMEOUT		LONG_MAX
+#define	MAX_SCHEDULE_TIMEOUT_MSECS	UINT_MAX
 extern signed long FASTCALL(schedule_timeout(signed long timeout));
+extern unsigned int schedule_timeout_msecs_interruptible
+						(unsigned int timeout_msecs);
+extern unsigned int schedule_timeout_msecs_uninterruptible
+						(unsigned int timeout_msecs);
 asmlinkage void schedule(void);
 
 struct namespace;
diff -urpN 2.6.13-rc3/kernel/timer.c 2.6.13-rc3-new_interfaces/kernel/timer.c
--- 2.6.13-rc3/kernel/timer.c	2005-07-13 15:52:14.000000000 -0700
+++ 2.6.13-rc3-new_interfaces/kernel/timer.c	2005-07-22 18:00:46.000000000 -0700
@@ -1153,6 +1153,100 @@ fastcall signed long __sched schedule_ti
 
 EXPORT_SYMBOL(schedule_timeout);
 
+/*
+ * schedule_timeout_msecs - sleep until timeout
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A human-time (but otherwise identical) alternative to
+ * schedule_timeout() The state, therefore, *does* need to be set before
+ * calling this function, but this function should *never* be called
+ * directly. Use the nice wrappers, schedule_{interruptible,
+ * uninterruptible}_timeout_msecs().
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+inline unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
+{
+	struct timer_list timer;
+	unsigned long expire_jifs;
+	signed long remaining_jifs;
+
+	if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
+		schedule();
+		goto out;
+	}
+
+	/*
+	 * msecs_to_jiffies() is a unit conversion, which truncates
+	 * (rounds down), so we need to add 1.
+	 */
+	expire_jifs = jiffies + msecs_to_jiffies(timeout_msecs) + 1;
+
+	init_timer(&timer);
+	timer.expires = expire_jifs;
+	timer.data = (unsigned long) current;
+	timer.function = process_timeout;
+
+	add_timer(&timer);
+	schedule();
+	del_singleshot_timer_sync(&timer);
+
+	remaining_jifs = expire_jifs - jiffies;
+	/* if we have woken up *before* we have requested */
+	if (remaining_jifs > 0)
+		/*
+		 * don't need to add 1 here, even though there is
+		 * truncation, because we will add 1 if/when the value
+		 * is sent back in
+		 */
+		timeout_msecs = jiffies_to_msecs(remaining_jifs);
+	else
+		timeout_msecs = 0;
+
+ out:
+	return timeout_msecs;
+}
+
+/**
+ * schedule_timeout_msecs_interruptible - sleep until timeout,
+ *						 wait-queue event, or signal
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A nice wrapper for the common
+ * set_current_state()/schedule_timeout_msecs() usage.  The state,
+ * therefore, does *not* need to be set before calling this function.
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+unsigned int __sched schedule_timeout_msecs_interruptible
+						(unsigned int timeout_msecs)
+{
+	set_current_state(TASK_INTERRUPTIBLE);
+	return schedule_timeout_msecs(timeout_msecs);
+}
+
+EXPORT_SYMBOL_GPL(schedule_timeout_msecs_interruptible);
+
+/**
+ * schedule_timeout_msecs_uninterrutible - sleep until timeout or
+ * 						wait-queue event
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A nice wrapper for the common
+ * set_current_state()/schedule_timeout_msecs() usage.  The state,
+ * therefore, does *not* need to be set before calling this function.
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+unsigned int __sched schedule_timeout_msecs_uninterruptible
+						(unsigned int timeout_msecs)
+{
+	set_current_state(TASK_UNINTERRUPTIBLE);
+	return schedule_timeout_msecs(timeout_msecs);
+}
+
+EXPORT_SYMBOL_GPL(schedule_timeout_msecs_uninterruptible);
+
 /* Thread ID - the internal kernel "pid" */
 asmlinkage long sys_gettid(void)
 {

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces
  2005-07-23  1:08       ` [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces Nishanth Aravamudan
@ 2005-07-23  2:30         ` Andrew Morton
  2005-07-23 16:23           ` Nishanth Aravamudan
  0 siblings, 1 reply; 50+ messages in thread
From: Andrew Morton @ 2005-07-23  2:30 UTC (permalink / raw)
  To: Nishanth Aravamudan; +Cc: arjan, domen, linux-kernel, clucas

Nishanth Aravamudan <nacc@us.ibm.com> wrote:
>
> 
> +/*
> + * schedule_timeout_msecs - sleep until timeout
> + * @timeout_msecs: timeout value in milliseconds
> + *
> + * A human-time (but otherwise identical) alternative to
> + * schedule_timeout() The state, therefore, *does* need to be set before
> + * calling this function, but this function should *never* be called
> + * directly. Use the nice wrappers, schedule_{interruptible,
> + * uninterruptible}_timeout_msecs().
> + *
> + * See the comment for schedule_timeout() for details.
> + */
> +inline unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
> +{

Making this inline will add more kernel text.    Can't we uninline it?

I'm surprised that this function is not implemented as a call to the
existing schedule_timeout()?

> +	init_timer(&timer);
> +	timer.expires = expire_jifs;
> +	timer.data = (unsigned long) current;
> +	timer.function = process_timeout;

hm, if we add the needed typecast to TIMER_INITIALIZER() the above could be

	timer = TIMER_INITIALIZER(process_timeout, expire_jifs,
				(unsigned long)current);



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23  0:31     ` Arjan van de Ven
  2005-07-23  1:08       ` [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces Nishanth Aravamudan
@ 2005-07-23 10:50       ` Roman Zippel
  2005-07-23 11:09         ` Arjan van de Ven
  2005-07-23 16:30         ` Nishanth Aravamudan
  1 sibling, 2 replies; 50+ messages in thread
From: Roman Zippel @ 2005-07-23 10:50 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Fri, 22 Jul 2005, Arjan van de Ven wrote:

> Also I'd rather not add the non-msec ones... either you're raw and use
> HZ, or you are "cooked" and use the msec variant.. I dont' see the point
> of adding an "in the middle" one. (Yes this means that several users
> need to be transformed to msecs but... I consider that progress ;)

What's wrong with using jiffies? It's simple and the current timeout 
system is based on it. Calling it something else doesn't suddenly give you 
more precision.

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 10:50       ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Roman Zippel
@ 2005-07-23 11:09         ` Arjan van de Ven
  2005-07-23 11:55           ` Roman Zippel
  2005-07-23 16:30         ` Nishanth Aravamudan
  1 sibling, 1 reply; 50+ messages in thread
From: Arjan van de Ven @ 2005-07-23 11:09 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas

On Sat, 2005-07-23 at 12:50 +0200, Roman Zippel wrote:
> Hi,
> 
> On Fri, 22 Jul 2005, Arjan van de Ven wrote:
> 
> > Also I'd rather not add the non-msec ones... either you're raw and use
> > HZ, or you are "cooked" and use the msec variant.. I dont' see the point
> > of adding an "in the middle" one. (Yes this means that several users
> > need to be transformed to msecs but... I consider that progress ;)
> 
> What's wrong with using jiffies? 

A lot of the (driver) users want a wallclock based timeout. For that,
miliseconds is a more obvious API with less chance to get the jiffies/HZ
conversion wrong by the driver writer.

> It's simple and the current timeout 
> system is based on it. Calling it something else doesn't suddenly give you 
> more precision.

It's not about precision, it's about making the new API (which is
intended to be a simplification already due to sucking in the state
setting) match the intent closer.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 11:09         ` Arjan van de Ven
@ 2005-07-23 11:55           ` Roman Zippel
  2005-07-23 12:51             ` Arjan van de Ven
  2005-07-23 16:37             ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan
  0 siblings, 2 replies; 50+ messages in thread
From: Roman Zippel @ 2005-07-23 11:55 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Sat, 23 Jul 2005, Arjan van de Ven wrote:

> > What's wrong with using jiffies? 
> 
> A lot of the (driver) users want a wallclock based timeout. For that,
> miliseconds is a more obvious API with less chance to get the jiffies/HZ
> conversion wrong by the driver writer.

We have helper functions for that. The point about using jiffies is to 
make it _very_ clear, that the timeout is imprecise and for most users 
this is sufficient.

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 11:55           ` Roman Zippel
@ 2005-07-23 12:51             ` Arjan van de Ven
  2005-07-23 13:04               ` Roman Zippel
  2005-07-23 16:37             ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan
  1 sibling, 1 reply; 50+ messages in thread
From: Arjan van de Ven @ 2005-07-23 12:51 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas

On Sat, 2005-07-23 at 13:55 +0200, Roman Zippel wrote:
> Hi,
> 
> On Sat, 23 Jul 2005, Arjan van de Ven wrote:
> 
> > > What's wrong with using jiffies? 
> > 
> > A lot of the (driver) users want a wallclock based timeout. For that,
> > miliseconds is a more obvious API with less chance to get the jiffies/HZ
> > conversion wrong by the driver writer.
> 
> We have helper functions for that.

I think we just disagree then... I consider this one a helper function
as well, one with a simpler API that wraps the more complex and powerful
api.

Sure sleeps are imprecise. All sleeps are even mdelay() (due to
preemption).



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 12:51             ` Arjan van de Ven
@ 2005-07-23 13:04               ` Roman Zippel
  2005-07-23 13:12                 ` Arjan van de Ven
  2005-07-23 16:43                 ` Nishanth Aravamudan
  0 siblings, 2 replies; 50+ messages in thread
From: Roman Zippel @ 2005-07-23 13:04 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Sat, 23 Jul 2005, Arjan van de Ven wrote:

> > > > What's wrong with using jiffies? 
> > > 
> > > A lot of the (driver) users want a wallclock based timeout. For that,
> > > miliseconds is a more obvious API with less chance to get the jiffies/HZ
> > > conversion wrong by the driver writer.
> > 
> > We have helper functions for that.
> 
> I think we just disagree then... I consider this one a helper function
> as well, one with a simpler API that wraps the more complex and powerful
> api.

How is it more powerful? The next step in introducing such API is that 
people start complaining, they don't get ms precision, which of course is 
fixed by the next patch, with then results in that the whole timer system 
becomes more complex for a few misguided users.
Keep the thing as simple as possible and jiffies _are_ simple. Please show 
me the problem, that needs to be fixed.

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 13:04               ` Roman Zippel
@ 2005-07-23 13:12                 ` Arjan van de Ven
  2005-07-23 13:29                   ` Roman Zippel
  2005-07-23 16:43                 ` Nishanth Aravamudan
  1 sibling, 1 reply; 50+ messages in thread
From: Arjan van de Ven @ 2005-07-23 13:12 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas

On Sat, 2005-07-23 at 15:04 +0200, Roman Zippel wrote:
> Hi,
> 
> On Sat, 23 Jul 2005, Arjan van de Ven wrote:
> 
> > > > > What's wrong with using jiffies? 
> > > > 
> > > > A lot of the (driver) users want a wallclock based timeout. For that,
> > > > miliseconds is a more obvious API with less chance to get the jiffies/HZ
> > > > conversion wrong by the driver writer.
> > > 
> > > We have helper functions for that.
> > 
> > I think we just disagree then... I consider this one a helper function
> > as well, one with a simpler API that wraps the more complex and powerful
> > api.
> 
> How is it more powerful?

jiffies/HZ is the more powerful API. The msec one which also sets
current to (un)interruptible is the simplified wrapper on top.




^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 13:12                 ` Arjan van de Ven
@ 2005-07-23 13:29                   ` Roman Zippel
  2005-07-23 13:32                     ` Arjan van de Ven
  2005-07-23 16:44                     ` Nishanth Aravamudan
  0 siblings, 2 replies; 50+ messages in thread
From: Roman Zippel @ 2005-07-23 13:29 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Sat, 23 Jul 2005, Arjan van de Ven wrote:

> jiffies/HZ is the more powerful API. The msec one which also sets
> current to (un)interruptible is the simplified wrapper on top.

So why do you want to hide it? Make the jiffies based API the primary one 
and add some convenience functions, where we BTW could convert already 
constants at compile time.

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 13:29                   ` Roman Zippel
@ 2005-07-23 13:32                     ` Arjan van de Ven
  2005-07-23 15:56                       ` Roman Zippel
  2005-07-23 16:44                     ` Nishanth Aravamudan
  1 sibling, 1 reply; 50+ messages in thread
From: Arjan van de Ven @ 2005-07-23 13:32 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas

On Sat, 2005-07-23 at 15:29 +0200, Roman Zippel wrote:
> Hi,
> 
> On Sat, 23 Jul 2005, Arjan van de Ven wrote:
> 
> > jiffies/HZ is the more powerful API. The msec one which also sets
> > current to (un)interruptible is the simplified wrapper on top.
> 
> So why do you want to hide it? 

I don't want to hide it at all. I want to provide a simpler variant for
it for the cases where a simpler variant is enough. It really is a
helper to take some common task and get that closer to what the
programmer wants (wallclock time delay)


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 13:32                     ` Arjan van de Ven
@ 2005-07-23 15:56                       ` Roman Zippel
  0 siblings, 0 replies; 50+ messages in thread
From: Roman Zippel @ 2005-07-23 15:56 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Nishanth Aravamudan, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Sat, 23 Jul 2005, Arjan van de Ven wrote:

> > > jiffies/HZ is the more powerful API. The msec one which also sets
> > > current to (un)interruptible is the simplified wrapper on top.
> > 
> > So why do you want to hide it? 
> 
> I don't want to hide it at all. I want to provide a simpler variant for
> it for the cases where a simpler variant is enough.

Then I don't understand your initial comment and I don't understand the 
point in adding a ms variant of this interface. If someone does care about 
the return value, he is better off to convert the initial value into 
jiffies and continue with that.

> It really is a
> helper to take some common task and get that closer to what the
> programmer wants (wallclock time delay)

And that's not what he gets...

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces
  2005-07-23  2:30         ` Andrew Morton
@ 2005-07-23 16:23           ` Nishanth Aravamudan
  0 siblings, 0 replies; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-07-23 16:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: arjan, domen, linux-kernel, clucas

On 23.07.2005 [12:30:00 +1000], Andrew Morton wrote:
> Nishanth Aravamudan <nacc@us.ibm.com> wrote:
> >
> > 
> > +/*
> > + * schedule_timeout_msecs - sleep until timeout
> > + * @timeout_msecs: timeout value in milliseconds
> > + *
> > + * A human-time (but otherwise identical) alternative to
> > + * schedule_timeout() The state, therefore, *does* need to be set before
> > + * calling this function, but this function should *never* be called
> > + * directly. Use the nice wrappers, schedule_{interruptible,
> > + * uninterruptible}_timeout_msecs().
> > + *
> > + * See the comment for schedule_timeout() for details.
> > + */
> > +inline unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
> > +{
> 
> Making this inline will add more kernel text.    Can't we uninline it?

Uninlined in the latest version (see below).

> I'm surprised that this function is not implemented as a call to the
> existing schedule_timeout()?

Well, I was considering doing that, but I tried to make this new
interface a little more sane. I don't think we need to worry about
signedness of the parameter any more (and thus we don't need to consider
returning a negative value). Calling schedule_timeout() underneath
schedule_timeout_msecs() will require us to do those conditional checks;
I guess this isn't criticial in a sleeping path.

Also, I would need to make some modifications (not critical, really), so
that MAX_SCHEDULE_TIMEOUT_MSECS would map 1:1 to MAX_SCHEDULE_TIMEOUT
(which is in jiffies).

But I'm happy to change this, also done below -- does that map better to
what you were thinking?

> > +	init_timer(&timer);
> > +	timer.expires = expire_jifs;
> > +	timer.data = (unsigned long) current;
> > +	timer.function = process_timeout;
> 
> hm, if we add the needed typecast to TIMER_INITIALIZER() the above could be
> 
> 	timer = TIMER_INITIALIZER(process_timeout, expire_jifs,
> 				(unsigned long)current);

Done as well below.

Thanks,
Nish

Description: Add millisecond wrapper for schedule_timeout(),
schedule_timeout_msecs(). Add wrappers for interruptible/uninterruptible
schedule_timeout_msecs() callers.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>

---

 include/linux/sched.h |    7 ++++
 kernel/timer.c        |   75 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+), 1 deletion(-)

diff -urpN 2.6.13-rc3/include/linux/sched.h 2.6.13-rc3-new_interfaces/include/linux/sched.h
--- 2.6.13-rc3/include/linux/sched.h	2005-07-13 15:52:14.000000000 -0700
+++ 2.6.13-rc3-new_interfaces/include/linux/sched.h	2005-07-22 18:06:36.000000000 -0700
@@ -181,8 +181,13 @@ extern void scheduler_tick(void);
 /* Is this address in the __sched functions? */
 extern int in_sched_functions(unsigned long addr);
 
-#define	MAX_SCHEDULE_TIMEOUT	LONG_MAX
+#define	MAX_SCHEDULE_TIMEOUT		LONG_MAX
+#define	MAX_SCHEDULE_TIMEOUT_MSECS	UINT_MAX
 extern signed long FASTCALL(schedule_timeout(signed long timeout));
+extern unsigned int schedule_timeout_msecs_interruptible
+						(unsigned int timeout_msecs);
+extern unsigned int schedule_timeout_msecs_uninterruptible
+						(unsigned int timeout_msecs);
 asmlinkage void schedule(void);
 
 struct namespace;
diff -urpN 2.6.13-rc3/kernel/timer.c 2.6.13-rc3-new_interfaces/kernel/timer.c
--- 2.6.13-rc3/kernel/timer.c	2005-07-13 15:52:14.000000000 -0700
+++ 2.6.13-rc3-new_interfaces/kernel/timer.c	2005-07-23 09:20:51.000000000 -0700
@@ -1153,6 +1153,81 @@ fastcall signed long __sched schedule_ti
 
 EXPORT_SYMBOL(schedule_timeout);
 
+/*
+ * schedule_timeout_msecs - sleep until timeout
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A human-time (but otherwise identical) alternative to
+ * schedule_timeout() The state, therefore, *does* need to be set before
+ * calling this function, but this function should *never* be called
+ * directly. Use the nice wrappers, schedule_{interruptible,
+ * uninterruptible}_timeout_msecs().
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
+{
+	unsigned long expire_jifs;
+
+	if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
+		expire_jifs = MAX_SCHEDULE_TIMEOUT;
+	} else {
+		/*
+		 * msecs_to_jiffies() is a unit conversion, which truncates
+		 * (rounds down), so we need to add 1.
+		 */
+		expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
+	}
+
+	expire_jifs = schedule_timeout(expire_jifs);
+
+	/*
+	 * don't need to add 1 here, even though there is truncation,
+	 * because we will add 1 if/when the value is sent back in
+	 */
+	return jiffies_to_msecs(expire_jifs);
+}
+
+/**
+ * schedule_timeout_msecs_interruptible - sleep until timeout,
+ *						 wait-queue event, or signal
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A nice wrapper for the common
+ * set_current_state()/schedule_timeout_msecs() usage.  The state,
+ * therefore, does *not* need to be set before calling this function.
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+unsigned int __sched schedule_timeout_msecs_interruptible
+						(unsigned int timeout_msecs)
+{
+	set_current_state(TASK_INTERRUPTIBLE);
+	return schedule_timeout_msecs(timeout_msecs);
+}
+
+EXPORT_SYMBOL_GPL(schedule_timeout_msecs_interruptible);
+
+/**
+ * schedule_timeout_msecs_uninterrutible - sleep until timeout or
+ * 						wait-queue event
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A nice wrapper for the common
+ * set_current_state()/schedule_timeout_msecs() usage.  The state,
+ * therefore, does *not* need to be set before calling this function.
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+unsigned int __sched schedule_timeout_msecs_uninterruptible
+						(unsigned int timeout_msecs)
+{
+	set_current_state(TASK_UNINTERRUPTIBLE);
+	return schedule_timeout_msecs(timeout_msecs);
+}
+
+EXPORT_SYMBOL_GPL(schedule_timeout_msecs_uninterruptible);
+
 /* Thread ID - the internal kernel "pid" */
 asmlinkage long sys_gettid(void)
 {

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 10:50       ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Roman Zippel
  2005-07-23 11:09         ` Arjan van de Ven
@ 2005-07-23 16:30         ` Nishanth Aravamudan
  1 sibling, 0 replies; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-07-23 16:30 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 23.07.2005 [12:50:45 +0200], Roman Zippel wrote:
> Hi,
> 
> On Fri, 22 Jul 2005, Arjan van de Ven wrote:
> 
> > Also I'd rather not add the non-msec ones... either you're raw and use
> > HZ, or you are "cooked" and use the msec variant.. I dont' see the point
> > of adding an "in the middle" one. (Yes this means that several users
> > need to be transformed to msecs but... I consider that progress ;)
> 
> What's wrong with using jiffies? It's simple and the current timeout 
> system is based on it. Calling it something else doesn't suddenly give you 
> more precision.

I agree, and for users that want a certain number of ticks (they are the
minority, in my experience), they can still use schedule_timeout() (so
maybe I should add back in the schedule_timeout_{,un}interruptible()
wrappers?

As far as precision, we *are* talking about a sleeping path, where the
concept of precision is vague at best :) Consider these functions
expressing the completeness of what msleep() and msleep_interruptible()
started, i.e. for the wait-queue case. And to be honest, we are not
changing the semantics of the schedule_timeout() family of functions.
The caller expresses their request in some units, the kenrel then does
what it can to satisfy the request, but is at liberty to be late (or
even early in these two cases).

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 11:55           ` Roman Zippel
  2005-07-23 12:51             ` Arjan van de Ven
@ 2005-07-23 16:37             ` Nishanth Aravamudan
  2005-07-23 17:01               ` Roman Zippel
  1 sibling, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-07-23 16:37 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 23.07.2005 [13:55:58 +0200], Roman Zippel wrote:
> Hi,
> 
> On Sat, 23 Jul 2005, Arjan van de Ven wrote:
> 
> > > What's wrong with using jiffies? 
> > 
> > A lot of the (driver) users want a wallclock based timeout. For that,
> > miliseconds is a more obvious API with less chance to get the jiffies/HZ
> > conversion wrong by the driver writer.
> 
> We have helper functions for that. The point about using jiffies is to 
> make it _very_ clear, that the timeout is imprecise and for most users 
> this is sufficient.

We do have helper functions for human-time <-> jiffies (I keep adding
new ones :) ). But why not, instead of

	set_current_state(TASK_{,UN}INTERRUPTIBLE);
	schedule_timeout(msecs_to_jiffies(some_constant_msecs));

just have an interface that allows

	schedule_timeout_msecs_{,un}interruptible(some_constant_msecs);

and push the jiffies conversion to common code?

There are some 300 or so users of schedule_timeout() in 2.6.12. I would
say about half are doing something along the lines of

	set_current_state(TASK_{,UN}INTERRUPTIBLE);
	schedule_timeout(HZ/some_constant);

These would be replaced with

	schedule_timeout_msecs_{,un}interruptible(1000/some_constant);

I would *not* be changing the callers that do

	set_current_state(TASK_{,UN}INTERRUPTIBLE);
	schedule_timeout(some_other_constant);

even though I think most of these are 2.4 remnants that don't need
short, e.g.  1 or 2 timer interrupt, sleeps, but actually can use a 10
or 20 millisecond (HZ=100, 1 or 2 jiffies) sleep.

This emphasizes another advantage of adding these new interfaces, the
delay requested does not change with HZ. Internally it does, certainly,
but the callers don't need to know that :)

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 13:04               ` Roman Zippel
  2005-07-23 13:12                 ` Arjan van de Ven
@ 2005-07-23 16:43                 ` Nishanth Aravamudan
  2005-07-23 17:17                   ` Roman Zippel
  1 sibling, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-07-23 16:43 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 23.07.2005 [15:04:44 +0200], Roman Zippel wrote:
> Hi,
> 
> On Sat, 23 Jul 2005, Arjan van de Ven wrote:
> 
> > > > > What's wrong with using jiffies? 
> > > > 
> > > > A lot of the (driver) users want a wallclock based timeout. For that,
> > > > miliseconds is a more obvious API with less chance to get the jiffies/HZ
> > > > conversion wrong by the driver writer.
> > > 
> > > We have helper functions for that.
> > 
> > I think we just disagree then... I consider this one a helper function
> > as well, one with a simpler API that wraps the more complex and powerful
> > api.
> 
> How is it more powerful? The next step in introducing such API is that 
> people start complaining, they don't get ms precision, which of course is 
> fixed by the next patch, with then results in that the whole timer system 
> becomes more complex for a few misguided users.

Do people complain about not getting jiffy precision? I think the whole
idea of discussing precision in this particular sleeping path is
ludicrous. Some users of this are requesting up to 100s of millisecond
delays!

And, please, don't confuse this patch with my other work. The soft-timer
rework was an RFC, meaning I was hoping to start a discussion and get
some comments. I got comments from you and others, which I greatly
appreciate. But, I think you interpret that patch as being a final
version, which is ready for mainline. It very explicitly is not. I am
intent on dealing with your issues (perhaps moving to usecs...), but
this patch has nothing to do with that. It has everything to do with my
experience replacing caller after caller of schedule_timeout() and a
request from Andrew for a compact interface (the milliseconds were
admittedly my idea, but he hasn't said no to that yet :) ).

> Keep the thing as simple as possible and jiffies _are_ simple. Please show 
> me the problem, that needs to be fixed.

I am not sure why jiffies are any simpler than milliseconds. In the
millisecond case, conversion happens in common code and only needs to be
audited once. In the jiffies case, I have to check *every* caller to see
if they are doing stupid math :)

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 13:29                   ` Roman Zippel
  2005-07-23 13:32                     ` Arjan van de Ven
@ 2005-07-23 16:44                     ` Nishanth Aravamudan
  1 sibling, 0 replies; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-07-23 16:44 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 23.07.2005 [15:29:42 +0200], Roman Zippel wrote:
> Hi,
> 
> On Sat, 23 Jul 2005, Arjan van de Ven wrote:
> 
> > jiffies/HZ is the more powerful API. The msec one which also sets
> > current to (un)interruptible is the simplified wrapper on top.
> 
> So why do you want to hide it? Make the jiffies based API the primary one 
> and add some convenience functions, where we BTW could convert already 
> constants at compile time.

We are not hiding anything; we are providing an alternative for those
users that would like to request wall-time (human-time) units -- those
users do not (or should not, if nothing else) expect any kind of
precision from the sleeping path.

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 16:37             ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan
@ 2005-07-23 17:01               ` Roman Zippel
  2005-07-23 19:06                 ` Nishanth Aravamudan
  0 siblings, 1 reply; 50+ messages in thread
From: Roman Zippel @ 2005-07-23 17:01 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Sat, 23 Jul 2005, Nishanth Aravamudan wrote:

> 	set_current_state(TASK_{,UN}INTERRUPTIBLE);
> 	schedule_timeout(msecs_to_jiffies(some_constant_msecs));
> 
> just have an interface that allows
> 
> 	schedule_timeout_msecs_{,un}interruptible(some_constant_msecs);
> 
> and push the jiffies conversion to common code?

What's wrong with just:

	schedule_timeout_{,un}interruptible(msecs_to_jiffies(some_constant_msecs));

The majority of users use a constant, which can already be converted at 
compile tile.
Additionally such an interface also had to return a ms value and instead 
of that constant conversion, the user is better off to work with jiffies 
directly.

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 16:43                 ` Nishanth Aravamudan
@ 2005-07-23 17:17                   ` Roman Zippel
  2005-07-23 19:10                     ` Nishanth Aravamudan
  0 siblings, 1 reply; 50+ messages in thread
From: Roman Zippel @ 2005-07-23 17:17 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Sat, 23 Jul 2005, Nishanth Aravamudan wrote:

> > Keep the thing as simple as possible and jiffies _are_ simple. Please show 
> > me the problem, that needs to be fixed.
> 
> I am not sure why jiffies are any simpler than milliseconds. In the
> millisecond case, conversion happens in common code and only needs to be
> audited once. In the jiffies case, I have to check *every* caller to see
> if they are doing stupid math :)

Jiffies are the basic time unit for kernel timers, hiding that fact gives 
users only wrong expectations about them.
I don't mind using helper functions, but constant conversion can already 
happen at compile time and for variable timeouts the user should seriously 
consider using jiffies directly instead of constantly converting time 
values.

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 17:01               ` Roman Zippel
@ 2005-07-23 19:06                 ` Nishanth Aravamudan
  2005-07-23 20:22                   ` Roman Zippel
  0 siblings, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-07-23 19:06 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 23.07.2005 [19:01:57 +0200], Roman Zippel wrote:
> Hi,
> 
> On Sat, 23 Jul 2005, Nishanth Aravamudan wrote:
> 
> > 	set_current_state(TASK_{,UN}INTERRUPTIBLE);
> > 	schedule_timeout(msecs_to_jiffies(some_constant_msecs));
> > 
> > just have an interface that allows
> > 
> > 	schedule_timeout_msecs_{,un}interruptible(some_constant_msecs);
> > 
> > and push the jiffies conversion to common code?
> 
> What's wrong with just:
> 
> 	schedule_timeout_{,un}interruptible(msecs_to_jiffies(some_constant_msecs));

Nothing, I suppose. I just prefer directly using msecs. I understand
your point more now, I think. You are worried about those people that
actually use the return value of schedule_timeout().

> The majority of users use a constant, which can already be converted at 
> compile tile.
> Additionally such an interface also had to return a ms value and instead 
> of that constant conversion, the user is better off to work with jiffies 
> directly.

So, I just spent a good hour looking at every caller of
schedule_timeout() which actually stores the return value. Beyond the
other wrappers for it (wait_event(), wait_for_completion(),
sys_nanosleep(), etc., which I will leave alone using schedule_timeout()
until I can change *their* parameters ;) ), I found two cases.

1) Sleep, see if you actually slept the whole time:

	remainder = schedule_timeout(some_value_in_jiffies);
	if (!remaining)
		report_timeout();

2) Sleep in a loop, keeping track of remaining timeout each iteration:

	while (timeout) {
		do_some_stuff();
		timeout = schedule_timeout(timeout);
		if (some_condition)
			break;
	}

Clearly, neither needs to use jiffies. The former only wants to know if
the full timeout elapsed. I didn't find anyone returning that stored
value (again, excepting wrapper interfaces) to the caller. They just
want to know if they should return -ETIME{,DOUT}. The latter just is a
means to guarantee the entire time is slept, but doesn't care about the
units.

Now, some of these might depend on structures which have members with
jiffy-unit values. But I will be more than happy to try and either leave
them alone or convert those structures. We'll see about that on a
case-by-case basis?

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 17:17                   ` Roman Zippel
@ 2005-07-23 19:10                     ` Nishanth Aravamudan
  2005-07-23 20:12                       ` Roman Zippel
  0 siblings, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-07-23 19:10 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 23.07.2005 [19:17:37 +0200], Roman Zippel wrote:
> Hi,
> 
> On Sat, 23 Jul 2005, Nishanth Aravamudan wrote:
> 
> > > Keep the thing as simple as possible and jiffies _are_ simple. Please show 
> > > me the problem, that needs to be fixed.
> > 
> > I am not sure why jiffies are any simpler than milliseconds. In the
> > millisecond case, conversion happens in common code and only needs to be
> > audited once. In the jiffies case, I have to check *every* caller to see
> > if they are doing stupid math :)
> 
> Jiffies are the basic time unit for kernel timers, hiding that fact gives 
> users only wrong expectations about them.

We already have msleep() and msleep_interruptible(), which hide jiffies
in milliseconds. These interfaces are their parallel in the wait-queue
case. If you don't want to use them, or their use is not appropriate,
then the callers won't be changed.

> I don't mind using helper functions, but constant conversion can already 
> happen at compile time and for variable timeouts the user should seriously 
> consider using jiffies directly instead of constantly converting time 
> values.

My goal is to distinguish between these cases in sleeping-logic:

1) tick-oriented
	use schedule_timeout(), add_timer(), etc.

2) time-oriented
	use schedule_timeout_msecs()

I am not going to run some global sed script to change the whole kernel.
I may not always be successful, but I do try to be cautious in my
patches. Only those callers that can benefit from using a millisecond
interface will be changed.

Another question (actually the same question expressed again), do Andrew
or Arjan think I should reinsert the
schedule_timeout_{,un}interruptible() functions for the 1) case above?

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 19:10                     ` Nishanth Aravamudan
@ 2005-07-23 20:12                       ` Roman Zippel
  2005-07-27 22:29                         ` Nishanth Aravamudan
  0 siblings, 1 reply; 50+ messages in thread
From: Roman Zippel @ 2005-07-23 20:12 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Sat, 23 Jul 2005, Nishanth Aravamudan wrote:

> > Jiffies are the basic time unit for kernel timers, hiding that fact gives 
> > users only wrong expectations about them.
> 
> We already have msleep() and msleep_interruptible(), which hide jiffies
> in milliseconds. These interfaces are their parallel in the wait-queue
> case. If you don't want to use them, or their use is not appropriate,
> then the callers won't be changed.

I'm not exactly impressed with their implementation, it's completely silly 
that there is no such convience function based on jiffies, e.g. if you 
look at the msleep_interruptible(), you'll find quite a few 
"msleep_interruptible(jiffies_to_msecs())".
msleep_interruptible() is especially bad as there are a few users who 
check the return value and since it adds one to the timeout, you can 
create loops which may never timeout (as e.g. in drivers/w1/w1_therm.c: 
w1_therm_read_bin()), this is nice example of a bad interface.

These two function should actually look like this:

static inline void msleep(unsigned int msecs)
{
	sleep(msecs_to_jiffies(msecs));
}

static inline int msleep_interruptible(unsigned int msecs)
{   
	sleep_interruptible(msecs_to_jiffies(msecs)) != 0;
}

> My goal is to distinguish between these cases in sleeping-logic:
> 
> 1) tick-oriented
> 	use schedule_timeout(), add_timer(), etc.
> 
> 2) time-oriented
> 	use schedule_timeout_msecs()

There is _no_ difference, the scheduler is based on ticks. Even if we soon 
have different time sources, the scheduler will continue to measure the 
time in ticks and for a simple reason - portability. Jiffies _are_ simple, 
don't throw that away.

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 19:06                 ` Nishanth Aravamudan
@ 2005-07-23 20:22                   ` Roman Zippel
  0 siblings, 0 replies; 50+ messages in thread
From: Roman Zippel @ 2005-07-23 20:22 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Sat, 23 Jul 2005, Nishanth Aravamudan wrote:

> > What's wrong with just:
> > 
> > 	schedule_timeout_{,un}interruptible(msecs_to_jiffies(some_constant_msecs));
> 
> Nothing, I suppose. I just prefer directly using msecs. I understand
> your point more now, I think. You are worried about those people that
> actually use the return value of schedule_timeout().

It's only half the point:

> > The majority of users use a constant, which can already be converted at 
> > compile tile.

The kernel time unit is and will be jiffies and the kernel time functions 
should be in ticks with some optional wrappers in other time units, not 
the other way around.

> 2) Sleep in a loop, keeping track of remaining timeout each iteration:
> 
> 	while (timeout) {
> 		do_some_stuff();
> 		timeout = schedule_timeout(timeout);
> 		if (some_condition)
> 			break;
> 	}

This actually is a pre-preempt construct and should probably use 
time_before() now.

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-23 20:12                       ` Roman Zippel
@ 2005-07-27 22:29                         ` Nishanth Aravamudan
  2005-07-30 23:35                           ` Roman Zippel
  0 siblings, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-07-27 22:29 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 23.07.2005 [22:12:30 +0200], Roman Zippel wrote:
> Hi,
> 
> On Sat, 23 Jul 2005, Nishanth Aravamudan wrote:
> 
> > > Jiffies are the basic time unit for kernel timers, hiding that fact gives 
> > > users only wrong expectations about them.
> > 
> > We already have msleep() and msleep_interruptible(), which hide jiffies
> > in milliseconds. These interfaces are their parallel in the wait-queue
> > case. If you don't want to use them, or their use is not appropriate,
> > then the callers won't be changed.
> 
> I'm not exactly impressed with their implementation, it's completely silly 
> that there is no such convience function based on jiffies, e.g. if you 
> look at the msleep_interruptible(), you'll find quite a few 
> "msleep_interruptible(jiffies_to_msecs())".

Patches are welcome, I guess :)

> msleep_interruptible() is especially bad as there are a few users who 
> check the return value and since it adds one to the timeout, you can 
> create loops which may never timeout (as e.g. in drivers/w1/w1_therm.c: 
> w1_therm_read_bin()), this is nice example of a bad interface.
> 
> These two function should actually look like this:
> 
> static inline void msleep(unsigned int msecs)
> {
> 	sleep(msecs_to_jiffies(msecs));
> }
> 
> static inline int msleep_interruptible(unsigned int msecs)
> {   
> 	sleep_interruptible(msecs_to_jiffies(msecs)) != 0;
> }

Where are sleep() and sleep_interruptible() defined? And this basically
pushes down the identical implementation a layer, correct? Finally, you
cannot simply do msecs_to_jiffies() in the caller of
sleep{,_interruptible()}, unless sleep{,_interruptible()} add 1 to every
request themselves. The basic issue is that with jiffies and ticks being
identical, there is no concept of the inter-tick position (i.e., a
sub-jiffy value), and the callers of these conversion functions *must*
assume they are going to be added to the timer subsystem immediately
before a timer interrupt occurs. An entire tick has not elapsed, but if
the caller had not added 1, then we could potentially go off almost an
entire tick early. The +1 is basically specifying we are asking to add
our delay to the *next* tick.

> > My goal is to distinguish between these cases in sleeping-logic:
> > 
> > 1) tick-oriented
> > 	use schedule_timeout(), add_timer(), etc.
> > 
> > 2) time-oriented
> > 	use schedule_timeout_msecs()
> 
> There is _no_ difference, the scheduler is based on ticks. Even if we soon 
> have different time sources, the scheduler will continue to measure the 
> time in ticks and for a simple reason - portability. Jiffies _are_ simple, 
> don't throw that away.

I agree that from an internal perspective there is no difference, but
from an *interface* perspective they are hugely different, simply on the
basis that one uses human-time units and one does not.

I guess we must continue to agree to disagree.

I am still waiting for some feedback from Andrew if he would prefer a
jiffy-only version or both the jiffy and msec version.

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces
  2005-07-27 22:29                         ` Nishanth Aravamudan
@ 2005-07-30 23:35                           ` Roman Zippel
  2005-08-01 19:35                             ` [UPDATE PATCH] Add schedule_timeout_{intr,unintr}{,_msecs}() interfaces Nishanth Aravamudan
  0 siblings, 1 reply; 50+ messages in thread
From: Roman Zippel @ 2005-07-30 23:35 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Wed, 27 Jul 2005, Nishanth Aravamudan wrote:

> > > My goal is to distinguish between these cases in sleeping-logic:
> > > 
> > > 1) tick-oriented
> > > 	use schedule_timeout(), add_timer(), etc.
> > > 
> > > 2) time-oriented
> > > 	use schedule_timeout_msecs()
> > 
> > There is _no_ difference, the scheduler is based on ticks. Even if we soon 
> > have different time sources, the scheduler will continue to measure the 
> > time in ticks and for a simple reason - portability. Jiffies _are_ simple, 
> > don't throw that away.
> 
> I agree that from an internal perspective there is no difference, but
> from an *interface* perspective they are hugely different, simply on the
> basis that one uses human-time units and one does not.
> 
> I guess we must continue to agree to disagree.

I'm not really sure, what you disagree about.
1 HZ is about one second, which I don't think is such a difficult concept. 
I already said wrapper functions are fine and for anything smaller than 
HZ/2 it's probably a good idea nowadays.
My main point is to keep the core functionality in jiffies and provide 
some wrapper functions. What exactly do you disagree here on?

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [UPDATE PATCH] Add schedule_timeout_{intr,unintr}{,_msecs}() interfaces
  2005-07-30 23:35                           ` Roman Zippel
@ 2005-08-01 19:35                             ` Nishanth Aravamudan
  2005-08-03 14:20                               ` Roman Zippel
  0 siblings, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-08-01 19:35 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 31.07.2005 [01:35:35 +0200], Roman Zippel wrote:
> Hi,
> 
> On Wed, 27 Jul 2005, Nishanth Aravamudan wrote:
> 
> > > > My goal is to distinguish between these cases in sleeping-logic:
> > > > 
> > > > 1) tick-oriented
> > > > 	use schedule_timeout(), add_timer(), etc.
> > > > 
> > > > 2) time-oriented
> > > > 	use schedule_timeout_msecs()
> > > 
> > > There is _no_ difference, the scheduler is based on ticks. Even if we soon 
> > > have different time sources, the scheduler will continue to measure the 
> > > time in ticks and for a simple reason - portability. Jiffies _are_ simple, 
> > > don't throw that away.
> > 
> > I agree that from an internal perspective there is no difference, but
> > from an *interface* perspective they are hugely different, simply on the
> > basis that one uses human-time units and one does not.
> > 
> > I guess we must continue to agree to disagree.
> 
> I'm not really sure, what you disagree about.
> 1 HZ is about one second, which I don't think is such a difficult concept. 
> I already said wrapper functions are fine and for anything smaller than 
> HZ/2 it's probably a good idea nowadays.
> My main point is to keep the core functionality in jiffies and provide 
> some wrapper functions. What exactly do you disagree here on?

Well, I was under the impression you were against the patch I had sent
to add some millisecond wrappers (which did *not* change any core
functionality) for schedule_timeout(). If that isn't the case, I'm
sorry, I misunderstood.  Here's the patch again. I'm still open to
feedback.

Thanks,
Nish

Description: Add wrappers for interruptible/uninterruptible
schedule_timeout() callers. Also add millisecond equivalents. I tried to
make the names a bit more reasonable to prevent long lines.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>

---

 include/linux/sched.h |    7 +++
 kernel/timer.c        |   89 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 95 insertions(+), 1 deletion(-)

diff -urpN 2.6.13-rc4/include/linux/sched.h 2.6.13-rc4-dev/include/linux/sched.h
--- 2.6.13-rc4/include/linux/sched.h	2005-07-29 14:11:49.000000000 -0700
+++ 2.6.13-rc4-dev/include/linux/sched.h	2005-08-01 11:20:31.000000000 -0700
@@ -181,8 +181,13 @@ extern void scheduler_tick(void);
 /* Is this address in the __sched functions? */
 extern int in_sched_functions(unsigned long addr);
 
-#define	MAX_SCHEDULE_TIMEOUT	LONG_MAX
+#define	MAX_SCHEDULE_TIMEOUT		LONG_MAX
+#define	MAX_SCHEDULE_TIMEOUT_MSECS	UINT_MAX
 extern signed long FASTCALL(schedule_timeout(signed long timeout));
+extern signed long schedule_timeout_intr(signed long timeout);
+extern signed long schedule_timeout_unintr(signed long timeout);
+extern unsigned int schedule_timeout_msecs_intr(unsigned int timeout_msecs);
+extern unsigned int schedule_timeout_msecs_unintr(unsigned int timeout_msecs);
 asmlinkage void schedule(void);
 
 struct namespace;
diff -urpN 2.6.13-rc4/kernel/timer.c 2.6.13-rc4-dev/kernel/timer.c
--- 2.6.13-rc4/kernel/timer.c	2005-07-29 14:11:49.000000000 -0700
+++ 2.6.13-rc4-dev/kernel/timer.c	2005-08-01 12:31:53.000000000 -0700
@@ -1153,6 +1153,95 @@ fastcall signed long __sched schedule_ti
 
 EXPORT_SYMBOL(schedule_timeout);
 
+signed long __sched schedule_timeout_intr(signed long timeout)
+{
+	set_current_state(TASK_INTERRUPTIBLE);
+	return schedule_timeout(timeout);
+}
+
+EXPORT_SYMBOL(schedule_timeout_intr);
+
+signed long __sched schedule_timeout_unintr(signed long timeout)
+{
+	set_current_state(TASK_UNINTERRUPTIBLE);
+	return schedule_timeout(timeout);
+}
+
+EXPORT_SYMBOL(schedule_timeout_unintr);
+
+/*
+ * schedule_timeout_msecs - sleep until timeout
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A human-time (but otherwise identical) alternative to
+ * schedule_timeout() The state, therefore, *does* need to be set before
+ * calling this function, but this function should *never* be called
+ * directly. Use the nice wrappers, schedule_{interruptible,
+ * uninterruptible}_timeout_msecs().
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
+{
+	unsigned long expire_jifs;
+
+	if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
+		expire_jifs = MAX_SCHEDULE_TIMEOUT;
+	} else {
+		/*
+		 * msecs_to_jiffies() is a unit conversion, which truncates
+		 * (rounds down), so we need to add 1.
+		 */
+		expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
+	}
+
+	expire_jifs = schedule_timeout(expire_jifs);
+
+	/*
+	 * don't need to add 1 here, even though there is truncation,
+	 * because we will add 1 if/when the value is sent back in
+	 */
+	return jiffies_to_msecs(expire_jifs);
+}
+
+/**
+ * schedule_timeout_msecs_interruptible - sleep until timeout,
+ *						 wait-queue event, or signal
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A nice wrapper for the common
+ * set_current_state()/schedule_timeout_msecs() usage.  The state,
+ * therefore, does *not* need to be set before calling this function.
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+unsigned int __sched schedule_timeout_msecs_intr(unsigned int timeout_msecs)
+{
+	set_current_state(TASK_INTERRUPTIBLE);
+	return schedule_timeout_msecs(timeout_msecs);
+}
+
+EXPORT_SYMBOL_GPL(schedule_timeout_msecs_intr);
+
+/**
+ * schedule_timeout_msecs_uninterrutible - sleep until timeout or
+ * 						wait-queue event
+ * @timeout_msecs: timeout value in milliseconds
+ *
+ * A nice wrapper for the common
+ * set_current_state()/schedule_timeout_msecs() usage.  The state,
+ * therefore, does *not* need to be set before calling this function.
+ *
+ * See the comment for schedule_timeout() for details.
+ */
+unsigned int __sched schedule_timeout_msecs_unintr(unsigned int timeout_msecs)
+{
+	set_current_state(TASK_UNINTERRUPTIBLE);
+	return schedule_timeout_msecs(timeout_msecs);
+}
+
+EXPORT_SYMBOL_GPL(schedule_timeout_msecs_unintr);
+
 /* Thread ID - the internal kernel "pid" */
 asmlinkage long sys_gettid(void)
 {

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [UPDATE PATCH] Add schedule_timeout_{intr,unintr}{,_msecs}() interfaces
  2005-08-01 19:35                             ` [UPDATE PATCH] Add schedule_timeout_{intr,unintr}{,_msecs}() interfaces Nishanth Aravamudan
@ 2005-08-03 14:20                               ` Roman Zippel
  2005-08-04  0:51                                 ` [PATCH] push rounding up of relative request to schedule_timeout() Nishanth Aravamudan
  0 siblings, 1 reply; 50+ messages in thread
From: Roman Zippel @ 2005-08-03 14:20 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Mon, 1 Aug 2005, Nishanth Aravamudan wrote:

> +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
> +{
> +	unsigned long expire_jifs;
> +
> +	if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
> +		expire_jifs = MAX_SCHEDULE_TIMEOUT;
> +	} else {
> +		/*
> +		 * msecs_to_jiffies() is a unit conversion, which truncates
> +		 * (rounds down), so we need to add 1.
> +		 */
> +		expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
> +	}
> +
> +	expire_jifs = schedule_timeout(expire_jifs);
> +
> +	/*
> +	 * don't need to add 1 here, even though there is truncation,
> +	 * because we will add 1 if/when the value is sent back in
> +	 */
> +	return jiffies_to_msecs(expire_jifs);
> +}

As I already mentioned for msleep_interruptible this is a really terrible 
interface.
The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the 
process is immediately woken up again) makes the caller suspectible to 
timeout manipulations and requires constant reauditing, that no caller 
gets it wrong, so it's better to avoid this error source completely.
Constant conversion between different time units is a really bad idea. If 
the user needs the remaining time, he is really better off to do it 
himself by checking jiffies and only does an initial conversion from 
relative to absolute (kernel) time.
This wrapper function really should be an inline function and should look 
more like this:

static inline int schedule_timeout_msecs(unsigned int timeout_msecs)
{
	return schedule_timeout(msecs_to_jiffies(timeout_msecs) + 1) != 0;
}

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-03 14:20                               ` Roman Zippel
@ 2005-08-04  0:51                                 ` Nishanth Aravamudan
  2005-08-04  5:14                                   ` [UPDATE PATCH] " Nishanth Aravamudan
  2005-08-04  9:38                                   ` [PATCH] " Roman Zippel
  0 siblings, 2 replies; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-08-04  0:51 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote:
> Hi,
> 
> On Mon, 1 Aug 2005, Nishanth Aravamudan wrote:
> 
> > +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
> > +{
> > +	unsigned long expire_jifs;
> > +
> > +	if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
> > +		expire_jifs = MAX_SCHEDULE_TIMEOUT;
> > +	} else {
> > +		/*
> > +		 * msecs_to_jiffies() is a unit conversion, which truncates
> > +		 * (rounds down), so we need to add 1.
> > +		 */
> > +		expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
> > +	}
> > +
> > +	expire_jifs = schedule_timeout(expire_jifs);
> > +
> > +	/*
> > +	 * don't need to add 1 here, even though there is truncation,
> > +	 * because we will add 1 if/when the value is sent back in
> > +	 */
> > +	return jiffies_to_msecs(expire_jifs);
> > +}
> 
> As I already mentioned for msleep_interruptible this is a really terrible 
> interface.
> The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the 
> process is immediately woken up again) makes the caller suspectible to 
> timeout manipulations and requires constant reauditing, that no caller 
> gets it wrong, so it's better to avoid this error source completely.

After some thought today, I realized the +1 case is not specific to
milliseconds. It's just that it's only being done *correctly* in the
milliseconds case...I think ;)

So, consider the following:

We are requesting a 10 jiffy sleep via

	set_current_state(TASK_INTERRUPTIBLE);
	schedule_timeout(10);

Keep in mind that jiffies is only incremented when the timer interrupt
occurs (the whole point being, again, we do not have any inter-tick
jiffy-value).

In schedule_timeout() we will add the 10 to jiffies' current value. But
what happens if we were calling schedule_timeout() immediately before
the next timer interrupt occurs? Then we will only sleep slightly more
than 9 jiffies, instead of the 10 requested. The basic issue is that we
are always taking the floor of the current position in jiffies in
schedule_timeout() by adding the relative offset to jiffies. To
guarantee the timeout requested passes, we must add the relative offset
to (jiffies+1) [See the attached patch, which I think fixes the
"problem" in 2.6.13-rc5]. Most callers are already rounding up or adding
one to their request, so it may not be a problem. And, often, these are
sleeping paths, so most callers don't care about precision. So, while
you are correct that there is a chance for "infinite" sleep in
msleep_interruptible() and schedule_timeout_{intr,unintr}_msecs(), there
technically *should* be such a possibility in the jiffies case too, but
the code wasn't correct up until now.

All in all, seems buggy, but my analysis may also be wrong -- and this
case may be damn well near impossible to actually create.

> Constant conversion between different time units is a really bad idea. If 
> the user needs the remaining time, he is really better off to do it 
> himself by checking jiffies and only does an initial conversion from 
> relative to absolute (kernel) time.
> This wrapper function really should be an inline function and should look 
> more like this:
> 
> static inline int schedule_timeout_msecs(unsigned int timeout_msecs)
> {
> 	return schedule_timeout(msecs_to_jiffies(timeout_msecs) + 1) != 0;
> }

I don't think I want the schedule_timeout*() functions' return values'
meanings to be different depending on whether you use milliseconds or
jiffies. Your version makes the millisecond-case boolean in return,
which is differnent than schedule_timeout()'s remaining-jiffies return
value.

I have also been thinking about the need to use
while(time_after(timeout_jiffies, timeout)) vs.  while(timeout_msecs),
but I will respond to a different e-mail about that.

Thanks,
Nish

---

 timer.c |   11 ++++++-----
 1 files changed, 6 insertions(+), 5 deletions(-)

Description: Ensure that schedule_timeout() requests can not possibly
expire early in the timeout case, by adding the requested relative jiffy
value to the next value of jiffies. Currently, by adding to the current
value of jiffies, we might actually expire a jiffy too early (in a
corner case).

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>

--- 2.6.13-rc5/kernel/timer.c	2005-08-01 12:31:53.000000000 -0700
+++ 2.6.13-rc5-dev/kernel/timer.c	2005-08-03 17:30:10.000000000 -0700
@@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti
 		}
 	}
 
-	expire = timeout + jiffies;
+	expire = timeout + jiffies + 1;
 
 	init_timer(&timer);
 	timer.expires = expire;
@@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms
 	} else {
 		/*
 		 * msecs_to_jiffies() is a unit conversion, which truncates
-		 * (rounds down), so we need to add 1.
+		 * (rounds down), so we need to add 1, but this is taken
+		 * care of by schedule_timeout() now.
 		 */
-		expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
+		expire_jifs = msecs_to_jiffies(timeout_msecs);
 	}
 
 	expire_jifs = schedule_timeout(expire_jifs);
@@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time
  */
 void msleep(unsigned int msecs)
 {
-	unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+	unsigned long timeout = msecs_to_jiffies(msecs);
 
 	while (timeout) {
 		set_current_state(TASK_UNINTERRUPTIBLE);
@@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep);
  */
 unsigned long msleep_interruptible(unsigned int msecs)
 {
-	unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+	unsigned long timeout = msecs_to_jiffies(msecs);
 
 	while (timeout && !signal_pending(current)) {
 		set_current_state(TASK_INTERRUPTIBLE);

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04  0:51                                 ` [PATCH] push rounding up of relative request to schedule_timeout() Nishanth Aravamudan
@ 2005-08-04  5:14                                   ` Nishanth Aravamudan
  2005-08-04 16:45                                     ` George Anzinger
  2005-08-04 17:05                                     ` George Anzinger
  2005-08-04  9:38                                   ` [PATCH] " Roman Zippel
  1 sibling, 2 replies; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-08-04  5:14 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote:
> On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote:
> > Hi,
> > 
> > On Mon, 1 Aug 2005, Nishanth Aravamudan wrote:
> > 
> > > +unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
> > > +{
> > > +	unsigned long expire_jifs;
> > > +
> > > +	if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
> > > +		expire_jifs = MAX_SCHEDULE_TIMEOUT;
> > > +	} else {
> > > +		/*
> > > +		 * msecs_to_jiffies() is a unit conversion, which truncates
> > > +		 * (rounds down), so we need to add 1.
> > > +		 */
> > > +		expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
> > > +	}
> > > +
> > > +	expire_jifs = schedule_timeout(expire_jifs);
> > > +
> > > +	/*
> > > +	 * don't need to add 1 here, even though there is truncation,
> > > +	 * because we will add 1 if/when the value is sent back in
> > > +	 */
> > > +	return jiffies_to_msecs(expire_jifs);
> > > +}
> > 
> > As I already mentioned for msleep_interruptible this is a really terrible 
> > interface.
> > The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the 
> > process is immediately woken up again) makes the caller suspectible to 
> > timeout manipulations and requires constant reauditing, that no caller 
> > gets it wrong, so it's better to avoid this error source completely.
> 
> After some thought today, I realized the +1 case is not specific to
> milliseconds. It's just that it's only being done *correctly* in the
> milliseconds case...I think ;)

<snip>

> Description: Ensure that schedule_timeout() requests can not possibly
> expire early in the timeout case, by adding the requested relative jiffy
> value to the next value of jiffies. Currently, by adding to the current
> value of jiffies, we might actually expire a jiffy too early (in a
> corner case).

Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
(to account for this same issue, I believe, as POSIX demands no early
return from nanosleep() calls). There are some other locations where
similar

	+ (t.tv_sec || t.tv_nsec)

rounding-ups occur. I'll fix those separately if this patch goes in.  I
change the one in sys_nanosleep() below to maintain the same latency as
we currently have. I also screwed up my layout in the previous
submission, sorry about that.

Description: Ensure that schedule_timeout() requests can not possibly
expire early in the timeout case, by adding the requested relative jiffy
value to the next value of jiffies. Currently, by adding to the current
value of jiffies, we might actually expire a jiffy too early (in a
corner case). Modify the callers of schedule_timeout() in timer.c to not
add 1 themselves.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>

---

 timer.c |   13 +++++++------
 1 files changed, 7 insertions(+), 6 deletions(-)

--- 2.6.13-rc4-dev/kernel/timer.c	2005-08-01 12:31:53.000000000 -0700
+++ 2.6.13-rc5/kernel/timer.c	2005-08-03 22:05:16.000000000 -0700
@@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti
 		}
 	}
 
-	expire = timeout + jiffies;
+	expire = timeout + jiffies + 1;
 
 	init_timer(&timer);
 	timer.expires = expire;
@@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms
 	} else {
 		/*
 		 * msecs_to_jiffies() is a unit conversion, which truncates
-		 * (rounds down), so we need to add 1.
+		 * (rounds down), so we need to add 1, but this is taken
+		 * care of by schedule_timeout() now.
 		 */
-		expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
+		expire_jifs = msecs_to_jiffies(timeout_msecs);
 	}
 
 	expire_jifs = schedule_timeout(expire_jifs);
@@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim
 	if ((t.tv_nsec >= 1000000000L) || (t.tv_nsec < 0) || (t.tv_sec < 0))
 		return -EINVAL;
 
-	expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);
+	expire = timespec_to_jiffies(&t);
 	current->state = TASK_INTERRUPTIBLE;
 	expire = schedule_timeout(expire);
 
@@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time
  */
 void msleep(unsigned int msecs)
 {
-	unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+	unsigned long timeout = msecs_to_jiffies(msecs);
 
 	while (timeout) {
 		set_current_state(TASK_UNINTERRUPTIBLE);
@@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep);
  */
 unsigned long msleep_interruptible(unsigned int msecs)
 {
-	unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+	unsigned long timeout = msecs_to_jiffies(msecs);
 
 	while (timeout && !signal_pending(current)) {
 		set_current_state(TASK_INTERRUPTIBLE);


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04  0:51                                 ` [PATCH] push rounding up of relative request to schedule_timeout() Nishanth Aravamudan
  2005-08-04  5:14                                   ` [UPDATE PATCH] " Nishanth Aravamudan
@ 2005-08-04  9:38                                   ` Roman Zippel
  2005-08-04 14:33                                     ` Nishanth Aravamudan
  2005-08-04 17:08                                     ` Andrew Morton
  1 sibling, 2 replies; 50+ messages in thread
From: Roman Zippel @ 2005-08-04  9:38 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

Hi,

Andrew, please drop this patch. 
Nish, please stop fucking around with kernel APIs.

On Wed, 3 Aug 2005, Nishanth Aravamudan wrote:

> > The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the 
> > process is immediately woken up again) makes the caller suspectible to 
> > timeout manipulations and requires constant reauditing, that no caller 
> > gets it wrong, so it's better to avoid this error source completely.

Nish, did you read this? Is my English this bad?

> --- 2.6.13-rc5/kernel/timer.c	2005-08-01 12:31:53.000000000 -0700
> +++ 2.6.13-rc5-dev/kernel/timer.c	2005-08-03 17:30:10.000000000 -0700
> @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti
>  		}
>  	}
>  
> -	expire = timeout + jiffies;
> +	expire = timeout + jiffies + 1;
>  
>  	init_timer(&timer);
>  	timer.expires = expire;

And a little later it does:

	timeout = expire - jiffies;

which means callers can get back a larger timeout.
Nish, did you check and fix _all_ users? I can easily find a number of 
users which immediately use the return value as next timeout.
There are _a_lot_ of schedule_timeout(1) for small busy loops, these are 
asking for "please schedule until next tick". Did you check that these are 
still ok?
schedule_timeout() is arguably a broken API, but can we please _first_ 
come up with a plan to fix this, before we break even more?

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04  9:38                                   ` [PATCH] " Roman Zippel
@ 2005-08-04 14:33                                     ` Nishanth Aravamudan
  2005-08-04 18:59                                       ` Roman Zippel
  2005-08-04 17:08                                     ` Andrew Morton
  1 sibling, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-08-04 14:33 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 04.08.2005 [11:38:33 +0200], Roman Zippel wrote:
> Hi,
> 
> Andrew, please drop this patch. 
> Nish, please stop fucking around with kernel APIs.

The comment for schedule_timeout() claims:

 * Make the current task sleep until @timeout jiffies have
 * elapsed.

Currently, it does not do so. I was simply trying to make the function
do what it claims it does.

> On Wed, 3 Aug 2005, Nishanth Aravamudan wrote:
> 
> > > The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the 
> > > process is immediately woken up again) makes the caller suspectible to 
> > > timeout manipulations and requires constant reauditing, that no caller 
> > > gets it wrong, so it's better to avoid this error source completely.
> 
> Nish, did you read this? Is my English this bad?

Your English is fine. My point was that the +1 issues (potential
infinite timeouts) are a problem with *jiffies* not milliseconds. And
thus need to be pushed down to the jiffies layer. I think my explanation
was pretty clear.

> 
> > --- 2.6.13-rc5/kernel/timer.c	2005-08-01 12:31:53.000000000 -0700
> > +++ 2.6.13-rc5-dev/kernel/timer.c	2005-08-03 17:30:10.000000000 -0700
> > @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti
> >  		}
> >  	}
> >  
> > -	expire = timeout + jiffies;
> > +	expire = timeout + jiffies + 1;
> >  
> >  	init_timer(&timer);
> >  	timer.expires = expire;
> 
> And a little later it does:
> 
> 	timeout = expire - jiffies;
> 
> which means callers can get back a larger timeout.

Hrm, maybe I will need to cache the parameter. But the only way you
would get a return value greater than requested is if schedule() returns
before the next jiffy? Which I guess could happen if a wait-queue event
or signal (if the task is set as such) occurs before the next interrupt
*and* there are no other threads running, I believe, as
process_timeout() -> try_to_wake_up() only puts the task back on the run
queue; I don't think it actually preempts the currently running task,
does it?  Just an FYI, though, that is a problem in the current code, in
the sense that we will pass back the exact same value again Let me take
a look, maybe we need to do some -1 later (or just cache the request, as
I mentioned).  Thanks for pointing this out.

> Nish, did you check and fix _all_ users? I can easily find a number of 
> users which immediately use the return value as next timeout.

I haven't fixed all users yet. I plan on trying to do that today. Would
those callers that do immediately use the return value as the next
timeout fall under the functions that should be using
time_after()/time_before()?

> There are _a_lot_ of schedule_timeout(1) for small busy loops, these are 
> asking for "please schedule until next tick". Did you check that these are 
> still ok?

According to the comment for schedule_timeout(), that is not true. They
are asking to sleep for *1* ticks, no the *next* tick. If the assumption
in the code is the latter, they have been calling this function
inappropriately for a while. They probably should be requesting
schedule_timeout(0), i.e. no sleep at all (but we still will add a timer
internally and it won't expire until the next interrupt occurs).

> schedule_timeout() is arguably a broken API, but can we please _first_ 
> come up with a plan to fix this, before we break even more?

It *is* broken, and my patch *does* fix it. I'm not suggesting it go to
Linus today, but it's ok in Andrew's tree, especially if I'm able to get
the depending patches sent out soon.

I am pretty sure there is no way to fix the problems of adding 1 without
an inter-tick position. The +1 is the closest thing we have to a
"ceiling" function with jiffies.

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04  5:14                                   ` [UPDATE PATCH] " Nishanth Aravamudan
@ 2005-08-04 16:45                                     ` George Anzinger
  2005-08-04 18:48                                       ` Nish Aravamudan
  2005-08-16 23:05                                       ` Nishanth Aravamudan
  2005-08-04 17:05                                     ` George Anzinger
  1 sibling, 2 replies; 50+ messages in thread
From: George Anzinger @ 2005-08-04 16:45 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen,
	linux-kernel, clucas

Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
repeating times (see setitimer code) we need the actual time as we KNOW 
where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
for the initial time, not the repeating time.


See:
http://marc.theaimsgroup.com/?l=linux-kernel&m=112208357906156&w=2

George

Nishanth Aravamudan wrote:
> On 03.08.2005 [17:51:47 -0700], Nishanth Aravamudan wrote:
> 
>>On 03.08.2005 [16:20:57 +0200], Roman Zippel wrote:
>>
>>>Hi,
>>>
>>>On Mon, 1 Aug 2005, Nishanth Aravamudan wrote:
>>>
>>>
>>>>+unsigned int __sched schedule_timeout_msecs(unsigned int timeout_msecs)
>>>>+{
>>>>+	unsigned long expire_jifs;
>>>>+
>>>>+	if (timeout_msecs == MAX_SCHEDULE_TIMEOUT_MSECS) {
>>>>+		expire_jifs = MAX_SCHEDULE_TIMEOUT;
>>>>+	} else {
>>>>+		/*
>>>>+		 * msecs_to_jiffies() is a unit conversion, which truncates
>>>>+		 * (rounds down), so we need to add 1.
>>>>+		 */
>>>>+		expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
>>>>+	}
>>>>+
>>>>+	expire_jifs = schedule_timeout(expire_jifs);
>>>>+
>>>>+	/*
>>>>+	 * don't need to add 1 here, even though there is truncation,
>>>>+	 * because we will add 1 if/when the value is sent back in
>>>>+	 */
>>>>+	return jiffies_to_msecs(expire_jifs);
>>>>+}
>>>
>>>As I already mentioned for msleep_interruptible this is a really terrible 
>>>interface.
>>>The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when the 
>>>process is immediately woken up again) makes the caller suspectible to 
>>>timeout manipulations and requires constant reauditing, that no caller 
>>>gets it wrong, so it's better to avoid this error source completely.
>>
>>After some thought today, I realized the +1 case is not specific to
>>milliseconds. It's just that it's only being done *correctly* in the
>>milliseconds case...I think ;)
> 
> 
> <snip>
> 
>>Description: Ensure that schedule_timeout() requests can not possibly
>>expire early in the timeout case, by adding the requested relative jiffy
>>value to the next value of jiffies. Currently, by adding to the current
>>value of jiffies, we might actually expire a jiffy too early (in a
>>corner case).
> 
> 
> Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
> (to account for this same issue, I believe, as POSIX demands no early
> return from nanosleep() calls). There are some other locations where
> similar
> 
> 	+ (t.tv_sec || t.tv_nsec)
> 
> rounding-ups occur. I'll fix those separately if this patch goes in.  I
> change the one in sys_nanosleep() below to maintain the same latency as
> we currently have. I also screwed up my layout in the previous
> submission, sorry about that.
> 
> Description: Ensure that schedule_timeout() requests can not possibly
> expire early in the timeout case, by adding the requested relative jiffy
> value to the next value of jiffies. Currently, by adding to the current
> value of jiffies, we might actually expire a jiffy too early (in a
> corner case). Modify the callers of schedule_timeout() in timer.c to not
> add 1 themselves.
> 
> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
> 
> ---
> 
>  timer.c |   13 +++++++------
>  1 files changed, 7 insertions(+), 6 deletions(-)
> 
> --- 2.6.13-rc4-dev/kernel/timer.c	2005-08-01 12:31:53.000000000 -0700
> +++ 2.6.13-rc5/kernel/timer.c	2005-08-03 22:05:16.000000000 -0700
> @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti
>  		}
>  	}
>  
> -	expire = timeout + jiffies;
> +	expire = timeout + jiffies + 1;
>  
>  	init_timer(&timer);
>  	timer.expires = expire;
> @@ -1190,9 +1190,10 @@ unsigned int __sched schedule_timeout_ms
>  	} else {
>  		/*
>  		 * msecs_to_jiffies() is a unit conversion, which truncates
> -		 * (rounds down), so we need to add 1.
> +		 * (rounds down), so we need to add 1, but this is taken
> +		 * care of by schedule_timeout() now.
>  		 */
> -		expire_jifs = msecs_to_jiffies(timeout_msecs) + 1;
> +		expire_jifs = msecs_to_jiffies(timeout_msecs);
>  	}
>  
>  	expire_jifs = schedule_timeout(expire_jifs);
> @@ -1286,7 +1287,7 @@ asmlinkage long sys_nanosleep(struct tim
>  	if ((t.tv_nsec >= 1000000000L) || (t.tv_nsec < 0) || (t.tv_sec < 0))
>  		return -EINVAL;
>  
> -	expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);
> +	expire = timespec_to_jiffies(&t);
>  	current->state = TASK_INTERRUPTIBLE;
>  	expire = schedule_timeout(expire);
>  
> @@ -1675,7 +1676,7 @@ unregister_time_interpolator(struct time
>   */
>  void msleep(unsigned int msecs)
>  {
> -	unsigned long timeout = msecs_to_jiffies(msecs) + 1;
> +	unsigned long timeout = msecs_to_jiffies(msecs);
>  
>  	while (timeout) {
>  		set_current_state(TASK_UNINTERRUPTIBLE);
> @@ -1691,7 +1692,7 @@ EXPORT_SYMBOL(msleep);
>   */
>  unsigned long msleep_interruptible(unsigned int msecs)
>  {
> -	unsigned long timeout = msecs_to_jiffies(msecs) + 1;
> +	unsigned long timeout = msecs_to_jiffies(msecs);
>  
>  	while (timeout && !signal_pending(current)) {
>  		set_current_state(TASK_INTERRUPTIBLE);
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04  5:14                                   ` [UPDATE PATCH] " Nishanth Aravamudan
  2005-08-04 16:45                                     ` George Anzinger
@ 2005-08-04 17:05                                     ` George Anzinger
  2005-08-04 18:49                                       ` Nish Aravamudan
  1 sibling, 1 reply; 50+ messages in thread
From: George Anzinger @ 2005-08-04 17:05 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen,
	linux-kernel, clucas

Nishanth Aravamudan wrote:
~
> Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
> (to account for this same issue, I believe, as POSIX demands no early
> return from nanosleep() calls). There are some other locations where
> similar
> 
> 	+ (t.tv_sec || t.tv_nsec)

This is not the same as "always add 1".  We don't do it this way just to 
have fun with C.  If you change schedule_timeout() to add the 1, 
nanosleep() will need to do things differently to get the same behavior. 
  (And, YES users do pass in zero sleep times.)


-- 
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04  9:38                                   ` [PATCH] " Roman Zippel
  2005-08-04 14:33                                     ` Nishanth Aravamudan
@ 2005-08-04 17:08                                     ` Andrew Morton
  2005-08-04 19:00                                       ` [PATCH] add schedule_timeout_{,un}intr() interfaces Nishanth Aravamudan
  1 sibling, 1 reply; 50+ messages in thread
From: Andrew Morton @ 2005-08-04 17:08 UTC (permalink / raw)
  To: Roman Zippel; +Cc: nacc, arjan, domen, linux-kernel, clucas

Roman Zippel <zippel@linux-m68k.org> wrote:
>
> Andrew, please drop this patch. 

Well I was sitting on them with the intention of taking a look later and
trying to work out what the heck you two have been going on about.

But maybe dropping them means that we'll later get a clear and concise
description of the problem and its solution, so I'll do that.

(This was supposed to be just a simple "consolidate
set_current_state+schedule_timeout into a single function call" patch. 
What happened?)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04 16:45                                     ` George Anzinger
@ 2005-08-04 18:48                                       ` Nish Aravamudan
  2005-08-16 23:05                                       ` Nishanth Aravamudan
  1 sibling, 0 replies; 50+ messages in thread
From: Nish Aravamudan @ 2005-08-04 18:48 UTC (permalink / raw)
  To: george
  Cc: Nishanth Aravamudan, Roman Zippel, Arjan van de Ven,
	Andrew Morton, domen, linux-kernel, clucas

On 8/4/05, George Anzinger <george@mvista.com> wrote:
> Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and
> timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For
> repeating times (see setitimer code) we need the actual time as we KNOW
> where the jiffies edge is in the repeating case.  The +1 is needed ONLY
> for the initial time, not the repeating time.

Please read the patch. I didn't touch timespec_to_jiffies() or
timeval_to_jiffies(). Not sure why you think I did. I agree that we
only need the initial time, my patch is no good. But it is hard for
non-itimers, like schedule_timeout() callers, to provide an interface
that only adds 1 to the initial request, since the callers currently
pass in an absolute jiffies value.

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04 17:05                                     ` George Anzinger
@ 2005-08-04 18:49                                       ` Nish Aravamudan
  0 siblings, 0 replies; 50+ messages in thread
From: Nish Aravamudan @ 2005-08-04 18:49 UTC (permalink / raw)
  To: george
  Cc: Nishanth Aravamudan, Roman Zippel, Arjan van de Ven,
	Andrew Morton, domen, linux-kernel, clucas

On 8/4/05, George Anzinger <george@mvista.com> wrote:
> Nishanth Aravamudan wrote:
> ~
> > Sorry, I forgot that sys_nanosleep() also always adds 1 to the request
> > (to account for this same issue, I believe, as POSIX demands no early
> > return from nanosleep() calls). There are some other locations where
> > similar
> >
> >       + (t.tv_sec || t.tv_nsec)
> 
> This is not the same as "always add 1".  We don't do it this way just to
> have fun with C.  If you change schedule_timeout() to add the 1,
> nanosleep() will need to do things differently to get the same behavior.
>   (And, YES users do pass in zero sleep times.)

Fair enough. Will need to think about this more.

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04 14:33                                     ` Nishanth Aravamudan
@ 2005-08-04 18:59                                       ` Roman Zippel
  2005-08-04 19:11                                         ` Nishanth Aravamudan
  0 siblings, 1 reply; 50+ messages in thread
From: Roman Zippel @ 2005-08-04 18:59 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Thu, 4 Aug 2005, Nishanth Aravamudan wrote:

> The comment for schedule_timeout() claims:
> 
>  * Make the current task sleep until @timeout jiffies have
>  * elapsed.
> 
> Currently, it does not do so. I was simply trying to make the function
> do what it claims it does.

What makes you think the comment is correct? This comment was added at 
2.4.3, while schedule_timeout() has this behaviour since it was added at 
2.1.127.

> My point was that the +1 issues (potential
> infinite timeouts) are a problem with *jiffies* not milliseconds. And
> thus need to be pushed down to the jiffies layer. I think my explanation
> was pretty clear.

Not really, could you go into more details why it's "a problem with 
*jiffies* not milliseconds"?

bye, Roman

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH] add schedule_timeout_{,un}intr() interfaces
  2005-08-04 17:08                                     ` Andrew Morton
@ 2005-08-04 19:00                                       ` Nishanth Aravamudan
  2005-08-05  7:38                                         ` Andrew Morton
  0 siblings, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-08-04 19:00 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Roman Zippel, arjan, domen, linux-kernel, clucas

On 04.08.2005 [10:08:10 -0700], Andrew Morton wrote:
> Roman Zippel <zippel@linux-m68k.org> wrote:
> >
> > Andrew, please drop this patch. 
> 
> Well I was sitting on them with the intention of taking a look later and
> trying to work out what the heck you two have been going on about.

I appreciate that... Good luck figuring out this mess...

> But maybe dropping them means that we'll later get a clear and concise
> description of the problem and its solution, so I'll do that.

I will try and formulate this as well as I can today or tomorrow.

> (This was supposed to be just a simple "consolidate
> set_current_state+schedule_timeout into a single function call" patch. 
> What happened?)

Here's that patch. What happened? I did, to be honest. Trying to fix too
many things at once.

Thanks,
Nish

Description: Add schedule_timeout_{,un}intr() interfaces so that
schedule_timeout() callers don't have to worry about forgetting to add
the set_current_state() call beforehand.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>

---

 include/linux/sched.h |    2 ++
 kernel/timer.c        |   14 ++++++++++++++
 2 files changed, 16 insertions(+)

diff -urpN 2.6.13-rc5/include/linux/sched.h 2.6.13-rc5-dev/include/linux/sched.h
--- 2.6.13-rc5/include/linux/sched.h	2005-08-04 11:46:06.000000000 -0700
+++ 2.6.13-rc5-dev/include/linux/sched.h	2005-08-04 11:53:20.000000000 -0700
@@ -183,6 +183,8 @@ extern int in_sched_functions(unsigned l
 
 #define	MAX_SCHEDULE_TIMEOUT	LONG_MAX
 extern signed long FASTCALL(schedule_timeout(signed long timeout));
+extern signed long schedule_timeout_intr(signed long timeout);
+extern signed long schedule_timeout_unintr(signed long timeout);
 asmlinkage void schedule(void);
 
 struct namespace;
diff -urpN 2.6.13-rc5/kernel/timer.c 2.6.13-rc5-dev/kernel/timer.c
--- 2.6.13-rc5/kernel/timer.c	2005-08-04 11:46:06.000000000 -0700
+++ 2.6.13-rc5-dev/kernel/timer.c	2005-08-04 11:52:06.000000000 -0700
@@ -1153,6 +1153,20 @@ fastcall signed long __sched schedule_ti
 
 EXPORT_SYMBOL(schedule_timeout);
 
+signed long __sched schedule_timeout_intr(signed long timeout)
+{
+       set_current_state(TASK_INTERRUPTIBLE);
+       return schedule_timeout(timeout);
+}
+EXPORT_SYMBOL(schedule_timeout_intr);
+
+signed long __sched schedule_timeout_unintr(signed long timeout)
+{
+       set_current_state(TASK_UNINTERRUPTIBLE);
+       return schedule_timeout(timeout);
+}
+EXPORT_SYMBOL(schedule_timeout_unintr);
+
 /* Thread ID - the internal kernel "pid" */
 asmlinkage long sys_gettid(void)
 {

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04 18:59                                       ` Roman Zippel
@ 2005-08-04 19:11                                         ` Nishanth Aravamudan
  2005-08-04 23:20                                           ` Roman Zippel
  0 siblings, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-08-04 19:11 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

On 04.08.2005 [20:59:50 +0200], Roman Zippel wrote:
> Hi,
> 
> On Thu, 4 Aug 2005, Nishanth Aravamudan wrote:
> 
> > The comment for schedule_timeout() claims:
> > 
> >  * Make the current task sleep until @timeout jiffies have
> >  * elapsed.
> > 
> > Currently, it does not do so. I was simply trying to make the function
> > do what it claims it does.
> 
> What makes you think the comment is correct? This comment was added at 
> 2.4.3, while schedule_timeout() has this behaviour since it was added at 
> 2.1.127.

Fair enough. Should we change the comment? What are the expectations of
programmers wrt. to this interface?

> > My point was that the +1 issues (potential
> > infinite timeouts) are a problem with *jiffies* not milliseconds. And
> > thus need to be pushed down to the jiffies layer. I think my explanation
> > was pretty clear.
> 
> Not really, could you go into more details why it's "a problem with 
> *jiffies* not milliseconds"?

Well, I may have phrased that wrong. It's not that milliseconds in the
kernel wouldn't have the same problem -- they would/do, because they
convert to jiffies eventually.

This is all the same issue, as George pointed out, that itimers face.

If a user requests schedule_timeout(HZ/100); which, if HZ is 1000, is 10
jiffies, yes? But, since we are between ticks, we want to actually add
that request to the next interval, not to the current one. Otherwise, we
do have the possibility of returning early. Currently, we require
callers to add the +1 to their request, and thus they only add it to the
first one. The problem with my patch which pushed it to
schedule_timeout(), is that we will do +1 to every request. I'm not
sure, without some sort of "persistent" timeout control structure, how
we get around that, though, in the schedule_timeout() case. Does that
make any more sense?

Thanks
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04 19:11                                         ` Nishanth Aravamudan
@ 2005-08-04 23:20                                           ` Roman Zippel
  0 siblings, 0 replies; 50+ messages in thread
From: Roman Zippel @ 2005-08-04 23:20 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Arjan van de Ven, Andrew Morton, domen, linux-kernel, clucas

Hi,

On Thu, 4 Aug 2005, Nishanth Aravamudan wrote:

> > What makes you think the comment is correct? This comment was added at 
> > 2.4.3, while schedule_timeout() has this behaviour since it was added at 
> > 2.1.127.
> 
> Fair enough. Should we change the comment?

It can't hurt to fix the comment.

> If a user requests schedule_timeout(HZ/100); which, if HZ is 1000, is 10
> jiffies, yes? But, since we are between ticks, we want to actually add
> that request to the next interval, not to the current one. Otherwise, we
> do have the possibility of returning early. Currently, we require
> callers to add the +1 to their request, and thus they only add it to the
> first one. The problem with my patch which pushed it to
> schedule_timeout(), is that we will do +1 to every request. I'm not
> sure, without some sort of "persistent" timeout control structure, how
> we get around that, though, in the schedule_timeout() case. Does that
> make any more sense?

I don't disagree, but please create some sane interfaces, e.g. something 
like below. This allows to first convert as much as possible 
schedule_timeout() users and then we can still change the 
schedule_timeout() interface.

bye, Roman

 include/linux/sched.h |   17 +++++++++++
 kernel/timer.c        |   71 +++++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 78 insertions(+), 10 deletions(-)

Index: linux-2.6/include/linux/sched.h
===================================================================
--- linux-2.6.orig/include/linux/sched.h	2005-06-18 15:00:59.000000000 +0200
+++ linux-2.6/include/linux/sched.h	2005-08-04 23:42:43.000000000 +0200
@@ -182,9 +182,24 @@ extern void scheduler_tick(void);
 extern int in_sched_functions(unsigned long addr);
 
 #define	MAX_SCHEDULE_TIMEOUT	LONG_MAX
-extern signed long FASTCALL(schedule_timeout(signed long timeout));
+extern void schedule_until(unsigned long expire) fastcall;
+extern int schedule_until_intr(unsigned long expire) fastcall;
+extern int schedule_until_unintr(unsigned long expire) fastcall;
+extern signed long schedule_timeout(signed long timeout) fastcall;
+extern int schedule_timeout_intr(signed long timeout) fastcall;
+extern int schedule_timeout_unintr(signed long timeout) fastcall;
 asmlinkage void schedule(void);
 
+static inline int schedule_timeout_msecs_intr(unsigned int timeout)
+{
+	return schedule_timeout_intr(msecs_to_jiffies(timeout));
+}
+
+static inline int schedule_timeout_msecs_unintr(unsigned int timeout)
+{
+	return schedule_timeout_unintr(msecs_to_jiffies(timeout));
+}
+
 struct namespace;
 
 /* Maximum number of active map areas.. This is a random (large) number */
Index: linux-2.6/kernel/timer.c
===================================================================
--- linux-2.6.orig/kernel/timer.c	2005-06-18 15:01:12.000000000 +0200
+++ linux-2.6/kernel/timer.c	2005-08-04 23:42:36.000000000 +0200
@@ -1049,6 +1049,67 @@ static void process_timeout(unsigned lon
 	wake_up_process((task_t *)__data);
 }
 
+fastcall __sched void schedule_until(unsigned long expire)
+{
+	struct timer_list timer;
+
+	init_timer(&timer);
+	timer.expires = expire;
+	timer.data = (unsigned long) current;
+	timer.function = process_timeout;
+
+	add_timer(&timer);
+	schedule();
+	del_singleshot_timer_sync(&timer);
+}
+EXPORT_SYMBOL(schedule_until);
+
+fastcall __sched int schedule_until_intr(unsigned long expire)
+{
+	set_current_state(TASK_INTERRUPTIBLE);
+	schedule_until(expire);
+	return time_before(expire, jiffies);
+}
+EXPORT_SYMBOL(schedule_until_intr);
+
+fastcall __sched int schedule_until_unintr(unsigned long expire)
+{
+	set_current_state(TASK_UNINTERRUPTIBLE);
+	schedule_until(expire);
+	return time_before(expire, jiffies);
+}
+EXPORT_SYMBOL(schedule_until_intr);
+
+fastcall __sched int schedule_timeout_intr(signed long timeout)
+{
+	unsigned long expire;
+	
+	set_current_state(TASK_INTERRUPTIBLE);
+	if (timeout >= MAX_SCHEDULE_TIMEOUT - 1) {
+		schedule();
+		return 1;
+	}
+	expire = jiffies + timeout + 1;
+	schedule_until(expire);
+	return time_before(expire, jiffies);
+}
+EXPORT_SYMBOL(schedule_timeout_intr);
+
+fastcall __sched int schedule_timeout_unintr(signed long timeout)
+{
+	unsigned long expire;
+
+	set_current_state(TASK_UNINTERRUPTIBLE);
+	if (timeout >= MAX_SCHEDULE_TIMEOUT - 1) {
+		schedule();
+		return 1;
+	}
+	expire = jiffies + timeout + 1;
+	schedule_until(expire);
+	return time_before(expire, jiffies);
+}
+EXPORT_SYMBOL(schedule_timeout_unintr);
+
 /**
  * schedule_timeout - sleep until timeout
  * @timeout: timeout value in jiffies
@@ -1077,7 +1138,6 @@ static void process_timeout(unsigned lon
  */
 fastcall signed long __sched schedule_timeout(signed long timeout)
 {
-	struct timer_list timer;
 	unsigned long expire;
 
 	switch (timeout)
@@ -1112,14 +1172,7 @@ fastcall signed long __sched schedule_ti
 
 	expire = timeout + jiffies;
 
-	init_timer(&timer);
-	timer.expires = expire;
-	timer.data = (unsigned long) current;
-	timer.function = process_timeout;
-
-	add_timer(&timer);
-	schedule();
-	del_singleshot_timer_sync(&timer);
+	schedule_until(expire);
 
 	timeout = expire - jiffies;
 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] add schedule_timeout_{,un}intr() interfaces
  2005-08-04 19:00                                       ` [PATCH] add schedule_timeout_{,un}intr() interfaces Nishanth Aravamudan
@ 2005-08-05  7:38                                         ` Andrew Morton
  0 siblings, 0 replies; 50+ messages in thread
From: Andrew Morton @ 2005-08-05  7:38 UTC (permalink / raw)
  To: Nishanth Aravamudan; +Cc: zippel, arjan, domen, linux-kernel, clucas

Nishanth Aravamudan <nacc@us.ibm.com> wrote:
>
> Add schedule_timeout_{,un}intr() interfaces

I did s/intr/interruptible/.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-04 16:45                                     ` George Anzinger
  2005-08-04 18:48                                       ` Nish Aravamudan
@ 2005-08-16 23:05                                       ` Nishanth Aravamudan
  2005-08-17  0:39                                         ` George Anzinger
  1 sibling, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-08-16 23:05 UTC (permalink / raw)
  To: George Anzinger
  Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen,
	linux-kernel, clucas

On 04.08.2005 [09:45:55 -0700], George Anzinger wrote:
> Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
> timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
> repeating times (see setitimer code) we need the actual time as we KNOW 
> where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
> for the initial time, not the repeating time.
> 
> 
> See:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=112208357906156&w=2

I followed that thread, George, but I think it's a different case with
schedule_timeout() [maybe this indicates drivers/other users should
maybe be using itimers, but I'll get to that in a sec].

With schedule_timeout(), we are just given a relative jiffies value. We
have no context as to which task is requesting the delay, per se,
meaning we don't (can't) know from the interface whether this is the
first delay in a sequence, or a brand new one, without changing all
users to have some sort of control structure. The callers of
schedule_timeout() don't even get a pointer to the timer added
internally.

So, adding 1 to all sleeps seems like it might be reasonable, as looping
sleeps probably need to use a different interface. I had worked a bit
ago on something like poll_event() with the kernel-janitors group, which
would abstract out the repeated sleeps. Basically wait_event() without
wait-queues... Maybe we could make such an interface just use itimers?
I've attached my old patch for poll_event(), just for reference.

My point, I guess, is that in the schedule_timeout() case, we don't know
where the jiffies edge is, as we either expire or receive a wait-queue
event/signal, we never mod_timer() the internal timer... So we have to
assume that we need to sleep the request. But maybe Roman's idea of
sleeping a certain number of jiffy edges is sufficient. I am not yet
convinced driver authors want/need such an interface, though, still
thinking it over.

Thanks for all the feedback,

Nish


--- 2.6.11-kj-v/include/linux/delay.h	2005-03-01 23:37:51.000000000 -0800
+++ 2.6.11-kj/include/linux/delay.h	2005-03-04 16:03:24.000000000 -0800
@@ -47,4 +47,106 @@ static inline void ssleep(unsigned int s
 	msleep(seconds * 1000);
 }
 
+/*
+ * poll_event* check if an @condition is true every @interval milliseconds,
+ * up to a @timeout maximum, if specified, else forever.
+ * The *_interruptible versions will sleep in TASK_INTERRUPTIBLE
+ * (instead of TASK_UNINTERRUPTIBLE) and will return early on signals
+ * Note that these interfaces are distinctly different than
+ * wait_event*() in two ways:
+ * 	1) No wait-queues
+ * 	2) Specify the time you want to sleep, not when you want to
+ * 	wake-up. Thus, callers should not add the current time to
+ * 	@timeout before calling, as it will be done by the macro.
+ */
+
+/**
+ * poll_event - check a condition at regular intervals
+ * @condition: Condition to check
+ * @interval: Number of milliseconds between checks
+ */
+#define poll_event(condition, interval)				\
+do {								\
+	while (!(condition))					\
+		msleep(interval);				\
+} while(0)
+
+/**
+ * poll_event_timeout - check a condition at regular intervals with a timeout
+ * @condition: Condition to check
+ * @interval: Number of milliseconds between checks
+ * @timeout: Maximum total number of milliseconds to take
+ * returns: 0, if condition caused return
+ * 	    -ETIME, if timeout
+ */
+#define poll_event_timeout(condition, interval, timeout)		\
+({									\
+	int __ret = 0;							\
+	unsigned long stop = jiffies + msecs_to_jiffies(timeout);	\
+ 									\
+	for (;;) {							\
+		if (condition)						\
+			break;						\
+		msleep(interval);					\
+		if (time_after(jiffies, stop)) {			\
+			__ret = -ETIME;					\
+ 			break;						\
+		}							\
+	}								\
+	__ret;								\
+})
+
+/**
+ * poll_event_interruptible - check a condition at regular intervals, returning early on signals
+ * @condition: Condition to check
+ * @interval: Number of milliseconds between checks
+ * returns: 0, if condition caused return
+ * 	    -EINTR, if a signal
+ */
+#define poll_event_interruptible(condition, interval)	\
+({							\
+	int __ret = 0;					\
+							\
+	for (;;) {					\
+		if (condition)				\
+			break;				\
+		msleep_interruptible(interval);		\
+		if (signal_pending(current)) {		\
+			__ret = -EINTR;			\
+ 			break;				\
+		}					\
+	}						\
+	__ret;
+})
+
+/**
+ * poll_event_interruptible_timeout - check a condition at regular intervals with a timeout, returning early on signals
+ * @condition: Condition to check
+ * @interval: Number of milliseconds between checks
+ * @timeout: Maximum total number of milliseconds to take
+ * returns 0, if condition caused return
+ * 	   -EINTR, if a signal
+ * 	   -ETIME, if timeout
+ */
+#define poll_event_interruptible_timeout(conditition, interval, timeout)	\
+({										\
+	int __ret = 0;								\
+	unsigned long stop = jiffies + msecs_to_jiffies(timeout);		\
+										\
+	for (;;) {								\
+		if (condition)							\
+			break;							\
+		msleep_interruptible(interval);					\
+		if (signal_pending(current)) {					\
+			__ret = -EINTR;						\
+			break;							\
+		}								\
+		if (time_after(jiffies, stop)) {				\
+			__ret = -ETIME;						\
+			break;							\
+		}								\
+	}									\
+	__ret;									\
+})
+
 #endif /* defined(_LINUX_DELAY_H) */

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-16 23:05                                       ` Nishanth Aravamudan
@ 2005-08-17  0:39                                         ` George Anzinger
  2005-08-17  5:56                                           ` Nishanth Aravamudan
  0 siblings, 1 reply; 50+ messages in thread
From: George Anzinger @ 2005-08-17  0:39 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen,
	linux-kernel, clucas

Nishanth Aravamudan wrote:
> On 04.08.2005 [09:45:55 -0700], George Anzinger wrote:
> 
>>Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
>>timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
>>repeating times (see setitimer code) we need the actual time as we KNOW 
>>where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
>>for the initial time, not the repeating time.
>>
>>
>>See:
>>http://marc.theaimsgroup.com/?l=linux-kernel&m=112208357906156&w=2
> 
> 
> I followed that thread, George, but I think it's a different case with
> schedule_timeout() [maybe this indicates drivers/other users should
> maybe be using itimers, but I'll get to that in a sec].

I think I miss understood back then :).

> 
> With schedule_timeout(), we are just given a relative jiffies value. We
> have no context as to which task is requesting the delay, per se,
> meaning we don't (can't) know from the interface whether this is the
> first delay in a sequence, or a brand new one, without changing all
> users to have some sort of control structure. The callers of
> schedule_timeout() don't even get a pointer to the timer added
> internally.
> 
> So, adding 1 to all sleeps seems like it might be reasonable, as looping
> sleeps probably need to use a different interface. I had worked a bit
> ago on something like poll_event() with the kernel-janitors group, which
> would abstract out the repeated sleeps. Basically wait_event() without
> wait-queues... Maybe we could make such an interface just use itimers?
> I've attached my old patch for poll_event(), just for reference.

I think not.  itimers is really pointed at a particular system call and 
has resources in the task structure to do it.  These would be hard to 
share...
> 
> My point, I guess, is that in the schedule_timeout() case, we don't know
> where the jiffies edge is, as we either expire or receive a wait-queue
> event/signal, we never mod_timer() the internal timer... So we have to
> assume that we need to sleep the request. But maybe Roman's idea of
> sleeping a certain number of jiffy edges is sufficient. I am not yet
> convinced driver authors want/need such an interface, though, still
> thinking it over.

IMNSHO we should not get too parental with kernel only interfaces. 
Adding 1 is easy enough for the caller and even easier to explain in the 
instructions (i.e. this call sleeps for X jiffies edges).  This allows 
the caller to do more if needed and, should he ever just want to sync to 
the next jiffie he does not have to deal with backing out that +1.
> 

-- 
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-17  0:39                                         ` George Anzinger
@ 2005-08-17  5:56                                           ` Nishanth Aravamudan
  2005-08-17 19:51                                             ` George Anzinger
  0 siblings, 1 reply; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-08-17  5:56 UTC (permalink / raw)
  To: George Anzinger
  Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen,
	linux-kernel, clucas

On 16.08.2005 [17:39:11 -0700], George Anzinger wrote:
> Nishanth Aravamudan wrote:
> >On 04.08.2005 [09:45:55 -0700], George Anzinger wrote:
> >
> >>Uh... PLEASE tell me you are NOT changing timespec_to_jiffies() (and 
> >>timeval_to_jiffies() to add 1.  This is NOT the right thing to do.  For 
> >>repeating times (see setitimer code) we need the actual time as we KNOW 
> >>where the jiffies edge is in the repeating case.  The +1 is needed ONLY 
> >>for the initial time, not the repeating time.
> >>
> >>
> >>See:
> >>http://marc.theaimsgroup.com/?l=linux-kernel&m=112208357906156&w=2
> >
> >
> >I followed that thread, George, but I think it's a different case with
> >schedule_timeout() [maybe this indicates drivers/other users should
> >maybe be using itimers, but I'll get to that in a sec].
> 
> I think I miss understood back then :).

Ok, no problem. I was just thinking today about all the issues that were
borught up... I really appreciate your feedback.

> >
> >With schedule_timeout(), we are just given a relative jiffies value. We
> >have no context as to which task is requesting the delay, per se,
> >meaning we don't (can't) know from the interface whether this is the
> >first delay in a sequence, or a brand new one, without changing all
> >users to have some sort of control structure. The callers of
> >schedule_timeout() don't even get a pointer to the timer added
> >internally.
> >
> >So, adding 1 to all sleeps seems like it might be reasonable, as looping
> >sleeps probably need to use a different interface. I had worked a bit
> >ago on something like poll_event() with the kernel-janitors group, which
> >would abstract out the repeated sleeps. Basically wait_event() without
> >wait-queues... Maybe we could make such an interface just use itimers?
> >I've attached my old patch for poll_event(), just for reference.
> 
> I think not.  itimers is really pointed at a particular system call and 
> has resources in the task structure to do it.  These would be hard to 
> share...

Ok, I wasn't sure about that -- I'm not too familiar with the itimer
code (maybe I'll take a look at it soon :) )

> >My point, I guess, is that in the schedule_timeout() case, we don't know
> >where the jiffies edge is, as we either expire or receive a wait-queue
> >event/signal, we never mod_timer() the internal timer... So we have to
> >assume that we need to sleep the request. But maybe Roman's idea of
> >sleeping a certain number of jiffy edges is sufficient. I am not yet
> >convinced driver authors want/need such an interface, though, still
> >thinking it over.
> 
> IMNSHO we should not get too parental with kernel only interfaces. 
> Adding 1 is easy enough for the caller and even easier to explain in the 
> instructions (i.e. this call sleeps for X jiffies edges).  This allows 
> the caller to do more if needed and, should he ever just want to sync to 
> the next jiffie he does not have to deal with backing out that +1.

I don't want to be too parental either, but I also am trying to avoid
code duplication. Lots of drivers basically do something like
poll_event() does (or could do with some changes), i.e. looping a
constant amount multiple times, checking something every so often. The
patch was just a thought, though. I will keep evaluating drivers and see
if it's a useful interface to have eventually.

I guess I'm just concerned with making an unintuitive interface. As was
brought up at OLS, drivers are a major source of bugs/buggy code. The
simpler, more useful we can make interfaces, the better, I think. I'm
not claiming you disagree, I just want to make my own motives clear.
While fixing up the schedule_timeout() comment would make it clear what
schedule_timeout() achieves, I'm not sure how useful such an interface
is, if every caller adds 1 :) I need to mull it over, though... Lots to
consider. I also, of course, want to stay flexible for the reasons you
mention (letting the driver adjust the timeout as they expect to).

Thanks,
Nish

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-17  5:56                                           ` Nishanth Aravamudan
@ 2005-08-17 19:51                                             ` George Anzinger
  2005-08-17 22:24                                               ` Nishanth Aravamudan
  0 siblings, 1 reply; 50+ messages in thread
From: George Anzinger @ 2005-08-17 19:51 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen,
	linux-kernel, clucas

Nishanth Aravamudan wrote:
~
>>IMNSHO we should not get too parental with kernel only interfaces. 
>>Adding 1 is easy enough for the caller and even easier to explain in the 
>>instructions (i.e. this call sleeps for X jiffies edges).  This allows 
>>the caller to do more if needed and, should he ever just want to sync to 
>>the next jiffie he does not have to deal with backing out that +1.
> 
> 
> I don't want to be too parental either, but I also am trying to avoid
> code duplication. Lots of drivers basically do something like
> poll_event() does (or could do with some changes), i.e. looping a
> constant amount multiple times, checking something every so often. The
> patch was just a thought, though. I will keep evaluating drivers and see
> if it's a useful interface to have eventually.
> 
> I guess I'm just concerned with making an unintuitive interface. As was
> brought up at OLS, drivers are a major source of bugs/buggy code. The
> simpler, more useful we can make interfaces, the better, I think. I'm
> not claiming you disagree, I just want to make my own motives clear.
> While fixing up the schedule_timeout() comment would make it clear what
> schedule_timeout() achieves, I'm not sure how useful such an interface
> is, if every caller adds 1 :) I need to mull it over, though... Lots to
> consider. I also, of course, want to stay flexible for the reasons you
> mention (letting the driver adjust the timeout as they expect to).

I would leave the +1 alone and put in the correct documentation.  This 
way _more_ folks will be made aware of the mid jiffie issue.  Far to 
often we see (and let get in) patches that mess up user interfaces 
around this issue.  The recent changes to itimer come to mind...
> 
~
-- 
George Anzinger   george@mvista.com
HRT (High-res-timers):  http://sourceforge.net/projects/high-res-timers/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [UPDATE PATCH] push rounding up of relative request to schedule_timeout()
  2005-08-17 19:51                                             ` George Anzinger
@ 2005-08-17 22:24                                               ` Nishanth Aravamudan
  0 siblings, 0 replies; 50+ messages in thread
From: Nishanth Aravamudan @ 2005-08-17 22:24 UTC (permalink / raw)
  To: George Anzinger
  Cc: Roman Zippel, Arjan van de Ven, Andrew Morton, domen,
	linux-kernel, clucas

On 17.08.2005 [12:51:17 -0700], George Anzinger wrote:
> Nishanth Aravamudan wrote:
> ~
> >>IMNSHO we should not get too parental with kernel only interfaces. 
> >>Adding 1 is easy enough for the caller and even easier to explain in the 
> >>instructions (i.e. this call sleeps for X jiffies edges).  This allows 
> >>the caller to do more if needed and, should he ever just want to sync to 
> >>the next jiffie he does not have to deal with backing out that +1.
> >
> >
> >I don't want to be too parental either, but I also am trying to avoid
> >code duplication. Lots of drivers basically do something like
> >poll_event() does (or could do with some changes), i.e. looping a
> >constant amount multiple times, checking something every so often. The
> >patch was just a thought, though. I will keep evaluating drivers and see
> >if it's a useful interface to have eventually.
> >
> >I guess I'm just concerned with making an unintuitive interface. As was
> >brought up at OLS, drivers are a major source of bugs/buggy code. The
> >simpler, more useful we can make interfaces, the better, I think. I'm
> >not claiming you disagree, I just want to make my own motives clear.
> >While fixing up the schedule_timeout() comment would make it clear what
> >schedule_timeout() achieves, I'm not sure how useful such an interface
> >is, if every caller adds 1 :) I need to mull it over, though... Lots to
> >consider. I also, of course, want to stay flexible for the reasons you
> >mention (letting the driver adjust the timeout as they expect to).
> 
> I would leave the +1 alone and put in the correct documentation.  This 
> way _more_ folks will be made aware of the mid jiffie issue.  Far to 
> often we see (and let get in) patches that mess up user interfaces 
> around this issue.  The recent changes to itimer come to mind...

Ok, makes sense to me; does the following patch work for everybody? The
wording is a bit awkward, but so is the issue :)

Description: Fix schedule_timeout()'s comment, indicated the inter-tick
rounding issue. Since the kernel does not keep track of an inter-tick
position in jiffies, a caller which wishes to sleep for at least a
certain number of jiffies should add its request to the *next* jiffies
value (meaning add 1 to its relative request). Make that clear in the
comment.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>

---

diff -urpN 2.6.13-rc6/kernel/timer.c 2.6.13-rc6-dev/kernel/timer.c
--- 2.6.13-rc6/kernel/timer.c	2005-08-09 15:22:57.000000000 -0700
+++ 2.6.13-rc6-dev/kernel/timer.c	2005-08-17 15:21:35.000000000 -0700
@@ -1077,9 +1077,15 @@ static void process_timeout(unsigned lon
  * schedule_timeout - sleep until timeout
  * @timeout: timeout value in jiffies
  *
- * Make the current task sleep until @timeout jiffies have
- * elapsed. The routine will return immediately unless
- * the current task state has been set (see set_current_state()).
+ * Make the current task sleep until @timeout timer interrupts have
+ * occurred, meaning jiffies has incremented @timeout times and not
+ * necessarily that @timeout jiffies have elapsed. If the task wishes to
+ * sleep until (at least) @timeout jiffies have elapsed, then it should
+ * add 1 to its request. This is necessary, as the kernel does not keep
+ * track of an inter-jiffy position, so the caller must "round up" its
+ * request so that it begins at the next jiffy. The routine will return
+ * immediately unless the current task state has been set (see
+ * set_current_state()).
  *
  * You can set the task state as follows -
  *

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2005-08-17 22:24 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-07 21:31 [patch 1/4] drivers/char/ip2/i2lib.c: replace direct assignment with set_current_state() domen
2005-07-08 23:08 ` Andrew Morton
2005-07-08 23:22   ` Nish Aravamudan
2005-07-23  0:27   ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan
2005-07-23  0:31     ` Arjan van de Ven
2005-07-23  1:08       ` [UPDATE PATCH] Add schedule_timeout_{interruptible,uninterruptible}_msecs() interfaces Nishanth Aravamudan
2005-07-23  2:30         ` Andrew Morton
2005-07-23 16:23           ` Nishanth Aravamudan
2005-07-23 10:50       ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Roman Zippel
2005-07-23 11:09         ` Arjan van de Ven
2005-07-23 11:55           ` Roman Zippel
2005-07-23 12:51             ` Arjan van de Ven
2005-07-23 13:04               ` Roman Zippel
2005-07-23 13:12                 ` Arjan van de Ven
2005-07-23 13:29                   ` Roman Zippel
2005-07-23 13:32                     ` Arjan van de Ven
2005-07-23 15:56                       ` Roman Zippel
2005-07-23 16:44                     ` Nishanth Aravamudan
2005-07-23 16:43                 ` Nishanth Aravamudan
2005-07-23 17:17                   ` Roman Zippel
2005-07-23 19:10                     ` Nishanth Aravamudan
2005-07-23 20:12                       ` Roman Zippel
2005-07-27 22:29                         ` Nishanth Aravamudan
2005-07-30 23:35                           ` Roman Zippel
2005-08-01 19:35                             ` [UPDATE PATCH] Add schedule_timeout_{intr,unintr}{,_msecs}() interfaces Nishanth Aravamudan
2005-08-03 14:20                               ` Roman Zippel
2005-08-04  0:51                                 ` [PATCH] push rounding up of relative request to schedule_timeout() Nishanth Aravamudan
2005-08-04  5:14                                   ` [UPDATE PATCH] " Nishanth Aravamudan
2005-08-04 16:45                                     ` George Anzinger
2005-08-04 18:48                                       ` Nish Aravamudan
2005-08-16 23:05                                       ` Nishanth Aravamudan
2005-08-17  0:39                                         ` George Anzinger
2005-08-17  5:56                                           ` Nishanth Aravamudan
2005-08-17 19:51                                             ` George Anzinger
2005-08-17 22:24                                               ` Nishanth Aravamudan
2005-08-04 17:05                                     ` George Anzinger
2005-08-04 18:49                                       ` Nish Aravamudan
2005-08-04  9:38                                   ` [PATCH] " Roman Zippel
2005-08-04 14:33                                     ` Nishanth Aravamudan
2005-08-04 18:59                                       ` Roman Zippel
2005-08-04 19:11                                         ` Nishanth Aravamudan
2005-08-04 23:20                                           ` Roman Zippel
2005-08-04 17:08                                     ` Andrew Morton
2005-08-04 19:00                                       ` [PATCH] add schedule_timeout_{,un}intr() interfaces Nishanth Aravamudan
2005-08-05  7:38                                         ` Andrew Morton
2005-07-23 16:37             ` [PATCH] Add schedule_timeout_{interruptible,uninterruptible}{,_msecs}() interfaces Nishanth Aravamudan
2005-07-23 17:01               ` Roman Zippel
2005-07-23 19:06                 ` Nishanth Aravamudan
2005-07-23 20:22                   ` Roman Zippel
2005-07-23 16:30         ` Nishanth Aravamudan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox