public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* RE: FW: [RFC] A more general timeout specification
@ 2005-08-31 22:11 Perez-Gonzalez, Inaky
  2005-08-31 23:06 ` Christopher Friesen
  0 siblings, 1 reply; 30+ messages in thread
From: Perez-Gonzalez, Inaky @ 2005-08-31 22:11 UTC (permalink / raw)
  To: Christopher Friesen, joe.korty; +Cc: akpm, george, johnstul, linux-kernel

>From: Christopher Friesen [mailto:cfriesen@nortel.com]
>Joe Korty wrote:
>
>> The returned timeout struct has a bit used to mark the value as
absolute.  Thus
>> the caller treats the returned timeout as a opaque cookie that can be
>> reapplied to the next (or more likely, the to-be restarted) timeout.
>
>Okay, endtime is always absolute value of when it should have expired.
>But I think I see a problem with the opaque cookie scheme and repeating
>timeouts.
>
>Suppose I want to wake my application at INTERVAL nanoseconds from now
>on the MONOTONIC clock, then again every INTERVAL nanoseconds after
that.

This API is not intended for your application to use directly, but
for kernel APIs that take sleeps from userspace (like
pthread_mutex_lock()
and friends), so this scenario is not very likely.

Granted, sleep() can be implemented with it too, so...

>How do I do that with this API?
>
>I can get the first sleep.  Suppose I oversleep by X nanoseconds.  I
>wake, and get an opaque timeout back.  How do I ask for the new wake
>time to be "endtime + INTERVAL"?

endtime.ts += INTERVAL

[we all know opaque is relative too] 
Or better, use itimers :)

-- Inaky

^ permalink raw reply	[flat|nested] 30+ messages in thread
* RE: FW: [RFC] A more general timeout specification
@ 2005-09-01 11:59 Perez-Gonzalez, Inaky
  2005-09-01 12:55 ` Roman Zippel
  0 siblings, 1 reply; 30+ messages in thread
From: Perez-Gonzalez, Inaky @ 2005-09-01 11:59 UTC (permalink / raw)
  To: Roman Zippel; +Cc: akpm, joe.korty, george, johnstul, linux-kernel

>From: Roman Zippel [mailto:zippel@linux-m68k.org]
>On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote:
>
>> Hmm, I cannot think of more ways to specify a timeout than how
>> long I want to wait (relative) or until when (absolute) and which
>> is the reference clock. And they don't seem broken to me, common
>> sense, in any case. Do you have any examples?
>
>You still didn't explain what's the point in choosing different clock
>sources for a _timeout_.

The same reasons that compel to have CLOCK_REALTIME or 
CLOCK_MONOTONIC, for example. Or the need to time out on a
high resolution clock. 

A certain application might have a need for a 10ms timeout, 
but another one might have it on 100us--modern CPUs make that
more than possible. The precission of your time source permeates
to the precission of your timeout.

[of course, now at the end it is still kernel time, but the 
ongoing revamp work on timers will change some of that, one
way or another].

>> Different versions of the same function that do relative, absolute.
>> If I keep going that way, the reason becomes:
>>
>> sys_mutex_lock
>> sys_mutex_lock_timed_relative_clock_realtime
>> sys_mutex_lock_timed_absolute_clock_realtime
>> sys_mutex_lock_timed_relative_clock_monotonic
>> sys_mutex_lock_timed_absolute_clock_monotonic
>> sys_mutex_lock_timed_relative_clock_monotonic_highres
>> sys_mutex_lock_timed_absolute_clock_monotonic_highres
>
>Hiding it behind an API makes it better?

It certainly cuts out cruft and to my not-so-trained eye, makes it
cleaner and easier to maintain.

>You didn't answer my other question, let's assume we add such a timeout
>structure, what's wrong with converting it to kernel time (which would
>automatically validate it).

And again, that's what at the end this API is doing, convering it to 
kernel time. 

Give it a more "human" specification (timespec) and gets the job done.
No need to care on how long a jiffy is today in this system, no need
to replicate endlessly the conversion code, which happens to be
non-trivial (for the absolute time case--but still way more trivial 
than userspace asking the kernel for the time, computing a relative 
shift and dealing with the skews that preemption at a Murphy moment 
could cause). 

It is mostly the same as schedule_timeout(), but it takes the sleep
time in a more general format. As every other API, it is designed so 
that the caller doesn't need to care or know about the gory details 
on how it has to be converted.

-- Inaky

^ permalink raw reply	[flat|nested] 30+ messages in thread
* RE: FW: [RFC] A more general timeout specification
@ 2005-09-01  0:00 Perez-Gonzalez, Inaky
  2005-09-01  9:19 ` Roman Zippel
  0 siblings, 1 reply; 30+ messages in thread
From: Perez-Gonzalez, Inaky @ 2005-09-01  0:00 UTC (permalink / raw)
  To: Roman Zippel; +Cc: akpm, joe.korty, george, johnstul, linux-kernel

>From: Roman Zippel [mailto:zippel@linux-m68k.org]
>On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote:
>
>> I cannot produce (top of my head) any other POSIX API calls that
>> allow you to specify another clock source, but they are there,
>> somewhere. If I am to introduce a new API, I better make it
>> flexible enough so that other subsystems can use it for more stuff
>> other than...
>
>So we have to deal at kernel level with every broken timeout
specification
>that comes along?

Hmm, I cannot think of more ways to specify a timeout than how
long I want to wait (relative) or until when (absolute) and which
is the reference clock. And they don't seem broken to me, common
sense, in any case. Do you have any examples?

In any case, like it or not, POSIX is what almost every application
uses to talk to the kernel.

>> ...adding more versions that add complexity and duplicate
>> code in many different places (user-to-kernel copy, syscall entry
>> points, timespec validation). And the minute you add a clock_id
>> you can steal some bits for specifying absolute/relative (or vice
>> versa), so it is almost a win-win situarion.
>
>What "more versions" are you talking about? When you convert a user
time
>to kernel time you can automatically validate it and later you can use
>standard kernel APIs, so you don't have to add even more API bloat.

The versions you were talking about:

>From: Roman Zippel [mailto:zippel@linux-m68k.org]
>...
>Why is not sufficient to just add a relative/absolute version,
>which convert the time at entry to kernel time?

Different versions of the same function that do relative, absolute.
If I keep going that way, the reason becomes:

sys_mutex_lock
sys_mutex_lock_timed_relative_clock_realtime
sys_mutex_lock_timed_absolute_clock_realtime
sys_mutex_lock_timed_relative_clock_monotonic
sys_mutex_lock_timed_absolute_clock_monotonic
sys_mutex_lock_timed_relative_clock_monotonic_highres
sys_mutex_lock_timed_absolute_clock_monotonic_highres

s/mutex_lock/ with whatever system call that takes a timeout you want
and
keep adding combinations. On each of those check for validity of the
__user pointer, copy it, validate the timespec.

[admitedly I am stretching the point with the different clock types].

So where is the problem on unifying all that handling? You are still 
not offering any constructive criticism to solve the issue that now
the syscalls take relative timeouts vs the absolutes we need.

-- Inaky

^ permalink raw reply	[flat|nested] 30+ messages in thread
* RE: FW: [RFC] A more general timeout specification
@ 2005-08-31 23:24 Perez-Gonzalez, Inaky
  2005-08-31 23:50 ` Roman Zippel
  0 siblings, 1 reply; 30+ messages in thread
From: Perez-Gonzalez, Inaky @ 2005-08-31 23:24 UTC (permalink / raw)
  To: Roman Zippel; +Cc: akpm, joe.korty, george, johnstul, linux-kernel

>From: Roman Zippel [mailto:zippel@linux-m68k.org]
>On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote:
>
>> Usefulness: (see the rationale in the patch), but in a nutshell;
>> most POSIX timeout specs have to be absolute in CLOCK_REALTIME
>> (eg: pthread_mutex_timed_lock()). Current kernel needs the timeout
>> relative, so glibc calls the kernel/however gets the time, computes
>> relative times and syscalls. Race conditions, overhead...etc.
>>
>> This mechanism supports both. That's why it is more general.
>
>Your patch basically only mentions fusyn, why does it need multiple
clock
>sources?

I cannot produce (top of my head) any other POSIX API calls that
allow you to specify another clock source, but they are there,
somewhere. If I am to introduce a new API, I better make it 
flexible enough so that other subsystems can use it for more stuff
other than...

>Why is not sufficient to just add a relative/absolute version,
>which convert the time at entry to kernel time?

...adding more versions that add complexity and duplicate
code in many different places (user-to-kernel copy, syscall entry 
points, timespec validation). And the minute you add a clock_id
you can steal some bits for specifying absolute/relative (or vice
versa), so it is almost a win-win situarion.

To summarize: thought about that, but it is fugly and not too practical.


Consider also his allows you to write extensions to POSIX or your
own user-level APIs that could allow (following the fusyn example) 
you to wait on a mutex with a timeout based off a monotonic clock, 
if you need it (or something that makes more sense than this--highres 
comes to mind). 

-- Inaky

^ permalink raw reply	[flat|nested] 30+ messages in thread
* RE: FW: [RFC] A more general timeout specification
@ 2005-08-31 23:17 Perez-Gonzalez, Inaky
  0 siblings, 0 replies; 30+ messages in thread
From: Perez-Gonzalez, Inaky @ 2005-08-31 23:17 UTC (permalink / raw)
  To: Christopher Friesen; +Cc: joe.korty, akpm, george, johnstul, linux-kernel

>From: Christopher Friesen [mailto:cfriesen@nortel.com]
>Perez-Gonzalez, Inaky wrote:
>
>>>I can get the first sleep.  Suppose I oversleep by X nanoseconds.  I
>>>wake, and get an opaque timeout back.  How do I ask for the new wake
>>>time to be "endtime + INTERVAL"?
>>
>>
>> endtime.ts += INTERVAL
>> [we all know opaque is relative too]
>
>Heh. Okay, then what are the rules about what I'm allowed to do with
>endtime?  Joe mentioned there was a bit in there somewhere to denote
>absolute time.

Well, it doesn't really matter. The bit in endtime.clock_id (highest,
AFAIR) says if it is absolute or not, but because adding a relative
value to a value maintains its condition (absolute or relative), it
is not a concern. Just add it.

Unless I am missing something really basic, of course.

>> Or better, use itimers :)
>
>I as actually thinking in terms of implementing itimers on top of your
>new API.

Heh, got me.

-- Inaky

^ permalink raw reply	[flat|nested] 30+ messages in thread
* RE: FW: [RFC] A more general timeout specification
@ 2005-08-31 22:15 Perez-Gonzalez, Inaky
  2005-08-31 23:01 ` Roman Zippel
  0 siblings, 1 reply; 30+ messages in thread
From: Perez-Gonzalez, Inaky @ 2005-08-31 22:15 UTC (permalink / raw)
  To: Roman Zippel; +Cc: akpm, joe.korty, george, johnstul, linux-kernel

>From: Roman Zippel [mailto:zippel@linux-m68k.org]
>On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote:
>
>> +	flags = tp->clock_id & TIMEOUT_FLAGS_MASK;
>> +	clock_id = tp->clock_id & TIMEOUT_CLOCK_MASK;
>> +
>> +	result = -EINVAL;
>> +	if (flags & ~TIMEOUT_RELATIVE)
>> +	    goto out;
>> +
>> +	/* someday, we should support *all* clocks available to us */
>> +	if (clock_id != CLOCK_REALTIME && clock_id != CLOCK_MONOTONIC)
>> +		goto out;
>> +	if ((unsigned long)tp->ts.tv_nsec >= NSEC_PER_SEC)
>> +		goto out;
>
>Why is that needed in a _general_ timeout API? What exactly makes it so
>useful for everyone and not just more complex for everyone?

Because if a system call gets a timeout specification it needs to
verify its correctness first. Instead of doing that at the point
where it goes to sleep, that could be deep in an atomic section,
we provide a separate function [timeout_validate()] which is the
one you mention, to do that.

Usefulness: (see the rationale in the patch), but in a nutshell;
most POSIX timeout specs have to be absolute in CLOCK_REALTIME
(eg: pthread_mutex_timed_lock()). Current kernel needs the timeout
relative, so glibc calls the kernel/however gets the time, computes
relative times and syscalls. Race conditions, overhead...etc. 

This mechanism supports both. That's why it is more general.

-- Inaky

^ permalink raw reply	[flat|nested] 30+ messages in thread
* FW: [RFC] A more general timeout specification
@ 2005-08-31 20:55 Perez-Gonzalez, Inaky
  2005-08-31 21:15 ` Joe Korty
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Perez-Gonzalez, Inaky @ 2005-08-31 20:55 UTC (permalink / raw)
  To: akpm; +Cc: joe.korty, george, johnstul, linux-kernel



Hi Andrew

I would like to ask for the inclusion of the general timeout
specification API in the -mm tree, looking forward it to bubble
up to mainline.

For more context, please visit:

http://groups.google.com/group/linux.kernel/browse_frm/thread/a426e5b0d0
f17860/b53000aa01304c4a?tvc=1&q=a+more+general+timeout+specification#b53
000aa01304c4a

This was developed by Joe Korty <joe.korty@ccur.com>, greatly 
enhancing something I had done before, so I am signing it out 
(although Joe should too, Joe?).

--

The fusyn (robust mutexes) project proposes the creation
of a more general data structure, 'struct timeout', for the
specification of timeouts in new services.  In this structure,
the user specifies:

    a time, in timespec format.
    the clock the time is specified against (eg, CLOCK_MONOTONIC).
    whether the time is absolute, or relative to 'now'.

That is, all combinations of useful timeout attributes become
possible.

Also proposed are two new kernel routines for the manipulation
of timeouts:

	timeout_validate()
	timeout_sleep()

timeout_validate() error-checks the syntax of a timeout
argument and returns either zero or -EINVAL.  By breaking
timeout_validate() out from timeout_sleep(), it becomes possible
to error check the timeout 'far away' from the places in the
code where we would actually do the timeout, as well as being
able to perform such checks only at those places we know the
timeout specification is coming from an unsafe source.

timeout_sleep() puts the caller to sleep until the
specified end time is in the past, as measured against
the given clock, or until the caller is awakened by other
means (such as wake_up_process()).  Like schedule_timeout(),
TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE must be set ahead
of time; if TASK_INTERRUPTIBLE is set then signals will also
break the caller out of the sleep.

timeout_sleep() returns either 0 (returned early) or -ETIMEDOUT
(returned due to timeout).  It is up to the caller to resolve,
in the "returned early" case, why it returned early.

Timeout_sleep has a return argument, endtime, which is also in
'struct timeout' format.  If the input time was relative, then
it is converted to absolute and returned through this argument.
This can be used when an early-terminated service must be
restarted and side effects of the early termination-n-restart
(such as end time drift) are to be avoided.

Joe
"Money can buy bandwidth, but latency is forever" -- John Mashey

Signed-off-by: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>

 2.6.12-rc4-jak/include/linux/time.h    |    6 +
 2.6.12-rc4-jak/include/linux/timeout.h |   48 ++++++++
 2.6.12-rc4-jak/kernel/posix-timers.c   |    7 +
 2.6.12-rc4-jak/kernel/timer.c          |  184
+++++++++++++++++++++++++++++++++
 4 files changed, 245 insertions(+)

diff -puNa include/linux/time.h~a.more.flexible.timeout.approach
include/linux/time.h
--- 2.6.12-rc4/include/linux/time.h~a.more.flexible.timeout.approach
2005-05-18 13:53:14.204417169 -0400
+++ 2.6.12-rc4-jak/include/linux/time.h	2005-05-18 13:53:14.212416002
-0400
@@ -25,6 +25,8 @@ struct timezone {
 	int	tz_dsttime;	/* type of dst correction */
 };
 
+#include <linux/timeout.h>
+
 #ifdef __KERNEL__
 
 /* Parameters used to convert the timespec values */
@@ -103,6 +105,10 @@ struct itimerval;
 extern int do_setitimer(int which, struct itimerval *value, struct
itimerval *ovalue);
 extern int do_getitimer(int which, struct itimerval *value);
 extern void getnstimeofday (struct timespec *tv);
+extern long clock_gettime(int which, struct timespec *tp);
+
+extern int FASTCALL(abs_timespec_to_abs_jiffies (clockid_t clock, const
struct timespec *tp, unsigned long *jp));
+extern int FASTCALL(rel_to_abs_timespec(clockid_t clock, const struct
timespec *tsrel, struct timespec *tsabs));
 
 extern struct timespec timespec_trunc(struct timespec t, unsigned
gran);
 
diff -puNa /dev/null include/linux/timeout.h
--- /dev/null	2004-06-24 14:04:38.000000000 -0400
+++ 2.6.12-rc4-jak/include/linux/timeout.h	2005-05-18
13:53:14.212416002 -0400
@@ -0,0 +1,48 @@
+/*
+ * Extended timeout specification
+ *
+ * (C) 2002-2005 Intel Corp
+ * Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>.
+ *
+ * Licensed under the FSF's GNU Public License v2 or later.
+ *
+ * Generic extended timeout specification.  Broken out by Joe Korty
+ * <joe.korty@ccur.com> from linux/time.h so that it can be included
+ * by userspace applications in conjunction with #include "time.h".
+ */
+
+#ifndef _LINUX_TIMEOUT_H
+#define _LINUX_TIMEOUT_H
+
+/* 'struct timeout' flag values.  OR these into clock_id along with
+ * a clock specification such as CLOCK_REALTIME or CLOCK_MONOTONIC.
+ */
+enum {
+	TIMEOUT_RELATIVE   = 0x10000000,	/* relative timeout */
+
+	TIMEOUT_FLAGS_MASK = 0xf0000000,	/* flags mask for
clock_id */
+	TIMEOUT_CLOCK_MASK = 0x0fffffff,	/* clock mask for
clock_id */
+};
+
+/* Magic values a 'struct timeout' pointer can have */
+
+#define TIMEOUT_MAX	((struct timeout *) ~0UL) /* never time out */
+#define TIMEOUT_NONE	((struct timeout *) 0UL)  /* time out
immediately */
+
+/**
+ * struct timeout - general timeout specification
+ *
+ * @clock_id: which clock source to use ORed with flags describing use.
+ * @ts:       timespec for the timeout
+ */
+struct timeout {
+	clockid_t clock_id;
+	struct timespec ts;
+};
+
+#ifdef __KERNEL__
+extern int FASTCALL(timeout_validate (const struct timeout *));
+extern int FASTCALL(timeout_sleep (const struct timeout *, struct
timeout *));
+#endif
+
+#endif
diff -puNa kernel/posix-timers.c~a.more.flexible.timeout.approach
kernel/posix-timers.c
--- 2.6.12-rc4/kernel/posix-timers.c~a.more.flexible.timeout.approach
2005-05-18 13:53:14.207416731 -0400
+++ 2.6.12-rc4-jak/kernel/posix-timers.c	2005-05-18
13:53:14.214415710 -0400
@@ -1288,6 +1288,13 @@ sys_clock_settime(clockid_t which_clock,
 	return CLOCK_DISPATCH(which_clock, clock_set, (which_clock,
&new_tp));
 }
 
+long clock_gettime(clockid_t which_clock, struct timespec *tp)
+{
+	if (invalid_clockid(which_clock))
+		return -EINVAL;
+	return CLOCK_DISPATCH(which_clock, clock_get, (which_clock,
tp));
+}
+
 asmlinkage long
 sys_clock_gettime(clockid_t which_clock, struct timespec __user *tp)
 {
diff -puNa kernel/timer.c~a.more.flexible.timeout.approach
kernel/timer.c
--- 2.6.12-rc4/kernel/timer.c~a.more.flexible.timeout.approach
2005-05-18 13:53:14.209416440 -0400
+++ 2.6.12-rc4-jak/kernel/timer.c	2005-05-18 15:27:06.363710594
-0400
@@ -1129,6 +1129,190 @@ fastcall signed long __sched schedule_ti
 
 EXPORT_SYMBOL(schedule_timeout);
 
+/**
+ * timeout_validate - verify that a timeout specification is
self-consistant.
+ * @tp - pointer to the 'struct timeout' to verify
+ * @returns - 0 if no error, <0 errno code if there was
+ */
+fastcall int timeout_validate(const struct timeout *tp)
+{
+	int result;
+	unsigned flags, clock_id;
+
+	result = 0;
+	if (tp == TIMEOUT_MAX || tp == TIMEOUT_NONE)
+		goto out;
+
+	flags = tp->clock_id & TIMEOUT_FLAGS_MASK;
+	clock_id = tp->clock_id & TIMEOUT_CLOCK_MASK;
+
+	result = -EINVAL;
+	if (flags & ~TIMEOUT_RELATIVE)
+	    goto out;
+
+	/* someday, we should support *all* clocks available to us */
+	if (clock_id != CLOCK_REALTIME && clock_id != CLOCK_MONOTONIC)
+		goto out;
+	if ((unsigned long)tp->ts.tv_nsec >= NSEC_PER_SEC)
+		goto out;
+	result = 0;
+out:
+	return result;
+}
+EXPORT_SYMBOL_GPL(timeout_validate);
+
+/** rel_to_abs_timespec - convert a relative timespec into an absolute
timespec
+ *  @clock - the system clock the timespecs are specified in.
+ *  @tsrel - a pointer to the relative timespec to be converted.
+ *  @tsabs - a pointer to where the absolute timespec is to be stored.
+ *  @returns - 0 if converted, <0 if error.
+ */
+fastcall int rel_to_abs_timespec(clockid_t clock, const struct timespec
*tsrel,
+				struct timespec *tsabs)
+{
+	int result;
+	struct timespec now;
+
+	result = clock_gettime(clock, &now);
+	if (result >= 0) {
+		set_normalized_timespec(tsabs, now.tv_sec +
tsrel->tv_sec,
+				now.tv_nsec + tsrel->tv_nsec);
+	}
+	return result;
+}
+EXPORT_SYMBOL_GPL(rel_to_abs_timespec);
+
+/**
+ * abs_timespec_to_abs_jiffies - convert an absolute timespec into
+ * absolute jiffies.
+ *
+ * @clock - select which clock @tp is specified against.
CLOCK_MONOTONIC and
+ *     CLOCK_REALTIME are two possibilities.
+ * @tp - absolute timespec that is to be converted to jiffies.
+ * @jp - absolute jiffies returned through this pointer.
+ * @returns -  0 if converted, <0 if error.
+ */
+fastcall int abs_timespec_to_abs_jiffies (clockid_t clock,
+			const struct timespec *tp, unsigned long *jp)
+{
+	int result;
+	unsigned long jiffies_abs, seq;
+	struct timespec oc, now;
+
+	do {
+		seq = read_seqbegin(&xtime_lock);
+		result = clock_gettime(clock, &now);
+		if (result < 0)
+			return result;
+		jiffies_abs = jiffies;
+	} while (read_seqretry(&xtime_lock, seq));
+
+	set_normalized_timespec(&oc, tp->tv_sec - now.tv_sec,
+				tp->tv_nsec - now.tv_nsec);
+
+	if (oc.tv_sec > 0 || (oc.tv_sec == 0 && oc.tv_nsec > 0))
+		jiffies_abs += timespec_to_jiffies(&oc) + 1;
+	*jp = jiffies_abs;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(abs_timespec_to_abs_jiffies);
+
+/* Helper function.  Handles the proper setting of @endtime when
+ * an @endtime needs to be returned.  Returns an approximation for
+ * the ending time in jiffies.  The approximation is guaranteed to
+ * be on or just past the true ending time.
+ */
+static inline
+unsigned long __timeout_sleep_timespec (const struct timeout *timeout,
+				struct timeout *endtime)
+{
+	unsigned long jiffies_abs;
+	struct timespec end;
+	clockid_t clock_id = timeout->clock_id & TIMEOUT_CLOCK_MASK;
+
+	if (timeout->clock_id & TIMEOUT_RELATIVE) {
+		rel_to_abs_timespec(clock_id, &timeout->ts, &end);
+		if (endtime) {
+			endtime->clock_id = clock_id;
+			endtime->ts = end;
+		}
+	}
+	else
+		end = timeout->ts;
+
+	abs_timespec_to_abs_jiffies(clock_id, &end, &jiffies_abs);
+	return jiffies_abs;
+}
+
+/**
+ * timeout_sleep - sleep until woken up, interrupted, or the
+ *		   designated wakeup time is past.
+ * @timeout:
+ *    time to wait.  const struct timeout pointer.  May take
+ *    on the special values TIMEOUT_MAX (never time out) or
+ *    TIMEOUT_NONE (times out instantly).  @timeout must be
+ *    error-checked by timeout_validate() beforehand.
+ * @endtime:
+ *    time remaining.  A 'struct timeout' value, holding the
+ *    end time of the period in absolute format and specified
+ *    against the same clock that @timeout was specified against,
+ *    is returned through this pointer.  No value is returned
+ *    if @endtime is NULL, if @timeout is TIMEOUT_MAX or
+ *    TIMEOUT_NONE, or if @timeout is itself specified in
+ *    absolute time.
+ * @returns:
+ *    -ETIMEDOUT if awakened due to timeout, 0 otherwise.
+ */
+fastcall int timeout_sleep (const struct timeout *timeout, struct
timeout *endtime)
+{
+	unsigned long jiffies_endtime;
+	struct timer_list timer;
+	int result;
+
+	if (timeout == TIMEOUT_NONE)
+		goto simulate_timeout;
+
+	if (timeout == TIMEOUT_MAX)
+		goto never_timeout;
+
+	if (timeout->ts.tv_sec  == MAX_SCHEDULE_TIMEOUT)
+		goto never_timeout;
+
+	jiffies_endtime = __timeout_sleep_timespec(timeout, endtime);
+
+	if (time_after_eq(jiffies, jiffies_endtime))
+		goto simulate_timeout;
+
+	init_timer(&timer);
+	timer.expires = jiffies_endtime;
+	timer.data = (unsigned long)current;
+	timer.function = process_timeout;
+	add_timer(&timer);
+	schedule();
+	result = -ETIMEDOUT;
+	if (del_singleshot_timer_sync (&timer))
+		result = 0;
+	goto out;
+
+	/*
+	 * simulate a timeout without actually expiring a timer
+	 */
+simulate_timeout:
+	process_timeout((unsigned long)current);
+	result = -ETIMEDOUT;
+	goto out;
+
+	/*
+	 * Sleep without a timeout scheduled.
+	 */
+never_timeout:
+	schedule();
+	result = 0;
+out:
+	return result;
+}
+EXPORT_SYMBOL_GPL (timeout_sleep);
+
 /* Thread ID - the internal kernel "pid" */
 asmlinkage long sys_gettid(void)
 {

-- 

Inaky


^ permalink raw reply	[flat|nested] 30+ messages in thread
* RE: FW: [RFC] A more general timeout specification
@ 2005-08-23  1:13 Perez-Gonzalez, Inaky
  0 siblings, 0 replies; 30+ messages in thread
From: Perez-Gonzalez, Inaky @ 2005-08-23  1:13 UTC (permalink / raw)
  To: john stultz, Inaky Perez-Gonzalez
  Cc: Nishanth Aravamudan, joe.korty, george, linux-kernel

Hi John

>From: john stultz [mailto:johnstul@us.ibm.com]
>On Thu, 2005-07-28 at 18:52 -0700, Inaky Perez-Gonzalez wrote:
>> The main user of this new inteface is to allow system calls to get
>> time specified in an absolute form (as most of POSIX states) and thus
>> avoid extra time conversion work.
...
>>
http://groups-beta.google.com/groups?q=a+more+general+timeout+specificat
ion
...
>>
>> timeout_validate() error-checks the syntax of a timeout
>> argument and returns either zero or -EINVAL.  By breaking
>> timeout_validate() out from timeout_sleep(), it becomes possible
>> to error check the timeout 'far away' from the places in the
>> code where we would actually do the timeout, as well as being
>> able to perform such checks only at those places we know the
>> timeout specification is coming from an unsafe source.
>
>using gettimeofday() so that part looks good. I'm not completely sold
on
>why the validate interface is needed, but I didn't hear any objections
>from George, so I'd defer to those who deal more with those interfaces.

_validate() is mostly needed when we take a timeout specification from
user space (timeouts from kernel space are supposed to be ok). We need
to validate that the clock id passed is correct (existant), that
the 'struct timespec' is also legal (eg: nsec < 1000M), that the flags 
are ok (relative/absolute), etc...

The idea is that in your code that uses this, once you copy the 'struct 
timeout' from  user space you check it for validity. Then you can 
dive into any kind of code (atomic, sleep paths, whatever) without
having
to code an error path from deeep down for when the user passed a bad 
timeout.

It sure makes it more simple :)

-- Inaky

^ permalink raw reply	[flat|nested] 30+ messages in thread
* [RFC] A more general timeout specification
@ 2005-05-18 20:15 Joe Korty
  2005-07-29  1:52 ` FW: " Inaky Perez-Gonzalez
  0 siblings, 1 reply; 30+ messages in thread
From: Joe Korty @ 2005-05-18 20:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: robustmutexes, george

[ for comment only ]

The fusyn (robust mutexes) project proposes the creation
of a more general data structure, 'struct timeout', for the
specification of timeouts in new services.  In this structure,
the user specifies:

    a time, in timespec format.
    the clock the time is specified against (eg, CLOCK_MONOTONIC).
    whether the time is absolute, or relative to 'now'.

That is, all combinations of useful timeout attributes become
possible.

Also proposed are two new kernel routines for the manipulation
of timeouts:

	timeout_validate()
	timeout_sleep()

timeout_validate() error-checks the syntax of a timeout
argument and returns either zero or -EINVAL.  By breaking
timeout_validate() out from timeout_sleep(), it becomes possible
to error check the timeout 'far away' from the places in the
code where we would actually do the timeout, as well as being
able to perform such checks only at those places we know the
timeout specification is coming from an unsafe source.

timeout_sleep() puts the caller to sleep until the
specified end time is in the past, as measured against
the given clock, or until the caller is awakened by other
means (such as wake_up_process()).  Like schedule_timeout(),
TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE must be set ahead
of time; if TASK_INTERRUPTIBLE is set then signals will also
break the caller out of the sleep.

timeout_sleep() returns either 0 (returned early) or -ETIMEDOUT
(returned due to timeout).  It is up to the caller to resolve,
in the "returned early" case, why it returned early.

Timeout_sleep has a return argument, endtime, which is also in
'struct timeout' format.  If the input time was relative, then
it is converted to absolute and returned through this argument.
This can be used when an early-terminated service must be
restarted and side effects of the early termination-n-restart
(such as end time drift) are to be avoided.

Joe
"Money can buy bandwidth, but latency is forever" -- John Mashey


 2.6.12-rc4-jak/include/linux/time.h    |    6 +
 2.6.12-rc4-jak/include/linux/timeout.h |   48 ++++++++
 2.6.12-rc4-jak/kernel/posix-timers.c   |    7 +
 2.6.12-rc4-jak/kernel/timer.c          |  184 +++++++++++++++++++++++++++++++++
 4 files changed, 245 insertions(+)

diff -puNa include/linux/time.h~a.more.flexible.timeout.approach include/linux/time.h
--- 2.6.12-rc4/include/linux/time.h~a.more.flexible.timeout.approach	2005-05-18 13:53:14.204417169 -0400
+++ 2.6.12-rc4-jak/include/linux/time.h	2005-05-18 13:53:14.212416002 -0400
@@ -25,6 +25,8 @@ struct timezone {
 	int	tz_dsttime;	/* type of dst correction */
 };
 
+#include <linux/timeout.h>
+
 #ifdef __KERNEL__
 
 /* Parameters used to convert the timespec values */
@@ -103,6 +105,10 @@ struct itimerval;
 extern int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue);
 extern int do_getitimer(int which, struct itimerval *value);
 extern void getnstimeofday (struct timespec *tv);
+extern long clock_gettime(int which, struct timespec *tp);
+
+extern int FASTCALL(abs_timespec_to_abs_jiffies (clockid_t clock, const struct timespec *tp, unsigned long *jp));
+extern int FASTCALL(rel_to_abs_timespec(clockid_t clock, const struct timespec *tsrel, struct timespec *tsabs));
 
 extern struct timespec timespec_trunc(struct timespec t, unsigned gran);
 
diff -puNa /dev/null include/linux/timeout.h
--- /dev/null	2004-06-24 14:04:38.000000000 -0400
+++ 2.6.12-rc4-jak/include/linux/timeout.h	2005-05-18 13:53:14.212416002 -0400
@@ -0,0 +1,48 @@
+/*
+ * Extended timeout specification
+ *
+ * (C) 2002-2005 Intel Corp
+ * Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>.
+ *
+ * Licensed under the FSF's GNU Public License v2 or later.
+ *
+ * Generic extended timeout specification.  Broken out by Joe Korty
+ * <joe.korty@ccur.com> from linux/time.h so that it can be included
+ * by userspace applications in conjunction with #include "time.h".
+ */
+
+#ifndef _LINUX_TIMEOUT_H
+#define _LINUX_TIMEOUT_H
+
+/* 'struct timeout' flag values.  OR these into clock_id along with
+ * a clock specification such as CLOCK_REALTIME or CLOCK_MONOTONIC.
+ */
+enum {
+	TIMEOUT_RELATIVE   = 0x10000000,	/* relative timeout */
+
+	TIMEOUT_FLAGS_MASK = 0xf0000000,	/* flags mask for clock_id */
+	TIMEOUT_CLOCK_MASK = 0x0fffffff,	/* clock mask for clock_id */
+};
+
+/* Magic values a 'struct timeout' pointer can have */
+
+#define TIMEOUT_MAX	((struct timeout *) ~0UL) /* never time out */
+#define TIMEOUT_NONE	((struct timeout *) 0UL)  /* time out immediately */
+
+/**
+ * struct timeout - general timeout specification
+ *
+ * @clock_id: which clock source to use ORed with flags describing use.
+ * @ts:       timespec for the timeout
+ */
+struct timeout {
+	clockid_t clock_id;
+	struct timespec ts;
+};
+
+#ifdef __KERNEL__
+extern int FASTCALL(timeout_validate (const struct timeout *));
+extern int FASTCALL(timeout_sleep (const struct timeout *, struct timeout *));
+#endif
+
+#endif
diff -puNa kernel/posix-timers.c~a.more.flexible.timeout.approach kernel/posix-timers.c
--- 2.6.12-rc4/kernel/posix-timers.c~a.more.flexible.timeout.approach	2005-05-18 13:53:14.207416731 -0400
+++ 2.6.12-rc4-jak/kernel/posix-timers.c	2005-05-18 13:53:14.214415710 -0400
@@ -1288,6 +1288,13 @@ sys_clock_settime(clockid_t which_clock,
 	return CLOCK_DISPATCH(which_clock, clock_set, (which_clock, &new_tp));
 }
 
+long clock_gettime(clockid_t which_clock, struct timespec *tp)
+{
+	if (invalid_clockid(which_clock))
+		return -EINVAL;
+	return CLOCK_DISPATCH(which_clock, clock_get, (which_clock, tp));
+}
+
 asmlinkage long
 sys_clock_gettime(clockid_t which_clock, struct timespec __user *tp)
 {
diff -puNa kernel/timer.c~a.more.flexible.timeout.approach kernel/timer.c
--- 2.6.12-rc4/kernel/timer.c~a.more.flexible.timeout.approach	2005-05-18 13:53:14.209416440 -0400
+++ 2.6.12-rc4-jak/kernel/timer.c	2005-05-18 15:27:06.363710594 -0400
@@ -1129,6 +1129,190 @@ fastcall signed long __sched schedule_ti
 
 EXPORT_SYMBOL(schedule_timeout);
 
+/**
+ * timeout_validate - verify that a timeout specification is self-consistant.
+ * @tp - pointer to the 'struct timeout' to verify
+ * @returns - 0 if no error, <0 errno code if there was
+ */
+fastcall int timeout_validate(const struct timeout *tp)
+{
+	int result;
+	unsigned flags, clock_id;
+
+	result = 0;
+	if (tp == TIMEOUT_MAX || tp == TIMEOUT_NONE)
+		goto out;
+
+	flags = tp->clock_id & TIMEOUT_FLAGS_MASK;
+	clock_id = tp->clock_id & TIMEOUT_CLOCK_MASK;
+
+	result = -EINVAL;
+	if (flags & ~TIMEOUT_RELATIVE)
+	    goto out;
+
+	/* someday, we should support *all* clocks available to us */
+	if (clock_id != CLOCK_REALTIME && clock_id != CLOCK_MONOTONIC)
+		goto out;
+	if ((unsigned long)tp->ts.tv_nsec >= NSEC_PER_SEC)
+		goto out;
+	result = 0;
+out:
+	return result;
+}
+EXPORT_SYMBOL_GPL(timeout_validate);
+
+/** rel_to_abs_timespec - convert a relative timespec into an absolute timespec
+ *  @clock - the system clock the timespecs are specified in.
+ *  @tsrel - a pointer to the relative timespec to be converted.
+ *  @tsabs - a pointer to where the absolute timespec is to be stored.
+ *  @returns - 0 if converted, <0 if error.
+ */
+fastcall int rel_to_abs_timespec(clockid_t clock, const struct timespec *tsrel,
+				struct timespec *tsabs)
+{
+	int result;
+	struct timespec now;
+
+	result = clock_gettime(clock, &now);
+	if (result >= 0) {
+		set_normalized_timespec(tsabs, now.tv_sec + tsrel->tv_sec,
+				now.tv_nsec + tsrel->tv_nsec);
+	}
+	return result;
+}
+EXPORT_SYMBOL_GPL(rel_to_abs_timespec);
+
+/**
+ * abs_timespec_to_abs_jiffies - convert an absolute timespec into
+ * absolute jiffies.
+ *
+ * @clock - select which clock @tp is specified against.  CLOCK_MONOTONIC and
+ *     CLOCK_REALTIME are two possibilities.
+ * @tp - absolute timespec that is to be converted to jiffies.
+ * @jp - absolute jiffies returned through this pointer.
+ * @returns -  0 if converted, <0 if error.
+ */
+fastcall int abs_timespec_to_abs_jiffies (clockid_t clock,
+			const struct timespec *tp, unsigned long *jp)
+{
+	int result;
+	unsigned long jiffies_abs, seq;
+	struct timespec oc, now;
+
+	do {
+		seq = read_seqbegin(&xtime_lock);
+		result = clock_gettime(clock, &now);
+		if (result < 0)
+			return result;
+		jiffies_abs = jiffies;
+	} while (read_seqretry(&xtime_lock, seq));
+
+	set_normalized_timespec(&oc, tp->tv_sec - now.tv_sec,
+				tp->tv_nsec - now.tv_nsec);
+
+	if (oc.tv_sec > 0 || (oc.tv_sec == 0 && oc.tv_nsec > 0))
+		jiffies_abs += timespec_to_jiffies(&oc) + 1;
+	*jp = jiffies_abs;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(abs_timespec_to_abs_jiffies);
+
+/* Helper function.  Handles the proper setting of @endtime when
+ * an @endtime needs to be returned.  Returns an approximation for
+ * the ending time in jiffies.  The approximation is guaranteed to
+ * be on or just past the true ending time.
+ */
+static inline
+unsigned long __timeout_sleep_timespec (const struct timeout *timeout,
+				struct timeout *endtime)
+{
+	unsigned long jiffies_abs;
+	struct timespec end;
+	clockid_t clock_id = timeout->clock_id & TIMEOUT_CLOCK_MASK;
+
+	if (timeout->clock_id & TIMEOUT_RELATIVE) {
+		rel_to_abs_timespec(clock_id, &timeout->ts, &end);
+		if (endtime) {
+			endtime->clock_id = clock_id;
+			endtime->ts = end;
+		}
+	}
+	else
+		end = timeout->ts;
+
+	abs_timespec_to_abs_jiffies(clock_id, &end, &jiffies_abs);
+	return jiffies_abs;
+}
+
+/**
+ * timeout_sleep - sleep until woken up, interrupted, or the
+ *		   designated wakeup time is past.
+ * @timeout:
+ *    time to wait.  const struct timeout pointer.  May take
+ *    on the special values TIMEOUT_MAX (never time out) or
+ *    TIMEOUT_NONE (times out instantly).  @timeout must be
+ *    error-checked by timeout_validate() beforehand.
+ * @endtime:
+ *    time remaining.  A 'struct timeout' value, holding the
+ *    end time of the period in absolute format and specified
+ *    against the same clock that @timeout was specified against,
+ *    is returned through this pointer.  No value is returned
+ *    if @endtime is NULL, if @timeout is TIMEOUT_MAX or
+ *    TIMEOUT_NONE, or if @timeout is itself specified in
+ *    absolute time.
+ * @returns:
+ *    -ETIMEDOUT if awakened due to timeout, 0 otherwise.
+ */
+fastcall int timeout_sleep (const struct timeout *timeout, struct timeout *endtime)
+{
+	unsigned long jiffies_endtime;
+	struct timer_list timer;
+	int result;
+
+	if (timeout == TIMEOUT_NONE)
+		goto simulate_timeout;
+
+	if (timeout == TIMEOUT_MAX)
+		goto never_timeout;
+
+	if (timeout->ts.tv_sec  == MAX_SCHEDULE_TIMEOUT)
+		goto never_timeout;
+
+	jiffies_endtime = __timeout_sleep_timespec(timeout, endtime);
+
+	if (time_after_eq(jiffies, jiffies_endtime))
+		goto simulate_timeout;
+
+	init_timer(&timer);
+	timer.expires = jiffies_endtime;
+	timer.data = (unsigned long)current;
+	timer.function = process_timeout;
+	add_timer(&timer);
+	schedule();
+	result = -ETIMEDOUT;
+	if (del_singleshot_timer_sync (&timer))
+		result = 0;
+	goto out;
+
+	/*
+	 * simulate a timeout without actually expiring a timer
+	 */
+simulate_timeout:
+	process_timeout((unsigned long)current);
+	result = -ETIMEDOUT;
+	goto out;
+
+	/*
+	 * Sleep without a timeout scheduled.
+	 */
+never_timeout:
+	schedule();
+	result = 0;
+out:
+	return result;
+}
+EXPORT_SYMBOL_GPL (timeout_sleep);
+
 /* Thread ID - the internal kernel "pid" */
 asmlinkage long sys_gettid(void)
 {

_

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2005-09-01 15:48 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-31 22:11 FW: [RFC] A more general timeout specification Perez-Gonzalez, Inaky
2005-08-31 23:06 ` Christopher Friesen
  -- strict thread matches above, loose matches on Subject: below --
2005-09-01 11:59 Perez-Gonzalez, Inaky
2005-09-01 12:55 ` Roman Zippel
2005-09-01  0:00 Perez-Gonzalez, Inaky
2005-09-01  9:19 ` Roman Zippel
2005-09-01 13:48   ` Joe Korty
2005-09-01 15:18     ` Roman Zippel
2005-08-31 23:24 Perez-Gonzalez, Inaky
2005-08-31 23:50 ` Roman Zippel
2005-09-01  0:08   ` Daniel Walker
2005-09-01  9:22     ` Roman Zippel
2005-09-01 13:53       ` Joe Korty
2005-09-01 13:50   ` Joe Korty
2005-09-01 14:32     ` Roman Zippel
2005-09-01 15:10       ` Daniel Walker
2005-09-01 15:11       ` Daniel Walker
2005-09-01 15:47       ` Joe Korty
2005-08-31 23:17 Perez-Gonzalez, Inaky
2005-08-31 22:15 Perez-Gonzalez, Inaky
2005-08-31 23:01 ` Roman Zippel
2005-08-31 20:55 Perez-Gonzalez, Inaky
2005-08-31 21:15 ` Joe Korty
2005-08-31 21:20 ` Christopher Friesen
2005-08-31 21:34   ` Joe Korty
2005-08-31 22:06     ` Christopher Friesen
2005-08-31 22:10 ` Roman Zippel
2005-08-23  1:13 Perez-Gonzalez, Inaky
2005-05-18 20:15 Joe Korty
2005-07-29  1:52 ` FW: " Inaky Perez-Gonzalez
2005-08-22 22:56   ` john stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox