* [PATCH/RFC] msleep() with hrtimers
@ 2007-07-15 22:42 Jonathan Corbet
2007-07-15 23:10 ` Arnd Bergmann
` (3 more replies)
0 siblings, 4 replies; 24+ messages in thread
From: Jonathan Corbet @ 2007-07-15 22:42 UTC (permalink / raw)
To: linux-kernel; +Cc: Thomas Gleixner, Ingo Molnar
The OLPC folks and I recently discovered something interesting: on a
HZ=100 system, a call to msleep(1) will delay for about 20ms. The
combination of jiffies timekeeping and rounding up means that the
minimum delay from msleep will be two jiffies, never less. That led to
multi-second delays in a driver which does a bunch of short msleep()
calls and, in response, a change to mdelay(), which will come back in
something closer to the requested time.
Here's another approach: a reimplementation of msleep() and
msleep_interruptible() using hrtimers. On a system without real
hrtimers this code will at least drop down to single-jiffy delays much
of the time (though not deterministically so). On my x86_64 system with
Thomas's hrtimer/dyntick patch applied, msleep(1) gives almost exactly
what was asked for.
jon
Use hrtimers for msleep() and msleep_interruptible()
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
--- 2.6.22-vanilla/kernel/timer.c 2007-07-15 13:08:36.000000000 -0600
+++ 2.6.22-timer/kernel/timer.c 2007-07-15 16:13:34.000000000 -0600
@@ -1522,18 +1522,43 @@ unregister_time_interpolator(struct time
}
#endif /* CONFIG_TIME_INTERPOLATION */
+
+
+
+static void do_msleep(unsigned int msecs, struct hrtimer_sleeper *sleeper,
+ int sigs)
+{
+ enum hrtimer_mode mode = HRTIMER_MODE_REL;
+ int state = sigs ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
+
+ /*
+ * This is really just a reworked and simplified version
+ * of do_nanosleep().
+ */
+ hrtimer_init(&sleeper->timer, CLOCK_MONOTONIC, mode);
+ sleeper->timer.expires = ktime_set(0, msecs*NSEC_PER_MSEC);
+ hrtimer_init_sleeper(sleeper, current);
+
+ do {
+ set_current_state(state);
+ hrtimer_start(&sleeper->timer, sleeper->timer.expires, mode);
+ if (sleeper->task)
+ schedule();
+ hrtimer_cancel(&sleeper->timer);
+ mode = HRTIMER_MODE_ABS;
+ } while (sleeper->task && !(sigs && signal_pending(current)));
+}
+
/**
* msleep - sleep safely even with waitqueue interruptions
* @msecs: Time in milliseconds to sleep for
*/
void msleep(unsigned int msecs)
{
- unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+ struct hrtimer_sleeper sleeper;
- while (timeout)
- timeout = schedule_timeout_uninterruptible(timeout);
+ do_msleep(msecs, &sleeper, 0);
}
-
EXPORT_SYMBOL(msleep);
/**
@@ -1542,11 +1567,15 @@ EXPORT_SYMBOL(msleep);
*/
unsigned long msleep_interruptible(unsigned int msecs)
{
- unsigned long timeout = msecs_to_jiffies(msecs) + 1;
+ struct hrtimer_sleeper sleeper;
+ ktime_t left;
- while (timeout && !signal_pending(current))
- timeout = schedule_timeout_interruptible(timeout);
- return jiffies_to_msecs(timeout);
-}
+ do_msleep(msecs, &sleeper, 1);
+ if (! sleeper.task)
+ return 0;
+ left = ktime_sub(sleeper.timer.expires,
+ sleeper.timer.base->get_time());
+ return max(((long) ktime_to_ns(left))/NSEC_PER_MSEC, 1L);
+}
EXPORT_SYMBOL(msleep_interruptible);
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-15 22:42 [PATCH/RFC] msleep() with hrtimers Jonathan Corbet
@ 2007-07-15 23:10 ` Arnd Bergmann
2007-07-16 10:48 ` Ingo Molnar
` (2 subsequent siblings)
3 siblings, 0 replies; 24+ messages in thread
From: Arnd Bergmann @ 2007-07-15 23:10 UTC (permalink / raw)
To: Jonathan Corbet; +Cc: linux-kernel, Thomas Gleixner, Ingo Molnar
On Monday 16 July 2007, Jonathan Corbet wrote:
> Here's another approach: a reimplementation of msleep() and
> msleep_interruptible() using hrtimers. On a system without real
> hrtimers this code will at least drop down to single-jiffy delays much
> of the time (though not deterministically so). On my x86_64 system with
> Thomas's hrtimer/dyntick patch applied, msleep(1) gives almost exactly
> what was asked for.
This reminds me of a patch I did some time ago, without ever getting
any feedback on it:
http://lkml.org/lkml/2007/3/4/165
In addition to just msleep, it converted all users of schedule_timeout()
as well as sys_*select(). I actually ran with that patch on my
main worstation for a few weeks, and while it did not crash, I
saw some strange behaviour after all.
Arnd <><
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-15 22:42 [PATCH/RFC] msleep() with hrtimers Jonathan Corbet
2007-07-15 23:10 ` Arnd Bergmann
@ 2007-07-16 10:48 ` Ingo Molnar
2007-07-16 11:39 ` Roman Zippel
2007-07-16 14:11 ` Roman Zippel
3 siblings, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2007-07-16 10:48 UTC (permalink / raw)
To: Jonathan Corbet; +Cc: linux-kernel, Thomas Gleixner
* Jonathan Corbet <corbet@lwn.net> wrote:
> The OLPC folks and I recently discovered something interesting: on a
> HZ=100 system, a call to msleep(1) will delay for about 20ms. The
> combination of jiffies timekeeping and rounding up means that the
> minimum delay from msleep will be two jiffies, never less. That led
> to multi-second delays in a driver which does a bunch of short
> msleep() calls and, in response, a change to mdelay(), which will come
> back in something closer to the requested time.
>
> Here's another approach: a reimplementation of msleep() and
> msleep_interruptible() using hrtimers. On a system without real
> hrtimers this code will at least drop down to single-jiffy delays much
> of the time (though not deterministically so). On my x86_64 system
> with Thomas's hrtimer/dyntick patch applied, msleep(1) gives almost
> exactly what was asked for.
>
> jon
>
> Use hrtimers for msleep() and msleep_interruptible()
>
> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
i certainly like this - an msleep() is like mdelay(), so there are real
delays (and not timeouts) involved, so converting to hrtimers is the
right thing to do from a conceptual POV too.
Acked-by: Ingo Molnar <mingo@elte.hu>
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-15 22:42 [PATCH/RFC] msleep() with hrtimers Jonathan Corbet
2007-07-15 23:10 ` Arnd Bergmann
2007-07-16 10:48 ` Ingo Molnar
@ 2007-07-16 11:39 ` Roman Zippel
2007-07-16 11:47 ` Ingo Molnar
2007-07-16 14:42 ` Jonathan Corbet
2007-07-16 14:11 ` Roman Zippel
3 siblings, 2 replies; 24+ messages in thread
From: Roman Zippel @ 2007-07-16 11:39 UTC (permalink / raw)
To: Jonathan Corbet; +Cc: linux-kernel, Thomas Gleixner, Ingo Molnar
Hi,
On Sun, 15 Jul 2007, Jonathan Corbet wrote:
> Here's another approach: a reimplementation of msleep() and
> msleep_interruptible() using hrtimers. On a system without real
> hrtimers this code will at least drop down to single-jiffy delays much
> of the time (though not deterministically so). On my x86_64 system with
> Thomas's hrtimer/dyntick patch applied, msleep(1) gives almost exactly
> what was asked for.
One possible problem here is that setting up that timer can be
considerably more expensive, for a relative timer you have to read the
current time, which can be quite expensive (e.g. your machine now uses the
PIT timer, because TSC was deemed unstable).
One question here would be, is it really a problem to sleep a little more?
Another possibility would be to add another sleep function, which uses
hrtimer and could also take ktime argument.
bye, Roman
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 11:39 ` Roman Zippel
@ 2007-07-16 11:47 ` Ingo Molnar
2007-07-16 11:54 ` Roman Zippel
2007-07-16 14:42 ` Jonathan Corbet
1 sibling, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2007-07-16 11:47 UTC (permalink / raw)
To: Roman Zippel; +Cc: Jonathan Corbet, linux-kernel, Thomas Gleixner
* Roman Zippel <zippel@linux-m68k.org> wrote:
> > Here's another approach: a reimplementation of msleep() and
> > msleep_interruptible() using hrtimers. On a system without real
> > hrtimers this code will at least drop down to single-jiffy delays
> > much of the time (though not deterministically so). On my x86_64
> > system with Thomas's hrtimer/dyntick patch applied, msleep(1) gives
> > almost exactly what was asked for.
>
> One possible problem here is that setting up that timer can be
> considerably more expensive, for a relative timer you have to read the
> current time, which can be quite expensive (e.g. your machine now uses
> the PIT timer, because TSC was deemed unstable).
i dont think there's any significant overhead. The OLPC folks are pretty
sensitive to performance, so if there was any genuine measurable
overhead due to this, i'd expect them to report it. And even if there
_was_ overhead, it would be well worth its price, the legacies of HZ are
clearly biting the OLPC project here. The sooner we get rid of HZ
dependencies and HZ artifacts, the better.
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 11:47 ` Ingo Molnar
@ 2007-07-16 11:54 ` Roman Zippel
2007-07-16 11:57 ` Ingo Molnar
0 siblings, 1 reply; 24+ messages in thread
From: Roman Zippel @ 2007-07-16 11:54 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Jonathan Corbet, linux-kernel, Thomas Gleixner
Hi,
On Mon, 16 Jul 2007, Ingo Molnar wrote:
> i dont think there's any significant overhead. The OLPC folks are pretty
> sensitive to performance,
How is a sleep function relevant to performace?
bye, Roman
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 11:54 ` Roman Zippel
@ 2007-07-16 11:57 ` Ingo Molnar
2007-07-16 12:05 ` Roman Zippel
0 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2007-07-16 11:57 UTC (permalink / raw)
To: Roman Zippel; +Cc: Jonathan Corbet, linux-kernel, Thomas Gleixner
* Roman Zippel <zippel@linux-m68k.org> wrote:
> > > One possible problem here is that setting up that timer can be
> > > considerably more expensive, for a relative timer you have to read
> > > the current time, which can be quite expensive (e.g. your machine
> > > now uses the PIT timer, because TSC was deemed unstable).
> >
> > i dont think there's any significant overhead. The OLPC folks are
> > pretty sensitive to performance, so if there was any genuine
> > measurable overhead due to this, i'd expect them to report it. And
> > even if there _was_ overhead, it would be well worth its price, the
> > legacies of HZ are clearly biting the OLPC project here. The sooner
> > we get rid of HZ dependencies and HZ artifacts, the better.
>
> How is a sleep function relevant to performace?
i'm not sure how your question relates/connects to what i wrote above,
could you please re-phrase your question into a bit more verbose form so
that i can answer it? Thanks,
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 11:57 ` Ingo Molnar
@ 2007-07-16 12:05 ` Roman Zippel
2007-07-16 12:19 ` Ingo Molnar
0 siblings, 1 reply; 24+ messages in thread
From: Roman Zippel @ 2007-07-16 12:05 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Jonathan Corbet, linux-kernel, Thomas Gleixner
Hi,
On Mon, 16 Jul 2007, Ingo Molnar wrote:
> i'm not sure how your question relates/connects to what i wrote above,
> could you please re-phrase your question into a bit more verbose form so
> that i can answer it? Thanks,
Well, you cut out the major question from my initial mail:
One question here would be, is it really a problem to sleep a little more?
bye, Roman
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 12:05 ` Roman Zippel
@ 2007-07-16 12:19 ` Ingo Molnar
2007-07-16 13:00 ` Roman Zippel
0 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2007-07-16 12:19 UTC (permalink / raw)
To: Roman Zippel; +Cc: Jonathan Corbet, linux-kernel, Thomas Gleixner
* Roman Zippel <zippel@linux-m68k.org> wrote:
> Hi,
>
> On Mon, 16 Jul 2007, Ingo Molnar wrote:
>
> > i'm not sure how your question relates/connects to what i wrote above,
> > could you please re-phrase your question into a bit more verbose form so
> > that i can answer it? Thanks,
>
> Well, you cut out the major question from my initial mail:
> One question here would be, is it really a problem to sleep a little more?
oh, i did not want to embarrass you (and distract the discussion) with
answering a pretty stupid, irrelevant question that has the obvious
answer even for the most casual observer: "yes, of course it really is a
problem to sleep a little more, read the description of the fine patch
you are replying to" ...
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 12:19 ` Ingo Molnar
@ 2007-07-16 13:00 ` Roman Zippel
2007-07-16 14:09 ` Ingo Molnar
0 siblings, 1 reply; 24+ messages in thread
From: Roman Zippel @ 2007-07-16 13:00 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Jonathan Corbet, linux-kernel, Thomas Gleixner
Hi,
On Mon, 16 Jul 2007, Ingo Molnar wrote:
> > Well, you cut out the major question from my initial mail:
> > One question here would be, is it really a problem to sleep a little more?
>
> oh, i did not want to embarrass you (and distract the discussion) with
> answering a pretty stupid, irrelevant question that has the obvious
> answer even for the most casual observer: "yes, of course it really is a
> problem to sleep a little more, read the description of the fine patch
> you are replying to" ...
And your insults continue... :-(
I ask a simple question and try to explore alternative solutions and this
is your contribution to it?
To put this into a little more context, this is the complete text you cut
off:
| One question here would be, is it really a problem to sleep a little more?
| Another possibility would be to add another sleep function, which uses
| hrtimer and could also take ktime argument.
So instead of considering this suggestion, you just read what you want out
of what I wrote and turn everything into an insult. Nicely done, Ingo. :-(
bye, Roman
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 13:00 ` Roman Zippel
@ 2007-07-16 14:09 ` Ingo Molnar
2007-07-16 14:32 ` Roman Zippel
0 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2007-07-16 14:09 UTC (permalink / raw)
To: Roman Zippel; +Cc: Jonathan Corbet, linux-kernel, Thomas Gleixner
* Roman Zippel <zippel@linux-m68k.org> wrote:
> One question here would be, is it really a problem to sleep a little
> more? Another possibility would be to add another sleep function,
> which uses hrtimer and could also take ktime argument.
i'm not sure how this helps your argument, could you please explain it
more verbosely? (because when i assumed the obvious, you called it an
insult - so please dont leave any room for assumptions and remove any
ambiguity - especially as our communication seems to be marred by what
appears to be frequent misunderstandings ;-)
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-15 22:42 [PATCH/RFC] msleep() with hrtimers Jonathan Corbet
` (2 preceding siblings ...)
2007-07-16 11:39 ` Roman Zippel
@ 2007-07-16 14:11 ` Roman Zippel
3 siblings, 0 replies; 24+ messages in thread
From: Roman Zippel @ 2007-07-16 14:11 UTC (permalink / raw)
To: Jonathan Corbet; +Cc: linux-kernel, Thomas Gleixner, Ingo Molnar
Hi,
On Sun, 15 Jul 2007, Jonathan Corbet wrote:
> The OLPC folks and I recently discovered something interesting: on a
> HZ=100 system, a call to msleep(1) will delay for about 20ms. The
> combination of jiffies timekeeping and rounding up means that the
> minimum delay from msleep will be two jiffies, never less. That led to
> multi-second delays in a driver which does a bunch of short msleep()
> calls and, in response, a change to mdelay(), which will come back in
> something closer to the requested time.
>
> Here's another approach: a reimplementation of msleep() and
> msleep_interruptible() using hrtimers. On a system without real
> hrtimers this code will at least drop down to single-jiffy delays much
> of the time (though not deterministically so). On my x86_64 system with
> Thomas's hrtimer/dyntick patch applied, msleep(1) gives almost exactly
> what was asked for.
BTW there is another thing to consider. If you already run with hrtimer/
dyntick, there is not much reason to keep HZ at 100, so you could just
increase HZ to get the same effect.
bye, Roman
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 14:09 ` Ingo Molnar
@ 2007-07-16 14:32 ` Roman Zippel
0 siblings, 0 replies; 24+ messages in thread
From: Roman Zippel @ 2007-07-16 14:32 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Jonathan Corbet, linux-kernel, Thomas Gleixner
Hi,
On Mon, 16 Jul 2007, Ingo Molnar wrote:
> because when i assumed the obvious, you called it an
> insult so please dont leave any room for assumptions and remove any
> ambiguity - especially as our communication seems to be marred by what
> appears to be frequent misunderstandings ;-)
What the hell is this supposed to be? How am I not to take this:
"i did not want to embarrass you (and distract the discussion) with
answering a pretty stupid, irrelevant question"
as an insult? How does it reflect on someone if he asks "stupid,
irrelevant questions"? If it had been a misunderstanding, you could have
ask appropriately instead of assuming I'm an idiot.
BTW Adding smileys to it doesn't make it any funnier, since it I don't
believe in the "misunderstandings" theory anymore. :-(
bye, Roman
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 11:39 ` Roman Zippel
2007-07-16 11:47 ` Ingo Molnar
@ 2007-07-16 14:42 ` Jonathan Corbet
2007-07-16 15:18 ` Ingo Molnar
2007-07-16 15:43 ` Roman Zippel
1 sibling, 2 replies; 24+ messages in thread
From: Jonathan Corbet @ 2007-07-16 14:42 UTC (permalink / raw)
To: Roman Zippel; +Cc: linux-kernel, Thomas Gleixner, Ingo Molnar
Hey, Roman,
> One possible problem here is that setting up that timer can be
> considerably more expensive, for a relative timer you have to read the
> current time, which can be quite expensive (e.g. your machine now uses the
> PIT timer, because TSC was deemed unstable).
That's a possibility, I admit I haven't benchmarked it. I will say that
I don't think it will be enough to matter - msleep() is not a hot-path
sort of function. Once the system is up and running it almost never
gets called at all - at least, on my setup.
> One question here would be, is it really a problem to sleep a little more?
"A little more" is a bit different than "twenty times as long as you
asked for." That "little bit more" added up to a few seconds when
programming a device which needs a brief delay after tweaking each of
almost 200 registers.
> BTW there is another thing to consider. If you already run with hrtimer/
> dyntick, there is not much reason to keep HZ at 100, so you could just
> increase HZ to get the same effect.
Except that then, with the current implementation, you're paying for the
higher HZ whenever the CPU is busy. I bet that doesn't take long to
overwhelm any added overhead in the hrtimer msleep().
In the end, I did this because I thought msleep() should do what it
claims to do, because I thought that getting a known-to-expire timeout
off the timer wheel made sense, and to make a tiny baby step in the
direction of reducing the use of jiffies in the core code.
jon
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 14:42 ` Jonathan Corbet
@ 2007-07-16 15:18 ` Ingo Molnar
2007-07-16 15:43 ` Roman Zippel
1 sibling, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2007-07-16 15:18 UTC (permalink / raw)
To: Jonathan Corbet; +Cc: Roman Zippel, linux-kernel, Thomas Gleixner
* Jonathan Corbet <corbet@lwn.net> wrote:
> In the end, I did this because I thought msleep() should do what it
> claims to do, because I thought that getting a known-to-expire timeout
> off the timer wheel made sense, and to make a tiny baby step in the
> direction of reducing the use of jiffies in the core code.
yeah, fully agreed.
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 14:42 ` Jonathan Corbet
2007-07-16 15:18 ` Ingo Molnar
@ 2007-07-16 15:43 ` Roman Zippel
2007-07-16 15:57 ` Ray Lee
` (2 more replies)
1 sibling, 3 replies; 24+ messages in thread
From: Roman Zippel @ 2007-07-16 15:43 UTC (permalink / raw)
To: Jonathan Corbet; +Cc: linux-kernel, Thomas Gleixner, Ingo Molnar
Hi,
On Mon, 16 Jul 2007, Jonathan Corbet wrote:
> > One possible problem here is that setting up that timer can be
> > considerably more expensive, for a relative timer you have to read the
> > current time, which can be quite expensive (e.g. your machine now uses the
> > PIT timer, because TSC was deemed unstable).
>
> That's a possibility, I admit I haven't benchmarked it. I will say that
> I don't think it will be enough to matter - msleep() is not a hot-path
> sort of function. Once the system is up and running it almost never
> gets called at all - at least, on my setup.
That's a bit my problem - we have to consider other setups as well.
Is it worth converting all msleep users behind their back or should we
just a provide a separate function for those who care?
I would really like to keep hrtimer and kernel timer separate and make it
obvious who is using what, as the usage requirements are somewhat
different.
> > One question here would be, is it really a problem to sleep a little more?
>
> "A little more" is a bit different than "twenty times as long as you
> asked for." That "little bit more" added up to a few seconds when
> programming a device which needs a brief delay after tweaking each of
> almost 200 registers.
Which driver is this? I'd like to look at this, in case there's some other
hidden problem.
> > BTW there is another thing to consider. If you already run with hrtimer/
> > dyntick, there is not much reason to keep HZ at 100, so you could just
> > increase HZ to get the same effect.
>
> Except that then, with the current implementation, you're paying for the
> higher HZ whenever the CPU is busy. I bet that doesn't take long to
> overwhelm any added overhead in the hrtimer msleep().
Actually if that's the case I'd consider this a bug, where is that extra
cost coming from?
> In the end, I did this because I thought msleep() should do what it
> claims to do, because I thought that getting a known-to-expire timeout
> off the timer wheel made sense, and to make a tiny baby step in the
> direction of reducing the use of jiffies in the core code.
I know that Ingo considers everything HZ related evil, but it really is
not - it keeps Linux scalable. Unless you need the high resolution the
timer wheel performance is still pretty hard to beat. That
"known-to-expire" stuff is really the least significant problem to
consider here, please just forget about it.
I don't want to keep anyone from using hrtimer, if it's just some driver
go wild, but in generic code we have to consider portability issues. Using
jiffies as a time base is still unbeatable cheap in the general case, so
we have to carefully consider whether using a different time source is
required. There is nothing wrong with using jiffies if it fits the bill
and in many cases it still does.
bye, Roman
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 15:43 ` Roman Zippel
@ 2007-07-16 15:57 ` Ray Lee
2007-07-16 16:08 ` Nish Aravamudan
2007-07-16 16:10 ` Ingo Molnar
2007-07-16 17:46 ` Jonathan Corbet
2 siblings, 1 reply; 24+ messages in thread
From: Ray Lee @ 2007-07-16 15:57 UTC (permalink / raw)
To: Roman Zippel; +Cc: Jonathan Corbet, linux-kernel, Thomas Gleixner, Ingo Molnar
On 7/16/07, Roman Zippel <zippel@linux-m68k.org> wrote:
> On Mon, 16 Jul 2007, Jonathan Corbet wrote:
> > > One possible problem here is that setting up that timer can be
> > > considerably more expensive, for a relative timer you have to read the
> > > current time, which can be quite expensive (e.g. your machine now uses the
> > > PIT timer, because TSC was deemed unstable).
> >
> > That's a possibility, I admit I haven't benchmarked it. I will say that
> > I don't think it will be enough to matter - msleep() is not a hot-path
> > sort of function. Once the system is up and running it almost never
> > gets called at all - at least, on my setup.
>
> That's a bit my problem - we have to consider other setups as well.
> Is it worth converting all msleep users behind their back or should we
> just a provide a separate function for those who care?
As a driver author (2.4 timeframe, embedded platform, see gitinc.com
for the hardware description), I would rather msleep did what it says
it's going to do. If the current one can wait 20 times longer than you
ask for, then that's just broken.
> > > One question here would be, is it really a problem to sleep a little more?
> >
> > "A little more" is a bit different than "twenty times as long as you
> > asked for." That "little bit more" added up to a few seconds when
> > programming a device which needs a brief delay after tweaking each of
> > almost 200 registers.
>
> Which driver is this? I'd like to look at this, in case there's some other
> hidden problem.
"Hidden problem"? Lots of hardware has quiescent wait-time
requirements, to guarantee that whatever is going on it it's little
state machine head finally reaches a stable state. Waiting for those
changes to take effect before starting the next set of register
reprogramming is a common requirement -- on the hardware I've worked
with at least.
Ray
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 15:57 ` Ray Lee
@ 2007-07-16 16:08 ` Nish Aravamudan
2007-07-17 4:04 ` Nick Piggin
0 siblings, 1 reply; 24+ messages in thread
From: Nish Aravamudan @ 2007-07-16 16:08 UTC (permalink / raw)
To: Ray Lee
Cc: Roman Zippel, Jonathan Corbet, linux-kernel, Thomas Gleixner,
Ingo Molnar
On 7/16/07, Ray Lee <ray-lk@madrabbit.org> wrote:
> On 7/16/07, Roman Zippel <zippel@linux-m68k.org> wrote:
> > On Mon, 16 Jul 2007, Jonathan Corbet wrote:
> > > > One possible problem here is that setting up that timer can be
> > > > considerably more expensive, for a relative timer you have to read the
> > > > current time, which can be quite expensive (e.g. your machine now uses the
> > > > PIT timer, because TSC was deemed unstable).
> > >
> > > That's a possibility, I admit I haven't benchmarked it. I will say that
> > > I don't think it will be enough to matter - msleep() is not a hot-path
> > > sort of function. Once the system is up and running it almost never
> > > gets called at all - at least, on my setup.
> >
> > That's a bit my problem - we have to consider other setups as well.
> > Is it worth converting all msleep users behind their back or should we
> > just a provide a separate function for those who care?
>
> As a driver author (2.4 timeframe, embedded platform, see gitinc.com
> for the hardware description), I would rather msleep did what it says
> it's going to do. If the current one can wait 20 times longer than you
> ask for, then that's just broken.
Well, before these changes, the only guarantee msleep() could make,
just like the only guarantee schedule_timeout() could make, was that
it would not return early. The 1-jiffy sleep was always tough to deal
with, because of rounding and such. And it's simply exacerbated with
HZ=100. It's not technically 20 times longer in all cases, it's 2
jiffies longer, which depends on HZ, so varies from 2 msecs longer to
20 msecs longer.
Thanks,
Nish
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 15:43 ` Roman Zippel
2007-07-16 15:57 ` Ray Lee
@ 2007-07-16 16:10 ` Ingo Molnar
2007-07-16 16:55 ` Roman Zippel
2007-07-16 17:46 ` Jonathan Corbet
2 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2007-07-16 16:10 UTC (permalink / raw)
To: Roman Zippel
Cc: Jonathan Corbet, linux-kernel, Thomas Gleixner, Arjan van de Ven
* Roman Zippel <zippel@linux-m68k.org> wrote:
> I know that Ingo considers everything HZ related evil, [...]
no, you are misrepresenting me, i dont consider everything HZ related
evil - where did you get that from?
I explained it numerous times (remember the 'timeout' vs. 'timer event'
discussion?) that i consider timer granularity important to scalability.
Basically, in every case where we know with great certainty that a
time-out will _not_ occur (where the time-out is in essence just an
exception handling mechanism), using struct timer_list is the best
solution. Furthermore, in cases where we know that the "granularity" of
a timer event is coarse, we can 'cluster' related timer events together.
(Arjan's timeout-rounding API additions do that.)
msleep() API is neither, and it's a perfect example for conversion to
hrtimers. It is exactly the type of timer API we intended hrtimers for.
what i consider harmful on the other hand are all the HZ assumptions
embedded into various pieces of code. The most harmful ones are design
details that depend on HZ and kernel-internal API details that depends
on HZ. Yes, NTP was such an example, and it was hard to fix, and you
didnt help much with that. (perhaps that is one source of this
increasingly testy exchange ;-) In any case we are slowly and surely
eradicating them. (we long ago eradicated all externally visible HZ
dependencies via the introduction of USER_HZ)
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 16:10 ` Ingo Molnar
@ 2007-07-16 16:55 ` Roman Zippel
0 siblings, 0 replies; 24+ messages in thread
From: Roman Zippel @ 2007-07-16 16:55 UTC (permalink / raw)
To: Ingo Molnar
Cc: Jonathan Corbet, linux-kernel, Thomas Gleixner, Arjan van de Ven
Hi,
On Mon, 16 Jul 2007, Ingo Molnar wrote:
> I explained it numerous times (remember the 'timeout' vs. 'timer event'
> discussion?) that i consider timer granularity important to scalability.
> Basically, in every case where we know with great certainty that a
> time-out will _not_ occur (where the time-out is in essence just an
> exception handling mechanism), using struct timer_list is the best
> solution.
Whether the timer expires or not is in many cases completely irrelevant.
You need special cases, where the timer wheel behaviour becomes an issue
and whether hrtimer would behave any better in such situations is
questionable.
Again, for the average user such details are pretty much irrelevant.
> what i consider harmful on the other hand are all the HZ assumptions
> embedded into various pieces of code. The most harmful ones are design
> details that depend on HZ and kernel-internal API details that depends
> on HZ. Yes, NTP was such an example, and it was hard to fix, and you
> didnt help much with that.
Stop spreading lies! :-(
One only has to look at the history of kernel/time/ntp.c
John's rather simple "HZ free ntp" patch wouldn't have been that simple
without all the cleanup patches before that done by me, which were
precisely intended to make this possible.
> (perhaps that is one source of this
> increasingly testy exchange ;-)
No, it's your prejudice against me based on wrong facts.
Get your facts straight and stop being an ass.
bye, Roman
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 15:43 ` Roman Zippel
2007-07-16 15:57 ` Ray Lee
2007-07-16 16:10 ` Ingo Molnar
@ 2007-07-16 17:46 ` Jonathan Corbet
2007-07-20 12:49 ` Roman Zippel
2 siblings, 1 reply; 24+ messages in thread
From: Jonathan Corbet @ 2007-07-16 17:46 UTC (permalink / raw)
To: Roman Zippel; +Cc: linux-kernel, Thomas Gleixner, Ingo Molnar
Roman Zippel <zippel@linux-m68k.org> wrote:
> > That's a possibility, I admit I haven't benchmarked it. I will say that
> > I don't think it will be enough to matter - msleep() is not a hot-path
> > sort of function. Once the system is up and running it almost never
> > gets called at all - at least, on my setup.
>
> That's a bit my problem - we have to consider other setups as well.
> Is it worth converting all msleep users behind their back or should we
> just a provide a separate function for those who care?
Any additional overhead is clearly small - enough not to disrupt a *high
resolution* timer, after all. And msleep() is used mostly during
initialization time. My box had a few hundred calls, almost all during
boot. Any cost will be bounded by the fact that, well, it sleeps for
milliseconds on every call.
I could run a million-msleep profile if you really want, but I just
can't imagine it's going to have a measurable effect on any real-world
situation. Is there a specific setup you're worried about?
I'm not *that* attached to this patch; if it causes heartburn we can
just forget it. But I had thought it might be useful...
> Which driver is this? I'd like to look at this, in case there's some other
> hidden problem.
drivers/media/video/cafe_ccic.c, and cafe_smbus_write_data() in
particular. The "hidden problem," though, is that the hardware has
periods where reading the status registers will send it off into its
room where it will hide under its bed and never come out.
I have an alternative which avoids the sleep at the cost of a small,
relatively harmless race; it may be my real solution in the end, we'll
see.
> > Except that then, with the current implementation, you're paying for the
> > higher HZ whenever the CPU is busy. I bet that doesn't take long to
> > overwhelm any added overhead in the hrtimer msleep().
>
> Actually if that's the case I'd consider this a bug, where is that extra
> cost coming from?
My understanding is that the current dyntick code only turns off the
tick during idle periods; while things are running it's business as
usual. Perhaps I misunderstood?
jon
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 16:08 ` Nish Aravamudan
@ 2007-07-17 4:04 ` Nick Piggin
2007-07-18 17:53 ` Nish Aravamudan
0 siblings, 1 reply; 24+ messages in thread
From: Nick Piggin @ 2007-07-17 4:04 UTC (permalink / raw)
To: Nish Aravamudan
Cc: Ray Lee, Roman Zippel, Jonathan Corbet, linux-kernel,
Thomas Gleixner, Ingo Molnar
Nish Aravamudan wrote:
> Well, before these changes, the only guarantee msleep() could make,
> just like the only guarantee schedule_timeout() could make, was that
> it would not return early. The 1-jiffy sleep was always tough to deal
> with, because of rounding and such. And it's simply exacerbated with
> HZ=100. It's not technically 20 times longer in all cases, it's 2
> jiffies longer, which depends on HZ, so varies from 2 msecs longer to
> 20 msecs longer.
I don't think you should rely on anything else actually, because Linux
is not an RT OS. If your driver needs a specific sequence in a given
amount of time, you have to do something like disable interrupts and
use delays.
--
SUSE Labs, Novell Inc.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-17 4:04 ` Nick Piggin
@ 2007-07-18 17:53 ` Nish Aravamudan
0 siblings, 0 replies; 24+ messages in thread
From: Nish Aravamudan @ 2007-07-18 17:53 UTC (permalink / raw)
To: Nick Piggin
Cc: Ray Lee, Roman Zippel, Jonathan Corbet, linux-kernel,
Thomas Gleixner, Ingo Molnar
On 7/16/07, Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> Nish Aravamudan wrote:
>
> > Well, before these changes, the only guarantee msleep() could make,
> > just like the only guarantee schedule_timeout() could make, was that
> > it would not return early. The 1-jiffy sleep was always tough to deal
> > with, because of rounding and such. And it's simply exacerbated with
> > HZ=100. It's not technically 20 times longer in all cases, it's 2
> > jiffies longer, which depends on HZ, so varies from 2 msecs longer to
> > 20 msecs longer.
>
> I don't think you should rely on anything else actually, because Linux
> is not an RT OS. If your driver needs a specific sequence in a given
> amount of time, you have to do something like disable interrupts and
> use delays.
I agree -- I wasn't trying to indicate what one should or should not
do. Just tried to provide some insight into the intent behind the
interfaces. They were never meant to be precise, per se -- simply
guaranteeing the request would be satisfied, with no upper bound. I
agree that if timing really is that critical then delays or more
accurate timing specifications would be appropriate (one reason to use
msecs_to_jiffies() and co. rather than jiffies directly in drivers,
perhaps).
Thanks,
Nish
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH/RFC] msleep() with hrtimers
2007-07-16 17:46 ` Jonathan Corbet
@ 2007-07-20 12:49 ` Roman Zippel
0 siblings, 0 replies; 24+ messages in thread
From: Roman Zippel @ 2007-07-20 12:49 UTC (permalink / raw)
To: Jonathan Corbet; +Cc: linux-kernel, Thomas Gleixner, Ingo Molnar
Hi,
On Mon, 16 Jul 2007, Jonathan Corbet wrote:
> > That's a bit my problem - we have to consider other setups as well.
> > Is it worth converting all msleep users behind their back or should we
> > just a provide a separate function for those who care?
>
> Any additional overhead is clearly small - enough not to disrupt a *high
> resolution* timer, after all.
If you use already high resolution timer, you also need a fast time
source, so that in this case it indeed doesn't matter much how you sleep.
> And msleep() is used mostly during
> initialization time. My box had a few hundred calls, almost all during
> boot. Any cost will be bounded by the fact that, well, it sleeps for
> milliseconds on every call.
Well, there are over 1500 msleep calls, so I'm not sure they're mostly
during initialization.
> I'm not *that* attached to this patch; if it causes heartburn we can
> just forget it. But I had thought it might be useful...
I'm not against using hrtimer in drivers, if you add a hrsleep() function
and use that, that would be perfectly fine.
The really important point is to keep our APIs clean, so it's obvious who
is using what. The requirements for both timers are different, so there
should be a choice in what to use.
> > Which driver is this? I'd like to look at this, in case there's some other
> > hidden problem.
>
> drivers/media/video/cafe_ccic.c, and cafe_smbus_write_data() in
> particular. The "hidden problem," though, is that the hardware has
> periods where reading the status registers will send it off into its
> room where it will hide under its bed and never come out.
It's indeed not a trivial problem, as it's not localized to the driver
(the request comes from generic code).
The most elegant and general solution might be to move such initialization
sequences into a separate thread, where they don't hold up the rest.
> My understanding is that the current dyntick code only turns off the
> tick during idle periods; while things are running it's business as
> usual. Perhaps I misunderstood?
jiffies needs to be updated, theoretically one could reduce the timer
tick even then, but one has to be careful about the increased resolution,
so jiffies+1 isn't enough anymore to round it up.
In general it's doable by further cleaning up our APIs, but here it's
really important to keep the APIs clean to keep Linux running on a wide
range of hardware. It should be clear whether one requests a low
resolution, but low overhead timer or a high resolution and more precise
timer (and _please_ ignore that "likely to expire" stuff).
It's e.g. a possibility to map everything to high resolution timer on a
hardware, which can deal with this, but on other hardware that's not
possible without paying a significant price.
bye, Roman
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2007-07-20 12:49 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-15 22:42 [PATCH/RFC] msleep() with hrtimers Jonathan Corbet
2007-07-15 23:10 ` Arnd Bergmann
2007-07-16 10:48 ` Ingo Molnar
2007-07-16 11:39 ` Roman Zippel
2007-07-16 11:47 ` Ingo Molnar
2007-07-16 11:54 ` Roman Zippel
2007-07-16 11:57 ` Ingo Molnar
2007-07-16 12:05 ` Roman Zippel
2007-07-16 12:19 ` Ingo Molnar
2007-07-16 13:00 ` Roman Zippel
2007-07-16 14:09 ` Ingo Molnar
2007-07-16 14:32 ` Roman Zippel
2007-07-16 14:42 ` Jonathan Corbet
2007-07-16 15:18 ` Ingo Molnar
2007-07-16 15:43 ` Roman Zippel
2007-07-16 15:57 ` Ray Lee
2007-07-16 16:08 ` Nish Aravamudan
2007-07-17 4:04 ` Nick Piggin
2007-07-18 17:53 ` Nish Aravamudan
2007-07-16 16:10 ` Ingo Molnar
2007-07-16 16:55 ` Roman Zippel
2007-07-16 17:46 ` Jonathan Corbet
2007-07-20 12:49 ` Roman Zippel
2007-07-16 14:11 ` Roman Zippel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox