* Re: x86: disable preemption in delay_tsc()
[not found] <200711150400.lAF40lIr020160@hera.kernel.org>
@ 2007-11-16 3:41 ` Arjan van de Ven
2007-11-16 3:52 ` Andrew Morton
2007-11-16 8:08 ` Avi Kivity
0 siblings, 2 replies; 11+ messages in thread
From: Arjan van de Ven @ 2007-11-16 3:41 UTC (permalink / raw)
To: Linux Kernel Mailing List; +Cc: akpm, mingo
On Thu, 15 Nov 2007 04:00:47 GMT
Linux Kernel Mailing List <linux-kernel@vger.kernel.org> wrote:
> Gitweb:
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=35d5d08a085c56f153458c3f5d8ce24123617faf
> Commit: 35d5d08a085c56f153458c3f5d8ce24123617faf Parent:
> 7eea436433b7b18045f272562e256976f593f7c0 Author: Andrew Morton
> <akpm@linux-foundation.org> AuthorDate: Wed Nov 14 17:00:41 2007 -0800
> Committer: Linus Torvalds <torvalds@woody.linux-foundation.org>
> CommitDate: Wed Nov 14 18:45:44 2007 -0800
>
> x86: disable preemption in delay_tsc()
>
> Marin Mitov points out that delay_tsc() can misbehave if it is
> preempted and rescheduled on a different CPU which has a skewed TSC.
> Fix it by disabling preemption.
>
this worries me.. this appears to effectively disable preemption during
udelay() and mdelay() loops... which are very obvious latency inducers.
Now you can argue that if you're preemptible you should have used
msleep() and co, and I'll totally buy that.
Maybe we should just check if we're still on the same cpu or something,
or have a cheap way to pin a process to a cpu.... but both are longer
term solutions.
--
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: x86: disable preemption in delay_tsc()
2007-11-16 3:41 ` x86: disable preemption in delay_tsc() Arjan van de Ven
@ 2007-11-16 3:52 ` Andrew Morton
2007-11-16 6:13 ` Ingo Molnar
2007-11-16 8:08 ` Avi Kivity
1 sibling, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2007-11-16 3:52 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Linux Kernel Mailing List, mingo
On Thu, 15 Nov 2007 19:41:16 -0800 Arjan van de Ven <arjan@infradead.org> wrote:
> On Thu, 15 Nov 2007 04:00:47 GMT
> Linux Kernel Mailing List <linux-kernel@vger.kernel.org> wrote:
>
> > Gitweb:
> > http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=35d5d08a085c56f153458c3f5d8ce24123617faf
> > Commit: 35d5d08a085c56f153458c3f5d8ce24123617faf Parent:
> > 7eea436433b7b18045f272562e256976f593f7c0 Author: Andrew Morton
> > <akpm@linux-foundation.org> AuthorDate: Wed Nov 14 17:00:41 2007 -0800
> > Committer: Linus Torvalds <torvalds@woody.linux-foundation.org>
> > CommitDate: Wed Nov 14 18:45:44 2007 -0800
> >
> > x86: disable preemption in delay_tsc()
> >
> > Marin Mitov points out that delay_tsc() can misbehave if it is
> > preempted and rescheduled on a different CPU which has a skewed TSC.
> > Fix it by disabling preemption.
> >
>
> this worries me.. this appears to effectively disable preemption during
> udelay() and mdelay() loops... which are very obvious latency inducers.
>
> Now you can argue that if you're preemptible you should have used
> msleep() and co, and I'll totally buy that.
>
>
> Maybe we should just check if we're still on the same cpu or something,
> or have a cheap way to pin a process to a cpu.... but both are longer
> term solutions.
>
Yes, we can do better.
But this bug can cause very rare failures in probably a large number of
device drivers on a minorty of machines. Ugly. So I felt it best to
plug it fast while people think about more sophisticated fixes.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: x86: disable preemption in delay_tsc()
2007-11-16 3:52 ` Andrew Morton
@ 2007-11-16 6:13 ` Ingo Molnar
2007-11-16 7:08 ` Andrew Morton
0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2007-11-16 6:13 UTC (permalink / raw)
To: Andrew Morton; +Cc: Arjan van de Ven, Linux Kernel Mailing List
* Andrew Morton <akpm@linux-foundation.org> wrote:
> > > x86: disable preemption in delay_tsc()
> > >
> > > Marin Mitov points out that delay_tsc() can misbehave if it is
> > > preempted and rescheduled on a different CPU which has a skewed
> > > TSC. Fix it by disabling preemption.
> > >
> >
> > this worries me.. this appears to effectively disable preemption
> > during udelay() and mdelay() loops... which are very obvious latency
> > inducers.
> >
> > Now you can argue that if you're preemptible you should have used
> > msleep() and co, and I'll totally buy that.
> >
> >
> > Maybe we should just check if we're still on the same cpu or
> > something, or have a cheap way to pin a process to a cpu.... but
> > both are longer term solutions.
>
> Yes, we can do better.
>
> But this bug can cause very rare failures in probably a large number
> of device drivers on a minorty of machines. Ugly. So I felt it best
> to plug it fast while people think about more sophisticated fixes.
how about using usleep() transparently if high-res timers are active and
we have !preempt_count()? That would be a sufficient solution and would
avoid all the calibration and per-cpu-ness problems.
Ingo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: x86: disable preemption in delay_tsc()
2007-11-16 6:13 ` Ingo Molnar
@ 2007-11-16 7:08 ` Andrew Morton
2007-11-16 7:17 ` Ingo Molnar
0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2007-11-16 7:08 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Arjan van de Ven, Linux Kernel Mailing List
On Fri, 16 Nov 2007 07:13:32 +0100 Ingo Molnar <mingo@elte.hu> wrote:
>
> * Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > > > x86: disable preemption in delay_tsc()
> > > >
> > > > Marin Mitov points out that delay_tsc() can misbehave if it is
> > > > preempted and rescheduled on a different CPU which has a skewed
> > > > TSC. Fix it by disabling preemption.
> > > >
> > >
> > > this worries me.. this appears to effectively disable preemption
> > > during udelay() and mdelay() loops... which are very obvious latency
> > > inducers.
> > >
> > > Now you can argue that if you're preemptible you should have used
> > > msleep() and co, and I'll totally buy that.
> > >
> > >
> > > Maybe we should just check if we're still on the same cpu or
> > > something, or have a cheap way to pin a process to a cpu.... but
> > > both are longer term solutions.
> >
> > Yes, we can do better.
> >
> > But this bug can cause very rare failures in probably a large number
> > of device drivers on a minorty of machines. Ugly. So I felt it best
> > to plug it fast while people think about more sophisticated fixes.
>
> how about using usleep() transparently if high-res timers are active and
> we have !preempt_count()?
And CONFIG_PREEMPT, of course
> That would be a sufficient solution and would
> avoid all the calibration and per-cpu-ness problems.
It sounds like it would work OK. What is the setup cost for a usleep? I'd
have thought that code which does something like
while (i++ < 1000) {
foo();
udelay(1);
}
would take qiute a bit longer with such a change?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: x86: disable preemption in delay_tsc()
2007-11-16 7:08 ` Andrew Morton
@ 2007-11-16 7:17 ` Ingo Molnar
2007-11-16 7:26 ` Andrew Morton
0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2007-11-16 7:17 UTC (permalink / raw)
To: Andrew Morton; +Cc: Arjan van de Ven, Linux Kernel Mailing List
* Andrew Morton <akpm@linux-foundation.org> wrote:
> It sounds like it would work OK. What is the setup cost for a usleep?
> I'd have thought that code which does something like
>
> while (i++ < 1000) {
> foo();
> udelay(1);
> }
>
> would take qiute a bit longer with such a change?
full roundtrip cost ought to be below 10 usecs, depending on the system.
There's no problem doing a non-preemptible udelay up to 10 usecs and we
could use usleep above that.
Ingo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: x86: disable preemption in delay_tsc()
2007-11-16 7:17 ` Ingo Molnar
@ 2007-11-16 7:26 ` Andrew Morton
0 siblings, 0 replies; 11+ messages in thread
From: Andrew Morton @ 2007-11-16 7:26 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Arjan van de Ven, Linux Kernel Mailing List
On Fri, 16 Nov 2007 08:17:08 +0100 Ingo Molnar <mingo@elte.hu> wrote:
>
> * Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > It sounds like it would work OK. What is the setup cost for a usleep?
> > I'd have thought that code which does something like
> >
> > while (i++ < 1000) {
> > foo();
> > udelay(1);
> > }
> >
> > would take qiute a bit longer with such a change?
>
> full roundtrip cost ought to be below 10 usecs, depending on the system.
Ow. So the above timeout would take 10x longer. That probably won't break
anything, but quite a few drivers do udelay(1) for post-IO settling times
and they might not like it.
> There's no problem doing a non-preemptible udelay up to 10 usecs and we
> could use usleep above that.
Yup, with a few smarts in there we could work out which is the best to use,
and also compensate for the setup costs.
It doesn't sound very 2.6.24ish though.
As a quicky things perhaps we could only do the preempt_disable()/preempt_enable()
if the TSCs are unsynced? Do we reliably know that? I guess not..
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: x86: disable preemption in delay_tsc()
2007-11-16 3:41 ` x86: disable preemption in delay_tsc() Arjan van de Ven
2007-11-16 3:52 ` Andrew Morton
@ 2007-11-16 8:08 ` Avi Kivity
2007-11-16 8:36 ` Ingo Molnar
1 sibling, 1 reply; 11+ messages in thread
From: Avi Kivity @ 2007-11-16 8:08 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Linux Kernel Mailing List, akpm, mingo
Arjan van de Ven wrote:
> On Thu, 15 Nov 2007 04:00:47 GMT
> Linux Kernel Mailing List <linux-kernel@vger.kernel.org> wrote:
>
>
>> Gitweb:
>> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=35d5d08a085c56f153458c3f5d8ce24123617faf
>> Commit: 35d5d08a085c56f153458c3f5d8ce24123617faf Parent:
>> 7eea436433b7b18045f272562e256976f593f7c0 Author: Andrew Morton
>> <akpm@linux-foundation.org> AuthorDate: Wed Nov 14 17:00:41 2007 -0800
>> Committer: Linus Torvalds <torvalds@woody.linux-foundation.org>
>> CommitDate: Wed Nov 14 18:45:44 2007 -0800
>>
>> x86: disable preemption in delay_tsc()
>>
>> Marin Mitov points out that delay_tsc() can misbehave if it is
>> preempted and rescheduled on a different CPU which has a skewed TSC.
>> Fix it by disabling preemption.
>>
>>
>
> this worries me.. this appears to effectively disable preemption during
> udelay() and mdelay() loops... which are very obvious latency inducers.
>
> Now you can argue that if you're preemptible you should have used
> msleep() and co, and I'll totally buy that.
>
>
> Maybe we should just check if we're still on the same cpu or something,
> or have a cheap way to pin a process to a cpu.... but both are longer
> term solutions.
>
>
You can use preemption notifiers to get a callback when you are
preempted. Not sure what you'd to with that callback, though.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: x86: disable preemption in delay_tsc()
2007-11-16 8:08 ` Avi Kivity
@ 2007-11-16 8:36 ` Ingo Molnar
2007-11-16 8:47 ` [patch] x86: make delay_tsc() preemptible again Ingo Molnar
0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2007-11-16 8:36 UTC (permalink / raw)
To: Avi Kivity; +Cc: Arjan van de Ven, Linux Kernel Mailing List, akpm
* Avi Kivity <avi@qumranet.com> wrote:
> You can use preemption notifiers to get a callback when you are
> preempted. Not sure what you'd to with that callback, though.
but that should not be needed in this case. Why doesnt the TSC using
delay loop simply poll the CPU it is on and fix up the TSC?
Ingo
^ permalink raw reply [flat|nested] 11+ messages in thread
* [patch] x86: make delay_tsc() preemptible again
2007-11-16 8:36 ` Ingo Molnar
@ 2007-11-16 8:47 ` Ingo Molnar
2007-11-16 9:39 ` Peter Zijlstra
0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2007-11-16 8:47 UTC (permalink / raw)
To: Avi Kivity; +Cc: Arjan van de Ven, Linux Kernel Mailing List, akpm
* Ingo Molnar <mingo@elte.hu> wrote:
> but that should not be needed in this case. Why doesnt the TSC using
> delay loop simply poll the CPU it is on and fix up the TSC?
something like the patch below.
Ingo
--------------->
Subject: x86: make delay_tsc() preemptible again
From: Ingo Molnar <mingo@elte.hu>
make delay_tsc() preemptible again.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/lib/delay_32.c | 28 +++++++++++++++++++++++-----
arch/x86/lib/delay_64.c | 30 ++++++++++++++++++++++++------
2 files changed, 47 insertions(+), 11 deletions(-)
Index: linux/arch/x86/lib/delay_32.c
===================================================================
--- linux.orig/arch/x86/lib/delay_32.c
+++ linux/arch/x86/lib/delay_32.c
@@ -38,17 +38,35 @@ static void delay_loop(unsigned long loo
:"0" (loops));
}
-/* TSC based delay: */
+/*
+ * TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
static void delay_tsc(unsigned long loops)
{
- unsigned long bclock, now;
+ unsigned long prev, now;
+ long left = loops;
+ int prev_cpu, cpu;
- preempt_disable(); /* TSC's are per-cpu */
- rdtscl(bclock);
+ preempt_disable();
+ rdtscl(prev);
do {
+ prev_cpu = smp_processor_id();
rep_nop();
+ preempt_enable();
+
+ preempt_disable();
+ cpu = smp_processor_id();
rdtscl(now);
- } while ((now-bclock) < loops);
+ /*
+ * If we preempted we skip this small amount of time:
+ */
+ if (prev_cpu != cpu)
+ prev = now;
+ left -= now - prev;
+ prev = now;
+ } while (left > 0);
preempt_enable();
}
Index: linux/arch/x86/lib/delay_64.c
===================================================================
--- linux.orig/arch/x86/lib/delay_64.c
+++ linux/arch/x86/lib/delay_64.c
@@ -26,17 +26,35 @@ int read_current_timer(unsigned long *ti
return 0;
}
+/*
+ * TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
void __delay(unsigned long loops)
{
- unsigned bclock, now;
+ unsigned long prev, now;
+ long left = loops;
+ int prev_cpu, cpu;
- preempt_disable(); /* TSC's are pre-cpu */
- rdtscl(bclock);
+ preempt_disable();
+ rdtscl(prev);
do {
- rep_nop();
+ prev_cpu = smp_processor_id();
+ rep_nop();
+ preempt_enable();
+
+ preempt_disable();
+ cpu = smp_processor_id();
rdtscl(now);
- }
- while ((now-bclock) < loops);
+ /*
+ * If we preempted we skip this small amount of time:
+ */
+ if (prev_cpu != cpu)
+ prev = now;
+ left -= now - prev;
+ prev = now;
+ } while (left > 0);
preempt_enable();
}
EXPORT_SYMBOL(__delay);
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] x86: make delay_tsc() preemptible again
2007-11-16 8:47 ` [patch] x86: make delay_tsc() preemptible again Ingo Molnar
@ 2007-11-16 9:39 ` Peter Zijlstra
2007-11-16 10:50 ` Ingo Molnar
0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2007-11-16 9:39 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Avi Kivity, Arjan van de Ven, Linux Kernel Mailing List, akpm
On Fri, 2007-11-16 at 09:47 +0100, Ingo Molnar wrote:
> * Ingo Molnar <mingo@elte.hu> wrote:
>
> > but that should not be needed in this case. Why doesnt the TSC using
> > delay loop simply poll the CPU it is on and fix up the TSC?
>
> something like the patch below.
>
> Ingo
>
> --------------->
> Subject: x86: make delay_tsc() preemptible again
> From: Ingo Molnar <mingo@elte.hu>
>
> make delay_tsc() preemptible again.
>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
> arch/x86/lib/delay_32.c | 28 +++++++++++++++++++++++-----
> arch/x86/lib/delay_64.c | 30 ++++++++++++++++++++++++------
> 2 files changed, 47 insertions(+), 11 deletions(-)
>
> Index: linux/arch/x86/lib/delay_32.c
> ===================================================================
> --- linux.orig/arch/x86/lib/delay_32.c
> +++ linux/arch/x86/lib/delay_32.c
> @@ -38,17 +38,35 @@ static void delay_loop(unsigned long loo
> :"0" (loops));
> }
>
> -/* TSC based delay: */
> +/*
> + * TSC based delay:
> + *
> + * We are careful about preemption as TSC's are per-CPU.
> + */
> static void delay_tsc(unsigned long loops)
> {
> - unsigned long bclock, now;
> + unsigned long prev, now;
> + long left = loops;
> + int prev_cpu, cpu;
>
> - preempt_disable(); /* TSC's are per-cpu */
> - rdtscl(bclock);
> + preempt_disable();
> + rdtscl(prev);
> do {
> + prev_cpu = smp_processor_id();
> rep_nop();
> + preempt_enable();
Why not have the rep_nop() here between the enable, and disable ?
> +
> + preempt_disable();
> + cpu = smp_processor_id();
> rdtscl(now);
> - } while ((now-bclock) < loops);
> + /*
> + * If we preempted we skip this small amount of time:
^ migrated, perhaps?
> + */
> + if (prev_cpu != cpu)
> + prev = now;
> + left -= now - prev;
> + prev = now;
> + } while (left > 0);
> preempt_enable();
> }
Otherwise, looks like a very nice patch :-)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] x86: make delay_tsc() preemptible again
2007-11-16 9:39 ` Peter Zijlstra
@ 2007-11-16 10:50 ` Ingo Molnar
0 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2007-11-16 10:50 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Avi Kivity, Arjan van de Ven, Linux Kernel Mailing List, akpm
* Peter Zijlstra <peterz@infradead.org> wrote:
> > + prev_cpu = smp_processor_id();
> > rep_nop();
> > + preempt_enable();
>
> Why not have the rep_nop() here between the enable, and disable ?
yes, indeed - fixed.
> > + /*
> > + * If we preempted we skip this small amount of time:
> ^ migrated, perhaps?
yeah, fixed.
updated patch below.
Ingo
---------------->
Subject: x86: make delay_tsc() preemptible again
From: Ingo Molnar <mingo@elte.hu>
make delay_tsc() preemptible again.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
arch/x86/lib/delay_32.c | 27 ++++++++++++++++++++++-----
arch/x86/lib/delay_64.c | 29 +++++++++++++++++++++++------
2 files changed, 45 insertions(+), 11 deletions(-)
Index: linux/arch/x86/lib/delay_32.c
===================================================================
--- linux.orig/arch/x86/lib/delay_32.c
+++ linux/arch/x86/lib/delay_32.c
@@ -38,17 +38,34 @@ static void delay_loop(unsigned long loo
:"0" (loops));
}
-/* TSC based delay: */
+/*
+ * TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
static void delay_tsc(unsigned long loops)
{
- unsigned long bclock, now;
+ unsigned long prev, now;
+ long left = loops;
+ int prev_cpu, cpu;
- preempt_disable(); /* TSC's are per-cpu */
- rdtscl(bclock);
+ preempt_disable();
+ rdtscl(prev);
do {
+ prev_cpu = smp_processor_id();
+ preempt_enable();
rep_nop();
+ preempt_disable();
+ cpu = smp_processor_id();
rdtscl(now);
- } while ((now-bclock) < loops);
+ /*
+ * If we migrated we skip this small amount of time:
+ */
+ if (prev_cpu != cpu)
+ prev = now;
+ left -= now - prev;
+ prev = now;
+ } while (left > 0);
preempt_enable();
}
Index: linux/arch/x86/lib/delay_64.c
===================================================================
--- linux.orig/arch/x86/lib/delay_64.c
+++ linux/arch/x86/lib/delay_64.c
@@ -26,17 +26,34 @@ int read_current_timer(unsigned long *ti
return 0;
}
+/*
+ * TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
void __delay(unsigned long loops)
{
- unsigned bclock, now;
+ unsigned long prev, now;
+ long left = loops;
+ int prev_cpu, cpu;
- preempt_disable(); /* TSC's are pre-cpu */
- rdtscl(bclock);
+ preempt_disable();
+ rdtscl(prev);
do {
- rep_nop();
+ prev_cpu = smp_processor_id();
+ preempt_enable();
+ rep_nop();
+ preempt_disable();
+ cpu = smp_processor_id();
rdtscl(now);
- }
- while ((now-bclock) < loops);
+ /*
+ * If we migrated we skip this small amount of time:
+ */
+ if (prev_cpu != cpu)
+ prev = now;
+ left -= now - prev;
+ prev = now;
+ } while (left > 0);
preempt_enable();
}
EXPORT_SYMBOL(__delay);
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-11-16 10:51 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200711150400.lAF40lIr020160@hera.kernel.org>
2007-11-16 3:41 ` x86: disable preemption in delay_tsc() Arjan van de Ven
2007-11-16 3:52 ` Andrew Morton
2007-11-16 6:13 ` Ingo Molnar
2007-11-16 7:08 ` Andrew Morton
2007-11-16 7:17 ` Ingo Molnar
2007-11-16 7:26 ` Andrew Morton
2007-11-16 8:08 ` Avi Kivity
2007-11-16 8:36 ` Ingo Molnar
2007-11-16 8:47 ` [patch] x86: make delay_tsc() preemptible again Ingo Molnar
2007-11-16 9:39 ` Peter Zijlstra
2007-11-16 10:50 ` Ingo Molnar
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.