Re:[PATCH -v4] x86: enable preemption in delay

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re:[PATCH -v4] x86: enable preemption in delay
@ 2008-05-25 18:08 Marin Mitov
  2008-05-25 19:30 ` Steven Rostedt
  2008-05-25 22:21 ` Thomas Gleixner
  0 siblings, 2 replies; 22+ messages in thread
From: Marin Mitov @ 2008-05-25 18:08 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

Hi Steven,

look at this patch, proposed by me and Ingo few month ago.
I think it solves the problem you had fond, but unfortunately
it had been lost (not included in the mainline).

http://lkml.org/lkml/2007/11/20/343

It is against the kernel at the moment the message was sent.
I think it could be applied to the current kernel too.

Best regards

Marin Mitov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re:[PATCH -v4] x86: enable preemption in delay
  2008-05-25 18:08 Re:[PATCH -v4] x86: enable preemption in delay Marin Mitov
@ 2008-05-25 19:30 ` Steven Rostedt
  2008-05-25 19:58   ` [PATCH " Marin Mitov
  2008-05-25 22:21 ` Thomas Gleixner
  1 sibling, 1 reply; 22+ messages in thread
From: Steven Rostedt @ 2008-05-25 19:30 UTC (permalink / raw)
  To: Marin Mitov; +Cc: linux-kernel


On Sun, 25 May 2008, Marin Mitov wrote:

> Hi Steven,
>
> look at this patch, proposed by me and Ingo few month ago.
> I think it solves the problem you had fond, but unfortunately
> it had been lost (not included in the mainline).
>
> http://lkml.org/lkml/2007/11/20/343
>
> It is against the kernel at the moment the message was sent.
> I think it could be applied to the current kernel too.
>

Hi Marin,

That patch looks basically like what I did. But mine has less code and is
a bit simpler ;-)

Thanks,

-- Steve


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH -v4] x86: enable preemption in delay
  2008-05-25 19:30 ` Steven Rostedt
@ 2008-05-25 19:58   ` Marin Mitov
  0 siblings, 0 replies; 22+ messages in thread
From: Marin Mitov @ 2008-05-25 19:58 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Sunday 25 May 2008 10:30:25 pm Steven Rostedt wrote:
> 
> On Sun, 25 May 2008, Marin Mitov wrote:
> 
> > Hi Steven,
> >
> > look at this patch, proposed by me and Ingo few month ago.
> > I think it solves the problem you had fond, but unfortunately
> > it had been lost (not included in the mainline).
> >
> > http://lkml.org/lkml/2007/11/20/343
> >
> > It is against the kernel at the moment the message was sent.
> > I think it could be applied to the current kernel too.
> >
> 
> Hi Marin,
> 
> That patch looks basically like what I did. But mine has less code and is

How much "less code" :-)

> a bit simpler ;-)

:-)

Marin Mitov

> 
> Thanks,
> 
> -- Steve
> 



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re:[PATCH -v4] x86: enable preemption in delay
  2008-05-25 18:08 Re:[PATCH -v4] x86: enable preemption in delay Marin Mitov
  2008-05-25 19:30 ` Steven Rostedt
@ 2008-05-25 22:21 ` Thomas Gleixner
  2008-05-26  5:14   ` [PATCH " Marin Mitov
  2008-06-01 16:01   ` [PATCH][resubmit] " Marin Mitov
  1 sibling, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2008-05-25 22:21 UTC (permalink / raw)
  To: Marin Mitov
  Cc: Steven Rostedt, LKML, linux-rt-users, akpm, Ingo Molnar,
	Clark Williams, Peter Zijlstra, Luis Claudio R. Goncalves,
	Gregory Haskins, Linus Torvalds, Andi Kleen

On Sun, 25 May 2008, Marin Mitov wrote:
> Hi Steven,

Please do not trim CC lists without a good reason. [Restored it]

> look at this patch, proposed by me and Ingo few month ago.
> I think it solves the problem you had fond, but unfortunately
> it had been lost (not included in the mainline).

Yep, and it might be simply because the mail thread ended with:

http://lkml.org/lkml/2007/11/20/426

> hi Marin,
>
> here's the patch we are carrying in x86.git at the moment - could you 
> please update it with v3 of your code, and send us the patch (with the 
> patch metadata kept intact, like you see it below)? Thanks,
>
>      Ingo

And there was no response. I just checked my x86 quilt archives and it
simply went into the "wait for update" category and got dropped
unfortunately.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH -v4] x86: enable preemption in delay
  2008-05-25 22:21 ` Thomas Gleixner
@ 2008-05-26  5:14   ` Marin Mitov
  2008-06-01 16:01   ` [PATCH][resubmit] " Marin Mitov
  1 sibling, 0 replies; 22+ messages in thread
From: Marin Mitov @ 2008-05-26  5:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Steven Rostedt, LKML, linux-rt-users, akpm, Ingo Molnar,
	Clark Williams, Peter Zijlstra, Luis Claudio R. Goncalves,
	Gregory Haskins, Linus Torvalds, Andi Kleen

Hi Thomas,

On Monday 26 May 2008 01:21:04 am Thomas Gleixner wrote:
> On Sun, 25 May 2008, Marin Mitov wrote:
> > Hi Steven,
> 
> Please do not trim CC lists without a good reason. [Restored it]
> 
> > look at this patch, proposed by me and Ingo few month ago.
> > I think it solves the problem you had fond, but unfortunately
> > it had been lost (not included in the mainline).
> 
> Yep, and it might be simply because the mail thread ended with:
> 
> http://lkml.org/lkml/2007/11/20/426

Could be. I remember to have answered to Ingo and he gave me
a long lecture (thank him for it, I am a newbee) how to use quilt 
(because I had updated the patch by hand). In any case, the final 
version is: 

http://lkml.org/lkml/2007/11/20/343

It applies to 2.6.25.4 with the following warnings:

patching file arch/x86/lib/delay_32.c
Hunk #1 succeeded at 40 (offset 2 lines).
patching file arch/x86/lib/delay_64.c
Hunk #1 succeeded at 28 (offset 2 lines).

If you find the patch usefull I will rebase it to 2.6.26-rc3.
> 
> > hi Marin,
> >
> > here's the patch we are carrying in x86.git at the moment - could you 
> > please update it with v3 of your code, and send us the patch (with the 
> > patch metadata kept intact, like you see it below)? Thanks,
> >
> >      Ingo
> 
> And there was no response. I just checked my x86 quilt archives and it
> simply went into the "wait for update" category and got dropped
> unfortunately.

Here is an extract of the Ingo's mail sent to me at that time: 

From: Ingo Molnar <mingo@elte.hu>
To: Marin Mitov <mitov@issp.bas.bg>
 CC: akpm@linux-foundation.org, tglx@linutronix.de
<snip>  
* Marin Mitov <mitov@issp.bas.bg> wrote:

> > > The difference is explained in the reference above. Ingo asked me 
> > > to send the last changes:
> > >
> > > http://lkml.org/lkml/2007/11/20/426
> > >
> > > and I have sent them to him.
> >
> > yep, and we've got that queued in the x86 tree.
> 
> According to the attachment to the e-mail I have got from Andrew, the 
> patch "added to -mm tree" is not the patch (v.3):
> 
> http://lkml.org/lkml/2007/11/20/343
> 
> but the patch (v.2) that has the flaw of a possible infinite 
> loop:
> 
> http://lkml.org/lkml/2007/11/18/5
> 
> If that is the intention, OK.

the patch that got queued up in the x86 tree 3 days ago is the one below 
- your latest.

        Ingo
<snip>

As far as the patch (in the Ingo's mail) was really the latest, 
I decided all is OK. But it did not appeared in 2.6.24. 

That is the story as I know of it ;-)

Best regards.

Marin Mitov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH][resubmit] x86: enable preemption in delay
  2008-05-25 22:21 ` Thomas Gleixner
  2008-05-26  5:14   ` [PATCH " Marin Mitov
@ 2008-06-01 16:01   ` Marin Mitov
  2008-06-01 16:25     ` Andi Kleen
  2008-06-09 12:13     ` Ingo Molnar
  1 sibling, 2 replies; 22+ messages in thread
From: Marin Mitov @ 2008-06-01 16:01 UTC (permalink / raw)
  To: LKML
  Cc: Steven Rostedt, Thomas Gleixner, linux-rt-users, akpm,
	Ingo Molnar, Clark Williams, Peter Zijlstra,
	Luis Claudio R. Goncalves, Gregory Haskins, Linus Torvalds,
	Andi Kleen

Hi all,

This is a resubmit of the patch proposed by Ingo and me
few month ago:

http://lkml.org/lkml/2007/11/20/343

It was in -mm for a while and then removed due to a move into
the mainline, but it never appeared in it.

As recently a new discussion started on the subject I decided
to resubmit it again.

Now it is against 2.6.25.4, but as far as there are no changes 
in the modified files it should apply to 2.6.26-rc4 too.

Comments/critics are wellcome.

Marin Mitov


Signed-off-by: Marin Mitov <mitov@issp.bas.bg>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

=========================================
--- a/arch/x86/lib/delay_32.c	2007-11-18 08:14:05.000000000 +0200
+++ b/arch/x86/lib/delay_32.c	2007-11-20 19:03:52.000000000 +0200
@@ -40,18 +40,42 @@
 		:"0" (loops));
 }
 
-/* TSC based delay: */
+/* TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
 static void delay_tsc(unsigned long loops)
 {
-	unsigned long bclock, now;
+	unsigned prev, prev_1, now;
+	unsigned left = loops;
+	unsigned prev_cpu, cpu;
+
+	preempt_disable();
+	rdtscl(prev);
+	prev_cpu = smp_processor_id();
+	preempt_enable();
+	now = prev;
 
-	preempt_disable();		/* TSC's are per-cpu */
-	rdtscl(bclock);
 	do {
 		rep_nop();
+
+		left -= now - prev;
+		prev = now;
+
+		preempt_disable();
+		rdtscl(prev_1);
+		cpu = smp_processor_id();
 		rdtscl(now);
-	} while ((now-bclock) < loops);
-	preempt_enable();
+		preempt_enable();
+
+		if (prev_cpu != cpu) {
+			/*
+			 * We have migrated, forget prev_cpu's tsc reading
+			 */
+			prev = prev_1;
+			prev_cpu = cpu;
+		}
+	} while ((now-prev) < left);
 }
 
 /*
--- a/arch/x86/lib/delay_64.c	2007-11-18 08:14:40.000000000 +0200
+++ b/arch/x86/lib/delay_64.c	2007-11-20 19:47:29.000000000 +0200
@@ -28,18 +28,42 @@
 	return 0;
 }
 
+/* TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
 void __delay(unsigned long loops)
 {
-	unsigned bclock, now;
+	unsigned prev, prev_1, now;
+	unsigned left = loops;
+	unsigned prev_cpu, cpu;
+
+	preempt_disable();
+	rdtscl(prev);
+	prev_cpu = smp_processor_id();
+	preempt_enable();
+	now = prev;
 
-	preempt_disable();		/* TSC's are pre-cpu */
-	rdtscl(bclock);
 	do {
-		rep_nop(); 
+		rep_nop();
+
+		left -= now - prev;
+		prev = now;
+
+		preempt_disable();
+		rdtscl(prev_1);
+		cpu = smp_processor_id();
 		rdtscl(now);
-	}
-	while ((now-bclock) < loops);
-	preempt_enable();
+		preempt_enable();
+
+		if (prev_cpu != cpu) {
+			/*
+			 * We have migrated, forget prev_cpu's tsc reading
+			 */
+			 prev = prev_1;
+			 prev_cpu = cpu;
+		}
+	} while ((now-prev) < left);
 }
 EXPORT_SYMBOL(__delay);
 
-




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-01 16:01   ` [PATCH][resubmit] " Marin Mitov
@ 2008-06-01 16:25     ` Andi Kleen
  2008-06-01 17:17       ` Marin Mitov
  2008-06-09 12:13     ` Ingo Molnar
  1 sibling, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2008-06-01 16:25 UTC (permalink / raw)
  To: Marin Mitov
  Cc: LKML, Steven Rostedt, Thomas Gleixner, linux-rt-users, akpm,
	Ingo Molnar, Clark Williams, Peter Zijlstra,
	Luis Claudio R. Goncalves, Gregory Haskins, Linus Torvalds,
	Andi Kleen

>  static void delay_tsc(unsigned long loops)
>  {
> -	unsigned long bclock, now;
> +	unsigned prev, prev_1, now;
> +	unsigned left = loops;
> +	unsigned prev_cpu, cpu;
> +
> +	preempt_disable();
> +	rdtscl(prev);


The unsigneds should be probably u64  and the rdtsc rdtscll.
Otherwise this will all overflow for longer waits on a very
fast systems (e.g. a 5Ghz system wraps 32bit in ~1.1 seconds)
Normally such delays shouldn't be that long, but why risk
overflow even in extreme cases?

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-01 16:25     ` Andi Kleen
@ 2008-06-01 17:17       ` Marin Mitov
  0 siblings, 0 replies; 22+ messages in thread
From: Marin Mitov @ 2008-06-01 17:17 UTC (permalink / raw)
  To: Andi Kleen
  Cc: LKML, Steven Rostedt, Thomas Gleixner, linux-rt-users, akpm,
	Ingo Molnar, Clark Williams, Peter Zijlstra,
	Luis Claudio R. Goncalves, Gregory Haskins, Linus Torvalds,
	Andi Kleen

On Sunday 01 June 2008 07:25:17 pm Andi Kleen wrote:
> >  static void delay_tsc(unsigned long loops)
> >  {
> > -	unsigned long bclock, now;
> > +	unsigned prev, prev_1, now;
> > +	unsigned left = loops;
> > +	unsigned prev_cpu, cpu;
> > +
> > +	preempt_disable();
> > +	rdtscl(prev);
> 
> 
> The unsigneds should be probably u64  and the rdtsc rdtscll.
> Otherwise this will all overflow for longer waits on a very
> fast systems (e.g. a 5Ghz system wraps 32bit in ~1.1 seconds)
> Normally such delays shouldn't be that long, but why risk
> overflow even in extreme cases?

Yes in principles, but the overflow (that could happen between
rdtscl(prev) and rdtscl(now) is taken into account the same way 
as in time_after()/time_before() macros, (differences only) see:

+		left -= now - prev;
.........
+	} while ((now-prev) < left);

If more than one overflow happen between rdtscl(prev) and 
rdtscl(now) (the task is suspended for a long time between two 
readings) all overflows after the first one will be lost. But the 
patch was submitted to guaranty minimum udelay() initially.
Sure, I could change to u64 if we reach a concensus.

Best regards.

Marin Mitov

> 
> -Andi
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-01 16:01   ` [PATCH][resubmit] " Marin Mitov
  2008-06-01 16:25     ` Andi Kleen
@ 2008-06-09 12:13     ` Ingo Molnar
  2008-06-09 16:11       ` Marin Mitov
  1 sibling, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2008-06-09 12:13 UTC (permalink / raw)
  To: Marin Mitov
  Cc: LKML, Steven Rostedt, Thomas Gleixner, linux-rt-users, akpm,
	Clark Williams, Peter Zijlstra, Luis Claudio R. Goncalves,
	Gregory Haskins, Linus Torvalds, Andi Kleen


* Marin Mitov <mitov@issp.bas.bg> wrote:

> Hi all,
> 
> This is a resubmit of the patch proposed by Ingo and me few month ago:
> 
> http://lkml.org/lkml/2007/11/20/343
> 
> It was in -mm for a while and then removed due to a move into the 
> mainline, but it never appeared in it.

hm, we've got this overlapping commit in -tip for the same general 
problem:

|  # x86/urgent: e71e716: x86: enable preemption in delay
|
|  From: Steven Rostedt <rostedt@goodmis.org>
|  Date: Sun, 25 May 2008 11:13:32 -0400
|  Subject: [PATCH] x86: enable preemption in delay

i think Thomas had a concern with the original fix - forgot the details.

	Ingo

------------------>
commit e71e716c531557308895598002bee24c431d3be1
Author: Steven Rostedt <rostedt@goodmis.org>
Date:   Sun May 25 11:13:32 2008 -0400

    x86: enable preemption in delay
    
    The RT team has been searching for a nasty latency. This latency shows
    up out of the blue and has been seen to be as big as 5ms!
    
    Using ftrace I found the cause of the latency.
    
       pcscd-2995  3dNh1 52360300us : irq_exit (smp_apic_timer_interrupt)
       pcscd-2995  3dN.2 52360301us : idle_cpu (irq_exit)
       pcscd-2995  3dN.2 52360301us : rcu_irq_exit (irq_exit)
       pcscd-2995  3dN.1 52360771us : smp_apic_timer_interrupt (apic_timer_interrupt
    )
       pcscd-2995  3dN.1 52360771us : exit_idle (smp_apic_timer_interrupt)
    
    Here's an example of a 400 us latency. pcscd took a timer interrupt and
    returned with "need resched" enabled, but did not reschedule until after
    the next interrupt came in at 52360771us 400us later!
    
    At first I thought we somehow missed a preemption check in entry.S. But
    I also noticed that this always seemed to happen during a __delay call.
    
       pcscd-2995  3dN.2 52360836us : rcu_irq_exit (irq_exit)
       pcscd-2995  3.N.. 52361265us : preempt_schedule (__delay)
    
    Looking at the x86 delay, I found my problem.
    
    In git commit 35d5d08a085c56f153458c3f5d8ce24123617faf, Andrew Morton
    placed preempt_disable around the entire delay due to TSC's not working
    nicely on SMP.  Unfortunately for those that care about latencies this
    is devastating! Especially when we have callers to mdelay(8).
    
    Here I enable preemption during the loop and account for anytime the task
    migrates to a new CPU. The delay asked for may be extended a bit by
    the migration, but delay only guarantees that it will delay for that minimum
    time. Delaying longer should not be an issue.
    
    [
      Thanks to Thomas Gleixner for spotting that cpu wasn't updated,
        and to place the rep_nop between preempt_enabled/disable.
    ]
    
    Signed-off-by: Steven Rostedt <srostedt@redhat.com>
    Cc: akpm@osdl.org
    Cc: Clark Williams <clark.williams@gmail.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
    Cc: Gregory Haskins <ghaskins@novell.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Andi Kleen <andi-suse@firstfloor.org>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

diff --git a/arch/x86/lib/delay_32.c b/arch/x86/lib/delay_32.c
index 4535e6d..d710f2d 100644
--- a/arch/x86/lib/delay_32.c
+++ b/arch/x86/lib/delay_32.c
@@ -44,13 +44,36 @@ static void delay_loop(unsigned long loops)
 static void delay_tsc(unsigned long loops)
 {
 	unsigned long bclock, now;
+	int cpu;
 
-	preempt_disable();		/* TSC's are per-cpu */
+	preempt_disable();
+	cpu = smp_processor_id();
 	rdtscl(bclock);
-	do {
-		rep_nop();
+	for (;;) {
 		rdtscl(now);
-	} while ((now-bclock) < loops);
+		if ((now - bclock) >= loops)
+			break;
+
+		/* Allow RT tasks to run */
+		preempt_enable();
+		rep_nop();
+		preempt_disable();
+
+		/*
+		 * It is possible that we moved to another CPU, and
+		 * since TSC's are per-cpu we need to calculate
+		 * that. The delay must guarantee that we wait "at
+		 * least" the amount of time. Being moved to another
+		 * CPU could make the wait longer but we just need to
+		 * make sure we waited long enough. Rebalance the
+		 * counter for this CPU.
+		 */
+		if (unlikely(cpu != smp_processor_id())) {
+			loops -= (now - bclock);
+			cpu = smp_processor_id();
+			rdtscl(bclock);
+		}
+	}
 	preempt_enable();
 }
 
diff --git a/arch/x86/lib/delay_64.c b/arch/x86/lib/delay_64.c
index bbc6105..4c441be 100644
--- a/arch/x86/lib/delay_64.c
+++ b/arch/x86/lib/delay_64.c
@@ -31,14 +31,36 @@ int __devinit read_current_timer(unsigned long *timer_value)
 void __delay(unsigned long loops)
 {
 	unsigned bclock, now;
+	int cpu;
 
-	preempt_disable();		/* TSC's are pre-cpu */
+	preempt_disable();
+	cpu = smp_processor_id();
 	rdtscl(bclock);
-	do {
-		rep_nop(); 
+	for (;;) {
 		rdtscl(now);
+		if ((now - bclock) >= loops)
+			break;
+
+		/* Allow RT tasks to run */
+		preempt_enable();
+		rep_nop();
+		preempt_disable();
+
+		/*
+		 * It is possible that we moved to another CPU, and
+		 * since TSC's are per-cpu we need to calculate
+		 * that. The delay must guarantee that we wait "at
+		 * least" the amount of time. Being moved to another
+		 * CPU could make the wait longer but we just need to
+		 * make sure we waited long enough. Rebalance the
+		 * counter for this CPU.
+		 */
+		if (unlikely(cpu != smp_processor_id())) {
+			loops -= (now - bclock);
+			cpu = smp_processor_id();
+			rdtscl(bclock);
+		}
 	}
-	while ((now-bclock) < loops);
 	preempt_enable();
 }
 EXPORT_SYMBOL(__delay);

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-09 12:13     ` Ingo Molnar
@ 2008-06-09 16:11       ` Marin Mitov
  2008-06-09 16:16         ` Ingo Molnar
  0 siblings, 1 reply; 22+ messages in thread
From: Marin Mitov @ 2008-06-09 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Steven Rostedt, Thomas Gleixner, linux-rt-users, akpm,
	Clark Williams, Peter Zijlstra, Luis Claudio R. Goncalves,
	Gregory Haskins, Linus Torvalds, Andi Kleen

On Monday 09 June 2008 03:13:09 pm Ingo Molnar wrote:
> 
> * Marin Mitov <mitov@issp.bas.bg> wrote:
> 
> > Hi all,
> > 
> > This is a resubmit of the patch proposed by Ingo and me few month ago:
> > 
> > http://lkml.org/lkml/2007/11/20/343
> > 
> > It was in -mm for a while and then removed due to a move into the 
> > mainline, but it never appeared in it.
> 
> hm, we've got this overlapping commit in -tip for the same general 
> problem:
> 
> |  # x86/urgent: e71e716: x86: enable preemption in delay
> |
> |  From: Steven Rostedt <rostedt@goodmis.org>
> |  Date: Sun, 25 May 2008 11:13:32 -0400
> |  Subject: [PATCH] x86: enable preemption in delay
> 
> i think Thomas had a concern with the original fix - forgot the details.

Here they are:

http://lkml.org/lkml/2008/5/25/251

but see my comment to it.

> 
> 	Ingo

There is no principal difference between both patches. I have seen Steven's
one as merged in linux-2.6.26-rc5. The only difference (if it matters of all) is
that in mine patch preempt_disable()/preempt_enable() sections are shorter
and protect only the code that must be protected: 

preempt_disable()
rdtscl()
smp_processor_id()
preempt_enable()

As far as Steven's patch is already merged - let it be :-)

Regards 

Marin Mitov

> 
> ------------------>
> commit e71e716c531557308895598002bee24c431d3be1
> Author: Steven Rostedt <rostedt@goodmis.org>
> Date:   Sun May 25 11:13:32 2008 -0400
> 
>     x86: enable preemption in delay
>     
>     The RT team has been searching for a nasty latency. This latency shows
>     up out of the blue and has been seen to be as big as 5ms!
>     
>     Using ftrace I found the cause of the latency.
>     
>        pcscd-2995  3dNh1 52360300us : irq_exit (smp_apic_timer_interrupt)
>        pcscd-2995  3dN.2 52360301us : idle_cpu (irq_exit)
>        pcscd-2995  3dN.2 52360301us : rcu_irq_exit (irq_exit)
>        pcscd-2995  3dN.1 52360771us : smp_apic_timer_interrupt (apic_timer_interrupt
>     )
>        pcscd-2995  3dN.1 52360771us : exit_idle (smp_apic_timer_interrupt)
>     
>     Here's an example of a 400 us latency. pcscd took a timer interrupt and
>     returned with "need resched" enabled, but did not reschedule until after
>     the next interrupt came in at 52360771us 400us later!
>     
>     At first I thought we somehow missed a preemption check in entry.S. But
>     I also noticed that this always seemed to happen during a __delay call.
>     
>        pcscd-2995  3dN.2 52360836us : rcu_irq_exit (irq_exit)
>        pcscd-2995  3.N.. 52361265us : preempt_schedule (__delay)
>     
>     Looking at the x86 delay, I found my problem.
>     
>     In git commit 35d5d08a085c56f153458c3f5d8ce24123617faf, Andrew Morton
>     placed preempt_disable around the entire delay due to TSC's not working
>     nicely on SMP.  Unfortunately for those that care about latencies this
>     is devastating! Especially when we have callers to mdelay(8).
>     
>     Here I enable preemption during the loop and account for anytime the task
>     migrates to a new CPU. The delay asked for may be extended a bit by
>     the migration, but delay only guarantees that it will delay for that minimum
>     time. Delaying longer should not be an issue.
>     
>     [
>       Thanks to Thomas Gleixner for spotting that cpu wasn't updated,
>         and to place the rep_nop between preempt_enabled/disable.
>     ]
>     
>     Signed-off-by: Steven Rostedt <srostedt@redhat.com>
>     Cc: akpm@osdl.org
>     Cc: Clark Williams <clark.williams@gmail.com>
>     Cc: Peter Zijlstra <peterz@infradead.org>
>     Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
>     Cc: Gregory Haskins <ghaskins@novell.com>
>     Cc: Linus Torvalds <torvalds@linux-foundation.org>
>     Cc: Andi Kleen <andi-suse@firstfloor.org>
>     Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> 
> diff --git a/arch/x86/lib/delay_32.c b/arch/x86/lib/delay_32.c
> index 4535e6d..d710f2d 100644
> --- a/arch/x86/lib/delay_32.c
> +++ b/arch/x86/lib/delay_32.c
> @@ -44,13 +44,36 @@ static void delay_loop(unsigned long loops)
>  static void delay_tsc(unsigned long loops)
>  {
>  	unsigned long bclock, now;
> +	int cpu;
>  
> -	preempt_disable();		/* TSC's are per-cpu */
> +	preempt_disable();
> +	cpu = smp_processor_id();
>  	rdtscl(bclock);
> -	do {
> -		rep_nop();
> +	for (;;) {
>  		rdtscl(now);
> -	} while ((now-bclock) < loops);
> +		if ((now - bclock) >= loops)
> +			break;
> +
> +		/* Allow RT tasks to run */
> +		preempt_enable();
> +		rep_nop();
> +		preempt_disable();
> +
> +		/*
> +		 * It is possible that we moved to another CPU, and
> +		 * since TSC's are per-cpu we need to calculate
> +		 * that. The delay must guarantee that we wait "at
> +		 * least" the amount of time. Being moved to another
> +		 * CPU could make the wait longer but we just need to
> +		 * make sure we waited long enough. Rebalance the
> +		 * counter for this CPU.
> +		 */
> +		if (unlikely(cpu != smp_processor_id())) {
> +			loops -= (now - bclock);
> +			cpu = smp_processor_id();
> +			rdtscl(bclock);
> +		}
> +	}
>  	preempt_enable();
>  }
>  
> diff --git a/arch/x86/lib/delay_64.c b/arch/x86/lib/delay_64.c
> index bbc6105..4c441be 100644
> --- a/arch/x86/lib/delay_64.c
> +++ b/arch/x86/lib/delay_64.c
> @@ -31,14 +31,36 @@ int __devinit read_current_timer(unsigned long *timer_value)
>  void __delay(unsigned long loops)
>  {
>  	unsigned bclock, now;
> +	int cpu;
>  
> -	preempt_disable();		/* TSC's are pre-cpu */
> +	preempt_disable();
> +	cpu = smp_processor_id();
>  	rdtscl(bclock);
> -	do {
> -		rep_nop(); 
> +	for (;;) {
>  		rdtscl(now);
> +		if ((now - bclock) >= loops)
> +			break;
> +
> +		/* Allow RT tasks to run */
> +		preempt_enable();
> +		rep_nop();
> +		preempt_disable();
> +
> +		/*
> +		 * It is possible that we moved to another CPU, and
> +		 * since TSC's are per-cpu we need to calculate
> +		 * that. The delay must guarantee that we wait "at
> +		 * least" the amount of time. Being moved to another
> +		 * CPU could make the wait longer but we just need to
> +		 * make sure we waited long enough. Rebalance the
> +		 * counter for this CPU.
> +		 */
> +		if (unlikely(cpu != smp_processor_id())) {
> +			loops -= (now - bclock);
> +			cpu = smp_processor_id();
> +			rdtscl(bclock);
> +		}
>  	}
> -	while ((now-bclock) < loops);
>  	preempt_enable();
>  }
>  EXPORT_SYMBOL(__delay);
> 



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-09 16:11       ` Marin Mitov
@ 2008-06-09 16:16         ` Ingo Molnar
  2008-06-15 17:58           ` Marin Mitov
  0 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2008-06-09 16:16 UTC (permalink / raw)
  To: Marin Mitov
  Cc: LKML, Steven Rostedt, Thomas Gleixner, linux-rt-users, akpm,
	Clark Williams, Peter Zijlstra, Luis Claudio R. Goncalves,
	Gregory Haskins, Linus Torvalds, Andi Kleen


* Marin Mitov <mitov@issp.bas.bg> wrote:

> > i think Thomas had a concern with the original fix - forgot the 
> > details.
> 
> Here they are:
> 
> http://lkml.org/lkml/2008/5/25/251
> 
> but see my comment to it.
> 
> There is no principal difference between both patches. I have seen 
> Steven's one as merged in linux-2.6.26-rc5. The only difference (if it 
> matters of all) is that in mine patch 
> preempt_disable()/preempt_enable() sections are shorter and protect 
> only the code that must be protected:
> 
> preempt_disable()
> rdtscl()
> smp_processor_id()
> preempt_enable()
> 
> As far as Steven's patch is already merged - let it be :-)

we could still merge your enhancements as well ontop of Steven's, if you 
would like to pursue it. If so then please send a patch against -rc5 or 
against -tip. Reducing the length of preempt-off sections is a fair 
goal.

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-09 16:16         ` Ingo Molnar
@ 2008-06-15 17:58           ` Marin Mitov
  2008-06-18  7:55             ` Ingo Molnar
  0 siblings, 1 reply; 22+ messages in thread
From: Marin Mitov @ 2008-06-15 17:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Steven Rostedt, Thomas Gleixner, linux-rt-users, akpm,
	Clark Williams, Peter Zijlstra, Luis Claudio R. Goncalves,
	Gregory Haskins, Linus Torvalds, Andi Kleen

On Monday 09 June 2008 07:16:06 pm Ingo Molnar wrote:
> > There is no principal difference between both patches. I have seen 
> > Steven's one as merged in linux-2.6.26-rc5. The only difference (if it 
> > matters of all) is that in mine patch 
> > preempt_disable()/preempt_enable() sections are shorter and protect 
> > only the code that must be protected:
> > 
> > preempt_disable()
> > rdtscl()
> > smp_processor_id()
> > preempt_enable()
> > 
> > As far as Steven's patch is already merged - let it be :-)
> 
> we could still merge your enhancements as well ontop of Steven's, if you 
> would like to pursue it. If so then please send a patch against -rc5 or 
> against -tip. Reducing the length of preempt-off sections is a fair 
> goal.
> 
> 	Ingo
> 
Well, the patch is bellow (against 2.6.26-rc6), but I would really 
appreciate your comments. 

The difference is that in Steven's version the loop is running 
mainly with preemption disabled, except for: 

		preempt_enable();
		rep_nop();
		preempt_disable();

while in the proposed version the loop is running with
preemption enabled, except for the part:

		preempt_disable();
		rdtscl(prev_1);
		cpu = smp_processor_id();
		rdtscl(now);
		preempt_enable();

I do realize that this is not a big difference as far as 
the time of loop execution is quite short.

In fact in the case of TSCs the problem is not the preemption 
itself, but the migration to another cpu after the preemption.

Why not something like that (do keep in mind I am not
an expert :-):

static void delay_tsc(unsigned long loops)
{
	get and store the mask of allowed cpus;
	/*     prevent the migration   */
	set the mask of allowed cpus to the current cpu only;
	/*     is it possible? could it be guaranteed?    */
	loop for the delay;
	restore the old mask of allowed cpus;
}

You have got the idea. Could it be realized? Is it more expensive
than the current realization? So, comments, please.

Regards,

Marin Mitov


Signed-off-by: Marin Mitov <mitov@issp.bas.bg>

=========================================
--- a/arch/x86/lib/delay_32.c	2008-06-15 11:04:05.000000000 +0300
+++ b/arch/x86/lib/delay_32.c	2008-06-15 11:09:45.000000000 +0300
@@ -40,41 +40,42 @@ static void delay_loop(unsigned long loo
 		:"0" (loops));
 }
 
-/* TSC based delay: */
+/* TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
 static void delay_tsc(unsigned long loops)
 {
-	unsigned long bclock, now;
-	int cpu;
+	unsigned prev, prev_1, now;
+	unsigned left = loops;
+	unsigned prev_cpu, cpu;
 
 	preempt_disable();
-	cpu = smp_processor_id();
-	rdtscl(bclock);
-	for (;;) {
-		rdtscl(now);
-		if ((now - bclock) >= loops)
-			break;
+	rdtscl(prev);
+	prev_cpu = smp_processor_id();
+	preempt_enable();
+	now = prev;
 
-		/* Allow RT tasks to run */
-		preempt_enable();
+	do {
 		rep_nop();
+
+		left -= now - prev;
+		prev = now;
+
 		preempt_disable();
+		rdtscl(prev_1);
+		cpu = smp_processor_id();
+		rdtscl(now);
+		preempt_enable();
 
-		/*
-		 * It is possible that we moved to another CPU, and
-		 * since TSC's are per-cpu we need to calculate
-		 * that. The delay must guarantee that we wait "at
-		 * least" the amount of time. Being moved to another
-		 * CPU could make the wait longer but we just need to
-		 * make sure we waited long enough. Rebalance the
-		 * counter for this CPU.
-		 */
-		if (unlikely(cpu != smp_processor_id())) {
-			loops -= (now - bclock);
-			cpu = smp_processor_id();
-			rdtscl(bclock);
+		if (prev_cpu != cpu) {
+			/*
+			 * We have migrated, forget prev_cpu's tsc reading
+			 */
+			prev = prev_1;
+			prev_cpu = cpu;
 		}
-	}
-	preempt_enable();
+	} while ((now-prev) < left);
 }
 
 /*
--- a/arch/x86/lib/delay_64.c	2008-06-15 11:04:17.000000000 +0300
+++ b/arch/x86/lib/delay_64.c	2008-06-15 11:09:52.000000000 +0300
@@ -28,40 +28,42 @@ int __devinit read_current_timer(unsigne
 	return 0;
 }
 
+/* TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
 void __delay(unsigned long loops)
 {
-	unsigned bclock, now;
-	int cpu;
+	unsigned prev, prev_1, now;
+	unsigned left = loops;
+	unsigned prev_cpu, cpu;
 
 	preempt_disable();
-	cpu = smp_processor_id();
-	rdtscl(bclock);
-	for (;;) {
-		rdtscl(now);
-		if ((now - bclock) >= loops)
-			break;
+	rdtscl(prev);
+	prev_cpu = smp_processor_id();
+	preempt_enable();
+	now = prev;
 
-		/* Allow RT tasks to run */
-		preempt_enable();
+	do {
 		rep_nop();
+
+		left -= now - prev;
+		prev = now;
+
 		preempt_disable();
+		rdtscl(prev_1);
+		cpu = smp_processor_id();
+		rdtscl(now);
+		preempt_enable();
 
-		/*
-		 * It is possible that we moved to another CPU, and
-		 * since TSC's are per-cpu we need to calculate
-		 * that. The delay must guarantee that we wait "at
-		 * least" the amount of time. Being moved to another
-		 * CPU could make the wait longer but we just need to
-		 * make sure we waited long enough. Rebalance the
-		 * counter for this CPU.
-		 */
-		if (unlikely(cpu != smp_processor_id())) {
-			loops -= (now - bclock);
-			cpu = smp_processor_id();
-			rdtscl(bclock);
+		if (prev_cpu != cpu) {
+			/*
+			 * We have migrated, forget prev_cpu's tsc reading
+			 */
+			 prev = prev_1;
+			 prev_cpu = cpu;
 		}
-	}
-	preempt_enable();
+	} while ((now-prev) < left);
 }
 EXPORT_SYMBOL(__delay);
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-15 17:58           ` Marin Mitov
@ 2008-06-18  7:55             ` Ingo Molnar
  2008-06-18 12:08               ` Gregory Haskins
  2008-06-28 10:44               ` Pavel Machek
  0 siblings, 2 replies; 22+ messages in thread
From: Ingo Molnar @ 2008-06-18  7:55 UTC (permalink / raw)
  To: Marin Mitov
  Cc: LKML, Steven Rostedt, Thomas Gleixner, linux-rt-users, akpm,
	Clark Williams, Peter Zijlstra, Luis Claudio R. Goncalves,
	Gregory Haskins, Linus Torvalds, Andi Kleen


* Marin Mitov <mitov@issp.bas.bg> wrote:

> Why not something like that (do keep in mind I am not an expert :-):
> 
> static void delay_tsc(unsigned long loops)
> {
> 	get and store the mask of allowed cpus;
> 	/*     prevent the migration   */
> 	set the mask of allowed cpus to the current cpu only;
> 	/*     is it possible? could it be guaranteed?    */
> 	loop for the delay;
> 	restore the old mask of allowed cpus;
> }
> 
> You have got the idea. Could it be realized? Is it more expensive than 
> the current realization? So, comments, please.

hm, changing/saving/restorig cpus_allowed is really considered a 'heavy' 
operation compared to preempt_disable(). On a 4096 CPUs box cpus_allowed 
is 4096 bits which is half a kilobyte ...

preempt_disable()/enable() on the other hand only touches a single 
variable, (thread_info->preempt_count which is an u32)

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-18  7:55             ` Ingo Molnar
@ 2008-06-18 12:08               ` Gregory Haskins
  2008-06-18 12:13                 ` Gregory Haskins
  2008-06-18 12:16                 ` Peter Zijlstra
  2008-06-28 10:44               ` Pavel Machek
  1 sibling, 2 replies; 22+ messages in thread
From: Gregory Haskins @ 2008-06-18 12:08 UTC (permalink / raw)
  To: Ingo Molnar, Marin Mitov
  Cc: Andi Kleen, Clark Williams, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, Linus Torvalds, akpm, Luis Claudio R. Goncalves,
	LKML, linux-rt-users

>>> On Wed, Jun 18, 2008 at  3:55 AM, in message <20080618075518.GD4135@elte.hu>,
Ingo Molnar <mingo@elte.hu> wrote: 

> * Marin Mitov <mitov@issp.bas.bg> wrote:
> 
>> Why not something like that (do keep in mind I am not an expert :-):
>> 
>> static void delay_tsc(unsigned long loops)
>> {
>> 	get and store the mask of allowed cpus;
>> 	/*     prevent the migration   */
>> 	set the mask of allowed cpus to the current cpu only;
>> 	/*     is it possible? could it be guaranteed?    */
>> 	loop for the delay;
>> 	restore the old mask of allowed cpus;
>> }
>> 
>> You have got the idea. Could it be realized? Is it more expensive than 
>> the current realization? So, comments, please.
> 
> hm, changing/saving/restorig cpus_allowed is really considered a 'heavy' 
> operation compared to preempt_disable(). On a 4096 CPUs box cpus_allowed 
> is 4096 bits which is half a kilobyte ...
> 
> preempt_disable()/enable() on the other hand only touches a single 
> variable, (thread_info->preempt_count which is an u32)
> 
> 	Ingo

FWIW:  I had submitted some "migration disable" patches a while back that would solve this without the cpus_allowed manipulations described here.  Its more expensive than a preempt-disable (but its preemptible), yet its way cheaper (and more correct / less racy) than chaning cpus_allowed.  I could resubmit if there was any interest, though I think Ingo said he didnt like the concept on the first pass.  Anyway, FYI.

-Greg


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-18 12:08               ` Gregory Haskins
@ 2008-06-18 12:13                 ` Gregory Haskins
  2008-06-18 12:16                 ` Peter Zijlstra
  1 sibling, 0 replies; 22+ messages in thread
From: Gregory Haskins @ 2008-06-18 12:13 UTC (permalink / raw)
  To: Ingo Molnar, Marin Mitov, Gregory Haskins
  Cc: Andi Kleen, Clark Williams, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, Linus Torvalds, akpm, Luis Claudio R. Goncalves,
	LKML, linux-rt-users

>>> On Wed, Jun 18, 2008 at  8:08 AM, in message <4858C286.BA47.005A.0@novell.com>,
"Gregory Haskins" <ghaskins@novell.com> wrote: 
>>>> On Wed, Jun 18, 2008 at  3:55 AM, in message <20080618075518.GD4135@elte.hu>,
> Ingo Molnar <mingo@elte.hu> wrote: 
> 
>> * Marin Mitov <mitov@issp.bas.bg> wrote:
>> 
>>> Why not something like that (do keep in mind I am not an expert :-):
>>> 
>>> static void delay_tsc(unsigned long loops)
>>> {
>>> 	get and store the mask of allowed cpus;
>>> 	/*     prevent the migration   */
>>> 	set the mask of allowed cpus to the current cpu only;
>>> 	/*     is it possible? could it be guaranteed?    */
>>> 	loop for the delay;
>>> 	restore the old mask of allowed cpus;
>>> }
>>> 
>>> You have got the idea. Could it be realized? Is it more expensive than 
>>> the current realization? So, comments, please.
>> 
>> hm, changing/saving/restorig cpus_allowed is really considered a 'heavy' 
>> operation compared to preempt_disable(). On a 4096 CPUs box cpus_allowed 
>> is 4096 bits which is half a kilobyte ...
>> 
>> preempt_disable()/enable() on the other hand only touches a single 
>> variable, (thread_info->preempt_count which is an u32)
>> 
>> 	Ingo
> 
> FWIW:  I had submitted some "migration disable" patches a while back that 
> would solve this without the cpus_allowed manipulations described here.  Its 
> more expensive than a preempt-disable (but its preemptible), yet its way 
> cheaper (and more correct / less racy) than chaning cpus_allowed.  I could 
> resubmit if there was any interest, though I think Ingo said he didnt like 
> the concept on the first pass.  Anyway, FYI.

Sorry, should have provided a reference:

http://lkml.org/lkml/2008/2/12/344

> 
> -Greg
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-18 12:08               ` Gregory Haskins
  2008-06-18 12:13                 ` Gregory Haskins
@ 2008-06-18 12:16                 ` Peter Zijlstra
  2008-06-18 12:25                   ` Gregory Haskins
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2008-06-18 12:16 UTC (permalink / raw)
  To: Gregory Haskins
  Cc: Ingo Molnar, Marin Mitov, Andi Kleen, Clark Williams,
	Steven Rostedt, Thomas Gleixner, Linus Torvalds, akpm,
	Luis Claudio R. Goncalves, LKML, linux-rt-users

On Wed, 2008-06-18 at 06:08 -0600, Gregory Haskins wrote:
> >>> On Wed, Jun 18, 2008 at  3:55 AM, in message <20080618075518.GD4135@elte.hu>,
> Ingo Molnar <mingo@elte.hu> wrote: 
> 
> > * Marin Mitov <mitov@issp.bas.bg> wrote:
> > 
> >> Why not something like that (do keep in mind I am not an expert :-):
> >> 
> >> static void delay_tsc(unsigned long loops)
> >> {
> >> 	get and store the mask of allowed cpus;
> >> 	/*     prevent the migration   */
> >> 	set the mask of allowed cpus to the current cpu only;
> >> 	/*     is it possible? could it be guaranteed?    */
> >> 	loop for the delay;
> >> 	restore the old mask of allowed cpus;
> >> }
> >> 
> >> You have got the idea. Could it be realized? Is it more expensive than 
> >> the current realization? So, comments, please.
> > 
> > hm, changing/saving/restorig cpus_allowed is really considered a 'heavy' 
> > operation compared to preempt_disable(). On a 4096 CPUs box cpus_allowed 
> > is 4096 bits which is half a kilobyte ...
> > 
> > preempt_disable()/enable() on the other hand only touches a single 
> > variable, (thread_info->preempt_count which is an u32)
> > 
> > 	Ingo
> 
> FWIW:  I had submitted some "migration disable" patches a while back
> that would solve this without the cpus_allowed manipulations described
> here.  Its more expensive than a preempt-disable (but its
> preemptible), yet its way cheaper (and more correct / less racy) than
> chaning cpus_allowed.  I could resubmit if there was any interest,
> though I think Ingo said he didnt like the concept on the first pass.
> Anyway, FYI.

(please teach your mailer to wrap text)

Yeah - migrate_disable() has been proposed several times. The reason I
don't like it is that is creates scheduling artefacts like latencies by
not being able to load-balance (and thereby complicates all that, and
you know we don't need more complication there).

Things like preempt latency and irq off latency are rather easy to
notice and debug in general. migrate_disable() would be fully
preemptable/schedulable which makes it much much harder to instrument or
even detect we have an issue. Which in turn makes it much harder to find
abuse.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-18 12:16                 ` Peter Zijlstra
@ 2008-06-18 12:25                   ` Gregory Haskins
  2008-06-18 12:42                     ` Nick Piggin
  0 siblings, 1 reply; 22+ messages in thread
From: Gregory Haskins @ 2008-06-18 12:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Andi Kleen, Clark Williams, Steven Rostedt,
	Marin Mitov, Thomas Gleixner, Linus Torvalds, akpm,
	Luis Claudio R. Goncalves, LKML, linux-rt-users

>>> On Wed, Jun 18, 2008 at  8:16 AM, in message
<1213791416.16944.222.camel@twins>, Peter Zijlstra <peterz@infradead.org>
wrote: 
> On Wed, 2008-06-18 at 06:08 -0600, Gregory Haskins wrote:
>> >>> On Wed, Jun 18, 2008 at  3:55 AM, in message <20080618075518.GD4135@elte.hu>,
>> Ingo Molnar <mingo@elte.hu> wrote: 
>> 
>> > * Marin Mitov <mitov@issp.bas.bg> wrote:
>> > 
>> >> Why not something like that (do keep in mind I am not an expert :-):
>> >> 
>> >> static void delay_tsc(unsigned long loops)
>> >> {
>> >> 	get and store the mask of allowed cpus;
>> >> 	/*     prevent the migration   */
>> >> 	set the mask of allowed cpus to the current cpu only;
>> >> 	/*     is it possible? could it be guaranteed?    */
>> >> 	loop for the delay;
>> >> 	restore the old mask of allowed cpus;
>> >> }
>> >> 
>> >> You have got the idea. Could it be realized? Is it more expensive than 
>> >> the current realization? So, comments, please.
>> > 
>> > hm, changing/saving/restorig cpus_allowed is really considered a 'heavy' 
>> > operation compared to preempt_disable(). On a 4096 CPUs box cpus_allowed 
>> > is 4096 bits which is half a kilobyte ...
>> > 
>> > preempt_disable()/enable() on the other hand only touches a single 
>> > variable, (thread_info->preempt_count which is an u32)
>> > 
>> > 	Ingo
>> 
>> FWIW:  I had submitted some "migration disable" patches a while back
>> that would solve this without the cpus_allowed manipulations described
>> here.  Its more expensive than a preempt-disable (but its
>> preemptible), yet its way cheaper (and more correct / less racy) than
>> chaning cpus_allowed.  I could resubmit if there was any interest,
>> though I think Ingo said he didnt like the concept on the first pass.
>> Anyway, FYI.
> 
> (please teach your mailer to wrap text)

Sorry...I know its really annoying, but I have no control over it in groupwise :(

Its a server side / MTA setting.  Go figure.  I will try to wrap manually.

> 
> Yeah - migrate_disable() has been proposed several times. The reason I
> don't like it is that is creates scheduling artefacts like latencies by
> not being able to load-balance (and thereby complicates all that, and
> you know we don't need more complication there).

True, and good point.  But this concept would certainly be useful to avoid 
the heavyweight (w.r.t. latency) preempt-disable() in quite a few different
areas, so if we can make it work with reasonable visibility, it might be nice
to have.

> 
> Things like preempt latency and irq off latency are rather easy to
> notice and debug in general. migrate_disable() would be fully
> preemptable/schedulable which makes it much much harder to instrument or
> even detect we have an issue. Which in turn makes it much harder to find
> abuse.

I wonder if we can come up with any creative instrumentation to get coverage
in this case.  I will think about it and add it to the migration-disable queue I
have to be submitted together (unless Ingo et. al. feel strongly that it will
never be accepted even with good instrumentation...then I will not waste
any effort on it).

Regards,
-Greg

> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-18 12:25                   ` Gregory Haskins
@ 2008-06-18 12:42                     ` Nick Piggin
  2008-06-18 13:04                       ` Gregory Haskins
  2008-06-18 16:16                       ` Steven Rostedt
  0 siblings, 2 replies; 22+ messages in thread
From: Nick Piggin @ 2008-06-18 12:42 UTC (permalink / raw)
  To: Gregory Haskins
  Cc: Peter Zijlstra, Ingo Molnar, Andi Kleen, Clark Williams,
	Steven Rostedt, Marin Mitov, Thomas Gleixner, Linus Torvalds,
	akpm, Luis Claudio R. Goncalves, LKML, linux-rt-users

On Wednesday 18 June 2008 22:25, Gregory Haskins wrote:
> >>> On Wed, Jun 18, 2008 at  8:16 AM, in message

> > Yeah - migrate_disable() has been proposed several times. The reason I
> > don't like it is that is creates scheduling artefacts like latencies by
> > not being able to load-balance (and thereby complicates all that, and
> > you know we don't need more complication there).
>
> True, and good point.  But this concept would certainly be useful to avoid
> the heavyweight (w.r.t. latency) preempt-disable() in quite a few different
> areas, so if we can make it work with reasonable visibility, it might be
> nice to have.

It just seems like pretty worthless bloat to me.

There are _some_ cases where it can be used, but nobody has been
able to come up with compelling uses really. I don't think this
case is helped very much either because the logic in there using
preempt-disable is fine, isn't it?

Except that it should also have a cond_resched in it. Seems like
an ideal place to put cond_resched because it is not a fastpath.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-18 12:42                     ` Nick Piggin
@ 2008-06-18 13:04                       ` Gregory Haskins
  2008-06-18 16:16                       ` Steven Rostedt
  1 sibling, 0 replies; 22+ messages in thread
From: Gregory Haskins @ 2008-06-18 13:04 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Andi Kleen, Clark Williams, Steven Rostedt,
	Peter Zijlstra, Marin Mitov, Thomas Gleixner, Linus Torvalds,
	akpm, Luis Claudio R. Goncalves, LKML, linux-rt-users

>>> On Wed, Jun 18, 2008 at  8:42 AM, in message
<200806182242.49245.nickpiggin@yahoo.com.au>, Nick Piggin
<nickpiggin@yahoo.com.au> wrote: 
> On Wednesday 18 June 2008 22:25, Gregory Haskins wrote:
>> >>> On Wed, Jun 18, 2008 at  8:16 AM, in message
> 
>> > Yeah - migrate_disable() has been proposed several times. The reason I
>> > don't like it is that is creates scheduling artefacts like latencies by
>> > not being able to load-balance (and thereby complicates all that, and
>> > you know we don't need more complication there).
>>
>> True, and good point.  But this concept would certainly be useful to avoid
>> the heavyweight (w.r.t. latency) preempt-disable() in quite a few different
>> areas, so if we can make it work with reasonable visibility, it might be
>> nice to have.
> 
> It just seems like pretty worthless bloat to me.
> 
> There are _some_ cases where it can be used, but nobody has been
> able to come up with compelling uses really.

Well, I have some ongoing R&D which (I believe) *would* make compelling use of a
migration-disable feature.  But to date it is not ready to see the light of day.  As 
far as existing uses, I think I mostly agree with you.  The one argument to the
contrary would be to clean up the handful of places that implement an ad-hoc
migration-disable by messing with cpus_allowed in a similar manner.  But perhaps
those could be solved with a preempt-disable() as well.

>I don't think this
> case is helped very much either because the logic in there using
> preempt-disable is fine, isn't it?

You are probably right.  In my own defense, I was just responding to the
question about manipulating the cpus_allowed.  If you are going to do that
you are better off with my patch (IMO).  Changing cpus_allowed to prevent
migration is racy, among other things.  Whether this tsc code is optimal with
migration disabled or preemption-disabled is a separate matter which
I did not address.

> 
> Except that it should also have a cond_resched in it. Seems like
> an ideal place to put cond_resched because it is not a fastpath.

Seems reasonable to me.  Thanks Nick.

-Greg

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-18 12:42                     ` Nick Piggin
  2008-06-18 13:04                       ` Gregory Haskins
@ 2008-06-18 16:16                       ` Steven Rostedt
  2008-06-18 17:18                         ` Nick Piggin
  1 sibling, 1 reply; 22+ messages in thread
From: Steven Rostedt @ 2008-06-18 16:16 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Gregory Haskins, Peter Zijlstra, Ingo Molnar, Andi Kleen,
	Clark Williams, Marin Mitov, Thomas Gleixner, Linus Torvalds,
	akpm, Luis Claudio R. Goncalves, LKML, linux-rt-users



On Wed, 18 Jun 2008, Nick Piggin wrote:

> There are _some_ cases where it can be used, but nobody has been
> able to come up with compelling uses really. I don't think this
> case is helped very much either because the logic in there using
> preempt-disable is fine, isn't it?
>
> Except that it should also have a cond_resched in it. Seems like
> an ideal place to put cond_resched because it is not a fastpath.
>

Does it really need a cond_resched? preempt_enable when it goes to zero
will already check to see if it should schedule.

-- Steve


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-18 16:16                       ` Steven Rostedt
@ 2008-06-18 17:18                         ` Nick Piggin
  0 siblings, 0 replies; 22+ messages in thread
From: Nick Piggin @ 2008-06-18 17:18 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Gregory Haskins, Peter Zijlstra, Ingo Molnar, Andi Kleen,
	Clark Williams, Marin Mitov, Thomas Gleixner, Linus Torvalds,
	akpm, Luis Claudio R. Goncalves, LKML, linux-rt-users

On Thursday 19 June 2008 02:16, Steven Rostedt wrote:
> On Wed, 18 Jun 2008, Nick Piggin wrote:
> > There are _some_ cases where it can be used, but nobody has been
> > able to come up with compelling uses really. I don't think this
> > case is helped very much either because the logic in there using
> > preempt-disable is fine, isn't it?
> >
> > Except that it should also have a cond_resched in it. Seems like
> > an ideal place to put cond_resched because it is not a fastpath.
>
> Does it really need a cond_resched? preempt_enable when it goes to zero
> will already check to see if it should schedule.

It doesn't really need one, but of course having one would help
non preempt kernels.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][resubmit] x86: enable preemption in delay
  2008-06-18  7:55             ` Ingo Molnar
  2008-06-18 12:08               ` Gregory Haskins
@ 2008-06-28 10:44               ` Pavel Machek
  1 sibling, 0 replies; 22+ messages in thread
From: Pavel Machek @ 2008-06-28 10:44 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Marin Mitov, LKML, Steven Rostedt, Thomas Gleixner,
	linux-rt-users, akpm, Clark Williams, Peter Zijlstra,
	Luis Claudio R. Goncalves, Gregory Haskins, Linus Torvalds,
	Andi Kleen

On Wed 2008-06-18 09:55:18, Ingo Molnar wrote:
> 
> * Marin Mitov <mitov@issp.bas.bg> wrote:
> 
> > Why not something like that (do keep in mind I am not an expert :-):
> > 
> > static void delay_tsc(unsigned long loops)
> > {
> > 	get and store the mask of allowed cpus;
> > 	/*     prevent the migration   */
> > 	set the mask of allowed cpus to the current cpu only;
> > 	/*     is it possible? could it be guaranteed?    */
> > 	loop for the delay;
> > 	restore the old mask of allowed cpus;
> > }
> > 
> > You have got the idea. Could it be realized? Is it more expensive than 
> > the current realization? So, comments, please.
> 
> hm, changing/saving/restorig cpus_allowed is really considered a 'heavy' 
> operation compared to preempt_disable(). On a 4096 CPUs box cpus_allowed 
> is 4096 bits which is half a kilobyte ...

Time to RLE the bitmap? 

<runs>							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2008-07-06 14:47 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-25 18:08 Re:[PATCH -v4] x86: enable preemption in delay Marin Mitov
2008-05-25 19:30 ` Steven Rostedt
2008-05-25 19:58   ` [PATCH " Marin Mitov
2008-05-25 22:21 ` Thomas Gleixner
2008-05-26  5:14   ` [PATCH " Marin Mitov
2008-06-01 16:01   ` [PATCH][resubmit] " Marin Mitov
2008-06-01 16:25     ` Andi Kleen
2008-06-01 17:17       ` Marin Mitov
2008-06-09 12:13     ` Ingo Molnar
2008-06-09 16:11       ` Marin Mitov
2008-06-09 16:16         ` Ingo Molnar
2008-06-15 17:58           ` Marin Mitov
2008-06-18  7:55             ` Ingo Molnar
2008-06-18 12:08               ` Gregory Haskins
2008-06-18 12:13                 ` Gregory Haskins
2008-06-18 12:16                 ` Peter Zijlstra
2008-06-18 12:25                   ` Gregory Haskins
2008-06-18 12:42                     ` Nick Piggin
2008-06-18 13:04                       ` Gregory Haskins
2008-06-18 16:16                       ` Steven Rostedt
2008-06-18 17:18                         ` Nick Piggin
2008-06-28 10:44               ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox