public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch 02/12] uml: cpu_relax fix
@ 2005-03-22 16:21 blaisorblade
  2005-03-23 17:09 ` [uml-devel] " Bodo Stroesser
  0 siblings, 1 reply; 5+ messages in thread
From: blaisorblade @ 2005-03-22 16:21 UTC (permalink / raw)
  To: akpm; +Cc: jdike, linux-kernel, user-mode-linux-devel, blaisorblade


Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s doing
that (i.e. i386 and x86_64).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
---

 linux-2.6.11-paolo/include/asm-um/processor-generic.h |    2 --
 linux-2.6.11-paolo/include/asm-um/processor-i386.h    |    8 ++++++++
 linux-2.6.11-paolo/include/asm-um/processor-x86_64.h  |    8 ++++++++
 3 files changed, 16 insertions(+), 2 deletions(-)

diff -puN include/asm-um/processor-generic.h~uml-cpu_relax include/asm-um/processor-generic.h
--- linux-2.6.11/include/asm-um/processor-generic.h~uml-cpu_relax	2005-03-22 16:52:25.000000000 +0100
+++ linux-2.6.11-paolo/include/asm-um/processor-generic.h	2005-03-22 16:54:41.000000000 +0100
@@ -16,8 +16,6 @@ struct task_struct;
 
 struct mm_struct;
 
-#define cpu_relax()   barrier()
-
 struct thread_struct {
 	int forking;
 	int nsyscalls;
diff -puN include/asm-um/processor-i386.h~uml-cpu_relax include/asm-um/processor-i386.h
--- linux-2.6.11/include/asm-um/processor-i386.h~uml-cpu_relax	2005-03-22 16:53:43.000000000 +0100
+++ linux-2.6.11-paolo/include/asm-um/processor-i386.h	2005-03-22 16:54:39.000000000 +0100
@@ -19,6 +19,14 @@ struct arch_thread {
 
 #include "asm/arch/user.h"
 
+/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
+static inline void rep_nop(void)
+{
+	__asm__ __volatile__("rep;nop": : :"memory");
+}
+
+#define cpu_relax()	rep_nop()
+
 /*
  * Default implementation of macro that returns current
  * instruction pointer ("program counter"). Stolen
diff -puN include/asm-um/processor-x86_64.h~uml-cpu_relax include/asm-um/processor-x86_64.h
--- linux-2.6.11/include/asm-um/processor-x86_64.h~uml-cpu_relax	2005-03-22 16:56:30.000000000 +0100
+++ linux-2.6.11-paolo/include/asm-um/processor-x86_64.h	2005-03-22 16:56:32.000000000 +0100
@@ -12,6 +12,14 @@
 struct arch_thread {
 };
 
+/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
+extern inline void rep_nop(void)
+{
+	__asm__ __volatile__("rep;nop": : :"memory");
+}
+
+#define cpu_relax()   rep_nop()
+
 #define INIT_ARCH_THREAD { }
 
 #define current_text_addr() \
_

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix
  2005-03-22 16:21 [patch 02/12] uml: cpu_relax fix blaisorblade
@ 2005-03-23 17:09 ` Bodo Stroesser
  2005-03-24  1:50   ` Blaisorblade
  0 siblings, 1 reply; 5+ messages in thread
From: Bodo Stroesser @ 2005-03-23 17:09 UTC (permalink / raw)
  To: blaisorblade; +Cc: akpm, jdike, linux-kernel, user-mode-linux-devel

blaisorblade@yahoo.it wrote:
> Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s doing
> that (i.e. i386 and x86_64).

IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion on #uml).
S390 does something similar using a special DIAG-opcode that gives permission to zVM,
that another Guest might run.

On a host running many UMLs, this might improve performance.

So, I would like to have the small patch below (it's not tested, just an idea).

		Bodo


> diff -puN include/asm-um/processor-generic.h~uml-cpu_relax include/asm-um/processor-generic.h
> --- linux-2.6.11/include/asm-um/processor-generic.h~uml-cpu_relax	2005-03-22 16:52:25.000000000 +0100
> +++ linux-2.6.11-paolo/include/asm-um/processor-generic.h	2005-03-22 16:54:41.000000000 +0100
> @@ -16,7 +16,8 @@ struct task_struct;
>  
>  struct mm_struct;
>  
> -#define cpu_relax()   barrier()
> +#include "kern.h"
> +#define cpu_relax()   sched_yield()
>  
>  struct thread_struct {
>  	int forking;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix
  2005-03-23 17:09 ` [uml-devel] " Bodo Stroesser
@ 2005-03-24  1:50   ` Blaisorblade
  2005-03-24  2:02     ` Andrew Morton
  2005-03-24  2:09     ` Nick Piggin
  0 siblings, 2 replies; 5+ messages in thread
From: Blaisorblade @ 2005-03-24  1:50 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Bodo Stroesser, akpm, jdike, linux-kernel

On Wednesday 23 March 2005 18:09, Bodo Stroesser wrote:
> blaisorblade@yahoo.it wrote:
> > Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s
> > doing that (i.e. i386 and x86_64).
>
> IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion
> on #uml).
Hmm, makes sense, but this is to benchmark well... I remember from early 
discussions on 2.6 scheduler that using sched_yield might decrease 
performance (IIRC starve the calling application).

Also, that call should be put inside the idle loop, not for cpu_relax, which 
is very different, since it is used (for instance) in kernel/spinlock.c for 
spinlocks, and in such things. The "Pause" opcode is explicitly recommended 
(by Intel manuals, I don't recall why) for things like spinlock loops, and 
using yield there would be bad.

> S390 does something similar using a special DIAG-opcode that 
> gives permission to zVM, that another Guest might run.

> On a host running many UMLs, this might improve performance.
>
> So, I would like to have the small patch below (it's not tested, just an
> idea).

-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729
http://www.user-mode-linux.org/~blaisorblade


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix
  2005-03-24  1:50   ` Blaisorblade
@ 2005-03-24  2:02     ` Andrew Morton
  2005-03-24  2:09     ` Nick Piggin
  1 sibling, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2005-03-24  2:02 UTC (permalink / raw)
  To: Blaisorblade; +Cc: user-mode-linux-devel, bstroesser, jdike, linux-kernel

Blaisorblade <blaisorblade@yahoo.it> wrote:
>
> On Wednesday 23 March 2005 18:09, Bodo Stroesser wrote:
>  > blaisorblade@yahoo.it wrote:
>  > > Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s
>  > > doing that (i.e. i386 and x86_64).
>  >
>  > IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion
>  > on #uml).
>  Hmm, makes sense, but this is to benchmark well... I remember from early 
>  discussions on 2.6 scheduler that using sched_yield might decrease 
>  performance (IIRC starve the calling application).

yup, sched_yield() is pretty uniformly bad, and can result in heaps of
starvation if the machine is busy.  Best to avoid it unless you really want
it, and have tested it thoroughly under many-tasks-busy workloads.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix
  2005-03-24  1:50   ` Blaisorblade
  2005-03-24  2:02     ` Andrew Morton
@ 2005-03-24  2:09     ` Nick Piggin
  1 sibling, 0 replies; 5+ messages in thread
From: Nick Piggin @ 2005-03-24  2:09 UTC (permalink / raw)
  To: Blaisorblade
  Cc: user-mode-linux-devel, Bodo Stroesser, akpm, jdike, linux-kernel

Blaisorblade wrote:
> On Wednesday 23 March 2005 18:09, Bodo Stroesser wrote:
> 
>>blaisorblade@yahoo.it wrote:
>>
>>>Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s
>>>doing that (i.e. i386 and x86_64).
>>
>>IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion
>>on #uml).
> 
> Hmm, makes sense, but this is to benchmark well... I remember from early 
> discussions on 2.6 scheduler that using sched_yield might decrease 
> performance (IIRC starve the calling application).
> 

Typically, for places where cpu_relax is used, sched_yield would be
a poor fit. So yes it could easily reduce performance.

> Also, that call should be put inside the idle loop, not for cpu_relax, which 
> is very different, since it is used (for instance) in kernel/spinlock.c for 
> spinlocks, and in such things. The "Pause" opcode is explicitly recommended 
> (by Intel manuals, I don't recall why) for things like spinlock loops, and 
> using yield there would be bad.
> 

The other thing is that sched_yield won't relax at all if you are the
only thing running, it will be a busy wait. So again, maybe not a great
fit for the idle loop either.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-03-24  2:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-22 16:21 [patch 02/12] uml: cpu_relax fix blaisorblade
2005-03-23 17:09 ` [uml-devel] " Bodo Stroesser
2005-03-24  1:50   ` Blaisorblade
2005-03-24  2:02     ` Andrew Morton
2005-03-24  2:09     ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox