* [uml-devel] [patch 02/12] uml: cpu_relax fix
@ 2005-03-22 16:21 ` blaisorblade
0 siblings, 0 replies; 10+ messages in thread
From: blaisorblade @ 2005-03-22 16:21 UTC (permalink / raw)
To: akpm; +Cc: jdike, linux-kernel, user-mode-linux-devel, blaisorblade
Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s doing
that (i.e. i386 and x86_64).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
---
linux-2.6.11-paolo/include/asm-um/processor-generic.h | 2 --
linux-2.6.11-paolo/include/asm-um/processor-i386.h | 8 ++++++++
linux-2.6.11-paolo/include/asm-um/processor-x86_64.h | 8 ++++++++
3 files changed, 16 insertions(+), 2 deletions(-)
diff -puN include/asm-um/processor-generic.h~uml-cpu_relax include/asm-um/processor-generic.h
--- linux-2.6.11/include/asm-um/processor-generic.h~uml-cpu_relax 2005-03-22 16:52:25.000000000 +0100
+++ linux-2.6.11-paolo/include/asm-um/processor-generic.h 2005-03-22 16:54:41.000000000 +0100
@@ -16,8 +16,6 @@ struct task_struct;
struct mm_struct;
-#define cpu_relax() barrier()
-
struct thread_struct {
int forking;
int nsyscalls;
diff -puN include/asm-um/processor-i386.h~uml-cpu_relax include/asm-um/processor-i386.h
--- linux-2.6.11/include/asm-um/processor-i386.h~uml-cpu_relax 2005-03-22 16:53:43.000000000 +0100
+++ linux-2.6.11-paolo/include/asm-um/processor-i386.h 2005-03-22 16:54:39.000000000 +0100
@@ -19,6 +19,14 @@ struct arch_thread {
#include "asm/arch/user.h"
+/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
+static inline void rep_nop(void)
+{
+ __asm__ __volatile__("rep;nop": : :"memory");
+}
+
+#define cpu_relax() rep_nop()
+
/*
* Default implementation of macro that returns current
* instruction pointer ("program counter"). Stolen
diff -puN include/asm-um/processor-x86_64.h~uml-cpu_relax include/asm-um/processor-x86_64.h
--- linux-2.6.11/include/asm-um/processor-x86_64.h~uml-cpu_relax 2005-03-22 16:56:30.000000000 +0100
+++ linux-2.6.11-paolo/include/asm-um/processor-x86_64.h 2005-03-22 16:56:32.000000000 +0100
@@ -12,6 +12,14 @@
struct arch_thread {
};
+/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
+extern inline void rep_nop(void)
+{
+ __asm__ __volatile__("rep;nop": : :"memory");
+}
+
+#define cpu_relax() rep_nop()
+
#define INIT_ARCH_THREAD { }
#define current_text_addr() \
_
-------------------------------------------------------
This SF.net email is sponsored by: 2005 Windows Mobile Application Contest
Submit applications for Windows Mobile(tm)-based Pocket PCs or Smartphones
for the chance to win $25,000 and application distribution. Enter today at
http://ads.osdn.com/?ad_id=6882&alloc_id=15148&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 10+ messages in thread* [patch 02/12] uml: cpu_relax fix @ 2005-03-22 16:21 ` blaisorblade 0 siblings, 0 replies; 10+ messages in thread From: blaisorblade @ 2005-03-22 16:21 UTC (permalink / raw) To: akpm; +Cc: jdike, linux-kernel, user-mode-linux-devel, blaisorblade Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s doing that (i.e. i386 and x86_64). Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> --- linux-2.6.11-paolo/include/asm-um/processor-generic.h | 2 -- linux-2.6.11-paolo/include/asm-um/processor-i386.h | 8 ++++++++ linux-2.6.11-paolo/include/asm-um/processor-x86_64.h | 8 ++++++++ 3 files changed, 16 insertions(+), 2 deletions(-) diff -puN include/asm-um/processor-generic.h~uml-cpu_relax include/asm-um/processor-generic.h --- linux-2.6.11/include/asm-um/processor-generic.h~uml-cpu_relax 2005-03-22 16:52:25.000000000 +0100 +++ linux-2.6.11-paolo/include/asm-um/processor-generic.h 2005-03-22 16:54:41.000000000 +0100 @@ -16,8 +16,6 @@ struct task_struct; struct mm_struct; -#define cpu_relax() barrier() - struct thread_struct { int forking; int nsyscalls; diff -puN include/asm-um/processor-i386.h~uml-cpu_relax include/asm-um/processor-i386.h --- linux-2.6.11/include/asm-um/processor-i386.h~uml-cpu_relax 2005-03-22 16:53:43.000000000 +0100 +++ linux-2.6.11-paolo/include/asm-um/processor-i386.h 2005-03-22 16:54:39.000000000 +0100 @@ -19,6 +19,14 @@ struct arch_thread { #include "asm/arch/user.h" +/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */ +static inline void rep_nop(void) +{ + __asm__ __volatile__("rep;nop": : :"memory"); +} + +#define cpu_relax() rep_nop() + /* * Default implementation of macro that returns current * instruction pointer ("program counter"). Stolen diff -puN include/asm-um/processor-x86_64.h~uml-cpu_relax include/asm-um/processor-x86_64.h --- linux-2.6.11/include/asm-um/processor-x86_64.h~uml-cpu_relax 2005-03-22 16:56:30.000000000 +0100 +++ linux-2.6.11-paolo/include/asm-um/processor-x86_64.h 2005-03-22 16:56:32.000000000 +0100 @@ -12,6 +12,14 @@ struct arch_thread { }; +/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */ +extern inline void rep_nop(void) +{ + __asm__ __volatile__("rep;nop": : :"memory"); +} + +#define cpu_relax() rep_nop() + #define INIT_ARCH_THREAD { } #define current_text_addr() \ _ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix 2005-03-22 16:21 ` blaisorblade @ 2005-03-23 17:09 ` Bodo Stroesser -1 siblings, 0 replies; 10+ messages in thread From: Bodo Stroesser @ 2005-03-23 17:09 UTC (permalink / raw) To: blaisorblade; +Cc: akpm, jdike, linux-kernel, user-mode-linux-devel blaisorblade@yahoo.it wrote: > Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s doing > that (i.e. i386 and x86_64). IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion on #uml). S390 does something similar using a special DIAG-opcode that gives permission to zVM, that another Guest might run. On a host running many UMLs, this might improve performance. So, I would like to have the small patch below (it's not tested, just an idea). Bodo > diff -puN include/asm-um/processor-generic.h~uml-cpu_relax include/asm-um/processor-generic.h > --- linux-2.6.11/include/asm-um/processor-generic.h~uml-cpu_relax 2005-03-22 16:52:25.000000000 +0100 > +++ linux-2.6.11-paolo/include/asm-um/processor-generic.h 2005-03-22 16:54:41.000000000 +0100 > @@ -16,7 +16,8 @@ struct task_struct; > > struct mm_struct; > > -#define cpu_relax() barrier() > +#include "kern.h" > +#define cpu_relax() sched_yield() > > struct thread_struct { > int forking; ------------------------------------------------------- This SF.net email is sponsored by Microsoft Mobile & Embedded DevCon 2005 Attend MEDC 2005 May 9-12 in Vegas. Learn more about the latest Windows Embedded(r) & Windows Mobile(tm) platforms, applications & content. Register by 3/29 & save $300 http://ads.osdn.com/?ad_id=6883&alloc_id=15149&op=click _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix @ 2005-03-23 17:09 ` Bodo Stroesser 0 siblings, 0 replies; 10+ messages in thread From: Bodo Stroesser @ 2005-03-23 17:09 UTC (permalink / raw) To: blaisorblade; +Cc: akpm, jdike, linux-kernel, user-mode-linux-devel blaisorblade@yahoo.it wrote: > Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s doing > that (i.e. i386 and x86_64). IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion on #uml). S390 does something similar using a special DIAG-opcode that gives permission to zVM, that another Guest might run. On a host running many UMLs, this might improve performance. So, I would like to have the small patch below (it's not tested, just an idea). Bodo > diff -puN include/asm-um/processor-generic.h~uml-cpu_relax include/asm-um/processor-generic.h > --- linux-2.6.11/include/asm-um/processor-generic.h~uml-cpu_relax 2005-03-22 16:52:25.000000000 +0100 > +++ linux-2.6.11-paolo/include/asm-um/processor-generic.h 2005-03-22 16:54:41.000000000 +0100 > @@ -16,7 +16,8 @@ struct task_struct; > > struct mm_struct; > > -#define cpu_relax() barrier() > +#include "kern.h" > +#define cpu_relax() sched_yield() > > struct thread_struct { > int forking; ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix 2005-03-23 17:09 ` Bodo Stroesser @ 2005-03-24 1:50 ` Blaisorblade -1 siblings, 0 replies; 10+ messages in thread From: Blaisorblade @ 2005-03-24 1:50 UTC (permalink / raw) To: user-mode-linux-devel; +Cc: Bodo Stroesser, akpm, jdike, linux-kernel On Wednesday 23 March 2005 18:09, Bodo Stroesser wrote: > blaisorblade@yahoo.it wrote: > > Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s > > doing that (i.e. i386 and x86_64). > > IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion > on #uml). Hmm, makes sense, but this is to benchmark well... I remember from early discussions on 2.6 scheduler that using sched_yield might decrease performance (IIRC starve the calling application). Also, that call should be put inside the idle loop, not for cpu_relax, which is very different, since it is used (for instance) in kernel/spinlock.c for spinlocks, and in such things. The "Pause" opcode is explicitly recommended (by Intel manuals, I don't recall why) for things like spinlock loops, and using yield there would be bad. > S390 does something similar using a special DIAG-opcode that > gives permission to zVM, that another Guest might run. > On a host running many UMLs, this might improve performance. > > So, I would like to have the small patch below (it's not tested, just an > idea). -- Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 http://www.user-mode-linux.org/~blaisorblade ------------------------------------------------------- This SF.net email is sponsored by Microsoft Mobile & Embedded DevCon 2005 Attend MEDC 2005 May 9-12 in Vegas. Learn more about the latest Windows Embedded(r) & Windows Mobile(tm) platforms, applications & content. Register by 3/29 & save $300 http://ads.osdn.com/?ad_id=6883&alloc_id=15149&op=click _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix @ 2005-03-24 1:50 ` Blaisorblade 0 siblings, 0 replies; 10+ messages in thread From: Blaisorblade @ 2005-03-24 1:50 UTC (permalink / raw) To: user-mode-linux-devel; +Cc: Bodo Stroesser, akpm, jdike, linux-kernel On Wednesday 23 March 2005 18:09, Bodo Stroesser wrote: > blaisorblade@yahoo.it wrote: > > Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s > > doing that (i.e. i386 and x86_64). > > IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion > on #uml). Hmm, makes sense, but this is to benchmark well... I remember from early discussions on 2.6 scheduler that using sched_yield might decrease performance (IIRC starve the calling application). Also, that call should be put inside the idle loop, not for cpu_relax, which is very different, since it is used (for instance) in kernel/spinlock.c for spinlocks, and in such things. The "Pause" opcode is explicitly recommended (by Intel manuals, I don't recall why) for things like spinlock loops, and using yield there would be bad. > S390 does something similar using a special DIAG-opcode that > gives permission to zVM, that another Guest might run. > On a host running many UMLs, this might improve performance. > > So, I would like to have the small patch below (it's not tested, just an > idea). -- Paolo Giarrusso, aka Blaisorblade Linux registered user n. 292729 http://www.user-mode-linux.org/~blaisorblade ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix 2005-03-24 1:50 ` Blaisorblade @ 2005-03-24 2:02 ` Andrew Morton -1 siblings, 0 replies; 10+ messages in thread From: Andrew Morton @ 2005-03-24 2:02 UTC (permalink / raw) To: Blaisorblade; +Cc: user-mode-linux-devel, bstroesser, jdike, linux-kernel Blaisorblade <blaisorblade@yahoo.it> wrote: > > On Wednesday 23 March 2005 18:09, Bodo Stroesser wrote: > > blaisorblade@yahoo.it wrote: > > > Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s > > > doing that (i.e. i386 and x86_64). > > > > IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion > > on #uml). > Hmm, makes sense, but this is to benchmark well... I remember from early > discussions on 2.6 scheduler that using sched_yield might decrease > performance (IIRC starve the calling application). yup, sched_yield() is pretty uniformly bad, and can result in heaps of starvation if the machine is busy. Best to avoid it unless you really want it, and have tested it thoroughly under many-tasks-busy workloads. ------------------------------------------------------- This SF.net email is sponsored by Microsoft Mobile & Embedded DevCon 2005 Attend MEDC 2005 May 9-12 in Vegas. Learn more about the latest Windows Embedded(r) & Windows Mobile(tm) platforms, applications & content. Register by 3/29 & save $300 http://ads.osdn.com/?ad_id=6883&alloc_id=15149&op=click _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix @ 2005-03-24 2:02 ` Andrew Morton 0 siblings, 0 replies; 10+ messages in thread From: Andrew Morton @ 2005-03-24 2:02 UTC (permalink / raw) To: Blaisorblade; +Cc: user-mode-linux-devel, bstroesser, jdike, linux-kernel Blaisorblade <blaisorblade@yahoo.it> wrote: > > On Wednesday 23 March 2005 18:09, Bodo Stroesser wrote: > > blaisorblade@yahoo.it wrote: > > > Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s > > > doing that (i.e. i386 and x86_64). > > > > IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion > > on #uml). > Hmm, makes sense, but this is to benchmark well... I remember from early > discussions on 2.6 scheduler that using sched_yield might decrease > performance (IIRC starve the calling application). yup, sched_yield() is pretty uniformly bad, and can result in heaps of starvation if the machine is busy. Best to avoid it unless you really want it, and have tested it thoroughly under many-tasks-busy workloads. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix 2005-03-24 1:50 ` Blaisorblade @ 2005-03-24 2:09 ` Nick Piggin -1 siblings, 0 replies; 10+ messages in thread From: Nick Piggin @ 2005-03-24 2:09 UTC (permalink / raw) To: Blaisorblade Cc: user-mode-linux-devel, Bodo Stroesser, akpm, jdike, linux-kernel Blaisorblade wrote: > On Wednesday 23 March 2005 18:09, Bodo Stroesser wrote: > >>blaisorblade@yahoo.it wrote: >> >>>Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s >>>doing that (i.e. i386 and x86_64). >> >>IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion >>on #uml). > > Hmm, makes sense, but this is to benchmark well... I remember from early > discussions on 2.6 scheduler that using sched_yield might decrease > performance (IIRC starve the calling application). > Typically, for places where cpu_relax is used, sched_yield would be a poor fit. So yes it could easily reduce performance. > Also, that call should be put inside the idle loop, not for cpu_relax, which > is very different, since it is used (for instance) in kernel/spinlock.c for > spinlocks, and in such things. The "Pause" opcode is explicitly recommended > (by Intel manuals, I don't recall why) for things like spinlock loops, and > using yield there would be bad. > The other thing is that sched_yield won't relax at all if you are the only thing running, it will be a busy wait. So again, maybe not a great fit for the idle loop either. ------------------------------------------------------- This SF.net email is sponsored by Microsoft Mobile & Embedded DevCon 2005 Attend MEDC 2005 May 9-12 in Vegas. Learn more about the latest Windows Embedded(r) & Windows Mobile(tm) platforms, applications & content. Register by 3/29 & save $300 http://ads.osdn.com/?ad_id=6883&alloc_id=15149&op=click _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [uml-devel] [patch 02/12] uml: cpu_relax fix @ 2005-03-24 2:09 ` Nick Piggin 0 siblings, 0 replies; 10+ messages in thread From: Nick Piggin @ 2005-03-24 2:09 UTC (permalink / raw) To: Blaisorblade Cc: user-mode-linux-devel, Bodo Stroesser, akpm, jdike, linux-kernel Blaisorblade wrote: > On Wednesday 23 March 2005 18:09, Bodo Stroesser wrote: > >>blaisorblade@yahoo.it wrote: >> >>>Use rep_nop instead of barrier for cpu_relax, following $(SUBARCH)'s >>>doing that (i.e. i386 and x86_64). >> >>IIRC, Jeff had the idea, to use sched_yield() for this (from a discussion >>on #uml). > > Hmm, makes sense, but this is to benchmark well... I remember from early > discussions on 2.6 scheduler that using sched_yield might decrease > performance (IIRC starve the calling application). > Typically, for places where cpu_relax is used, sched_yield would be a poor fit. So yes it could easily reduce performance. > Also, that call should be put inside the idle loop, not for cpu_relax, which > is very different, since it is used (for instance) in kernel/spinlock.c for > spinlocks, and in such things. The "Pause" opcode is explicitly recommended > (by Intel manuals, I don't recall why) for things like spinlock loops, and > using yield there would be bad. > The other thing is that sched_yield won't relax at all if you are the only thing running, it will be a busy wait. So again, maybe not a great fit for the idle loop either. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2005-03-24 2:11 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-03-22 16:21 [uml-devel] [patch 02/12] uml: cpu_relax fix blaisorblade 2005-03-22 16:21 ` blaisorblade 2005-03-23 17:09 ` [uml-devel] " Bodo Stroesser 2005-03-23 17:09 ` Bodo Stroesser 2005-03-24 1:50 ` Blaisorblade 2005-03-24 1:50 ` Blaisorblade 2005-03-24 2:02 ` Andrew Morton 2005-03-24 2:02 ` Andrew Morton 2005-03-24 2:09 ` Nick Piggin 2005-03-24 2:09 ` Nick Piggin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.