* [PATCH resend] Perf-tool/MIPS: support cross compiling of tools/perf for MIPS
@ 2010-09-30 13:35 Deng-Cheng Zhu
2010-10-01 21:45 ` David Daney
0 siblings, 1 reply; 5+ messages in thread
From: Deng-Cheng Zhu @ 2010-09-30 13:35 UTC (permalink / raw)
To: linux-mips, ralf, ddaney, a.p.zijlstra, paulus, mingo, acme
(Directing this patch to Perf-events maintainers for review.)
With the kernel facility of Linux performance counters, we want the user
level tool tools/perf to be cross compiled for MIPS platform. To do this,
we need to include unistd.h, add rmb() and cpu_relax() in perf.h.
Your review comments are especially required for the definition of rmb():
In perf.h, we need to have a proper rmb() for _all_ MIPS platforms. And
we don't have CONFIG_* things for use in here. Looking at barrier.h,
rmb() goes into barrier() and __sync() for CAVIUM OCTEON and other CPUs,
respectively. What's more, __sync() has different versions as well.
Referring to BARRIER() in dump_tlb.c, I propose the "common" definition
for perf tool rmb() in this patch. Do you have any comments?
In addition, for testing the kernel part code I sent several days
ago, I was using the "particular" rmb() version for 24K/34K/74K cores:
#define rmb() asm volatile( \
".set push\n\t" \
".set noreorder\n\t" \
".set mips2\n\t" \
"sync\n\t" \
".set pop" \
: /* no output */ \
: /* no input */ \
: "memory")
This is the definition of __sync() for CONFIG_CPU_HAS_SYNC.
Thanks,
Deng-Cheng
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
---
tools/perf/perf.h | 12 ++++++++++++
1 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 6fb379b..cd05284 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -73,6 +73,18 @@
#define cpu_relax() asm volatile("":::"memory")
#endif
+#ifdef __mips__
+#include "../../arch/mips/include/asm/unistd.h"
+#define rmb() asm volatile( \
+ ".set noreorder\n\t" \
+ "nop;nop;nop;nop;nop;nop;nop\n\t" \
+ ".set reorder" \
+ : /* no output */ \
+ : /* no input */ \
+ : "memory")
+#define cpu_relax() asm volatile("" ::: "memory")
+#endif
+
#include <time.h>
#include <unistd.h>
#include <sys/types.h>
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH resend] Perf-tool/MIPS: support cross compiling of tools/perf for MIPS
2010-09-30 13:35 [PATCH resend] Perf-tool/MIPS: support cross compiling of tools/perf for MIPS Deng-Cheng Zhu
@ 2010-10-01 21:45 ` David Daney
2010-10-02 1:59 ` Ralf Baechle
0 siblings, 1 reply; 5+ messages in thread
From: David Daney @ 2010-10-01 21:45 UTC (permalink / raw)
To: Deng-Cheng Zhu; +Cc: linux-mips, ralf, a.p.zijlstra, paulus, mingo, acme
On 09/30/2010 06:35 AM, Deng-Cheng Zhu wrote:
>
>
> (Directing this patch to Perf-events maintainers for review.)
>
> With the kernel facility of Linux performance counters, we want the user
> level tool tools/perf to be cross compiled for MIPS platform. To do this,
> we need to include unistd.h, add rmb() and cpu_relax() in perf.h.
>
> Your review comments are especially required for the definition of rmb():
> In perf.h, we need to have a proper rmb() for _all_ MIPS platforms. And
> we don't have CONFIG_* things for use in here. Looking at barrier.h,
> rmb() goes into barrier() and __sync() for CAVIUM OCTEON and other CPUs,
> respectively. What's more, __sync() has different versions as well.
> Referring to BARRIER() in dump_tlb.c, I propose the "common" definition
> for perf tool rmb() in this patch. Do you have any comments?
>
In fact I do.
In user space the rmb() must expand to a SYNC instruction. I am not
sure what your version in the patch is doing with all those NOPs. That
is not guaranteed to do anything.
The instruction set specifications say that SYNC orders all loads and
stores. This is a heaver operation than rmb() demands, but is the only
universally available instruction that imposes ordering.
For processors that do not support SYNC, the kernel will emulate it, so
it is safe to use in userspace. I wouldn't worry about emulation
overhead though, because processors that lack SYNC probably also lack
performance counters, so are not as interesting from a perf-tool point
of view.
David Daney
> In addition, for testing the kernel part code I sent several days
> ago, I was using the "particular" rmb() version for 24K/34K/74K cores:
>
> #define rmb() asm volatile( \
> ".set push\n\t" \
> ".set noreorder\n\t" \
> ".set mips2\n\t" \
> "sync\n\t" \
> ".set pop" \
> : /* no output */ \
> : /* no input */ \
> : "memory")
>
> This is the definition of __sync() for CONFIG_CPU_HAS_SYNC.
>
>
> Thanks,
>
> Deng-Cheng
>
> Signed-off-by: Deng-Cheng Zhu<dengcheng.zhu@gmail.com>
> ---
> tools/perf/perf.h | 12 ++++++++++++
> 1 files changed, 12 insertions(+), 0 deletions(-)
>
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index 6fb379b..cd05284 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -73,6 +73,18 @@
> #define cpu_relax() asm volatile("":::"memory")
> #endif
>
> +#ifdef __mips__
> +#include "../../arch/mips/include/asm/unistd.h"
> +#define rmb() asm volatile( \
> + ".set noreorder\n\t" \
> + "nop;nop;nop;nop;nop;nop;nop\n\t" \
> + ".set reorder" \
> + : /* no output */ \
> + : /* no input */ \
> + : "memory")
> +#define cpu_relax() asm volatile("" ::: "memory")
> +#endif
> +
> #include<time.h>
> #include<unistd.h>
> #include<sys/types.h>
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH resend] Perf-tool/MIPS: support cross compiling of tools/perf for MIPS
2010-10-01 21:45 ` David Daney
@ 2010-10-02 1:59 ` Ralf Baechle
2010-10-02 2:54 ` Deng-Cheng Zhu
0 siblings, 1 reply; 5+ messages in thread
From: Ralf Baechle @ 2010-10-02 1:59 UTC (permalink / raw)
To: David Daney; +Cc: Deng-Cheng Zhu, linux-mips, a.p.zijlstra, paulus, mingo, acme
On Fri, Oct 01, 2010 at 02:45:17PM -0700, David Daney wrote:
> In user space the rmb() must expand to a SYNC instruction. I am not
> sure what your version in the patch is doing with all those NOPs. That
> is not guaranteed to do anything.
That's a rather old version of the kernel rmb macro I think. The NOPs
where there to enforce ordering of a mix of cached and uncached accesses
on the R4400 (not R4000) where according to my reading the manual leaves
it a bit unclear if a SYNC is sufficient or if the pipeline needs to be
drained in addition. See version 2 of the R4000/R4400 User's Manual.
> The instruction set specifications say that SYNC orders all loads and
> stores. This is a heaver operation than rmb() demands, but is the only
> universally available instruction that imposes ordering.
>
> For processors that do not support SYNC, the kernel will emulate it, so
> it is safe to use in userspace. I wouldn't worry about emulation
> overhead though, because processors that lack SYNC probably also lack
> performance counters, so are not as interesting from a perf-tool point
> of view.
Yes, just use SYNC. SYNC-less processors would only be R2000/R3000
processors and a few other oddball processors which for performance
optimization are totally uninteresting since years.
Ralf
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH resend] Perf-tool/MIPS: support cross compiling of tools/perf for MIPS
2010-10-02 1:59 ` Ralf Baechle
@ 2010-10-02 2:54 ` Deng-Cheng Zhu
2010-10-09 5:39 ` Ralf Baechle
0 siblings, 1 reply; 5+ messages in thread
From: Deng-Cheng Zhu @ 2010-10-02 2:54 UTC (permalink / raw)
To: Ralf Baechle; +Cc: David Daney, linux-mips, a.p.zijlstra, paulus, mingo, acme
Thanks guys. So let's turn the patch into the following?
Signed-off-by: Deng-Cheng Zhu<dengcheng.zhu@gmail.com>
---
tools/perf/perf.h | 14 ++++++++++++++
1 files changed, 14 insertions(+), 0 deletions(-)
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 6fb379b..cd05284 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -73,6 +73,20 @@
#define cpu_relax() asm volatile("":::"memory")
#endif
+#ifdef __mips__
+#include "../../arch/mips/include/asm/unistd.h"
+#define rmb() asm volatile( \
+ ".set push\n\t" \
+ ".set noreorder\n\t" \
+ ".set mips2\n\t" \
+ "sync\n\t" \
+ ".set pop" \
+ : /* no output */ \
+ : /* no input */ \
+ : "memory")
+#define cpu_relax() asm volatile("" ::: "memory")
+#endif
+
#include<time.h>
#include<unistd.h>
#include<sys/types.h>
On 2010-10-2 9:59, Ralf Baechle wrote:
> On Fri, Oct 01, 2010 at 02:45:17PM -0700, David Daney wrote:
>
>> In user space the rmb() must expand to a SYNC instruction. I am not
>> sure what your version in the patch is doing with all those NOPs. That
>> is not guaranteed to do anything.
> That's a rather old version of the kernel rmb macro I think. The NOPs
> where there to enforce ordering of a mix of cached and uncached accesses
> on the R4400 (not R4000) where according to my reading the manual leaves
> it a bit unclear if a SYNC is sufficient or if the pipeline needs to be
> drained in addition. See version 2 of the R4000/R4400 User's Manual.
>
>> The instruction set specifications say that SYNC orders all loads and
>> stores. This is a heaver operation than rmb() demands, but is the only
>> universally available instruction that imposes ordering.
>>
>> For processors that do not support SYNC, the kernel will emulate it, so
>> it is safe to use in userspace. I wouldn't worry about emulation
>> overhead though, because processors that lack SYNC probably also lack
>> performance counters, so are not as interesting from a perf-tool point
>> of view.
> Yes, just use SYNC. SYNC-less processors would only be R2000/R3000
> processors and a few other oddball processors which for performance
> optimization are totally uninteresting since years.
>
> Ralf
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH resend] Perf-tool/MIPS: support cross compiling of tools/perf for MIPS
2010-10-02 2:54 ` Deng-Cheng Zhu
@ 2010-10-09 5:39 ` Ralf Baechle
0 siblings, 0 replies; 5+ messages in thread
From: Ralf Baechle @ 2010-10-09 5:39 UTC (permalink / raw)
To: Deng-Cheng Zhu; +Cc: David Daney, linux-mips, a.p.zijlstra, paulus, mingo, acme
On Sat, Oct 02, 2010 at 10:54:04AM +0800, Deng-Cheng Zhu wrote:
> Thanks guys. So let's turn the patch into the following?
>
> Signed-off-by: Deng-Cheng Zhu<dengcheng.zhu@gmail.com>
> ---
> tools/perf/perf.h | 14 ++++++++++++++
> 1 files changed, 14 insertions(+), 0 deletions(-)
>
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index 6fb379b..cd05284 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -73,6 +73,20 @@
> #define cpu_relax() asm volatile("":::"memory")
> #endif
>
> +#ifdef __mips__
> +#include "../../arch/mips/include/asm/unistd.h"
> +#define rmb() asm volatile( \
> + ".set push\n\t" \
> + ".set noreorder\n\t" \
> + ".set mips2\n\t" \
> + "sync\n\t" \
> + ".set pop" \
> + : /* no output */ \
> + : /* no input */ \
> + : "memory")
> +#define cpu_relax() asm volatile("" ::: "memory")
> +#endif
Yes, aside of cosmetic issues this is looking good. The cosmetic issue
is that there are lots of useless dot-ops in the inline assembler. That
could be reduced to just .set mips2 before the SYNC and .set mips0 after.
There are some considerations wrt. to the MT ASE so I've bounced a few
emails around in the meantime but that's nothing that should stop merging
this patch with the rest of the perf patches.
Ralf
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-10-09 5:39 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-30 13:35 [PATCH resend] Perf-tool/MIPS: support cross compiling of tools/perf for MIPS Deng-Cheng Zhu
2010-10-01 21:45 ` David Daney
2010-10-02 1:59 ` Ralf Baechle
2010-10-02 2:54 ` Deng-Cheng Zhu
2010-10-09 5:39 ` Ralf Baechle
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.