* Re: porting oprofile to ppc
2003-02-28 23:27 porting oprofile to ppc Albert Cahalan
@ 2003-03-01 0:22 ` John Levon
2003-03-03 1:49 ` Segher Boessenkool
2003-03-04 16:36 ` AW: " Oliver Oppitz
2 siblings, 0 replies; 7+ messages in thread
From: John Levon @ 2003-03-01 0:22 UTC (permalink / raw)
To: Albert Cahalan; +Cc: oprofile-list, linuxppc-dev
On Fri, Feb 28, 2003 at 06:27:15PM -0500, Albert Cahalan wrote:
> I run both 2.4.xx and 2.5.xx kernels, compiled from
> source. Neither one has any performance monitoring
> hooks ready to use. I could add them. What is needed?
I suggest you start with 2.5 kernels. ppc64 has some oprofile code in
the kernel: the first step is to copy that code into ppc32 and make it
work. That will give you the basic timer interrupt functionality.
> I notice that RTC support conflicts with the /dev/rtc
> driver. Couldn't it use the driver? Sometimes the RTC
No.
> I'm not a Qt fan. Can I avoid it? All my stuff is GNOME,
> plain X11, or non-GUI. Somehow libqt.so.2.3.1 did get
> installed though.
It's completely unnecessary for you to deal with, in any way.
> The 7400 chip additionally gives me a set of performance
> monitoring registers, with read-only access from user code.
> There are four counters, PMC1 to PMC4, and control registers.
> I can freeze the counters in kernel mode, in user mode,
> and according to a flag that may be used to mark a process.
The crucial thing for perfctr support is that it is able to provide an
interrupt when it overflows. Then you (in-kernel) call
oprofile_add_sample() to record the EIP etc.
You may need to tweak the certain details of how the perctrs are
represented in oprofile userspace, but this is something we can work
through. Your first step is to get the ppc64-style support going and
working properly.
Hope that helps,
john
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: porting oprofile to ppc
2003-02-28 23:27 porting oprofile to ppc Albert Cahalan
2003-03-01 0:22 ` John Levon
@ 2003-03-03 1:49 ` Segher Boessenkool
2003-03-03 8:14 ` Albert Cahalan
2003-03-04 16:36 ` AW: " Oliver Oppitz
2 siblings, 1 reply; 7+ messages in thread
From: Segher Boessenkool @ 2003-03-03 1:49 UTC (permalink / raw)
To: Albert Cahalan; +Cc: oprofile-list, linuxppc-dev
Albert Cahalan wrote:
> I'm considering a port to the MPC7400 ("G4") PowerPC.
> This is out of desperation, since there isn't anything
> beyond gprof available for Linux/ppc users.
Great to hear someone's willing to work on this!
I currently use the following hack to use the pmc's:
I have a trivial kernel module that accepts as parameters
the events to count on each pmc (like, insmod pmc.o 1 2 3 4),
sets the PMCn and MMCRn regs, and fails to load. This sets
the counters running, and then I instrument the program
to be profiled to read the PMC's at interesting program
locations. I wrote this years ago and never got around
to finish any better tools.
> I could use some advice. Where do I even start?
> Anybody else doing this or interested in helping?
I'll answer any questions you have -- feel free to email
me in private about this.
> On the 7xx and 74xx chips, I get a user-readable 64-bit
> counter that ticks at 1/16 of the memory bus clock. So on
You are talking about the time base? It runs at 1/4th
the cpu clock. It can be disabled by means of a hardware
pin (GPIO9 on most Mac's); default is running and that's
just what you want, I think ;)
> my 450 MHz Mac with a 100 MHz bus, it ticks at 6.25 MHz.
25MHz.
> There's also a privileged 32-bit count-down register that
> gives an interrupt.
The decrementer; same frequency as the time base on all
G3 and G4 cpu's.
> There isn't a CPU core cycle counter,
> unless you have a 7400 (or above?) and are willing to
> devote a performance counter to that purpose.
Actually, all G3 and G4 cpu's have event 1 on all pmc's
as such a counter.
> The 7400 chip additionally gives me a set of performance
> monitoring registers, with read-only access from user code.
> There are four counters, PMC1 to PMC4, and control registers.
750 has four pmc's as well, 7450 has six of-em.
> I can freeze the counters in kernel mode, in user mode,
> and according to a flag that may be used to mark a process.
There's no mark flag on the 750.
> There's a threshold value for some of the performance
> counters, taking on values from 0..63 times 2 or 32.
> (0,2,4,...,124,126,128,160,192,...,1952,1984,2016)
> So for example, I could count loads that stall for more
> than 1952 ticks.
>
> I can enable counters PMC2..PMC4 when PMC1 goes negative.
> I can freeze all the counters (or cause an interrupt)
> when one of PMC2...PMC4 goes negative.
Or both; the most useful mode, imho.
> There are ways for external hardware to mask counting or
> interrupt generation. I'm not about to solder a button
> onto my CPU for this, but I guess it should be supported.
Luckily, GPIO8 is just what you need. It isn't all that
useful, though. [Beware: always check the device tree
for the exact gpio number on your box -- this can vary].
> All four counters can count:
>
> core cycles
> completed instructions, excluding folded branches
> memory cycles divided by 32, 8k, 128k, or 2M
This one should read "time base ticks".
> instructions dispatched (0, 1, or 2 per core cycle)
>
> Then of course each register has a selection of other
> choices. Of interest:
>
> instruction breakpoint matches, with a bit mask
> (could be abused to count system calls or interrupts)
> various cache things, loads, stores, etc.
>
> There must be 60 to 240 choices, depending on how one
> counts duplicates.
Lots and lots more combo's, although not all combinations
are all that useful ;)
Also, all cpu's have different even assignments (and
different MMCRn registers, too).
Good luck and have fun,
Segher
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: porting oprofile to ppc
2003-03-03 1:49 ` Segher Boessenkool
@ 2003-03-03 8:14 ` Albert Cahalan
2003-03-03 8:28 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 7+ messages in thread
From: Albert Cahalan @ 2003-03-03 8:14 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: Albert Cahalan, oprofile-list, linuxppc-dev
On Sun, 2003-03-02 at 20:49, Segher Boessenkool wrote:
> Albert Cahalan wrote:
>> I'm considering a port to the MPC7400 ("G4") PowerPC.
>> This is out of desperation, since there isn't anything
>> beyond gprof available for Linux/ppc users.
>
> Great to hear someone's willing to work on this!
Sort of. "feeling dragged into" is more like it.
Porting oprofile looks easier than porting valgrind
and kcachegrind, and nobody else seems to care
about PowerPC.
> I currently use the following hack to use the pmc's:
> I have a trivial kernel module that accepts as parameters
> the events to count on each pmc (like, insmod pmc.o 1 2 3 4),
> sets the PMCn and MMCRn regs, and fails to load. This sets
> the counters running, and then I instrument the program
> to be profiled to read the PMC's at interesting program
> locations. I wrote this years ago and never got around
> to finish any better tools.
Oooh... you shouldn't have said that. I might be
mostly satisfied. Right now I'm using a nasty hack
involving "gcc -finstrument-functions" and "nm".
I guess I most want instruction pointer (NIP) sampling
at irregular intervals, or at least not tied to the
clock tick. I could see using some random pre-initialized
profiling counter ("42 foo from now") to get this. I could
hook into the regular external interrupt and rely on USB
traffic to do my sampling. >:-)
> You are talking about the time base? It runs at 1/4th
> the cpu clock. It can be disabled by means of a hardware
> pin (GPIO9 on most Mac's); default is running and that's
> just what you want, I think ;)
Yes. I wonder where I got the 1/16 idea.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: porting oprofile to ppc
2003-03-03 8:14 ` Albert Cahalan
@ 2003-03-03 8:28 ` Benjamin Herrenschmidt
2003-03-04 3:44 ` Segher Boessenkool
0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2003-03-03 8:28 UTC (permalink / raw)
To: Albert Cahalan; +Cc: Segher Boessenkool, oprofile-list, linuxppc-dev
On Mon, 2003-03-03 at 09:14, Albert Cahalan wrote:
> > You are talking about the time base? It runs at 1/4th
> > the cpu clock. It can be disabled by means of a hardware
> > pin (GPIO9 on most Mac's); default is running and that's
> > just what you want, I think ;)
>
> Yes. I wonder where I got the 1/16 idea.
Isn't it the bus clock rather ?
Ben.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: porting oprofile to ppc
2003-03-03 8:28 ` Benjamin Herrenschmidt
@ 2003-03-04 3:44 ` Segher Boessenkool
0 siblings, 0 replies; 7+ messages in thread
From: Segher Boessenkool @ 2003-03-04 3:44 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Albert Cahalan, oprofile-list, linuxppc-dev
Benjamin Herrenschmidt wrote:
> On Mon, 2003-03-03 at 09:14, Albert Cahalan wrote:
>
>
>>>You are talking about the time base? It runs at 1/4th
>>>the cpu clock. It can be disabled by means of a hardware
>>>pin (GPIO9 on most Mac's); default is running and that's
>>>just what you want, I think ;)
>>
>>Yes. I wonder where I got the 1/16 idea.
>
> Isn't it the bus clock rather ?
Yes, the cpu bus clock. Sorry about the confusion.
Segher
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread
* AW: porting oprofile to ppc
2003-02-28 23:27 porting oprofile to ppc Albert Cahalan
2003-03-01 0:22 ` John Levon
2003-03-03 1:49 ` Segher Boessenkool
@ 2003-03-04 16:36 ` Oliver Oppitz
2 siblings, 0 replies; 7+ messages in thread
From: Oliver Oppitz @ 2003-03-04 16:36 UTC (permalink / raw)
To: Albert Cahalan, oprofile-list; +Cc: linuxppc-dev
I have been playing around a bit with performance counters on an MPC7441
(eMac), even found an undocumented "issue" with the "retired instructions"
counter: the value is erroneously incremented by interrupts (+1 for each
time entering/leaving the interrupt handler), even though it was set up for
counting only user-mode instructions. I did these tests via some hijacked
system call (vm86), that I adapted to read/write the SPRs. Is no good style,
but works for my purposes.
In any case, I would like to offer my help with the porting project. Maybe
testing for a beginning, as I need to learn something about kernel modules
before being of any help with that...
Regards, Oliver
-----Ursprüngliche Nachricht-----
Von: owner-linuxppc-dev@lists.linuxppc.org
[mailto:owner-linuxppc-dev@lists.linuxppc.org]Im Auftrag von Albert
Cahalan
Gesendet: Samstag, 1. März 2003 00:27
An: oprofile-list@lists.sourceforge.net
Cc: linuxppc-dev@lists.linuxppc.org
Betreff: porting oprofile to ppc
I'm considering a port to the MPC7400 ("G4") PowerPC.
This is out of desperation, since there isn't anything
beyond gprof available for Linux/ppc users.
I could use some advice. Where do I even start?
Anybody else doing this or interested in helping?
I run both 2.4.xx and 2.5.xx kernels, compiled from
source. Neither one has any performance monitoring
hooks ready to use. I could add them. What is needed?
I notice that RTC support conflicts with the /dev/rtc
driver. Couldn't it use the driver? Sometimes the RTC
is available via memory-mapped IO, and sometimes the
RTC is emulated by the /dev/rtc driver. Even on x86 you
need the /dev/rtc driver to safely set the clock with SMP.
I'm not a Qt fan. Can I avoid it? All my stuff is GNOME,
plain X11, or non-GUI. Somehow libqt.so.2.3.1 did get
installed though.
On the 7xx and 74xx chips, I get a user-readable 64-bit
counter that ticks at 1/16 of the memory bus clock. So on
my 450 MHz Mac with a 100 MHz bus, it ticks at 6.25 MHz.
There's also a privileged 32-bit count-down register that
gives an interrupt. There isn't a CPU core cycle counter,
unless you have a 7400 (or above?) and are willing to
devote a performance counter to that purpose.
The 7400 chip additionally gives me a set of performance
monitoring registers, with read-only access from user code.
There are four counters, PMC1 to PMC4, and control registers.
I can freeze the counters in kernel mode, in user mode,
and according to a flag that may be used to mark a process.
There's a threshold value for some of the performance
counters, taking on values from 0..63 times 2 or 32.
(0,2,4,...,124,126,128,160,192,...,1952,1984,2016)
So for example, I could count loads that stall for more
than 1952 ticks.
I can enable counters PMC2..PMC4 when PMC1 goes negative.
I can freeze all the counters (or cause an interrupt)
when one of PMC2...PMC4 goes negative.
There are ways for external hardware to mask counting or
interrupt generation. I'm not about to solder a button
onto my CPU for this, but I guess it should be supported.
All four counters can count:
core cycles
completed instructions, excluding folded branches
memory cycles divided by 32, 8k, 128k, or 2M
instructions dispatched (0, 1, or 2 per core cycle)
Then of course each register has a selection of other
choices. Of interest:
instruction breakpoint matches, with a bit mask
(could be abused to count system calls or interrupts)
various cache things, loads, stores, etc.
There must be 60 to 240 choices, depending on how one
counts duplicates.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 7+ messages in thread