* [RFC, PATCH 0/4] Add support to OProfile for profiling Cell BE SPUs -- update @ 2007-01-29 19:45 Maynard Johnson 2007-01-29 19:46 ` [RFC, PATCH 1/4] " Maynard Johnson ` (4 more replies) 0 siblings, 5 replies; 43+ messages in thread From: Maynard Johnson @ 2007-01-29 19:45 UTC (permalink / raw) To: cbe-oss-dev, linux-kernel, linuxppc-dev, oprofile-list On December 14, 2006, I posted a patch that added support to the OProfile kernel driver for profiling Cell SPUs. There have been some changes/fixes to this patch since the original posting (including forward porting from 2.6.18-based kernel to 2.6.20-rc1), so I am reposting the patch for review now. This patch relies upon the following patches that have not been accepted yet: 1. oprofile cleanup patch (submitted on Nov 27) 2. Fix for PPU profiling (not submitted yet, since it depends on #1) 3. SPU task notification patch (last submitted on Jan 26) For those who may want to apply and build the oprofile SPU support patch, it would be necessary to first apply the above patches. For convenience, I will post all three of the above patches, along with the oprofile SPU support patch. Comments appreciated. Thank you. Maynard Johnson IBM LTC Toolchain ^ permalink raw reply [flat|nested] 43+ messages in thread
* [RFC, PATCH 1/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-29 19:45 [RFC, PATCH 0/4] Add support to OProfile for profiling Cell BE SPUs -- update Maynard Johnson @ 2007-01-29 19:46 ` Maynard Johnson 2007-01-30 4:07 ` [Cbe-oss-dev] " Arnd Bergmann 2007-01-30 10:39 ` Christoph Hellwig 2007-01-29 19:47 ` [RFC, PATCH 2/4] " Maynard Johnson ` (3 subsequent siblings) 4 siblings, 2 replies; 43+ messages in thread From: Maynard Johnson @ 2007-01-29 19:46 UTC (permalink / raw) To: cbe-oss-dev, linux-kernel, linuxppc-dev, oprofile-list [-- Attachment #1: Type: text/plain, Size: 2 bytes --] [-- Attachment #2: oprof-ppu-cleanup1.diff --] [-- Type: text/x-patch, Size: 12489 bytes --] This is a clean up patch that includes the following changes: -It removes some macro definitions that are only used once with the actual code. -Some comments were added to clarify the code based on feedback from the community. -The write_pm_cntrl() and set_count_mode() were passed a structure element from a global variable. The argument was removed so the functions now just operate on the global directly. -The set_pm_event() function call in the cell_virtual_cntr() routine was moved to a for-loop before the for_each_cpu loop Signed-off-by: Carl Love <carll@us.ibm.com> Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> Index: linux-2.6.20-rc1/arch/powerpc/oprofile/op_model_cell.c =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/oprofile/op_model_cell.c 2007-01-18 16:43:14.428510224 -0600 +++ linux-2.6.20-rc1/arch/powerpc/oprofile/op_model_cell.c 2007-01-18 16:56:47.300605984 -0600 @@ -41,8 +41,12 @@ #define PPU_CYCLES_EVENT_NUM 1 /* event number for CYCLES */ #define CBE_COUNT_ALL_CYCLES 0x42800000 /* PPU cycle event specifier */ -#define NUM_THREADS 2 -#define VIRT_CNTR_SW_TIME_NS 100000000 // 0.5 seconds +#define NUM_THREADS 2 /* number of physical threads in + * physical processor + */ +#define NUM_TRACE_BUS_WORDS 4 +#define NUM_INPUT_BUS_WORDS 2 + struct pmc_cntrl_data { unsigned long vcntr; @@ -94,14 +98,6 @@ } pm_regs; -#define GET_SUB_UNIT(x) ((x & 0x0000f000) >> 12) -#define GET_BUS_WORD(x) ((x & 0x000000f0) >> 4) -#define GET_BUS_TYPE(x) ((x & 0x00000300) >> 8) -#define GET_POLARITY(x) ((x & 0x00000002) >> 1) -#define GET_COUNT_CYCLES(x) (x & 0x00000001) -#define GET_INPUT_CONTROL(x) ((x & 0x00000004) >> 2) - - static DEFINE_PER_CPU(unsigned long[NR_PHYS_CTRS], pmc_values); static struct pmc_cntrl_data pmc_cntrl[NUM_THREADS][NR_PHYS_CTRS]; @@ -129,8 +125,8 @@ static u32 ctr_enabled; -static unsigned char trace_bus[4]; -static unsigned char input_bus[2]; +static unsigned char trace_bus[NUM_TRACE_BUS_WORDS]; +static unsigned char input_bus[NUM_INPUT_BUS_WORDS]; /* * Firmware interface functions @@ -183,7 +179,8 @@ for (j = 0; j < count; j++) { /* fw expects physical cpu # */ pm_signal_local[j].cpu = node; - pm_signal_local[j].signal_group = pm_signal[j].signal_group; + pm_signal_local[j].signal_group + = pm_signal[j].signal_group; pm_signal_local[j].bus_word = pm_signal[j].bus_word; pm_signal_local[j].sub_unit = pm_signal[j].sub_unit; pm_signal_local[j].bit = pm_signal[j].bit; @@ -221,24 +218,32 @@ pm_regs.pm07_cntrl[ctr] = 0; } - bus_word = GET_BUS_WORD(unit_mask); - bus_type = GET_BUS_TYPE(unit_mask); - count_cycles = GET_COUNT_CYCLES(unit_mask); - polarity = GET_POLARITY(unit_mask); - input_control = GET_INPUT_CONTROL(unit_mask); + bus_word = (unit_mask & 0x000000f0) >> 4; + bus_type = (unit_mask & 0x00000300) >> 8; + count_cycles = unit_mask & 0x00000001; + polarity = (unit_mask & 0x00000002) >> 1; + input_control = (unit_mask & 0x00000004) >> 2; signal_bit = (event % 100); p = &(pm_signal[ctr]); p->signal_group = event / 100; p->bus_word = bus_word; - p->sub_unit = unit_mask & 0x0000f000; + p->sub_unit = (unit_mask & 0x0000f000) >> 12; pm_regs.pm07_cntrl[ctr] = 0; - pm_regs.pm07_cntrl[ctr] |= PM07_CTR_COUNT_CYCLES(count_cycles); - pm_regs.pm07_cntrl[ctr] |= PM07_CTR_POLARITY(polarity); - pm_regs.pm07_cntrl[ctr] |= PM07_CTR_INPUT_CONTROL(input_control); - + pm_regs.pm07_cntrl[ctr] |= (count_cycles & 1) << 23; + pm_regs.pm07_cntrl[ctr] |= (polarity & 1) << 24; + pm_regs.pm07_cntrl[ctr] |= (input_control & 1) << 25; + + /* Some of the islands signal selection is based on 64 bit words. + * The debug bus words are 32 bits, the input words to the performance + * counters are defined as 32 bits. Need to convert the 64 bit island + * specification to the appropriate 32 input bit and bus word for the + * performance counter event selection. See the CELL Performance + * monitoring signals manual and the Perf cntr hardware descriptions + * for the details. + */ if (input_control == 0) { if (signal_bit > 31) { signal_bit -= 32; @@ -253,18 +258,18 @@ if ((bus_type == 1) && p->signal_group >= 50) bus_type = 0; - pm_regs.pm07_cntrl[ctr] |= PM07_CTR_INPUT_MUX(signal_bit); + pm_regs.pm07_cntrl[ctr] |= (signal_bit & 0x3F) << 26; } else { pm_regs.pm07_cntrl[ctr] = 0; p->bit = signal_bit; } - for (i = 0; i < 4; i++) { + for (i = 0; i < NUM_TRACE_BUS_WORDS; i++) { if (bus_word & (1 << i)) { pm_regs.debug_bus_control |= (bus_type << (31 - (2 * i) + 1)); - for (j = 0; j < 2; j++) { + for (j = 0; j < NUM_INPUT_BUS_WORDS; j++) { if (input_bus[j] == 0xff) { input_bus[j] = i; pm_regs.group_control |= @@ -278,52 +283,58 @@ ; } -static void write_pm_cntrl(int cpu, struct pm_cntrl *pm_cntrl) +static void write_pm_cntrl(int cpu) { - /* Oprofile will use 32 bit counters, set bits 7:10 to 0 */ + /* Oprofile will use 32 bit counters, set bits 7:10 to 0 + * pmregs.pm_cntrl is a global + */ + u32 val = 0; - if (pm_cntrl->enable == 1) + if (pm_regs.pm_cntrl.enable == 1) val |= CBE_PM_ENABLE_PERF_MON; - if (pm_cntrl->stop_at_max == 1) + if (pm_regs.pm_cntrl.stop_at_max == 1) val |= CBE_PM_STOP_AT_MAX; - if (pm_cntrl->trace_mode == 1) - val |= CBE_PM_TRACE_MODE_SET(pm_cntrl->trace_mode); + if (pm_regs.pm_cntrl.trace_mode == 1) + val |= CBE_PM_TRACE_MODE_SET(pm_regs.pm_cntrl.trace_mode); - if (pm_cntrl->freeze == 1) + if (pm_regs.pm_cntrl.freeze == 1) val |= CBE_PM_FREEZE_ALL_CTRS; /* Routine set_count_mode must be called previously to set * the count mode based on the user selection of user and kernel. */ - val |= CBE_PM_COUNT_MODE_SET(pm_cntrl->count_mode); + val |= CBE_PM_COUNT_MODE_SET(pm_regs.pm_cntrl.count_mode); cbe_write_pm(cpu, pm_control, val); } static inline void -set_count_mode(u32 kernel, u32 user, struct pm_cntrl *pm_cntrl) +set_count_mode(u32 kernel, u32 user) { /* The user must specify user and kernel if they want them. If - * neither is specified, OProfile will count in hypervisor mode + * neither is specified, OProfile will count in hypervisor mode. + * pm_regs.pm_cntrl is a global */ if (kernel) { if (user) - pm_cntrl->count_mode = CBE_COUNT_ALL_MODES; + pm_regs.pm_cntrl.count_mode = CBE_COUNT_ALL_MODES; else - pm_cntrl->count_mode = CBE_COUNT_SUPERVISOR_MODE; + pm_regs.pm_cntrl.count_mode = + CBE_COUNT_SUPERVISOR_MODE; } else { if (user) - pm_cntrl->count_mode = CBE_COUNT_PROBLEM_MODE; + pm_regs.pm_cntrl.count_mode = CBE_COUNT_PROBLEM_MODE; else - pm_cntrl->count_mode = CBE_COUNT_HYPERVISOR_MODE; + pm_regs.pm_cntrl.count_mode = + CBE_COUNT_HYPERVISOR_MODE; } } static inline void enable_ctr(u32 cpu, u32 ctr, u32 * pm07_cntrl) { - pm07_cntrl[ctr] |= PM07_CTR_ENABLE(1); + pm07_cntrl[ctr] |= CBE_PM_CTR_ENABLE; cbe_write_pm07_control(cpu, ctr, pm07_cntrl[ctr]); } @@ -365,6 +376,14 @@ hdw_thread = 1 ^ hdw_thread; next_hdw_thread = hdw_thread; + for (i = 0; i < num_counters; i++) + /* There are some per thread events. Must do the + * set event, for the thread that is being started + */ + set_pm_event(i, + pmc_cntrl[next_hdw_thread][i].evnts, + pmc_cntrl[next_hdw_thread][i].masks); + /* The following is done only once per each node, but * we need cpu #, not node #, to pass to the cbe_xxx functions. */ @@ -385,12 +404,13 @@ == 0xFFFFFFFF) /* If the cntr value is 0xffffffff, we must * reset that to 0xfffffff0 when the current - * thread is restarted. This will generate a new - * interrupt and make sure that we never restore - * the counters to the max value. If the counters - * were restored to the max value, they do not - * increment and no interrupts are generated. Hence - * no more samples will be collected on that cpu. + * thread is restarted. This will generate a + * new interrupt and make sure that we never + * restore the counters to the max value. If + * the counters were restored to the max value, + * they do not increment and no interrupts are + * generated. Hence no more samples will be + * collected on that cpu. */ cbe_write_ctr(cpu, i, 0xFFFFFFF0); else @@ -410,9 +430,6 @@ * Must do the set event, enable_cntr * for each cpu. */ - set_pm_event(i, - pmc_cntrl[next_hdw_thread][i].evnts, - pmc_cntrl[next_hdw_thread][i].masks); enable_ctr(cpu, i, pm_regs.pm07_cntrl); } else { @@ -465,8 +482,7 @@ pm_regs.pm_cntrl.trace_mode = 0; pm_regs.pm_cntrl.freeze = 1; - set_count_mode(sys->enable_kernel, sys->enable_user, - &pm_regs.pm_cntrl); + set_count_mode(sys->enable_kernel, sys->enable_user); /* Setup the thread 0 events */ for (i = 0; i < num_ctrs; ++i) { @@ -498,10 +514,10 @@ pmc_cntrl[1][i].vcntr = i; } - for (i = 0; i < 4; i++) + for (i = 0; i < NUM_INPUT_BUS_WORDS; i++) trace_bus[i] = 0xff; - for (i = 0; i < 2; i++) + for (i = 0; i < NUM_INPUT_BUS_WORDS; i++) input_bus[i] = 0xff; /* Our counters count up, and "count" refers to @@ -560,7 +576,7 @@ cbe_write_pm(cpu, pm_start_stop, 0); cbe_write_pm(cpu, group_control, pm_regs.group_control); cbe_write_pm(cpu, debug_bus_control, pm_regs.debug_bus_control); - write_pm_cntrl(cpu, &pm_regs.pm_cntrl); + write_pm_cntrl(cpu); for (i = 0; i < num_counters; ++i) { if (ctr_enabled & (1 << i)) { @@ -602,7 +618,7 @@ } } - cbe_clear_pm_interrupts(cpu); + cbe_get_and_clear_pm_interrupts(cpu); cbe_enable_pm_interrupts(cpu, hdw_thread, interrupt_mask); cbe_enable_pm(cpu); } @@ -672,7 +688,7 @@ cbe_disable_pm(cpu); - interrupt_mask = cbe_clear_pm_interrupts(cpu); + interrupt_mask = cbe_get_and_clear_pm_interrupts(cpu); /* If the interrupt mask has been cleared, then the virt cntr * has cleared the interrupt. When the thread that generated Index: linux-2.6.20-rc1/arch/powerpc/platforms/cell/pmu.c =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/platforms/cell/pmu.c 2007-01-18 16:43:14.319526792 -0600 +++ linux-2.6.20-rc1/arch/powerpc/platforms/cell/pmu.c 2007-01-18 16:49:58.319611848 -0600 @@ -345,18 +345,12 @@ * Enabling/disabling interrupts for the entire performance monitoring unit. */ -u32 cbe_query_pm_interrupts(u32 cpu) -{ - return cbe_read_pm(cpu, pm_status); -} -EXPORT_SYMBOL_GPL(cbe_query_pm_interrupts); - -u32 cbe_clear_pm_interrupts(u32 cpu) +u32 cbe_get_and_clear_pm_interrupts(u32 cpu) { /* Reading pm_status clears the interrupt bits. */ - return cbe_query_pm_interrupts(cpu); + return cbe_read_pm(cpu, pm_status); } -EXPORT_SYMBOL_GPL(cbe_clear_pm_interrupts); +EXPORT_SYMBOL_GPL(cbe_get_and_clear_pm_interrupts); void cbe_enable_pm_interrupts(u32 cpu, u32 thread, u32 mask) { @@ -371,7 +365,7 @@ void cbe_disable_pm_interrupts(u32 cpu) { - cbe_clear_pm_interrupts(cpu); + cbe_get_and_clear_pm_interrupts(cpu); cbe_write_pm(cpu, pm_status, 0); } EXPORT_SYMBOL_GPL(cbe_disable_pm_interrupts); Index: linux-2.6.20-rc1/include/asm-powerpc/cell-pmu.h =================================================================== --- linux-2.6.20-rc1.orig/include/asm-powerpc/cell-pmu.h 2007-01-18 16:43:19.932605072 -0600 +++ linux-2.6.20-rc1/include/asm-powerpc/cell-pmu.h 2007-01-18 16:59:38.230579688 -0600 @@ -89,8 +89,7 @@ extern void cbe_enable_pm_interrupts(u32 cpu, u32 thread, u32 mask); extern void cbe_disable_pm_interrupts(u32 cpu); -extern u32 cbe_query_pm_interrupts(u32 cpu); -extern u32 cbe_clear_pm_interrupts(u32 cpu); +extern u32 cbe_get_and_clear_pm_interrupts(u32 cpu); extern void cbe_sync_irq(int node); /* Utility functions, macros */ @@ -103,11 +102,4 @@ #define CBE_COUNT_PROBLEM_MODE 2 #define CBE_COUNT_ALL_MODES 3 -/* Macros for the pm07_control registers. */ -#define PM07_CTR_INPUT_MUX(x) (((x) & 0x3F) << 26) -#define PM07_CTR_INPUT_CONTROL(x) (((x) & 1) << 25) -#define PM07_CTR_POLARITY(x) (((x) & 1) << 24) -#define PM07_CTR_COUNT_CYCLES(x) (((x) & 1) << 23) -#define PM07_CTR_ENABLE(x) (((x) & 1) << 22) - #endif /* __ASM_CELL_PMU_H__ */ ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 1/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-29 19:46 ` [RFC, PATCH 1/4] " Maynard Johnson @ 2007-01-30 4:07 ` Arnd Bergmann 2007-01-30 10:39 ` Christoph Hellwig 1 sibling, 0 replies; 43+ messages in thread From: Arnd Bergmann @ 2007-01-30 4:07 UTC (permalink / raw) To: cbe-oss-dev, maynardj; +Cc: linuxppc-dev, linux-kernel, oprofile-list On Monday 29 January 2007 20:46, Maynard Johnson wrote: > This is a clean up patch that includes the following changes: >=20 > =A0 =A0 =A0 =A0 -It removes some macro definitions that are only used once > =A0 =A0 =A0 =A0 =A0with the actual code. > =A0 =A0 =A0 =A0 -Some comments were added to clarify the code based on fe= edback > =A0 =A0 =A0 =A0 =A0from the community. > =A0 =A0 =A0 =A0 -The write_pm_cntrl() and set_count_mode() were passed a = structure > =A0 =A0 =A0 =A0 =A0element from a global variable. =A0The argument was re= moved so the > =A0 =A0 =A0 =A0 =A0functions now just operate on the global directly. > =A0 =A0 =A0 =A0 -The set_pm_event() function call in the cell_virtual_cnt= r() routine > =A0 =A0 =A0 =A0 =A0was moved to a for-loop before the for_each_cpu loop >=20 > Signed-off-by: Carl Love <carll@us.ibm.com> > Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> >=20 Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Just a small side note: Please give each of your patches a one-line summary in the subject of the email. I'm filing this one under: "cell: oprofile cleanup". It would also be good if you could use a mailer that sends out patches as inline, so you don't need to resort to using attachments. Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 1/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-29 19:46 ` [RFC, PATCH 1/4] " Maynard Johnson 2007-01-30 4:07 ` [Cbe-oss-dev] " Arnd Bergmann @ 2007-01-30 10:39 ` Christoph Hellwig 2007-01-30 22:49 ` Carl Love 1 sibling, 1 reply; 43+ messages in thread From: Christoph Hellwig @ 2007-01-30 10:39 UTC (permalink / raw) To: Maynard Johnson; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel On Mon, Jan 29, 2007 at 01:46:50PM -0600, Maynard Johnson wrote: > > I don't think the macro removal is helpful, getting rid of the names makes the code less readable to me. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 1/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 10:39 ` Christoph Hellwig @ 2007-01-30 22:49 ` Carl Love 2007-01-30 22:57 ` Benjamin Herrenschmidt 2007-01-30 22:59 ` Benjamin Herrenschmidt 0 siblings, 2 replies; 43+ messages in thread From: Carl Love @ 2007-01-30 22:49 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel Christoph: In our earlier work on the PPU profiling patch, Benjamin Herrenschmidt said that we should remove macros that are only used once and just put the actual code in. That is why the macros were removed. Carl Love On Tue, 2007-01-30 at 11:39 +0100, Christoph Hellwig wrote: > On Mon, Jan 29, 2007 at 01:46:50PM -0600, Maynard Johnson wrote: > > > > > > I don't think the macro removal is helpful, getting rid of the names > makes the code less readable to me. > _______________________________________________ > cbe-oss-dev mailing list > cbe-oss-dev@ozlabs.org > https://ozlabs.org/mailman/listinfo/cbe-oss-dev ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 1/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 22:49 ` Carl Love @ 2007-01-30 22:57 ` Benjamin Herrenschmidt 2007-01-31 8:47 ` Christoph Hellwig 2007-01-30 22:59 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 43+ messages in thread From: Benjamin Herrenschmidt @ 2007-01-30 22:57 UTC (permalink / raw) To: Carl Love; +Cc: linuxppc-dev, linux-kernel, oprofile-list, cbe-oss-dev On Tue, 2007-01-30 at 14:49 -0800, Carl Love wrote: > Christoph: > > In our earlier work on the PPU profiling patch, Benjamin Herrenschmidt > said that we should remove macros that are only used once and just put > the actual code in. That is why the macros were removed. Heh... there is a balance to be found... In some cases, inline functions might be better too. Ben. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 1/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 22:57 ` Benjamin Herrenschmidt @ 2007-01-31 8:47 ` Christoph Hellwig 0 siblings, 0 replies; 43+ messages in thread From: Christoph Hellwig @ 2007-01-31 8:47 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linux-kernel, cbe-oss-dev, linuxppc-dev, oprofile-list, Carl Love On Wed, Jan 31, 2007 at 09:57:26AM +1100, Benjamin Herrenschmidt wrote: > On Tue, 2007-01-30 at 14:49 -0800, Carl Love wrote: > > Christoph: > > > > In our earlier work on the PPU profiling patch, Benjamin Herrenschmidt > > said that we should remove macros that are only used once and just put > > the actual code in. That is why the macros were removed. > > Heh... there is a balance to be found... In some cases, inline functions > might be better too. Well, unless there's a very good reasons against it (token pasting, header pollution) inlines are always preferable over macros, but I didn't want to bring that issue up aswell.. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 1/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 22:49 ` Carl Love 2007-01-30 22:57 ` Benjamin Herrenschmidt @ 2007-01-30 22:59 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 43+ messages in thread From: Benjamin Herrenschmidt @ 2007-01-30 22:59 UTC (permalink / raw) To: Carl Love; +Cc: linuxppc-dev, linux-kernel, oprofile-list, cbe-oss-dev On Tue, 2007-01-30 at 14:49 -0800, Carl Love wrote: > Christoph: > > In our earlier work on the PPU profiling patch, Benjamin Herrenschmidt > said that we should remove macros that are only used once and just put > the actual code in. That is why the macros were removed. I've looked at the macros you remove and indeed, they look like stuff you actually want to keep in macros. I don't have off the top of my head the circumstances where I asked you to remove macros in the PPE code, but I'm sure it was different. Ben. ^ permalink raw reply [flat|nested] 43+ messages in thread
* [RFC, PATCH 2/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-29 19:45 [RFC, PATCH 0/4] Add support to OProfile for profiling Cell BE SPUs -- update Maynard Johnson 2007-01-29 19:46 ` [RFC, PATCH 1/4] " Maynard Johnson @ 2007-01-29 19:47 ` Maynard Johnson 2007-01-30 4:08 ` [Cbe-oss-dev] " Arnd Bergmann 2007-01-29 19:48 ` [RFC, PATCH 3/4] " Maynard Johnson ` (2 subsequent siblings) 4 siblings, 1 reply; 43+ messages in thread From: Maynard Johnson @ 2007-01-29 19:47 UTC (permalink / raw) To: cbe-oss-dev, linux-kernel, linuxppc-dev, oprofile-list [-- Attachment #1: Type: text/plain, Size: 1 bytes --] [-- Attachment #2: oprof-ppu-fix2.diff --] [-- Type: text/x-patch, Size: 3634 bytes --] The code was setting up the debug bus for group 21 when profiling on the event PPU CYCLES. The debug bus is not actually used by the hardware performance counters when counting PPU CYCLES. Setting up the debug bus for PPU CYCLES causes signal routing conflicts on the debug bus when profiling PPU cycles and another PPU event. This patch fixes the code to only setup the debug bus to route the performance signals for the non PPU CYCLE events. Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> Signed-off-by: Carl Love <carll@us.ibm.com> Index: linux-2.6.20-rc1/arch/powerpc/oprofile/op_model_cell.c =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/oprofile/op_model_cell.c 2007-01-18 16:56:47.300605984 -0600 +++ linux-2.6.20-rc1/arch/powerpc/oprofile/op_model_cell.c 2007-01-24 12:16:16.225609136 -0600 @@ -39,6 +39,9 @@ #include "../platforms/cell/interrupt.h" #define PPU_CYCLES_EVENT_NUM 1 /* event number for CYCLES */ +#define PPU_CYCLES_GRP_NUM 1 /* special group number for identifying + * PPU_CYCLES event + */ #define CBE_COUNT_ALL_CYCLES 0x42800000 /* PPU cycle event specifier */ #define NUM_THREADS 2 /* number of physical threads in @@ -62,7 +65,7 @@ struct pm_signal { u16 cpu; /* Processor to modify */ u16 sub_unit; /* hw subunit this applies to (if applicable) */ - u16 signal_group; /* Signal Group to Enable/Disable */ + short int signal_group; /* Signal Group to Enable/Disable */ u8 bus_word; /* Enable/Disable on this Trace/Trigger/Event * Bus Word(s) (bitmask) */ @@ -173,26 +176,40 @@ static void pm_rtas_activate_signals(u32 node, u32 count) { int ret; - int j; + int i, j; struct pm_signal pm_signal_local[NR_PHYS_CTRS]; + /* There is no debug setup required for the cycles event. + * Note that only events in the same group can be used. + * Otherwise, there will be conflicts in correctly routing + * the signals on the debug bus. It is the responsiblity + * of the OProfile user tool to check the events are in + * the same group. + */ + i = 0; for (j = 0; j < count; j++) { - /* fw expects physical cpu # */ - pm_signal_local[j].cpu = node; - pm_signal_local[j].signal_group - = pm_signal[j].signal_group; - pm_signal_local[j].bus_word = pm_signal[j].bus_word; - pm_signal_local[j].sub_unit = pm_signal[j].sub_unit; - pm_signal_local[j].bit = pm_signal[j].bit; - } + if (pm_signal[j].signal_group != PPU_CYCLES_GRP_NUM) { - ret = rtas_ibm_cbe_perftools(SUBFUNC_ACTIVATE, PASSTHRU_ENABLE, - pm_signal_local, - count * sizeof(struct pm_signal)); + /* fw expects physical cpu # */ + pm_signal_local[i].cpu = node; + pm_signal_local[i].signal_group + = pm_signal[j].signal_group; + pm_signal_local[i].bus_word = pm_signal[j].bus_word; + pm_signal_local[i].sub_unit = pm_signal[j].sub_unit; + pm_signal_local[i].bit = pm_signal[j].bit; + i++; + } + } - if (ret) - printk(KERN_WARNING "%s: rtas returned: %d\n", - __FUNCTION__, ret); + if (i != 0) { + ret = rtas_ibm_cbe_perftools(SUBFUNC_ACTIVATE, PASSTHRU_ENABLE, + pm_signal_local, + i * sizeof(struct pm_signal)); + + if (ret) + printk(KERN_WARNING "%s: rtas returned: %d\n", + __FUNCTION__, ret); + } } /* @@ -209,7 +226,7 @@ /* Special Event: Count all cpu cycles */ pm_regs.pm07_cntrl[ctr] = CBE_COUNT_ALL_CYCLES; p = &(pm_signal[ctr]); - p->signal_group = 21; + p->signal_group = PPU_CYCLES_GRP_NUM; p->bus_word = 1; p->sub_unit = 0; p->bit = 0; ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 2/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-29 19:47 ` [RFC, PATCH 2/4] " Maynard Johnson @ 2007-01-30 4:08 ` Arnd Bergmann 2007-01-30 23:51 ` Carl Love 0 siblings, 1 reply; 43+ messages in thread From: Arnd Bergmann @ 2007-01-30 4:08 UTC (permalink / raw) To: cbe-oss-dev, maynardj; +Cc: linuxppc-dev, linux-kernel, oprofile-list On Monday 29 January 2007 20:47, Maynard Johnson wrote: > The code was setting up the debug bus for group 21 when profiling on th= e=20 > event PPU CYCLES. =A0The debug bus is not actually used by the hardware=20 > performance counters when counting PPU CYCLES. =A0Setting up the debug bus > for PPU CYCLES causes signal routing conflicts on the debug bus when=20 > profiling PPU cycles and another PPU event. =A0This patch fixes the code = to=20 > only setup the debug bus to route the performance signals for the non > PPU CYCLE events. >=20 > Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> > Signed-off-by: Carl Love <carll@us.ibm.com> Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Any suggestion for a one-line patch title? ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 2/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 4:08 ` [Cbe-oss-dev] " Arnd Bergmann @ 2007-01-30 23:51 ` Carl Love 0 siblings, 0 replies; 43+ messages in thread From: Carl Love @ 2007-01-30 23:51 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel Arnd: Well most of the patch is simply cleaning up the pm_rtas_activate_signals() routine. I would be think "pm_rtas_activat_signals routine cleanup" would be good. Carl Love On Tue, 2007-01-30 at 05:08 +0100, Arnd Bergmann wrote: > On Monday 29 January 2007 20:47, Maynard Johnson wrote: > > The code was setting up the debug bus for group 21 when profiling on the > > event PPU CYCLES. The debug bus is not actually used by the hardware > > performance counters when counting PPU CYCLES. Setting up the debug bus > > for PPU CYCLES causes signal routing conflicts on the debug bus when > > profiling PPU cycles and another PPU event. This patch fixes the code to > > only setup the debug bus to route the performance signals for the non > > PPU CYCLE events. > > > > Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> > > Signed-off-by: Carl Love <carll@us.ibm.com> > > Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> > > Any suggestion for a one-line patch title? > _______________________________________________ > cbe-oss-dev mailing list > cbe-oss-dev@ozlabs.org > https://ozlabs.org/mailman/listinfo/cbe-oss-dev ^ permalink raw reply [flat|nested] 43+ messages in thread
* [RFC, PATCH 3/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-29 19:45 [RFC, PATCH 0/4] Add support to OProfile for profiling Cell BE SPUs -- update Maynard Johnson 2007-01-29 19:46 ` [RFC, PATCH 1/4] " Maynard Johnson 2007-01-29 19:47 ` [RFC, PATCH 2/4] " Maynard Johnson @ 2007-01-29 19:48 ` Maynard Johnson 2007-01-30 4:24 ` [Cbe-oss-dev] " Arnd Bergmann 2007-01-29 19:48 ` [RFC, PATCH 4/4] " Maynard Johnson 2007-01-30 8:37 ` [RFC, PATCH 0/4] " Arnd Bergmann 4 siblings, 1 reply; 43+ messages in thread From: Maynard Johnson @ 2007-01-29 19:48 UTC (permalink / raw) To: cbe-oss-dev, linux-kernel, linuxppc-dev, oprofile-list [-- Attachment #1: Type: text/plain, Size: 1 bytes --] [-- Attachment #2: spu-notifier.patch --] [-- Type: text/x-diff, Size: 4589 bytes --] Subject: Enable SPU switch notification to detect currently active SPU tasks. From: Maynard Johnson <maynardj@us.ibm.com> This patch adds to the capability of spu_switch_event_register so that the caller is also notified of currently active SPU tasks. It also exports spu_switch_event_register and spu_switch_event_unregister. Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> Index: linux-2.6.20-rc1/arch/powerpc/platforms/cell/spufs/sched.c =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/platforms/cell/spufs/sched.c 2007-01-18 16:43:14.324526032 -0600 +++ linux-2.6.20-rc1/arch/powerpc/platforms/cell/spufs/sched.c 2007-01-26 16:16:35.219668640 -0600 @@ -78,21 +78,46 @@ static BLOCKING_NOTIFIER_HEAD(spu_switch_notifier); -static void spu_switch_notify(struct spu *spu, struct spu_context *ctx) +void spu_switch_notify(struct spu *spu, struct spu_context *ctx) { blocking_notifier_call_chain(&spu_switch_notifier, ctx ? ctx->object_id : 0, spu); } +static void notify_spus_active(void) +{ + int node; + /* Wake up the active spu_contexts. When the awakened processes + * sees their notify_active flag is set, they will call + * spu_switch_notify(); + */ + for (node = 0; node < MAX_NUMNODES; node++) { + struct spu *spu; + mutex_lock(&spu_prio->active_mutex[node]); + list_for_each_entry(spu, &spu_prio->active_list[node], list) { + struct spu_context *ctx = spu->ctx; + spu->notify_active = 1; + wake_up_all(&ctx->stop_wq); + } + mutex_unlock(&spu_prio->active_mutex[node]); + } +} + int spu_switch_event_register(struct notifier_block * n) { - return blocking_notifier_chain_register(&spu_switch_notifier, n); + int ret; + ret = blocking_notifier_chain_register(&spu_switch_notifier, n); + if (!ret) + notify_spus_active(); + return ret; } +EXPORT_SYMBOL_GPL(spu_switch_event_register); int spu_switch_event_unregister(struct notifier_block * n) { return blocking_notifier_chain_unregister(&spu_switch_notifier, n); } +EXPORT_SYMBOL_GPL(spu_switch_event_unregister); static inline void bind_context(struct spu *spu, struct spu_context *ctx) Index: linux-2.6.20-rc1/arch/powerpc/platforms/cell/spufs/spufs.h =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/platforms/cell/spufs/spufs.h 2007-01-18 16:43:14.340523600 -0600 +++ linux-2.6.20-rc1/arch/powerpc/platforms/cell/spufs/spufs.h 2007-01-26 16:26:49.733703448 -0600 @@ -180,6 +180,7 @@ int spu_activate(struct spu_context *ctx, u64 flags); void spu_deactivate(struct spu_context *ctx); void spu_yield(struct spu_context *ctx); +void spu_switch_notify(struct spu *spu, struct spu_context *ctx); int __init spu_sched_init(void); void __exit spu_sched_exit(void); Index: linux-2.6.20-rc1/arch/powerpc/platforms/cell/spufs/run.c =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/platforms/cell/spufs/run.c 2007-01-18 16:43:14.340523600 -0600 +++ linux-2.6.20-rc1/arch/powerpc/platforms/cell/spufs/run.c 2007-01-26 16:24:38.979744856 -0600 @@ -45,9 +45,10 @@ u64 pte_fault; *stat = ctx->ops->status_read(ctx); - if (ctx->state != SPU_STATE_RUNNABLE) - return 1; + spu = ctx->spu; + if (ctx->state != SPU_STATE_RUNNABLE || spu->notify_active) + return 1; pte_fault = spu->dsisr & (MFC_DSISR_PTE_NOT_FOUND | MFC_DSISR_ACCESS_DENIED); return (!(*stat & 0x1) || pte_fault || spu->class_0_pending) ? 1 : 0; @@ -305,6 +306,7 @@ u32 *npc, u32 *event) { int ret; + struct spu * spu; u32 status; if (down_interruptible(&ctx->run_sema)) @@ -318,8 +320,16 @@ do { ret = spufs_wait(ctx->stop_wq, spu_stopped(ctx, &status)); + spu = ctx->spu; if (unlikely(ret)) break; + if (unlikely(spu->notify_active)) { + spu->notify_active = 0; + if (!(status & SPU_STATUS_STOPPED_BY_STOP)) { + spu_switch_notify(spu, ctx); + continue; + } + } if ((status & SPU_STATUS_STOPPED_BY_STOP) && (status >> SPU_STOP_STATUS_SHIFT == 0x2104)) { ret = spu_process_callback(ctx); Index: linux-2.6.20-rc1/include/asm-powerpc/spu.h =================================================================== --- linux-2.6.20-rc1.orig/include/asm-powerpc/spu.h 2007-01-18 16:43:19.932605072 -0600 +++ linux-2.6.20-rc1/include/asm-powerpc/spu.h 2007-01-24 12:17:30.209676992 -0600 @@ -144,6 +144,7 @@ void* pdata; /* platform private data */ struct sys_device sysdev; + int notify_active; }; struct spu *spu_alloc(void); ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 3/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-29 19:48 ` [RFC, PATCH 3/4] " Maynard Johnson @ 2007-01-30 4:24 ` Arnd Bergmann 2007-01-30 15:31 ` Maynard Johnson 0 siblings, 1 reply; 43+ messages in thread From: Arnd Bergmann @ 2007-01-30 4:24 UTC (permalink / raw) To: cbe-oss-dev, maynardj; +Cc: linuxppc-dev, linux-kernel, oprofile-list On Monday 29 January 2007 20:48, Maynard Johnson wrote: > Subject: Enable SPU switch notification to detect currently active SPU tasks. > > From: Maynard Johnson <maynardj@us.ibm.com> > > This patch adds to the capability of spu_switch_event_register so that the > caller is also notified of currently active SPU tasks. It also exports > spu_switch_event_register and spu_switch_event_unregister. > > Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> I looked through it again, and think I found a serious bug, but that should be easy enough to solve: > +static void notify_spus_active(void) > +{ > + int node; > + /* Wake up the active spu_contexts. When the awakened processes > + * sees their notify_active flag is set, they will call > + * spu_switch_notify(); > + */ > + for (node = 0; node < MAX_NUMNODES; node++) { > + struct spu *spu; > + mutex_lock(&spu_prio->active_mutex[node]); > + list_for_each_entry(spu, &spu_prio->active_list[node], list) { > + struct spu_context *ctx = spu->ctx; [side note] There is a small whitespace breakage in here, please make sure you always use tabs for indenting, not space characters. [/side note] > @@ -45,9 +45,10 @@ > u64 pte_fault; > > *stat = ctx->ops->status_read(ctx); > - if (ctx->state != SPU_STATE_RUNNABLE) > - return 1; > + > spu = ctx->spu; > + if (ctx->state != SPU_STATE_RUNNABLE || spu->notify_active) > + return 1; > pte_fault = spu->dsisr & > (MFC_DSISR_PTE_NOT_FOUND | MFC_DSISR_ACCESS_DENIED); > return (!(*stat & 0x1) || pte_fault || spu->class_0_pending) ? 1 : 0; > @@ -305,6 +306,7 @@ > u32 *npc, u32 *event) > { > int ret; > + struct spu * spu; > u32 status; > > if (down_interruptible(&ctx->run_sema)) > @@ -318,8 +320,16 @@ > > do { > ret = spufs_wait(ctx->stop_wq, spu_stopped(ctx, &status)); > + spu = ctx->spu; > if (unlikely(ret)) > break; > + if (unlikely(spu->notify_active)) { > + spu->notify_active = 0; > + if (!(status & SPU_STATUS_STOPPED_BY_STOP)) { > + spu_switch_notify(spu, ctx); > + continue; > + } > + } This is before spu_reacquire_runnable, so in case the spu got preempted at the same time when oprofile was enabled, ctx->spu is NULL, and you can't load the notify_active flag from it. On solution would be to move the notify_active flag from ctx->spu into ctx itself, but maybe there are other ways to solve this. Thanks, Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 3/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 4:24 ` [Cbe-oss-dev] " Arnd Bergmann @ 2007-01-30 15:31 ` Maynard Johnson 2007-01-31 0:35 ` Arnd Bergmann 0 siblings, 1 reply; 43+ messages in thread From: Maynard Johnson @ 2007-01-30 15:31 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel Arnd Bergmann wrote: > On Monday 29 January 2007 20:48, Maynard Johnson wrote: > >>Subject: Enable SPU switch notification to detect currently active SPU tasks. >> >>From: Maynard Johnson <maynardj@us.ibm.com> >> >>This patch adds to the capability of spu_switch_event_register so that the >>caller is also notified of currently active SPU tasks. It also exports >>spu_switch_event_register and spu_switch_event_unregister. >> >>Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> > > > I looked through it again, and think I found a serious bug, but that > should be easy enough to solve: > > >>+static void notify_spus_active(void) >>+{ >>+ int node; >>+ /* Wake up the active spu_contexts. When the awakened processes >>+ * sees their notify_active flag is set, they will call >>+ * spu_switch_notify(); >>+ */ >>+ for (node = 0; node < MAX_NUMNODES; node++) { >>+ struct spu *spu; >>+ mutex_lock(&spu_prio->active_mutex[node]); >>+ list_for_each_entry(spu, &spu_prio->active_list[node], list) { >>+ struct spu_context *ctx = spu->ctx; > > > [side note] > There is a small whitespace breakage in here, please make sure you always > use tabs for indenting, not space characters. > [/side note] > > >>@@ -45,9 +45,10 @@ >> u64 pte_fault; >> >> *stat = ctx->ops->status_read(ctx); >>- if (ctx->state != SPU_STATE_RUNNABLE) >>- return 1; >>+ >> spu = ctx->spu; >>+ if (ctx->state != SPU_STATE_RUNNABLE || spu->notify_active) >>+ return 1; >> pte_fault = spu->dsisr & >> (MFC_DSISR_PTE_NOT_FOUND | MFC_DSISR_ACCESS_DENIED); >> return (!(*stat & 0x1) || pte_fault || spu->class_0_pending) ? 1 : 0; >>@@ -305,6 +306,7 @@ >> u32 *npc, u32 *event) >> { >> int ret; >>+ struct spu * spu; >> u32 status; >> >> if (down_interruptible(&ctx->run_sema)) >>@@ -318,8 +320,16 @@ >> >> do { >> ret = spufs_wait(ctx->stop_wq, spu_stopped(ctx, &status)); >>+ spu = ctx->spu; >> if (unlikely(ret)) >> break; >>+ if (unlikely(spu->notify_active)) { >>+ spu->notify_active = 0; >>+ if (!(status & SPU_STATUS_STOPPED_BY_STOP)) { >>+ spu_switch_notify(spu, ctx); >>+ continue; >>+ } >>+ } > > > This is before spu_reacquire_runnable, so in case the spu got > preempted at the same time when oprofile was enabled, ctx->spu > is NULL, and you can't load the notify_active flag from it. > > On solution would be to move the notify_active flag from ctx->spu > into ctx itself, but maybe there are other ways to solve this. In an earlier review of this patch, Christopher Hellwig suggested I move the notify_active flag to be a bit in the sched_flags field that's added in his scheduler patch series. If this patch series will be a available in an "Arnd" tree that we'll be using for our current OProfile development, perhaps I should wait until that time to change this, since the window of vulnerability is quite small. What do you think? -Maynard > > Thanks, > > Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 3/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 15:31 ` Maynard Johnson @ 2007-01-31 0:35 ` Arnd Bergmann 0 siblings, 0 replies; 43+ messages in thread From: Arnd Bergmann @ 2007-01-31 0:35 UTC (permalink / raw) To: maynardj; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel On Tuesday 30 January 2007 16:31, Maynard Johnson wrote: >=20 > > On solution would be to move the notify_active flag from ctx->spu > > into ctx itself, but maybe there are other ways to solve this. > In an earlier review of this patch, Christopher Hellwig suggested I move= =20 > the notify_active flag to be a bit in the sched_flags field that's added= =20 > in his scheduler patch series. =A0If this patch series will be a availabl= e=20 > in an "Arnd" tree that we'll be using for our current OProfile=20 > development, perhaps I should wait until that time to change this, since= =20 > the window of vulnerability is quite small. =A0What do you think? Sounds good to me. Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-29 19:45 [RFC, PATCH 0/4] Add support to OProfile for profiling Cell BE SPUs -- update Maynard Johnson ` (2 preceding siblings ...) 2007-01-29 19:48 ` [RFC, PATCH 3/4] " Maynard Johnson @ 2007-01-29 19:48 ` Maynard Johnson 2007-01-30 7:39 ` [Cbe-oss-dev] " Arnd Bergmann 2007-01-31 9:24 ` Milton Miller 2007-01-30 8:37 ` [RFC, PATCH 0/4] " Arnd Bergmann 4 siblings, 2 replies; 43+ messages in thread From: Maynard Johnson @ 2007-01-29 19:48 UTC (permalink / raw) To: cbe-oss-dev, linux-kernel, linuxppc-dev, oprofile-list [-- Attachment #1: Type: text/plain, Size: 1 bytes --] [-- Attachment #2: oprof-spu.diff --] [-- Type: text/x-patch, Size: 48333 bytes --] Subject: Add support to OProfile for profiling Cell BE SPUs From: Maynard Johnson <maynardj@us.ibm.com> This patch updates the existing arch/powerpc/oprofile/op_model_cell.c to add in the SPU profiling capabilities. In addition, a 'cell' subdirectory was added to arch/powerpc/oprofile to hold Cell-specific SPU profiling code. Signed-off-by: Carl Love <carll@us.ibm.com> Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> Index: linux-2.6.20-rc1/arch/powerpc/configs/cell_defconfig =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/configs/cell_defconfig 2007-01-18 16:43:14.230540320 -0600 +++ linux-2.6.20-rc1/arch/powerpc/configs/cell_defconfig 2007-01-29 10:32:03.386789608 -0600 @@ -1403,7 +1403,7 @@ # Instrumentation Support # CONFIG_PROFILING=y -CONFIG_OPROFILE=y +CONFIG_OPROFILE=m # CONFIG_KPROBES is not set # Index: linux-2.6.20-rc1/arch/powerpc/oprofile/cell/pr_util.h =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-2.6.20-rc1/arch/powerpc/oprofile/cell/pr_util.h 2007-01-29 10:32:03.388789304 -0600 @@ -0,0 +1,75 @@ + /* + * Cell Broadband Engine OProfile Support + * + * (C) Copyright IBM Corporation 2006 + * + * Author: Maynard Johnson <maynardj@us.ibm.com> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef PR_UTIL_H +#define PR_UTIL_H + +#include <linux/cpumask.h> +#include <linux/oprofile.h> +#include <asm/cell-pmu.h> +#include <asm/spu.h> + +#define number_of_online_nodes(nodes) { \ + u32 cpu; u32 tmp; \ + nodes = 0; \ + for_each_online_cpu(cpu) { \ + tmp = cbe_cpu_to_node(cpu) + 1;\ + if (tmp > nodes) \ + nodes++; \ + } \ +} + + +/* Defines used for sync_start */ +#define SKIP_GENERIC_SYNC 0 +#define SYNC_START_ERROR -1 +#define DO_GENERIC_SYNC 1 + +typedef struct vma_map +{ + struct vma_map *next; + unsigned int vma; + unsigned int size; + unsigned int offset; + unsigned int guard_ptr; + unsigned int guard_val; +} vma_map_t; + +/* The three functions below are for maintaining and accessing + * the vma-to-file offset map. + */ +vma_map_t * create_vma_map(const struct spu * spu, u64 objectid); +unsigned int vma_map_lookup(vma_map_t *map, unsigned int vma, + const struct spu * aSpu); +void vma_map_free(struct vma_map *map); + +/* + * Entry point for SPU profiling. + * cycles_reset is the SPU_CYCLES count value specified by the user. + */ +void start_spu_profiling(unsigned int cycles_reset); + +void stop_spu_profiling(void); + + +/* add the necessary profiling hooks */ +int spu_sync_start(void); + +/* remove the hooks */ +int spu_sync_stop(void); + +/* Record SPU program counter samples to the oprofile event buffer. */ +void spu_sync_buffer(int spu_num, unsigned int * samples, + int num_samples); + +#endif // PR_UTIL_H Index: linux-2.6.20-rc1/arch/powerpc/oprofile/cell/spu_profiler.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-2.6.20-rc1/arch/powerpc/oprofile/cell/spu_profiler.c 2007-01-29 10:32:03.392788696 -0600 @@ -0,0 +1,204 @@ +/* + * Cell Broadband Engine OProfile Support + * + * (C) Copyright IBM Corporation 2006 + * + * Authors: Maynard Johnson <maynardj@us.ibm.com> + * Carl Love <carll@us.ibm.com> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include <linux/hrtimer.h> +#include <linux/smp.h> +#include <linux/slab.h> +#include <asm/cell-pmu.h> +#include "pr_util.h" + +#define TRACE_ARRAY_SIZE 1024 +static u32 * samples; +static u32 * samples_per_node; + +static int spu_prof_running = 0; +static unsigned int profiling_interval = 0; + +extern int num_nodes; +extern unsigned int khzfreq; + +/* + * Oprofile setup functions + */ + +#define NUM_SPU_BITS_TRBUF 16 +#define SPUS_PER_TB_ENTRY 4 +#define SPUS_PER_NODE 8 + +/* + * Collect the SPU program counter samples from the trace buffer. + * The global variable usage is as follows: + * samples[<total-spus>][TRACE_ARRAY_SIZE] - array to store SPU PC samples + * Assumption, the array will be all zeros on entry. + * u32 samples_per_node[num_nodes] - array of how many valid samples per node + */ +static void cell_spu_pc_collection(void) +{ + int cpu; + int node; + int spu; + u32 trace_addr; + /* the trace buffer is 128 bits */ + u64 trace_buffer[2]; + u64 spu_pc_lower; + u64 spu_pc_upper; + u64 spu_mask; + int entry, node_factor; + // process the collected SPU PC for each node + for_each_online_cpu(cpu) { + if (cbe_get_hw_thread_id(cpu)) + continue; + + node = cbe_cpu_to_node(cpu); + node_factor = node * SPUS_PER_NODE; + /* number of valid entries for this node */ + entry = 0; + + trace_addr = cbe_read_pm(cpu, trace_address); + while ((trace_addr & CBE_PM_TRACE_BUF_EMPTY) != 0x400) + { + /* there is data in the trace buffer to process */ + cbe_read_trace_buffer(cpu, trace_buffer); + spu_mask = 0xFFFF000000000000; + + /* Each SPU PC is 16 bits; hence, four spus in each of + * the two 64-bit buffer entries that make up the + * 128-bit trace_buffer entry. Process the upper and + * lower 64-bit values simultaneously. + */ + for (spu = 0; spu < SPUS_PER_TB_ENTRY; spu++) { + spu_pc_lower = spu_mask & trace_buffer[0]; + spu_pc_lower = spu_pc_lower >> (NUM_SPU_BITS_TRBUF + * (SPUS_PER_TB_ENTRY-spu-1)); + + spu_pc_upper = spu_mask & trace_buffer[1]; + spu_pc_upper = spu_pc_upper >> (NUM_SPU_BITS_TRBUF + * (SPUS_PER_TB_ENTRY-spu-1)); + + spu_mask = spu_mask >> NUM_SPU_BITS_TRBUF; + + /* spu PC trace entry is upper 16 bits of the + * 18 bit SPU program counter + */ + spu_pc_lower = spu_pc_lower << 2; + spu_pc_upper = spu_pc_upper << 2; + + samples[((node_factor + spu) * TRACE_ARRAY_SIZE) + entry] + = (u32) spu_pc_lower; + samples[((node_factor + spu + SPUS_PER_TB_ENTRY) * TRACE_ARRAY_SIZE) + entry] + = (u32) spu_pc_upper; + } + + entry++; + + if (entry >= TRACE_ARRAY_SIZE) + /* spu_samples is full */ + break; + + trace_addr = cbe_read_pm(cpu, trace_address); + } + samples_per_node[node] = entry; + } +} + + +static int profile_spus(struct hrtimer * timer) +{ + ktime_t kt; + int cpu, node, k, num_samples, spu_num; + + if (!spu_prof_running) + goto STOP; + + cell_spu_pc_collection(); + for_each_online_cpu(cpu) { + if (cbe_get_hw_thread_id(cpu)) + continue; + + node = cbe_cpu_to_node(cpu); + + num_samples = samples_per_node[node]; + if (num_samples == 0) + continue; + for (k = 0; k < SPUS_PER_NODE; k++) { + spu_num = k + (node * SPUS_PER_NODE); + spu_sync_buffer(spu_num, samples + (spu_num * TRACE_ARRAY_SIZE), num_samples); + } + } + smp_wmb(); + + kt = ktime_set(0, profiling_interval); + if (!spu_prof_running) + goto STOP; + hrtimer_forward(timer, timer->base->get_time(), kt); + return HRTIMER_RESTART; + + STOP: + printk(KERN_INFO "SPU_PROF: spu-prof timer ending\n"); + return HRTIMER_NORESTART; +} + +static struct hrtimer timer; +#define SCALE_SHIFT 14 +/* + * Entry point for SPU profiling. + * NOTE: SPU profiling is done system-wide, not per-CPU. + * + * cycles_reset is the count value specified by the user when + * setting up OProfile to count SPU_CYCLES. + */ +void start_spu_profiling(unsigned int cycles_reset) { + + ktime_t kt; + + /* To calculate a timeout in nanoseconds, the basic + * formula is ns = cycles_reset * (NSEC_PER_SEC / cpu frequency). + * To avoid floating point math, we use the scale math + * technique as described in linux/jiffies.h. We use + * a scale factor of SCALE_SHIFT,which provides 4 decimal places + * of precision, which is close enough for the purpose at hand. + */ + + /* Since cpufreq_quick_get returns frequency in kHz, we use + * USEC_PER_SEC here vs NSEC_PER_SEC. + */ + unsigned long nsPerCyc = (USEC_PER_SEC << SCALE_SHIFT)/khzfreq; + profiling_interval = (nsPerCyc * cycles_reset) >> SCALE_SHIFT; + + pr_debug("timer resolution: %lu\n", + TICK_NSEC); + kt = ktime_set(0, profiling_interval); + hrtimer_init(&timer, CLOCK_MONOTONIC, HRTIMER_REL); + timer.expires = kt; + timer.function = profile_spus; + + /* Allocate arrays for collecting SPU PC samples */ + samples = (u32 *) kzalloc(num_nodes * SPUS_PER_NODE * TRACE_ARRAY_SIZE * sizeof(u32), GFP_ATOMIC); + samples_per_node = (u32 *) kzalloc(num_nodes * sizeof(u32), GFP_ATOMIC); + + spu_prof_running = 1; + hrtimer_start(&timer, kt, HRTIMER_REL); +} + +void stop_spu_profiling(void) +{ + + hrtimer_cancel(&timer); + kfree(samples); + kfree(samples_per_node); + pr_debug("SPU_PROF: stop_spu_profiling issued\n"); + spu_prof_running = 0; +} + + Index: linux-2.6.20-rc1/arch/powerpc/oprofile/cell/spu_task_sync.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-2.6.20-rc1/arch/powerpc/oprofile/cell/spu_task_sync.c 2007-01-29 10:32:03.398787784 -0600 @@ -0,0 +1,493 @@ +/* + * Cell Broadband Engine OProfile Support + * + * (C) Copyright IBM Corporation 2006 + * + * Author: Maynard Johnson <maynardj@us.ibm.com> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +/* The purpose of this file is to handle SPU event task switching + * and to record SPU context information into the OProfile + * event buffer. + * + * Additionally, the spu_sync_buffer function is provided as a helper + * for recoding actual SPU program counter samples to the event buffer. + */ + +#include <linux/notifier.h> +#include <linux/list.h> +#include <linux/numa.h> +#include <linux/mm.h> +#include <linux/dcookies.h> +#include <linux/spinlock.h> +#include <linux/kref.h> +#include <linux/oprofile.h> +#include "pr_util.h" + +#define DISCARD_ALL 9999 + +static spinlock_t buffer_lock = SPIN_LOCK_UNLOCKED; +static int num_spu_nodes; +int num_nodes; + +/* Conainer for caching information about an active SPU task. + * :map -- pointer to a list of vma_maps + * :spu -- the spu for this active SPU task + * :list -- potentially could be used to contain the cached_infos + * for inactive SPU tasks. + * + * Ideally, we would like to be able to create the cached_info for + * an SPU task just one time -- when libspe first loads the SPU + * binary file. We would store the cached_info in a list. Then, as + * SPU tasks are switched out and new ones switched in, the cached_info + * for inactive tasks would be kept, and the active one would be placed + * at the head of the list. But this technique may not with + * current spufs functionality since the spu used in bind_context may + * be a different spu than was used in a previous bind_context for a + * reactivated SPU task. Additionally, a reactivated SPU task may be + * assigned to run on a different physical SPE. We will investigate + * further if this can be done. + * + */ +struct cached_info { + vma_map_t * map; + struct spu * the_spu; + struct kref cache_ref; + struct list_head list; +}; + +/* A data structure for cached information about active SPU tasks. + * Storage is dynamically allocated, sized as + * "number of active nodes multplied by 8". + * The info_list[n] member holds 0 or more + * 'struct cached_info' objects for SPU#=n. + * + * As currently implemented, there will only ever be one cached_info + * in the list for a given SPU. If we can devise a way to maintain + * multiple cached_infos in our list, then it would make sense + * to also cache the dcookie representing the PPU task application. + * See above description of struct cached_info for more details. + */ +struct spu_info_stacks { + struct list_head * info_list; +}; + +static spinlock_t cache_lock = SPIN_LOCK_UNLOCKED; + + +static struct spu_info_stacks * spu_info; + +static void destroy_cached_info(struct kref * kref) +{ + struct cached_info * info; + info = container_of(kref, struct cached_info, cache_ref); + vma_map_free(info->map); + kfree(info); +} + +static int put_cached_info(struct cached_info * info) +{ + return kref_put(&info->cache_ref, &destroy_cached_info); +} + +/* Return the cached_info for the passed SPU number. + * Current implementation is such that a list will hold, at most, + * one cached_info. + * + * NOTE: Clients of this function MUST call put_cached_info() + * when finished using the returned cached_info (if the + * returned value is non-null). + */ +static struct cached_info * get_cached_info(int spu_num) +{ + struct cached_info * ret_info, * info; + unsigned long flags = 0; + ret_info = NULL; + spin_lock_irqsave(&cache_lock, flags); + if (spu_info == NULL) { + pr_debug("SPU_PROF:%s: spu_info does not exist\n", + __FUNCTION__); + goto out; + } + if (spu_num >= num_spu_nodes) { + printk(KERN_ERR "SPU_PROF: " + "%s, line %d: Invalid index %d into spu info cache\n", + __FUNCTION__, __LINE__, spu_num); + goto out; + } + list_for_each_entry(info, &spu_info->info_list[spu_num], list) { + /* Only one item in the list, so return it. */ + ret_info = info; + kref_get(&info->cache_ref); + break; + } + +out: + spin_unlock_irqrestore(&cache_lock, flags); + return ret_info; +} + + +/* Looks for cached info for the passed spu. If not found, the + * cached info is created for the passed spu. + * Returns 0 for success; otherwise, -1 for error. + */ +static int +prepare_cached_spu_info(struct spu * spu, unsigned int objectId) +{ + vma_map_t * new_map; + unsigned long flags = 0; + int retval = 0; + /* spu->number is a system-wide value, not a per-node value. */ + struct cached_info * info = get_cached_info(spu->number); + if (info == NULL) { + /* create cached_info and add it to the list for SPU #<n>.*/ + info = kzalloc(sizeof(struct cached_info), GFP_ATOMIC); + if (!info) { + printk(KERN_ERR "SPU_PROF: " + "%s, line %d: create vma_map failed\n", + __FUNCTION__, __LINE__); + goto ERR_ALLOC; + } + new_map = create_vma_map(spu, objectId); + if (!new_map) { + printk(KERN_ERR "SPU_PROF: " + "%s, line %d: create vma_map failed\n", + __FUNCTION__, __LINE__); + goto ERR_ALLOC; + } + + pr_debug("Created vma_map\n"); + info->map = new_map; + info->the_spu = spu; + kref_init(&info->cache_ref); + spin_lock_irqsave(&cache_lock, flags); + list_add(&info->list, &spu_info->info_list[spu->number]); + spin_unlock_irqrestore(&cache_lock, flags); + goto OUT; + } else { + /* Immedidately put back reference to cached_info since we don't + * really need it -- just checking whether we have it. + */ + put_cached_info(info); + pr_debug("Found cached SPU info.\n"); + } + +ERR_ALLOC: + retval = -1; +OUT: + return retval; +} + + +/* Discard all cached info and free the memory. + * NOTE: The caller is responsible for locking the + * spu_info struct containing the cached_info + * prior to calling this function. + */ +static int discard_cached_info(int spu_index) +{ + struct cached_info * info, * tmp; + int index, end; + if (spu_index == DISCARD_ALL) { + end = num_spu_nodes; + index = 0; + } else { + if (spu_index >= num_spu_nodes) { + printk(KERN_ERR "SPU_PROF: " + "%s, line %d: Invalid index %d into spu info cache\n", + __FUNCTION__, __LINE__, spu_index); + goto out; + } + end = spu_index +1; + index = spu_index; + } + for (; index < end; index++) { + list_for_each_entry_safe(info, tmp, + &spu_info->info_list[index], + list) { + list_del(&info->list); + put_cached_info(info); + } + } +out: + return 0; +} + +/* The source code for fast_get_dcookie was "borrowed" + * from drivers/oprofile/buffer_sync.c. + */ + +/* Optimisation. We can manage without taking the dcookie sem + * because we cannot reach this code without at least one + * dcookie user still being registered (namely, the reader + * of the event buffer). + */ +static inline unsigned long fast_get_dcookie(struct dentry * dentry, + struct vfsmount * vfsmnt) +{ + unsigned long cookie; + + if (dentry->d_cookie) + return (unsigned long)dentry; + get_dcookie(dentry, vfsmnt, &cookie); + return cookie; +} + +/* Look up the dcookie for the task's first VM_EXECUTABLE mapping, + * which corresponds loosely to "application name". Also, determine + * the offset for the SPU ELF object. If computed offset is + * non-zero, it implies an embedded SPU object; otherwise, it's a + * separate SPU binary, in which case we retrieve it's dcookie. + */ +static unsigned long +get_exec_dcookie_and_offset( + struct spu * spu, unsigned int * offsetp, + unsigned long * spu_bin_dcookie, + unsigned int spu_ref) +{ + unsigned long cookie = 0; + unsigned int my_offset = 0; + struct vm_area_struct * vma; + struct mm_struct * mm = spu->mm; + + if (!mm) + goto OUT; + + for (vma = mm->mmap; vma; vma = vma->vm_next) { + if (!vma->vm_file) + continue; + if (!(vma->vm_flags & VM_EXECUTABLE)) + continue; + cookie = fast_get_dcookie(vma->vm_file->f_dentry, + vma->vm_file->f_vfsmnt); + pr_debug("got dcookie for %s\n", + vma->vm_file->f_dentry->d_name.name); + break; + } + + for (vma = mm->mmap; vma; vma = vma->vm_next) { + if (vma->vm_start > spu_ref || vma->vm_end < spu_ref) + continue; + my_offset = spu_ref - vma->vm_start; + pr_debug("Found spu ELF at " + " %X for file %s\n", my_offset, + vma->vm_file->f_dentry->d_name.name); + *offsetp = my_offset; + if (my_offset == 0) { + if (!vma->vm_file) { + goto FAIL_NO_SPU_COOKIE; + } + *spu_bin_dcookie = fast_get_dcookie( + vma->vm_file->f_dentry, + vma->vm_file->f_vfsmnt); + pr_debug("got dcookie for %s\n", + vma->vm_file->f_dentry->d_name.name); + } + break; + } + +OUT: + return cookie; + +FAIL_NO_SPU_COOKIE: + printk(KERN_ERR "SPU_PROF: " + "%s, line %d: Cannot find dcookie for SPU binary\n", + __FUNCTION__, __LINE__); + goto OUT; +} + + + +/* This function finds or creates cached context information for the + * passed SPU and records SPU context information into the OProfile + * event buffer. + */ +static int process_context_switch(struct spu * spu, unsigned int objectId) +{ + unsigned long flags = 0; + int retval = 0; + unsigned int offset = 0; + unsigned long spu_cookie = 0, app_dcookie = 0; + retval = prepare_cached_spu_info(spu, objectId); + if (retval == -1) { + goto OUT; + } + /* Get dcookie first because a mutex_lock is taken in that + * code path, so interrupts must not be disabled. + */ + app_dcookie = get_exec_dcookie_and_offset(spu, &offset, + &spu_cookie, objectId); + + /* Record context info in event buffer */ + spin_lock_irqsave(&buffer_lock, flags); + add_event_entry(ESCAPE_CODE); + add_event_entry(SPU_CTX_SWITCH_CODE); + add_event_entry(spu->number); + add_event_entry(spu->pid); + add_event_entry(spu->tgid); + add_event_entry(app_dcookie); + + add_event_entry(ESCAPE_CODE); + if (offset) { + /* When offset is non-zero, this means the SPU ELF was embedded; + * otherwise, it was loaded from a separate binary file. For the + * embedded case, we record the offset of the SPU ELF into the PPU + * executable; for the non-embedded case, we record a dcookie that + * points to the location of the SPU binary that was loaded. + */ + add_event_entry(SPU_OFFSET_CODE); + add_event_entry(offset); + } else { + add_event_entry(SPU_COOKIE_CODE); + add_event_entry(spu_cookie); + } + spin_unlock_irqrestore(&buffer_lock, flags); + smp_wmb(); +OUT: + return retval; +} + +/* + * This function is invoked on either a bind_context or unbind_context. + * If called for an unbind_context, the val arg is 0; otherwise, + * it is the object-id value for the spu context. + * The data arg is of type 'struct spu *'. + */ +static int spu_active_notify(struct notifier_block * self, unsigned long val, + void * data) +{ + int retval ; + unsigned long flags = 0; + struct spu * the_spu = (struct spu *) data; + pr_debug("SPU event notification arrived\n"); + if (val == 0){ + spin_lock_irqsave(&cache_lock, flags); + retval = discard_cached_info(the_spu->number); + spin_unlock_irqrestore(&cache_lock, flags); + } else { + retval = process_context_switch(the_spu, val); + } + return retval; +} + +static struct notifier_block spu_active = { + .notifier_call = spu_active_notify, +}; + +/* The main purpose of this function is to synchronize + * OProfile with SPUFS by registering to be notified of + * SPU task switches. + * + * NOTE: When profiling SPUs, we must ensure that only + * spu_sync_start is invoked and not the generic sync_start + * in drivers/oprofile/oprof.c. A return value of + * SKIP_GENERIC_SYNC or SYNC_START_ERROR will + * accomplish this. + */ +int spu_sync_start(void) { + int ret = SKIP_GENERIC_SYNC; + int register_ret; + int i; + unsigned long flags = 0; + number_of_online_nodes(num_nodes); + num_spu_nodes = num_nodes * 8; + spin_lock_irqsave(&cache_lock, flags); + spu_info = kzalloc(sizeof(struct spu_info_stacks), GFP_ATOMIC); + spu_info->info_list = kzalloc(sizeof(struct list_head) * num_spu_nodes, + GFP_ATOMIC); + + for (i = 0; i < num_spu_nodes; i++) { + INIT_LIST_HEAD(&spu_info->info_list[i]); + } + spin_unlock_irqrestore(&cache_lock, flags); + + spin_lock_irqsave(&buffer_lock, flags); + add_event_entry(ESCAPE_CODE); + add_event_entry(SPU_PROFILING_CODE); + add_event_entry(num_spu_nodes); + spin_unlock_irqrestore(&buffer_lock, flags); + + /* Register for SPU events */ + register_ret = spu_switch_event_register(&spu_active); + if (register_ret) { + ret = SYNC_START_ERROR; + goto OUT; + } + + pr_debug("spu_sync_start -- running.\n"); +OUT: + return ret; +} + +/* Record SPU program counter samples to the oprofile event buffer. */ +void spu_sync_buffer(int spu_num, unsigned int * samples, + int num_samples) +{ + unsigned long flags = 0; + int i; + vma_map_t * map; + struct spu * the_spu; + unsigned long long spu_num_ll = spu_num; + unsigned long long spu_num_shifted = spu_num_ll << 32; + struct cached_info * c_info = get_cached_info(spu_num); + if (c_info == NULL) { +/* This legitimately happens when the SPU task ends before all + * samples are recorded. No big deal -- so we just drop a few samples. + */ + pr_debug("SPU_PROF: No cached SPU contex " + "for SPU #%d. Dropping samples.\n", spu_num); + return ; + } + + map = c_info->map; + the_spu = c_info->the_spu; + spin_lock_irqsave(&buffer_lock, flags); + for (i = 0; i < num_samples; i++) { + unsigned long long file_offset; + unsigned int sample = *(samples+i); + if (sample == 0) + continue; + file_offset = vma_map_lookup( + map, sample, the_spu); + /* For now, we'll drop samples that can't be mapped. + * This can happen for generated stubs executed from + * the SPU stack. Do we need to record these somehow? + */ + if (unlikely(file_offset == -1)) + continue; + add_event_entry(file_offset | spu_num_shifted); + } + spin_unlock_irqrestore(&buffer_lock, flags); + put_cached_info(c_info); +} + + +int spu_sync_stop(void) +{ + unsigned long flags = 0; + int ret = spu_switch_event_unregister(&spu_active); + if (ret) { + printk(KERN_ERR "SPU_PROF: " + "%s, line %d: spu_switch_event_unregister returned %d\n", + __FUNCTION__, __LINE__, ret); + goto OUT; + } + + spin_lock_irqsave(&cache_lock, flags); + ret = discard_cached_info(DISCARD_ALL); + kfree(spu_info->info_list); + kfree(spu_info); + spin_unlock_irqrestore(&cache_lock, flags); + +OUT: + pr_debug("spu_sync_stop -- done.\n"); + return ret; +} + + Index: linux-2.6.20-rc1/arch/powerpc/oprofile/cell/vma_map.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-2.6.20-rc1/arch/powerpc/oprofile/cell/vma_map.c 2007-01-29 10:32:03.401787328 -0600 @@ -0,0 +1,228 @@ + /* + * Cell Broadband Engine OProfile Support + * + * (C) Copyright IBM Corporation 2006 + * + * Author: Maynard Johnson <maynardj@us.ibm.com> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +/* The code in this source file is responsible for generating + * vma-to-fileOffset maps for both overlay and non-overlay SPU + * applications. + */ + +#include <linux/mm.h> +#include <linux/string.h> +#include <linux/uaccess.h> +#include <linux/elf.h> +#include "pr_util.h" + + +void vma_map_free(struct vma_map *map) +{ + while (map) { + vma_map_t *next = map->next; + kfree(map); + map = next; + } +} + +unsigned int vma_map_lookup(vma_map_t *map, unsigned int vma, + const struct spu * aSpu) +{ + u32 offset = -1; + u32 ovly_grd; + for (; map; map = map->next) { + if (vma < map->vma || vma >= map->vma + map->size) + continue; + + if (map->guard_ptr) { + ovly_grd = *(u32 *)(aSpu->local_store + map->guard_ptr); + if (ovly_grd != map->guard_val) + continue; + } + break; + } + + if (likely(map != NULL)) { + offset = vma - map->vma + map->offset; + } + return offset; +} + +static vma_map_t * +vma_map_add(vma_map_t *map, unsigned int vma, unsigned int size, + unsigned int offset, unsigned int guard_ptr, + unsigned int guard_val) +{ + vma_map_t *new = kzalloc(sizeof(vma_map_t), GFP_ATOMIC); + if (!new) { + printk(KERN_ERR "SPU_PROF: %s, line %d: malloc failed\n", + __FUNCTION__, __LINE__); + vma_map_free(map); + return NULL; + } + + new->next = map; + new->vma = vma; + new->size = size; + new->offset = offset; + new->guard_ptr = guard_ptr; + new->guard_val = guard_val; + + return new; +} + + +/* Parse SPE ELF header and generate a list of vma_maps. + * A pointer to the first vma_map in the generated list + * of vma_maps is returned. */ +vma_map_t * create_vma_map(const struct spu * aSpu, + unsigned long spu_elf_start) +{ + static const unsigned char expected[EI_PAD] = { + [EI_MAG0] = ELFMAG0, + [EI_MAG1] = ELFMAG1, + [EI_MAG2] = ELFMAG2, + [EI_MAG3] = ELFMAG3, + [EI_CLASS] = ELFCLASS32, + [EI_DATA] = ELFDATA2MSB, + [EI_VERSION] = EV_CURRENT, + [EI_OSABI] = ELFOSABI_NONE + }; + + struct vma_map *map = NULL; + unsigned int overlay_tbl_offset = -1; + unsigned long phdr_start, shdr_start; + Elf32_Ehdr ehdr; + Elf32_Phdr phdr; + Elf32_Shdr shdr, shdr_str; + Elf32_Sym sym; + int i, j; + char name[32]; + + unsigned int ovly_table_sym = 0; + unsigned int ovly_buf_table_sym = 0; + unsigned int ovly_table_end_sym = 0; + unsigned int ovly_buf_table_end_sym = 0; + unsigned long ovly_table; + unsigned int n_ovlys; + + struct { + unsigned int vma; + unsigned int size; + unsigned int offset; + unsigned int buf; + } ovly; + + /* Get and validate ELF header. */ + + copy_from_user(&ehdr, (void *) spu_elf_start, sizeof (ehdr)); + if (memcmp(ehdr.e_ident, expected, EI_PAD) != 0) { + printk(KERN_ERR "SPU_PROF: " + "%s, line %d: Unexpected value parsing SPU ELF\n", + __FUNCTION__, __LINE__); + return NULL; + } + if (ehdr.e_machine != 23) { + printk(KERN_ERR "SPU_PROF: " + "%s, line %d: Unexpected value parsing SPU ELF\n", + __FUNCTION__, __LINE__); + + return NULL; + } + if (ehdr.e_type != ET_EXEC) { + printk(KERN_ERR "SPU_PROF: " + "%s, line %d: Unexpected value parsing SPU ELF\n", + __FUNCTION__, __LINE__); + return NULL; + } + phdr_start = spu_elf_start + ehdr.e_phoff; + shdr_start = spu_elf_start + ehdr.e_shoff; + + /* Traverse program headers. */ + for (i = 0; i < ehdr.e_phnum; i++) { + copy_from_user(&phdr, (void *) (phdr_start + i * sizeof(phdr)), + sizeof(phdr)); + if (phdr.p_type != PT_LOAD) + continue; + if (phdr.p_flags & (1 << 27)) + continue; + + map = vma_map_add(map, phdr.p_vaddr, phdr.p_memsz, + phdr.p_offset, 0, 0); + if (!map) + return NULL; + } + + pr_debug("SPU_PROF: Created non-overlay maps\n"); + /* Traverse section table and search for overlay-related symbols. */ + for (i = 0; i < ehdr.e_shnum; i++) { + copy_from_user(&shdr, (void *) (shdr_start + i * sizeof(shdr)), + sizeof(shdr)); + if (shdr.sh_type != SHT_SYMTAB) + continue; + if (shdr.sh_entsize != sizeof (sym)) + continue; + + copy_from_user(&shdr_str, + (void *) (shdr_start + shdr.sh_link * sizeof(shdr)), + sizeof(shdr)); + if (shdr_str.sh_type != SHT_STRTAB) + return NULL; + + for (j = 0; j < shdr.sh_size / sizeof (sym); j++) { + copy_from_user(&sym, (void *) (spu_elf_start + + shdr.sh_offset + j * sizeof (sym)), + sizeof (sym)); + copy_from_user(name, (void *) (spu_elf_start + shdr_str.sh_offset + + sym.st_name), + 20); + if (memcmp(name, "_ovly_table", 12) == 0) + ovly_table_sym = sym.st_value; + if (memcmp(name, "_ovly_buf_table", 16) == 0) + ovly_buf_table_sym = sym.st_value; + if (memcmp(name, "_ovly_table_end", 16) == 0) + ovly_table_end_sym = sym.st_value; + if (memcmp(name, "_ovly_buf_table_end", 20) == 0) + ovly_buf_table_end_sym = sym.st_value; + } + } + + /* If we don't have overlays, we're done. */ + if (ovly_table_sym == 0 || ovly_buf_table_sym == 0 + || ovly_table_end_sym == 0 || ovly_buf_table_end_sym == 0) { + pr_debug("SPU_PROF: No overlay table found\n"); + return map; + } + else { + pr_debug("SPU_PROF: Overlay table found\n"); + } + + overlay_tbl_offset = vma_map_lookup(map, ovly_table_sym, aSpu); + if (overlay_tbl_offset < 0) { + printk(KERN_ERR "SPU_PROF: " + "%s, line %d: Error finding SPU overlay table\n", + __FUNCTION__, __LINE__); + return NULL; + } + ovly_table = spu_elf_start + overlay_tbl_offset; + n_ovlys = (ovly_table_end_sym - ovly_table_sym) / sizeof (ovly); + + /* Traverse overlay table. */ + for (i = 0; i < n_ovlys; i++) { + copy_from_user(&ovly, (void *) (ovly_table + i * sizeof (ovly)), + sizeof (ovly)); + map = vma_map_add(map, ovly.vma, ovly.size, ovly.offset, + ovly_buf_table_sym + (ovly.buf - 1) * 4, i + 1); + if (!map) + return NULL; + } + + return map; +} Index: linux-2.6.20-rc1/arch/powerpc/oprofile/common.c =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/oprofile/common.c 2007-01-18 16:43:14.429510072 -0600 +++ linux-2.6.20-rc1/arch/powerpc/oprofile/common.c 2007-01-29 10:32:03.403787024 -0600 @@ -150,6 +150,8 @@ #ifdef CONFIG_PPC_CELL_NATIVE case PPC_OPROFILE_CELL: model = &op_model_cell; + ops->sync_start = model->sync_start; + ops->sync_stop = model->sync_stop; break; #endif case PPC_OPROFILE_RS64: Index: linux-2.6.20-rc1/arch/powerpc/oprofile/Kconfig =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/oprofile/Kconfig 2007-01-18 16:43:14.426510528 -0600 +++ linux-2.6.20-rc1/arch/powerpc/oprofile/Kconfig 2007-01-29 10:32:03.404786872 -0600 @@ -7,7 +7,8 @@ config OPROFILE tristate "OProfile system profiling (EXPERIMENTAL)" - depends on PROFILING + default m + depends on SPU_FS && PROFILING && CBE_CPUFREQ help OProfile is a profiling system capable of profiling the whole system, include the kernel, kernel modules, libraries, Index: linux-2.6.20-rc1/arch/powerpc/oprofile/Makefile =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/oprofile/Makefile 2007-01-18 16:43:14.429510072 -0600 +++ linux-2.6.20-rc1/arch/powerpc/oprofile/Makefile 2007-01-29 10:32:03.405786720 -0600 @@ -11,7 +11,8 @@ timer_int.o ) oprofile-y := $(DRIVER_OBJS) common.o backtrace.o -oprofile-$(CONFIG_PPC_CELL_NATIVE) += op_model_cell.o +oprofile-$(CONFIG_PPC_CELL_NATIVE) += op_model_cell.o \ + cell/spu_profiler.o cell/vma_map.o cell/spu_task_sync.o oprofile-$(CONFIG_PPC64) += op_model_rs64.o op_model_power4.o oprofile-$(CONFIG_FSL_BOOKE) += op_model_fsl_booke.o oprofile-$(CONFIG_6xx) += op_model_7450.o Index: linux-2.6.20-rc1/arch/powerpc/oprofile/op_model_cell.c =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/oprofile/op_model_cell.c 2007-01-24 12:16:16.225609136 -0600 +++ linux-2.6.20-rc1/arch/powerpc/oprofile/op_model_cell.c 2007-01-29 10:32:03.410785960 -0600 @@ -37,6 +37,17 @@ #include <asm/system.h> #include "../platforms/cell/interrupt.h" +#include "cell/pr_util.h" + +/* spu_cycle_reset is the number of cycles between samples. + * This variable is used for SPU profiling and should ONLY be set + * at the beginning of cell_reg_setup; otherwise, it's read-only. + */ +static unsigned int spu_cycle_reset = 0; +unsigned int khzfreq; + +#define NUM_SPUS_PER_NODE 8 +#define SPU_CYCLES_EVENT_NUM 2 /* event number for SPU_CYCLES */ #define PPU_CYCLES_EVENT_NUM 1 /* event number for CYCLES */ #define PPU_CYCLES_GRP_NUM 1 /* special group number for identifying @@ -50,7 +61,6 @@ #define NUM_TRACE_BUS_WORDS 4 #define NUM_INPUT_BUS_WORDS 2 - struct pmc_cntrl_data { unsigned long vcntr; unsigned long evnts; @@ -134,6 +144,7 @@ /* * Firmware interface functions */ + static int rtas_ibm_cbe_perftools(int subfunc, int passthru, void *address, unsigned long length) @@ -480,7 +491,22 @@ struct op_system_config *sys, int num_ctrs) { int i, j, cpu; + spu_cycle_reset = 0; + /* The cpufreq_quick_get function requires that cbe_cpufreq module + * be loaded. This function is not actually provided and exported + * by cbe_cpufreq, but it relies on cbe_cpufreq initialize kernel + * data structures. Since there's no way for depmod to realize + * that our OProfile module depends on cbe_cpufreq, we currently + * are letting the userspace tool, opcontrol, ensure that the + * cbe_cpufreq module is loaded. + */ + khzfreq = cpufreq_quick_get(smp_processor_id()); + + if (ctr[0].event == SPU_CYCLES_EVENT_NUM) { + spu_cycle_reset = ctr[0].count; + return; + } pm_rtas_token = rtas_token("ibm,cbe-perftools"); if (pm_rtas_token == RTAS_UNKNOWN_SERVICE) { printk(KERN_WARNING "%s: RTAS_UNKNOWN_SERVICE\n", @@ -566,6 +592,8 @@ ; } + + /* This function is called once for each cpu */ static void cell_cpu_setup(struct op_counter_config *cntr) { @@ -573,6 +601,9 @@ u32 num_enabled = 0; int i; + if (spu_cycle_reset) + return; + /* There is one performance monitor per processor chip (i.e. node), * so we only need to perform this function once per node. */ @@ -607,7 +638,121 @@ ; } -static void cell_global_start(struct op_counter_config *ctr) +static int calculate_lfsr(int n) +{ +#define size 24 + int i; + unsigned int newlfsr0; + unsigned int lfsr = 0xFFFFFF; + unsigned int howmany = lfsr - n; + + for (i = 2; i < howmany + 2; i++) { + newlfsr0 = (((lfsr >> (size - 1 - 0)) & 1) ^ + ((lfsr >> (size - 1 - 1)) & 1) ^ + (((lfsr >> (size - 1 - 6)) & 1) ^ + ((lfsr >> (size - 1 - 23)) & 1))); + + lfsr >>= 1; + lfsr = lfsr | (newlfsr0 << (size - 1)); + } + return lfsr; + +} + +static void pm_rtas_activate_spu_profiling(u32 node) +{ + int ret, i; + struct pm_signal pm_signal_local[NR_PHYS_CTRS]; + + /* Set up the rtas call to configure the debug bus to + * route the SPU PCs. Setup the pm_signal for each SPU */ + for (i = 0; i < NUM_SPUS_PER_NODE; i++) { + pm_signal_local[i].cpu = node; + pm_signal_local[i].signal_group = 41; + pm_signal_local[i].bus_word = 1 << i / 2; /* spu i on + * word (i/2) + */ + pm_signal_local[i].sub_unit = i; /* spu i */ + pm_signal_local[i].bit = 63; + } + + pm_rtas_token = rtas_token("ibm,cbe-perftools"); + if (pm_rtas_token == RTAS_UNKNOWN_SERVICE) { + printk(KERN_WARNING "%s: RTAS_UNKNOWN_SERVICE \n", + __FUNCTION__); + } + + ret = rtas_ibm_cbe_perftools(SUBFUNC_ACTIVATE, PASSTHRU_ENABLE, + pm_signal_local, + 8 * sizeof(struct pm_signal)); //FIXME 8 to #define + + if (ret) + printk(KERN_WARNING "%s: rtas returned: %d\n", + __FUNCTION__, ret); + +} + +static void cell_global_start_spu(struct op_counter_config *ctr) +{ + int subfunc, rtn_value; + unsigned int lfsr_value; + int cpu; + + for_each_online_cpu(cpu) { + if (cbe_get_hw_thread_id(cpu)) + continue; + /* Setup SPU cycle-based profiling. + * Set perf_mon_control bit 0 to a zero before + * enabling spu collection hardware. + */ + cbe_write_pm(cpu, pm_control, 0); + + pm_rtas_activate_spu_profiling(cbe_cpu_to_node(cpu)); + + if (spu_cycle_reset > 0xFFFFFE) + lfsr_value = calculate_lfsr(1); /* use largest possible + * value + */ + else + lfsr_value = calculate_lfsr(spu_cycle_reset); + + if (lfsr_value == 0) { /* must use a non zero value. Zero + * disables data collection. + */ + lfsr_value = calculate_lfsr(1); /* use largest possible + * value + */ + } + + lfsr_value = lfsr_value << 8; /* shift lfsr to correct + * register location + */ + + pm_rtas_token = rtas_token("ibm,cbe-spu-perftools"); + + if (pm_rtas_token == RTAS_UNKNOWN_SERVICE) { + printk(KERN_ERR + "%s: rtas token ibm,cbe-spu-perftools unknown\n", + __FUNCTION__); + } + + subfunc = 2; // 2 - activate SPU tracing, 3 - deactivate + + rtn_value = rtas_call(pm_rtas_token, 3, 1, NULL, subfunc, + cbe_cpu_to_node(cpu), lfsr_value); + + if (rtn_value != 0) + printk(KERN_ERR + "%s: rtas call ibm,cbe-spu-perftools failed, return = %d\n", + __FUNCTION__, rtn_value); + } + + start_spu_profiling(spu_cycle_reset); + + oprofile_running = 1; +} + +static void cell_global_start_ppu(struct op_counter_config *ctr) { u32 cpu; u32 interrupt_mask = 0; @@ -652,7 +797,44 @@ start_virt_cntrs(); } -static void cell_global_stop(void) +static void cell_global_start(struct op_counter_config *ctr) +{ + if (spu_cycle_reset) { + cell_global_start_spu(ctr); + } else { + cell_global_start_ppu(ctr); + } +} + +static void cell_global_stop_spu(void) +{ + int subfunc, rtn_value; + unsigned int lfsr_value; + int cpu; + + oprofile_running = 0; + + for_each_online_cpu(cpu) { + if (cbe_get_hw_thread_id(cpu)) + continue; + + subfunc = 3; // 2 - activate SPU tracing, 3 - deactivate + lfsr_value = 0x8f100000; + + rtn_value = + rtas_call(pm_rtas_token, 3, 1, NULL, subfunc, + cbe_cpu_to_node(cpu), lfsr_value); + + if (rtn_value != 0) + printk + ("ERROR, rtas call ibm,cbe-spu-perftools failed, return = %d\n", + rtn_value); + } + + stop_spu_profiling(); +} + +static void cell_global_stop_ppu(void) { int cpu; @@ -680,6 +862,16 @@ } } +static void cell_global_stop(void) +{ + if (spu_cycle_reset) { + cell_global_stop_spu(); + } else { + cell_global_stop_ppu(); + } + +} + static void cell_handle_interrupt(struct pt_regs *regs, struct op_counter_config *ctr) { @@ -748,10 +940,33 @@ spin_unlock_irqrestore(&virt_cntr_lock, flags); } +/* This function is called from the generic OProfile + * driver. When profiling PPUs, we need to do the + * generic sync start; otherwise, do spu_sync_start. + */ +static int cell_sync_start(void) +{ + if (spu_cycle_reset) + return spu_sync_start(); + else + return DO_GENERIC_SYNC; +} + +static int cell_sync_stop(void) +{ + if (spu_cycle_reset) + return spu_sync_stop(); + else + return 1; +} + + struct op_powerpc_model op_model_cell = { .reg_setup = cell_reg_setup, .cpu_setup = cell_cpu_setup, .global_start = cell_global_start, .global_stop = cell_global_stop, + .sync_start = cell_sync_start, + .sync_stop = cell_sync_stop, .handle_interrupt = cell_handle_interrupt, }; Index: linux-2.6.20-rc1/arch/powerpc/platforms/cell/spufs/sched.c =================================================================== --- linux-2.6.20-rc1.orig/arch/powerpc/platforms/cell/spufs/sched.c 2007-01-26 16:16:35.219668640 -0600 +++ linux-2.6.20-rc1/arch/powerpc/platforms/cell/spufs/sched.c 2007-01-29 10:32:03.413785504 -0600 @@ -94,7 +94,7 @@ for (node = 0; node < MAX_NUMNODES; node++) { struct spu *spu; mutex_lock(&spu_prio->active_mutex[node]); - list_for_each_entry(spu, &spu_prio->active_list[node], list) { + list_for_each_entry(spu, &spu_prio->active_list[node], list) { struct spu_context *ctx = spu->ctx; spu->notify_active = 1; wake_up_all(&ctx->stop_wq); @@ -129,6 +129,7 @@ ctx->spu = spu; ctx->ops = &spu_hw_ops; spu->pid = current->pid; + spu->tgid = current->tgid; spu->prio = current->prio; spu->mm = ctx->owner; mm_needs_global_tlbie(spu->mm); @@ -161,6 +162,7 @@ spu->dma_callback = NULL; spu->mm = NULL; spu->pid = 0; + spu->tgid = 0; spu->prio = MAX_PRIO; ctx->ops = &spu_backing_ops; ctx->spu = NULL; Index: linux-2.6.20-rc1/drivers/oprofile/buffer_sync.c =================================================================== --- linux-2.6.20-rc1.orig/drivers/oprofile/buffer_sync.c 2007-01-18 16:43:11.675529376 -0600 +++ linux-2.6.20-rc1/drivers/oprofile/buffer_sync.c 2007-01-29 10:32:03.415785200 -0600 @@ -26,6 +26,7 @@ #include <linux/profile.h> #include <linux/module.h> #include <linux/fs.h> +#include <linux/oprofile.h> #include "oprofile_stats.h" #include "event_buffer.h" Index: linux-2.6.20-rc1/drivers/oprofile/event_buffer.h =================================================================== --- linux-2.6.20-rc1.orig/drivers/oprofile/event_buffer.h 2007-01-18 16:43:11.673529680 -0600 +++ linux-2.6.20-rc1/drivers/oprofile/event_buffer.h 2007-01-29 10:32:03.417784896 -0600 @@ -19,28 +19,10 @@ /* wake up the process sleeping on the event file */ void wake_up_buffer_waiter(void); - -/* Each escaped entry is prefixed by ESCAPE_CODE - * then one of the following codes, then the - * relevant data. - */ -#define ESCAPE_CODE ~0UL -#define CTX_SWITCH_CODE 1 -#define CPU_SWITCH_CODE 2 -#define COOKIE_SWITCH_CODE 3 -#define KERNEL_ENTER_SWITCH_CODE 4 -#define KERNEL_EXIT_SWITCH_CODE 5 -#define MODULE_LOADED_CODE 6 -#define CTX_TGID_CODE 7 -#define TRACE_BEGIN_CODE 8 -#define TRACE_END_CODE 9 - + #define INVALID_COOKIE ~0UL #define NO_COOKIE 0UL -/* add data to the event buffer */ -void add_event_entry(unsigned long data); - extern struct file_operations event_buffer_fops; /* mutex between sync_cpu_buffers() and the Index: linux-2.6.20-rc1/drivers/oprofile/oprof.c =================================================================== --- linux-2.6.20-rc1.orig/drivers/oprofile/oprof.c 2007-01-18 16:43:11.675529376 -0600 +++ linux-2.6.20-rc1/drivers/oprofile/oprof.c 2007-01-29 10:32:03.419784592 -0600 @@ -53,9 +53,23 @@ * us missing task deaths and eventually oopsing * when trying to process the event buffer. */ + if (oprofile_ops.sync_start) { + int sync_ret = oprofile_ops.sync_start(); + switch (sync_ret) { + case 0: goto post_sync; + break; + case 1: goto do_generic; + break; + case -1: goto out3; + break; + default: goto out3; + } + } +do_generic: if ((err = sync_start())) goto out3; +post_sync: is_setup = 1; mutex_unlock(&start_mutex); return 0; @@ -118,7 +132,19 @@ void oprofile_shutdown(void) { mutex_lock(&start_mutex); + if (oprofile_ops.sync_stop) { + int sync_ret = oprofile_ops.sync_stop(); + switch (sync_ret) { + case 0: goto post_sync; + break; + case 1: goto do_generic; + break; + default: goto post_sync; + } + } +do_generic: sync_stop(); +post_sync: if (oprofile_ops.shutdown) oprofile_ops.shutdown(); is_setup = 0; Index: linux-2.6.20-rc1/include/asm-powerpc/oprofile_impl.h =================================================================== --- linux-2.6.20-rc1.orig/include/asm-powerpc/oprofile_impl.h 2007-01-18 16:43:19.315566704 -0600 +++ linux-2.6.20-rc1/include/asm-powerpc/oprofile_impl.h 2007-01-29 10:32:03.421784288 -0600 @@ -47,6 +47,8 @@ void (*global_start) (struct op_counter_config *); void (*stop) (void); void (*global_stop) (void); + int (*sync_start)(void); + int (*sync_stop)(void); void (*handle_interrupt) (struct pt_regs *, struct op_counter_config *); int num_counters; Index: linux-2.6.20-rc1/include/asm-powerpc/spu.h =================================================================== --- linux-2.6.20-rc1.orig/include/asm-powerpc/spu.h 2007-01-24 12:17:30.209676992 -0600 +++ linux-2.6.20-rc1/include/asm-powerpc/spu.h 2007-01-29 10:32:03.423783984 -0600 @@ -128,6 +128,7 @@ struct spu_runqueue *rq; unsigned long long timestamp; pid_t pid; + pid_t tgid; int prio; int class_0_pending; spinlock_t register_lock; Index: linux-2.6.20-rc1/include/linux/oprofile.h =================================================================== --- linux-2.6.20-rc1.orig/include/linux/oprofile.h 2007-01-18 16:43:18.379575976 -0600 +++ linux-2.6.20-rc1/include/linux/oprofile.h 2007-01-29 10:32:03.425783680 -0600 @@ -17,6 +17,28 @@ #include <linux/spinlock.h> #include <asm/atomic.h> +/* Each escaped entry is prefixed by ESCAPE_CODE + * then one of the following codes, then the + * relevant data. + * These #defines live in this file so that arch-specific + * buffer sync'ing code can access them. + */ +#define ESCAPE_CODE ~0UL +#define CTX_SWITCH_CODE 1 +#define CPU_SWITCH_CODE 2 +#define COOKIE_SWITCH_CODE 3 +#define KERNEL_ENTER_SWITCH_CODE 4 +#define KERNEL_EXIT_SWITCH_CODE 5 +#define MODULE_LOADED_CODE 6 +#define CTX_TGID_CODE 7 +#define TRACE_BEGIN_CODE 8 +#define TRACE_END_CODE 9 +#define XEN_ENTER_SWITCH_CODE 10 +#define SPU_PROFILING_CODE 11 +#define SPU_CTX_SWITCH_CODE 12 +#define SPU_OFFSET_CODE 13 +#define SPU_COOKIE_CODE 14 + struct super_block; struct dentry; struct file_operations; @@ -35,6 +57,14 @@ int (*start)(void); /* Stop delivering interrupts. */ void (*stop)(void); + /* Arch-specific buffer sync functions. + * Return value = 0: Success + * Return value = -1: Failure + * Return value = 1: Run generic sync function + */ + int (*sync_start)(void); + int (*sync_stop)(void); + /* Initiate a stack backtrace. Optional. */ void (*backtrace)(struct pt_regs * const regs, unsigned int depth); /* CPU identification string. */ @@ -56,6 +86,13 @@ void oprofile_arch_exit(void); /** + * Add data to the event buffer. + * The data passed is free-form, but typically consists of + * file offsets, dcookies, context information, and ESCAPE codes. + */ +void add_event_entry(unsigned long data); + +/** * Add a sample. This may be called from any context. Pass * smp_processor_id() as cpu. */ Index: linux-2.6.20-rc1/kernel/hrtimer.c =================================================================== --- linux-2.6.20-rc1.orig/kernel/hrtimer.c 2007-01-18 16:43:05.808489704 -0600 +++ linux-2.6.20-rc1/kernel/hrtimer.c 2007-01-29 10:32:48.321748656 -0600 @@ -335,6 +335,7 @@ return orun; } +EXPORT_SYMBOL_GPL(hrtimer_forward); /* * enqueue_hrtimer - internal function to (re)start a timer ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-29 19:48 ` [RFC, PATCH 4/4] " Maynard Johnson @ 2007-01-30 7:39 ` Arnd Bergmann 2007-01-30 7:53 ` Benjamin Herrenschmidt ` (2 more replies) 2007-01-31 9:24 ` Milton Miller 1 sibling, 3 replies; 43+ messages in thread From: Arnd Bergmann @ 2007-01-30 7:39 UTC (permalink / raw) To: cbe-oss-dev, maynardj; +Cc: linuxppc-dev, linux-kernel, oprofile-list T24gTW9uZGF5IDI5IEphbnVhcnkgMjAwNyAyMDo0OCwgTWF5bmFyZCBKb2huc29uIHdyb3RlOgo+ IFN1YmplY3Q6IEFkZCBzdXBwb3J0IHRvIE9Qcm9maWxlIGZvciBwcm9maWxpbmcgQ2VsbCBCRSBT UFVzCj4gCj4gRnJvbTogTWF5bmFyZCBKb2huc29uIDxtYXluYXJkakB1cy5pYm0uY29tPgo+IAo+ IFRoaXMgcGF0Y2ggdXBkYXRlcyB0aGUgZXhpc3RpbmcgYXJjaC9wb3dlcnBjL29wcm9maWxlL29w X21vZGVsX2NlbGwuYwo+IHRvIGFkZCBpbiB0aGUgU1BVIHByb2ZpbGluZyBjYXBhYmlsaXRpZXMu IKBJbiBhZGRpdGlvbiwgYSAnY2VsbCcgc3ViZGlyZWN0b3J5Cj4gd2FzIGFkZGVkIHRvIGFyY2gv cG93ZXJwYy9vcHJvZmlsZSB0byBob2xkIENlbGwtc3BlY2lmaWMgU1BVIHByb2ZpbGluZwo+IGNv ZGUuCj4gCj4gU2lnbmVkLW9mZi1ieTogQ2FybCBMb3ZlIDxjYXJsbEB1cy5pYm0uY29tPgo+IFNp Z25lZC1vZmYtYnk6IE1heW5hcmQgSm9obnNvbiA8bXBqb2huQHVzLmlibS5jb20+CgpJIGNhbid0 IHJlYWxseSBzYXkgbXVjaCBhYm91dCB0aGUgY29tbW9uIG9wcm9maWxlIGZpbGVzIHRoYXQgeW91 IGFyZQp0b3VjaGluZywgbWF5YmUgc29tZW9uZSBmcm9tIG9wcm9maWxlLWxpc3QgKFBoaWxpcHBl PykgdG8gbG9vayBvdmVyIHRoZW0KYW5kIGFjay9uYWNrIHRoZW0uCgo+ICsjZGVmaW5lIG51bWJl cl9vZl9vbmxpbmVfbm9kZXMobm9kZXMpIHsgoCCgIKAgoFwKPiArIKAgoCCgIKB1MzIgY3B1OyB1 MzIgdG1wOyCgIKAgoCCgIKAgoCCgIKAgoCCgIKBcCj4gKyCgIKAgoCCgbm9kZXMgPSAwOyCgIKAg oCCgIKAgoCCgIKAgoCCgIKAgoCCgIKAgXAo+ICsgoCCgIKAgoGZvcl9lYWNoX29ubGluZV9jcHUo Y3B1KSB7IKAgoCCgIKAgoCCgIFwKPiArIKAgoCCgIKAgoCCgIKAgoHRtcCA9IGNiZV9jcHVfdG9f bm9kZShjcHUpICsgMTtcCj4gKyCgIKAgoCCgIKAgoCCgIKBpZiAodG1wID4gbm9kZXMpIKAgoCCg IKAgoCCgIKAgXAo+ICsgoCCgIKAgoCCgIKAgoCCgIKAgoCCgIKBub2RlcysrOyCgIKAgoCCgIKAg oCCgIFwKPiAroKCgoKCgoH0goCCgIKAgoCCgIKAgoCCgIKAgoCCgIKAgoCCgIKAgoCCgIKAgoFwK PiArfQoKSSd2ZSBiZWVuIGRpc2N1c3Npbmcgd2l0aCBiZW5oIGFib3V0IGEgYmV0dGVyIHdheSB0 byBkbyB0aGlzLiBXZSBzaG91bGQKY2hhbmdlIGFsbCB0aGUgcmVmZXJlbmNlcyB0byBub2RlcyBh bmQgY3B1IG51bWJlcnMgdG8gc29tZXRoaW5nIG1vcmUKY29ycmVjdCBpbiB0aGUgZnV0dXJlLCBz byB3ZSBnZXQgcmlkIG9mIHRoZSBhc3N1bXB0aW9uIHRoYXQgZWFjaApudW1hIG5vZGUgaXMgYSBj ZWxsIGNoaXAuIEl0J3MgcHJvYmFibHkgYmVzdCB0byBsZWF2ZSB5b3VyIGNvZGUgYXMKaXMgZm9y IG5vdywgYnV0IHdlIG5lZWQgdG8gcmVtZW1iZXIgdGhpcyBvbmUgd2hlbiBjbGVhbmluZyBpdCB1 cC4KCllvdSBzaG91bGQgaG93ZXZlciBjb252ZXJ0IHRoaXMgaW50byBhbiBpbmxpbmUgZnVuY3Rp b24Kb2YgcHJvdG90eXBlIAoKc3RhdGljIGlubGluZSBpbnQgbnVtYmVyX29mX29ubGluZV9ub2Rl cyh2b2lkKQoKaW5zdGVhZCBvZiBhIG1hY3JvLgoKPiArLyogRGVmaW5lcyB1c2VkIGZvciBzeW5j X3N0YXJ0ICovCj4gKyNkZWZpbmUgU0tJUF9HRU5FUklDX1NZTkMgMAo+ICsjZGVmaW5lIFNZTkNf U1RBUlRfRVJST1IgLTEKPiArI2RlZmluZSBET19HRU5FUklDX1NZTkMgMQo+ICsKPiArdHlwZWRl ZiBzdHJ1Y3Qgdm1hX21hcAo+ICt7Cj4gK6CgoKCgoKBzdHJ1Y3Qgdm1hX21hcCAqbmV4dDsKPiAr oKCgoKCgoHVuc2lnbmVkIGludCB2bWE7Cj4gK6CgoKCgoKB1bnNpZ25lZCBpbnQgc2l6ZTsKPiAr oKCgoKCgoHVuc2lnbmVkIGludCBvZmZzZXQ7Cj4gK6CgoKCgoKB1bnNpZ25lZCBpbnQgZ3VhcmRf cHRyOwo+ICugoKCgoKCgdW5zaWduZWQgaW50IGd1YXJkX3ZhbDsKPiArfSB2bWFfbWFwX3Q7Cgpw bGVhc2UgZG9uJ3QgdHlwZWRlZiBzdHJ1Y3R1cmVzLgoKPiArLyogVGhlIHRocmVlIGZ1bmN0aW9u cyBiZWxvdyBhcmUgZm9yIG1haW50YWluaW5nIGFuZCBhY2Nlc3NpbmcKPiArICogdGhlIHZtYS10 by1maWxlIG9mZnNldCBtYXAuCj4gKyAqLwo+ICt2bWFfbWFwX3QgKiBjcmVhdGVfdm1hX21hcChj b25zdCBzdHJ1Y3Qgc3B1ICogc3B1LCB1NjQgb2JqZWN0aWQpOwo+ICt1bnNpZ25lZCBpbnQgdm1h X21hcF9sb29rdXAodm1hX21hcF90ICptYXAsIHVuc2lnbmVkIGludCB2bWEsCj4gK6CgoKCgoKCg oKCgoKCgoKCgoKCgoKCgIKAgoGNvbnN0IHN0cnVjdCBzcHUgKiBhU3B1KTsKPiArdm9pZCB2bWFf bWFwX2ZyZWUoc3RydWN0IHZtYV9tYXAgKm1hcCk7Cj4gKwo+ICsvKgo+ICsgKiBFbnRyeSBwb2lu dCBmb3IgU1BVIHByb2ZpbGluZy4KPiArICogY3ljbGVzX3Jlc2V0IGlzIHRoZSBTUFVfQ1lDTEVT IGNvdW50IHZhbHVlIHNwZWNpZmllZCBieSB0aGUgdXNlci4KPiArICovCj4gK3ZvaWQgc3RhcnRf c3B1X3Byb2ZpbGluZyh1bnNpZ25lZCBpbnQgY3ljbGVzX3Jlc2V0KTsKPiArCj4gK3ZvaWQgc3Rv cF9zcHVfcHJvZmlsaW5nKHZvaWQpOwo+ICsKPiArIAo+ICsvKiBhZGQgdGhlIG5lY2Vzc2FyeSBw cm9maWxpbmcgaG9va3MgKi8KPiAraW50IHNwdV9zeW5jX3N0YXJ0KHZvaWQpOwo+ICsKPiArLyog cmVtb3ZlIHRoZSBob29rcyAqLwo+ICtpbnQgc3B1X3N5bmNfc3RvcCh2b2lkKTsKPiArIAo+ICsv KiBSZWNvcmQgU1BVIHByb2dyYW0gY291bnRlciBzYW1wbGVzIHRvIHRoZSBvcHJvZmlsZSBldmVu dCBidWZmZXIuICovCj4gK3ZvaWQgc3B1X3N5bmNfYnVmZmVyKGludCBzcHVfbnVtLCB1bnNpZ25l ZCBpbnQgKiBzYW1wbGVzLCAKPiAroKCgoKCgoKCgoKCgoKCgIKAgoCBpbnQgbnVtX3NhbXBsZXMp Owo+ICsKPiArI2VuZGlmIKAgoC8vIFBSX1VUSUxfSCAKPiBJbmRleDogbGludXgtMi42LjIwLXJj MS9hcmNoL3Bvd2VycGMvb3Byb2ZpbGUvY2VsbC9zcHVfcHJvZmlsZXIuYwo+ID09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0K PiAtLS0gL2Rldi9udWxsoKCgMTk3MC0wMS0wMSAwMDowMDowMC4wMDAwMDAwMDAgKzAwMDAKPiAr KysgbGludXgtMi42LjIwLXJjMS9hcmNoL3Bvd2VycGMvb3Byb2ZpbGUvY2VsbC9zcHVfcHJvZmls ZXIuY6CgMjAwNy0wMS0yOSAxMDozMjowMy4zOTI3ODg2OTYgLTA2MDAKPiBAQCAtMCwwICsxLDIw NCBAQAo+ICsvKgo+ICsgKiBDZWxsIEJyb2FkYmFuZCBFbmdpbmUgT1Byb2ZpbGUgU3VwcG9ydAo+ ICsgKgo+ICsgKiAoQykgQ29weXJpZ2h0IElCTSBDb3Jwb3JhdGlvbiAyMDA2Cj4gKyAqCj4gKyAq IEF1dGhvcnM6IE1heW5hcmQgSm9obnNvbiA8bWF5bmFyZGpAdXMuaWJtLmNvbT4KPiArICogoCCg IKAgoCCgQ2FybCBMb3ZlIDxjYXJsbEB1cy5pYm0uY29tPgo+ICsgKgo+ICsgKiBUaGlzIHByb2dy YW0gaXMgZnJlZSBzb2Z0d2FyZTsgeW91IGNhbiByZWRpc3RyaWJ1dGUgaXQgYW5kL29yCj4gKyAq IG1vZGlmeSBpdCB1bmRlciB0aGUgdGVybXMgb2YgdGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNl bnNlCj4gKyAqIGFzIHB1Ymxpc2hlZCBieSB0aGUgRnJlZSBTb2Z0d2FyZSBGb3VuZGF0aW9uOyBl aXRoZXIgdmVyc2lvbgo+ICsgKiAyIG9mIHRoZSBMaWNlbnNlLCBvciAoYXQgeW91ciBvcHRpb24p IGFueSBsYXRlciB2ZXJzaW9uLgo+ICsgKi8KPiArCj4gKyNpbmNsdWRlIDxsaW51eC9ocnRpbWVy Lmg+Cj4gKyNpbmNsdWRlIDxsaW51eC9zbXAuaD4KPiArI2luY2x1ZGUgPGxpbnV4L3NsYWIuaD4K PiArI2luY2x1ZGUgPGFzbS9jZWxsLXBtdS5oPgo+ICsjaW5jbHVkZSAicHJfdXRpbC5oIgo+ICsK PiArI2RlZmluZSBUUkFDRV9BUlJBWV9TSVpFIDEwMjQKPiArc3RhdGljIHUzMiAqIHNhbXBsZXM7 Cj4gK3N0YXRpYyB1MzIgKiBzYW1wbGVzX3Blcl9ub2RlOwo+ICsKPiArc3RhdGljIGludCBzcHVf cHJvZl9ydW5uaW5nID0gMDsKPiArc3RhdGljIHVuc2lnbmVkIGludCBwcm9maWxpbmdfaW50ZXJ2 YWwgPSAwOwo+ICsKPiArZXh0ZXJuIGludCBudW1fbm9kZXM7Cj4gK2V4dGVybiB1bnNpZ25lZCBp bnQga2h6ZnJlcTsKCllvdSByZWFsbHkgY2FuJ3QgaGF2ZSBnbG9iYWwgdmFyaWFibGUgd2l0aCBz dWNoIGdlbmVyaWMgbmFtZXMuIEZvcgpzdGF0aWMgdmFyaWFibGVzLCBpdCdzIGxlc3Mgb2YgYSBw cm9ibGVtLCBzaW5jZSB0aGV5IGFyZSBub3QgaW4gdGhlCnNhbWUgbmFtZSBzcGFjZSwgYnV0IGZv ciBlYXNpZXIgZGVidWdnaW5nLCBpdCdzIGdvb2QgdG8gYWx3YXlzIGhhdmUKdGhlIG5hbWUgb2Yg eW91ciBtb2R1bGUgKGUuZy4gc3B1X3Byb2ZfKSBhcyBhIHByZWZpeCB0byB0aGUgaWRlbnRpZmll ci4KCk9mIGNvdXJzZSwgdGhlIGJlc3Qgd2F5IHdvdWxkIGJlIHRvIGF2b2lkIGdsb2JhbCBhbmQg c3RhdGljIHZhcmlhYmxlcwplbnRpcmVseSwgYnV0IHRoYXQncyBub3QgYWx3YXlzIHBvc3NpYmxl LgoKPiArLyoKPiArICogT3Byb2ZpbGUgc2V0dXAgZnVuY3Rpb25zCj4gKyAqLwo+ICsKPiArI2Rl ZmluZSBOVU1fU1BVX0JJVFNfVFJCVUYgMTYKPiArI2RlZmluZSBTUFVTX1BFUl9UQl9FTlRSWSCg IDQKPiArI2RlZmluZSBTUFVTX1BFUl9OT0RFIKAgoCCgIDgKPiArCj4gKy8qIAo+ICsgKiBDb2xs ZWN0IHRoZSBTUFUgcHJvZ3JhbSBjb3VudGVyIHNhbXBsZXMgZnJvbSB0aGUgdHJhY2UgYnVmZmVy Lgo+ICsgKiBUaGUgZ2xvYmFsIHZhcmlhYmxlIHVzYWdlIGlzIGFzIGZvbGxvd3M6Cj4gKyAqIKAg oHNhbXBsZXNbPHRvdGFsLXNwdXM+XVtUUkFDRV9BUlJBWV9TSVpFXSAtIGFycmF5IHRvIHN0b3Jl IFNQVSBQQyBzYW1wbGVzCj4gKyAqIKAgoCCgIKAgoCBBc3N1bXB0aW9uLCB0aGUgYXJyYXkgd2ls bCBiZSBhbGwgemVyb3Mgb24gZW50cnkuCj4gKyAqIKAgoHUzMiBzYW1wbGVzX3Blcl9ub2RlW251 bV9ub2Rlc10gLSBhcnJheSBvZiBob3cgbWFueSB2YWxpZCBzYW1wbGVzIHBlciBub2RlCj4gKyAq Lwo+ICtzdGF0aWMgdm9pZCBjZWxsX3NwdV9wY19jb2xsZWN0aW9uKHZvaWQpCj4gK3sKPiAroKCg oKCgoGludCBjcHU7Cj4gK6CgoKCgoKBpbnQgbm9kZTsKPiAroKCgoKCgoGludCBzcHU7Cj4gK6Cg oKCgoKB1MzIgdHJhY2VfYWRkcjsKPiArIKAgoCCgIKAvKiB0aGUgdHJhY2UgYnVmZmVyIGlzIDEy OCBiaXRzICovCj4gK6CgoKCgoKB1NjQgdHJhY2VfYnVmZmVyWzJdOwo+ICugoKCgoKCgdTY0IHNw dV9wY19sb3dlcjsgoAo+ICugoKCgoKCgdTY0IHNwdV9wY191cHBlcjsKPiAroKCgoKCgoHU2NCBz cHVfbWFzazsKPiAroKCgoKCgoGludCBlbnRyeSwgbm9kZV9mYWN0b3I7Cj4gK6CgoKCgoKAvLyBw cm9jZXNzIHRoZSBjb2xsZWN0ZWQgU1BVIFBDIGZvciBlYWNoIG5vZGUKPiAroKCgoKCgoGZvcl9l YWNoX29ubGluZV9jcHUoY3B1KSB7Cj4gK6CgoKCgoKCgoKCgoKCgoGlmIChjYmVfZ2V0X2h3X3Ro cmVhZF9pZChjcHUpKQo+ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCgoGNvbnRpbnVlOwo+ICsKPiAr oKCgoKCgoKCgoKCgoKCgbm9kZSA9IGNiZV9jcHVfdG9fbm9kZShjcHUpOwo+ICugoKCgoKCgoKCg oKCgoKBub2RlX2ZhY3RvciA9IG5vZGUgKiBTUFVTX1BFUl9OT0RFOwo+ICsgoCCgIKAgoCCgIKAg oCCgLyogbnVtYmVyIG9mIHZhbGlkIGVudHJpZXMgZm9yIHRoaXMgbm9kZSAqLwo+ICugoKCgoKCg oKCgoKCgoKBlbnRyeSA9IDA7Cj4gKwo+ICugoKCgoKCgoKCgoKCgoKB0cmFjZV9hZGRyID0gY2Jl X3JlYWRfcG0oY3B1LCB0cmFjZV9hZGRyZXNzKTsKPiAroKCgoKCgoKCgoKCgoKCgd2hpbGUgKCh0 cmFjZV9hZGRyICYgQ0JFX1BNX1RSQUNFX0JVRl9FTVBUWSkgIT0gMHg0MDApCj4gK6CgoKCgoKCg oKCgoKCgoHsKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKAvKiB0aGVyZSBpcyBkYXRhIGluIHRo ZSB0cmFjZSBidWZmZXIgdG8gcHJvY2VzcyAqLwo+ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCgoGNi ZV9yZWFkX3RyYWNlX2J1ZmZlcihjcHUsIHRyYWNlX2J1ZmZlcik7IKAKPiAroKCgoKCgoKCgoKCg oKCgoKCgoKCgoKBzcHVfbWFzayA9IDB4RkZGRjAwMDAwMDAwMDAwMDsKPiArCj4gK6CgoKCgoKCg oKCgoKCgoKCgoKCgoKCgLyogRWFjaCBTUFUgUEMgaXMgMTYgYml0czsgaGVuY2UsIGZvdXIgc3B1 cyBpbiBlYWNoIG9mIAo+ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCgoCAqIHRoZSB0d28gNjQtYml0 IGJ1ZmZlciBlbnRyaWVzIHRoYXQgbWFrZSB1cCB0aGUKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCg oKAgKiAxMjgtYml0IHRyYWNlX2J1ZmZlciBlbnRyeS4goFByb2Nlc3MgdGhlIHVwcGVyIGFuZAo+ ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCgoCAqIGxvd2VyIDY0LWJpdCB2YWx1ZXMgc2ltdWx0YW5l b3VzbHkuCj4gK6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgICovCj4gK6CgoKCgoKCgoKCgoKCgoKCg oKCgoKCgZm9yIChzcHUgPSAwOyBzcHUgPCBTUFVTX1BFUl9UQl9FTlRSWTsgc3B1KyspIHsKPiAr oKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoHNwdV9wY19sb3dlciA9IHNwdV9tYXNrICYg dHJhY2VfYnVmZmVyWzBdOwo+ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgc3B1X3Bj X2xvd2VyID0gc3B1X3BjX2xvd2VyID4+IChOVU1fU1BVX0JJVFNfVFJCVUYKPiAroKCgoKCgoKCg oKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgKiAo U1BVU19QRVJfVEJfRU5UUlktc3B1LTEpKTsKPiArCj4gK6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCg oKCgoKCgoKBzcHVfcGNfdXBwZXIgPSBzcHVfbWFzayAmIHRyYWNlX2J1ZmZlclsxXTsKPiAroKCg oKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoHNwdV9wY191cHBlciA9IHNwdV9wY191cHBlciA+ PiAoTlVNX1NQVV9CSVRTX1RSQlVGCj4gK6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCg oKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoCogKFNQVVNfUEVSX1RCX0VOVFJZLXNwdS0x KSk7Cj4gKwo+ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgc3B1X21hc2sgPSBzcHVf bWFzayA+PiBOVU1fU1BVX0JJVFNfVFJCVUY7Cj4gKwo+ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCg oKCgoKCgoKCgLyogc3B1IFBDIHRyYWNlIGVudHJ5IGlzIHVwcGVyIDE2IGJpdHMgb2YgdGhlCj4g K6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKAgKiAxOCBiaXQgU1BVIHByb2dyYW0gY291 bnRlciAKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoCAqLwo+ICugoKCgoKCgoKCg oKCgoKCgoKCgoKCgoKCgoKCgoKCgc3B1X3BjX2xvd2VyID0gc3B1X3BjX2xvd2VyIDw8IDI7Cj4g K6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKBzcHVfcGNfdXBwZXIgPSBzcHVfcGNfdXBw ZXIgPDwgMjsKPiArCj4gK6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKBzYW1wbGVzWygo bm9kZV9mYWN0b3IgKyBzcHUpICogVFJBQ0VfQVJSQVlfU0laRSkgKyBlbnRyeV0KPiAroKCgoKCg oKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgPSAodTMyKSBzcHVfcGNfbG93ZXI7Cj4g K6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKBzYW1wbGVzWygobm9kZV9mYWN0b3IgKyBz cHUgKyBTUFVTX1BFUl9UQl9FTlRSWSkgKiBUUkFDRV9BUlJBWV9TSVpFKSArIGVudHJ5XQo+ICug oKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKA9ICh1MzIpIHNwdV9wY191cHBl cjsKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKB9Cj4gKwo+ICugoKCgoKCgoKCgoKCgoKCgoKCg oKCgoGVudHJ5Kys7Cj4gKwo+ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCgoGlmIChlbnRyeSA+PSBU UkFDRV9BUlJBWV9TSVpFKSAKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoC8qIHNw dV9zYW1wbGVzIGlzIGZ1bGwgKi8KPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoGJy ZWFrOwo+ICsKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKB0cmFjZV9hZGRyID0gY2JlX3JlYWRf cG0oY3B1LCB0cmFjZV9hZGRyZXNzKTsKPiAroKCgoKCgoKCgoKCgoKCgfQo+ICugoKCgoKCgoKCg oKCgoKBzYW1wbGVzX3Blcl9ub2RlW25vZGVdID0gZW50cnk7Cj4gK6CgoKCgoKB9Cj4gK30KCldo aWxlIEkgY2FuJ3Qgc2VlIGFueXRoaW5nIHRlY2huaWNhbGx5IHdyb25nIHdpdGggdGhpcyBmdW5j dGlvbiwgaXQgd291bGQgYmUKZ29vZCB0byBzcGxpdCBpdCBpbnRvIHNtYWxsZXIgZnVuY3Rpb25z LiBTaW5jZSB5b3UgYXJlIG5lc3RpbmcgdGhyZWUKbG9vcHMsIGl0IHNob3VsZCBiZSBwb3NzaWJs ZSB0byBtYWtlIGEgc2VwYXJhdGUgZnVuY3Rpb24gZnJvbSBvbmUgb2YgdGhlCmlubmVyIGxvb3Bz IHdpdGhvdXQgY2hhbmdpbmcgdGhlIGFjdHVhbCBsb2dpYyBiZWhpbmQgaXQuCgo+ICsKPiArc3Rh dGljIGludCBwcm9maWxlX3NwdXMoc3RydWN0IGhydGltZXIgKiB0aW1lcikKPiArewo+ICugoKCg oKCga3RpbWVfdCBrdDsKPiArIKAgoCCgIKBpbnQgY3B1LCBub2RlLCBrLCBudW1fc2FtcGxlcywg c3B1X251bTsKCndoaXRlc3BhY2UgZGFtYWdlCgo+ICugoKCgoKCgCj4gK6CgoKCgoKBpZiAoIXNw dV9wcm9mX3J1bm5pbmcpCj4gK6CgoKCgoKCgoKCgoKCgoGdvdG8gU1RPUDsKPiArCj4gK6CgoKCg oKBjZWxsX3NwdV9wY19jb2xsZWN0aW9uKCk7Cj4gK6CgoKCgoKBmb3JfZWFjaF9vbmxpbmVfY3B1 KGNwdSkgewo+ICugoKCgoKCgoKCgoKCgoKBpZiAoY2JlX2dldF9od190aHJlYWRfaWQoY3B1KSkK PiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKBjb250aW51ZTsKCkhlcmUsIHlvdSBlbnRlciB0aGUg c2FtZSB0b3AtbGV2ZWwgbG9vcCBhZ2Fpbiwgd2h5IG5vdCBtYWtlIGl0Cglmb3JfZWFjaF9vbmxp bmVfY3B1KGNwdSkgewoJCWlmIChjYmVfZ2V0X2h3X3RocmVhZF9pZChjcHUpKQogICAgICAgICAg ICAgICAgICAgICAgICAgY29udGludWU7CgkJbnVtX3NhbXBsZXMgPSBjZWxsX3NwdV9wY19jb2xs ZWN0aW9uKGNwdSk7CgkJLi4uCgo+ICugoKCgoKCga3QgPSBrdGltZV9zZXQoMCwgcHJvZmlsaW5n X2ludGVydmFsKTsKPiAroKCgoKCgoGlmICghc3B1X3Byb2ZfcnVubmluZykKPiAroKCgoKCgoKCg oKCgoKCgZ290byBTVE9QOwo+ICugoKCgoKCgaHJ0aW1lcl9mb3J3YXJkKHRpbWVyLCB0aW1lci0+ YmFzZS0+Z2V0X3RpbWUoKSwga3QpOwo+ICugoKCgoKCgcmV0dXJuIEhSVElNRVJfUkVTVEFSVDsK CmlzIGhydGltZXJfZm9yd2FyZCByZWFsbHkgdGhlIHJpZ2h0IGludGVyZmFjZSBoZXJlPyBZb3Ug YXJlIGlnbm9yaW5nCnRoZSBudW1iZXIgb2Ygb3ZlcnJ1bnMgYW55d2F5LCBzbyBocnRpbWVyX3N0 YXJ0KCwsKSBzb3VuZHMgbW9yZQpjb3JyZWN0IHRvIG1lLgoKPiArCj4gKyBTVE9QOgoKbGFiZWxz IHNob3VsZCBiZSBpbiBzbWFsbCBsZXR0ZXJzLgoKPiAroKCgoKCgoHByaW50ayhLRVJOX0lORk8g IlNQVV9QUk9GOiBzcHUtcHJvZiB0aW1lciBlbmRpbmdcbiIpOwo+ICugoKCgoKCgcmV0dXJuIEhS VElNRVJfTk9SRVNUQVJUOwo+ICt9Cgo+ICt2b2lkIHN0YXJ0X3NwdV9wcm9maWxpbmcodW5zaWdu ZWQgaW50IGN5Y2xlc19yZXNldCkgewo+ICsKPiAroKCgoKCgoGt0aW1lX3Qga3Q7Cj4gKwo+ICsg oCCgIKAgoC8qIFRvIGNhbGN1bGF0ZSBhIHRpbWVvdXQgaW4gbmFub3NlY29uZHMsIHRoZSBiYXNp Ywo+ICugoKCgoKCgICogZm9ybXVsYSBpcyBucyA9IGN5Y2xlc19yZXNldCAqIChOU0VDX1BFUl9T RUMgLyBjcHUgZnJlcXVlbmN5KS4KPiAroKCgoKCgoCAqIFRvIGF2b2lkIGZsb2F0aW5nIHBvaW50 IG1hdGgsIHdlIHVzZSB0aGUgc2NhbGUgbWF0aAo+ICugoKCgoKCgICogdGVjaG5pcXVlIGFzIGRl c2NyaWJlZCBpbiBsaW51eC9qaWZmaWVzLmguIKBXZSB1c2UKPiAroKCgoKCgoCAqIGEgc2NhbGUg ZmFjdG9yIG9mIFNDQUxFX1NISUZULHdoaWNoIHByb3ZpZGVzIDQgZGVjaW1hbCBwbGFjZXMKPiAr oKCgoKCgoCAqIG9mIHByZWNpc2lvbiwgd2hpY2ggaXMgY2xvc2UgZW5vdWdoIGZvciB0aGUgcHVy cG9zZSBhdCBoYW5kLgo+ICugoKCgoKCgICovCj4gKwo+ICugoKCgoKCgLyogU2luY2UgY3B1ZnJl cV9xdWlja19nZXQgcmV0dXJucyBmcmVxdWVuY3kgaW4ga0h6LCB3ZSB1c2UKPiAroKCgoKCgoCAq IFVTRUNfUEVSX1NFQyBoZXJlIHZzIE5TRUNfUEVSX1NFQy4KPiAroKCgoKCgoCAqLwo+ICugoKCg oKCgdW5zaWduZWQgbG9uZyBuc1BlckN5YyA9IChVU0VDX1BFUl9TRUMgPDwgU0NBTEVfU0hJRlQp L2toemZyZXE7Cj4gK6CgoKCgoKBwcm9maWxpbmdfaW50ZXJ2YWwgPSAobnNQZXJDeWMgKiBjeWNs ZXNfcmVzZXQpID4+IFNDQUxFX1NISUZUOwo+ICugoKCgoKCgCj4gK6CgoKCgoKBwcl9kZWJ1Zygi dGltZXIgcmVzb2x1dGlvbjogJWx1XG4iLCAKPiAroKCgoKCgoKCgoKCgoKCgIFRJQ0tfTlNFQyk7 CgpEb24ndCB5b3UgbmVlZCB0byBhZGFwdCB0aGUgcHJvZmlsaW5nX2ludGVydmFsIGF0IHJ1biB0 aW1lLCB3aGVuIGNwdWZyZXEKY2hhbmdlcyB0aGUgY29yZSBmcmVxdWVuY3k/IFlvdSBzaG91bGQg cHJvYmFibHkgdXNlCmNwdWZyZXFfcmVnaXN0ZXJfbm90aWZpZXIoKSB0byB1cGRhdGUgdGhpcy4K Cj4gK6CgoKCgoKBrdCA9IGt0aW1lX3NldCgwLCBwcm9maWxpbmdfaW50ZXJ2YWwpOwo+ICugoKCg oKCgaHJ0aW1lcl9pbml0KCZ0aW1lciwgQ0xPQ0tfTU9OT1RPTklDLCBIUlRJTUVSX1JFTCk7Cj4g K6CgoKCgoKB0aW1lci5leHBpcmVzID0ga3Q7Cj4gK6CgoKCgoKB0aW1lci5mdW5jdGlvbiA9IHBy b2ZpbGVfc3B1czsKPiArCj4gKyCgIKAgoCCgLyogQWxsb2NhdGUgYXJyYXlzIGZvciBjb2xsZWN0 aW5nIFNQVSBQQyBzYW1wbGVzICovCj4gK6CgoKCgoKBzYW1wbGVzID0gKHUzMiAqKSBremFsbG9j KG51bV9ub2RlcyAqIFNQVVNfUEVSX05PREUgKiBUUkFDRV9BUlJBWV9TSVpFICogc2l6ZW9mKHUz MiksIEdGUF9BVE9NSUMpOwoKVHJ5IHRvIGF2b2lkIGF0b21pYyBhbGxvY2F0aW9ucy4gSSBkb24n dCB0aGluayB5b3UgYXJlIGluIGFuIGF0b21pYwpjb250ZXh0IGhlcmUgYXQgYWxsLCBzbyB5b3Ug Y2FuIGp1c3QgdXNlIEdGUF9LRVJORUwuCgo+ICugoKCgoKCgc2FtcGxlc19wZXJfbm9kZSA9ICh1 MzIgKikga3phbGxvYyhudW1fbm9kZXMgKiBzaXplb2YodTMyKSwgR0ZQX0FUT01JQyk7CgpTaW5j ZSBNQVhfTlVNTk9ERVMgaXMgc21hbGwsIGl0J3MgcHJvYmFibHkgbW9yZSBlZmZpY2llbnQgdG8g anVzdCBhbGxvY2F0ZSB0aGlzCnN0YXRpY2FsbHkuCgo+ICsKPiAroKCgoKCgoHNwdV9wcm9mX3J1 bm5pbmcgPSAxOwo+ICugoKCgoKCgaHJ0aW1lcl9zdGFydCgmdGltZXIsIGt0LCBIUlRJTUVSX1JF TCk7Cj4gK30KPgo+ICsKPiArdm9pZCBzdG9wX3NwdV9wcm9maWxpbmcodm9pZCkgCj4gK3sKPiAr Cj4gK6CgoKCgoKBocnRpbWVyX2NhbmNlbCgmdGltZXIpOwo+ICugoKCgoKCga2ZyZWUoc2FtcGxl cyk7Cj4gK6CgoKCgoKBrZnJlZShzYW1wbGVzX3Blcl9ub2RlKTsKPiAroKCgoKCgoHByX2RlYnVn KCJTUFVfUFJPRjogc3RvcF9zcHVfcHJvZmlsaW5nIGlzc3VlZFxuIik7Cj4gK6CgoKCgoKBzcHVf cHJvZl9ydW5uaW5nID0gMDsKPiArfQoKc2hvdWxkbid0IHlvdSBzZXQgc3B1X3Byb2ZfcnVubmlu ZyA9IDAgYmVmb3JlIGRvaW5nIGFueSBvZiB0aGUgb3RoZXIgdGhpbmdzPwpJdCBsb29rcyB0byBt ZSBsaWtlIHlvdSBjb3VsZCBvdGhlcndpc2UgZ2V0IGludG8gYSB1c2UtYWZ0ZXItZnJlZQpzaXR1 YXRpb24uIElmIEknbSB3cm9uZyB3aXRoIHRoYXQsIHlvdSBwcm9iYWJseSBkb24ndCBuZWVkIHNw dV9wcm9mX3J1bm5pbmcKYXQgYWxsIDstKQoKPiArLyogQ29uYWluZXIgZm9yIGNhY2hpbmcgaW5m b3JtYXRpb24gYWJvdXQgYW4gYWN0aXZlIFNQVSB0YXNrLgo+ICsgKiCgIDptYXAgLS0gcG9pbnRl ciB0byBhIGxpc3Qgb2Ygdm1hX21hcHMKPiArICogoCA6c3B1IC0tIHRoZSBzcHUgZm9yIHRoaXMg YWN0aXZlIFNQVSB0YXNrCj4gKyAqIKAgOmxpc3QgLS0gcG90ZW50aWFsbHkgY291bGQgYmUgdXNl ZCB0byBjb250YWluIHRoZSBjYWNoZWRfaW5mb3MKPiArICogoCCgIKAgoCCgIKBmb3IgaW5hY3Rp dmUgU1BVIHRhc2tzLgoKRG9jdW1lbnRpbmcgc3RydWN0dXJlcyBpcyBnb29kLCBidXQgcGxlYXNl IHVzZSB0aGUgY29tbW9uIGtlcm5lbGRvYyBmb3JtYXQKZm9yIGl0LiBUaGVyZSBhcmUgYSBudW1i ZXIgb2YgZXhhbXBsZXMgZm9yIHRoaXMgaW4gaW5jbHVkZS9saW51eC8KCj4gKyAqIAo+ICsgKiBJ ZGVhbGx5LCB3ZSB3b3VsZCBsaWtlIHRvIGJlIGFibGUgdG8gY3JlYXRlIHRoZSBjYWNoZWRfaW5m byBmb3IKPiArICogYW4gU1BVIHRhc2sganVzdCBvbmUgdGltZSAtLSB3aGVuIGxpYnNwZSBmaXJz dCBsb2FkcyB0aGUgU1BVIAo+ICsgKiBiaW5hcnkgZmlsZS4goFdlIHdvdWxkIHN0b3JlIHRoZSBj YWNoZWRfaW5mbyBpbiBhIGxpc3QuIKBUaGVuLCBhcwo+ICsgKiBTUFUgdGFza3MgYXJlIHN3aXRj aGVkIG91dCBhbmQgbmV3IG9uZXMgc3dpdGNoZWQgaW4sIHRoZSBjYWNoZWRfaW5mbwo+ICsgKiBm b3IgaW5hY3RpdmUgdGFza3Mgd291bGQgYmUga2VwdCwgYW5kIHRoZSBhY3RpdmUgb25lIHdvdWxk IGJlIHBsYWNlZAo+ICsgKiBhdCB0aGUgaGVhZCBvZiB0aGUgbGlzdC4goEJ1dCB0aGlzIHRlY2hu aXF1ZSBtYXkgbm90IHdpdGgKPiArICogY3VycmVudCBzcHVmcyBmdW5jdGlvbmFsaXR5IHNpbmNl IHRoZSBzcHUgdXNlZCBpbiBiaW5kX2NvbnRleHQgbWF5Cj4gKyAqIGJlIGEgZGlmZmVyZW50IHNw dSB0aGFuIHdhcyB1c2VkIGluIGEgcHJldmlvdXMgYmluZF9jb250ZXh0IGZvciBhCj4gKyAqIHJl YWN0aXZhdGVkIFNQVSB0YXNrLiCgQWRkaXRpb25hbGx5LCBhIHJlYWN0aXZhdGVkIFNQVSB0YXNr IG1heSBiZQo+ICsgKiBhc3NpZ25lZCB0byBydW4gb24gYSBkaWZmZXJlbnQgcGh5c2ljYWwgU1BF LiCgV2Ugd2lsbCBpbnZlc3RpZ2F0ZQo+ICsgKiBmdXJ0aGVyIGlmIHRoaXMgY2FuIGJlIGRvbmUu Cj4gKyAqCj4gKyAqLwoKWW91IHNob3VsZCBzdHVmZiBhIHBvaW50ZXIgdG8gY2FjaGVkX2luZm8g aW50byBzdHJ1Y3Qgc3B1X2NvbnRleHQsCmUuZy4gJ3ZvaWQgKnByb2ZpbGVfcHJpdmF0ZScuCgo+ ICtzdHJ1Y3QgY2FjaGVkX2luZm8gewo+ICugoKCgoKCgdm1hX21hcF90ICogbWFwOwo+ICugoKCg oKCgc3RydWN0IHNwdSAqIHRoZV9zcHU7Cj4gK6CgoKCgoKBzdHJ1Y3Qga3JlZiBjYWNoZV9yZWY7 Cj4gK6CgoKCgoKBzdHJ1Y3QgbGlzdF9oZWFkIGxpc3Q7Cj4gK307CgpBbmQgcmVwbGFjZSB0aGUg J3RoZV9zcHUnIG1lbWJlciB3aXRoIGEgYmFjayBwb2ludGVyIHRvIHRoZQpzcHVfY29udGV4dCBp ZiB5b3UgbmVlZCBpdC4KCj4gKwo+ICsvKiBBIGRhdGEgc3RydWN0dXJlIGZvciBjYWNoZWQgaW5m b3JtYXRpb24gYWJvdXQgYWN0aXZlIFNQVSB0YXNrcy4KPiArICogU3RvcmFnZSBpcyBkeW5hbWlj YWxseSBhbGxvY2F0ZWQsIHNpemVkIGFzCj4gKyAqICJudW1iZXIgb2YgYWN0aXZlIG5vZGVzIG11 bHRwbGllZCBieSA4Ii4gCj4gKyAqIFRoZSBpbmZvX2xpc3Rbbl0gbWVtYmVyIGhvbGRzIDAgb3Ig bW9yZSAKPiArICogJ3N0cnVjdCBjYWNoZWRfaW5mbycgb2JqZWN0cyBmb3IgU1BVIz1uLiAKPiAr ICoKPiArICogQXMgY3VycmVudGx5IGltcGxlbWVudGVkLCB0aGVyZSB3aWxsIG9ubHkgZXZlciBi ZSBvbmUgY2FjaGVkX2luZm8gCj4gKyAqIGluIHRoZSBsaXN0IGZvciBhIGdpdmVuIFNQVS4goElm IHdlIGNhbiBkZXZpc2UgYSB3YXkgdG8gbWFpbnRhaW4KPiArICogbXVsdGlwbGUgY2FjaGVkX2lu Zm9zIGluIG91ciBsaXN0LCB0aGVuIGl0IHdvdWxkIG1ha2Ugc2Vuc2UKPiArICogdG8gYWxzbyBj YWNoZSB0aGUgZGNvb2tpZSByZXByZXNlbnRpbmcgdGhlIFBQVSB0YXNrIGFwcGxpY2F0aW9uLgo+ ICsgKiBTZWUgYWJvdmUgZGVzY3JpcHRpb24gb2Ygc3RydWN0IGNhY2hlZF9pbmZvIGZvciBtb3Jl IGRldGFpbHMuCj4gKyAqLwo+ICtzdHJ1Y3Qgc3B1X2luZm9fc3RhY2tzIHsKPiAroKCgoKCgoHN0 cnVjdCBsaXN0X2hlYWQgKiBpbmZvX2xpc3Q7Cj4gK307CgpXaHkgZG8geW91IHN0b3JlIHBvaW50 ZXJzIHRvIGxpc3RfaGVhZCBzdHJ1Y3R1cmVzPyBJZiB5b3Ugd2FudCB0byBzdG9yZQpsaXN0cywg eW91IHNob3VsZCBoYXZlIGEgbGlzdHNfaGVhZCBpdHNlbGYgaW4gaGVyZS4KCldoeSBkbyB5b3Ug c3RvcmUgdGhlbSBwZXIgc3B1IGluIHRoZSBmaXJzdCBwbGFjZT8gVGhlIHBoeXNpY2FsIHNwdQpk b2Vzbid0IGhhdmUgYW55IHJlbGV2YW5jZSB0byB0aGlzIGF0IGFsbCwgdGhlIG9ubHkgZGF0YSB0 aGF0IGlzCnBlciBzcHUgaXMgdGhlIHNhbXBsZSBkYXRhIGNvbGxlY3RlZCBvbiBhIHByb2ZpbGlu ZyBpbnRlcnJ1cHQsCndoaWNoIHlvdSBjYW4gdGhlbiBjb3B5IGluIHRoZSBwZXItY29udGV4dCBk YXRhIG9uIGEgY29udGV4dCBzd2l0Y2guCgoKPiArLyogTG9va3MgZm9yIGNhY2hlZCBpbmZvIGZv ciB0aGUgcGFzc2VkIHNwdS4goElmIG5vdCBmb3VuZCwgdGhlCj4gKyAqIGNhY2hlZCBpbmZvIGlz IGNyZWF0ZWQgZm9yIHRoZSBwYXNzZWQgc3B1Lgo+ICsgKiBSZXR1cm5zIDAgZm9yIHN1Y2Nlc3M7 IG90aGVyd2lzZSwgLTEgZm9yIGVycm9yLiCgCj4gKyAqLyAKPiArc3RhdGljIGludAo+ICtwcmVw YXJlX2NhY2hlZF9zcHVfaW5mbyhzdHJ1Y3Qgc3B1ICogc3B1LCB1bnNpZ25lZCBpbnQgb2JqZWN0 SWQpCj4gK3sKCnNlZSBhYm92ZSwgdGhpcyBzaG91bGQgZ2V0IHRoZSBzcHVfY29udGV4dCBwb2lu dGVyIGFzIGl0cyBhcmd1bWVudCwKbm90IHRoZSBzcHUuCgo+ICugoKCgoKCgdm1hX21hcF90ICog bmV3X21hcDsKPiAroKCgoKCgoHVuc2lnbmVkIGxvbmcgZmxhZ3MgPSAwOwo+ICugoKCgoKCgaW50 IHJldHZhbCA9IDA7Cj4gK6CgoKCgoKAvKiBzcHUtPm51bWJlciBpcyBhIHN5c3RlbS13aWRlIHZh bHVlLCBub3QgYSBwZXItbm9kZSB2YWx1ZS4gKi8KPiAroKCgoKCgoHN0cnVjdCBjYWNoZWRfaW5m byAqIGluZm8gPSBnZXRfY2FjaGVkX2luZm8oc3B1LT5udW1iZXIpOwo+ICugoKCgoKCgaWYgKGlu Zm8gPT0gTlVMTCkgewoKaWYgeW91IHJldmVydCB0aGUgbG9naWMgdG8KCglpZiAoaW5mbykKCQln b3RvIG91dDsKCnRoZW4gdGhlIGJ1bGsgb2YgeW91ciBmdW5jdGlvbiBkb2Vzbid0IG5lZWQgdG8g Z2V0IGluZGVudGVkLAp3aGljaCBoZWxwcyByZWFkYWJpbGl0eS4KCj4gK6CgoKCgoKCgoKCgoKCg oC8qIGNyZWF0ZSBjYWNoZWRfaW5mbyBhbmQgYWRkIGl0IHRvIHRoZSBsaXN0IGZvciBTUFUgIzxu Pi4qLwo+ICugoKCgoKCgoKCgoKCgoKBpbmZvID0ga3phbGxvYyhzaXplb2Yoc3RydWN0IGNhY2hl ZF9pbmZvKSwgR0ZQX0FUT01JQyk7CgpHRlBfS0VSTkVMCgo+ICugoKCgoKCgoKCgoKCgoKBpZiAo IWluZm8pIHsKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKBwcmludGsoS0VSTl9FUlIgIlNQVV9Q Uk9GOiAiCj4gK6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgIKAgoCCgICIlcywgbGluZSAlZDogY3Jl YXRlIHZtYV9tYXAgZmFpbGVkXG4iLAo+ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCgoCCgIKAgoCBf X0ZVTkNUSU9OX18sIF9fTElORV9fKTsKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKBnb3RvIEVS Ul9BTExPQzsKPiAroKCgoKCgoKCgoKCgoKCgfQo+ICugoKCgoKCgoKCgoKCgoKBuZXdfbWFwID0g Y3JlYXRlX3ZtYV9tYXAoc3B1LCBvYmplY3RJZCk7Cj4gK6CgoKCgoKCgoKCgoKCgoGlmICghbmV3 X21hcCkgewo+ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCgoHByaW50ayhLRVJOX0VSUiAiU1BVX1BS T0Y6ICIKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKAgoCCgIKAgIiVzLCBsaW5lICVkOiBjcmVh dGUgdm1hX21hcCBmYWlsZWRcbiIsCj4gK6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgIKAgoCCgIF9f RlVOQ1RJT05fXywgX19MSU5FX18pOwo+ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCgoGdvdG8gRVJS X0FMTE9DOwo+ICugoKCgoKCgoKCgoKCgoKB9Cj4gKwo+ICugoKCgoKCgoKCgoKCgoKBwcl9kZWJ1 ZygiQ3JlYXRlZCB2bWFfbWFwXG4iKTsKPiArIKCgoKCgoKCgoKCgoKCgaW5mby0+bWFwID0gbmV3 X21hcDsKPiAroKCgoKCgoKCgoKCgoKCgaW5mby0+dGhlX3NwdSA9IHNwdTsKPiAroKCgoKCgoKCg oKCgoKCga3JlZl9pbml0KCZpbmZvLT5jYWNoZV9yZWYpOwo+ICugoKCgoKCgoKCgoKCgoKBzcGlu X2xvY2tfaXJxc2F2ZSgmY2FjaGVfbG9jaywgZmxhZ3MpOwo+ICugoKCgoKCgoKCgoKCgoKBsaXN0 X2FkZCgmaW5mby0+bGlzdCwgJnNwdV9pbmZvLT5pbmZvX2xpc3Rbc3B1LT5udW1iZXJdKTsKPiAr oKCgoKCgoKCgoKCgoKCgc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSgmY2FjaGVfbG9jaywgZmxhZ3Mp Owo+ICugoKCgoKCgoKCgoKCgoKBnb3RvIE9VVDsKPiAroKCgoKCgoH0gZWxzZSB7Cj4gKyCgIKAg oCCgLyogSW1tZWRpZGF0ZWx5IHB1dCBiYWNrIHJlZmVyZW5jZSB0byBjYWNoZWRfaW5mbyBzaW5j ZSB3ZSBkb24ndAo+ICugoKCgoKCgICogcmVhbGx5IG5lZWQgaXQgLS0ganVzdCBjaGVja2luZyB3 aGV0aGVyIHdlIGhhdmUgaXQuCj4gK6CgoKCgoKAgKi8KPiAroKCgoKCgoKCgoKCgoKCgcHV0X2Nh Y2hlZF9pbmZvKGluZm8pOwo+ICugoKCgoKCgoKCgoKCgoKBwcl9kZWJ1ZygiRm91bmQgY2FjaGVk IFNQVSBpbmZvLlxuIik7Cj4gK6CgoKCgoKB9Cj4gK6CgoKCgoKAKPiArRVJSX0FMTE9DOgo+ICug oKCgoKCgcmV0dmFsID0gLTE7Cj4gK09VVDoKPiAroKCgoKCgoHJldHVybiByZXR2YWw7Cj4gK30K Cj4gKy8qIExvb2sgdXAgdGhlIGRjb29raWUgZm9yIHRoZSB0YXNrJ3MgZmlyc3QgVk1fRVhFQ1VU QUJMRSBtYXBwaW5nLAo+ICsgKiB3aGljaCBjb3JyZXNwb25kcyBsb29zZWx5IHRvICJhcHBsaWNh dGlvbiBuYW1lIi4gQWxzbywgZGV0ZXJtaW5lCj4gKyAqIHRoZSBvZmZzZXQgZm9yIHRoZSBTUFUg RUxGIG9iamVjdC4goElmIGNvbXB1dGVkIG9mZnNldCBpcyAKPiArICogbm9uLXplcm8sIGl0IGlt cGxpZXMgYW4gZW1iZWRkZWQgU1BVIG9iamVjdDsgb3RoZXJ3aXNlLCBpdCdzIGEKPiArICogc2Vw YXJhdGUgU1BVIGJpbmFyeSwgaW4gd2hpY2ggY2FzZSB3ZSByZXRyaWV2ZSBpdCdzIGRjb29raWUu Cj4gKyAqLwo+ICtzdGF0aWMgdW5zaWduZWQgbG9uZyAKPiArZ2V0X2V4ZWNfZGNvb2tpZV9hbmRf b2Zmc2V0KAo+ICugoKCgoKCgc3RydWN0IHNwdSAqIHNwdSwgdW5zaWduZWQgaW50ICogb2Zmc2V0 cCwKPiAroKCgoKCgoHVuc2lnbmVkIGxvbmcgKiBzcHVfYmluX2Rjb29raWUsCj4gK6CgoKCgoKB1 bnNpZ25lZCBpbnQgc3B1X3JlZikKPiArewo+ICsgoCCgIKAgoHVuc2lnbmVkIGxvbmcgY29va2ll ID0gMDsKPiAroKCgoKCgoHVuc2lnbmVkIGludCBteV9vZmZzZXQgPSAwOwo+ICsgoCCgIKAgoHN0 cnVjdCB2bV9hcmVhX3N0cnVjdCAqIHZtYTsKPiAroKCgoKCgoHN0cnVjdCBtbV9zdHJ1Y3QgKiBt bSA9IHNwdS0+bW07CgppbmRlbnRpbmcKCj4gKyCgIKAgoCCgaWYgKCFtbSkKPiArIKAgoCCgIKAg oCCgIKAgoGdvdG8gT1VUOwo+ICsKPiArIKAgoCCgIKBmb3IgKHZtYSA9IG1tLT5tbWFwOyB2bWE7 IHZtYSA9IHZtYS0+dm1fbmV4dCkgewo+ICsgoCCgIKAgoCCgIKAgoCCgaWYgKCF2bWEtPnZtX2Zp bGUpCj4gKyCgIKAgoCCgIKAgoCCgIKAgoCCgIKAgoGNvbnRpbnVlOwo+ICsgoCCgIKAgoCCgIKAg oCCgaWYgKCEodm1hLT52bV9mbGFncyAmIFZNX0VYRUNVVEFCTEUpKQo+ICsgoCCgIKAgoCCgIKAg oCCgIKAgoCCgIKBjb250aW51ZTsKPiArIKAgoCCgIKAgoCCgIKAgoGNvb2tpZSA9IGZhc3RfZ2V0 X2Rjb29raWUodm1hLT52bV9maWxlLT5mX2RlbnRyeSwKPiArIKAgoCCgIKAgoCCgIKAgoCCgIKAg oCCgdm1hLT52bV9maWxlLT5mX3Zmc21udCk7Cj4gK6CgoKCgoKCgoKCgoKCgoHByX2RlYnVnKCJn b3QgZGNvb2tpZSBmb3IgJXNcbiIsCj4gK6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgIKB2bWEtPnZt X2ZpbGUtPmZfZGVudHJ5LT5kX25hbWUubmFtZSk7Cj4gKyCgIKAgoCCgIKAgoCCgIKBicmVhazsK PiArIKAgoCCgIKB9Cj4gKwo+ICugoKCgoKCgZm9yICh2bWEgPSBtbS0+bW1hcDsgdm1hOyB2bWEg PSB2bWEtPnZtX25leHQpIHsKPiAroKCgoKCgoKCgoKCgoKCgaWYgKHZtYS0+dm1fc3RhcnQgPiBz cHVfcmVmIHx8IHZtYS0+dm1fZW5kIDwgc3B1X3JlZikKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCg oKBjb250aW51ZTsKPiAroKCgoKCgoKCgoKCgoKCgbXlfb2Zmc2V0ID0gc3B1X3JlZiAtIHZtYS0+ dm1fc3RhcnQ7Cj4gK6CgoKCgoKCgoKCgoKCgoHByX2RlYnVnKCJGb3VuZCBzcHUgRUxGIGF0ICIK PiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKAgoCIgJVggZm9yIGZpbGUgJXNcbiIsIG15X29mZnNl dCwKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKAgoHZtYS0+dm1fZmlsZS0+Zl9kZW50cnktPmRf bmFtZS5uYW1lKTsKPiAroKCgoKCgoKCgoKCgoKCgKm9mZnNldHAgPSBteV9vZmZzZXQ7Cj4gK6Cg oKCgoKCgoKCgoKCgoGlmIChteV9vZmZzZXQgPT0gMCkgewo+ICugoKCgoKCgoKCgoKCgoKCgoKCg oKCgoGlmICghdm1hLT52bV9maWxlKSB7Cj4gK6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCg oKBnb3RvIEZBSUxfTk9fU1BVX0NPT0tJRTsKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKB9Cj4g K6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgKnNwdV9iaW5fZGNvb2tpZSA9IGZhc3RfZ2V0X2Rjb29r aWUoCj4gK6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKB2bWEtPnZtX2ZpbGUtPmZfZGVu dHJ5LAo+ICugoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgdm1hLT52bV9maWxlLT5mX3Zm c21udCk7Cj4gK6CgoKCgoKCgoKCgoKCgoKCgoKCgoKCgcHJfZGVidWcoImdvdCBkY29va2llIGZv ciAlc1xuIiwKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoCCgdm1hLT52bV9maWxl LT5mX2RlbnRyeS0+ZF9uYW1lLm5hbWUpOwo+ICugoKCgoKCgoKCgoKCgoKB9Cj4gK6CgoKCgoKCg oKCgoKCgoGJyZWFrOwo+ICugoKCgoKCgfQo+ICugoKCgoKCgCj4gK09VVDoKPiArIKAgoCCgIKBy ZXR1cm4gY29va2llOwo+ICsKPiArRkFJTF9OT19TUFVfQ09PS0lFOgo+ICugoKCgoKCgcHJpbnRr KEtFUk5fRVJSICJTUFVfUFJPRjogIgo+ICugoKCgoKCgIKAgoCCgICIlcywgbGluZSAlZDogQ2Fu bm90IGZpbmQgZGNvb2tpZSBmb3IgU1BVIGJpbmFyeVxuIiwKPiAroKCgoKCgoCCgIKAgoCBfX0ZV TkNUSU9OX18sIF9fTElORV9fKTsKPiAroKCgoKCgoGdvdG8gT1VUOwo+ICt9Cj4gKwo+ICsKPiAr Cj4gKy8qIFRoaXMgZnVuY3Rpb24gZmluZHMgb3IgY3JlYXRlcyBjYWNoZWQgY29udGV4dCBpbmZv cm1hdGlvbiBmb3IgdGhlCj4gKyAqIHBhc3NlZCBTUFUgYW5kIHJlY29yZHMgU1BVIGNvbnRleHQg aW5mb3JtYXRpb24gaW50byB0aGUgT1Byb2ZpbGUKPiArICogZXZlbnQgYnVmZmVyLgo+ICsgKi8K PiArc3RhdGljIGludCBwcm9jZXNzX2NvbnRleHRfc3dpdGNoKHN0cnVjdCBzcHUgKiBzcHUsIHVu c2lnbmVkIGludCBvYmplY3RJZCkKPiArewo+ICugoKCgoKCgdW5zaWduZWQgbG9uZyBmbGFncyA9 IDA7Cj4gK6CgoKCgoKBpbnQgcmV0dmFsID0gMDsKPiAroKCgoKCgoHVuc2lnbmVkIGludCBvZmZz ZXQgPSAwOwo+ICugoKCgoKCgdW5zaWduZWQgbG9uZyBzcHVfY29va2llID0gMCwgYXBwX2Rjb29r aWUgPSAwOwo+ICugoKCgoKCgcmV0dmFsID0gcHJlcGFyZV9jYWNoZWRfc3B1X2luZm8oc3B1LCBv YmplY3RJZCk7Cj4gK6CgoKCgoKBpZiAocmV0dmFsID09IC0xKSB7Cj4gK6CgoKCgoKCgoKCgoKCg oGdvdG8gT1VUOwo+ICugoKCgoKCgfQo+ICsgoCCgIKAgoC8qIEdldCBkY29va2llIGZpcnN0IGJl Y2F1c2UgYSBtdXRleF9sb2NrIGlzIHRha2VuIGluIHRoYXQKPiAroKCgoKCgoCAqIGNvZGUgcGF0 aCwgc28gaW50ZXJydXB0cyBtdXN0IG5vdCBiZSBkaXNhYmxlZC4KPiAroKCgoKCgoCAqLwo+ICug oKCgoKCgYXBwX2Rjb29raWUgPSBnZXRfZXhlY19kY29va2llX2FuZF9vZmZzZXQoc3B1LCAmb2Zm c2V0LCAKPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKAg oCZzcHVfY29va2llLCBvYmplY3RJZCk7Cj4gKwo+ICsgoCCgIKAgoC8qIFJlY29yZCBjb250ZXh0 IGluZm8gaW4gZXZlbnQgYnVmZmVyICovCj4gK6CgoKCgoKBzcGluX2xvY2tfaXJxc2F2ZSgmYnVm ZmVyX2xvY2ssIGZsYWdzKTsKPiAroKCgoKCgoGFkZF9ldmVudF9lbnRyeShFU0NBUEVfQ09ERSk7 Cj4gK6CgoKCgoKBhZGRfZXZlbnRfZW50cnkoU1BVX0NUWF9TV0lUQ0hfQ09ERSk7Cj4gK6CgoKCg oKBhZGRfZXZlbnRfZW50cnkoc3B1LT5udW1iZXIpOwo+ICugoKCgoKCgYWRkX2V2ZW50X2VudHJ5 KHNwdS0+cGlkKTsKPiAroKCgoKCgoGFkZF9ldmVudF9lbnRyeShzcHUtPnRnaWQpOwo+ICugoKCg oKCgYWRkX2V2ZW50X2VudHJ5KGFwcF9kY29va2llKTsKPiArCj4gK6CgoKCgoKBhZGRfZXZlbnRf ZW50cnkoRVNDQVBFX0NPREUpOwo+ICugoKCgoKCgaWYgKG9mZnNldCkgewo+ICugoKCgoKCgIKAv KiBXaGVuIG9mZnNldCBpcyBub24temVybywgoHRoaXMgbWVhbnMgdGhlIFNQVSBFTEYgd2FzIGVt YmVkZGVkOwo+ICugoKCgoKCgIKAgKiBvdGhlcndpc2UsIGl0IHdhcyBsb2FkZWQgZnJvbSBhIHNl cGFyYXRlIGJpbmFyeSBmaWxlLiCgRm9yIHRoZQo+ICugoKCgoKCgIKAgKiBlbWJlZGRlZCBjYXNl LCB3ZSByZWNvcmQgdGhlIG9mZnNldCBvZiB0aGUgU1BVIEVMRiBpbnRvIHRoZSBQUFUKPiAroKCg oKCgoCCgICogZXhlY3V0YWJsZTsgZm9yIHRoZSBub24tZW1iZWRkZWQgY2FzZSwgd2UgcmVjb3Jk IGEgZGNvb2tpZSB0aGF0Cj4gK6CgoKCgoKAgoCAqIHBvaW50cyB0byB0aGUgbG9jYXRpb24gb2Yg dGhlIFNQVSBiaW5hcnkgdGhhdCB3YXMgbG9hZGVkLgo+ICugoKCgoKCgIKAgKi8KPiAroKCgoKCg oKCgoKCgoKCgYWRkX2V2ZW50X2VudHJ5KFNQVV9PRkZTRVRfQ09ERSk7Cj4gK6CgoKCgoKCgoKCg oKCgoGFkZF9ldmVudF9lbnRyeShvZmZzZXQpOwo+ICugoKCgoKCgfSBlbHNlIHsKPiAroKCgoKCg oKCgoKCgoKCgYWRkX2V2ZW50X2VudHJ5KFNQVV9DT09LSUVfQ09ERSk7Cj4gK6CgoKCgoKCgoKCg oKCgoGFkZF9ldmVudF9lbnRyeShzcHVfY29va2llKTsKPiAroKCgoKCgoH0KCkkgZG9uJ3QgZ2V0 IGl0LiBXaGF0IGlzIHRoZSBhcHBfZGNvb2tpZSB1c2VkIGZvcj8gSWYgdGhlIHNwdSBiaW5hcnkg aXMKZW1iZWRkZWQgaW50byBhIGxpYnJhcnksIHlvdSBhcmUgc3RpbGwgbWlzc2luZyB0aGUgZGNv b2tpZSB0byB0aGF0IC5zbyBmaWxlLApiZWNhdXNlIHlvdSByZXR1cm4gb25seSBhbiBvZmZzZXQu Cgo8bml0cGlja2luZz4KCj4gK6CgoKCgoKB1bnNpZ25lZCBsb25nIGZsYWdzID0gMDsKCm5vIG5l ZWQgdG8gaW5pdGlhbGl6ZQoKPiAroKCgoKCgoHN0cnVjdCBzcHUgKiB0aGVfc3B1ID0gKHN0cnVj dCBzcHUgKikgZGF0YTsKCm5vIG5lZWQgZm9yIHRoZSBjYXN0Cgo+ICugoKCgoKCgcHJfZGVidWco IlNQVSBldmVudCBub3RpZmljYXRpb24gYXJyaXZlZFxuIik7Cj4gK6CgoKCgoKBpZiAodmFsID09 IDApewoKaWYgKCF2YWwpCgo+ICugoKCgoKCgcHJfZGVidWcoInNwdV9zeW5jX3N0YXJ0IC0tIHJ1 bm5pbmcuXG4iKTsKPiArT1VUOgoKb3V0OgoKPiAroKCgoKCgoHJldHVybiByZXQ7oKCgoKAKPiAr fQoKPC9uaXRwaWNraW5nPgoKCgo+IEBAIC00ODAsNyArNDkxLDIyIEBACj4goKCgoKCgoKAgoCCg IKAgc3RydWN0IG9wX3N5c3RlbV9jb25maWcgKnN5cywgaW50IG51bV9jdHJzKQo+IKB7Cj4goKCg oKCgoKBpbnQgaSwgaiwgY3B1Owo+ICugoKCgoKCgc3B1X2N5Y2xlX3Jlc2V0ID0gMDsKPiCgCj4g K6CgoKCgoKAvKiBUaGUgY3B1ZnJlcV9xdWlja19nZXQgZnVuY3Rpb24gcmVxdWlyZXMgdGhhdCBj YmVfY3B1ZnJlcSBtb2R1bGUKPiAroKCgoKCgoCAqIGJlIGxvYWRlZC4goFRoaXMgZnVuY3Rpb24g aXMgbm90IGFjdHVhbGx5IHByb3ZpZGVkIGFuZCBleHBvcnRlZAo+ICugoKCgoKCgICogYnkgY2Jl X2NwdWZyZXEsIGJ1dCBpdCByZWxpZXMgb24gY2JlX2NwdWZyZXEgaW5pdGlhbGl6ZSBrZXJuZWwK PiAroKCgoKCgoCAqIGRhdGEgc3RydWN0dXJlcy4goFNpbmNlIHRoZXJlJ3Mgbm8gd2F5IGZvciBk ZXBtb2QgdG8gcmVhbGl6ZQo+ICugoKCgoKCgICogdGhhdCBvdXIgT1Byb2ZpbGUgbW9kdWxlIGRl cGVuZHMgb24gY2JlX2NwdWZyZXEsIHdlIGN1cnJlbnRseQo+ICugoKCgoKCgICogYXJlIGxldHRp bmcgdGhlIHVzZXJzcGFjZSB0b29sLCBvcGNvbnRyb2wsIGVuc3VyZSB0aGF0IHRoZQo+ICugoKCg oKCgICogY2JlX2NwdWZyZXEgbW9kdWxlIGlzIGxvYWRlZC4KPiAroKCgoKCgoCAqLwo+ICugoKCg oKCga2h6ZnJlcSA9IGNwdWZyZXFfcXVpY2tfZ2V0KHNtcF9wcm9jZXNzb3JfaWQoKSk7CgpZb3Ug c2hvdWxkIHByb2JhYmx5IGhhdmUgYSBmYWxsYmFjayBpbiBoZXJlIGluIGNhc2UgdGhlIGNwdWZy ZXEgbW9kdWxlCmlzIG5vdCBsb2FkZWQuIFRoZXJlIGlzIGEgZ2xvYmFsIHZhcmlhYmxlIHBwY19w cm9jX2ZyZXEgKGluIEh6KSB0aGF0CnlvdSBjYW4gYWNjZXNzLgo+IKCgoKCgoKCgOwo+IKB9Cj4g oAo+IC1zdGF0aWMgdm9pZCBjZWxsX2dsb2JhbF9zdGFydChzdHJ1Y3Qgb3BfY291bnRlcl9jb25m aWcgKmN0cikKPiArc3RhdGljIGludCBjYWxjdWxhdGVfbGZzcihpbnQgbikKPiArewo+ICsjZGVm aW5lIHNpemUgMjQKPiAroKCgoKCgoGludCBpOwo+ICugoKCgoKCgdW5zaWduZWQgaW50IG5ld2xm c3IwOwo+ICugoKCgoKCgdW5zaWduZWQgaW50IGxmc3IgPSAweEZGRkZGRjsKPiAroKCgoKCgoHVu c2lnbmVkIGludCBob3dtYW55ID0gbGZzciAtIG47Cj4gKwo+ICugoKCgoKCgZm9yIChpID0gMjsg aSA8IGhvd21hbnkgKyAyOyBpKyspIHsKPiAroKCgoKCgoKCgoKCgoKCgbmV3bGZzcjAgPSAoKChs ZnNyID4+IChzaXplIC0gMSAtIDApKSAmIDEpIF4KPiAroKCgoKCgoKCgoKCgoKCgoKCgoKCgoKAg oCCgKChsZnNyID4+IChzaXplIC0gMSAtIDEpKSAmIDEpIF4KPiAroKCgoKCgoKCgoKCgoKCgoKCg oKCgoKAgoCCgKCgobGZzciA+PiAoc2l6ZSAtIDEgLSA2KSkgJiAxKSBeCj4gK6CgoKCgoKCgoKCg oKCgoKCgoKCgoKCgIKAgoCAoKGxmc3IgPj4gKHNpemUgLSAxIC0gMjMpKSAmIDEpKSk7Cj4gKwo+ ICugoKCgoKCgoKCgoKCgoKBsZnNyID4+PSAxOwo+ICugoKCgoKCgoKCgoKCgoKBsZnNyID0gbGZz ciB8IChuZXdsZnNyMCA8PCAoc2l6ZSAtIDEpKTsKPiAroKCgoKCgoH0KPiAroKCgoKCgoHJldHVy biBsZnNyOwo+ICsKPiArfQoKSSBkb24ndCBoYXZlIHRoZSBzbGlnaHRlc3QgaWRlYSB3aGF0IHRo aXMgY29kZSBpcyBhYm91dCwgYnV0Cml0IGNlcnRhaW5seSBsb29rcyBpbmVmZmljaWVudCB0byBs b29wIDE2IG1pbGxpb24gdGltZXMgdG8KY29tcHV0ZSBhIGNvbnN0YW50LiBDb3VsZCB5b3UgdXNl IGEgZmFzdGVyIGFsZ29yaXRobSBpbnN0ZWFkLApvciBhdCBsZWFzdCBhZGQgYSBjb21tZW50IGFi b3V0IHdoeSB5b3UgZG8gaXQgdGhpcyB3YXk/Cgo+ICtzdGF0aWMgdm9pZCBjZWxsX2dsb2JhbF9z dG9wKHZvaWQpCj4gK3sKPiAroKCgoKCgoGlmIChzcHVfY3ljbGVfcmVzZXQpIHsKPiAroKCgoKCg oKCgoKCgoKCgY2VsbF9nbG9iYWxfc3RvcF9zcHUoKTsKPiAroKCgoKCgoH0gZWxzZSB7Cj4gK6Cg oKCgoKCgoKCgoKCgoGNlbGxfZ2xvYmFsX3N0b3BfcHB1KCk7Cj4gK6CgoKCgoKB9Cj4gKwo+ICt9 CgpUaGlzIGxvb2tzIHdlaXJkIGFzIHdlbGwuIEkgc3VwcG9zZSBpdCdzIGEgbGltaXRhdGlvbiBv ZiB0aGUgaGFyZHdhcmUKdGhhdCB5b3UgY2FuIG9ubHkgZG8gZWl0aGVyIHBwdSBvciBzcHUgcHJv ZmlsaW5nLiBIb3dldmVyLCBtYWtpbmcgdGhhdApkZXBlbmRlbnQgb2Ygd2hldGhlciB0aGUgJ3Nw dV9jeWNsZV9yZXNldCcgdmFyaWFibGUgaXMgc2V0IHNvdW5kcwpyYXRoZXIgYm9ndXMuCgpJIGRv bid0IGtub3cgd2hhdCB0aGUgYmVzdCBpbnRlcmZhY2UgZm9yIGNob29zaW5nIHRoZSB0YXJnZXQg ZnJvbQp1c2VyIHNwYWNlIHdvdWxkIGJlLCBidXQgeW91IHByb2JhYmx5IGFsc28gd2FudCB0byBi ZSBhYmxlIHRvCnN3aXRjaCBiZXR3ZWVuIHRoZW0gYXQgcnVuIHRpbWUuCgoJQXJuZCA8PjwK ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 7:39 ` [Cbe-oss-dev] " Arnd Bergmann @ 2007-01-30 7:53 ` Benjamin Herrenschmidt 2007-01-30 10:41 ` Christoph Hellwig 2007-01-30 21:41 ` Maynard Johnson 2007-02-03 23:49 ` Maynard Johnson 2 siblings, 1 reply; 43+ messages in thread From: Benjamin Herrenschmidt @ 2007-01-30 7:53 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel > > +/* Defines used for sync_start */ > > +#define SKIP_GENERIC_SYNC 0 > > +#define SYNC_START_ERROR -1 > > +#define DO_GENERIC_SYNC 1 > > + > > +typedef struct vma_map > > +{ > > + struct vma_map *next; > > + unsigned int vma; > > + unsigned int size; > > + unsigned int offset; > > + unsigned int guard_ptr; > > + unsigned int guard_val; > > +} vma_map_t; I haven't had time to look in details yet but in that context, what does "vma" stands for ? There's already an important vm data structure in linux routinely called "vma" and thus I suspect this is a poor naming choice as it will cause confusion. Cheers, Ben. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 7:53 ` Benjamin Herrenschmidt @ 2007-01-30 10:41 ` Christoph Hellwig 2007-01-30 23:09 ` Maynard Johnson 0 siblings, 1 reply; 43+ messages in thread From: Christoph Hellwig @ 2007-01-30 10:41 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linuxppc-dev, oprofile-list, cbe-oss-dev, Arnd Bergmann, linux-kernel On Tue, Jan 30, 2007 at 06:53:50PM +1100, Benjamin Herrenschmidt wrote: > > > > +/* Defines used for sync_start */ > > > +#define SKIP_GENERIC_SYNC 0 > > > +#define SYNC_START_ERROR -1 > > > +#define DO_GENERIC_SYNC 1 > > > + > > > +typedef struct vma_map > > > +{ > > > + struct vma_map *next; > > > + unsigned int vma; > > > + unsigned int size; > > > + unsigned int offset; > > > + unsigned int guard_ptr; > > > + unsigned int guard_val; > > > +} vma_map_t; > > I haven't had time to look in details yet but in that context, what does > "vma" stands for ? There's already an important vm data structure in > linux routinely called "vma" and thus I suspect this is a poor naming > choice as it will cause confusion. It looks like it actually is dealing with vma to me. But then again: - please don't use typedefs for structures - there might be a more descriptive name for this than just vma_map ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 10:41 ` Christoph Hellwig @ 2007-01-30 23:09 ` Maynard Johnson 0 siblings, 0 replies; 43+ messages in thread From: Maynard Johnson @ 2007-01-30 23:09 UTC (permalink / raw) To: Christoph Hellwig Cc: Arnd Bergmann, linux-kernel, linuxppc-dev, oprofile-list, cbe-oss-dev Christoph Hellwig wrote: > On Tue, Jan 30, 2007 at 06:53:50PM +1100, Benjamin Herrenschmidt wrote: > >>>>+/* Defines used for sync_start */ >>>>+#define SKIP_GENERIC_SYNC 0 >>>>+#define SYNC_START_ERROR -1 >>>>+#define DO_GENERIC_SYNC 1 >>>>+ >>>>+typedef struct vma_map >>>>+{ >>>>+ struct vma_map *next; >>>>+ unsigned int vma; >>>>+ unsigned int size; >>>>+ unsigned int offset; >>>>+ unsigned int guard_ptr; >>>>+ unsigned int guard_val; >>>>+} vma_map_t; >> >>I haven't had time to look in details yet but in that context, what does >>"vma" stands for ? There's already an important vm data structure in >>linux routinely called "vma" and thus I suspect this is a poor naming >>choice as it will cause confusion. > > > It looks like it actually is dealing with vma to me. But then again: > > - please don't use typedefs for structures > - there might be a more descriptive name for this than just vma_map Yes, I'll come up with some (hopefully) better name. > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > oprofile-list mailing list > oprofile-list@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/oprofile-list ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 7:39 ` [Cbe-oss-dev] " Arnd Bergmann 2007-01-30 7:53 ` Benjamin Herrenschmidt @ 2007-01-30 21:41 ` Maynard Johnson 2007-01-30 22:54 ` Maynard Johnson ` (2 more replies) 2007-02-03 23:49 ` Maynard Johnson 2 siblings, 3 replies; 43+ messages in thread From: Maynard Johnson @ 2007-01-30 21:41 UTC (permalink / raw) To: Arnd Bergmann Cc: linuxppc-dev, Anton Blanchard, cbe-oss-dev, oprofile-list, linux-kernel Arnd Bergmann wrote: > On Monday 29 January 2007 20:48, Maynard Johnson wrote: > >>Subject: Add support to OProfile for profiling Cell BE SPUs >> >>From: Maynard Johnson <maynardj@us.ibm.com> >> >>This patch updates the existing arch/powerpc/oprofile/op_model_cell.c >>to add in the SPU profiling capabilities. In addition, a 'cell' subdirectory >>was added to arch/powerpc/oprofile to hold Cell-specific SPU profiling >>code. >> >>Signed-off-by: Carl Love <carll@us.ibm.com> >>Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> > > > I can't really say much about the common oprofile files that you are > touching, maybe someone from oprofile-list (Philippe?) to look over them > and ack/nack them. Anton (added to cc list) may be another good reviewer of drivers/oprofile changes. > > >>+#define number_of_online_nodes(nodes) { \ >>+ u32 cpu; u32 tmp; \ >>+ nodes = 0; \ >>+ for_each_online_cpu(cpu) { \ >>+ tmp = cbe_cpu_to_node(cpu) + 1;\ >>+ if (tmp > nodes) \ >>+ nodes++; \ >>+ } \ >>+} > > > I've been discussing with benh about a better way to do this. We should > change all the references to nodes and cpu numbers to something more > correct in the future, so we get rid of the assumption that each > numa node is a cell chip. It's probably best to leave your code as > is for now, but we need to remember this one when cleaning it up. > > You should however convert this into an inline function > of prototype > > static inline int number_of_online_nodes(void) > > instead of a macro. OK. > > >>+/* Defines used for sync_start */ >>+#define SKIP_GENERIC_SYNC 0 >>+#define SYNC_START_ERROR -1 >>+#define DO_GENERIC_SYNC 1 >>+ >>+typedef struct vma_map >>+{ >>+ struct vma_map *next; >>+ unsigned int vma; >>+ unsigned int size; >>+ unsigned int offset; >>+ unsigned int guard_ptr; >>+ unsigned int guard_val; >>+} vma_map_t; > > > please don't typedef structures. Sure. > > [snip] >>+ >>+static int spu_prof_running = 0; >>+static unsigned int profiling_interval = 0; >>+ >>+extern int num_nodes; >>+extern unsigned int khzfreq; > > > You really can't have global variable with such generic names. For > static variables, it's less of a problem, since they are not in the > same name space, but for easier debugging, it's good to always have > the name of your module (e.g. spu_prof_) as a prefix to the identifier. OK, we'll add 'spu_prof' prefix to them. > > Of course, the best way would be to avoid global and static variables > entirely, but that's not always possible. > > [snip] >>+ // process the collected SPU PC for each node >>+ for_each_online_cpu(cpu) { >>+ if (cbe_get_hw_thread_id(cpu)) >>+ continue; >>+ >>+ node = cbe_cpu_to_node(cpu); >>+ node_factor = node * SPUS_PER_NODE; >>+ /* number of valid entries for this node */ >>+ entry = 0; >>+ >>+ trace_addr = cbe_read_pm(cpu, trace_address); >>+ while ((trace_addr & CBE_PM_TRACE_BUF_EMPTY) != 0x400) >>+ { >>+ /* there is data in the trace buffer to process */ >>+ cbe_read_trace_buffer(cpu, trace_buffer); >>+ spu_mask = 0xFFFF000000000000; >>+ >>+ /* Each SPU PC is 16 bits; hence, four spus in each of >>+ * the two 64-bit buffer entries that make up the >>+ * 128-bit trace_buffer entry. Process the upper and >>+ * lower 64-bit values simultaneously. >>+ */ >>+ for (spu = 0; spu < SPUS_PER_TB_ENTRY; spu++) { >>+ spu_pc_lower = spu_mask & trace_buffer[0]; >>+ spu_pc_lower = spu_pc_lower >> (NUM_SPU_BITS_TRBUF >>+ * (SPUS_PER_TB_ENTRY-spu-1)); >>+ >>+ spu_pc_upper = spu_mask & trace_buffer[1]; >>+ spu_pc_upper = spu_pc_upper >> (NUM_SPU_BITS_TRBUF >>+ * (SPUS_PER_TB_ENTRY-spu-1)); >>+ >>+ spu_mask = spu_mask >> NUM_SPU_BITS_TRBUF; >>+ >>+ /* spu PC trace entry is upper 16 bits of the >>+ * 18 bit SPU program counter >>+ */ >>+ spu_pc_lower = spu_pc_lower << 2; >>+ spu_pc_upper = spu_pc_upper << 2; >>+ >>+ samples[((node_factor + spu) * TRACE_ARRAY_SIZE) + entry] >>+ = (u32) spu_pc_lower; >>+ samples[((node_factor + spu + SPUS_PER_TB_ENTRY) * TRACE_ARRAY_SIZE) + entry] >>+ = (u32) spu_pc_upper; >>+ } >>+ >>+ entry++; >>+ >>+ if (entry >= TRACE_ARRAY_SIZE) >>+ /* spu_samples is full */ >>+ break; >>+ >>+ trace_addr = cbe_read_pm(cpu, trace_address); >>+ } >>+ samples_per_node[node] = entry; >>+ } >>+} > > > While I can't see anything technically wrong with this function, it would be > good to split it into smaller functions. Since you are nesting three > loops, it should be possible to make a separate function from one of the > inner loops without changing the actual logic behind it. Will do. > > >>+ >>+static int profile_spus(struct hrtimer * timer) >>+{ >>+ ktime_t kt; >>+ int cpu, node, k, num_samples, spu_num; > > > whitespace damage fixed > > >>+ >>+ if (!spu_prof_running) >>+ goto STOP; >>+ >>+ cell_spu_pc_collection(); >>+ for_each_online_cpu(cpu) { >>+ if (cbe_get_hw_thread_id(cpu)) >>+ continue; > > > Here, you enter the same top-level loop again, why not make it > for_each_online_cpu(cpu) { > if (cbe_get_hw_thread_id(cpu)) > continue; > num_samples = cell_spu_pc_collection(cpu); > ... Yes, good suggestion. > > >>+ kt = ktime_set(0, profiling_interval); >>+ if (!spu_prof_running) >>+ goto STOP; >>+ hrtimer_forward(timer, timer->base->get_time(), kt); >>+ return HRTIMER_RESTART; > > > is hrtimer_forward really the right interface here? You are ignoring > the number of overruns anyway, so hrtimer_start(,,) sounds more > correct to me. According to Tom Gleixner, "hrtimer_forward is a convenience function to move the expiry time of a timer forward in multiples of the interval, so it is in the future. After setting the expiry time you restart the timer either with [sic] a return HRTIMER_RESTART (if you are in the timer callback function)." > > >>+ >>+ STOP: > > > labels should be in small letters. OK > > >>+ printk(KERN_INFO "SPU_PROF: spu-prof timer ending\n"); >>+ return HRTIMER_NORESTART; >>+} > > >>+void start_spu_profiling(unsigned int cycles_reset) { >>+ >>+ ktime_t kt; >>+ >>+ /* To calculate a timeout in nanoseconds, the basic >>+ * formula is ns = cycles_reset * (NSEC_PER_SEC / cpu frequency). >>+ * To avoid floating point math, we use the scale math >>+ * technique as described in linux/jiffies.h. We use >>+ * a scale factor of SCALE_SHIFT,which provides 4 decimal places >>+ * of precision, which is close enough for the purpose at hand. >>+ */ >>+ >>+ /* Since cpufreq_quick_get returns frequency in kHz, we use >>+ * USEC_PER_SEC here vs NSEC_PER_SEC. >>+ */ >>+ unsigned long nsPerCyc = (USEC_PER_SEC << SCALE_SHIFT)/khzfreq; >>+ profiling_interval = (nsPerCyc * cycles_reset) >> SCALE_SHIFT; >>+ >>+ pr_debug("timer resolution: %lu\n", >>+ TICK_NSEC); > > > Don't you need to adapt the profiling_interval at run time, when cpufreq > changes the core frequency? You should probably use > cpufreq_register_notifier() to update this. Since OProfile is a statistical profiler, the exact frequency is not critical. The user is going to be looking for hot spots in their code, so it's all relative. With that said, I don't imagine using the cpufreq notiication would be a big deal. We'll look at it. > > >>+ kt = ktime_set(0, profiling_interval); >>+ hrtimer_init(&timer, CLOCK_MONOTONIC, HRTIMER_REL); >>+ timer.expires = kt; >>+ timer.function = profile_spus; >>+ >>+ /* Allocate arrays for collecting SPU PC samples */ >>+ samples = (u32 *) kzalloc(num_nodes * SPUS_PER_NODE * TRACE_ARRAY_SIZE * sizeof(u32), GFP_ATOMIC); > > > Try to avoid atomic allocations. I don't think you are in an atomic > context here at all, so you can just use GFP_KERNEL. OK, I'll check it out. > > >>+ samples_per_node = (u32 *) kzalloc(num_nodes * sizeof(u32), GFP_ATOMIC); > > > Since MAX_NUMNODES is small, it's probably more efficient to just allocate this > statically. OK. > > >>+ >>+ spu_prof_running = 1; >>+ hrtimer_start(&timer, kt, HRTIMER_REL); >>+} >> >>+ >>+void stop_spu_profiling(void) >>+{ >>+ >>+ hrtimer_cancel(&timer); >>+ kfree(samples); >>+ kfree(samples_per_node); >>+ pr_debug("SPU_PROF: stop_spu_profiling issued\n"); >>+ spu_prof_running = 0; >>+} > > > shouldn't you set spu_prof_running = 0 before doing any of the other things? > It looks to me like you could otherwise get into a use-after-free > situation. If I'm wrong with that, you probably don't need spu_prof_running > at all ;-) No, you're right. :-) > > >>+/* Conainer for caching information about an active SPU task. >>+ * :map -- pointer to a list of vma_maps >>+ * :spu -- the spu for this active SPU task >>+ * :list -- potentially could be used to contain the cached_infos >>+ * for inactive SPU tasks. > > > Documenting structures is good, but please use the common kerneldoc format > for it. There are a number of examples for this in include/linux/ OK > > >>+ * >>+ * Ideally, we would like to be able to create the cached_info for >>+ * an SPU task just one time -- when libspe first loads the SPU >>+ * binary file. We would store the cached_info in a list. Then, as >>+ * SPU tasks are switched out and new ones switched in, the cached_info >>+ * for inactive tasks would be kept, and the active one would be placed >>+ * at the head of the list. But this technique may not with >>+ * current spufs functionality since the spu used in bind_context may >>+ * be a different spu than was used in a previous bind_context for a >>+ * reactivated SPU task. Additionally, a reactivated SPU task may be >>+ * assigned to run on a different physical SPE. We will investigate >>+ * further if this can be done. >>+ * >>+ */ > > > You should stuff a pointer to cached_info into struct spu_context, > e.g. 'void *profile_private'. > > >>+struct cached_info { >>+ vma_map_t * map; >>+ struct spu * the_spu; >>+ struct kref cache_ref; >>+ struct list_head list; >>+}; > > > And replace the 'the_spu' member with a back pointer to the > spu_context if you need it. > > >>+ >>+/* A data structure for cached information about active SPU tasks. >>+ * Storage is dynamically allocated, sized as >>+ * "number of active nodes multplied by 8". >>+ * The info_list[n] member holds 0 or more >>+ * 'struct cached_info' objects for SPU#=n. >>+ * >>+ * As currently implemented, there will only ever be one cached_info >>+ * in the list for a given SPU. If we can devise a way to maintain >>+ * multiple cached_infos in our list, then it would make sense >>+ * to also cache the dcookie representing the PPU task application. >>+ * See above description of struct cached_info for more details. >>+ */ >>+struct spu_info_stacks { >>+ struct list_head * info_list; >>+}; > > > Why do you store pointers to list_head structures? If you want to store > lists, you should have a lists_head itself in here. info_list is an array of n lists, where n is the number of SPUs. > > Why do you store them per spu in the first place? The physical spu > doesn't have any relevance to this at all, the only data that is > per spu is the sample data collected on a profiling interrupt, > which you can then copy in the per-context data on a context switch. The sample data is written out to the event buffer on every profiling interrupt. But we don't write out the SPU program counter samples directly to the event buffer. First, we have to find the cached_info for the appropriate SPU context to retrieve the cached vma-to-fileoffset map. Then we do the vma_map_lookup to find the fileoffset corresponding to the SPU PC sample, which we then write out to the event buffer. This is one of the most time-critical pieces of the SPU profiling code, so I used an array to hold the cached_info for fast random access. But as I stated in a code comment above, the negative implication of this current implementation is that the array can only hold the cached_info for currently running SPU tasks. I need to give this some more thought. > > > >>+/* Looks for cached info for the passed spu. If not found, the >>+ * cached info is created for the passed spu. >>+ * Returns 0 for success; otherwise, -1 for error. >>+ */ >>+static int >>+prepare_cached_spu_info(struct spu * spu, unsigned int objectId) >>+{ > > > see above, this should get the spu_context pointer as its argument, > not the spu. > > >>+ vma_map_t * new_map; >>+ unsigned long flags = 0; >>+ int retval = 0; >>+ /* spu->number is a system-wide value, not a per-node value. */ >>+ struct cached_info * info = get_cached_info(spu->number); >>+ if (info == NULL) { > > > if you revert the logic to > > if (info) > goto out; > > then the bulk of your function doesn't need to get indented, > which helps readability. OK > > >>+ /* create cached_info and add it to the list for SPU #<n>.*/ >>+ info = kzalloc(sizeof(struct cached_info), GFP_ATOMIC); > > > GFP_KERNEL OK > > >>+ if (!info) { >>+ printk(KERN_ERR "SPU_PROF: " >>+ "%s, line %d: create vma_map failed\n", >>+ __FUNCTION__, __LINE__); >>+ goto ERR_ALLOC; >>+ } >>+ new_map = create_vma_map(spu, objectId); >>+ if (!new_map) { >>+ printk(KERN_ERR "SPU_PROF: " >>+ "%s, line %d: create vma_map failed\n", >>+ __FUNCTION__, __LINE__); >>+ goto ERR_ALLOC; >>+ } >>+ >>+ pr_debug("Created vma_map\n"); >>+ info->map = new_map; >>+ info->the_spu = spu; >>+ kref_init(&info->cache_ref); >>+ spin_lock_irqsave(&cache_lock, flags); >>+ list_add(&info->list, &spu_info->info_list[spu->number]); >>+ spin_unlock_irqrestore(&cache_lock, flags); >>+ goto OUT; >>+ } else { >>+ /* Immedidately put back reference to cached_info since we don't >>+ * really need it -- just checking whether we have it. >>+ */ >>+ put_cached_info(info); >>+ pr_debug("Found cached SPU info.\n"); >>+ } >>+ >>+ERR_ALLOC: >>+ retval = -1; >>+OUT: >>+ return retval; >>+} > > >>+/* Look up the dcookie for the task's first VM_EXECUTABLE mapping, >>+ * which corresponds loosely to "application name". Also, determine >>+ * the offset for the SPU ELF object. If computed offset is >>+ * non-zero, it implies an embedded SPU object; otherwise, it's a >>+ * separate SPU binary, in which case we retrieve it's dcookie. >>+ */ >>+static unsigned long >>+get_exec_dcookie_and_offset( >>+ struct spu * spu, unsigned int * offsetp, >>+ unsigned long * spu_bin_dcookie, >>+ unsigned int spu_ref) >>+{ >>+ unsigned long cookie = 0; >>+ unsigned int my_offset = 0; >>+ struct vm_area_struct * vma; >>+ struct mm_struct * mm = spu->mm; > > > indenting uh-huh > > >>+ if (!mm) >>+ goto OUT; >>+ >>+ for (vma = mm->mmap; vma; vma = vma->vm_next) { >>+ if (!vma->vm_file) >>+ continue; >>+ if (!(vma->vm_flags & VM_EXECUTABLE)) >>+ continue; >>+ cookie = fast_get_dcookie(vma->vm_file->f_dentry, >>+ vma->vm_file->f_vfsmnt); >>+ pr_debug("got dcookie for %s\n", >>+ vma->vm_file->f_dentry->d_name.name); >>+ break; >>+ } >>+ >>+ for (vma = mm->mmap; vma; vma = vma->vm_next) { >>+ if (vma->vm_start > spu_ref || vma->vm_end < spu_ref) >>+ continue; >>+ my_offset = spu_ref - vma->vm_start; >>+ pr_debug("Found spu ELF at " >>+ " %X for file %s\n", my_offset, >>+ vma->vm_file->f_dentry->d_name.name); >>+ *offsetp = my_offset; >>+ if (my_offset == 0) { >>+ if (!vma->vm_file) { >>+ goto FAIL_NO_SPU_COOKIE; >>+ } >>+ *spu_bin_dcookie = fast_get_dcookie( >>+ vma->vm_file->f_dentry, >>+ vma->vm_file->f_vfsmnt); >>+ pr_debug("got dcookie for %s\n", >>+ vma->vm_file->f_dentry->d_name.name); >>+ } >>+ break; >>+ } >>+ >>+OUT: >>+ return cookie; >>+ >>+FAIL_NO_SPU_COOKIE: >>+ printk(KERN_ERR "SPU_PROF: " >>+ "%s, line %d: Cannot find dcookie for SPU binary\n", >>+ __FUNCTION__, __LINE__); >>+ goto OUT; >>+} >>+ >>+ >>+ >>+/* This function finds or creates cached context information for the >>+ * passed SPU and records SPU context information into the OProfile >>+ * event buffer. >>+ */ >>+static int process_context_switch(struct spu * spu, unsigned int objectId) >>+{ >>+ unsigned long flags = 0; >>+ int retval = 0; >>+ unsigned int offset = 0; >>+ unsigned long spu_cookie = 0, app_dcookie = 0; >>+ retval = prepare_cached_spu_info(spu, objectId); >>+ if (retval == -1) { >>+ goto OUT; >>+ } >>+ /* Get dcookie first because a mutex_lock is taken in that >>+ * code path, so interrupts must not be disabled. >>+ */ >>+ app_dcookie = get_exec_dcookie_and_offset(spu, &offset, >>+ &spu_cookie, objectId); >>+ >>+ /* Record context info in event buffer */ >>+ spin_lock_irqsave(&buffer_lock, flags); >>+ add_event_entry(ESCAPE_CODE); >>+ add_event_entry(SPU_CTX_SWITCH_CODE); >>+ add_event_entry(spu->number); >>+ add_event_entry(spu->pid); >>+ add_event_entry(spu->tgid); >>+ add_event_entry(app_dcookie); >>+ >>+ add_event_entry(ESCAPE_CODE); >>+ if (offset) { >>+ /* When offset is non-zero, this means the SPU ELF was embedded; >>+ * otherwise, it was loaded from a separate binary file. For the >>+ * embedded case, we record the offset of the SPU ELF into the PPU >>+ * executable; for the non-embedded case, we record a dcookie that >>+ * points to the location of the SPU binary that was loaded. >>+ */ >>+ add_event_entry(SPU_OFFSET_CODE); >>+ add_event_entry(offset); >>+ } else { >>+ add_event_entry(SPU_COOKIE_CODE); >>+ add_event_entry(spu_cookie); >>+ } > > > I don't get it. What is the app_dcookie used for? If the spu binary is > embedded into a library, you are still missing the dcookie to that .so file, > because you return only an offset. For embedded SPU app, the post-processing tool opens the PPE binary app file and obtains the SPU ELF embedded thereine, and from that, we obtain the name of the SPU binary. Also, the app name is included in the report, along with the SPU binary filename, if the report contains samples from more than one application. > > <nitpicking> > >>+ unsigned long flags = 0; > > > no need to initialize > > >>+ struct spu * the_spu = (struct spu *) data; > > > no need for the cast > > >>+ pr_debug("SPU event notification arrived\n"); >>+ if (val == 0){ > > > if (!val) > > >>+ pr_debug("spu_sync_start -- running.\n"); >>+OUT: > > > out: > > >>+ return ret; >>+} > > > </nitpicking> OK, we'll fix up. > > > >>@@ -480,7 +491,22 @@ >> struct op_system_config *sys, int num_ctrs) >> { >> int i, j, cpu; >>+ spu_cycle_reset = 0; >> >>+ /* The cpufreq_quick_get function requires that cbe_cpufreq module >>+ * be loaded. This function is not actually provided and exported >>+ * by cbe_cpufreq, but it relies on cbe_cpufreq initialize kernel >>+ * data structures. Since there's no way for depmod to realize >>+ * that our OProfile module depends on cbe_cpufreq, we currently >>+ * are letting the userspace tool, opcontrol, ensure that the >>+ * cbe_cpufreq module is loaded. >>+ */ >>+ khzfreq = cpufreq_quick_get(smp_processor_id()); > > > You should probably have a fallback in here in case the cpufreq module > is not loaded. There is a global variable ppc_proc_freq (in Hz) that > you can access. Our userspace tool ensures the cpufreq module is loaded. > >> ; >> } >> >>-static void cell_global_start(struct op_counter_config *ctr) >>+static int calculate_lfsr(int n) >>+{ >>+#define size 24 >>+ int i; >>+ unsigned int newlfsr0; >>+ unsigned int lfsr = 0xFFFFFF; >>+ unsigned int howmany = lfsr - n; >>+ >>+ for (i = 2; i < howmany + 2; i++) { >>+ newlfsr0 = (((lfsr >> (size - 1 - 0)) & 1) ^ >>+ ((lfsr >> (size - 1 - 1)) & 1) ^ >>+ (((lfsr >> (size - 1 - 6)) & 1) ^ >>+ ((lfsr >> (size - 1 - 23)) & 1))); >>+ >>+ lfsr >>= 1; >>+ lfsr = lfsr | (newlfsr0 << (size - 1)); >>+ } >>+ return lfsr; >>+ >>+} > > > I don't have the slightest idea what this code is about, but Me neither. Carl, can you comment? > it certainly looks inefficient to loop 16 million times to > compute a constant. Could you use a faster algorithm instead, > or at least add a comment about why you do it this way? > > >>+static void cell_global_stop(void) >>+{ >>+ if (spu_cycle_reset) { >>+ cell_global_stop_spu(); >>+ } else { >>+ cell_global_stop_ppu(); >>+ } >>+ >>+} > > > This looks weird as well. I suppose it's a limitation of the hardware > that you can only do either ppu or spu profiling. However, making that > dependent of whether the 'spu_cycle_reset' variable is set sounds > rather bogus. It is the only file-scoped variable relating to SPU profiling, and will always be non-zero when the user selects to perform SPU profiling. Seemed like a logical-enough choice to me. > > I don't know what the best interface for choosing the target from > user space would be, but you probably also want to be able to > switch between them at run time. The hardware setup is so completely different, I don't think there's a viable way of switching between PPU profiling and SPU profiling without a "stop" in between. > > Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 21:41 ` Maynard Johnson @ 2007-01-30 22:54 ` Maynard Johnson 2007-01-30 23:34 ` Benjamin Herrenschmidt 2007-01-31 6:52 ` Arnd Bergmann 2007-01-30 23:31 ` Carl Love 2007-01-31 5:57 ` Arnd Bergmann 2 siblings, 2 replies; 43+ messages in thread From: Maynard Johnson @ 2007-01-30 22:54 UTC (permalink / raw) Cc: linuxppc-dev, oprofile-list, cbe-oss-dev, Arnd Bergmann, linux-kernel Maynard Johnson wrote: > Arnd Bergmann wrote: > > >>On Monday 29 January 2007 20:48, Maynard Johnson wrote: >> >> >>>Subject: Add support to OProfile for profiling Cell BE SPUs >>> >>>From: Maynard Johnson <maynardj@us.ibm.com> >>> >>>This patch updates the existing arch/powerpc/oprofile/op_model_cell.c >>>to add in the SPU profiling capabilities. In addition, a 'cell' subdirectory >>>was added to arch/powerpc/oprofile to hold Cell-specific SPU profiling >>>code. >>> >>>Signed-off-by: Carl Love <carll@us.ibm.com> >>>Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> >> [snip] >> >>>+ * >>>+ * Ideally, we would like to be able to create the cached_info for >>>+ * an SPU task just one time -- when libspe first loads the SPU >>>+ * binary file. We would store the cached_info in a list. Then, as >>>+ * SPU tasks are switched out and new ones switched in, the cached_info >>>+ * for inactive tasks would be kept, and the active one would be placed >>>+ * at the head of the list. But this technique may not with >>>+ * current spufs functionality since the spu used in bind_context may >>>+ * be a different spu than was used in a previous bind_context for a >>>+ * reactivated SPU task. Additionally, a reactivated SPU task may be >>>+ * assigned to run on a different physical SPE. We will investigate >>>+ * further if this can be done. >>>+ * >>>+ */ >> >> >>You should stuff a pointer to cached_info into struct spu_context, >>e.g. 'void *profile_private'. >> >> >> >>>+struct cached_info { >>>+ vma_map_t * map; >>>+ struct spu * the_spu; >>>+ struct kref cache_ref; >>>+ struct list_head list; >>>+}; >> >> >>And replace the 'the_spu' member with a back pointer to the >>spu_context if you need it. >> >> >> >>>+ >>>+/* A data structure for cached information about active SPU tasks. >>>+ * Storage is dynamically allocated, sized as >>>+ * "number of active nodes multplied by 8". >>>+ * The info_list[n] member holds 0 or more >>>+ * 'struct cached_info' objects for SPU#=n. >>>+ * >>>+ * As currently implemented, there will only ever be one cached_info >>>+ * in the list for a given SPU. If we can devise a way to maintain >>>+ * multiple cached_infos in our list, then it would make sense >>>+ * to also cache the dcookie representing the PPU task application. >>>+ * See above description of struct cached_info for more details. >>>+ */ >>>+struct spu_info_stacks { >>>+ struct list_head * info_list; >>>+}; >> >> >>Why do you store pointers to list_head structures? If you want to store >>lists, you should have a lists_head itself in here. > > info_list is an array of n lists, where n is the number of SPUs. > >>Why do you store them per spu in the first place? The physical spu >>doesn't have any relevance to this at all, the only data that is >>per spu is the sample data collected on a profiling interrupt, >>which you can then copy in the per-context data on a context switch. > > The sample data is written out to the event buffer on every profiling > interrupt. But we don't write out the SPU program counter samples > directly to the event buffer. First, we have to find the cached_info > for the appropriate SPU context to retrieve the cached vma-to-fileoffset > map. Then we do the vma_map_lookup to find the fileoffset corresponding > to the SPU PC sample, which we then write out to the event buffer. This > is one of the most time-critical pieces of the SPU profiling code, so I > used an array to hold the cached_info for fast random access. But as I > stated in a code comment above, the negative implication of this current > implementation is that the array can only hold the cached_info for > currently running SPU tasks. I need to give this some more thought. I've given this some more thought, and I'm coming to the conclusion that a pure array-based implementation for holding cached_info (getting rid of the lists) would work well for the vast majority of cases in which OProfile will be used. Yes, it is true that the mapping of an SPU context to a phsyical spu-numbered array location cannot be guaranteed to stay valid, and that's why I discard the cached_info at that array location when the SPU task is switched out. Yes, it would be terribly inefficient if the same SPU task gets switched back in later and we would have to recreate the cached_info. However, I contend that OProfile users are interested in profiling one application at a time. They are not going to want to muddy the waters with multiple SPU apps running at the same time. I can't think of any reason why someone would conscisouly choose to do that. Any thoughts from the general community, especially OProfile users? Thanks. -Maynard > >> [snip] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 22:54 ` Maynard Johnson @ 2007-01-30 23:34 ` Benjamin Herrenschmidt 2007-01-31 0:29 ` Maynard Johnson 2007-01-31 6:52 ` Arnd Bergmann 1 sibling, 1 reply; 43+ messages in thread From: Benjamin Herrenschmidt @ 2007-01-30 23:34 UTC (permalink / raw) To: maynardj Cc: linuxppc-dev, Arnd Bergmann, cbe-oss-dev, oprofile-list, linux-kernel > I've given this some more thought, and I'm coming to the conclusion that > a pure array-based implementation for holding cached_info (getting rid > of the lists) would work well for the vast majority of cases in which > OProfile will be used. Yes, it is true that the mapping of an SPU > context to a phsyical spu-numbered array location cannot be guaranteed > to stay valid, and that's why I discard the cached_info at that array > location when the SPU task is switched out. Yes, it would be terribly > inefficient if the same SPU task gets switched back in later and we > would have to recreate the cached_info. However, I contend that > OProfile users are interested in profiling one application at a time. > They are not going to want to muddy the waters with multiple SPU apps > running at the same time. I can't think of any reason why someone would > conscisouly choose to do that. > > Any thoughts from the general community, especially OProfile users? Well, it's my understanding that quite a few typical usage scenario involve different tasks running on different SPUs passing each other data around. Ben. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 23:34 ` Benjamin Herrenschmidt @ 2007-01-31 0:29 ` Maynard Johnson 0 siblings, 0 replies; 43+ messages in thread From: Maynard Johnson @ 2007-01-31 0:29 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linuxppc-dev, Arnd Bergmann, cbe-oss-dev, oprofile-list, linux-kernel Benjamin Herrenschmidt wrote: >>I've given this some more thought, and I'm coming to the conclusion that >>a pure array-based implementation for holding cached_info (getting rid >>of the lists) would work well for the vast majority of cases in which >>OProfile will be used. Yes, it is true that the mapping of an SPU >>context to a phsyical spu-numbered array location cannot be guaranteed >>to stay valid, and that's why I discard the cached_info at that array >>location when the SPU task is switched out. Yes, it would be terribly >>inefficient if the same SPU task gets switched back in later and we >>would have to recreate the cached_info. However, I contend that >>OProfile users are interested in profiling one application at a time. >>They are not going to want to muddy the waters with multiple SPU apps >>running at the same time. I can't think of any reason why someone would >>conscisouly choose to do that. >> >>Any thoughts from the general community, especially OProfile users? > > > Well, it's my understanding that quite a few typical usage scenario > involve different tasks running on different SPUs passing each other > data around. That shouldn't be a problem. I would consider this to be "one large application" consisting of multiple SPU binaries running simultaneously. Such a scenario can be handled with no negative performance impact using a simple 16 element array of cached_info objects -- as long as there isn't (much) SPU task switching being done. -Maynard > > Ben. > > ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 22:54 ` Maynard Johnson 2007-01-30 23:34 ` Benjamin Herrenschmidt @ 2007-01-31 6:52 ` Arnd Bergmann 2007-02-02 16:47 ` Maynard Johnson 1 sibling, 1 reply; 43+ messages in thread From: Arnd Bergmann @ 2007-01-31 6:52 UTC (permalink / raw) To: cbe-oss-dev, maynardj; +Cc: linuxppc-dev, linux-kernel, oprofile-list On Tuesday 30 January 2007 23:54, Maynard Johnson wrote: > >>Why do you store them per spu in the first place? The physical spu > >>doesn't have any relevance to this at all, the only data that is > >>per spu is the sample data collected on a profiling interrupt, > >>which you can then copy in the per-context data on a context switch. > >=20 > > The sample data is written out to the event buffer on every profiling=20 > > interrupt. =A0But we don't write out the SPU program counter samples=20 > > directly to the event buffer. =A0First, we have to find the cached_info= =20 > > for the appropriate SPU context to retrieve the cached vma-to-fileoffse= t=20 > > map. =A0Then we do the vma_map_lookup to find the fileoffset correspond= ing=20 > > to the SPU PC sample, which we then write out to the event buffer. =A0T= his=20 > > is one of the most time-critical pieces of the SPU profiling code, so I= =20 > > used an array to hold the cached_info for fast random access. =A0But as= I=20 > > stated in a code comment above, the negative implication of this curren= t=20 > > implementation is that the array can only hold the cached_info for=20 > > currently running SPU tasks. =A0I need to give this some more thought. >=20 > I've given this some more thought, and I'm coming to the conclusion that= =20 > a pure array-based implementation for holding cached_info (getting rid=20 > of the lists) would work well for the vast majority of cases in which=20 > OProfile will be used. =A0Yes, it is true that the mapping of an SPU=20 > context to a phsyical spu-numbered array location cannot be guaranteed=20 > to stay valid, and that's why I discard the cached_info at that array=20 > location when the SPU task is switched out. =A0Yes, it would be terribly= =20 > inefficient if the same SPU task gets switched back in later and we=20 > would have to recreate the cached_info. =A0However, I contend that=20 > OProfile users are interested in profiling one application at a time.=20 > They are not going to want to muddy the waters with multiple SPU apps=20 > running at the same time. =A0I can't think of any reason why someone woul= d=20 > conscisouly choose to do that. >=20 > Any thoughts from the general community, especially OProfile users? >=20 Please assume that in the near future we will be scheduling SPU contexts in and out multiple times a second. Even in a single application, you can easily have more contexts than you have physical SPUs. The event buffer by definition needs to be per context. If you for some reason want to collect the samples per physical SPU during an event interrupt, you should at least make sure that they are copied into the per-context event buffer on a context switch. At the context switch point, you probably also want to drain the hw event counters, so that you account all events correctly. We also want to be able to profile the context switch code itself, which means that we also need one event buffer associated with the kernel to collect events that for a zero context_id. Of course, the recording of raw samples in the per-context buffer does not need to have the dcookies along with it, you can still resolve the pointers when the SPU context gets destroyed (or an object gets unmapped). Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-31 6:52 ` Arnd Bergmann @ 2007-02-02 16:47 ` Maynard Johnson 2007-02-03 7:40 ` Arnd Bergmann 0 siblings, 1 reply; 43+ messages in thread From: Maynard Johnson @ 2007-02-02 16:47 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel Arnd Bergmann wrote: > On Tuesday 30 January 2007 23:54, Maynard Johnson wrote: > > >>>>Why do you store them per spu in the first place? The physical spu >>>>doesn't have any relevance to this at all, the only data that is >>>>per spu is the sample data collected on a profiling interrupt, >>>>which you can then copy in the per-context data on a context switch. >>> >>>The sample data is written out to the event buffer on every profiling >>>interrupt. But we don't write out the SPU program counter samples >>>directly to the event buffer. First, we have to find the cached_info >>>for the appropriate SPU context to retrieve the cached vma-to-fileoffset >>>map. Then we do the vma_map_lookup to find the fileoffset corresponding >>>to the SPU PC sample, which we then write out to the event buffer. This >>>is one of the most time-critical pieces of the SPU profiling code, so I >>>used an array to hold the cached_info for fast random access. But as I >>>stated in a code comment above, the negative implication of this current >>>implementation is that the array can only hold the cached_info for >>>currently running SPU tasks. I need to give this some more thought. >> >>I've given this some more thought, and I'm coming to the conclusion that >>a pure array-based implementation for holding cached_info (getting rid >>of the lists) would work well for the vast majority of cases in which >>OProfile will be used. Yes, it is true that the mapping of an SPU >>context to a phsyical spu-numbered array location cannot be guaranteed >>to stay valid, and that's why I discard the cached_info at that array >>location when the SPU task is switched out. Yes, it would be terribly >>inefficient if the same SPU task gets switched back in later and we >>would have to recreate the cached_info. However, I contend that >>OProfile users are interested in profiling one application at a time. >>They are not going to want to muddy the waters with multiple SPU apps >>running at the same time. I can't think of any reason why someone would >>conscisouly choose to do that. >> >>Any thoughts from the general community, especially OProfile users? >> > > Please assume that in the near future we will be scheduling SPU contexts > in and out multiple times a second. Even in a single application, you > can easily have more contexts than you have physical SPUs. Arnd, thanks for pointing this out. That's definitely a good reason why my simplistic approach won't work. I'll look at other options. > > The event buffer by definition needs to be per context. If you for some Yes, and it is. Right now, with the current simplistic approach, the context and the physical SPU are kept in sync. > reason want to collect the samples per physical SPU during an event > interrupt, you should at least make sure that they are copied into the > per-context event buffer on a context switch. > > At the context switch point, you probably also want to drain the > hw event counters, so that you account all events correctly. Yeah, that's a good idea. The few extraneous invalid samples would probably never rise above the noise level, but we should do this anyway for completeness. > > We also want to be able to profile the context switch code itself, which > means that we also need one event buffer associated with the kernel to > collect events that for a zero context_id. The hardware design precludes tracing both SPU and PPU simultaneously. -Maynard > > Of course, the recording of raw samples in the per-context buffer does > not need to have the dcookies along with it, you can still resolve > the pointers when the SPU context gets destroyed (or an object gets > unmapped). > > Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-02-02 16:47 ` Maynard Johnson @ 2007-02-03 7:40 ` Arnd Bergmann 2007-02-03 20:03 ` Maynard Johnson 0 siblings, 1 reply; 43+ messages in thread From: Arnd Bergmann @ 2007-02-03 7:40 UTC (permalink / raw) To: maynardj; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel On Friday 02 February 2007 17:47, Maynard Johnson wrote: > > > We also want to be able to profile the context switch code itself, which > > means that we also need one event buffer associated with the kernel to > > collect events that for a zero context_id. > The hardware design precludes tracing both SPU and PPU simultaneously. > I mean the SPU-side part of the context switch code, which you can find in arch/powerpc/platforms/cell/spufs/spu_{save,restore}*. This code is the one that runs when context_id == 0 is passed to the callback. Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-02-03 7:40 ` Arnd Bergmann @ 2007-02-03 20:03 ` Maynard Johnson 2007-02-04 2:42 ` Arnd Bergmann 0 siblings, 1 reply; 43+ messages in thread From: Maynard Johnson @ 2007-02-03 20:03 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel Arnd Bergmann wrote: >On Friday 02 February 2007 17:47, Maynard Johnson wrote: > > >>>We also want to be able to profile the context switch code itself, which >>>means that we also need one event buffer associated with the kernel to >>>collect events that for a zero context_id. >>> >>> >>The hardware design precludes tracing both SPU and PPU simultaneously. >> >> >> >I mean the SPU-side part of the context switch code, which you can find >in arch/powerpc/platforms/cell/spufs/spu_{save,restore}*. > >This code is the one that runs when context_id == 0 is passed to the >callback. > > I presume you mean 'object_id'. What you're asking for is a new requirement, and one which I don't believe is achievable in the current timeframe. Since this is spufs code that's dynamicaly loaded into the SPU at runtime, the symbol information for this code is not accessible to the userspace post-processing tools. It would require an altogether different mechanism to record samples along with necessary information, not to mention the changes required in the post-processing tools. This will have to be a future enhancement. -Maynard > Arnd <>< > > ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-02-03 20:03 ` Maynard Johnson @ 2007-02-04 2:42 ` Arnd Bergmann 2007-02-04 17:11 ` Maynard Johnson 0 siblings, 1 reply; 43+ messages in thread From: Arnd Bergmann @ 2007-02-04 2:42 UTC (permalink / raw) To: cbe-oss-dev; +Cc: linuxppc-dev, linux-kernel, oprofile-list On Saturday 03 February 2007 21:03, Maynard Johnson wrote: > I presume you mean 'object_id'.=20 Right, sorry for the confusion. > What you're asking for is a new =20 > requirement, and one which I don't believe is achievable in the current=20 > timeframe. =A0Since this is spufs code that's dynamicaly loaded into the= =20 > SPU at runtime, the symbol information for this code is not accessible=20 > to the userspace post-processing tools. We can always fix the user space tool later, but it is important to get at least the kernel interface right, so we can add the functionality later without breaking new user space on old kernels or vice versa. > It would require an altogether =20 > different mechanism to record samples along with necessary information,=20 > not to mention the changes required in the post-processing tools. =A0This= =20 > will have to be a future enhancement. So what do you do with the samples for object_id =3D=3D 0? I would expect t= hem to be simply added to the sample buffer for the kernel, which is sort of special in oprofile anyway. Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-02-04 2:42 ` Arnd Bergmann @ 2007-02-04 17:11 ` Maynard Johnson 0 siblings, 0 replies; 43+ messages in thread From: Maynard Johnson @ 2007-02-04 17:11 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel Arnd Bergmann wrote: >On Saturday 03 February 2007 21:03, Maynard Johnson wrote: > > >>I presume you mean 'object_id'. >> >> > >Right, sorry for the confusion. > > > >>What you're asking for is a new >>requirement, and one which I don't believe is achievable in the current >>timeframe. Since this is spufs code that's dynamicaly loaded into the >>SPU at runtime, the symbol information for this code is not accessible >>to the userspace post-processing tools. >> >> > >We can always fix the user space tool later, but it is important to >get at least the kernel interface right, so we can add the functionality >later without breaking new user space on old kernels or vice versa. > > There's no obvious solution to this problem, so it's going to take some design effort to come up with a solution. We can work on the problem, but I don't think we can allow it to get in the way of getting our currently proposed SPU profiling functionality into the kernel. > > >>It would require an altogether >>different mechanism to record samples along with necessary information, >>not to mention the changes required in the post-processing tools. This >>will have to be a future enhancement. >> >> > >So what do you do with the samples for object_id == 0? I would expect them >to be simply added to the sample buffer for the kernel, which is sort >of special in oprofile anyway. > > There is no sample buffer for the kernel in SPU profiling. When OProfile gets notified of an SPU task switching out (object_if == 0), we stop recording samples for the corresponding SPU. If SPUFs sends the notification after the spu_save operation, we still would be collecting samples during that time; however, since the VMAs of these samples would not map to any fileoffset in the SPU binary that's executing, our current implementation throws them away. We could change that behavior and record them in the samples buffer as "anonymous samples". Not sure it that would help too much, as you wouldn't be able to distinguish between spu_save samples and samples from generated stubs that are executing from the stack. -Maynard > Arnd <>< > > ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 21:41 ` Maynard Johnson 2007-01-30 22:54 ` Maynard Johnson @ 2007-01-30 23:31 ` Carl Love 2007-01-31 1:25 ` Christian Krafft 2007-01-31 6:06 ` Arnd Bergmann 2007-01-31 5:57 ` Arnd Bergmann 2 siblings, 2 replies; 43+ messages in thread From: Carl Love @ 2007-01-30 23:31 UTC (permalink / raw) To: maynardj Cc: linuxppc-dev, oprofile-list, cbe-oss-dev, Arnd Bergmann, linux-kernel On Tue, 2007-01-30 at 15:41 -0600, Maynard Johnson wrote: [snip > [snip] > >>+ // process the collected SPU PC for each node > >>+ for_each_online_cpu(cpu) { > >>+ if (cbe_get_hw_thread_id(cpu)) > >>+ continue; > >>+ > >>+ node = cbe_cpu_to_node(cpu); > >>+ node_factor = node * SPUS_PER_NODE; > >>+ /* number of valid entries for this node */ > >>+ entry = 0; > >>+ > >>+ trace_addr = cbe_read_pm(cpu, trace_address); > >>+ while ((trace_addr & CBE_PM_TRACE_BUF_EMPTY) != 0x400) > >>+ { > >>+ /* there is data in the trace buffer to process */ > >>+ cbe_read_trace_buffer(cpu, trace_buffer); > >>+ spu_mask = 0xFFFF000000000000; > >>+ > >>+ /* Each SPU PC is 16 bits; hence, four spus in each of > >>+ * the two 64-bit buffer entries that make up the > >>+ * 128-bit trace_buffer entry. Process the upper and > >>+ * lower 64-bit values simultaneously. > >>+ */ > >>+ for (spu = 0; spu < SPUS_PER_TB_ENTRY; spu++) { > >>+ spu_pc_lower = spu_mask & trace_buffer[0]; > >>+ spu_pc_lower = spu_pc_lower >> (NUM_SPU_BITS_TRBUF > >>+ * (SPUS_PER_TB_ENTRY-spu-1)); > >>+ > >>+ spu_pc_upper = spu_mask & trace_buffer[1]; > >>+ spu_pc_upper = spu_pc_upper >> (NUM_SPU_BITS_TRBUF > >>+ * (SPUS_PER_TB_ENTRY-spu-1)); > >>+ > >>+ spu_mask = spu_mask >> NUM_SPU_BITS_TRBUF; > >>+ > >>+ /* spu PC trace entry is upper 16 bits of the > >>+ * 18 bit SPU program counter > >>+ */ > >>+ spu_pc_lower = spu_pc_lower << 2; > >>+ spu_pc_upper = spu_pc_upper << 2; > >>+ > >>+ samples[((node_factor + spu) * TRACE_ARRAY_SIZE) + entry] > >>+ = (u32) spu_pc_lower; > >>+ samples[((node_factor + spu + SPUS_PER_TB_ENTRY) * TRACE_ARRAY_SIZE) + entry] > >>+ = (u32) spu_pc_upper; > >>+ } > >>+ > >>+ entry++; > >>+ > >>+ if (entry >= TRACE_ARRAY_SIZE) > >>+ /* spu_samples is full */ > >>+ break; > >>+ > >>+ trace_addr = cbe_read_pm(cpu, trace_address); > >>+ } > >>+ samples_per_node[node] = entry; > >>+ } > >>+} > > > > > > While I can't see anything technically wrong with this function, it would be > > good to split it into smaller functions. Since you are nesting three > > loops, it should be possible to make a separate function from one of the > > inner loops without changing the actual logic behind it. > Will do. > > > > > >>+ > >>+static int profile_spus(struct hrtimer * timer) > >>+{ > >>+ ktime_t kt; > >>+ int cpu, node, k, num_samples, spu_num; > > > > > > whitespace damage > fixed > > > > > >>+ > >>+ if (!spu_prof_running) > >>+ goto STOP; > >>+ > >>+ cell_spu_pc_collection(); > >>+ for_each_online_cpu(cpu) { > >>+ if (cbe_get_hw_thread_id(cpu)) > >>+ continue; > > > > > > Here, you enter the same top-level loop again, why not make it > > for_each_online_cpu(cpu) { > > if (cbe_get_hw_thread_id(cpu)) > > continue; > > num_samples = cell_spu_pc_collection(cpu); > > ... > Yes, good suggestion. I believe what you are asking here is why can't the cell_spu_pc_collection() function put the data in to the samples array for a given node, in the loop that then processes the samples array for that node. Yes, I believe that this can be done. The only restriction is that cell_spu_pc_collection() will have to extract the SPU program counter data for all SPUs on that node. This is due to the fact the data for the 8 SPUs are all stored in a single entry of the hardware trace buffer. Once the hardware trace buffer is read, the hardware advances the read pointer so there is no way to go back and re-read the entry. [snip] [snip] > >>+static int calculate_lfsr(int n) > >>+{ > >>+#define size 24 > >>+ int i; > >>+ unsigned int newlfsr0; > >>+ unsigned int lfsr = 0xFFFFFF; > >>+ unsigned int howmany = lfsr - n; > >>+ > >>+ for (i = 2; i < howmany + 2; i++) { > >>+ newlfsr0 = (((lfsr >> (size - 1 - 0)) & 1) ^ > >>+ ((lfsr >> (size - 1 - 1)) & 1) ^ > >>+ (((lfsr >> (size - 1 - 6)) & 1) ^ > >>+ ((lfsr >> (size - 1 - 23)) & 1))); > >>+ > >>+ lfsr >>= 1; > >>+ lfsr = lfsr | (newlfsr0 << (size - 1)); > >>+ } > >>+ return lfsr; > >>+ > >>+} > > > > > > I don't have the slightest idea what this code is about, but > Me neither. Carl, can you comment? > > it certainly looks inefficient to loop 16 million times to > > compute a constant. Could you use a faster algorithm instead, > > or at least add a comment about why you do it this way? > > > > An LFSR sequence is similar to a pseudo random number sequence. For a 24 bit LFSR sequence each number between 0 and 2^24 will occur once in the sequence but not in a normal counting order. The hardware uses the LFSR sequence to count to since it is much simpler to implement in hardware then a normal counter. Unfortunately, the only way we know how to figure out what the LFSR value that corresponds to the number in the sequence that is N before the last value (0xFFFFFF) is to calculate the previous value N times. It is like trying to ask what is the pseudo random number that is N before this pseudo random number? I will add a short comment to the code that will summarize the above paragraph. [snip] ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 23:31 ` Carl Love @ 2007-01-31 1:25 ` Christian Krafft 2007-01-31 6:06 ` Arnd Bergmann 1 sibling, 0 replies; 43+ messages in thread From: Christian Krafft @ 2007-01-31 1:25 UTC (permalink / raw) To: Carl Love Cc: Arnd Bergmann, linux-kernel, linuxppc-dev, oprofile-list, cbe-oss-dev On Tue, 30 Jan 2007 15:31:09 -0800 Carl Love <cel@us.ibm.com> wrote: > An LFSR sequence is similar to a pseudo random number sequence. For a 24 > bit LFSR sequence each number between 0 and 2^24 will occur once in the > sequence but not in a normal counting order. The hardware uses the LFSR > sequence to count to since it is much simpler to implement in hardware > then a normal counter. Unfortunately, the only way we know how to > figure out what the LFSR value that corresponds to the number in the > sequence that is N before the last value (0xFFFFFF) is to calculate the > previous value N times. It is like trying to ask what is the pseudo > random number that is N before this pseudo random number? That should be no problem.=20 You can just revers your algorithm and let it run x times instead of 0xFFFF= FF-x. >=20 > I will add a short comment to the code that will summarize the above > paragraph. > [snip] >=20 > _______________________________________________ > cbe-oss-dev mailing list > cbe-oss-dev@ozlabs.org > https://ozlabs.org/mailman/listinfo/cbe-oss-dev --=20 Mit freundlichen Gr=FCssen, kind regards, Christian Krafft IBM Systems & Technology Group,=20 Linux Kernel Development IT Specialist ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 23:31 ` Carl Love 2007-01-31 1:25 ` Christian Krafft @ 2007-01-31 6:06 ` Arnd Bergmann 1 sibling, 0 replies; 43+ messages in thread From: Arnd Bergmann @ 2007-01-31 6:06 UTC (permalink / raw) To: cbe-oss-dev; +Cc: linuxppc-dev, linux-kernel, oprofile-list, Carl Love On Wednesday 31 January 2007 00:31, Carl Love wrote: > Unfortunately, the only way we know how to > figure out what the LFSR value that corresponds to the number in the > sequence that is N before the last value (0xFFFFFF) is to calculate the > previous value N times. =A0It is like trying to ask what is the pseudo > random number that is N before this pseudo random number? Well, you can at least implement the lfsr both ways, and choose the one that is faster to get at, like u32 get_lfsr(u32 v) { int i; u32 r =3D 0xffffff; if (v < 0x7fffff) { for (i =3D 0; i < v; i++) r =3D lfsr_forwards(r); } else { for (i =3D 0; i < (0x1000000 - v); i++) r =3D lfsr_backwards(r); } return r; } Also, if the value doesn't have to be really exact, you could have a small lookup table with precomputed values, like: u32 get_lfsr(u32 v) { static const lookup[256] =3D { 0xab3492, 0x3e3f34, 0xc47610c, ... /* insert actual values */ }; return lookup[v >> 16]; } Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 21:41 ` Maynard Johnson 2007-01-30 22:54 ` Maynard Johnson 2007-01-30 23:31 ` Carl Love @ 2007-01-31 5:57 ` Arnd Bergmann 2007-02-02 19:27 ` Maynard Johnson 2 siblings, 1 reply; 43+ messages in thread From: Arnd Bergmann @ 2007-01-31 5:57 UTC (permalink / raw) To: maynardj Cc: linuxppc-dev, Anton Blanchard, cbe-oss-dev, oprofile-list, linux-kernel On Tuesday 30 January 2007 22:41, Maynard Johnson wrote: > Arnd Bergmann wrote: > >>+ kt = ktime_set(0, profiling_interval); > >>+ if (!spu_prof_running) > >>+ goto STOP; > >>+ hrtimer_forward(timer, timer->base->get_time(), kt); > >>+ return HRTIMER_RESTART; > > > > > > is hrtimer_forward really the right interface here? You are ignoring > > the number of overruns anyway, so hrtimer_start(,,) sounds more > > correct to me. > According to Tom Gleixner, "hrtimer_forward is a convenience function to > move the expiry time of a timer forward in multiples of the interval, so > it is in the future. After setting the expiry time you restart the > timer either with [sic] a return HRTIMER_RESTART (if you are in the > timer callback function)." > > Ok, I see. Have you seen the timer actually coming in late, resulting in hrtimer_forward returning non-zero? I guess it's not a big problem for statistic data collection if that happens, but you might still want to be able to see it. > >>+ /* Since cpufreq_quick_get returns frequency in kHz, we use > >>+ * USEC_PER_SEC here vs NSEC_PER_SEC. > >>+ */ > >>+ unsigned long nsPerCyc = (USEC_PER_SEC << SCALE_SHIFT)/khzfreq; > >>+ profiling_interval = (nsPerCyc * cycles_reset) >> SCALE_SHIFT; > >>+ > >>+ pr_debug("timer resolution: %lu\n", > >>+ TICK_NSEC); > > > > > > Don't you need to adapt the profiling_interval at run time, when cpufreq > > changes the core frequency? You should probably use > > cpufreq_register_notifier() to update this. > Since OProfile is a statistical profiler, the exact frequency is not > critical. The user is going to be looking for hot spots in their code, > so it's all relative. With that said, I don't imagine using the > cpufreq notiication would be a big deal. We'll look at it. > > >>@@ -480,7 +491,22 @@ > >> struct op_system_config *sys, int num_ctrs) > >> { > >> int i, j, cpu; > >>+ spu_cycle_reset = 0; > >> > >>+ /* The cpufreq_quick_get function requires that cbe_cpufreq module > >>+ * be loaded. This function is not actually provided and exported > >>+ * by cbe_cpufreq, but it relies on cbe_cpufreq initialize kernel > >>+ * data structures. Since there's no way for depmod to realize > >>+ * that our OProfile module depends on cbe_cpufreq, we currently > >>+ * are letting the userspace tool, opcontrol, ensure that the > >>+ * cbe_cpufreq module is loaded. > >>+ */ > >>+ khzfreq = cpufreq_quick_get(smp_processor_id()); > > > > > > You should probably have a fallback in here in case the cpufreq module > > is not loaded. There is a global variable ppc_proc_freq (in Hz) that > > you can access. > > Our userspace tool ensures the cpufreq module is loaded. You should not rely on user space tools to do the right thing in the kernel. Moreover, if the exact frequency is not that important, as you mentioned above, you can probably just hardcode a compile-time constant here. > >>+ * > >>+ * Ideally, we would like to be able to create the cached_info for > >>+ * an SPU task just one time -- when libspe first loads the SPU > >>+ * binary file. We would store the cached_info in a list. Then, as > >>+ * SPU tasks are switched out and new ones switched in, the cached_info > >>+ * for inactive tasks would be kept, and the active one would be placed > >>+ * at the head of the list. But this technique may not with > >>+ * current spufs functionality since the spu used in bind_context may > >>+ * be a different spu than was used in a previous bind_context for a > >>+ * reactivated SPU task. Additionally, a reactivated SPU task may be > >>+ * assigned to run on a different physical SPE. We will investigate > >>+ * further if this can be done. > >>+ * > >>+ */ > > > > > > You should stuff a pointer to cached_info into struct spu_context, > > e.g. 'void *profile_private'. > > > > > >>+struct cached_info { > >>+ vma_map_t * map; > >>+ struct spu * the_spu; > >>+ struct kref cache_ref; > >>+ struct list_head list; > >>+}; > > > > > > And replace the 'the_spu' member with a back pointer to the > > spu_context if you need it. > > > > > >>+ > >>+/* A data structure for cached information about active SPU tasks. > >>+ * Storage is dynamically allocated, sized as > >>+ * "number of active nodes multplied by 8". > >>+ * The info_list[n] member holds 0 or more > >>+ * 'struct cached_info' objects for SPU#=n. > >>+ * > >>+ * As currently implemented, there will only ever be one cached_info > >>+ * in the list for a given SPU. If we can devise a way to maintain > >>+ * multiple cached_infos in our list, then it would make sense > >>+ * to also cache the dcookie representing the PPU task application. > >>+ * See above description of struct cached_info for more details. > >>+ */ > >>+struct spu_info_stacks { > >>+ struct list_head * info_list; > >>+}; > > > > > > Why do you store pointers to list_head structures? If you want to store > > lists, you should have a lists_head itself in here. > info_list is an array of n lists, where n is the number of SPUs. My point was that it's not an array of lists, but an array of pointers to lists. The way that include/list.h works is by having a struct list_head as the anchor and then add nodes to it. By simply pointing to a list_head, you won't be able to use the list_for_each_entry() macro the way it is meant. > > > > I don't get it. What is the app_dcookie used for? If the spu binary is > > embedded into a library, you are still missing the dcookie to that .so file, > > because you return only an offset. > For embedded SPU app, the post-processing tool opens the PPE binary app > file and obtains the SPU ELF embedded thereine, and from that, we obtain > the name of the SPU binary. Also, the app name is included in the > report, along with the SPU binary filename, if the report contains > samples from more than one application. That's not what I meant. If you have an embedded spu ELF file in a shared library object, you end up recording the file of the main PPE binary, and the offset of the SPU ELF inside of the .so file, but not the name of the .so file itself. You also say that the you record the name of the application on purpose for separate SPU ELF files. My assumption was that this is not necessary, but I don't know much about that aspect of oprofile. My feeling is that the samples for an external SPU ELF file should be handled the same way that oprofile handles events in shared libraries on the PPU (e.g. in libc.so). Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-31 5:57 ` Arnd Bergmann @ 2007-02-02 19:27 ` Maynard Johnson 0 siblings, 0 replies; 43+ messages in thread From: Maynard Johnson @ 2007-02-02 19:27 UTC (permalink / raw) To: Arnd Bergmann Cc: linuxppc-dev, Anton Blanchard, cbe-oss-dev, oprofile-list, linux-kernel Arnd Bergmann wrote: > On Tuesday 30 January 2007 22:41, Maynard Johnson wrote: > >>Arnd Bergmann wrote: > > >>>>+ kt = ktime_set(0, profiling_interval); >>>>+ if (!spu_prof_running) >>>>+ goto STOP; >>>>+ hrtimer_forward(timer, timer->base->get_time(), kt); >>>>+ return HRTIMER_RESTART; >>> >>> >>>is hrtimer_forward really the right interface here? You are ignoring >>>the number of overruns anyway, so hrtimer_start(,,) sounds more >>>correct to me. >> >>According to Tom Gleixner, "hrtimer_forward is a convenience function to >>move the expiry time of a timer forward in multiples of the interval, so >>it is in the future. After setting the expiry time you restart the >>timer either with [sic] a return HRTIMER_RESTART (if you are in the >>timer callback function)." >> > > Ok, I see. Have you seen the timer actually coming in late, resulting > in hrtimer_forward returning non-zero? I guess it's not a big problem > for statistic data collection if that happens, but you might still want > to be able to see it. I don't think there's much point in knowing if we have overrun(s). We're not going to schedule the timer any differently. We want to keep the timer interrupts as consistent as possible according to the user's request. > > >>>>+ /* Since cpufreq_quick_get returns frequency in kHz, we use >>>>+ * USEC_PER_SEC here vs NSEC_PER_SEC. >>>>+ */ >>>>+ unsigned long nsPerCyc = (USEC_PER_SEC << SCALE_SHIFT)/khzfreq; >>>>+ profiling_interval = (nsPerCyc * cycles_reset) >> SCALE_SHIFT; >>>>+ >>>>+ pr_debug("timer resolution: %lu\n", >>>>+ TICK_NSEC); >>> >>> >>>Don't you need to adapt the profiling_interval at run time, when cpufreq >>>changes the core frequency? You should probably use >>>cpufreq_register_notifier() to update this. >> >>Since OProfile is a statistical profiler, the exact frequency is not >>critical. The user is going to be looking for hot spots in their code, >>so it's all relative. With that said, I don't imagine using the >>cpufreq notiication would be a big deal. We'll look at it. >> >> >>>>@@ -480,7 +491,22 @@ >>>> struct op_system_config *sys, int num_ctrs) >>>>{ >>>> int i, j, cpu; >>>>+ spu_cycle_reset = 0; >>>> >>>>+ /* The cpufreq_quick_get function requires that cbe_cpufreq module >>>>+ * be loaded. This function is not actually provided and exported >>>>+ * by cbe_cpufreq, but it relies on cbe_cpufreq initialize kernel >>>>+ * data structures. Since there's no way for depmod to realize >>>>+ * that our OProfile module depends on cbe_cpufreq, we currently >>>>+ * are letting the userspace tool, opcontrol, ensure that the >>>>+ * cbe_cpufreq module is loaded. >>>>+ */ >>>>+ khzfreq = cpufreq_quick_get(smp_processor_id()); >>> >>> >>>You should probably have a fallback in here in case the cpufreq module >>>is not loaded. There is a global variable ppc_proc_freq (in Hz) that >>>you can access. >> >>Our userspace tool ensures the cpufreq module is loaded. > > > You should not rely on user space tools to do the right thing in the kernel. Ok, we'll look at the fallback option you suggest. I don't recall if I even knew about ppc_proc_freq before or why I originally chose cpufreq_guick_get. Maybe we can do without the cpufreq and use ppc_proc_freq all the time. We'll see . . . > > Moreover, if the exact frequency is not that important, as you mentioned > above, you can probably just hardcode a compile-time constant here. Well, exact frequency isn't critical, but it should, as close as possible, correspond with the user's requested value for "spu cycle reset". > > >>>>+ * >>>>+ * Ideally, we would like to be able to create the cached_info for >>>>+ * an SPU task just one time -- when libspe first loads the SPU >>>>+ * binary file. We would store the cached_info in a list. Then, as >>>>+ * SPU tasks are switched out and new ones switched in, the cached_info >>>>+ * for inactive tasks would be kept, and the active one would be placed >>>>+ * at the head of the list. But this technique may not with >>>>+ * current spufs functionality since the spu used in bind_context may >>>>+ * be a different spu than was used in a previous bind_context for a >>>>+ * reactivated SPU task. Additionally, a reactivated SPU task may be >>>>+ * assigned to run on a different physical SPE. We will investigate >>>>+ * further if this can be done. >>>>+ * >>>>+ */ >>> >>> >>>You should stuff a pointer to cached_info into struct spu_context, >>>e.g. 'void *profile_private'. >>> >>> >>> >>>>+struct cached_info { >>>>+ vma_map_t * map; >>>>+ struct spu * the_spu; >>>>+ struct kref cache_ref; >>>>+ struct list_head list; >>>>+}; >>> >>> >>>And replace the 'the_spu' member with a back pointer to the >>>spu_context if you need it. >>> >>> >>> >>>>+ >>>>+/* A data structure for cached information about active SPU tasks. >>>>+ * Storage is dynamically allocated, sized as >>>>+ * "number of active nodes multplied by 8". >>>>+ * The info_list[n] member holds 0 or more >>>>+ * 'struct cached_info' objects for SPU#=n. >>>>+ * >>>>+ * As currently implemented, there will only ever be one cached_info >>>>+ * in the list for a given SPU. If we can devise a way to maintain >>>>+ * multiple cached_infos in our list, then it would make sense >>>>+ * to also cache the dcookie representing the PPU task application. >>>>+ * See above description of struct cached_info for more details. >>>>+ */ >>>>+struct spu_info_stacks { >>>>+ struct list_head * info_list; >>>>+}; >>> >>> >>>Why do you store pointers to list_head structures? If you want to store >>>lists, you should have a lists_head itself in here. >> >>info_list is an array of n lists, where n is the number of SPUs. > > > My point was that it's not an array of lists, but an array of pointers > to lists. The way that include/list.h works is by having a struct list_head > as the anchor and then add nodes to it. By simply pointing to a list_head, > you won't be able to use the list_for_each_entry() macro the way it is meant. I've got to rework this implementation anyway . . . > > [snip] -Maynard > > Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-30 7:39 ` [Cbe-oss-dev] " Arnd Bergmann 2007-01-30 7:53 ` Benjamin Herrenschmidt 2007-01-30 21:41 ` Maynard Johnson @ 2007-02-03 23:49 ` Maynard Johnson 2007-02-04 2:52 ` Arnd Bergmann 2 siblings, 1 reply; 43+ messages in thread From: Maynard Johnson @ 2007-02-03 23:49 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel Arnd Bergmann wrote: >On Monday 29 January 2007 20:48, Maynard Johnson wrote: > > >>Subject: Add support to OProfile for profiling Cell BE SPUs >> >> >> > > [snip] >>+ * >>+ * Ideally, we would like to be able to create the cached_info for >>+ * an SPU task just one time -- when libspe first loads the SPU >>+ * binary file. We would store the cached_info in a list. Then, as >>+ * SPU tasks are switched out and new ones switched in, the cached_info >>+ * for inactive tasks would be kept, and the active one would be placed >>+ * at the head of the list. But this technique may not with >>+ * current spufs functionality since the spu used in bind_context may >>+ * be a different spu than was used in a previous bind_context for a >>+ * reactivated SPU task. Additionally, a reactivated SPU task may be >>+ * assigned to run on a different physical SPE. We will investigate >>+ * further if this can be done. >>+ * >>+ */ >> >> > >You should stuff a pointer to cached_info into struct spu_context, >e.g. 'void *profile_private'. > > I seem to recall looking at this option a while back, but didn't go that route since struct spu_context is opaque to me. With such a teqnique, I could then use a simple 16-element array of pointers to cached_info objects, creating them as needed when spu_context->profile_private is NULL. I suppose the better option for now is to add a get_profile_private() function to SPUFs, rather than requiring spu_context to be visible. Don't know why I didn't think to do that before. Ah, well, live and learn. -Maynard > > >>+struct cached_info { >>+ vma_map_t * map; >>+ struct spu * the_spu; >>+ struct kref cache_ref; >>+ struct list_head list; >>+}; >> >> > >And replace the 'the_spu' member with a back pointer to the >spu_context if you need it. > > > > Arnd <>< > > ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-02-03 23:49 ` Maynard Johnson @ 2007-02-04 2:52 ` Arnd Bergmann 2007-02-04 17:33 ` Maynard Johnson 0 siblings, 1 reply; 43+ messages in thread From: Arnd Bergmann @ 2007-02-04 2:52 UTC (permalink / raw) To: cbe-oss-dev; +Cc: linuxppc-dev, linux-kernel, oprofile-list On Sunday 04 February 2007 00:49, Maynard Johnson wrote: > I seem to recall looking at this option a while back, but didn't go that= =20 > route since struct spu_context is opaque to me. =A0With such a teqnique, = I=20 > could then use a simple 16-element array of =A0pointers to cached_info=20 > objects, creating them as needed when spu_context->profile_private is=20 > NULL. =A0I suppose the better option for now is to add a=20 > get_profile_private() function to SPUFs, rather than requiring=20 > spu_context to be visible. Yes, that sounds good. Note that the file providing the=20 spufs_get_profile_private (and respective spufs_set_profile_private) functions needs to be compiled into the kernel then in case oprofile gets linked in but spufs is a module. I think it would also be necessary to have another interface for cleaning up this data when spufs destroys the context. That could possibly a variation of the existing notifier call, or a new call, or you establish the convention that if the private pointer is non-NULL, spufs will kfree it. Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-02-04 2:52 ` Arnd Bergmann @ 2007-02-04 17:33 ` Maynard Johnson 0 siblings, 0 replies; 43+ messages in thread From: Maynard Johnson @ 2007-02-04 17:33 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-dev, cbe-oss-dev, oprofile-list, linux-kernel Arnd Bergmann wrote: >On Sunday 04 February 2007 00:49, Maynard Johnson wrote: > > >>I seem to recall looking at this option a while back, but didn't go that >>route since struct spu_context is opaque to me. With such a teqnique, I >>could then use a simple 16-element array of pointers to cached_info >>objects, creating them as needed when spu_context->profile_private is >>NULL. I suppose the better option for now is to add a >>get_profile_private() function to SPUFs, rather than requiring >>spu_context to be visible. >> >> > >Yes, that sounds good. Note that the file providing the >spufs_get_profile_private (and respective spufs_set_profile_private) >functions needs to be compiled into the kernel then in case oprofile >gets linked in but spufs is a module. > > Hmm . . . we already depend on the register/unregister functions in sched.c, so my patch changes the oprofile Kconfig to default to 'm' and 'depends on SPU_FS'. >I think it would also be necessary to have another interface for cleaning >up this data when spufs destroys the context. That could possibly >a variation of the existing notifier call, or a new call, or you >establish the convention that if the private pointer is non-NULL, >spufs will kfree it. > > Yes, I was thnking along the lines of your last suggestion. I presume OProfile gets notified (object_id == 0) before the context is actually destroyed. At that time, we would NULL-out the reference to the cached_info, so then SPUFS would kfree it at destroy time. -Maynard > Arnd <>< > > ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-29 19:48 ` [RFC, PATCH 4/4] " Maynard Johnson 2007-01-30 7:39 ` [Cbe-oss-dev] " Arnd Bergmann @ 2007-01-31 9:24 ` Milton Miller 2007-01-31 15:42 ` Arnd Bergmann 1 sibling, 1 reply; 43+ messages in thread From: Milton Miller @ 2007-01-31 9:24 UTC (permalink / raw) To: Maynard Johnson, Carl Love; +Cc: linuxppc-dev, cbe-oss-dev, Arnd Bergmann I've actually read most of the replys. Hopefully I included enough headers to get my mailer to put this in the right thread. Sorry if I missed someone on cc, the mail archives don't give one that info. On Jan 30, 2007, at 5:48 AM, Maynard Johnson wrote: > > Subject: Add support to OProfile for profiling Cell BE SPUs > > From: Maynard Johnson <maynardj@us.ibm.com> > > This patch updates the existing arch/powerpc/oprofile/op_model_cell.c > to add in the SPU profiling capabilities. In addition, a 'cell' > subdirectory > was added to arch/powerpc/oprofile to hold Cell-specific SPU profiling > code. > > Signed-off-by: Carl Love <carll@us.ibm.com> > Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com> > Index: linux-2.6.20-rc1/arch/powerpc/oprofile/cell/pr_util.h > =================================================================== > --- /dev/null 1970-01-01 00:00:00.000000000 +0000 > +++ linux-2.6.20-rc1/arch/powerpc/oprofile/cell/pr_util.h 2007-01-29 > 10:32:03.388789304 -0600 ... > +typedef struct vma_map > +{ > + struct vma_map *next; > + unsigned int vma; > + unsigned int size; > + unsigned int offset; > + unsigned int guard_ptr; > + unsigned int guard_val; > +} vma_map_t; > + > +/* The three functions below are for maintaining and accessing > + * the vma-to-file offset map. > + */ > +vma_map_t * create_vma_map(const struct spu * spu, u64 objectid); > +unsigned int vma_map_lookup(vma_map_t *map, unsigned int vma, > + const struct spu * aSpu); > +void vma_map_free(struct vma_map *map); > + Why would the SPU to cookie translation need to be different than the standard vm one? Is it that spufs takes refs on the pages but doesn't have the standard vma? Maybe an approach of creating them would reuse the oprofile code. > Index: linux-2.6.20-rc1/arch/powerpc/oprofile/cell/spu_profiler.c > =================================================================== > --- /dev/null 1970-01-01 00:00:00.000000000 +0000 > +++ > linux-2.6.20-rc1/arch/powerpc/oprofile/cell/spu_profiler.c 2007-01-29 > 10:32:03.392788696 -0600 ... > +#define TRACE_ARRAY_SIZE 1024 > +static u32 * samples; > +static u32 * samples_per_node; > + > +static int spu_prof_running = 0; > +static unsigned int profiling_interval = 0; > + > +extern int num_nodes; > +extern unsigned int khzfreq; > + > +/* > + * Oprofile setup functions > + */ > + > +#define NUM_SPU_BITS_TRBUF 16 > +#define SPUS_PER_TB_ENTRY 4 > +#define SPUS_PER_NODE 8 > + > +/* > + * Collect the SPU program counter samples from the trace buffer. > + * The global variable usage is as follows: > + * samples[<total-spus>][TRACE_ARRAY_SIZE] - array to store SPU PC > samples > + * Assumption, the array will be all zeros on entry. > + * u32 samples_per_node[num_nodes] - array of how many valid > samples per node > + */ > +static void cell_spu_pc_collection(void) > +{ > + int cpu; > + int node; > + int spu; > + u32 trace_addr; > + /* the trace buffer is 128 bits */ > + u64 trace_buffer[2]; > + u64 spu_pc_lower; > + u64 spu_pc_upper; > + u64 spu_mask; > + int entry, node_factor; > + // process the collected SPU PC for each node > + for_each_online_cpu(cpu) { > + if (cbe_get_hw_thread_id(cpu)) > + continue; > + > + node = cbe_cpu_to_node(cpu); > + node_factor = node * SPUS_PER_NODE; > + /* number of valid entries for this node */ > + entry = 0; > + > + trace_addr = cbe_read_pm(cpu, trace_address); > + while ((trace_addr & CBE_PM_TRACE_BUF_EMPTY) != 0x400) > + { > + /* there is data in the trace buffer to process */ > + cbe_read_trace_buffer(cpu, trace_buffer); > + spu_mask = 0xFFFF000000000000; > + > + /* Each SPU PC is 16 bits; hence, four spus in each of > + * the two 64-bit buffer entries that make up the > + * 128-bit trace_buffer entry. Process the upper and > + * lower 64-bit values simultaneously. > + */ > + for (spu = 0; spu < SPUS_PER_TB_ENTRY; spu++) { > + spu_pc_lower = spu_mask & trace_buffer[0]; > + spu_pc_lower = spu_pc_lower >> (NUM_SPU_BITS_TRBUF > + * (SPUS_PER_TB_ENTRY-spu-1)); > Calculating the shift each time through the loop has to be inefficient. As mentioned by others, I would suggest making this loop split the value into the 4 parts. It would probably be better to shift the raw data to the left each pass, and then always take the top 16 bits and shift them down low. > + > + spu_pc_upper = spu_mask & trace_buffer[1]; > + spu_pc_upper = spu_pc_upper >> (NUM_SPU_BITS_TRBUF > + * (SPUS_PER_TB_ENTRY-spu-1)); > + > + spu_mask = spu_mask >> NUM_SPU_BITS_TRBUF; > + > + /* spu PC trace entry is upper 16 bits of the > + * 18 bit SPU program counter > + */ > + spu_pc_lower = spu_pc_lower << 2; > + spu_pc_upper = spu_pc_upper << 2; > + > + samples[((node_factor + spu) * TRACE_ARRAY_SIZE) + entry] > + = (u32) spu_pc_lower; > + samples[((node_factor + spu + SPUS_PER_TB_ENTRY) * > TRACE_ARRAY_SIZE) + entry] > + = (u32) spu_pc_upper; > + } > + > + entry++; > + > + if (entry >= TRACE_ARRAY_SIZE) > + /* spu_samples is full */ > + break; > + > + trace_addr = cbe_read_pm(cpu, trace_address); > + } looks more like a for loop to me ... its for (a=func; a & bit, a = func) Actually, i'd probably change this to for (entry = 0; entry < trace size; entry++) { a = read_trace_data(x); if (trace_not_valid) break; spilt sample per notes aove; } Rational: the stop for entrys is absolute you are processing data to fill, which eliminates duplicate obtain_data code. Its obvious you should stop if you are full (aside) the timer should be set to avoid full, but delays could cause it to fill. Perhaps if we fill we should slightly reduce the timer period? > + samples_per_node[node] = entry; > + } > +} > + > + > +/* > + * Entry point for SPU profiling. > + * NOTE: SPU profiling is done system-wide, not per-CPU. > + * > + * cycles_reset is the count value specified by the user when > + * setting up OProfile to count SPU_CYCLES. > + */ > +void start_spu_profiling(unsigned int cycles_reset) { > + > + ktime_t kt; > + > + /* To calculate a timeout in nanoseconds, the basic > + * formula is ns = cycles_reset * (NSEC_PER_SEC / cpu frequency). > + * To avoid floating point math, we use the scale math > + * technique as described in linux/jiffies.h. We use > + * a scale factor of SCALE_SHIFT,which provides 4 decimal places > + * of precision, which is close enough for the purpose at hand. > + */ > + > + /* Since cpufreq_quick_get returns frequency in kHz, we use > + * USEC_PER_SEC here vs NSEC_PER_SEC. > + */ > + unsigned long nsPerCyc = (USEC_PER_SEC << SCALE_SHIFT)/khzfreq; > + profiling_interval = (nsPerCyc * cycles_reset) >> SCALE_SHIFT; > + > + pr_debug("timer resolution: %lu\n", > + TICK_NSEC); > + kt = ktime_set(0, profiling_interval); > + hrtimer_init(&timer, CLOCK_MONOTONIC, HRTIMER_REL); > + timer.expires = kt; > + timer.function = profile_spus; > + > + /* Allocate arrays for collecting SPU PC samples */ > + samples = (u32 *) kzalloc(num_nodes * SPUS_PER_NODE * > TRACE_ARRAY_SIZE * sizeof(u32), GFP_ATOMIC); > + samples_per_node = (u32 *) kzalloc(num_nodes * sizeof(u32), > GFP_ATOMIC); > + > + spu_prof_running = 1; > + hrtimer_start(&timer, kt, HRTIMER_REL); > +} So we are doing all this calculations, including adding cpufreq dependencys, just to caculate once how often to dump the trace array once? And then we have all the convert-to-lfsr in the kernel, which as people noticed will run up to 2^24 loops of bit manuplitions as specified by the user .... I would propose instead the user space code passes (1) the requested dump interval and (2) the computed lfsr value to load. The kernel can impose sanity checks on the interrupt rate. The invalid lfsr value is known (0), so that means that that the worse that could happen is (1) user space programs too small of a count, and we don't dump the trace competely. or (2) we dump more often that we need to. If we want to correct or tune, we (1) can be detected and the kernel gradually reduce the poll interval; (2) can be detected by empty trace arrays on timer with no intervening context switch dump; and only results in interrupt overhead wich can be limited by sanity checks (bounds testing). This would also allow a path in the future for userspace to change the requested sample rates .... milton ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-31 9:24 ` Milton Miller @ 2007-01-31 15:42 ` Arnd Bergmann 2007-02-01 18:56 ` Milton Miller 0 siblings, 1 reply; 43+ messages in thread From: Arnd Bergmann @ 2007-01-31 15:42 UTC (permalink / raw) To: cbe-oss-dev; +Cc: linuxppc-dev, Milton Miller, Carl Love On Wednesday 31 January 2007 10:24, Milton Miller wrote: > > +/* The three functions below are for maintaining and accessing > > + * the vma-to-file offset map. > > + */ > > +vma_map_t * create_vma_map(const struct spu * spu, u64 objectid); > > +unsigned int vma_map_lookup(vma_map_t *map, unsigned int vma, > > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 =A0 = =A0const struct spu * aSpu); > > +void vma_map_free(struct vma_map *map); > > + >=20 > Why would the SPU to cookie translation need to be different > than the standard vm one? =A0 Is it that spufs takes refs on the > pages but doesn't have the standard vma? =A0 Maybe an approach > of creating them would reuse the oprofile code. It's a two stage process for SPUs: * samples are relative to the local store, and for each sample we need to know what context was running. The context is identified over a pointer to user space effective addresses. * The effective address identifies the SPU ELF binary mapped at that address. It may however be in the middle of a VMA, so you get another offset into the mapped file. =46or each sample, you then get an offset into the ls, an offset into the file to identify the ELF object, and the dcookie for the file containing that object. As a consequence, you only need dcookies for the case where a context switch happens (the executable changes), but not for each of the samples during the a time slice, they all point to the same object. Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-31 15:42 ` Arnd Bergmann @ 2007-02-01 18:56 ` Milton Miller 2007-02-02 0:54 ` Arnd Bergmann 0 siblings, 1 reply; 43+ messages in thread From: Milton Miller @ 2007-02-01 18:56 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linuxppc-dev, cbe-oss-dev, Carl Love On Jan 31, 2007, at 9:42 AM, Arnd Bergmann wrote: > On Wednesday 31 January 2007 10:24, Milton Miller wrote: >>> +/* The three functions below are for maintaining and accessing >>> + * the vma-to-file offset map. >>> + */ >>> +vma_map_t * create_vma_map(const struct spu * spu, u64 objectid); >>> +unsigned int vma_map_lookup(vma_map_t *map, unsigned int vma, >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 =A0 = =A0const struct spu * aSpu); >>> +void vma_map_free(struct vma_map *map); >>> + >> >> Why would the SPU to cookie translation need to be different >> than the standard vm one? =A0 Is it that spufs takes refs on the >> pages but doesn't have the standard vma? =A0 Maybe an approach >> of creating them would reuse the oprofile code. > > It's a two stage process for SPUs: > > * samples are relative to the local store, and for each sample > we need to know what context was running. The context is > identified over a pointer to user space effective addresses. > Ok I'm with you so far. There is a separate address space, the local store, which you abbreviate ls below. The trace gives this local address. > * The effective address identifies the SPU ELF binary mapped > at that address. It may however be in the middle of a VMA, > so you get another offset into the mapped file. I think this is where we start to diverge. Can you only map 1 linear range of 1 file as the SPU local store? I thought you were mentioning shared libraries, and had mmap, shared mappings of text, etc. > > For each sample, you then get an offset into the ls, an offset > into the file to identify the ELF object, and the dcookie > for the file containing that object You are supplying (1) offset in local store (2) offset from dcookie to local store (? you said ELF object) (3) file containing #2 So there is exactly one backing object for the entire local store, and its mapped linearly? > As a consequence, you only need dcookies for the case where > a context switch happens (the executable changes), but not > for each of the samples during the a time slice, they all > point to the same object. My understanding is the dcookie is supposed to represent a backing file object. The combination (dcookie, offset) should lead to the backing object, which userspace can then disassemble, etc. Therefore, if you allow multiple pieces to be mapped into local store, then you should be reverse translating each ls address into (file,=20 offset) of that file system object. The fact that its contained in a bigger elf file may mean that userspace needs some more info, but it still needs the same info. If you do allow more than 1 backing object, then my suggestion was to use the common code by setting up a fake vm context that has kernel vmas and let the generic code lookup the file from this context. milton= ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-02-01 18:56 ` Milton Miller @ 2007-02-02 0:54 ` Arnd Bergmann 0 siblings, 0 replies; 43+ messages in thread From: Arnd Bergmann @ 2007-02-02 0:54 UTC (permalink / raw) To: Milton Miller; +Cc: linuxppc-dev, cbe-oss-dev, Carl Love On Thursday 01 February 2007 19:56, Milton Miller wrote: > > * The effective address identifies the SPU ELF binary mapped > > =A0 at that address. It may however be in the middle of a VMA, > > =A0 so you get another offset into the mapped file. >=20 > I think this is where we start to diverge. >=20 > Can you only map 1 linear range of 1 file as the SPU local store? >=20 > I thought you were mentioning shared libraries, and had mmap, > shared mappings of text, etc. There are two address spaces in the SPU. Code and local variables are all in the local store (256kb), so the samples that oprofile takes come from there.=20 The other address space is the DMA space, which is defined by the mm_struct of the process, and established on the PPU (the linux program). Shared libraries and mappings are all in this space. To load an SPU program, the SPU ELF file is mapped into the process address space (DMA space) and the ELF loader transfers sections from it to the local store of the SPU. Consequently, there is a unique translation from every local store address to a file offset in the original ELF file, but there can be multiple linear ranges. > > For each sample, you then get an offset into the ls, an offset > > into the file to identify the ELF object, and the dcookie > > for the file containing that object >=20 > You are supplying > =A0 =A0 (1) offset in local store > =A0 =A0 (2) offset from dcookie to local store (? =A0 you said ELF object) > =A0 =A0 (3) file containing #2 >=20 > So there is exactly one backing object for the entire local > store, and its mapped linearly? 1 object, but not linearly. To make things worse, there can also be overlays (think turbo pascal on MS-DOS), so the object can be larger than the actual local store, and parts of it get loaded on demand. Also, a file can contain multiple SPU ELF binaries, we have the embedspu tool that encapsulates a statically linked SPU program into a powerpc object containing a single symbol. You can then link multiple such objects into a shared library or your main powerpc-side application. > > As a consequence, you only need dcookies for the case where > > a context switch happens (the executable changes), but not > > for each of the samples during the a time slice, they all > > point to the same object. >=20 >=20 > My understanding is the dcookie is supposed to represent a backing > file object. =A0The combination (dcookie, offset) should lead to > the backing object, which userspace can then disassemble, etc. The (dcookie, offset) tuple first identifies the SPU ELF binary, offset here being the file offset where the actual ELF image starts. You need the second offset to identify the pointer in there. > Therefore, if you allow multiple pieces to be mapped into local store, > then you should be reverse translating each ls address into (file,=20 > offset) > of that file system object. =A0 The fact that its contained in a bigger > elf file may mean that userspace needs some more info, but it still > needs the same info. At the minimum, userspaces needs information like * spu3 is <dcookie=3D87656323, offset=3D4567> /* first program gets loaded = */ * sample at <spu=3D3, offset=3D1234> /* samples */ * sample at <spu=3D3, offset=3D1248> * sample at <spu=3D3, offset=3D160c> * spu3 is <dcookie=3D34563287, offset=3D5476> /* context switch to other */ * sample at <spu=3D3, offset=3D5a32> /* samples in new program */ * sample at <spu=3D3, offset=3D7231> The same can be expressed as * sample at <dcookie=3D87656323, foffset=3D4567, offset=3D1234> /* samples = */ * sample at <dcookie=3D87656323, foffset=3D4567, offset=3D1248> * sample at <dcookie=3D87656323, foffset=3D4567, offset=3D160c> * sample at <dcookie=3D34563287, foffset=3D5476, offset=3D5a32> /* samples = in new program */ * sample at <dcookie=3D34563287, foffset=3D5476, offset=3D7231> And potentially other information per sample. > If you do allow more than 1 backing object, then my suggestion was > to use the common code by setting up a fake vm context that has > kernel vmas and let the generic code lookup the file from this context. The problem of more than one backing object is only if you look the samples being collected per physical SPU, because that means it will see context switches. If the samples get collected per context, there is only one backing object. Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC, PATCH 0/4] Add support to OProfile for profiling Cell BE SPUs -- update 2007-01-29 19:45 [RFC, PATCH 0/4] Add support to OProfile for profiling Cell BE SPUs -- update Maynard Johnson ` (3 preceding siblings ...) 2007-01-29 19:48 ` [RFC, PATCH 4/4] " Maynard Johnson @ 2007-01-30 8:37 ` Arnd Bergmann 4 siblings, 0 replies; 43+ messages in thread From: Arnd Bergmann @ 2007-01-30 8:37 UTC (permalink / raw) To: linuxppc-dev, maynardj; +Cc: cbe-oss-dev, oprofile-list, linux-kernel On Monday 29 January 2007 20:45, Maynard Johnson wrote: > On December 14, 2006, I posted a patch that added support to the=20 > OProfile kernel driver for profiling Cell SPUs. =A0There have been some=20 > changes/fixes to this patch since the original posting (including=20 > forward porting from 2.6.18-based kernel to 2.6.20-rc1), so I am=20 > reposting the patch for review now. =A0This patch relies upon the=20 > following patches that have not been accepted yet: > =A0 =A01. oprofile cleanup patch (submitted on Nov 27) > =A0 =A02. Fix for PPU profiling (not submitted yet, since it depends on #= 1) > =A0 =A03. SPU task notification patch (last submitted on Jan 26) >=20 Sorry for taking so much time before reviewing this. I knew it would be a long patch that requires time to go through in detail (though certainly not nearly as much time as it took you to write it), so I kept procrastinating. Arnd <>< ^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~2007-02-04 17:33 UTC | newest] Thread overview: 43+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-01-29 19:45 [RFC, PATCH 0/4] Add support to OProfile for profiling Cell BE SPUs -- update Maynard Johnson 2007-01-29 19:46 ` [RFC, PATCH 1/4] " Maynard Johnson 2007-01-30 4:07 ` [Cbe-oss-dev] " Arnd Bergmann 2007-01-30 10:39 ` Christoph Hellwig 2007-01-30 22:49 ` Carl Love 2007-01-30 22:57 ` Benjamin Herrenschmidt 2007-01-31 8:47 ` Christoph Hellwig 2007-01-30 22:59 ` Benjamin Herrenschmidt 2007-01-29 19:47 ` [RFC, PATCH 2/4] " Maynard Johnson 2007-01-30 4:08 ` [Cbe-oss-dev] " Arnd Bergmann 2007-01-30 23:51 ` Carl Love 2007-01-29 19:48 ` [RFC, PATCH 3/4] " Maynard Johnson 2007-01-30 4:24 ` [Cbe-oss-dev] " Arnd Bergmann 2007-01-30 15:31 ` Maynard Johnson 2007-01-31 0:35 ` Arnd Bergmann 2007-01-29 19:48 ` [RFC, PATCH 4/4] " Maynard Johnson 2007-01-30 7:39 ` [Cbe-oss-dev] " Arnd Bergmann 2007-01-30 7:53 ` Benjamin Herrenschmidt 2007-01-30 10:41 ` Christoph Hellwig 2007-01-30 23:09 ` Maynard Johnson 2007-01-30 21:41 ` Maynard Johnson 2007-01-30 22:54 ` Maynard Johnson 2007-01-30 23:34 ` Benjamin Herrenschmidt 2007-01-31 0:29 ` Maynard Johnson 2007-01-31 6:52 ` Arnd Bergmann 2007-02-02 16:47 ` Maynard Johnson 2007-02-03 7:40 ` Arnd Bergmann 2007-02-03 20:03 ` Maynard Johnson 2007-02-04 2:42 ` Arnd Bergmann 2007-02-04 17:11 ` Maynard Johnson 2007-01-30 23:31 ` Carl Love 2007-01-31 1:25 ` Christian Krafft 2007-01-31 6:06 ` Arnd Bergmann 2007-01-31 5:57 ` Arnd Bergmann 2007-02-02 19:27 ` Maynard Johnson 2007-02-03 23:49 ` Maynard Johnson 2007-02-04 2:52 ` Arnd Bergmann 2007-02-04 17:33 ` Maynard Johnson 2007-01-31 9:24 ` Milton Miller 2007-01-31 15:42 ` Arnd Bergmann 2007-02-01 18:56 ` Milton Miller 2007-02-02 0:54 ` Arnd Bergmann 2007-01-30 8:37 ` [RFC, PATCH 0/4] " Arnd Bergmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).