From: Robert Richter <robert.richter@amd.com>
To: Carl Love <cel@us.ibm.com>
Cc: Barry Kasindorf <barry.kasindorf@amd.com>,
Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
oprofile-list <oprofile-list@lists.sourceforge.net>
Subject: Re: [PATCH 10/24] x86/oprofile: Add IBS support for AMD CPUs, IBS buffer handling routines
Date: Wed, 23 Jul 2008 22:19:25 +0200 [thread overview]
Message-ID: <20080723201924.GJ17134@erda.amd.com> (raw)
In-Reply-To: <1216843317.26829.58.camel@carll-linux-desktop>
On 23.07.08 13:01:57, Carl Love wrote:
>
> On Tue, 2008-07-22 at 21:08 +0200, Robert Richter wrote:
> > From: Barry Kasindorf <barry.kasindorf@amd.com>
> >
> > This patchset supports the new profiling hardware available in the
> > latest AMD CPUs in the oProfile driver.
> >
> > Signed-off-by: Barry Kasindorf <barry.kasindorf@amd.com>
> > Signed-off-by: Robert Richter <robert.richter@amd.com>
> > ---
> > drivers/oprofile/buffer_sync.c | 72 +++++++++++++++++++++++++++++++++++++++-
> > drivers/oprofile/cpu_buffer.c | 68 +++++++++++++++++++++++++++++++++++++-
> > drivers/oprofile/cpu_buffer.h | 2 +
> > 3 files changed, 140 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/oprofile/buffer_sync.c b/drivers/oprofile/buffer_sync.c
> > index 615929f..e1782d2 100644
> > --- a/drivers/oprofile/buffer_sync.c
> > +++ b/drivers/oprofile/buffer_sync.c
> > @@ -5,6 +5,7 @@
> > * @remark Read the file COPYING
> > *
> > * @author John Levon <levon@movementarian.org>
> > + * @author Barry Kasindorf
> > *
> > * This is the core of the buffer management. Each
> > * CPU buffer is processed and entered into the
> > @@ -272,7 +273,7 @@ static void increment_tail(struct oprofile_cpu_buffer *b)
> > {
> > unsigned long new_tail = b->tail_pos + 1;
> >
> > - rmb();
> > + rmb(); /* be sure fifo pointers are synchromized */
> >
> > if (new_tail < b->buffer_size)
> > b->tail_pos = new_tail;
> > @@ -327,6 +328,67 @@ static void add_trace_begin(void)
> > add_event_entry(TRACE_BEGIN_CODE);
> > }
> >
> > +#define IBS_FETCH_CODE_SIZE 2
> > +#define IBS_OP_CODE_SIZE 5
> > +#define IBS_EIP(offset) \
> > + (((struct op_sample *)&cpu_buf->buffer[(offset)])->eip)
> > +#define IBS_EVENT(offset) \
> > + (((struct op_sample *)&cpu_buf->buffer[(offset)])->event)
> > +
> > +/*
> > + * Add IBS fetch and op entries to event buffer
> > + */
> > +static void add_ibs_begin(struct oprofile_cpu_buffer *cpu_buf, int code,
> > + int in_kernel, struct mm_struct *mm)
> > +{
> > + unsigned long rip;
> > + int i, count;
> > + unsigned long ibs_cookie = 0;
> > + off_t offset;
> > +
> > + increment_tail(cpu_buf); /* move to RIP entry */
> > +
> > + rip = IBS_EIP(cpu_buf->tail_pos);
> > +
> > +#ifdef __LP64__
> > + rip += IBS_EVENT(cpu_buf->tail_pos) << 32;
> > +#endif
> > +
> > + if (mm) {
> > + ibs_cookie = lookup_dcookie(mm, rip, &offset);
> > +
> > + if (ibs_cookie == NO_COOKIE)
> > + offset = rip;
> > + if (ibs_cookie == INVALID_COOKIE) {
> > + atomic_inc(&oprofile_stats.sample_lost_no_mapping);
> > + offset = rip;
> > + }
> > + if (ibs_cookie != last_cookie) {
> > + add_cookie_switch(ibs_cookie);
> > + last_cookie = ibs_cookie;
> > + }
> > + } else
> > + offset = rip;
> > +
> > + add_event_entry(ESCAPE_CODE);
> > + add_event_entry(code);
> > + add_event_entry(offset); /* Offset from Dcookie */
> > +
> > + /* we send the Dcookie offset, but send the raw Linear Add also*/
> > + add_event_entry(IBS_EIP(cpu_buf->tail_pos));
> > + add_event_entry(IBS_EVENT(cpu_buf->tail_pos));
> > +
> > + if (code == IBS_FETCH_CODE)
> > + count = IBS_FETCH_CODE_SIZE; /*IBS FETCH is 2 int64s*/
> > + else
> > + count = IBS_OP_CODE_SIZE; /*IBS OP is 5 int64s*/
> > +
> > + for (i = 0; i < count; i++) {
> > + increment_tail(cpu_buf);
> > + add_event_entry(IBS_EIP(cpu_buf->tail_pos));
> > + add_event_entry(IBS_EVENT(cpu_buf->tail_pos));
> > + }
> > +}
>
> In general, I think it would be good to make the IBS stuff as arch
> independent as possible. Could you move the above function to the arch
> specific code. Here a generic function pointer could be exported which
> would be assigned to the arch specific routine. This would enable
> another architecture to reuse this code.
Yes, this will change too (see my mail to Maynard). No IBS outside of
op_model_amd.c. The solution is copy the samples directly in
sync_buffer() until a new escape code is received. Internals of the
samples (type, size, etc.) are not needed in this case. The address
convertion will use existion functions as well.
-Robert
>
> >
> > static void add_sample_entry(unsigned long offset, unsigned long event)
> > {
> > @@ -524,6 +586,14 @@ void sync_buffer(int cpu)
> > } else if (s->event == CPU_TRACE_BEGIN) {
> > state = sb_bt_start;
> > add_trace_begin();
> > + } else if (s->event == IBS_FETCH_BEGIN) {
> > + state = sb_bt_start;
> > + add_ibs_begin(cpu_buf,
> > + IBS_FETCH_CODE, in_kernel, mm);
> > + } else if (s->event == IBS_OP_BEGIN) {
> > + state = sb_bt_start;
> > + add_ibs_begin(cpu_buf,
> > + IBS_OP_CODE, in_kernel, mm);
> > } else {
> > struct mm_struct *oldmm = mm;
> >
>
> If you made the log_ibs_sample() more generic, as discussed below, then
> the #defines could be changed to generic names and thus allow other
> architectures to leverage the same code.
>
>
> > diff --git a/drivers/oprofile/cpu_buffer.c b/drivers/oprofile/cpu_buffer.c
> > index 2450b3a..c9ac4e1 100644
> > --- a/drivers/oprofile/cpu_buffer.c
> > +++ b/drivers/oprofile/cpu_buffer.c
> > @@ -5,6 +5,7 @@
> > * @remark Read the file COPYING
> > *
> > * @author John Levon <levon@movementarian.org>
> > + * @author Barry Kasindorf <barry.kasindorf@amd.com>
> > *
> > * Each CPU has a local buffer that stores PC value/event
> > * pairs. We also log context switches when we notice them.
> > @@ -207,7 +208,7 @@ static int log_sample(struct oprofile_cpu_buffer * cpu_buf, unsigned long pc,
> > return 1;
> > }
> >
> > -static int oprofile_begin_trace(struct oprofile_cpu_buffer * cpu_buf)
> > +static int oprofile_begin_trace(struct oprofile_cpu_buffer *cpu_buf)
> > {
> > if (nr_available_slots(cpu_buf) < 4) {
> > cpu_buf->sample_lost_overflow++;
> > @@ -252,6 +253,71 @@ void oprofile_add_sample(struct pt_regs * const regs, unsigned long event)
> > oprofile_add_ext_sample(pc, regs, event, is_kernel);
> > }
> >
> > +#define MAX_IBS_SAMPLE_SIZE 14
> > +static int log_ibs_sample(struct oprofile_cpu_buffer *cpu_buf,
> > + unsigned long pc, int is_kernel, unsigned int *ibs, int ibs_code)
> > +{
> > + struct task_struct *task;
> > +
> > + cpu_buf->sample_received++;
> > +
> > + if (nr_available_slots(cpu_buf) < MAX_IBS_SAMPLE_SIZE) {
> > + cpu_buf->sample_lost_overflow++;
> > + return 0;
> > + }
> > +
> > + is_kernel = !!is_kernel;
> > +
> > + /* notice a switch from user->kernel or vice versa */
> > + if (cpu_buf->last_is_kernel != is_kernel) {
> > + cpu_buf->last_is_kernel = is_kernel;
> > + add_code(cpu_buf, is_kernel);
> > + }
> > +
> > + /* notice a task switch */
> > + if (!is_kernel) {
> > + task = current;
> > +
> > + if (cpu_buf->last_task != task) {
> > + cpu_buf->last_task = task;
> > + add_code(cpu_buf, (unsigned long)task);
> > + }
> > + }
> > +
> > + add_code(cpu_buf, ibs_code);
> > + add_sample(cpu_buf, ibs[0], ibs[1]);
> > + add_sample(cpu_buf, ibs[2], ibs[3]);
> > + add_sample(cpu_buf, ibs[4], ibs[5]);
> > +
> > + if (ibs_code == IBS_OP_BEGIN) {
> > + add_sample(cpu_buf, ibs[6], ibs[7]);
> > + add_sample(cpu_buf, ibs[8], ibs[9]);
> > + add_sample(cpu_buf, ibs[10], ibs[11]);
> > + }
> > +
> > + return 1;
> > +}
> > +
>
> Can we make this function a bit more general? Specifically, suppose
>
> static int log_generic_sample (struct oprofile_cpu_buffer *cpu_buf,
> unsigned int *array, int num_entries)
>
> If we just passed in an array of data that is ready to just be copied
> directly to
> the kernel buffer in a for loop:
>
> for (i=0; i<num_entries; i=i+2)
>
> add_sample(cpu_buf, array[i], array[i+1]);
>
> The arch specific code would have to do the if(ibs_code == IBS_OP_BEGIN)
> code to either add the stuff into the input array or not. The
> appropriate number of entries would be passed.
>
> The if (cpu_buf->last_is_kernel != is_kernel) statement might be hard
> to do from the architecture code and might have to stay in the
> log_generic_sample. It would be good if we could find a way to also move
> this line of code to the arch specific code to setup the generic array
> of data to pass to log_generic_sample(). Maybe the arch specific code
> could have an array of last_is_kernel to refer to when preparing the
> data for the log_generic_sample() call.
>
> By making this more generic and flexible in the number of entries and
> pushing the logic of what to put into the array into the arch specific
> code, it could be used by other architectures. I had something along
> the lines of the above log_generic_sample() in a version of a patch for
> the CELL architecture. On CELL, I have to take samples from the various
> SPUs and push them into the kernel buffer. I had tried to do it using
> the per CPU buffers. I ended up dropping the approach for other
> reasons. The point is I can see where something a little more
> generic/flexible could be used by other architectures.
>
> > +void oprofile_add_ibs_sample(struct pt_regs *const regs,
> > + unsigned int * const ibs_sample, u8 code)
> > +{
> > + int is_kernel = !user_mode(regs);
> > + unsigned long pc = profile_pc(regs);
> > +
> > + struct oprofile_cpu_buffer *cpu_buf =
> > + &per_cpu(cpu_buffer, smp_processor_id());
> > +
> > + if (!backtrace_depth) {
> > + log_ibs_sample(cpu_buf, pc, is_kernel, ibs_sample, code);
> > + return;
> > + }
> > +
> > + /* if log_sample() fails we can't backtrace since we lost the source
> > + * of this event */
> > + if (log_ibs_sample(cpu_buf, pc, is_kernel, ibs_sample, code))
> > + oprofile_ops.backtrace(regs, backtrace_depth);
> > +}
> > +
> > void oprofile_add_pc(unsigned long pc, int is_kernel, unsigned long event)
> > {
> > struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(cpu_buffer);
> > diff --git a/drivers/oprofile/cpu_buffer.h b/drivers/oprofile/cpu_buffer.h
> > index c3e366b..9c44d00 100644
> > --- a/drivers/oprofile/cpu_buffer.h
> > +++ b/drivers/oprofile/cpu_buffer.h
> > @@ -55,5 +55,7 @@ void cpu_buffer_reset(struct oprofile_cpu_buffer * cpu_buf);
> > /* transient events for the CPU buffer -> event buffer */
> > #define CPU_IS_KERNEL 1
> > #define CPU_TRACE_BEGIN 2
> > +#define IBS_FETCH_BEGIN 3
> > +#define IBS_OP_BEGIN 4
> >
> > #endif /* OPROFILE_CPU_BUFFER_H */
>
>
--
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter@amd.com
next prev parent reply other threads:[~2008-07-23 20:20 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-22 19:08 [PATCH 0/24] oprofile: Add IBS support for AMD CPUs Robert Richter
2008-07-22 19:08 ` [PATCH 01/24] x86: Add PCI IDs for AMD Barcelona PCI devices Robert Richter
2008-07-22 19:08 ` [PATCH 02/24] x86: apic_*.c: Add description to AMD's extended LVT functions Robert Richter
2008-07-22 19:08 ` [PATCH 03/24] oprofile: Add support for AMD Family 11h Robert Richter
2008-07-26 17:32 ` Daniel K.
2008-07-28 15:35 ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 04/24] x86/oprofile: Introduce model specific init/exit functions Robert Richter
2008-07-22 19:08 ` [PATCH 05/24] x86/oprofile: Minor changes in op_model_athlon.c Robert Richter
2008-07-22 19:08 ` [PATCH 06/24] x86/oprofile: Renaming athlon_*() into op_amd_*() Robert Richter
2008-07-26 9:55 ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 07/24] drivers/oprofile: Coding style fixes in buffer_sync.c Robert Richter
2008-07-22 19:08 ` [PATCH 08/24] OProfile: Moving increment_tail() " Robert Richter
2008-07-26 9:56 ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 09/24] OProfile: Add IBS code macros Robert Richter
2008-07-22 19:08 ` [PATCH 10/24] x86/oprofile: Add IBS support for AMD CPUs, IBS buffer handling routines Robert Richter
2008-07-23 19:20 ` Maynard Johnson
2008-07-23 19:46 ` Robert Richter
2008-07-23 20:01 ` Carl Love
2008-07-23 20:19 ` Robert Richter [this message]
2008-07-26 9:58 ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 11/24] x86/oprofile: Add IBS support for AMD CPUs, model specific code Robert Richter
2008-07-24 14:15 ` Maynard Johnson
2008-07-24 14:36 ` Robert Richter
2008-07-24 14:39 ` Robert Richter
2008-07-26 10:03 ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 12/24] x86/oprofile: Separating the IBS handler Robert Richter
2008-07-22 19:08 ` [PATCH 13/24] OProfile: Change IBS interrupt initialization Robert Richter
2008-07-26 10:05 ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 14/24] OProfile: Fix build error in op_model_athlon.c Robert Richter
2008-07-26 10:05 ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 15/24] OProfile: on_each_cpu(): kill unused retry parameter Robert Richter
2008-07-22 19:09 ` [PATCH 16/24] OProfile: Fix setup_ibs_files() function interface Robert Richter
2008-07-22 19:09 ` [PATCH 17/24] OProfile: Enable IBS for AMD CPUs Robert Richter
2008-07-26 10:09 ` Ingo Molnar
2008-07-22 19:09 ` [PATCH 18/24] OProfile: Fix IBS build error for UP Robert Richter
2008-07-22 19:09 ` [PATCH 19/24] x86/oprofile: Macro definition cleanup in op_model_athlon.c Robert Richter
2008-07-22 19:09 ` [PATCH 20/24] x86/oprofile: op_model_athlon.c: Fix counter reset when reenabling IBS OP Robert Richter
2008-07-22 19:09 ` [PATCH 21/24] x86: apic: Export symbols for extended interrupt LVT functions Robert Richter
2008-07-22 19:53 ` Arjan van de Ven
2008-07-23 13:28 ` [PATCH] x86: apic: Changing export symbols to *_GPL Robert Richter
2008-07-23 19:29 ` linux-os (Dick Johnson)
2008-07-23 20:01 ` Robert Richter
2008-07-24 8:16 ` Stefan Richter
2008-07-22 19:09 ` [PATCH 22/24] x86/oprofile: Add CONFIG_OPROFILE_IBS option Robert Richter
2008-07-26 10:15 ` Ingo Molnar
2008-07-22 19:09 ` [PATCH 23/24] oprofile: Fix printk in cpu_buffer.c Robert Richter
2008-07-22 19:09 ` [PATCH 24/24] x86/oprofile: Reanaming op_model_athlon.c to op_model_amd.c Robert Richter
2008-07-26 10:17 ` Ingo Molnar
2008-07-23 12:24 ` [PATCH 0/24] oprofile: Add IBS support for AMD CPUs Maynard Johnson
2008-07-26 9:52 ` Ingo Molnar
2008-07-28 14:02 ` Robert Richter
2008-07-31 10:32 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080723201924.GJ17134@erda.amd.com \
--to=robert.richter@amd.com \
--cc=barry.kasindorf@amd.com \
--cc=cel@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=oprofile-list@lists.sourceforge.net \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.