public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Robert Richter <robert.richter@amd.com>
To: Carl Love <cel@us.ibm.com>
Cc: Barry Kasindorf <barry.kasindorf@amd.com>,
	Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	oprofile-list <oprofile-list@lists.sourceforge.net>
Subject: Re: [PATCH 10/24] x86/oprofile: Add IBS support for AMD CPUs, IBS buffer handling routines
Date: Wed, 23 Jul 2008 22:19:25 +0200	[thread overview]
Message-ID: <20080723201924.GJ17134@erda.amd.com> (raw)
In-Reply-To: <1216843317.26829.58.camel@carll-linux-desktop>

On 23.07.08 13:01:57, Carl Love wrote:
> 
> On Tue, 2008-07-22 at 21:08 +0200, Robert Richter wrote:
> > From: Barry Kasindorf <barry.kasindorf@amd.com>
> > 
> > This patchset supports the new profiling hardware available in the
> > latest AMD CPUs in the oProfile driver.
> > 
> > Signed-off-by: Barry Kasindorf <barry.kasindorf@amd.com>
> > Signed-off-by: Robert Richter <robert.richter@amd.com>
> > ---
> >  drivers/oprofile/buffer_sync.c |   72 +++++++++++++++++++++++++++++++++++++++-
> >  drivers/oprofile/cpu_buffer.c  |   68 +++++++++++++++++++++++++++++++++++++-
> >  drivers/oprofile/cpu_buffer.h  |    2 +
> >  3 files changed, 140 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/oprofile/buffer_sync.c b/drivers/oprofile/buffer_sync.c
> > index 615929f..e1782d2 100644
> > --- a/drivers/oprofile/buffer_sync.c
> > +++ b/drivers/oprofile/buffer_sync.c
> > @@ -5,6 +5,7 @@
> >   * @remark Read the file COPYING
> >   *
> >   * @author John Levon <levon@movementarian.org>
> > + * @author Barry Kasindorf
> >   *
> >   * This is the core of the buffer management. Each
> >   * CPU buffer is processed and entered into the
> > @@ -272,7 +273,7 @@ static void increment_tail(struct oprofile_cpu_buffer *b)
> >  {
> >  	unsigned long new_tail = b->tail_pos + 1;
> > 
> > -	rmb();
> > +	rmb();	/* be sure fifo pointers are synchromized */
> > 
> >  	if (new_tail < b->buffer_size)
> >  		b->tail_pos = new_tail;
> > @@ -327,6 +328,67 @@ static void add_trace_begin(void)
> >  	add_event_entry(TRACE_BEGIN_CODE);
> >  }
> > 
> > +#define IBS_FETCH_CODE_SIZE	2
> > +#define IBS_OP_CODE_SIZE	5
> > +#define IBS_EIP(offset)				\
> > +	(((struct op_sample *)&cpu_buf->buffer[(offset)])->eip)
> > +#define IBS_EVENT(offset)				\
> > +	(((struct op_sample *)&cpu_buf->buffer[(offset)])->event)
> > +
> > +/*
> > + * Add IBS fetch and op entries to event buffer
> > + */
> > +static void add_ibs_begin(struct oprofile_cpu_buffer *cpu_buf, int code,
> > +	int in_kernel, struct mm_struct *mm)
> > +{
> > +	unsigned long rip;
> > +	int i, count;
> > +	unsigned long ibs_cookie = 0;
> > +	off_t offset;
> > +
> > +	increment_tail(cpu_buf);	/* move to RIP entry */
> > +
> > +	rip = IBS_EIP(cpu_buf->tail_pos);
> > +
> > +#ifdef __LP64__
> > +	rip += IBS_EVENT(cpu_buf->tail_pos) << 32;
> > +#endif
> > +
> > +	if (mm) {
> > +		ibs_cookie = lookup_dcookie(mm, rip, &offset);
> > +
> > +		if (ibs_cookie == NO_COOKIE)
> > +			offset = rip;
> > +		if (ibs_cookie == INVALID_COOKIE) {
> > +			atomic_inc(&oprofile_stats.sample_lost_no_mapping);
> > +			offset = rip;
> > +		}
> > +		if (ibs_cookie != last_cookie) {
> > +			add_cookie_switch(ibs_cookie);
> > +			last_cookie = ibs_cookie;
> > +		}
> > +	} else
> > +		offset = rip;
> > +
> > +	add_event_entry(ESCAPE_CODE);
> > +	add_event_entry(code);
> > +	add_event_entry(offset);	/* Offset from Dcookie */
> > +
> > +	/* we send the Dcookie offset, but send the raw Linear Add also*/
> > +	add_event_entry(IBS_EIP(cpu_buf->tail_pos));
> > +	add_event_entry(IBS_EVENT(cpu_buf->tail_pos));
> > +
> > +	if (code == IBS_FETCH_CODE)
> > +		count = IBS_FETCH_CODE_SIZE;	/*IBS FETCH is 2 int64s*/
> > +	else
> > +		count = IBS_OP_CODE_SIZE;	/*IBS OP is 5 int64s*/
> > +
> > +	for (i = 0; i < count; i++) {
> > +		increment_tail(cpu_buf);
> > +		add_event_entry(IBS_EIP(cpu_buf->tail_pos));
> > +		add_event_entry(IBS_EVENT(cpu_buf->tail_pos));
> > +	}
> > +}
> 
> In general, I think it would be good to make the IBS stuff as arch
> independent as possible.  Could you move the above function to the arch
> specific code.  Here a generic function pointer could be exported which
> would be assigned to the arch specific routine.  This would enable
> another architecture to reuse this code.  

Yes, this will change too (see my mail to Maynard). No IBS outside of
op_model_amd.c. The solution is copy the samples directly in
sync_buffer() until a new escape code is received. Internals of the
samples (type, size, etc.) are not needed in this case. The address
convertion will use existion functions as well.

-Robert

> 
> > 
> >  static void add_sample_entry(unsigned long offset, unsigned long event)
> >  {
> > @@ -524,6 +586,14 @@ void sync_buffer(int cpu)
> >  			} else if (s->event == CPU_TRACE_BEGIN) {
> >  				state = sb_bt_start;
> >  				add_trace_begin();
> > +			} else if (s->event == IBS_FETCH_BEGIN) {
> > +				state = sb_bt_start;
> > +				add_ibs_begin(cpu_buf,
> > +					IBS_FETCH_CODE, in_kernel, mm);
> > +			} else if (s->event == IBS_OP_BEGIN) {
> > +				state = sb_bt_start;
> > +				add_ibs_begin(cpu_buf,
> > +					IBS_OP_CODE, in_kernel, mm);
> >  			} else {
> >  				struct mm_struct *oldmm = mm;
> > 
> 
> If you made the log_ibs_sample() more generic, as discussed below, then
> the #defines could be changed to generic names and thus allow other
> architectures to leverage the same code.  
> 
> 
> > diff --git a/drivers/oprofile/cpu_buffer.c b/drivers/oprofile/cpu_buffer.c
> > index 2450b3a..c9ac4e1 100644
> > --- a/drivers/oprofile/cpu_buffer.c
> > +++ b/drivers/oprofile/cpu_buffer.c
> > @@ -5,6 +5,7 @@
> >   * @remark Read the file COPYING
> >   *
> >   * @author John Levon <levon@movementarian.org>
> > + * @author Barry Kasindorf <barry.kasindorf@amd.com>
> >   *
> >   * Each CPU has a local buffer that stores PC value/event
> >   * pairs. We also log context switches when we notice them.
> > @@ -207,7 +208,7 @@ static int log_sample(struct oprofile_cpu_buffer * cpu_buf, unsigned long pc,
> >  	return 1;
> >  }
> > 
> > -static int oprofile_begin_trace(struct oprofile_cpu_buffer * cpu_buf)
> > +static int oprofile_begin_trace(struct oprofile_cpu_buffer *cpu_buf)
> >  {
> >  	if (nr_available_slots(cpu_buf) < 4) {
> >  		cpu_buf->sample_lost_overflow++;
> > @@ -252,6 +253,71 @@ void oprofile_add_sample(struct pt_regs * const regs, unsigned long event)
> >  	oprofile_add_ext_sample(pc, regs, event, is_kernel);
> >  }
> > 
> > +#define MAX_IBS_SAMPLE_SIZE	14
> > +static int log_ibs_sample(struct oprofile_cpu_buffer *cpu_buf,
> > +	unsigned long pc, int is_kernel, unsigned  int *ibs, int ibs_code)
> > +{
> > +	struct task_struct *task;
> > +
> > +	cpu_buf->sample_received++;
> > +
> > +	if (nr_available_slots(cpu_buf) < MAX_IBS_SAMPLE_SIZE) {
> > +		cpu_buf->sample_lost_overflow++;
> > +		return 0;
> > +	}
> > +
> > +	is_kernel = !!is_kernel;
> > +
> > +	/* notice a switch from user->kernel or vice versa */
> > +	if (cpu_buf->last_is_kernel != is_kernel) {
> > +		cpu_buf->last_is_kernel = is_kernel;
> > +		add_code(cpu_buf, is_kernel);
> > +	}
> > +
> > +	/* notice a task switch */
> > +	if (!is_kernel) {
> > +		task = current;
> > +
> > +		if (cpu_buf->last_task != task) {
> > +			cpu_buf->last_task = task;
> > +			add_code(cpu_buf, (unsigned long)task);
> > +		}
> > +	}
> > +
> > +	add_code(cpu_buf, ibs_code);
> > +	add_sample(cpu_buf, ibs[0], ibs[1]);
> > +	add_sample(cpu_buf, ibs[2], ibs[3]);
> > +	add_sample(cpu_buf, ibs[4], ibs[5]);
> > +
> > +	if (ibs_code == IBS_OP_BEGIN) {
> > +	add_sample(cpu_buf, ibs[6], ibs[7]);
> > +	add_sample(cpu_buf, ibs[8], ibs[9]);
> > +	add_sample(cpu_buf, ibs[10], ibs[11]);
> > +	}
> > +
> > +	return 1;
> > +}
> > +
> 
> Can we make this function a bit more general?  Specifically, suppose 
> 
> static int log_generic_sample (struct oprofile_cpu_buffer *cpu_buf, 
>                                  unsigned int *array, int num_entries) 
> 
> If we just passed in an array of data that is ready to just be copied
> directly to 
> the kernel buffer in a for loop: 
> 
> for (i=0; i<num_entries; i=i+2) 
> 
>         add_sample(cpu_buf, array[i], array[i+1]); 
> 
> The arch specific code would have to do the if(ibs_code == IBS_OP_BEGIN)
> code to either add the stuff into the input array or not.  The
> appropriate number of entries would be passed.   
> 
> The  if (cpu_buf->last_is_kernel != is_kernel) statement might be hard
> to do from the architecture code and might have to stay in the
> log_generic_sample. It would be good if we could find a way to also move
> this line of code to the arch specific code to setup the generic array
> of data to pass to log_generic_sample(). Maybe the arch specific code
> could have an array of last_is_kernel to refer to when preparing the
> data for the log_generic_sample() call.
> 
> By making this more generic and flexible in the number of entries and
> pushing the logic of what to put into the array into the arch specific
> code, it could be used by other architectures.  I had something along
> the lines of the above log_generic_sample() in a version of a patch for
> the CELL architecture.  On CELL, I have to take samples from the various
> SPUs and push them into the kernel buffer.  I had tried to do it using
> the per CPU buffers.  I ended up dropping the approach for other
> reasons.  The point is I can see where something a little more
> generic/flexible could be used by other architectures.
> 
> > +void oprofile_add_ibs_sample(struct pt_regs *const regs,
> > +				unsigned int * const ibs_sample, u8 code)
> > +{
> > +	int is_kernel = !user_mode(regs);
> > +	unsigned long pc = profile_pc(regs);
> > +
> > +	struct oprofile_cpu_buffer *cpu_buf =
> > +			 &per_cpu(cpu_buffer, smp_processor_id());
> > +
> > +	if (!backtrace_depth) {
> > +		log_ibs_sample(cpu_buf, pc, is_kernel, ibs_sample, code);
> > +		return;
> > +	}
> > +
> > +	/* if log_sample() fails we can't backtrace since we lost the source
> > +	* of this event */
> > +	if (log_ibs_sample(cpu_buf, pc, is_kernel, ibs_sample, code))
> > +		oprofile_ops.backtrace(regs, backtrace_depth);
> > +}
> > +
> >  void oprofile_add_pc(unsigned long pc, int is_kernel, unsigned long event)
> >  {
> >  	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(cpu_buffer);
> > diff --git a/drivers/oprofile/cpu_buffer.h b/drivers/oprofile/cpu_buffer.h
> > index c3e366b..9c44d00 100644
> > --- a/drivers/oprofile/cpu_buffer.h
> > +++ b/drivers/oprofile/cpu_buffer.h
> > @@ -55,5 +55,7 @@ void cpu_buffer_reset(struct oprofile_cpu_buffer * cpu_buf);
> >  /* transient events for the CPU buffer -> event buffer */
> >  #define CPU_IS_KERNEL 1
> >  #define CPU_TRACE_BEGIN 2
> > +#define IBS_FETCH_BEGIN 3
> > +#define IBS_OP_BEGIN    4
> > 
> >  #endif /* OPROFILE_CPU_BUFFER_H */
> 
> 

-- 
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter@amd.com


  reply	other threads:[~2008-07-23 20:20 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-22 19:08 [PATCH 0/24] oprofile: Add IBS support for AMD CPUs Robert Richter
2008-07-22 19:08 ` [PATCH 01/24] x86: Add PCI IDs for AMD Barcelona PCI devices Robert Richter
2008-07-22 19:08 ` [PATCH 02/24] x86: apic_*.c: Add description to AMD's extended LVT functions Robert Richter
2008-07-22 19:08 ` [PATCH 03/24] oprofile: Add support for AMD Family 11h Robert Richter
2008-07-26 17:32   ` Daniel K.
2008-07-28 15:35     ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 04/24] x86/oprofile: Introduce model specific init/exit functions Robert Richter
2008-07-22 19:08 ` [PATCH 05/24] x86/oprofile: Minor changes in op_model_athlon.c Robert Richter
2008-07-22 19:08 ` [PATCH 06/24] x86/oprofile: Renaming athlon_*() into op_amd_*() Robert Richter
2008-07-26  9:55   ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 07/24] drivers/oprofile: Coding style fixes in buffer_sync.c Robert Richter
2008-07-22 19:08 ` [PATCH 08/24] OProfile: Moving increment_tail() " Robert Richter
2008-07-26  9:56   ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 09/24] OProfile: Add IBS code macros Robert Richter
2008-07-22 19:08 ` [PATCH 10/24] x86/oprofile: Add IBS support for AMD CPUs, IBS buffer handling routines Robert Richter
2008-07-23 19:20   ` Maynard Johnson
2008-07-23 19:46     ` Robert Richter
2008-07-23 20:01   ` Carl Love
2008-07-23 20:19     ` Robert Richter [this message]
2008-07-26  9:58   ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 11/24] x86/oprofile: Add IBS support for AMD CPUs, model specific code Robert Richter
2008-07-24 14:15   ` Maynard Johnson
2008-07-24 14:36     ` Robert Richter
2008-07-24 14:39   ` Robert Richter
2008-07-26 10:03   ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 12/24] x86/oprofile: Separating the IBS handler Robert Richter
2008-07-22 19:08 ` [PATCH 13/24] OProfile: Change IBS interrupt initialization Robert Richter
2008-07-26 10:05   ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 14/24] OProfile: Fix build error in op_model_athlon.c Robert Richter
2008-07-26 10:05   ` Ingo Molnar
2008-07-22 19:08 ` [PATCH 15/24] OProfile: on_each_cpu(): kill unused retry parameter Robert Richter
2008-07-22 19:09 ` [PATCH 16/24] OProfile: Fix setup_ibs_files() function interface Robert Richter
2008-07-22 19:09 ` [PATCH 17/24] OProfile: Enable IBS for AMD CPUs Robert Richter
2008-07-26 10:09   ` Ingo Molnar
2008-07-22 19:09 ` [PATCH 18/24] OProfile: Fix IBS build error for UP Robert Richter
2008-07-22 19:09 ` [PATCH 19/24] x86/oprofile: Macro definition cleanup in op_model_athlon.c Robert Richter
2008-07-22 19:09 ` [PATCH 20/24] x86/oprofile: op_model_athlon.c: Fix counter reset when reenabling IBS OP Robert Richter
2008-07-22 19:09 ` [PATCH 21/24] x86: apic: Export symbols for extended interrupt LVT functions Robert Richter
2008-07-22 19:53   ` Arjan van de Ven
2008-07-23 13:28     ` [PATCH] x86: apic: Changing export symbols to *_GPL Robert Richter
2008-07-23 19:29       ` linux-os (Dick Johnson)
2008-07-23 20:01         ` Robert Richter
2008-07-24  8:16           ` Stefan Richter
2008-07-22 19:09 ` [PATCH 22/24] x86/oprofile: Add CONFIG_OPROFILE_IBS option Robert Richter
2008-07-26 10:15   ` Ingo Molnar
2008-07-22 19:09 ` [PATCH 23/24] oprofile: Fix printk in cpu_buffer.c Robert Richter
2008-07-22 19:09 ` [PATCH 24/24] x86/oprofile: Reanaming op_model_athlon.c to op_model_amd.c Robert Richter
2008-07-26 10:17   ` Ingo Molnar
2008-07-23 12:24 ` [PATCH 0/24] oprofile: Add IBS support for AMD CPUs Maynard Johnson
2008-07-26  9:52 ` Ingo Molnar
2008-07-28 14:02   ` Robert Richter
2008-07-31 10:32     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080723201924.GJ17134@erda.amd.com \
    --to=robert.richter@amd.com \
    --cc=barry.kasindorf@amd.com \
    --cc=cel@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oprofile-list@lists.sourceforge.net \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox