All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
To: Stephane Eranian <eranian@google.com>
Cc: Michael Neuling <mikey@neuling.org>,
	"ak@linux.intel.com" <ak@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Michael Ellerman <michael@ellerman.id.au>,
	Linux PPC dev <linuxppc-dev@ozlabs.org>,
	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
	Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [V6 00/11] perf: New conditional branch filter
Date: Wed, 28 May 2014 13:34:03 +0530	[thread overview]
Message-ID: <53859873.9070603@linux.vnet.ibm.com> (raw)
In-Reply-To: <CABPqkBS+3N6PW0wJr3xnvpF4zinZM7+iwFWzwS7BDm-LTkam5Q@mail.gmail.com>

On 05/27/2014 05:39 PM, Stephane Eranian wrote:
> I have been looking at those patches and ran some tests.
> And I found a few issues so far.
> 
> I am running:
> $ perf record -j any_ret -e cycles:u test_program
> $ perf report -D
> 
> Most entries are okay and match the filter, however some do not make sense:
> 
> 3642586996762 0x15d0 [0x108]: PERF_RECORD_SAMPLE(IP, 2): 17921/17921:
> 0x10001170 period: 613678 addr: 0
> .... branch stack: nr:9
> .....  0: 00000000100011cc -> 0000000010000e38
> .....  1: 0000000010001150 -> 00000000100011bc
> .....  2: 0000000010001208 -> 0000000010000e38
> .....  3: 0000000010001160 -> 00000000100011f8
> .....  4: 00000000100011cc -> 0000000010000e38
> .....  5: 0000000010001150 -> 00000000100011bc
> .....  6: 0000000010001208 -> 0000000010000e38
> .....  7: 0000000010001160 -> 00000000100011f8
> .....  8: 0000000000000000 -> 0000000010001160
> ^^^^^^
> Entry 8 does not make sense, unless 0x0 is a valid return branch
> instruction address.
> If an address is invalid, the whole entry needs to be eliminated. It
> is okay to have
> less than the max number of entries supported by HW.

Hey Stephane,

Okay. The same behaviour is also reflected in the test results what I have
shared in the patchset. Here is that section.

(3) perf record -j any_ret -e branch-misses:u ./cprog

# Overhead  Command  Source Shared Object          Source Symbol  Target Shared Object          Target Symbol
# ........  .......  ....................  .....................  ....................  .....................
#
    15.61%    cprog  [unknown]             [.] 00000000           cprog                 [.] sw_3_1           
     6.28%    cprog  cprog                 [.] symbol2            cprog                 [.] hw_1_2           
     6.28%    cprog  cprog                 [.] ctr_addr           cprog                 [.] sw_4_1           
     6.26%    cprog  cprog                 [.] success_3_1_3      cprog                 [.] sw_3_1           
     6.24%    cprog  cprog                 [.] symbol1            cprog                 [.] hw_1_1           
     6.24%    cprog  cprog                 [.] sw_4_2             cprog                 [.] callme           
     6.21%    cprog  [unknown]             [.] 00000000           cprog                 [.] callme           
     6.19%    cprog  cprog                 [.] lr_addr            cprog                 [.] sw_4_2           
     3.16%    cprog  cprog                 [.] hw_1_2             cprog                 [.] callme           
     3.15%    cprog  cprog                 [.] success_3_1_1      cprog                 [.] sw_3_1           
     3.15%    cprog  cprog                 [.] sw_4_1             cprog                 [.] callme           
     3.14%    cprog  cprog                 [.] callme             cprog                 [.] main             
     3.13%    cprog  cprog                 [.] hw_1_1             cprog                 [.] callme

So a lot of samples above have 0x0 as the "from" address. This originates from the code
section here inside the function "power_pmu_bhrb_read", where we hit two back to back
target addresses. So we zero out the from address for the first target address and re-read
the second address over again. So thats how we get zero as the from address. This is how the
HW capture the samples. I was reluctant to drop these samples but I agree that these kind of
samples can be dropped if we need to.

if (val & BHRB_TARGET) {
	/* Shouldn't have two targets in a
	   row.. Reset index and try again */
	r_index--;
	addr = 0;
}

> I also had cases where monitoring only at the user level, got me
> branch addresses in the
> 0xc0000000...... range. My test program is linked statically.
> 

Thats weird. I would need more information and details on this. BTW
what is the system you are running on ? Could you please share the
/proc/cpuinfo details of the same ?

> when eliminating the bogus entries, my tests yielded only return
> branch instruction addresses
> which is good. Will run more tests.

Sure. Thanks for the tests and comments.

WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
To: Stephane Eranian <eranian@google.com>
Cc: Linux PPC dev <linuxppc-dev@ozlabs.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Michael Ellerman <michael@ellerman.id.au>,
	Michael Neuling <mikey@neuling.org>,
	"ak@linux.intel.com" <ak@linux.intel.com>,
	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
	Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [V6 00/11] perf: New conditional branch filter
Date: Wed, 28 May 2014 13:34:03 +0530	[thread overview]
Message-ID: <53859873.9070603@linux.vnet.ibm.com> (raw)
In-Reply-To: <CABPqkBS+3N6PW0wJr3xnvpF4zinZM7+iwFWzwS7BDm-LTkam5Q@mail.gmail.com>

On 05/27/2014 05:39 PM, Stephane Eranian wrote:
> I have been looking at those patches and ran some tests.
> And I found a few issues so far.
> 
> I am running:
> $ perf record -j any_ret -e cycles:u test_program
> $ perf report -D
> 
> Most entries are okay and match the filter, however some do not make sense:
> 
> 3642586996762 0x15d0 [0x108]: PERF_RECORD_SAMPLE(IP, 2): 17921/17921:
> 0x10001170 period: 613678 addr: 0
> .... branch stack: nr:9
> .....  0: 00000000100011cc -> 0000000010000e38
> .....  1: 0000000010001150 -> 00000000100011bc
> .....  2: 0000000010001208 -> 0000000010000e38
> .....  3: 0000000010001160 -> 00000000100011f8
> .....  4: 00000000100011cc -> 0000000010000e38
> .....  5: 0000000010001150 -> 00000000100011bc
> .....  6: 0000000010001208 -> 0000000010000e38
> .....  7: 0000000010001160 -> 00000000100011f8
> .....  8: 0000000000000000 -> 0000000010001160
> ^^^^^^
> Entry 8 does not make sense, unless 0x0 is a valid return branch
> instruction address.
> If an address is invalid, the whole entry needs to be eliminated. It
> is okay to have
> less than the max number of entries supported by HW.

Hey Stephane,

Okay. The same behaviour is also reflected in the test results what I have
shared in the patchset. Here is that section.

(3) perf record -j any_ret -e branch-misses:u ./cprog

# Overhead  Command  Source Shared Object          Source Symbol  Target Shared Object          Target Symbol
# ........  .......  ....................  .....................  ....................  .....................
#
    15.61%    cprog  [unknown]             [.] 00000000           cprog                 [.] sw_3_1           
     6.28%    cprog  cprog                 [.] symbol2            cprog                 [.] hw_1_2           
     6.28%    cprog  cprog                 [.] ctr_addr           cprog                 [.] sw_4_1           
     6.26%    cprog  cprog                 [.] success_3_1_3      cprog                 [.] sw_3_1           
     6.24%    cprog  cprog                 [.] symbol1            cprog                 [.] hw_1_1           
     6.24%    cprog  cprog                 [.] sw_4_2             cprog                 [.] callme           
     6.21%    cprog  [unknown]             [.] 00000000           cprog                 [.] callme           
     6.19%    cprog  cprog                 [.] lr_addr            cprog                 [.] sw_4_2           
     3.16%    cprog  cprog                 [.] hw_1_2             cprog                 [.] callme           
     3.15%    cprog  cprog                 [.] success_3_1_1      cprog                 [.] sw_3_1           
     3.15%    cprog  cprog                 [.] sw_4_1             cprog                 [.] callme           
     3.14%    cprog  cprog                 [.] callme             cprog                 [.] main             
     3.13%    cprog  cprog                 [.] hw_1_1             cprog                 [.] callme

So a lot of samples above have 0x0 as the "from" address. This originates from the code
section here inside the function "power_pmu_bhrb_read", where we hit two back to back
target addresses. So we zero out the from address for the first target address and re-read
the second address over again. So thats how we get zero as the from address. This is how the
HW capture the samples. I was reluctant to drop these samples but I agree that these kind of
samples can be dropped if we need to.

if (val & BHRB_TARGET) {
	/* Shouldn't have two targets in a
	   row.. Reset index and try again */
	r_index--;
	addr = 0;
}

> I also had cases where monitoring only at the user level, got me
> branch addresses in the
> 0xc0000000...... range. My test program is linked statically.
> 

Thats weird. I would need more information and details on this. BTW
what is the system you are running on ? Could you please share the
/proc/cpuinfo details of the same ?

> when eliminating the bogus entries, my tests yielded only return
> branch instruction addresses
> which is good. Will run more tests.

Sure. Thanks for the tests and comments.


  reply	other threads:[~2014-05-28  8:06 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-05  9:09 [V6 00/11] perf: New conditional branch filter Anshuman Khandual
2014-05-05  9:09 ` Anshuman Khandual
2014-05-05  9:09 ` [V6 01/11] perf: Add PERF_SAMPLE_BRANCH_COND Anshuman Khandual
2014-05-05  9:09   ` Anshuman Khandual
2014-05-05  9:09 ` [V6 02/11] perf, tool: Conditional branch filter 'cond' added to perf record Anshuman Khandual
2014-05-05  9:09   ` Anshuman Khandual
2014-05-05  9:09 ` [V6 03/11] x86, perf: Add conditional branch filtering support Anshuman Khandual
2014-05-05  9:09   ` Anshuman Khandual
2014-05-05  9:09 ` [V6 04/11] perf, documentation: Description for conditional branch filter Anshuman Khandual
2014-05-05  9:09   ` Anshuman Khandual
2014-05-05  9:09 ` [V6 05/11] powerpc, perf: Re-arrange BHRB processing Anshuman Khandual
2014-05-05  9:09   ` Anshuman Khandual
2014-05-05  9:09 ` [V6 06/11] powerpc, perf: Re-arrange PMU based branch filter processing in POWER8 Anshuman Khandual
2014-05-05  9:09   ` Anshuman Khandual
2014-05-05  9:09 ` [V6 07/11] powerpc, perf: Change the name of HW PMU branch filter tracking variable Anshuman Khandual
2014-05-05  9:09   ` Anshuman Khandual
2014-05-05  9:09 ` [V6 08/11] powerpc, lib: Add new branch analysis support functions Anshuman Khandual
2014-05-05  9:09   ` Anshuman Khandual
2014-05-05  9:09 ` [V6 09/11] powerpc, perf: Enable SW filtering in branch stack sampling framework Anshuman Khandual
2014-05-05  9:09   ` Anshuman Khandual
2014-05-05  9:09 ` [V6 10/11] power8, perf: Adapt BHRB PMU configuration to work with SW filters Anshuman Khandual
2014-05-05  9:09   ` Anshuman Khandual
2014-05-05  9:09 ` [V6 11/11] powerpc, perf: Enable privilege mode SW branch filters Anshuman Khandual
2014-05-05  9:09   ` Anshuman Khandual
2014-05-21  9:11 ` Fwd: [V6 00/11] perf: New conditional branch filter Anshuman Khandual
2014-05-21  9:23   ` Peter Zijlstra
2014-05-21 10:39     ` Anshuman Khandual
2014-05-21 12:01       ` Peter Zijlstra
2014-05-22  3:59         ` Anshuman Khandual
2014-05-22  4:30         ` Michael Ellerman
2014-05-22  7:01           ` Peter Zijlstra
2014-05-27 12:09 ` Stephane Eranian
2014-05-27 12:09   ` Stephane Eranian
2014-05-28  8:04   ` Anshuman Khandual [this message]
2014-05-28  8:04     ` Anshuman Khandual
2014-06-02 12:59     ` Stephane Eranian
2014-06-02 12:59       ` Stephane Eranian
2014-06-02 16:04       ` Anshuman Khandual
2014-06-02 16:25         ` Stephane Eranian
2014-06-02 22:52       ` Michael Neuling
2014-06-02 22:52         ` Michael Neuling
  -- strict thread matches above, loose matches on Subject: below --
2014-05-21  9:59 Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53859873.9070603@linux.vnet.ibm.com \
    --to=khandual@linux.vnet.ibm.com \
    --cc=acme@ghostprotocols.net \
    --cc=ak@linux.intel.com \
    --cc=eranian@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=michael@ellerman.id.au \
    --cc=mikey@neuling.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sukadev@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.