linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf, script: Minimize "not reaching sample" for brstackinsn
@ 2024-02-27 18:39 Andi Kleen
  2024-02-28 11:04 ` Adrian Hunter
  0 siblings, 1 reply; 4+ messages in thread
From: Andi Kleen @ 2024-02-27 18:39 UTC (permalink / raw)
  To: linux-perf-users; +Cc: adrian.hunter, Andi Kleen

In some situations perf script -F +brstackinsn sees a lot of
"not reaching sample" messages. This happens when the last LBR block
before the sample contains a branch that is not in the LBR,
and the instruction dumping stops.

$ perf record -b  emacs -Q --batch '()'
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.396 MB perf.data (443 samples) ]
$ perf script -F +brstackinsn
...
        00007f0ab2d171a4        insn: 41 0f 94 c0
        00007f0ab2d171a8        insn: 83 fa 01
        00007f0ab2d171ab        insn: 74 d3                     # PRED 6 cycles [313] 1.00 IPC
        00007f0ab2d17180        insn: 45 84 c0
        00007f0ab2d17183        insn: 74 28
        ... not reaching sample ...

$ perf script -F +brstackinsn | grep -c reach
136

This is a problem for further analysis that wants to see the full
code upto the sample.

There are two common cases where the message is bogus:
- The LBR only logs taken branches, but the branch might be a
conditional branch that is not taken (that is the most common
case actually)
- The LBR sampling uses a filter ignoring some branches,
but the perf script check checks for all branches.

This patch fixes these two conditions, by only checking
for conditional branches, as well as checking the perf_event_attr's
branch filter attributes.

For the test case above it fixes all the messages:

$ ./perf script -F +brstackinsn | grep -c reach
0

Note that there are still conditions when the message is hit --
sometimes there can be a unconditional branch that misses the LBR
update before the sample -- but they are much more rare now.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/builtin-script.c                              | 4 +++-
 tools/perf/util/dump-insn.c                              | 2 +-
 tools/perf/util/dump-insn.h                              | 2 +-
 tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c | 5 +++--
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 37088cc0ff1b..df2555fdb18f 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1378,7 +1378,9 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
 		printed += fprintf(fp, "\n");
 		if (ilen == 0)
 			break;
-		if (arch_is_branch(buffer + off, len - off, x.is64bit) && start + off != sample->ip) {
+		if ((attr->branch_sample_type == 0 || attr->branch_sample_type & PERF_SAMPLE_BRANCH_ANY)
+				&& arch_is_uncond_branch(buffer + off, len - off, x.is64bit)
+				&& start + off != sample->ip) {
 			/*
 			 * Hit a missing branch. Just stop.
 			 */
diff --git a/tools/perf/util/dump-insn.c b/tools/perf/util/dump-insn.c
index 2bd8585db93c..c1cc0ade48d0 100644
--- a/tools/perf/util/dump-insn.c
+++ b/tools/perf/util/dump-insn.c
@@ -15,7 +15,7 @@ const char *dump_insn(struct perf_insn *x __maybe_unused,
 }
 
 __weak
-int arch_is_branch(const unsigned char *buf __maybe_unused,
+int arch_is_uncond_branch(const unsigned char *buf __maybe_unused,
 		   size_t len __maybe_unused,
 		   int x86_64 __maybe_unused)
 {
diff --git a/tools/perf/util/dump-insn.h b/tools/perf/util/dump-insn.h
index 650125061530..a5de239679d7 100644
--- a/tools/perf/util/dump-insn.h
+++ b/tools/perf/util/dump-insn.h
@@ -20,6 +20,6 @@ struct perf_insn {
 
 const char *dump_insn(struct perf_insn *x, u64 ip,
 		      u8 *inbuf, int inlen, int *lenp);
-int arch_is_branch(const unsigned char *buf, size_t len, int x86_64);
+int arch_is_uncond_branch(const unsigned char *buf, size_t len, int x86_64);
 
 #endif
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
index c5d57027ec23..292027a984a9 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
@@ -200,12 +200,13 @@ int intel_pt_get_insn(const unsigned char *buf, size_t len, int x86_64,
 	return 0;
 }
 
-int arch_is_branch(const unsigned char *buf, size_t len, int x86_64)
+int arch_is_uncond_branch(const unsigned char *buf, size_t len, int x86_64)
 {
 	struct intel_pt_insn in;
 	if (intel_pt_get_insn(buf, len, x86_64, &in) < 0)
 		return -1;
-	return in.branch != INTEL_PT_BR_NO_BRANCH;
+	return in.branch == INTEL_PT_BR_UNCONDITIONAL ||
+	       in.branch == INTEL_PT_BR_INDIRECT;
 }
 
 const char *dump_insn(struct perf_insn *x, uint64_t ip __maybe_unused,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] perf, script: Minimize "not reaching sample" for brstackinsn
  2024-02-27 18:39 [PATCH] perf, script: Minimize "not reaching sample" for brstackinsn Andi Kleen
@ 2024-02-28 11:04 ` Adrian Hunter
  2024-02-28 23:33   ` Andi Kleen
  0 siblings, 1 reply; 4+ messages in thread
From: Adrian Hunter @ 2024-02-28 11:04 UTC (permalink / raw)
  To: Andi Kleen, linux-perf-users

On 27/02/24 20:39, Andi Kleen wrote:
> In some situations perf script -F +brstackinsn sees a lot of
> "not reaching sample" messages. This happens when the last LBR block
> before the sample contains a branch that is not in the LBR,
> and the instruction dumping stops.
> 
> $ perf record -b  emacs -Q --batch '()'
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.396 MB perf.data (443 samples) ]
> $ perf script -F +brstackinsn
> ...
>         00007f0ab2d171a4        insn: 41 0f 94 c0
>         00007f0ab2d171a8        insn: 83 fa 01
>         00007f0ab2d171ab        insn: 74 d3                     # PRED 6 cycles [313] 1.00 IPC
>         00007f0ab2d17180        insn: 45 84 c0
>         00007f0ab2d17183        insn: 74 28
>         ... not reaching sample ...
> 
> $ perf script -F +brstackinsn | grep -c reach
> 136
> 
> This is a problem for further analysis that wants to see the full
> code upto the sample.
> 
> There are two common cases where the message is bogus:
> - The LBR only logs taken branches, but the branch might be a
> conditional branch that is not taken (that is the most common
> case actually)

How do you know it is not a taken branch that missed the LBR update?

> - The LBR sampling uses a filter ignoring some branches,
> but the perf script check checks for all branches.

Not understanding this case.  Do you mean you expect not to reach
the sample, so there is no point printing the message?

> 
> This patch fixes these two conditions, by only checking
> for conditional branches, as well as checking the perf_event_attr's
> branch filter attributes.
> 
> For the test case above it fixes all the messages:
> 
> $ ./perf script -F +brstackinsn | grep -c reach
> 0
> 
> Note that there are still conditions when the message is hit --
> sometimes there can be a unconditional branch that misses the LBR
> update before the sample -- but they are much more rare now.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  tools/perf/builtin-script.c                              | 4 +++-
>  tools/perf/util/dump-insn.c                              | 2 +-
>  tools/perf/util/dump-insn.h                              | 2 +-
>  tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c | 5 +++--
>  4 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 37088cc0ff1b..df2555fdb18f 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -1378,7 +1378,9 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
>  		printed += fprintf(fp, "\n");
>  		if (ilen == 0)
>  			break;
> -		if (arch_is_branch(buffer + off, len - off, x.is64bit) && start + off != sample->ip) {
> +		if ((attr->branch_sample_type == 0 || attr->branch_sample_type & PERF_SAMPLE_BRANCH_ANY)
> +				&& arch_is_uncond_branch(buffer + off, len - off, x.is64bit)
> +				&& start + off != sample->ip) {

Needs a comment, or update the comment further back.

>  			/*
>  			 * Hit a missing branch. Just stop.
>  			 */
> diff --git a/tools/perf/util/dump-insn.c b/tools/perf/util/dump-insn.c
> index 2bd8585db93c..c1cc0ade48d0 100644
> --- a/tools/perf/util/dump-insn.c
> +++ b/tools/perf/util/dump-insn.c
> @@ -15,7 +15,7 @@ const char *dump_insn(struct perf_insn *x __maybe_unused,
>  }
>  
>  __weak
> -int arch_is_branch(const unsigned char *buf __maybe_unused,
> +int arch_is_uncond_branch(const unsigned char *buf __maybe_unused,
>  		   size_t len __maybe_unused,
>  		   int x86_64 __maybe_unused)
>  {
> diff --git a/tools/perf/util/dump-insn.h b/tools/perf/util/dump-insn.h
> index 650125061530..a5de239679d7 100644
> --- a/tools/perf/util/dump-insn.h
> +++ b/tools/perf/util/dump-insn.h
> @@ -20,6 +20,6 @@ struct perf_insn {
>  
>  const char *dump_insn(struct perf_insn *x, u64 ip,
>  		      u8 *inbuf, int inlen, int *lenp);
> -int arch_is_branch(const unsigned char *buf, size_t len, int x86_64);
> +int arch_is_uncond_branch(const unsigned char *buf, size_t len, int x86_64);
>  
>  #endif
> diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
> index c5d57027ec23..292027a984a9 100644
> --- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
> +++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
> @@ -200,12 +200,13 @@ int intel_pt_get_insn(const unsigned char *buf, size_t len, int x86_64,
>  	return 0;
>  }
>  
> -int arch_is_branch(const unsigned char *buf, size_t len, int x86_64)
> +int arch_is_uncond_branch(const unsigned char *buf, size_t len, int x86_64)
>  {
>  	struct intel_pt_insn in;
>  	if (intel_pt_get_insn(buf, len, x86_64, &in) < 0)
>  		return -1;
> -	return in.branch != INTEL_PT_BR_NO_BRANCH;
> +	return in.branch == INTEL_PT_BR_UNCONDITIONAL ||
> +	       in.branch == INTEL_PT_BR_INDIRECT;
>  }
>  
>  const char *dump_insn(struct perf_insn *x, uint64_t ip __maybe_unused,


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] perf, script: Minimize "not reaching sample" for brstackinsn
  2024-02-28 11:04 ` Adrian Hunter
@ 2024-02-28 23:33   ` Andi Kleen
  2024-02-29  7:37     ` Adrian Hunter
  0 siblings, 1 reply; 4+ messages in thread
From: Andi Kleen @ 2024-02-28 23:33 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: linux-perf-users

> > There are two common cases where the message is bogus:
> > - The LBR only logs taken branches, but the branch might be a
> > conditional branch that is not taken (that is the most common
> > case actually)
> 
> How do you know it is not a taken branch that missed the LBR update?

I don't, but the not taken case is totally valid (and also common)
so it doesn't make sense to have a mere sanity check make a common
case unusable.

> 
> > - The LBR sampling uses a filter ignoring some branches,
> > but the perf script check checks for all branches.
> 
> Not understanding this case.  Do you mean you expect not to reach
> the sample, so there is no point printing the message?

If the LBR is e.g. filtered on far branches it makes no sense to 
check for non far branches. There are lot of filtering cases here which
would be very complicated to handle for a mere sanity check,
so the best way is to not do the bogus sanity check.

-Andi

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] perf, script: Minimize "not reaching sample" for brstackinsn
  2024-02-28 23:33   ` Andi Kleen
@ 2024-02-29  7:37     ` Adrian Hunter
  0 siblings, 0 replies; 4+ messages in thread
From: Adrian Hunter @ 2024-02-29  7:37 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-perf-users

On 29/02/24 01:33, Andi Kleen wrote:
>>> There are two common cases where the message is bogus:
>>> - The LBR only logs taken branches, but the branch might be a
>>> conditional branch that is not taken (that is the most common
>>> case actually)
>>
>> How do you know it is not a taken branch that missed the LBR update?
> 
> I don't, but the not taken case is totally valid (and also common)
> so it doesn't make sense to have a mere sanity check make a common
> case unusable.
> 
>>
>>> - The LBR sampling uses a filter ignoring some branches,
>>> but the perf script check checks for all branches.
>>
>> Not understanding this case.  Do you mean you expect not to reach
>> the sample, so there is no point printing the message?
> 
> If the LBR is e.g. filtered on far branches it makes no sense to 
> check for non far branches. There are lot of filtering cases here which
> would be very complicated to handle for a mere sanity check,
> so the best way is to not do the bogus sanity check.

This comment could be updated

	 * between final branch and sample. When this happens just
	 * continue walking after the last TO until we hit a branch.

It's not clear why "attr->branch_sample_type == 0" is there.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-02-29  7:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-27 18:39 [PATCH] perf, script: Minimize "not reaching sample" for brstackinsn Andi Kleen
2024-02-28 11:04 ` Adrian Hunter
2024-02-28 23:33   ` Andi Kleen
2024-02-29  7:37     ` Adrian Hunter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).