From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754219AbdDDOSO (ORCPT ); Tue, 4 Apr 2017 10:18:14 -0400 Received: from mail.kernel.org ([198.145.29.136]:49332 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752890AbdDDOSM (ORCPT ); Tue, 4 Apr 2017 10:18:12 -0400 Date: Tue, 4 Apr 2017 11:18:05 -0300 From: Arnaldo Carvalho de Melo To: Jin Yao Cc: Jiri Olsa , linux-kernel@vger.kernel.org, ak@linux.intel.com, kan.liang@intel.com, yao.jin@intel.com, Peter Zijlstra , Alexander Shishkin , Ingo Molnar Subject: Re: [PATCH v1 1/5] perf/core: Define the common branch type classification Message-ID: <20170404141805.GA12903@kernel.org> References: <1490973522-5499-1-git-send-email-yao.jin@linux.intel.com> <1490973522-5499-2-git-send-email-yao.jin@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1490973522-5499-2-git-send-email-yao.jin@linux.intel.com> X-Url: http://acmel.wordpress.com User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Adding the perf kernel maintainers to the CC list. Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu: > It is often useful to know the branch types while analyzing branch > data. For example, a call is very different from a conditional branch. > > Currently we have to look it up in binary while the binary may later > not be available and even the binary is available but user has to take > some time. It is very useful for user to check it directly in perf > report. > > Perf already has support for disassembling the branch instruction > to get the branch type. The branch type is defined in lbr.c. > > To keep consistent on kernel and userspace and make the classification > more common, the patch adds the common branch type classification > in perf_event.h. > > Since the disassembling of branch instruction needs some overhead, > a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it > needs to disassemble the branch instruction and record the branch > type. > > Signed-off-by: Jin Yao > --- > include/uapi/linux/perf_event.h | 24 +++++++++++++++++++++++- > tools/include/uapi/linux/perf_event.h | 24 +++++++++++++++++++++++- > 2 files changed, 46 insertions(+), 2 deletions(-) > > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index d09a9cd..4d731fd 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift { > PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT = 14, /* no flags */ > PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT = 15, /* no cycles */ > > + PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT = 16, /* save branch type */ > + > PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */ > }; > > @@ -198,9 +200,27 @@ enum perf_branch_sample_type { > PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT, > PERF_SAMPLE_BRANCH_NO_CYCLES = 1U << PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT, > > + PERF_SAMPLE_BRANCH_TYPE_SAVE = > + 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT, > + > PERF_SAMPLE_BRANCH_MAX = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT, > }; > > +/* > + * Common flow change classification > + */ > +enum { > + PERF_BR_NONE = 0, /* unknown */ > + PERF_BR_JCC_FWD = 1 << 1, /* conditional forward jump */ > + PERF_BR_JCC_BWD = 1 << 2, /* conditional backward jump */ > + PERF_BR_JMP = 1 << 3, /* jump */ > + PERF_BR_IND_JMP = 1 << 4, /* indirect jump */ > + PERF_BR_CALL = 1 << 5, /* call */ > + PERF_BR_IND_CALL = 1 << 6, /* indirect call */ > + PERF_BR_RET = 1 << 7, /* return */ > + PERF_BR_FAR_BRANCH = 1 << 8, /* SYSCALL,SYSRET,IRQ,... */ Humm, wouldn't be better to have those in separate buckets? I.e. PERF_BR_SYSCALL PERF_BR_SYSRET, PERF_BR_IRQ etc? And why a bitmask? /me reads a bit more... couldn't find a reason for this: + type:9, /* branch type */ Do you have a reason to use 9 bits? Why not just: enum { PERF_BR_NONE = 0, /* unknown */ PERF_BR_JCC_FWD = 1, /* conditional forward jump */ PERF_BR_JCC_BWD = 2, /* conditional backward jump */ PERF_BR_JMP = 3, /* jump */ PERF_BR_IND_JMP = 4, /* indirect jump */ PERF_BR_CALL = 5, /* call */ PERF_BR_IND_CALL = 6, /* indirect call */ PERF_BR_RET = 7, /* return */ PERF_BR_FAR_BRANCH = 8, /* SYSCALL,SYSRET,IRQ,... */ And then use, say, 4 or 5 bits for that type field? I must be missing something trivial ;-\ - Arnaldo > +}; > + > #define PERF_SAMPLE_BRANCH_PLM_ALL \ > (PERF_SAMPLE_BRANCH_USER|\ > PERF_SAMPLE_BRANCH_KERNEL|\ > @@ -999,6 +1019,7 @@ union perf_mem_data_src { > * in_tx: running in a hardware transaction > * abort: aborting a hardware transaction > * cycles: cycles from last branch (or 0 if not supported) > + * type: branch type > */ > struct perf_branch_entry { > __u64 from; > @@ -1008,7 +1029,8 @@ struct perf_branch_entry { > in_tx:1, /* in transaction */ > abort:1, /* transaction abort */ > cycles:16, /* cycle count to last branch */ > - reserved:44; > + type:9, /* branch type */ > + reserved:35; > }; > > #endif /* _UAPI_LINUX_PERF_EVENT_H */ > diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h > index d09a9cd..4d731fd 100644 > --- a/tools/include/uapi/linux/perf_event.h > +++ b/tools/include/uapi/linux/perf_event.h > @@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift { > PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT = 14, /* no flags */ > PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT = 15, /* no cycles */ > > + PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT = 16, /* save branch type */ > + > PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */ > }; > > @@ -198,9 +200,27 @@ enum perf_branch_sample_type { > PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT, > PERF_SAMPLE_BRANCH_NO_CYCLES = 1U << PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT, > > + PERF_SAMPLE_BRANCH_TYPE_SAVE = > + 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT, > + > PERF_SAMPLE_BRANCH_MAX = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT, > }; > > +/* > + * Common flow change classification > + */ > +enum { > + PERF_BR_NONE = 0, /* unknown */ > + PERF_BR_JCC_FWD = 1 << 1, /* conditional forward jump */ > + PERF_BR_JCC_BWD = 1 << 2, /* conditional backward jump */ > + PERF_BR_JMP = 1 << 3, /* jump */ > + PERF_BR_IND_JMP = 1 << 4, /* indirect jump */ > + PERF_BR_CALL = 1 << 5, /* call */ > + PERF_BR_IND_CALL = 1 << 6, /* indirect call */ > + PERF_BR_RET = 1 << 7, /* return */ > + PERF_BR_FAR_BRANCH = 1 << 8, /* SYSCALL,SYSRET,IRQ,... */ > +}; > + > #define PERF_SAMPLE_BRANCH_PLM_ALL \ > (PERF_SAMPLE_BRANCH_USER|\ > PERF_SAMPLE_BRANCH_KERNEL|\ > @@ -999,6 +1019,7 @@ union perf_mem_data_src { > * in_tx: running in a hardware transaction > * abort: aborting a hardware transaction > * cycles: cycles from last branch (or 0 if not supported) > + * type: branch type > */ > struct perf_branch_entry { > __u64 from; > @@ -1008,7 +1029,8 @@ struct perf_branch_entry { > in_tx:1, /* in transaction */ > abort:1, /* transaction abort */ > cycles:16, /* cycle count to last branch */ > - reserved:44; > + type:9, /* branch type */ > + reserved:35; > }; > > #endif /* _UAPI_LINUX_PERF_EVENT_H */ > -- > 2.7.4