From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932411AbbJMNk2 (ORCPT ); Tue, 13 Oct 2015 09:40:28 -0400 Received: from mail-wi0-f170.google.com ([209.85.212.170]:33963 "EHLO mail-wi0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932107AbbJMNk0 (ORCPT ); Tue, 13 Oct 2015 09:40:26 -0400 Date: Tue, 13 Oct 2015 15:40:04 +0200 From: Ingo Molnar To: Stephane Eranian Cc: linux-kernel@vger.kernel.org, acme@redhat.com, peterz@infradead.org, mingo@elte.hu, ak@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org, khandual@linux.vnet.ibm.com Subject: Re: [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL Message-ID: <20151013134004.GA8843@gmail.com> References: <1444720151-10275-1-git-send-email-eranian@google.com> <1444720151-10275-3-git-send-email-eranian@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1444720151-10275-3-git-send-email-eranian@google.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Stephane Eranian wrote: > This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL > for Intel x86 processors. When the processor support LBR filtering > this the selection is done in hardware. Otherwise, the filter is > applied by software. Note that we chose to include zero length calls > because they also represent calls. > > Signed-off-by: Stephane Eranian > --- > arch/x86/kernel/cpu/perf_event_intel_lbr.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c > index ad0b8b0..bfd0b71 100644 > --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c > +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c > @@ -555,6 +555,8 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event) > if (br_type & PERF_SAMPLE_BRANCH_IND_JUMP) > mask |= X86_BR_IND_JMP; > > + if (br_type & PERF_SAMPLE_BRANCH_CALL) > + mask |= X86_BR_CALL | X86_BR_ZERO_CALL; I'm wondering how frequent zero-length calls are. If they still occur in typical user-space, would it make sense to also have a separate branch sampling type for zero length calls? Intel documents zero length calls as ones that (ab-)use the call instruction to push the current IP on the stack: call next_addr next_addr: pop %reg which can take over 10 cycles on certain microarchitectures (and it unbalances whatever call stack tracking/caching the CPU does as well). So it might make sense to analyze them separately. I guess that's the reason why Intel added a separate flag for them in the PMU. Thanks, Ingo