All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org,
	mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
	jolsa@kernel.org, namhyung@kernel.org, irogers@google.com,
	adrian.hunter@intel.com, ak@linux.intel.com, eranian@google.com,
	alexey.v.bayduraev@linux.intel.com, tinghao.zhang@intel.com
Subject: Re: [PATCH V4 4/7] perf/x86/intel: Support LBR event logging
Date: Fri, 20 Oct 2023 08:45:04 -0400	[thread overview]
Message-ID: <149fef35-b24f-426a-9e8a-d7cadcd7bdeb@linux.intel.com> (raw)
In-Reply-To: <20231019111202.GJ36211@noisy.programming.kicks-ass.net>



On 2023-10-19 7:12 a.m., Peter Zijlstra wrote:
> On Wed, Oct 04, 2023 at 11:40:41AM -0700, kan.liang@linux.intel.com wrote:
>> +static __always_inline void get_lbr_events(struct cpu_hw_events *cpuc,
>> +					   int i, u64 info)
>> +{
>> +	/*
>> +	 * The later code will decide what content can be disclosed
>> +	 * to the perf tool. It's no harmful to unconditionally update
>> +	 * the cpuc->lbr_events.
>> +	 * Pleae see intel_pmu_lbr_event_reorder()
>> +	 */
>> +	cpuc->lbr_events[i] = info & LBR_INFO_EVENTS;
>> +}
> 
> You could be forcing an extra cachemiss here. 

Here is to temporarily store the branch _counter information. Maybe we
can leverage the reserved field of cpuc->lbr_entries[i] to avoid the
cachemiss.

	e->reserved = info & LBR_INFO_COUNTERS;

I tried to add something like a static_assert to check the size of the
reserved field in case the field is shrink later. But the reserved field
is a bit field. I have no idea how to get the exact size of a bit field
unless define a macro. Is something as below OK? Any suggestions are
appreciated.


diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 1e80a551a4c2..62675593e39a 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1582,6 +1582,8 @@ static bool is_arch_lbr_xsave_available(void)
 	return true;
 }

+static_assert((64 - PERF_BRANCH_ENTRY_INFO_BITS_MAX) >
LBR_INFO_COUNTERS_MAX_NUM * 2);
+
 void __init intel_pmu_arch_lbr_init(void)
 {
	struct pmu *pmu = x86_get_pmu(smp_processor_id());
diff --git a/arch/x86/include/asm/msr-index.h
b/arch/x86/include/asm/msr-index.h
index f220c3598d03..e9ff8eba5efd 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -238,6 +238,7 @@
 #define LBR_INFO_BR_TYPE		(0xfull << LBR_INFO_BR_TYPE_OFFSET)
 #define LBR_INFO_EVENTS_OFFSET		32
 #define LBR_INFO_EVENTS			(0xffull << LBR_INFO_EVENTS_OFFSET)
+#define LBR_INFO_COUNTERS_MAX_NUM	4

 #define MSR_ARCH_LBR_CTL		0x000014ce
 #define ARCH_LBR_CTL_LBREN		BIT(0)
diff --git a/include/uapi/linux/perf_event.h
b/include/uapi/linux/perf_event.h
index 4461f380425b..3a64499b0f5d 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -1437,6 +1437,9 @@ struct perf_branch_entry {
 		reserved:31;
 };

+/* Size of used info bits in struct perf_branch_entry */
+#define PERF_BRANCH_ENTRY_INFO_BITS_MAX		33
+
 union perf_sample_weight {
 	__u64		full;
 #if defined(__LITTLE_ENDIAN_BITFIELD)



> A long time ago I had
> hacks to profile perf with perf, but perhaps PT can be abused for that
> now?

As my understanding, the PT can only give the trace information, and may
not tell if there is a canchemiss or something.
I will take a deep look and see if PT can help the case.

Thanks,
Kan

  reply	other threads:[~2023-10-20 12:45 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-04 18:40 [PATCH V4 1/7] perf: Add branch stack counters kan.liang
2023-10-04 18:40 ` [PATCH V4 2/7] perf/x86: Add PERF_X86_EVENT_NEEDS_BRANCH_STACK flag kan.liang
2023-10-04 18:40 ` [PATCH V4 3/7] perf: Add branch_sample_call_stack kan.liang
2023-10-04 18:40 ` [PATCH V4 4/7] perf/x86/intel: Support LBR event logging kan.liang
2023-10-19  9:23   ` Peter Zijlstra
2023-10-19 13:56     ` Liang, Kan
2023-10-19  9:26   ` Peter Zijlstra
2023-10-19 13:58     ` Liang, Kan
2023-10-19 10:52   ` Peter Zijlstra
2023-10-19 14:26     ` Liang, Kan
2023-10-19 18:18       ` Peter Zijlstra
2023-10-19 11:00   ` Peter Zijlstra
2023-10-19 14:28     ` Liang, Kan
2023-10-19 11:09   ` Peter Zijlstra
2023-10-19 14:31     ` Liang, Kan
2023-10-19 11:12   ` Peter Zijlstra
2023-10-20 12:45     ` Liang, Kan [this message]
2023-10-04 18:40 ` [PATCH V4 5/7] tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernel kan.liang
2023-10-04 18:40 ` [PATCH V4 6/7] perf header: Support num and width of branch counters kan.liang
2023-10-04 18:40 ` [PATCH V4 7/7] perf tools: Add branch counter knob kan.liang
2023-10-16 17:48 ` [PATCH V4 1/7] perf: Add branch stack counters Liang, Kan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=149fef35-b24f-426a-9e8a-d7cadcd7bdeb@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexey.v.bayduraev@linux.intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tinghao.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.