linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: James Clark <james.clark@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>,
	Mark Brown <broonie@kernel.org>, Rob Herring <robh@kernel.org>,
	Marc Zyngier <maz@kernel.org>,
	Suzuki Poulose <suzuki.poulose@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	linux-perf-users@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, will@kernel.org,
	catalin.marinas@arm.com
Subject: Re: [PATCH V17 0/9] arm64/perf: Enable branch stack sampling
Date: Thu, 30 May 2024 18:41:14 +0100	[thread overview]
Message-ID: <Zli6Ot0TwK3Qy-ee@J2N7QTR9R3> (raw)
In-Reply-To: <80d33844-bdd2-4fee-81dd-9cd37c63d42c@arm.com>

On Thu, May 30, 2024 at 10:47:34AM +0100, James Clark wrote:
> On 05/04/2024 03:46, Anshuman Khandual wrote:
> > This series enables perf branch stack sampling support on arm64 platform
> > via a new arch feature called Branch Record Buffer Extension (BRBE). All
> > the relevant register definitions could be accessed here.
> > 
> > https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers
> > 
> > This series applies on 6.9-rc2.
> > 
> > Also this series is being hosted below for quick access, review and test.
> > 
> > https://git.gitlab.arm.com/linux-arm/linux-anshuman.git (brbe_v17)
> > 
> > There are still some open questions regarding handling multiple perf events
> > with different privilege branch filters getting on the same PMU, supporting
> > guest branch stack tracing from the host etc. Finally also looking for some
> > suggestions regarding supporting BRBE inside the guest. The series has been
> > re-organized completely as suggested earlier.
> > 
> > - Anshuman
> > 
> [...]
> > 
> > ------------------ Possible 'branch_sample_type' Mismatch -----------------
> > 
> > Branch stack sampling attributes 'event->attr.branch_sample_type' generally
> > remain the same for all the events during a perf record session.
> > 
> > $perf record -e <event_1> -e <event_2> -j <branch_filters> [workload]
> > 
> > event_1->attr.branch_sample_type == event_2->attr.branch_sample_type
> > 
> > This 'branch_sample_type' is used to configure the BRBE hardware, when both
> > events i.e <event_1> and <event_2> get scheduled on a given PMU. But during
> > PMU HW event's privilege filter inheritance, 'branch_sample_type' does not
> > remain the same for all events. Let's consider the following example
> > 
> > $perf record -e cycles:u -e instructions:k -j any,save_type ls
> > 
> > cycles->attr.branch_sample_type != instructions->attr.branch_sample_type
> > 
> > Because cycles event inherits PERF_SAMPLE_BRANCH_USER and instruction event
> > inherits PERF_SAMPLE_BRANCH_KERNEL. The proposed solution here configures
> > BRBE hardware with 'branch_sample_type' from last event to be added in the
> > PMU and hence captured branch records only get passed on to matching events
> > during a PMU interrupt.
> > 
> 
> Hi Anshuman,
> 
> Surely because of this example we should merge? At least we have to try
> to make the most common basic command lines work. Unless we expect all
> tools to know whether the branch buffer is shared between PMUs on each
> architecture or not. The driver knows though, so can merge the settings
> because it all has to go into one BRBE.

The difficulty here is that these are opened as independent events (not
in the same event group), and so from the driver's PoV, this is no
different two two users independently doing:

	perf record -e event:u -j any,save_type -p ${SOME_PID}

	perf record -e event:k -j any,save_type -p ${SOME_PID}

... where either would be surprised to get the merged result.

So, if we merge the filters into the most permissive set, we *must*
filter them when handing them to userspace so that each event gets the
expected set of branch records.

Assuming we do that, for Anshuman's case above:

	perf record -e cycles:u -e instructions:k -j any,save_type ls	

... the overflow of the cycles evnt will only record user branch
records, and the overflow of the instructions event will only record
kernel branch records.

I think it's arguable either way as to whether users want that or merged
records; we should probably figure out with the tools folk what the
desired behaviour is for that command line, but as above I don't think
that we can have the kernel give both events merged records unless
userspace asks for that explicitly.

> Merging the settings in tools would be a much harder problem.

I can see that it'd be harder to do that up-front when parsing each
event independently, but there is a phase where the tool knows about all
of the events and *might* be able to do that, where the driver doesn't
really know at any point that these events are related.

Regardless, I assume that if the user passes:

	perf record -e cycles:u -e instructions:k -k any,u,k,save_type ls

... both events will be opened with PERF_SAMPLE_BRANCH_USER and
PERF_SAMPLE_BRANCH_KERNEL, so maybe that's good, and in-kernel filtering
is sufficient.

> Any user that doesn't have permission for anything in the merged result
> can continue to get nothing.
> 
> And we can always add filtering in the kernel later on if we want to to
> make it appear to behave even more normally.

As above, I think if we merge the HW filters in the kernel then the
kernel must do SW filtering. I don't think that's something we can leave
for later.

> > static int
> > armpmu_add(struct perf_event *event, int flags)
> > {
> > 	........
> > 	if (has_branch_stack(event)) {
> > 		/*
> > 		 * Reset branch records buffer if a new task event gets
> > 		 * scheduled on a PMU which might have existing records.
> > 		 * Otherwise older branch records present in the buffer
> > 		 * might leak into the new task event.
> > 		 */
> > 		if (event->ctx->task && hw_events->brbe_context != event->ctx) {
> > 			hw_events->brbe_context = event->ctx;
> > 			if (armpmu->branch_reset)
> > 				armpmu->branch_reset();
> > 		}
> > 		hw_events->brbe_users++;
> > Here ------->	hw_events->brbe_sample_type = event->attr.branch_sample_type;
> > 	}
> > 	........
> > }
> > 
> > Instead of overriding existing 'branch_sample_type', both could be merged.
> > 
> 
> I can't see any use case where anyone would want the override behavior.
> Especially if you imagine multiple users not even aware of each other.

I completely agree that one event overriding another is not an
acceptable solution.

> Either the current "no records for mismatches" or the merged one make sense.

I think our options are:

1) Do not allow events with conflicting HW filters to be scheduled (e.g.
   treating this like a counter constraint).

2) Allow events with conflicting HW filters to be scheduled, merge the
   active HW filters, and SW filter the records in the kernel.

3) Allow events with conflicting branch filters to be scheduled, but
   only those which match the "active" filter get records.

So far (2) seems to me like the best option, and IIUC that's what x86
does with LBR. I suspect that also justifies discarding records at
context switch time, since we can't know how many relevant records were
discarded due to conflicting records (and so cannot safely stitch
records), and users have to expect that they may get fewer records than
may exist in HW anyway.

Mark.

  reply	other threads:[~2024-05-30 17:41 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-05  2:46 [PATCH V17 0/9] arm64/perf: Enable branch stack sampling Anshuman Khandual
2024-04-05  2:46 ` [PATCH V17 1/9] arm64/sysreg: Add BRBE registers and fields Anshuman Khandual
2024-05-21 13:24   ` Mark Rutland
2024-06-03  5:12     ` Anshuman Khandual
2024-04-05  2:46 ` [PATCH V17 2/9] KVM: arm64: Explicitly handle BRBE traps as UNDEFINED Anshuman Khandual
2024-05-21 13:26   ` Mark Rutland
2024-06-03  5:31     ` Anshuman Khandual
2024-04-05  2:46 ` [PATCH V17 3/9] drivers: perf: arm_pmu: Add infrastructure for branch stack sampling Anshuman Khandual
2024-05-21 13:44   ` Mark Rutland
2024-06-03  6:40     ` Anshuman Khandual
2024-04-05  2:46 ` [PATCH V17 4/9] arm64/boot: Enable EL2 requirements for BRBE Anshuman Khandual
2024-05-29 10:51   ` Mark Rutland
2024-06-03  9:11     ` Anshuman Khandual
2024-06-03  9:38       ` Mark Rutland
2024-04-05  2:46 ` [PATCH V17 5/9] drivers: perf: arm_pmuv3: Enable branch stack sampling via FEAT_BRBE Anshuman Khandual
2024-04-05  2:46 ` [PATCH V17 6/9] KVM: arm64: nvhe: Disable branch generation in nVHE guests Anshuman Khandual
2024-04-05  2:46 ` [PATCH V17 7/9] perf: test: Speed up running brstack test on an Arm model Anshuman Khandual
2024-04-05  2:46 ` [PATCH V17 8/9] perf: test: Remove empty lines from branch filter test output Anshuman Khandual
2024-04-05  2:46 ` [PATCH V17 9/9] perf: test: Extend branch stack sampling test for Arm64 BRBE Anshuman Khandual
2024-05-30  9:47 ` [PATCH V17 0/9] arm64/perf: Enable branch stack sampling James Clark
2024-05-30 17:41   ` Mark Rutland [this message]
2024-05-31 13:01     ` Mark Rutland
2024-06-06  4:58     ` Anshuman Khandual
2024-06-06  6:27       ` Anshuman Khandual
2024-06-06 11:01       ` James Clark
2024-06-06  3:57   ` Anshuman Khandual
2024-05-30 10:03 ` James Clark
2024-06-03  9:18   ` Anshuman Khandual
2024-06-03  9:39     ` Mark Rutland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zli6Ot0TwK3Qy-ee@J2N7QTR9R3 \
    --to=mark.rutland@arm.com \
    --cc=acme@kernel.org \
    --cc=anshuman.khandual@arm.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=james.clark@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=robh@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).