public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Leo Yan <leo.yan@arm.com>
To: Rob Herring <robh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Will Deacon <will@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Jonathan Corbet <corbet@lwn.net>, Marc Zyngier <maz@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	James Clark <james.clark@linaro.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, kvmarm@lists.linux.dev
Subject: Re: [PATCH v20 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
Date: Tue, 25 Feb 2025 12:38:13 +0000	[thread overview]
Message-ID: <20250225123813.GA1821331@e132581.arm.com> (raw)
In-Reply-To: <CAL_JsqKNad6eEBerUOco=SDWxdp6dgRD3FDrSt5OpGQYwwstSg@mail.gmail.com>

On Mon, Feb 24, 2025 at 07:31:52PM -0600, Rob Herring wrote:

[...]

> > > > When event rotation happens, if without context switch, in theory we
> > > > should can directly use the branch record (no invalidation, no injection)
> > > > for all events.
> > >
> > > No; that only works in *some* cases, and will produce incorrect results
> > > in others.
> > >
> > > For example, consider filtering. Imagine a PMU with a single counter,
> > > and two events, where event-A filters for calls-and-returns and event-B
> > > filters for calls-only. When switching from event-A to event-B, it's
> > > theoretically possible to keep the existing records around, knowing that
> > > the returns can be filtered out later. When switching from event-B to
> > > event-A we cannot keep the existing records, since there are gaps
> > > whenever a return should have been recorded.
> >
> > Seems to me, the problem is not caused by event rotation.  We need to
> > calculate a correct filter in the first place - the BRBE driver should
> > calculate a superset for all filters of events for a session.  Then,
> > generate branch record based event's specific filter.
> 
> The driver doesn't have enough information. If it is told to schedule
> event A, it doesn't know anything about event B. It could in theory
> try to remember event B if event B had already been scheduled, but it
> never knows when event B is gone.

E.g., I tried below command for enabling 10 events in a perf session:

  perf record -e armv9_nevis/r04/ -e armv9_nevis/r05/ \
              -e armv9_nevis/r06/ -e armv9_nevis/r07/ \
              -e armv9_nevis/r08/ -e armv9_nevis/r09/ \
              -e armv9_nevis/r10/ -e armv9_nevis/r11/ \
              -e armv9_nevis/r12/ -e armv9_nevis/r13/ \
              -- sleep 1

For Arm PMU, the flow below is invoked for every event on every
affinied CPU in initialization phase:

  armpmu_event_init() {
    armv8pmu_set_event_filter();
  }

Shouldn't we calculate a superset branch filter for all events, store
it into a per-CPU data structure and then apply the filter on BRBE?

> > > There are a number of cases of that shape given the set of configurable
> > > filters. In theory it's possible to retain those in some cases, but I
> > > don't think that the complexity is justified.
> > >
> > > Similarly, whenever kernel branches are recorded it's necessary to drop
> > > the stale branches whenever branch recording is paused, as there's
> > > necessarily a blackout period and hence a gap in the records.
> >
> > If we save BRBE record when a process is switched out and then restore
> > the record when a process is switched in, should we can keep a decent
> > branch record for performance profiling?
> 
> Keep in mind that there's only 64 branches recorded at most. How many
> branches in a context switch plus reconfiguring the PMU? Not a small
> percentage of 64 I think. In traces where freeze on overflow was not
> working (there's an example in v18), just the interrupt entry until
> BRBE was stopped was a significant part of the trace. A context switch
> is going to be similar.

That is true for kernel mode enabled tracing.  But we will have no
such kind noises for userspace only mode tracing.

[...]

> > > Do you have a reason why you think we *must* keep events around?
> >
> > Here I am really concerned are cases when a process is preempted or
> > migrated.  The driver doesn't save and restore branch records for these
> > cases, it just invalidates all records when a task is scheduled in.
> >
> > As a result, if an event overflow is close to context switch, it is
> > likely to capture incomplete branch records.  For a userspace-only
> > tracing, it is risk to capture empty branch record after preemption
> > and migrations.
> 
> There's the same risk if something else is recording kernel branches
> when you are recording userspace only. I think the user has to be
> aware if other things like context switches are perturbing their data.

I am confused for the decription above.  Does it refer to branch
recording cross different sessions?  It is fine for me that the branch
data is interleaved by different sessions (e.g. one is global tracing
and another is only per-thread tracing).

We might need to consider an intact branch record for the single perf
session case.  E.g. if userspace program calls:

    func_a -> func_b -> func_c

In a case for only userspace tracing, we will have no chance to preserve
the call sequence of these functions after the program is switched out.

Thanks,
Leo

  reply	other threads:[~2025-02-25 12:38 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-18 20:39 [PATCH v20 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
2025-02-18 20:39 ` [PATCH v20 01/11] perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters Rob Herring (Arm)
2025-02-18 20:39 ` [PATCH v20 02/11] perf: arm_pmu: Don't disable counter in armpmu_add() Rob Herring (Arm)
2025-02-18 20:39 ` [PATCH v20 03/11] perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event() Rob Herring (Arm)
2025-02-18 20:39 ` [PATCH v20 04/11] perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts Rob Herring (Arm)
2025-02-18 20:40 ` [PATCH v20 05/11] perf: arm_v7_pmu: Don't disable counter in (armv7|krait_|scorpion_)pmu_enable_event() Rob Herring (Arm)
2025-02-18 20:40 ` [PATCH v20 06/11] perf: apple_m1: Don't disable counter in m1_pmu_enable_event() Rob Herring (Arm)
2025-02-18 20:40 ` [PATCH v20 07/11] perf: arm_pmu: Move PMUv3-specific data Rob Herring (Arm)
2025-02-18 20:40 ` [PATCH v20 08/11] arm64/sysreg: Add BRBE registers and fields Rob Herring (Arm)
2025-02-18 20:40 ` [PATCH v20 09/11] arm64: Handle BRBE booting requirements Rob Herring (Arm)
2025-02-18 20:40 ` [PATCH v20 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
2025-02-24 10:41   ` Leo Yan
2025-02-18 20:40 ` [PATCH v20 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE) Rob Herring (Arm)
2025-02-24 12:25   ` Leo Yan
2025-02-24 12:46     ` Rob Herring
2025-02-24 14:03       ` Leo Yan
2025-02-24 16:05         ` Mark Rutland
2025-02-24 18:03           ` Leo Yan
2025-02-25  1:31             ` Rob Herring
2025-02-25 12:38               ` Leo Yan [this message]
2025-02-25 15:35                 ` Rob Herring
2025-02-25 19:46                 ` Mark Rutland
2025-02-25 12:01             ` Mark Rutland
2025-02-25 17:48               ` Leo Yan
2025-02-25 19:04                 ` Rob Herring
2025-02-25 19:58                 ` Mark Rutland
2025-02-26 13:48                   ` Leo Yan
2025-02-26 14:26                     ` Rob Herring
2025-02-19 16:09 ` [PATCH v20 00/11] arm64/perf: Enable branch stack sampling James Clark
2025-03-01  7:05 ` Will Deacon
2025-03-03 16:44   ` Rob Herring
2025-03-04 11:25     ` Catalin Marinas
2025-03-04 16:25       ` Mark Rutland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250225123813.GA1821331@e132581.arm.com \
    --to=leo.yan@arm.com \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=james.clark@linaro.org \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=robh@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox