From: Rajnesh Kanwal <rkanwal@rivosinc.com>
To: Ian Rogers <irogers@google.com>, ak@linux.intel.com
Cc: "Liang, Kan" <kan.liang@linux.intel.com>,
linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
linux-perf-users@vger.kernel.org, adrian.hunter@intel.com,
alexander.shishkin@linux.intel.com, ajones@ventanamicro.com,
anup@brainfault.org, acme@kernel.org, atishp@rivosinc.com,
beeman@rivosinc.com, brauner@kernel.org, conor@kernel.org,
heiko@sntech.de, mingo@redhat.com, james.clark@arm.com,
renyu.zj@linux.alibaba.com, jolsa@kernel.org,
jisheng.teoh@starfivetech.com, palmer@dabbelt.com,
will@kernel.org, kaiwenxue1@gmail.com, vincent.chen@sifive.com
Subject: Re: [PATCH v2 1/7] perf: Increase the maximum number of samples to 256.
Date: Wed, 21 May 2025 11:47:18 +0100 [thread overview]
Message-ID: <CAECbVCui6ZgHBXBr3LdVGd16ERe0GgAnA1zy_zOQZVTU3bPoWw@mail.gmail.com> (raw)
In-Reply-To: <CAECbVCvX8St8sXh9pTnyO_94-cJT_DB4MyggtS_-PXqWNtXDXw@mail.gmail.com>
On Thu, Apr 17, 2025 at 1:51 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
>
> + Adding Andi Kleen.
>
> On Thu, Feb 20, 2025 at 6:51 PM Ian Rogers <irogers@google.com> wrote:
> >
> > On Thu, Jan 16, 2025 at 3:10 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
> > >
> > > RISCV CTR extension support a maximum depth of 256 last branch records.
> > > The 127 entries limit results in corrupting CTR entries for RISC-V if
> > > configured to be 256 entries. This will not impact any other architectures
> > > as it is just increasing maximum limit of possible entries.
> >
> > I wonder if rather than a constant this code should just use the auto
> > resizing hashmap code?
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/hashmap.h
> >
> > I assume the value of 127 comes from perf_event.h's PERF_MAX_STACK_DEPTH:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/perf_event.h#n1252
> >
> > Perhaps these constants shouldn't exist. The perf-record man page mentions:
> > sysctl.kernel.perf_event_max_stack
> > which I believe gets a value from
> > /proc/sys/kernel/perf_event_max_stack, so maybe these should be
> > runtime determined constants rather than compile time.
While working on this, I came across the following two patches. It
looks like what
you have suggested, it was tried before but later on Arnaldo reverted the change
from report and script cmds due to reasons mentioned in the second patch.
https://lore.kernel.org/lkml/1461767472-8827-31-git-send-email-acme@kernel.org/
https://lore.kernel.org/lkml/1463696493-27528-8-git-send-email-acme@kernel.org/
Regards
Rajnesh
> >
>
> Thanks Ian for your feedback. I am not sure if it's feasible to use auto
> resizing hashmap here. On each sample of 256 entries we will be doing
> 6 callocs and transferring a whole lot of entries in hashmap_grow. We
> can't reuse old hashmap as well. On each sample we bear the same cost
>
> But I do agree this should be more dynamic but the maximum number
> of entries remove_loops can process is limited by the type of chash array
> here. I can change it and related logic to use uint16_t or higher but we
> will still have a cap on the number of entries.
>
> PERF_MAX_BRANCH_DEPTH seems to be denoting what remove_loops
> can process. This is being used by thread__resolve_callchain_sample to
> check if the sample is processable before calling remove_loops. I think
> this can't be changed to use perf_event_max_stack. But I can rename
> this macro to avoid confusion.
>
> I didn't notice PERF_MAX_STACK_DEPTH. This seems to be defined in
> multiple places and touches bpf as well. I agree that we should avoid
> using this macro and use runtime determined value instead. Tbh I don't
> have super in-depth perf understanding. I will give this a try and send a
> patch in the next update. It would be helpful if you can review it.
>
> Thanks
> -Rajnesh
>
> > Thanks,
> > Ian
> >
> > > Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> > > ---
> > > tools/perf/util/machine.c | 21 ++++++++++++++-------
> > > 1 file changed, 14 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > > index 27d5345d2b30..f2eb3c20274e 100644
> > > --- a/tools/perf/util/machine.c
> > > +++ b/tools/perf/util/machine.c
> > > @@ -2174,25 +2174,32 @@ static void save_iterations(struct iterations *iter,
> > > iter->cycles += be[i].flags.cycles;
> > > }
> > >
> > > -#define CHASHSZ 127
> > > -#define CHASHBITS 7
> > > -#define NO_ENTRY 0xff
> > > +#define CHASHBITS 8
> > > +#define NO_ENTRY 0xffU
> > >
> > > -#define PERF_MAX_BRANCH_DEPTH 127
> > > +#define PERF_MAX_BRANCH_DEPTH 256
> > >
> > > /* Remove loops. */
> > > +/* Note: Last entry (i==ff) will never be checked against NO_ENTRY
> > > + * so it's safe to have an unsigned char array to process 256 entries
> > > + * without causing clash between last entry and NO_ENTRY value.
> > > + */
> > > static int remove_loops(struct branch_entry *l, int nr,
> > > struct iterations *iter)
> > > {
> > > int i, j, off;
> > > - unsigned char chash[CHASHSZ];
> > > + unsigned char chash[PERF_MAX_BRANCH_DEPTH];
> > >
> > > memset(chash, NO_ENTRY, sizeof(chash));
> > >
> > > - BUG_ON(PERF_MAX_BRANCH_DEPTH > 255);
> > > + BUG_ON(PERF_MAX_BRANCH_DEPTH > 256);
> > >
> > > for (i = 0; i < nr; i++) {
> > > - int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
> > > + /* Remainder division by PERF_MAX_BRANCH_DEPTH is not
> > > + * needed as hash_64 will anyway limit the hash
> > > + * to CHASHBITS
> > > + */
> > > + int h = hash_64(l[i].from, CHASHBITS);
> > >
> > > /* no collision handling for now */
> > > if (chash[h] == NO_ENTRY) {
> > > --
> > > 2.34.1
> > >
next prev parent reply other threads:[~2025-05-21 10:47 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-16 23:09 [PATCH v2 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 1/7] perf: Increase the maximum number of samples to 256 Rajnesh Kanwal
2025-02-20 18:51 ` Ian Rogers
2025-04-17 12:51 ` Rajnesh Kanwal
2025-05-21 10:47 ` Rajnesh Kanwal [this message]
2025-05-21 15:36 ` Ian Rogers
2025-05-21 17:40 ` Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 2/7] riscv: pmu: Add Control transfer records CSR definations Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 3/7] riscv: Add Control Transfer Records extension parsing Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 4/7] dt-bindings: riscv: add Sxctr ISA extension description Rajnesh Kanwal
2025-01-17 7:26 ` Krzysztof Kozlowski
2025-01-20 14:31 ` Rajnesh Kanwal
2025-01-20 18:49 ` Conor Dooley
2025-01-16 23:09 ` [PATCH v2 5/7] riscv: pmu: Add infrastructure for Control Transfer Record Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 6/7] riscv: pmu: Add driver for Control Transfer Records Ext Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 7/7] riscv: pmu: Integrate CTR Ext support in riscv_pmu_dev driver Rajnesh Kanwal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAECbVCui6ZgHBXBr3LdVGd16ERe0GgAnA1zy_zOQZVTU3bPoWw@mail.gmail.com \
--to=rkanwal@rivosinc.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ajones@ventanamicro.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=anup@brainfault.org \
--cc=atishp@rivosinc.com \
--cc=beeman@rivosinc.com \
--cc=brauner@kernel.org \
--cc=conor@kernel.org \
--cc=heiko@sntech.de \
--cc=irogers@google.com \
--cc=james.clark@arm.com \
--cc=jisheng.teoh@starfivetech.com \
--cc=jolsa@kernel.org \
--cc=kaiwenxue1@gmail.com \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=mingo@redhat.com \
--cc=palmer@dabbelt.com \
--cc=renyu.zj@linux.alibaba.com \
--cc=vincent.chen@sifive.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).