linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: linux-perf-users@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>, Jiri Olsa <jolsa@kernel.org>
Subject: Re: [PATCH] Fix (well, work around) perf stat --null segfault
Date: Mon, 1 Dec 2025 18:54:38 +0100	[thread overview]
Message-ID: <aS3WXn4nz9GE87bM@gmail.com> (raw)
In-Reply-To: <CAP-5=fXK8HORZaFUxydEatLV1xXQTcLLBnEMeveJYDHDQPkf_w@mail.gmail.com>


* Ian Rogers <irogers@google.com> wrote:

> Agreed on the trainwreck :-) So NULL is a valid CPU map value. It
> means no CPUs, but magically no CPUs can also be converted in the code
> to become the "any" CPU value of -1. So iterating the perf_cpu_map of
> NULL will yield a single CPU of -1 :-( This behavior existed before my
> time working on the code, I've tried to make the expectation in the
> using code clearer, does the user want the magic "any" CPU behavior?
> "any" CPU is distinct from every/all CPUs which would be something
> like CPUs 0-127, and then does "all" include online and offline CPUs?
> This is something that's been discussed and I think things can be
> better, I'd rather the code was explicit whenever terms like "all",
> "online" and "any" were being used - it is very easy to confuse "all"
> and "any". I've been trying to adjust the code and make it more
> sensible with changes like:
> https://lore.kernel.org/all/20240202234057.2085863-1-irogers@google.com/
> but I agree with you that things are a mess here. At least we
> shouldn't be leaking memory (due to the reference count checking) and
> we shouldn't confuse indices and CPU numbers (due to the introduction
> of a wrapper type), problems that were pervasive and broke things like
> uncore counters frequently before.
> 
> Fwiw, part of me thinks we should change the perf_cpu_map to be a
> cpu_set_t, the "any" case could be an additional boolean in the
> struct. Perhaps we can just auto change "all" to "any" as is somewhat
> done here:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n92
> Then perf_cpu_map can be a cpu_set_t and I think life will be clearer.

Yeah, so when I looked at the data structure a few 
minutes ago:

	/**
	 * A sized, reference counted, sorted array of integers representing CPU
	 * numbers. This is commonly used to capture which CPUs a PMU is associated
	 * with. The indices into the cpumap are frequently used as they avoid having
	 * gaps if CPU numbers were used. For events associated with a pid, rather than
	 * a CPU, a single dummy map with an entry of -1 is used.
	 */
	DECLARE_RC_STRUCT(perf_cpu_map) {
		refcount_t	refcnt;
		/** Length of the map array. */
		int		nr;
		/** The CPU values. */
		struct perf_cpu	map[];
	};


My first thought was "Why isn't this a bitmask?". :-)

The thing is, even with 8192 CPUs, the mask size is 1K. 
That's still not catastrophic compared to the nightmare 
of managing a [8192] array...

And I would keep such core data structures as simple as 
possible: a single continuous bitmask. No sparse 
support, no NULL tricks. Just an always-present 
bitmask, and maybe an any/all modifier flag if the ABIs 
strongly suggest so.

The closer a data structure is to the core concepts of 
a tool, the more important its robust KISS properties 
are - at almost any runtime cost. </handwaving> :)

> Fwiw, even on the kernel side this is borked as I tried to explain in
> this RFC with 0 follow ups:
> https://lore.kernel.org/lkml/20250716223924.825772-1-irogers@google.com/

So that's a neat patch and kudos on the ASCII graphics. :-)

I'd suggest a re-send.

> Imagine the case of an event opened with 1 CPU and a TID. The filter
> will mean the counter only counts on the 1 CPU but the enabled and
> running time will mean the counter will be scaled (on your system) 128
> times!
> All of this matters when you consider different events opened in the
> same evlist and there being combined perf_cpu_maps like the evlist's
> all_cpus value.

> > -       nr = perf_cpu_map__max(evsel_list->core.all_cpus).cpu + 1;
> > +       if (evsel_list->core.all_cpus)
> > +               nr = perf_cpu_map__max(evsel_list->core.all_cpus).cpu + 1;
> > +       else
> > +               nr = 0;
> 
> So I think the better fix is to make max handle a NULL perf_cpu_map.
> That will yield "any" CPU of -1 ( :-( ) and then the -1 + 1 will give
> an empty aggregation map.
> 
> I'll write this up, add the test and see if I can spot similar issues
> wrt NULL in the vicinity in the perf_cpu_map code.

Thank you! In the short run I'd prefer whichever fix 
can restore perf stat --null functionality the fastest. ;-)

Thanks,

	Ingo

  reply	other threads:[~2025-12-01 17:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-30 11:43 [PATCH] Fix (well, work around) perf stat --null segfault Ingo Molnar
2025-12-01 16:52 ` Ian Rogers
2025-12-01 17:54   ` Ingo Molnar [this message]
2025-12-01 23:33     ` Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aS3WXn4nz9GE87bM@gmail.com \
    --to=mingo@kernel.org \
    --cc=acme@kernel.org \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=namhyung@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).