All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: linux-perf-users@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>, Jiri Olsa <jolsa@kernel.org>
Subject: Re: [PATCH] Fix (well, work around) perf stat --null segfault
Date: Mon, 1 Dec 2025 18:54:38 +0100	[thread overview]
Message-ID: <aS3WXn4nz9GE87bM@gmail.com> (raw)
In-Reply-To: <CAP-5=fXK8HORZaFUxydEatLV1xXQTcLLBnEMeveJYDHDQPkf_w@mail.gmail.com>


* Ian Rogers <irogers@google.com> wrote:

> Agreed on the trainwreck :-) So NULL is a valid CPU map value. It
> means no CPUs, but magically no CPUs can also be converted in the code
> to become the "any" CPU value of -1. So iterating the perf_cpu_map of
> NULL will yield a single CPU of -1 :-( This behavior existed before my
> time working on the code, I've tried to make the expectation in the
> using code clearer, does the user want the magic "any" CPU behavior?
> "any" CPU is distinct from every/all CPUs which would be something
> like CPUs 0-127, and then does "all" include online and offline CPUs?
> This is something that's been discussed and I think things can be
> better, I'd rather the code was explicit whenever terms like "all",
> "online" and "any" were being used - it is very easy to confuse "all"
> and "any". I've been trying to adjust the code and make it more
> sensible with changes like:
> https://lore.kernel.org/all/20240202234057.2085863-1-irogers@google.com/
> but I agree with you that things are a mess here. At least we
> shouldn't be leaking memory (due to the reference count checking) and
> we shouldn't confuse indices and CPU numbers (due to the introduction
> of a wrapper type), problems that were pervasive and broke things like
> uncore counters frequently before.
> 
> Fwiw, part of me thinks we should change the perf_cpu_map to be a
> cpu_set_t, the "any" case could be an additional boolean in the
> struct. Perhaps we can just auto change "all" to "any" as is somewhat
> done here:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n92
> Then perf_cpu_map can be a cpu_set_t and I think life will be clearer.

Yeah, so when I looked at the data structure a few 
minutes ago:

	/**
	 * A sized, reference counted, sorted array of integers representing CPU
	 * numbers. This is commonly used to capture which CPUs a PMU is associated
	 * with. The indices into the cpumap are frequently used as they avoid having
	 * gaps if CPU numbers were used. For events associated with a pid, rather than
	 * a CPU, a single dummy map with an entry of -1 is used.
	 */
	DECLARE_RC_STRUCT(perf_cpu_map) {
		refcount_t	refcnt;
		/** Length of the map array. */
		int		nr;
		/** The CPU values. */
		struct perf_cpu	map[];
	};


My first thought was "Why isn't this a bitmask?". :-)

The thing is, even with 8192 CPUs, the mask size is 1K. 
That's still not catastrophic compared to the nightmare 
of managing a [8192] array...

And I would keep such core data structures as simple as 
possible: a single continuous bitmask. No sparse 
support, no NULL tricks. Just an always-present 
bitmask, and maybe an any/all modifier flag if the ABIs 
strongly suggest so.

The closer a data structure is to the core concepts of 
a tool, the more important its robust KISS properties 
are - at almost any runtime cost. </handwaving> :)

> Fwiw, even on the kernel side this is borked as I tried to explain in
> this RFC with 0 follow ups:
> https://lore.kernel.org/lkml/20250716223924.825772-1-irogers@google.com/

So that's a neat patch and kudos on the ASCII graphics. :-)

I'd suggest a re-send.

> Imagine the case of an event opened with 1 CPU and a TID. The filter
> will mean the counter only counts on the 1 CPU but the enabled and
> running time will mean the counter will be scaled (on your system) 128
> times!
> All of this matters when you consider different events opened in the
> same evlist and there being combined perf_cpu_maps like the evlist's
> all_cpus value.

> > -       nr = perf_cpu_map__max(evsel_list->core.all_cpus).cpu + 1;
> > +       if (evsel_list->core.all_cpus)
> > +               nr = perf_cpu_map__max(evsel_list->core.all_cpus).cpu + 1;
> > +       else
> > +               nr = 0;
> 
> So I think the better fix is to make max handle a NULL perf_cpu_map.
> That will yield "any" CPU of -1 ( :-( ) and then the -1 + 1 will give
> an empty aggregation map.
> 
> I'll write this up, add the test and see if I can spot similar issues
> wrt NULL in the vicinity in the perf_cpu_map code.

Thank you! In the short run I'd prefer whichever fix 
can restore perf stat --null functionality the fastest. ;-)

Thanks,

	Ingo

  reply	other threads:[~2025-12-01 17:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-30 11:43 [PATCH] Fix (well, work around) perf stat --null segfault Ingo Molnar
2025-12-01 16:52 ` Ian Rogers
2025-12-01 17:54   ` Ingo Molnar [this message]
2025-12-01 23:33     ` Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aS3WXn4nz9GE87bM@gmail.com \
    --to=mingo@kernel.org \
    --cc=acme@kernel.org \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=namhyung@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.