From: Namhyung Kim <namhyung@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
Kan Liang <kan.liang@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-perf-users@vger.kernel.org,
Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH] perf report: Add 'tgid' sort key
Date: Wed, 12 Feb 2025 17:52:55 -0800 [thread overview]
Message-ID: <Z61Qd7NG4k8DDvc6@google.com> (raw)
In-Reply-To: <CAP-5=fUqcykMdApHVweETg9bp2EVPJhJOj_PR8cByOOA6OyQGw@mail.gmail.com>
On Wed, Feb 12, 2025 at 02:10:42PM -0800, Ian Rogers wrote:
> On Wed, Feb 12, 2025 at 1:59 PM Ian Rogers <irogers@google.com> wrote:
> >
> > On Wed, Feb 12, 2025 at 1:07 PM Arnaldo Carvalho de Melo
> > <acme@kernel.org> wrote:
> > >
> > > On Wed, Feb 12, 2025 at 10:05:27PM +0100, Arnaldo Carvalho de Melo wrote:
> > > > On Wed, Feb 05, 2025 at 04:01:37PM -0800, Namhyung Kim wrote:
> > > > > Sometimes we need to analyze the data in process level but current sort
> > > > > keys only work on thread level. Let's add 'tgid' sort key for that as
> > > > > 'pid' is already taken for thread.
> > > > >
> > > > > This will look mostly the same, but it only uses tgid instead of tid.
> > > > > Here's an example of a process with two threads (thloop).
> > > > >
> > > > > $ perf record -- perf test -w thloop
> > > >
> > > > Unrelated, but when building perf with DEBUG=1 and trying to test the
> > > > above I noticed:
> > > >
> > > > root@number:~# perf record -- perf test -w thloop
> > > > [ perf record: Woken up 1 times to write data ]
> > > > [ perf record: Captured and wrote 0.404 MB perf.data (7968 samples) ]
> > > > perf: util/maps.c:95: check_invariants: Assertion `map__end(prev) <= map__end(map)' failed.
> > > > Aborted (core dumped)
> > > > root@number:~# perf record -- perf test -w offcpu
> > > > [ perf record: Woken up 1 times to write data ]
> > > > [ perf record: Captured and wrote 0.040 MB perf.data (23 samples) ]
> > > > perf: util/maps.c:95: check_invariants: Assertion `map__end(prev) <= map__end(map)' failed.
> > > > Aborted (core dumped)
> > > > root@number:~#
> > > >
> > > > I have:
> > > >
> > > > ⬢ [acme@toolbox perf-tools-next]$ git log --oneline perf-tools-next/perf-tools-next..
> > > > 9de1ed6fa3b73cb1 (HEAD -> perf-tools-next) perf report: Add 'tgid' sort key
> > > > 23e98ede2a353530 perf trace: Add --summary-mode option
> > > > e6d6104625a3790b perf tools: Get rid of now-unused rb_resort.h
> > > > 173ec14e72ef4ed7 perf trace: Convert syscall_stats to hashmap
> > > > 66edfb5d404e743d perf trace: Allocate syscall stats only if summary is on
> > > > ca6637e1ea08e6f4 perf parse-events filter: Use evsel__find_pmu()
> > > > bd1ac4a678f7f2c8 perf bench evlist-open-close: Reduce scope of 2 variables
> > > > cd59081880e89df8 perf test: Add direct off-cpu test
> > > > 56cbd794c0c46ba9 perf record --off-cpu: Add --off-cpu-thresh option
> > > > 28d9b19c5455556f perf record --off-cpu: Dump the remaining samples in BPF's stack trace map
> > > > 2bc05b02743b50a7 perf script: Display off-cpu samples correctly
> > > > bfa457a621596947 perf record --off-cpu: Disable perf_event's callchain collection
> > > > eca732cc42d20266 perf evsel: Assemble offcpu samples
> > > > 74ce50e40c569e90 perf record --off-cpu: Dump off-cpu samples in BPF
> > > > e75f8ce63bfa6cb9 perf record --off-cpu: Preparation of off-cpu BPF program
> > > > 0ffab9d26971c91c perf record --off-cpu: Parse off-cpu event
> > > > efc3fe2070853b7d perf evsel: Expose evsel__is_offcpu_event() for future use
> > > > ⬢ [acme@toolbox perf-tools-next]$
> > > >
> > > > locally, that is the stuff I've been testing lately, doubt it is related
> > > > to these patches, I'll investigate later, have to go AFK, so FWIW as a
> > > > heads up.
> > >
> > > Had time to extract this, now going really AFK:
> > >
> > > [New Thread 0x7fffdf24c6c0 (LWP 580622)]
> > > [ perf record: Woken up 1 times to write data ]
> > > [ perf record: Captured and wrote 0.403 MB perf.data (7948 samples) ]
> > > [Thread 0x7fffdf24c6c0 (LWP 580622) exited]
> > > perf: util/maps.c:95: check_invariants: Assertion `map__end(prev) <= map__end(map)' failed.
> > >
> > > Thread 1 "perf" received signal SIGABRT, Aborted.
> > > Downloading 4.06 K source file /usr/src/debug/glibc-2.39-37.fc40.x86_64/nptl/pthread_kill.c
> > > __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
> > > 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
> > > (gdb) bt
> > > #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
> > > #1 0x00007ffff6ea80a3 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78
> > > #2 0x00007ffff6e4ef1e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
> > > #3 0x00007ffff6e36902 in __GI_abort () at abort.c:79
> > > #4 0x00007ffff6e3681e in __assert_fail_base (fmt=0x7ffff6fc3bb8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7bef08 "map__end(prev) <= map__end(map)",
> > > file=file@entry=0x7bedf8 "util/maps.c", line=line@entry=95, function=function@entry=0x7bf1c0 <__PRETTY_FUNCTION__.6> "check_invariants") at assert.c:96
> > > #5 0x00007ffff6e47047 in __assert_fail (assertion=0x7bef08 "map__end(prev) <= map__end(map)", file=0x7bedf8 "util/maps.c", line=95,
> > > function=0x7bf1c0 <__PRETTY_FUNCTION__.6> "check_invariants") at assert.c:105
> > > #6 0x00000000006347a1 in check_invariants (maps=0xf987e0) at util/maps.c:95
> > > #7 0x0000000000635ae2 in maps__remove (maps=0xf987e0, map=0xf98a80) at util/maps.c:538
> > > #8 0x000000000062afd2 in machine__destroy_kernel_maps (machine=0xf98178) at util/machine.c:1176
> > > #9 0x000000000062b32b in machines__destroy_kernel_maps (machines=0xf98178) at util/machine.c:1238
> > > #10 0x00000000006388af in perf_session__destroy_kernel_maps (session=0xf97f60) at util/session.c:105
> > > #11 0x0000000000638df0 in perf_session__delete (session=0xf97f60) at util/session.c:248
> > > #12 0x0000000000431f18 in __cmd_record (rec=0xecace0 <record>, argc=4, argv=0x7fffffffde60) at builtin-record.c:2888
> > > #13 0x00000000004351fb in cmd_record (argc=4, argv=0x7fffffffde60) at builtin-record.c:4286
> > > #14 0x00000000004bd4d4 in run_builtin (p=0xecddc0 <commands+288>, argc=6, argv=0x7fffffffde60) at perf.c:351
> > > #15 0x00000000004bd77b in handle_internal_command (argc=6, argv=0x7fffffffde60) at perf.c:404
> > > #16 0x00000000004bd8d4 in run_argv (argcp=0x7fffffffdc4c, argv=0x7fffffffdc40) at perf.c:448
> > > #17 0x00000000004bdc1d in main (argc=6, argv=0x7fffffffde60) at perf.c:556
> > > (gdb)
> >
> > So my guess would be that something modified a map and broke the
> > invariants of the maps_by_addresss/maps_by_name. It should be possible
> > to add more check_invariants to work out where this happens.
> >
> > Thanks,
> > Ian
>
> I also suspect this is a regression. If you could bisect to find the
> cause then the fix is probably to not modify a map but clone it,
> change it and then reinsert it into the maps - the insert is called
> maps__fixup_overlap_and_insert so that maps don't overlap one another
> like the invariant check is detecting. Fwiw, in the older rbtree code,
> invariant breakages like this would be silently ignored, so we may
> have a latent bug :-(
Sorry, I also cannot reproduce it on my machine. But I think it's
unrelated to this change since you saw it during record. It'd be nice
if you could bisect.
And I think I forgot to call thread__put() for the leader thread. :)
Will update in v2.
Thanks,
Namhyung
next prev parent reply other threads:[~2025-02-13 1:52 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-06 0:01 [PATCH] perf report: Add 'tgid' sort key Namhyung Kim
2025-02-11 22:43 ` Ian Rogers
2025-02-12 21:05 ` Arnaldo Carvalho de Melo
2025-02-12 21:07 ` Arnaldo Carvalho de Melo
2025-02-12 21:59 ` Ian Rogers
2025-02-12 22:10 ` Ian Rogers
2025-02-13 1:52 ` Namhyung Kim [this message]
2025-02-14 22:22 ` Arnaldo Carvalho de Melo
2025-02-18 20:36 ` Arnaldo Carvalho de Melo
2025-02-18 21:01 ` Arnaldo Carvalho de Melo
2025-02-18 22:03 ` Namhyung Kim
2025-02-19 14:37 ` Arnaldo Carvalho de Melo
2025-02-19 14:47 ` Arnaldo Carvalho de Melo
2025-02-19 21:10 ` Namhyung Kim
2025-02-20 17:12 ` Ian Rogers
2025-02-21 7:04 ` Namhyung Kim
2025-02-24 18:18 ` Ian Rogers
2025-02-25 2:51 ` Namhyung Kim
2025-02-25 4:40 ` Ian Rogers
2025-02-25 7:51 ` Namhyung Kim
2025-02-25 19:07 ` Arnaldo Carvalho de Melo
2025-02-25 19:11 ` Arnaldo Carvalho de Melo
2025-02-25 19:25 ` Arnaldo Carvalho de Melo
2025-02-25 19:48 ` Arnaldo Carvalho de Melo
2025-02-26 19:34 ` Namhyung Kim
2025-02-26 21:37 ` Arnaldo Carvalho de Melo
2025-02-26 21:38 ` Arnaldo Carvalho de Melo
2025-02-26 21:40 ` Arnaldo Carvalho de Melo
2025-02-26 21:49 ` Arnaldo Carvalho de Melo
2025-02-26 22:17 ` Namhyung Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z61Qd7NG4k8DDvc6@google.com \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.