From: Namhyung Kim <namhyung@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>,
"Ingo Molnar" <mingo@redhat.com>,
"Arnaldo Carvalho de Melo" <acme@kernel.org>,
"Mark Rutland" <mark.rutland@arm.com>,
"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
"Jiri Olsa" <jolsa@kernel.org>,
"Adrian Hunter" <adrian.hunter@intel.com>,
"Kan Liang" <kan.liang@linux.intel.com>,
"John Garry" <john.g.garry@oracle.com>,
"Will Deacon" <will@kernel.org>,
"James Clark" <james.clark@linaro.org>,
"Mike Leach" <mike.leach@linaro.org>,
"Leo Yan" <leo.yan@linux.dev>, guoren <guoren@kernel.org>,
"Paul Walmsley" <paul.walmsley@sifive.com>,
"Palmer Dabbelt" <palmer@dabbelt.com>,
"Albert Ou" <aou@eecs.berkeley.edu>,
"Charlie Jenkins" <charlie@rivosinc.com>,
"Bibo Mao" <maobibo@loongson.cn>,
"Huacai Chen" <chenhuacai@kernel.org>,
"Catalin Marinas" <catalin.marinas@arm.com>,
"Jiri Slaby" <jirislaby@kernel.org>,
"Björn Töpel" <bjorn@rivosinc.com>,
"Howard Chu" <howardchu95@gmail.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
"linux-csky@vger.kernel.org" <linux-csky@vger.kernel.org>,
linux-riscv@lists.infradead.org, linux-mips@vger.kernel.org,
"Arnd Bergmann" <arnd@arndb.de>
Subject: Re: [PATCH v3 0/8] perf: Support multiple system call tables in the build
Date: Wed, 26 Feb 2025 23:24:27 -0800 [thread overview]
Message-ID: <Z8ATKyAhb-67NGIM@google.com> (raw)
In-Reply-To: <CAP-5=fVAuCF4i8HQEGYx_ApcSuBg82pXCsgk98oU9-2Gm=NOcw@mail.gmail.com>
On Wed, Feb 26, 2025 at 09:24:15PM -0800, Ian Rogers wrote:
> On Wed, Feb 26, 2025 at 4:00 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Mon, Feb 24, 2025 at 08:22:50PM -0800, Ian Rogers wrote:
> > > On Mon, Feb 24, 2025 at 7:20 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > On Wed, Feb 19, 2025 at 10:56:49AM -0800, Ian Rogers wrote:
> > > > > This work builds on the clean up of system call tables and removal of
> > > > > libaudit by Charlie Jenkins <charlie@rivosinc.com>.
> > > > >
> > > > > The system call table in perf trace is used to map system call numbers
> > > > > to names and vice versa. Prior to these changes, a single table
> > > > > matching the perf binary's build was present. The table would be
> > > > > incorrect if tracing say a 32-bit binary from a 64-bit version of
> > > > > perf, the names and numbers wouldn't match.
> > > > >
> > > > > Change the build so that a single system call file is built and the
> > > > > potentially multiple tables are identifiable from the ELF machine type
> > > > > of the process being examined. To determine the ELF machine type, the
> > > > > executable's header is read from /proc/pid/exe with fallbacks to using
> > > > > the perf's binary type when unknown.
> > > >
> > > > Hmm.. then this is limited to live mode and potentially detect wrong
> > > > machine type if it reads an old data, right?
> > > >
> > > > Also IIUC fallback to the perf binary means it cannot use cross-machine
> > > > table. For example, it cannot process data from ARM64 on x86, no? It
> > > > seems it should use perf_env.arch.
> > >
> > > The perf env arch is kind of horrid. On x86 it has the value x86 and
> > > then there is an extra 64bit flag, who knows how x32 should be encoded
> > > - but we barely support x32 as-is. I'd rather we added a new feature
> > > for the e_machine/e_flags of the executable and worked with those, but
> > > it is kind of weird with doing system wide mode. I didn't want to drag
> > > that into this patch series anyway as there is already enough here.
> >
> > Right, I don't know how to handle x32 properly. Maybe we can just
> > ignore it for now.
> >
> > But anyway looking at /proc/PID for recorded data doesn't seem correct.
> > Can you please add a flag to do that only from trace__run() and just use
> > EM_HOST for trace__replay()?
>
> So I was hoping at some later point the e_machine on the thread could
> be populated from the data file - hence the accessor being on thread
> and not part of the trace code.
Fair enough.
> We could add a global flag to thread
> to disable the reading from /proc but we do similar reading in
> machine.c for /proc/version, /proc/kallsyms, /proc/modules, etc.
You can add a flag to struct trace and only care about the perf trace
use case - whether to call thread__get_e_machine() or not.
In general, reading /proc from perf record is fine. But doing that from
perf report or similar is not good. You don't need to fix them, if any,
with this change. But let's not introduce more bugs.
> I think the chance a pid is recycled and the process has a different
> e_machine are remote enough that it is similar in nature. Adding the
> flag means we need to go and fix up all uses, we only need to set the
> flag in builtin-trace.c currently, but we've been historically bad at
> setting these globals and bugs creep in. I also don't think
> record/replay is working well and I didn't want the syscalltbl cleanup
> to turn into a perf trace record/replay fixing exercise.
Yep, please see above. Anyway I think record/replay on the same machine
is working well.
Thanks,
Namhyung
>
> > Later, we may need to add a misc flag or so to PERF_RECORD_FORK (and
> > PERF_RECORD_COMM with MISC_COMM_EXEC) to indicate non-standard ABI for a
> > new thread. But it's not clear how to make it arch-independent.
> >
> > >
> > > > One more concern is BPF. The BPF should know about the ABI of the
> > > > current process so that it can augment the syscall arguments correctly.
> > > > Currently it only checks the syscall number but it can be different on
> > > > 32-bit and 64-bit.
> > >
> > > That's right. This change is trying to clean up
> > > tools/perf/util/syscalltbl.c and the perf trace usage. I didn't go as
> > > far as making BPF programs pair system call number with e_machine and
> > > e_flags, there is enough here and the behavior after these patches
> > > matches the behavior before - that is to assume the system call ABI
> > > matches that of the perf binary.
> >
> > Right, the next step would be adding a BPF kfunc to identify the current
> > ABI.
> >
> > Thanks,
> > Namhyung
> >
prev parent reply other threads:[~2025-02-27 7:24 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
2025-02-19 18:56 ` [PATCH v3 1/8] perf syscalltble: Remove syscall_table.h Ian Rogers
2025-02-19 18:56 ` [PATCH v3 2/8] perf trace: Reorganize syscalls Ian Rogers
2025-02-19 18:56 ` [PATCH v3 3/8] perf syscalltbl: Remove struct syscalltbl Ian Rogers
2025-02-19 18:56 ` [PATCH v3 4/8] perf thread: Add support for reading the e_machine type for a thread Ian Rogers
2025-02-19 18:56 ` [PATCH v3 5/8] perf trace beauty: Add syscalltbl.sh generating all system call tables Ian Rogers
2025-02-19 18:56 ` [PATCH v3 6/8] perf syscalltbl: Use lookup table containing multiple architectures Ian Rogers
2025-02-19 18:56 ` [PATCH v3 7/8] perf build: Remove Makefile.syscalls Ian Rogers
2025-02-19 18:56 ` [PATCH v3 8/8] perf syscalltbl: Mask off ABI type for MIPS system calls Ian Rogers
2025-02-25 3:05 ` [PATCH v3 0/8] perf: Support multiple system call tables in the build Namhyung Kim
2025-02-25 4:37 ` Ian Rogers
2025-02-25 5:40 ` Namhyung Kim
2025-02-26 2:47 ` Namhyung Kim
2025-02-26 23:47 ` Namhyung Kim
2025-02-25 3:20 ` Namhyung Kim
2025-02-25 4:22 ` Ian Rogers
2025-02-27 0:00 ` Namhyung Kim
2025-02-27 5:24 ` Ian Rogers
2025-02-27 7:24 ` Namhyung Kim [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z8ATKyAhb-67NGIM@google.com \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=aou@eecs.berkeley.edu \
--cc=arnd@arndb.de \
--cc=bjorn@rivosinc.com \
--cc=catalin.marinas@arm.com \
--cc=charlie@rivosinc.com \
--cc=chenhuacai@kernel.org \
--cc=guoren@kernel.org \
--cc=howardchu95@gmail.com \
--cc=irogers@google.com \
--cc=james.clark@linaro.org \
--cc=jirislaby@kernel.org \
--cc=john.g.garry@oracle.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=leo.yan@linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-csky@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=maobibo@loongson.cn \
--cc=mark.rutland@arm.com \
--cc=mike.leach@linaro.org \
--cc=mingo@redhat.com \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=peterz@infradead.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).