* [PATCH v3 0/8] perf: Support multiple system call tables in the build
@ 2025-02-19 18:56 Ian Rogers
2025-02-19 18:56 ` [PATCH v3 1/8] perf syscalltble: Remove syscall_table.h Ian Rogers
` (9 more replies)
0 siblings, 10 replies; 19+ messages in thread
From: Ian Rogers @ 2025-02-19 18:56 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv, linux-mips,
Arnd Bergmann
This work builds on the clean up of system call tables and removal of
libaudit by Charlie Jenkins <charlie@rivosinc.com>.
The system call table in perf trace is used to map system call numbers
to names and vice versa. Prior to these changes, a single table
matching the perf binary's build was present. The table would be
incorrect if tracing say a 32-bit binary from a 64-bit version of
perf, the names and numbers wouldn't match.
Change the build so that a single system call file is built and the
potentially multiple tables are identifiable from the ELF machine type
of the process being examined. To determine the ELF machine type, the
executable's header is read from /proc/pid/exe with fallbacks to using
the perf's binary type when unknown.
Remove some runtime types used by the system call tables and make
equivalents generated at build time.
v3: Add Charlie's reviewed-by tags. Incorporate feedback from Arnd
Bergmann <arnd@arndb.de> on additional optional column and MIPS
system call numbering. Rebase past Namhyung's global system call
statistics and add comments that they don't yet support an
e_machine other than EM_HOST.
v2: Change the 1 element cache for the last table as suggested by
Howard Chu, add Howard's reviewed-by tags.
Add a comment and apology to Charlie for not doing better in
guiding:
https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
After discussion on v1 and he agreed this patch series would be
the better direction.
Ian Rogers (8):
perf syscalltble: Remove syscall_table.h
perf trace: Reorganize syscalls
perf syscalltbl: Remove struct syscalltbl
perf thread: Add support for reading the e_machine type for a thread
perf trace beauty: Add syscalltbl.sh generating all system call tables
perf syscalltbl: Use lookup table containing multiple architectures
perf build: Remove Makefile.syscalls
perf syscalltbl: Mask off ABI type for MIPS system calls
tools/perf/Makefile.perf | 10 +-
tools/perf/arch/alpha/entry/syscalls/Kbuild | 2 -
.../alpha/entry/syscalls/Makefile.syscalls | 5 -
tools/perf/arch/alpha/include/syscall_table.h | 2 -
tools/perf/arch/arc/entry/syscalls/Kbuild | 2 -
.../arch/arc/entry/syscalls/Makefile.syscalls | 3 -
tools/perf/arch/arc/include/syscall_table.h | 2 -
tools/perf/arch/arm/entry/syscalls/Kbuild | 4 -
.../arch/arm/entry/syscalls/Makefile.syscalls | 2 -
tools/perf/arch/arm/include/syscall_table.h | 2 -
tools/perf/arch/arm64/entry/syscalls/Kbuild | 3 -
.../arm64/entry/syscalls/Makefile.syscalls | 6 -
tools/perf/arch/arm64/include/syscall_table.h | 8 -
tools/perf/arch/csky/entry/syscalls/Kbuild | 2 -
.../csky/entry/syscalls/Makefile.syscalls | 3 -
tools/perf/arch/csky/include/syscall_table.h | 2 -
.../perf/arch/loongarch/entry/syscalls/Kbuild | 2 -
.../entry/syscalls/Makefile.syscalls | 3 -
.../arch/loongarch/include/syscall_table.h | 2 -
tools/perf/arch/mips/entry/syscalls/Kbuild | 2 -
.../mips/entry/syscalls/Makefile.syscalls | 5 -
tools/perf/arch/mips/include/syscall_table.h | 2 -
tools/perf/arch/parisc/entry/syscalls/Kbuild | 3 -
.../parisc/entry/syscalls/Makefile.syscalls | 6 -
.../perf/arch/parisc/include/syscall_table.h | 8 -
tools/perf/arch/powerpc/entry/syscalls/Kbuild | 3 -
.../powerpc/entry/syscalls/Makefile.syscalls | 6 -
.../perf/arch/powerpc/include/syscall_table.h | 8 -
tools/perf/arch/riscv/entry/syscalls/Kbuild | 2 -
.../riscv/entry/syscalls/Makefile.syscalls | 4 -
tools/perf/arch/riscv/include/syscall_table.h | 8 -
tools/perf/arch/s390/entry/syscalls/Kbuild | 2 -
.../s390/entry/syscalls/Makefile.syscalls | 5 -
tools/perf/arch/s390/include/syscall_table.h | 2 -
tools/perf/arch/sh/entry/syscalls/Kbuild | 2 -
.../arch/sh/entry/syscalls/Makefile.syscalls | 4 -
tools/perf/arch/sh/include/syscall_table.h | 2 -
tools/perf/arch/sparc/entry/syscalls/Kbuild | 3 -
.../sparc/entry/syscalls/Makefile.syscalls | 5 -
tools/perf/arch/sparc/include/syscall_table.h | 8 -
tools/perf/arch/x86/entry/syscalls/Kbuild | 3 -
.../arch/x86/entry/syscalls/Makefile.syscalls | 6 -
tools/perf/arch/x86/include/syscall_table.h | 8 -
tools/perf/arch/xtensa/entry/syscalls/Kbuild | 2 -
.../xtensa/entry/syscalls/Makefile.syscalls | 4 -
.../perf/arch/xtensa/include/syscall_table.h | 2 -
tools/perf/builtin-trace.c | 290 +++++++++++-------
tools/perf/scripts/Makefile.syscalls | 61 ----
tools/perf/scripts/syscalltbl.sh | 86 ------
tools/perf/trace/beauty/syscalltbl.sh | 274 +++++++++++++++++
tools/perf/util/syscalltbl.c | 148 ++++-----
tools/perf/util/syscalltbl.h | 22 +-
tools/perf/util/thread.c | 50 +++
tools/perf/util/thread.h | 14 +-
54 files changed, 616 insertions(+), 509 deletions(-)
delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/alpha/include/syscall_table.h
delete mode 100644 tools/perf/arch/arc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arc/include/syscall_table.h
delete mode 100644 tools/perf/arch/arm/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arm/include/syscall_table.h
delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arm64/include/syscall_table.h
delete mode 100644 tools/perf/arch/csky/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/csky/include/syscall_table.h
delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/loongarch/include/syscall_table.h
delete mode 100644 tools/perf/arch/mips/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/mips/include/syscall_table.h
delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/parisc/include/syscall_table.h
delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/powerpc/include/syscall_table.h
delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/riscv/include/syscall_table.h
delete mode 100644 tools/perf/arch/s390/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/s390/include/syscall_table.h
delete mode 100644 tools/perf/arch/sh/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/sh/include/syscall_table.h
delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/sparc/include/syscall_table.h
delete mode 100644 tools/perf/arch/x86/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/x86/include/syscall_table.h
delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/xtensa/include/syscall_table.h
delete mode 100644 tools/perf/scripts/Makefile.syscalls
delete mode 100755 tools/perf/scripts/syscalltbl.sh
create mode 100755 tools/perf/trace/beauty/syscalltbl.sh
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v3 1/8] perf syscalltble: Remove syscall_table.h
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
@ 2025-02-19 18:56 ` Ian Rogers
2025-02-19 18:56 ` [PATCH v3 2/8] perf trace: Reorganize syscalls Ian Rogers
` (8 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Ian Rogers @ 2025-02-19 18:56 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv, linux-mips,
Arnd Bergmann
The definition of "static const char *const syscalltbl[] = {" is done
in a generated syscalls_32.h or syscalls_64.h that is architecture
dependent. In order to include the appropriate file a syscall_table.h
is found via the perf include path and it includes the syscalls_32.h
or syscalls_64.h as appropriate.
To support having multiple syscall tables, one for 32-bit and one for
64-bit, or for different architectures, an include path cannot be
used. Remove syscall_table.h because of this and inline what it does
into syscalltbl.c.
For architectures without a syscall_table.h this will cause a failure
to include either syscalls_32.h or syscalls_64.h rather than a failure
to include syscall_table.h. For architectures that only included one
or other, the behavior matches BITS_PER_LONG as previously done on
architectures supporting both syscalls_32.h and syscalls_64.h.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
tools/perf/arch/alpha/include/syscall_table.h | 2 --
tools/perf/arch/arc/include/syscall_table.h | 2 --
tools/perf/arch/arm/include/syscall_table.h | 2 --
tools/perf/arch/arm64/include/syscall_table.h | 8 --------
tools/perf/arch/csky/include/syscall_table.h | 2 --
tools/perf/arch/loongarch/include/syscall_table.h | 2 --
tools/perf/arch/mips/include/syscall_table.h | 2 --
tools/perf/arch/parisc/include/syscall_table.h | 8 --------
tools/perf/arch/powerpc/include/syscall_table.h | 8 --------
tools/perf/arch/riscv/include/syscall_table.h | 8 --------
tools/perf/arch/s390/include/syscall_table.h | 2 --
tools/perf/arch/sh/include/syscall_table.h | 2 --
tools/perf/arch/sparc/include/syscall_table.h | 8 --------
tools/perf/arch/x86/include/syscall_table.h | 8 --------
tools/perf/arch/xtensa/include/syscall_table.h | 2 --
tools/perf/util/syscalltbl.c | 8 +++++++-
16 files changed, 7 insertions(+), 67 deletions(-)
delete mode 100644 tools/perf/arch/alpha/include/syscall_table.h
delete mode 100644 tools/perf/arch/arc/include/syscall_table.h
delete mode 100644 tools/perf/arch/arm/include/syscall_table.h
delete mode 100644 tools/perf/arch/arm64/include/syscall_table.h
delete mode 100644 tools/perf/arch/csky/include/syscall_table.h
delete mode 100644 tools/perf/arch/loongarch/include/syscall_table.h
delete mode 100644 tools/perf/arch/mips/include/syscall_table.h
delete mode 100644 tools/perf/arch/parisc/include/syscall_table.h
delete mode 100644 tools/perf/arch/powerpc/include/syscall_table.h
delete mode 100644 tools/perf/arch/riscv/include/syscall_table.h
delete mode 100644 tools/perf/arch/s390/include/syscall_table.h
delete mode 100644 tools/perf/arch/sh/include/syscall_table.h
delete mode 100644 tools/perf/arch/sparc/include/syscall_table.h
delete mode 100644 tools/perf/arch/x86/include/syscall_table.h
delete mode 100644 tools/perf/arch/xtensa/include/syscall_table.h
diff --git a/tools/perf/arch/alpha/include/syscall_table.h b/tools/perf/arch/alpha/include/syscall_table.h
deleted file mode 100644
index b53e31c15805..000000000000
--- a/tools/perf/arch/alpha/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_64.h>
diff --git a/tools/perf/arch/arc/include/syscall_table.h b/tools/perf/arch/arc/include/syscall_table.h
deleted file mode 100644
index 4c942821662d..000000000000
--- a/tools/perf/arch/arc/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_32.h>
diff --git a/tools/perf/arch/arm/include/syscall_table.h b/tools/perf/arch/arm/include/syscall_table.h
deleted file mode 100644
index 4c942821662d..000000000000
--- a/tools/perf/arch/arm/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_32.h>
diff --git a/tools/perf/arch/arm64/include/syscall_table.h b/tools/perf/arch/arm64/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/arm64/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/csky/include/syscall_table.h b/tools/perf/arch/csky/include/syscall_table.h
deleted file mode 100644
index 4c942821662d..000000000000
--- a/tools/perf/arch/csky/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_32.h>
diff --git a/tools/perf/arch/loongarch/include/syscall_table.h b/tools/perf/arch/loongarch/include/syscall_table.h
deleted file mode 100644
index 9d0646d3455c..000000000000
--- a/tools/perf/arch/loongarch/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscall_table_64.h>
diff --git a/tools/perf/arch/mips/include/syscall_table.h b/tools/perf/arch/mips/include/syscall_table.h
deleted file mode 100644
index b53e31c15805..000000000000
--- a/tools/perf/arch/mips/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_64.h>
diff --git a/tools/perf/arch/parisc/include/syscall_table.h b/tools/perf/arch/parisc/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/parisc/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/powerpc/include/syscall_table.h b/tools/perf/arch/powerpc/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/powerpc/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/riscv/include/syscall_table.h b/tools/perf/arch/riscv/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/riscv/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/s390/include/syscall_table.h b/tools/perf/arch/s390/include/syscall_table.h
deleted file mode 100644
index b53e31c15805..000000000000
--- a/tools/perf/arch/s390/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_64.h>
diff --git a/tools/perf/arch/sh/include/syscall_table.h b/tools/perf/arch/sh/include/syscall_table.h
deleted file mode 100644
index 4c942821662d..000000000000
--- a/tools/perf/arch/sh/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_32.h>
diff --git a/tools/perf/arch/sparc/include/syscall_table.h b/tools/perf/arch/sparc/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/sparc/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/x86/include/syscall_table.h b/tools/perf/arch/x86/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/x86/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/xtensa/include/syscall_table.h b/tools/perf/arch/xtensa/include/syscall_table.h
deleted file mode 100644
index 4c942821662d..000000000000
--- a/tools/perf/arch/xtensa/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_32.h>
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index 928aca4cd6e9..2f76241494c8 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -7,13 +7,19 @@
#include "syscalltbl.h"
#include <stdlib.h>
+#include <asm/bitsperlong.h>
#include <linux/compiler.h>
#include <linux/zalloc.h>
#include <string.h>
#include "string2.h"
-#include <syscall_table.h>
+#if __BITS_PER_LONG == 64
+ #include <asm/syscalls_64.h>
+#else
+ #include <asm/syscalls_32.h>
+#endif
+
const int syscalltbl_native_max_id = SYSCALLTBL_MAX_ID;
static const char *const *syscalltbl_native = syscalltbl;
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 2/8] perf trace: Reorganize syscalls
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
2025-02-19 18:56 ` [PATCH v3 1/8] perf syscalltble: Remove syscall_table.h Ian Rogers
@ 2025-02-19 18:56 ` Ian Rogers
2025-02-19 18:56 ` [PATCH v3 3/8] perf syscalltbl: Remove struct syscalltbl Ian Rogers
` (7 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Ian Rogers @ 2025-02-19 18:56 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv, linux-mips,
Arnd Bergmann
Identify struct syscall information in the syscalls table by a machine
type and syscall number, not just system call number. Having the
machine type means that 32-bit system calls can be differentiated from
64-bit ones on a machine capable of both. Having a table for all
machine types and all system call numbers would be too large, so
maintain a sorted array of system calls as they are encountered.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
tools/perf/builtin-trace.c | 177 ++++++++++++++++++++++++-------------
1 file changed, 118 insertions(+), 59 deletions(-)
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index f55a8a6481f2..eb3551fb0e7b 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -66,6 +66,7 @@
#include "syscalltbl.h"
#include "../perf.h"
#include "trace_augment.h"
+#include "dwarf-regs.h"
#include <errno.h>
#include <inttypes.h>
@@ -86,6 +87,7 @@
#include <linux/ctype.h>
#include <perf/mmap.h>
+#include <tools/libc_compat.h>
#ifdef HAVE_LIBTRACEEVENT
#include <event-parse.h>
@@ -149,7 +151,10 @@ struct trace {
struct perf_tool tool;
struct syscalltbl *sctbl;
struct {
+ /** Sorted sycall numbers used by the trace. */
struct syscall *table;
+ /** Size of table. */
+ size_t table_size;
struct {
struct evsel *sys_enter,
*sys_exit,
@@ -1454,22 +1459,37 @@ static const struct syscall_fmt *syscall_fmt__find_by_alias(const char *alias)
return __syscall_fmt__find_by_alias(syscall_fmts, nmemb, alias);
}
-/*
- * is_exit: is this "exit" or "exit_group"?
- * is_open: is this "open" or "openat"? To associate the fd returned in sys_exit with the pathname in sys_enter.
- * args_size: sum of the sizes of the syscall arguments, anything after that is augmented stuff: pathname for openat, etc.
- * nonexistent: Just a hole in the syscall table, syscall id not allocated
+/**
+ * struct syscall
*/
struct syscall {
+ /** @e_machine: The ELF machine associated with the entry. */
+ int e_machine;
+ /** @id: id value from the tracepoint, the system call number. */
+ int id;
struct tep_event *tp_format;
int nr_args;
+ /**
+ * @args_size: sum of the sizes of the syscall arguments, anything
+ * after that is augmented stuff: pathname for openat, etc.
+ */
+
int args_size;
struct {
struct bpf_program *sys_enter,
*sys_exit;
} bpf_prog;
+ /** @is_exit: is this "exit" or "exit_group"? */
bool is_exit;
+ /**
+ * @is_open: is this "open" or "openat"? To associate the fd returned in
+ * sys_exit with the pathname in sys_enter.
+ */
bool is_open;
+ /**
+ * @nonexistent: Name lookup failed. Just a hole in the syscall table,
+ * syscall id not allocated.
+ */
bool nonexistent;
bool use_btf;
struct tep_format_field *args;
@@ -2107,22 +2127,21 @@ static int syscall__set_arg_fmts(struct syscall *sc)
return 0;
}
-static int trace__read_syscall_info(struct trace *trace, int id)
+static int syscall__read_info(struct syscall *sc, struct trace *trace)
{
char tp_name[128];
- struct syscall *sc;
- const char *name = syscalltbl__name(trace->sctbl, id);
+ const char *name;
int err;
- if (trace->syscalls.table == NULL) {
- trace->syscalls.table = calloc(trace->sctbl->syscalls.max_id + 1, sizeof(*sc));
- if (trace->syscalls.table == NULL)
- return -ENOMEM;
- }
- sc = trace->syscalls.table + id;
if (sc->nonexistent)
return -EEXIST;
+ if (sc->name) {
+ /* Info already read. */
+ return 0;
+ }
+
+ name = syscalltbl__name(trace->sctbl, sc->id);
if (name == NULL) {
sc->nonexistent = true;
return -EEXIST;
@@ -2145,15 +2164,16 @@ static int trace__read_syscall_info(struct trace *trace, int id)
*/
if (IS_ERR(sc->tp_format)) {
sc->nonexistent = true;
- return PTR_ERR(sc->tp_format);
+ err = PTR_ERR(sc->tp_format);
+ sc->tp_format = NULL;
+ return err;
}
/*
* The tracepoint format contains __syscall_nr field, so it's one more
* than the actual number of syscall arguments.
*/
- if (syscall__alloc_arg_fmts(sc, IS_ERR(sc->tp_format) ?
- RAW_SYSCALL_ARGS_NUM : sc->tp_format->format.nr_fields - 1))
+ if (syscall__alloc_arg_fmts(sc, sc->tp_format->format.nr_fields - 1))
return -ENOMEM;
sc->args = sc->tp_format->format.fields;
@@ -2442,13 +2462,67 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
return printed;
}
+static void syscall__init(struct syscall *sc, int e_machine, int id)
+{
+ memset(sc, 0, sizeof(*sc));
+ sc->e_machine = e_machine;
+ sc->id = id;
+}
+
+static void syscall__exit(struct syscall *sc)
+{
+ if (!sc)
+ return;
+
+ zfree(&sc->arg_fmt);
+}
+
+static int syscall__cmp(const void *va, const void *vb)
+{
+ const struct syscall *a = va, *b = vb;
+
+ if (a->e_machine != b->e_machine)
+ return a->e_machine - b->e_machine;
+
+ return a->id - b->id;
+}
+
+static struct syscall *trace__find_syscall(struct trace *trace, int e_machine, int id)
+{
+ struct syscall key = {
+ .e_machine = e_machine,
+ .id = id,
+ };
+ struct syscall *sc, *tmp;
+
+ sc = bsearch(&key, trace->syscalls.table, trace->syscalls.table_size,
+ sizeof(struct syscall), syscall__cmp);
+ if (sc)
+ return sc;
+
+ tmp = reallocarray(trace->syscalls.table, trace->syscalls.table_size + 1,
+ sizeof(struct syscall));
+ if (!tmp)
+ return NULL;
+
+ trace->syscalls.table = tmp;
+ sc = &trace->syscalls.table[trace->syscalls.table_size++];
+ syscall__init(sc, e_machine, id);
+ qsort(trace->syscalls.table, trace->syscalls.table_size, sizeof(struct syscall),
+ syscall__cmp);
+ sc = bsearch(&key, trace->syscalls.table, trace->syscalls.table_size,
+ sizeof(struct syscall), syscall__cmp);
+ return sc;
+}
+
typedef int (*tracepoint_handler)(struct trace *trace, struct evsel *evsel,
union perf_event *event,
struct perf_sample *sample);
-static struct syscall *trace__syscall_info(struct trace *trace,
- struct evsel *evsel, int id)
+static struct syscall *trace__syscall_info(struct trace *trace, struct evsel *evsel,
+ int e_machine, int id)
{
+ struct syscall *sc;
int err = 0;
if (id < 0) {
@@ -2473,28 +2547,20 @@ static struct syscall *trace__syscall_info(struct trace *trace,
err = -EINVAL;
- if (id > trace->sctbl->syscalls.max_id) {
- goto out_cant_read;
- }
-
- if ((trace->syscalls.table == NULL || trace->syscalls.table[id].name == NULL) &&
- (err = trace__read_syscall_info(trace, id)) != 0)
- goto out_cant_read;
+ sc = trace__find_syscall(trace, e_machine, id);
+ if (sc)
+ err = syscall__read_info(sc, trace);
- if (trace->syscalls.table && trace->syscalls.table[id].nonexistent)
- goto out_cant_read;
-
- return &trace->syscalls.table[id];
-
-out_cant_read:
- if (verbose > 0) {
+ if (err && verbose > 0) {
char sbuf[STRERR_BUFSIZE];
- fprintf(trace->output, "Problems reading syscall %d: %d (%s)", id, -err, str_error_r(-err, sbuf, sizeof(sbuf)));
- if (id <= trace->sctbl->syscalls.max_id && trace->syscalls.table[id].name != NULL)
- fprintf(trace->output, "(%s)", trace->syscalls.table[id].name);
+
+ fprintf(trace->output, "Problems reading syscall %d: %d (%s)", id, -err,
+ str_error_r(-err, sbuf, sizeof(sbuf)));
+ if (sc && sc->name)
+ fprintf(trace->output, "(%s)", sc->name);
fputs(" information\n", trace->output);
}
- return NULL;
+ return err ? NULL : sc;
}
struct syscall_stats {
@@ -2643,14 +2709,6 @@ static void *syscall__augmented_args(struct syscall *sc, struct perf_sample *sam
return NULL;
}
-static void syscall__exit(struct syscall *sc)
-{
- if (!sc)
- return;
-
- zfree(&sc->arg_fmt);
-}
-
static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
union perf_event *event __maybe_unused,
struct perf_sample *sample)
@@ -2662,7 +2720,7 @@ static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
int augmented_args_size = 0;
void *augmented_args = NULL;
- struct syscall *sc = trace__syscall_info(trace, evsel, id);
+ struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
struct thread_trace *ttrace;
if (sc == NULL)
@@ -2736,7 +2794,7 @@ static int trace__fprintf_sys_enter(struct trace *trace, struct evsel *evsel,
struct thread_trace *ttrace;
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
- struct syscall *sc = trace__syscall_info(trace, evsel, id);
+ struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
char msg[1024];
void *args, *augmented_args = NULL;
int augmented_args_size;
@@ -2811,7 +2869,7 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1, callchain_ret = 0, printed = 0;
int alignment = trace->args_alignment;
- struct syscall *sc = trace__syscall_info(trace, evsel, id);
+ struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
struct thread_trace *ttrace;
if (sc == NULL)
@@ -3164,7 +3222,7 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
if (evsel == trace->syscalls.events.bpf_output) {
int id = perf_evsel__sc_tp_uint(evsel, id, sample);
- struct syscall *sc = trace__syscall_info(trace, evsel, id);
+ struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
if (sc) {
fprintf(trace->output, "%s(", sc->name);
@@ -3673,7 +3731,7 @@ static struct bpf_program *trace__find_syscall_bpf_prog(struct trace *trace, str
static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
if (sc == NULL)
return;
@@ -3684,20 +3742,20 @@ static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
static int trace__bpf_prog_sys_enter_fd(struct trace *trace, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
return sc ? bpf_program__fd(sc->bpf_prog.sys_enter) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
}
static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
}
static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigned int *beauty_array)
{
struct tep_format_field *field;
- struct syscall *sc = trace__syscall_info(trace, NULL, key);
+ struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
const struct btf_type *bt;
char *struct_offset, *tmp, name[32];
bool can_augment = false;
@@ -3795,7 +3853,7 @@ static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace
try_to_find_pair:
for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
int id = syscalltbl__id_at_idx(trace->sctbl, i);
- struct syscall *pair = trace__syscall_info(trace, NULL, id);
+ struct syscall *pair = trace__syscall_info(trace, NULL, EM_HOST, id);
struct bpf_program *pair_prog;
bool is_candidate = false;
@@ -3945,7 +4003,7 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
*/
for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
int key = syscalltbl__id_at_idx(trace->sctbl, i);
- struct syscall *sc = trace__syscall_info(trace, NULL, key);
+ struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
struct bpf_program *pair_prog;
int prog_fd;
@@ -4760,7 +4818,10 @@ static size_t syscall__dump_stats(struct trace *trace, FILE *fp,
pct = avg ? 100.0 * stddev_stats(&stats->stats) / avg : 0.0;
avg /= NSEC_PER_MSEC;
- sc = &trace->syscalls.table[entry->syscall];
+ sc = trace__syscall_info(trace, /*evsel=*/NULL, EM_HOST, entry->syscall);
+ if (!sc)
+ continue;
+
printed += fprintf(fp, " %-15s", sc->name);
printed += fprintf(fp, " %8" PRIu64 " %6" PRIu64 " %9.3f %9.3f %9.3f",
n, stats->nr_failures, entry->msecs, min, avg);
@@ -5217,12 +5278,10 @@ static int trace__config(const char *var, const char *value, void *arg)
static void trace__exit(struct trace *trace)
{
- int i;
-
strlist__delete(trace->ev_qualifier);
zfree(&trace->ev_qualifier_ids.entries);
if (trace->syscalls.table) {
- for (i = 0; i <= trace->sctbl->syscalls.max_id; i++)
+ for (size_t i = 0; i < trace->syscalls.table_size; i++)
syscall__exit(&trace->syscalls.table[i]);
zfree(&trace->syscalls.table);
}
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 3/8] perf syscalltbl: Remove struct syscalltbl
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
2025-02-19 18:56 ` [PATCH v3 1/8] perf syscalltble: Remove syscall_table.h Ian Rogers
2025-02-19 18:56 ` [PATCH v3 2/8] perf trace: Reorganize syscalls Ian Rogers
@ 2025-02-19 18:56 ` Ian Rogers
2025-02-19 18:56 ` [PATCH v3 4/8] perf thread: Add support for reading the e_machine type for a thread Ian Rogers
` (6 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Ian Rogers @ 2025-02-19 18:56 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv, linux-mips,
Arnd Bergmann
The syscalltbl held entries of system call name and number pairs,
generated from a native syscalltbl at start up. As there are gaps in
the system call number there is a notion of index into the
table. Going forward we want the system call table to be identifiable
by a machine type, for example, i386 vs x86-64. Change the interface
to the syscalltbl so (1) a (currently unused machine type of EM_HOST)
is passed (2) the index to syscall number and system call name mapping
is computed at build time.
Two tables are used for this, an array of system call number to name,
an array of system call numbers sorted by the system call name. The
sorted array doesn't store strings in part to save memory and
relocations. The index notion is carried forward and is an index into
the sorted array of system call numbers, the data structures are
opaque (held only in syscalltbl.c), and so the number of indices for a
machine type is exposed as a new API.
The arrays are computed in the syscalltbl.sh script and so no start-up
time computation and storage is necessary.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
tools/perf/builtin-trace.c | 106 +++++++++++++++++------------
tools/perf/scripts/syscalltbl.sh | 36 ++++------
tools/perf/util/syscalltbl.c | 113 ++++++++++---------------------
tools/perf/util/syscalltbl.h | 22 ++----
4 files changed, 117 insertions(+), 160 deletions(-)
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index eb3551fb0e7b..fbf21055cff6 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -149,7 +149,6 @@ enum summary_mode {
struct trace {
struct perf_tool tool;
- struct syscalltbl *sctbl;
struct {
/** Sorted sycall numbers used by the trace. */
struct syscall *table;
@@ -188,6 +187,14 @@ struct trace {
pid_t *entries;
struct bpf_map *map;
} filter_pids;
+ /*
+ * TODO: The map is from an ID (aka system call number) to struct
+ * syscall_stats. If there is >1 e_machine, such as i386 and x86-64
+ * processes, then the stats here will gather wrong the statistics for
+ * the non EM_HOST system calls. A fix would be to add the e_machine
+ * into the key, but this would make the code inconsistent with the
+ * per-thread version.
+ */
struct hashmap *syscall_stats;
double duration_filter;
double runtime_ms;
@@ -2141,7 +2148,7 @@ static int syscall__read_info(struct syscall *sc, struct trace *trace)
return 0;
}
- name = syscalltbl__name(trace->sctbl, sc->id);
+ name = syscalltbl__name(sc->e_machine, sc->id);
if (name == NULL) {
sc->nonexistent = true;
return -EEXIST;
@@ -2241,10 +2248,14 @@ static int trace__validate_ev_qualifier(struct trace *trace)
strlist__for_each_entry(pos, trace->ev_qualifier) {
const char *sc = pos->s;
- int id = syscalltbl__id(trace->sctbl, sc), match_next = -1;
+ /*
+ * TODO: Assume more than the validation/warnings are all for
+ * the same binary type as perf.
+ */
+ int id = syscalltbl__id(EM_HOST, sc), match_next = -1;
if (id < 0) {
- id = syscalltbl__strglobmatch_first(trace->sctbl, sc, &match_next);
+ id = syscalltbl__strglobmatch_first(EM_HOST, sc, &match_next);
if (id >= 0)
goto matches;
@@ -2264,7 +2275,7 @@ static int trace__validate_ev_qualifier(struct trace *trace)
continue;
while (1) {
- id = syscalltbl__strglobmatch_next(trace->sctbl, sc, &match_next);
+ id = syscalltbl__strglobmatch_next(EM_HOST, sc, &match_next);
if (id < 0)
break;
if (nr_allocated == nr_used) {
@@ -2720,6 +2731,7 @@ static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
int augmented_args_size = 0;
void *augmented_args = NULL;
+ /* TODO: get e_machine from thread. */
struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
struct thread_trace *ttrace;
@@ -2794,6 +2806,7 @@ static int trace__fprintf_sys_enter(struct trace *trace, struct evsel *evsel,
struct thread_trace *ttrace;
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
+ /* TODO: get e_machine from thread. */
struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
char msg[1024];
void *args, *augmented_args = NULL;
@@ -2869,6 +2882,7 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1, callchain_ret = 0, printed = 0;
int alignment = trace->args_alignment;
+ /* TODO: get e_machine from thread. */
struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
struct thread_trace *ttrace;
@@ -3222,6 +3236,7 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
if (evsel == trace->syscalls.events.bpf_output) {
int id = perf_evsel__sc_tp_uint(evsel, id, sample);
+ /* TODO: get e_machine from thread. */
struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
if (sc) {
@@ -3729,9 +3744,9 @@ static struct bpf_program *trace__find_syscall_bpf_prog(struct trace *trace, str
return trace->skel->progs.syscall_unaugmented;
}
-static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
+static void trace__init_syscall_bpf_progs(struct trace *trace, int e_machine, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, id);
if (sc == NULL)
return;
@@ -3740,22 +3755,22 @@ static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
sc->bpf_prog.sys_exit = trace__find_syscall_bpf_prog(trace, sc, sc->fmt ? sc->fmt->bpf_prog_name.sys_exit : NULL, "exit");
}
-static int trace__bpf_prog_sys_enter_fd(struct trace *trace, int id)
+static int trace__bpf_prog_sys_enter_fd(struct trace *trace, int e_machine, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, id);
return sc ? bpf_program__fd(sc->bpf_prog.sys_enter) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
}
-static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
+static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int e_machine, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, id);
return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
}
-static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigned int *beauty_array)
+static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int e_machine, int key, unsigned int *beauty_array)
{
struct tep_format_field *field;
- struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
+ struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, key);
const struct btf_type *bt;
char *struct_offset, *tmp, name[32];
bool can_augment = false;
@@ -3851,9 +3866,9 @@ static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace
return NULL;
try_to_find_pair:
- for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
- int id = syscalltbl__id_at_idx(trace->sctbl, i);
- struct syscall *pair = trace__syscall_info(trace, NULL, EM_HOST, id);
+ for (int i = 0, num_idx = syscalltbl__num_idx(sc->e_machine); i < num_idx; ++i) {
+ int id = syscalltbl__id_at_idx(sc->e_machine, i);
+ struct syscall *pair = trace__syscall_info(trace, NULL, sc->e_machine, id);
struct bpf_program *pair_prog;
bool is_candidate = false;
@@ -3937,7 +3952,7 @@ static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace
return NULL;
}
-static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
+static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace, int e_machine)
{
int map_enter_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_enter);
int map_exit_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_exit);
@@ -3945,27 +3960,27 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
int err = 0;
unsigned int beauty_array[6];
- for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
- int prog_fd, key = syscalltbl__id_at_idx(trace->sctbl, i);
+ for (int i = 0, num_idx = syscalltbl__num_idx(e_machine); i < num_idx; ++i) {
+ int prog_fd, key = syscalltbl__id_at_idx(e_machine, i);
if (!trace__syscall_enabled(trace, key))
continue;
- trace__init_syscall_bpf_progs(trace, key);
+ trace__init_syscall_bpf_progs(trace, e_machine, key);
// It'll get at least the "!raw_syscalls:unaugmented"
- prog_fd = trace__bpf_prog_sys_enter_fd(trace, key);
+ prog_fd = trace__bpf_prog_sys_enter_fd(trace, e_machine, key);
err = bpf_map_update_elem(map_enter_fd, &key, &prog_fd, BPF_ANY);
if (err)
break;
- prog_fd = trace__bpf_prog_sys_exit_fd(trace, key);
+ prog_fd = trace__bpf_prog_sys_exit_fd(trace, e_machine, key);
err = bpf_map_update_elem(map_exit_fd, &key, &prog_fd, BPF_ANY);
if (err)
break;
/* use beauty_map to tell BPF how many bytes to collect, set beauty_map's value here */
memset(beauty_array, 0, sizeof(beauty_array));
- err = trace__bpf_sys_enter_beauty_map(trace, key, (unsigned int *)beauty_array);
+ err = trace__bpf_sys_enter_beauty_map(trace, e_machine, key, (unsigned int *)beauty_array);
if (err)
continue;
err = bpf_map_update_elem(beauty_map_fd, &key, beauty_array, BPF_ANY);
@@ -4001,9 +4016,9 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
* first and second arg (this one on the raw_syscalls:sys_exit prog
* array tail call, then that one will be used.
*/
- for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
- int key = syscalltbl__id_at_idx(trace->sctbl, i);
- struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
+ for (int i = 0, num_idx = syscalltbl__num_idx(e_machine); i < num_idx; ++i) {
+ int key = syscalltbl__id_at_idx(e_machine, i);
+ struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, key);
struct bpf_program *pair_prog;
int prog_fd;
@@ -4449,8 +4464,13 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
goto out_error_mem;
#ifdef HAVE_BPF_SKEL
- if (trace->skel && trace->skel->progs.sys_enter)
- trace__init_syscalls_bpf_prog_array_maps(trace);
+ if (trace->skel && trace->skel->progs.sys_enter) {
+ /*
+ * TODO: Initialize for all host binary machine types, not just
+ * those matching the perf binary.
+ */
+ trace__init_syscalls_bpf_prog_array_maps(trace, EM_HOST);
+ }
#endif
if (trace->ev_qualifier_ids.nr > 0) {
@@ -4475,7 +4495,8 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
* So just disable this beautifier (SCA_FD, SCA_FDAT) when 'close' is
* not in use.
*/
- trace->fd_path_disabled = !trace__syscall_enabled(trace, syscalltbl__id(trace->sctbl, "close"));
+ /* TODO: support for more than just perf binary machine type close. */
+ trace->fd_path_disabled = !trace__syscall_enabled(trace, syscalltbl__id(EM_HOST, "close"));
err = trace__expand_filters(trace, &evsel);
if (err)
@@ -4787,7 +4808,7 @@ static struct syscall_entry *syscall__sort_stats(struct hashmap *syscall_stats)
return entry;
}
-static size_t syscall__dump_stats(struct trace *trace, FILE *fp,
+static size_t syscall__dump_stats(struct trace *trace, int e_machine, FILE *fp,
struct hashmap *syscall_stats)
{
size_t printed = 0;
@@ -4818,7 +4839,7 @@ static size_t syscall__dump_stats(struct trace *trace, FILE *fp,
pct = avg ? 100.0 * stddev_stats(&stats->stats) / avg : 0.0;
avg /= NSEC_PER_MSEC;
- sc = trace__syscall_info(trace, /*evsel=*/NULL, EM_HOST, entry->syscall);
+ sc = trace__syscall_info(trace, /*evsel=*/NULL, e_machine, entry->syscall);
if (!sc)
continue;
@@ -4845,14 +4866,14 @@ static size_t syscall__dump_stats(struct trace *trace, FILE *fp,
}
static size_t thread__dump_stats(struct thread_trace *ttrace,
- struct trace *trace, FILE *fp)
+ struct trace *trace, int e_machine, FILE *fp)
{
- return syscall__dump_stats(trace, fp, ttrace->syscall_stats);
+ return syscall__dump_stats(trace, e_machine, fp, ttrace->syscall_stats);
}
-static size_t system__dump_stats(struct trace *trace, FILE *fp)
+static size_t system__dump_stats(struct trace *trace, int e_machine, FILE *fp)
{
- return syscall__dump_stats(trace, fp, trace->syscall_stats);
+ return syscall__dump_stats(trace, e_machine, fp, trace->syscall_stats);
}
static size_t trace__fprintf_thread(FILE *fp, struct thread *thread, struct trace *trace)
@@ -4878,7 +4899,8 @@ static size_t trace__fprintf_thread(FILE *fp, struct thread *thread, struct trac
else if (fputc('\n', fp) != EOF)
++printed;
- printed += thread__dump_stats(ttrace, trace, fp);
+ /* TODO: get e_machine from thread. */
+ printed += thread__dump_stats(ttrace, trace, EM_HOST, fp);
return printed;
}
@@ -4939,7 +4961,8 @@ static size_t trace__fprintf_total_summary(struct trace *trace, FILE *fp)
else if (fputc('\n', fp) != EOF)
++printed;
- printed += system__dump_stats(trace, fp);
+ /* TODO: get all system e_machines. */
+ printed += system__dump_stats(trace, EM_HOST, fp);
return printed;
}
@@ -5131,8 +5154,9 @@ static int trace__parse_events_option(const struct option *opt, const char *str,
*sep = '\0';
list = 0;
- if (syscalltbl__id(trace->sctbl, s) >= 0 ||
- syscalltbl__strglobmatch_first(trace->sctbl, s, &idx) >= 0) {
+ /* TODO: support for more than just perf binary machine type syscalls. */
+ if (syscalltbl__id(EM_HOST, s) >= 0 ||
+ syscalltbl__strglobmatch_first(EM_HOST, s, &idx) >= 0) {
list = 1;
goto do_concat;
}
@@ -5285,7 +5309,6 @@ static void trace__exit(struct trace *trace)
syscall__exit(&trace->syscalls.table[i]);
zfree(&trace->syscalls.table);
}
- syscalltbl__delete(trace->sctbl);
zfree(&trace->perfconfig_events);
}
@@ -5434,9 +5457,8 @@ int cmd_trace(int argc, const char **argv)
sigaction(SIGCHLD, &sigchld_act, NULL);
trace.evlist = evlist__new();
- trace.sctbl = syscalltbl__new();
- if (trace.evlist == NULL || trace.sctbl == NULL) {
+ if (trace.evlist == NULL) {
pr_err("Not enough memory to run!\n");
err = -ENOMEM;
goto out;
diff --git a/tools/perf/scripts/syscalltbl.sh b/tools/perf/scripts/syscalltbl.sh
index 1ce0d5aa8b50..a39b3013b103 100755
--- a/tools/perf/scripts/syscalltbl.sh
+++ b/tools/perf/scripts/syscalltbl.sh
@@ -50,37 +50,27 @@ fi
infile="$1"
outfile="$2"
-nxt=0
-
-syscall_macro() {
- nr="$1"
- name="$2"
-
- echo " [$nr] = \"$name\","
-}
-
-emit() {
- nr="$1"
- entry="$2"
-
- syscall_macro "$nr" "$entry"
-}
-
-echo "static const char *const syscalltbl[] = {" > $outfile
-
sorted_table=$(mktemp /tmp/syscalltbl.XXXXXX)
grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | sort -n > $sorted_table
-max_nr=0
+echo "static const char *const syscall_num_to_name[] = {" > $outfile
# the params are: nr abi name entry compat
# use _ for intentionally unused variables according to SC2034
while read nr _ name _ _; do
- emit "$nr" "$name" >> $outfile
- max_nr=$nr
+ echo " [$nr] = \"$name\"," >> $outfile
done < $sorted_table
+echo "};" >> $outfile
-rm -f $sorted_table
+echo "static const uint16_t syscall_sorted_names[] = {" >> $outfile
+# When sorting by name, add a suffix of 0s upto 20 characters so that system
+# calls that differ with a numerical suffix don't sort before those
+# without. This default behavior of sort differs from that of strcmp used at
+# runtime. Use sed to strip the trailing 0s suffix afterwards.
+grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | awk '{printf $3; for (i = length($3); i < 20; i++) { printf "0"; }; print " " $1}'| sort | sed 's/\([a-zA-Z1-9]\+\)0\+ \([0-9]\+\)/\1 \2/' > $sorted_table
+while read name nr; do
+ echo " $nr, /* $name */" >> $outfile
+done < $sorted_table
echo "};" >> $outfile
-echo "#define SYSCALLTBL_MAX_ID ${max_nr}" >> $outfile
+rm -f $sorted_table
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index 2f76241494c8..760ac4d0869f 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -9,6 +9,7 @@
#include <stdlib.h>
#include <asm/bitsperlong.h>
#include <linux/compiler.h>
+#include <linux/kernel.h>
#include <linux/zalloc.h>
#include <string.h>
@@ -20,112 +21,66 @@
#include <asm/syscalls_32.h>
#endif
-const int syscalltbl_native_max_id = SYSCALLTBL_MAX_ID;
-static const char *const *syscalltbl_native = syscalltbl;
+const char *syscalltbl__name(int e_machine __maybe_unused, int id)
+{
+ if (id >= 0 && id <= (int)ARRAY_SIZE(syscall_num_to_name))
+ return syscall_num_to_name[id];
+ return NULL;
+}
-struct syscall {
- int id;
+struct syscall_cmp_key {
const char *name;
+ const char *const *tbl;
};
static int syscallcmpname(const void *vkey, const void *ventry)
{
- const char *key = vkey;
- const struct syscall *entry = ventry;
+ const struct syscall_cmp_key *key = vkey;
+ const uint16_t *entry = ventry;
- return strcmp(key, entry->name);
+ return strcmp(key->name, key->tbl[*entry]);
}
-static int syscallcmp(const void *va, const void *vb)
+int syscalltbl__id(int e_machine __maybe_unused, const char *name)
{
- const struct syscall *a = va, *b = vb;
-
- return strcmp(a->name, b->name);
+ struct syscall_cmp_key key = {
+ .name = name,
+ .tbl = syscall_num_to_name,
+ };
+ const int *id = bsearch(&key, syscall_sorted_names,
+ ARRAY_SIZE(syscall_sorted_names),
+ sizeof(syscall_sorted_names[0]),
+ syscallcmpname);
+
+ return id ? *id : -1;
}
-static int syscalltbl__init_native(struct syscalltbl *tbl)
+int syscalltbl__num_idx(int e_machine __maybe_unused)
{
- int nr_entries = 0, i, j;
- struct syscall *entries;
-
- for (i = 0; i <= syscalltbl_native_max_id; ++i)
- if (syscalltbl_native[i])
- ++nr_entries;
-
- entries = tbl->syscalls.entries = malloc(sizeof(struct syscall) * nr_entries);
- if (tbl->syscalls.entries == NULL)
- return -1;
-
- for (i = 0, j = 0; i <= syscalltbl_native_max_id; ++i) {
- if (syscalltbl_native[i]) {
- entries[j].name = syscalltbl_native[i];
- entries[j].id = i;
- ++j;
- }
- }
-
- qsort(tbl->syscalls.entries, nr_entries, sizeof(struct syscall), syscallcmp);
- tbl->syscalls.nr_entries = nr_entries;
- tbl->syscalls.max_id = syscalltbl_native_max_id;
- return 0;
+ return ARRAY_SIZE(syscall_sorted_names);
}
-struct syscalltbl *syscalltbl__new(void)
+int syscalltbl__id_at_idx(int e_machine __maybe_unused, int idx)
{
- struct syscalltbl *tbl = malloc(sizeof(*tbl));
- if (tbl) {
- if (syscalltbl__init_native(tbl)) {
- free(tbl);
- return NULL;
- }
- }
- return tbl;
-}
-
-void syscalltbl__delete(struct syscalltbl *tbl)
-{
- zfree(&tbl->syscalls.entries);
- free(tbl);
-}
-
-const char *syscalltbl__name(const struct syscalltbl *tbl __maybe_unused, int id)
-{
- return id <= syscalltbl_native_max_id ? syscalltbl_native[id]: NULL;
-}
-
-int syscalltbl__id(struct syscalltbl *tbl, const char *name)
-{
- struct syscall *sc = bsearch(name, tbl->syscalls.entries,
- tbl->syscalls.nr_entries, sizeof(*sc),
- syscallcmpname);
-
- return sc ? sc->id : -1;
-}
-
-int syscalltbl__id_at_idx(struct syscalltbl *tbl, int idx)
-{
- struct syscall *syscalls = tbl->syscalls.entries;
-
- return idx < tbl->syscalls.nr_entries ? syscalls[idx].id : -1;
+ return syscall_sorted_names[idx];
}
-int syscalltbl__strglobmatch_next(struct syscalltbl *tbl, const char *syscall_glob, int *idx)
+int syscalltbl__strglobmatch_next(int e_machine __maybe_unused, const char *syscall_glob, int *idx)
{
- int i;
- struct syscall *syscalls = tbl->syscalls.entries;
+ for (int i = *idx + 1; i < (int)ARRAY_SIZE(syscall_sorted_names); ++i) {
+ const char *name = syscall_num_to_name[syscall_sorted_names[i]];
- for (i = *idx + 1; i < tbl->syscalls.nr_entries; ++i) {
- if (strglobmatch(syscalls[i].name, syscall_glob)) {
+ if (strglobmatch(name, syscall_glob)) {
*idx = i;
- return syscalls[i].id;
+ return syscall_sorted_names[i];
}
}
return -1;
}
-int syscalltbl__strglobmatch_first(struct syscalltbl *tbl, const char *syscall_glob, int *idx)
+int syscalltbl__strglobmatch_first(int e_machine, const char *syscall_glob, int *idx)
{
*idx = -1;
- return syscalltbl__strglobmatch_next(tbl, syscall_glob, idx);
+ return syscalltbl__strglobmatch_next(e_machine, syscall_glob, idx);
}
diff --git a/tools/perf/util/syscalltbl.h b/tools/perf/util/syscalltbl.h
index 362411a6d849..2bb628eff367 100644
--- a/tools/perf/util/syscalltbl.h
+++ b/tools/perf/util/syscalltbl.h
@@ -2,22 +2,12 @@
#ifndef __PERF_SYSCALLTBL_H
#define __PERF_SYSCALLTBL_H
-struct syscalltbl {
- struct {
- int max_id;
- int nr_entries;
- void *entries;
- } syscalls;
-};
+const char *syscalltbl__name(int e_machine, int id);
+int syscalltbl__id(int e_machine, const char *name);
+int syscalltbl__num_idx(int e_machine);
+int syscalltbl__id_at_idx(int e_machine, int idx);
-struct syscalltbl *syscalltbl__new(void);
-void syscalltbl__delete(struct syscalltbl *tbl);
-
-const char *syscalltbl__name(const struct syscalltbl *tbl, int id);
-int syscalltbl__id(struct syscalltbl *tbl, const char *name);
-int syscalltbl__id_at_idx(struct syscalltbl *tbl, int idx);
-
-int syscalltbl__strglobmatch_first(struct syscalltbl *tbl, const char *syscall_glob, int *idx);
-int syscalltbl__strglobmatch_next(struct syscalltbl *tbl, const char *syscall_glob, int *idx);
+int syscalltbl__strglobmatch_first(int e_machine, const char *syscall_glob, int *idx);
+int syscalltbl__strglobmatch_next(int e_machine, const char *syscall_glob, int *idx);
#endif /* __PERF_SYSCALLTBL_H */
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 4/8] perf thread: Add support for reading the e_machine type for a thread
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
` (2 preceding siblings ...)
2025-02-19 18:56 ` [PATCH v3 3/8] perf syscalltbl: Remove struct syscalltbl Ian Rogers
@ 2025-02-19 18:56 ` Ian Rogers
2025-02-19 18:56 ` [PATCH v3 5/8] perf trace beauty: Add syscalltbl.sh generating all system call tables Ian Rogers
` (5 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Ian Rogers @ 2025-02-19 18:56 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv, linux-mips,
Arnd Bergmann
Use the executable from /proc/pid/exe and read the e_machine from the
ELF header. On failure use EM_HOST. Change builtin-trace syscall
functions to pass e_machine from the thread rather than EM_HOST, so
that in later patches when syscalltbl can use the e_machine the system
calls are specific to the architecture.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
tools/perf/builtin-trace.c | 43 ++++++++++++++++----------------
tools/perf/util/thread.c | 50 ++++++++++++++++++++++++++++++++++++++
tools/perf/util/thread.h | 14 ++++++++++-
3 files changed, 85 insertions(+), 22 deletions(-)
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index fbf21055cff6..de6896ebe82b 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2729,16 +2729,16 @@ static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
int printed = 0;
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
- int augmented_args_size = 0;
+ int augmented_args_size = 0, e_machine;
void *augmented_args = NULL;
- /* TODO: get e_machine from thread. */
- struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
+ struct syscall *sc;
struct thread_trace *ttrace;
- if (sc == NULL)
- return -1;
-
thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
+ e_machine = thread__e_machine(thread, trace->host);
+ sc = trace__syscall_info(trace, evsel, e_machine, id);
+ if (sc == NULL)
+ goto out_put;
ttrace = thread__trace(thread, trace);
if (ttrace == NULL)
goto out_put;
@@ -2806,17 +2806,18 @@ static int trace__fprintf_sys_enter(struct trace *trace, struct evsel *evsel,
struct thread_trace *ttrace;
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
- /* TODO: get e_machine from thread. */
- struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
+ struct syscall *sc;
char msg[1024];
void *args, *augmented_args = NULL;
- int augmented_args_size;
+ int augmented_args_size, e_machine;
size_t printed = 0;
- if (sc == NULL)
- return -1;
thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
+ e_machine = thread__e_machine(thread, trace->host);
+ sc = trace__syscall_info(trace, evsel, e_machine, id);
+ if (sc == NULL)
+ return -1;
ttrace = thread__trace(thread, trace);
/*
* We need to get ttrace just to make sure it is there when syscall__scnprintf_args()
@@ -2881,15 +2882,15 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
bool duration_calculated = false;
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1, callchain_ret = 0, printed = 0;
- int alignment = trace->args_alignment;
- /* TODO: get e_machine from thread. */
- struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
+ int alignment = trace->args_alignment, e_machine;
+ struct syscall *sc;
struct thread_trace *ttrace;
- if (sc == NULL)
- return -1;
-
thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
+ e_machine = thread__e_machine(thread, trace->host);
+ sc = trace__syscall_info(trace, evsel, e_machine, id);
+ if (sc == NULL)
+ goto out_put;
ttrace = thread__trace(thread, trace);
if (ttrace == NULL)
goto out_put;
@@ -3236,8 +3237,8 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
if (evsel == trace->syscalls.events.bpf_output) {
int id = perf_evsel__sc_tp_uint(evsel, id, sample);
- /* TODO: get e_machine from thread. */
- struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
+ int e_machine = thread ? thread__e_machine(thread, trace->host) : EM_HOST;
+ struct syscall *sc = trace__syscall_info(trace, evsel, e_machine, id);
if (sc) {
fprintf(trace->output, "%s(", sc->name);
@@ -4880,6 +4881,7 @@ static size_t trace__fprintf_thread(FILE *fp, struct thread *thread, struct trac
{
size_t printed = 0;
struct thread_trace *ttrace = thread__priv(thread);
+ int e_machine = thread__e_machine(thread, trace->host);
double ratio;
if (ttrace == NULL)
@@ -4899,8 +4901,7 @@ static size_t trace__fprintf_thread(FILE *fp, struct thread *thread, struct trac
else if (fputc('\n', fp) != EOF)
++printed;
- /* TODO: get e_machine from thread. */
- printed += thread__dump_stats(ttrace, trace, EM_HOST, fp);
+ printed += thread__dump_stats(ttrace, trace, e_machine, fp);
return printed;
}
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 0ffdd52d86d7..a07446a280ed 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -1,5 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
+#include <elf.h>
#include <errno.h>
+#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
@@ -16,6 +18,7 @@
#include "symbol.h"
#include "unwind.h"
#include "callchain.h"
+#include "dwarf-regs.h"
#include <api/fs/fs.h>
@@ -51,6 +54,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
thread__set_ppid(thread, -1);
thread__set_cpu(thread, -1);
thread__set_guest_cpu(thread, -1);
+ thread__set_e_machine(thread, EM_NONE);
thread__set_lbr_stitch_enable(thread, false);
INIT_LIST_HEAD(thread__namespaces_list(thread));
INIT_LIST_HEAD(thread__comm_list(thread));
@@ -423,6 +427,52 @@ void thread__find_cpumode_addr_location(struct thread *thread, u64 addr,
}
}
+static uint16_t read_proc_e_machine_for_pid(pid_t pid)
+{
+ char path[6 /* "/proc/" */ + 11 /* max length of pid */ + 5 /* "/exe\0" */];
+ int fd;
+ uint16_t e_machine = EM_NONE;
+
+ snprintf(path, sizeof(path), "/proc/%d/exe", pid);
+ fd = open(path, O_RDONLY);
+ if (fd >= 0) {
+ _Static_assert(offsetof(Elf32_Ehdr, e_machine) == 18, "Unexpected offset");
+ _Static_assert(offsetof(Elf64_Ehdr, e_machine) == 18, "Unexpected offset");
+ if (pread(fd, &e_machine, sizeof(e_machine), 18) != sizeof(e_machine))
+ e_machine = EM_NONE;
+ close(fd);
+ }
+ return e_machine;
+}
+
+uint16_t thread__e_machine(struct thread *thread, struct machine *machine)
+{
+ pid_t tid, pid;
+ uint16_t e_machine = RC_CHK_ACCESS(thread)->e_machine;
+
+ if (e_machine != EM_NONE)
+ return e_machine;
+
+ tid = thread__tid(thread);
+ pid = thread__pid(thread);
+ if (pid != tid) {
+ struct thread *parent = machine__findnew_thread(machine, pid, pid);
+
+ if (parent) {
+ e_machine = thread__e_machine(parent, machine);
+ thread__set_e_machine(thread, e_machine);
+ return e_machine;
+ }
+ /* Something went wrong, fallback. */
+ }
+ e_machine = read_proc_e_machine_for_pid(pid);
+ if (e_machine != EM_NONE)
+ thread__set_e_machine(thread, e_machine);
+ else
+ e_machine = EM_HOST;
+ return e_machine;
+}
+
struct thread *thread__main_thread(struct machine *machine, struct thread *thread)
{
if (thread__pid(thread) == thread__tid(thread))
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 6cbf6eb2812e..cd574a896418 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -60,7 +60,11 @@ DECLARE_RC_STRUCT(thread) {
struct srccode_state srccode_state;
bool filter;
int filter_entry_depth;
-
+ /**
+ * @e_machine: The ELF EM_* associated with the thread. EM_NONE if not
+ * computed.
+ */
+ uint16_t e_machine;
/* LBR call stack stitch */
bool lbr_stitch_enable;
struct lbr_stitch *lbr_stitch;
@@ -302,6 +306,14 @@ static inline void thread__set_filter_entry_depth(struct thread *thread, int dep
RC_CHK_ACCESS(thread)->filter_entry_depth = depth;
}
+uint16_t thread__e_machine(struct thread *thread, struct machine *machine);
+
+static inline void thread__set_e_machine(struct thread *thread, uint16_t e_machine)
+{
+ RC_CHK_ACCESS(thread)->e_machine = e_machine;
+}
+
+
static inline bool thread__lbr_stitch_enable(const struct thread *thread)
{
return RC_CHK_ACCESS(thread)->lbr_stitch_enable;
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 5/8] perf trace beauty: Add syscalltbl.sh generating all system call tables
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
` (3 preceding siblings ...)
2025-02-19 18:56 ` [PATCH v3 4/8] perf thread: Add support for reading the e_machine type for a thread Ian Rogers
@ 2025-02-19 18:56 ` Ian Rogers
2025-02-19 18:56 ` [PATCH v3 6/8] perf syscalltbl: Use lookup table containing multiple architectures Ian Rogers
` (4 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Ian Rogers @ 2025-02-19 18:56 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv, linux-mips,
Arnd Bergmann
Rather than generating individual syscall header files generate a
single trace/beauty/generated/syscalltbl.c. In a syscalltbls array
have references to each architectures tables along with the
corresponding e_machine. When the 32-bit or 64-bit table is ambiguous,
match the perf binary's type. For ARM32 don't use the arm64 32-bit
table which is smaller. EM_NONE is present for is no machine matches.
Conditionally compile the tables, only having the appropriate 32 and
64-bit table. If ALL_SYSCALLTBL is defined all tables can be
compiled.
Add comment for noreturn column suggested by Arnd Bergmann:
https://lore.kernel.org/lkml/d47c35dd-9c52-48e7-a00d-135572f11fbb@app.fastmail.com/
and added in commit 9142be9e6443 ("x86/syscall: Mark exit[_group]
syscall handlers __noreturn").
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
tools/perf/Makefile.perf | 9 +
tools/perf/trace/beauty/syscalltbl.sh | 274 ++++++++++++++++++++++++++
2 files changed, 283 insertions(+)
create mode 100755 tools/perf/trace/beauty/syscalltbl.sh
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 55d6ce9ea52f..793e702f9aaf 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -559,6 +559,14 @@ beauty_ioctl_outdir := $(beauty_outdir)/ioctl
# Create output directory if not already present
$(shell [ -d '$(beauty_ioctl_outdir)' ] || mkdir -p '$(beauty_ioctl_outdir)')
+syscall_array := $(beauty_outdir)/syscalltbl.c
+syscall_tbl := $(srctree)/tools/perf/trace/beauty/syscalltbl.sh
+syscall_tbl_data := $(srctree)/tools/scripts/syscall.tbl \
+ $(wildcard $(srctree)/tools/perf/arch/*/entry/syscalls/syscall*.tbl)
+
+$(syscall_array): $(syscall_tbl) $(syscall_tbl_data)
+ $(Q)$(SHELL) '$(syscall_tbl)' $(srctree)/tools $@
+
fs_at_flags_array := $(beauty_outdir)/fs_at_flags_array.c
fs_at_flags_tbl := $(srctree)/tools/perf/trace/beauty/fs_at_flags.sh
@@ -878,6 +886,7 @@ build-dir = $(or $(__build-dir),.)
prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders \
arm64-sysreg-defs \
+ $(syscall_array) \
$(fs_at_flags_array) \
$(clone_flags_array) \
$(drm_ioctl_array) \
diff --git a/tools/perf/trace/beauty/syscalltbl.sh b/tools/perf/trace/beauty/syscalltbl.sh
new file mode 100755
index 000000000000..1199618dc178
--- /dev/null
+++ b/tools/perf/trace/beauty/syscalltbl.sh
@@ -0,0 +1,274 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# Generate all syscall tables.
+#
+# Each line of the syscall table should have the following format:
+#
+# NR ABI NAME [NATIVE] [COMPAT [noreturn]]
+#
+# NR syscall number
+# ABI ABI name
+# NAME syscall name
+# NATIVE native entry point (optional)
+# COMPAT compat entry point (optional)
+# noreturn system call doesn't return (optional)
+set -e
+
+usage() {
+ cat >&2 <<EOF
+usage: $0 <TOOLS DIRECTORY> <OUTFILE>
+
+ <TOOLS DIRECTORY> path to kernel tools directory
+ <OUTFILE> output header file
+EOF
+ exit 1
+}
+
+if [ $# -ne 2 ]; then
+ usage
+fi
+tools_dir=$1
+outfile=$2
+
+build_tables() {
+ infile="$1"
+ outfile="$2"
+ abis=$(echo "($3)" | tr ',' '|')
+ e_machine="$4"
+
+ if [ ! -f "$infile" ]
+ then
+ echo "Missing file $infile"
+ exit 1
+ fi
+ sorted_table=$(mktemp /tmp/syscalltbl.XXXXXX)
+ grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | sort -n > "$sorted_table"
+
+ echo "static const char *const syscall_num_to_name_${e_machine}[] = {" >> "$outfile"
+ # the params are: nr abi name entry compat
+ # use _ for intentionally unused variables according to SC2034
+ while read -r nr _ name _ _; do
+ echo " [$nr] = \"$name\"," >> "$outfile"
+ done < "$sorted_table"
+ echo "};" >> "$outfile"
+
+ echo "static const uint16_t syscall_sorted_names_${e_machine}[] = {" >> "$outfile"
+
+ # When sorting by name, add a suffix of 0s upto 20 characters so that
+ # system calls that differ with a numerical suffix don't sort before
+ # those without. This default behavior of sort differs from that of
+ # strcmp used at runtime. Use sed to strip the trailing 0s suffix
+ # afterwards.
+ grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | awk '{printf $3; for (i = length($3); i < 20; i++) { printf "0"; }; print " " $1}'| sort | sed 's/\([a-zA-Z1-9]\+\)0\+ \([0-9]\+\)/\1 \2/' > "$sorted_table"
+ while read -r name nr; do
+ echo " $nr, /* $name */" >> "$outfile"
+ done < "$sorted_table"
+ echo "};" >> "$outfile"
+
+ rm -f "$sorted_table"
+}
+
+rm -f "$outfile"
+cat >> "$outfile" <<EOF
+#include <elf.h>
+#include <stdint.h>
+#include <asm/bitsperlong.h>
+#include <linux/kernel.h>
+
+struct syscalltbl {
+ const char *const *num_to_name;
+ const uint16_t *sorted_names;
+ uint16_t e_machine;
+ uint16_t num_to_name_len;
+ uint16_t sorted_names_len;
+};
+
+#if defined(ALL_SYSCALLTBL) || defined(__alpha__)
+EOF
+build_tables "$tools_dir/perf/arch/alpha/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_ALPHA
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__alpha__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
+EOF
+build_tables "$tools_dir/perf/arch/arm/entry/syscalls/syscall.tbl" "$outfile" common,32,oabi EM_ARM
+build_tables "$tools_dir/perf/arch/arm64/entry/syscalls/syscall_64.tbl" "$outfile" common,64,renameat,rlimit,memfd_secret EM_AARCH64
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__csky__)
+EOF
+build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32,csky,time32,stat64,rlimit EM_CSKY
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__csky__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__mips__)
+EOF
+build_tables "$tools_dir/perf/arch/mips/entry/syscalls/syscall_n64.tbl" "$outfile" common,64,n64 EM_MIPS
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__hppa__)
+#if __BITS_PER_LONG != 64
+EOF
+build_tables "$tools_dir/perf/arch/parisc/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_PARISC
+echo "#else" >> "$outfile"
+build_tables "$tools_dir/perf/arch/parisc/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_PARISC
+cat >> "$outfile" <<EOF
+#endif //__BITS_PER_LONG != 64
+#endif // defined(ALL_SYSCALLTBL) || defined(__hppa__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
+EOF
+build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl" "$outfile" common,32,nospu EM_PPC
+build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl" "$outfile" common,64,nospu EM_PPC64
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__riscv)
+#if __BITS_PER_LONG != 64
+EOF
+build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32,riscv,memfd_secret EM_RISCV
+echo "#else" >> "$outfile"
+build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,64,riscv,rlimit,memfd_secret EM_RISCV
+cat >> "$outfile" <<EOF
+#endif //__BITS_PER_LONG != 64
+#endif // defined(ALL_SYSCALLTBL) || defined(__riscv)
+#if defined(ALL_SYSCALLTBL) || defined(__s390x__)
+EOF
+build_tables "$tools_dir/perf/arch/s390/entry/syscalls/syscall.tbl" "$outfile" common,64,renameat,rlimit,memfd_secret EM_S390
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__sh__)
+EOF
+build_tables "$tools_dir/perf/arch/sh/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_SH
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__sh__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
+#if __BITS_PER_LONG != 64
+EOF
+build_tables "$tools_dir/perf/arch/sparc/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_SPARC
+echo "#else" >> "$outfile"
+build_tables "$tools_dir/perf/arch/sparc/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_SPARC
+cat >> "$outfile" <<EOF
+#endif //__BITS_PER_LONG != 64
+#endif // defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
+EOF
+build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_32.tbl" "$outfile" common,32,i386 EM_386
+build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_64.tbl" "$outfile" common,64 EM_X86_64
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__xtensa__)
+EOF
+build_tables "$tools_dir/perf/arch/xtensa/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_XTENSA
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__xtensa__)
+
+#if __BITS_PER_LONG != 64
+EOF
+build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32 EM_NONE
+echo "#else" >> "$outfile"
+build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,64 EM_NONE
+echo "#endif //__BITS_PER_LONG != 64" >> "$outfile"
+
+build_outer_table() {
+ e_machine=$1
+ outfile="$2"
+ cat >> "$outfile" <<EOF
+ {
+ .num_to_name = syscall_num_to_name_$e_machine,
+ .sorted_names = syscall_sorted_names_$e_machine,
+ .e_machine = $e_machine,
+ .num_to_name_len = ARRAY_SIZE(syscall_num_to_name_$e_machine),
+ .sorted_names_len = ARRAY_SIZE(syscall_sorted_names_$e_machine),
+ },
+EOF
+}
+
+cat >> "$outfile" <<EOF
+static const struct syscalltbl syscalltbls[] = {
+#if defined(ALL_SYSCALLTBL) || defined(__alpha__)
+EOF
+build_outer_table EM_ALPHA "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__alpha__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
+EOF
+build_outer_table EM_ARM "$outfile"
+build_outer_table EM_AARCH64 "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__csky__)
+EOF
+build_outer_table EM_CSKY "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__csky__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__mips__)
+EOF
+build_outer_table EM_MIPS "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__hppa__)
+EOF
+build_outer_table EM_PARISC "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__hppa__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
+EOF
+build_outer_table EM_PPC "$outfile"
+build_outer_table EM_PPC64 "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__riscv)
+EOF
+build_outer_table EM_RISCV "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__riscv)
+
+#if defined(ALL_SYSCALLTBL) || defined(__s390x__)
+EOF
+build_outer_table EM_S390 "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__sh__)
+EOF
+build_outer_table EM_SH "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__sh__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
+EOF
+build_outer_table EM_SPARC "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
+EOF
+build_outer_table EM_386 "$outfile"
+build_outer_table EM_X86_64 "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__xtensa__)
+EOF
+build_outer_table EM_XTENSA "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__xtensa__)
+EOF
+build_outer_table EM_NONE "$outfile"
+cat >> "$outfile" <<EOF
+};
+EOF
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 6/8] perf syscalltbl: Use lookup table containing multiple architectures
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
` (4 preceding siblings ...)
2025-02-19 18:56 ` [PATCH v3 5/8] perf trace beauty: Add syscalltbl.sh generating all system call tables Ian Rogers
@ 2025-02-19 18:56 ` Ian Rogers
2025-02-19 18:56 ` [PATCH v3 7/8] perf build: Remove Makefile.syscalls Ian Rogers
` (3 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Ian Rogers @ 2025-02-19 18:56 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv, linux-mips,
Arnd Bergmann
Switch to use the lookup table containing all architectures rather
than tables matching the perf binary.
This fixes perf trace when executed on a 32-bit i386 binary on an
x86-64 machine. Note in the following the system call names of the
32-bit i386 binary as seen by an x86-64 perf.
Before:
```
? ( ): a.out/447296 ... [continued]: munmap()) = 0
0.024 ( 0.001 ms): a.out/447296 recvfrom(ubuf: 0x2, size: 4160585708, flags: DONTROUTE|CTRUNC|TRUNC|DONTWAIT|EOR|WAITALL|FIN|SYN|CONFIRM|RST|ERRQUEUE|NOSIGNAL|WAITFORONE|BATCH|SOCK_DEVMEM|ZEROCOPY|FASTOPEN|CMSG_CLOEXEC|0x91f80000, addr: 0xe30, addr_len: 0xffce438c) = 1475198976
0.042 ( 0.003 ms): a.out/447296 lgetxattr(name: "", value: 0x3, size: 34) = 4160344064
0.054 ( 0.003 ms): a.out/447296 dup2(oldfd: -134422744, newfd: 4) = -1 ENOENT (No such file or directory)
0.060 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x2e646c2f6374652f,.iov_len = (__kernel_size_t)7307199665335594867,}, vlen: 557056, pos_h: 4160585708) = 3
0.074 ( 0.004 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2) = 4160237568
0.080 ( 0.001 ms): a.out/447296 lstat(filename: "", statbuf: 0x193f6) = 0
0.089 ( 0.007 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x3833692f62696c2f,.iov_len = (__kernel_size_t)3276497845987585334,}, vlen: 557056, pos_h: 4160585708) = 3
0.097 ( 0.002 ms): a.out/447296 close(fd: 3</proc/447296/status>) = 512
0.103 ( 0.002 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2050) = 4157935616
0.107 ( 0.007 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x5, size: 2066) = 4158078976
0.116 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x1, size: 2066) = 4159639552
0.121 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 2066) = 4160184320
0.129 ( 0.002 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 50) = 4160196608
0.138 ( 0.001 ms): a.out/447296 lstat(filename: "") = 0
0.145 ( 0.002 ms): a.out/447296 mq_timedreceive(mqdes: 4291706800, u_msg_ptr: 0xf7f9ea48, msg_len: 134616640, u_msg_prio: 0xf7fd7fec, u_abs_timeout: (struct __kernel_timespec){.tv_sec = (__kernel_time64_t)-578174027777317696,.tv_nsec = (long long int)4160349376,}) = 0
0.148 ( 0.001 ms): a.out/447296 mkdirat(dfd: -134617816, pathname: " ��� ���▒���▒���", mode: IFREG|ISUID|IRUSR|IWGRP|0xf7fd0000) = 447296
0.150 ( 0.001 ms): a.out/447296 process_vm_writev(pid: -134617812, lvec: (struct iovec){.iov_base = (void *)0xf7f9e9c8f7f9e4c0,.iov_len = (__kernel_size_t)4160349376,}, liovcnt: 4160588048, rvec: (struct iovec){}, riovcnt: 4160585708, flags: 4291707352) = 0
0.197 ( 0.004 ms): a.out/447296 capget(header: 4160184320, dataptr: 8192) = 0
0.202 ( 0.002 ms): a.out/447296 capget(header: 1448669184, dataptr: 4096) = 0
0.208 ( 0.002 ms): a.out/447296 capget(header: 4160577536, dataptr: 8192) = 0
0.220 ( 0.001 ms): a.out/447296 getxattr(pathname: "", name: "c������", value: 0xf7f77e34, size: 1) = 0
0.228 ( 0.005 ms): a.out/447296 fchmod(fd: -134729728, mode: IRUGO|IWUGO|IFREG|IFIFO|ISVTX|IXUSR|0x10000) = 0
0.240 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: 0x5658e008, pos_h: 4160192052) = 3
0.250 ( 0.008 ms): a.out/447296 close(fd: 3</proc/447296/status>) = 1436
0.260 ( 0.018 ms): a.out/447296 stat(filename: "", statbuf: 0xffce32ac) = 1436
0.288 (1000.213 ms): a.out/447296 readlinkat(buf: 0xffce31d4, bufsiz: 4291703244) = 0
```
After:
```
? ( ): a.out/442930 ... [continued]: execve()) = 0
0.023 ( 0.002 ms): a.out/442930 brk() = 0x57760000
0.052 ( 0.003 ms): a.out/442930 access(filename: 0xf7f5af28, mode: R) = -1 ENOENT (No such file or directory)
0.059 ( 0.009 ms): a.out/442930 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
0.078 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>) = 0
0.087 ( 0.007 ms): a.out/442930 openat(dfd: CWD, filename: "/lib/i386-linux-", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
0.095 ( 0.002 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdbb70, count: 512) = 512
0.135 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>) = 0
0.148 ( 0.001 ms): a.out/442930 set_tid_address(tidptr: 0xf7f2b528) = 442930 (a.out)
0.150 ( 0.001 ms): a.out/442930 set_robust_list(head: 0xf7f2b52c, len: 12) =
0.196 ( 0.004 ms): a.out/442930 mprotect(start: 0xf7f03000, len: 8192, prot: READ) = 0
0.202 ( 0.002 ms): a.out/442930 mprotect(start: 0x5658e000, len: 4096, prot: READ) = 0
0.207 ( 0.002 ms): a.out/442930 mprotect(start: 0xf7f63000, len: 8192, prot: READ) = 0
0.230 ( 0.005 ms): a.out/442930 munmap(addr: 0xf7f10000, len: 103414) = 0
0.244 ( 0.010 ms): a.out/442930 openat(dfd: CWD, filename: 0x5658d008) = 3
0.255 ( 0.007 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdb67c, count: 4096) = 1436
0.264 ( 0.018 ms): a.out/442930 write(fd: 1</dev/pts/4>, buf: , count: 1436) = 1436
0.292 (1000.173 ms): a.out/442930 clock_nanosleep(rqtp: { .tv_sec: 17866546940376776704, .tv_nsec: 4159878336 }, rmtp: 0xffbdb59c) = 0
1000.478 ( ): a.out/442930 exit_group() = ?
```
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
tools/perf/util/syscalltbl.c | 89 ++++++++++++++++++++++++++----------
1 file changed, 64 insertions(+), 25 deletions(-)
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index 760ac4d0869f..db0d2b81aed1 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -15,16 +15,39 @@
#include <string.h>
#include "string2.h"
-#if __BITS_PER_LONG == 64
- #include <asm/syscalls_64.h>
-#else
- #include <asm/syscalls_32.h>
-#endif
+#include "trace/beauty/generated/syscalltbl.c"
-const char *syscalltbl__name(int e_machine __maybe_unused, int id)
+static const struct syscalltbl *find_table(int e_machine)
{
- if (id >= 0 && id <= (int)ARRAY_SIZE(syscall_num_to_name))
- return syscall_num_to_name[id];
+ static const struct syscalltbl *last_table;
+ static int last_table_machine = EM_NONE;
+
+ /* Tables only exist for EM_SPARC. */
+ if (e_machine == EM_SPARCV9)
+ e_machine = EM_SPARC;
+
+ if (last_table_machine == e_machine && last_table != NULL)
+ return last_table;
+
+ for (size_t i = 0; i < ARRAY_SIZE(syscalltbls); i++) {
+ const struct syscalltbl *entry = &syscalltbls[i];
+
+ if (entry->e_machine != e_machine && entry->e_machine != EM_NONE)
+ continue;
+
+ last_table = entry;
+ last_table_machine = e_machine;
+ return entry;
+ }
+ return NULL;
+}
+
+const char *syscalltbl__name(int e_machine, int id)
+{
+ const struct syscalltbl *table = find_table(e_machine);
+
+ if (table && id >= 0 && id < table->num_to_name_len)
+ return table->num_to_name[id];
return NULL;
}
@@ -41,38 +64,54 @@ static int syscallcmpname(const void *vkey, const void *ventry)
return strcmp(key->name, key->tbl[*entry]);
}
-int syscalltbl__id(int e_machine __maybe_unused, const char *name)
+int syscalltbl__id(int e_machine, const char *name)
{
- struct syscall_cmp_key key = {
- .name = name,
- .tbl = syscall_num_to_name,
- };
- const int *id = bsearch(&key, syscall_sorted_names,
- ARRAY_SIZE(syscall_sorted_names),
- sizeof(syscall_sorted_names[0]),
- syscallcmpname);
+ const struct syscalltbl *table = find_table(e_machine);
+ struct syscall_cmp_key key;
+ const int *id;
+
+ if (!table)
+ return -1;
+
+ key.name = name;
+ key.tbl = table->num_to_name;
+ id = bsearch(&key, table->sorted_names, table->sorted_names_len,
+ sizeof(table->sorted_names[0]), syscallcmpname);
return id ? *id : -1;
}
-int syscalltbl__num_idx(int e_machine __maybe_unused)
+int syscalltbl__num_idx(int e_machine)
{
- return ARRAY_SIZE(syscall_sorted_names);
+ const struct syscalltbl *table = find_table(e_machine);
+
+ if (!table)
+ return 0;
+
+ return table->sorted_names_len;
}
-int syscalltbl__id_at_idx(int e_machine __maybe_unused, int idx)
+int syscalltbl__id_at_idx(int e_machine, int idx)
{
- return syscall_sorted_names[idx];
+ const struct syscalltbl *table = find_table(e_machine);
+
+ if (!table)
+ return -1;
+
+ assert(idx >= 0 && idx < table->sorted_names_len);
+ return table->sorted_names[idx];
}
-int syscalltbl__strglobmatch_next(int e_machine __maybe_unused, const char *syscall_glob, int *idx)
+int syscalltbl__strglobmatch_next(int e_machine, const char *syscall_glob, int *idx)
{
- for (int i = *idx + 1; i < (int)ARRAY_SIZE(syscall_sorted_names); ++i) {
- const char *name = syscall_num_to_name[syscall_sorted_names[i]];
+ const struct syscalltbl *table = find_table(e_machine);
+
+ for (int i = *idx + 1; table && i < table->sorted_names_len; ++i) {
+ const char *name = table->num_to_name[table->sorted_names[i]];
if (strglobmatch(name, syscall_glob)) {
*idx = i;
- return syscall_sorted_names[i];
+ return table->sorted_names[i];
}
}
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 7/8] perf build: Remove Makefile.syscalls
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
` (5 preceding siblings ...)
2025-02-19 18:56 ` [PATCH v3 6/8] perf syscalltbl: Use lookup table containing multiple architectures Ian Rogers
@ 2025-02-19 18:56 ` Ian Rogers
2025-02-19 18:56 ` [PATCH v3 8/8] perf syscalltbl: Mask off ABI type for MIPS system calls Ian Rogers
` (2 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Ian Rogers @ 2025-02-19 18:56 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv, linux-mips,
Arnd Bergmann
Now a single beauty file is generated and used by all architectures,
remove the per-architecture Makefiles, Kbuild files and previous
generator script.
Note: there was conversation with Charlie Jenkins
<charlie@rivosinc.com> and they'd written an alternate approach to
support multiple architectures:
https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
It would have been better to have helped Charlie fix their series (my
apologies) but they agreed that the approach taken here was likely
best for longer term maintainability:
https://lore.kernel.org/lkml/Z6Jk_UN9i69QGqUj@ghost/
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
tools/perf/Makefile.perf | 1 -
tools/perf/arch/alpha/entry/syscalls/Kbuild | 2 -
.../alpha/entry/syscalls/Makefile.syscalls | 5 --
tools/perf/arch/arc/entry/syscalls/Kbuild | 2 -
.../arch/arc/entry/syscalls/Makefile.syscalls | 3 -
tools/perf/arch/arm/entry/syscalls/Kbuild | 4 -
.../arch/arm/entry/syscalls/Makefile.syscalls | 2 -
tools/perf/arch/arm64/entry/syscalls/Kbuild | 3 -
.../arm64/entry/syscalls/Makefile.syscalls | 6 --
tools/perf/arch/csky/entry/syscalls/Kbuild | 2 -
.../csky/entry/syscalls/Makefile.syscalls | 3 -
.../perf/arch/loongarch/entry/syscalls/Kbuild | 2 -
.../entry/syscalls/Makefile.syscalls | 3 -
tools/perf/arch/mips/entry/syscalls/Kbuild | 2 -
.../mips/entry/syscalls/Makefile.syscalls | 5 --
tools/perf/arch/parisc/entry/syscalls/Kbuild | 3 -
.../parisc/entry/syscalls/Makefile.syscalls | 6 --
tools/perf/arch/powerpc/entry/syscalls/Kbuild | 3 -
.../powerpc/entry/syscalls/Makefile.syscalls | 6 --
tools/perf/arch/riscv/entry/syscalls/Kbuild | 2 -
.../riscv/entry/syscalls/Makefile.syscalls | 4 -
tools/perf/arch/s390/entry/syscalls/Kbuild | 2 -
.../s390/entry/syscalls/Makefile.syscalls | 5 --
tools/perf/arch/sh/entry/syscalls/Kbuild | 2 -
.../arch/sh/entry/syscalls/Makefile.syscalls | 4 -
tools/perf/arch/sparc/entry/syscalls/Kbuild | 3 -
.../sparc/entry/syscalls/Makefile.syscalls | 5 --
tools/perf/arch/x86/entry/syscalls/Kbuild | 3 -
.../arch/x86/entry/syscalls/Makefile.syscalls | 6 --
tools/perf/arch/xtensa/entry/syscalls/Kbuild | 2 -
.../xtensa/entry/syscalls/Makefile.syscalls | 4 -
tools/perf/scripts/Makefile.syscalls | 61 ---------------
tools/perf/scripts/syscalltbl.sh | 76 -------------------
33 files changed, 242 deletions(-)
delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arm/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/csky/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/mips/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/s390/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/sh/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/x86/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/scripts/Makefile.syscalls
delete mode 100755 tools/perf/scripts/syscalltbl.sh
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 793e702f9aaf..62176d685445 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -339,7 +339,6 @@ ifeq ($(filter feature-dump,$(MAKECMDGOALS)),feature-dump)
FEATURE_TESTS := all
endif
endif
-include $(srctree)/tools/perf/scripts/Makefile.syscalls
include Makefile.config
endif
diff --git a/tools/perf/arch/alpha/entry/syscalls/Kbuild b/tools/perf/arch/alpha/entry/syscalls/Kbuild
deleted file mode 100644
index 9a41e3572c3a..000000000000
--- a/tools/perf/arch/alpha/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls b/tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 690168aac34d..000000000000
--- a/tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,5 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_64 +=
-
-syscalltbl = $(srctree)/tools/perf/arch/alpha/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/arc/entry/syscalls/Kbuild b/tools/perf/arch/arc/entry/syscalls/Kbuild
deleted file mode 100644
index 11707c481a24..000000000000
--- a/tools/perf/arch/arc/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
diff --git a/tools/perf/arch/arc/entry/syscalls/Makefile.syscalls b/tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 391d30ab7a83..000000000000
--- a/tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += arc time32 renameat stat64 rlimit
diff --git a/tools/perf/arch/arm/entry/syscalls/Kbuild b/tools/perf/arch/arm/entry/syscalls/Kbuild
deleted file mode 100644
index 9d777540f089..000000000000
--- a/tools/perf/arch/arm/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,4 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += oabi
-syscalltbl = $(srctree)/tools/perf/arch/arm/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/arm/entry/syscalls/Makefile.syscalls b/tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 11707c481a24..000000000000
--- a/tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
diff --git a/tools/perf/arch/arm64/entry/syscalls/Kbuild b/tools/perf/arch/arm64/entry/syscalls/Kbuild
deleted file mode 100644
index 84c6599b4ea6..000000000000
--- a/tools/perf/arch/arm64/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls b/tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index e7e78c2d1c02..000000000000
--- a/tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,6 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 +=
-syscall_abis_64 += renameat rlimit memfd_secret
-
-syscalltbl = $(srctree)/tools/perf/arch/arm64/entry/syscalls/syscall_%.tbl
diff --git a/tools/perf/arch/csky/entry/syscalls/Kbuild b/tools/perf/arch/csky/entry/syscalls/Kbuild
deleted file mode 100644
index 11707c481a24..000000000000
--- a/tools/perf/arch/csky/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
diff --git a/tools/perf/arch/csky/entry/syscalls/Makefile.syscalls b/tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index ea2dd10d0571..000000000000
--- a/tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += csky time32 stat64 rlimit
diff --git a/tools/perf/arch/loongarch/entry/syscalls/Kbuild b/tools/perf/arch/loongarch/entry/syscalls/Kbuild
deleted file mode 100644
index 9a41e3572c3a..000000000000
--- a/tools/perf/arch/loongarch/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls b/tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 47d32da2aed8..000000000000
--- a/tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_64 +=
diff --git a/tools/perf/arch/mips/entry/syscalls/Kbuild b/tools/perf/arch/mips/entry/syscalls/Kbuild
deleted file mode 100644
index 9a41e3572c3a..000000000000
--- a/tools/perf/arch/mips/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/mips/entry/syscalls/Makefile.syscalls b/tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 9ee914bdfb05..000000000000
--- a/tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,5 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_64 += n64
-
-syscalltbl = $(srctree)/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
diff --git a/tools/perf/arch/parisc/entry/syscalls/Kbuild b/tools/perf/arch/parisc/entry/syscalls/Kbuild
deleted file mode 100644
index 84c6599b4ea6..000000000000
--- a/tools/perf/arch/parisc/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls b/tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index ae326fecb83b..000000000000
--- a/tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,6 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 +=
-syscall_abis_64 +=
-
-syscalltbl = $(srctree)/tools/perf/arch/parisc/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/powerpc/entry/syscalls/Kbuild b/tools/perf/arch/powerpc/entry/syscalls/Kbuild
deleted file mode 100644
index 84c6599b4ea6..000000000000
--- a/tools/perf/arch/powerpc/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls b/tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index e35afbc57c79..000000000000
--- a/tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,6 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += nospu
-syscall_abis_64 += nospu
-
-syscalltbl = $(srctree)/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/riscv/entry/syscalls/Kbuild b/tools/perf/arch/riscv/entry/syscalls/Kbuild
deleted file mode 100644
index 9a41e3572c3a..000000000000
--- a/tools/perf/arch/riscv/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls b/tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 9668fd1faf60..000000000000
--- a/tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,4 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += riscv memfd_secret
-syscall_abis_64 += riscv rlimit memfd_secret
diff --git a/tools/perf/arch/s390/entry/syscalls/Kbuild b/tools/perf/arch/s390/entry/syscalls/Kbuild
deleted file mode 100644
index 9a41e3572c3a..000000000000
--- a/tools/perf/arch/s390/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/s390/entry/syscalls/Makefile.syscalls b/tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 9762d7abf17c..000000000000
--- a/tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,5 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_64 += renameat rlimit memfd_secret
-
-syscalltbl = $(srctree)/tools/perf/arch/s390/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/sh/entry/syscalls/Kbuild b/tools/perf/arch/sh/entry/syscalls/Kbuild
deleted file mode 100644
index 11707c481a24..000000000000
--- a/tools/perf/arch/sh/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
diff --git a/tools/perf/arch/sh/entry/syscalls/Makefile.syscalls b/tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 25080390e4ed..000000000000
--- a/tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,4 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 +=
-syscalltbl = $(srctree)/tools/perf/arch/sh/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/sparc/entry/syscalls/Kbuild b/tools/perf/arch/sparc/entry/syscalls/Kbuild
deleted file mode 100644
index 84c6599b4ea6..000000000000
--- a/tools/perf/arch/sparc/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls b/tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 212c1800b644..000000000000
--- a/tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,5 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 +=
-syscall_abis_64 +=
-syscalltbl = $(srctree)/tools/perf/arch/sparc/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/x86/entry/syscalls/Kbuild b/tools/perf/arch/x86/entry/syscalls/Kbuild
deleted file mode 100644
index 84c6599b4ea6..000000000000
--- a/tools/perf/arch/x86/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/x86/entry/syscalls/Makefile.syscalls b/tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index db3d5d6d4e56..000000000000
--- a/tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,6 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += i386
-syscall_abis_64 +=
-
-syscalltbl = $(srctree)/tools/perf/arch/x86/entry/syscalls/syscall_%.tbl
diff --git a/tools/perf/arch/xtensa/entry/syscalls/Kbuild b/tools/perf/arch/xtensa/entry/syscalls/Kbuild
deleted file mode 100644
index 11707c481a24..000000000000
--- a/tools/perf/arch/xtensa/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
diff --git a/tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls b/tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index d4aa2358460c..000000000000
--- a/tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,4 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 +=
-syscalltbl = $(srctree)/tools/perf/arch/xtensa/entry/syscalls/syscall.tbl
diff --git a/tools/perf/scripts/Makefile.syscalls b/tools/perf/scripts/Makefile.syscalls
deleted file mode 100644
index 8bf55333262e..000000000000
--- a/tools/perf/scripts/Makefile.syscalls
+++ /dev/null
@@ -1,61 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-# This Makefile generates headers in
-# tools/perf/arch/$(SRCARCH)/include/generated/asm from the architecture's
-# syscall table. This will either be from the generic syscall table, or from a
-# table that is specific to that architecture.
-
-PHONY := all
-all:
-
-obj := $(OUTPUT)arch/$(SRCARCH)/include/generated/asm
-
-syscall_abis_32 := common,32
-syscall_abis_64 := common,64
-syscalltbl := $(srctree)/tools/scripts/syscall.tbl
-
-# let architectures override $(syscall_abis_%) and $(syscalltbl)
--include $(srctree)/tools/perf/arch/$(SRCARCH)/entry/syscalls/Makefile.syscalls
-include $(srctree)/tools/build/Build.include
--include $(srctree)/tools/perf/arch/$(SRCARCH)/entry/syscalls/Kbuild
-
-systbl := $(srctree)/tools/perf/scripts/syscalltbl.sh
-
-syscall-y := $(addprefix $(obj)/, $(syscall-y))
-
-# Remove stale wrappers when the corresponding files are removed from generic-y
-old-headers := $(wildcard $(obj)/*.h)
-unwanted := $(filter-out $(syscall-y),$(old-headers))
-
-quiet_cmd_remove = REMOVE $(unwanted)
- cmd_remove = rm -f $(unwanted)
-
-quiet_cmd_systbl = SYSTBL $@
- cmd_systbl = $(CONFIG_SHELL) $(systbl) \
- $(if $(systbl-args-$*),$(systbl-args-$*),$(systbl-args)) \
- --abis $(subst $(space),$(comma),$(strip $(syscall_abis_$*))) \
- $< $@
-
-all: $(syscall-y)
- $(if $(unwanted),$(call cmd,remove))
- @:
-
-$(obj)/syscalls_%.h: $(syscalltbl) $(systbl) FORCE
- $(call if_changed,systbl)
-
-targets := $(syscall-y)
-
-# Create output directory. Skip it if at least one old header exists
-# since we know the output directory already exists.
-ifeq ($(old-headers),)
-$(shell mkdir -p $(obj))
-endif
-
-PHONY += FORCE
-
-FORCE:
-
-existing-targets := $(wildcard $(sort $(targets)))
-
--include $(foreach f,$(existing-targets),$(dir $(f)).$(notdir $(f)).cmd)
-
-.PHONY: $(PHONY)
diff --git a/tools/perf/scripts/syscalltbl.sh b/tools/perf/scripts/syscalltbl.sh
deleted file mode 100755
index a39b3013b103..000000000000
--- a/tools/perf/scripts/syscalltbl.sh
+++ /dev/null
@@ -1,76 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: GPL-2.0
-#
-# Generate a syscall table header.
-#
-# Each line of the syscall table should have the following format:
-#
-# NR ABI NAME [NATIVE] [COMPAT]
-#
-# NR syscall number
-# ABI ABI name
-# NAME syscall name
-# NATIVE native entry point (optional)
-# COMPAT compat entry point (optional)
-
-set -e
-
-usage() {
- echo >&2 "usage: $0 [--abis ABIS] INFILE OUTFILE" >&2
- echo >&2
- echo >&2 " INFILE input syscall table"
- echo >&2 " OUTFILE output header file"
- echo >&2
- echo >&2 "options:"
- echo >&2 " --abis ABIS ABI(s) to handle (By default, all lines are handled)"
- exit 1
-}
-
-# default unless specified by options
-abis=
-
-while [ $# -gt 0 ]
-do
- case $1 in
- --abis)
- abis=$(echo "($2)" | tr ',' '|')
- shift 2;;
- -*)
- echo "$1: unknown option" >&2
- usage;;
- *)
- break;;
- esac
-done
-
-if [ $# -ne 2 ]; then
- usage
-fi
-
-infile="$1"
-outfile="$2"
-
-sorted_table=$(mktemp /tmp/syscalltbl.XXXXXX)
-grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | sort -n > $sorted_table
-
-echo "static const char *const syscall_num_to_name[] = {" > $outfile
-# the params are: nr abi name entry compat
-# use _ for intentionally unused variables according to SC2034
-while read nr _ name _ _; do
- echo " [$nr] = \"$name\"," >> $outfile
-done < $sorted_table
-echo "};" >> $outfile
-
-echo "static const uint16_t syscall_sorted_names[] = {" >> $outfile
-
-# When sorting by name, add a suffix of 0s upto 20 characters so that system
-# calls that differ with a numerical suffix don't sort before those
-# without. This default behavior of sort differs from that of strcmp used at
-# runtime. Use sed to strip the trailing 0s suffix afterwards.
-grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | awk '{printf $3; for (i = length($3); i < 20; i++) { printf "0"; }; print " " $1}'| sort | sed 's/\([a-zA-Z1-9]\+\)0\+ \([0-9]\+\)/\1 \2/' > $sorted_table
-while read name nr; do
- echo " $nr, /* $name */" >> $outfile
-done < $sorted_table
-echo "};" >> $outfile
-
-rm -f $sorted_table
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 8/8] perf syscalltbl: Mask off ABI type for MIPS system calls
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
` (6 preceding siblings ...)
2025-02-19 18:56 ` [PATCH v3 7/8] perf build: Remove Makefile.syscalls Ian Rogers
@ 2025-02-19 18:56 ` Ian Rogers
2025-02-25 3:05 ` [PATCH v3 0/8] perf: Support multiple system call tables in the build Namhyung Kim
2025-02-25 3:20 ` Namhyung Kim
9 siblings, 0 replies; 19+ messages in thread
From: Ian Rogers @ 2025-02-19 18:56 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv, linux-mips,
Arnd Bergmann
Arnd Bergmann described that MIPS system calls don't necessarily start
from 0 as an ABI prefix is applied:
https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
When decoding the "id" (aka system call number) for MIPS ignore values
greater-than 1000.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/syscalltbl.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index db0d2b81aed1..ace66e69c1bc 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -46,6 +46,14 @@ const char *syscalltbl__name(int e_machine, int id)
{
const struct syscalltbl *table = find_table(e_machine);
+ if (e_machine == EM_MIPS && id > 1000) {
+ /*
+ * MIPS may encode the N32/64/O32 type in the high part of
+ * syscall number. Mask this off if present. See the values of
+ * __NR_N32_Linux, __NR_64_Linux, __NR_O32_Linux and __NR_Linux.
+ */
+ id = id % 1000;
+ }
if (table && id >= 0 && id < table->num_to_name_len)
return table->num_to_name[id];
return NULL;
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v3 0/8] perf: Support multiple system call tables in the build
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
` (7 preceding siblings ...)
2025-02-19 18:56 ` [PATCH v3 8/8] perf syscalltbl: Mask off ABI type for MIPS system calls Ian Rogers
@ 2025-02-25 3:05 ` Namhyung Kim
2025-02-25 4:37 ` Ian Rogers
2025-02-25 3:20 ` Namhyung Kim
9 siblings, 1 reply; 19+ messages in thread
From: Namhyung Kim @ 2025-02-25 3:05 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, John Garry, Will Deacon, James Clark, Mike Leach,
Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Charlie Jenkins, Bibo Mao, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv, linux-mips, Arnd Bergmann
On Wed, Feb 19, 2025 at 10:56:49AM -0800, Ian Rogers wrote:
> This work builds on the clean up of system call tables and removal of
> libaudit by Charlie Jenkins <charlie@rivosinc.com>.
>
> The system call table in perf trace is used to map system call numbers
> to names and vice versa. Prior to these changes, a single table
> matching the perf binary's build was present. The table would be
> incorrect if tracing say a 32-bit binary from a 64-bit version of
> perf, the names and numbers wouldn't match.
>
> Change the build so that a single system call file is built and the
> potentially multiple tables are identifiable from the ELF machine type
> of the process being examined. To determine the ELF machine type, the
> executable's header is read from /proc/pid/exe with fallbacks to using
> the perf's binary type when unknown.
>
> Remove some runtime types used by the system call tables and make
> equivalents generated at build time.
So I tested this with a test program.
$ cat a.c
#include <stdio.h>
int main(void)
{
char buf[4096];
FILE *fp = fopen("a.c", "r");
size_t len;
len = fread(buf, sizeof(buf), 1, fp);
fwrite(buf, 1, len, stdout);
fflush(stdout);
fclose(fp);
return 0;
}
$ gcc -o a64.out a.c
$ gcc -o a32.out -m32 a.c
$ ./perf version
perf version 6.14.rc1.ge002a64f6188
$ git show
commit e002a64f61882626992dd6513c0db3711c06fea7 (HEAD -> perf-check)
Author: Ian Rogers <irogers@google.com>
Date: Wed Feb 19 10:56:57 2025 -0800
perf syscalltbl: Mask off ABI type for MIPS system calls
Arnd Bergmann described that MIPS system calls don't necessarily start
from 0 as an ABI prefix is applied:
https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
When decoding the "id" (aka system call number) for MIPS ignore values
greater-than 1000.
Signed-off-by: Ian Rogers <irogers@google.com>
It works well with 64bit.
$ sudo ./perf trace ./a64.out |& tail
0.266 ( 0.007 ms): a64.out/858681 munmap(addr: 0x7f392723a000, len: 109058) = 0
0.286 ( 0.002 ms): a64.out/858681 getrandom(ubuf: 0x7f3927232178, len: 8, flags: NONBLOCK) = 8
0.289 ( 0.001 ms): a64.out/858681 brk() = 0x56419ecf7000
0.291 ( 0.002 ms): a64.out/858681 brk(brk: 0x56419ed18000) = 0x56419ed18000
0.299 ( 0.009 ms): a64.out/858681 openat(dfd: CWD, filename: "a.c") = 3
0.312 ( 0.001 ms): a64.out/858681 fstat(fd: 3, statbuf: 0x7ffdfadf1eb0) = 0
0.315 ( 0.002 ms): a64.out/858681 read(fd: 3, buf: 0x7ffdfadf2030, count: 4096) = 211
0.318 ( 0.009 ms): a64.out/858681 read(fd: 3, buf: 0x56419ecf7480, count: 4096) = 0
0.330 ( 0.001 ms): a64.out/858681 close(fd: 3) = 0
0.338 ( ): a64.out/858681 exit_group() = ?
But 32bit is still broken and use 64bit syscall table wrongly.
$ file a32.out
a32.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2,
BuildID[sha1]=6eea873c939012e6c715e8f030261642bf61cb4e, for GNU/Linux 3.2.0, not stripped
$ sudo ./perf trace ./a32.out |& tail
0.296 ( 0.001 ms): a32.out/858699 getxattr(pathname: "", name: "������", value: 0xf7f6ce14, size: 1) = 0
0.305 ( 0.007 ms): a32.out/858699 fchmod(fd: -134774784, mode: IFLNK|ISUID|ISVTX|IWOTH|0x10000) = 0
0.333 ( 0.001 ms): a32.out/858699 recvfrom(size: 4160146964, flags: RST|0x20000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1481879552
0.335 ( 0.004 ms): a32.out/858699 recvfrom(fd: 1482014720, ubuf: 0xf7f71278, size: 4160146964, flags: NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482014720
0.355 ( 0.002 ms): a32.out/858699 recvfrom(fd: 1482018816, ubuf: 0x5855d000, size: 4160146964, flags: RST|NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482018816
0.362 ( 0.010 ms): a32.out/858699 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x1b01000000632e62,.iov_len = (__kernel_size_t)1125899909479171,}, pos_h: 4160146964) = 3
0.385 ( 0.002 ms): a32.out/858699 close(fd: 3) = 211
0.388 ( 0.001 ms): a32.out/858699 close(fd: 3) = 0
0.393 ( 0.002 ms): a32.out/858699 lstat(filename: "") = 0
0.396 ( 0.004 ms): a32.out/858699 recvfrom(fd: 1482014720, size: 4160146964, flags: NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482014720
The last 5 should be openat, read, read, close and brk(?).
Thanks,
Namhyung
>
> v3: Add Charlie's reviewed-by tags. Incorporate feedback from Arnd
> Bergmann <arnd@arndb.de> on additional optional column and MIPS
> system call numbering. Rebase past Namhyung's global system call
> statistics and add comments that they don't yet support an
> e_machine other than EM_HOST.
>
> v2: Change the 1 element cache for the last table as suggested by
> Howard Chu, add Howard's reviewed-by tags.
> Add a comment and apology to Charlie for not doing better in
> guiding:
> https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
> After discussion on v1 and he agreed this patch series would be
> the better direction.
>
> Ian Rogers (8):
> perf syscalltble: Remove syscall_table.h
> perf trace: Reorganize syscalls
> perf syscalltbl: Remove struct syscalltbl
> perf thread: Add support for reading the e_machine type for a thread
> perf trace beauty: Add syscalltbl.sh generating all system call tables
> perf syscalltbl: Use lookup table containing multiple architectures
> perf build: Remove Makefile.syscalls
> perf syscalltbl: Mask off ABI type for MIPS system calls
>
> tools/perf/Makefile.perf | 10 +-
> tools/perf/arch/alpha/entry/syscalls/Kbuild | 2 -
> .../alpha/entry/syscalls/Makefile.syscalls | 5 -
> tools/perf/arch/alpha/include/syscall_table.h | 2 -
> tools/perf/arch/arc/entry/syscalls/Kbuild | 2 -
> .../arch/arc/entry/syscalls/Makefile.syscalls | 3 -
> tools/perf/arch/arc/include/syscall_table.h | 2 -
> tools/perf/arch/arm/entry/syscalls/Kbuild | 4 -
> .../arch/arm/entry/syscalls/Makefile.syscalls | 2 -
> tools/perf/arch/arm/include/syscall_table.h | 2 -
> tools/perf/arch/arm64/entry/syscalls/Kbuild | 3 -
> .../arm64/entry/syscalls/Makefile.syscalls | 6 -
> tools/perf/arch/arm64/include/syscall_table.h | 8 -
> tools/perf/arch/csky/entry/syscalls/Kbuild | 2 -
> .../csky/entry/syscalls/Makefile.syscalls | 3 -
> tools/perf/arch/csky/include/syscall_table.h | 2 -
> .../perf/arch/loongarch/entry/syscalls/Kbuild | 2 -
> .../entry/syscalls/Makefile.syscalls | 3 -
> .../arch/loongarch/include/syscall_table.h | 2 -
> tools/perf/arch/mips/entry/syscalls/Kbuild | 2 -
> .../mips/entry/syscalls/Makefile.syscalls | 5 -
> tools/perf/arch/mips/include/syscall_table.h | 2 -
> tools/perf/arch/parisc/entry/syscalls/Kbuild | 3 -
> .../parisc/entry/syscalls/Makefile.syscalls | 6 -
> .../perf/arch/parisc/include/syscall_table.h | 8 -
> tools/perf/arch/powerpc/entry/syscalls/Kbuild | 3 -
> .../powerpc/entry/syscalls/Makefile.syscalls | 6 -
> .../perf/arch/powerpc/include/syscall_table.h | 8 -
> tools/perf/arch/riscv/entry/syscalls/Kbuild | 2 -
> .../riscv/entry/syscalls/Makefile.syscalls | 4 -
> tools/perf/arch/riscv/include/syscall_table.h | 8 -
> tools/perf/arch/s390/entry/syscalls/Kbuild | 2 -
> .../s390/entry/syscalls/Makefile.syscalls | 5 -
> tools/perf/arch/s390/include/syscall_table.h | 2 -
> tools/perf/arch/sh/entry/syscalls/Kbuild | 2 -
> .../arch/sh/entry/syscalls/Makefile.syscalls | 4 -
> tools/perf/arch/sh/include/syscall_table.h | 2 -
> tools/perf/arch/sparc/entry/syscalls/Kbuild | 3 -
> .../sparc/entry/syscalls/Makefile.syscalls | 5 -
> tools/perf/arch/sparc/include/syscall_table.h | 8 -
> tools/perf/arch/x86/entry/syscalls/Kbuild | 3 -
> .../arch/x86/entry/syscalls/Makefile.syscalls | 6 -
> tools/perf/arch/x86/include/syscall_table.h | 8 -
> tools/perf/arch/xtensa/entry/syscalls/Kbuild | 2 -
> .../xtensa/entry/syscalls/Makefile.syscalls | 4 -
> .../perf/arch/xtensa/include/syscall_table.h | 2 -
> tools/perf/builtin-trace.c | 290 +++++++++++-------
> tools/perf/scripts/Makefile.syscalls | 61 ----
> tools/perf/scripts/syscalltbl.sh | 86 ------
> tools/perf/trace/beauty/syscalltbl.sh | 274 +++++++++++++++++
> tools/perf/util/syscalltbl.c | 148 ++++-----
> tools/perf/util/syscalltbl.h | 22 +-
> tools/perf/util/thread.c | 50 +++
> tools/perf/util/thread.h | 14 +-
> 54 files changed, 616 insertions(+), 509 deletions(-)
> delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/alpha/include/syscall_table.h
> delete mode 100644 tools/perf/arch/arc/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/arc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/arm/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/arm/include/syscall_table.h
> delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/arm64/include/syscall_table.h
> delete mode 100644 tools/perf/arch/csky/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/csky/include/syscall_table.h
> delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/loongarch/include/syscall_table.h
> delete mode 100644 tools/perf/arch/mips/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/mips/include/syscall_table.h
> delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/parisc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/powerpc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/riscv/include/syscall_table.h
> delete mode 100644 tools/perf/arch/s390/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/s390/include/syscall_table.h
> delete mode 100644 tools/perf/arch/sh/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/sh/include/syscall_table.h
> delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/sparc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/x86/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/x86/include/syscall_table.h
> delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/xtensa/include/syscall_table.h
> delete mode 100644 tools/perf/scripts/Makefile.syscalls
> delete mode 100755 tools/perf/scripts/syscalltbl.sh
> create mode 100755 tools/perf/trace/beauty/syscalltbl.sh
>
> --
> 2.48.1.601.g30ceb7b040-goog
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 0/8] perf: Support multiple system call tables in the build
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
` (8 preceding siblings ...)
2025-02-25 3:05 ` [PATCH v3 0/8] perf: Support multiple system call tables in the build Namhyung Kim
@ 2025-02-25 3:20 ` Namhyung Kim
2025-02-25 4:22 ` Ian Rogers
9 siblings, 1 reply; 19+ messages in thread
From: Namhyung Kim @ 2025-02-25 3:20 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, John Garry, Will Deacon, James Clark, Mike Leach,
Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Charlie Jenkins, Bibo Mao, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv, linux-mips, Arnd Bergmann
On Wed, Feb 19, 2025 at 10:56:49AM -0800, Ian Rogers wrote:
> This work builds on the clean up of system call tables and removal of
> libaudit by Charlie Jenkins <charlie@rivosinc.com>.
>
> The system call table in perf trace is used to map system call numbers
> to names and vice versa. Prior to these changes, a single table
> matching the perf binary's build was present. The table would be
> incorrect if tracing say a 32-bit binary from a 64-bit version of
> perf, the names and numbers wouldn't match.
>
> Change the build so that a single system call file is built and the
> potentially multiple tables are identifiable from the ELF machine type
> of the process being examined. To determine the ELF machine type, the
> executable's header is read from /proc/pid/exe with fallbacks to using
> the perf's binary type when unknown.
Hmm.. then this is limited to live mode and potentially detect wrong
machine type if it reads an old data, right?
Also IIUC fallback to the perf binary means it cannot use cross-machine
table. For example, it cannot process data from ARM64 on x86, no? It
seems it should use perf_env.arch.
One more concern is BPF. The BPF should know about the ABI of the
current process so that it can augment the syscall arguments correctly.
Currently it only checks the syscall number but it can be different on
32-bit and 64-bit.
Thanks,
Namhyung
>
> Remove some runtime types used by the system call tables and make
> equivalents generated at build time.
>
> v3: Add Charlie's reviewed-by tags. Incorporate feedback from Arnd
> Bergmann <arnd@arndb.de> on additional optional column and MIPS
> system call numbering. Rebase past Namhyung's global system call
> statistics and add comments that they don't yet support an
> e_machine other than EM_HOST.
>
> v2: Change the 1 element cache for the last table as suggested by
> Howard Chu, add Howard's reviewed-by tags.
> Add a comment and apology to Charlie for not doing better in
> guiding:
> https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
> After discussion on v1 and he agreed this patch series would be
> the better direction.
>
> Ian Rogers (8):
> perf syscalltble: Remove syscall_table.h
> perf trace: Reorganize syscalls
> perf syscalltbl: Remove struct syscalltbl
> perf thread: Add support for reading the e_machine type for a thread
> perf trace beauty: Add syscalltbl.sh generating all system call tables
> perf syscalltbl: Use lookup table containing multiple architectures
> perf build: Remove Makefile.syscalls
> perf syscalltbl: Mask off ABI type for MIPS system calls
>
> tools/perf/Makefile.perf | 10 +-
> tools/perf/arch/alpha/entry/syscalls/Kbuild | 2 -
> .../alpha/entry/syscalls/Makefile.syscalls | 5 -
> tools/perf/arch/alpha/include/syscall_table.h | 2 -
> tools/perf/arch/arc/entry/syscalls/Kbuild | 2 -
> .../arch/arc/entry/syscalls/Makefile.syscalls | 3 -
> tools/perf/arch/arc/include/syscall_table.h | 2 -
> tools/perf/arch/arm/entry/syscalls/Kbuild | 4 -
> .../arch/arm/entry/syscalls/Makefile.syscalls | 2 -
> tools/perf/arch/arm/include/syscall_table.h | 2 -
> tools/perf/arch/arm64/entry/syscalls/Kbuild | 3 -
> .../arm64/entry/syscalls/Makefile.syscalls | 6 -
> tools/perf/arch/arm64/include/syscall_table.h | 8 -
> tools/perf/arch/csky/entry/syscalls/Kbuild | 2 -
> .../csky/entry/syscalls/Makefile.syscalls | 3 -
> tools/perf/arch/csky/include/syscall_table.h | 2 -
> .../perf/arch/loongarch/entry/syscalls/Kbuild | 2 -
> .../entry/syscalls/Makefile.syscalls | 3 -
> .../arch/loongarch/include/syscall_table.h | 2 -
> tools/perf/arch/mips/entry/syscalls/Kbuild | 2 -
> .../mips/entry/syscalls/Makefile.syscalls | 5 -
> tools/perf/arch/mips/include/syscall_table.h | 2 -
> tools/perf/arch/parisc/entry/syscalls/Kbuild | 3 -
> .../parisc/entry/syscalls/Makefile.syscalls | 6 -
> .../perf/arch/parisc/include/syscall_table.h | 8 -
> tools/perf/arch/powerpc/entry/syscalls/Kbuild | 3 -
> .../powerpc/entry/syscalls/Makefile.syscalls | 6 -
> .../perf/arch/powerpc/include/syscall_table.h | 8 -
> tools/perf/arch/riscv/entry/syscalls/Kbuild | 2 -
> .../riscv/entry/syscalls/Makefile.syscalls | 4 -
> tools/perf/arch/riscv/include/syscall_table.h | 8 -
> tools/perf/arch/s390/entry/syscalls/Kbuild | 2 -
> .../s390/entry/syscalls/Makefile.syscalls | 5 -
> tools/perf/arch/s390/include/syscall_table.h | 2 -
> tools/perf/arch/sh/entry/syscalls/Kbuild | 2 -
> .../arch/sh/entry/syscalls/Makefile.syscalls | 4 -
> tools/perf/arch/sh/include/syscall_table.h | 2 -
> tools/perf/arch/sparc/entry/syscalls/Kbuild | 3 -
> .../sparc/entry/syscalls/Makefile.syscalls | 5 -
> tools/perf/arch/sparc/include/syscall_table.h | 8 -
> tools/perf/arch/x86/entry/syscalls/Kbuild | 3 -
> .../arch/x86/entry/syscalls/Makefile.syscalls | 6 -
> tools/perf/arch/x86/include/syscall_table.h | 8 -
> tools/perf/arch/xtensa/entry/syscalls/Kbuild | 2 -
> .../xtensa/entry/syscalls/Makefile.syscalls | 4 -
> .../perf/arch/xtensa/include/syscall_table.h | 2 -
> tools/perf/builtin-trace.c | 290 +++++++++++-------
> tools/perf/scripts/Makefile.syscalls | 61 ----
> tools/perf/scripts/syscalltbl.sh | 86 ------
> tools/perf/trace/beauty/syscalltbl.sh | 274 +++++++++++++++++
> tools/perf/util/syscalltbl.c | 148 ++++-----
> tools/perf/util/syscalltbl.h | 22 +-
> tools/perf/util/thread.c | 50 +++
> tools/perf/util/thread.h | 14 +-
> 54 files changed, 616 insertions(+), 509 deletions(-)
> delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/alpha/include/syscall_table.h
> delete mode 100644 tools/perf/arch/arc/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/arc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/arm/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/arm/include/syscall_table.h
> delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/arm64/include/syscall_table.h
> delete mode 100644 tools/perf/arch/csky/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/csky/include/syscall_table.h
> delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/loongarch/include/syscall_table.h
> delete mode 100644 tools/perf/arch/mips/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/mips/include/syscall_table.h
> delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/parisc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/powerpc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/riscv/include/syscall_table.h
> delete mode 100644 tools/perf/arch/s390/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/s390/include/syscall_table.h
> delete mode 100644 tools/perf/arch/sh/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/sh/include/syscall_table.h
> delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/sparc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/x86/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/x86/include/syscall_table.h
> delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Kbuild
> delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
> delete mode 100644 tools/perf/arch/xtensa/include/syscall_table.h
> delete mode 100644 tools/perf/scripts/Makefile.syscalls
> delete mode 100755 tools/perf/scripts/syscalltbl.sh
> create mode 100755 tools/perf/trace/beauty/syscalltbl.sh
>
> --
> 2.48.1.601.g30ceb7b040-goog
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 0/8] perf: Support multiple system call tables in the build
2025-02-25 3:20 ` Namhyung Kim
@ 2025-02-25 4:22 ` Ian Rogers
2025-02-27 0:00 ` Namhyung Kim
0 siblings, 1 reply; 19+ messages in thread
From: Ian Rogers @ 2025-02-25 4:22 UTC (permalink / raw)
To: Namhyung Kim
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, John Garry, Will Deacon, James Clark, Mike Leach,
Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Charlie Jenkins, Bibo Mao, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv, linux-mips, Arnd Bergmann
On Mon, Feb 24, 2025 at 7:20 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Wed, Feb 19, 2025 at 10:56:49AM -0800, Ian Rogers wrote:
> > This work builds on the clean up of system call tables and removal of
> > libaudit by Charlie Jenkins <charlie@rivosinc.com>.
> >
> > The system call table in perf trace is used to map system call numbers
> > to names and vice versa. Prior to these changes, a single table
> > matching the perf binary's build was present. The table would be
> > incorrect if tracing say a 32-bit binary from a 64-bit version of
> > perf, the names and numbers wouldn't match.
> >
> > Change the build so that a single system call file is built and the
> > potentially multiple tables are identifiable from the ELF machine type
> > of the process being examined. To determine the ELF machine type, the
> > executable's header is read from /proc/pid/exe with fallbacks to using
> > the perf's binary type when unknown.
>
> Hmm.. then this is limited to live mode and potentially detect wrong
> machine type if it reads an old data, right?
>
> Also IIUC fallback to the perf binary means it cannot use cross-machine
> table. For example, it cannot process data from ARM64 on x86, no? It
> seems it should use perf_env.arch.
The perf env arch is kind of horrid. On x86 it has the value x86 and
then there is an extra 64bit flag, who knows how x32 should be encoded
- but we barely support x32 as-is. I'd rather we added a new feature
for the e_machine/e_flags of the executable and worked with those, but
it is kind of weird with doing system wide mode. I didn't want to drag
that into this patch series anyway as there is already enough here.
> One more concern is BPF. The BPF should know about the ABI of the
> current process so that it can augment the syscall arguments correctly.
> Currently it only checks the syscall number but it can be different on
> 32-bit and 64-bit.
That's right. This change is trying to clean up
tools/perf/util/syscalltbl.c and the perf trace usage. I didn't go as
far as making BPF programs pair system call number with e_machine and
e_flags, there is enough here and the behavior after these patches
matches the behavior before - that is to assume the system call ABI
matches that of the perf binary.
Thanks,
Ian
> Thanks,
> Namhyung
>
>
> >
> > Remove some runtime types used by the system call tables and make
> > equivalents generated at build time.
> >
> > v3: Add Charlie's reviewed-by tags. Incorporate feedback from Arnd
> > Bergmann <arnd@arndb.de> on additional optional column and MIPS
> > system call numbering. Rebase past Namhyung's global system call
> > statistics and add comments that they don't yet support an
> > e_machine other than EM_HOST.
> >
> > v2: Change the 1 element cache for the last table as suggested by
> > Howard Chu, add Howard's reviewed-by tags.
> > Add a comment and apology to Charlie for not doing better in
> > guiding:
> > https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
> > After discussion on v1 and he agreed this patch series would be
> > the better direction.
> >
> > Ian Rogers (8):
> > perf syscalltble: Remove syscall_table.h
> > perf trace: Reorganize syscalls
> > perf syscalltbl: Remove struct syscalltbl
> > perf thread: Add support for reading the e_machine type for a thread
> > perf trace beauty: Add syscalltbl.sh generating all system call tables
> > perf syscalltbl: Use lookup table containing multiple architectures
> > perf build: Remove Makefile.syscalls
> > perf syscalltbl: Mask off ABI type for MIPS system calls
> >
> > tools/perf/Makefile.perf | 10 +-
> > tools/perf/arch/alpha/entry/syscalls/Kbuild | 2 -
> > .../alpha/entry/syscalls/Makefile.syscalls | 5 -
> > tools/perf/arch/alpha/include/syscall_table.h | 2 -
> > tools/perf/arch/arc/entry/syscalls/Kbuild | 2 -
> > .../arch/arc/entry/syscalls/Makefile.syscalls | 3 -
> > tools/perf/arch/arc/include/syscall_table.h | 2 -
> > tools/perf/arch/arm/entry/syscalls/Kbuild | 4 -
> > .../arch/arm/entry/syscalls/Makefile.syscalls | 2 -
> > tools/perf/arch/arm/include/syscall_table.h | 2 -
> > tools/perf/arch/arm64/entry/syscalls/Kbuild | 3 -
> > .../arm64/entry/syscalls/Makefile.syscalls | 6 -
> > tools/perf/arch/arm64/include/syscall_table.h | 8 -
> > tools/perf/arch/csky/entry/syscalls/Kbuild | 2 -
> > .../csky/entry/syscalls/Makefile.syscalls | 3 -
> > tools/perf/arch/csky/include/syscall_table.h | 2 -
> > .../perf/arch/loongarch/entry/syscalls/Kbuild | 2 -
> > .../entry/syscalls/Makefile.syscalls | 3 -
> > .../arch/loongarch/include/syscall_table.h | 2 -
> > tools/perf/arch/mips/entry/syscalls/Kbuild | 2 -
> > .../mips/entry/syscalls/Makefile.syscalls | 5 -
> > tools/perf/arch/mips/include/syscall_table.h | 2 -
> > tools/perf/arch/parisc/entry/syscalls/Kbuild | 3 -
> > .../parisc/entry/syscalls/Makefile.syscalls | 6 -
> > .../perf/arch/parisc/include/syscall_table.h | 8 -
> > tools/perf/arch/powerpc/entry/syscalls/Kbuild | 3 -
> > .../powerpc/entry/syscalls/Makefile.syscalls | 6 -
> > .../perf/arch/powerpc/include/syscall_table.h | 8 -
> > tools/perf/arch/riscv/entry/syscalls/Kbuild | 2 -
> > .../riscv/entry/syscalls/Makefile.syscalls | 4 -
> > tools/perf/arch/riscv/include/syscall_table.h | 8 -
> > tools/perf/arch/s390/entry/syscalls/Kbuild | 2 -
> > .../s390/entry/syscalls/Makefile.syscalls | 5 -
> > tools/perf/arch/s390/include/syscall_table.h | 2 -
> > tools/perf/arch/sh/entry/syscalls/Kbuild | 2 -
> > .../arch/sh/entry/syscalls/Makefile.syscalls | 4 -
> > tools/perf/arch/sh/include/syscall_table.h | 2 -
> > tools/perf/arch/sparc/entry/syscalls/Kbuild | 3 -
> > .../sparc/entry/syscalls/Makefile.syscalls | 5 -
> > tools/perf/arch/sparc/include/syscall_table.h | 8 -
> > tools/perf/arch/x86/entry/syscalls/Kbuild | 3 -
> > .../arch/x86/entry/syscalls/Makefile.syscalls | 6 -
> > tools/perf/arch/x86/include/syscall_table.h | 8 -
> > tools/perf/arch/xtensa/entry/syscalls/Kbuild | 2 -
> > .../xtensa/entry/syscalls/Makefile.syscalls | 4 -
> > .../perf/arch/xtensa/include/syscall_table.h | 2 -
> > tools/perf/builtin-trace.c | 290 +++++++++++-------
> > tools/perf/scripts/Makefile.syscalls | 61 ----
> > tools/perf/scripts/syscalltbl.sh | 86 ------
> > tools/perf/trace/beauty/syscalltbl.sh | 274 +++++++++++++++++
> > tools/perf/util/syscalltbl.c | 148 ++++-----
> > tools/perf/util/syscalltbl.h | 22 +-
> > tools/perf/util/thread.c | 50 +++
> > tools/perf/util/thread.h | 14 +-
> > 54 files changed, 616 insertions(+), 509 deletions(-)
> > delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/alpha/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/arc/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/arc/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/arm/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/arm/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/arm64/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/csky/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/csky/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/loongarch/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/mips/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/mips/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/parisc/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/powerpc/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/riscv/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/s390/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/s390/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/sh/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/sh/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/sparc/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/x86/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/x86/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/xtensa/include/syscall_table.h
> > delete mode 100644 tools/perf/scripts/Makefile.syscalls
> > delete mode 100755 tools/perf/scripts/syscalltbl.sh
> > create mode 100755 tools/perf/trace/beauty/syscalltbl.sh
> >
> > --
> > 2.48.1.601.g30ceb7b040-goog
> >
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 0/8] perf: Support multiple system call tables in the build
2025-02-25 3:05 ` [PATCH v3 0/8] perf: Support multiple system call tables in the build Namhyung Kim
@ 2025-02-25 4:37 ` Ian Rogers
2025-02-25 5:40 ` Namhyung Kim
0 siblings, 1 reply; 19+ messages in thread
From: Ian Rogers @ 2025-02-25 4:37 UTC (permalink / raw)
To: Namhyung Kim
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, John Garry, Will Deacon, James Clark, Mike Leach,
Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Charlie Jenkins, Bibo Mao, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv, linux-mips, Arnd Bergmann
On Mon, Feb 24, 2025 at 7:05 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Wed, Feb 19, 2025 at 10:56:49AM -0800, Ian Rogers wrote:
> > This work builds on the clean up of system call tables and removal of
> > libaudit by Charlie Jenkins <charlie@rivosinc.com>.
> >
> > The system call table in perf trace is used to map system call numbers
> > to names and vice versa. Prior to these changes, a single table
> > matching the perf binary's build was present. The table would be
> > incorrect if tracing say a 32-bit binary from a 64-bit version of
> > perf, the names and numbers wouldn't match.
> >
> > Change the build so that a single system call file is built and the
> > potentially multiple tables are identifiable from the ELF machine type
> > of the process being examined. To determine the ELF machine type, the
> > executable's header is read from /proc/pid/exe with fallbacks to using
> > the perf's binary type when unknown.
> >
> > Remove some runtime types used by the system call tables and make
> > equivalents generated at build time.
>
> So I tested this with a test program.
>
> $ cat a.c
> #include <stdio.h>
> int main(void)
> {
> char buf[4096];
> FILE *fp = fopen("a.c", "r");
> size_t len;
>
> len = fread(buf, sizeof(buf), 1, fp);
> fwrite(buf, 1, len, stdout);
> fflush(stdout);
> fclose(fp);
> return 0;
> }
>
> $ gcc -o a64.out a.c
> $ gcc -o a32.out -m32 a.c
>
> $ ./perf version
> perf version 6.14.rc1.ge002a64f6188
>
> $ git show
> commit e002a64f61882626992dd6513c0db3711c06fea7 (HEAD -> perf-check)
> Author: Ian Rogers <irogers@google.com>
> Date: Wed Feb 19 10:56:57 2025 -0800
>
> perf syscalltbl: Mask off ABI type for MIPS system calls
>
> Arnd Bergmann described that MIPS system calls don't necessarily start
> from 0 as an ABI prefix is applied:
> https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
> When decoding the "id" (aka system call number) for MIPS ignore values
> greater-than 1000.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
>
> It works well with 64bit.
>
> $ sudo ./perf trace ./a64.out |& tail
> 0.266 ( 0.007 ms): a64.out/858681 munmap(addr: 0x7f392723a000, len: 109058) = 0
> 0.286 ( 0.002 ms): a64.out/858681 getrandom(ubuf: 0x7f3927232178, len: 8, flags: NONBLOCK) = 8
> 0.289 ( 0.001 ms): a64.out/858681 brk() = 0x56419ecf7000
> 0.291 ( 0.002 ms): a64.out/858681 brk(brk: 0x56419ed18000) = 0x56419ed18000
> 0.299 ( 0.009 ms): a64.out/858681 openat(dfd: CWD, filename: "a.c") = 3
> 0.312 ( 0.001 ms): a64.out/858681 fstat(fd: 3, statbuf: 0x7ffdfadf1eb0) = 0
> 0.315 ( 0.002 ms): a64.out/858681 read(fd: 3, buf: 0x7ffdfadf2030, count: 4096) = 211
> 0.318 ( 0.009 ms): a64.out/858681 read(fd: 3, buf: 0x56419ecf7480, count: 4096) = 0
> 0.330 ( 0.001 ms): a64.out/858681 close(fd: 3) = 0
> 0.338 ( ): a64.out/858681 exit_group() = ?
>
> But 32bit is still broken and use 64bit syscall table wrongly.
>
> $ file a32.out
> a32.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2,
> BuildID[sha1]=6eea873c939012e6c715e8f030261642bf61cb4e, for GNU/Linux 3.2.0, not stripped
>
> $ sudo ./perf trace ./a32.out |& tail
> 0.296 ( 0.001 ms): a32.out/858699 getxattr(pathname: "", name: "������", value: 0xf7f6ce14, size: 1) = 0
> 0.305 ( 0.007 ms): a32.out/858699 fchmod(fd: -134774784, mode: IFLNK|ISUID|ISVTX|IWOTH|0x10000) = 0
> 0.333 ( 0.001 ms): a32.out/858699 recvfrom(size: 4160146964, flags: RST|0x20000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1481879552
> 0.335 ( 0.004 ms): a32.out/858699 recvfrom(fd: 1482014720, ubuf: 0xf7f71278, size: 4160146964, flags: NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482014720
> 0.355 ( 0.002 ms): a32.out/858699 recvfrom(fd: 1482018816, ubuf: 0x5855d000, size: 4160146964, flags: RST|NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482018816
> 0.362 ( 0.010 ms): a32.out/858699 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x1b01000000632e62,.iov_len = (__kernel_size_t)1125899909479171,}, pos_h: 4160146964) = 3
> 0.385 ( 0.002 ms): a32.out/858699 close(fd: 3) = 211
> 0.388 ( 0.001 ms): a32.out/858699 close(fd: 3) = 0
> 0.393 ( 0.002 ms): a32.out/858699 lstat(filename: "") = 0
> 0.396 ( 0.004 ms): a32.out/858699 recvfrom(fd: 1482014720, size: 4160146964, flags: NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482014720
>
> The last 5 should be openat, read, read, close and brk(?).
That's strange as nearly the same test works for me:
```
$ git show
commit 7920020237af8138f7be1a21be9a2918a71ddc5e (HEAD -> ptn-syscalltbl)
Author: Ian Rogers <irogers@google.com>
Date: Fri Jan 31 21:34:07 2025 -0800
perf syscalltbl: Mask off ABI type for MIPS system calls
Arnd Bergmann described that MIPS system calls don't necessarily start
from 0 as an ABI prefix is applied:
https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
When decoding the "id" (aka system call number) for MIPS ignore values
greater-than 1000.
Signed-off-by: Ian Rogers <irogers@google.com>
..
$ file a.out
a.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV),
dynamically linked, interpreter /lib/ld-linux.so.2,
BuildID[sha1]=3fcd28f85a27a3108941661a91dbc675c06868f9, for GNU/Linux
3.2.0, not stripped
$ sudo /tmp/perf/perf trace ./a.out
...
? ( ): a.out/218604 ... [continued]: execve())
= 0
0.067 ( 0.003 ms): a.out/218604 brk()
= 0x5749e000
0.154 ( 0.007 ms): a.out/218604 access(filename: 0xf7fc7f28,
mode: R) = -1 ENOENT (No such file or
directory)
0.168 ( 0.023 ms): a.out/218604 openat(dfd: CWD, filename:
0xf7fc44c3, flags: RDONLY|CLOEXEC|LARGEFILE) = 3
0.193 ( 0.006 ms): a.out/218604 statx(dfd:
3</proc/218604/status>, filename: 0xf7fc510a, flags:
NO_AUTOMOUNT|EMPTY_PATH, mask:
TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS, buffer:
0xffaa6b88) = 0
0.212 ( 0.002 ms): a.out/218604 close(fd: 3</proc/218604/status>)
= 0
0.233 ( 0.019 ms): a.out/218604 openat(dfd: CWD, filename:
0xf7f973e0, flags: RDONLY|CLOEXEC|LARGEFILE) = 3
0.255 ( 0.004 ms): a.out/218604 read(fd: 3</proc/218604/status>,
buf: 0xffaa6df0, count: 512) = 512
0.262 ( 0.003 ms): a.out/218604 statx(dfd:
3</proc/218604/status>, filename: 0xf7fc510a, flags:
NO_AUTOMOUNT|EMPTY_PATH, mask:
TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS, buffer:
0xffaa6b38) = 0
0.347 ( 0.002 ms): a.out/218604 close(fd: 3</proc/218604/status>)
= 0
0.372 ( 0.002 ms): a.out/218604 set_tid_address(tidptr:
0xf7f98528) = 218604 (a.out)
0.376 ( 0.002 ms): a.out/218604 set_robust_list(head: 0xf7f9852c,
len: 12) =
0.381 ( 0.002 ms): a.out/218604 rseq(rseq: 0xf7f98960, rseq_len:
32, sig: 1392848979) =
0.469 ( 0.010 ms): a.out/218604 mprotect(start: 0xf7f6e000, len:
8192, prot: READ) = 0
0.489 ( 0.007 ms): a.out/218604 mprotect(start: 0x5661a000, len:
4096, prot: READ) = 0
0.503 ( 0.007 ms): a.out/218604 mprotect(start: 0xf7fd0000, len:
8192, prot: READ) = 0
0.550 ( 0.015 ms): a.out/218604 munmap(addr: 0xf7f7b000, len:
111198) = 0
0.589 ( 0.035 ms): a.out/218604 openat(dfd: CWD, filename:
0x56619008) = 3
0.627 ( 0.024 ms): a.out/218604 read(fd: 3</proc/218604/status>,
buf: 0xffaa68fc, count: 4096) = 1437
0.654 ( 0.090 ms): a.out/218604 write(fd: 1</dev/pts/3>, buf: ,
count: 1437) = 1437
0.766 (1000.164 ms): a.out/218604 clock_nanosleep(rqtp:
0xffaa6824, rmtp: 0xffaa681c) = 0
1000.942 ( ): a.out/218604 exit_group()
$ file /tmp/perf/perf
/tmp/perf/perf: ELF 64-bit LSB pie executable, x86-64, version 1
(SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2,
BuildID[sha1]=60b07f65d2559a7193b2d1d36cfa00054dfbd076, for GNU/Linux
3.2.0, with debug_info, not stripped
```
Perhaps your a.out binary was built as an x32 one?
Looking under the covers with gdb:
```
$ sudo gdb --args /tmp/perf/perf trace ./a.out
GNU gdb (Debian 15.1-1) 15.1
...
Reading symbols from /tmp/perf/perf...
(gdb) b syscalltbl__name
Breakpoint 1 at 0x23a51b: file util/syscalltbl.c, line 47.
(gdb) r
...
[Detaching after vfork from child process 218826]
Breakpoint 1, syscalltbl__name (e_machine=3, id=11) at util/syscalltbl.c:47
47 const struct syscalltbl *table = find_table(e_machine);
```
So the e_machine is 3 which corresponds to EM_386.
I've not fixed every use of syscalltbl but I believe this one is working.
Thanks,
Ian
> >
> > v3: Add Charlie's reviewed-by tags. Incorporate feedback from Arnd
> > Bergmann <arnd@arndb.de> on additional optional column and MIPS
> > system call numbering. Rebase past Namhyung's global system call
> > statistics and add comments that they don't yet support an
> > e_machine other than EM_HOST.
> >
> > v2: Change the 1 element cache for the last table as suggested by
> > Howard Chu, add Howard's reviewed-by tags.
> > Add a comment and apology to Charlie for not doing better in
> > guiding:
> > https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
> > After discussion on v1 and he agreed this patch series would be
> > the better direction.
> >
> > Ian Rogers (8):
> > perf syscalltble: Remove syscall_table.h
> > perf trace: Reorganize syscalls
> > perf syscalltbl: Remove struct syscalltbl
> > perf thread: Add support for reading the e_machine type for a thread
> > perf trace beauty: Add syscalltbl.sh generating all system call tables
> > perf syscalltbl: Use lookup table containing multiple architectures
> > perf build: Remove Makefile.syscalls
> > perf syscalltbl: Mask off ABI type for MIPS system calls
> >
> > tools/perf/Makefile.perf | 10 +-
> > tools/perf/arch/alpha/entry/syscalls/Kbuild | 2 -
> > .../alpha/entry/syscalls/Makefile.syscalls | 5 -
> > tools/perf/arch/alpha/include/syscall_table.h | 2 -
> > tools/perf/arch/arc/entry/syscalls/Kbuild | 2 -
> > .../arch/arc/entry/syscalls/Makefile.syscalls | 3 -
> > tools/perf/arch/arc/include/syscall_table.h | 2 -
> > tools/perf/arch/arm/entry/syscalls/Kbuild | 4 -
> > .../arch/arm/entry/syscalls/Makefile.syscalls | 2 -
> > tools/perf/arch/arm/include/syscall_table.h | 2 -
> > tools/perf/arch/arm64/entry/syscalls/Kbuild | 3 -
> > .../arm64/entry/syscalls/Makefile.syscalls | 6 -
> > tools/perf/arch/arm64/include/syscall_table.h | 8 -
> > tools/perf/arch/csky/entry/syscalls/Kbuild | 2 -
> > .../csky/entry/syscalls/Makefile.syscalls | 3 -
> > tools/perf/arch/csky/include/syscall_table.h | 2 -
> > .../perf/arch/loongarch/entry/syscalls/Kbuild | 2 -
> > .../entry/syscalls/Makefile.syscalls | 3 -
> > .../arch/loongarch/include/syscall_table.h | 2 -
> > tools/perf/arch/mips/entry/syscalls/Kbuild | 2 -
> > .../mips/entry/syscalls/Makefile.syscalls | 5 -
> > tools/perf/arch/mips/include/syscall_table.h | 2 -
> > tools/perf/arch/parisc/entry/syscalls/Kbuild | 3 -
> > .../parisc/entry/syscalls/Makefile.syscalls | 6 -
> > .../perf/arch/parisc/include/syscall_table.h | 8 -
> > tools/perf/arch/powerpc/entry/syscalls/Kbuild | 3 -
> > .../powerpc/entry/syscalls/Makefile.syscalls | 6 -
> > .../perf/arch/powerpc/include/syscall_table.h | 8 -
> > tools/perf/arch/riscv/entry/syscalls/Kbuild | 2 -
> > .../riscv/entry/syscalls/Makefile.syscalls | 4 -
> > tools/perf/arch/riscv/include/syscall_table.h | 8 -
> > tools/perf/arch/s390/entry/syscalls/Kbuild | 2 -
> > .../s390/entry/syscalls/Makefile.syscalls | 5 -
> > tools/perf/arch/s390/include/syscall_table.h | 2 -
> > tools/perf/arch/sh/entry/syscalls/Kbuild | 2 -
> > .../arch/sh/entry/syscalls/Makefile.syscalls | 4 -
> > tools/perf/arch/sh/include/syscall_table.h | 2 -
> > tools/perf/arch/sparc/entry/syscalls/Kbuild | 3 -
> > .../sparc/entry/syscalls/Makefile.syscalls | 5 -
> > tools/perf/arch/sparc/include/syscall_table.h | 8 -
> > tools/perf/arch/x86/entry/syscalls/Kbuild | 3 -
> > .../arch/x86/entry/syscalls/Makefile.syscalls | 6 -
> > tools/perf/arch/x86/include/syscall_table.h | 8 -
> > tools/perf/arch/xtensa/entry/syscalls/Kbuild | 2 -
> > .../xtensa/entry/syscalls/Makefile.syscalls | 4 -
> > .../perf/arch/xtensa/include/syscall_table.h | 2 -
> > tools/perf/builtin-trace.c | 290 +++++++++++-------
> > tools/perf/scripts/Makefile.syscalls | 61 ----
> > tools/perf/scripts/syscalltbl.sh | 86 ------
> > tools/perf/trace/beauty/syscalltbl.sh | 274 +++++++++++++++++
> > tools/perf/util/syscalltbl.c | 148 ++++-----
> > tools/perf/util/syscalltbl.h | 22 +-
> > tools/perf/util/thread.c | 50 +++
> > tools/perf/util/thread.h | 14 +-
> > 54 files changed, 616 insertions(+), 509 deletions(-)
> > delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/alpha/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/arc/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/arc/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/arm/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/arm/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/arm64/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/csky/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/csky/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/loongarch/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/mips/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/mips/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/parisc/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/powerpc/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/riscv/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/s390/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/s390/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/sh/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/sh/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/sparc/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/x86/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/x86/include/syscall_table.h
> > delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Kbuild
> > delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
> > delete mode 100644 tools/perf/arch/xtensa/include/syscall_table.h
> > delete mode 100644 tools/perf/scripts/Makefile.syscalls
> > delete mode 100755 tools/perf/scripts/syscalltbl.sh
> > create mode 100755 tools/perf/trace/beauty/syscalltbl.sh
> >
> > --
> > 2.48.1.601.g30ceb7b040-goog
> >
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 0/8] perf: Support multiple system call tables in the build
2025-02-25 4:37 ` Ian Rogers
@ 2025-02-25 5:40 ` Namhyung Kim
2025-02-26 2:47 ` Namhyung Kim
0 siblings, 1 reply; 19+ messages in thread
From: Namhyung Kim @ 2025-02-25 5:40 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, John Garry, Will Deacon, James Clark, Mike Leach,
Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Charlie Jenkins, Bibo Mao, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv, linux-mips, Arnd Bergmann
On Mon, Feb 24, 2025 at 08:37:01PM -0800, Ian Rogers wrote:
> On Mon, Feb 24, 2025 at 7:05 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Wed, Feb 19, 2025 at 10:56:49AM -0800, Ian Rogers wrote:
> > > This work builds on the clean up of system call tables and removal of
> > > libaudit by Charlie Jenkins <charlie@rivosinc.com>.
> > >
> > > The system call table in perf trace is used to map system call numbers
> > > to names and vice versa. Prior to these changes, a single table
> > > matching the perf binary's build was present. The table would be
> > > incorrect if tracing say a 32-bit binary from a 64-bit version of
> > > perf, the names and numbers wouldn't match.
> > >
> > > Change the build so that a single system call file is built and the
> > > potentially multiple tables are identifiable from the ELF machine type
> > > of the process being examined. To determine the ELF machine type, the
> > > executable's header is read from /proc/pid/exe with fallbacks to using
> > > the perf's binary type when unknown.
> > >
> > > Remove some runtime types used by the system call tables and make
> > > equivalents generated at build time.
> >
> > So I tested this with a test program.
> >
> > $ cat a.c
> > #include <stdio.h>
> > int main(void)
> > {
> > char buf[4096];
> > FILE *fp = fopen("a.c", "r");
> > size_t len;
> >
> > len = fread(buf, sizeof(buf), 1, fp);
> > fwrite(buf, 1, len, stdout);
> > fflush(stdout);
> > fclose(fp);
> > return 0;
> > }
> >
> > $ gcc -o a64.out a.c
> > $ gcc -o a32.out -m32 a.c
> >
> > $ ./perf version
> > perf version 6.14.rc1.ge002a64f6188
> >
> > $ git show
> > commit e002a64f61882626992dd6513c0db3711c06fea7 (HEAD -> perf-check)
> > Author: Ian Rogers <irogers@google.com>
> > Date: Wed Feb 19 10:56:57 2025 -0800
> >
> > perf syscalltbl: Mask off ABI type for MIPS system calls
> >
> > Arnd Bergmann described that MIPS system calls don't necessarily start
> > from 0 as an ABI prefix is applied:
> > https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
> > When decoding the "id" (aka system call number) for MIPS ignore values
> > greater-than 1000.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> >
> > It works well with 64bit.
> >
> > $ sudo ./perf trace ./a64.out |& tail
> > 0.266 ( 0.007 ms): a64.out/858681 munmap(addr: 0x7f392723a000, len: 109058) = 0
> > 0.286 ( 0.002 ms): a64.out/858681 getrandom(ubuf: 0x7f3927232178, len: 8, flags: NONBLOCK) = 8
> > 0.289 ( 0.001 ms): a64.out/858681 brk() = 0x56419ecf7000
> > 0.291 ( 0.002 ms): a64.out/858681 brk(brk: 0x56419ed18000) = 0x56419ed18000
> > 0.299 ( 0.009 ms): a64.out/858681 openat(dfd: CWD, filename: "a.c") = 3
> > 0.312 ( 0.001 ms): a64.out/858681 fstat(fd: 3, statbuf: 0x7ffdfadf1eb0) = 0
> > 0.315 ( 0.002 ms): a64.out/858681 read(fd: 3, buf: 0x7ffdfadf2030, count: 4096) = 211
> > 0.318 ( 0.009 ms): a64.out/858681 read(fd: 3, buf: 0x56419ecf7480, count: 4096) = 0
> > 0.330 ( 0.001 ms): a64.out/858681 close(fd: 3) = 0
> > 0.338 ( ): a64.out/858681 exit_group() = ?
> >
> > But 32bit is still broken and use 64bit syscall table wrongly.
> >
> > $ file a32.out
> > a32.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2,
> > BuildID[sha1]=6eea873c939012e6c715e8f030261642bf61cb4e, for GNU/Linux 3.2.0, not stripped
> >
> > $ sudo ./perf trace ./a32.out |& tail
> > 0.296 ( 0.001 ms): a32.out/858699 getxattr(pathname: "", name: "������", value: 0xf7f6ce14, size: 1) = 0
> > 0.305 ( 0.007 ms): a32.out/858699 fchmod(fd: -134774784, mode: IFLNK|ISUID|ISVTX|IWOTH|0x10000) = 0
> > 0.333 ( 0.001 ms): a32.out/858699 recvfrom(size: 4160146964, flags: RST|0x20000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1481879552
> > 0.335 ( 0.004 ms): a32.out/858699 recvfrom(fd: 1482014720, ubuf: 0xf7f71278, size: 4160146964, flags: NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482014720
> > 0.355 ( 0.002 ms): a32.out/858699 recvfrom(fd: 1482018816, ubuf: 0x5855d000, size: 4160146964, flags: RST|NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482018816
> > 0.362 ( 0.010 ms): a32.out/858699 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x1b01000000632e62,.iov_len = (__kernel_size_t)1125899909479171,}, pos_h: 4160146964) = 3
> > 0.385 ( 0.002 ms): a32.out/858699 close(fd: 3) = 211
> > 0.388 ( 0.001 ms): a32.out/858699 close(fd: 3) = 0
> > 0.393 ( 0.002 ms): a32.out/858699 lstat(filename: "") = 0
> > 0.396 ( 0.004 ms): a32.out/858699 recvfrom(fd: 1482014720, size: 4160146964, flags: NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482014720
> >
> > The last 5 should be openat, read, read, close and brk(?).
>
> That's strange as nearly the same test works for me:
> ```
> $ git show
> commit 7920020237af8138f7be1a21be9a2918a71ddc5e (HEAD -> ptn-syscalltbl)
> Author: Ian Rogers <irogers@google.com>
> Date: Fri Jan 31 21:34:07 2025 -0800
>
> perf syscalltbl: Mask off ABI type for MIPS system calls
>
> Arnd Bergmann described that MIPS system calls don't necessarily start
> from 0 as an ABI prefix is applied:
> https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
> When decoding the "id" (aka system call number) for MIPS ignore values
> greater-than 1000.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ..
> $ file a.out
> a.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV),
> dynamically linked, interpreter /lib/ld-linux.so.2,
> BuildID[sha1]=3fcd28f85a27a3108941661a91dbc675c06868f9, for GNU/Linux
> 3.2.0, not stripped
> $ sudo /tmp/perf/perf trace ./a.out
> ...
> ? ( ): a.out/218604 ... [continued]: execve())
> = 0
> 0.067 ( 0.003 ms): a.out/218604 brk()
> = 0x5749e000
> 0.154 ( 0.007 ms): a.out/218604 access(filename: 0xf7fc7f28,
> mode: R) = -1 ENOENT (No such file or
> directory)
> 0.168 ( 0.023 ms): a.out/218604 openat(dfd: CWD, filename:
> 0xf7fc44c3, flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> 0.193 ( 0.006 ms): a.out/218604 statx(dfd:
> 3</proc/218604/status>, filename: 0xf7fc510a, flags:
> NO_AUTOMOUNT|EMPTY_PATH, mask:
> TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS, buffer:
> 0xffaa6b88) = 0
> 0.212 ( 0.002 ms): a.out/218604 close(fd: 3</proc/218604/status>)
> = 0
> 0.233 ( 0.019 ms): a.out/218604 openat(dfd: CWD, filename:
> 0xf7f973e0, flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> 0.255 ( 0.004 ms): a.out/218604 read(fd: 3</proc/218604/status>,
> buf: 0xffaa6df0, count: 512) = 512
> 0.262 ( 0.003 ms): a.out/218604 statx(dfd:
> 3</proc/218604/status>, filename: 0xf7fc510a, flags:
> NO_AUTOMOUNT|EMPTY_PATH, mask:
> TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS, buffer:
> 0xffaa6b38) = 0
> 0.347 ( 0.002 ms): a.out/218604 close(fd: 3</proc/218604/status>)
> = 0
> 0.372 ( 0.002 ms): a.out/218604 set_tid_address(tidptr:
> 0xf7f98528) = 218604 (a.out)
> 0.376 ( 0.002 ms): a.out/218604 set_robust_list(head: 0xf7f9852c,
> len: 12) =
> 0.381 ( 0.002 ms): a.out/218604 rseq(rseq: 0xf7f98960, rseq_len:
> 32, sig: 1392848979) =
> 0.469 ( 0.010 ms): a.out/218604 mprotect(start: 0xf7f6e000, len:
> 8192, prot: READ) = 0
> 0.489 ( 0.007 ms): a.out/218604 mprotect(start: 0x5661a000, len:
> 4096, prot: READ) = 0
> 0.503 ( 0.007 ms): a.out/218604 mprotect(start: 0xf7fd0000, len:
> 8192, prot: READ) = 0
> 0.550 ( 0.015 ms): a.out/218604 munmap(addr: 0xf7f7b000, len:
> 111198) = 0
> 0.589 ( 0.035 ms): a.out/218604 openat(dfd: CWD, filename:
> 0x56619008) = 3
> 0.627 ( 0.024 ms): a.out/218604 read(fd: 3</proc/218604/status>,
> buf: 0xffaa68fc, count: 4096) = 1437
> 0.654 ( 0.090 ms): a.out/218604 write(fd: 1</dev/pts/3>, buf: ,
> count: 1437) = 1437
> 0.766 (1000.164 ms): a.out/218604 clock_nanosleep(rqtp:
> 0xffaa6824, rmtp: 0xffaa681c) = 0
> 1000.942 ( ): a.out/218604 exit_group()
> $ file /tmp/perf/perf
> /tmp/perf/perf: ELF 64-bit LSB pie executable, x86-64, version 1
> (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2,
> BuildID[sha1]=60b07f65d2559a7193b2d1d36cfa00054dfbd076, for GNU/Linux
> 3.2.0, with debug_info, not stripped
> ```
> Perhaps your a.out binary was built as an x32 one?
> Looking under the covers with gdb:
> ```
> $ sudo gdb --args /tmp/perf/perf trace ./a.out
> GNU gdb (Debian 15.1-1) 15.1
> ...
> Reading symbols from /tmp/perf/perf...
> (gdb) b syscalltbl__name
> Breakpoint 1 at 0x23a51b: file util/syscalltbl.c, line 47.
> (gdb) r
> ...
> [Detaching after vfork from child process 218826]
>
> Breakpoint 1, syscalltbl__name (e_machine=3, id=11) at util/syscalltbl.c:47
> 47 const struct syscalltbl *table = find_table(e_machine);
> ```
> So the e_machine is 3 which corresponds to EM_386.
>
> I've not fixed every use of syscalltbl but I believe this one is working.
Strange. I'm seeing 62 (x86_64).
$ sudo gdb -q --args ./perf trace ./a32.out
Reading symbols from ./perf...
(gdb) b syscalltbl__name
Breakpoint 1 at 0x27998b: file util/syscalltbl.c, line 46.
(gdb) r
Starting program: /home/namhyung/tmp/perf trace ./a32.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Detaching after fork from child process 886888]
Breakpoint 1, syscalltbl__name (e_machine=62, id=156) at util/syscalltbl.c:46
46 {
But the binary is i386.
$ file a32.out
a32.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2,
BuildID[sha1]=6eea873c939012e6c715e8f030261642bf61cb4e, for GNU/Linux 3.2.0, not stripped
$ readelf -h a32.out
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Position-Independent Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x10a0
Start of program headers: 52 (bytes into file)
Start of section headers: 13932 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 11
Size of section headers: 40 (bytes)
Number of section headers: 30
Section header string table index: 29
$ hexdump -C -n 32 a32.out
00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 03 00 03 00 01 00 00 00 a0 10 00 00 34 00 00 00 |............4...|
00000020 ----- -----
^ ^
| |
ET_DYN |
EM_386
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 0/8] perf: Support multiple system call tables in the build
2025-02-25 5:40 ` Namhyung Kim
@ 2025-02-26 2:47 ` Namhyung Kim
2025-02-26 23:47 ` Namhyung Kim
0 siblings, 1 reply; 19+ messages in thread
From: Namhyung Kim @ 2025-02-26 2:47 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, John Garry, Will Deacon, James Clark, Mike Leach,
Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Charlie Jenkins, Bibo Mao, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv, linux-mips, Arnd Bergmann
On Mon, Feb 24, 2025 at 09:40:55PM -0800, Namhyung Kim wrote:
> On Mon, Feb 24, 2025 at 08:37:01PM -0800, Ian Rogers wrote:
> > On Mon, Feb 24, 2025 at 7:05 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > On Wed, Feb 19, 2025 at 10:56:49AM -0800, Ian Rogers wrote:
> > > > This work builds on the clean up of system call tables and removal of
> > > > libaudit by Charlie Jenkins <charlie@rivosinc.com>.
> > > >
> > > > The system call table in perf trace is used to map system call numbers
> > > > to names and vice versa. Prior to these changes, a single table
> > > > matching the perf binary's build was present. The table would be
> > > > incorrect if tracing say a 32-bit binary from a 64-bit version of
> > > > perf, the names and numbers wouldn't match.
> > > >
> > > > Change the build so that a single system call file is built and the
> > > > potentially multiple tables are identifiable from the ELF machine type
> > > > of the process being examined. To determine the ELF machine type, the
> > > > executable's header is read from /proc/pid/exe with fallbacks to using
> > > > the perf's binary type when unknown.
> > > >
> > > > Remove some runtime types used by the system call tables and make
> > > > equivalents generated at build time.
> > >
> > > So I tested this with a test program.
> > >
> > > $ cat a.c
> > > #include <stdio.h>
> > > int main(void)
> > > {
> > > char buf[4096];
> > > FILE *fp = fopen("a.c", "r");
> > > size_t len;
> > >
> > > len = fread(buf, sizeof(buf), 1, fp);
> > > fwrite(buf, 1, len, stdout);
> > > fflush(stdout);
> > > fclose(fp);
> > > return 0;
> > > }
> > >
> > > $ gcc -o a64.out a.c
> > > $ gcc -o a32.out -m32 a.c
> > >
> > > $ ./perf version
> > > perf version 6.14.rc1.ge002a64f6188
> > >
> > > $ git show
> > > commit e002a64f61882626992dd6513c0db3711c06fea7 (HEAD -> perf-check)
> > > Author: Ian Rogers <irogers@google.com>
> > > Date: Wed Feb 19 10:56:57 2025 -0800
> > >
> > > perf syscalltbl: Mask off ABI type for MIPS system calls
> > >
> > > Arnd Bergmann described that MIPS system calls don't necessarily start
> > > from 0 as an ABI prefix is applied:
> > > https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
> > > When decoding the "id" (aka system call number) for MIPS ignore values
> > > greater-than 1000.
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > >
> > > It works well with 64bit.
> > >
> > > $ sudo ./perf trace ./a64.out |& tail
> > > 0.266 ( 0.007 ms): a64.out/858681 munmap(addr: 0x7f392723a000, len: 109058) = 0
> > > 0.286 ( 0.002 ms): a64.out/858681 getrandom(ubuf: 0x7f3927232178, len: 8, flags: NONBLOCK) = 8
> > > 0.289 ( 0.001 ms): a64.out/858681 brk() = 0x56419ecf7000
> > > 0.291 ( 0.002 ms): a64.out/858681 brk(brk: 0x56419ed18000) = 0x56419ed18000
> > > 0.299 ( 0.009 ms): a64.out/858681 openat(dfd: CWD, filename: "a.c") = 3
> > > 0.312 ( 0.001 ms): a64.out/858681 fstat(fd: 3, statbuf: 0x7ffdfadf1eb0) = 0
> > > 0.315 ( 0.002 ms): a64.out/858681 read(fd: 3, buf: 0x7ffdfadf2030, count: 4096) = 211
> > > 0.318 ( 0.009 ms): a64.out/858681 read(fd: 3, buf: 0x56419ecf7480, count: 4096) = 0
> > > 0.330 ( 0.001 ms): a64.out/858681 close(fd: 3) = 0
> > > 0.338 ( ): a64.out/858681 exit_group() = ?
> > >
> > > But 32bit is still broken and use 64bit syscall table wrongly.
> > >
> > > $ file a32.out
> > > a32.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2,
> > > BuildID[sha1]=6eea873c939012e6c715e8f030261642bf61cb4e, for GNU/Linux 3.2.0, not stripped
> > >
> > > $ sudo ./perf trace ./a32.out |& tail
> > > 0.296 ( 0.001 ms): a32.out/858699 getxattr(pathname: "", name: "������", value: 0xf7f6ce14, size: 1) = 0
> > > 0.305 ( 0.007 ms): a32.out/858699 fchmod(fd: -134774784, mode: IFLNK|ISUID|ISVTX|IWOTH|0x10000) = 0
> > > 0.333 ( 0.001 ms): a32.out/858699 recvfrom(size: 4160146964, flags: RST|0x20000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1481879552
> > > 0.335 ( 0.004 ms): a32.out/858699 recvfrom(fd: 1482014720, ubuf: 0xf7f71278, size: 4160146964, flags: NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482014720
> > > 0.355 ( 0.002 ms): a32.out/858699 recvfrom(fd: 1482018816, ubuf: 0x5855d000, size: 4160146964, flags: RST|NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482018816
> > > 0.362 ( 0.010 ms): a32.out/858699 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x1b01000000632e62,.iov_len = (__kernel_size_t)1125899909479171,}, pos_h: 4160146964) = 3
> > > 0.385 ( 0.002 ms): a32.out/858699 close(fd: 3) = 211
> > > 0.388 ( 0.001 ms): a32.out/858699 close(fd: 3) = 0
> > > 0.393 ( 0.002 ms): a32.out/858699 lstat(filename: "") = 0
> > > 0.396 ( 0.004 ms): a32.out/858699 recvfrom(fd: 1482014720, size: 4160146964, flags: NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482014720
> > >
> > > The last 5 should be openat, read, read, close and brk(?).
> >
> > That's strange as nearly the same test works for me:
> > ```
> > $ git show
> > commit 7920020237af8138f7be1a21be9a2918a71ddc5e (HEAD -> ptn-syscalltbl)
> > Author: Ian Rogers <irogers@google.com>
> > Date: Fri Jan 31 21:34:07 2025 -0800
> >
> > perf syscalltbl: Mask off ABI type for MIPS system calls
> >
> > Arnd Bergmann described that MIPS system calls don't necessarily start
> > from 0 as an ABI prefix is applied:
> > https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
> > When decoding the "id" (aka system call number) for MIPS ignore values
> > greater-than 1000.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ..
> > $ file a.out
> > a.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV),
> > dynamically linked, interpreter /lib/ld-linux.so.2,
> > BuildID[sha1]=3fcd28f85a27a3108941661a91dbc675c06868f9, for GNU/Linux
> > 3.2.0, not stripped
> > $ sudo /tmp/perf/perf trace ./a.out
> > ...
> > ? ( ): a.out/218604 ... [continued]: execve())
> > = 0
> > 0.067 ( 0.003 ms): a.out/218604 brk()
> > = 0x5749e000
> > 0.154 ( 0.007 ms): a.out/218604 access(filename: 0xf7fc7f28,
> > mode: R) = -1 ENOENT (No such file or
> > directory)
> > 0.168 ( 0.023 ms): a.out/218604 openat(dfd: CWD, filename:
> > 0xf7fc44c3, flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> > 0.193 ( 0.006 ms): a.out/218604 statx(dfd:
> > 3</proc/218604/status>, filename: 0xf7fc510a, flags:
> > NO_AUTOMOUNT|EMPTY_PATH, mask:
> > TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS, buffer:
> > 0xffaa6b88) = 0
> > 0.212 ( 0.002 ms): a.out/218604 close(fd: 3</proc/218604/status>)
> > = 0
> > 0.233 ( 0.019 ms): a.out/218604 openat(dfd: CWD, filename:
> > 0xf7f973e0, flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> > 0.255 ( 0.004 ms): a.out/218604 read(fd: 3</proc/218604/status>,
> > buf: 0xffaa6df0, count: 512) = 512
> > 0.262 ( 0.003 ms): a.out/218604 statx(dfd:
> > 3</proc/218604/status>, filename: 0xf7fc510a, flags:
> > NO_AUTOMOUNT|EMPTY_PATH, mask:
> > TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS, buffer:
> > 0xffaa6b38) = 0
> > 0.347 ( 0.002 ms): a.out/218604 close(fd: 3</proc/218604/status>)
> > = 0
> > 0.372 ( 0.002 ms): a.out/218604 set_tid_address(tidptr:
> > 0xf7f98528) = 218604 (a.out)
> > 0.376 ( 0.002 ms): a.out/218604 set_robust_list(head: 0xf7f9852c,
> > len: 12) =
> > 0.381 ( 0.002 ms): a.out/218604 rseq(rseq: 0xf7f98960, rseq_len:
> > 32, sig: 1392848979) =
> > 0.469 ( 0.010 ms): a.out/218604 mprotect(start: 0xf7f6e000, len:
> > 8192, prot: READ) = 0
> > 0.489 ( 0.007 ms): a.out/218604 mprotect(start: 0x5661a000, len:
> > 4096, prot: READ) = 0
> > 0.503 ( 0.007 ms): a.out/218604 mprotect(start: 0xf7fd0000, len:
> > 8192, prot: READ) = 0
> > 0.550 ( 0.015 ms): a.out/218604 munmap(addr: 0xf7f7b000, len:
> > 111198) = 0
> > 0.589 ( 0.035 ms): a.out/218604 openat(dfd: CWD, filename:
> > 0x56619008) = 3
> > 0.627 ( 0.024 ms): a.out/218604 read(fd: 3</proc/218604/status>,
> > buf: 0xffaa68fc, count: 4096) = 1437
> > 0.654 ( 0.090 ms): a.out/218604 write(fd: 1</dev/pts/3>, buf: ,
> > count: 1437) = 1437
> > 0.766 (1000.164 ms): a.out/218604 clock_nanosleep(rqtp:
> > 0xffaa6824, rmtp: 0xffaa681c) = 0
> > 1000.942 ( ): a.out/218604 exit_group()
> > $ file /tmp/perf/perf
> > /tmp/perf/perf: ELF 64-bit LSB pie executable, x86-64, version 1
> > (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2,
> > BuildID[sha1]=60b07f65d2559a7193b2d1d36cfa00054dfbd076, for GNU/Linux
> > 3.2.0, with debug_info, not stripped
> > ```
> > Perhaps your a.out binary was built as an x32 one?
> > Looking under the covers with gdb:
> > ```
> > $ sudo gdb --args /tmp/perf/perf trace ./a.out
> > GNU gdb (Debian 15.1-1) 15.1
> > ...
> > Reading symbols from /tmp/perf/perf...
> > (gdb) b syscalltbl__name
> > Breakpoint 1 at 0x23a51b: file util/syscalltbl.c, line 47.
> > (gdb) r
> > ...
> > [Detaching after vfork from child process 218826]
> >
> > Breakpoint 1, syscalltbl__name (e_machine=3, id=11) at util/syscalltbl.c:47
> > 47 const struct syscalltbl *table = find_table(e_machine);
> > ```
> > So the e_machine is 3 which corresponds to EM_386.
> >
> > I've not fixed every use of syscalltbl but I believe this one is working.
>
> Strange. I'm seeing 62 (x86_64).
>
> $ sudo gdb -q --args ./perf trace ./a32.out
> Reading symbols from ./perf...
> (gdb) b syscalltbl__name
> Breakpoint 1 at 0x27998b: file util/syscalltbl.c, line 46.
> (gdb) r
> Starting program: /home/namhyung/tmp/perf trace ./a32.out
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> [Detaching after fork from child process 886888]
>
> Breakpoint 1, syscalltbl__name (e_machine=62, id=156) at util/syscalltbl.c:46
> 46 {
>
> But the binary is i386.
>
> $ file a32.out
> a32.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2,
> BuildID[sha1]=6eea873c939012e6c715e8f030261642bf61cb4e, for GNU/Linux 3.2.0, not stripped
>
> $ readelf -h a32.out
> ELF Header:
> Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
> Class: ELF32
> Data: 2's complement, little endian
> Version: 1 (current)
> OS/ABI: UNIX - System V
> ABI Version: 0
> Type: DYN (Position-Independent Executable file)
> Machine: Intel 80386
> Version: 0x1
> Entry point address: 0x10a0
> Start of program headers: 52 (bytes into file)
> Start of section headers: 13932 (bytes into file)
> Flags: 0x0
> Size of this header: 52 (bytes)
> Size of program headers: 32 (bytes)
> Number of program headers: 11
> Size of section headers: 40 (bytes)
> Number of section headers: 30
> Section header string table index: 29
>
> $ hexdump -C -n 32 a32.out
> 00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
> 00000010 03 00 03 00 01 00 00 00 a0 10 00 00 34 00 00 00 |............4...|
> 00000020 ----- -----
> ^ ^
> | |
> ET_DYN |
> EM_386
>
I found it failed to open /proc/PID/exe for some reason. It failed with
ENOENT but I've confirmed there's /proc/PID directory. Strange...
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 0/8] perf: Support multiple system call tables in the build
2025-02-26 2:47 ` Namhyung Kim
@ 2025-02-26 23:47 ` Namhyung Kim
0 siblings, 0 replies; 19+ messages in thread
From: Namhyung Kim @ 2025-02-26 23:47 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, John Garry, Will Deacon, James Clark, Mike Leach,
Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Charlie Jenkins, Bibo Mao, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv, linux-mips, Arnd Bergmann
On Tue, Feb 25, 2025 at 06:47:58PM -0800, Namhyung Kim wrote:
> On Mon, Feb 24, 2025 at 09:40:55PM -0800, Namhyung Kim wrote:
> > On Mon, Feb 24, 2025 at 08:37:01PM -0800, Ian Rogers wrote:
> > > On Mon, Feb 24, 2025 at 7:05 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > On Wed, Feb 19, 2025 at 10:56:49AM -0800, Ian Rogers wrote:
> > > > > This work builds on the clean up of system call tables and removal of
> > > > > libaudit by Charlie Jenkins <charlie@rivosinc.com>.
> > > > >
> > > > > The system call table in perf trace is used to map system call numbers
> > > > > to names and vice versa. Prior to these changes, a single table
> > > > > matching the perf binary's build was present. The table would be
> > > > > incorrect if tracing say a 32-bit binary from a 64-bit version of
> > > > > perf, the names and numbers wouldn't match.
> > > > >
> > > > > Change the build so that a single system call file is built and the
> > > > > potentially multiple tables are identifiable from the ELF machine type
> > > > > of the process being examined. To determine the ELF machine type, the
> > > > > executable's header is read from /proc/pid/exe with fallbacks to using
> > > > > the perf's binary type when unknown.
> > > > >
> > > > > Remove some runtime types used by the system call tables and make
> > > > > equivalents generated at build time.
> > > >
> > > > So I tested this with a test program.
> > > >
> > > > $ cat a.c
> > > > #include <stdio.h>
> > > > int main(void)
> > > > {
> > > > char buf[4096];
> > > > FILE *fp = fopen("a.c", "r");
> > > > size_t len;
> > > >
> > > > len = fread(buf, sizeof(buf), 1, fp);
> > > > fwrite(buf, 1, len, stdout);
> > > > fflush(stdout);
> > > > fclose(fp);
> > > > return 0;
> > > > }
> > > >
> > > > $ gcc -o a64.out a.c
> > > > $ gcc -o a32.out -m32 a.c
> > > >
> > > > $ ./perf version
> > > > perf version 6.14.rc1.ge002a64f6188
> > > >
> > > > $ git show
> > > > commit e002a64f61882626992dd6513c0db3711c06fea7 (HEAD -> perf-check)
> > > > Author: Ian Rogers <irogers@google.com>
> > > > Date: Wed Feb 19 10:56:57 2025 -0800
> > > >
> > > > perf syscalltbl: Mask off ABI type for MIPS system calls
> > > >
> > > > Arnd Bergmann described that MIPS system calls don't necessarily start
> > > > from 0 as an ABI prefix is applied:
> > > > https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
> > > > When decoding the "id" (aka system call number) for MIPS ignore values
> > > > greater-than 1000.
> > > >
> > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > >
> > > > It works well with 64bit.
> > > >
> > > > $ sudo ./perf trace ./a64.out |& tail
> > > > 0.266 ( 0.007 ms): a64.out/858681 munmap(addr: 0x7f392723a000, len: 109058) = 0
> > > > 0.286 ( 0.002 ms): a64.out/858681 getrandom(ubuf: 0x7f3927232178, len: 8, flags: NONBLOCK) = 8
> > > > 0.289 ( 0.001 ms): a64.out/858681 brk() = 0x56419ecf7000
> > > > 0.291 ( 0.002 ms): a64.out/858681 brk(brk: 0x56419ed18000) = 0x56419ed18000
> > > > 0.299 ( 0.009 ms): a64.out/858681 openat(dfd: CWD, filename: "a.c") = 3
> > > > 0.312 ( 0.001 ms): a64.out/858681 fstat(fd: 3, statbuf: 0x7ffdfadf1eb0) = 0
> > > > 0.315 ( 0.002 ms): a64.out/858681 read(fd: 3, buf: 0x7ffdfadf2030, count: 4096) = 211
> > > > 0.318 ( 0.009 ms): a64.out/858681 read(fd: 3, buf: 0x56419ecf7480, count: 4096) = 0
> > > > 0.330 ( 0.001 ms): a64.out/858681 close(fd: 3) = 0
> > > > 0.338 ( ): a64.out/858681 exit_group() = ?
> > > >
> > > > But 32bit is still broken and use 64bit syscall table wrongly.
> > > >
> > > > $ file a32.out
> > > > a32.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2,
> > > > BuildID[sha1]=6eea873c939012e6c715e8f030261642bf61cb4e, for GNU/Linux 3.2.0, not stripped
> > > >
> > > > $ sudo ./perf trace ./a32.out |& tail
> > > > 0.296 ( 0.001 ms): a32.out/858699 getxattr(pathname: "", name: "������", value: 0xf7f6ce14, size: 1) = 0
> > > > 0.305 ( 0.007 ms): a32.out/858699 fchmod(fd: -134774784, mode: IFLNK|ISUID|ISVTX|IWOTH|0x10000) = 0
> > > > 0.333 ( 0.001 ms): a32.out/858699 recvfrom(size: 4160146964, flags: RST|0x20000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1481879552
> > > > 0.335 ( 0.004 ms): a32.out/858699 recvfrom(fd: 1482014720, ubuf: 0xf7f71278, size: 4160146964, flags: NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482014720
> > > > 0.355 ( 0.002 ms): a32.out/858699 recvfrom(fd: 1482018816, ubuf: 0x5855d000, size: 4160146964, flags: RST|NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482018816
> > > > 0.362 ( 0.010 ms): a32.out/858699 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x1b01000000632e62,.iov_len = (__kernel_size_t)1125899909479171,}, pos_h: 4160146964) = 3
> > > > 0.385 ( 0.002 ms): a32.out/858699 close(fd: 3) = 211
> > > > 0.388 ( 0.001 ms): a32.out/858699 close(fd: 3) = 0
> > > > 0.393 ( 0.002 ms): a32.out/858699 lstat(filename: "") = 0
> > > > 0.396 ( 0.004 ms): a32.out/858699 recvfrom(fd: 1482014720, size: 4160146964, flags: NOSIGNAL|MORE|WAITFORONE|BATCH|SPLICE_PAGES|CMSG_CLOEXEC|0x10500000, addr: 0xf7f6ce14, addr_len: 0xf7f71278) = 1482014720
> > > >
> > > > The last 5 should be openat, read, read, close and brk(?).
> > >
> > > That's strange as nearly the same test works for me:
> > > ```
> > > $ git show
> > > commit 7920020237af8138f7be1a21be9a2918a71ddc5e (HEAD -> ptn-syscalltbl)
> > > Author: Ian Rogers <irogers@google.com>
> > > Date: Fri Jan 31 21:34:07 2025 -0800
> > >
> > > perf syscalltbl: Mask off ABI type for MIPS system calls
> > >
> > > Arnd Bergmann described that MIPS system calls don't necessarily start
> > > from 0 as an ABI prefix is applied:
> > > https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
> > > When decoding the "id" (aka system call number) for MIPS ignore values
> > > greater-than 1000.
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ..
> > > $ file a.out
> > > a.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV),
> > > dynamically linked, interpreter /lib/ld-linux.so.2,
> > > BuildID[sha1]=3fcd28f85a27a3108941661a91dbc675c06868f9, for GNU/Linux
> > > 3.2.0, not stripped
> > > $ sudo /tmp/perf/perf trace ./a.out
> > > ...
> > > ? ( ): a.out/218604 ... [continued]: execve())
> > > = 0
> > > 0.067 ( 0.003 ms): a.out/218604 brk()
> > > = 0x5749e000
> > > 0.154 ( 0.007 ms): a.out/218604 access(filename: 0xf7fc7f28,
> > > mode: R) = -1 ENOENT (No such file or
> > > directory)
> > > 0.168 ( 0.023 ms): a.out/218604 openat(dfd: CWD, filename:
> > > 0xf7fc44c3, flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> > > 0.193 ( 0.006 ms): a.out/218604 statx(dfd:
> > > 3</proc/218604/status>, filename: 0xf7fc510a, flags:
> > > NO_AUTOMOUNT|EMPTY_PATH, mask:
> > > TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS, buffer:
> > > 0xffaa6b88) = 0
> > > 0.212 ( 0.002 ms): a.out/218604 close(fd: 3</proc/218604/status>)
> > > = 0
> > > 0.233 ( 0.019 ms): a.out/218604 openat(dfd: CWD, filename:
> > > 0xf7f973e0, flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> > > 0.255 ( 0.004 ms): a.out/218604 read(fd: 3</proc/218604/status>,
> > > buf: 0xffaa6df0, count: 512) = 512
> > > 0.262 ( 0.003 ms): a.out/218604 statx(dfd:
> > > 3</proc/218604/status>, filename: 0xf7fc510a, flags:
> > > NO_AUTOMOUNT|EMPTY_PATH, mask:
> > > TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS, buffer:
> > > 0xffaa6b38) = 0
> > > 0.347 ( 0.002 ms): a.out/218604 close(fd: 3</proc/218604/status>)
> > > = 0
> > > 0.372 ( 0.002 ms): a.out/218604 set_tid_address(tidptr:
> > > 0xf7f98528) = 218604 (a.out)
> > > 0.376 ( 0.002 ms): a.out/218604 set_robust_list(head: 0xf7f9852c,
> > > len: 12) =
> > > 0.381 ( 0.002 ms): a.out/218604 rseq(rseq: 0xf7f98960, rseq_len:
> > > 32, sig: 1392848979) =
> > > 0.469 ( 0.010 ms): a.out/218604 mprotect(start: 0xf7f6e000, len:
> > > 8192, prot: READ) = 0
> > > 0.489 ( 0.007 ms): a.out/218604 mprotect(start: 0x5661a000, len:
> > > 4096, prot: READ) = 0
> > > 0.503 ( 0.007 ms): a.out/218604 mprotect(start: 0xf7fd0000, len:
> > > 8192, prot: READ) = 0
> > > 0.550 ( 0.015 ms): a.out/218604 munmap(addr: 0xf7f7b000, len:
> > > 111198) = 0
> > > 0.589 ( 0.035 ms): a.out/218604 openat(dfd: CWD, filename:
> > > 0x56619008) = 3
> > > 0.627 ( 0.024 ms): a.out/218604 read(fd: 3</proc/218604/status>,
> > > buf: 0xffaa68fc, count: 4096) = 1437
> > > 0.654 ( 0.090 ms): a.out/218604 write(fd: 1</dev/pts/3>, buf: ,
> > > count: 1437) = 1437
> > > 0.766 (1000.164 ms): a.out/218604 clock_nanosleep(rqtp:
> > > 0xffaa6824, rmtp: 0xffaa681c) = 0
> > > 1000.942 ( ): a.out/218604 exit_group()
> > > $ file /tmp/perf/perf
> > > /tmp/perf/perf: ELF 64-bit LSB pie executable, x86-64, version 1
> > > (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2,
> > > BuildID[sha1]=60b07f65d2559a7193b2d1d36cfa00054dfbd076, for GNU/Linux
> > > 3.2.0, with debug_info, not stripped
> > > ```
> > > Perhaps your a.out binary was built as an x32 one?
> > > Looking under the covers with gdb:
> > > ```
> > > $ sudo gdb --args /tmp/perf/perf trace ./a.out
> > > GNU gdb (Debian 15.1-1) 15.1
> > > ...
> > > Reading symbols from /tmp/perf/perf...
> > > (gdb) b syscalltbl__name
> > > Breakpoint 1 at 0x23a51b: file util/syscalltbl.c, line 47.
> > > (gdb) r
> > > ...
> > > [Detaching after vfork from child process 218826]
> > >
> > > Breakpoint 1, syscalltbl__name (e_machine=3, id=11) at util/syscalltbl.c:47
> > > 47 const struct syscalltbl *table = find_table(e_machine);
> > > ```
> > > So the e_machine is 3 which corresponds to EM_386.
> > >
> > > I've not fixed every use of syscalltbl but I believe this one is working.
> >
> > Strange. I'm seeing 62 (x86_64).
> >
> > $ sudo gdb -q --args ./perf trace ./a32.out
> > Reading symbols from ./perf...
> > (gdb) b syscalltbl__name
> > Breakpoint 1 at 0x27998b: file util/syscalltbl.c, line 46.
> > (gdb) r
> > Starting program: /home/namhyung/tmp/perf trace ./a32.out
> > [Thread debugging using libthread_db enabled]
> > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> > [Detaching after fork from child process 886888]
> >
> > Breakpoint 1, syscalltbl__name (e_machine=62, id=156) at util/syscalltbl.c:46
> > 46 {
> >
> > But the binary is i386.
> >
> > $ file a32.out
> > a32.out: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2,
> > BuildID[sha1]=6eea873c939012e6c715e8f030261642bf61cb4e, for GNU/Linux 3.2.0, not stripped
> >
> > $ readelf -h a32.out
> > ELF Header:
> > Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
> > Class: ELF32
> > Data: 2's complement, little endian
> > Version: 1 (current)
> > OS/ABI: UNIX - System V
> > ABI Version: 0
> > Type: DYN (Position-Independent Executable file)
> > Machine: Intel 80386
> > Version: 0x1
> > Entry point address: 0x10a0
> > Start of program headers: 52 (bytes into file)
> > Start of section headers: 13932 (bytes into file)
> > Flags: 0x0
> > Size of this header: 52 (bytes)
> > Size of program headers: 32 (bytes)
> > Number of program headers: 11
> > Size of section headers: 40 (bytes)
> > Number of section headers: 30
> > Section header string table index: 29
> >
> > $ hexdump -C -n 32 a32.out
> > 00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
> > 00000010 03 00 03 00 01 00 00 00 a0 10 00 00 34 00 00 00 |............4...|
> > 00000020 ----- -----
> > ^ ^
> > | |
> > ET_DYN |
> > EM_386
> >
>
> I found it failed to open /proc/PID/exe for some reason. It failed with
> ENOENT but I've confirmed there's /proc/PID directory. Strange...
It sometimes succeeded and showed the correct syscall names. :(
I don't know what's the problem on my machine. But I think this is a
pre-exisiting problem and this patch improves it.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 0/8] perf: Support multiple system call tables in the build
2025-02-25 4:22 ` Ian Rogers
@ 2025-02-27 0:00 ` Namhyung Kim
2025-02-27 5:24 ` Ian Rogers
0 siblings, 1 reply; 19+ messages in thread
From: Namhyung Kim @ 2025-02-27 0:00 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, John Garry, Will Deacon, James Clark, Mike Leach,
Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Charlie Jenkins, Bibo Mao, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv, linux-mips, Arnd Bergmann
On Mon, Feb 24, 2025 at 08:22:50PM -0800, Ian Rogers wrote:
> On Mon, Feb 24, 2025 at 7:20 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Wed, Feb 19, 2025 at 10:56:49AM -0800, Ian Rogers wrote:
> > > This work builds on the clean up of system call tables and removal of
> > > libaudit by Charlie Jenkins <charlie@rivosinc.com>.
> > >
> > > The system call table in perf trace is used to map system call numbers
> > > to names and vice versa. Prior to these changes, a single table
> > > matching the perf binary's build was present. The table would be
> > > incorrect if tracing say a 32-bit binary from a 64-bit version of
> > > perf, the names and numbers wouldn't match.
> > >
> > > Change the build so that a single system call file is built and the
> > > potentially multiple tables are identifiable from the ELF machine type
> > > of the process being examined. To determine the ELF machine type, the
> > > executable's header is read from /proc/pid/exe with fallbacks to using
> > > the perf's binary type when unknown.
> >
> > Hmm.. then this is limited to live mode and potentially detect wrong
> > machine type if it reads an old data, right?
> >
> > Also IIUC fallback to the perf binary means it cannot use cross-machine
> > table. For example, it cannot process data from ARM64 on x86, no? It
> > seems it should use perf_env.arch.
>
> The perf env arch is kind of horrid. On x86 it has the value x86 and
> then there is an extra 64bit flag, who knows how x32 should be encoded
> - but we barely support x32 as-is. I'd rather we added a new feature
> for the e_machine/e_flags of the executable and worked with those, but
> it is kind of weird with doing system wide mode. I didn't want to drag
> that into this patch series anyway as there is already enough here.
Right, I don't know how to handle x32 properly. Maybe we can just
ignore it for now.
But anyway looking at /proc/PID for recorded data doesn't seem correct.
Can you please add a flag to do that only from trace__run() and just use
EM_HOST for trace__replay()?
Later, we may need to add a misc flag or so to PERF_RECORD_FORK (and
PERF_RECORD_COMM with MISC_COMM_EXEC) to indicate non-standard ABI for a
new thread. But it's not clear how to make it arch-independent.
>
> > One more concern is BPF. The BPF should know about the ABI of the
> > current process so that it can augment the syscall arguments correctly.
> > Currently it only checks the syscall number but it can be different on
> > 32-bit and 64-bit.
>
> That's right. This change is trying to clean up
> tools/perf/util/syscalltbl.c and the perf trace usage. I didn't go as
> far as making BPF programs pair system call number with e_machine and
> e_flags, there is enough here and the behavior after these patches
> matches the behavior before - that is to assume the system call ABI
> matches that of the perf binary.
Right, the next step would be adding a BPF kfunc to identify the current
ABI.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 0/8] perf: Support multiple system call tables in the build
2025-02-27 0:00 ` Namhyung Kim
@ 2025-02-27 5:24 ` Ian Rogers
2025-02-27 7:24 ` Namhyung Kim
0 siblings, 1 reply; 19+ messages in thread
From: Ian Rogers @ 2025-02-27 5:24 UTC (permalink / raw)
To: Namhyung Kim
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, John Garry, Will Deacon, James Clark, Mike Leach,
Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Charlie Jenkins, Bibo Mao, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv, linux-mips, Arnd Bergmann
On Wed, Feb 26, 2025 at 4:00 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Mon, Feb 24, 2025 at 08:22:50PM -0800, Ian Rogers wrote:
> > On Mon, Feb 24, 2025 at 7:20 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > On Wed, Feb 19, 2025 at 10:56:49AM -0800, Ian Rogers wrote:
> > > > This work builds on the clean up of system call tables and removal of
> > > > libaudit by Charlie Jenkins <charlie@rivosinc.com>.
> > > >
> > > > The system call table in perf trace is used to map system call numbers
> > > > to names and vice versa. Prior to these changes, a single table
> > > > matching the perf binary's build was present. The table would be
> > > > incorrect if tracing say a 32-bit binary from a 64-bit version of
> > > > perf, the names and numbers wouldn't match.
> > > >
> > > > Change the build so that a single system call file is built and the
> > > > potentially multiple tables are identifiable from the ELF machine type
> > > > of the process being examined. To determine the ELF machine type, the
> > > > executable's header is read from /proc/pid/exe with fallbacks to using
> > > > the perf's binary type when unknown.
> > >
> > > Hmm.. then this is limited to live mode and potentially detect wrong
> > > machine type if it reads an old data, right?
> > >
> > > Also IIUC fallback to the perf binary means it cannot use cross-machine
> > > table. For example, it cannot process data from ARM64 on x86, no? It
> > > seems it should use perf_env.arch.
> >
> > The perf env arch is kind of horrid. On x86 it has the value x86 and
> > then there is an extra 64bit flag, who knows how x32 should be encoded
> > - but we barely support x32 as-is. I'd rather we added a new feature
> > for the e_machine/e_flags of the executable and worked with those, but
> > it is kind of weird with doing system wide mode. I didn't want to drag
> > that into this patch series anyway as there is already enough here.
>
> Right, I don't know how to handle x32 properly. Maybe we can just
> ignore it for now.
>
> But anyway looking at /proc/PID for recorded data doesn't seem correct.
> Can you please add a flag to do that only from trace__run() and just use
> EM_HOST for trace__replay()?
So I was hoping at some later point the e_machine on the thread could
be populated from the data file - hence the accessor being on thread
and not part of the trace code. We could add a global flag to thread
to disable the reading from /proc but we do similar reading in
machine.c for /proc/version, /proc/kallsyms, /proc/modules, etc. I
think the chance a pid is recycled and the process has a different
e_machine are remote enough that it is similar in nature. Adding the
flag means we need to go and fix up all uses, we only need to set the
flag in builtin-trace.c currently, but we've been historically bad at
setting these globals and bugs creep in. I also don't think
record/replay is working well and I didn't want the syscalltbl cleanup
to turn into a perf trace record/replay fixing exercise.
Thanks,
Ian
> Later, we may need to add a misc flag or so to PERF_RECORD_FORK (and
> PERF_RECORD_COMM with MISC_COMM_EXEC) to indicate non-standard ABI for a
> new thread. But it's not clear how to make it arch-independent.
>
> >
> > > One more concern is BPF. The BPF should know about the ABI of the
> > > current process so that it can augment the syscall arguments correctly.
> > > Currently it only checks the syscall number but it can be different on
> > > 32-bit and 64-bit.
> >
> > That's right. This change is trying to clean up
> > tools/perf/util/syscalltbl.c and the perf trace usage. I didn't go as
> > far as making BPF programs pair system call number with e_machine and
> > e_flags, there is enough here and the behavior after these patches
> > matches the behavior before - that is to assume the system call ABI
> > matches that of the perf binary.
>
> Right, the next step would be adding a BPF kfunc to identify the current
> ABI.
>
> Thanks,
> Namhyung
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3 0/8] perf: Support multiple system call tables in the build
2025-02-27 5:24 ` Ian Rogers
@ 2025-02-27 7:24 ` Namhyung Kim
0 siblings, 0 replies; 19+ messages in thread
From: Namhyung Kim @ 2025-02-27 7:24 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, John Garry, Will Deacon, James Clark, Mike Leach,
Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Charlie Jenkins, Bibo Mao, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv, linux-mips, Arnd Bergmann
On Wed, Feb 26, 2025 at 09:24:15PM -0800, Ian Rogers wrote:
> On Wed, Feb 26, 2025 at 4:00 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Mon, Feb 24, 2025 at 08:22:50PM -0800, Ian Rogers wrote:
> > > On Mon, Feb 24, 2025 at 7:20 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > On Wed, Feb 19, 2025 at 10:56:49AM -0800, Ian Rogers wrote:
> > > > > This work builds on the clean up of system call tables and removal of
> > > > > libaudit by Charlie Jenkins <charlie@rivosinc.com>.
> > > > >
> > > > > The system call table in perf trace is used to map system call numbers
> > > > > to names and vice versa. Prior to these changes, a single table
> > > > > matching the perf binary's build was present. The table would be
> > > > > incorrect if tracing say a 32-bit binary from a 64-bit version of
> > > > > perf, the names and numbers wouldn't match.
> > > > >
> > > > > Change the build so that a single system call file is built and the
> > > > > potentially multiple tables are identifiable from the ELF machine type
> > > > > of the process being examined. To determine the ELF machine type, the
> > > > > executable's header is read from /proc/pid/exe with fallbacks to using
> > > > > the perf's binary type when unknown.
> > > >
> > > > Hmm.. then this is limited to live mode and potentially detect wrong
> > > > machine type if it reads an old data, right?
> > > >
> > > > Also IIUC fallback to the perf binary means it cannot use cross-machine
> > > > table. For example, it cannot process data from ARM64 on x86, no? It
> > > > seems it should use perf_env.arch.
> > >
> > > The perf env arch is kind of horrid. On x86 it has the value x86 and
> > > then there is an extra 64bit flag, who knows how x32 should be encoded
> > > - but we barely support x32 as-is. I'd rather we added a new feature
> > > for the e_machine/e_flags of the executable and worked with those, but
> > > it is kind of weird with doing system wide mode. I didn't want to drag
> > > that into this patch series anyway as there is already enough here.
> >
> > Right, I don't know how to handle x32 properly. Maybe we can just
> > ignore it for now.
> >
> > But anyway looking at /proc/PID for recorded data doesn't seem correct.
> > Can you please add a flag to do that only from trace__run() and just use
> > EM_HOST for trace__replay()?
>
> So I was hoping at some later point the e_machine on the thread could
> be populated from the data file - hence the accessor being on thread
> and not part of the trace code.
Fair enough.
> We could add a global flag to thread
> to disable the reading from /proc but we do similar reading in
> machine.c for /proc/version, /proc/kallsyms, /proc/modules, etc.
You can add a flag to struct trace and only care about the perf trace
use case - whether to call thread__get_e_machine() or not.
In general, reading /proc from perf record is fine. But doing that from
perf report or similar is not good. You don't need to fix them, if any,
with this change. But let's not introduce more bugs.
> I think the chance a pid is recycled and the process has a different
> e_machine are remote enough that it is similar in nature. Adding the
> flag means we need to go and fix up all uses, we only need to set the
> flag in builtin-trace.c currently, but we've been historically bad at
> setting these globals and bugs creep in. I also don't think
> record/replay is working well and I didn't want the syscalltbl cleanup
> to turn into a perf trace record/replay fixing exercise.
Yep, please see above. Anyway I think record/replay on the same machine
is working well.
Thanks,
Namhyung
>
> > Later, we may need to add a misc flag or so to PERF_RECORD_FORK (and
> > PERF_RECORD_COMM with MISC_COMM_EXEC) to indicate non-standard ABI for a
> > new thread. But it's not clear how to make it arch-independent.
> >
> > >
> > > > One more concern is BPF. The BPF should know about the ABI of the
> > > > current process so that it can augment the syscall arguments correctly.
> > > > Currently it only checks the syscall number but it can be different on
> > > > 32-bit and 64-bit.
> > >
> > > That's right. This change is trying to clean up
> > > tools/perf/util/syscalltbl.c and the perf trace usage. I didn't go as
> > > far as making BPF programs pair system call number with e_machine and
> > > e_flags, there is enough here and the behavior after these patches
> > > matches the behavior before - that is to assume the system call ABI
> > > matches that of the perf binary.
> >
> > Right, the next step would be adding a BPF kfunc to identify the current
> > ABI.
> >
> > Thanks,
> > Namhyung
> >
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-02-27 7:24 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-19 18:56 [PATCH v3 0/8] perf: Support multiple system call tables in the build Ian Rogers
2025-02-19 18:56 ` [PATCH v3 1/8] perf syscalltble: Remove syscall_table.h Ian Rogers
2025-02-19 18:56 ` [PATCH v3 2/8] perf trace: Reorganize syscalls Ian Rogers
2025-02-19 18:56 ` [PATCH v3 3/8] perf syscalltbl: Remove struct syscalltbl Ian Rogers
2025-02-19 18:56 ` [PATCH v3 4/8] perf thread: Add support for reading the e_machine type for a thread Ian Rogers
2025-02-19 18:56 ` [PATCH v3 5/8] perf trace beauty: Add syscalltbl.sh generating all system call tables Ian Rogers
2025-02-19 18:56 ` [PATCH v3 6/8] perf syscalltbl: Use lookup table containing multiple architectures Ian Rogers
2025-02-19 18:56 ` [PATCH v3 7/8] perf build: Remove Makefile.syscalls Ian Rogers
2025-02-19 18:56 ` [PATCH v3 8/8] perf syscalltbl: Mask off ABI type for MIPS system calls Ian Rogers
2025-02-25 3:05 ` [PATCH v3 0/8] perf: Support multiple system call tables in the build Namhyung Kim
2025-02-25 4:37 ` Ian Rogers
2025-02-25 5:40 ` Namhyung Kim
2025-02-26 2:47 ` Namhyung Kim
2025-02-26 23:47 ` Namhyung Kim
2025-02-25 3:20 ` Namhyung Kim
2025-02-25 4:22 ` Ian Rogers
2025-02-27 0:00 ` Namhyung Kim
2025-02-27 5:24 ` Ian Rogers
2025-02-27 7:24 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).