* [PATCH v2 0/7] perf: Support multiple system call tables in the build
@ 2025-02-10 16:51 Ian Rogers
2025-02-10 16:51 ` [PATCH v2 1/7] perf syscalltble: Remove syscall_table.h Ian Rogers
` (6 more replies)
0 siblings, 7 replies; 26+ messages in thread
From: Ian Rogers @ 2025-02-10 16:51 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, John Garry, Will Deacon,
James Clark, Mike Leach, Leo Yan, Guo Ren, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Charlie Jenkins, Bibo Mao,
Arnd Bergmann, Huacai Chen, Catalin Marinas, Jiri Slaby,
Björn Töpel, Howard Chu, linux-kernel, linux-perf-users,
linux-arm-kernel, linux-csky, linux-riscv
This work builds on the clean up of system call tables and removal of
libaudit by Charlie Jenkins <charlie@rivosinc.com>.
The system call table in perf trace is used to map system call numbers
to names and vice versa. Prior to these changes, a single table
matching the perf binary's build was present. The table would be
incorrect if tracing say a 32-bit binary from a 64-bit version of
perf, the names and numbers wouldn't match.
Change the build so that a single system call file is built and the
potentially multiple tables are identifiable from the ELF machine type
of the process being examined. To determine the ELF machine type, the
executable's header is read from /proc/pid/exe with fallbacks to using
the perf's binary type when unknown.
Remove some runtime types used by the system call tables and make
equivalents generated at build time.
v2: Change the 1 element cache for the last table as suggested by
Howard Chu, add Howard's reviewed-by tags.
Add a comment and apology to Charlie for not doing better in
guiding:
https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
After discussion on v1 and he agreed this patch series would be
the better direction.
Ian Rogers (7):
perf syscalltble: Remove syscall_table.h
perf trace: Reorganize syscalls
perf syscalltbl: Remove struct syscalltbl
perf thread: Add support for reading the e_machine type for a thread
perf trace beauty: Add syscalltbl.sh generating all system call tables
perf syscalltbl: Use lookup table containing multiple architectures
perf build: Remove Makefile.syscalls
tools/perf/Makefile.perf | 10 +-
tools/perf/arch/alpha/entry/syscalls/Kbuild | 2 -
.../alpha/entry/syscalls/Makefile.syscalls | 5 -
tools/perf/arch/alpha/include/syscall_table.h | 2 -
tools/perf/arch/arc/entry/syscalls/Kbuild | 2 -
.../arch/arc/entry/syscalls/Makefile.syscalls | 3 -
tools/perf/arch/arc/include/syscall_table.h | 2 -
tools/perf/arch/arm/entry/syscalls/Kbuild | 4 -
.../arch/arm/entry/syscalls/Makefile.syscalls | 2 -
tools/perf/arch/arm/include/syscall_table.h | 2 -
tools/perf/arch/arm64/entry/syscalls/Kbuild | 3 -
.../arm64/entry/syscalls/Makefile.syscalls | 6 -
tools/perf/arch/arm64/include/syscall_table.h | 8 -
tools/perf/arch/csky/entry/syscalls/Kbuild | 2 -
.../csky/entry/syscalls/Makefile.syscalls | 3 -
tools/perf/arch/csky/include/syscall_table.h | 2 -
.../perf/arch/loongarch/entry/syscalls/Kbuild | 2 -
.../entry/syscalls/Makefile.syscalls | 3 -
.../arch/loongarch/include/syscall_table.h | 2 -
tools/perf/arch/mips/entry/syscalls/Kbuild | 2 -
.../mips/entry/syscalls/Makefile.syscalls | 5 -
tools/perf/arch/mips/include/syscall_table.h | 2 -
tools/perf/arch/parisc/entry/syscalls/Kbuild | 3 -
.../parisc/entry/syscalls/Makefile.syscalls | 6 -
.../perf/arch/parisc/include/syscall_table.h | 8 -
tools/perf/arch/powerpc/entry/syscalls/Kbuild | 3 -
.../powerpc/entry/syscalls/Makefile.syscalls | 6 -
.../perf/arch/powerpc/include/syscall_table.h | 8 -
tools/perf/arch/riscv/entry/syscalls/Kbuild | 2 -
.../riscv/entry/syscalls/Makefile.syscalls | 4 -
tools/perf/arch/riscv/include/syscall_table.h | 8 -
tools/perf/arch/s390/entry/syscalls/Kbuild | 2 -
.../s390/entry/syscalls/Makefile.syscalls | 5 -
tools/perf/arch/s390/include/syscall_table.h | 2 -
tools/perf/arch/sh/entry/syscalls/Kbuild | 2 -
.../arch/sh/entry/syscalls/Makefile.syscalls | 4 -
tools/perf/arch/sh/include/syscall_table.h | 2 -
tools/perf/arch/sparc/entry/syscalls/Kbuild | 3 -
.../sparc/entry/syscalls/Makefile.syscalls | 5 -
tools/perf/arch/sparc/include/syscall_table.h | 8 -
tools/perf/arch/x86/entry/syscalls/Kbuild | 3 -
.../arch/x86/entry/syscalls/Makefile.syscalls | 6 -
tools/perf/arch/x86/include/syscall_table.h | 8 -
tools/perf/arch/xtensa/entry/syscalls/Kbuild | 2 -
.../xtensa/entry/syscalls/Makefile.syscalls | 4 -
.../perf/arch/xtensa/include/syscall_table.h | 2 -
tools/perf/builtin-trace.c | 275 +++++++++++-------
tools/perf/scripts/Makefile.syscalls | 61 ----
tools/perf/scripts/syscalltbl.sh | 86 ------
tools/perf/trace/beauty/syscalltbl.sh | 274 +++++++++++++++++
tools/perf/util/syscalltbl.c | 142 ++++-----
tools/perf/util/syscalltbl.h | 22 +-
tools/perf/util/thread.c | 50 ++++
tools/perf/util/thread.h | 14 +-
54 files changed, 598 insertions(+), 506 deletions(-)
delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/alpha/include/syscall_table.h
delete mode 100644 tools/perf/arch/arc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arc/include/syscall_table.h
delete mode 100644 tools/perf/arch/arm/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arm/include/syscall_table.h
delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arm64/include/syscall_table.h
delete mode 100644 tools/perf/arch/csky/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/csky/include/syscall_table.h
delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/loongarch/include/syscall_table.h
delete mode 100644 tools/perf/arch/mips/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/mips/include/syscall_table.h
delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/parisc/include/syscall_table.h
delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/powerpc/include/syscall_table.h
delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/riscv/include/syscall_table.h
delete mode 100644 tools/perf/arch/s390/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/s390/include/syscall_table.h
delete mode 100644 tools/perf/arch/sh/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/sh/include/syscall_table.h
delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/sparc/include/syscall_table.h
delete mode 100644 tools/perf/arch/x86/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/x86/include/syscall_table.h
delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/xtensa/include/syscall_table.h
delete mode 100644 tools/perf/scripts/Makefile.syscalls
delete mode 100755 tools/perf/scripts/syscalltbl.sh
create mode 100755 tools/perf/trace/beauty/syscalltbl.sh
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v2 1/7] perf syscalltble: Remove syscall_table.h
2025-02-10 16:51 [PATCH v2 0/7] perf: Support multiple system call tables in the build Ian Rogers
@ 2025-02-10 16:51 ` Ian Rogers
2025-02-10 23:48 ` Charlie Jenkins
2025-02-10 16:51 ` [PATCH v2 2/7] perf trace: Reorganize syscalls Ian Rogers
` (5 subsequent siblings)
6 siblings, 1 reply; 26+ messages in thread
From: Ian Rogers @ 2025-02-10 16:51 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, John Garry, Will Deacon,
James Clark, Mike Leach, Leo Yan, Guo Ren, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Charlie Jenkins, Bibo Mao,
Arnd Bergmann, Huacai Chen, Catalin Marinas, Jiri Slaby,
Björn Töpel, Howard Chu, linux-kernel, linux-perf-users,
linux-arm-kernel, linux-csky, linux-riscv
The definition of "static const char *const syscalltbl[] = {" is done
in a generated syscalls_32.h or syscalls_64.h that is architecture
dependent. In order to include the appropriate file a syscall_table.h
is found via the perf include path and it includes the syscalls_32.h
or syscalls_64.h as appropriate.
To support having multiple syscall tables, one for 32-bit and one for
64-bit, or for different architectures, an include path cannot be
used. Remove syscall_table.h because of this and inline what it does
into syscalltbl.c.
For architectures without a syscall_table.h this will cause a failure
to include either syscalls_32.h or syscalls_64.h rather than a failure
to include syscall_table.h. For architectures that only included one
or other, the behavior matches BITS_PER_LONG as previously done on
architectures supporting both syscalls_32.h and syscalls_64.h.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
---
tools/perf/arch/alpha/include/syscall_table.h | 2 --
tools/perf/arch/arc/include/syscall_table.h | 2 --
tools/perf/arch/arm/include/syscall_table.h | 2 --
tools/perf/arch/arm64/include/syscall_table.h | 8 --------
tools/perf/arch/csky/include/syscall_table.h | 2 --
tools/perf/arch/loongarch/include/syscall_table.h | 2 --
tools/perf/arch/mips/include/syscall_table.h | 2 --
tools/perf/arch/parisc/include/syscall_table.h | 8 --------
tools/perf/arch/powerpc/include/syscall_table.h | 8 --------
tools/perf/arch/riscv/include/syscall_table.h | 8 --------
tools/perf/arch/s390/include/syscall_table.h | 2 --
tools/perf/arch/sh/include/syscall_table.h | 2 --
tools/perf/arch/sparc/include/syscall_table.h | 8 --------
tools/perf/arch/x86/include/syscall_table.h | 8 --------
tools/perf/arch/xtensa/include/syscall_table.h | 2 --
tools/perf/util/syscalltbl.c | 8 +++++++-
16 files changed, 7 insertions(+), 67 deletions(-)
delete mode 100644 tools/perf/arch/alpha/include/syscall_table.h
delete mode 100644 tools/perf/arch/arc/include/syscall_table.h
delete mode 100644 tools/perf/arch/arm/include/syscall_table.h
delete mode 100644 tools/perf/arch/arm64/include/syscall_table.h
delete mode 100644 tools/perf/arch/csky/include/syscall_table.h
delete mode 100644 tools/perf/arch/loongarch/include/syscall_table.h
delete mode 100644 tools/perf/arch/mips/include/syscall_table.h
delete mode 100644 tools/perf/arch/parisc/include/syscall_table.h
delete mode 100644 tools/perf/arch/powerpc/include/syscall_table.h
delete mode 100644 tools/perf/arch/riscv/include/syscall_table.h
delete mode 100644 tools/perf/arch/s390/include/syscall_table.h
delete mode 100644 tools/perf/arch/sh/include/syscall_table.h
delete mode 100644 tools/perf/arch/sparc/include/syscall_table.h
delete mode 100644 tools/perf/arch/x86/include/syscall_table.h
delete mode 100644 tools/perf/arch/xtensa/include/syscall_table.h
diff --git a/tools/perf/arch/alpha/include/syscall_table.h b/tools/perf/arch/alpha/include/syscall_table.h
deleted file mode 100644
index b53e31c15805..000000000000
--- a/tools/perf/arch/alpha/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_64.h>
diff --git a/tools/perf/arch/arc/include/syscall_table.h b/tools/perf/arch/arc/include/syscall_table.h
deleted file mode 100644
index 4c942821662d..000000000000
--- a/tools/perf/arch/arc/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_32.h>
diff --git a/tools/perf/arch/arm/include/syscall_table.h b/tools/perf/arch/arm/include/syscall_table.h
deleted file mode 100644
index 4c942821662d..000000000000
--- a/tools/perf/arch/arm/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_32.h>
diff --git a/tools/perf/arch/arm64/include/syscall_table.h b/tools/perf/arch/arm64/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/arm64/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/csky/include/syscall_table.h b/tools/perf/arch/csky/include/syscall_table.h
deleted file mode 100644
index 4c942821662d..000000000000
--- a/tools/perf/arch/csky/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_32.h>
diff --git a/tools/perf/arch/loongarch/include/syscall_table.h b/tools/perf/arch/loongarch/include/syscall_table.h
deleted file mode 100644
index 9d0646d3455c..000000000000
--- a/tools/perf/arch/loongarch/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscall_table_64.h>
diff --git a/tools/perf/arch/mips/include/syscall_table.h b/tools/perf/arch/mips/include/syscall_table.h
deleted file mode 100644
index b53e31c15805..000000000000
--- a/tools/perf/arch/mips/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_64.h>
diff --git a/tools/perf/arch/parisc/include/syscall_table.h b/tools/perf/arch/parisc/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/parisc/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/powerpc/include/syscall_table.h b/tools/perf/arch/powerpc/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/powerpc/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/riscv/include/syscall_table.h b/tools/perf/arch/riscv/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/riscv/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/s390/include/syscall_table.h b/tools/perf/arch/s390/include/syscall_table.h
deleted file mode 100644
index b53e31c15805..000000000000
--- a/tools/perf/arch/s390/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_64.h>
diff --git a/tools/perf/arch/sh/include/syscall_table.h b/tools/perf/arch/sh/include/syscall_table.h
deleted file mode 100644
index 4c942821662d..000000000000
--- a/tools/perf/arch/sh/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_32.h>
diff --git a/tools/perf/arch/sparc/include/syscall_table.h b/tools/perf/arch/sparc/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/sparc/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/x86/include/syscall_table.h b/tools/perf/arch/x86/include/syscall_table.h
deleted file mode 100644
index 7ff51b783000..000000000000
--- a/tools/perf/arch/x86/include/syscall_table.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/bitsperlong.h>
-
-#if __BITS_PER_LONG == 64
-#include <asm/syscalls_64.h>
-#else
-#include <asm/syscalls_32.h>
-#endif
diff --git a/tools/perf/arch/xtensa/include/syscall_table.h b/tools/perf/arch/xtensa/include/syscall_table.h
deleted file mode 100644
index 4c942821662d..000000000000
--- a/tools/perf/arch/xtensa/include/syscall_table.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm/syscalls_32.h>
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index 928aca4cd6e9..2f76241494c8 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -7,13 +7,19 @@
#include "syscalltbl.h"
#include <stdlib.h>
+#include <asm/bitsperlong.h>
#include <linux/compiler.h>
#include <linux/zalloc.h>
#include <string.h>
#include "string2.h"
-#include <syscall_table.h>
+#if __BITS_PER_LONG == 64
+ #include <asm/syscalls_64.h>
+#else
+ #include <asm/syscalls_32.h>
+#endif
+
const int syscalltbl_native_max_id = SYSCALLTBL_MAX_ID;
static const char *const *syscalltbl_native = syscalltbl;
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 2/7] perf trace: Reorganize syscalls
2025-02-10 16:51 [PATCH v2 0/7] perf: Support multiple system call tables in the build Ian Rogers
2025-02-10 16:51 ` [PATCH v2 1/7] perf syscalltble: Remove syscall_table.h Ian Rogers
@ 2025-02-10 16:51 ` Ian Rogers
2025-02-11 0:17 ` Charlie Jenkins
2025-02-10 16:51 ` [PATCH v2 3/7] perf syscalltbl: Remove struct syscalltbl Ian Rogers
` (4 subsequent siblings)
6 siblings, 1 reply; 26+ messages in thread
From: Ian Rogers @ 2025-02-10 16:51 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, John Garry, Will Deacon,
James Clark, Mike Leach, Leo Yan, Guo Ren, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Charlie Jenkins, Bibo Mao,
Arnd Bergmann, Huacai Chen, Catalin Marinas, Jiri Slaby,
Björn Töpel, Howard Chu, linux-kernel, linux-perf-users,
linux-arm-kernel, linux-csky, linux-riscv
Identify struct syscall information in the syscalls table by a machine
type and syscall number, not just system call number. Having the
machine type means that 32-bit system calls can be differentiated from
64-bit ones on a machine capable of both. Having a table for all
machine types and all system call numbers would be too large, so
maintain a sorted array of system calls as they are encountered.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
---
tools/perf/builtin-trace.c | 178 +++++++++++++++++++++++++------------
1 file changed, 119 insertions(+), 59 deletions(-)
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 06356217adeb..916a51df236b 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -66,6 +66,7 @@
#include "rb_resort.h"
#include "../perf.h"
#include "trace_augment.h"
+#include "dwarf-regs.h"
#include <errno.h>
#include <inttypes.h>
@@ -86,6 +87,7 @@
#include <linux/ctype.h>
#include <perf/mmap.h>
+#include <tools/libc_compat.h>
#ifdef HAVE_LIBTRACEEVENT
#include <event-parse.h>
@@ -143,7 +145,10 @@ struct trace {
struct perf_tool tool;
struct syscalltbl *sctbl;
struct {
+ /** Sorted sycall numbers used by the trace. */
struct syscall *table;
+ /** Size of table. */
+ size_t table_size;
struct {
struct evsel *sys_enter,
*sys_exit,
@@ -1445,22 +1450,37 @@ static const struct syscall_fmt *syscall_fmt__find_by_alias(const char *alias)
return __syscall_fmt__find_by_alias(syscall_fmts, nmemb, alias);
}
-/*
- * is_exit: is this "exit" or "exit_group"?
- * is_open: is this "open" or "openat"? To associate the fd returned in sys_exit with the pathname in sys_enter.
- * args_size: sum of the sizes of the syscall arguments, anything after that is augmented stuff: pathname for openat, etc.
- * nonexistent: Just a hole in the syscall table, syscall id not allocated
+/**
+ * struct syscall
*/
struct syscall {
+ /** @e_machine: The ELF machine associated with the entry. */
+ int e_machine;
+ /** @id: id value from the tracepoint, the system call number. */
+ int id;
struct tep_event *tp_format;
int nr_args;
+ /**
+ * @args_size: sum of the sizes of the syscall arguments, anything
+ * after that is augmented stuff: pathname for openat, etc.
+ */
+
int args_size;
struct {
struct bpf_program *sys_enter,
*sys_exit;
} bpf_prog;
+ /** @is_exit: is this "exit" or "exit_group"? */
bool is_exit;
+ /**
+ * @is_open: is this "open" or "openat"? To associate the fd returned in
+ * sys_exit with the pathname in sys_enter.
+ */
bool is_open;
+ /**
+ * @nonexistent: Name lookup failed. Just a hole in the syscall table,
+ * syscall id not allocated.
+ */
bool nonexistent;
bool use_btf;
struct tep_format_field *args;
@@ -2066,22 +2086,21 @@ static int syscall__set_arg_fmts(struct syscall *sc)
return 0;
}
-static int trace__read_syscall_info(struct trace *trace, int id)
+static int syscall__read_info(struct syscall *sc, struct trace *trace)
{
char tp_name[128];
- struct syscall *sc;
- const char *name = syscalltbl__name(trace->sctbl, id);
+ const char *name;
int err;
- if (trace->syscalls.table == NULL) {
- trace->syscalls.table = calloc(trace->sctbl->syscalls.max_id + 1, sizeof(*sc));
- if (trace->syscalls.table == NULL)
- return -ENOMEM;
- }
- sc = trace->syscalls.table + id;
if (sc->nonexistent)
return -EEXIST;
+ if (sc->name) {
+ /* Info already read. */
+ return 0;
+ }
+
+ name = syscalltbl__name(trace->sctbl, sc->id);
if (name == NULL) {
sc->nonexistent = true;
return -EEXIST;
@@ -2104,15 +2123,16 @@ static int trace__read_syscall_info(struct trace *trace, int id)
*/
if (IS_ERR(sc->tp_format)) {
sc->nonexistent = true;
- return PTR_ERR(sc->tp_format);
+ err = PTR_ERR(sc->tp_format);
+ sc->tp_format = NULL;
+ return err;
}
/*
* The tracepoint format contains __syscall_nr field, so it's one more
* than the actual number of syscall arguments.
*/
- if (syscall__alloc_arg_fmts(sc, IS_ERR(sc->tp_format) ?
- RAW_SYSCALL_ARGS_NUM : sc->tp_format->format.nr_fields - 1))
+ if (syscall__alloc_arg_fmts(sc, sc->tp_format->format.nr_fields - 1))
return -ENOMEM;
sc->args = sc->tp_format->format.fields;
@@ -2401,13 +2421,67 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
return printed;
}
+static void syscall__init(struct syscall *sc, int e_machine, int id)
+{
+ memset(sc, 0, sizeof(*sc));
+ sc->e_machine = e_machine;
+ sc->id = id;
+}
+
+static void syscall__exit(struct syscall *sc)
+{
+ if (!sc)
+ return;
+
+ zfree(&sc->arg_fmt);
+}
+
+static int syscall__cmp(const void *va, const void *vb)
+{
+ const struct syscall *a = va, *b = vb;
+
+ if (a->e_machine != b->e_machine)
+ return a->e_machine - b->e_machine;
+
+ return a->id - b->id;
+}
+
+static struct syscall *trace__find_syscall(struct trace *trace, int e_machine, int id)
+{
+ struct syscall key = {
+ .e_machine = e_machine,
+ .id = id,
+ };
+ struct syscall *sc, *tmp;
+
+ sc = bsearch(&key, trace->syscalls.table, trace->syscalls.table_size,
+ sizeof(struct syscall), syscall__cmp);
+ if (sc)
+ return sc;
+
+ tmp = reallocarray(trace->syscalls.table, trace->syscalls.table_size + 1,
+ sizeof(struct syscall));
+ if (!tmp)
+ return NULL;
+
+ trace->syscalls.table = tmp;
+ sc = &trace->syscalls.table[trace->syscalls.table_size++];
+ syscall__init(sc, e_machine, id);
+ qsort(trace->syscalls.table, trace->syscalls.table_size, sizeof(struct syscall),
+ syscall__cmp);
+ sc = bsearch(&key, trace->syscalls.table, trace->syscalls.table_size,
+ sizeof(struct syscall), syscall__cmp);
+ return sc;
+}
+
typedef int (*tracepoint_handler)(struct trace *trace, struct evsel *evsel,
union perf_event *event,
struct perf_sample *sample);
-static struct syscall *trace__syscall_info(struct trace *trace,
- struct evsel *evsel, int id)
+static struct syscall *trace__syscall_info(struct trace *trace, struct evsel *evsel,
+ int e_machine, int id)
{
+ struct syscall *sc;
int err = 0;
if (id < 0) {
@@ -2432,28 +2506,20 @@ static struct syscall *trace__syscall_info(struct trace *trace,
err = -EINVAL;
- if (id > trace->sctbl->syscalls.max_id) {
- goto out_cant_read;
- }
-
- if ((trace->syscalls.table == NULL || trace->syscalls.table[id].name == NULL) &&
- (err = trace__read_syscall_info(trace, id)) != 0)
- goto out_cant_read;
+ sc = trace__find_syscall(trace, e_machine, id);
+ if (sc)
+ err = syscall__read_info(sc, trace);
- if (trace->syscalls.table && trace->syscalls.table[id].nonexistent)
- goto out_cant_read;
-
- return &trace->syscalls.table[id];
-
-out_cant_read:
- if (verbose > 0) {
+ if (err && verbose > 0) {
char sbuf[STRERR_BUFSIZE];
- fprintf(trace->output, "Problems reading syscall %d: %d (%s)", id, -err, str_error_r(-err, sbuf, sizeof(sbuf)));
- if (id <= trace->sctbl->syscalls.max_id && trace->syscalls.table[id].name != NULL)
- fprintf(trace->output, "(%s)", trace->syscalls.table[id].name);
+
+ fprintf(trace->output, "Problems reading syscall %d: %d (%s)", id, -err,
+ str_error_r(-err, sbuf, sizeof(sbuf)));
+ if (sc && sc->name)
+ fprintf(trace->output, "(%s)", sc->name);
fputs(" information\n", trace->output);
}
- return NULL;
+ return err ? NULL : sc;
}
struct syscall_stats {
@@ -2600,14 +2666,6 @@ static void *syscall__augmented_args(struct syscall *sc, struct perf_sample *sam
return NULL;
}
-static void syscall__exit(struct syscall *sc)
-{
- if (!sc)
- return;
-
- zfree(&sc->arg_fmt);
-}
-
static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
union perf_event *event __maybe_unused,
struct perf_sample *sample)
@@ -2619,7 +2677,7 @@ static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
int augmented_args_size = 0;
void *augmented_args = NULL;
- struct syscall *sc = trace__syscall_info(trace, evsel, id);
+ struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
struct thread_trace *ttrace;
if (sc == NULL)
@@ -2693,7 +2751,7 @@ static int trace__fprintf_sys_enter(struct trace *trace, struct evsel *evsel,
struct thread_trace *ttrace;
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
- struct syscall *sc = trace__syscall_info(trace, evsel, id);
+ struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
char msg[1024];
void *args, *augmented_args = NULL;
int augmented_args_size;
@@ -2768,7 +2826,7 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1, callchain_ret = 0, printed = 0;
int alignment = trace->args_alignment;
- struct syscall *sc = trace__syscall_info(trace, evsel, id);
+ struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
struct thread_trace *ttrace;
if (sc == NULL)
@@ -3121,7 +3179,7 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
if (evsel == trace->syscalls.events.bpf_output) {
int id = perf_evsel__sc_tp_uint(evsel, id, sample);
- struct syscall *sc = trace__syscall_info(trace, evsel, id);
+ struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
if (sc) {
fprintf(trace->output, "%s(", sc->name);
@@ -3626,7 +3684,7 @@ static struct bpf_program *trace__find_syscall_bpf_prog(struct trace *trace, str
static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
if (sc == NULL)
return;
@@ -3637,20 +3695,20 @@ static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
static int trace__bpf_prog_sys_enter_fd(struct trace *trace, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
return sc ? bpf_program__fd(sc->bpf_prog.sys_enter) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
}
static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
}
static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigned int *beauty_array)
{
struct tep_format_field *field;
- struct syscall *sc = trace__syscall_info(trace, NULL, key);
+ struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
const struct btf_type *bt;
char *struct_offset, *tmp, name[32];
bool can_augment = false;
@@ -3748,7 +3806,7 @@ static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace
try_to_find_pair:
for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
int id = syscalltbl__id_at_idx(trace->sctbl, i);
- struct syscall *pair = trace__syscall_info(trace, NULL, id);
+ struct syscall *pair = trace__syscall_info(trace, NULL, EM_HOST, id);
struct bpf_program *pair_prog;
bool is_candidate = false;
@@ -3898,7 +3956,7 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
*/
for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
int key = syscalltbl__id_at_idx(trace->sctbl, i);
- struct syscall *sc = trace__syscall_info(trace, NULL, key);
+ struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
struct bpf_program *pair_prog;
int prog_fd;
@@ -4663,7 +4721,11 @@ static size_t thread__dump_stats(struct thread_trace *ttrace,
pct = avg ? 100.0 * stddev_stats(&stats->stats) / avg : 0.0;
avg /= NSEC_PER_MSEC;
- sc = &trace->syscalls.table[syscall_stats_entry->syscall];
+ sc = trace__syscall_info(trace, /*evsel=*/NULL, EM_HOST,
+ syscall_stats_entry->syscall);
+ if (!sc)
+ continue;
+
printed += fprintf(fp, " %-15s", sc->name);
printed += fprintf(fp, " %8" PRIu64 " %6" PRIu64 " %9.3f %9.3f %9.3f",
n, stats->nr_failures, syscall_stats_entry->msecs, min, avg);
@@ -5071,12 +5133,10 @@ static int trace__config(const char *var, const char *value, void *arg)
static void trace__exit(struct trace *trace)
{
- int i;
-
strlist__delete(trace->ev_qualifier);
zfree(&trace->ev_qualifier_ids.entries);
if (trace->syscalls.table) {
- for (i = 0; i <= trace->sctbl->syscalls.max_id; i++)
+ for (size_t i = 0; i < trace->syscalls.table_size; i++)
syscall__exit(&trace->syscalls.table[i]);
zfree(&trace->syscalls.table);
}
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 3/7] perf syscalltbl: Remove struct syscalltbl
2025-02-10 16:51 [PATCH v2 0/7] perf: Support multiple system call tables in the build Ian Rogers
2025-02-10 16:51 ` [PATCH v2 1/7] perf syscalltble: Remove syscall_table.h Ian Rogers
2025-02-10 16:51 ` [PATCH v2 2/7] perf trace: Reorganize syscalls Ian Rogers
@ 2025-02-10 16:51 ` Ian Rogers
2025-02-11 0:19 ` Charlie Jenkins
2025-02-11 7:48 ` Arnd Bergmann
2025-02-10 16:51 ` [PATCH v2 4/7] perf thread: Add support for reading the e_machine type for a thread Ian Rogers
` (3 subsequent siblings)
6 siblings, 2 replies; 26+ messages in thread
From: Ian Rogers @ 2025-02-10 16:51 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, John Garry, Will Deacon,
James Clark, Mike Leach, Leo Yan, Guo Ren, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Charlie Jenkins, Bibo Mao,
Arnd Bergmann, Huacai Chen, Catalin Marinas, Jiri Slaby,
Björn Töpel, Howard Chu, linux-kernel, linux-perf-users,
linux-arm-kernel, linux-csky, linux-riscv
The syscalltbl held entries of system call name and number pairs,
generated from a native syscalltbl at start up. As there are gaps in
the system call number there is a notion of index into the
table. Going forward we want the system call table to be identifiable
by a machine type, for example, i386 vs x86-64. Change the interface
to the syscalltbl so (1) a (currently unused machine type of EM_HOST)
is passed (2) the index to syscall number and system call name mapping
is computed at build time.
Two tables are used for this, an array of system call number to name,
an array of system call numbers sorted by the system call name. The
sorted array doesn't store strings in part to save memory and
relocations. The index notion is carried forward and is an index into
the sorted array of system call numbers, the data structures are
opaque (held only in syscalltbl.c), and so the number of indices for a
machine type is exposed as a new API.
The arrays are computed in the syscalltbl.sh script and so no start-up
time computation and storage is necessary.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
---
tools/perf/builtin-trace.c | 88 +++++++++++++-----------
tools/perf/scripts/syscalltbl.sh | 36 ++++------
tools/perf/util/syscalltbl.c | 113 ++++++++++---------------------
tools/perf/util/syscalltbl.h | 22 ++----
4 files changed, 103 insertions(+), 156 deletions(-)
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 916a51df236b..4b77c2ab3dba 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -143,7 +143,6 @@ struct syscall_fmt {
struct trace {
struct perf_tool tool;
- struct syscalltbl *sctbl;
struct {
/** Sorted sycall numbers used by the trace. */
struct syscall *table;
@@ -2100,7 +2099,7 @@ static int syscall__read_info(struct syscall *sc, struct trace *trace)
return 0;
}
- name = syscalltbl__name(trace->sctbl, sc->id);
+ name = syscalltbl__name(sc->e_machine, sc->id);
if (name == NULL) {
sc->nonexistent = true;
return -EEXIST;
@@ -2200,10 +2199,14 @@ static int trace__validate_ev_qualifier(struct trace *trace)
strlist__for_each_entry(pos, trace->ev_qualifier) {
const char *sc = pos->s;
- int id = syscalltbl__id(trace->sctbl, sc), match_next = -1;
+ /*
+ * TODO: Assume more than the validation/warnings are all for
+ * the same binary type as perf.
+ */
+ int id = syscalltbl__id(EM_HOST, sc), match_next = -1;
if (id < 0) {
- id = syscalltbl__strglobmatch_first(trace->sctbl, sc, &match_next);
+ id = syscalltbl__strglobmatch_first(EM_HOST, sc, &match_next);
if (id >= 0)
goto matches;
@@ -2223,7 +2226,7 @@ static int trace__validate_ev_qualifier(struct trace *trace)
continue;
while (1) {
- id = syscalltbl__strglobmatch_next(trace->sctbl, sc, &match_next);
+ id = syscalltbl__strglobmatch_next(EM_HOST, sc, &match_next);
if (id < 0)
break;
if (nr_allocated == nr_used) {
@@ -2677,6 +2680,7 @@ static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
int augmented_args_size = 0;
void *augmented_args = NULL;
+ /* TODO: get e_machine from thread. */
struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
struct thread_trace *ttrace;
@@ -2751,6 +2755,7 @@ static int trace__fprintf_sys_enter(struct trace *trace, struct evsel *evsel,
struct thread_trace *ttrace;
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
+ /* TODO: get e_machine from thread. */
struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
char msg[1024];
void *args, *augmented_args = NULL;
@@ -2826,6 +2831,7 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1, callchain_ret = 0, printed = 0;
int alignment = trace->args_alignment;
+ /* TODO: get e_machine from thread. */
struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
struct thread_trace *ttrace;
@@ -3179,6 +3185,7 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
if (evsel == trace->syscalls.events.bpf_output) {
int id = perf_evsel__sc_tp_uint(evsel, id, sample);
+ /* TODO: get e_machine from thread. */
struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
if (sc) {
@@ -3682,9 +3689,9 @@ static struct bpf_program *trace__find_syscall_bpf_prog(struct trace *trace, str
return trace->skel->progs.syscall_unaugmented;
}
-static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
+static void trace__init_syscall_bpf_progs(struct trace *trace, int e_machine, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, id);
if (sc == NULL)
return;
@@ -3693,22 +3700,22 @@ static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
sc->bpf_prog.sys_exit = trace__find_syscall_bpf_prog(trace, sc, sc->fmt ? sc->fmt->bpf_prog_name.sys_exit : NULL, "exit");
}
-static int trace__bpf_prog_sys_enter_fd(struct trace *trace, int id)
+static int trace__bpf_prog_sys_enter_fd(struct trace *trace, int e_machine, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, id);
return sc ? bpf_program__fd(sc->bpf_prog.sys_enter) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
}
-static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
+static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int e_machine, int id)
{
- struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
+ struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, id);
return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
}
-static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigned int *beauty_array)
+static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int e_machine, int key, unsigned int *beauty_array)
{
struct tep_format_field *field;
- struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
+ struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, key);
const struct btf_type *bt;
char *struct_offset, *tmp, name[32];
bool can_augment = false;
@@ -3804,9 +3811,9 @@ static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace
return NULL;
try_to_find_pair:
- for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
- int id = syscalltbl__id_at_idx(trace->sctbl, i);
- struct syscall *pair = trace__syscall_info(trace, NULL, EM_HOST, id);
+ for (int i = 0, num_idx = syscalltbl__num_idx(sc->e_machine); i < num_idx; ++i) {
+ int id = syscalltbl__id_at_idx(sc->e_machine, i);
+ struct syscall *pair = trace__syscall_info(trace, NULL, sc->e_machine, id);
struct bpf_program *pair_prog;
bool is_candidate = false;
@@ -3890,7 +3897,7 @@ static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace
return NULL;
}
-static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
+static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace, int e_machine)
{
int map_enter_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_enter);
int map_exit_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_exit);
@@ -3898,27 +3905,27 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
int err = 0;
unsigned int beauty_array[6];
- for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
- int prog_fd, key = syscalltbl__id_at_idx(trace->sctbl, i);
+ for (int i = 0, num_idx = syscalltbl__num_idx(e_machine); i < num_idx; ++i) {
+ int prog_fd, key = syscalltbl__id_at_idx(e_machine, i);
if (!trace__syscall_enabled(trace, key))
continue;
- trace__init_syscall_bpf_progs(trace, key);
+ trace__init_syscall_bpf_progs(trace, e_machine, key);
// It'll get at least the "!raw_syscalls:unaugmented"
- prog_fd = trace__bpf_prog_sys_enter_fd(trace, key);
+ prog_fd = trace__bpf_prog_sys_enter_fd(trace, e_machine, key);
err = bpf_map_update_elem(map_enter_fd, &key, &prog_fd, BPF_ANY);
if (err)
break;
- prog_fd = trace__bpf_prog_sys_exit_fd(trace, key);
+ prog_fd = trace__bpf_prog_sys_exit_fd(trace, e_machine, key);
err = bpf_map_update_elem(map_exit_fd, &key, &prog_fd, BPF_ANY);
if (err)
break;
/* use beauty_map to tell BPF how many bytes to collect, set beauty_map's value here */
memset(beauty_array, 0, sizeof(beauty_array));
- err = trace__bpf_sys_enter_beauty_map(trace, key, (unsigned int *)beauty_array);
+ err = trace__bpf_sys_enter_beauty_map(trace, e_machine, key, (unsigned int *)beauty_array);
if (err)
continue;
err = bpf_map_update_elem(beauty_map_fd, &key, beauty_array, BPF_ANY);
@@ -3954,9 +3961,9 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
* first and second arg (this one on the raw_syscalls:sys_exit prog
* array tail call, then that one will be used.
*/
- for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
- int key = syscalltbl__id_at_idx(trace->sctbl, i);
- struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
+ for (int i = 0, num_idx = syscalltbl__num_idx(e_machine); i < num_idx; ++i) {
+ int key = syscalltbl__id_at_idx(e_machine, i);
+ struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, key);
struct bpf_program *pair_prog;
int prog_fd;
@@ -4393,8 +4400,13 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
goto out_error_mem;
#ifdef HAVE_BPF_SKEL
- if (trace->skel && trace->skel->progs.sys_enter)
- trace__init_syscalls_bpf_prog_array_maps(trace);
+ if (trace->skel && trace->skel->progs.sys_enter) {
+ /*
+ * TODO: Initialize for all host binary machine types, not just
+ * those matching the perf binary.
+ */
+ trace__init_syscalls_bpf_prog_array_maps(trace, EM_HOST);
+ }
#endif
if (trace->ev_qualifier_ids.nr > 0) {
@@ -4419,7 +4431,8 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
* So just disable this beautifier (SCA_FD, SCA_FDAT) when 'close' is
* not in use.
*/
- trace->fd_path_disabled = !trace__syscall_enabled(trace, syscalltbl__id(trace->sctbl, "close"));
+ /* TODO: support for more than just perf binary machine type close. */
+ trace->fd_path_disabled = !trace__syscall_enabled(trace, syscalltbl__id(EM_HOST, "close"));
err = trace__expand_filters(trace, &evsel);
if (err)
@@ -4692,8 +4705,7 @@ DEFINE_RESORT_RB(syscall_stats, a->msecs > b->msecs,
entry->msecs = stats ? (u64)stats->stats.n * (avg_stats(&stats->stats) / NSEC_PER_MSEC) : 0;
}
-static size_t thread__dump_stats(struct thread_trace *ttrace,
- struct trace *trace, FILE *fp)
+static size_t thread__dump_stats(struct thread_trace *ttrace, struct trace *trace, int e_machine, FILE *fp)
{
size_t printed = 0;
struct syscall *sc;
@@ -4721,7 +4733,7 @@ static size_t thread__dump_stats(struct thread_trace *ttrace,
pct = avg ? 100.0 * stddev_stats(&stats->stats) / avg : 0.0;
avg /= NSEC_PER_MSEC;
- sc = trace__syscall_info(trace, /*evsel=*/NULL, EM_HOST,
+ sc = trace__syscall_info(trace, /*evsel=*/NULL, e_machine,
syscall_stats_entry->syscall);
if (!sc)
continue;
@@ -4771,7 +4783,8 @@ static size_t trace__fprintf_thread(FILE *fp, struct thread *thread, struct trac
else if (fputc('\n', fp) != EOF)
++printed;
- printed += thread__dump_stats(ttrace, trace, fp);
+ /* TODO: get e_machine from thread. */
+ printed += thread__dump_stats(ttrace, trace, EM_HOST, fp);
return printed;
}
@@ -5003,8 +5016,9 @@ static int trace__parse_events_option(const struct option *opt, const char *str,
*sep = '\0';
list = 0;
- if (syscalltbl__id(trace->sctbl, s) >= 0 ||
- syscalltbl__strglobmatch_first(trace->sctbl, s, &idx) >= 0) {
+ /* TODO: support for more than just perf binary machine type syscalls. */
+ if (syscalltbl__id(EM_HOST, s) >= 0 ||
+ syscalltbl__strglobmatch_first(EM_HOST, s, &idx) >= 0) {
list = 1;
goto do_concat;
}
@@ -5140,7 +5154,6 @@ static void trace__exit(struct trace *trace)
syscall__exit(&trace->syscalls.table[i]);
zfree(&trace->syscalls.table);
}
- syscalltbl__delete(trace->sctbl);
zfree(&trace->perfconfig_events);
}
@@ -5286,9 +5299,8 @@ int cmd_trace(int argc, const char **argv)
sigaction(SIGCHLD, &sigchld_act, NULL);
trace.evlist = evlist__new();
- trace.sctbl = syscalltbl__new();
- if (trace.evlist == NULL || trace.sctbl == NULL) {
+ if (trace.evlist == NULL) {
pr_err("Not enough memory to run!\n");
err = -ENOMEM;
goto out;
diff --git a/tools/perf/scripts/syscalltbl.sh b/tools/perf/scripts/syscalltbl.sh
index 1ce0d5aa8b50..a39b3013b103 100755
--- a/tools/perf/scripts/syscalltbl.sh
+++ b/tools/perf/scripts/syscalltbl.sh
@@ -50,37 +50,27 @@ fi
infile="$1"
outfile="$2"
-nxt=0
-
-syscall_macro() {
- nr="$1"
- name="$2"
-
- echo " [$nr] = \"$name\","
-}
-
-emit() {
- nr="$1"
- entry="$2"
-
- syscall_macro "$nr" "$entry"
-}
-
-echo "static const char *const syscalltbl[] = {" > $outfile
-
sorted_table=$(mktemp /tmp/syscalltbl.XXXXXX)
grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | sort -n > $sorted_table
-max_nr=0
+echo "static const char *const syscall_num_to_name[] = {" > $outfile
# the params are: nr abi name entry compat
# use _ for intentionally unused variables according to SC2034
while read nr _ name _ _; do
- emit "$nr" "$name" >> $outfile
- max_nr=$nr
+ echo " [$nr] = \"$name\"," >> $outfile
done < $sorted_table
+echo "};" >> $outfile
-rm -f $sorted_table
+echo "static const uint16_t syscall_sorted_names[] = {" >> $outfile
+# When sorting by name, add a suffix of 0s upto 20 characters so that system
+# calls that differ with a numerical suffix don't sort before those
+# without. This default behavior of sort differs from that of strcmp used at
+# runtime. Use sed to strip the trailing 0s suffix afterwards.
+grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | awk '{printf $3; for (i = length($3); i < 20; i++) { printf "0"; }; print " " $1}'| sort | sed 's/\([a-zA-Z1-9]\+\)0\+ \([0-9]\+\)/\1 \2/' > $sorted_table
+while read name nr; do
+ echo " $nr, /* $name */" >> $outfile
+done < $sorted_table
echo "};" >> $outfile
-echo "#define SYSCALLTBL_MAX_ID ${max_nr}" >> $outfile
+rm -f $sorted_table
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index 2f76241494c8..760ac4d0869f 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -9,6 +9,7 @@
#include <stdlib.h>
#include <asm/bitsperlong.h>
#include <linux/compiler.h>
+#include <linux/kernel.h>
#include <linux/zalloc.h>
#include <string.h>
@@ -20,112 +21,66 @@
#include <asm/syscalls_32.h>
#endif
-const int syscalltbl_native_max_id = SYSCALLTBL_MAX_ID;
-static const char *const *syscalltbl_native = syscalltbl;
+const char *syscalltbl__name(int e_machine __maybe_unused, int id)
+{
+ if (id >= 0 && id <= (int)ARRAY_SIZE(syscall_num_to_name))
+ return syscall_num_to_name[id];
+ return NULL;
+}
-struct syscall {
- int id;
+struct syscall_cmp_key {
const char *name;
+ const char *const *tbl;
};
static int syscallcmpname(const void *vkey, const void *ventry)
{
- const char *key = vkey;
- const struct syscall *entry = ventry;
+ const struct syscall_cmp_key *key = vkey;
+ const uint16_t *entry = ventry;
- return strcmp(key, entry->name);
+ return strcmp(key->name, key->tbl[*entry]);
}
-static int syscallcmp(const void *va, const void *vb)
+int syscalltbl__id(int e_machine __maybe_unused, const char *name)
{
- const struct syscall *a = va, *b = vb;
-
- return strcmp(a->name, b->name);
+ struct syscall_cmp_key key = {
+ .name = name,
+ .tbl = syscall_num_to_name,
+ };
+ const int *id = bsearch(&key, syscall_sorted_names,
+ ARRAY_SIZE(syscall_sorted_names),
+ sizeof(syscall_sorted_names[0]),
+ syscallcmpname);
+
+ return id ? *id : -1;
}
-static int syscalltbl__init_native(struct syscalltbl *tbl)
+int syscalltbl__num_idx(int e_machine __maybe_unused)
{
- int nr_entries = 0, i, j;
- struct syscall *entries;
-
- for (i = 0; i <= syscalltbl_native_max_id; ++i)
- if (syscalltbl_native[i])
- ++nr_entries;
-
- entries = tbl->syscalls.entries = malloc(sizeof(struct syscall) * nr_entries);
- if (tbl->syscalls.entries == NULL)
- return -1;
-
- for (i = 0, j = 0; i <= syscalltbl_native_max_id; ++i) {
- if (syscalltbl_native[i]) {
- entries[j].name = syscalltbl_native[i];
- entries[j].id = i;
- ++j;
- }
- }
-
- qsort(tbl->syscalls.entries, nr_entries, sizeof(struct syscall), syscallcmp);
- tbl->syscalls.nr_entries = nr_entries;
- tbl->syscalls.max_id = syscalltbl_native_max_id;
- return 0;
+ return ARRAY_SIZE(syscall_sorted_names);
}
-struct syscalltbl *syscalltbl__new(void)
+int syscalltbl__id_at_idx(int e_machine __maybe_unused, int idx)
{
- struct syscalltbl *tbl = malloc(sizeof(*tbl));
- if (tbl) {
- if (syscalltbl__init_native(tbl)) {
- free(tbl);
- return NULL;
- }
- }
- return tbl;
-}
-
-void syscalltbl__delete(struct syscalltbl *tbl)
-{
- zfree(&tbl->syscalls.entries);
- free(tbl);
-}
-
-const char *syscalltbl__name(const struct syscalltbl *tbl __maybe_unused, int id)
-{
- return id <= syscalltbl_native_max_id ? syscalltbl_native[id]: NULL;
-}
-
-int syscalltbl__id(struct syscalltbl *tbl, const char *name)
-{
- struct syscall *sc = bsearch(name, tbl->syscalls.entries,
- tbl->syscalls.nr_entries, sizeof(*sc),
- syscallcmpname);
-
- return sc ? sc->id : -1;
-}
-
-int syscalltbl__id_at_idx(struct syscalltbl *tbl, int idx)
-{
- struct syscall *syscalls = tbl->syscalls.entries;
-
- return idx < tbl->syscalls.nr_entries ? syscalls[idx].id : -1;
+ return syscall_sorted_names[idx];
}
-int syscalltbl__strglobmatch_next(struct syscalltbl *tbl, const char *syscall_glob, int *idx)
+int syscalltbl__strglobmatch_next(int e_machine __maybe_unused, const char *syscall_glob, int *idx)
{
- int i;
- struct syscall *syscalls = tbl->syscalls.entries;
+ for (int i = *idx + 1; i < (int)ARRAY_SIZE(syscall_sorted_names); ++i) {
+ const char *name = syscall_num_to_name[syscall_sorted_names[i]];
- for (i = *idx + 1; i < tbl->syscalls.nr_entries; ++i) {
- if (strglobmatch(syscalls[i].name, syscall_glob)) {
+ if (strglobmatch(name, syscall_glob)) {
*idx = i;
- return syscalls[i].id;
+ return syscall_sorted_names[i];
}
}
return -1;
}
-int syscalltbl__strglobmatch_first(struct syscalltbl *tbl, const char *syscall_glob, int *idx)
+int syscalltbl__strglobmatch_first(int e_machine, const char *syscall_glob, int *idx)
{
*idx = -1;
- return syscalltbl__strglobmatch_next(tbl, syscall_glob, idx);
+ return syscalltbl__strglobmatch_next(e_machine, syscall_glob, idx);
}
diff --git a/tools/perf/util/syscalltbl.h b/tools/perf/util/syscalltbl.h
index 362411a6d849..2bb628eff367 100644
--- a/tools/perf/util/syscalltbl.h
+++ b/tools/perf/util/syscalltbl.h
@@ -2,22 +2,12 @@
#ifndef __PERF_SYSCALLTBL_H
#define __PERF_SYSCALLTBL_H
-struct syscalltbl {
- struct {
- int max_id;
- int nr_entries;
- void *entries;
- } syscalls;
-};
+const char *syscalltbl__name(int e_machine, int id);
+int syscalltbl__id(int e_machine, const char *name);
+int syscalltbl__num_idx(int e_machine);
+int syscalltbl__id_at_idx(int e_machine, int idx);
-struct syscalltbl *syscalltbl__new(void);
-void syscalltbl__delete(struct syscalltbl *tbl);
-
-const char *syscalltbl__name(const struct syscalltbl *tbl, int id);
-int syscalltbl__id(struct syscalltbl *tbl, const char *name);
-int syscalltbl__id_at_idx(struct syscalltbl *tbl, int idx);
-
-int syscalltbl__strglobmatch_first(struct syscalltbl *tbl, const char *syscall_glob, int *idx);
-int syscalltbl__strglobmatch_next(struct syscalltbl *tbl, const char *syscall_glob, int *idx);
+int syscalltbl__strglobmatch_first(int e_machine, const char *syscall_glob, int *idx);
+int syscalltbl__strglobmatch_next(int e_machine, const char *syscall_glob, int *idx);
#endif /* __PERF_SYSCALLTBL_H */
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 4/7] perf thread: Add support for reading the e_machine type for a thread
2025-02-10 16:51 [PATCH v2 0/7] perf: Support multiple system call tables in the build Ian Rogers
` (2 preceding siblings ...)
2025-02-10 16:51 ` [PATCH v2 3/7] perf syscalltbl: Remove struct syscalltbl Ian Rogers
@ 2025-02-10 16:51 ` Ian Rogers
2025-02-11 0:20 ` Charlie Jenkins
2025-02-10 16:51 ` [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables Ian Rogers
` (2 subsequent siblings)
6 siblings, 1 reply; 26+ messages in thread
From: Ian Rogers @ 2025-02-10 16:51 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, John Garry, Will Deacon,
James Clark, Mike Leach, Leo Yan, Guo Ren, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Charlie Jenkins, Bibo Mao,
Arnd Bergmann, Huacai Chen, Catalin Marinas, Jiri Slaby,
Björn Töpel, Howard Chu, linux-kernel, linux-perf-users,
linux-arm-kernel, linux-csky, linux-riscv
Use the executable from /proc/pid/exe and read the e_machine from the
ELF header. On failure use EM_HOST. Change builtin-trace syscall
functions to pass e_machine from the thread rather than EM_HOST, so
that in later patches when syscalltbl can use the e_machine the system
calls are specific to the architecture.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
---
tools/perf/builtin-trace.c | 41 ++++++++++++++++---------------
tools/perf/util/thread.c | 50 ++++++++++++++++++++++++++++++++++++++
tools/perf/util/thread.h | 14 ++++++++++-
3 files changed, 85 insertions(+), 20 deletions(-)
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 4b77c2ab3dba..1ae609555018 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2678,16 +2678,17 @@ static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
int printed = 0;
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
- int augmented_args_size = 0;
+ int augmented_args_size = 0, e_machine;
void *augmented_args = NULL;
/* TODO: get e_machine from thread. */
- struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
+ struct syscall *sc;
struct thread_trace *ttrace;
- if (sc == NULL)
- return -1;
-
thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
+ e_machine = thread__e_machine(thread, trace->host);
+ sc = trace__syscall_info(trace, evsel, e_machine, id);
+ if (sc == NULL)
+ goto out_put;
ttrace = thread__trace(thread, trace->output);
if (ttrace == NULL)
goto out_put;
@@ -2756,16 +2757,18 @@ static int trace__fprintf_sys_enter(struct trace *trace, struct evsel *evsel,
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
/* TODO: get e_machine from thread. */
- struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
+ struct syscall *sc;
char msg[1024];
void *args, *augmented_args = NULL;
- int augmented_args_size;
+ int augmented_args_size, e_machine;
size_t printed = 0;
- if (sc == NULL)
- return -1;
thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
+ e_machine = thread__e_machine(thread, trace->host);
+ sc = trace__syscall_info(trace, evsel, e_machine, id);
+ if (sc == NULL)
+ return -1;
ttrace = thread__trace(thread, trace->output);
/*
* We need to get ttrace just to make sure it is there when syscall__scnprintf_args()
@@ -2830,15 +2833,15 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
bool duration_calculated = false;
struct thread *thread;
int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1, callchain_ret = 0, printed = 0;
- int alignment = trace->args_alignment;
- /* TODO: get e_machine from thread. */
- struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
+ int alignment = trace->args_alignment, e_machine;
+ struct syscall *sc;
struct thread_trace *ttrace;
- if (sc == NULL)
- return -1;
-
thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
+ e_machine = thread__e_machine(thread, trace->host);
+ sc = trace__syscall_info(trace, evsel, e_machine, id);
+ if (sc == NULL)
+ goto out_put;
ttrace = thread__trace(thread, trace->output);
if (ttrace == NULL)
goto out_put;
@@ -3185,8 +3188,8 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
if (evsel == trace->syscalls.events.bpf_output) {
int id = perf_evsel__sc_tp_uint(evsel, id, sample);
- /* TODO: get e_machine from thread. */
- struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
+ int e_machine = thread ? thread__e_machine(thread, trace->host) : EM_HOST;
+ struct syscall *sc = trace__syscall_info(trace, evsel, e_machine, id);
if (sc) {
fprintf(trace->output, "%s(", sc->name);
@@ -4764,6 +4767,7 @@ static size_t trace__fprintf_thread(FILE *fp, struct thread *thread, struct trac
{
size_t printed = 0;
struct thread_trace *ttrace = thread__priv(thread);
+ int e_machine = thread__e_machine(thread, trace->host);
double ratio;
if (ttrace == NULL)
@@ -4783,8 +4787,7 @@ static size_t trace__fprintf_thread(FILE *fp, struct thread *thread, struct trac
else if (fputc('\n', fp) != EOF)
++printed;
- /* TODO: get e_machine from thread. */
- printed += thread__dump_stats(ttrace, trace, EM_HOST, fp);
+ printed += thread__dump_stats(ttrace, trace, e_machine, fp);
return printed;
}
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 0ffdd52d86d7..a07446a280ed 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -1,5 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
+#include <elf.h>
#include <errno.h>
+#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
@@ -16,6 +18,7 @@
#include "symbol.h"
#include "unwind.h"
#include "callchain.h"
+#include "dwarf-regs.h"
#include <api/fs/fs.h>
@@ -51,6 +54,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
thread__set_ppid(thread, -1);
thread__set_cpu(thread, -1);
thread__set_guest_cpu(thread, -1);
+ thread__set_e_machine(thread, EM_NONE);
thread__set_lbr_stitch_enable(thread, false);
INIT_LIST_HEAD(thread__namespaces_list(thread));
INIT_LIST_HEAD(thread__comm_list(thread));
@@ -423,6 +427,52 @@ void thread__find_cpumode_addr_location(struct thread *thread, u64 addr,
}
}
+static uint16_t read_proc_e_machine_for_pid(pid_t pid)
+{
+ char path[6 /* "/proc/" */ + 11 /* max length of pid */ + 5 /* "/exe\0" */];
+ int fd;
+ uint16_t e_machine = EM_NONE;
+
+ snprintf(path, sizeof(path), "/proc/%d/exe", pid);
+ fd = open(path, O_RDONLY);
+ if (fd >= 0) {
+ _Static_assert(offsetof(Elf32_Ehdr, e_machine) == 18, "Unexpected offset");
+ _Static_assert(offsetof(Elf64_Ehdr, e_machine) == 18, "Unexpected offset");
+ if (pread(fd, &e_machine, sizeof(e_machine), 18) != sizeof(e_machine))
+ e_machine = EM_NONE;
+ close(fd);
+ }
+ return e_machine;
+}
+
+uint16_t thread__e_machine(struct thread *thread, struct machine *machine)
+{
+ pid_t tid, pid;
+ uint16_t e_machine = RC_CHK_ACCESS(thread)->e_machine;
+
+ if (e_machine != EM_NONE)
+ return e_machine;
+
+ tid = thread__tid(thread);
+ pid = thread__pid(thread);
+ if (pid != tid) {
+ struct thread *parent = machine__findnew_thread(machine, pid, pid);
+
+ if (parent) {
+ e_machine = thread__e_machine(parent, machine);
+ thread__set_e_machine(thread, e_machine);
+ return e_machine;
+ }
+ /* Something went wrong, fallback. */
+ }
+ e_machine = read_proc_e_machine_for_pid(pid);
+ if (e_machine != EM_NONE)
+ thread__set_e_machine(thread, e_machine);
+ else
+ e_machine = EM_HOST;
+ return e_machine;
+}
+
struct thread *thread__main_thread(struct machine *machine, struct thread *thread)
{
if (thread__pid(thread) == thread__tid(thread))
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 6cbf6eb2812e..cd574a896418 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -60,7 +60,11 @@ DECLARE_RC_STRUCT(thread) {
struct srccode_state srccode_state;
bool filter;
int filter_entry_depth;
-
+ /**
+ * @e_machine: The ELF EM_* associated with the thread. EM_NONE if not
+ * computed.
+ */
+ uint16_t e_machine;
/* LBR call stack stitch */
bool lbr_stitch_enable;
struct lbr_stitch *lbr_stitch;
@@ -302,6 +306,14 @@ static inline void thread__set_filter_entry_depth(struct thread *thread, int dep
RC_CHK_ACCESS(thread)->filter_entry_depth = depth;
}
+uint16_t thread__e_machine(struct thread *thread, struct machine *machine);
+
+static inline void thread__set_e_machine(struct thread *thread, uint16_t e_machine)
+{
+ RC_CHK_ACCESS(thread)->e_machine = e_machine;
+}
+
+
static inline bool thread__lbr_stitch_enable(const struct thread *thread)
{
return RC_CHK_ACCESS(thread)->lbr_stitch_enable;
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables
2025-02-10 16:51 [PATCH v2 0/7] perf: Support multiple system call tables in the build Ian Rogers
` (3 preceding siblings ...)
2025-02-10 16:51 ` [PATCH v2 4/7] perf thread: Add support for reading the e_machine type for a thread Ian Rogers
@ 2025-02-10 16:51 ` Ian Rogers
2025-02-11 0:22 ` Charlie Jenkins
2025-02-11 8:08 ` Arnd Bergmann
2025-02-10 16:51 ` [PATCH v2 6/7] perf syscalltbl: Use lookup table containing multiple architectures Ian Rogers
2025-02-10 16:51 ` [PATCH v2 7/7] perf build: Remove Makefile.syscalls Ian Rogers
6 siblings, 2 replies; 26+ messages in thread
From: Ian Rogers @ 2025-02-10 16:51 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, John Garry, Will Deacon,
James Clark, Mike Leach, Leo Yan, Guo Ren, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Charlie Jenkins, Bibo Mao,
Arnd Bergmann, Huacai Chen, Catalin Marinas, Jiri Slaby,
Björn Töpel, Howard Chu, linux-kernel, linux-perf-users,
linux-arm-kernel, linux-csky, linux-riscv
Rather than generating individual syscall header files generate a
single trace/beauty/generated/syscalltbl.c. In a syscalltbls array
have references to each architectures tables along with the
corresponding e_machine. When the 32-bit or 64-bit table is ambiguous,
match the perf binary's type. For ARM32 don't use the arm64 32-bit
table which is smaller. EM_NONE is present for is no machine matches.
Conditionally compile the tables, only having the appropriate 32 and
64-bit table. If ALL_SYSCALLTBL is defined all tables can be
compiled.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
---
tools/perf/Makefile.perf | 9 +
tools/perf/trace/beauty/syscalltbl.sh | 274 ++++++++++++++++++++++++++
2 files changed, 283 insertions(+)
create mode 100755 tools/perf/trace/beauty/syscalltbl.sh
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 55d6ce9ea52f..793e702f9aaf 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -559,6 +559,14 @@ beauty_ioctl_outdir := $(beauty_outdir)/ioctl
# Create output directory if not already present
$(shell [ -d '$(beauty_ioctl_outdir)' ] || mkdir -p '$(beauty_ioctl_outdir)')
+syscall_array := $(beauty_outdir)/syscalltbl.c
+syscall_tbl := $(srctree)/tools/perf/trace/beauty/syscalltbl.sh
+syscall_tbl_data := $(srctree)/tools/scripts/syscall.tbl \
+ $(wildcard $(srctree)/tools/perf/arch/*/entry/syscalls/syscall*.tbl)
+
+$(syscall_array): $(syscall_tbl) $(syscall_tbl_data)
+ $(Q)$(SHELL) '$(syscall_tbl)' $(srctree)/tools $@
+
fs_at_flags_array := $(beauty_outdir)/fs_at_flags_array.c
fs_at_flags_tbl := $(srctree)/tools/perf/trace/beauty/fs_at_flags.sh
@@ -878,6 +886,7 @@ build-dir = $(or $(__build-dir),.)
prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders \
arm64-sysreg-defs \
+ $(syscall_array) \
$(fs_at_flags_array) \
$(clone_flags_array) \
$(drm_ioctl_array) \
diff --git a/tools/perf/trace/beauty/syscalltbl.sh b/tools/perf/trace/beauty/syscalltbl.sh
new file mode 100755
index 000000000000..635924dc5f59
--- /dev/null
+++ b/tools/perf/trace/beauty/syscalltbl.sh
@@ -0,0 +1,274 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# Generate all syscall tables.
+#
+# Each line of the syscall table should have the following format:
+#
+# NR ABI NAME [NATIVE] [COMPAT]
+#
+# NR syscall number
+# ABI ABI name
+# NAME syscall name
+# NATIVE native entry point (optional)
+# COMPAT compat entry point (optional)
+
+set -e
+
+usage() {
+ cat >&2 <<EOF
+usage: $0 <TOOLS DIRECTORY> <OUTFILE>
+
+ <TOOLS DIRECTORY> path to kernel tools directory
+ <OUTFILE> output header file
+EOF
+ exit 1
+}
+
+if [ $# -ne 2 ]; then
+ usage
+fi
+tools_dir=$1
+outfile=$2
+
+build_tables() {
+ infile="$1"
+ outfile="$2"
+ abis=$(echo "($3)" | tr ',' '|')
+ e_machine="$4"
+
+ if [ ! -f "$infile" ]
+ then
+ echo "Missing file $infile"
+ exit 1
+ fi
+ sorted_table=$(mktemp /tmp/syscalltbl.XXXXXX)
+ grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | sort -n > "$sorted_table"
+
+ echo "static const char *const syscall_num_to_name_${e_machine}[] = {" >> "$outfile"
+ # the params are: nr abi name entry compat
+ # use _ for intentionally unused variables according to SC2034
+ while read -r nr _ name _ _; do
+ echo " [$nr] = \"$name\"," >> "$outfile"
+ done < "$sorted_table"
+ echo "};" >> "$outfile"
+
+ echo "static const uint16_t syscall_sorted_names_${e_machine}[] = {" >> "$outfile"
+
+ # When sorting by name, add a suffix of 0s upto 20 characters so that
+ # system calls that differ with a numerical suffix don't sort before
+ # those without. This default behavior of sort differs from that of
+ # strcmp used at runtime. Use sed to strip the trailing 0s suffix
+ # afterwards.
+ grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | awk '{printf $3; for (i = length($3); i < 20; i++) { printf "0"; }; print " " $1}'| sort | sed 's/\([a-zA-Z1-9]\+\)0\+ \([0-9]\+\)/\1 \2/' > "$sorted_table"
+ while read -r name nr; do
+ echo " $nr, /* $name */" >> "$outfile"
+ done < "$sorted_table"
+ echo "};" >> "$outfile"
+
+ rm -f "$sorted_table"
+}
+
+rm -f "$outfile"
+cat >> "$outfile" <<EOF
+#include <elf.h>
+#include <stdint.h>
+#include <asm/bitsperlong.h>
+#include <linux/kernel.h>
+
+struct syscalltbl {
+ const char *const *num_to_name;
+ const uint16_t *sorted_names;
+ uint16_t e_machine;
+ uint16_t num_to_name_len;
+ uint16_t sorted_names_len;
+};
+
+#if defined(ALL_SYSCALLTBL) || defined(__alpha__)
+EOF
+build_tables "$tools_dir/perf/arch/alpha/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_ALPHA
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__alpha__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
+EOF
+build_tables "$tools_dir/perf/arch/arm/entry/syscalls/syscall.tbl" "$outfile" common,32,oabi EM_ARM
+build_tables "$tools_dir/perf/arch/arm64/entry/syscalls/syscall_64.tbl" "$outfile" common,64,renameat,rlimit,memfd_secret EM_AARCH64
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__csky__)
+EOF
+build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32,csky,time32,stat64,rlimit EM_CSKY
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__csky__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__mips__)
+EOF
+build_tables "$tools_dir/perf/arch/mips/entry/syscalls/syscall_n64.tbl" "$outfile" common,64,n64 EM_MIPS
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__hppa__)
+#if __BITS_PER_LONG != 64
+EOF
+build_tables "$tools_dir/perf/arch/parisc/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_PARISC
+echo "#else" >> "$outfile"
+build_tables "$tools_dir/perf/arch/parisc/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_PARISC
+cat >> "$outfile" <<EOF
+#endif //__BITS_PER_LONG != 64
+#endif // defined(ALL_SYSCALLTBL) || defined(__hppa__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
+EOF
+build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl" "$outfile" common,32,nospu EM_PPC
+build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl" "$outfile" common,64,nospu EM_PPC64
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__riscv)
+#if __BITS_PER_LONG != 64
+EOF
+build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32,riscv,memfd_secret EM_RISCV
+echo "#else" >> "$outfile"
+build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,64,riscv,rlimit,memfd_secret EM_RISCV
+cat >> "$outfile" <<EOF
+#endif //__BITS_PER_LONG != 64
+#endif // defined(ALL_SYSCALLTBL) || defined(__riscv)
+#if defined(ALL_SYSCALLTBL) || defined(__s390x__)
+EOF
+build_tables "$tools_dir/perf/arch/s390/entry/syscalls/syscall.tbl" "$outfile" common,64,renameat,rlimit,memfd_secret EM_S390
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__sh__)
+EOF
+build_tables "$tools_dir/perf/arch/sh/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_SH
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__sh__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
+#if __BITS_PER_LONG != 64
+EOF
+build_tables "$tools_dir/perf/arch/sparc/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_SPARC
+echo "#else" >> "$outfile"
+build_tables "$tools_dir/perf/arch/sparc/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_SPARC
+cat >> "$outfile" <<EOF
+#endif //__BITS_PER_LONG != 64
+#endif // defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
+EOF
+build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_32.tbl" "$outfile" common,32,i386 EM_386
+build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_64.tbl" "$outfile" common,64 EM_X86_64
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__xtensa__)
+EOF
+build_tables "$tools_dir/perf/arch/xtensa/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_XTENSA
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__xtensa__)
+
+#if __BITS_PER_LONG != 64
+EOF
+build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32 EM_NONE
+echo "#else" >> "$outfile"
+build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,64 EM_NONE
+echo "#endif //__BITS_PER_LONG != 64" >> "$outfile"
+
+build_outer_table() {
+ e_machine=$1
+ outfile="$2"
+ cat >> "$outfile" <<EOF
+ {
+ .num_to_name = syscall_num_to_name_$e_machine,
+ .sorted_names = syscall_sorted_names_$e_machine,
+ .e_machine = $e_machine,
+ .num_to_name_len = ARRAY_SIZE(syscall_num_to_name_$e_machine),
+ .sorted_names_len = ARRAY_SIZE(syscall_sorted_names_$e_machine),
+ },
+EOF
+}
+
+cat >> "$outfile" <<EOF
+static const struct syscalltbl syscalltbls[] = {
+#if defined(ALL_SYSCALLTBL) || defined(__alpha__)
+EOF
+build_outer_table EM_ALPHA "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__alpha__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
+EOF
+build_outer_table EM_ARM "$outfile"
+build_outer_table EM_AARCH64 "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__csky__)
+EOF
+build_outer_table EM_CSKY "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__csky__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__mips__)
+EOF
+build_outer_table EM_MIPS "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__hppa__)
+EOF
+build_outer_table EM_PARISC "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__hppa__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
+EOF
+build_outer_table EM_PPC "$outfile"
+build_outer_table EM_PPC64 "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__riscv)
+EOF
+build_outer_table EM_RISCV "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__riscv)
+
+#if defined(ALL_SYSCALLTBL) || defined(__s390x__)
+EOF
+build_outer_table EM_S390 "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__sh__)
+EOF
+build_outer_table EM_SH "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__sh__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
+EOF
+build_outer_table EM_SPARC "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
+EOF
+build_outer_table EM_386 "$outfile"
+build_outer_table EM_X86_64 "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
+
+#if defined(ALL_SYSCALLTBL) || defined(__xtensa__)
+EOF
+build_outer_table EM_XTENSA "$outfile"
+cat >> "$outfile" <<EOF
+#endif // defined(ALL_SYSCALLTBL) || defined(__xtensa__)
+EOF
+build_outer_table EM_NONE "$outfile"
+cat >> "$outfile" <<EOF
+};
+EOF
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 6/7] perf syscalltbl: Use lookup table containing multiple architectures
2025-02-10 16:51 [PATCH v2 0/7] perf: Support multiple system call tables in the build Ian Rogers
` (4 preceding siblings ...)
2025-02-10 16:51 ` [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables Ian Rogers
@ 2025-02-10 16:51 ` Ian Rogers
2025-02-10 23:39 ` Charlie Jenkins
2025-02-11 0:23 ` Charlie Jenkins
2025-02-10 16:51 ` [PATCH v2 7/7] perf build: Remove Makefile.syscalls Ian Rogers
6 siblings, 2 replies; 26+ messages in thread
From: Ian Rogers @ 2025-02-10 16:51 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, John Garry, Will Deacon,
James Clark, Mike Leach, Leo Yan, Guo Ren, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Charlie Jenkins, Bibo Mao,
Arnd Bergmann, Huacai Chen, Catalin Marinas, Jiri Slaby,
Björn Töpel, Howard Chu, linux-kernel, linux-perf-users,
linux-arm-kernel, linux-csky, linux-riscv
Switch to use the lookup table containing all architectures rather
than tables matching the perf binary.
This fixes perf trace when executed on a 32-bit i386 binary on an
x86-64 machine. Note in the following the system call names of the
32-bit i386 binary as seen by an x86-64 perf.
Before:
```
? ( ): a.out/447296 ... [continued]: munmap()) = 0
0.024 ( 0.001 ms): a.out/447296 recvfrom(ubuf: 0x2, size: 4160585708, flags: DONTROUTE|CTRUNC|TRUNC|DONTWAIT|EOR|WAITALL|FIN|SYN|CONFIRM|RST|ERRQUEUE|NOSIGNAL|WAITFORONE|BATCH|SOCK_DEVMEM|ZEROCOPY|FASTOPEN|CMSG_CLOEXEC|0x91f80000, addr: 0xe30, addr_len: 0xffce438c) = 1475198976
0.042 ( 0.003 ms): a.out/447296 lgetxattr(name: "", value: 0x3, size: 34) = 4160344064
0.054 ( 0.003 ms): a.out/447296 dup2(oldfd: -134422744, newfd: 4) = -1 ENOENT (No such file or directory)
0.060 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x2e646c2f6374652f,.iov_len = (__kernel_size_t)7307199665335594867,}, vlen: 557056, pos_h: 4160585708) = 3
0.074 ( 0.004 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2) = 4160237568
0.080 ( 0.001 ms): a.out/447296 lstat(filename: "", statbuf: 0x193f6) = 0
0.089 ( 0.007 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x3833692f62696c2f,.iov_len = (__kernel_size_t)3276497845987585334,}, vlen: 557056, pos_h: 4160585708) = 3
0.097 ( 0.002 ms): a.out/447296 close(fd: 3</proc/447296/status>) = 512
0.103 ( 0.002 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2050) = 4157935616
0.107 ( 0.007 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x5, size: 2066) = 4158078976
0.116 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x1, size: 2066) = 4159639552
0.121 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 2066) = 4160184320
0.129 ( 0.002 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 50) = 4160196608
0.138 ( 0.001 ms): a.out/447296 lstat(filename: "") = 0
0.145 ( 0.002 ms): a.out/447296 mq_timedreceive(mqdes: 4291706800, u_msg_ptr: 0xf7f9ea48, msg_len: 134616640, u_msg_prio: 0xf7fd7fec, u_abs_timeout: (struct __kernel_timespec){.tv_sec = (__kernel_time64_t)-578174027777317696,.tv_nsec = (long long int)4160349376,}) = 0
0.148 ( 0.001 ms): a.out/447296 mkdirat(dfd: -134617816, pathname: " ��� ���▒���▒���", mode: IFREG|ISUID|IRUSR|IWGRP|0xf7fd0000) = 447296
0.150 ( 0.001 ms): a.out/447296 process_vm_writev(pid: -134617812, lvec: (struct iovec){.iov_base = (void *)0xf7f9e9c8f7f9e4c0,.iov_len = (__kernel_size_t)4160349376,}, liovcnt: 4160588048, rvec: (struct iovec){}, riovcnt: 4160585708, flags: 4291707352) = 0
0.197 ( 0.004 ms): a.out/447296 capget(header: 4160184320, dataptr: 8192) = 0
0.202 ( 0.002 ms): a.out/447296 capget(header: 1448669184, dataptr: 4096) = 0
0.208 ( 0.002 ms): a.out/447296 capget(header: 4160577536, dataptr: 8192) = 0
0.220 ( 0.001 ms): a.out/447296 getxattr(pathname: "", name: "c������", value: 0xf7f77e34, size: 1) = 0
0.228 ( 0.005 ms): a.out/447296 fchmod(fd: -134729728, mode: IRUGO|IWUGO|IFREG|IFIFO|ISVTX|IXUSR|0x10000) = 0
0.240 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: 0x5658e008, pos_h: 4160192052) = 3
0.250 ( 0.008 ms): a.out/447296 close(fd: 3</proc/447296/status>) = 1436
0.260 ( 0.018 ms): a.out/447296 stat(filename: "", statbuf: 0xffce32ac) = 1436
0.288 (1000.213 ms): a.out/447296 readlinkat(buf: 0xffce31d4, bufsiz: 4291703244) = 0
```
After:
```
? ( ): a.out/442930 ... [continued]: execve()) = 0
0.023 ( 0.002 ms): a.out/442930 brk() = 0x57760000
0.052 ( 0.003 ms): a.out/442930 access(filename: 0xf7f5af28, mode: R) = -1 ENOENT (No such file or directory)
0.059 ( 0.009 ms): a.out/442930 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
0.078 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>) = 0
0.087 ( 0.007 ms): a.out/442930 openat(dfd: CWD, filename: "/lib/i386-linux-", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
0.095 ( 0.002 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdbb70, count: 512) = 512
0.135 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>) = 0
0.148 ( 0.001 ms): a.out/442930 set_tid_address(tidptr: 0xf7f2b528) = 442930 (a.out)
0.150 ( 0.001 ms): a.out/442930 set_robust_list(head: 0xf7f2b52c, len: 12) =
0.196 ( 0.004 ms): a.out/442930 mprotect(start: 0xf7f03000, len: 8192, prot: READ) = 0
0.202 ( 0.002 ms): a.out/442930 mprotect(start: 0x5658e000, len: 4096, prot: READ) = 0
0.207 ( 0.002 ms): a.out/442930 mprotect(start: 0xf7f63000, len: 8192, prot: READ) = 0
0.230 ( 0.005 ms): a.out/442930 munmap(addr: 0xf7f10000, len: 103414) = 0
0.244 ( 0.010 ms): a.out/442930 openat(dfd: CWD, filename: 0x5658d008) = 3
0.255 ( 0.007 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdb67c, count: 4096) = 1436
0.264 ( 0.018 ms): a.out/442930 write(fd: 1</dev/pts/4>, buf: , count: 1436) = 1436
0.292 (1000.173 ms): a.out/442930 clock_nanosleep(rqtp: { .tv_sec: 17866546940376776704, .tv_nsec: 4159878336 }, rmtp: 0xffbdb59c) = 0
1000.478 ( ): a.out/442930 exit_group() = ?
```
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
---
tools/perf/util/syscalltbl.c | 89 ++++++++++++++++++++++++++----------
1 file changed, 64 insertions(+), 25 deletions(-)
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index 760ac4d0869f..db0d2b81aed1 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -15,16 +15,39 @@
#include <string.h>
#include "string2.h"
-#if __BITS_PER_LONG == 64
- #include <asm/syscalls_64.h>
-#else
- #include <asm/syscalls_32.h>
-#endif
+#include "trace/beauty/generated/syscalltbl.c"
-const char *syscalltbl__name(int e_machine __maybe_unused, int id)
+static const struct syscalltbl *find_table(int e_machine)
{
- if (id >= 0 && id <= (int)ARRAY_SIZE(syscall_num_to_name))
- return syscall_num_to_name[id];
+ static const struct syscalltbl *last_table;
+ static int last_table_machine = EM_NONE;
+
+ /* Tables only exist for EM_SPARC. */
+ if (e_machine == EM_SPARCV9)
+ e_machine = EM_SPARC;
+
+ if (last_table_machine == e_machine && last_table != NULL)
+ return last_table;
+
+ for (size_t i = 0; i < ARRAY_SIZE(syscalltbls); i++) {
+ const struct syscalltbl *entry = &syscalltbls[i];
+
+ if (entry->e_machine != e_machine && entry->e_machine != EM_NONE)
+ continue;
+
+ last_table = entry;
+ last_table_machine = e_machine;
+ return entry;
+ }
+ return NULL;
+}
+
+const char *syscalltbl__name(int e_machine, int id)
+{
+ const struct syscalltbl *table = find_table(e_machine);
+
+ if (table && id >= 0 && id < table->num_to_name_len)
+ return table->num_to_name[id];
return NULL;
}
@@ -41,38 +64,54 @@ static int syscallcmpname(const void *vkey, const void *ventry)
return strcmp(key->name, key->tbl[*entry]);
}
-int syscalltbl__id(int e_machine __maybe_unused, const char *name)
+int syscalltbl__id(int e_machine, const char *name)
{
- struct syscall_cmp_key key = {
- .name = name,
- .tbl = syscall_num_to_name,
- };
- const int *id = bsearch(&key, syscall_sorted_names,
- ARRAY_SIZE(syscall_sorted_names),
- sizeof(syscall_sorted_names[0]),
- syscallcmpname);
+ const struct syscalltbl *table = find_table(e_machine);
+ struct syscall_cmp_key key;
+ const int *id;
+
+ if (!table)
+ return -1;
+
+ key.name = name;
+ key.tbl = table->num_to_name;
+ id = bsearch(&key, table->sorted_names, table->sorted_names_len,
+ sizeof(table->sorted_names[0]), syscallcmpname);
return id ? *id : -1;
}
-int syscalltbl__num_idx(int e_machine __maybe_unused)
+int syscalltbl__num_idx(int e_machine)
{
- return ARRAY_SIZE(syscall_sorted_names);
+ const struct syscalltbl *table = find_table(e_machine);
+
+ if (!table)
+ return 0;
+
+ return table->sorted_names_len;
}
-int syscalltbl__id_at_idx(int e_machine __maybe_unused, int idx)
+int syscalltbl__id_at_idx(int e_machine, int idx)
{
- return syscall_sorted_names[idx];
+ const struct syscalltbl *table = find_table(e_machine);
+
+ if (!table)
+ return -1;
+
+ assert(idx >= 0 && idx < table->sorted_names_len);
+ return table->sorted_names[idx];
}
-int syscalltbl__strglobmatch_next(int e_machine __maybe_unused, const char *syscall_glob, int *idx)
+int syscalltbl__strglobmatch_next(int e_machine, const char *syscall_glob, int *idx)
{
- for (int i = *idx + 1; i < (int)ARRAY_SIZE(syscall_sorted_names); ++i) {
- const char *name = syscall_num_to_name[syscall_sorted_names[i]];
+ const struct syscalltbl *table = find_table(e_machine);
+
+ for (int i = *idx + 1; table && i < table->sorted_names_len; ++i) {
+ const char *name = table->num_to_name[table->sorted_names[i]];
if (strglobmatch(name, syscall_glob)) {
*idx = i;
- return syscall_sorted_names[i];
+ return table->sorted_names[i];
}
}
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 7/7] perf build: Remove Makefile.syscalls
2025-02-10 16:51 [PATCH v2 0/7] perf: Support multiple system call tables in the build Ian Rogers
` (5 preceding siblings ...)
2025-02-10 16:51 ` [PATCH v2 6/7] perf syscalltbl: Use lookup table containing multiple architectures Ian Rogers
@ 2025-02-10 16:51 ` Ian Rogers
6 siblings, 0 replies; 26+ messages in thread
From: Ian Rogers @ 2025-02-10 16:51 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, John Garry, Will Deacon,
James Clark, Mike Leach, Leo Yan, Guo Ren, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Charlie Jenkins, Bibo Mao,
Arnd Bergmann, Huacai Chen, Catalin Marinas, Jiri Slaby,
Björn Töpel, Howard Chu, linux-kernel, linux-perf-users,
linux-arm-kernel, linux-csky, linux-riscv
Now a single beauty file is generated and used by all architectures,
remove the per-architecture Makefiles, Kbuild files and previous
generator script.
Note: there was conversation with Charlie Jenkins
<charlie@rivosinc.com> and they'd written an alternate approach to
support multiple architectures:
https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
It would have been better to have helped Charlie fix their series (my
apologies) but they agreed that the approach taken here was likely
best for longer term maintainability:
https://lore.kernel.org/lkml/Z6Jk_UN9i69QGqUj@ghost/
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu
<howardchu95@gmail.com>
---
tools/perf/Makefile.perf | 1 -
tools/perf/arch/alpha/entry/syscalls/Kbuild | 2 -
.../alpha/entry/syscalls/Makefile.syscalls | 5 --
tools/perf/arch/arc/entry/syscalls/Kbuild | 2 -
.../arch/arc/entry/syscalls/Makefile.syscalls | 3 -
tools/perf/arch/arm/entry/syscalls/Kbuild | 4 -
.../arch/arm/entry/syscalls/Makefile.syscalls | 2 -
tools/perf/arch/arm64/entry/syscalls/Kbuild | 3 -
.../arm64/entry/syscalls/Makefile.syscalls | 6 --
tools/perf/arch/csky/entry/syscalls/Kbuild | 2 -
.../csky/entry/syscalls/Makefile.syscalls | 3 -
.../perf/arch/loongarch/entry/syscalls/Kbuild | 2 -
.../entry/syscalls/Makefile.syscalls | 3 -
tools/perf/arch/mips/entry/syscalls/Kbuild | 2 -
.../mips/entry/syscalls/Makefile.syscalls | 5 --
tools/perf/arch/parisc/entry/syscalls/Kbuild | 3 -
.../parisc/entry/syscalls/Makefile.syscalls | 6 --
tools/perf/arch/powerpc/entry/syscalls/Kbuild | 3 -
.../powerpc/entry/syscalls/Makefile.syscalls | 6 --
tools/perf/arch/riscv/entry/syscalls/Kbuild | 2 -
.../riscv/entry/syscalls/Makefile.syscalls | 4 -
tools/perf/arch/s390/entry/syscalls/Kbuild | 2 -
.../s390/entry/syscalls/Makefile.syscalls | 5 --
tools/perf/arch/sh/entry/syscalls/Kbuild | 2 -
.../arch/sh/entry/syscalls/Makefile.syscalls | 4 -
tools/perf/arch/sparc/entry/syscalls/Kbuild | 3 -
.../sparc/entry/syscalls/Makefile.syscalls | 5 --
tools/perf/arch/x86/entry/syscalls/Kbuild | 3 -
.../arch/x86/entry/syscalls/Makefile.syscalls | 6 --
tools/perf/arch/xtensa/entry/syscalls/Kbuild | 2 -
.../xtensa/entry/syscalls/Makefile.syscalls | 4 -
tools/perf/scripts/Makefile.syscalls | 61 ---------------
tools/perf/scripts/syscalltbl.sh | 76 -------------------
33 files changed, 242 deletions(-)
delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arm/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/csky/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/mips/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/s390/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/sh/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/x86/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Kbuild
delete mode 100644 tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
delete mode 100644 tools/perf/scripts/Makefile.syscalls
delete mode 100755 tools/perf/scripts/syscalltbl.sh
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 793e702f9aaf..62176d685445 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -339,7 +339,6 @@ ifeq ($(filter feature-dump,$(MAKECMDGOALS)),feature-dump)
FEATURE_TESTS := all
endif
endif
-include $(srctree)/tools/perf/scripts/Makefile.syscalls
include Makefile.config
endif
diff --git a/tools/perf/arch/alpha/entry/syscalls/Kbuild b/tools/perf/arch/alpha/entry/syscalls/Kbuild
deleted file mode 100644
index 9a41e3572c3a..000000000000
--- a/tools/perf/arch/alpha/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls b/tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 690168aac34d..000000000000
--- a/tools/perf/arch/alpha/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,5 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_64 +=
-
-syscalltbl = $(srctree)/tools/perf/arch/alpha/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/arc/entry/syscalls/Kbuild b/tools/perf/arch/arc/entry/syscalls/Kbuild
deleted file mode 100644
index 11707c481a24..000000000000
--- a/tools/perf/arch/arc/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
diff --git a/tools/perf/arch/arc/entry/syscalls/Makefile.syscalls b/tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 391d30ab7a83..000000000000
--- a/tools/perf/arch/arc/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += arc time32 renameat stat64 rlimit
diff --git a/tools/perf/arch/arm/entry/syscalls/Kbuild b/tools/perf/arch/arm/entry/syscalls/Kbuild
deleted file mode 100644
index 9d777540f089..000000000000
--- a/tools/perf/arch/arm/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,4 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += oabi
-syscalltbl = $(srctree)/tools/perf/arch/arm/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/arm/entry/syscalls/Makefile.syscalls b/tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 11707c481a24..000000000000
--- a/tools/perf/arch/arm/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
diff --git a/tools/perf/arch/arm64/entry/syscalls/Kbuild b/tools/perf/arch/arm64/entry/syscalls/Kbuild
deleted file mode 100644
index 84c6599b4ea6..000000000000
--- a/tools/perf/arch/arm64/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls b/tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index e7e78c2d1c02..000000000000
--- a/tools/perf/arch/arm64/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,6 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 +=
-syscall_abis_64 += renameat rlimit memfd_secret
-
-syscalltbl = $(srctree)/tools/perf/arch/arm64/entry/syscalls/syscall_%.tbl
diff --git a/tools/perf/arch/csky/entry/syscalls/Kbuild b/tools/perf/arch/csky/entry/syscalls/Kbuild
deleted file mode 100644
index 11707c481a24..000000000000
--- a/tools/perf/arch/csky/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
diff --git a/tools/perf/arch/csky/entry/syscalls/Makefile.syscalls b/tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index ea2dd10d0571..000000000000
--- a/tools/perf/arch/csky/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += csky time32 stat64 rlimit
diff --git a/tools/perf/arch/loongarch/entry/syscalls/Kbuild b/tools/perf/arch/loongarch/entry/syscalls/Kbuild
deleted file mode 100644
index 9a41e3572c3a..000000000000
--- a/tools/perf/arch/loongarch/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls b/tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 47d32da2aed8..000000000000
--- a/tools/perf/arch/loongarch/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_64 +=
diff --git a/tools/perf/arch/mips/entry/syscalls/Kbuild b/tools/perf/arch/mips/entry/syscalls/Kbuild
deleted file mode 100644
index 9a41e3572c3a..000000000000
--- a/tools/perf/arch/mips/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/mips/entry/syscalls/Makefile.syscalls b/tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 9ee914bdfb05..000000000000
--- a/tools/perf/arch/mips/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,5 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_64 += n64
-
-syscalltbl = $(srctree)/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
diff --git a/tools/perf/arch/parisc/entry/syscalls/Kbuild b/tools/perf/arch/parisc/entry/syscalls/Kbuild
deleted file mode 100644
index 84c6599b4ea6..000000000000
--- a/tools/perf/arch/parisc/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls b/tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index ae326fecb83b..000000000000
--- a/tools/perf/arch/parisc/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,6 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 +=
-syscall_abis_64 +=
-
-syscalltbl = $(srctree)/tools/perf/arch/parisc/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/powerpc/entry/syscalls/Kbuild b/tools/perf/arch/powerpc/entry/syscalls/Kbuild
deleted file mode 100644
index 84c6599b4ea6..000000000000
--- a/tools/perf/arch/powerpc/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls b/tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index e35afbc57c79..000000000000
--- a/tools/perf/arch/powerpc/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,6 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += nospu
-syscall_abis_64 += nospu
-
-syscalltbl = $(srctree)/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/riscv/entry/syscalls/Kbuild b/tools/perf/arch/riscv/entry/syscalls/Kbuild
deleted file mode 100644
index 9a41e3572c3a..000000000000
--- a/tools/perf/arch/riscv/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls b/tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 9668fd1faf60..000000000000
--- a/tools/perf/arch/riscv/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,4 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += riscv memfd_secret
-syscall_abis_64 += riscv rlimit memfd_secret
diff --git a/tools/perf/arch/s390/entry/syscalls/Kbuild b/tools/perf/arch/s390/entry/syscalls/Kbuild
deleted file mode 100644
index 9a41e3572c3a..000000000000
--- a/tools/perf/arch/s390/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/s390/entry/syscalls/Makefile.syscalls b/tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 9762d7abf17c..000000000000
--- a/tools/perf/arch/s390/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,5 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_64 += renameat rlimit memfd_secret
-
-syscalltbl = $(srctree)/tools/perf/arch/s390/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/sh/entry/syscalls/Kbuild b/tools/perf/arch/sh/entry/syscalls/Kbuild
deleted file mode 100644
index 11707c481a24..000000000000
--- a/tools/perf/arch/sh/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
diff --git a/tools/perf/arch/sh/entry/syscalls/Makefile.syscalls b/tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 25080390e4ed..000000000000
--- a/tools/perf/arch/sh/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,4 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 +=
-syscalltbl = $(srctree)/tools/perf/arch/sh/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/sparc/entry/syscalls/Kbuild b/tools/perf/arch/sparc/entry/syscalls/Kbuild
deleted file mode 100644
index 84c6599b4ea6..000000000000
--- a/tools/perf/arch/sparc/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls b/tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index 212c1800b644..000000000000
--- a/tools/perf/arch/sparc/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,5 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 +=
-syscall_abis_64 +=
-syscalltbl = $(srctree)/tools/perf/arch/sparc/entry/syscalls/syscall.tbl
diff --git a/tools/perf/arch/x86/entry/syscalls/Kbuild b/tools/perf/arch/x86/entry/syscalls/Kbuild
deleted file mode 100644
index 84c6599b4ea6..000000000000
--- a/tools/perf/arch/x86/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,3 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
-syscall-y += syscalls_64.h
diff --git a/tools/perf/arch/x86/entry/syscalls/Makefile.syscalls b/tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index db3d5d6d4e56..000000000000
--- a/tools/perf/arch/x86/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,6 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 += i386
-syscall_abis_64 +=
-
-syscalltbl = $(srctree)/tools/perf/arch/x86/entry/syscalls/syscall_%.tbl
diff --git a/tools/perf/arch/xtensa/entry/syscalls/Kbuild b/tools/perf/arch/xtensa/entry/syscalls/Kbuild
deleted file mode 100644
index 11707c481a24..000000000000
--- a/tools/perf/arch/xtensa/entry/syscalls/Kbuild
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-syscall-y += syscalls_32.h
diff --git a/tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls b/tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
deleted file mode 100644
index d4aa2358460c..000000000000
--- a/tools/perf/arch/xtensa/entry/syscalls/Makefile.syscalls
+++ /dev/null
@@ -1,4 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-syscall_abis_32 +=
-syscalltbl = $(srctree)/tools/perf/arch/xtensa/entry/syscalls/syscall.tbl
diff --git a/tools/perf/scripts/Makefile.syscalls b/tools/perf/scripts/Makefile.syscalls
deleted file mode 100644
index 8bf55333262e..000000000000
--- a/tools/perf/scripts/Makefile.syscalls
+++ /dev/null
@@ -1,61 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-# This Makefile generates headers in
-# tools/perf/arch/$(SRCARCH)/include/generated/asm from the architecture's
-# syscall table. This will either be from the generic syscall table, or from a
-# table that is specific to that architecture.
-
-PHONY := all
-all:
-
-obj := $(OUTPUT)arch/$(SRCARCH)/include/generated/asm
-
-syscall_abis_32 := common,32
-syscall_abis_64 := common,64
-syscalltbl := $(srctree)/tools/scripts/syscall.tbl
-
-# let architectures override $(syscall_abis_%) and $(syscalltbl)
--include $(srctree)/tools/perf/arch/$(SRCARCH)/entry/syscalls/Makefile.syscalls
-include $(srctree)/tools/build/Build.include
--include $(srctree)/tools/perf/arch/$(SRCARCH)/entry/syscalls/Kbuild
-
-systbl := $(srctree)/tools/perf/scripts/syscalltbl.sh
-
-syscall-y := $(addprefix $(obj)/, $(syscall-y))
-
-# Remove stale wrappers when the corresponding files are removed from generic-y
-old-headers := $(wildcard $(obj)/*.h)
-unwanted := $(filter-out $(syscall-y),$(old-headers))
-
-quiet_cmd_remove = REMOVE $(unwanted)
- cmd_remove = rm -f $(unwanted)
-
-quiet_cmd_systbl = SYSTBL $@
- cmd_systbl = $(CONFIG_SHELL) $(systbl) \
- $(if $(systbl-args-$*),$(systbl-args-$*),$(systbl-args)) \
- --abis $(subst $(space),$(comma),$(strip $(syscall_abis_$*))) \
- $< $@
-
-all: $(syscall-y)
- $(if $(unwanted),$(call cmd,remove))
- @:
-
-$(obj)/syscalls_%.h: $(syscalltbl) $(systbl) FORCE
- $(call if_changed,systbl)
-
-targets := $(syscall-y)
-
-# Create output directory. Skip it if at least one old header exists
-# since we know the output directory already exists.
-ifeq ($(old-headers),)
-$(shell mkdir -p $(obj))
-endif
-
-PHONY += FORCE
-
-FORCE:
-
-existing-targets := $(wildcard $(sort $(targets)))
-
--include $(foreach f,$(existing-targets),$(dir $(f)).$(notdir $(f)).cmd)
-
-.PHONY: $(PHONY)
diff --git a/tools/perf/scripts/syscalltbl.sh b/tools/perf/scripts/syscalltbl.sh
deleted file mode 100755
index a39b3013b103..000000000000
--- a/tools/perf/scripts/syscalltbl.sh
+++ /dev/null
@@ -1,76 +0,0 @@
-#!/bin/sh
-# SPDX-License-Identifier: GPL-2.0
-#
-# Generate a syscall table header.
-#
-# Each line of the syscall table should have the following format:
-#
-# NR ABI NAME [NATIVE] [COMPAT]
-#
-# NR syscall number
-# ABI ABI name
-# NAME syscall name
-# NATIVE native entry point (optional)
-# COMPAT compat entry point (optional)
-
-set -e
-
-usage() {
- echo >&2 "usage: $0 [--abis ABIS] INFILE OUTFILE" >&2
- echo >&2
- echo >&2 " INFILE input syscall table"
- echo >&2 " OUTFILE output header file"
- echo >&2
- echo >&2 "options:"
- echo >&2 " --abis ABIS ABI(s) to handle (By default, all lines are handled)"
- exit 1
-}
-
-# default unless specified by options
-abis=
-
-while [ $# -gt 0 ]
-do
- case $1 in
- --abis)
- abis=$(echo "($2)" | tr ',' '|')
- shift 2;;
- -*)
- echo "$1: unknown option" >&2
- usage;;
- *)
- break;;
- esac
-done
-
-if [ $# -ne 2 ]; then
- usage
-fi
-
-infile="$1"
-outfile="$2"
-
-sorted_table=$(mktemp /tmp/syscalltbl.XXXXXX)
-grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | sort -n > $sorted_table
-
-echo "static const char *const syscall_num_to_name[] = {" > $outfile
-# the params are: nr abi name entry compat
-# use _ for intentionally unused variables according to SC2034
-while read nr _ name _ _; do
- echo " [$nr] = \"$name\"," >> $outfile
-done < $sorted_table
-echo "};" >> $outfile
-
-echo "static const uint16_t syscall_sorted_names[] = {" >> $outfile
-
-# When sorting by name, add a suffix of 0s upto 20 characters so that system
-# calls that differ with a numerical suffix don't sort before those
-# without. This default behavior of sort differs from that of strcmp used at
-# runtime. Use sed to strip the trailing 0s suffix afterwards.
-grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | awk '{printf $3; for (i = length($3); i < 20; i++) { printf "0"; }; print " " $1}'| sort | sed 's/\([a-zA-Z1-9]\+\)0\+ \([0-9]\+\)/\1 \2/' > $sorted_table
-while read name nr; do
- echo " $nr, /* $name */" >> $outfile
-done < $sorted_table
-echo "};" >> $outfile
-
-rm -f $sorted_table
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v2 6/7] perf syscalltbl: Use lookup table containing multiple architectures
2025-02-10 16:51 ` [PATCH v2 6/7] perf syscalltbl: Use lookup table containing multiple architectures Ian Rogers
@ 2025-02-10 23:39 ` Charlie Jenkins
2025-02-11 5:15 ` Ian Rogers
2025-02-11 0:23 ` Charlie Jenkins
1 sibling, 1 reply; 26+ messages in thread
From: Charlie Jenkins @ 2025-02-10 23:39 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, Guo Ren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Bibo Mao, Arnd Bergmann, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky, linux-riscv
On Mon, Feb 10, 2025 at 08:51:07AM -0800, Ian Rogers wrote:
> Switch to use the lookup table containing all architectures rather
> than tables matching the perf binary.
>
> This fixes perf trace when executed on a 32-bit i386 binary on an
> x86-64 machine. Note in the following the system call names of the
> 32-bit i386 binary as seen by an x86-64 perf.
>
> Before:
> ```
> ? ( ): a.out/447296 ... [continued]: munmap()) = 0
> 0.024 ( 0.001 ms): a.out/447296 recvfrom(ubuf: 0x2, size: 4160585708, flags: DONTROUTE|CTRUNC|TRUNC|DONTWAIT|EOR|WAITALL|FIN|SYN|CONFIRM|RST|ERRQUEUE|NOSIGNAL|WAITFORONE|BATCH|SOCK_DEVMEM|ZEROCOPY|FASTOPEN|CMSG_CLOEXEC|0x91f80000, addr: 0xe30, addr_len: 0xffce438c) = 1475198976
> 0.042 ( 0.003 ms): a.out/447296 lgetxattr(name: "", value: 0x3, size: 34) = 4160344064
> 0.054 ( 0.003 ms): a.out/447296 dup2(oldfd: -134422744, newfd: 4) = -1 ENOENT (No such file or directory)
> 0.060 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x2e646c2f6374652f,.iov_len = (__kernel_size_t)7307199665335594867,}, vlen: 557056, pos_h: 4160585708) = 3
> 0.074 ( 0.004 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2) = 4160237568
> 0.080 ( 0.001 ms): a.out/447296 lstat(filename: "", statbuf: 0x193f6) = 0
> 0.089 ( 0.007 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x3833692f62696c2f,.iov_len = (__kernel_size_t)3276497845987585334,}, vlen: 557056, pos_h: 4160585708) = 3
> 0.097 ( 0.002 ms): a.out/447296 close(fd: 3</proc/447296/status>) = 512
> 0.103 ( 0.002 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2050) = 4157935616
> 0.107 ( 0.007 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x5, size: 2066) = 4158078976
> 0.116 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x1, size: 2066) = 4159639552
> 0.121 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 2066) = 4160184320
> 0.129 ( 0.002 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 50) = 4160196608
> 0.138 ( 0.001 ms): a.out/447296 lstat(filename: "") = 0
> 0.145 ( 0.002 ms): a.out/447296 mq_timedreceive(mqdes: 4291706800, u_msg_ptr: 0xf7f9ea48, msg_len: 134616640, u_msg_prio: 0xf7fd7fec, u_abs_timeout: (struct __kernel_timespec){.tv_sec = (__kernel_time64_t)-578174027777317696,.tv_nsec = (long long int)4160349376,}) = 0
> 0.148 ( 0.001 ms): a.out/447296 mkdirat(dfd: -134617816, pathname: " ��� ���▒���▒���", mode: IFREG|ISUID|IRUSR|IWGRP|0xf7fd0000) = 447296
> 0.150 ( 0.001 ms): a.out/447296 process_vm_writev(pid: -134617812, lvec: (struct iovec){.iov_base = (void *)0xf7f9e9c8f7f9e4c0,.iov_len = (__kernel_size_t)4160349376,}, liovcnt: 4160588048, rvec: (struct iovec){}, riovcnt: 4160585708, flags: 4291707352) = 0
> 0.197 ( 0.004 ms): a.out/447296 capget(header: 4160184320, dataptr: 8192) = 0
> 0.202 ( 0.002 ms): a.out/447296 capget(header: 1448669184, dataptr: 4096) = 0
> 0.208 ( 0.002 ms): a.out/447296 capget(header: 4160577536, dataptr: 8192) = 0
> 0.220 ( 0.001 ms): a.out/447296 getxattr(pathname: "", name: "c������", value: 0xf7f77e34, size: 1) = 0
> 0.228 ( 0.005 ms): a.out/447296 fchmod(fd: -134729728, mode: IRUGO|IWUGO|IFREG|IFIFO|ISVTX|IXUSR|0x10000) = 0
> 0.240 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: 0x5658e008, pos_h: 4160192052) = 3
> 0.250 ( 0.008 ms): a.out/447296 close(fd: 3</proc/447296/status>) = 1436
> 0.260 ( 0.018 ms): a.out/447296 stat(filename: "", statbuf: 0xffce32ac) = 1436
> 0.288 (1000.213 ms): a.out/447296 readlinkat(buf: 0xffce31d4, bufsiz: 4291703244) = 0
> ```
>
> After:
> ```
> ? ( ): a.out/442930 ... [continued]: execve()) = 0
> 0.023 ( 0.002 ms): a.out/442930 brk() = 0x57760000
> 0.052 ( 0.003 ms): a.out/442930 access(filename: 0xf7f5af28, mode: R) = -1 ENOENT (No such file or directory)
> 0.059 ( 0.009 ms): a.out/442930 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> 0.078 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>) = 0
> 0.087 ( 0.007 ms): a.out/442930 openat(dfd: CWD, filename: "/lib/i386-linux-", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> 0.095 ( 0.002 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdbb70, count: 512) = 512
> 0.135 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>) = 0
> 0.148 ( 0.001 ms): a.out/442930 set_tid_address(tidptr: 0xf7f2b528) = 442930 (a.out)
> 0.150 ( 0.001 ms): a.out/442930 set_robust_list(head: 0xf7f2b52c, len: 12) =
> 0.196 ( 0.004 ms): a.out/442930 mprotect(start: 0xf7f03000, len: 8192, prot: READ) = 0
> 0.202 ( 0.002 ms): a.out/442930 mprotect(start: 0x5658e000, len: 4096, prot: READ) = 0
> 0.207 ( 0.002 ms): a.out/442930 mprotect(start: 0xf7f63000, len: 8192, prot: READ) = 0
> 0.230 ( 0.005 ms): a.out/442930 munmap(addr: 0xf7f10000, len: 103414) = 0
> 0.244 ( 0.010 ms): a.out/442930 openat(dfd: CWD, filename: 0x5658d008) = 3
> 0.255 ( 0.007 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdb67c, count: 4096) = 1436
> 0.264 ( 0.018 ms): a.out/442930 write(fd: 1</dev/pts/4>, buf: , count: 1436) = 1436
> 0.292 (1000.173 ms): a.out/442930 clock_nanosleep(rqtp: { .tv_sec: 17866546940376776704, .tv_nsec: 4159878336 }, rmtp: 0xffbdb59c) = 0
> 1000.478 ( ): a.out/442930 exit_group() = ?
> ```
>
I think I am conflating some things in my mind here. This change doesn't
impact perf report does it? perf report reports syscall numbers only,
but it could be hooked up into this change to correctly report the
correct syscall name from a perf.data generated on any architecture?
I believe that question is tangential to this patch but let me know!
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Howard Chu <howardchu95@gmail.com>
> ---
> tools/perf/util/syscalltbl.c | 89 ++++++++++++++++++++++++++----------
> 1 file changed, 64 insertions(+), 25 deletions(-)
>
> diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
> index 760ac4d0869f..db0d2b81aed1 100644
> --- a/tools/perf/util/syscalltbl.c
> +++ b/tools/perf/util/syscalltbl.c
> @@ -15,16 +15,39 @@
> #include <string.h>
> #include "string2.h"
>
> -#if __BITS_PER_LONG == 64
> - #include <asm/syscalls_64.h>
> -#else
> - #include <asm/syscalls_32.h>
> -#endif
> +#include "trace/beauty/generated/syscalltbl.c"
>
> -const char *syscalltbl__name(int e_machine __maybe_unused, int id)
> +static const struct syscalltbl *find_table(int e_machine)
> {
> - if (id >= 0 && id <= (int)ARRAY_SIZE(syscall_num_to_name))
> - return syscall_num_to_name[id];
> + static const struct syscalltbl *last_table;
> + static int last_table_machine = EM_NONE;
> +
> + /* Tables only exist for EM_SPARC. */
> + if (e_machine == EM_SPARCV9)
> + e_machine = EM_SPARC;
> +
> + if (last_table_machine == e_machine && last_table != NULL)
> + return last_table;
> +
> + for (size_t i = 0; i < ARRAY_SIZE(syscalltbls); i++) {
> + const struct syscalltbl *entry = &syscalltbls[i];
> +
> + if (entry->e_machine != e_machine && entry->e_machine != EM_NONE)
> + continue;
> +
> + last_table = entry;
> + last_table_machine = e_machine;
> + return entry;
> + }
> + return NULL;
> +}
> +
> +const char *syscalltbl__name(int e_machine, int id)
> +{
> + const struct syscalltbl *table = find_table(e_machine);
> +
> + if (table && id >= 0 && id < table->num_to_name_len)
> + return table->num_to_name[id];
> return NULL;
> }
>
> @@ -41,38 +64,54 @@ static int syscallcmpname(const void *vkey, const void *ventry)
> return strcmp(key->name, key->tbl[*entry]);
> }
>
> -int syscalltbl__id(int e_machine __maybe_unused, const char *name)
> +int syscalltbl__id(int e_machine, const char *name)
> {
> - struct syscall_cmp_key key = {
> - .name = name,
> - .tbl = syscall_num_to_name,
> - };
> - const int *id = bsearch(&key, syscall_sorted_names,
> - ARRAY_SIZE(syscall_sorted_names),
> - sizeof(syscall_sorted_names[0]),
> - syscallcmpname);
> + const struct syscalltbl *table = find_table(e_machine);
> + struct syscall_cmp_key key;
> + const int *id;
> +
> + if (!table)
> + return -1;
> +
> + key.name = name;
> + key.tbl = table->num_to_name;
> + id = bsearch(&key, table->sorted_names, table->sorted_names_len,
> + sizeof(table->sorted_names[0]), syscallcmpname);
>
> return id ? *id : -1;
> }
>
> -int syscalltbl__num_idx(int e_machine __maybe_unused)
> +int syscalltbl__num_idx(int e_machine)
> {
> - return ARRAY_SIZE(syscall_sorted_names);
> + const struct syscalltbl *table = find_table(e_machine);
> +
> + if (!table)
> + return 0;
> +
> + return table->sorted_names_len;
> }
>
> -int syscalltbl__id_at_idx(int e_machine __maybe_unused, int idx)
> +int syscalltbl__id_at_idx(int e_machine, int idx)
> {
> - return syscall_sorted_names[idx];
> + const struct syscalltbl *table = find_table(e_machine);
> +
> + if (!table)
> + return -1;
> +
> + assert(idx >= 0 && idx < table->sorted_names_len);
> + return table->sorted_names[idx];
> }
>
> -int syscalltbl__strglobmatch_next(int e_machine __maybe_unused, const char *syscall_glob, int *idx)
> +int syscalltbl__strglobmatch_next(int e_machine, const char *syscall_glob, int *idx)
> {
> - for (int i = *idx + 1; i < (int)ARRAY_SIZE(syscall_sorted_names); ++i) {
> - const char *name = syscall_num_to_name[syscall_sorted_names[i]];
> + const struct syscalltbl *table = find_table(e_machine);
> +
> + for (int i = *idx + 1; table && i < table->sorted_names_len; ++i) {
> + const char *name = table->num_to_name[table->sorted_names[i]];
>
> if (strglobmatch(name, syscall_glob)) {
> *idx = i;
> - return syscall_sorted_names[i];
> + return table->sorted_names[i];
> }
> }
>
> --
> 2.48.1.502.g6dc24dfdaf-goog
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 1/7] perf syscalltble: Remove syscall_table.h
2025-02-10 16:51 ` [PATCH v2 1/7] perf syscalltble: Remove syscall_table.h Ian Rogers
@ 2025-02-10 23:48 ` Charlie Jenkins
0 siblings, 0 replies; 26+ messages in thread
From: Charlie Jenkins @ 2025-02-10 23:48 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, Guo Ren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Bibo Mao, Arnd Bergmann, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky, linux-riscv
On Mon, Feb 10, 2025 at 08:51:02AM -0800, Ian Rogers wrote:
> The definition of "static const char *const syscalltbl[] = {" is done
> in a generated syscalls_32.h or syscalls_64.h that is architecture
> dependent. In order to include the appropriate file a syscall_table.h
> is found via the perf include path and it includes the syscalls_32.h
> or syscalls_64.h as appropriate.
>
> To support having multiple syscall tables, one for 32-bit and one for
> 64-bit, or for different architectures, an include path cannot be
> used. Remove syscall_table.h because of this and inline what it does
> into syscalltbl.c.
>
> For architectures without a syscall_table.h this will cause a failure
> to include either syscalls_32.h or syscalls_64.h rather than a failure
> to include syscall_table.h. For architectures that only included one
> or other, the behavior matches BITS_PER_LONG as previously done on
> architectures supporting both syscalls_32.h and syscalls_64.h.
This is a great way of doing this, thank you.
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Howard Chu <howardchu95@gmail.com>
> ---
> tools/perf/arch/alpha/include/syscall_table.h | 2 --
> tools/perf/arch/arc/include/syscall_table.h | 2 --
> tools/perf/arch/arm/include/syscall_table.h | 2 --
> tools/perf/arch/arm64/include/syscall_table.h | 8 --------
> tools/perf/arch/csky/include/syscall_table.h | 2 --
> tools/perf/arch/loongarch/include/syscall_table.h | 2 --
> tools/perf/arch/mips/include/syscall_table.h | 2 --
> tools/perf/arch/parisc/include/syscall_table.h | 8 --------
> tools/perf/arch/powerpc/include/syscall_table.h | 8 --------
> tools/perf/arch/riscv/include/syscall_table.h | 8 --------
> tools/perf/arch/s390/include/syscall_table.h | 2 --
> tools/perf/arch/sh/include/syscall_table.h | 2 --
> tools/perf/arch/sparc/include/syscall_table.h | 8 --------
> tools/perf/arch/x86/include/syscall_table.h | 8 --------
> tools/perf/arch/xtensa/include/syscall_table.h | 2 --
> tools/perf/util/syscalltbl.c | 8 +++++++-
> 16 files changed, 7 insertions(+), 67 deletions(-)
> delete mode 100644 tools/perf/arch/alpha/include/syscall_table.h
> delete mode 100644 tools/perf/arch/arc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/arm/include/syscall_table.h
> delete mode 100644 tools/perf/arch/arm64/include/syscall_table.h
> delete mode 100644 tools/perf/arch/csky/include/syscall_table.h
> delete mode 100644 tools/perf/arch/loongarch/include/syscall_table.h
> delete mode 100644 tools/perf/arch/mips/include/syscall_table.h
> delete mode 100644 tools/perf/arch/parisc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/powerpc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/riscv/include/syscall_table.h
> delete mode 100644 tools/perf/arch/s390/include/syscall_table.h
> delete mode 100644 tools/perf/arch/sh/include/syscall_table.h
> delete mode 100644 tools/perf/arch/sparc/include/syscall_table.h
> delete mode 100644 tools/perf/arch/x86/include/syscall_table.h
> delete mode 100644 tools/perf/arch/xtensa/include/syscall_table.h
>
> diff --git a/tools/perf/arch/alpha/include/syscall_table.h b/tools/perf/arch/alpha/include/syscall_table.h
> deleted file mode 100644
> index b53e31c15805..000000000000
> --- a/tools/perf/arch/alpha/include/syscall_table.h
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/syscalls_64.h>
> diff --git a/tools/perf/arch/arc/include/syscall_table.h b/tools/perf/arch/arc/include/syscall_table.h
> deleted file mode 100644
> index 4c942821662d..000000000000
> --- a/tools/perf/arch/arc/include/syscall_table.h
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/syscalls_32.h>
> diff --git a/tools/perf/arch/arm/include/syscall_table.h b/tools/perf/arch/arm/include/syscall_table.h
> deleted file mode 100644
> index 4c942821662d..000000000000
> --- a/tools/perf/arch/arm/include/syscall_table.h
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/syscalls_32.h>
> diff --git a/tools/perf/arch/arm64/include/syscall_table.h b/tools/perf/arch/arm64/include/syscall_table.h
> deleted file mode 100644
> index 7ff51b783000..000000000000
> --- a/tools/perf/arch/arm64/include/syscall_table.h
> +++ /dev/null
> @@ -1,8 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/bitsperlong.h>
> -
> -#if __BITS_PER_LONG == 64
> -#include <asm/syscalls_64.h>
> -#else
> -#include <asm/syscalls_32.h>
> -#endif
> diff --git a/tools/perf/arch/csky/include/syscall_table.h b/tools/perf/arch/csky/include/syscall_table.h
> deleted file mode 100644
> index 4c942821662d..000000000000
> --- a/tools/perf/arch/csky/include/syscall_table.h
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/syscalls_32.h>
> diff --git a/tools/perf/arch/loongarch/include/syscall_table.h b/tools/perf/arch/loongarch/include/syscall_table.h
> deleted file mode 100644
> index 9d0646d3455c..000000000000
> --- a/tools/perf/arch/loongarch/include/syscall_table.h
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/syscall_table_64.h>
> diff --git a/tools/perf/arch/mips/include/syscall_table.h b/tools/perf/arch/mips/include/syscall_table.h
> deleted file mode 100644
> index b53e31c15805..000000000000
> --- a/tools/perf/arch/mips/include/syscall_table.h
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/syscalls_64.h>
> diff --git a/tools/perf/arch/parisc/include/syscall_table.h b/tools/perf/arch/parisc/include/syscall_table.h
> deleted file mode 100644
> index 7ff51b783000..000000000000
> --- a/tools/perf/arch/parisc/include/syscall_table.h
> +++ /dev/null
> @@ -1,8 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/bitsperlong.h>
> -
> -#if __BITS_PER_LONG == 64
> -#include <asm/syscalls_64.h>
> -#else
> -#include <asm/syscalls_32.h>
> -#endif
> diff --git a/tools/perf/arch/powerpc/include/syscall_table.h b/tools/perf/arch/powerpc/include/syscall_table.h
> deleted file mode 100644
> index 7ff51b783000..000000000000
> --- a/tools/perf/arch/powerpc/include/syscall_table.h
> +++ /dev/null
> @@ -1,8 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/bitsperlong.h>
> -
> -#if __BITS_PER_LONG == 64
> -#include <asm/syscalls_64.h>
> -#else
> -#include <asm/syscalls_32.h>
> -#endif
> diff --git a/tools/perf/arch/riscv/include/syscall_table.h b/tools/perf/arch/riscv/include/syscall_table.h
> deleted file mode 100644
> index 7ff51b783000..000000000000
> --- a/tools/perf/arch/riscv/include/syscall_table.h
> +++ /dev/null
> @@ -1,8 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/bitsperlong.h>
> -
> -#if __BITS_PER_LONG == 64
> -#include <asm/syscalls_64.h>
> -#else
> -#include <asm/syscalls_32.h>
> -#endif
> diff --git a/tools/perf/arch/s390/include/syscall_table.h b/tools/perf/arch/s390/include/syscall_table.h
> deleted file mode 100644
> index b53e31c15805..000000000000
> --- a/tools/perf/arch/s390/include/syscall_table.h
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/syscalls_64.h>
> diff --git a/tools/perf/arch/sh/include/syscall_table.h b/tools/perf/arch/sh/include/syscall_table.h
> deleted file mode 100644
> index 4c942821662d..000000000000
> --- a/tools/perf/arch/sh/include/syscall_table.h
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/syscalls_32.h>
> diff --git a/tools/perf/arch/sparc/include/syscall_table.h b/tools/perf/arch/sparc/include/syscall_table.h
> deleted file mode 100644
> index 7ff51b783000..000000000000
> --- a/tools/perf/arch/sparc/include/syscall_table.h
> +++ /dev/null
> @@ -1,8 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/bitsperlong.h>
> -
> -#if __BITS_PER_LONG == 64
> -#include <asm/syscalls_64.h>
> -#else
> -#include <asm/syscalls_32.h>
> -#endif
> diff --git a/tools/perf/arch/x86/include/syscall_table.h b/tools/perf/arch/x86/include/syscall_table.h
> deleted file mode 100644
> index 7ff51b783000..000000000000
> --- a/tools/perf/arch/x86/include/syscall_table.h
> +++ /dev/null
> @@ -1,8 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/bitsperlong.h>
> -
> -#if __BITS_PER_LONG == 64
> -#include <asm/syscalls_64.h>
> -#else
> -#include <asm/syscalls_32.h>
> -#endif
> diff --git a/tools/perf/arch/xtensa/include/syscall_table.h b/tools/perf/arch/xtensa/include/syscall_table.h
> deleted file mode 100644
> index 4c942821662d..000000000000
> --- a/tools/perf/arch/xtensa/include/syscall_table.h
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include <asm/syscalls_32.h>
> diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
> index 928aca4cd6e9..2f76241494c8 100644
> --- a/tools/perf/util/syscalltbl.c
> +++ b/tools/perf/util/syscalltbl.c
> @@ -7,13 +7,19 @@
>
> #include "syscalltbl.h"
> #include <stdlib.h>
> +#include <asm/bitsperlong.h>
> #include <linux/compiler.h>
> #include <linux/zalloc.h>
>
> #include <string.h>
> #include "string2.h"
>
> -#include <syscall_table.h>
> +#if __BITS_PER_LONG == 64
> + #include <asm/syscalls_64.h>
> +#else
> + #include <asm/syscalls_32.h>
> +#endif
> +
> const int syscalltbl_native_max_id = SYSCALLTBL_MAX_ID;
> static const char *const *syscalltbl_native = syscalltbl;
>
> --
> 2.48.1.502.g6dc24dfdaf-goog
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 2/7] perf trace: Reorganize syscalls
2025-02-10 16:51 ` [PATCH v2 2/7] perf trace: Reorganize syscalls Ian Rogers
@ 2025-02-11 0:17 ` Charlie Jenkins
0 siblings, 0 replies; 26+ messages in thread
From: Charlie Jenkins @ 2025-02-11 0:17 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, Guo Ren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Bibo Mao, Arnd Bergmann, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky, linux-riscv
On Mon, Feb 10, 2025 at 08:51:03AM -0800, Ian Rogers wrote:
> Identify struct syscall information in the syscalls table by a machine
> type and syscall number, not just system call number. Having the
> machine type means that 32-bit system calls can be differentiated from
> 64-bit ones on a machine capable of both. Having a table for all
> machine types and all system call numbers would be too large, so
> maintain a sorted array of system calls as they are encountered.
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Howard Chu <howardchu95@gmail.com>
> ---
> tools/perf/builtin-trace.c | 178 +++++++++++++++++++++++++------------
> 1 file changed, 119 insertions(+), 59 deletions(-)
>
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 06356217adeb..916a51df236b 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -66,6 +66,7 @@
> #include "rb_resort.h"
> #include "../perf.h"
> #include "trace_augment.h"
> +#include "dwarf-regs.h"
>
> #include <errno.h>
> #include <inttypes.h>
> @@ -86,6 +87,7 @@
>
> #include <linux/ctype.h>
> #include <perf/mmap.h>
> +#include <tools/libc_compat.h>
>
> #ifdef HAVE_LIBTRACEEVENT
> #include <event-parse.h>
> @@ -143,7 +145,10 @@ struct trace {
> struct perf_tool tool;
> struct syscalltbl *sctbl;
> struct {
> + /** Sorted sycall numbers used by the trace. */
> struct syscall *table;
> + /** Size of table. */
> + size_t table_size;
> struct {
> struct evsel *sys_enter,
> *sys_exit,
> @@ -1445,22 +1450,37 @@ static const struct syscall_fmt *syscall_fmt__find_by_alias(const char *alias)
> return __syscall_fmt__find_by_alias(syscall_fmts, nmemb, alias);
> }
>
> -/*
> - * is_exit: is this "exit" or "exit_group"?
> - * is_open: is this "open" or "openat"? To associate the fd returned in sys_exit with the pathname in sys_enter.
> - * args_size: sum of the sizes of the syscall arguments, anything after that is augmented stuff: pathname for openat, etc.
> - * nonexistent: Just a hole in the syscall table, syscall id not allocated
> +/**
> + * struct syscall
> */
> struct syscall {
> + /** @e_machine: The ELF machine associated with the entry. */
> + int e_machine;
> + /** @id: id value from the tracepoint, the system call number. */
> + int id;
> struct tep_event *tp_format;
> int nr_args;
> + /**
> + * @args_size: sum of the sizes of the syscall arguments, anything
> + * after that is augmented stuff: pathname for openat, etc.
> + */
> +
> int args_size;
> struct {
> struct bpf_program *sys_enter,
> *sys_exit;
> } bpf_prog;
> + /** @is_exit: is this "exit" or "exit_group"? */
> bool is_exit;
> + /**
> + * @is_open: is this "open" or "openat"? To associate the fd returned in
> + * sys_exit with the pathname in sys_enter.
> + */
> bool is_open;
> + /**
> + * @nonexistent: Name lookup failed. Just a hole in the syscall table,
> + * syscall id not allocated.
> + */
> bool nonexistent;
> bool use_btf;
> struct tep_format_field *args;
> @@ -2066,22 +2086,21 @@ static int syscall__set_arg_fmts(struct syscall *sc)
> return 0;
> }
>
> -static int trace__read_syscall_info(struct trace *trace, int id)
> +static int syscall__read_info(struct syscall *sc, struct trace *trace)
> {
> char tp_name[128];
> - struct syscall *sc;
> - const char *name = syscalltbl__name(trace->sctbl, id);
> + const char *name;
> int err;
>
> - if (trace->syscalls.table == NULL) {
> - trace->syscalls.table = calloc(trace->sctbl->syscalls.max_id + 1, sizeof(*sc));
> - if (trace->syscalls.table == NULL)
> - return -ENOMEM;
> - }
> - sc = trace->syscalls.table + id;
> if (sc->nonexistent)
> return -EEXIST;
>
> + if (sc->name) {
> + /* Info already read. */
> + return 0;
> + }
> +
> + name = syscalltbl__name(trace->sctbl, sc->id);
> if (name == NULL) {
> sc->nonexistent = true;
> return -EEXIST;
> @@ -2104,15 +2123,16 @@ static int trace__read_syscall_info(struct trace *trace, int id)
> */
> if (IS_ERR(sc->tp_format)) {
> sc->nonexistent = true;
> - return PTR_ERR(sc->tp_format);
> + err = PTR_ERR(sc->tp_format);
> + sc->tp_format = NULL;
> + return err;
> }
>
> /*
> * The tracepoint format contains __syscall_nr field, so it's one more
> * than the actual number of syscall arguments.
> */
> - if (syscall__alloc_arg_fmts(sc, IS_ERR(sc->tp_format) ?
> - RAW_SYSCALL_ARGS_NUM : sc->tp_format->format.nr_fields - 1))
> + if (syscall__alloc_arg_fmts(sc, sc->tp_format->format.nr_fields - 1))
> return -ENOMEM;
>
> sc->args = sc->tp_format->format.fields;
> @@ -2401,13 +2421,67 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
> return printed;
> }
>
> +static void syscall__init(struct syscall *sc, int e_machine, int id)
> +{
> + memset(sc, 0, sizeof(*sc));
> + sc->e_machine = e_machine;
> + sc->id = id;
> +}
> +
> +static void syscall__exit(struct syscall *sc)
> +{
> + if (!sc)
> + return;
> +
> + zfree(&sc->arg_fmt);
> +}
> +
> +static int syscall__cmp(const void *va, const void *vb)
> +{
> + const struct syscall *a = va, *b = vb;
> +
> + if (a->e_machine != b->e_machine)
> + return a->e_machine - b->e_machine;
> +
> + return a->id - b->id;
> +}
> +
> +static struct syscall *trace__find_syscall(struct trace *trace, int e_machine, int id)
> +{
> + struct syscall key = {
> + .e_machine = e_machine,
> + .id = id,
> + };
> + struct syscall *sc, *tmp;
> +
> + sc = bsearch(&key, trace->syscalls.table, trace->syscalls.table_size,
> + sizeof(struct syscall), syscall__cmp);
> + if (sc)
> + return sc;
> +
> + tmp = reallocarray(trace->syscalls.table, trace->syscalls.table_size + 1,
> + sizeof(struct syscall));
> + if (!tmp)
> + return NULL;
> +
> + trace->syscalls.table = tmp;
> + sc = &trace->syscalls.table[trace->syscalls.table_size++];
> + syscall__init(sc, e_machine, id);
> + qsort(trace->syscalls.table, trace->syscalls.table_size, sizeof(struct syscall),
> + syscall__cmp);
> + sc = bsearch(&key, trace->syscalls.table, trace->syscalls.table_size,
> + sizeof(struct syscall), syscall__cmp);
> + return sc;
> +}
> +
> typedef int (*tracepoint_handler)(struct trace *trace, struct evsel *evsel,
> union perf_event *event,
> struct perf_sample *sample);
>
> -static struct syscall *trace__syscall_info(struct trace *trace,
> - struct evsel *evsel, int id)
> +static struct syscall *trace__syscall_info(struct trace *trace, struct evsel *evsel,
> + int e_machine, int id)
> {
> + struct syscall *sc;
> int err = 0;
>
> if (id < 0) {
> @@ -2432,28 +2506,20 @@ static struct syscall *trace__syscall_info(struct trace *trace,
>
> err = -EINVAL;
>
> - if (id > trace->sctbl->syscalls.max_id) {
> - goto out_cant_read;
> - }
> -
> - if ((trace->syscalls.table == NULL || trace->syscalls.table[id].name == NULL) &&
> - (err = trace__read_syscall_info(trace, id)) != 0)
> - goto out_cant_read;
> + sc = trace__find_syscall(trace, e_machine, id);
> + if (sc)
> + err = syscall__read_info(sc, trace);
>
> - if (trace->syscalls.table && trace->syscalls.table[id].nonexistent)
> - goto out_cant_read;
> -
> - return &trace->syscalls.table[id];
> -
> -out_cant_read:
> - if (verbose > 0) {
> + if (err && verbose > 0) {
> char sbuf[STRERR_BUFSIZE];
> - fprintf(trace->output, "Problems reading syscall %d: %d (%s)", id, -err, str_error_r(-err, sbuf, sizeof(sbuf)));
> - if (id <= trace->sctbl->syscalls.max_id && trace->syscalls.table[id].name != NULL)
> - fprintf(trace->output, "(%s)", trace->syscalls.table[id].name);
> +
> + fprintf(trace->output, "Problems reading syscall %d: %d (%s)", id, -err,
> + str_error_r(-err, sbuf, sizeof(sbuf)));
> + if (sc && sc->name)
> + fprintf(trace->output, "(%s)", sc->name);
> fputs(" information\n", trace->output);
> }
> - return NULL;
> + return err ? NULL : sc;
> }
>
> struct syscall_stats {
> @@ -2600,14 +2666,6 @@ static void *syscall__augmented_args(struct syscall *sc, struct perf_sample *sam
> return NULL;
> }
>
> -static void syscall__exit(struct syscall *sc)
> -{
> - if (!sc)
> - return;
> -
> - zfree(&sc->arg_fmt);
> -}
> -
> static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
> union perf_event *event __maybe_unused,
> struct perf_sample *sample)
> @@ -2619,7 +2677,7 @@ static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
> int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
> int augmented_args_size = 0;
> void *augmented_args = NULL;
> - struct syscall *sc = trace__syscall_info(trace, evsel, id);
> + struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
> struct thread_trace *ttrace;
>
> if (sc == NULL)
> @@ -2693,7 +2751,7 @@ static int trace__fprintf_sys_enter(struct trace *trace, struct evsel *evsel,
> struct thread_trace *ttrace;
> struct thread *thread;
> int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
> - struct syscall *sc = trace__syscall_info(trace, evsel, id);
> + struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
> char msg[1024];
> void *args, *augmented_args = NULL;
> int augmented_args_size;
> @@ -2768,7 +2826,7 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
> struct thread *thread;
> int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1, callchain_ret = 0, printed = 0;
> int alignment = trace->args_alignment;
> - struct syscall *sc = trace__syscall_info(trace, evsel, id);
> + struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
> struct thread_trace *ttrace;
>
> if (sc == NULL)
> @@ -3121,7 +3179,7 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
>
> if (evsel == trace->syscalls.events.bpf_output) {
> int id = perf_evsel__sc_tp_uint(evsel, id, sample);
> - struct syscall *sc = trace__syscall_info(trace, evsel, id);
> + struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
>
> if (sc) {
> fprintf(trace->output, "%s(", sc->name);
> @@ -3626,7 +3684,7 @@ static struct bpf_program *trace__find_syscall_bpf_prog(struct trace *trace, str
>
> static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
> {
> - struct syscall *sc = trace__syscall_info(trace, NULL, id);
> + struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
>
> if (sc == NULL)
> return;
> @@ -3637,20 +3695,20 @@ static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
>
> static int trace__bpf_prog_sys_enter_fd(struct trace *trace, int id)
> {
> - struct syscall *sc = trace__syscall_info(trace, NULL, id);
> + struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
> return sc ? bpf_program__fd(sc->bpf_prog.sys_enter) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
> }
>
> static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
> {
> - struct syscall *sc = trace__syscall_info(trace, NULL, id);
> + struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
> return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
> }
>
> static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigned int *beauty_array)
> {
> struct tep_format_field *field;
> - struct syscall *sc = trace__syscall_info(trace, NULL, key);
> + struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
> const struct btf_type *bt;
> char *struct_offset, *tmp, name[32];
> bool can_augment = false;
> @@ -3748,7 +3806,7 @@ static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace
> try_to_find_pair:
> for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
> int id = syscalltbl__id_at_idx(trace->sctbl, i);
> - struct syscall *pair = trace__syscall_info(trace, NULL, id);
> + struct syscall *pair = trace__syscall_info(trace, NULL, EM_HOST, id);
> struct bpf_program *pair_prog;
> bool is_candidate = false;
>
> @@ -3898,7 +3956,7 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
> */
> for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
> int key = syscalltbl__id_at_idx(trace->sctbl, i);
> - struct syscall *sc = trace__syscall_info(trace, NULL, key);
> + struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
> struct bpf_program *pair_prog;
> int prog_fd;
>
> @@ -4663,7 +4721,11 @@ static size_t thread__dump_stats(struct thread_trace *ttrace,
> pct = avg ? 100.0 * stddev_stats(&stats->stats) / avg : 0.0;
> avg /= NSEC_PER_MSEC;
>
> - sc = &trace->syscalls.table[syscall_stats_entry->syscall];
> + sc = trace__syscall_info(trace, /*evsel=*/NULL, EM_HOST,
> + syscall_stats_entry->syscall);
> + if (!sc)
> + continue;
> +
> printed += fprintf(fp, " %-15s", sc->name);
> printed += fprintf(fp, " %8" PRIu64 " %6" PRIu64 " %9.3f %9.3f %9.3f",
> n, stats->nr_failures, syscall_stats_entry->msecs, min, avg);
> @@ -5071,12 +5133,10 @@ static int trace__config(const char *var, const char *value, void *arg)
>
> static void trace__exit(struct trace *trace)
> {
> - int i;
> -
> strlist__delete(trace->ev_qualifier);
> zfree(&trace->ev_qualifier_ids.entries);
> if (trace->syscalls.table) {
> - for (i = 0; i <= trace->sctbl->syscalls.max_id; i++)
> + for (size_t i = 0; i < trace->syscalls.table_size; i++)
> syscall__exit(&trace->syscalls.table[i]);
> zfree(&trace->syscalls.table);
> }
> --
> 2.48.1.502.g6dc24dfdaf-goog
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 3/7] perf syscalltbl: Remove struct syscalltbl
2025-02-10 16:51 ` [PATCH v2 3/7] perf syscalltbl: Remove struct syscalltbl Ian Rogers
@ 2025-02-11 0:19 ` Charlie Jenkins
2025-02-11 7:48 ` Arnd Bergmann
1 sibling, 0 replies; 26+ messages in thread
From: Charlie Jenkins @ 2025-02-11 0:19 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, Guo Ren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Bibo Mao, Arnd Bergmann, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky, linux-riscv
On Mon, Feb 10, 2025 at 08:51:04AM -0800, Ian Rogers wrote:
> The syscalltbl held entries of system call name and number pairs,
> generated from a native syscalltbl at start up. As there are gaps in
> the system call number there is a notion of index into the
> table. Going forward we want the system call table to be identifiable
> by a machine type, for example, i386 vs x86-64. Change the interface
> to the syscalltbl so (1) a (currently unused machine type of EM_HOST)
> is passed (2) the index to syscall number and system call name mapping
> is computed at build time.
>
> Two tables are used for this, an array of system call number to name,
> an array of system call numbers sorted by the system call name. The
> sorted array doesn't store strings in part to save memory and
> relocations. The index notion is carried forward and is an index into
> the sorted array of system call numbers, the data structures are
> opaque (held only in syscalltbl.c), and so the number of indices for a
> machine type is exposed as a new API.
>
> The arrays are computed in the syscalltbl.sh script and so no start-up
> time computation and storage is necessary.
Thank you for also generating the sorted table, that is a very nice
addition.
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Howard Chu <howardchu95@gmail.com>
> ---
> tools/perf/builtin-trace.c | 88 +++++++++++++-----------
> tools/perf/scripts/syscalltbl.sh | 36 ++++------
> tools/perf/util/syscalltbl.c | 113 ++++++++++---------------------
> tools/perf/util/syscalltbl.h | 22 ++----
> 4 files changed, 103 insertions(+), 156 deletions(-)
>
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 916a51df236b..4b77c2ab3dba 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -143,7 +143,6 @@ struct syscall_fmt {
>
> struct trace {
> struct perf_tool tool;
> - struct syscalltbl *sctbl;
> struct {
> /** Sorted sycall numbers used by the trace. */
> struct syscall *table;
> @@ -2100,7 +2099,7 @@ static int syscall__read_info(struct syscall *sc, struct trace *trace)
> return 0;
> }
>
> - name = syscalltbl__name(trace->sctbl, sc->id);
> + name = syscalltbl__name(sc->e_machine, sc->id);
> if (name == NULL) {
> sc->nonexistent = true;
> return -EEXIST;
> @@ -2200,10 +2199,14 @@ static int trace__validate_ev_qualifier(struct trace *trace)
>
> strlist__for_each_entry(pos, trace->ev_qualifier) {
> const char *sc = pos->s;
> - int id = syscalltbl__id(trace->sctbl, sc), match_next = -1;
> + /*
> + * TODO: Assume more than the validation/warnings are all for
> + * the same binary type as perf.
> + */
> + int id = syscalltbl__id(EM_HOST, sc), match_next = -1;
>
> if (id < 0) {
> - id = syscalltbl__strglobmatch_first(trace->sctbl, sc, &match_next);
> + id = syscalltbl__strglobmatch_first(EM_HOST, sc, &match_next);
> if (id >= 0)
> goto matches;
>
> @@ -2223,7 +2226,7 @@ static int trace__validate_ev_qualifier(struct trace *trace)
> continue;
>
> while (1) {
> - id = syscalltbl__strglobmatch_next(trace->sctbl, sc, &match_next);
> + id = syscalltbl__strglobmatch_next(EM_HOST, sc, &match_next);
> if (id < 0)
> break;
> if (nr_allocated == nr_used) {
> @@ -2677,6 +2680,7 @@ static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
> int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
> int augmented_args_size = 0;
> void *augmented_args = NULL;
> + /* TODO: get e_machine from thread. */
> struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
> struct thread_trace *ttrace;
>
> @@ -2751,6 +2755,7 @@ static int trace__fprintf_sys_enter(struct trace *trace, struct evsel *evsel,
> struct thread_trace *ttrace;
> struct thread *thread;
> int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
> + /* TODO: get e_machine from thread. */
> struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
> char msg[1024];
> void *args, *augmented_args = NULL;
> @@ -2826,6 +2831,7 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
> struct thread *thread;
> int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1, callchain_ret = 0, printed = 0;
> int alignment = trace->args_alignment;
> + /* TODO: get e_machine from thread. */
> struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
> struct thread_trace *ttrace;
>
> @@ -3179,6 +3185,7 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
>
> if (evsel == trace->syscalls.events.bpf_output) {
> int id = perf_evsel__sc_tp_uint(evsel, id, sample);
> + /* TODO: get e_machine from thread. */
> struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
>
> if (sc) {
> @@ -3682,9 +3689,9 @@ static struct bpf_program *trace__find_syscall_bpf_prog(struct trace *trace, str
> return trace->skel->progs.syscall_unaugmented;
> }
>
> -static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
> +static void trace__init_syscall_bpf_progs(struct trace *trace, int e_machine, int id)
> {
> - struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
> + struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, id);
>
> if (sc == NULL)
> return;
> @@ -3693,22 +3700,22 @@ static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
> sc->bpf_prog.sys_exit = trace__find_syscall_bpf_prog(trace, sc, sc->fmt ? sc->fmt->bpf_prog_name.sys_exit : NULL, "exit");
> }
>
> -static int trace__bpf_prog_sys_enter_fd(struct trace *trace, int id)
> +static int trace__bpf_prog_sys_enter_fd(struct trace *trace, int e_machine, int id)
> {
> - struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
> + struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, id);
> return sc ? bpf_program__fd(sc->bpf_prog.sys_enter) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
> }
>
> -static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
> +static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int e_machine, int id)
> {
> - struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, id);
> + struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, id);
> return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
> }
>
> -static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigned int *beauty_array)
> +static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int e_machine, int key, unsigned int *beauty_array)
> {
> struct tep_format_field *field;
> - struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
> + struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, key);
> const struct btf_type *bt;
> char *struct_offset, *tmp, name[32];
> bool can_augment = false;
> @@ -3804,9 +3811,9 @@ static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace
> return NULL;
>
> try_to_find_pair:
> - for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
> - int id = syscalltbl__id_at_idx(trace->sctbl, i);
> - struct syscall *pair = trace__syscall_info(trace, NULL, EM_HOST, id);
> + for (int i = 0, num_idx = syscalltbl__num_idx(sc->e_machine); i < num_idx; ++i) {
> + int id = syscalltbl__id_at_idx(sc->e_machine, i);
> + struct syscall *pair = trace__syscall_info(trace, NULL, sc->e_machine, id);
> struct bpf_program *pair_prog;
> bool is_candidate = false;
>
> @@ -3890,7 +3897,7 @@ static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace
> return NULL;
> }
>
> -static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
> +static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace, int e_machine)
> {
> int map_enter_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_enter);
> int map_exit_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_exit);
> @@ -3898,27 +3905,27 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
> int err = 0;
> unsigned int beauty_array[6];
>
> - for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
> - int prog_fd, key = syscalltbl__id_at_idx(trace->sctbl, i);
> + for (int i = 0, num_idx = syscalltbl__num_idx(e_machine); i < num_idx; ++i) {
> + int prog_fd, key = syscalltbl__id_at_idx(e_machine, i);
>
> if (!trace__syscall_enabled(trace, key))
> continue;
>
> - trace__init_syscall_bpf_progs(trace, key);
> + trace__init_syscall_bpf_progs(trace, e_machine, key);
>
> // It'll get at least the "!raw_syscalls:unaugmented"
> - prog_fd = trace__bpf_prog_sys_enter_fd(trace, key);
> + prog_fd = trace__bpf_prog_sys_enter_fd(trace, e_machine, key);
> err = bpf_map_update_elem(map_enter_fd, &key, &prog_fd, BPF_ANY);
> if (err)
> break;
> - prog_fd = trace__bpf_prog_sys_exit_fd(trace, key);
> + prog_fd = trace__bpf_prog_sys_exit_fd(trace, e_machine, key);
> err = bpf_map_update_elem(map_exit_fd, &key, &prog_fd, BPF_ANY);
> if (err)
> break;
>
> /* use beauty_map to tell BPF how many bytes to collect, set beauty_map's value here */
> memset(beauty_array, 0, sizeof(beauty_array));
> - err = trace__bpf_sys_enter_beauty_map(trace, key, (unsigned int *)beauty_array);
> + err = trace__bpf_sys_enter_beauty_map(trace, e_machine, key, (unsigned int *)beauty_array);
> if (err)
> continue;
> err = bpf_map_update_elem(beauty_map_fd, &key, beauty_array, BPF_ANY);
> @@ -3954,9 +3961,9 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
> * first and second arg (this one on the raw_syscalls:sys_exit prog
> * array tail call, then that one will be used.
> */
> - for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
> - int key = syscalltbl__id_at_idx(trace->sctbl, i);
> - struct syscall *sc = trace__syscall_info(trace, NULL, EM_HOST, key);
> + for (int i = 0, num_idx = syscalltbl__num_idx(e_machine); i < num_idx; ++i) {
> + int key = syscalltbl__id_at_idx(e_machine, i);
> + struct syscall *sc = trace__syscall_info(trace, NULL, e_machine, key);
> struct bpf_program *pair_prog;
> int prog_fd;
>
> @@ -4393,8 +4400,13 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
> goto out_error_mem;
>
> #ifdef HAVE_BPF_SKEL
> - if (trace->skel && trace->skel->progs.sys_enter)
> - trace__init_syscalls_bpf_prog_array_maps(trace);
> + if (trace->skel && trace->skel->progs.sys_enter) {
> + /*
> + * TODO: Initialize for all host binary machine types, not just
> + * those matching the perf binary.
> + */
> + trace__init_syscalls_bpf_prog_array_maps(trace, EM_HOST);
> + }
> #endif
>
> if (trace->ev_qualifier_ids.nr > 0) {
> @@ -4419,7 +4431,8 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
> * So just disable this beautifier (SCA_FD, SCA_FDAT) when 'close' is
> * not in use.
> */
> - trace->fd_path_disabled = !trace__syscall_enabled(trace, syscalltbl__id(trace->sctbl, "close"));
> + /* TODO: support for more than just perf binary machine type close. */
> + trace->fd_path_disabled = !trace__syscall_enabled(trace, syscalltbl__id(EM_HOST, "close"));
>
> err = trace__expand_filters(trace, &evsel);
> if (err)
> @@ -4692,8 +4705,7 @@ DEFINE_RESORT_RB(syscall_stats, a->msecs > b->msecs,
> entry->msecs = stats ? (u64)stats->stats.n * (avg_stats(&stats->stats) / NSEC_PER_MSEC) : 0;
> }
>
> -static size_t thread__dump_stats(struct thread_trace *ttrace,
> - struct trace *trace, FILE *fp)
> +static size_t thread__dump_stats(struct thread_trace *ttrace, struct trace *trace, int e_machine, FILE *fp)
> {
> size_t printed = 0;
> struct syscall *sc;
> @@ -4721,7 +4733,7 @@ static size_t thread__dump_stats(struct thread_trace *ttrace,
> pct = avg ? 100.0 * stddev_stats(&stats->stats) / avg : 0.0;
> avg /= NSEC_PER_MSEC;
>
> - sc = trace__syscall_info(trace, /*evsel=*/NULL, EM_HOST,
> + sc = trace__syscall_info(trace, /*evsel=*/NULL, e_machine,
> syscall_stats_entry->syscall);
> if (!sc)
> continue;
> @@ -4771,7 +4783,8 @@ static size_t trace__fprintf_thread(FILE *fp, struct thread *thread, struct trac
> else if (fputc('\n', fp) != EOF)
> ++printed;
>
> - printed += thread__dump_stats(ttrace, trace, fp);
> + /* TODO: get e_machine from thread. */
> + printed += thread__dump_stats(ttrace, trace, EM_HOST, fp);
>
> return printed;
> }
> @@ -5003,8 +5016,9 @@ static int trace__parse_events_option(const struct option *opt, const char *str,
> *sep = '\0';
>
> list = 0;
> - if (syscalltbl__id(trace->sctbl, s) >= 0 ||
> - syscalltbl__strglobmatch_first(trace->sctbl, s, &idx) >= 0) {
> + /* TODO: support for more than just perf binary machine type syscalls. */
> + if (syscalltbl__id(EM_HOST, s) >= 0 ||
> + syscalltbl__strglobmatch_first(EM_HOST, s, &idx) >= 0) {
> list = 1;
> goto do_concat;
> }
> @@ -5140,7 +5154,6 @@ static void trace__exit(struct trace *trace)
> syscall__exit(&trace->syscalls.table[i]);
> zfree(&trace->syscalls.table);
> }
> - syscalltbl__delete(trace->sctbl);
> zfree(&trace->perfconfig_events);
> }
>
> @@ -5286,9 +5299,8 @@ int cmd_trace(int argc, const char **argv)
> sigaction(SIGCHLD, &sigchld_act, NULL);
>
> trace.evlist = evlist__new();
> - trace.sctbl = syscalltbl__new();
>
> - if (trace.evlist == NULL || trace.sctbl == NULL) {
> + if (trace.evlist == NULL) {
> pr_err("Not enough memory to run!\n");
> err = -ENOMEM;
> goto out;
> diff --git a/tools/perf/scripts/syscalltbl.sh b/tools/perf/scripts/syscalltbl.sh
> index 1ce0d5aa8b50..a39b3013b103 100755
> --- a/tools/perf/scripts/syscalltbl.sh
> +++ b/tools/perf/scripts/syscalltbl.sh
> @@ -50,37 +50,27 @@ fi
> infile="$1"
> outfile="$2"
>
> -nxt=0
> -
> -syscall_macro() {
> - nr="$1"
> - name="$2"
> -
> - echo " [$nr] = \"$name\","
> -}
> -
> -emit() {
> - nr="$1"
> - entry="$2"
> -
> - syscall_macro "$nr" "$entry"
> -}
> -
> -echo "static const char *const syscalltbl[] = {" > $outfile
> -
> sorted_table=$(mktemp /tmp/syscalltbl.XXXXXX)
> grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | sort -n > $sorted_table
>
> -max_nr=0
> +echo "static const char *const syscall_num_to_name[] = {" > $outfile
> # the params are: nr abi name entry compat
> # use _ for intentionally unused variables according to SC2034
> while read nr _ name _ _; do
> - emit "$nr" "$name" >> $outfile
> - max_nr=$nr
> + echo " [$nr] = \"$name\"," >> $outfile
> done < $sorted_table
> +echo "};" >> $outfile
>
> -rm -f $sorted_table
> +echo "static const uint16_t syscall_sorted_names[] = {" >> $outfile
>
> +# When sorting by name, add a suffix of 0s upto 20 characters so that system
> +# calls that differ with a numerical suffix don't sort before those
> +# without. This default behavior of sort differs from that of strcmp used at
> +# runtime. Use sed to strip the trailing 0s suffix afterwards.
> +grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | awk '{printf $3; for (i = length($3); i < 20; i++) { printf "0"; }; print " " $1}'| sort | sed 's/\([a-zA-Z1-9]\+\)0\+ \([0-9]\+\)/\1 \2/' > $sorted_table
> +while read name nr; do
> + echo " $nr, /* $name */" >> $outfile
> +done < $sorted_table
> echo "};" >> $outfile
>
> -echo "#define SYSCALLTBL_MAX_ID ${max_nr}" >> $outfile
> +rm -f $sorted_table
> diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
> index 2f76241494c8..760ac4d0869f 100644
> --- a/tools/perf/util/syscalltbl.c
> +++ b/tools/perf/util/syscalltbl.c
> @@ -9,6 +9,7 @@
> #include <stdlib.h>
> #include <asm/bitsperlong.h>
> #include <linux/compiler.h>
> +#include <linux/kernel.h>
> #include <linux/zalloc.h>
>
> #include <string.h>
> @@ -20,112 +21,66 @@
> #include <asm/syscalls_32.h>
> #endif
>
> -const int syscalltbl_native_max_id = SYSCALLTBL_MAX_ID;
> -static const char *const *syscalltbl_native = syscalltbl;
> +const char *syscalltbl__name(int e_machine __maybe_unused, int id)
> +{
> + if (id >= 0 && id <= (int)ARRAY_SIZE(syscall_num_to_name))
> + return syscall_num_to_name[id];
> + return NULL;
> +}
>
> -struct syscall {
> - int id;
> +struct syscall_cmp_key {
> const char *name;
> + const char *const *tbl;
> };
>
> static int syscallcmpname(const void *vkey, const void *ventry)
> {
> - const char *key = vkey;
> - const struct syscall *entry = ventry;
> + const struct syscall_cmp_key *key = vkey;
> + const uint16_t *entry = ventry;
>
> - return strcmp(key, entry->name);
> + return strcmp(key->name, key->tbl[*entry]);
> }
>
> -static int syscallcmp(const void *va, const void *vb)
> +int syscalltbl__id(int e_machine __maybe_unused, const char *name)
> {
> - const struct syscall *a = va, *b = vb;
> -
> - return strcmp(a->name, b->name);
> + struct syscall_cmp_key key = {
> + .name = name,
> + .tbl = syscall_num_to_name,
> + };
> + const int *id = bsearch(&key, syscall_sorted_names,
> + ARRAY_SIZE(syscall_sorted_names),
> + sizeof(syscall_sorted_names[0]),
> + syscallcmpname);
> +
> + return id ? *id : -1;
> }
>
> -static int syscalltbl__init_native(struct syscalltbl *tbl)
> +int syscalltbl__num_idx(int e_machine __maybe_unused)
> {
> - int nr_entries = 0, i, j;
> - struct syscall *entries;
> -
> - for (i = 0; i <= syscalltbl_native_max_id; ++i)
> - if (syscalltbl_native[i])
> - ++nr_entries;
> -
> - entries = tbl->syscalls.entries = malloc(sizeof(struct syscall) * nr_entries);
> - if (tbl->syscalls.entries == NULL)
> - return -1;
> -
> - for (i = 0, j = 0; i <= syscalltbl_native_max_id; ++i) {
> - if (syscalltbl_native[i]) {
> - entries[j].name = syscalltbl_native[i];
> - entries[j].id = i;
> - ++j;
> - }
> - }
> -
> - qsort(tbl->syscalls.entries, nr_entries, sizeof(struct syscall), syscallcmp);
> - tbl->syscalls.nr_entries = nr_entries;
> - tbl->syscalls.max_id = syscalltbl_native_max_id;
> - return 0;
> + return ARRAY_SIZE(syscall_sorted_names);
> }
>
> -struct syscalltbl *syscalltbl__new(void)
> +int syscalltbl__id_at_idx(int e_machine __maybe_unused, int idx)
> {
> - struct syscalltbl *tbl = malloc(sizeof(*tbl));
> - if (tbl) {
> - if (syscalltbl__init_native(tbl)) {
> - free(tbl);
> - return NULL;
> - }
> - }
> - return tbl;
> -}
> -
> -void syscalltbl__delete(struct syscalltbl *tbl)
> -{
> - zfree(&tbl->syscalls.entries);
> - free(tbl);
> -}
> -
> -const char *syscalltbl__name(const struct syscalltbl *tbl __maybe_unused, int id)
> -{
> - return id <= syscalltbl_native_max_id ? syscalltbl_native[id]: NULL;
> -}
> -
> -int syscalltbl__id(struct syscalltbl *tbl, const char *name)
> -{
> - struct syscall *sc = bsearch(name, tbl->syscalls.entries,
> - tbl->syscalls.nr_entries, sizeof(*sc),
> - syscallcmpname);
> -
> - return sc ? sc->id : -1;
> -}
> -
> -int syscalltbl__id_at_idx(struct syscalltbl *tbl, int idx)
> -{
> - struct syscall *syscalls = tbl->syscalls.entries;
> -
> - return idx < tbl->syscalls.nr_entries ? syscalls[idx].id : -1;
> + return syscall_sorted_names[idx];
> }
>
> -int syscalltbl__strglobmatch_next(struct syscalltbl *tbl, const char *syscall_glob, int *idx)
> +int syscalltbl__strglobmatch_next(int e_machine __maybe_unused, const char *syscall_glob, int *idx)
> {
> - int i;
> - struct syscall *syscalls = tbl->syscalls.entries;
> + for (int i = *idx + 1; i < (int)ARRAY_SIZE(syscall_sorted_names); ++i) {
> + const char *name = syscall_num_to_name[syscall_sorted_names[i]];
>
> - for (i = *idx + 1; i < tbl->syscalls.nr_entries; ++i) {
> - if (strglobmatch(syscalls[i].name, syscall_glob)) {
> + if (strglobmatch(name, syscall_glob)) {
> *idx = i;
> - return syscalls[i].id;
> + return syscall_sorted_names[i];
> }
> }
>
> return -1;
> }
>
> -int syscalltbl__strglobmatch_first(struct syscalltbl *tbl, const char *syscall_glob, int *idx)
> +int syscalltbl__strglobmatch_first(int e_machine, const char *syscall_glob, int *idx)
> {
> *idx = -1;
> - return syscalltbl__strglobmatch_next(tbl, syscall_glob, idx);
> + return syscalltbl__strglobmatch_next(e_machine, syscall_glob, idx);
> }
> diff --git a/tools/perf/util/syscalltbl.h b/tools/perf/util/syscalltbl.h
> index 362411a6d849..2bb628eff367 100644
> --- a/tools/perf/util/syscalltbl.h
> +++ b/tools/perf/util/syscalltbl.h
> @@ -2,22 +2,12 @@
> #ifndef __PERF_SYSCALLTBL_H
> #define __PERF_SYSCALLTBL_H
>
> -struct syscalltbl {
> - struct {
> - int max_id;
> - int nr_entries;
> - void *entries;
> - } syscalls;
> -};
> +const char *syscalltbl__name(int e_machine, int id);
> +int syscalltbl__id(int e_machine, const char *name);
> +int syscalltbl__num_idx(int e_machine);
> +int syscalltbl__id_at_idx(int e_machine, int idx);
>
> -struct syscalltbl *syscalltbl__new(void);
> -void syscalltbl__delete(struct syscalltbl *tbl);
> -
> -const char *syscalltbl__name(const struct syscalltbl *tbl, int id);
> -int syscalltbl__id(struct syscalltbl *tbl, const char *name);
> -int syscalltbl__id_at_idx(struct syscalltbl *tbl, int idx);
> -
> -int syscalltbl__strglobmatch_first(struct syscalltbl *tbl, const char *syscall_glob, int *idx);
> -int syscalltbl__strglobmatch_next(struct syscalltbl *tbl, const char *syscall_glob, int *idx);
> +int syscalltbl__strglobmatch_first(int e_machine, const char *syscall_glob, int *idx);
> +int syscalltbl__strglobmatch_next(int e_machine, const char *syscall_glob, int *idx);
>
> #endif /* __PERF_SYSCALLTBL_H */
> --
> 2.48.1.502.g6dc24dfdaf-goog
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 4/7] perf thread: Add support for reading the e_machine type for a thread
2025-02-10 16:51 ` [PATCH v2 4/7] perf thread: Add support for reading the e_machine type for a thread Ian Rogers
@ 2025-02-11 0:20 ` Charlie Jenkins
0 siblings, 0 replies; 26+ messages in thread
From: Charlie Jenkins @ 2025-02-11 0:20 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, Guo Ren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Bibo Mao, Arnd Bergmann, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky, linux-riscv
On Mon, Feb 10, 2025 at 08:51:05AM -0800, Ian Rogers wrote:
> Use the executable from /proc/pid/exe and read the e_machine from the
> ELF header. On failure use EM_HOST. Change builtin-trace syscall
> functions to pass e_machine from the thread rather than EM_HOST, so
> that in later patches when syscalltbl can use the e_machine the system
> calls are specific to the architecture.
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Howard Chu <howardchu95@gmail.com>
> ---
> tools/perf/builtin-trace.c | 41 ++++++++++++++++---------------
> tools/perf/util/thread.c | 50 ++++++++++++++++++++++++++++++++++++++
> tools/perf/util/thread.h | 14 ++++++++++-
> 3 files changed, 85 insertions(+), 20 deletions(-)
>
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 4b77c2ab3dba..1ae609555018 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -2678,16 +2678,17 @@ static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
> int printed = 0;
> struct thread *thread;
> int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
> - int augmented_args_size = 0;
> + int augmented_args_size = 0, e_machine;
> void *augmented_args = NULL;
> /* TODO: get e_machine from thread. */
> - struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
> + struct syscall *sc;
> struct thread_trace *ttrace;
>
> - if (sc == NULL)
> - return -1;
> -
> thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
> + e_machine = thread__e_machine(thread, trace->host);
> + sc = trace__syscall_info(trace, evsel, e_machine, id);
> + if (sc == NULL)
> + goto out_put;
> ttrace = thread__trace(thread, trace->output);
> if (ttrace == NULL)
> goto out_put;
> @@ -2756,16 +2757,18 @@ static int trace__fprintf_sys_enter(struct trace *trace, struct evsel *evsel,
> struct thread *thread;
> int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1;
> /* TODO: get e_machine from thread. */
> - struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
> + struct syscall *sc;
> char msg[1024];
> void *args, *augmented_args = NULL;
> - int augmented_args_size;
> + int augmented_args_size, e_machine;
> size_t printed = 0;
>
> - if (sc == NULL)
> - return -1;
>
> thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
> + e_machine = thread__e_machine(thread, trace->host);
> + sc = trace__syscall_info(trace, evsel, e_machine, id);
> + if (sc == NULL)
> + return -1;
> ttrace = thread__trace(thread, trace->output);
> /*
> * We need to get ttrace just to make sure it is there when syscall__scnprintf_args()
> @@ -2830,15 +2833,15 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
> bool duration_calculated = false;
> struct thread *thread;
> int id = perf_evsel__sc_tp_uint(evsel, id, sample), err = -1, callchain_ret = 0, printed = 0;
> - int alignment = trace->args_alignment;
> - /* TODO: get e_machine from thread. */
> - struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
> + int alignment = trace->args_alignment, e_machine;
> + struct syscall *sc;
> struct thread_trace *ttrace;
>
> - if (sc == NULL)
> - return -1;
> -
> thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);
> + e_machine = thread__e_machine(thread, trace->host);
> + sc = trace__syscall_info(trace, evsel, e_machine, id);
> + if (sc == NULL)
> + goto out_put;
> ttrace = thread__trace(thread, trace->output);
> if (ttrace == NULL)
> goto out_put;
> @@ -3185,8 +3188,8 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
>
> if (evsel == trace->syscalls.events.bpf_output) {
> int id = perf_evsel__sc_tp_uint(evsel, id, sample);
> - /* TODO: get e_machine from thread. */
> - struct syscall *sc = trace__syscall_info(trace, evsel, EM_HOST, id);
> + int e_machine = thread ? thread__e_machine(thread, trace->host) : EM_HOST;
> + struct syscall *sc = trace__syscall_info(trace, evsel, e_machine, id);
>
> if (sc) {
> fprintf(trace->output, "%s(", sc->name);
> @@ -4764,6 +4767,7 @@ static size_t trace__fprintf_thread(FILE *fp, struct thread *thread, struct trac
> {
> size_t printed = 0;
> struct thread_trace *ttrace = thread__priv(thread);
> + int e_machine = thread__e_machine(thread, trace->host);
> double ratio;
>
> if (ttrace == NULL)
> @@ -4783,8 +4787,7 @@ static size_t trace__fprintf_thread(FILE *fp, struct thread *thread, struct trac
> else if (fputc('\n', fp) != EOF)
> ++printed;
>
> - /* TODO: get e_machine from thread. */
> - printed += thread__dump_stats(ttrace, trace, EM_HOST, fp);
> + printed += thread__dump_stats(ttrace, trace, e_machine, fp);
>
> return printed;
> }
> diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
> index 0ffdd52d86d7..a07446a280ed 100644
> --- a/tools/perf/util/thread.c
> +++ b/tools/perf/util/thread.c
> @@ -1,5 +1,7 @@
> // SPDX-License-Identifier: GPL-2.0
> +#include <elf.h>
> #include <errno.h>
> +#include <fcntl.h>
> #include <stdlib.h>
> #include <stdio.h>
> #include <string.h>
> @@ -16,6 +18,7 @@
> #include "symbol.h"
> #include "unwind.h"
> #include "callchain.h"
> +#include "dwarf-regs.h"
>
> #include <api/fs/fs.h>
>
> @@ -51,6 +54,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
> thread__set_ppid(thread, -1);
> thread__set_cpu(thread, -1);
> thread__set_guest_cpu(thread, -1);
> + thread__set_e_machine(thread, EM_NONE);
> thread__set_lbr_stitch_enable(thread, false);
> INIT_LIST_HEAD(thread__namespaces_list(thread));
> INIT_LIST_HEAD(thread__comm_list(thread));
> @@ -423,6 +427,52 @@ void thread__find_cpumode_addr_location(struct thread *thread, u64 addr,
> }
> }
>
> +static uint16_t read_proc_e_machine_for_pid(pid_t pid)
> +{
> + char path[6 /* "/proc/" */ + 11 /* max length of pid */ + 5 /* "/exe\0" */];
> + int fd;
> + uint16_t e_machine = EM_NONE;
> +
> + snprintf(path, sizeof(path), "/proc/%d/exe", pid);
> + fd = open(path, O_RDONLY);
> + if (fd >= 0) {
> + _Static_assert(offsetof(Elf32_Ehdr, e_machine) == 18, "Unexpected offset");
> + _Static_assert(offsetof(Elf64_Ehdr, e_machine) == 18, "Unexpected offset");
> + if (pread(fd, &e_machine, sizeof(e_machine), 18) != sizeof(e_machine))
> + e_machine = EM_NONE;
> + close(fd);
> + }
> + return e_machine;
> +}
> +
> +uint16_t thread__e_machine(struct thread *thread, struct machine *machine)
> +{
> + pid_t tid, pid;
> + uint16_t e_machine = RC_CHK_ACCESS(thread)->e_machine;
> +
> + if (e_machine != EM_NONE)
> + return e_machine;
> +
> + tid = thread__tid(thread);
> + pid = thread__pid(thread);
> + if (pid != tid) {
> + struct thread *parent = machine__findnew_thread(machine, pid, pid);
> +
> + if (parent) {
> + e_machine = thread__e_machine(parent, machine);
> + thread__set_e_machine(thread, e_machine);
> + return e_machine;
> + }
> + /* Something went wrong, fallback. */
> + }
> + e_machine = read_proc_e_machine_for_pid(pid);
> + if (e_machine != EM_NONE)
> + thread__set_e_machine(thread, e_machine);
> + else
> + e_machine = EM_HOST;
> + return e_machine;
> +}
> +
> struct thread *thread__main_thread(struct machine *machine, struct thread *thread)
> {
> if (thread__pid(thread) == thread__tid(thread))
> diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
> index 6cbf6eb2812e..cd574a896418 100644
> --- a/tools/perf/util/thread.h
> +++ b/tools/perf/util/thread.h
> @@ -60,7 +60,11 @@ DECLARE_RC_STRUCT(thread) {
> struct srccode_state srccode_state;
> bool filter;
> int filter_entry_depth;
> -
> + /**
> + * @e_machine: The ELF EM_* associated with the thread. EM_NONE if not
> + * computed.
> + */
> + uint16_t e_machine;
> /* LBR call stack stitch */
> bool lbr_stitch_enable;
> struct lbr_stitch *lbr_stitch;
> @@ -302,6 +306,14 @@ static inline void thread__set_filter_entry_depth(struct thread *thread, int dep
> RC_CHK_ACCESS(thread)->filter_entry_depth = depth;
> }
>
> +uint16_t thread__e_machine(struct thread *thread, struct machine *machine);
> +
> +static inline void thread__set_e_machine(struct thread *thread, uint16_t e_machine)
> +{
> + RC_CHK_ACCESS(thread)->e_machine = e_machine;
> +}
> +
> +
> static inline bool thread__lbr_stitch_enable(const struct thread *thread)
> {
> return RC_CHK_ACCESS(thread)->lbr_stitch_enable;
> --
> 2.48.1.502.g6dc24dfdaf-goog
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables
2025-02-10 16:51 ` [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables Ian Rogers
@ 2025-02-11 0:22 ` Charlie Jenkins
2025-02-11 5:08 ` Ian Rogers
2025-02-11 8:08 ` Arnd Bergmann
1 sibling, 1 reply; 26+ messages in thread
From: Charlie Jenkins @ 2025-02-11 0:22 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, Guo Ren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Bibo Mao, Arnd Bergmann, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky, linux-riscv
On Mon, Feb 10, 2025 at 08:51:06AM -0800, Ian Rogers wrote:
> Rather than generating individual syscall header files generate a
> single trace/beauty/generated/syscalltbl.c. In a syscalltbls array
> have references to each architectures tables along with the
> corresponding e_machine. When the 32-bit or 64-bit table is ambiguous,
> match the perf binary's type. For ARM32 don't use the arm64 32-bit
> table which is smaller. EM_NONE is present for is no machine matches.
>
> Conditionally compile the tables, only having the appropriate 32 and
> 64-bit table. If ALL_SYSCALLTBL is defined all tables can be
> compiled.
Is there somewhere that the ALL_SYSCALLTBL could be documented? I talk
about this more in patch 7, but if this also could help perf report
display the correct syscall names, then ALL_SYSCALLTBL maybe should be
the default?
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Howard Chu <howardchu95@gmail.com>
> ---
> tools/perf/Makefile.perf | 9 +
> tools/perf/trace/beauty/syscalltbl.sh | 274 ++++++++++++++++++++++++++
> 2 files changed, 283 insertions(+)
> create mode 100755 tools/perf/trace/beauty/syscalltbl.sh
>
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 55d6ce9ea52f..793e702f9aaf 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -559,6 +559,14 @@ beauty_ioctl_outdir := $(beauty_outdir)/ioctl
> # Create output directory if not already present
> $(shell [ -d '$(beauty_ioctl_outdir)' ] || mkdir -p '$(beauty_ioctl_outdir)')
>
> +syscall_array := $(beauty_outdir)/syscalltbl.c
> +syscall_tbl := $(srctree)/tools/perf/trace/beauty/syscalltbl.sh
> +syscall_tbl_data := $(srctree)/tools/scripts/syscall.tbl \
> + $(wildcard $(srctree)/tools/perf/arch/*/entry/syscalls/syscall*.tbl)
> +
> +$(syscall_array): $(syscall_tbl) $(syscall_tbl_data)
> + $(Q)$(SHELL) '$(syscall_tbl)' $(srctree)/tools $@
> +
> fs_at_flags_array := $(beauty_outdir)/fs_at_flags_array.c
> fs_at_flags_tbl := $(srctree)/tools/perf/trace/beauty/fs_at_flags.sh
>
> @@ -878,6 +886,7 @@ build-dir = $(or $(__build-dir),.)
>
> prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders \
> arm64-sysreg-defs \
> + $(syscall_array) \
> $(fs_at_flags_array) \
> $(clone_flags_array) \
> $(drm_ioctl_array) \
> diff --git a/tools/perf/trace/beauty/syscalltbl.sh b/tools/perf/trace/beauty/syscalltbl.sh
> new file mode 100755
> index 000000000000..635924dc5f59
> --- /dev/null
> +++ b/tools/perf/trace/beauty/syscalltbl.sh
> @@ -0,0 +1,274 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Generate all syscall tables.
> +#
> +# Each line of the syscall table should have the following format:
> +#
> +# NR ABI NAME [NATIVE] [COMPAT]
> +#
> +# NR syscall number
> +# ABI ABI name
> +# NAME syscall name
> +# NATIVE native entry point (optional)
> +# COMPAT compat entry point (optional)
> +
> +set -e
> +
> +usage() {
> + cat >&2 <<EOF
> +usage: $0 <TOOLS DIRECTORY> <OUTFILE>
> +
> + <TOOLS DIRECTORY> path to kernel tools directory
> + <OUTFILE> output header file
> +EOF
> + exit 1
> +}
> +
> +if [ $# -ne 2 ]; then
> + usage
> +fi
> +tools_dir=$1
> +outfile=$2
> +
> +build_tables() {
> + infile="$1"
> + outfile="$2"
> + abis=$(echo "($3)" | tr ',' '|')
> + e_machine="$4"
> +
> + if [ ! -f "$infile" ]
> + then
> + echo "Missing file $infile"
> + exit 1
> + fi
> + sorted_table=$(mktemp /tmp/syscalltbl.XXXXXX)
> + grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | sort -n > "$sorted_table"
> +
> + echo "static const char *const syscall_num_to_name_${e_machine}[] = {" >> "$outfile"
> + # the params are: nr abi name entry compat
> + # use _ for intentionally unused variables according to SC2034
> + while read -r nr _ name _ _; do
> + echo " [$nr] = \"$name\"," >> "$outfile"
> + done < "$sorted_table"
> + echo "};" >> "$outfile"
> +
> + echo "static const uint16_t syscall_sorted_names_${e_machine}[] = {" >> "$outfile"
> +
> + # When sorting by name, add a suffix of 0s upto 20 characters so that
> + # system calls that differ with a numerical suffix don't sort before
> + # those without. This default behavior of sort differs from that of
> + # strcmp used at runtime. Use sed to strip the trailing 0s suffix
> + # afterwards.
> + grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | awk '{printf $3; for (i = length($3); i < 20; i++) { printf "0"; }; print " " $1}'| sort | sed 's/\([a-zA-Z1-9]\+\)0\+ \([0-9]\+\)/\1 \2/' > "$sorted_table"
> + while read -r name nr; do
> + echo " $nr, /* $name */" >> "$outfile"
> + done < "$sorted_table"
> + echo "};" >> "$outfile"
> +
> + rm -f "$sorted_table"
> +}
> +
> +rm -f "$outfile"
> +cat >> "$outfile" <<EOF
> +#include <elf.h>
> +#include <stdint.h>
> +#include <asm/bitsperlong.h>
> +#include <linux/kernel.h>
> +
> +struct syscalltbl {
> + const char *const *num_to_name;
> + const uint16_t *sorted_names;
> + uint16_t e_machine;
> + uint16_t num_to_name_len;
> + uint16_t sorted_names_len;
> +};
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__alpha__)
> +EOF
> +build_tables "$tools_dir/perf/arch/alpha/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_ALPHA
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__alpha__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
> +EOF
> +build_tables "$tools_dir/perf/arch/arm/entry/syscalls/syscall.tbl" "$outfile" common,32,oabi EM_ARM
> +build_tables "$tools_dir/perf/arch/arm64/entry/syscalls/syscall_64.tbl" "$outfile" common,64,renameat,rlimit,memfd_secret EM_AARCH64
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__csky__)
> +EOF
> +build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32,csky,time32,stat64,rlimit EM_CSKY
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__csky__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__mips__)
> +EOF
> +build_tables "$tools_dir/perf/arch/mips/entry/syscalls/syscall_n64.tbl" "$outfile" common,64,n64 EM_MIPS
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__hppa__)
> +#if __BITS_PER_LONG != 64
> +EOF
> +build_tables "$tools_dir/perf/arch/parisc/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_PARISC
> +echo "#else" >> "$outfile"
> +build_tables "$tools_dir/perf/arch/parisc/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_PARISC
> +cat >> "$outfile" <<EOF
> +#endif //__BITS_PER_LONG != 64
> +#endif // defined(ALL_SYSCALLTBL) || defined(__hppa__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
> +EOF
> +build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl" "$outfile" common,32,nospu EM_PPC
> +build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl" "$outfile" common,64,nospu EM_PPC64
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__riscv)
> +#if __BITS_PER_LONG != 64
> +EOF
> +build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32,riscv,memfd_secret EM_RISCV
> +echo "#else" >> "$outfile"
> +build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,64,riscv,rlimit,memfd_secret EM_RISCV
> +cat >> "$outfile" <<EOF
> +#endif //__BITS_PER_LONG != 64
> +#endif // defined(ALL_SYSCALLTBL) || defined(__riscv)
> +#if defined(ALL_SYSCALLTBL) || defined(__s390x__)
> +EOF
> +build_tables "$tools_dir/perf/arch/s390/entry/syscalls/syscall.tbl" "$outfile" common,64,renameat,rlimit,memfd_secret EM_S390
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__sh__)
> +EOF
> +build_tables "$tools_dir/perf/arch/sh/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_SH
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__sh__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
> +#if __BITS_PER_LONG != 64
> +EOF
> +build_tables "$tools_dir/perf/arch/sparc/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_SPARC
> +echo "#else" >> "$outfile"
> +build_tables "$tools_dir/perf/arch/sparc/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_SPARC
> +cat >> "$outfile" <<EOF
> +#endif //__BITS_PER_LONG != 64
> +#endif // defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> +EOF
> +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_32.tbl" "$outfile" common,32,i386 EM_386
> +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_64.tbl" "$outfile" common,64 EM_X86_64
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__xtensa__)
> +EOF
> +build_tables "$tools_dir/perf/arch/xtensa/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_XTENSA
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__xtensa__)
> +
> +#if __BITS_PER_LONG != 64
> +EOF
> +build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32 EM_NONE
> +echo "#else" >> "$outfile"
> +build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,64 EM_NONE
> +echo "#endif //__BITS_PER_LONG != 64" >> "$outfile"
> +
> +build_outer_table() {
> + e_machine=$1
> + outfile="$2"
> + cat >> "$outfile" <<EOF
> + {
> + .num_to_name = syscall_num_to_name_$e_machine,
> + .sorted_names = syscall_sorted_names_$e_machine,
> + .e_machine = $e_machine,
> + .num_to_name_len = ARRAY_SIZE(syscall_num_to_name_$e_machine),
> + .sorted_names_len = ARRAY_SIZE(syscall_sorted_names_$e_machine),
> + },
> +EOF
> +}
> +
> +cat >> "$outfile" <<EOF
> +static const struct syscalltbl syscalltbls[] = {
> +#if defined(ALL_SYSCALLTBL) || defined(__alpha__)
> +EOF
> +build_outer_table EM_ALPHA "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__alpha__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
> +EOF
> +build_outer_table EM_ARM "$outfile"
> +build_outer_table EM_AARCH64 "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__csky__)
> +EOF
> +build_outer_table EM_CSKY "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__csky__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__mips__)
> +EOF
> +build_outer_table EM_MIPS "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__hppa__)
> +EOF
> +build_outer_table EM_PARISC "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__hppa__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
> +EOF
> +build_outer_table EM_PPC "$outfile"
> +build_outer_table EM_PPC64 "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__riscv)
> +EOF
> +build_outer_table EM_RISCV "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__riscv)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__s390x__)
> +EOF
> +build_outer_table EM_S390 "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__sh__)
> +EOF
> +build_outer_table EM_SH "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__sh__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
> +EOF
> +build_outer_table EM_SPARC "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> +EOF
> +build_outer_table EM_386 "$outfile"
> +build_outer_table EM_X86_64 "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> +
> +#if defined(ALL_SYSCALLTBL) || defined(__xtensa__)
> +EOF
> +build_outer_table EM_XTENSA "$outfile"
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__xtensa__)
> +EOF
> +build_outer_table EM_NONE "$outfile"
> +cat >> "$outfile" <<EOF
> +};
> +EOF
> --
> 2.48.1.502.g6dc24dfdaf-goog
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 6/7] perf syscalltbl: Use lookup table containing multiple architectures
2025-02-10 16:51 ` [PATCH v2 6/7] perf syscalltbl: Use lookup table containing multiple architectures Ian Rogers
2025-02-10 23:39 ` Charlie Jenkins
@ 2025-02-11 0:23 ` Charlie Jenkins
1 sibling, 0 replies; 26+ messages in thread
From: Charlie Jenkins @ 2025-02-11 0:23 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, Guo Ren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Bibo Mao, Arnd Bergmann, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky, linux-riscv
On Mon, Feb 10, 2025 at 08:51:07AM -0800, Ian Rogers wrote:
> Switch to use the lookup table containing all architectures rather
> than tables matching the perf binary.
>
> This fixes perf trace when executed on a 32-bit i386 binary on an
> x86-64 machine. Note in the following the system call names of the
> 32-bit i386 binary as seen by an x86-64 perf.
>
> Before:
> ```
> ? ( ): a.out/447296 ... [continued]: munmap()) = 0
> 0.024 ( 0.001 ms): a.out/447296 recvfrom(ubuf: 0x2, size: 4160585708, flags: DONTROUTE|CTRUNC|TRUNC|DONTWAIT|EOR|WAITALL|FIN|SYN|CONFIRM|RST|ERRQUEUE|NOSIGNAL|WAITFORONE|BATCH|SOCK_DEVMEM|ZEROCOPY|FASTOPEN|CMSG_CLOEXEC|0x91f80000, addr: 0xe30, addr_len: 0xffce438c) = 1475198976
> 0.042 ( 0.003 ms): a.out/447296 lgetxattr(name: "", value: 0x3, size: 34) = 4160344064
> 0.054 ( 0.003 ms): a.out/447296 dup2(oldfd: -134422744, newfd: 4) = -1 ENOENT (No such file or directory)
> 0.060 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x2e646c2f6374652f,.iov_len = (__kernel_size_t)7307199665335594867,}, vlen: 557056, pos_h: 4160585708) = 3
> 0.074 ( 0.004 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2) = 4160237568
> 0.080 ( 0.001 ms): a.out/447296 lstat(filename: "", statbuf: 0x193f6) = 0
> 0.089 ( 0.007 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x3833692f62696c2f,.iov_len = (__kernel_size_t)3276497845987585334,}, vlen: 557056, pos_h: 4160585708) = 3
> 0.097 ( 0.002 ms): a.out/447296 close(fd: 3</proc/447296/status>) = 512
> 0.103 ( 0.002 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2050) = 4157935616
> 0.107 ( 0.007 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x5, size: 2066) = 4158078976
> 0.116 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x1, size: 2066) = 4159639552
> 0.121 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 2066) = 4160184320
> 0.129 ( 0.002 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 50) = 4160196608
> 0.138 ( 0.001 ms): a.out/447296 lstat(filename: "") = 0
> 0.145 ( 0.002 ms): a.out/447296 mq_timedreceive(mqdes: 4291706800, u_msg_ptr: 0xf7f9ea48, msg_len: 134616640, u_msg_prio: 0xf7fd7fec, u_abs_timeout: (struct __kernel_timespec){.tv_sec = (__kernel_time64_t)-578174027777317696,.tv_nsec = (long long int)4160349376,}) = 0
> 0.148 ( 0.001 ms): a.out/447296 mkdirat(dfd: -134617816, pathname: " ��� ���▒���▒���", mode: IFREG|ISUID|IRUSR|IWGRP|0xf7fd0000) = 447296
> 0.150 ( 0.001 ms): a.out/447296 process_vm_writev(pid: -134617812, lvec: (struct iovec){.iov_base = (void *)0xf7f9e9c8f7f9e4c0,.iov_len = (__kernel_size_t)4160349376,}, liovcnt: 4160588048, rvec: (struct iovec){}, riovcnt: 4160585708, flags: 4291707352) = 0
> 0.197 ( 0.004 ms): a.out/447296 capget(header: 4160184320, dataptr: 8192) = 0
> 0.202 ( 0.002 ms): a.out/447296 capget(header: 1448669184, dataptr: 4096) = 0
> 0.208 ( 0.002 ms): a.out/447296 capget(header: 4160577536, dataptr: 8192) = 0
> 0.220 ( 0.001 ms): a.out/447296 getxattr(pathname: "", name: "c������", value: 0xf7f77e34, size: 1) = 0
> 0.228 ( 0.005 ms): a.out/447296 fchmod(fd: -134729728, mode: IRUGO|IWUGO|IFREG|IFIFO|ISVTX|IXUSR|0x10000) = 0
> 0.240 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: 0x5658e008, pos_h: 4160192052) = 3
> 0.250 ( 0.008 ms): a.out/447296 close(fd: 3</proc/447296/status>) = 1436
> 0.260 ( 0.018 ms): a.out/447296 stat(filename: "", statbuf: 0xffce32ac) = 1436
> 0.288 (1000.213 ms): a.out/447296 readlinkat(buf: 0xffce31d4, bufsiz: 4291703244) = 0
> ```
>
> After:
> ```
> ? ( ): a.out/442930 ... [continued]: execve()) = 0
> 0.023 ( 0.002 ms): a.out/442930 brk() = 0x57760000
> 0.052 ( 0.003 ms): a.out/442930 access(filename: 0xf7f5af28, mode: R) = -1 ENOENT (No such file or directory)
> 0.059 ( 0.009 ms): a.out/442930 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> 0.078 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>) = 0
> 0.087 ( 0.007 ms): a.out/442930 openat(dfd: CWD, filename: "/lib/i386-linux-", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> 0.095 ( 0.002 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdbb70, count: 512) = 512
> 0.135 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>) = 0
> 0.148 ( 0.001 ms): a.out/442930 set_tid_address(tidptr: 0xf7f2b528) = 442930 (a.out)
> 0.150 ( 0.001 ms): a.out/442930 set_robust_list(head: 0xf7f2b52c, len: 12) =
> 0.196 ( 0.004 ms): a.out/442930 mprotect(start: 0xf7f03000, len: 8192, prot: READ) = 0
> 0.202 ( 0.002 ms): a.out/442930 mprotect(start: 0x5658e000, len: 4096, prot: READ) = 0
> 0.207 ( 0.002 ms): a.out/442930 mprotect(start: 0xf7f63000, len: 8192, prot: READ) = 0
> 0.230 ( 0.005 ms): a.out/442930 munmap(addr: 0xf7f10000, len: 103414) = 0
> 0.244 ( 0.010 ms): a.out/442930 openat(dfd: CWD, filename: 0x5658d008) = 3
> 0.255 ( 0.007 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdb67c, count: 4096) = 1436
> 0.264 ( 0.018 ms): a.out/442930 write(fd: 1</dev/pts/4>, buf: , count: 1436) = 1436
> 0.292 (1000.173 ms): a.out/442930 clock_nanosleep(rqtp: { .tv_sec: 17866546940376776704, .tv_nsec: 4159878336 }, rmtp: 0xffbdb59c) = 0
> 1000.478 ( ): a.out/442930 exit_group() = ?
> ```
>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Howard Chu <howardchu95@gmail.com>
> ---
> tools/perf/util/syscalltbl.c | 89 ++++++++++++++++++++++++++----------
> 1 file changed, 64 insertions(+), 25 deletions(-)
>
> diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
> index 760ac4d0869f..db0d2b81aed1 100644
> --- a/tools/perf/util/syscalltbl.c
> +++ b/tools/perf/util/syscalltbl.c
> @@ -15,16 +15,39 @@
> #include <string.h>
> #include "string2.h"
>
> -#if __BITS_PER_LONG == 64
> - #include <asm/syscalls_64.h>
> -#else
> - #include <asm/syscalls_32.h>
> -#endif
> +#include "trace/beauty/generated/syscalltbl.c"
>
> -const char *syscalltbl__name(int e_machine __maybe_unused, int id)
> +static const struct syscalltbl *find_table(int e_machine)
> {
> - if (id >= 0 && id <= (int)ARRAY_SIZE(syscall_num_to_name))
> - return syscall_num_to_name[id];
> + static const struct syscalltbl *last_table;
> + static int last_table_machine = EM_NONE;
> +
> + /* Tables only exist for EM_SPARC. */
> + if (e_machine == EM_SPARCV9)
> + e_machine = EM_SPARC;
> +
> + if (last_table_machine == e_machine && last_table != NULL)
> + return last_table;
> +
> + for (size_t i = 0; i < ARRAY_SIZE(syscalltbls); i++) {
> + const struct syscalltbl *entry = &syscalltbls[i];
> +
> + if (entry->e_machine != e_machine && entry->e_machine != EM_NONE)
> + continue;
> +
> + last_table = entry;
> + last_table_machine = e_machine;
> + return entry;
> + }
> + return NULL;
> +}
> +
> +const char *syscalltbl__name(int e_machine, int id)
> +{
> + const struct syscalltbl *table = find_table(e_machine);
> +
> + if (table && id >= 0 && id < table->num_to_name_len)
> + return table->num_to_name[id];
> return NULL;
> }
>
> @@ -41,38 +64,54 @@ static int syscallcmpname(const void *vkey, const void *ventry)
> return strcmp(key->name, key->tbl[*entry]);
> }
>
> -int syscalltbl__id(int e_machine __maybe_unused, const char *name)
> +int syscalltbl__id(int e_machine, const char *name)
> {
> - struct syscall_cmp_key key = {
> - .name = name,
> - .tbl = syscall_num_to_name,
> - };
> - const int *id = bsearch(&key, syscall_sorted_names,
> - ARRAY_SIZE(syscall_sorted_names),
> - sizeof(syscall_sorted_names[0]),
> - syscallcmpname);
> + const struct syscalltbl *table = find_table(e_machine);
> + struct syscall_cmp_key key;
> + const int *id;
> +
> + if (!table)
> + return -1;
> +
> + key.name = name;
> + key.tbl = table->num_to_name;
> + id = bsearch(&key, table->sorted_names, table->sorted_names_len,
> + sizeof(table->sorted_names[0]), syscallcmpname);
>
> return id ? *id : -1;
> }
>
> -int syscalltbl__num_idx(int e_machine __maybe_unused)
> +int syscalltbl__num_idx(int e_machine)
> {
> - return ARRAY_SIZE(syscall_sorted_names);
> + const struct syscalltbl *table = find_table(e_machine);
> +
> + if (!table)
> + return 0;
> +
> + return table->sorted_names_len;
> }
>
> -int syscalltbl__id_at_idx(int e_machine __maybe_unused, int idx)
> +int syscalltbl__id_at_idx(int e_machine, int idx)
> {
> - return syscall_sorted_names[idx];
> + const struct syscalltbl *table = find_table(e_machine);
> +
> + if (!table)
> + return -1;
> +
> + assert(idx >= 0 && idx < table->sorted_names_len);
> + return table->sorted_names[idx];
> }
>
> -int syscalltbl__strglobmatch_next(int e_machine __maybe_unused, const char *syscall_glob, int *idx)
> +int syscalltbl__strglobmatch_next(int e_machine, const char *syscall_glob, int *idx)
> {
> - for (int i = *idx + 1; i < (int)ARRAY_SIZE(syscall_sorted_names); ++i) {
> - const char *name = syscall_num_to_name[syscall_sorted_names[i]];
> + const struct syscalltbl *table = find_table(e_machine);
> +
> + for (int i = *idx + 1; table && i < table->sorted_names_len; ++i) {
> + const char *name = table->num_to_name[table->sorted_names[i]];
>
> if (strglobmatch(name, syscall_glob)) {
> *idx = i;
> - return syscall_sorted_names[i];
> + return table->sorted_names[i];
> }
> }
>
> --
> 2.48.1.502.g6dc24dfdaf-goog
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables
2025-02-11 0:22 ` Charlie Jenkins
@ 2025-02-11 5:08 ` Ian Rogers
0 siblings, 0 replies; 26+ messages in thread
From: Ian Rogers @ 2025-02-11 5:08 UTC (permalink / raw)
To: Charlie Jenkins
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, Guo Ren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Bibo Mao, Arnd Bergmann, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky, linux-riscv
On Mon, Feb 10, 2025 at 4:22 PM Charlie Jenkins <charlie@rivosinc.com> wrote:
>
> On Mon, Feb 10, 2025 at 08:51:06AM -0800, Ian Rogers wrote:
> > Rather than generating individual syscall header files generate a
> > single trace/beauty/generated/syscalltbl.c. In a syscalltbls array
> > have references to each architectures tables along with the
> > corresponding e_machine. When the 32-bit or 64-bit table is ambiguous,
> > match the perf binary's type. For ARM32 don't use the arm64 32-bit
> > table which is smaller. EM_NONE is present for is no machine matches.
> >
> > Conditionally compile the tables, only having the appropriate 32 and
> > 64-bit table. If ALL_SYSCALLTBL is defined all tables can be
> > compiled.
>
> Is there somewhere that the ALL_SYSCALLTBL could be documented? I talk
> about this more in patch 7, but if this also could help perf report
> display the correct syscall names, then ALL_SYSCALLTBL maybe should be
> the default?
So I think ALL_SYSCALLTBL should just go to being the default once we
have a use for it. Currently `perf trace record` doesn't capture the
e_machine of the executing processes, so recording on say a RISC-V
machine and then analyzing on an x86-64 isn't possible. I was worried
that just making ALL_SYSCALLTBL the default would lead to complaints
about increases in binary size or something. This patch series does
what's sensible for things that work right now. ALL_SYSCALLTBL is
useful for making sure the tables other than for your build machine at
least compile. If others feel strongly it should be the default I
don't have a problem changing the code.
Thanks,
Ian
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > Reviewed-by: Howard Chu <howardchu95@gmail.com>
> > ---
> > tools/perf/Makefile.perf | 9 +
> > tools/perf/trace/beauty/syscalltbl.sh | 274 ++++++++++++++++++++++++++
> > 2 files changed, 283 insertions(+)
> > create mode 100755 tools/perf/trace/beauty/syscalltbl.sh
> >
> > diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> > index 55d6ce9ea52f..793e702f9aaf 100644
> > --- a/tools/perf/Makefile.perf
> > +++ b/tools/perf/Makefile.perf
> > @@ -559,6 +559,14 @@ beauty_ioctl_outdir := $(beauty_outdir)/ioctl
> > # Create output directory if not already present
> > $(shell [ -d '$(beauty_ioctl_outdir)' ] || mkdir -p '$(beauty_ioctl_outdir)')
> >
> > +syscall_array := $(beauty_outdir)/syscalltbl.c
> > +syscall_tbl := $(srctree)/tools/perf/trace/beauty/syscalltbl.sh
> > +syscall_tbl_data := $(srctree)/tools/scripts/syscall.tbl \
> > + $(wildcard $(srctree)/tools/perf/arch/*/entry/syscalls/syscall*.tbl)
> > +
> > +$(syscall_array): $(syscall_tbl) $(syscall_tbl_data)
> > + $(Q)$(SHELL) '$(syscall_tbl)' $(srctree)/tools $@
> > +
> > fs_at_flags_array := $(beauty_outdir)/fs_at_flags_array.c
> > fs_at_flags_tbl := $(srctree)/tools/perf/trace/beauty/fs_at_flags.sh
> >
> > @@ -878,6 +886,7 @@ build-dir = $(or $(__build-dir),.)
> >
> > prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders \
> > arm64-sysreg-defs \
> > + $(syscall_array) \
> > $(fs_at_flags_array) \
> > $(clone_flags_array) \
> > $(drm_ioctl_array) \
> > diff --git a/tools/perf/trace/beauty/syscalltbl.sh b/tools/perf/trace/beauty/syscalltbl.sh
> > new file mode 100755
> > index 000000000000..635924dc5f59
> > --- /dev/null
> > +++ b/tools/perf/trace/beauty/syscalltbl.sh
> > @@ -0,0 +1,274 @@
> > +#!/bin/sh
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Generate all syscall tables.
> > +#
> > +# Each line of the syscall table should have the following format:
> > +#
> > +# NR ABI NAME [NATIVE] [COMPAT]
> > +#
> > +# NR syscall number
> > +# ABI ABI name
> > +# NAME syscall name
> > +# NATIVE native entry point (optional)
> > +# COMPAT compat entry point (optional)
> > +
> > +set -e
> > +
> > +usage() {
> > + cat >&2 <<EOF
> > +usage: $0 <TOOLS DIRECTORY> <OUTFILE>
> > +
> > + <TOOLS DIRECTORY> path to kernel tools directory
> > + <OUTFILE> output header file
> > +EOF
> > + exit 1
> > +}
> > +
> > +if [ $# -ne 2 ]; then
> > + usage
> > +fi
> > +tools_dir=$1
> > +outfile=$2
> > +
> > +build_tables() {
> > + infile="$1"
> > + outfile="$2"
> > + abis=$(echo "($3)" | tr ',' '|')
> > + e_machine="$4"
> > +
> > + if [ ! -f "$infile" ]
> > + then
> > + echo "Missing file $infile"
> > + exit 1
> > + fi
> > + sorted_table=$(mktemp /tmp/syscalltbl.XXXXXX)
> > + grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | sort -n > "$sorted_table"
> > +
> > + echo "static const char *const syscall_num_to_name_${e_machine}[] = {" >> "$outfile"
> > + # the params are: nr abi name entry compat
> > + # use _ for intentionally unused variables according to SC2034
> > + while read -r nr _ name _ _; do
> > + echo " [$nr] = \"$name\"," >> "$outfile"
> > + done < "$sorted_table"
> > + echo "};" >> "$outfile"
> > +
> > + echo "static const uint16_t syscall_sorted_names_${e_machine}[] = {" >> "$outfile"
> > +
> > + # When sorting by name, add a suffix of 0s upto 20 characters so that
> > + # system calls that differ with a numerical suffix don't sort before
> > + # those without. This default behavior of sort differs from that of
> > + # strcmp used at runtime. Use sed to strip the trailing 0s suffix
> > + # afterwards.
> > + grep -E "^[0-9]+[[:space:]]+$abis" "$infile" | awk '{printf $3; for (i = length($3); i < 20; i++) { printf "0"; }; print " " $1}'| sort | sed 's/\([a-zA-Z1-9]\+\)0\+ \([0-9]\+\)/\1 \2/' > "$sorted_table"
> > + while read -r name nr; do
> > + echo " $nr, /* $name */" >> "$outfile"
> > + done < "$sorted_table"
> > + echo "};" >> "$outfile"
> > +
> > + rm -f "$sorted_table"
> > +}
> > +
> > +rm -f "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#include <elf.h>
> > +#include <stdint.h>
> > +#include <asm/bitsperlong.h>
> > +#include <linux/kernel.h>
> > +
> > +struct syscalltbl {
> > + const char *const *num_to_name;
> > + const uint16_t *sorted_names;
> > + uint16_t e_machine;
> > + uint16_t num_to_name_len;
> > + uint16_t sorted_names_len;
> > +};
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__alpha__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/alpha/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_ALPHA
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__alpha__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/arm/entry/syscalls/syscall.tbl" "$outfile" common,32,oabi EM_ARM
> > +build_tables "$tools_dir/perf/arch/arm64/entry/syscalls/syscall_64.tbl" "$outfile" common,64,renameat,rlimit,memfd_secret EM_AARCH64
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__csky__)
> > +EOF
> > +build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32,csky,time32,stat64,rlimit EM_CSKY
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__csky__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__mips__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/mips/entry/syscalls/syscall_n64.tbl" "$outfile" common,64,n64 EM_MIPS
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__hppa__)
> > +#if __BITS_PER_LONG != 64
> > +EOF
> > +build_tables "$tools_dir/perf/arch/parisc/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_PARISC
> > +echo "#else" >> "$outfile"
> > +build_tables "$tools_dir/perf/arch/parisc/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_PARISC
> > +cat >> "$outfile" <<EOF
> > +#endif //__BITS_PER_LONG != 64
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__hppa__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl" "$outfile" common,32,nospu EM_PPC
> > +build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl" "$outfile" common,64,nospu EM_PPC64
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__riscv)
> > +#if __BITS_PER_LONG != 64
> > +EOF
> > +build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32,riscv,memfd_secret EM_RISCV
> > +echo "#else" >> "$outfile"
> > +build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,64,riscv,rlimit,memfd_secret EM_RISCV
> > +cat >> "$outfile" <<EOF
> > +#endif //__BITS_PER_LONG != 64
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__riscv)
> > +#if defined(ALL_SYSCALLTBL) || defined(__s390x__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/s390/entry/syscalls/syscall.tbl" "$outfile" common,64,renameat,rlimit,memfd_secret EM_S390
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__sh__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/sh/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_SH
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__sh__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
> > +#if __BITS_PER_LONG != 64
> > +EOF
> > +build_tables "$tools_dir/perf/arch/sparc/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_SPARC
> > +echo "#else" >> "$outfile"
> > +build_tables "$tools_dir/perf/arch/sparc/entry/syscalls/syscall.tbl" "$outfile" common,64 EM_SPARC
> > +cat >> "$outfile" <<EOF
> > +#endif //__BITS_PER_LONG != 64
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_32.tbl" "$outfile" common,32,i386 EM_386
> > +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_64.tbl" "$outfile" common,64 EM_X86_64
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__xtensa__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/xtensa/entry/syscalls/syscall.tbl" "$outfile" common,32 EM_XTENSA
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__xtensa__)
> > +
> > +#if __BITS_PER_LONG != 64
> > +EOF
> > +build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,32 EM_NONE
> > +echo "#else" >> "$outfile"
> > +build_tables "$tools_dir/scripts/syscall.tbl" "$outfile" common,64 EM_NONE
> > +echo "#endif //__BITS_PER_LONG != 64" >> "$outfile"
> > +
> > +build_outer_table() {
> > + e_machine=$1
> > + outfile="$2"
> > + cat >> "$outfile" <<EOF
> > + {
> > + .num_to_name = syscall_num_to_name_$e_machine,
> > + .sorted_names = syscall_sorted_names_$e_machine,
> > + .e_machine = $e_machine,
> > + .num_to_name_len = ARRAY_SIZE(syscall_num_to_name_$e_machine),
> > + .sorted_names_len = ARRAY_SIZE(syscall_sorted_names_$e_machine),
> > + },
> > +EOF
> > +}
> > +
> > +cat >> "$outfile" <<EOF
> > +static const struct syscalltbl syscalltbls[] = {
> > +#if defined(ALL_SYSCALLTBL) || defined(__alpha__)
> > +EOF
> > +build_outer_table EM_ALPHA "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__alpha__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
> > +EOF
> > +build_outer_table EM_ARM "$outfile"
> > +build_outer_table EM_AARCH64 "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__csky__)
> > +EOF
> > +build_outer_table EM_CSKY "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__csky__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__mips__)
> > +EOF
> > +build_outer_table EM_MIPS "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__hppa__)
> > +EOF
> > +build_outer_table EM_PARISC "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__hppa__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
> > +EOF
> > +build_outer_table EM_PPC "$outfile"
> > +build_outer_table EM_PPC64 "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__powerpc__) || defined(__powerpc64__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__riscv)
> > +EOF
> > +build_outer_table EM_RISCV "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__riscv)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__s390x__)
> > +EOF
> > +build_outer_table EM_S390 "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__sh__)
> > +EOF
> > +build_outer_table EM_SH "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__sh__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
> > +EOF
> > +build_outer_table EM_SPARC "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__sparc64__) || defined(__sparc__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> > +EOF
> > +build_outer_table EM_386 "$outfile"
> > +build_outer_table EM_X86_64 "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> > +
> > +#if defined(ALL_SYSCALLTBL) || defined(__xtensa__)
> > +EOF
> > +build_outer_table EM_XTENSA "$outfile"
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__xtensa__)
> > +EOF
> > +build_outer_table EM_NONE "$outfile"
> > +cat >> "$outfile" <<EOF
> > +};
> > +EOF
> > --
> > 2.48.1.502.g6dc24dfdaf-goog
> >
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 6/7] perf syscalltbl: Use lookup table containing multiple architectures
2025-02-10 23:39 ` Charlie Jenkins
@ 2025-02-11 5:15 ` Ian Rogers
0 siblings, 0 replies; 26+ messages in thread
From: Ian Rogers @ 2025-02-11 5:15 UTC (permalink / raw)
To: Charlie Jenkins
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, Guo Ren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Bibo Mao, Arnd Bergmann, Huacai Chen, Catalin Marinas,
Jiri Slaby, Björn Töpel, Howard Chu, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky, linux-riscv
On Mon, Feb 10, 2025 at 3:39 PM Charlie Jenkins <charlie@rivosinc.com> wrote:
>
> On Mon, Feb 10, 2025 at 08:51:07AM -0800, Ian Rogers wrote:
> > Switch to use the lookup table containing all architectures rather
> > than tables matching the perf binary.
> >
> > This fixes perf trace when executed on a 32-bit i386 binary on an
> > x86-64 machine. Note in the following the system call names of the
> > 32-bit i386 binary as seen by an x86-64 perf.
> >
> > Before:
> > ```
> > ? ( ): a.out/447296 ... [continued]: munmap()) = 0
> > 0.024 ( 0.001 ms): a.out/447296 recvfrom(ubuf: 0x2, size: 4160585708, flags: DONTROUTE|CTRUNC|TRUNC|DONTWAIT|EOR|WAITALL|FIN|SYN|CONFIRM|RST|ERRQUEUE|NOSIGNAL|WAITFORONE|BATCH|SOCK_DEVMEM|ZEROCOPY|FASTOPEN|CMSG_CLOEXEC|0x91f80000, addr: 0xe30, addr_len: 0xffce438c) = 1475198976
> > 0.042 ( 0.003 ms): a.out/447296 lgetxattr(name: "", value: 0x3, size: 34) = 4160344064
> > 0.054 ( 0.003 ms): a.out/447296 dup2(oldfd: -134422744, newfd: 4) = -1 ENOENT (No such file or directory)
> > 0.060 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x2e646c2f6374652f,.iov_len = (__kernel_size_t)7307199665335594867,}, vlen: 557056, pos_h: 4160585708) = 3
> > 0.074 ( 0.004 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2) = 4160237568
> > 0.080 ( 0.001 ms): a.out/447296 lstat(filename: "", statbuf: 0x193f6) = 0
> > 0.089 ( 0.007 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x3833692f62696c2f,.iov_len = (__kernel_size_t)3276497845987585334,}, vlen: 557056, pos_h: 4160585708) = 3
> > 0.097 ( 0.002 ms): a.out/447296 close(fd: 3</proc/447296/status>) = 512
> > 0.103 ( 0.002 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2050) = 4157935616
> > 0.107 ( 0.007 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x5, size: 2066) = 4158078976
> > 0.116 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x1, size: 2066) = 4159639552
> > 0.121 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 2066) = 4160184320
> > 0.129 ( 0.002 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 50) = 4160196608
> > 0.138 ( 0.001 ms): a.out/447296 lstat(filename: "") = 0
> > 0.145 ( 0.002 ms): a.out/447296 mq_timedreceive(mqdes: 4291706800, u_msg_ptr: 0xf7f9ea48, msg_len: 134616640, u_msg_prio: 0xf7fd7fec, u_abs_timeout: (struct __kernel_timespec){.tv_sec = (__kernel_time64_t)-578174027777317696,.tv_nsec = (long long int)4160349376,}) = 0
> > 0.148 ( 0.001 ms): a.out/447296 mkdirat(dfd: -134617816, pathname: " ��� ���▒���▒���", mode: IFREG|ISUID|IRUSR|IWGRP|0xf7fd0000) = 447296
> > 0.150 ( 0.001 ms): a.out/447296 process_vm_writev(pid: -134617812, lvec: (struct iovec){.iov_base = (void *)0xf7f9e9c8f7f9e4c0,.iov_len = (__kernel_size_t)4160349376,}, liovcnt: 4160588048, rvec: (struct iovec){}, riovcnt: 4160585708, flags: 4291707352) = 0
> > 0.197 ( 0.004 ms): a.out/447296 capget(header: 4160184320, dataptr: 8192) = 0
> > 0.202 ( 0.002 ms): a.out/447296 capget(header: 1448669184, dataptr: 4096) = 0
> > 0.208 ( 0.002 ms): a.out/447296 capget(header: 4160577536, dataptr: 8192) = 0
> > 0.220 ( 0.001 ms): a.out/447296 getxattr(pathname: "", name: "c������", value: 0xf7f77e34, size: 1) = 0
> > 0.228 ( 0.005 ms): a.out/447296 fchmod(fd: -134729728, mode: IRUGO|IWUGO|IFREG|IFIFO|ISVTX|IXUSR|0x10000) = 0
> > 0.240 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: 0x5658e008, pos_h: 4160192052) = 3
> > 0.250 ( 0.008 ms): a.out/447296 close(fd: 3</proc/447296/status>) = 1436
> > 0.260 ( 0.018 ms): a.out/447296 stat(filename: "", statbuf: 0xffce32ac) = 1436
> > 0.288 (1000.213 ms): a.out/447296 readlinkat(buf: 0xffce31d4, bufsiz: 4291703244) = 0
> > ```
> >
> > After:
> > ```
> > ? ( ): a.out/442930 ... [continued]: execve()) = 0
> > 0.023 ( 0.002 ms): a.out/442930 brk() = 0x57760000
> > 0.052 ( 0.003 ms): a.out/442930 access(filename: 0xf7f5af28, mode: R) = -1 ENOENT (No such file or directory)
> > 0.059 ( 0.009 ms): a.out/442930 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> > 0.078 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>) = 0
> > 0.087 ( 0.007 ms): a.out/442930 openat(dfd: CWD, filename: "/lib/i386-linux-", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
> > 0.095 ( 0.002 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdbb70, count: 512) = 512
> > 0.135 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>) = 0
> > 0.148 ( 0.001 ms): a.out/442930 set_tid_address(tidptr: 0xf7f2b528) = 442930 (a.out)
> > 0.150 ( 0.001 ms): a.out/442930 set_robust_list(head: 0xf7f2b52c, len: 12) =
> > 0.196 ( 0.004 ms): a.out/442930 mprotect(start: 0xf7f03000, len: 8192, prot: READ) = 0
> > 0.202 ( 0.002 ms): a.out/442930 mprotect(start: 0x5658e000, len: 4096, prot: READ) = 0
> > 0.207 ( 0.002 ms): a.out/442930 mprotect(start: 0xf7f63000, len: 8192, prot: READ) = 0
> > 0.230 ( 0.005 ms): a.out/442930 munmap(addr: 0xf7f10000, len: 103414) = 0
> > 0.244 ( 0.010 ms): a.out/442930 openat(dfd: CWD, filename: 0x5658d008) = 3
> > 0.255 ( 0.007 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdb67c, count: 4096) = 1436
> > 0.264 ( 0.018 ms): a.out/442930 write(fd: 1</dev/pts/4>, buf: , count: 1436) = 1436
> > 0.292 (1000.173 ms): a.out/442930 clock_nanosleep(rqtp: { .tv_sec: 17866546940376776704, .tv_nsec: 4159878336 }, rmtp: 0xffbdb59c) = 0
> > 1000.478 ( ): a.out/442930 exit_group() = ?
> > ```
> >
>
> I think I am conflating some things in my mind here. This change doesn't
> impact perf report does it? perf report reports syscall numbers only,
> but it could be hooked up into this change to correctly report the
> correct syscall name from a perf.data generated on any architecture?
>
> I believe that question is tangential to this patch but let me know!
You are right it is tangential. There are tracepoints for most system
calls that could be used to break down individual counts. You may be
thinking of `perf trace record` that I mentioned. Many of the perf
commands have a record and report option, so that things can be done
off-line. The support isn't what it should be. For example, `perf stat
record` will record event information and counts but it doesn't do
things like store metrics, so you'd need to manually compute a metric
from `perf stat report` if you wanted to use `perf stat` in this
record/report mode. I think often times think C is a very slow and
verbose way of doing the plumbing for these things, so I keep pushing
my python patches along.
Thanks for all your changes and the reviews! This patch series is just
moving things you've enabled around and was little work as a
consequence.
Thanks,
Ian
> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
>
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > Reviewed-by: Howard Chu <howardchu95@gmail.com>
> > ---
> > tools/perf/util/syscalltbl.c | 89 ++++++++++++++++++++++++++----------
> > 1 file changed, 64 insertions(+), 25 deletions(-)
> >
> > diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
> > index 760ac4d0869f..db0d2b81aed1 100644
> > --- a/tools/perf/util/syscalltbl.c
> > +++ b/tools/perf/util/syscalltbl.c
> > @@ -15,16 +15,39 @@
> > #include <string.h>
> > #include "string2.h"
> >
> > -#if __BITS_PER_LONG == 64
> > - #include <asm/syscalls_64.h>
> > -#else
> > - #include <asm/syscalls_32.h>
> > -#endif
> > +#include "trace/beauty/generated/syscalltbl.c"
> >
> > -const char *syscalltbl__name(int e_machine __maybe_unused, int id)
> > +static const struct syscalltbl *find_table(int e_machine)
> > {
> > - if (id >= 0 && id <= (int)ARRAY_SIZE(syscall_num_to_name))
> > - return syscall_num_to_name[id];
> > + static const struct syscalltbl *last_table;
> > + static int last_table_machine = EM_NONE;
> > +
> > + /* Tables only exist for EM_SPARC. */
> > + if (e_machine == EM_SPARCV9)
> > + e_machine = EM_SPARC;
> > +
> > + if (last_table_machine == e_machine && last_table != NULL)
> > + return last_table;
> > +
> > + for (size_t i = 0; i < ARRAY_SIZE(syscalltbls); i++) {
> > + const struct syscalltbl *entry = &syscalltbls[i];
> > +
> > + if (entry->e_machine != e_machine && entry->e_machine != EM_NONE)
> > + continue;
> > +
> > + last_table = entry;
> > + last_table_machine = e_machine;
> > + return entry;
> > + }
> > + return NULL;
> > +}
> > +
> > +const char *syscalltbl__name(int e_machine, int id)
> > +{
> > + const struct syscalltbl *table = find_table(e_machine);
> > +
> > + if (table && id >= 0 && id < table->num_to_name_len)
> > + return table->num_to_name[id];
> > return NULL;
> > }
> >
> > @@ -41,38 +64,54 @@ static int syscallcmpname(const void *vkey, const void *ventry)
> > return strcmp(key->name, key->tbl[*entry]);
> > }
> >
> > -int syscalltbl__id(int e_machine __maybe_unused, const char *name)
> > +int syscalltbl__id(int e_machine, const char *name)
> > {
> > - struct syscall_cmp_key key = {
> > - .name = name,
> > - .tbl = syscall_num_to_name,
> > - };
> > - const int *id = bsearch(&key, syscall_sorted_names,
> > - ARRAY_SIZE(syscall_sorted_names),
> > - sizeof(syscall_sorted_names[0]),
> > - syscallcmpname);
> > + const struct syscalltbl *table = find_table(e_machine);
> > + struct syscall_cmp_key key;
> > + const int *id;
> > +
> > + if (!table)
> > + return -1;
> > +
> > + key.name = name;
> > + key.tbl = table->num_to_name;
> > + id = bsearch(&key, table->sorted_names, table->sorted_names_len,
> > + sizeof(table->sorted_names[0]), syscallcmpname);
> >
> > return id ? *id : -1;
> > }
> >
> > -int syscalltbl__num_idx(int e_machine __maybe_unused)
> > +int syscalltbl__num_idx(int e_machine)
> > {
> > - return ARRAY_SIZE(syscall_sorted_names);
> > + const struct syscalltbl *table = find_table(e_machine);
> > +
> > + if (!table)
> > + return 0;
> > +
> > + return table->sorted_names_len;
> > }
> >
> > -int syscalltbl__id_at_idx(int e_machine __maybe_unused, int idx)
> > +int syscalltbl__id_at_idx(int e_machine, int idx)
> > {
> > - return syscall_sorted_names[idx];
> > + const struct syscalltbl *table = find_table(e_machine);
> > +
> > + if (!table)
> > + return -1;
> > +
> > + assert(idx >= 0 && idx < table->sorted_names_len);
> > + return table->sorted_names[idx];
> > }
> >
> > -int syscalltbl__strglobmatch_next(int e_machine __maybe_unused, const char *syscall_glob, int *idx)
> > +int syscalltbl__strglobmatch_next(int e_machine, const char *syscall_glob, int *idx)
> > {
> > - for (int i = *idx + 1; i < (int)ARRAY_SIZE(syscall_sorted_names); ++i) {
> > - const char *name = syscall_num_to_name[syscall_sorted_names[i]];
> > + const struct syscalltbl *table = find_table(e_machine);
> > +
> > + for (int i = *idx + 1; table && i < table->sorted_names_len; ++i) {
> > + const char *name = table->num_to_name[table->sorted_names[i]];
> >
> > if (strglobmatch(name, syscall_glob)) {
> > *idx = i;
> > - return syscall_sorted_names[i];
> > + return table->sorted_names[i];
> > }
> > }
> >
> > --
> > 2.48.1.502.g6dc24dfdaf-goog
> >
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 3/7] perf syscalltbl: Remove struct syscalltbl
2025-02-10 16:51 ` [PATCH v2 3/7] perf syscalltbl: Remove struct syscalltbl Ian Rogers
2025-02-11 0:19 ` Charlie Jenkins
@ 2025-02-11 7:48 ` Arnd Bergmann
2025-02-11 16:18 ` Ian Rogers
1 sibling, 1 reply; 26+ messages in thread
From: Arnd Bergmann @ 2025-02-11 7:48 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv
On Mon, Feb 10, 2025, at 17:51, Ian Rogers wrote:
> The syscalltbl held entries of system call name and number pairs,
> generated from a native syscalltbl at start up. As there are gaps in
> the system call number there is a notion of index into the
> table. Going forward we want the system call table to be identifiable
> by a machine type, for example, i386 vs x86-64. Change the interface
> to the syscalltbl so (1) a (currently unused machine type of EM_HOST)
> is passed (2) the index to syscall number and system call name mapping
> is computed at build time.
>
> Two tables are used for this, an array of system call number to name,
> an array of system call numbers sorted by the system call name. The
> sorted array doesn't store strings in part to save memory and
> relocations. The index notion is carried forward and is an index into
> the sorted array of system call numbers, the data structures are
> opaque (held only in syscalltbl.c), and so the number of indices for a
> machine type is exposed as a new API.
>
> The arrays are computed in the syscalltbl.sh script and so no start-up
> time computation and storage is necessary.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Howard Chu <howardchu95@gmail.com>
Your changes look fine to me, but I noticed one part that may
be wrong before and after your patch:
>
> -const int syscalltbl_native_max_id = SYSCALLTBL_MAX_ID;
> -static const char *const *syscalltbl_native = syscalltbl;
> +const char *syscalltbl__name(int e_machine __maybe_unused, int id)
> +{
> + if (id >= 0 && id <= (int)ARRAY_SIZE(syscall_num_to_name))
> + return syscall_num_to_name[id];
> + return NULL;
> +}
The syscall numbers on mips (and previously on ia64) are offset by
a large number depending on the ABI (o32/n32/n64). I assume what
we want here is to have the small numbers without the offset in
syscall_num_to_name[], but that requires adding the offset during
the lookup. Can you check if this is handled correctly?
Arnd
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables
2025-02-10 16:51 ` [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables Ian Rogers
2025-02-11 0:22 ` Charlie Jenkins
@ 2025-02-11 8:08 ` Arnd Bergmann
2025-02-11 17:24 ` Ian Rogers
1 sibling, 1 reply; 26+ messages in thread
From: Arnd Bergmann @ 2025-02-11 8:08 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv
On Mon, Feb 10, 2025, at 17:51, Ian Rogers wrote:
> +# Each line of the syscall table should have the following format:
> +#
> +# NR ABI NAME [NATIVE] [COMPAT]
> +#
> +# NR syscall number
> +# ABI ABI name
> +# NAME syscall name
> +# NATIVE native entry point (optional)
> +# COMPAT compat entry point (optional)
On x86, there is now a sixth optional field.
> +#if defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
> +EOF
> +build_tables "$tools_dir/perf/arch/arm/entry/syscalls/syscall.tbl"
> "$outfile" common,32,oabi EM_ARM
> +build_tables
The oabi syscalls probably shouldn't be part of the default set here.
Technically these are two separate ABIs, though EABI is a subset of
OABI for the most most part. Some of the calling conventions are
also different.
> "$tools_dir/perf/arch/arm64/entry/syscalls/syscall_64.tbl" "$outfile"
> common,64,renameat,rlimit,memfd_secret EM_AARCH64
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) ||
> defined(__aarch64__)
Hardcoding the set of ABIs in the middle of the script seems
too fragile to me, I'm worried that these get out of sync quickly.
> +#if defined(ALL_SYSCALLTBL) || defined(__mips__)
> +EOF
> +build_tables
> "$tools_dir/perf/arch/mips/entry/syscalls/syscall_n64.tbl" "$outfile"
> common,64,n64 EM_MIPS
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
What about n32/o32? The syscall tables are completely different here.
> +#if defined(ALL_SYSCALLTBL) || defined(__powerpc__) ||
> defined(__powerpc64__)
> +EOF
> +build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl"
> "$outfile" common,32,nospu EM_PPC
> +build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl"
> "$outfile" common,64,nospu EM_PPC64
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__powerpc__) ||
> defined(__powerpc64__)
This skips the SPU table, but I think that's fine.
> +EOF
> +build_tables "$tools_dir/perf/arch/s390/entry/syscalls/syscall.tbl"
> "$outfile" common,64,renameat,rlimit,memfd_secret EM_S390
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
This skips the 32-bit table, though I think that one is already
planned to be discontinued in the future.
> +#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> +EOF
> +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_32.tbl"
> "$outfile" common,32,i386 EM_386
> +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_64.tbl"
> "$outfile" common,64 EM_X86_64
> +cat >> "$outfile" <<EOF
> +#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) ||
> defined(__x86_64__)
This misses the x32 table.
Arnd
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 3/7] perf syscalltbl: Remove struct syscalltbl
2025-02-11 7:48 ` Arnd Bergmann
@ 2025-02-11 16:18 ` Ian Rogers
2025-02-11 16:34 ` Arnd Bergmann
0 siblings, 1 reply; 26+ messages in thread
From: Ian Rogers @ 2025-02-11 16:18 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv
On Mon, Feb 10, 2025 at 11:48 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Mon, Feb 10, 2025, at 17:51, Ian Rogers wrote:
> > The syscalltbl held entries of system call name and number pairs,
> > generated from a native syscalltbl at start up. As there are gaps in
> > the system call number there is a notion of index into the
> > table. Going forward we want the system call table to be identifiable
> > by a machine type, for example, i386 vs x86-64. Change the interface
> > to the syscalltbl so (1) a (currently unused machine type of EM_HOST)
> > is passed (2) the index to syscall number and system call name mapping
> > is computed at build time.
> >
> > Two tables are used for this, an array of system call number to name,
> > an array of system call numbers sorted by the system call name. The
> > sorted array doesn't store strings in part to save memory and
> > relocations. The index notion is carried forward and is an index into
> > the sorted array of system call numbers, the data structures are
> > opaque (held only in syscalltbl.c), and so the number of indices for a
> > machine type is exposed as a new API.
> >
> > The arrays are computed in the syscalltbl.sh script and so no start-up
> > time computation and storage is necessary.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > Reviewed-by: Howard Chu <howardchu95@gmail.com>
>
> Your changes look fine to me, but I noticed one part that may
> be wrong before and after your patch:
>
> >
> > -const int syscalltbl_native_max_id = SYSCALLTBL_MAX_ID;
> > -static const char *const *syscalltbl_native = syscalltbl;
> > +const char *syscalltbl__name(int e_machine __maybe_unused, int id)
> > +{
> > + if (id >= 0 && id <= (int)ARRAY_SIZE(syscall_num_to_name))
> > + return syscall_num_to_name[id];
> > + return NULL;
> > +}
>
> The syscall numbers on mips (and previously on ia64) are offset by
> a large number depending on the ABI (o32/n32/n64). I assume what
> we want here is to have the small numbers without the offset in
> syscall_num_to_name[], but that requires adding the offset during
> the lookup. Can you check if this is handled correctly?
Thanks Arnd! I agree the tables are large and can be sparse, they'll
also be full of relocations. MIPS doesn't look like an outlier to me
here:
```
#if defined(ALL_SYSCALLTBL) || defined(__mips__)
static const char *const syscall_num_to_name_EM_MIPS[] = {
[0] = "read",
[1] = "write",
[2] = "open",
...
[465] = "listxattrat",
[466] = "removexattrat",
};
```
For contrast x86:
```
#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
static const char *const syscall_num_to_name_EM_386[] = {
[0] = "restart_syscall",
[1] = "exit",
[2] = "fork",
...
[464] = "getxattrat",
[465] = "listxattrat",
[466] = "removexattrat",
};
```
Looking through the tables I see alpha having the highest number
syscall with 572 being mseal.
I don't think this is great but in the current code (on x86-64) we
have in arch/x86/include/generated/asm/syscalls_64.h:
```
static const char *const syscalltbl[] = {
[0] = "read",
[1] = "write",
[2] = "open",
...
[465] = "listxattrat",
[466] = "removexattrat",
};
#define SYSCALLTBL_MAX_ID 466
```
So the change is carrying forward a bad behavior, the table is still
only around 4kb. We could be more aggressive in compressing the
strings and pointers, for example how we compress the perf events and
metrics. I think it is getting out-of-scope here as that logic is
written in python, with the aid of lots of dictionaries, whilst this
code is currently a shell script. It becomes more of an issue if we
enable all of the tables in the build at once.
Thanks,
Ian
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 3/7] perf syscalltbl: Remove struct syscalltbl
2025-02-11 16:18 ` Ian Rogers
@ 2025-02-11 16:34 ` Arnd Bergmann
2025-02-11 17:32 ` Ian Rogers
0 siblings, 1 reply; 26+ messages in thread
From: Arnd Bergmann @ 2025-02-11 16:34 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv
On Tue, Feb 11, 2025, at 17:18, Ian Rogers wrote:
> On Mon, Feb 10, 2025 at 11:48 PM Arnd Bergmann <arnd@arndb.de> wrote:
>> The syscall numbers on mips (and previously on ia64) are offset by
>> a large number depending on the ABI (o32/n32/n64). I assume what
>> we want here is to have the small numbers without the offset in
>> syscall_num_to_name[], but that requires adding the offset during
>> the lookup. Can you check if this is handled correctly?
>
> Thanks Arnd! I agree the tables are large and can be sparse, they'll
> also be full of relocations. MIPS doesn't look like an outlier to me
> here:
Sorry, I should have been clearer what I meant, see
arch/mips/include/uapi/asm/unistd.h:
#if _MIPS_SIM == _MIPS_SIM_NABI32
#define __NR_Linux 6000
#include <asm/unistd_n32.h>
#endif
and
arch/mips/include/generated/uapi/asm/unistd_n32.h
#define __NR_read (__NR_Linux + 0)
#define __NR_write (__NR_Linux + 1)
#define __NR_open (__NR_Linux + 2)
These offsets are 4000/5000/6000 respectively.
> ```
> #if defined(ALL_SYSCALLTBL) || defined(__mips__)
> static const char *const syscall_num_to_name_EM_MIPS[] = {
> [0] = "read",
> [1] = "write",
> [2] = "open",
> ...
> [465] = "listxattrat",
> [466] = "removexattrat",
> };
This means the array is not sparse, but the numbers
here do not match the syscall number argument register.
The question is whether tracing on mips adds the same
per-ABI offset again, or if it tries and fails to look
up index 6002 for 'open'.
Arnd
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables
2025-02-11 8:08 ` Arnd Bergmann
@ 2025-02-11 17:24 ` Ian Rogers
2025-02-11 17:53 ` Arnd Bergmann
0 siblings, 1 reply; 26+ messages in thread
From: Ian Rogers @ 2025-02-11 17:24 UTC (permalink / raw)
To: Arnd Bergmann, Arnaldo Carvalho de Melo, Howard Chu
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
John Garry, Will Deacon, James Clark, Mike Leach, Leo Yan, guoren,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Charlie Jenkins,
Bibo Mao, Huacai Chen, Catalin Marinas, Jiri Slaby,
Björn Töpel, linux-kernel, linux-perf-users,
linux-arm-kernel, linux-csky@vger.kernel.org, linux-riscv
On Tue, Feb 11, 2025 at 12:09 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Mon, Feb 10, 2025, at 17:51, Ian Rogers wrote:
>
> > +# Each line of the syscall table should have the following format:
> > +#
> > +# NR ABI NAME [NATIVE] [COMPAT]
> > +#
> > +# NR syscall number
> > +# ABI ABI name
> > +# NAME syscall name
> > +# NATIVE native entry point (optional)
> > +# COMPAT compat entry point (optional)
>
> On x86, there is now a sixth optional field.
Thanks, I'll add and repost a v3. I had some other questions below so
I'll try to do everything together to avoid noise.
> > +#if defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/arm/entry/syscalls/syscall.tbl"
> > "$outfile" common,32,oabi EM_ARM
> > +build_tables
>
> The oabi syscalls probably shouldn't be part of the default set here.
> Technically these are two separate ABIs, though EABI is a subset of
> OABI for the most most part. Some of the calling conventions are
> also different.
Ack. I was carrying forward:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm/entry/syscalls/Kbuild#n3
but noticed that we weren't adding this for arm64:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/entry/syscalls/Kbuild
I'm happy to drop and have the ARM64 behavior. I'll make it a separate
patch in case there is a desire from someone to revert.
> > "$tools_dir/perf/arch/arm64/entry/syscalls/syscall_64.tbl" "$outfile"
> > common,64,renameat,rlimit,memfd_secret EM_AARCH64
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) ||
> > defined(__aarch64__)
>
> Hardcoding the set of ABIs in the middle of the script seems
> too fragile to me, I'm worried that these get out of sync quickly.
I agree, again this is carrying forward a behavior and at least after
these changes the location is just one place. Do you have any
suggestions on how to do better?
Fwiw, I wonder a related problem/question that has come up primarily
with Arnaldo and Howard is in having a way to determine system call
argument types so that perf trace can pretty print them. For example,
if via BTF it is found an argument is a "const char*" then it is
assumed to be a string, but a "char *" is not as it may just be an out
argument. There's a source for more information in the syzkaller
project:
https://github.com/google/syzkaller/blob/master/sys/linux/sys.txt
Perhaps there's a way to generate this information from the Linux
build and feed it into perf's build. It is out-of-scope for what I'm
trying to do here, but I thought it worth a mention given my general
ignorance on wider things.
> > +#if defined(ALL_SYSCALLTBL) || defined(__mips__)
> > +EOF
> > +build_tables
> > "$tools_dir/perf/arch/mips/entry/syscalls/syscall_n64.tbl" "$outfile"
> > common,64,n64 EM_MIPS
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
>
> What about n32/o32? The syscall tables are completely different here.
So perf hasn't historically supported them and no one is asking for
support. Generating more tables isn't the problem, but we need to have
some way of determining which table to use for n32/o32. I see
EF_MIPS_ABI_O32 and EF_MIPS_ABI_O64, so we could add support by
extending the lookup of the table to be both of e_machine and e_flags.
I'm less clear on choosing n32. That said, back in the 90s I was
working to port MIPS code to Itanium via binary translation. Given now
Itanium is obsolete, I'm not sure it is worth adding complexity for
the sake of MIPS. I'm happy to do what others feel is best here, but
my default position is just to carry what the existing behavior is
forward.
> > +#if defined(ALL_SYSCALLTBL) || defined(__powerpc__) ||
> > defined(__powerpc64__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl"
> > "$outfile" common,32,nospu EM_PPC
> > +build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl"
> > "$outfile" common,64,nospu EM_PPC64
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__powerpc__) ||
> > defined(__powerpc64__)
>
> This skips the SPU table, but I think that's fine.
>
> > +EOF
> > +build_tables "$tools_dir/perf/arch/s390/entry/syscalls/syscall.tbl"
> > "$outfile" common,64,renameat,rlimit,memfd_secret EM_S390
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
>
> This skips the 32-bit table, though I think that one is already
> planned to be discontinued in the future.
Thankfully we have awesome s390 devs on the mailing list, hopefully
they'll shout out if I'm doing things wrong.
> > +#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_32.tbl"
> > "$outfile" common,32,i386 EM_386
> > +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_64.tbl"
> > "$outfile" common,64 EM_X86_64
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) ||
> > defined(__x86_64__)
>
> This misses the x32 table.
Again I'm carrying forward a behavior. Would it be worth adding x32?
Context: I handled x86 on Android over 10 years ago. x86 phones were
64-bit long before ARM or Apple phones, the kernel was 64-bit but the
userland was forced to be 32-bit. ARM32 has R15 be the program
counter, Sophie Wilson's idea, and when Android's security folks
experimented with ASLR they found it to be free as a consequence of
ARM32. On x86 not x32 then you're in the land of thunk bx, losing a
register and extra instructions. We never did x32 on Android as it
became irrelevant when I brought up x86-64 on Android. The desire for
x32 was for RIP encoding to fix the ASLR issue, not so much the extra
registers, and because Android used to mandate a 32-bit user land
(this was so extreme that even the developer's C to generally ARM
cross-compilers were built as 32-bit binaries). Given x32 was done for
Android, Android never used it, I have a hard time thinking we should
be adding support to perf. That said there is likely other context
that I'm unaware of as I'm surprised x32 still exists. Fwiw, it
saddens me that the x32 experience means that for APX we're still
getting the x86-64 ABI moved forward and silly things like the var
args convention on %al (there for C80 compatibility I once believed -
not really an issue today). On ARM64 registers there are at least 8
callee-save general purpose and floating point registers, so the x86
model of pretty much everything is caller-save means function calls
are expensive and you may need aggressive inlining. Anyway, sorry for
going on so much.
Thanks,
Ian
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 3/7] perf syscalltbl: Remove struct syscalltbl
2025-02-11 16:34 ` Arnd Bergmann
@ 2025-02-11 17:32 ` Ian Rogers
0 siblings, 0 replies; 26+ messages in thread
From: Ian Rogers @ 2025-02-11 17:32 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, Howard Chu,
linux-kernel, linux-perf-users, linux-arm-kernel,
linux-csky@vger.kernel.org, linux-riscv
On Tue, Feb 11, 2025 at 8:34 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Tue, Feb 11, 2025, at 17:18, Ian Rogers wrote:
> > On Mon, Feb 10, 2025 at 11:48 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> >> The syscall numbers on mips (and previously on ia64) are offset by
> >> a large number depending on the ABI (o32/n32/n64). I assume what
> >> we want here is to have the small numbers without the offset in
> >> syscall_num_to_name[], but that requires adding the offset during
> >> the lookup. Can you check if this is handled correctly?
> >
> > Thanks Arnd! I agree the tables are large and can be sparse, they'll
> > also be full of relocations. MIPS doesn't look like an outlier to me
> > here:
>
> Sorry, I should have been clearer what I meant, see
> arch/mips/include/uapi/asm/unistd.h:
>
> #if _MIPS_SIM == _MIPS_SIM_NABI32
> #define __NR_Linux 6000
> #include <asm/unistd_n32.h>
> #endif
>
> and
> arch/mips/include/generated/uapi/asm/unistd_n32.h
>
> #define __NR_read (__NR_Linux + 0)
> #define __NR_write (__NR_Linux + 1)
> #define __NR_open (__NR_Linux + 2)
>
> These offsets are 4000/5000/6000 respectively.
>
> > ```
> > #if defined(ALL_SYSCALLTBL) || defined(__mips__)
> > static const char *const syscall_num_to_name_EM_MIPS[] = {
> > [0] = "read",
> > [1] = "write",
> > [2] = "open",
> > ...
> > [465] = "listxattrat",
> > [466] = "removexattrat",
> > };
>
> This means the array is not sparse, but the numbers
> here do not match the syscall number argument register.
>
> The question is whether tracing on mips adds the same
> per-ABI offset again, or if it tries and fails to look
> up index 6002 for 'open'.
Thanks for clarifying. I believe it will use 6002 and be broken. I
believe that'd be true without these changes too. I'm not testing on
MIPS so it'd be nice to have a fix targeted at making it work. Ideally
we wouldn't use __NR_Linux and instead fudge the id/system call number
based on e_machine/e_flags. I think it is follow on work, especially
because I don't find MIPS a very motivating use-case.
Thanks,
Ian
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables
2025-02-11 17:24 ` Ian Rogers
@ 2025-02-11 17:53 ` Arnd Bergmann
2025-02-11 18:45 ` Ian Rogers
2025-02-12 13:59 ` David Laight
0 siblings, 2 replies; 26+ messages in thread
From: Arnd Bergmann @ 2025-02-11 17:53 UTC (permalink / raw)
To: Ian Rogers, Arnaldo Carvalho de Melo, Howard Chu
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
John Garry, Will Deacon, James Clark, Mike Leach, Leo Yan, guoren,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Charlie Jenkins,
Bibo Mao, Huacai Chen, Catalin Marinas, Jiri Slaby,
Björn Töpel, linux-kernel, linux-perf-users,
linux-arm-kernel, linux-csky@vger.kernel.org, linux-riscv
On Tue, Feb 11, 2025, at 18:24, Ian Rogers wrote:
> On Tue, Feb 11, 2025 at 12:09 AM Arnd Bergmann <arnd@arndb.de> wrote:
>> On Mon, Feb 10, 2025, at 17:51, Ian Rogers wrote:
>> > "$tools_dir/perf/arch/arm64/entry/syscalls/syscall_64.tbl" "$outfile"
>> > common,64,renameat,rlimit,memfd_secret EM_AARCH64
>> > +cat >> "$outfile" <<EOF
>> > +#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) ||
>> > defined(__aarch64__)
>>
>> Hardcoding the set of ABIs in the middle of the script seems
>> too fragile to me, I'm worried that these get out of sync quickly.
>
> I agree, again this is carrying forward a behavior and at least after
> these changes the location is just one place. Do you have any
> suggestions on how to do better?
Not sure, but I have some patches that I was planning to send
that puts these into arch/*/kernel/Makefile.syscalls for all
architectures in a consistent way. Ideally we'd use the same
Makefile contents for tools/perf in order to trivially sync
them, but I'm also happy to hear other suggestions.
Your patches are currently ahead of mine, so I don't want to
hold you up.
> Fwiw, I wonder a related problem/question that has come up primarily
> with Arnaldo and Howard is in having a way to determine system call
> argument types so that perf trace can pretty print them. For example,
> if via BTF it is found an argument is a "const char*" then it is
> assumed to be a string, but a "char *" is not as it may just be an out
> argument. There's a source for more information in the syzkaller
> project:
> https://github.com/google/syzkaller/blob/master/sys/linux/sys.txt
> Perhaps there's a way to generate this information from the Linux
> build and feed it into perf's build. It is out-of-scope for what I'm
> trying to do here, but I thought it worth a mention given my general
> ignorance on wider things.
Yes, this is also something I've been trying to work on. In particular
the calling conventions for 64-bit register arguments on 32-bit
targets need some help. My plan for this is to have a consistent
mapping of internal (sys_foo()) function names to argument lists,
instead of having some calls that are slightly different depending
on the architecture or ABI.
This should be in a machine-readable format so it can be parsed
not only by perf but also any other project that needs a list
(libc, gdb, qemu, strace, rust, ...)
>> > +#if defined(ALL_SYSCALLTBL) || defined(__mips__)
>> > +EOF
>> > +build_tables
>> > "$tools_dir/perf/arch/mips/entry/syscalls/syscall_n64.tbl" "$outfile"
>> > common,64,n64 EM_MIPS
>> > +cat >> "$outfile" <<EOF
>> > +#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
>>
>> What about n32/o32? The syscall tables are completely different here.
>
> So perf hasn't historically supported them and no one is asking for
> support. Generating more tables isn't the problem, but we need to have
> some way of determining which table to use for n32/o32. I see
> EF_MIPS_ABI_O32 and EF_MIPS_ABI_O64, so we could add support by
> extending the lookup of the table to be both of e_machine and e_flags.
> I'm less clear on choosing n32. That said, back in the 90s I was
> working to port MIPS code to Itanium via binary translation. Given now
> Itanium is obsolete, I'm not sure it is worth adding complexity for
> the sake of MIPS. I'm happy to do what others feel is best here, but
> my default position is just to carry what the existing behavior is
> forward.
I think the way it actually works on mips is that all syscalls are
allowed in any task and the actual number identifies both the
ABI and the syscall. In some variant, the same is true on arm
(oabi/eabi) and x86-64 (64/x32), but oabi and x32 are both too
obsolete to put much work into them.
There is still some interest in mips, maybe you can poke the
maintainers and see if someone is willing to help out since you
have done the bulk of the work already.
>> > +EOF
>> > +build_tables "$tools_dir/perf/arch/s390/entry/syscalls/syscall.tbl"
>> > "$outfile" common,64,renameat,rlimit,memfd_secret EM_S390
>> > +cat >> "$outfile" <<EOF
>> > +#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
>>
>> This skips the 32-bit table, though I think that one is already
>> planned to be discontinued in the future.
>
> Thankfully we have awesome s390 devs on the mailing list, hopefully
> they'll shout out if I'm doing things wrong.
I also remembered that I had a patch to bring the s390 syscall.tbl
into the same format as the others, since the behavior is currently
a bit different for compat calls. I think there is also a chance
that they want to discontinue 32-bit mode entirely, given that
the last 32-bit machine was discontinued over 20 years ago, and
support for native 32-bit kernels got removed 10 years ago
after Debian 8 moved to 64 bit.
If they are confident that there are no more remaining users that
rely on 32-bit binaries, we could both save some work.
>> > +#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
>> > +EOF
>> > +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_32.tbl"
>> > "$outfile" common,32,i386 EM_386
>> > +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_64.tbl"
>> > "$outfile" common,64 EM_X86_64
>> > +cat >> "$outfile" <<EOF
>> > +#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) ||
>> > defined(__x86_64__)
>>
>> This misses the x32 table.
>
> Again I'm carrying forward a behavior. Would it be worth adding x32?
I would probably document it in the file as an intentional
omission, same as for arm oabi
> That said there is likely other context
> that I'm unaware of as I'm surprised x32 still exists.
There are a handful of people still testing it, and some still
using it, but I agree it completely failed to get enough
traction to be worth maintaining.
I view x32 (and the corresponding arm64 ilp32 mode that never
made it in) mostly as an exercise in benchmark(et)ing, since
it showed noticeably higher results in some versions of specint
and some compiler workloads, compared to both normal 32-bit and
64-bit modes. The time that we already wasted on maintaining
it must have long surpassed any such benefits though, so I
certainly don't want to waste more time on it.
Arnd
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables
2025-02-11 17:53 ` Arnd Bergmann
@ 2025-02-11 18:45 ` Ian Rogers
2025-02-12 13:59 ` David Laight
1 sibling, 0 replies; 26+ messages in thread
From: Ian Rogers @ 2025-02-11 18:45 UTC (permalink / raw)
To: Arnd Bergmann, linux-mips
Cc: Arnaldo Carvalho de Melo, Howard Chu, Peter Zijlstra, Ingo Molnar,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Kan Liang, John Garry, Will Deacon, James Clark,
Mike Leach, Leo Yan, guoren, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv
On Tue, Feb 11, 2025 at 9:53 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Tue, Feb 11, 2025, at 18:24, Ian Rogers wrote:
> > On Tue, Feb 11, 2025 at 12:09 AM Arnd Bergmann <arnd@arndb.de> wrote:
> >> On Mon, Feb 10, 2025, at 17:51, Ian Rogers wrote:
> >> > "$tools_dir/perf/arch/arm64/entry/syscalls/syscall_64.tbl" "$outfile"
> >> > common,64,renameat,rlimit,memfd_secret EM_AARCH64
> >> > +cat >> "$outfile" <<EOF
> >> > +#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) ||
> >> > defined(__aarch64__)
> >>
> >> Hardcoding the set of ABIs in the middle of the script seems
> >> too fragile to me, I'm worried that these get out of sync quickly.
> >
> > I agree, again this is carrying forward a behavior and at least after
> > these changes the location is just one place. Do you have any
> > suggestions on how to do better?
>
> Not sure, but I have some patches that I was planning to send
> that puts these into arch/*/kernel/Makefile.syscalls for all
> architectures in a consistent way. Ideally we'd use the same
> Makefile contents for tools/perf in order to trivially sync
> them, but I'm also happy to hear other suggestions.
>
> Your patches are currently ahead of mine, so I don't want to
> hold you up.
>
> > Fwiw, I wonder a related problem/question that has come up primarily
> > with Arnaldo and Howard is in having a way to determine system call
> > argument types so that perf trace can pretty print them. For example,
> > if via BTF it is found an argument is a "const char*" then it is
> > assumed to be a string, but a "char *" is not as it may just be an out
> > argument. There's a source for more information in the syzkaller
> > project:
> > https://github.com/google/syzkaller/blob/master/sys/linux/sys.txt
> > Perhaps there's a way to generate this information from the Linux
> > build and feed it into perf's build. It is out-of-scope for what I'm
> > trying to do here, but I thought it worth a mention given my general
> > ignorance on wider things.
>
> Yes, this is also something I've been trying to work on. In particular
> the calling conventions for 64-bit register arguments on 32-bit
> targets need some help. My plan for this is to have a consistent
> mapping of internal (sys_foo()) function names to argument lists,
> instead of having some calls that are slightly different depending
> on the architecture or ABI.
>
> This should be in a machine-readable format so it can be parsed
> not only by perf but also any other project that needs a list
> (libc, gdb, qemu, strace, rust, ...)
Awesome, thanks for working on this!
> >> > +#if defined(ALL_SYSCALLTBL) || defined(__mips__)
> >> > +EOF
> >> > +build_tables
> >> > "$tools_dir/perf/arch/mips/entry/syscalls/syscall_n64.tbl" "$outfile"
> >> > common,64,n64 EM_MIPS
> >> > +cat >> "$outfile" <<EOF
> >> > +#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
> >>
> >> What about n32/o32? The syscall tables are completely different here.
> >
> > So perf hasn't historically supported them and no one is asking for
> > support. Generating more tables isn't the problem, but we need to have
> > some way of determining which table to use for n32/o32. I see
> > EF_MIPS_ABI_O32 and EF_MIPS_ABI_O64, so we could add support by
> > extending the lookup of the table to be both of e_machine and e_flags.
> > I'm less clear on choosing n32. That said, back in the 90s I was
> > working to port MIPS code to Itanium via binary translation. Given now
> > Itanium is obsolete, I'm not sure it is worth adding complexity for
> > the sake of MIPS. I'm happy to do what others feel is best here, but
> > my default position is just to carry what the existing behavior is
> > forward.
>
> I think the way it actually works on mips is that all syscalls are
> allowed in any task and the actual number identifies both the
> ABI and the syscall. In some variant, the same is true on arm
> (oabi/eabi) and x86-64 (64/x32), but oabi and x32 are both too
> obsolete to put much work into them.
>
> There is still some interest in mips, maybe you can poke the
> maintainers and see if someone is willing to help out since you
> have done the bulk of the work already.
Thanks, adding linux-mips@vger.kernel.org. Here is the original
feedback for them for context:
https://lore.kernel.org/lkml/07c5c3ad-5a6d-4eda-95f2-ed16e7504d4c@app.fastmail.com/
> >> > +EOF
> >> > +build_tables "$tools_dir/perf/arch/s390/entry/syscalls/syscall.tbl"
> >> > "$outfile" common,64,renameat,rlimit,memfd_secret EM_S390
> >> > +cat >> "$outfile" <<EOF
> >> > +#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
> >>
> >> This skips the 32-bit table, though I think that one is already
> >> planned to be discontinued in the future.
> >
> > Thankfully we have awesome s390 devs on the mailing list, hopefully
> > they'll shout out if I'm doing things wrong.
>
> I also remembered that I had a patch to bring the s390 syscall.tbl
> into the same format as the others, since the behavior is currently
> a bit different for compat calls. I think there is also a chance
> that they want to discontinue 32-bit mode entirely, given that
> the last 32-bit machine was discontinued over 20 years ago, and
> support for native 32-bit kernels got removed 10 years ago
> after Debian 8 moved to 64 bit.
>
> If they are confident that there are no more remaining users that
> rely on 32-bit binaries, we could both save some work.
>
> >> > +#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> >> > +EOF
> >> > +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_32.tbl"
> >> > "$outfile" common,32,i386 EM_386
> >> > +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_64.tbl"
> >> > "$outfile" common,64 EM_X86_64
> >> > +cat >> "$outfile" <<EOF
> >> > +#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) ||
> >> > defined(__x86_64__)
> >>
> >> This misses the x32 table.
> >
> > Again I'm carrying forward a behavior. Would it be worth adding x32?
>
> I would probably document it in the file as an intentional
> omission, same as for arm oabi
Ack.
> > That said there is likely other context
> > that I'm unaware of as I'm surprised x32 still exists.
>
> There are a handful of people still testing it, and some still
> using it, but I agree it completely failed to get enough
> traction to be worth maintaining.
>
> I view x32 (and the corresponding arm64 ilp32 mode that never
> made it in) mostly as an exercise in benchmark(et)ing, since
> it showed noticeably higher results in some versions of specint
> and some compiler workloads, compared to both normal 32-bit and
> 64-bit modes. The time that we already wasted on maintaining
> it must have long surpassed any such benefits though, so I
> certainly don't want to waste more time on it.
:-) I ate part of a cake intended as a goodwill gesture toward getting
x32 into Android.
Thanks,
Ian
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables
2025-02-11 17:53 ` Arnd Bergmann
2025-02-11 18:45 ` Ian Rogers
@ 2025-02-12 13:59 ` David Laight
1 sibling, 0 replies; 26+ messages in thread
From: David Laight @ 2025-02-12 13:59 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Ian Rogers, Arnaldo Carvalho de Melo, Howard Chu, Peter Zijlstra,
Ingo Molnar, Namhyung Kim, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, John Garry, Will Deacon,
James Clark, Mike Leach, Leo Yan, guoren, Paul Walmsley,
Palmer Dabbelt, Albert Ou, Charlie Jenkins, Bibo Mao, Huacai Chen,
Catalin Marinas, Jiri Slaby, Björn Töpel, linux-kernel,
linux-perf-users, linux-arm-kernel, linux-csky@vger.kernel.org,
linux-riscv
On Tue, 11 Feb 2025 18:53:14 +0100
"Arnd Bergmann" <arnd@arndb.de> wrote:
...
> I think the way it actually works on mips is that all syscalls are
> allowed in any task and the actual number identifies both the
> ABI and the syscall. In some variant, the same is true on arm
> (oabi/eabi) and x86-64 (64/x32), but oabi and x32 are both too
> obsolete to put much work into them.
IIRC x86-64 processes can also just make i386 system calls.
Even switching to/from 64bit mode isn't privileged.
David
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2025-02-12 13:59 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-10 16:51 [PATCH v2 0/7] perf: Support multiple system call tables in the build Ian Rogers
2025-02-10 16:51 ` [PATCH v2 1/7] perf syscalltble: Remove syscall_table.h Ian Rogers
2025-02-10 23:48 ` Charlie Jenkins
2025-02-10 16:51 ` [PATCH v2 2/7] perf trace: Reorganize syscalls Ian Rogers
2025-02-11 0:17 ` Charlie Jenkins
2025-02-10 16:51 ` [PATCH v2 3/7] perf syscalltbl: Remove struct syscalltbl Ian Rogers
2025-02-11 0:19 ` Charlie Jenkins
2025-02-11 7:48 ` Arnd Bergmann
2025-02-11 16:18 ` Ian Rogers
2025-02-11 16:34 ` Arnd Bergmann
2025-02-11 17:32 ` Ian Rogers
2025-02-10 16:51 ` [PATCH v2 4/7] perf thread: Add support for reading the e_machine type for a thread Ian Rogers
2025-02-11 0:20 ` Charlie Jenkins
2025-02-10 16:51 ` [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating all system call tables Ian Rogers
2025-02-11 0:22 ` Charlie Jenkins
2025-02-11 5:08 ` Ian Rogers
2025-02-11 8:08 ` Arnd Bergmann
2025-02-11 17:24 ` Ian Rogers
2025-02-11 17:53 ` Arnd Bergmann
2025-02-11 18:45 ` Ian Rogers
2025-02-12 13:59 ` David Laight
2025-02-10 16:51 ` [PATCH v2 6/7] perf syscalltbl: Use lookup table containing multiple architectures Ian Rogers
2025-02-10 23:39 ` Charlie Jenkins
2025-02-11 5:15 ` Ian Rogers
2025-02-11 0:23 ` Charlie Jenkins
2025-02-10 16:51 ` [PATCH v2 7/7] perf build: Remove Makefile.syscalls Ian Rogers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).