linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir
@ 2025-02-22  6:10 Ian Rogers
  2025-02-22  6:10 ` [PATCH v3 1/8] tools lib api: Add io_dir an allocation free readdir alternative Ian Rogers
                   ` (9 more replies)
  0 siblings, 10 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22  6:10 UTC (permalink / raw)
  To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
	James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
	Michael Petlan, Anne Macedo, Dominique Martinet,
	Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
	Krzysztof Łopatowski
  Cc: Ian Rogers

glibc's opendir allocates a minimum of 32kb, when called recursively
for a directory tree the memory consumption can add up - nearly 300kb
during perf start-up when processing modules. Add a stack allocated
variant of readdir sized a little more than 1kb

v3: Rebase on top of Krzysztof Łopatowski's work. Add additional
    defines for SYS_getdents64 on all other architectures if its
    definition is missing. Add a patch to further reduce the
    stack/memory usage in machine__set_modules_path_dir by appending
    to a buffer rather than creating a copy.
v2: Remove the feature test and always use a perf supplied getdents64
    to workaround an Alpine Linux issue in v1:
    https://lore.kernel.org/lkml/20231207050433.1426834-1-irogers@google.com/
    As suggested by Krzysztof Łopatowski
    <krzysztof.m.lopatowski@gmail.com> who also pointed to the perf
    trace performance improvements in start-up time eliminating stat
    calls can achieve:
    https://lore.kernel.org/lkml/20250206113314.335376-2-krzysztof.m.lopatowski@gmail.com/
    Convert parse-events and hwmon_pmu to use io_dir.
v1: This was previously part of the memory saving change set:
    https://lore.kernel.org/lkml/20231127220902.1315692-1-irogers@google.com/
    It is separated here and a feature check and syscall workaround
    for missing getdents64 added.

Ian Rogers (8):
  tools lib api: Add io_dir an allocation free readdir alternative
  perf maps: Switch modules tree walk to io_dir__readdir
  perf pmu: Switch to io_dir__readdir
  perf header: Switch mem topology to io_dir__readdir
  perf events: Remove scandir in thread synthesis
  perf parse-events: Switch tracepoints to io_dir__readdir
  perf hwmon_pmu: Switch event discovery to io_dir__readdir
  perf machine: Reuse module path buffer

 tools/lib/api/Makefile             |   2 +-
 tools/lib/api/io_dir.h             | 104 +++++++++++++++++++++++++++++
 tools/perf/util/header.c           |  31 ++++-----
 tools/perf/util/hwmon_pmu.c        |  42 +++++-------
 tools/perf/util/machine.c          |  57 ++++++++--------
 tools/perf/util/parse-events.c     |  32 +++++----
 tools/perf/util/pmu.c              |  46 ++++++-------
 tools/perf/util/pmus.c             |  30 +++------
 tools/perf/util/synthetic-events.c |  22 +++---
 9 files changed, 229 insertions(+), 137 deletions(-)
 create mode 100644 tools/lib/api/io_dir.h

-- 
2.48.1.658.g4767266eb4-goog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/8] tools lib api: Add io_dir an allocation free readdir alternative
  2025-02-22  6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
@ 2025-02-22  6:10 ` Ian Rogers
  2025-02-22  6:10 ` [PATCH v3 2/8] perf maps: Switch modules tree walk to io_dir__readdir Ian Rogers
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22  6:10 UTC (permalink / raw)
  To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
	James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
	Michael Petlan, Anne Macedo, Dominique Martinet,
	Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
	Krzysztof Łopatowski
  Cc: Ian Rogers

glibc's opendir allocates a minimum of 32kb, when called recursively
for a directory tree the memory consumption can add up - nearly 300kb
during perf start-up when processing modules. Add a stack allocated
variant of readdir sized a little more than 1kb.

As getdents64 may be missing from libc, add support using syscall. As
the system call number maybe missing, add #defines for those.

Note, an earlier version of this patch had a feature test for
getdents64 but there were problems on certains distros where
getdents64 would be #define renamed to getdents breaking the code. The
syscall use was made uncondtional to work around this. There is
context in:
https://lore.kernel.org/lkml/20231207050433.1426834-1-irogers@google.com/

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/api/Makefile |   2 +-
 tools/lib/api/io_dir.h | 104 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 105 insertions(+), 1 deletion(-)
 create mode 100644 tools/lib/api/io_dir.h

diff --git a/tools/lib/api/Makefile b/tools/lib/api/Makefile
index 7f6396087b46..8665c799e0fa 100644
--- a/tools/lib/api/Makefile
+++ b/tools/lib/api/Makefile
@@ -95,7 +95,7 @@ install_lib: $(LIBFILE)
 		$(call do_install_mkdir,$(libdir_SQ)); \
 		cp -fpR $(LIBFILE) $(DESTDIR)$(libdir_SQ)
 
-HDRS := cpu.h debug.h io.h
+HDRS := cpu.h debug.h io.h io_dir.h
 FD_HDRS := fd/array.h
 FS_HDRS := fs/fs.h fs/tracing_path.h
 INSTALL_HDRS_PFX := $(DESTDIR)$(prefix)/include/api
diff --git a/tools/lib/api/io_dir.h b/tools/lib/api/io_dir.h
new file mode 100644
index 000000000000..aab73393d2a2
--- /dev/null
+++ b/tools/lib/api/io_dir.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+/*
+ * Lightweight directory reading library.
+ */
+#ifndef __API_IO_DIR__
+#define __API_IO_DIR__
+
+#include <dirent.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <sys/stat.h>
+#include <sys/syscall.h>
+
+#if !defined(SYS_getdents64)
+#if defined(__x86_64__) || defined(__arm__)
+  #define SYS_getdents64 217
+#elif defined(__i386__) || defined(__s390x__) || defined(__sh__)
+  #define SYS_getdents64 220
+#elif defined(__alpha__)
+  #define SYS_getdents64 377
+#elif defined(__mips__)
+  #define SYS_getdents64 308
+#elif defined(__powerpc64__) || defined(__powerpc__)
+  #define SYS_getdents64 202
+#elif defined(__sparc64__) || defined(__sparc__)
+  #define SYS_getdents64 154
+#elif defined(__xtensa__)
+  #define SYS_getdents64 60
+#else
+  #define SYS_getdents64 61
+#endif
+#endif /* !defined(SYS_getdents64) */
+
+static inline ssize_t perf_getdents64(int fd, void *dirp, size_t count)
+{
+#ifdef MEMORY_SANITIZER
+	memset(dirp, 0, count);
+#endif
+	return syscall(SYS_getdents64, fd, dirp, count);
+}
+
+struct io_dirent64 {
+	ino64_t        d_ino;    /* 64-bit inode number */
+	off64_t        d_off;    /* 64-bit offset to next structure */
+	unsigned short d_reclen; /* Size of this dirent */
+	unsigned char  d_type;   /* File type */
+	char           d_name[NAME_MAX + 1]; /* Filename (null-terminated) */
+};
+
+struct io_dir {
+	int dirfd;
+	ssize_t available_bytes;
+	struct io_dirent64 *next;
+	struct io_dirent64 buff[4];
+};
+
+static inline void io_dir__init(struct io_dir *iod, int dirfd)
+{
+	iod->dirfd = dirfd;
+	iod->available_bytes = 0;
+}
+
+static inline void io_dir__rewinddir(struct io_dir *iod)
+{
+	lseek(iod->dirfd, 0, SEEK_SET);
+	iod->available_bytes = 0;
+}
+
+static inline struct io_dirent64 *io_dir__readdir(struct io_dir *iod)
+{
+	struct io_dirent64 *entry;
+
+	if (iod->available_bytes <= 0) {
+		ssize_t rc = perf_getdents64(iod->dirfd, iod->buff, sizeof(iod->buff));
+
+		if (rc <= 0)
+			return NULL;
+		iod->available_bytes = rc;
+		iod->next = iod->buff;
+	}
+	entry = iod->next;
+	iod->next = (struct io_dirent64 *)((char *)entry + entry->d_reclen);
+	iod->available_bytes -= entry->d_reclen;
+	return entry;
+}
+
+static inline bool io_dir__is_dir(const struct io_dir *iod, struct io_dirent64 *dent)
+{
+	if (dent->d_type == DT_UNKNOWN) {
+		struct stat st;
+
+		if (fstatat(iod->dirfd, dent->d_name, &st, /*flags=*/0))
+			return false;
+
+		if (S_ISDIR(st.st_mode)) {
+			dent->d_type = DT_DIR;
+			return true;
+		}
+	}
+	return dent->d_type == DT_DIR;
+}
+
+#endif  /* __API_IO_DIR__ */
-- 
2.48.1.658.g4767266eb4-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/8] perf maps: Switch modules tree walk to io_dir__readdir
  2025-02-22  6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
  2025-02-22  6:10 ` [PATCH v3 1/8] tools lib api: Add io_dir an allocation free readdir alternative Ian Rogers
@ 2025-02-22  6:10 ` Ian Rogers
  2025-02-22  6:10 ` [PATCH v3 3/8] perf pmu: Switch " Ian Rogers
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22  6:10 UTC (permalink / raw)
  To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
	James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
	Michael Petlan, Anne Macedo, Dominique Martinet,
	Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
	Krzysztof Łopatowski
  Cc: Ian Rogers

Compared to glibc's opendir/readdir this lowers the max RSS of perf
record by 1.8MB on a Debian machine.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/machine.c | 25 ++++++++-----------------
 1 file changed, 8 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 316f0879e5e0..e394c630e3a2 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -37,6 +37,7 @@
 #include <internal/lib.h> // page_size
 #include "cgroup.h"
 #include "arm64-frame-pointer-unwind-support.h"
+#include <api/io_dir.h>
 
 #include <linux/ctype.h>
 #include <symbol/kallsyms.h>
@@ -1339,31 +1340,21 @@ static int maps__set_module_path(struct maps *maps, const char *path, struct kmo
 
 static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, int depth)
 {
-	const struct dirent *dent;
-	DIR *dir = opendir(dir_name);
+	struct io_dirent64 *dent;
+	struct io_dir iod;
 	int ret = 0;
 
-	if (!dir) {
+	io_dir__init(&iod, open(dir_name, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+	if (iod.dirfd < 0) {
 		pr_debug("%s: cannot open %s dir\n", __func__, dir_name);
 		return -1;
 	}
 
-	while ((dent = readdir(dir)) != NULL) {
+	while ((dent = io_dir__readdir(&iod)) != NULL) {
 		char path[PATH_MAX];
-		unsigned char d_type = dent->d_type;
 
 		path__join(path, sizeof(path), dir_name, dent->d_name);
-
-		if (d_type == DT_UNKNOWN) {
-			struct stat st;
-
-			if (stat(path, &st))
-				continue;
-			if (S_ISDIR(st.st_mode))
-				d_type = DT_DIR;
-		}
-
-		if (d_type == DT_DIR) {
+		if (io_dir__is_dir(&iod, dent)) {
 			if (!strcmp(dent->d_name, ".") ||
 			    !strcmp(dent->d_name, ".."))
 				continue;
@@ -1396,7 +1387,7 @@ static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, i
 	}
 
 out:
-	closedir(dir);
+	close(iod.dirfd);
 	return ret;
 }
 
-- 
2.48.1.658.g4767266eb4-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/8] perf pmu: Switch to io_dir__readdir
  2025-02-22  6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
  2025-02-22  6:10 ` [PATCH v3 1/8] tools lib api: Add io_dir an allocation free readdir alternative Ian Rogers
  2025-02-22  6:10 ` [PATCH v3 2/8] perf maps: Switch modules tree walk to io_dir__readdir Ian Rogers
@ 2025-02-22  6:10 ` Ian Rogers
  2025-02-22  6:10 ` [PATCH v3 4/8] perf header: Switch mem topology " Ian Rogers
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22  6:10 UTC (permalink / raw)
  To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
	James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
	Michael Petlan, Anne Macedo, Dominique Martinet,
	Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
	Krzysztof Łopatowski
  Cc: Ian Rogers

Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/pmu.c  | 46 +++++++++++++++++-------------------------
 tools/perf/util/pmus.c | 30 ++++++++++-----------------
 2 files changed, 30 insertions(+), 46 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ec3878c890a9..e51c17b1ea35 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -13,6 +13,7 @@
 #include <dirent.h>
 #include <api/fs/fs.h>
 #include <api/io.h>
+#include <api/io_dir.h>
 #include <locale.h>
 #include <fnmatch.h>
 #include <math.h>
@@ -195,19 +196,17 @@ static void perf_pmu_format__load(const struct perf_pmu *pmu, struct perf_pmu_fo
  */
 static int perf_pmu__format_parse(struct perf_pmu *pmu, int dirfd, bool eager_load)
 {
-	struct dirent *evt_ent;
-	DIR *format_dir;
+	struct io_dirent64 *evt_ent;
+	struct io_dir format_dir;
 	int ret = 0;
 
-	format_dir = fdopendir(dirfd);
-	if (!format_dir)
-		return -EINVAL;
+	io_dir__init(&format_dir, dirfd);
 
-	while ((evt_ent = readdir(format_dir)) != NULL) {
+	while ((evt_ent = io_dir__readdir(&format_dir)) != NULL) {
 		struct perf_pmu_format *format;
 		char *name = evt_ent->d_name;
 
-		if (!strcmp(name, ".") || !strcmp(name, ".."))
+		if (io_dir__is_dir(&format_dir, evt_ent))
 			continue;
 
 		format = perf_pmu__new_format(&pmu->format, name);
@@ -234,7 +233,7 @@ static int perf_pmu__format_parse(struct perf_pmu *pmu, int dirfd, bool eager_lo
 		}
 	}
 
-	closedir(format_dir);
+	close(format_dir.dirfd);
 	return ret;
 }
 
@@ -635,14 +634,12 @@ static inline bool pmu_alias_info_file(const char *name)
  */
 static int __pmu_aliases_parse(struct perf_pmu *pmu, int events_dir_fd)
 {
-	struct dirent *evt_ent;
-	DIR *event_dir;
+	struct io_dirent64 *evt_ent;
+	struct io_dir event_dir;
 
-	event_dir = fdopendir(events_dir_fd);
-	if (!event_dir)
-		return -EINVAL;
+	io_dir__init(&event_dir, events_dir_fd);
 
-	while ((evt_ent = readdir(event_dir))) {
+	while ((evt_ent = io_dir__readdir(&event_dir))) {
 		char *name = evt_ent->d_name;
 		int fd;
 		FILE *file;
@@ -674,7 +671,6 @@ static int __pmu_aliases_parse(struct perf_pmu *pmu, int events_dir_fd)
 		fclose(file);
 	}
 
-	closedir(event_dir);
 	pmu->sysfs_aliases_loaded = true;
 	return 0;
 }
@@ -2221,10 +2217,9 @@ static void perf_pmu__del_caps(struct perf_pmu *pmu)
  */
 int perf_pmu__caps_parse(struct perf_pmu *pmu)
 {
-	struct stat st;
 	char caps_path[PATH_MAX];
-	DIR *caps_dir;
-	struct dirent *evt_ent;
+	struct io_dir caps_dir;
+	struct io_dirent64 *evt_ent;
 	int caps_fd;
 
 	if (pmu->caps_initialized)
@@ -2235,24 +2230,21 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu)
 	if (!perf_pmu__pathname_scnprintf(caps_path, sizeof(caps_path), pmu->name, "caps"))
 		return -1;
 
-	if (stat(caps_path, &st) < 0) {
+	caps_fd = open(caps_path, O_CLOEXEC | O_DIRECTORY | O_RDONLY);
+	if (caps_fd == -1) {
 		pmu->caps_initialized = true;
 		return 0;	/* no error if caps does not exist */
 	}
 
-	caps_dir = opendir(caps_path);
-	if (!caps_dir)
-		return -EINVAL;
-
-	caps_fd = dirfd(caps_dir);
+	io_dir__init(&caps_dir, caps_fd);
 
-	while ((evt_ent = readdir(caps_dir)) != NULL) {
+	while ((evt_ent = io_dir__readdir(&caps_dir)) != NULL) {
 		char *name = evt_ent->d_name;
 		char value[128];
 		FILE *file;
 		int fd;
 
-		if (!strcmp(name, ".") || !strcmp(name, ".."))
+		if (io_dir__is_dir(&caps_dir, evt_ent))
 			continue;
 
 		fd = openat(caps_fd, name, O_RDONLY);
@@ -2274,7 +2266,7 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu)
 		fclose(file);
 	}
 
-	closedir(caps_dir);
+	close(caps_fd);
 
 	pmu->caps_initialized = true;
 	return pmu->nr_caps;
diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index 8a0a919415d4..afd59d678fd0 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -3,10 +3,10 @@
 #include <linux/list_sort.h>
 #include <linux/string.h>
 #include <linux/zalloc.h>
+#include <api/io_dir.h>
 #include <subcmd/pager.h>
 #include <sys/types.h>
 #include <ctype.h>
-#include <dirent.h>
 #include <pthread.h>
 #include <string.h>
 #include <unistd.h>
@@ -235,20 +235,16 @@ static void pmu_read_sysfs(unsigned int to_read_types)
 
 	if (to_read_types & (PERF_TOOL_PMU_TYPE_PE_CORE_MASK | PERF_TOOL_PMU_TYPE_PE_OTHER_MASK)) {
 		int fd = perf_pmu__event_source_devices_fd();
-		DIR *dir;
-		struct dirent *dent;
+		struct io_dir dir;
+		struct io_dirent64 *dent;
 		bool core_only = (to_read_types & PERF_TOOL_PMU_TYPE_PE_OTHER_MASK) == 0;
 
 		if (fd < 0)
 			goto skip_pe_pmus;
 
-		dir = fdopendir(fd);
-		if (!dir) {
-			close(fd);
-			goto skip_pe_pmus;
-		}
+		io_dir__init(&dir, fd);
 
-		while ((dent = readdir(dir))) {
+		while ((dent = io_dir__readdir(&dir)) != NULL) {
 			if (!strcmp(dent->d_name, ".") || !strcmp(dent->d_name, ".."))
 				continue;
 			if (core_only && !is_pmu_core(dent->d_name))
@@ -257,7 +253,7 @@ static void pmu_read_sysfs(unsigned int to_read_types)
 			perf_pmu__find2(fd, dent->d_name);
 		}
 
-		closedir(dir);
+		close(fd);
 	}
 skip_pe_pmus:
 	if ((to_read_types & PERF_TOOL_PMU_TYPE_PE_CORE_MASK) && list_empty(&core_pmus)) {
@@ -721,8 +717,8 @@ bool perf_pmus__supports_extended_type(void)
 char *perf_pmus__default_pmu_name(void)
 {
 	int fd;
-	DIR *dir;
-	struct dirent *dent;
+	struct io_dir dir;
+	struct io_dirent64 *dent;
 	char *result = NULL;
 
 	if (!list_empty(&core_pmus))
@@ -732,13 +728,9 @@ char *perf_pmus__default_pmu_name(void)
 	if (fd < 0)
 		return strdup("cpu");
 
-	dir = fdopendir(fd);
-	if (!dir) {
-		close(fd);
-		return strdup("cpu");
-	}
+	io_dir__init(&dir, fd);
 
-	while ((dent = readdir(dir))) {
+	while ((dent = io_dir__readdir(&dir)) != NULL) {
 		if (!strcmp(dent->d_name, ".") || !strcmp(dent->d_name, ".."))
 			continue;
 		if (is_pmu_core(dent->d_name)) {
@@ -747,7 +739,7 @@ char *perf_pmus__default_pmu_name(void)
 		}
 	}
 
-	closedir(dir);
+	close(fd);
 	return result ?: strdup("cpu");
 }
 
-- 
2.48.1.658.g4767266eb4-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/8] perf header: Switch mem topology to io_dir__readdir
  2025-02-22  6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
                   ` (2 preceding siblings ...)
  2025-02-22  6:10 ` [PATCH v3 3/8] perf pmu: Switch " Ian Rogers
@ 2025-02-22  6:10 ` Ian Rogers
  2025-02-22  6:10 ` [PATCH v3 5/8] perf events: Remove scandir in thread synthesis Ian Rogers
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22  6:10 UTC (permalink / raw)
  To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
	James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
	Michael Petlan, Anne Macedo, Dominique Martinet,
	Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
	Krzysztof Łopatowski
  Cc: Ian Rogers

Switch memory_node__read and build_mem_topology from opendir/readdir
to io_dir__readdir, with smaller stack allocations. Reduces peak
memory consumption of perf record by 10kb.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/header.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index d06aa86352d3..1900965f8752 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -44,6 +44,7 @@
 #include "build-id.h"
 #include "data.h"
 #include <api/fs/fs.h>
+#include <api/io_dir.h>
 #include "asm/bug.h"
 #include "tool.h"
 #include "time-utils.h"
@@ -1311,11 +1312,11 @@ static int memory_node__read(struct memory_node *n, unsigned long idx)
 {
 	unsigned int phys, size = 0;
 	char path[PATH_MAX];
-	struct dirent *ent;
-	DIR *dir;
+	struct io_dirent64 *ent;
+	struct io_dir dir;
 
 #define for_each_memory(mem, dir)					\
-	while ((ent = readdir(dir)))					\
+	while ((ent = io_dir__readdir(&dir)) != NULL)			\
 		if (strcmp(ent->d_name, ".") &&				\
 		    strcmp(ent->d_name, "..") &&			\
 		    sscanf(ent->d_name, "memory%u", &mem) == 1)
@@ -1324,9 +1325,9 @@ static int memory_node__read(struct memory_node *n, unsigned long idx)
 		  "%s/devices/system/node/node%lu",
 		  sysfs__mountpoint(), idx);
 
-	dir = opendir(path);
-	if (!dir) {
-		pr_warning("failed: can't open memory sysfs data\n");
+	io_dir__init(&dir, open(path, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+	if (dir.dirfd < 0) {
+		pr_warning("failed: can't open memory sysfs data '%s'\n", path);
 		return -1;
 	}
 
@@ -1338,20 +1339,20 @@ static int memory_node__read(struct memory_node *n, unsigned long idx)
 
 	n->set = bitmap_zalloc(size);
 	if (!n->set) {
-		closedir(dir);
+		close(dir.dirfd);
 		return -ENOMEM;
 	}
 
 	n->node = idx;
 	n->size = size;
 
-	rewinddir(dir);
+	io_dir__rewinddir(&dir);
 
 	for_each_memory(phys, dir) {
 		__set_bit(phys, n->set);
 	}
 
-	closedir(dir);
+	close(dir.dirfd);
 	return 0;
 }
 
@@ -1374,8 +1375,8 @@ static int memory_node__sort(const void *a, const void *b)
 static int build_mem_topology(struct memory_node **nodesp, u64 *cntp)
 {
 	char path[PATH_MAX];
-	struct dirent *ent;
-	DIR *dir;
+	struct io_dirent64 *ent;
+	struct io_dir dir;
 	int ret = 0;
 	size_t cnt = 0, size = 0;
 	struct memory_node *nodes = NULL;
@@ -1383,14 +1384,14 @@ static int build_mem_topology(struct memory_node **nodesp, u64 *cntp)
 	scnprintf(path, PATH_MAX, "%s/devices/system/node/",
 		  sysfs__mountpoint());
 
-	dir = opendir(path);
-	if (!dir) {
+	io_dir__init(&dir, open(path, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+	if (dir.dirfd < 0) {
 		pr_debug2("%s: couldn't read %s, does this arch have topology information?\n",
 			  __func__, path);
 		return -1;
 	}
 
-	while (!ret && (ent = readdir(dir))) {
+	while (!ret && (ent = io_dir__readdir(&dir))) {
 		unsigned int idx;
 		int r;
 
@@ -1419,7 +1420,7 @@ static int build_mem_topology(struct memory_node **nodesp, u64 *cntp)
 			cnt += 1;
 	}
 out:
-	closedir(dir);
+	close(dir.dirfd);
 	if (!ret) {
 		*cntp = cnt;
 		*nodesp = nodes;
-- 
2.48.1.658.g4767266eb4-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 5/8] perf events: Remove scandir in thread synthesis
  2025-02-22  6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
                   ` (3 preceding siblings ...)
  2025-02-22  6:10 ` [PATCH v3 4/8] perf header: Switch mem topology " Ian Rogers
@ 2025-02-22  6:10 ` Ian Rogers
  2025-02-22  6:10 ` [PATCH v3 6/8] perf parse-events: Switch tracepoints to io_dir__readdir Ian Rogers
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22  6:10 UTC (permalink / raw)
  To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
	James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
	Michael Petlan, Anne Macedo, Dominique Martinet,
	Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
	Krzysztof Łopatowski
  Cc: Ian Rogers

This avoids scanddir reading the directory into memory that's
allocated and instead allocates on the stack.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/synthetic-events.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 2dfc4260d36d..2fc4d0537840 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -38,6 +38,7 @@
 #include <uapi/linux/mman.h> /* To get things like MAP_HUGETLB even on older libc headers */
 #include <api/fs/fs.h>
 #include <api/io.h>
+#include <api/io_dir.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <fcntl.h>
@@ -767,10 +768,10 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 				      bool needs_mmap, bool mmap_data)
 {
 	char filename[PATH_MAX];
-	struct dirent **dirent;
+	struct io_dir iod;
+	struct io_dirent64 *dent;
 	pid_t tgid, ppid;
 	int rc = 0;
-	int i, n;
 
 	/* special case: only send one comm event using passed in pid */
 	if (!full) {
@@ -802,16 +803,19 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 	snprintf(filename, sizeof(filename), "%s/proc/%d/task",
 		 machine->root_dir, pid);
 
-	n = scandir(filename, &dirent, filter_task, NULL);
-	if (n < 0)
-		return n;
+	io_dir__init(&iod, open(filename, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+	if (iod.dirfd < 0)
+		return -1;
 
-	for (i = 0; i < n; i++) {
+	while ((dent = io_dir__readdir(&iod)) != NULL) {
 		char *end;
 		pid_t _pid;
 		bool kernel_thread = false;
 
-		_pid = strtol(dirent[i]->d_name, &end, 10);
+		if (!isdigit(dent->d_name[0]))
+			continue;
+
+		_pid = strtol(dent->d_name, &end, 10);
 		if (*end)
 			continue;
 
@@ -845,9 +849,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 		}
 	}
 
-	for (i = 0; i < n; i++)
-		zfree(&dirent[i]);
-	free(dirent);
+	close(iod.dirfd);
 
 	return rc;
 }
-- 
2.48.1.658.g4767266eb4-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 6/8] perf parse-events: Switch tracepoints to io_dir__readdir
  2025-02-22  6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
                   ` (4 preceding siblings ...)
  2025-02-22  6:10 ` [PATCH v3 5/8] perf events: Remove scandir in thread synthesis Ian Rogers
@ 2025-02-22  6:10 ` Ian Rogers
  2025-02-22  6:10 ` [PATCH v3 7/8] perf hwmon_pmu: Switch event discovery " Ian Rogers
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22  6:10 UTC (permalink / raw)
  To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
	James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
	Michael Petlan, Anne Macedo, Dominique Martinet,
	Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
	Krzysztof Łopatowski
  Cc: Ian Rogers

Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/parse-events.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 6c36b98875bc..35e48fe56dfa 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -17,6 +17,7 @@
 #include "strbuf.h"
 #include "debug.h"
 #include <api/fs/tracing_path.h>
+#include <api/io_dir.h>
 #include <perf/cpumap.h>
 #include <util/parse-events-bison.h>
 #include <util/parse-events-flex.h>
@@ -554,8 +555,8 @@ static int add_tracepoint_multi_event(struct parse_events_state *parse_state,
 				      struct parse_events_terms *head_config, YYLTYPE *loc)
 {
 	char *evt_path;
-	struct dirent *evt_ent;
-	DIR *evt_dir;
+	struct io_dirent64 *evt_ent;
+	struct io_dir evt_dir;
 	int ret = 0, found = 0;
 
 	evt_path = get_events_file(sys_name);
@@ -563,14 +564,14 @@ static int add_tracepoint_multi_event(struct parse_events_state *parse_state,
 		tracepoint_error(err, errno, sys_name, evt_name, loc->first_column);
 		return -1;
 	}
-	evt_dir = opendir(evt_path);
-	if (!evt_dir) {
+	io_dir__init(&evt_dir, open(evt_path, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+	if (evt_dir.dirfd < 0) {
 		put_events_file(evt_path);
 		tracepoint_error(err, errno, sys_name, evt_name, loc->first_column);
 		return -1;
 	}
 
-	while (!ret && (evt_ent = readdir(evt_dir))) {
+	while (!ret && (evt_ent = io_dir__readdir(&evt_dir))) {
 		if (!strcmp(evt_ent->d_name, ".")
 		    || !strcmp(evt_ent->d_name, "..")
 		    || !strcmp(evt_ent->d_name, "enable")
@@ -592,7 +593,7 @@ static int add_tracepoint_multi_event(struct parse_events_state *parse_state,
 	}
 
 	put_events_file(evt_path);
-	closedir(evt_dir);
+	close(evt_dir.dirfd);
 	return ret;
 }
 
@@ -615,17 +616,23 @@ static int add_tracepoint_multi_sys(struct parse_events_state *parse_state,
 				    struct parse_events_error *err,
 				    struct parse_events_terms *head_config, YYLTYPE *loc)
 {
-	struct dirent *events_ent;
-	DIR *events_dir;
+	struct io_dirent64 *events_ent;
+	struct io_dir events_dir;
 	int ret = 0;
+	char *events_dir_path = get_tracing_file("events");
 
-	events_dir = tracing_events__opendir();
-	if (!events_dir) {
+	if (!events_dir_path) {
+		tracepoint_error(err, errno, sys_name, evt_name, loc->first_column);
+		return -1;
+	}
+	io_dir__init(&events_dir, open(events_dir_path, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+	put_events_file(events_dir_path);
+	if (events_dir.dirfd < 0) {
 		tracepoint_error(err, errno, sys_name, evt_name, loc->first_column);
 		return -1;
 	}
 
-	while (!ret && (events_ent = readdir(events_dir))) {
+	while (!ret && (events_ent = io_dir__readdir(&events_dir))) {
 		if (!strcmp(events_ent->d_name, ".")
 		    || !strcmp(events_ent->d_name, "..")
 		    || !strcmp(events_ent->d_name, "enable")
@@ -639,8 +646,7 @@ static int add_tracepoint_multi_sys(struct parse_events_state *parse_state,
 		ret = add_tracepoint_event(parse_state, list, events_ent->d_name,
 					   evt_name, err, head_config, loc);
 	}
-
-	closedir(events_dir);
+	close(events_dir.dirfd);
 	return ret;
 }
 
-- 
2.48.1.658.g4767266eb4-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 7/8] perf hwmon_pmu: Switch event discovery to io_dir__readdir
  2025-02-22  6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
                   ` (5 preceding siblings ...)
  2025-02-22  6:10 ` [PATCH v3 6/8] perf parse-events: Switch tracepoints to io_dir__readdir Ian Rogers
@ 2025-02-22  6:10 ` Ian Rogers
  2025-02-22  6:10 ` [PATCH v3 8/8] perf machine: Reuse module path buffer Ian Rogers
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22  6:10 UTC (permalink / raw)
  To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
	James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
	Michael Petlan, Anne Macedo, Dominique Martinet,
	Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
	Krzysztof Łopatowski
  Cc: Ian Rogers

Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/hwmon_pmu.c | 42 +++++++++++++++----------------------
 1 file changed, 17 insertions(+), 25 deletions(-)

diff --git a/tools/perf/util/hwmon_pmu.c b/tools/perf/util/hwmon_pmu.c
index acd889b2462f..3cce77fc8004 100644
--- a/tools/perf/util/hwmon_pmu.c
+++ b/tools/perf/util/hwmon_pmu.c
@@ -11,13 +11,13 @@
 #include <sys/types.h>
 #include <assert.h>
 #include <ctype.h>
-#include <dirent.h>
 #include <fcntl.h>
 #include <stddef.h>
 #include <stdlib.h>
 #include <string.h>
 #include <api/fs/fs.h>
 #include <api/io.h>
+#include <api/io_dir.h>
 #include <linux/kernel.h>
 #include <linux/string.h>
 #include <linux/zalloc.h>
@@ -235,31 +235,22 @@ static void fix_name(char *p)
 
 static int hwmon_pmu__read_events(struct hwmon_pmu *pmu)
 {
-	DIR *dir;
-	struct dirent *ent;
-	int dup_fd, err = 0;
+	int err = 0;
 	struct hashmap_entry *cur, *tmp;
 	size_t bkt;
+	struct io_dirent64 *ent;
+	struct io_dir dir;
 
 	if (pmu->pmu.sysfs_aliases_loaded)
 		return 0;
 
-	/*
-	 * Use a dup-ed fd as closedir will close it. Use openat so that the
-	 * directory contents are refreshed.
-	 */
-	dup_fd = openat(pmu->hwmon_dir_fd, ".", O_DIRECTORY);
+	/* Use openat so that the directory contents are refreshed. */
+	io_dir__init(&dir, openat(pmu->hwmon_dir_fd, ".", O_CLOEXEC | O_DIRECTORY | O_RDONLY));
 
-	if (dup_fd == -1)
-		return -ENOMEM;
+	if (dir.dirfd < 0)
+		return -ENOENT;
 
-	dir = fdopendir(dup_fd);
-	if (!dir) {
-		close(dup_fd);
-		return -ENOMEM;
-	}
-
-	while ((ent = readdir(dir)) != NULL) {
+	while ((ent = io_dir__readdir(&dir)) != NULL) {
 		enum hwmon_type type;
 		int number;
 		enum hwmon_item item;
@@ -347,7 +338,7 @@ static int hwmon_pmu__read_events(struct hwmon_pmu *pmu)
 	pmu->pmu.sysfs_aliases_loaded = true;
 
 err_out:
-	closedir(dir);
+	close(dir.dirfd);
 	return err;
 }
 
@@ -702,8 +693,8 @@ int hwmon_pmu__check_alias(struct parse_events_terms *terms, struct perf_pmu_inf
 int perf_pmus__read_hwmon_pmus(struct list_head *pmus)
 {
 	char *line = NULL;
-	DIR *class_hwmon_dir;
-	struct dirent *class_hwmon_ent;
+	struct io_dirent64 *class_hwmon_ent;
+	struct io_dir class_hwmon_dir;
 	char buf[PATH_MAX];
 	const char *sysfs = sysfs__mountpoint();
 
@@ -711,11 +702,12 @@ int perf_pmus__read_hwmon_pmus(struct list_head *pmus)
 		return 0;
 
 	scnprintf(buf, sizeof(buf), "%s/class/hwmon/", sysfs);
-	class_hwmon_dir = opendir(buf);
-	if (!class_hwmon_dir)
+	io_dir__init(&class_hwmon_dir, open(buf, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+
+	if (class_hwmon_dir.dirfd < 0)
 		return 0;
 
-	while ((class_hwmon_ent = readdir(class_hwmon_dir)) != NULL) {
+	while ((class_hwmon_ent = io_dir__readdir(&class_hwmon_dir)) != NULL) {
 		size_t line_len;
 		int hwmon_dir, name_fd;
 		struct io io;
@@ -745,7 +737,7 @@ int perf_pmus__read_hwmon_pmus(struct list_head *pmus)
 		close(name_fd);
 	}
 	free(line);
-	closedir(class_hwmon_dir);
+	close(class_hwmon_dir.dirfd);
 	return 0;
 }
 
-- 
2.48.1.658.g4767266eb4-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 8/8] perf machine: Reuse module path buffer
  2025-02-22  6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
                   ` (6 preceding siblings ...)
  2025-02-22  6:10 ` [PATCH v3 7/8] perf hwmon_pmu: Switch event discovery " Ian Rogers
@ 2025-02-22  6:10 ` Ian Rogers
  2025-02-25  0:28 ` [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Namhyung Kim
  2025-02-25 21:58 ` Namhyung Kim
  9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22  6:10 UTC (permalink / raw)
  To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
	James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
	Michael Petlan, Anne Macedo, Dominique Martinet,
	Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
	Krzysztof Łopatowski
  Cc: Ian Rogers

Rather than copying the path and appending the directory entry in a
fresh path buffer, append to the path at the end of where it is for
the recursion level. This saves a PATH_MAX buffer per recursion level
and some unnecessary copying.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/machine.c | 34 +++++++++++++++++++++++-----------
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index e394c630e3a2..3f1faf94198d 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1338,22 +1338,23 @@ static int maps__set_module_path(struct maps *maps, const char *path, struct kmo
 	return 0;
 }
 
-static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, int depth)
+static int maps__set_modules_path_dir(struct maps *maps, char *path, size_t path_size, int depth)
 {
 	struct io_dirent64 *dent;
 	struct io_dir iod;
+	size_t root_len = strlen(path);
 	int ret = 0;
 
-	io_dir__init(&iod, open(dir_name, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+	io_dir__init(&iod, open(path, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
 	if (iod.dirfd < 0) {
-		pr_debug("%s: cannot open %s dir\n", __func__, dir_name);
+		pr_debug("%s: cannot open %s dir\n", __func__, path);
 		return -1;
 	}
-
+	/* Bounds check, should never happen. */
+	if (root_len >= path_size)
+		return -1;
+	path[root_len++] = '/';
 	while ((dent = io_dir__readdir(&iod)) != NULL) {
-		char path[PATH_MAX];
-
-		path__join(path, sizeof(path), dir_name, dent->d_name);
 		if (io_dir__is_dir(&iod, dent)) {
 			if (!strcmp(dent->d_name, ".") ||
 			    !strcmp(dent->d_name, ".."))
@@ -1366,7 +1367,12 @@ static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, i
 					continue;
 			}
 
-			ret = maps__set_modules_path_dir(maps, path, depth + 1);
+			/* Bounds check, should never happen. */
+			if (root_len + strlen(dent->d_name) >= path_size)
+				continue;
+
+			strcpy(path + root_len, dent->d_name);
+			ret = maps__set_modules_path_dir(maps, path, path_size, depth + 1);
 			if (ret < 0)
 				goto out;
 		} else {
@@ -1376,9 +1382,14 @@ static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, i
 			if (ret)
 				goto out;
 
-			if (m.kmod)
-				ret = maps__set_module_path(maps, path, &m);
+			if (m.kmod) {
+				/* Bounds check, should never happen. */
+				if (root_len + strlen(dent->d_name) < path_size) {
+					strcpy(path + root_len, dent->d_name);
+					ret = maps__set_module_path(maps, path, &m);
 
+				}
+			}
 			zfree(&m.name);
 
 			if (ret)
@@ -1404,7 +1415,8 @@ static int machine__set_modules_path(struct machine *machine)
 		 machine->root_dir, version);
 	free(version);
 
-	return maps__set_modules_path_dir(machine__kernel_maps(machine), modules_path, 0);
+	return maps__set_modules_path_dir(machine__kernel_maps(machine),
+					  modules_path, sizeof(modules_path), 0);
 }
 int __weak arch__fix_module_text_start(u64 *start __maybe_unused,
 				u64 *size __maybe_unused,
-- 
2.48.1.658.g4767266eb4-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir
  2025-02-22  6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
                   ` (7 preceding siblings ...)
  2025-02-22  6:10 ` [PATCH v3 8/8] perf machine: Reuse module path buffer Ian Rogers
@ 2025-02-25  0:28 ` Namhyung Kim
  2025-02-25  0:30   ` Namhyung Kim
  2025-02-25 21:58 ` Namhyung Kim
  9 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2025-02-25  0:28 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Sam James, Jesper Juhl, James Clark, Zhongqiu Han,
	Yicong Yang, Thomas Richter, Michael Petlan, Anne Macedo,
	Dominique Martinet, Jean-Philippe Romain, Junhao He, linux-kernel,
	linux-perf-users, Krzysztof Łopatowski

Hi Ian,

On Fri, Feb 21, 2025 at 10:10:05PM -0800, Ian Rogers wrote:
> glibc's opendir allocates a minimum of 32kb, when called recursively
> for a directory tree the memory consumption can add up - nearly 300kb
> during perf start-up when processing modules. Add a stack allocated
> variant of readdir sized a little more than 1kb

It's still small and hard to verify.  I've run the following command
before and after the change but didn't see a difference.

  $ sudo time -f %Mk ./perf record -a true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.757 MB perf.data (563 samples) ]
  74724k

According to man time(1), %M is for max RSS.

Thanks,
Namhyung

> 
> v3: Rebase on top of Krzysztof Łopatowski's work. Add additional
>     defines for SYS_getdents64 on all other architectures if its
>     definition is missing. Add a patch to further reduce the
>     stack/memory usage in machine__set_modules_path_dir by appending
>     to a buffer rather than creating a copy.
> v2: Remove the feature test and always use a perf supplied getdents64
>     to workaround an Alpine Linux issue in v1:
>     https://lore.kernel.org/lkml/20231207050433.1426834-1-irogers@google.com/
>     As suggested by Krzysztof Łopatowski
>     <krzysztof.m.lopatowski@gmail.com> who also pointed to the perf
>     trace performance improvements in start-up time eliminating stat
>     calls can achieve:
>     https://lore.kernel.org/lkml/20250206113314.335376-2-krzysztof.m.lopatowski@gmail.com/
>     Convert parse-events and hwmon_pmu to use io_dir.
> v1: This was previously part of the memory saving change set:
>     https://lore.kernel.org/lkml/20231127220902.1315692-1-irogers@google.com/
>     It is separated here and a feature check and syscall workaround
>     for missing getdents64 added.
> 
> Ian Rogers (8):
>   tools lib api: Add io_dir an allocation free readdir alternative
>   perf maps: Switch modules tree walk to io_dir__readdir
>   perf pmu: Switch to io_dir__readdir
>   perf header: Switch mem topology to io_dir__readdir
>   perf events: Remove scandir in thread synthesis
>   perf parse-events: Switch tracepoints to io_dir__readdir
>   perf hwmon_pmu: Switch event discovery to io_dir__readdir
>   perf machine: Reuse module path buffer
> 
>  tools/lib/api/Makefile             |   2 +-
>  tools/lib/api/io_dir.h             | 104 +++++++++++++++++++++++++++++
>  tools/perf/util/header.c           |  31 ++++-----
>  tools/perf/util/hwmon_pmu.c        |  42 +++++-------
>  tools/perf/util/machine.c          |  57 ++++++++--------
>  tools/perf/util/parse-events.c     |  32 +++++----
>  tools/perf/util/pmu.c              |  46 ++++++-------
>  tools/perf/util/pmus.c             |  30 +++------
>  tools/perf/util/synthetic-events.c |  22 +++---
>  9 files changed, 229 insertions(+), 137 deletions(-)
>  create mode 100644 tools/lib/api/io_dir.h
> 
> -- 
> 2.48.1.658.g4767266eb4-goog
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir
  2025-02-25  0:28 ` [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Namhyung Kim
@ 2025-02-25  0:30   ` Namhyung Kim
  2025-02-25  1:25     ` Ian Rogers
  0 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2025-02-25  0:30 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Sam James, Jesper Juhl, James Clark, Zhongqiu Han,
	Yicong Yang, Thomas Richter, Michael Petlan, Anne Macedo,
	Dominique Martinet, Jean-Philippe Romain, Junhao He, linux-kernel,
	linux-perf-users, Krzysztof Łopatowski

On Mon, Feb 24, 2025 at 04:28:24PM -0800, Namhyung Kim wrote:
> Hi Ian,
> 
> On Fri, Feb 21, 2025 at 10:10:05PM -0800, Ian Rogers wrote:
> > glibc's opendir allocates a minimum of 32kb, when called recursively
> > for a directory tree the memory consumption can add up - nearly 300kb
> > during perf start-up when processing modules. Add a stack allocated
> > variant of readdir sized a little more than 1kb
> 
> It's still small and hard to verify.  I've run the following command
> before and after the change but didn't see a difference.
> 
>   $ sudo time -f %Mk ./perf record -a true
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 1.757 MB perf.data (563 samples) ]
>   74724k
> 
> According to man time(1), %M is for max RSS.

But anyway, it looks ok and build is fine now.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir
  2025-02-25  0:30   ` Namhyung Kim
@ 2025-02-25  1:25     ` Ian Rogers
  0 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-25  1:25 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Sam James, Jesper Juhl, James Clark, Zhongqiu Han,
	Yicong Yang, Thomas Richter, Michael Petlan, Anne Macedo,
	Dominique Martinet, Jean-Philippe Romain, Junhao He, linux-kernel,
	linux-perf-users, Krzysztof Łopatowski

On Mon, Feb 24, 2025 at 4:30 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Mon, Feb 24, 2025 at 04:28:24PM -0800, Namhyung Kim wrote:
> > Hi Ian,
> >
> > On Fri, Feb 21, 2025 at 10:10:05PM -0800, Ian Rogers wrote:
> > > glibc's opendir allocates a minimum of 32kb, when called recursively
> > > for a directory tree the memory consumption can add up - nearly 300kb
> > > during perf start-up when processing modules. Add a stack allocated
> > > variant of readdir sized a little more than 1kb
> >
> > It's still small and hard to verify.  I've run the following command
> > before and after the change but didn't see a difference.
> >
> >   $ sudo time -f %Mk ./perf record -a true
> >   [ perf record: Woken up 1 times to write data ]
> >   [ perf record: Captured and wrote 1.757 MB perf.data (563 samples) ]
> >   74724k
> >
> > According to man time(1), %M is for max RSS.
>
> But anyway, it looks ok and build is fine now.

Thanks for the testing! So doing a regular build I could repeat what
you saw - basically the opendir isn't contributing to maxrss as the
BPF handling and so gets lost. Doing a minimal static build, that
loses BPF support, things were clearer but not as good as I'd
originally measured, 10880k reduced to 10696k - a 184k saving (raw
data below). Perhaps opendir got better or perhaps there are fewer
kernel modules. I tried heaptrack but unfortunately it wasn't able to
instrument the allocations in glibc's allocdir function (it reported
0). Originally heaptrack showing opendir allocations were significant
for `perf record` was what led me to this code. At the moment BPF
event synthesis and topdown event checking look particularly
expensive.

Thanks,
Ian

Before:
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (107 samples) ]
10880k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (106 samples) ]
10880k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (109 samples) ]
10880k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (102 samples) ]
10880k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (106 samples) ]
11008k

After:
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.473 MB perf.data (106 samples) ]
10820k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.473 MB perf.data (109 samples) ]
10696k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.473 MB perf.data (98 samples) ]
10696k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (98 samples) ]
10696k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.490 MB perf.data (110 samples) ]
10696k

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir
  2025-02-22  6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
                   ` (8 preceding siblings ...)
  2025-02-25  0:28 ` [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Namhyung Kim
@ 2025-02-25 21:58 ` Namhyung Kim
  9 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2025-02-25 21:58 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Sam James, Jesper Juhl, James Clark, Zhongqiu Han,
	Yicong Yang, Thomas Richter, Michael Petlan, Anne Macedo,
	Dominique Martinet, Jean-Philippe Romain, Junhao He, linux-kernel,
	linux-perf-users, Krzysztof Łopatowski, Ian Rogers

On Fri, 21 Feb 2025 22:10:05 -0800, Ian Rogers wrote:
> glibc's opendir allocates a minimum of 32kb, when called recursively
> for a directory tree the memory consumption can add up - nearly 300kb
> during perf start-up when processing modules. Add a stack allocated
> variant of readdir sized a little more than 1kb
> 
> v3: Rebase on top of Krzysztof Łopatowski's work. Add additional
>     defines for SYS_getdents64 on all other architectures if its
>     definition is missing. Add a patch to further reduce the
>     stack/memory usage in machine__set_modules_path_dir by appending
>     to a buffer rather than creating a copy.
> v2: Remove the feature test and always use a perf supplied getdents64
>     to workaround an Alpine Linux issue in v1:
>     https://lore.kernel.org/lkml/20231207050433.1426834-1-irogers@google.com/
>     As suggested by Krzysztof Łopatowski
>     <krzysztof.m.lopatowski@gmail.com> who also pointed to the perf
>     trace performance improvements in start-up time eliminating stat
>     calls can achieve:
>     https://lore.kernel.org/lkml/20250206113314.335376-2-krzysztof.m.lopatowski@gmail.com/
>     Convert parse-events and hwmon_pmu to use io_dir.
> v1: This was previously part of the memory saving change set:
>     https://lore.kernel.org/lkml/20231127220902.1315692-1-irogers@google.com/
>     It is separated here and a feature check and syscall workaround
>     for missing getdents64 added.
> 
> [...]
Applied to perf-tools-next, thanks!

Best regards,
Namhyung



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-02-25 21:58 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-22  6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
2025-02-22  6:10 ` [PATCH v3 1/8] tools lib api: Add io_dir an allocation free readdir alternative Ian Rogers
2025-02-22  6:10 ` [PATCH v3 2/8] perf maps: Switch modules tree walk to io_dir__readdir Ian Rogers
2025-02-22  6:10 ` [PATCH v3 3/8] perf pmu: Switch " Ian Rogers
2025-02-22  6:10 ` [PATCH v3 4/8] perf header: Switch mem topology " Ian Rogers
2025-02-22  6:10 ` [PATCH v3 5/8] perf events: Remove scandir in thread synthesis Ian Rogers
2025-02-22  6:10 ` [PATCH v3 6/8] perf parse-events: Switch tracepoints to io_dir__readdir Ian Rogers
2025-02-22  6:10 ` [PATCH v3 7/8] perf hwmon_pmu: Switch event discovery " Ian Rogers
2025-02-22  6:10 ` [PATCH v3 8/8] perf machine: Reuse module path buffer Ian Rogers
2025-02-25  0:28 ` [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Namhyung Kim
2025-02-25  0:30   ` Namhyung Kim
2025-02-25  1:25     ` Ian Rogers
2025-02-25 21:58 ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).