* [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir
@ 2025-02-22 6:10 Ian Rogers
2025-02-22 6:10 ` [PATCH v3 1/8] tools lib api: Add io_dir an allocation free readdir alternative Ian Rogers
` (9 more replies)
0 siblings, 10 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22 6:10 UTC (permalink / raw)
To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
Michael Petlan, Anne Macedo, Dominique Martinet,
Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
Krzysztof Łopatowski
Cc: Ian Rogers
glibc's opendir allocates a minimum of 32kb, when called recursively
for a directory tree the memory consumption can add up - nearly 300kb
during perf start-up when processing modules. Add a stack allocated
variant of readdir sized a little more than 1kb
v3: Rebase on top of Krzysztof Łopatowski's work. Add additional
defines for SYS_getdents64 on all other architectures if its
definition is missing. Add a patch to further reduce the
stack/memory usage in machine__set_modules_path_dir by appending
to a buffer rather than creating a copy.
v2: Remove the feature test and always use a perf supplied getdents64
to workaround an Alpine Linux issue in v1:
https://lore.kernel.org/lkml/20231207050433.1426834-1-irogers@google.com/
As suggested by Krzysztof Łopatowski
<krzysztof.m.lopatowski@gmail.com> who also pointed to the perf
trace performance improvements in start-up time eliminating stat
calls can achieve:
https://lore.kernel.org/lkml/20250206113314.335376-2-krzysztof.m.lopatowski@gmail.com/
Convert parse-events and hwmon_pmu to use io_dir.
v1: This was previously part of the memory saving change set:
https://lore.kernel.org/lkml/20231127220902.1315692-1-irogers@google.com/
It is separated here and a feature check and syscall workaround
for missing getdents64 added.
Ian Rogers (8):
tools lib api: Add io_dir an allocation free readdir alternative
perf maps: Switch modules tree walk to io_dir__readdir
perf pmu: Switch to io_dir__readdir
perf header: Switch mem topology to io_dir__readdir
perf events: Remove scandir in thread synthesis
perf parse-events: Switch tracepoints to io_dir__readdir
perf hwmon_pmu: Switch event discovery to io_dir__readdir
perf machine: Reuse module path buffer
tools/lib/api/Makefile | 2 +-
tools/lib/api/io_dir.h | 104 +++++++++++++++++++++++++++++
tools/perf/util/header.c | 31 ++++-----
tools/perf/util/hwmon_pmu.c | 42 +++++-------
tools/perf/util/machine.c | 57 ++++++++--------
tools/perf/util/parse-events.c | 32 +++++----
tools/perf/util/pmu.c | 46 ++++++-------
tools/perf/util/pmus.c | 30 +++------
tools/perf/util/synthetic-events.c | 22 +++---
9 files changed, 229 insertions(+), 137 deletions(-)
create mode 100644 tools/lib/api/io_dir.h
--
2.48.1.658.g4767266eb4-goog
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v3 1/8] tools lib api: Add io_dir an allocation free readdir alternative
2025-02-22 6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
@ 2025-02-22 6:10 ` Ian Rogers
2025-02-22 6:10 ` [PATCH v3 2/8] perf maps: Switch modules tree walk to io_dir__readdir Ian Rogers
` (8 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22 6:10 UTC (permalink / raw)
To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
Michael Petlan, Anne Macedo, Dominique Martinet,
Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
Krzysztof Łopatowski
Cc: Ian Rogers
glibc's opendir allocates a minimum of 32kb, when called recursively
for a directory tree the memory consumption can add up - nearly 300kb
during perf start-up when processing modules. Add a stack allocated
variant of readdir sized a little more than 1kb.
As getdents64 may be missing from libc, add support using syscall. As
the system call number maybe missing, add #defines for those.
Note, an earlier version of this patch had a feature test for
getdents64 but there were problems on certains distros where
getdents64 would be #define renamed to getdents breaking the code. The
syscall use was made uncondtional to work around this. There is
context in:
https://lore.kernel.org/lkml/20231207050433.1426834-1-irogers@google.com/
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/lib/api/Makefile | 2 +-
tools/lib/api/io_dir.h | 104 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 105 insertions(+), 1 deletion(-)
create mode 100644 tools/lib/api/io_dir.h
diff --git a/tools/lib/api/Makefile b/tools/lib/api/Makefile
index 7f6396087b46..8665c799e0fa 100644
--- a/tools/lib/api/Makefile
+++ b/tools/lib/api/Makefile
@@ -95,7 +95,7 @@ install_lib: $(LIBFILE)
$(call do_install_mkdir,$(libdir_SQ)); \
cp -fpR $(LIBFILE) $(DESTDIR)$(libdir_SQ)
-HDRS := cpu.h debug.h io.h
+HDRS := cpu.h debug.h io.h io_dir.h
FD_HDRS := fd/array.h
FS_HDRS := fs/fs.h fs/tracing_path.h
INSTALL_HDRS_PFX := $(DESTDIR)$(prefix)/include/api
diff --git a/tools/lib/api/io_dir.h b/tools/lib/api/io_dir.h
new file mode 100644
index 000000000000..aab73393d2a2
--- /dev/null
+++ b/tools/lib/api/io_dir.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+/*
+ * Lightweight directory reading library.
+ */
+#ifndef __API_IO_DIR__
+#define __API_IO_DIR__
+
+#include <dirent.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <sys/stat.h>
+#include <sys/syscall.h>
+
+#if !defined(SYS_getdents64)
+#if defined(__x86_64__) || defined(__arm__)
+ #define SYS_getdents64 217
+#elif defined(__i386__) || defined(__s390x__) || defined(__sh__)
+ #define SYS_getdents64 220
+#elif defined(__alpha__)
+ #define SYS_getdents64 377
+#elif defined(__mips__)
+ #define SYS_getdents64 308
+#elif defined(__powerpc64__) || defined(__powerpc__)
+ #define SYS_getdents64 202
+#elif defined(__sparc64__) || defined(__sparc__)
+ #define SYS_getdents64 154
+#elif defined(__xtensa__)
+ #define SYS_getdents64 60
+#else
+ #define SYS_getdents64 61
+#endif
+#endif /* !defined(SYS_getdents64) */
+
+static inline ssize_t perf_getdents64(int fd, void *dirp, size_t count)
+{
+#ifdef MEMORY_SANITIZER
+ memset(dirp, 0, count);
+#endif
+ return syscall(SYS_getdents64, fd, dirp, count);
+}
+
+struct io_dirent64 {
+ ino64_t d_ino; /* 64-bit inode number */
+ off64_t d_off; /* 64-bit offset to next structure */
+ unsigned short d_reclen; /* Size of this dirent */
+ unsigned char d_type; /* File type */
+ char d_name[NAME_MAX + 1]; /* Filename (null-terminated) */
+};
+
+struct io_dir {
+ int dirfd;
+ ssize_t available_bytes;
+ struct io_dirent64 *next;
+ struct io_dirent64 buff[4];
+};
+
+static inline void io_dir__init(struct io_dir *iod, int dirfd)
+{
+ iod->dirfd = dirfd;
+ iod->available_bytes = 0;
+}
+
+static inline void io_dir__rewinddir(struct io_dir *iod)
+{
+ lseek(iod->dirfd, 0, SEEK_SET);
+ iod->available_bytes = 0;
+}
+
+static inline struct io_dirent64 *io_dir__readdir(struct io_dir *iod)
+{
+ struct io_dirent64 *entry;
+
+ if (iod->available_bytes <= 0) {
+ ssize_t rc = perf_getdents64(iod->dirfd, iod->buff, sizeof(iod->buff));
+
+ if (rc <= 0)
+ return NULL;
+ iod->available_bytes = rc;
+ iod->next = iod->buff;
+ }
+ entry = iod->next;
+ iod->next = (struct io_dirent64 *)((char *)entry + entry->d_reclen);
+ iod->available_bytes -= entry->d_reclen;
+ return entry;
+}
+
+static inline bool io_dir__is_dir(const struct io_dir *iod, struct io_dirent64 *dent)
+{
+ if (dent->d_type == DT_UNKNOWN) {
+ struct stat st;
+
+ if (fstatat(iod->dirfd, dent->d_name, &st, /*flags=*/0))
+ return false;
+
+ if (S_ISDIR(st.st_mode)) {
+ dent->d_type = DT_DIR;
+ return true;
+ }
+ }
+ return dent->d_type == DT_DIR;
+}
+
+#endif /* __API_IO_DIR__ */
--
2.48.1.658.g4767266eb4-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 2/8] perf maps: Switch modules tree walk to io_dir__readdir
2025-02-22 6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
2025-02-22 6:10 ` [PATCH v3 1/8] tools lib api: Add io_dir an allocation free readdir alternative Ian Rogers
@ 2025-02-22 6:10 ` Ian Rogers
2025-02-22 6:10 ` [PATCH v3 3/8] perf pmu: Switch " Ian Rogers
` (7 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22 6:10 UTC (permalink / raw)
To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
Michael Petlan, Anne Macedo, Dominique Martinet,
Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
Krzysztof Łopatowski
Cc: Ian Rogers
Compared to glibc's opendir/readdir this lowers the max RSS of perf
record by 1.8MB on a Debian machine.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/machine.c | 25 ++++++++-----------------
1 file changed, 8 insertions(+), 17 deletions(-)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 316f0879e5e0..e394c630e3a2 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -37,6 +37,7 @@
#include <internal/lib.h> // page_size
#include "cgroup.h"
#include "arm64-frame-pointer-unwind-support.h"
+#include <api/io_dir.h>
#include <linux/ctype.h>
#include <symbol/kallsyms.h>
@@ -1339,31 +1340,21 @@ static int maps__set_module_path(struct maps *maps, const char *path, struct kmo
static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, int depth)
{
- const struct dirent *dent;
- DIR *dir = opendir(dir_name);
+ struct io_dirent64 *dent;
+ struct io_dir iod;
int ret = 0;
- if (!dir) {
+ io_dir__init(&iod, open(dir_name, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+ if (iod.dirfd < 0) {
pr_debug("%s: cannot open %s dir\n", __func__, dir_name);
return -1;
}
- while ((dent = readdir(dir)) != NULL) {
+ while ((dent = io_dir__readdir(&iod)) != NULL) {
char path[PATH_MAX];
- unsigned char d_type = dent->d_type;
path__join(path, sizeof(path), dir_name, dent->d_name);
-
- if (d_type == DT_UNKNOWN) {
- struct stat st;
-
- if (stat(path, &st))
- continue;
- if (S_ISDIR(st.st_mode))
- d_type = DT_DIR;
- }
-
- if (d_type == DT_DIR) {
+ if (io_dir__is_dir(&iod, dent)) {
if (!strcmp(dent->d_name, ".") ||
!strcmp(dent->d_name, ".."))
continue;
@@ -1396,7 +1387,7 @@ static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, i
}
out:
- closedir(dir);
+ close(iod.dirfd);
return ret;
}
--
2.48.1.658.g4767266eb4-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 3/8] perf pmu: Switch to io_dir__readdir
2025-02-22 6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
2025-02-22 6:10 ` [PATCH v3 1/8] tools lib api: Add io_dir an allocation free readdir alternative Ian Rogers
2025-02-22 6:10 ` [PATCH v3 2/8] perf maps: Switch modules tree walk to io_dir__readdir Ian Rogers
@ 2025-02-22 6:10 ` Ian Rogers
2025-02-22 6:10 ` [PATCH v3 4/8] perf header: Switch mem topology " Ian Rogers
` (6 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22 6:10 UTC (permalink / raw)
To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
Michael Petlan, Anne Macedo, Dominique Martinet,
Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
Krzysztof Łopatowski
Cc: Ian Rogers
Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/pmu.c | 46 +++++++++++++++++-------------------------
tools/perf/util/pmus.c | 30 ++++++++++-----------------
2 files changed, 30 insertions(+), 46 deletions(-)
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ec3878c890a9..e51c17b1ea35 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -13,6 +13,7 @@
#include <dirent.h>
#include <api/fs/fs.h>
#include <api/io.h>
+#include <api/io_dir.h>
#include <locale.h>
#include <fnmatch.h>
#include <math.h>
@@ -195,19 +196,17 @@ static void perf_pmu_format__load(const struct perf_pmu *pmu, struct perf_pmu_fo
*/
static int perf_pmu__format_parse(struct perf_pmu *pmu, int dirfd, bool eager_load)
{
- struct dirent *evt_ent;
- DIR *format_dir;
+ struct io_dirent64 *evt_ent;
+ struct io_dir format_dir;
int ret = 0;
- format_dir = fdopendir(dirfd);
- if (!format_dir)
- return -EINVAL;
+ io_dir__init(&format_dir, dirfd);
- while ((evt_ent = readdir(format_dir)) != NULL) {
+ while ((evt_ent = io_dir__readdir(&format_dir)) != NULL) {
struct perf_pmu_format *format;
char *name = evt_ent->d_name;
- if (!strcmp(name, ".") || !strcmp(name, ".."))
+ if (io_dir__is_dir(&format_dir, evt_ent))
continue;
format = perf_pmu__new_format(&pmu->format, name);
@@ -234,7 +233,7 @@ static int perf_pmu__format_parse(struct perf_pmu *pmu, int dirfd, bool eager_lo
}
}
- closedir(format_dir);
+ close(format_dir.dirfd);
return ret;
}
@@ -635,14 +634,12 @@ static inline bool pmu_alias_info_file(const char *name)
*/
static int __pmu_aliases_parse(struct perf_pmu *pmu, int events_dir_fd)
{
- struct dirent *evt_ent;
- DIR *event_dir;
+ struct io_dirent64 *evt_ent;
+ struct io_dir event_dir;
- event_dir = fdopendir(events_dir_fd);
- if (!event_dir)
- return -EINVAL;
+ io_dir__init(&event_dir, events_dir_fd);
- while ((evt_ent = readdir(event_dir))) {
+ while ((evt_ent = io_dir__readdir(&event_dir))) {
char *name = evt_ent->d_name;
int fd;
FILE *file;
@@ -674,7 +671,6 @@ static int __pmu_aliases_parse(struct perf_pmu *pmu, int events_dir_fd)
fclose(file);
}
- closedir(event_dir);
pmu->sysfs_aliases_loaded = true;
return 0;
}
@@ -2221,10 +2217,9 @@ static void perf_pmu__del_caps(struct perf_pmu *pmu)
*/
int perf_pmu__caps_parse(struct perf_pmu *pmu)
{
- struct stat st;
char caps_path[PATH_MAX];
- DIR *caps_dir;
- struct dirent *evt_ent;
+ struct io_dir caps_dir;
+ struct io_dirent64 *evt_ent;
int caps_fd;
if (pmu->caps_initialized)
@@ -2235,24 +2230,21 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu)
if (!perf_pmu__pathname_scnprintf(caps_path, sizeof(caps_path), pmu->name, "caps"))
return -1;
- if (stat(caps_path, &st) < 0) {
+ caps_fd = open(caps_path, O_CLOEXEC | O_DIRECTORY | O_RDONLY);
+ if (caps_fd == -1) {
pmu->caps_initialized = true;
return 0; /* no error if caps does not exist */
}
- caps_dir = opendir(caps_path);
- if (!caps_dir)
- return -EINVAL;
-
- caps_fd = dirfd(caps_dir);
+ io_dir__init(&caps_dir, caps_fd);
- while ((evt_ent = readdir(caps_dir)) != NULL) {
+ while ((evt_ent = io_dir__readdir(&caps_dir)) != NULL) {
char *name = evt_ent->d_name;
char value[128];
FILE *file;
int fd;
- if (!strcmp(name, ".") || !strcmp(name, ".."))
+ if (io_dir__is_dir(&caps_dir, evt_ent))
continue;
fd = openat(caps_fd, name, O_RDONLY);
@@ -2274,7 +2266,7 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu)
fclose(file);
}
- closedir(caps_dir);
+ close(caps_fd);
pmu->caps_initialized = true;
return pmu->nr_caps;
diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index 8a0a919415d4..afd59d678fd0 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -3,10 +3,10 @@
#include <linux/list_sort.h>
#include <linux/string.h>
#include <linux/zalloc.h>
+#include <api/io_dir.h>
#include <subcmd/pager.h>
#include <sys/types.h>
#include <ctype.h>
-#include <dirent.h>
#include <pthread.h>
#include <string.h>
#include <unistd.h>
@@ -235,20 +235,16 @@ static void pmu_read_sysfs(unsigned int to_read_types)
if (to_read_types & (PERF_TOOL_PMU_TYPE_PE_CORE_MASK | PERF_TOOL_PMU_TYPE_PE_OTHER_MASK)) {
int fd = perf_pmu__event_source_devices_fd();
- DIR *dir;
- struct dirent *dent;
+ struct io_dir dir;
+ struct io_dirent64 *dent;
bool core_only = (to_read_types & PERF_TOOL_PMU_TYPE_PE_OTHER_MASK) == 0;
if (fd < 0)
goto skip_pe_pmus;
- dir = fdopendir(fd);
- if (!dir) {
- close(fd);
- goto skip_pe_pmus;
- }
+ io_dir__init(&dir, fd);
- while ((dent = readdir(dir))) {
+ while ((dent = io_dir__readdir(&dir)) != NULL) {
if (!strcmp(dent->d_name, ".") || !strcmp(dent->d_name, ".."))
continue;
if (core_only && !is_pmu_core(dent->d_name))
@@ -257,7 +253,7 @@ static void pmu_read_sysfs(unsigned int to_read_types)
perf_pmu__find2(fd, dent->d_name);
}
- closedir(dir);
+ close(fd);
}
skip_pe_pmus:
if ((to_read_types & PERF_TOOL_PMU_TYPE_PE_CORE_MASK) && list_empty(&core_pmus)) {
@@ -721,8 +717,8 @@ bool perf_pmus__supports_extended_type(void)
char *perf_pmus__default_pmu_name(void)
{
int fd;
- DIR *dir;
- struct dirent *dent;
+ struct io_dir dir;
+ struct io_dirent64 *dent;
char *result = NULL;
if (!list_empty(&core_pmus))
@@ -732,13 +728,9 @@ char *perf_pmus__default_pmu_name(void)
if (fd < 0)
return strdup("cpu");
- dir = fdopendir(fd);
- if (!dir) {
- close(fd);
- return strdup("cpu");
- }
+ io_dir__init(&dir, fd);
- while ((dent = readdir(dir))) {
+ while ((dent = io_dir__readdir(&dir)) != NULL) {
if (!strcmp(dent->d_name, ".") || !strcmp(dent->d_name, ".."))
continue;
if (is_pmu_core(dent->d_name)) {
@@ -747,7 +739,7 @@ char *perf_pmus__default_pmu_name(void)
}
}
- closedir(dir);
+ close(fd);
return result ?: strdup("cpu");
}
--
2.48.1.658.g4767266eb4-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 4/8] perf header: Switch mem topology to io_dir__readdir
2025-02-22 6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
` (2 preceding siblings ...)
2025-02-22 6:10 ` [PATCH v3 3/8] perf pmu: Switch " Ian Rogers
@ 2025-02-22 6:10 ` Ian Rogers
2025-02-22 6:10 ` [PATCH v3 5/8] perf events: Remove scandir in thread synthesis Ian Rogers
` (5 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22 6:10 UTC (permalink / raw)
To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
Michael Petlan, Anne Macedo, Dominique Martinet,
Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
Krzysztof Łopatowski
Cc: Ian Rogers
Switch memory_node__read and build_mem_topology from opendir/readdir
to io_dir__readdir, with smaller stack allocations. Reduces peak
memory consumption of perf record by 10kb.
Signed-off-by: Ian Rogers <irogers@google.com>
---
| 31 ++++++++++++++++---------------
1 file changed, 16 insertions(+), 15 deletions(-)
--git a/tools/perf/util/header.c b/tools/perf/util/header.c
index d06aa86352d3..1900965f8752 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -44,6 +44,7 @@
#include "build-id.h"
#include "data.h"
#include <api/fs/fs.h>
+#include <api/io_dir.h>
#include "asm/bug.h"
#include "tool.h"
#include "time-utils.h"
@@ -1311,11 +1312,11 @@ static int memory_node__read(struct memory_node *n, unsigned long idx)
{
unsigned int phys, size = 0;
char path[PATH_MAX];
- struct dirent *ent;
- DIR *dir;
+ struct io_dirent64 *ent;
+ struct io_dir dir;
#define for_each_memory(mem, dir) \
- while ((ent = readdir(dir))) \
+ while ((ent = io_dir__readdir(&dir)) != NULL) \
if (strcmp(ent->d_name, ".") && \
strcmp(ent->d_name, "..") && \
sscanf(ent->d_name, "memory%u", &mem) == 1)
@@ -1324,9 +1325,9 @@ static int memory_node__read(struct memory_node *n, unsigned long idx)
"%s/devices/system/node/node%lu",
sysfs__mountpoint(), idx);
- dir = opendir(path);
- if (!dir) {
- pr_warning("failed: can't open memory sysfs data\n");
+ io_dir__init(&dir, open(path, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+ if (dir.dirfd < 0) {
+ pr_warning("failed: can't open memory sysfs data '%s'\n", path);
return -1;
}
@@ -1338,20 +1339,20 @@ static int memory_node__read(struct memory_node *n, unsigned long idx)
n->set = bitmap_zalloc(size);
if (!n->set) {
- closedir(dir);
+ close(dir.dirfd);
return -ENOMEM;
}
n->node = idx;
n->size = size;
- rewinddir(dir);
+ io_dir__rewinddir(&dir);
for_each_memory(phys, dir) {
__set_bit(phys, n->set);
}
- closedir(dir);
+ close(dir.dirfd);
return 0;
}
@@ -1374,8 +1375,8 @@ static int memory_node__sort(const void *a, const void *b)
static int build_mem_topology(struct memory_node **nodesp, u64 *cntp)
{
char path[PATH_MAX];
- struct dirent *ent;
- DIR *dir;
+ struct io_dirent64 *ent;
+ struct io_dir dir;
int ret = 0;
size_t cnt = 0, size = 0;
struct memory_node *nodes = NULL;
@@ -1383,14 +1384,14 @@ static int build_mem_topology(struct memory_node **nodesp, u64 *cntp)
scnprintf(path, PATH_MAX, "%s/devices/system/node/",
sysfs__mountpoint());
- dir = opendir(path);
- if (!dir) {
+ io_dir__init(&dir, open(path, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+ if (dir.dirfd < 0) {
pr_debug2("%s: couldn't read %s, does this arch have topology information?\n",
__func__, path);
return -1;
}
- while (!ret && (ent = readdir(dir))) {
+ while (!ret && (ent = io_dir__readdir(&dir))) {
unsigned int idx;
int r;
@@ -1419,7 +1420,7 @@ static int build_mem_topology(struct memory_node **nodesp, u64 *cntp)
cnt += 1;
}
out:
- closedir(dir);
+ close(dir.dirfd);
if (!ret) {
*cntp = cnt;
*nodesp = nodes;
--
2.48.1.658.g4767266eb4-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 5/8] perf events: Remove scandir in thread synthesis
2025-02-22 6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
` (3 preceding siblings ...)
2025-02-22 6:10 ` [PATCH v3 4/8] perf header: Switch mem topology " Ian Rogers
@ 2025-02-22 6:10 ` Ian Rogers
2025-02-22 6:10 ` [PATCH v3 6/8] perf parse-events: Switch tracepoints to io_dir__readdir Ian Rogers
` (4 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22 6:10 UTC (permalink / raw)
To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
Michael Petlan, Anne Macedo, Dominique Martinet,
Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
Krzysztof Łopatowski
Cc: Ian Rogers
This avoids scanddir reading the directory into memory that's
allocated and instead allocates on the stack.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/synthetic-events.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 2dfc4260d36d..2fc4d0537840 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -38,6 +38,7 @@
#include <uapi/linux/mman.h> /* To get things like MAP_HUGETLB even on older libc headers */
#include <api/fs/fs.h>
#include <api/io.h>
+#include <api/io_dir.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
@@ -767,10 +768,10 @@ static int __event__synthesize_thread(union perf_event *comm_event,
bool needs_mmap, bool mmap_data)
{
char filename[PATH_MAX];
- struct dirent **dirent;
+ struct io_dir iod;
+ struct io_dirent64 *dent;
pid_t tgid, ppid;
int rc = 0;
- int i, n;
/* special case: only send one comm event using passed in pid */
if (!full) {
@@ -802,16 +803,19 @@ static int __event__synthesize_thread(union perf_event *comm_event,
snprintf(filename, sizeof(filename), "%s/proc/%d/task",
machine->root_dir, pid);
- n = scandir(filename, &dirent, filter_task, NULL);
- if (n < 0)
- return n;
+ io_dir__init(&iod, open(filename, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+ if (iod.dirfd < 0)
+ return -1;
- for (i = 0; i < n; i++) {
+ while ((dent = io_dir__readdir(&iod)) != NULL) {
char *end;
pid_t _pid;
bool kernel_thread = false;
- _pid = strtol(dirent[i]->d_name, &end, 10);
+ if (!isdigit(dent->d_name[0]))
+ continue;
+
+ _pid = strtol(dent->d_name, &end, 10);
if (*end)
continue;
@@ -845,9 +849,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
}
}
- for (i = 0; i < n; i++)
- zfree(&dirent[i]);
- free(dirent);
+ close(iod.dirfd);
return rc;
}
--
2.48.1.658.g4767266eb4-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 6/8] perf parse-events: Switch tracepoints to io_dir__readdir
2025-02-22 6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
` (4 preceding siblings ...)
2025-02-22 6:10 ` [PATCH v3 5/8] perf events: Remove scandir in thread synthesis Ian Rogers
@ 2025-02-22 6:10 ` Ian Rogers
2025-02-22 6:10 ` [PATCH v3 7/8] perf hwmon_pmu: Switch event discovery " Ian Rogers
` (3 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22 6:10 UTC (permalink / raw)
To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
Michael Petlan, Anne Macedo, Dominique Martinet,
Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
Krzysztof Łopatowski
Cc: Ian Rogers
Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/parse-events.c | 32 +++++++++++++++++++-------------
1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 6c36b98875bc..35e48fe56dfa 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -17,6 +17,7 @@
#include "strbuf.h"
#include "debug.h"
#include <api/fs/tracing_path.h>
+#include <api/io_dir.h>
#include <perf/cpumap.h>
#include <util/parse-events-bison.h>
#include <util/parse-events-flex.h>
@@ -554,8 +555,8 @@ static int add_tracepoint_multi_event(struct parse_events_state *parse_state,
struct parse_events_terms *head_config, YYLTYPE *loc)
{
char *evt_path;
- struct dirent *evt_ent;
- DIR *evt_dir;
+ struct io_dirent64 *evt_ent;
+ struct io_dir evt_dir;
int ret = 0, found = 0;
evt_path = get_events_file(sys_name);
@@ -563,14 +564,14 @@ static int add_tracepoint_multi_event(struct parse_events_state *parse_state,
tracepoint_error(err, errno, sys_name, evt_name, loc->first_column);
return -1;
}
- evt_dir = opendir(evt_path);
- if (!evt_dir) {
+ io_dir__init(&evt_dir, open(evt_path, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+ if (evt_dir.dirfd < 0) {
put_events_file(evt_path);
tracepoint_error(err, errno, sys_name, evt_name, loc->first_column);
return -1;
}
- while (!ret && (evt_ent = readdir(evt_dir))) {
+ while (!ret && (evt_ent = io_dir__readdir(&evt_dir))) {
if (!strcmp(evt_ent->d_name, ".")
|| !strcmp(evt_ent->d_name, "..")
|| !strcmp(evt_ent->d_name, "enable")
@@ -592,7 +593,7 @@ static int add_tracepoint_multi_event(struct parse_events_state *parse_state,
}
put_events_file(evt_path);
- closedir(evt_dir);
+ close(evt_dir.dirfd);
return ret;
}
@@ -615,17 +616,23 @@ static int add_tracepoint_multi_sys(struct parse_events_state *parse_state,
struct parse_events_error *err,
struct parse_events_terms *head_config, YYLTYPE *loc)
{
- struct dirent *events_ent;
- DIR *events_dir;
+ struct io_dirent64 *events_ent;
+ struct io_dir events_dir;
int ret = 0;
+ char *events_dir_path = get_tracing_file("events");
- events_dir = tracing_events__opendir();
- if (!events_dir) {
+ if (!events_dir_path) {
+ tracepoint_error(err, errno, sys_name, evt_name, loc->first_column);
+ return -1;
+ }
+ io_dir__init(&events_dir, open(events_dir_path, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+ put_events_file(events_dir_path);
+ if (events_dir.dirfd < 0) {
tracepoint_error(err, errno, sys_name, evt_name, loc->first_column);
return -1;
}
- while (!ret && (events_ent = readdir(events_dir))) {
+ while (!ret && (events_ent = io_dir__readdir(&events_dir))) {
if (!strcmp(events_ent->d_name, ".")
|| !strcmp(events_ent->d_name, "..")
|| !strcmp(events_ent->d_name, "enable")
@@ -639,8 +646,7 @@ static int add_tracepoint_multi_sys(struct parse_events_state *parse_state,
ret = add_tracepoint_event(parse_state, list, events_ent->d_name,
evt_name, err, head_config, loc);
}
-
- closedir(events_dir);
+ close(events_dir.dirfd);
return ret;
}
--
2.48.1.658.g4767266eb4-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 7/8] perf hwmon_pmu: Switch event discovery to io_dir__readdir
2025-02-22 6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
` (5 preceding siblings ...)
2025-02-22 6:10 ` [PATCH v3 6/8] perf parse-events: Switch tracepoints to io_dir__readdir Ian Rogers
@ 2025-02-22 6:10 ` Ian Rogers
2025-02-22 6:10 ` [PATCH v3 8/8] perf machine: Reuse module path buffer Ian Rogers
` (2 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22 6:10 UTC (permalink / raw)
To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
Michael Petlan, Anne Macedo, Dominique Martinet,
Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
Krzysztof Łopatowski
Cc: Ian Rogers
Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/hwmon_pmu.c | 42 +++++++++++++++----------------------
1 file changed, 17 insertions(+), 25 deletions(-)
diff --git a/tools/perf/util/hwmon_pmu.c b/tools/perf/util/hwmon_pmu.c
index acd889b2462f..3cce77fc8004 100644
--- a/tools/perf/util/hwmon_pmu.c
+++ b/tools/perf/util/hwmon_pmu.c
@@ -11,13 +11,13 @@
#include <sys/types.h>
#include <assert.h>
#include <ctype.h>
-#include <dirent.h>
#include <fcntl.h>
#include <stddef.h>
#include <stdlib.h>
#include <string.h>
#include <api/fs/fs.h>
#include <api/io.h>
+#include <api/io_dir.h>
#include <linux/kernel.h>
#include <linux/string.h>
#include <linux/zalloc.h>
@@ -235,31 +235,22 @@ static void fix_name(char *p)
static int hwmon_pmu__read_events(struct hwmon_pmu *pmu)
{
- DIR *dir;
- struct dirent *ent;
- int dup_fd, err = 0;
+ int err = 0;
struct hashmap_entry *cur, *tmp;
size_t bkt;
+ struct io_dirent64 *ent;
+ struct io_dir dir;
if (pmu->pmu.sysfs_aliases_loaded)
return 0;
- /*
- * Use a dup-ed fd as closedir will close it. Use openat so that the
- * directory contents are refreshed.
- */
- dup_fd = openat(pmu->hwmon_dir_fd, ".", O_DIRECTORY);
+ /* Use openat so that the directory contents are refreshed. */
+ io_dir__init(&dir, openat(pmu->hwmon_dir_fd, ".", O_CLOEXEC | O_DIRECTORY | O_RDONLY));
- if (dup_fd == -1)
- return -ENOMEM;
+ if (dir.dirfd < 0)
+ return -ENOENT;
- dir = fdopendir(dup_fd);
- if (!dir) {
- close(dup_fd);
- return -ENOMEM;
- }
-
- while ((ent = readdir(dir)) != NULL) {
+ while ((ent = io_dir__readdir(&dir)) != NULL) {
enum hwmon_type type;
int number;
enum hwmon_item item;
@@ -347,7 +338,7 @@ static int hwmon_pmu__read_events(struct hwmon_pmu *pmu)
pmu->pmu.sysfs_aliases_loaded = true;
err_out:
- closedir(dir);
+ close(dir.dirfd);
return err;
}
@@ -702,8 +693,8 @@ int hwmon_pmu__check_alias(struct parse_events_terms *terms, struct perf_pmu_inf
int perf_pmus__read_hwmon_pmus(struct list_head *pmus)
{
char *line = NULL;
- DIR *class_hwmon_dir;
- struct dirent *class_hwmon_ent;
+ struct io_dirent64 *class_hwmon_ent;
+ struct io_dir class_hwmon_dir;
char buf[PATH_MAX];
const char *sysfs = sysfs__mountpoint();
@@ -711,11 +702,12 @@ int perf_pmus__read_hwmon_pmus(struct list_head *pmus)
return 0;
scnprintf(buf, sizeof(buf), "%s/class/hwmon/", sysfs);
- class_hwmon_dir = opendir(buf);
- if (!class_hwmon_dir)
+ io_dir__init(&class_hwmon_dir, open(buf, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+
+ if (class_hwmon_dir.dirfd < 0)
return 0;
- while ((class_hwmon_ent = readdir(class_hwmon_dir)) != NULL) {
+ while ((class_hwmon_ent = io_dir__readdir(&class_hwmon_dir)) != NULL) {
size_t line_len;
int hwmon_dir, name_fd;
struct io io;
@@ -745,7 +737,7 @@ int perf_pmus__read_hwmon_pmus(struct list_head *pmus)
close(name_fd);
}
free(line);
- closedir(class_hwmon_dir);
+ close(class_hwmon_dir.dirfd);
return 0;
}
--
2.48.1.658.g4767266eb4-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 8/8] perf machine: Reuse module path buffer
2025-02-22 6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
` (6 preceding siblings ...)
2025-02-22 6:10 ` [PATCH v3 7/8] perf hwmon_pmu: Switch event discovery " Ian Rogers
@ 2025-02-22 6:10 ` Ian Rogers
2025-02-25 0:28 ` [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Namhyung Kim
2025-02-25 21:58 ` Namhyung Kim
9 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-22 6:10 UTC (permalink / raw)
To: Namhyung Kim, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Kan Liang, Sam James, Jesper Juhl,
James Clark, Zhongqiu Han, Yicong Yang, Thomas Richter,
Michael Petlan, Anne Macedo, Dominique Martinet,
Jean-Philippe Romain, Junhao He, linux-kernel, linux-perf-users,
Krzysztof Łopatowski
Cc: Ian Rogers
Rather than copying the path and appending the directory entry in a
fresh path buffer, append to the path at the end of where it is for
the recursion level. This saves a PATH_MAX buffer per recursion level
and some unnecessary copying.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/machine.c | 34 +++++++++++++++++++++++-----------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index e394c630e3a2..3f1faf94198d 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1338,22 +1338,23 @@ static int maps__set_module_path(struct maps *maps, const char *path, struct kmo
return 0;
}
-static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, int depth)
+static int maps__set_modules_path_dir(struct maps *maps, char *path, size_t path_size, int depth)
{
struct io_dirent64 *dent;
struct io_dir iod;
+ size_t root_len = strlen(path);
int ret = 0;
- io_dir__init(&iod, open(dir_name, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
+ io_dir__init(&iod, open(path, O_CLOEXEC | O_DIRECTORY | O_RDONLY));
if (iod.dirfd < 0) {
- pr_debug("%s: cannot open %s dir\n", __func__, dir_name);
+ pr_debug("%s: cannot open %s dir\n", __func__, path);
return -1;
}
-
+ /* Bounds check, should never happen. */
+ if (root_len >= path_size)
+ return -1;
+ path[root_len++] = '/';
while ((dent = io_dir__readdir(&iod)) != NULL) {
- char path[PATH_MAX];
-
- path__join(path, sizeof(path), dir_name, dent->d_name);
if (io_dir__is_dir(&iod, dent)) {
if (!strcmp(dent->d_name, ".") ||
!strcmp(dent->d_name, ".."))
@@ -1366,7 +1367,12 @@ static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, i
continue;
}
- ret = maps__set_modules_path_dir(maps, path, depth + 1);
+ /* Bounds check, should never happen. */
+ if (root_len + strlen(dent->d_name) >= path_size)
+ continue;
+
+ strcpy(path + root_len, dent->d_name);
+ ret = maps__set_modules_path_dir(maps, path, path_size, depth + 1);
if (ret < 0)
goto out;
} else {
@@ -1376,9 +1382,14 @@ static int maps__set_modules_path_dir(struct maps *maps, const char *dir_name, i
if (ret)
goto out;
- if (m.kmod)
- ret = maps__set_module_path(maps, path, &m);
+ if (m.kmod) {
+ /* Bounds check, should never happen. */
+ if (root_len + strlen(dent->d_name) < path_size) {
+ strcpy(path + root_len, dent->d_name);
+ ret = maps__set_module_path(maps, path, &m);
+ }
+ }
zfree(&m.name);
if (ret)
@@ -1404,7 +1415,8 @@ static int machine__set_modules_path(struct machine *machine)
machine->root_dir, version);
free(version);
- return maps__set_modules_path_dir(machine__kernel_maps(machine), modules_path, 0);
+ return maps__set_modules_path_dir(machine__kernel_maps(machine),
+ modules_path, sizeof(modules_path), 0);
}
int __weak arch__fix_module_text_start(u64 *start __maybe_unused,
u64 *size __maybe_unused,
--
2.48.1.658.g4767266eb4-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir
2025-02-22 6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
` (7 preceding siblings ...)
2025-02-22 6:10 ` [PATCH v3 8/8] perf machine: Reuse module path buffer Ian Rogers
@ 2025-02-25 0:28 ` Namhyung Kim
2025-02-25 0:30 ` Namhyung Kim
2025-02-25 21:58 ` Namhyung Kim
9 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2025-02-25 0:28 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, Sam James, Jesper Juhl, James Clark, Zhongqiu Han,
Yicong Yang, Thomas Richter, Michael Petlan, Anne Macedo,
Dominique Martinet, Jean-Philippe Romain, Junhao He, linux-kernel,
linux-perf-users, Krzysztof Łopatowski
Hi Ian,
On Fri, Feb 21, 2025 at 10:10:05PM -0800, Ian Rogers wrote:
> glibc's opendir allocates a minimum of 32kb, when called recursively
> for a directory tree the memory consumption can add up - nearly 300kb
> during perf start-up when processing modules. Add a stack allocated
> variant of readdir sized a little more than 1kb
It's still small and hard to verify. I've run the following command
before and after the change but didn't see a difference.
$ sudo time -f %Mk ./perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.757 MB perf.data (563 samples) ]
74724k
According to man time(1), %M is for max RSS.
Thanks,
Namhyung
>
> v3: Rebase on top of Krzysztof Łopatowski's work. Add additional
> defines for SYS_getdents64 on all other architectures if its
> definition is missing. Add a patch to further reduce the
> stack/memory usage in machine__set_modules_path_dir by appending
> to a buffer rather than creating a copy.
> v2: Remove the feature test and always use a perf supplied getdents64
> to workaround an Alpine Linux issue in v1:
> https://lore.kernel.org/lkml/20231207050433.1426834-1-irogers@google.com/
> As suggested by Krzysztof Łopatowski
> <krzysztof.m.lopatowski@gmail.com> who also pointed to the perf
> trace performance improvements in start-up time eliminating stat
> calls can achieve:
> https://lore.kernel.org/lkml/20250206113314.335376-2-krzysztof.m.lopatowski@gmail.com/
> Convert parse-events and hwmon_pmu to use io_dir.
> v1: This was previously part of the memory saving change set:
> https://lore.kernel.org/lkml/20231127220902.1315692-1-irogers@google.com/
> It is separated here and a feature check and syscall workaround
> for missing getdents64 added.
>
> Ian Rogers (8):
> tools lib api: Add io_dir an allocation free readdir alternative
> perf maps: Switch modules tree walk to io_dir__readdir
> perf pmu: Switch to io_dir__readdir
> perf header: Switch mem topology to io_dir__readdir
> perf events: Remove scandir in thread synthesis
> perf parse-events: Switch tracepoints to io_dir__readdir
> perf hwmon_pmu: Switch event discovery to io_dir__readdir
> perf machine: Reuse module path buffer
>
> tools/lib/api/Makefile | 2 +-
> tools/lib/api/io_dir.h | 104 +++++++++++++++++++++++++++++
> tools/perf/util/header.c | 31 ++++-----
> tools/perf/util/hwmon_pmu.c | 42 +++++-------
> tools/perf/util/machine.c | 57 ++++++++--------
> tools/perf/util/parse-events.c | 32 +++++----
> tools/perf/util/pmu.c | 46 ++++++-------
> tools/perf/util/pmus.c | 30 +++------
> tools/perf/util/synthetic-events.c | 22 +++---
> 9 files changed, 229 insertions(+), 137 deletions(-)
> create mode 100644 tools/lib/api/io_dir.h
>
> --
> 2.48.1.658.g4767266eb4-goog
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir
2025-02-25 0:28 ` [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Namhyung Kim
@ 2025-02-25 0:30 ` Namhyung Kim
2025-02-25 1:25 ` Ian Rogers
0 siblings, 1 reply; 13+ messages in thread
From: Namhyung Kim @ 2025-02-25 0:30 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, Sam James, Jesper Juhl, James Clark, Zhongqiu Han,
Yicong Yang, Thomas Richter, Michael Petlan, Anne Macedo,
Dominique Martinet, Jean-Philippe Romain, Junhao He, linux-kernel,
linux-perf-users, Krzysztof Łopatowski
On Mon, Feb 24, 2025 at 04:28:24PM -0800, Namhyung Kim wrote:
> Hi Ian,
>
> On Fri, Feb 21, 2025 at 10:10:05PM -0800, Ian Rogers wrote:
> > glibc's opendir allocates a minimum of 32kb, when called recursively
> > for a directory tree the memory consumption can add up - nearly 300kb
> > during perf start-up when processing modules. Add a stack allocated
> > variant of readdir sized a little more than 1kb
>
> It's still small and hard to verify. I've run the following command
> before and after the change but didn't see a difference.
>
> $ sudo time -f %Mk ./perf record -a true
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 1.757 MB perf.data (563 samples) ]
> 74724k
>
> According to man time(1), %M is for max RSS.
But anyway, it looks ok and build is fine now.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir
2025-02-25 0:30 ` Namhyung Kim
@ 2025-02-25 1:25 ` Ian Rogers
0 siblings, 0 replies; 13+ messages in thread
From: Ian Rogers @ 2025-02-25 1:25 UTC (permalink / raw)
To: Namhyung Kim
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, Sam James, Jesper Juhl, James Clark, Zhongqiu Han,
Yicong Yang, Thomas Richter, Michael Petlan, Anne Macedo,
Dominique Martinet, Jean-Philippe Romain, Junhao He, linux-kernel,
linux-perf-users, Krzysztof Łopatowski
On Mon, Feb 24, 2025 at 4:30 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Mon, Feb 24, 2025 at 04:28:24PM -0800, Namhyung Kim wrote:
> > Hi Ian,
> >
> > On Fri, Feb 21, 2025 at 10:10:05PM -0800, Ian Rogers wrote:
> > > glibc's opendir allocates a minimum of 32kb, when called recursively
> > > for a directory tree the memory consumption can add up - nearly 300kb
> > > during perf start-up when processing modules. Add a stack allocated
> > > variant of readdir sized a little more than 1kb
> >
> > It's still small and hard to verify. I've run the following command
> > before and after the change but didn't see a difference.
> >
> > $ sudo time -f %Mk ./perf record -a true
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 1.757 MB perf.data (563 samples) ]
> > 74724k
> >
> > According to man time(1), %M is for max RSS.
>
> But anyway, it looks ok and build is fine now.
Thanks for the testing! So doing a regular build I could repeat what
you saw - basically the opendir isn't contributing to maxrss as the
BPF handling and so gets lost. Doing a minimal static build, that
loses BPF support, things were clearer but not as good as I'd
originally measured, 10880k reduced to 10696k - a 184k saving (raw
data below). Perhaps opendir got better or perhaps there are fewer
kernel modules. I tried heaptrack but unfortunately it wasn't able to
instrument the allocations in glibc's allocdir function (it reported
0). Originally heaptrack showing opendir allocations were significant
for `perf record` was what led me to this code. At the moment BPF
event synthesis and topdown event checking look particularly
expensive.
Thanks,
Ian
Before:
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (107 samples) ]
10880k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (106 samples) ]
10880k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (109 samples) ]
10880k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (102 samples) ]
10880k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (106 samples) ]
11008k
After:
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.473 MB perf.data (106 samples) ]
10820k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.473 MB perf.data (109 samples) ]
10696k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.473 MB perf.data (98 samples) ]
10696k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.489 MB perf.data (98 samples) ]
10696k
$ sudo /bin/time -f %Mk /tmp/perf/perf record -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.490 MB perf.data (110 samples) ]
10696k
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir
2025-02-22 6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
` (8 preceding siblings ...)
2025-02-25 0:28 ` [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Namhyung Kim
@ 2025-02-25 21:58 ` Namhyung Kim
9 siblings, 0 replies; 13+ messages in thread
From: Namhyung Kim @ 2025-02-25 21:58 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, Sam James, Jesper Juhl, James Clark, Zhongqiu Han,
Yicong Yang, Thomas Richter, Michael Petlan, Anne Macedo,
Dominique Martinet, Jean-Philippe Romain, Junhao He, linux-kernel,
linux-perf-users, Krzysztof Łopatowski, Ian Rogers
On Fri, 21 Feb 2025 22:10:05 -0800, Ian Rogers wrote:
> glibc's opendir allocates a minimum of 32kb, when called recursively
> for a directory tree the memory consumption can add up - nearly 300kb
> during perf start-up when processing modules. Add a stack allocated
> variant of readdir sized a little more than 1kb
>
> v3: Rebase on top of Krzysztof Łopatowski's work. Add additional
> defines for SYS_getdents64 on all other architectures if its
> definition is missing. Add a patch to further reduce the
> stack/memory usage in machine__set_modules_path_dir by appending
> to a buffer rather than creating a copy.
> v2: Remove the feature test and always use a perf supplied getdents64
> to workaround an Alpine Linux issue in v1:
> https://lore.kernel.org/lkml/20231207050433.1426834-1-irogers@google.com/
> As suggested by Krzysztof Łopatowski
> <krzysztof.m.lopatowski@gmail.com> who also pointed to the perf
> trace performance improvements in start-up time eliminating stat
> calls can achieve:
> https://lore.kernel.org/lkml/20250206113314.335376-2-krzysztof.m.lopatowski@gmail.com/
> Convert parse-events and hwmon_pmu to use io_dir.
> v1: This was previously part of the memory saving change set:
> https://lore.kernel.org/lkml/20231127220902.1315692-1-irogers@google.com/
> It is separated here and a feature check and syscall workaround
> for missing getdents64 added.
>
> [...]
Applied to perf-tools-next, thanks!
Best regards,
Namhyung
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-02-25 21:58 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-22 6:10 [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Ian Rogers
2025-02-22 6:10 ` [PATCH v3 1/8] tools lib api: Add io_dir an allocation free readdir alternative Ian Rogers
2025-02-22 6:10 ` [PATCH v3 2/8] perf maps: Switch modules tree walk to io_dir__readdir Ian Rogers
2025-02-22 6:10 ` [PATCH v3 3/8] perf pmu: Switch " Ian Rogers
2025-02-22 6:10 ` [PATCH v3 4/8] perf header: Switch mem topology " Ian Rogers
2025-02-22 6:10 ` [PATCH v3 5/8] perf events: Remove scandir in thread synthesis Ian Rogers
2025-02-22 6:10 ` [PATCH v3 6/8] perf parse-events: Switch tracepoints to io_dir__readdir Ian Rogers
2025-02-22 6:10 ` [PATCH v3 7/8] perf hwmon_pmu: Switch event discovery " Ian Rogers
2025-02-22 6:10 ` [PATCH v3 8/8] perf machine: Reuse module path buffer Ian Rogers
2025-02-25 0:28 ` [PATCH v3 0/8] Add io_dir to avoid memory overhead from opendir Namhyung Kim
2025-02-25 0:30 ` Namhyung Kim
2025-02-25 1:25 ` Ian Rogers
2025-02-25 21:58 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).