linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc
@ 2024-10-10 23:25 Namhyung Kim
  2024-10-10 23:25 ` [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator Namhyung Kim
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Namhyung Kim @ 2024-10-10 23:25 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

Hello,

I'm proposing a new iterator and a kfunc for the slab memory allocator
to get information of each kmem_cache like in /proc/slabinfo or
/sys/kernel/slab in more flexible way.

v5 changes)

 * set PTR_UNTRUSTED for return value of bpf_get_kmem_cache()  (Alexei)
 * add KF_RCU_PROTECTED to bpf_get_kmem_cache().  See below.  (Song)
 * add WARN_ON_ONCE and comment in kmem_cache_iter_seq_next()  (Song)
 * change kmem_cache_iter_seq functions not to call BPF on intermediate stop
 * add a subtest to compare the kmem cache info with /proc/slabinfo  (Alexei)

v4: https://lore.kernel.org/lkml/20241002180956.1781008-1-namhyung@kernel.org

 * skip kmem_cache_destroy() in kmem_cache_iter_seq_stop() if possible  (Vlastimil)
 * fix a bug in the kmem_cache_iter_seq_start() for the last entry

v3: https://lore.kernel.org/lkml/20241002065456.1580143-1-namhyung@kernel.org/

 * rework kmem_cache_iter not to hold slab_mutex when running BPF  (Alexei)
 * add virt_addr_valid() check  (Alexei)
 * fix random test failure by running test with the current task  (Hyeonggon)

v2: https://lore.kernel.org/lkml/20240927184133.968283-1-namhyung@kernel.org/

 * rename it to "kmem_cache_iter"
 * fix a build issue
 * add Acked-by's from Roman and Vlastimil (Thanks!)
 * add error codes in the test for debugging

v1: https://lore.kernel.org/lkml/20240925223023.735947-1-namhyung@kernel.org/

My use case is `perf lock contention` tool which shows contended locks
but many of them are not global locks and don't have symbols.  If it
can tranlate the address of the lock in a slab object to the name of
the slab, it'd be much more useful.

I'm not aware of type information in slab yet, but I was told there's
a work to associate BTF ID with it.  It'd be definitely helpful to my
use case.  Probably we need another kfunc to get the start address of
the object or the offset in the object from an address if the type
info is available.  But I want to start with a simple thing first.

The kmem_cache_iter iterates kmem_cache objects under slab_mutex and
will be useful for userspace to prepare some work for specific slabs
like setting up filters in advance.  And the bpf_get_kmem_cache()
kfunc will return a pointer to a slab from the address of a lock.

Actualy I'm not sure about the RCU lock - IIUC it doesn't protect the
kmem_cache itself but kmem_cache_destroy() calls some RCU barrier
functions, so having RCU read lock would protect the object from going
away by kfree_rcu() or something and then kmem_cache.  But please
correct me if I'm wrong.

And the test code is to read from the iterator and make sure it finds
a slab cache of the task_struct for the current task.

The code is available at 'bpf/slab-iter-v5' branch in
https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (3):
  bpf: Add kmem_cache iterator
  mm/bpf: Add bpf_get_kmem_cache() kfunc
  selftests/bpf: Add a test for kmem_cache_iter

 include/linux/btf_ids.h                       |   1 +
 kernel/bpf/Makefile                           |   1 +
 kernel/bpf/helpers.c                          |   1 +
 kernel/bpf/kmem_cache_iter.c                  | 175 ++++++++++++++++++
 kernel/bpf/verifier.c                         |   5 +
 mm/slab_common.c                              |  19 ++
 .../bpf/prog_tests/kmem_cache_iter.c          | 115 ++++++++++++
 tools/testing/selftests/bpf/progs/bpf_iter.h  |   7 +
 .../selftests/bpf/progs/kmem_cache_iter.c     |  95 ++++++++++
 9 files changed, 419 insertions(+)
 create mode 100644 kernel/bpf/kmem_cache_iter.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/kmem_cache_iter.c
 create mode 100644 tools/testing/selftests/bpf/progs/kmem_cache_iter.c


base-commit: 5bd48a3a14df4b3ee1be0757efcc0f40d4f57b35
-- 
2.47.0.rc1.288.g06298d1525-goog



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator
  2024-10-10 23:25 [PATCH v5 bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc Namhyung Kim
@ 2024-10-10 23:25 ` Namhyung Kim
  2024-10-11 18:33   ` Alexei Starovoitov
                     ` (2 more replies)
  2024-10-10 23:25 ` [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc Namhyung Kim
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 18+ messages in thread
From: Namhyung Kim @ 2024-10-10 23:25 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

The new "kmem_cache" iterator will traverse the list of slab caches
and call attached BPF programs for each entry.  It should check the
argument (ctx.s) if it's NULL before using it.

Now the iteration grabs the slab_mutex only if it traverse the list and
releases the mutex when it runs the BPF program.  The kmem_cache entry
is protected by a refcount during the execution.

It includes the internal "mm/slab.h" header to access kmem_cache,
slab_caches and slab_mutex.  Hope it's ok to mm folks.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 include/linux/btf_ids.h      |   1 +
 kernel/bpf/Makefile          |   1 +
 kernel/bpf/kmem_cache_iter.c | 175 +++++++++++++++++++++++++++++++++++
 3 files changed, 177 insertions(+)
 create mode 100644 kernel/bpf/kmem_cache_iter.c

diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
index c0e3e1426a82f5c4..139bdececdcfaefb 100644
--- a/include/linux/btf_ids.h
+++ b/include/linux/btf_ids.h
@@ -283,5 +283,6 @@ extern u32 btf_tracing_ids[];
 extern u32 bpf_cgroup_btf_id[];
 extern u32 bpf_local_storage_map_btf_id[];
 extern u32 btf_bpf_map_id[];
+extern u32 bpf_kmem_cache_btf_id[];
 
 #endif
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 9b9c151b5c826b31..105328f0b9c04e37 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -52,3 +52,4 @@ obj-$(CONFIG_BPF_PRELOAD) += preload/
 obj-$(CONFIG_BPF_SYSCALL) += relo_core.o
 obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o
 obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o
+obj-$(CONFIG_BPF_SYSCALL) += kmem_cache_iter.o
diff --git a/kernel/bpf/kmem_cache_iter.c b/kernel/bpf/kmem_cache_iter.c
new file mode 100644
index 0000000000000000..2de0682c6d4c773f
--- /dev/null
+++ b/kernel/bpf/kmem_cache_iter.c
@@ -0,0 +1,175 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2024 Google */
+#include <linux/bpf.h>
+#include <linux/btf_ids.h>
+#include <linux/slab.h>
+#include <linux/kernel.h>
+#include <linux/seq_file.h>
+
+#include "../../mm/slab.h" /* kmem_cache, slab_caches and slab_mutex */
+
+struct bpf_iter__kmem_cache {
+	__bpf_md_ptr(struct bpf_iter_meta *, meta);
+	__bpf_md_ptr(struct kmem_cache *, s);
+};
+
+static void *kmem_cache_iter_seq_start(struct seq_file *seq, loff_t *pos)
+{
+	loff_t cnt = 0;
+	bool found = false;
+	struct kmem_cache *s;
+
+	mutex_lock(&slab_mutex);
+
+	/* Find an entry at the given position in the slab_caches list instead
+	 * of keeping a reference (of the last visited entry, if any) out of
+	 * slab_mutex. It might miss something if one is deleted in the middle
+	 * while it releases the lock.  But it should be rare and there's not
+	 * much we can do about it.
+	 */
+	list_for_each_entry(s, &slab_caches, list) {
+		if (cnt == *pos) {
+			/* Make sure this entry remains in the list by getting
+			 * a new reference count.  Note that boot_cache entries
+			 * have a negative refcount, so don't touch them.
+			 */
+			if (s->refcount > 0)
+				s->refcount++;
+			found = true;
+			break;
+		}
+		cnt++;
+	}
+	mutex_unlock(&slab_mutex);
+
+	if (!found)
+		return NULL;
+
+	return s;
+}
+
+static void kmem_cache_iter_seq_stop(struct seq_file *seq, void *v)
+{
+	struct bpf_iter_meta meta;
+	struct bpf_iter__kmem_cache ctx = {
+		.meta = &meta,
+		.s = v,
+	};
+	struct bpf_prog *prog;
+	bool destroy = false;
+
+	meta.seq = seq;
+	prog = bpf_iter_get_info(&meta, true);
+	if (prog && !ctx.s)
+		bpf_iter_run_prog(prog, &ctx);
+
+	if (ctx.s == NULL)
+		return;
+
+	mutex_lock(&slab_mutex);
+
+	/* Skip kmem_cache_destroy() for active entries */
+	if (ctx.s->refcount > 1)
+		ctx.s->refcount--;
+	else if (ctx.s->refcount == 1)
+		destroy = true;
+
+	mutex_unlock(&slab_mutex);
+
+	if (destroy)
+		kmem_cache_destroy(ctx.s);
+}
+
+static void *kmem_cache_iter_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+	struct kmem_cache *s = v;
+	struct kmem_cache *next = NULL;
+	bool destroy = false;
+
+	++*pos;
+
+	mutex_lock(&slab_mutex);
+
+	if (list_last_entry(&slab_caches, struct kmem_cache, list) != s) {
+		next = list_next_entry(s, list);
+
+		WARN_ON_ONCE(next->refcount == 0);
+
+		/* boot_caches have negative refcount, don't touch them */
+		if (next->refcount > 0)
+			next->refcount++;
+	}
+
+	/* Skip kmem_cache_destroy() for active entries */
+	if (s->refcount > 1)
+		s->refcount--;
+	else if (s->refcount == 1)
+		destroy = true;
+
+	mutex_unlock(&slab_mutex);
+
+	if (destroy)
+		kmem_cache_destroy(s);
+
+	return next;
+}
+
+static int kmem_cache_iter_seq_show(struct seq_file *seq, void *v)
+{
+	struct bpf_iter_meta meta;
+	struct bpf_iter__kmem_cache ctx = {
+		.meta = &meta,
+		.s = v,
+	};
+	struct bpf_prog *prog;
+	int ret = 0;
+
+	meta.seq = seq;
+	prog = bpf_iter_get_info(&meta, false);
+	if (prog)
+		ret = bpf_iter_run_prog(prog, &ctx);
+
+	return ret;
+}
+
+static const struct seq_operations kmem_cache_iter_seq_ops = {
+	.start  = kmem_cache_iter_seq_start,
+	.next   = kmem_cache_iter_seq_next,
+	.stop   = kmem_cache_iter_seq_stop,
+	.show   = kmem_cache_iter_seq_show,
+};
+
+BTF_ID_LIST_GLOBAL_SINGLE(bpf_kmem_cache_btf_id, struct, kmem_cache)
+
+static const struct bpf_iter_seq_info kmem_cache_iter_seq_info = {
+	.seq_ops		= &kmem_cache_iter_seq_ops,
+};
+
+static void bpf_iter_kmem_cache_show_fdinfo(const struct bpf_iter_aux_info *aux,
+					    struct seq_file *seq)
+{
+	seq_puts(seq, "kmem_cache iter\n");
+}
+
+DEFINE_BPF_ITER_FUNC(kmem_cache, struct bpf_iter_meta *meta,
+		     struct kmem_cache *s)
+
+static struct bpf_iter_reg bpf_kmem_cache_reg_info = {
+	.target			= "kmem_cache",
+	.feature		= BPF_ITER_RESCHED,
+	.show_fdinfo		= bpf_iter_kmem_cache_show_fdinfo,
+	.ctx_arg_info_size	= 1,
+	.ctx_arg_info		= {
+		{ offsetof(struct bpf_iter__kmem_cache, s),
+		  PTR_TO_BTF_ID_OR_NULL | PTR_UNTRUSTED },
+	},
+	.seq_info		= &kmem_cache_iter_seq_info,
+};
+
+static int __init bpf_kmem_cache_iter_init(void)
+{
+	bpf_kmem_cache_reg_info.ctx_arg_info[0].btf_id = bpf_kmem_cache_btf_id[0];
+	return bpf_iter_reg_target(&bpf_kmem_cache_reg_info);
+}
+
+late_initcall(bpf_kmem_cache_iter_init);
-- 
2.47.0.rc1.288.g06298d1525-goog



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc
  2024-10-10 23:25 [PATCH v5 bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc Namhyung Kim
  2024-10-10 23:25 ` [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator Namhyung Kim
@ 2024-10-10 23:25 ` Namhyung Kim
  2024-10-11 18:35   ` Alexei Starovoitov
  2024-10-10 23:25 ` [PATCH v5 bpf-next 3/3] selftests/bpf: Add a test for kmem_cache_iter Namhyung Kim
  2024-10-15  2:00 ` [PATCH v5 bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc patchwork-bot+netdevbpf
  3 siblings, 1 reply; 18+ messages in thread
From: Namhyung Kim @ 2024-10-10 23:25 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

The bpf_get_kmem_cache() is to get a slab cache information from a
virtual address like virt_to_cache().  If the address is a pointer
to a slab object, it'd return a valid kmem_cache pointer, otherwise
NULL is returned.

It doesn't grab a reference count of the kmem_cache so the caller is
responsible to manage the access.  The returned point is marked as
PTR_UNTRUSTED.  And the kfunc has KF_RCU_PROTECTED as the slab object
might be protected by RCU.

The intended use case for now is to symbolize locks in slab objects
from the lock contention tracepoints.

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev> (mm/*)
Acked-by: Vlastimil Babka <vbabka@suse.cz> #mm/slab
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 kernel/bpf/helpers.c  |  1 +
 kernel/bpf/verifier.c |  5 +++++
 mm/slab_common.c      | 19 +++++++++++++++++++
 3 files changed, 25 insertions(+)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 4053f279ed4cc7ab..7bfef9378ab21267 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -3090,6 +3090,7 @@ BTF_ID_FLAGS(func, bpf_iter_bits_new, KF_ITER_NEW)
 BTF_ID_FLAGS(func, bpf_iter_bits_next, KF_ITER_NEXT | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_iter_bits_destroy, KF_ITER_DESTROY)
 BTF_ID_FLAGS(func, bpf_copy_from_user_str, KF_SLEEPABLE)
+BTF_ID_FLAGS(func, bpf_get_kmem_cache, KF_RCU_PROTECTED)
 BTF_KFUNCS_END(common_btf_ids)
 
 static const struct btf_kfunc_id_set common_kfunc_set = {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index cfc62e0776bff2c8..f514247ba8ba8a57 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11259,6 +11259,7 @@ enum special_kfunc_type {
 	KF_bpf_preempt_enable,
 	KF_bpf_iter_css_task_new,
 	KF_bpf_session_cookie,
+	KF_bpf_get_kmem_cache,
 };
 
 BTF_SET_START(special_kfunc_set)
@@ -11324,6 +11325,7 @@ BTF_ID(func, bpf_session_cookie)
 #else
 BTF_ID_UNUSED
 #endif
+BTF_ID(func, bpf_get_kmem_cache)
 
 static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta)
 {
@@ -12834,6 +12836,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			regs[BPF_REG_0].type = PTR_TO_BTF_ID;
 			regs[BPF_REG_0].btf_id = ptr_type_id;
 
+			if (meta.func_id == special_kfunc_list[KF_bpf_get_kmem_cache])
+				regs[BPF_REG_0].type |= PTR_UNTRUSTED;
+
 			if (is_iter_next_kfunc(&meta)) {
 				struct bpf_reg_state *cur_iter;
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 7443244656150325..5484e1cd812f698e 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1322,6 +1322,25 @@ size_t ksize(const void *objp)
 }
 EXPORT_SYMBOL(ksize);
 
+#ifdef CONFIG_BPF_SYSCALL
+#include <linux/btf.h>
+
+__bpf_kfunc_start_defs();
+
+__bpf_kfunc struct kmem_cache *bpf_get_kmem_cache(u64 addr)
+{
+	struct slab *slab;
+
+	if (!virt_addr_valid(addr))
+		return NULL;
+
+	slab = virt_to_slab((void *)(long)addr);
+	return slab ? slab->slab_cache : NULL;
+}
+
+__bpf_kfunc_end_defs();
+#endif /* CONFIG_BPF_SYSCALL */
+
 /* Tracepoints definitions. */
 EXPORT_TRACEPOINT_SYMBOL(kmalloc);
 EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc);
-- 
2.47.0.rc1.288.g06298d1525-goog



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 bpf-next 3/3] selftests/bpf: Add a test for kmem_cache_iter
  2024-10-10 23:25 [PATCH v5 bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc Namhyung Kim
  2024-10-10 23:25 ` [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator Namhyung Kim
  2024-10-10 23:25 ` [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc Namhyung Kim
@ 2024-10-10 23:25 ` Namhyung Kim
  2024-10-15  2:00 ` [PATCH v5 bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc patchwork-bot+netdevbpf
  3 siblings, 0 replies; 18+ messages in thread
From: Namhyung Kim @ 2024-10-10 23:25 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

The test traverses all slab caches using the kmem_cache_iter and save
the data into slab_result array map.  And check if current task's
pointer is from "task_struct" slab cache using bpf_get_kmem_cache().

Also compare the result array with /proc/slabinfo if available (when
CONFIG_SLUB_DEBUG is on).  Note that many of the fields in the slabinfo
are transient, so it only compares the name and objsize fields.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 .../bpf/prog_tests/kmem_cache_iter.c          | 115 ++++++++++++++++++
 tools/testing/selftests/bpf/progs/bpf_iter.h  |   7 ++
 .../selftests/bpf/progs/kmem_cache_iter.c     |  95 +++++++++++++++
 3 files changed, 217 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/kmem_cache_iter.c
 create mode 100644 tools/testing/selftests/bpf/progs/kmem_cache_iter.c

diff --git a/tools/testing/selftests/bpf/prog_tests/kmem_cache_iter.c b/tools/testing/selftests/bpf/prog_tests/kmem_cache_iter.c
new file mode 100644
index 0000000000000000..848d8fc9171fae45
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/kmem_cache_iter.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google */
+
+#include <test_progs.h>
+#include <bpf/libbpf.h>
+#include <bpf/btf.h>
+#include "kmem_cache_iter.skel.h"
+
+#define SLAB_NAME_MAX  32
+
+struct kmem_cache_result {
+	char name[SLAB_NAME_MAX];
+	long obj_size;
+};
+
+static void subtest_kmem_cache_iter_check_task_struct(struct kmem_cache_iter *skel)
+{
+	LIBBPF_OPTS(bpf_test_run_opts, opts,
+		.flags = 0,  /* Run it with the current task */
+	);
+	int prog_fd = bpf_program__fd(skel->progs.check_task_struct);
+
+	/* Get task_struct and check it if's from a slab cache */
+	ASSERT_OK(bpf_prog_test_run_opts(prog_fd, &opts), "prog_test_run");
+
+	/* The BPF program should set 'found' variable */
+	ASSERT_EQ(skel->bss->task_struct_found, 1, "task_struct_found");
+}
+
+static void subtest_kmem_cache_iter_check_slabinfo(struct kmem_cache_iter *skel)
+{
+	FILE *fp;
+	int map_fd;
+	char name[SLAB_NAME_MAX];
+	unsigned long objsize;
+	char rest_of_line[1000];
+	struct kmem_cache_result r;
+	int seen = 0;
+
+	fp = fopen("/proc/slabinfo", "r");
+	if (fp == NULL) {
+		/* CONFIG_SLUB_DEBUG is not enabled */
+		return;
+	}
+
+	map_fd = bpf_map__fd(skel->maps.slab_result);
+
+	/* Ignore first two lines for header */
+	fscanf(fp, "slabinfo - version: %*d.%*d\n");
+	fscanf(fp, "# %*s %*s %*s %*s %*s %*s : %[^\n]\n", rest_of_line);
+
+	/* Compare name and objsize only - others can be changes frequently */
+	while (fscanf(fp, "%s %*u %*u %lu %*u %*u : %[^\n]\n",
+		      name, &objsize, rest_of_line) == 3) {
+		int ret = bpf_map_lookup_elem(map_fd, &seen, &r);
+
+		if (!ASSERT_OK(ret, "kmem_cache_lookup"))
+			break;
+
+		ASSERT_STREQ(r.name, name, "kmem_cache_name");
+		ASSERT_EQ(r.obj_size, objsize, "kmem_cache_objsize");
+
+		seen++;
+	}
+
+	ASSERT_EQ(skel->bss->kmem_cache_seen, seen, "kmem_cache_seen_eq");
+
+	fclose(fp);
+}
+
+void test_kmem_cache_iter(void)
+{
+	DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts);
+	struct kmem_cache_iter *skel = NULL;
+	union bpf_iter_link_info linfo = {};
+	struct bpf_link *link;
+	char buf[256];
+	int iter_fd;
+
+	skel = kmem_cache_iter__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "kmem_cache_iter__open_and_load"))
+		return;
+
+	opts.link_info = &linfo;
+	opts.link_info_len = sizeof(linfo);
+
+	link = bpf_program__attach_iter(skel->progs.slab_info_collector, &opts);
+	if (!ASSERT_OK_PTR(link, "attach_iter"))
+		goto destroy;
+
+	iter_fd = bpf_iter_create(bpf_link__fd(link));
+	if (!ASSERT_GE(iter_fd, 0, "iter_create"))
+		goto free_link;
+
+	memset(buf, 0, sizeof(buf));
+	while (read(iter_fd, buf, sizeof(buf) > 0)) {
+		/* Read out all contents */
+		printf("%s", buf);
+	}
+
+	/* Next reads should return 0 */
+	ASSERT_EQ(read(iter_fd, buf, sizeof(buf)), 0, "read");
+
+	if (test__start_subtest("check_task_struct"))
+		subtest_kmem_cache_iter_check_task_struct(skel);
+	if (test__start_subtest("check_slabinfo"))
+		subtest_kmem_cache_iter_check_slabinfo(skel);
+
+	close(iter_fd);
+
+free_link:
+	bpf_link__destroy(link);
+destroy:
+	kmem_cache_iter__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter.h b/tools/testing/selftests/bpf/progs/bpf_iter.h
index c41ee80533ca219a..3305dc3a74b32481 100644
--- a/tools/testing/selftests/bpf/progs/bpf_iter.h
+++ b/tools/testing/selftests/bpf/progs/bpf_iter.h
@@ -24,6 +24,7 @@
 #define BTF_F_PTR_RAW BTF_F_PTR_RAW___not_used
 #define BTF_F_ZERO BTF_F_ZERO___not_used
 #define bpf_iter__ksym bpf_iter__ksym___not_used
+#define bpf_iter__kmem_cache bpf_iter__kmem_cache___not_used
 #include "vmlinux.h"
 #undef bpf_iter_meta
 #undef bpf_iter__bpf_map
@@ -48,6 +49,7 @@
 #undef BTF_F_PTR_RAW
 #undef BTF_F_ZERO
 #undef bpf_iter__ksym
+#undef bpf_iter__kmem_cache
 
 struct bpf_iter_meta {
 	struct seq_file *seq;
@@ -165,3 +167,8 @@ struct bpf_iter__ksym {
 	struct bpf_iter_meta *meta;
 	struct kallsym_iter *ksym;
 };
+
+struct bpf_iter__kmem_cache {
+	struct bpf_iter_meta *meta;
+	struct kmem_cache *s;
+} __attribute__((preserve_access_index));
diff --git a/tools/testing/selftests/bpf/progs/kmem_cache_iter.c b/tools/testing/selftests/bpf/progs/kmem_cache_iter.c
new file mode 100644
index 0000000000000000..1cff8c7772683caf
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/kmem_cache_iter.c
@@ -0,0 +1,95 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google */
+
+#include "bpf_iter.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+#define SLAB_NAME_MAX  32
+
+struct kmem_cache_result {
+	char name[SLAB_NAME_MAX];
+	long obj_size;
+};
+
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(key_size, sizeof(void *));
+	__uint(value_size, SLAB_NAME_MAX);
+	__uint(max_entries, 1);
+} slab_hash SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(int));
+	__uint(value_size, sizeof(struct kmem_cache_result));
+	__uint(max_entries, 1024);
+} slab_result SEC(".maps");
+
+extern void bpf_rcu_read_lock(void) __ksym;
+extern void bpf_rcu_read_unlock(void) __ksym;
+extern struct kmem_cache *bpf_get_kmem_cache(u64 addr) __ksym;
+
+/* Result, will be checked by userspace */
+int task_struct_found;
+int kmem_cache_seen;
+
+SEC("iter/kmem_cache")
+int slab_info_collector(struct bpf_iter__kmem_cache *ctx)
+{
+	struct seq_file *seq = ctx->meta->seq;
+	struct kmem_cache *s = ctx->s;
+	struct kmem_cache_result *r;
+	int idx;
+
+	if (s) {
+		/* To make sure if the slab_iter implements the seq interface
+		 * properly and it's also useful for debugging.
+		 */
+		BPF_SEQ_PRINTF(seq, "%s: %u\n", s->name, s->size);
+
+		idx = kmem_cache_seen;
+		r = bpf_map_lookup_elem(&slab_result, &idx);
+		if (r == NULL)
+			return 0;
+
+		kmem_cache_seen++;
+
+		/* Save name and size to match /proc/slabinfo */
+		bpf_probe_read_kernel_str(r->name, sizeof(r->name), s->name);
+		r->obj_size = s->size;
+
+		if (!bpf_strncmp(r->name, 11, "task_struct"))
+			bpf_map_update_elem(&slab_hash, &s, r->name, BPF_NOEXIST);
+	}
+
+	return 0;
+}
+
+SEC("raw_tp/bpf_test_finish")
+int BPF_PROG(check_task_struct)
+{
+	u64 curr = bpf_get_current_task();
+	struct kmem_cache *s;
+	char *name;
+
+	bpf_rcu_read_lock();
+
+	s = bpf_get_kmem_cache(curr);
+	if (s == NULL) {
+		task_struct_found = -1;
+		bpf_rcu_read_unlock();
+		return 0;
+	}
+
+	name = bpf_map_lookup_elem(&slab_hash, &s);
+	if (name && !bpf_strncmp(name, 11, "task_struct"))
+		task_struct_found = 1;
+	else
+		task_struct_found = -2;
+
+	bpf_rcu_read_unlock();
+	return 0;
+}
-- 
2.47.0.rc1.288.g06298d1525-goog



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator
  2024-10-10 23:25 ` [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator Namhyung Kim
@ 2024-10-11 18:33   ` Alexei Starovoitov
  2024-10-11 19:13     ` Namhyung Kim
  2024-10-11 18:44   ` Alexei Starovoitov
  2024-10-14 15:25   ` Vlastimil Babka
  2 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2024-10-11 18:33 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> The new "kmem_cache" iterator will traverse the list of slab caches
> and call attached BPF programs for each entry.  It should check the
> argument (ctx.s) if it's NULL before using it.
>
> Now the iteration grabs the slab_mutex only if it traverse the list and

traverses

> releases the mutex when it runs the BPF program.  The kmem_cache entry
> is protected by a refcount during the execution.
>
> It includes the internal "mm/slab.h" header to access kmem_cache,
> slab_caches and slab_mutex.  Hope it's ok to mm folks.

What was the reason you dropped Vlastimil's and Roman's acks
from this patch while keeping them in patch 2 ?

Folks pls Ack again if it looks ok.

I'm ready to apply, but would like the acks first.

Also I'd like to remove the above paragraph
from mm/slab.h from the commit log.
It was good to ask during v1, but looks odd at v5.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc
  2024-10-10 23:25 ` [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc Namhyung Kim
@ 2024-10-11 18:35   ` Alexei Starovoitov
  2024-10-11 19:14     ` Namhyung Kim
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2024-10-11 18:35 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> The bpf_get_kmem_cache() is to get a slab cache information from a
> virtual address like virt_to_cache().  If the address is a pointer
> to a slab object, it'd return a valid kmem_cache pointer, otherwise
> NULL is returned.
>
> It doesn't grab a reference count of the kmem_cache so the caller is
> responsible to manage the access.  The returned point is marked as
> PTR_UNTRUSTED.  And the kfunc has KF_RCU_PROTECTED as the slab object
> might be protected by RCU.

...
> +BTF_ID_FLAGS(func, bpf_get_kmem_cache, KF_RCU_PROTECTED)

This flag is unnecessary. PTR_UNTRUSTED can point to absolutely any memory.
In this case it likely points to a valid kmem_cache, but
the verifier will guard all accesses with probe_read anyway.

I can remove this flag while applying.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator
  2024-10-10 23:25 ` [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator Namhyung Kim
  2024-10-11 18:33   ` Alexei Starovoitov
@ 2024-10-11 18:44   ` Alexei Starovoitov
  2024-10-11 19:41     ` Andrii Nakryiko
  2024-10-14 15:25   ` Vlastimil Babka
  2 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2024-10-11 18:44 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> +struct bpf_iter__kmem_cache {
> +       __bpf_md_ptr(struct bpf_iter_meta *, meta);
> +       __bpf_md_ptr(struct kmem_cache *, s);
> +};

Just noticed this.
Not your fault. You're copy pasting from bpf_iter__*.
It looks like tech debt.

Andrii, Song,

do you remember why all iters are using this?
__bpf_md_ptr() wrap was necessary in uapi/bpf.h,
but this is kernel iters that go into vmlinux.h
It should be fine to remove them all and
progs wouldn't need to do the ugly dance of:

#define bpf_iter__ksym bpf_iter__ksym___not_used
#include "vmlinux.h"
#undef bpf_iter__ksym


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator
  2024-10-11 18:33   ` Alexei Starovoitov
@ 2024-10-11 19:13     ` Namhyung Kim
  0 siblings, 0 replies; 18+ messages in thread
From: Namhyung Kim @ 2024-10-11 19:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

Hello Alexei,

On Fri, Oct 11, 2024 at 11:33:31AM -0700, Alexei Starovoitov wrote:
> On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > The new "kmem_cache" iterator will traverse the list of slab caches
> > and call attached BPF programs for each entry.  It should check the
> > argument (ctx.s) if it's NULL before using it.
> >
> > Now the iteration grabs the slab_mutex only if it traverse the list and
> 
> traverses
> 
> > releases the mutex when it runs the BPF program.  The kmem_cache entry
> > is protected by a refcount during the execution.
> >
> > It includes the internal "mm/slab.h" header to access kmem_cache,
> > slab_caches and slab_mutex.  Hope it's ok to mm folks.
> 
> What was the reason you dropped Vlastimil's and Roman's acks
> from this patch while keeping them in patch 2 ?

I wanted to make sure the slab maintainers agree with the refcounting and
the locking logic changes.  But I forgot to add back Vlastimil's Acked
for the v4 which is the same in this regard.

> 
> Folks pls Ack again if it looks ok.
> 
> I'm ready to apply, but would like the acks first.
> 
> Also I'd like to remove the above paragraph
> from mm/slab.h from the commit log.
> It was good to ask during v1, but looks odd at v5.

Sure, feel free to make any changes.

Thanks,
Namhyung



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc
  2024-10-11 18:35   ` Alexei Starovoitov
@ 2024-10-11 19:14     ` Namhyung Kim
  2024-10-14 18:13       ` Namhyung Kim
  0 siblings, 1 reply; 18+ messages in thread
From: Namhyung Kim @ 2024-10-11 19:14 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

On Fri, Oct 11, 2024 at 11:35:27AM -0700, Alexei Starovoitov wrote:
> On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > The bpf_get_kmem_cache() is to get a slab cache information from a
> > virtual address like virt_to_cache().  If the address is a pointer
> > to a slab object, it'd return a valid kmem_cache pointer, otherwise
> > NULL is returned.
> >
> > It doesn't grab a reference count of the kmem_cache so the caller is
> > responsible to manage the access.  The returned point is marked as
> > PTR_UNTRUSTED.  And the kfunc has KF_RCU_PROTECTED as the slab object
> > might be protected by RCU.
> 
> ...
> > +BTF_ID_FLAGS(func, bpf_get_kmem_cache, KF_RCU_PROTECTED)
> 
> This flag is unnecessary. PTR_UNTRUSTED can point to absolutely any memory.
> In this case it likely points to a valid kmem_cache, but
> the verifier will guard all accesses with probe_read anyway.
> 
> I can remove this flag while applying.

Ok, I'd be happy if you would remove it.

Thanks,
Namhyung



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator
  2024-10-11 18:44   ` Alexei Starovoitov
@ 2024-10-11 19:41     ` Andrii Nakryiko
  2024-10-11 19:43       ` Andrii Nakryiko
  0 siblings, 1 reply; 18+ messages in thread
From: Andrii Nakryiko @ 2024-10-11 19:41 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Namhyung Kim, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, LKML, bpf, Andrew Morton, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Vlastimil Babka,
	Roman Gushchin, Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo,
	Kees Cook, Paul E. McKenney

On Fri, Oct 11, 2024 at 11:44 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > +struct bpf_iter__kmem_cache {
> > +       __bpf_md_ptr(struct bpf_iter_meta *, meta);
> > +       __bpf_md_ptr(struct kmem_cache *, s);
> > +};
>
> Just noticed this.
> Not your fault. You're copy pasting from bpf_iter__*.
> It looks like tech debt.
>
> Andrii, Song,
>
> do you remember why all iters are using this?

I don't *know*, but I suspect we are doing this because of 32-bit host
architecture. BPF-side is always 64-bit, so to make memory layout
inside the kernel and in BPF programs compatible we have to do this
for pointers, no?

> __bpf_md_ptr() wrap was necessary in uapi/bpf.h,
> but this is kernel iters that go into vmlinux.h
> It should be fine to remove them all and
> progs wouldn't need to do the ugly dance of:
>
> #define bpf_iter__ksym bpf_iter__ksym___not_used
> #include "vmlinux.h"
> #undef bpf_iter__ksym

I don't think __bpf_md_ptr is why we are doing this ___not_used dance.
At some point we probably didn't want to rely on having the very
latest vmlinux.h available in BPF selftests, so we chose to define
local versions of all relevant context types.

I think we can drop all that ___not_used dance regardless (and remove
local definitions in progs/bpf_iter.h).


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator
  2024-10-11 19:41     ` Andrii Nakryiko
@ 2024-10-11 19:43       ` Andrii Nakryiko
  0 siblings, 0 replies; 18+ messages in thread
From: Andrii Nakryiko @ 2024-10-11 19:43 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Namhyung Kim, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, LKML, bpf, Andrew Morton, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Vlastimil Babka,
	Roman Gushchin, Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo,
	Kees Cook, Paul E. McKenney

On Fri, Oct 11, 2024 at 12:41 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Oct 11, 2024 at 11:44 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > +struct bpf_iter__kmem_cache {
> > > +       __bpf_md_ptr(struct bpf_iter_meta *, meta);
> > > +       __bpf_md_ptr(struct kmem_cache *, s);
> > > +};

BTW, do we want/need to define an open-coded iterator version of this,
so that this iteration can be done from other BPF programs? Seems like
it has to be a sleepable BPF program, but that's probably fine?

> >
> > Just noticed this.
> > Not your fault. You're copy pasting from bpf_iter__*.
> > It looks like tech debt.
> >
> > Andrii, Song,
> >
> > do you remember why all iters are using this?
>
> I don't *know*, but I suspect we are doing this because of 32-bit host
> architecture. BPF-side is always 64-bit, so to make memory layout
> inside the kernel and in BPF programs compatible we have to do this
> for pointers, no?
>
> > __bpf_md_ptr() wrap was necessary in uapi/bpf.h,
> > but this is kernel iters that go into vmlinux.h
> > It should be fine to remove them all and
> > progs wouldn't need to do the ugly dance of:
> >
> > #define bpf_iter__ksym bpf_iter__ksym___not_used
> > #include "vmlinux.h"
> > #undef bpf_iter__ksym
>
> I don't think __bpf_md_ptr is why we are doing this ___not_used dance.
> At some point we probably didn't want to rely on having the very
> latest vmlinux.h available in BPF selftests, so we chose to define
> local versions of all relevant context types.
>
> I think we can drop all that ___not_used dance regardless (and remove
> local definitions in progs/bpf_iter.h).


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator
  2024-10-10 23:25 ` [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator Namhyung Kim
  2024-10-11 18:33   ` Alexei Starovoitov
  2024-10-11 18:44   ` Alexei Starovoitov
@ 2024-10-14 15:25   ` Vlastimil Babka
  2 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2024-10-14 15:25 UTC (permalink / raw)
  To: Namhyung Kim, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko
  Cc: Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo,
	linux-mm, Arnaldo Carvalho de Melo, Kees Cook, Paul E. McKenney

On 10/11/24 01:25, Namhyung Kim wrote:
> The new "kmem_cache" iterator will traverse the list of slab caches
> and call attached BPF programs for each entry.  It should check the
> argument (ctx.s) if it's NULL before using it.
> 
> Now the iteration grabs the slab_mutex only if it traverse the list and
> releases the mutex when it runs the BPF program.  The kmem_cache entry
> is protected by a refcount during the execution.
> 
> It includes the internal "mm/slab.h" header to access kmem_cache,
> slab_caches and slab_mutex.  Hope it's ok to mm folks.

Yeah this paragraph can be dropped.

> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Acked-by: Vlastimil Babka <vbabka@suse.cz> #slab



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc
  2024-10-11 19:14     ` Namhyung Kim
@ 2024-10-14 18:13       ` Namhyung Kim
  2024-10-15  1:50         ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Namhyung Kim @ 2024-10-14 18:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

Hi Alexei,

On Fri, Oct 11, 2024 at 12:14:14PM -0700, Namhyung Kim wrote:
> On Fri, Oct 11, 2024 at 11:35:27AM -0700, Alexei Starovoitov wrote:
> > On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > The bpf_get_kmem_cache() is to get a slab cache information from a
> > > virtual address like virt_to_cache().  If the address is a pointer
> > > to a slab object, it'd return a valid kmem_cache pointer, otherwise
> > > NULL is returned.
> > >
> > > It doesn't grab a reference count of the kmem_cache so the caller is
> > > responsible to manage the access.  The returned point is marked as
> > > PTR_UNTRUSTED.  And the kfunc has KF_RCU_PROTECTED as the slab object
> > > might be protected by RCU.
> > 
> > ...
> > > +BTF_ID_FLAGS(func, bpf_get_kmem_cache, KF_RCU_PROTECTED)
> > 
> > This flag is unnecessary. PTR_UNTRUSTED can point to absolutely any memory.
> > In this case it likely points to a valid kmem_cache, but
> > the verifier will guard all accesses with probe_read anyway.
> > 
> > I can remove this flag while applying.
> 
> Ok, I'd be happy if you would remove it.

You will need to update the bpf_rcu_read_lock/unlock() in the test code
(patch 3).  I can send v6 with that and Vlastimil's Ack if you want.

Thanks,
Namhyung



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc
  2024-10-14 18:13       ` Namhyung Kim
@ 2024-10-15  1:50         ` Alexei Starovoitov
  2024-10-15 18:19           ` Namhyung Kim
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2024-10-15  1:50 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

On Mon, Oct 14, 2024 at 11:13 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Alexei,
>
> On Fri, Oct 11, 2024 at 12:14:14PM -0700, Namhyung Kim wrote:
> > On Fri, Oct 11, 2024 at 11:35:27AM -0700, Alexei Starovoitov wrote:
> > > On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > The bpf_get_kmem_cache() is to get a slab cache information from a
> > > > virtual address like virt_to_cache().  If the address is a pointer
> > > > to a slab object, it'd return a valid kmem_cache pointer, otherwise
> > > > NULL is returned.
> > > >
> > > > It doesn't grab a reference count of the kmem_cache so the caller is
> > > > responsible to manage the access.  The returned point is marked as
> > > > PTR_UNTRUSTED.  And the kfunc has KF_RCU_PROTECTED as the slab object
> > > > might be protected by RCU.
> > >
> > > ...
> > > > +BTF_ID_FLAGS(func, bpf_get_kmem_cache, KF_RCU_PROTECTED)
> > >
> > > This flag is unnecessary. PTR_UNTRUSTED can point to absolutely any memory.
> > > In this case it likely points to a valid kmem_cache, but
> > > the verifier will guard all accesses with probe_read anyway.
> > >
> > > I can remove this flag while applying.
> >
> > Ok, I'd be happy if you would remove it.
>
> You will need to update the bpf_rcu_read_lock/unlock() in the test code
> (patch 3).  I can send v6 with that and Vlastimil's Ack if you want.

Fixed all that while applying.

Could you please follow up with an open-coded iterator version
of the same slab iterator ?
So that progs can iterate slabs as a normal for/while loop ?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc
  2024-10-10 23:25 [PATCH v5 bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc Namhyung Kim
                   ` (2 preceding siblings ...)
  2024-10-10 23:25 ` [PATCH v5 bpf-next 3/3] selftests/bpf: Add a test for kmem_cache_iter Namhyung Kim
@ 2024-10-15  2:00 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 18+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-10-15  2:00 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: ast, daniel, andrii, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, linux-kernel, bpf,
	akpm, cl, penberg, rientjes, iamjoonsoo.kim, vbabka,
	roman.gushchin, 42.hyeyoo, linux-mm, acme, kees, paulmck

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Thu, 10 Oct 2024 16:25:02 -0700 you wrote:
> Hello,
> 
> I'm proposing a new iterator and a kfunc for the slab memory allocator
> to get information of each kmem_cache like in /proc/slabinfo or
> /sys/kernel/slab in more flexible way.
> 
> v5 changes)
> 
> [...]

Here is the summary with links:
  - [v5,bpf-next,1/3] bpf: Add kmem_cache iterator
    https://git.kernel.org/bpf/bpf-next/c/4971266e1595
  - [v5,bpf-next,2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc
    https://git.kernel.org/bpf/bpf-next/c/04b069ff0181
  - [v5,bpf-next,3/3] selftests/bpf: Add a test for kmem_cache_iter
    https://git.kernel.org/bpf/bpf-next/c/73bb7e74d181

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc
  2024-10-15  1:50         ` Alexei Starovoitov
@ 2024-10-15 18:19           ` Namhyung Kim
  2024-10-15 18:25             ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Namhyung Kim @ 2024-10-15 18:19 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

On Mon, Oct 14, 2024 at 06:50:49PM -0700, Alexei Starovoitov wrote:
> On Mon, Oct 14, 2024 at 11:13 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Alexei,
> >
> > On Fri, Oct 11, 2024 at 12:14:14PM -0700, Namhyung Kim wrote:
> > > On Fri, Oct 11, 2024 at 11:35:27AM -0700, Alexei Starovoitov wrote:
> > > > On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > >
> > > > > The bpf_get_kmem_cache() is to get a slab cache information from a
> > > > > virtual address like virt_to_cache().  If the address is a pointer
> > > > > to a slab object, it'd return a valid kmem_cache pointer, otherwise
> > > > > NULL is returned.
> > > > >
> > > > > It doesn't grab a reference count of the kmem_cache so the caller is
> > > > > responsible to manage the access.  The returned point is marked as
> > > > > PTR_UNTRUSTED.  And the kfunc has KF_RCU_PROTECTED as the slab object
> > > > > might be protected by RCU.
> > > >
> > > > ...
> > > > > +BTF_ID_FLAGS(func, bpf_get_kmem_cache, KF_RCU_PROTECTED)
> > > >
> > > > This flag is unnecessary. PTR_UNTRUSTED can point to absolutely any memory.
> > > > In this case it likely points to a valid kmem_cache, but
> > > > the verifier will guard all accesses with probe_read anyway.
> > > >
> > > > I can remove this flag while applying.
> > >
> > > Ok, I'd be happy if you would remove it.
> >
> > You will need to update the bpf_rcu_read_lock/unlock() in the test code
> > (patch 3).  I can send v6 with that and Vlastimil's Ack if you want.
> 
> Fixed all that while applying.
> 
> Could you please follow up with an open-coded iterator version
> of the same slab iterator ?
> So that progs can iterate slabs as a normal for/while loop ?

I'm not sure I'm following.  Do you want a new test program to iterate
kmem_caches by reading list pointers manually?  How can I grab the
slab_mutex then?

Thanks,
Namhyung



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc
  2024-10-15 18:19           ` Namhyung Kim
@ 2024-10-15 18:25             ` Alexei Starovoitov
  2024-10-15 20:54               ` Namhyung Kim
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2024-10-15 18:25 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

On Tue, Oct 15, 2024 at 11:20 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Mon, Oct 14, 2024 at 06:50:49PM -0700, Alexei Starovoitov wrote:
> > On Mon, Oct 14, 2024 at 11:13 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > Hi Alexei,
> > >
> > > On Fri, Oct 11, 2024 at 12:14:14PM -0700, Namhyung Kim wrote:
> > > > On Fri, Oct 11, 2024 at 11:35:27AM -0700, Alexei Starovoitov wrote:
> > > > > On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > > >
> > > > > > The bpf_get_kmem_cache() is to get a slab cache information from a
> > > > > > virtual address like virt_to_cache().  If the address is a pointer
> > > > > > to a slab object, it'd return a valid kmem_cache pointer, otherwise
> > > > > > NULL is returned.
> > > > > >
> > > > > > It doesn't grab a reference count of the kmem_cache so the caller is
> > > > > > responsible to manage the access.  The returned point is marked as
> > > > > > PTR_UNTRUSTED.  And the kfunc has KF_RCU_PROTECTED as the slab object
> > > > > > might be protected by RCU.
> > > > >
> > > > > ...
> > > > > > +BTF_ID_FLAGS(func, bpf_get_kmem_cache, KF_RCU_PROTECTED)
> > > > >
> > > > > This flag is unnecessary. PTR_UNTRUSTED can point to absolutely any memory.
> > > > > In this case it likely points to a valid kmem_cache, but
> > > > > the verifier will guard all accesses with probe_read anyway.
> > > > >
> > > > > I can remove this flag while applying.
> > > >
> > > > Ok, I'd be happy if you would remove it.
> > >
> > > You will need to update the bpf_rcu_read_lock/unlock() in the test code
> > > (patch 3).  I can send v6 with that and Vlastimil's Ack if you want.
> >
> > Fixed all that while applying.
> >
> > Could you please follow up with an open-coded iterator version
> > of the same slab iterator ?
> > So that progs can iterate slabs as a normal for/while loop ?
>
> I'm not sure I'm following.  Do you want a new test program to iterate
> kmem_caches by reading list pointers manually?  How can I grab the
> slab_mutex then?

No.
See bpf_iter_task_new/_next/_destroy kfuncs and
commit c68a78ffe2cb ("bpf: Introduce task open coded iterator kfuncs").


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc
  2024-10-15 18:25             ` Alexei Starovoitov
@ 2024-10-15 20:54               ` Namhyung Kim
  0 siblings, 0 replies; 18+ messages in thread
From: Namhyung Kim @ 2024-10-15 20:54 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	LKML, bpf, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Arnaldo Carvalho de Melo, Kees Cook,
	Paul E. McKenney

On Tue, Oct 15, 2024 at 11:25:11AM -0700, Alexei Starovoitov wrote:
> On Tue, Oct 15, 2024 at 11:20 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Mon, Oct 14, 2024 at 06:50:49PM -0700, Alexei Starovoitov wrote:
> > > On Mon, Oct 14, 2024 at 11:13 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > Hi Alexei,
> > > >
> > > > On Fri, Oct 11, 2024 at 12:14:14PM -0700, Namhyung Kim wrote:
> > > > > On Fri, Oct 11, 2024 at 11:35:27AM -0700, Alexei Starovoitov wrote:
> > > > > > On Thu, Oct 10, 2024 at 4:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > > > >
> > > > > > > The bpf_get_kmem_cache() is to get a slab cache information from a
> > > > > > > virtual address like virt_to_cache().  If the address is a pointer
> > > > > > > to a slab object, it'd return a valid kmem_cache pointer, otherwise
> > > > > > > NULL is returned.
> > > > > > >
> > > > > > > It doesn't grab a reference count of the kmem_cache so the caller is
> > > > > > > responsible to manage the access.  The returned point is marked as
> > > > > > > PTR_UNTRUSTED.  And the kfunc has KF_RCU_PROTECTED as the slab object
> > > > > > > might be protected by RCU.
> > > > > >
> > > > > > ...
> > > > > > > +BTF_ID_FLAGS(func, bpf_get_kmem_cache, KF_RCU_PROTECTED)
> > > > > >
> > > > > > This flag is unnecessary. PTR_UNTRUSTED can point to absolutely any memory.
> > > > > > In this case it likely points to a valid kmem_cache, but
> > > > > > the verifier will guard all accesses with probe_read anyway.
> > > > > >
> > > > > > I can remove this flag while applying.
> > > > >
> > > > > Ok, I'd be happy if you would remove it.
> > > >
> > > > You will need to update the bpf_rcu_read_lock/unlock() in the test code
> > > > (patch 3).  I can send v6 with that and Vlastimil's Ack if you want.
> > >
> > > Fixed all that while applying.
> > >
> > > Could you please follow up with an open-coded iterator version
> > > of the same slab iterator ?
> > > So that progs can iterate slabs as a normal for/while loop ?
> >
> > I'm not sure I'm following.  Do you want a new test program to iterate
> > kmem_caches by reading list pointers manually?  How can I grab the
> > slab_mutex then?
> 
> No.
> See bpf_iter_task_new/_next/_destroy kfuncs and
> commit c68a78ffe2cb ("bpf: Introduce task open coded iterator kfuncs").

Oh, ok.  Thanks for the pointer, I'll take a look and add the open code
version.

Thanks,
Namhyung



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-10-15 20:54 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-10 23:25 [PATCH v5 bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc Namhyung Kim
2024-10-10 23:25 ` [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator Namhyung Kim
2024-10-11 18:33   ` Alexei Starovoitov
2024-10-11 19:13     ` Namhyung Kim
2024-10-11 18:44   ` Alexei Starovoitov
2024-10-11 19:41     ` Andrii Nakryiko
2024-10-11 19:43       ` Andrii Nakryiko
2024-10-14 15:25   ` Vlastimil Babka
2024-10-10 23:25 ` [PATCH v5 bpf-next 2/3] mm/bpf: Add bpf_get_kmem_cache() kfunc Namhyung Kim
2024-10-11 18:35   ` Alexei Starovoitov
2024-10-11 19:14     ` Namhyung Kim
2024-10-14 18:13       ` Namhyung Kim
2024-10-15  1:50         ` Alexei Starovoitov
2024-10-15 18:19           ` Namhyung Kim
2024-10-15 18:25             ` Alexei Starovoitov
2024-10-15 20:54               ` Namhyung Kim
2024-10-10 23:25 ` [PATCH v5 bpf-next 3/3] selftests/bpf: Add a test for kmem_cache_iter Namhyung Kim
2024-10-15  2:00 ` [PATCH v5 bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).