All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
	Yonghong Song <yonghong.song@linux.dev>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	bpf@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	linux-mm@kvack.org, Arnaldo Carvalho de Melo <acme@kernel.org>,
	Kees Cook <kees@kernel.org>
Subject: Re: [PATCH bpf-next 1/2] bpf: Add open coded version of kmem_cache iterator
Date: Tue, 22 Oct 2024 10:47:57 -0700	[thread overview]
Message-ID: <ZxflTe2O2iktiv8G@google.com> (raw)
In-Reply-To: <b3655d46-5c42-407e-adc1-b17865432e45@linux.dev>

Hello,

On Fri, Oct 18, 2024 at 11:22:00AM -0700, Martin KaFai Lau wrote:
> On 10/17/24 1:06 AM, Namhyung Kim wrote:
> > Add a new open coded iterator for kmem_cache which can be called from a
> > BPF program like below.  It doesn't take any argument and traverses all
> > kmem_cache entries.
> > 
> >    struct kmem_cache *pos;
> > 
> >    bpf_for_each(kmem_cache, pos) {
> >        ...
> >    }
> > 
> > As it needs to grab slab_mutex, it should be called from sleepable BPF
> > programs only.
> > 
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >   kernel/bpf/helpers.c         |  3 ++
> >   kernel/bpf/kmem_cache_iter.c | 87 ++++++++++++++++++++++++++++++++++++
> >   2 files changed, 90 insertions(+)
> > 
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 073e6f04f4d765ff..d1dfa4f335577914 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -3111,6 +3111,9 @@ BTF_ID_FLAGS(func, bpf_iter_bits_next, KF_ITER_NEXT | KF_RET_NULL)
> >   BTF_ID_FLAGS(func, bpf_iter_bits_destroy, KF_ITER_DESTROY)
> >   BTF_ID_FLAGS(func, bpf_copy_from_user_str, KF_SLEEPABLE)
> >   BTF_ID_FLAGS(func, bpf_get_kmem_cache)
> > +BTF_ID_FLAGS(func, bpf_iter_kmem_cache_new, KF_ITER_NEW | KF_SLEEPABLE)
> > +BTF_ID_FLAGS(func, bpf_iter_kmem_cache_next, KF_ITER_NEXT | KF_RET_NULL | KF_SLEEPABLE)
> > +BTF_ID_FLAGS(func, bpf_iter_kmem_cache_destroy, KF_ITER_DESTROY | KF_SLEEPABLE)
> >   BTF_KFUNCS_END(common_btf_ids)
> >   static const struct btf_kfunc_id_set common_kfunc_set = {
> > diff --git a/kernel/bpf/kmem_cache_iter.c b/kernel/bpf/kmem_cache_iter.c
> > index ebc101d7da51b57c..31ddaf452b20a458 100644
> > --- a/kernel/bpf/kmem_cache_iter.c
> > +++ b/kernel/bpf/kmem_cache_iter.c
> > @@ -145,6 +145,93 @@ static const struct bpf_iter_seq_info kmem_cache_iter_seq_info = {
> >   	.seq_ops		= &kmem_cache_iter_seq_ops,
> >   };
> > +/* open-coded version */
> > +struct bpf_iter_kmem_cache {
> > +	__u64 __opaque[1];
> > +} __attribute__((aligned(8)));
> > +
> > +struct bpf_iter_kmem_cache_kern {
> > +	struct kmem_cache *pos;
> > +} __attribute__((aligned(8)));
> > +
> > +__bpf_kfunc_start_defs();
> > +
> > +__bpf_kfunc int bpf_iter_kmem_cache_new(struct bpf_iter_kmem_cache *it)
> > +{
> > +	struct bpf_iter_kmem_cache_kern *kit = (void *)it;
> > +
> > +	BUILD_BUG_ON(sizeof(*kit) > sizeof(*it));
> > +	BUILD_BUG_ON(__alignof__(*kit) != __alignof__(*it));
> > +
> > +	kit->pos = NULL;
> > +	return 0;
> > +}
> > +
> > +__bpf_kfunc struct kmem_cache *bpf_iter_kmem_cache_next(struct bpf_iter_kmem_cache *it)
> > +{
> > +	struct bpf_iter_kmem_cache_kern *kit = (void *)it;
> > +	struct kmem_cache *prev = kit->pos;
> > +	struct kmem_cache *next;
> > +	bool destroy = false;
> > +
> > +	mutex_lock(&slab_mutex);
> 
> I think taking mutex_lock here should be fine since sleepable tracing prog
> should be limited to the error injection whitelist. Those functions should
> not have held the mutex afaict.
> 
> > +
> > +	if (list_empty(&slab_caches)) {
> > +		mutex_unlock(&slab_mutex);
> > +		return NULL;
> > +	}
> > +
> > +	if (prev == NULL)
> > +		next = list_first_entry(&slab_caches, struct kmem_cache, list);
> > +	else if (list_last_entry(&slab_caches, struct kmem_cache, list) == prev)
> > +		next = NULL;
> 
> At the last entry, next is NULL.
> 
> > +	else
> > +		next = list_next_entry(prev, list);
> > +
> > +	/* boot_caches have negative refcount, don't touch them */
> > +	if (next && next->refcount > 0)
> > +		next->refcount++;
> > +
> > +	/* Skip kmem_cache_destroy() for active entries */
> > +	if (prev && prev->refcount > 1)
> > +		prev->refcount--;
> > +	else if (prev && prev->refcount == 1)
> > +		destroy = true;
> > +
> > +	mutex_unlock(&slab_mutex);
> > +
> > +	if (destroy)
> > +		kmem_cache_destroy(prev);
> > +
> > +	kit->pos = next;
> 
> so kit->pos will be NULL also. Does it mean the bpf prog will be able to
> call bpf_iter_kmem_cache_next() again and re-loop from the beginning of the
> slab_caches list?

Right, I'll mark the start pos differently to prevent that.

Thanks,
Namhyung

> 
> > +	return next;
> > +}
> > +
> > +__bpf_kfunc void bpf_iter_kmem_cache_destroy(struct bpf_iter_kmem_cache *it)
> > +{
> > +	struct bpf_iter_kmem_cache_kern *kit = (void *)it;
> > +	struct kmem_cache *s = kit->pos;
> > +	bool destroy = false;
> > +
> > +	if (s == NULL)
> > +		return;
> > +
> > +	mutex_lock(&slab_mutex);
> > +
> > +	/* Skip kmem_cache_destroy() for active entries */
> > +	if (s->refcount > 1)
> > +		s->refcount--;
> > +	else if (s->refcount == 1)
> > +		destroy = true;
> > +
> > +	mutex_unlock(&slab_mutex);
> > +
> > +	if (destroy)
> > +		kmem_cache_destroy(s);
> > +}
> > +
> > +__bpf_kfunc_end_defs();
> > +
> >   static void bpf_iter_kmem_cache_show_fdinfo(const struct bpf_iter_aux_info *aux,
> >   					    struct seq_file *seq)
> >   {
> 

  reply	other threads:[~2024-10-22 17:48 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-17  8:06 [PATCH bpf-next 1/2] bpf: Add open coded version of kmem_cache iterator Namhyung Kim
2024-10-17  8:06 ` [PATCH bpf-next 2/2] selftests/bpf: Add a test for open coded kmem_cache iter Namhyung Kim
2024-10-18 18:46   ` Martin KaFai Lau
2024-10-22 17:51     ` Namhyung Kim
2024-10-21 23:36   ` Andrii Nakryiko
2024-10-22 17:52     ` Namhyung Kim
2024-10-24  7:44     ` Namhyung Kim
2024-10-18 18:22 ` [PATCH bpf-next 1/2] bpf: Add open coded version of kmem_cache iterator Martin KaFai Lau
2024-10-22 17:47   ` Namhyung Kim [this message]
2024-10-21 23:32 ` Andrii Nakryiko
2024-10-22 17:50   ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZxflTe2O2iktiv8G@google.com \
    --to=namhyung@kernel.org \
    --cc=42.hyeyoo@gmail.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cl@linux.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kees@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=martin.lau@linux.dev \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.