All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Emil Tsalapatis" <emil@etsalapatis.com>
To: "Mykyta Yatsenko" <mykyta.yatsenko5@gmail.com>,
	"Emil Tsalapatis" <emil@etsalapatis.com>, <bpf@vger.kernel.org>,
	<ast@kernel.org>, <andrii@kernel.org>, <daniel@iogearbox.net>,
	<kafai@meta.com>, <kernel-team@meta.com>, <eddyz87@gmail.com>,
	<memxor@gmail.com>, <herbert@gondor.apana.org.au>
Cc: "Mykyta Yatsenko" <yatsenko@meta.com>
Subject: Re: [PATCH RFC bpf-next v2 07/18] bpf: Implement batch ops for resizable hashtab
Date: Tue, 14 Apr 2026 13:47:09 -0400	[thread overview]
Message-ID: <DHT2E3SO4O0U.3IHGPWXP84PWZ@etsalapatis.com> (raw)
In-Reply-To: <9cbfadf0-39ca-437e-88d7-3bb7030b799d@gmail.com>

On Tue Apr 14, 2026 at 4:08 AM EDT, Mykyta Yatsenko wrote:
>
>
> On 4/14/26 12:25 AM, Emil Tsalapatis wrote:
>> On Wed Apr 8, 2026 at 11:10 AM EDT, Mykyta Yatsenko wrote:
>>> From: Mykyta Yatsenko <yatsenko@meta.com>
>>>
>>> Add batch operations for BPF_MAP_TYPE_RHASH.
>>>
>>> Batch operations:
>>>   * rhtab_map_lookup_batch: Bulk lookup of elements by bucket
>>>   * rhtab_map_lookup_and_delete_batch: Atomic bulk lookup and delete
>>>
>>> The batch implementation uses rhashtable_walk_enter_from() to resume
>>> iteration from the last collected key. When the buffer fills, the last
>>> key becomes the cursor for the next batch call.
>>>
>>> Also implements rhtab_map_mem_usage() to report memory consumption.
>>>
>>> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
>>> ---
>>>   kernel/bpf/hashtab.c | 137 +++++++++++++++++++++++++++++++++++++++++++++++++--
>>>   1 file changed, 134 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>>> index e79c194e2779..a79d434dc626 100644
>>> --- a/kernel/bpf/hashtab.c
>>> +++ b/kernel/bpf/hashtab.c
>>> @@ -3051,19 +3051,150 @@ static long bpf_each_rhash_elem(struct bpf_map *map, bpf_callback_t callback_fn,
>>>   
>>>   static u64 rhtab_map_mem_usage(const struct bpf_map *map)
>>>   {
>>> -	return 0;
>>> +	struct bpf_rhtab *rhtab = container_of(map, struct bpf_rhtab, map);
>>> +	u64 num_entries;
>>> +
>>> +	num_entries = atomic_read(&rhtab->ht.nelems);
>>> +	return sizeof(struct bpf_rhtab) + rhtab->elem_size * num_entries;
>>> +}
>>> +
>>> +static int __rhtab_map_lookup_and_delete_batch(struct bpf_map *map,
>>> +					       const union bpf_attr *attr,
>>> +					       union bpf_attr __user *uattr,
>>> +					       bool do_delete)
>>> +{
>>> +	struct bpf_rhtab *rhtab = container_of(map, struct bpf_rhtab, map);
>>> +	void __user *uvalues = u64_to_user_ptr(attr->batch.values);
>>> +	void __user *ukeys = u64_to_user_ptr(attr->batch.keys);
>>> +	void __user *ubatch = u64_to_user_ptr(attr->batch.in_batch);
>>> +	void *buf = NULL, *keys = NULL, *values = NULL, *dst_key, *dst_val;
>>> +	struct rhtab_elem **del_elems = NULL;
>>> +	u32 max_count, total, key_size, value_size, i;
>>> +	struct rhashtable_iter iter;
>>> +	struct rhtab_elem *elem;
>>> +	u64 elem_map_flags, map_flags;
>>> +	int ret = 0;
>>> +
>>> +	elem_map_flags = attr->batch.elem_flags;
>>> +	if ((elem_map_flags & ~BPF_F_LOCK) ||
>>> +	    ((elem_map_flags & BPF_F_LOCK) &&
>>> +	     !btf_record_has_field(map->record, BPF_SPIN_LOCK)))
>>> +		return -EINVAL;
>>> +
>>> +	map_flags = attr->batch.flags;
>>> +	if (map_flags)
>>> +		return -EINVAL;
>>> +
>>> +	max_count = attr->batch.count;
>>> +	if (!max_count)
>>> +		return 0;
>>> +
>>> +	if (put_user(0, &uattr->batch.count))
>>> +		return -EFAULT;
>>> +
>>> +	key_size = map->key_size;
>>> +	value_size = map->value_size;
>>> +
>>> +	keys = kvmalloc_array(max_count, key_size, GFP_USER | __GFP_NOWARN);
>>> +	values = kvmalloc_array(max_count, value_size, GFP_USER | __GFP_NOWARN);
>>> +	if (do_delete)
>>> +		del_elems = kvmalloc_array(max_count, sizeof(void *),
>>> +					   GFP_USER | __GFP_NOWARN);
>>> +
>>> +	if (!keys || !values || (do_delete && !del_elems)) {
>>> +		ret = -ENOMEM;
>>> +		goto free;
>>> +	}
>>> +
>>> +	/*
>>> +	 * Use the last key from the previous batch as cursor.
>>> +	 * enter_from positions at that key's bucket, walk_next
>>> +	 * returns the successor in O(1).
>>> +	 * First call (ubatch == NULL): starts from bucket 0.
>>> +	 */
>>> +	if (ubatch) {
>>> +		buf = kmalloc(key_size, GFP_USER | __GFP_NOWARN);
>>> +		if (!buf) {
>>> +			ret = -ENOMEM;
>>> +			goto free;
>>> +		}
>>> +		if (copy_from_user(buf, ubatch, key_size)) {
>>> +			ret = -EFAULT;
>>> +			goto free;
>>> +		}
>>> +	}
>>> +
>>> +	scoped_guard(rcu) {
>> 
>> AFAICT this guard makes sure the RCU critical section extends from the
>> beginning of rhashtable_walk_enter_from all the way to walk_stop(), is
>> that correct?
>> 
>
> This guard is to make sure that when rhashtable_walk_enter_from() 
> initializes iterator, the element which it initialised with is not 
> freed. rhashtable_walk_start() calls rcu lock as well, that's why I can 
> end guard right after rhashtable_walk_start().
>

Sounds good, thanks.

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

>>> +		rhashtable_walk_enter_from(&rhtab->ht, &iter, buf, rhtab->params);
>>> +		rhashtable_walk_start(&iter);
>>> +	}
>>> +
>>> +	dst_key = keys;
>>> +	dst_val = values;
>>> +	total = 0;
>>> +
>>> +	while (total < max_count) {
>>> +		elem = rhtab_iter_next(&iter);
>>> +		if (!elem)
>>> +			break;
>>> +
>>> +		memcpy(dst_key, elem->data, key_size);
>>> +		rhtab_read_elem_value(map, dst_val, elem, elem_map_flags);
>>> +		check_and_init_map_value(map, dst_val);
>>> +
>>> +		if (do_delete)
>>> +			del_elems[total] = elem;
>>> +
>>> +		dst_key += key_size;
>>> +		dst_val += value_size;
>>> +		total++;
>>> +	}
>>> +
>>> +	if (do_delete) {
>>> +		for (i = 0; i < total; i++)
>>> +			rhtab_delete_elem(rhtab, del_elems[i]);
>>> +	}
>>> +
>>> +	rhashtable_walk_stop(&iter);
>>> +	rhashtable_walk_exit(&iter);
>>> +
>>> +	if (total == 0) {
>>> +		ret = -ENOENT;
>>> +		goto free;
>>> +	}
>>> +
>>> +	/* Signal end of table when we collected fewer than requested */
>>> +	if (total < max_count)
>>> +		ret = -ENOENT;
>>> +
>>> +	/* Write last key as cursor for the next batch call */
>>> +	if (copy_to_user(ukeys, keys, total * key_size) ||
>>> +	    copy_to_user(uvalues, values, total * value_size) ||
>>> +	    put_user(total, &uattr->batch.count) ||
>>> +	    copy_to_user(u64_to_user_ptr(attr->batch.out_batch),
>>> +			 dst_key - key_size, key_size)) {
>>> +		ret = -EFAULT;
>>> +		goto free;
>>> +	}
>>> +
>>> +free:
>>> +	kfree(buf);
>>> +	kvfree(keys);
>>> +	kvfree(values);
>>> +	kvfree(del_elems);
>>> +	return ret;
>>>   }
>>>   
>>>   static int rhtab_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr,
>>>   				  union bpf_attr __user *uattr)
>>>   {
>>> -	return 0;
>>> +	return __rhtab_map_lookup_and_delete_batch(map, attr, uattr, false);
>>>   }
>>>   
>>>   static int rhtab_map_lookup_and_delete_batch(struct bpf_map *map, const union bpf_attr *attr,
>>>   					     union bpf_attr __user *uattr)
>>>   {
>>> -	return 0;
>> Shouldn't these have been -EINVAL or -ENOSUPP in previous patches?
>> 
>>> +	return __rhtab_map_lookup_and_delete_batch(map, attr, uattr, true);
>>>   }
>>>   
>>>   struct bpf_iter_seq_rhash_map_info {
>> 


  reply	other threads:[~2026-04-14 17:47 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08 15:10 [PATCH RFC bpf-next v2 00/18] bpf: Introduce resizable hash map Mykyta Yatsenko
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 01/18] bpf: Register rhash map Mykyta Yatsenko
2026-04-10 22:31   ` Emil Tsalapatis
2026-04-13  8:10     ` Mykyta Yatsenko
2026-04-14 17:50       ` Emil Tsalapatis
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 02/18] bpf: Add resizable hashtab skeleton Mykyta Yatsenko
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 03/18] bpf: Implement lookup, delete, update for resizable hashtab Mykyta Yatsenko
2026-04-12 23:10   ` Alexei Starovoitov
2026-04-13 10:52     ` Mykyta Yatsenko
2026-04-13 16:24       ` Alexei Starovoitov
2026-04-13 16:27         ` Daniel Borkmann
2026-04-13 19:43           ` Mykyta Yatsenko
2026-04-13 20:37   ` Emil Tsalapatis
2026-04-14  8:34     ` Mykyta Yatsenko
2026-04-14 10:25   ` Leon Hwang
2026-04-14 10:28     ` Mykyta Yatsenko
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 04/18] rhashtable: Add rhashtable_walk_enter_from() Mykyta Yatsenko
2026-04-12 23:13   ` Alexei Starovoitov
2026-04-13 12:22     ` Mykyta Yatsenko
2026-04-13 22:22   ` Emil Tsalapatis
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 05/18] bpf: Implement get_next_key and free_internal_structs for resizable hashtab Mykyta Yatsenko
2026-04-13 22:44   ` Emil Tsalapatis
2026-04-14  8:11     ` Mykyta Yatsenko
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 06/18] bpf: Implement bpf_each_rhash_elem() using walk API Mykyta Yatsenko
2026-04-13 23:02   ` Emil Tsalapatis
2026-04-24 15:16     ` Mykyta Yatsenko
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 07/18] bpf: Implement batch ops for resizable hashtab Mykyta Yatsenko
2026-04-13 23:25   ` Emil Tsalapatis
2026-04-14  8:08     ` Mykyta Yatsenko
2026-04-14 17:47       ` Emil Tsalapatis [this message]
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 08/18] bpf: Implement iterator APIs " Mykyta Yatsenko
2026-04-14 17:49   ` Emil Tsalapatis
2026-04-15 11:15     ` Mykyta Yatsenko
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 09/18] bpf: Implement alloc and free " Mykyta Yatsenko
2026-04-12 23:15   ` Alexei Starovoitov
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 10/18] bpf: Allow timers, workqueues and task_work in " Mykyta Yatsenko
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 11/18] libbpf: Support resizable hashtable Mykyta Yatsenko
2026-04-14 17:46   ` Emil Tsalapatis
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 12/18] selftests/bpf: Add basic tests for resizable hash map Mykyta Yatsenko
2026-04-12 23:16   ` Alexei Starovoitov
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 13/18] selftests/bpf: Support resizable hashtab in test_maps Mykyta Yatsenko
2026-04-12 23:17   ` Alexei Starovoitov
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 14/18] selftests/bpf: Resizable hashtab BPF_F_LOCK tests Mykyta Yatsenko
2026-04-12 23:18   ` Alexei Starovoitov
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 15/18] selftests/bpf: Add stress tests for resizable hash get_next_key Mykyta Yatsenko
2026-04-12 23:19   ` Alexei Starovoitov
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 16/18] selftests/bpf: Add BPF iterator tests for resizable hash map Mykyta Yatsenko
2026-04-12 23:20   ` Alexei Starovoitov
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 17/18] bpftool: Add rhash map documentation Mykyta Yatsenko
2026-04-14 17:51   ` Emil Tsalapatis
2026-04-08 15:10 ` [PATCH RFC bpf-next v2 18/18] selftests/bpf: Add resizable hashmap to benchmarks Mykyta Yatsenko
2026-04-12 23:25   ` Alexei Starovoitov
2026-04-12 23:11 ` [PATCH RFC bpf-next v2 00/18] bpf: Introduce resizable hash map Alexei Starovoitov
2026-04-13  8:28   ` Mykyta Yatsenko
2026-04-15  3:27 ` Herbert Xu
2026-04-15  5:13   ` Alexei Starovoitov
2026-04-16  5:18     ` Herbert Xu
2026-04-16 14:11       ` Alexei Starovoitov
2026-04-16 15:10         ` Mykyta Yatsenko
2026-04-16 15:36           ` Alexei Starovoitov
2026-04-16 16:30             ` Mykyta Yatsenko
2026-04-17  6:54           ` Herbert Xu
2026-04-17 15:16             ` Mykyta Yatsenko
2026-04-18  0:43               ` Herbert Xu
2026-04-20 11:45                 ` Mykyta Yatsenko
2026-04-20 15:41                   ` Alexei Starovoitov
2026-04-20 15:50                     ` Mykyta Yatsenko
2026-04-20 16:06                       ` Alexei Starovoitov
2026-04-20 16:37                         ` Mykyta Yatsenko
2026-04-20 18:00                           ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DHT2E3SO4O0U.3IHGPWXP84PWZ@etsalapatis.com \
    --to=emil@etsalapatis.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=kafai@meta.com \
    --cc=kernel-team@meta.com \
    --cc=memxor@gmail.com \
    --cc=mykyta.yatsenko5@gmail.com \
    --cc=yatsenko@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.