Re: [PATCH bpf-next v5 04/11] bpf: Implement iteration ops for resizable hashtab

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
To: sashiko-reviews@lists.linux.dev
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH bpf-next v5 04/11] bpf: Implement iteration ops for resizable hashtab
Date: Fri, 29 May 2026 18:53:17 +0100	[thread overview]
Message-ID: <c8acb23e-ac1f-48d3-8e33-64c40bba16eb@gmail.com> (raw)
In-Reply-To: <20260528182814.AD3771F000E9@smtp.kernel.org>



On 5/28/26 7:28 PM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
> - [Critical] Missing error pointer check in `rhtab_map_get_next_key` leads to unhandled error pointer dereferences and kernel panics.
> - [High] `bpf_each_rhash_elem` illegally calls `cond_resched()` and drops `rcu_read_lock()` from within a BPF map iterator callback, which will trigger kernel panics in atomic contexts.
> - [High] Batch lookup and `seq_file` iterators prematurely terminate when encountering concurrent deletions, breaking BPF map iteration guarantees and causing silent data omission.
> --
> 
> commit afc331365f30658242015d82e7bdd35a014f432d
> Author: Mykyta Yatsenko <yatsenko@meta.com>
> 
> bpf: Implement iteration ops for resizable hashtab
> 
> Implement rhtab_map_get_next_key(), batch lookup/lookup-and-delete,
> and the seq_file BPF iterator for BPF_MAP_TYPE_RHASH.
> 
>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>> index 8944c9f070ca8..a03fa5fa6545b 100644
>> --- a/kernel/bpf/hashtab.c
>> +++ b/kernel/bpf/hashtab.c
>> @@ -3021,10 +3021,363 @@ static void rhtab_map_free_internal_structs(struct bpf_map *map)
>>  }
>>  
>>  static int rhtab_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
>> +	__must_hold_shared(RCU)
>>  {
>> -	return -EOPNOTSUPP;
>> +	struct bpf_rhtab *rhtab = container_of(map, struct bpf_rhtab, map);
>> +	struct rhtab_elem *elem;
>> +
>> +	elem = rhashtable_next_key(&rhtab->ht, key, rhtab_params);
>> +
>> +	/* if not found, return the first key */
>> +	if (PTR_ERR(elem) == -ENOENT)
>> +		elem = rhashtable_next_key(&rhtab->ht, NULL, rhtab_params);
>> +
>> +	if (!elem)
>> +		return -ENOENT;
>> +
>> +	memcpy(next_key, elem->data, map->key_size);
> 
> [Severity: Critical]
> Could this dereference an error pointer if rhashtable_next_key() returns
> ERR_PTR(-EAGAIN)? 
> 
> The first check only handles -ENOENT. If we get -EAGAIN (which can happen
> during concurrent resizing), it bypasses the PTR_ERR check, passes the
> !elem check since it isn't NULL, and then passes the error pointer directly
> into memcpy().
> 

-EAGAIN is never returned from rhashtable_next_key(), but the error check
can be made more reliable. I'll address this for the v6

>> +	return 0;
>> +}
> 
> [ ... ]
> 
>> +static long bpf_each_rhash_elem(struct bpf_map *map, bpf_callback_t callback_fn,
>> +				void *callback_ctx, u64 flags)
>> +{
>> +	struct bpf_rhtab *rhtab = container_of(map, struct bpf_rhtab, map);
>> +	void *prev_key = NULL;
> 
> [ ... ]
> 
>> +	rcu_read_lock();
>> +	/*
>> +	 * Best-effort iteration: if rhashtable is concurrently resized or
>> +	 * elements are deleted/inserted, there may be missed or duplicate
>> +	 * elements visited.
>> +	 */
>> +	while ((elem = rhashtable_next_key(&rhtab->ht, prev_key, rhtab_params))) {
>> +		if (IS_ERR(elem))
>> +			break;
>> +		num_elems++;
>> +		ret = callback_fn((u64)(long)map,
>> +				  (u64)(long)elem->data,
>> +				  (u64)(long)rhtab_elem_value(elem, map->key_size),
>> +				  (u64)(long)callback_ctx, 0);
>> +		if (ret)
>> +			break;
>> +
>> +		prev_key = elem->data;	/* valid while RCU held */
>> +
>> +		if (need_resched()) {
>> +			memcpy(key_buf, elem->data, map->key_size);
>> +			rcu_read_unlock();
>> +			cond_resched();
> 
> [Severity: High]
> Can this cond_resched() trigger "BUG: scheduling while atomic" panics?
> 
> BPF programs using the bpf_for_each_map_elem helper often run in
> non-sleepable contexts (like XDP, TC, or tracing) under an outer RCU read
> lock or with preemption disabled. 
> 
> While this drops the inner RCU lock, it doesn't exit the outer atomic
> context, so sleeping here might violate those constraints and cause system
> instability.
> 

yes, I'll remove cond_resched() from here. Dropping and re-acquiring rcu lock,
though may still be needed.

>> +			rcu_read_lock();
>> +			prev_key = key_buf;
>> +		}
>> +	}
>> +	rcu_read_unlock();
> 
> [ ... ]
> 
>> +static int __rhtab_map_lookup_and_delete_batch(struct bpf_map *map,
>> +					       const union bpf_attr *attr,
>> +					       union bpf_attr __user *uattr,
>> +					       bool do_delete)
>> +{
> 
> [ ... ]
> 
>> +	rcu_read_lock();
>> +
>> +	/*
>> +	 * Cursor stores the key of the next-to-process element (stashed by
>> +	 * the previous batch). Look it up directly so the element is included
>> +	 * here rather than skipped by next_key(). If the cursor was deleted
>> +	 * concurrently (or by the previous do_delete batch), lookup returns
>> +	 * NULL and userspace must restart from a NULL cursor.
>> +	 */
>> +	if (ubatch)
>> +		elem = rhtab_lookup_elem(map, cursor);
>> +	else
>> +		elem = rhashtable_next_key(&rhtab->ht, NULL, rhtab_params);
>> +
>> +	while (elem && !IS_ERR(elem) && total < max_count) {
> 
> [Severity: High]
> Will this prematurely terminate the batch iteration if a concurrent deletion
> occurs? 
> 
> If the cursor element was concurrently deleted, rhashtable_next_key() returns
> ERR_PTR(-ENOENT). This loop checks !IS_ERR(elem) and will break out early.
> 
> This results in the function returning -ENOENT to userspace. Under BPF UAPI
> semantics, userspace interprets -ENOENT as the end of the map, which might
> cause silent data omission instead of safely recovering or continuing.

Maybe return -EAGAIN in case of cursor deletion, so that userspace will retry.

> 
>> +		memcpy(dst_key, elem->data, key_size);
>> +		rhtab_read_elem_value(map, dst_val, elem, elem_map_flags);
>> +		check_and_init_map_value(map, dst_val);
> 
> [ ... ]
> 
>> +static void *bpf_rhash_map_seq_next(struct seq_file *seq, void *v, loff_t *pos)
>> +{
>> +	struct bpf_iter_seq_rhash_map_info *info = seq->private;
>> +	struct rhtab_elem *elem = v;
>> +	void *next;
>> +
>> +	memcpy(info->last_key, elem->data, info->map->key_size);
>> +	info->last_key_valid = true;
>> +	++*pos;
>> +
>> +	next = rhashtable_next_key(&info->rhtab->ht, info->last_key,
>> +				   rhtab_params);
>> +	if (IS_ERR(next))
>> +		return NULL;
> 
> [Severity: High]
> Does this also prematurely terminate the seq_file iteration on concurrent
> deletions? 
> 
> Similar to the batch logic above, returning NULL when next is
> ERR_PTR(-ENOENT) signals the end of the sequence to the iterator, which could
> silently truncate the iteration results.

Yes, in case cursor is deleted, the iteration finishes early.

> 
>> +	return next;
>> +}
>

next prev parent reply	other threads:[~2026-05-29 17:53 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-28 17:47 [PATCH bpf-next v5 00/11] bpf: Introduce resizable hash map Mykyta Yatsenko
2026-05-28 17:47 ` [PATCH bpf-next v5 01/11] rhashtable: Add rhashtable_next_key() API Mykyta Yatsenko
2026-05-28 18:16   ` sashiko-bot
2026-05-29 16:43     ` Mykyta Yatsenko
2026-05-28 18:26   ` bot+bpf-ci
2026-05-28 17:47 ` [PATCH bpf-next v5 02/11] rhashtable: Add selftest for rhashtable_next_key() Mykyta Yatsenko
2026-05-28 18:26   ` sashiko-bot
2026-05-29 16:53     ` Mykyta Yatsenko
2026-05-28 17:47 ` [PATCH bpf-next v5 03/11] bpf: Implement resizable hashmap basic functions Mykyta Yatsenko
2026-05-28 18:42   ` bot+bpf-ci
2026-05-29 17:08     ` Mykyta Yatsenko
2026-05-28 17:47 ` [PATCH bpf-next v5 04/11] bpf: Implement iteration ops for resizable hashtab Mykyta Yatsenko
2026-05-28 18:28   ` sashiko-bot
2026-05-29 17:53     ` Mykyta Yatsenko [this message]
2026-05-28 17:47 ` [PATCH bpf-next v5 05/11] bpf: Allow special fields in " Mykyta Yatsenko
2026-05-28 18:26   ` bot+bpf-ci
2026-06-01 13:05     ` Mykyta Yatsenko
2026-05-28 17:47 ` [PATCH bpf-next v5 06/11] bpf: Optimize word-sized keys for resizable hashtable Mykyta Yatsenko
2026-05-28 18:25   ` sashiko-bot
2026-06-01 16:38     ` Mykyta Yatsenko
2026-05-28 17:47 ` [PATCH bpf-next v5 07/11] libbpf: Support " Mykyta Yatsenko
2026-05-28 17:47 ` [PATCH bpf-next v5 08/11] selftests/bpf: Add basic tests for resizable hash map Mykyta Yatsenko
2026-05-28 18:41   ` sashiko-bot
2026-06-01 16:41     ` Mykyta Yatsenko
2026-05-28 17:47 ` [PATCH bpf-next v5 09/11] selftests/bpf: Add BPF iterator " Mykyta Yatsenko
2026-05-28 17:47 ` [PATCH bpf-next v5 10/11] bpftool: Add rhash map documentation Mykyta Yatsenko
2026-05-28 17:47 ` [PATCH bpf-next v5 11/11] selftests/bpf: Add resizable hashmap to benchmarks Mykyta Yatsenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c8acb23e-ac1f-48d3-8e33-64c40bba16eb@gmail.com \
    --to=mykyta.yatsenko5@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.