From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B90236C0CB for ; Fri, 24 Apr 2026 20:28:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777062539; cv=none; b=pYUcsubmRvfKZzMvJNh1EWelSVR2oTsrwhULmvtHz9f1u93agQwRMv3jtK3n1G2ZaUXbHDc/49eYyY0oN/BeELNM++3z0aJwFBcoquqZAjwtdiy5zldNz27JagB0GBkmNWRS/qUMCbj9qIRpxZcQ1X9NqgPM9hL3DJb78rbb11s= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777062539; c=relaxed/simple; bh=LeWtMza1NEtx8PNhrC2KM83q4wHE+oPHMMdyQ4+WNM4=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=HUw/sIzdpwoCejk+1ClIamMgsyBXLhIXIVFI/1x/QS0y7GJOiyB8bGey0UR7NBnQBm91uts26vhuc7g+ehJikTUDvDIKsAZoqTtLOhXRv8ubCxc1xaCMoxm9oPD4Nyg2mtRVtSLLzqw6+UmDzM2Xu1AMJxFoo8x/AbHngmgQ70g= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=t+gDFNW8; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="t+gDFNW8" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7DD0FC19425; Fri, 24 Apr 2026 20:28:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777062538; bh=LeWtMza1NEtx8PNhrC2KM83q4wHE+oPHMMdyQ4+WNM4=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date:From; b=t+gDFNW8ZSI8nGRS13rC5NC+AJsFv/yptJYiYxTIbDlMv+H4zPFPV5nOizxEcm8BT SG/sD2GLMhUdWtg+sNrCVEGrCF4UFdRwjPe/vmhuwAxF6Y/f3enrHxbYOJS6gmJNWx OIx7Q8Xnieu9yqVok5Pn/BbV/jojAQaR66Ze7GDvWOvNUEl93VzRqsWzjp2fnwacN5 cH9LMdsYG2RSzs2cgBHC2t5guP8TrqBlBOITY6VQbTEfwofLX7h9UIa/XVJNZgZqo8 W1y2tlcsD19QztPOOclfCR7GMnppr5bvcuAtmIBiSlANBIdwYafFhcOT1rlzDF8xSY WqD7UbDQojpyQ== From: sashiko-bot@kernel.org Subject: Re: [PATCH bpf-next v3 04/10] bpf: Implement batch ops and iterators for resizable hashtab Reply-To: sashiko@lists.linux.dev To: "Mykyta Yatsenko" Cc: bpf@vger.kernel.org In-Reply-To: <20260424-rhash-v3-4-d0fa0ce4379b@meta.com> References: <20260424-rhash-v3-4-d0fa0ce4379b@meta.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Fri, 24 Apr 2026 20:28:57 +0000 Message-Id: <20260424202858.7DD0FC19425@smtp.kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Sashiko AI review found 4 potential issue(s): - [High] Cursor invalidation in batch and seq_file iterators causes O(N^2) = complexity and infinite loops. - [High] Unbounded RCU read lock duration in batch operations allows user-t= riggered RCU stalls. - [High] Map iterators and seq_file dumps spuriously abort during normal ba= ckground resizes. - [High] Deadlock risk in `bpf_each_rhash_elem` due to acquiring `spin_lock= ` from atomic contexts. -- commit 9f7ce51619eb27e8a4a65df12f862e12e34bc285 Author: Mykyta Yatsenko bpf: Implement batch ops and iterators for resizable hashtab This commit adds batch operations for resizable hashtab maps, using rhashtable_walk_enter_from() to resume iteration. It also wires up seq_file BPF iterators and implements rhtab_map_mem_usage(). > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c > index d37f3d548d3658..000caa2c7f4cd9 100644 > --- a/kernel/bpf/hashtab.c > +++ b/kernel/bpf/hashtab.c [ ... ] > @@ -3066,64 +3066,330 @@ static int rhtab_map_get_next_key(struct bpf_map= *map, void *key, void *next_key > +static long bpf_each_rhash_elem(struct bpf_map *map, bpf_callback_t call= back_fn, > + void *callback_ctx, u64 flags) > +{ [ ... ] > + /* > + * The rhashtable walk API uses spin_lock in rhashtable_walk_start/stop, > + * which is not safe in NMI or soft/hard IRQ context. > + */ > + if (in_nmi() || in_hardirq() || in_softirq()) > + return -EOPNOTSUPP; > + > + rhashtable_walk_enter(&rhtab->ht, &iter); Does this adequately protect against deadlocks when called from kprobes or tracepoints in process context with preemption disabled? If a thread currently holding ht->lock (e.g., during map insertion) is interrupted by a kprobe that runs this BPF program, attempting to acquire the same spin_lock inside rhashtable_walk_enter() might cause a deadlock. > + rhashtable_walk_start(&iter); > + > + while ((elem =3D rhashtable_walk_next(&iter))) { > + /* rhashtable_walk_next returns -EAGAIN on resize, abort */ > + if (IS_ERR(elem)) { > + num_elems =3D -EBUSY; > + break; > + } Is it expected to abort the entire iteration if a background map resize occurs?=20 rhashtable_walk_next() returns -EAGAIN to indicate the table was resized and the iterator was rewound. Since resizable hashtabs automatically resize during normal workloads, returning -EBUSY here could cause bpf_for_each_map_elem() and bpftool map dumps to randomly fail. Should this handle -EAGAIN by retrying or continuing the walk instead? [ ... ] > +static int __rhtab_map_lookup_and_delete_batch(struct bpf_map *map, > + const union bpf_attr *attr, > + union bpf_attr __user *uattr, > + bool do_delete) > +{ [ ... ] > + /* > + * Use the last key from the previous batch as cursor. > + * enter_from positions at that key's bucket, walk_next > + * returns the successor in O(1). > + * First call (ubatch =3D=3D NULL): starts from bucket 0. > + */ > + if (ubatch) { > + buf =3D kmalloc(key_size, GFP_USER | __GFP_NOWARN); > + if (!buf) { > + ret =3D -ENOMEM; > + goto free; > + } > + if (copy_from_user(buf, ubatch, key_size)) { > + ret =3D -EFAULT; > + goto free; > + } > + } > + > + scoped_guard(rcu) { > + rhashtable_walk_enter_from(&rhtab->ht, &iter, buf, rhtab->params); > + rhashtable_walk_start(&iter); > + } If the key stored in buf was already deleted, will this iteration restart from the beginning of the map? Looking at rhashtable_walk_enter_from(), if it fails to find the key, it leaves iter->slot at 0. Then rhashtable_walk_start() and walk_next() will start from bucket 0 instead of where the previous batch left off. Since lookup_and_delete_batch deletes the elements returned in the previous batch, the cursor key will always be deleted. Could this cause the iterator to repeatedly scan over all emptied buckets, resulting in an expensive traversal? Similarly, for lookup_batch and seq_file dumps, if the cursor is concurrently deleted, could this trap userspace in an infinite loop of duplicate elements? > + > + dst_key =3D keys; > + dst_val =3D values; > + total =3D 0; > + > + while (total < max_count) { > + elem =3D rhtab_iter_next(&iter); > + if (!elem) > + break; Can this loop cause RCU stalls or soft lockups if max_count is very large?= =20 The max_count parameter is controlled by userspace. Since rhashtable_walk_start() acquires the RCU read-side critical section, this loop might iterate, copy memory, and process millions of elements without ever dropping the RCU lock or calling cond_resched(). [ ... ] > +static void *bpf_rhash_map_seq_start(struct seq_file *seq, loff_t *pos) > + __acquires(RCU) > +{ [ ... ] > + elem =3D rhashtable_walk_next(&info->iter); > + /* rhashtable_walk_next returns -EAGAIN on resize, abort */ > + if (IS_ERR(elem)) > + return ERR_PTR(-EBUSY); Similar to bpf_each_rhash_elem, will returning -EBUSY here on normal hash table resizes cause userspace map dumps to fail spuriously? [ ... ] > +static void *bpf_rhash_map_seq_next(struct seq_file *seq, void *v, loff_= t *pos) > +{ [ ... ] > + elem =3D rhashtable_walk_next(&info->iter); > + /* rhashtable_walk_next returns -EAGAIN on resize, abort */ > + if (IS_ERR(elem)) > + return ERR_PTR(-EBUSY); > + > + return elem; > +} Again, aborting on -EAGAIN here might interrupt the map dump when a resize happens in the background. Is there a way to handle -EAGAIN without failing the entire sequence? --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260424-rhash-v3-0= -d0fa0ce4379b@meta.com?part=3D4