From: Martin KaFai Lau <martin.lau@linux.dev>
To: bpf@vger.kernel.org
Cc: 'Alexei Starovoitov ' <ast@kernel.org>,
'Andrii Nakryiko ' <andrii@kernel.org>,
'Daniel Borkmann ' <daniel@iogearbox.net>,
netdev@vger.kernel.org, kernel-team@meta.com,
Aditi Ghag <aditi.ghag@isovalent.com>
Subject: [PATCH v2 bpf 2/3] bpf: Avoid iter->offset making backward progress in bpf_iter_udp
Date: Wed, 10 Jan 2024 09:57:42 -0800 [thread overview]
Message-ID: <20240110175743.2220907-3-martin.lau@linux.dev> (raw)
In-Reply-To: <20240110175743.2220907-1-martin.lau@linux.dev>
From: Martin KaFai Lau <martin.lau@kernel.org>
There is a bug in the bpf_iter_udp_batch() function that stops
the userspace from making forward progress.
The case that triggers the bug is the userspace passed in
a very small read buffer. When the bpf prog does bpf_seq_printf,
the userspace read buffer is not enough to capture the whole bucket.
When the read buffer is not large enough, the kernel will remember
the offset of the bucket in iter->offset such that the next userspace
read() can continue from where it left off.
The kernel will skip the number (== "iter->offset") of sockets in
the next read(). However, the code directly decrements the
"--iter->offset". This is incorrect because the next read() may
not consume the whole bucket either and then the next-next read()
will start from offset 0. The net effect is the userspace will
keep reading from the beginning of a bucket and the process will
never finish. "iter->offset" must always go forward until the
whole bucket is consumed.
This patch fixes it by using a local variable "resume_offset"
and "resume_bucket". "iter->offset" is always reset to 0 before
it may be used. "iter->offset" will be advanced to the
"resume_offset" when it continues from the "resume_bucket" (i.e.
"state->bucket == resume_bucket"). This brings it closer to
the bpf_iter_tcp's offset handling which does not suffer
the same bug.
Cc: Aditi Ghag <aditi.ghag@isovalent.com>
Fixes: c96dac8d369f ("bpf: udp: Implement batching for sockets iterator")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
---
net/ipv4/udp.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 978b83d3c094..04c534a9ef89 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -3137,16 +3137,18 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
struct bpf_udp_iter_state *iter = seq->private;
struct udp_iter_state *state = &iter->state;
struct net *net = seq_file_net(seq);
+ int resume_bucket, resume_offset;
struct udp_table *udptable;
unsigned int batch_sks = 0;
bool resized = false;
struct sock *sk;
+ resume_bucket = state->bucket;
+ resume_offset = iter->offset;
+
/* The current batch is done, so advance the bucket. */
- if (iter->st_bucket_done) {
+ if (iter->st_bucket_done)
state->bucket++;
- iter->offset = 0;
- }
udptable = udp_get_table_seq(seq, net);
@@ -3166,19 +3168,19 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
for (; state->bucket <= udptable->mask; state->bucket++) {
struct udp_hslot *hslot2 = &udptable->hash2[state->bucket];
- if (hlist_empty(&hslot2->head)) {
- iter->offset = 0;
+ if (hlist_empty(&hslot2->head))
continue;
- }
+ iter->offset = 0;
spin_lock_bh(&hslot2->lock);
udp_portaddr_for_each_entry(sk, &hslot2->head) {
if (seq_sk_match(seq, sk)) {
/* Resume from the last iterated socket at the
* offset in the bucket before iterator was stopped.
*/
- if (iter->offset) {
- --iter->offset;
+ if (state->bucket == resume_bucket &&
+ iter->offset < resume_offset) {
+ ++iter->offset;
continue;
}
if (iter->end_sk < iter->max_sk) {
@@ -3192,9 +3194,6 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
if (iter->end_sk)
break;
-
- /* Reset the current bucket's offset before moving to the next bucket. */
- iter->offset = 0;
}
/* All done: no batch made. */
--
2.34.1
next prev parent reply other threads:[~2024-01-10 17:58 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-10 17:57 [PATCH v2 bpf 0/3] bpf: Fix backward progress bug in bpf_iter_udp Martin KaFai Lau
2024-01-10 17:57 ` [PATCH v2 bpf 1/3] bpf: iter_udp: Retry with a larger batch size without going back to the previous bucket Martin KaFai Lau
2024-01-12 4:52 ` Yonghong Song
2024-01-10 17:57 ` Martin KaFai Lau [this message]
2024-01-12 5:33 ` [PATCH v2 bpf 2/3] bpf: Avoid iter->offset making backward progress in bpf_iter_udp Yonghong Song
2024-01-10 17:57 ` [PATCH v2 bpf 3/3] selftests/bpf: Test udp and tcp iter batching Martin KaFai Lau
2024-01-12 17:50 ` Yonghong Song
2024-01-12 18:10 ` Martin KaFai Lau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240110175743.2220907-3-martin.lau@linux.dev \
--to=martin.lau@linux.dev \
--cc=aditi.ghag@isovalent.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@meta.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox