All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Kuniyuki Iwashima" <kuniyu@google.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH v1 bpf-next 8/8] selftest: bpf: Add test for BPF_SOCK_OPS_RCVLOWAT_CB.
Date: Fri, 08 May 2026 22:17:19 +0000	[thread overview]
Message-ID: <20260508221719.EED64C2BCB0@smtp.kernel.org> (raw)
In-Reply-To: <20260508073355.3916746-9-kuniyu@google.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Medium] Flaky test design due to race condition between send() and epoll_wait()/getsockopt().
- [Medium] Pipelined data causes a TOCTOU race that permanently breaks and disables the BPF autolowat program.
--

commit 7be0b88b0617982ff63886ecdcf680fd3a6f0db8
Author: Kuniyuki Iwashima <kuniyu@google.com>

selftest: bpf: Add test for BPF_SOCK_OPS_RCVLOWAT_CB.

This commit adds a selftest simulating a production scenario where a BPF
program dynamically adjusts a socket's SO_RCVLOWAT based on the parsing
of RPC descriptors. The test exchanges partial and full RPC frames over
a local connection and verifies that epoll wakes up the application at
the correct boundaries.

> diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_autolowat.c b/tools/testing/selftests/bpf/prog_tests/tcp_autolowat.c
> --- a/tools/testing/selftests/bpf/prog_tests/tcp_autolowat.c
> +++ b/tools/testing/selftests/bpf/prog_tests/tcp_autolowat.c

[ ... ]

> @@ -0,0 +1,350 @@
> +static void tcp_autolowat_run_rpc_test(struct tcp_autolowat_test_cb *cb,
> +				       struct rpc_test_case *test_case)
> +{

[ ... ]

> +		switch (event->type) {
> +		case RPC_EVENT_END:
> +			return;
> +		case RPC_EVENT_AUTOLOWAT:
> +			err = setsockopt(cb->child, SOL_BPF, BPF_TCP_AUTOLOWAT,
> +					 &event->val, sizeof(event->val));
> +			if (!ASSERT_OK(err, "setsockopt"))
> +				return;
> +			break;
> +		case RPC_EVENT_SEND:
> +			err = send(cb->client, ptr, event->len, 0);
> +			if (!ASSERT_EQ(err, event->len, "send"))
> +				return;
> +
> +			ptr += event->len;
> +			break;

[ ... ]

> +		case RPC_EVENT_EPOLL:
> +			err = epoll_wait(cb->epoll, &ev, 1, 0);
> +			if (!ASSERT_EQ(err, event->nfds, "epoll_wait"))
> +				return;
> +			break;
> +		case RPC_EVENT_RCVLOWAT:
> +			optval = 0;
> +			optlen = sizeof(optval);
> +
> +			err = getsockopt(cb->child, SOL_SOCKET, SO_RCVLOWAT, &optval, &optlen);
> +			if (!ASSERT_OK(err, "getsockopt") ||
> +			    !ASSERT_EQ(optval, event->rcvlowat, "rcvlowat"))
> +				return;
> +			break;
> +		}

Are the epoll and getsockopt checks vulnerable to flaky failures
under load?

Since TCP loopback delivery can be deferred to softirq context, could
epoll_wait with a 0-millisecond timeout return 0 if the packet hasn't
been queued yet?

Similarly, could getsockopt read a stale SO_RCVLOWAT value before the
BPF program gets a chance to update it?

Would using a polling loop or a positive epoll timeout prevent
intermittent test failures?

> diff --git a/tools/testing/selftests/bpf/progs/tcp_autolowat.c b/tools/testing/selftests/bpf/progs/tcp_autolowat.c
> --- a/tools/testing/selftests/bpf/progs/tcp_autolowat.c
> +++ b/tools/testing/selftests/bpf/progs/tcp_autolowat.c

[ ... ]

> @@ -0,0 +1,316 @@
> +static int tcp_parse_descriptor(struct bpf_sock_ops *skops,
> +				struct tcp_autolowat_cb *cb,
> +				u32 seq, u32 end_seq)
> +{
> +	struct rpc_descriptor *rpc_desc;
> +	u32 rpc_copied_seq;
> +	u32 copy_len;
> +	u64 rpc_len;
> +	int err;
> +
> +	rpc_copied_seq = cb->rpc_desc_seq + cb->rpc_desc_buff_len;
> +
> +	if (before(cb->rpc_desc_seq + RPC_DESC_SIZE, end_seq))
> +		copy_len = RPC_DESC_SIZE - cb->rpc_desc_buff_len;
> +	else
> +		copy_len = end_seq - rpc_copied_seq;
> +
> +	/* Since LLVM commit 324e27e8bad83ca23a3cd276d7e2e729b1b0b8c7,
> +	 * clang omits the "copy_len == 0" check below, which is necessary
> +	 * to satisfy the BPF verifier's range check for bpf_skb_load_bytes().
> +	 */
> +	barrier_var(copy_len);
> +
> +	if (copy_len == 0)
> +		goto disable; /* FIN. */
> +	if (copy_len > RPC_DESC_SIZE)
> +		goto disable; /* always false, only for verifier. */
> +	if (cb->rpc_desc_buf + cb->rpc_desc_buff_len >= &cb->rpc_desc_buf[RPC_DESC_SIZE])
> +		goto disable; /* always false, only for verifier. */
> +
> +	err = bpf_skb_load_bytes(skops, rpc_copied_seq - seq,
> +				 cb->rpc_desc_buf + cb->rpc_desc_buff_len, copy_len);
> +	if (err)
> +		goto disable;

Is there a race condition if pipelined data is already in the receive
queue when BPF_TCP_AUTOLOWAT is enabled?

Looking at tcp_init_autolowat_cb(), cb->rpc_desc_seq is initialized to
tp->copied_seq. If there is already unread data in the queue, the next
packet arrival triggers BPF_SOCK_OPS_RCVLOWAT_CB.

Because seq (which comes from tcb->seq) corresponds to the new packet
and is greater than tp->copied_seq, will rpc_copied_seq - seq underflow?

If it underflows to a large positive value, won't bpf_skb_load_bytes()
fail with -EFAULT, causing the code to jump to the disable label and
permanently disable the autolowat functionality for this socket?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260508073355.3916746-1-kuniyu@google.com?part=8

  parent reply	other threads:[~2026-05-08 22:17 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08  7:33 [PATCH v1 bpf-next 0/8] bpf: Add SOCK_OPS hooks for TCP AutoLOWAT Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 1/8] selftest: bpf: Use BPF_SOCK_OPS_ALL_CB_FLAGS + 1 for bad_cb_test_rv Kuniyuki Iwashima
2026-05-08 19:02   ` sashiko-bot
2026-05-08 20:21     ` Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 2/8] bpf: tcp: Introduce BPF_SOCK_OPS_RCVLOWAT_CB Kuniyuki Iwashima
2026-05-08 19:17   ` sashiko-bot
2026-05-08 20:26     ` Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 3/8] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVLOWAT_CB Kuniyuki Iwashima
2026-05-08 15:15   ` Stanislav Fomichev
2026-05-08 19:45     ` Kuniyuki Iwashima
2026-05-11 14:56       ` Stanislav Fomichev
2026-05-08  7:33 ` [PATCH v1 bpf-next 4/8] tcp: Split out __tcp_set_rcvlowat() Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 5/8] bpf: tcp: Add kfunc to adjust sk->sk_rcvlowat Kuniyuki Iwashima
2026-05-11 12:34   ` Björn Töpel
2026-05-17 23:28     ` Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 6/8] bpf: tcp: Factorise bpf_skops_established() Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 7/8] bpf: tcp: Add SOCK_OPS rcvlowat hook Kuniyuki Iwashima
2026-05-08 10:37   ` Jiayuan Chen
2026-05-08 11:30     ` Kuniyuki Iwashima
2026-05-08 12:19       ` Jiayuan Chen
2026-05-08 15:28   ` Stanislav Fomichev
2026-05-08 20:05     ` Kuniyuki Iwashima
2026-05-11 14:55       ` Stanislav Fomichev
2026-05-08 21:46   ` sashiko-bot
2026-05-08  7:33 ` [PATCH v1 bpf-next 8/8] selftest: bpf: Add test for BPF_SOCK_OPS_RCVLOWAT_CB Kuniyuki Iwashima
2026-05-08 15:35   ` Stanislav Fomichev
2026-05-08 20:19     ` Kuniyuki Iwashima
2026-05-08 21:47       ` Stanislav Fomichev
2026-05-08 21:58         ` Kuniyuki Iwashima
2026-05-08 22:17   ` sashiko-bot [this message]
2026-05-08 22:47     ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260508221719.EED64C2BCB0@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=kuniyu@google.com \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.