Re: [PATCH v2 bpf-next/net 4/5] bpf: Add kfunc to proxy TX HW Timestamp.

BPF List
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Kuniyuki Iwashima" <kuniyu@google.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH v2 bpf-next/net 4/5] bpf: Add kfunc to proxy TX HW Timestamp.
Date: Sat, 13 Jun 2026 01:20:52 +0000	[thread overview]
Message-ID: <20260613012052.C30831F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260613010039.1362312-5-kuniyu@google.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Incompatible `payload_offset` semantics and bounds checking between `tcx`, `cls_bpf`, and `bpf_skb_complete_tx_tstamp()` leads to false rejections or packet corruption.
- [High] Blindly clearing skb_shinfo(skb)->tx_flags destroys the SKBTX_DEV_ZEROCOPY flag, causing a permanent memory leak and application hang.
--

commit 07c148630f8e93ce45f5ce41dcd9ee675f05a799
Author: Kuniyuki Iwashima <kuniyu@google.com>

bpf: Add kfunc to proxy TX HW Timestamp.

The commit adds two kfuncs, bpf_skb_scrub_tx_tstamp() and
bpf_skb_complete_tx_tstamp(), to proxy TX hardware timestamps. This allows
socket applications to receive TX hardware timestamps via SCM_TIMESTAMPING.

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 0c6c270d9f7d1..a45cdf38616ac 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4456,6 +4456,41 @@ tcx_run(const struct bpf_mprog_entry *entry, struct sk_buff *skb,
>  	return tcx_action_code(skb, ret);
>  }
>  
> +static int skb_do_completion(struct sk_buff *skb)
> +{
> +	enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_INGRESS;
> +	struct bpf_redirect_info *ri = bpf_net_ctx_get_ri();
> +	struct bpf_tx_tstamp_cmpl *txtscmpl;
> +
> +	if (!(ri->kern_flags & BPF_RI_F_TX_TS_CMPL))
> +		goto drop;
> +
> +	if (skb_unclone(skb, GFP_ATOMIC))
> +		goto drop;
> +
> +	__skb_push(skb, skb->mac_len);
> +
> +	txtscmpl = &ri->txtscmpl;
> +
> +	drop_reason = pskb_may_pull_reason(skb, txtscmpl->payload_offset);
> +	if (drop_reason)
> +		goto drop;
> +
> +	skb->protocol = txtscmpl->protocol;
> +	skb_set_network_header(skb, txtscmpl->network_offset);
> +	__skb_pull(skb, txtscmpl->payload_offset);
> +
> +	skb_shinfo(skb)->tskey = txtscmpl->tskey;
> +	skb_shinfo(skb)->tx_flags = SKBTX_HW_TSTAMP_NOBPF;

[Severity: High]
Does this direct assignment cause a memory leak if the packet was sent with
MSG_ZEROCOPY?

If the user application sends a packet with MSG_ZEROCOPY, tx_flags contains
SKBTX_DEV_ZEROCOPY and a reference to ubuf_info is held. Blindly overwriting
tx_flags here erases SKBTX_DEV_ZEROCOPY. This prevents skb_release_data()
from calling skb_zcopy_clear() when the SKB is freed, leaking the ubuf_info
reference and causing the userspace socket to hang indefinitely waiting for
the completion notification.

Can we mask out only the timestamp flags or preserve SKBTX_DEV_ZEROCOPY?

> +	__skb_tstamp_tx(skb, NULL, skb_hwtstamps(skb), skb->sk, SCM_TSTAMP_SND);
> +
> +	consume_skb(skb);
> +	return NET_RX_SUCCESS;

[ ... ]

> diff --git a/net/core/filter.c b/net/core/filter.c
> index bee1e7eee5eae..fce2d24aad487 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -12396,6 +12396,67 @@ __bpf_kfunc int bpf_skb_set_hwtstamp(struct __sk_buff *s,
>  	return 0;
>  }
>  
> +__bpf_kfunc int bpf_skb_scrub_tx_tstamp(struct __sk_buff *s)
> +{
> +	struct sk_buff *skb = (struct sk_buff *)s;
> +
> +	if (skb_at_tc_ingress(skb))
> +		return -EINVAL;
> +
> +	if (skb_unclone(skb, GFP_ATOMIC))
> +		return -ENOMEM;
> +
> +	skb_shinfo(skb)->tx_flags = 0;

[Severity: High]
Does clearing tx_flags here also destroy SKBTX_DEV_ZEROCOPY?

Similar to skb_do_completion(), if a user application sends a packet with
MSG_ZEROCOPY, blindly clearing tx_flags to 0 erases SKBTX_DEV_ZEROCOPY.
When the SKB is ultimately freed after transmission, skb_release_data()
will not call skb_zcopy_clear(), leading to a leaked ubuf_info reference
and hanging the userspace socket waiting for completion.

> +
> +	bpf_compute_data_pointers(skb);
> +
> +	return 0;
> +}
> +
> +__bpf_kfunc int bpf_skb_complete_tx_tstamp(struct __sk_buff *s,
> +					   struct bpf_tx_tstamp_cmpl *attrs,
> +					   int attrs__sz)
> +{
> +	int defined_sz = offsetofend(struct bpf_tx_tstamp_cmpl, payload_offset);
> +	struct sk_buff *skb = (struct sk_buff *)s;
> +	struct bpf_redirect_info *ri;
> +	struct sock *sk = skb->sk;
> +	s32 delta;
> +
> +	if (attrs__sz != sizeof(*attrs) ||
> +	    memchr_inv((char *)attrs + defined_sz, 0, sizeof(u16)))
> +		return -EINVAL;
> +
> +	if (!sk || !sk_fullsock(sk))
> +		return -EINVAL;
> +
> +	if (!skb_at_tc_ingress(skb))
> +		return -EINVAL;
> +
> +	if (attrs->payload_offset > skb->len)

[Severity: High]
Are there incompatible payload_offset semantics between here and legacy TC?

In skb_do_completion(), skb->mac_len is unconditionally pushed before pulling
txtscmpl->payload_offset:

	__skb_push(skb, skb->mac_len);
	txtscmpl = &ri->txtscmpl;
	drop_reason = pskb_may_pull_reason(skb, txtscmpl->payload_offset);

This structurally requires payload_offset to be relative to the MAC header.
However, for BPF programs attached via legacy TC (cls_bpf), the MAC header
is not pushed prior to ingress execution, meaning skb->len excludes mac_len.

If a MAC-relative offset is provided, it will exceed skb->len for small
payloads (e.g., empty TCP ACKs where mac_len + offset > total_len - mac_len),
causing a false -EINVAL rejection.

Conversely, if a BPF program compensates by sending a network-relative offset
to pass this validation, it will cause skb_do_completion() to pull from the
MAC header incorrectly, corrupting the packet.

> +		return -EINVAL;
> +
> +	delta = attrs->payload_offset - attrs->network_offset;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260613010039.1362312-1-kuniyu@google.com?part=4

next prev parent reply	other threads:[~2026-06-13  1:20 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-13  0:59 [PATCH v2 bpf-next/net 0/5] bpf: Support RX/TX HW timestamp proxy Kuniyuki Iwashima
2026-06-13  0:59 ` [PATCH v2 bpf-next/net 1/5] ethtool: Introduce ETHTOOL_MSG_TSINFO_SET for virtual interfaces Kuniyuki Iwashima
2026-06-13  1:21   ` sashiko-bot
2026-06-13  2:29   ` Stanislav Fomichev
2026-06-13  3:43     ` Kuniyuki Iwashima
2026-06-13  0:59 ` [PATCH v2 bpf-next/net 2/5] bpf: Rename bpf_kfunc_set_tcp_reqsk to bpf_kfunc_set_sched_cls Kuniyuki Iwashima
2026-06-13  1:00 ` [PATCH v2 bpf-next/net 3/5] bpf: Add bpf_skb_set_hwtstamp() Kuniyuki Iwashima
2026-06-13  1:12   ` sashiko-bot
2026-06-13 22:01     ` Kuniyuki Iwashima
2026-06-13  1:00 ` [PATCH v2 bpf-next/net 4/5] bpf: Add kfunc to proxy TX HW Timestamp Kuniyuki Iwashima
2026-06-13  1:20   ` sashiko-bot [this message]
2026-06-13 22:03     ` Kuniyuki Iwashima
2026-06-13  1:00 ` [PATCH v2 bpf-next/net 5/5] selftest: bpf: Add test for hwtstamp proxy Kuniyuki Iwashima
2026-06-13  1:31   ` Alexei Starovoitov
2026-06-13 22:08     ` Kuniyuki Iwashima
2026-06-13 17:20 ` [PATCH v2 bpf-next/net 0/5] bpf: Support RX/TX HW timestamp proxy Jakub Kicinski
2026-06-13 21:43   ` Kuniyuki Iwashima
2026-06-13 22:47     ` Jakub Kicinski
2026-06-13 23:18       ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260613012052.C30831F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=kuniyu@google.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox