Re: [PATCH v2 bpf-next/net 4/5] bpf: Add kfunc to proxy TX HW Timestamp.

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Kuniyuki Iwashima" <kuniyu@google.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH v2 bpf-next/net 4/5] bpf: Add kfunc to proxy TX HW Timestamp.
Date: Sat, 13 Jun 2026 01:20:52 +0000	[thread overview]
Message-ID: <20260613012052.C30831F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260613010039.1362312-5-kuniyu@google.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Incompatible `payload_offset` semantics and bounds checking between `tcx`, `cls_bpf`, and `bpf_skb_complete_tx_tstamp()` leads to false rejections or packet corruption.
- [High] Blindly clearing skb_shinfo(skb)->tx_flags destroys the SKBTX_DEV_ZEROCOPY flag, causing a permanent memory leak and application hang.
--

commit 07c148630f8e93ce45f5ce41dcd9ee675f05a799
Author: Kuniyuki Iwashima <kuniyu@google.com>

bpf: Add kfunc to proxy TX HW Timestamp.

The commit adds two kfuncs, bpf_skb_scrub_tx_tstamp() and
bpf_skb_complete_tx_tstamp(), to proxy TX hardware timestamps. This allows
socket applications to receive TX hardware timestamps via SCM_TIMESTAMPING.

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 0c6c270d9f7d1..a45cdf38616ac 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4456,6 +4456,41 @@ tcx_run(const struct bpf_mprog_entry *entry, struct sk_buff *skb,
>  	return tcx_action_code(skb, ret);
>  }
>  
> +static int skb_do_completion(struct sk_buff *skb)
> +{
> +	enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_INGRESS;
> +	struct bpf_redirect_info *ri = bpf_net_ctx_get_ri();
> +	struct bpf_tx_tstamp_cmpl *txtscmpl;
> +
> +	if (!(ri->kern_flags & BPF_RI_F_TX_TS_CMPL))
> +		goto drop;
> +
> +	if (skb_unclone(skb, GFP_ATOMIC))
> +		goto drop;
> +
> +	__skb_push(skb, skb->mac_len);
> +
> +	txtscmpl = &ri->txtscmpl;
> +
> +	drop_reason = pskb_may_pull_reason(skb, txtscmpl->payload_offset);
> +	if (drop_reason)
> +		goto drop;
> +
> +	skb->protocol = txtscmpl->protocol;
> +	skb_set_network_header(skb, txtscmpl->network_offset);
> +	__skb_pull(skb, txtscmpl->payload_offset);
> +
> +	skb_shinfo(skb)->tskey = txtscmpl->tskey;
> +	skb_shinfo(skb)->tx_flags = SKBTX_HW_TSTAMP_NOBPF;

[Severity: High]
Does this direct assignment cause a memory leak if the packet was sent with
MSG_ZEROCOPY?

If the user application sends a packet with MSG_ZEROCOPY, tx_flags contains
SKBTX_DEV_ZEROCOPY and a reference to ubuf_info is held. Blindly overwriting
tx_flags here erases SKBTX_DEV_ZEROCOPY. This prevents skb_release_data()
from calling skb_zcopy_clear() when the SKB is freed, leaking the ubuf_info
reference and causing the userspace socket to hang indefinitely waiting for
the completion notification.

Can we mask out only the timestamp flags or preserve SKBTX_DEV_ZEROCOPY?

> +	__skb_tstamp_tx(skb, NULL, skb_hwtstamps(skb), skb->sk, SCM_TSTAMP_SND);
> +
> +	consume_skb(skb);
> +	return NET_RX_SUCCESS;

[ ... ]

> diff --git a/net/core/filter.c b/net/core/filter.c
> index bee1e7eee5eae..fce2d24aad487 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -12396,6 +12396,67 @@ __bpf_kfunc int bpf_skb_set_hwtstamp(struct __sk_buff *s,
>  	return 0;
>  }
>  
> +__bpf_kfunc int bpf_skb_scrub_tx_tstamp(struct __sk_buff *s)
> +{
> +	struct sk_buff *skb = (struct sk_buff *)s;
> +
> +	if (skb_at_tc_ingress(skb))
> +		return -EINVAL;
> +
> +	if (skb_unclone(skb, GFP_ATOMIC))
> +		return -ENOMEM;
> +
> +	skb_shinfo(skb)->tx_flags = 0;

[Severity: High]
Does clearing tx_flags here also destroy SKBTX_DEV_ZEROCOPY?

Similar to skb_do_completion(), if a user application sends a packet with
MSG_ZEROCOPY, blindly clearing tx_flags to 0 erases SKBTX_DEV_ZEROCOPY.
When the SKB is ultimately freed after transmission, skb_release_data()
will not call skb_zcopy_clear(), leading to a leaked ubuf_info reference
and hanging the userspace socket waiting for completion.

> +
> +	bpf_compute_data_pointers(skb);
> +
> +	return 0;
> +}
> +
> +__bpf_kfunc int bpf_skb_complete_tx_tstamp(struct __sk_buff *s,
> +					   struct bpf_tx_tstamp_cmpl *attrs,
> +					   int attrs__sz)
> +{
> +	int defined_sz = offsetofend(struct bpf_tx_tstamp_cmpl, payload_offset);
> +	struct sk_buff *skb = (struct sk_buff *)s;
> +	struct bpf_redirect_info *ri;
> +	struct sock *sk = skb->sk;
> +	s32 delta;
> +
> +	if (attrs__sz != sizeof(*attrs) ||
> +	    memchr_inv((char *)attrs + defined_sz, 0, sizeof(u16)))
> +		return -EINVAL;
> +
> +	if (!sk || !sk_fullsock(sk))
> +		return -EINVAL;
> +
> +	if (!skb_at_tc_ingress(skb))
> +		return -EINVAL;
> +
> +	if (attrs->payload_offset > skb->len)

[Severity: High]
Are there incompatible payload_offset semantics between here and legacy TC?

In skb_do_completion(), skb->mac_len is unconditionally pushed before pulling
txtscmpl->payload_offset:

	__skb_push(skb, skb->mac_len);
	txtscmpl = &ri->txtscmpl;
	drop_reason = pskb_may_pull_reason(skb, txtscmpl->payload_offset);

This structurally requires payload_offset to be relative to the MAC header.
However, for BPF programs attached via legacy TC (cls_bpf), the MAC header
is not pushed prior to ingress execution, meaning skb->len excludes mac_len.

If a MAC-relative offset is provided, it will exceed skb->len for small
payloads (e.g., empty TCP ACKs where mac_len + offset > total_len - mac_len),
causing a false -EINVAL rejection.

Conversely, if a BPF program compensates by sending a network-relative offset
to pass this validation, it will cause skb_do_completion() to pull from the
MAC header incorrectly, corrupting the packet.

> +		return -EINVAL;
> +
> +	delta = attrs->payload_offset - attrs->network_offset;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260613010039.1362312-1-kuniyu@google.com?part=4

next prev parent reply	other threads:[~2026-06-13  1:20 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-13  0:59 [PATCH v2 bpf-next/net 0/5] bpf: Support RX/TX HW timestamp proxy Kuniyuki Iwashima
2026-06-13  0:59 ` [PATCH v2 bpf-next/net 1/5] ethtool: Introduce ETHTOOL_MSG_TSINFO_SET for virtual interfaces Kuniyuki Iwashima
2026-06-13  1:21   ` sashiko-bot
2026-06-13  2:29   ` Stanislav Fomichev
2026-06-13  3:43     ` Kuniyuki Iwashima
2026-06-13  0:59 ` [PATCH v2 bpf-next/net 2/5] bpf: Rename bpf_kfunc_set_tcp_reqsk to bpf_kfunc_set_sched_cls Kuniyuki Iwashima
2026-06-13  1:00 ` [PATCH v2 bpf-next/net 3/5] bpf: Add bpf_skb_set_hwtstamp() Kuniyuki Iwashima
2026-06-13  1:12   ` sashiko-bot
2026-06-13 22:01     ` Kuniyuki Iwashima
2026-06-13  1:00 ` [PATCH v2 bpf-next/net 4/5] bpf: Add kfunc to proxy TX HW Timestamp Kuniyuki Iwashima
2026-06-13  1:20   ` sashiko-bot [this message]
2026-06-13 22:03     ` Kuniyuki Iwashima
2026-06-13  1:00 ` [PATCH v2 bpf-next/net 5/5] selftest: bpf: Add test for hwtstamp proxy Kuniyuki Iwashima
2026-06-13  1:31   ` Alexei Starovoitov
2026-06-13 22:08     ` Kuniyuki Iwashima
2026-06-13 17:20 ` [PATCH v2 bpf-next/net 0/5] bpf: Support RX/TX HW timestamp proxy Jakub Kicinski
2026-06-13 21:43   ` Kuniyuki Iwashima
2026-06-13 22:47     ` Jakub Kicinski
2026-06-13 23:18       ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260613012052.C30831F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=kuniyu@google.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.