netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amery Hung <ameryhung@gmail.com>
To: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org, alexei.starovoitov@gmail.com,
	andrii@kernel.org, daniel@iogearbox.net, kuba@kernel.org,
	stfomichev@gmail.com, martin.lau@kernel.org,
	mohsin.bashr@gmail.com, noren@nvidia.com, dtatulea@nvidia.com,
	saeedm@nvidia.com, tariqt@nvidia.com, mbloch@nvidia.com,
	maciej.fijalkowski@intel.com, kernel-team@meta.com
Subject: [PATCH bpf-next v2 3/7] bpf: Support pulling non-linear xdp data
Date: Fri,  5 Sep 2025 10:33:47 -0700	[thread overview]
Message-ID: <20250905173352.3759457-4-ameryhung@gmail.com> (raw)
In-Reply-To: <20250905173352.3759457-1-ameryhung@gmail.com>

Add kfunc, bpf_xdp_pull_data(), to support pulling data from xdp
fragments. Similar to bpf_skb_pull_data(), bpf_xdp_pull_data() makes
the first len bytes of data directly readable and writable in bpf
programs. If the "len" argument is larger than the linear data size,
data in fragments will be copied to the linear region when there
is enough room between xdp->data_end and xdp_data_hard_end(xdp),
which is subject to driver implementation.

A use case of the kfunc is to decapsulate headers residing in xdp
fragments. It is possible for a NIC driver to place headers in xdp
fragments. To keep using direct packet access for parsing and
decapsulating headers, users can pull headers into the linear data
area by calling bpf_xdp_pull_data() and then pop the header with
bpf_xdp_adjust_head().

An unused argument, flags is reserved for future extension (e.g.,
tossing the data instead of copying it to the linear data area).

Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
 net/core/filter.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/net/core/filter.c b/net/core/filter.c
index 0b82cb348ce0..0b161106debd 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -12212,6 +12212,81 @@ __bpf_kfunc int bpf_sock_ops_enable_tx_tstamp(struct bpf_sock_ops_kern *skops,
 	return 0;
 }
 
+/**
+ * bpf_xdp_pull_data() - Pull in non-linear xdp data.
+ * @x: &xdp_md associated with the XDP buffer
+ * @len: length of data to be made directly accessible in the linear part
+ * @flags: future use, must be zero
+ *
+ * Pull in non-linear data in case the XDP buffer associated with @x is
+ * non-linear and not all @len are in the linear data area.
+ *
+ * Direct packet access allows reading and writing linear XDP data through
+ * packet pointers (i.e., &xdp_md->data + offsets). When an eBPF program wants
+ * to directly access data that may be in the non-linear area, call this kfunc
+ * to make sure the data is available in the linear area.
+ *
+ * This kfunc can also be used with bpf_xdp_adjust_head() to decapsulate
+ * headers in the non-linear data area.
+ *
+ * A call to this kfunc is susceptible to change the underlying packet buffer.
+ * Therefore, at load time, all checks on pointers previously done by the
+ * verifier are invalidated and must be performed again, if the kfunc is used
+ * in combination with direct packet access.
+ *
+ * Return:
+ * * %0         - success
+ * * %-EINVAL   - invalid len or flags
+ */
+__bpf_kfunc int bpf_xdp_pull_data(struct xdp_md *x, u32 len, u64 flags)
+{
+	struct xdp_buff *xdp = (struct xdp_buff *)x;
+	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
+	void *data_hard_end = xdp_data_hard_end(xdp);
+	void *data_end = xdp->data + len;
+	int i, delta, n_frags_free = 0, len_free = 0;
+
+	if (flags)
+		return -EINVAL;
+
+	if (unlikely(len > xdp_get_buff_len(xdp)))
+		return -EINVAL;
+
+	if (unlikely(data_end < xdp->data || data_end > data_hard_end))
+		return -EINVAL;
+
+	delta = data_end - xdp->data_end;
+	if (delta <= 0)
+		return 0;
+
+	for (i = 0; i < sinfo->nr_frags && delta; i++) {
+		skb_frag_t *frag = &sinfo->frags[i];
+		u32 shrink = min_t(u32, delta, skb_frag_size(frag));
+
+		memcpy(xdp->data_end + len_free, skb_frag_address(frag), shrink);
+
+		len_free += shrink;
+		delta -= shrink;
+		if (bpf_xdp_shrink_data(xdp, frag, shrink, false))
+			n_frags_free++;
+	}
+
+	if (unlikely(n_frags_free)) {
+		memmove(sinfo->frags, sinfo->frags + n_frags_free,
+			(sinfo->nr_frags - n_frags_free) * sizeof(skb_frag_t));
+
+		sinfo->nr_frags -= n_frags_free;
+
+		if (!sinfo->nr_frags)
+			xdp_buff_clear_frags_flag(xdp);
+	}
+
+	sinfo->xdp_frags_size -= len_free;
+	xdp->data_end = data_end;
+
+	return 0;
+}
+
 __bpf_kfunc_end_defs();
 
 int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags,
@@ -12239,6 +12314,7 @@ BTF_KFUNCS_END(bpf_kfunc_check_set_skb_meta)
 
 BTF_KFUNCS_START(bpf_kfunc_check_set_xdp)
 BTF_ID_FLAGS(func, bpf_dynptr_from_xdp)
+BTF_ID_FLAGS(func, bpf_xdp_pull_data)
 BTF_KFUNCS_END(bpf_kfunc_check_set_xdp)
 
 BTF_KFUNCS_START(bpf_kfunc_check_set_sock_addr)
-- 
2.47.3


  parent reply	other threads:[~2025-09-05 17:33 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05 17:33 [PATCH bpf-next v2 0/7] Add kfunc bpf_xdp_pull_data Amery Hung
2025-09-05 17:33 ` [PATCH bpf-next v2 1/7] net/mlx5e: Fix generating skb from nonlinear xdp_buff Amery Hung
2025-09-08 14:41   ` Dragos Tatulea
2025-09-08 17:23     ` Amery Hung
2025-09-05 17:33 ` [PATCH bpf-next v2 2/7] bpf: Allow bpf_xdp_shrink_data to shrink a frag from head and tail Amery Hung
2025-09-05 17:33 ` Amery Hung [this message]
2025-09-08 19:27   ` [PATCH bpf-next v2 3/7] bpf: Support pulling non-linear xdp data Martin KaFai Lau
2025-09-08 22:28     ` Amery Hung
2025-09-09  1:54   ` Jakub Kicinski
2025-09-10 15:17     ` Amery Hung
2025-09-10 18:04       ` Jakub Kicinski
2025-09-10 19:11         ` Amery Hung
2025-09-05 17:33 ` [PATCH bpf-next v2 4/7] bpf: Clear packet pointers after changing packet data in kfuncs Amery Hung
2025-09-05 17:33 ` [PATCH bpf-next v2 5/7] bpf: Support specifying linear xdp packet data size in test_run Amery Hung
2025-09-05 17:33 ` [PATCH bpf-next v2 6/7] selftests/bpf: Test bpf_xdp_pull_data Amery Hung
2025-09-05 17:33 ` [PATCH bpf-next v2 7/7] selftests: drv-net: Pull data before parsing headers Amery Hung

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250905173352.3759457-4-ameryhung@gmail.com \
    --to=ameryhung@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dtatulea@nvidia.com \
    --cc=kernel-team@meta.com \
    --cc=kuba@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=martin.lau@kernel.org \
    --cc=mbloch@nvidia.com \
    --cc=mohsin.bashr@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=noren@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=stfomichev@gmail.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).