Linux kernel and device drivers for NXP i.MX platforms
 help / color / mirror / Atom feed
From: Larysa Zaremba <larysa.zaremba@intel.com>
To: bpf@vger.kernel.org
Cc: "Larysa Zaremba" <larysa.zaremba@intel.com>,
	"Claudiu Manoil" <claudiu.manoil@nxp.com>,
	"Vladimir Oltean" <vladimir.oltean@nxp.com>,
	"Wei Fang" <wei.fang@nxp.com>,
	"Clark Wang" <xiaoning.wang@nxp.com>,
	"Andrew Lunn" <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Tony Nguyen" <anthony.l.nguyen@intel.com>,
	"Przemek Kitszel" <przemyslaw.kitszel@intel.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Stanislav Fomichev" <sdf@fomichev.me>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	"Eduard Zingerman" <eddyz87@gmail.com>,
	"Song Liu" <song@kernel.org>,
	"Yonghong Song" <yonghong.song@linux.dev>,
	"KP Singh" <kpsingh@kernel.org>, "Hao Luo" <haoluo@google.com>,
	"Jiri Olsa" <jolsa@kernel.org>, "Simon Horman" <horms@kernel.org>,
	"Shuah Khan" <shuah@kernel.org>,
	"Alexander Lobakin" <aleksander.lobakin@intel.com>,
	"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
	"Bastien Curutchet (eBPF Foundation)"
	<bastien.curutchet@bootlin.com>,
	"Tushar Vyavahare" <tushar.vyavahare@intel.com>,
	"Jason Xing" <kernelxing@tencent.com>,
	"Ricardo B. Marlière" <rbm@suse.com>,
	"Eelco Chaudron" <echaudro@redhat.com>,
	"Lorenzo Bianconi" <lorenzo@kernel.org>,
	"Toke Hoiland-Jorgensen" <toke@redhat.com>,
	imx@lists.linux.dev, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
	linux-kselftest@vger.kernel.org,
	"Aleksandr Loktionov" <aleksandr.loktionov@intel.com>,
	"Dragos Tatulea" <dtatulea@nvidia.com>
Subject: [PATCH bpf v2 1/9] xdp: use modulo operation to calculate XDP frag tailroom
Date: Thu, 12 Feb 2026 19:33:16 +0100	[thread overview]
Message-ID: <20260212183328.1883148-3-larysa.zaremba@intel.com> (raw)
In-Reply-To: <20260212183328.1883148-2-larysa.zaremba@intel.com>

The current formula for calculating XDP tailroom in mbuf packets works only
if each frag has its own page (if rxq->frag_size is PAGE_SIZE), this
defeats the purpose of the parameter overall and without any indication
leads to negative calculated tailroom on at least half of frags, if shared
pages are used.

There are not many drivers that set rxq->frag_size. Among them:
* i40e and enetc always split page uniformly between frags, use shared
  pages
* ice uses page_pool frags via libeth, those are power-of-2 and uniformly
  distributed across page
* idpf has variable frag_size with XDP on, so current API is not applicable
* mlx5, mtk and mvneta use PAGE_SIZE or 0 as frag_size for page_pool

As for AF_XDP ZC, only ice, i40e and idpf declare frag_size for it. Modulo
operation yields good results for aligned chunks, they are all power-of-2,
between 2K and PAGE_SIZE. Formula without modulo fails when chunk_size is
2K. Buffers in unaligned mode are not distributed uniformly, so modulo
operation would not work.

To accommodate unaligned buffers, we could define frag_size as
data + tailroom, and hence do not subtract offset when calculating
tailroom, but this would necessitate more changes in the drivers.

Define rxq->frag_size as an even portion of a page that fully belongs to a
single frag. When calculating tailroom, locate the data start within such
portion by performing a modulo operation on page offset.

Fixes: bf25146a5595 ("bpf: add frags support to the bpf_xdp_adjust_tail() API")
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
 net/core/filter.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index ba019ded773d..5f5489665c58 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4156,7 +4156,8 @@ static int bpf_xdp_frags_increase_tail(struct xdp_buff *xdp, int offset)
 	if (!rxq->frag_size || rxq->frag_size > xdp->frame_sz)
 		return -EOPNOTSUPP;
 
-	tailroom = rxq->frag_size - skb_frag_size(frag) - skb_frag_off(frag);
+	tailroom = rxq->frag_size - skb_frag_size(frag) -
+		   skb_frag_off(frag) % rxq->frag_size;
 	if (unlikely(offset > tailroom))
 		return -EINVAL;
 
-- 
2.52.0


  reply	other threads:[~2026-02-12 19:03 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-12 18:33 [PATCH bpf v2 0/9] Address XDP frags having negative tailroom Larysa Zaremba
2026-02-12 18:33 ` Larysa Zaremba [this message]
2026-02-13 20:34   ` [PATCH bpf v2 1/9] xdp: use modulo operation to calculate XDP frag tailroom Jakub Kicinski
2026-02-12 18:33 ` [PATCH bpf v2 2/9] xsk: introduce helper to determine rxq->frag_size Larysa Zaremba
2026-02-12 18:33 ` [PATCH bpf v2 3/9] ice: fix rxq info registering in mbuf packets Larysa Zaremba
2026-02-12 19:31   ` bot+bpf-ci
2026-02-12 18:33 ` [PATCH bpf v2 4/9] ice: change XDP RxQ frag_size from DMA write length to xdp.frame_sz Larysa Zaremba
2026-02-13  3:57   ` Loktionov, Aleksandr
2026-02-13  8:41     ` Larysa Zaremba
2026-02-12 18:33 ` [PATCH bpf v2 5/9] i40e: fix registering XDP RxQ info Larysa Zaremba
2026-02-12 18:33 ` [PATCH bpf v2 6/9] i40e: use xdp.frame_sz as XDP RxQ info frag_size Larysa Zaremba
2026-02-13  4:04   ` Loktionov, Aleksandr
2026-02-13  8:58     ` Larysa Zaremba
2026-02-12 18:33 ` [PATCH bpf v2 7/9] idpf: use truesize " Larysa Zaremba
2026-02-16 10:46   ` Alexander Lobakin
2026-02-16 10:48     ` Alexander Lobakin
2026-02-16 14:01       ` Larysa Zaremba
2026-02-16 15:17         ` Alexander Lobakin
2026-02-12 18:33 ` [PATCH bpf v2 8/9] net: enetc: " Larysa Zaremba
2026-02-12 18:33 ` [PATCH bpf v2 9/9] xdp: produce a warning when calculated tailroom is negative Larysa Zaremba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260212183328.1883148-3-larysa.zaremba@intel.com \
    --to=larysa.zaremba@intel.com \
    --cc=aleksander.lobakin@intel.com \
    --cc=aleksandr.loktionov@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=andrii@kernel.org \
    --cc=anthony.l.nguyen@intel.com \
    --cc=ast@kernel.org \
    --cc=bastien.curutchet@bootlin.com \
    --cc=bpf@vger.kernel.org \
    --cc=claudiu.manoil@nxp.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dtatulea@nvidia.com \
    --cc=echaudro@redhat.com \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=horms@kernel.org \
    --cc=imx@lists.linux.dev \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kernelxing@tencent.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=rbm@suse.com \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=toke@redhat.com \
    --cc=tushar.vyavahare@intel.com \
    --cc=vladimir.oltean@nxp.com \
    --cc=wei.fang@nxp.com \
    --cc=xiaoning.wang@nxp.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox