From: Stanislav Fomichev <stfomichev@gmail.com>
To: Jon Kohler <jon@nutanix.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
Jason Wang <jasowang@redhat.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Jesper Dangaard Brouer <hawk@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
Simon Horman <horms@kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>
Subject: Re: [PATCH net-next v2] xdp: Add helpers for head length, headroom, and metadata length
Date: Wed, 30 Apr 2025 18:34:46 -0700 [thread overview]
Message-ID: <aBLPtszlDe74yTlk@mini-arch> (raw)
In-Reply-To: <32FB9CF5-E5BB-4912-B76D-53971C6B6F98@nutanix.com>
On 05/01, Jon Kohler wrote:
>
>
> > On Apr 30, 2025, at 5:09 PM, Stanislav Fomichev <stfomichev@gmail.com> wrote:
> >
> > !-------------------------------------------------------------------|
> > CAUTION: External Email
> >
> > |-------------------------------------------------------------------!
> >
> > On 04/30, Jon Kohler wrote:
> >> Introduce new XDP helpers:
> >> - xdp_headlen: Similar to skb_headlen
> >> - xdp_headroom: Similar to skb_headroom
> >> - xdp_metadata_len: Similar to skb_metadata_len
> >>
> >> Integrate these helpers into tap, tun, and XDP implementation to start.
> >>
> >> No functional changes introduced.
> >>
> >> Signed-off-by: Jon Kohler <jon@nutanix.com>
> >> ---
> >> v1->v2: Integrate feedback from Willem
> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_project_netdevbpf_patch_20250430182921.1704021-2D1-2Djon-40nutanix.com_&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=NGPRGGo37mQiSXgHKm5rCQ&m=9pdxzQszX_M0K3gEPeYOyMZZYSkRR8IMvxslS8320Eoctk58y-ELCdZ5iaryF2GH&s=J-ILB7E9VQ_plo0hyjEtzGzjy6G0_o4MMMmmE_z8vvc&e=
> >>
> >> drivers/net/tap.c | 6 +++---
> >> drivers/net/tun.c | 12 +++++------
> >> include/net/xdp.h | 54 +++++++++++++++++++++++++++++++++++++++++++----
> >> net/core/xdp.c | 12 +++++------
> >> 4 files changed, 65 insertions(+), 19 deletions(-)
> >>
> >> diff --git a/drivers/net/tap.c b/drivers/net/tap.c
> >> index d4ece538f1b2..a62fbca4b08f 100644
> >> --- a/drivers/net/tap.c
> >> +++ b/drivers/net/tap.c
> >> @@ -1048,7 +1048,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
> >> struct sk_buff *skb;
> >> int err, depth;
> >>
> >> - if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) {
> >> + if (unlikely(xdp_headlen(xdp) < ETH_HLEN)) {
> >> err = -EINVAL;
> >> goto err;
> >> }
> >> @@ -1062,8 +1062,8 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
> >> goto err;
> >> }
> >>
> >> - skb_reserve(skb, xdp->data - xdp->data_hard_start);
> >> - skb_put(skb, xdp->data_end - xdp->data);
> >> + skb_reserve(skb, xdp_headroom(xdp));
> >> + skb_put(skb, xdp_headlen(xdp));
> >>
> >> skb_set_network_header(skb, ETH_HLEN);
> >> skb_reset_mac_header(skb);
> >> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> >> index 7babd1e9a378..4c47eed71986 100644
> >> --- a/drivers/net/tun.c
> >> +++ b/drivers/net/tun.c
> >> @@ -1567,7 +1567,7 @@ static int tun_xdp_act(struct tun_struct *tun, struct bpf_prog *xdp_prog,
> >> dev_core_stats_rx_dropped_inc(tun->dev);
> >> return err;
> >> }
> >> - dev_sw_netstats_rx_add(tun->dev, xdp->data_end - xdp->data);
> >> + dev_sw_netstats_rx_add(tun->dev, xdp_headlen(xdp));
> >> break;
> >> case XDP_TX:
> >> err = tun_xdp_tx(tun->dev, xdp);
> >> @@ -1575,7 +1575,7 @@ static int tun_xdp_act(struct tun_struct *tun, struct bpf_prog *xdp_prog,
> >> dev_core_stats_rx_dropped_inc(tun->dev);
> >> return err;
> >> }
> >> - dev_sw_netstats_rx_add(tun->dev, xdp->data_end - xdp->data);
> >> + dev_sw_netstats_rx_add(tun->dev, xdp_headlen(xdp));
> >> break;
> >> case XDP_PASS:
> >> break;
> >> @@ -2355,7 +2355,7 @@ static int tun_xdp_one(struct tun_struct *tun,
> >> struct xdp_buff *xdp, int *flush,
> >> struct tun_page *tpage)
> >> {
> >> - unsigned int datasize = xdp->data_end - xdp->data;
> >> + unsigned int datasize = xdp_headlen(xdp);
> >> struct tun_xdp_hdr *hdr = xdp->data_hard_start;
> >> struct virtio_net_hdr *gso = &hdr->gso;
> >> struct bpf_prog *xdp_prog;
> >> @@ -2415,14 +2415,14 @@ static int tun_xdp_one(struct tun_struct *tun,
> >> goto out;
> >> }
> >>
> >> - skb_reserve(skb, xdp->data - xdp->data_hard_start);
> >> - skb_put(skb, xdp->data_end - xdp->data);
> >> + skb_reserve(skb, xdp_headroom(xdp));
> >> + skb_put(skb, xdp_headlen(xdp));
> >>
> >> /* The externally provided xdp_buff may have no metadata support, which
> >> * is marked by xdp->data_meta being xdp->data + 1. This will lead to a
> >> * metasize of -1 and is the reason why the condition checks for > 0.
> >> */
> >> - metasize = xdp->data - xdp->data_meta;
> >> + metasize = xdp_metadata_len(xdp);
> >> if (metasize > 0)
> >> skb_metadata_set(skb, metasize);
> >>
> >> diff --git a/include/net/xdp.h b/include/net/xdp.h
> >> index 48efacbaa35d..044345b18305 100644
> >> --- a/include/net/xdp.h
> >> +++ b/include/net/xdp.h
> >> @@ -151,10 +151,56 @@ xdp_get_shared_info_from_buff(const struct xdp_buff *xdp)
> >> return (struct skb_shared_info *)xdp_data_hard_end(xdp);
> >> }
> >>
> >> +/**
> >> + * xdp_headlen - Calculate the length of the data in an XDP buffer
> >> + * @xdp: Pointer to the XDP buffer structure
> >> + *
> >> + * Compute the length of the data contained in the XDP buffer. Does not
> >> + * include frags, use xdp_get_buff_len() for that instead.
> >> + *
> >> + * Analogous to skb_headlen().
> >> + *
> >> + * Return: The length of the data in the XDP buffer in bytes.
> >> + */
> >> +static inline unsigned int xdp_headlen(const struct xdp_buff *xdp)
> >> +{
> >> + return xdp->data_end - xdp->data;
> >> +}
> >> +
> >> +/**
> >> + * xdp_headroom - Calculate the headroom available in an XDP buffer
> >> + * @xdp: Pointer to the XDP buffer structure
> >> + *
> >> + * Compute the headroom in an XDP buffer.
> >> + *
> >> + * Analogous to the skb_headroom().
> >> + *
> >> + * Return: The size of the headroom in bytes.
> >> + */
> >> +static inline unsigned int xdp_headroom(const struct xdp_buff *xdp)
> >> +{
> >> + return xdp->data - xdp->data_hard_start;
> >> +}
> >> +
> >> +/**
> >> + * xdp_metadata_len - Calculate the length of metadata in an XDP buffer
> >> + * @xdp: Pointer to the XDP buffer structure
> >> + *
> >> + * Compute the length of the metadata region in an XDP buffer.
> >> + *
> >> + * Analogous to skb_metadata_len().
> >> + *
> >> + * Return: The length of the metadata in bytes.
> >> + */
> >> +static inline unsigned int xdp_metadata_len(const struct xdp_buff *xdp)
> >
> > I believe this has to return int, not unsigned int. There are places
> > where we do data_meta = data + 1, and the callers check whether
> > the result is signed or sunsigned.
>
> The existing SKB APIs are the exact same return type (I copied them 1:1),
> but I have a feeling that we’re never intending these values to either A) be
> negative and/or B) wrap in strange ways?
>
> >
> >> +{
> >> + return xdp->data - xdp->data_meta;
> >> +}
> >> +
> >> static __always_inline unsigned int
> >> xdp_get_buff_len(const struct xdp_buff *xdp)
> >> {
> >> - unsigned int len = xdp->data_end - xdp->data;
> >> + unsigned int len = xdp_headlen(xdp);
> >> const struct skb_shared_info *sinfo;
> >>
> >> if (likely(!xdp_buff_has_frags(xdp)))
> >> @@ -364,8 +410,8 @@ int xdp_update_frame_from_buff(const struct xdp_buff *xdp,
> >> int metasize, headroom;
>
> Said another way, perhaps this should be unsigned?
>
> >>
> >> /* Assure headroom is available for storing info */
> >> - headroom = xdp->data - xdp->data_hard_start;
> >> - metasize = xdp->data - xdp->data_meta;
> >> + headroom = xdp_headroom(xdp);
> >> + metasize = xdp_metadata_len(xdp);
> >> metasize = metasize > 0 ? metasize : 0;
> >
> > ^^ like here
>
> Look across the tree, seems like more are unsigned than signed
The ones that are unsigned are either calling xdp_data_meta_unsupported
explicitly (and it does > to check for this condition, not signed math)
or are running in the drivers that are guaranteed to have metadata
support (and, hence, always have data_meta <= data).
> These ones use unsigned:
> xdp_convert_zc_to_xdp_frame
This uses xdp_data_meta_unsupported
> veth_xdp_rcv_skb
> xsk_construct_skb
> bnxt_copy_xdp
> i40e_build_skb
> i40e_construct_skb_zc
> ice_build_skb (this is u8)
> ice_construct_skb_zc
> igb_build_skb
> igb_construct_skb_zc
> igc_build_skb
> igc_construct_skb
> igc_construct_skb_zc
> ixgbe_build_skb
> ixgbe_construct_skb_zc
> ixgbevf_build_skb
> mvneta_swbm_build_skb
> mlx5e_xsk_construct_skb
> mana_build_skb
> stmmac_construct_skb_zc
> bpf_prog_run_generic_xdp
These run in the drivers that support metadata (data_meta <= data)
> xdp_get_metalen
This uses xdp_data_meta_unsupported
> These ones are regular int:
> xdp_build_skb_from_buff
> xdp_build_skb_from_zc
> xdp_update_frame_from_buff
> tun_xdp_one
> build_skb_from_xdp_buff
These can be called from the drivers that support and don't support
the metadata, so have to (correctly) use int.
> Perhaps a separate patch to convert the regulars to unsigned,
> thoughts?
Take a look at xdp_set_data_meta_invalid and xdp_data_meta_unsupported.
There are cases where xdp->data - xdp->data_meta is -1 (and the callers
check for this condition), we can't use unsigned unconditionally
(unless we use xdp_data_meta_unsupported).
next prev parent reply other threads:[~2025-05-01 1:34 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-30 20:11 [PATCH net-next v2] xdp: Add helpers for head length, headroom, and metadata length Jon Kohler
2025-04-30 21:05 ` Jacob Keller
2025-04-30 21:09 ` Stanislav Fomichev
2025-05-01 1:00 ` Jon Kohler
2025-05-01 1:34 ` Stanislav Fomichev [this message]
2025-05-01 1:42 ` Jon Kohler
2025-05-01 2:26 ` Stanislav Fomichev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aBLPtszlDe74yTlk@mini-arch \
--to=stfomichev@gmail.com \
--cc=andrew+netdev@lunn.ch \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=jasowang@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=jon@nutanix.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).