All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Valerio <pvalerio@redhat.com>
To: "Théo Lebrun" <theo.lebrun@bootlin.com>, netdev@vger.kernel.org
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>,
	Claudiu Beznea <claudiu.beznea@tuxon.dev>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Lorenzo Bianconi <lorenzo@kernel.org>
Subject: Re: [PATCH RFC net-next 0/6] net: macb: Add XDP support and page pool integration
Date: Tue, 02 Dec 2025 18:24:50 +0100	[thread overview]
Message-ID: <87fr9szti5.fsf@redhat.com> (raw)
In-Reply-To: <DEITSIO441QL.X81MVLL3EIV4@bootlin.com>

Hello Théo,

thank you for the feedback

On 26 Nov 2025 at 07:08:14 PM, Théo Lebrun <theo.lebrun@bootlin.com> wrote:

> Hello Paolo,
>
> So this is an initial review, I'll start here with five series-wide
> topics and send small per-line comments (ie nitpicks) in a second stage.
>
>
>
> ### Rx buffer size computation
>
> The buffer size computation should be reworked. At the end of the series
> it looks like:
>
> static int macb_open(struct net_device *dev)
> {
>     size_t bufsz = dev->mtu + ETH_HLEN + ETH_FCS_LEN + NET_IP_ALIGN;
>
>     // ...
>
>     macb_init_rx_buffer_size(bp, bufsz);
>
>     // ...
> }
>
> static void macb_init_rx_buffer_size(struct macb *bp, size_t size)
> {
>     if (!macb_is_gem(bp)) {
>         bp->rx_buffer_size = MACB_RX_BUFFER_SIZE;
>     } else {
>         bp->rx_buffer_size = size
>             + SKB_DATA_ALIGN(sizeof(struct skb_shared_info))
>             + MACB_PP_HEADROOM;
>
>         if (bp->rx_buffer_size > PAGE_SIZE)
>             bp->rx_buffer_size = PAGE_SIZE;
>
>         if (bp->rx_buffer_size % RX_BUFFER_MULTIPLE)
>             bp->rx_buffer_size = roundup(bp->rx_buffer_size, RX_BUFFER_MULTIPLE);
>     }
> }
>
> Most of the issues with this code is not stemming from your series, but
> this big rework is the right moment to fix it all.
>
>  - NET_IP_ALIGN is accounted for in the headroom even though it isn't
>    present if !RSC.
>

that's something I noticed and I was a unsure about the reason.

>  - When skbuff.c/h allocates an SKB buffer, it SKB_DATA_ALIGN()s
>    headroom+data. We should probably do the same. In our case it would
>    be:
>
>    bp->rx_buffer_size = SKB_DATA_ALIGN(MACB_PP_HEADROOM + size) +
>                         SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>    // or
>    bp->rx_buffer_size = SKB_HEAD_ALIGN(MACB_PP_HEADROOM + size);
>
>    I've not computed if it can ever give a different value to your
>    series in terms of total size or shinfo alignment. I'd guess it is
>    unlikely.
>

I'll take a look at this

>  - If the size clamping to PAGE_SIZE comes into play, we are probably
>    doomed. It means we cannot deal with the MTU and we'll probably get
>    corruption. If we do put a check in place, it should loudly fail
>    rather than silently clamp.
>

That should not happen, unless I'm missing something.
E.g., 9000B mtu on a 4K PAGE_SIZE kernel should be handled with multiple
descriptors. The clamping is there because according with how the series
creates the pool, the maximum buffer size is page order 0.

Hardware-wise bp->rx_buffer_size should also be taken into account for
the receive buffer size.

> TLDR: I think macb_init_rx_buffer_size() should encapsulate the whole
> rx buffer size computation. It can use bp->rx_offset and add on top
> MTU & co. It might start failing if >PAGE_SIZE (?).
>

ack, I'll move that part

>
>
> ### Buffer variable names
>
> Related: so many variables, fields or constants have ambiguous names,
> can we do something about it?
>
>  - bp->rx_offset is named oddly to my ears. Offset to what?
>    Maybe bp->rx_head or bp->rx_headroom?
>

bp->rx_headroom sounds a good choice to me, but if you have a stronger
preference for bp->rx_head just let me know.

>  - bp->rx_buffer_size: it used to be approximately the payload size
>    (plus NET_IP_ALIGN). Now it contains the XDP headroom and shinfo.
>    That's on GEM, because on MACB it means something different.
>
>    This line is a bit ironic and illustrates the trouble:
>       buffer_size = SKB_WITH_OVERHEAD(bp->rx_buffer_size) - bp->rx_offset
>
>  - data in gem_rx|gem_rx_refill|gem_free_rx_buffers() points to what we
>    could call skb->head, ie start of buffer & start of XDP headroom.
>    Its name implied skb->data to me, ie after the headroom.
>
>    It also made `data - page_address(page) + bp->rx_offset` hard to
>    understand. It is easier for me to understand that the following is
>    the page fragment offset till skb->data:
>
>       buff_head + bp->rx_headroom - page_address(page)
>

ack, will change this

>  - MACB_MAX_PAD is ambiguous. It rhymes with NET_SKB_PAD which is the
>    default SKB headroom but here it contains part of the headroom
>    (XDP_PACKET_HEADROOM but not NET_IP_ALIGN) and the tailroom (shinfo).
>

uhm, looking at this, I think NET_IP_ALIGN should be part of it (for the
!rsc case).
I'll revisit this part, though.

>  - Field `data` in `struct macb_tx_buff` points to skb/xdp_frame but my
>    initial thought was skb->data pointer (ie after headroom).
>    What about va or ptr or buff or frame or ...?
>

I see. At some point I considered buff, but then I realized
tx_buff->buff was not perfect, hence data :)

I guess one between frame or va works, thanks.

> TLDR: I had a hard time understanding size/offset expressions (both from
> existing code and the series) because of variable names.
>

ack. Will revisit this aspect based on your suggestions.

>
>
> ### Duplicated buffer size computations
>
> Last point related to buffer size computation:
>
>  - it is duplicated in macb_set_mtu() but forgets NET_IP_ALIGN & proper
>    SKB_DATA_ALIGN() and,
>
>  - it is duplicated in gem_xdp_setup() but I don't see why because the
>    buffer size is computed to work fine with/without XDP enabled. Also
>    this check means we cannot load an XDP program before macb_open()
>    because bp->rx_buffer_size == 0.
>
> TLDR: Let's deduplicate size computations to minimise chances of getting
> it wrong.
>

ack

>
>
> ### Allocation changes
>
> I am not convinced by patches 1/6 and 2/6 that change the alloc strategy
> in two steps, from netdev_alloc_skb() to page_pool_alloc_pages() to
> page_pool_alloc_frag().
>
>  - The transient solution isn't acceptable when PAGE_SIZE is large.
>    We have 16K and would never want one buffer per page.
>
>  - It forces you to introduce temporary code & constants which is added
>    noise IMO. MACB_PP_MAX_BUF_SIZE() is odd as is the alignment of
>    buffer sizes to page sizes. It forces you to deal with
>    `bp->rx_buffer_size > PAGE_SIZE` which we could ignore. Right?
>
> TLDR: do alloc changes in one step.
>

yes, makes sense I'll squash them.

>
>
> ### XDP_SETUP_PROG if netif_running()
>
> I'd like to start a discussion on the expected behavior on XDP program
> change if netif_running(). Summarised:
>
> static int gem_xdp_setup(struct net_device *dev, struct bpf_prog *prog,
>              struct netlink_ext_ack *extack)
> {
>     bool running = netif_running(dev);
>     bool need_update = !!bp->prog != !!prog;
>
>     if (running && need_update)
>         macb_close(dev);
>     old_prog = rcu_replace_pointer(bp->prog, prog, lockdep_rtnl_is_held());
>     if (running && need_update)
>         return macb_open(dev);
> }
>
> Have you experimented with that? I don't see anything graceful in our
> close operation, it looks like we'll get corruption or dropped packets
> or both. We shouldn't impose that on the user who just wanted to swap
> the program.
>
> I cannot find any good reason that implies we wouldn't be able to swap
> our XDP program on the fly. If we think it is unsafe, I'd vote for
> starting with a -EBUSY return code and iterating on that.
>

I didn't experiment much with this, other than simply adding and
removing programs as needed during my tests. Didn't experience
particular issues.

The reason a close/open sequence was added here was mostly because I was
considering to account XDP_PACKET_HEADROOM only when a program was
present. I later decided to not proceed with that (mostly to avoid
changing too many things at once).

Given the geometry of the buffer remains untouched in either case, I
see no particular reasons we can't swap on the fly as you suggest.

I'll try this and change it, thanks!

> TLDR: macb_close() on XDP program change is probably a bad idea.
>
> Thanks,
>
> --
> Théo Lebrun, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com


  reply	other threads:[~2025-12-02 17:25 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-19 13:53 [PATCH RFC net-next 0/6] net: macb: Add XDP support and page pool integration Paolo Valerio
2025-11-19 13:53 ` [PATCH RFC net-next 1/6] cadence: macb/gem: Add page pool support Paolo Valerio
2025-11-27 13:21   ` Théo Lebrun
2025-12-02 17:30     ` Paolo Valerio
2025-11-19 13:53 ` [PATCH RFC net-next 2/6] cadence: macb/gem: handle multi-descriptor frame reception Paolo Valerio
2025-11-27 13:38   ` Théo Lebrun
2025-12-02 17:32     ` Paolo Valerio
2025-12-08 10:21       ` Théo Lebrun
2025-12-08 10:22         ` [PATCH 1/8] net: macb: move Rx buffers alloc from link up to open Théo Lebrun
2025-12-08 12:53         ` [PATCH RFC net-next 2/6] cadence: macb/gem: handle multi-descriptor frame reception Paolo Valerio
2025-12-09  9:01           ` Théo Lebrun
2025-11-19 13:53 ` [PATCH RFC net-next 3/6] cadence: macb/gem: use the current queue number for stats Paolo Valerio
2025-11-19 13:53 ` [PATCH RFC net-next 4/6] cadence: macb/gem: add XDP support for gem Paolo Valerio
2025-11-27 14:41   ` Théo Lebrun
2025-12-02 17:32     ` Paolo Valerio
2025-12-08 10:59       ` Théo Lebrun
2025-11-19 13:53 ` [PATCH RFC net-next 5/6] cadence: macb/gem: make tx path skb agnostic Paolo Valerio
2025-11-27 14:49   ` Théo Lebrun
2025-12-02 17:33     ` Paolo Valerio
2025-11-19 13:53 ` [PATCH RFC net-next 6/6] cadence: macb/gem: introduce xmit support Paolo Valerio
2025-11-22 20:49   ` kernel test robot
2025-11-27 15:07   ` Théo Lebrun
2025-12-02 17:34     ` Paolo Valerio
2025-12-08 11:01       ` Théo Lebrun
2025-11-25 16:50 ` [PATCH RFC net-next 0/6] net: macb: Add XDP support and page pool integration Théo Lebrun
2025-11-25 23:11   ` Paolo Valerio
2025-11-26 18:08 ` Théo Lebrun
2025-12-02 17:24   ` Paolo Valerio [this message]
2025-12-03 14:28     ` Théo Lebrun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fr9szti5.fsf@redhat.com \
    --to=pvalerio@redhat.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=claudiu.beznea@tuxon.dev \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.ferre@microchip.com \
    --cc=pabeni@redhat.com \
    --cc=theo.lebrun@bootlin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.