From: Jakub Kicinski <kuba@kernel.org>
To: Mina Almasry <almasrymina@google.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>,
netdev@vger.kernel.org, Andrew Lunn <andrew@lunn.ch>,
davem@davemloft.net, Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
Donald Hunter <donald.hunter@gmail.com>,
Michael Chan <michael.chan@broadcom.com>,
Pavan Chebbi <pavan.chebbi@broadcom.com>,
Jesper Dangaard Brouer <hawk@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
Stanislav Fomichev <sdf@fomichev.me>,
Joshua Washington <joshwash@google.com>,
Harshitha Ramamurthy <hramamurthy@google.com>,
Jian Shen <shenjian15@huawei.com>,
Salil Mehta <salil.mehta@huawei.com>,
Jijie Shao <shaojijie@huawei.com>,
Sunil Goutham <sgoutham@marvell.com>,
Geetha sowjanya <gakula@marvell.com>,
Subbaraya Sundeep <sbhatta@marvell.com>,
hariprasad <hkelam@marvell.com>,
Bharat Bhushan <bbhushan2@marvell.com>,
Saeed Mahameed <saeedm@nvidia.com>,
Tariq Toukan <tariqt@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
Leon Romanovsky <leon@kernel.org>,
Alexander Duyck <alexanderduyck@fb.com>,
kernel-team@meta.com,
Ilias Apalodimas <ilias.apalodimas@linaro.org>,
Joe Damato <joe@dama.to>, David Wei <dw@davidwei.uk>,
Willem de Bruijn <willemb@google.com>,
Breno Leitao <leitao@debian.org>,
Dragos Tatulea <dtatulea@nvidia.com>,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-rdma@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>
Subject: Re: [PATCH net-next v4 00/24][pull request] Queue configs and large buffer providers
Date: Tue, 14 Oct 2025 18:41:19 -0700 [thread overview]
Message-ID: <20251014184119.3ba2dd70@kernel.org> (raw)
In-Reply-To: <CAHS8izOupVhkaZXNDmZo8KzR42M+rxvvmmLW=9r3oPoNOC6pkQ@mail.gmail.com>
On Mon, 13 Oct 2025 21:41:38 -0700 Mina Almasry wrote:
> > I'd like to rework these a little bit.
> > On reflection I don't like the single size control.
> > Please hold off.
>
> FWIW when I last looked at this I didn't like that the size control
> seemed to control the size of the allocations made from the pp, but
> not the size actually posted to the NIC.
>
> I.e. in the scenario where the driver fragments each pp buffer into 2,
> and the user asks for 8K rx-buf-len, the size actually posted to the
> NIC would have actually been 4K (8K / 2 for 2 fragments).
>
> Not sure how much of a concern this really is. I thought it would be
> great if somehow rx-buf-len controlled the buffer sizes actually
> posted to the NIC, because that what ultimately matters, no (it ends
> up being the size of the incoming frags)? Or does that not matter for
> some reason I'm missing?
I spent a couple of hours trying to write up my thoughts but I still haven't
finished 😅️ I'll send the full thing tomorrow.
You may have looked at hns3 is that right? It bumps the page pool order
by 1 so that it can fit two allocations into each page. I'm guessing
it's a remnant of "page flipping". The other current user of rx-buf-len
(otx2) doesn't do that - it uses simple page_order(rx_buf_len), AFAICT.
If that's what you mean - I'd chalk the hns3 behavior to "historical
reasons", it can probably be straightened out today to everyone's
benefit.
I wanted to reply already (before I present my "full case" :)) because
my thinking started slipping in the opposite direction of being
concerned about "buffer sizes actually posted to the NIC".
Say the NIC packs packet payloads into buffers like this:
1 2 3
packets: xxxxxxxxx yyyy zzzzzzz
buffers: [xxxx] [xxxx] [x|yyy] [y|zzz] [zzzz]
Hope the diagram makes sense, each [....] is 4k, headers went elsewhere.
If the user filled in the page pool with 16k buffers, and driver split
it up into 4k chunks. HW packed the payloads into those 4k chunks,
and GRO reformed them back into just 2 skb frags. Do we really care
about the buffer size on the HW fill ring being 4kB ? Isn't what user
cares about that they saw 2 frags not 5 ?
next prev parent reply other threads:[~2025-10-15 1:41 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-13 14:54 [PATCH net-next v4 00/24][pull request] Queue configs and large buffer providers Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 01/24] net: page_pool: sanitise allocation order Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 02/24] docs: ethtool: document that rx_buf_len must control payload lengths Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 03/24] net: ethtool: report max value for rx-buf-len Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 04/24] net: use zero value to restore rx_buf_len to default Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 05/24] net: hns3: net: use zero " Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 06/24] net: clarify the meaning of netdev_config members Pavel Begunkov
2025-10-13 17:12 ` Randy Dunlap
2025-10-14 12:53 ` Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 07/24] net: add rx_buf_len to netdev config Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 08/24] eth: bnxt: read the page size from the adapter struct Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 09/24] eth: bnxt: set page pool page order based on rx_page_size Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 10/24] eth: bnxt: support setting size of agg buffers via ethtool Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 11/24] net: move netdev_config manipulation to dedicated helpers Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 12/24] net: reduce indent of struct netdev_queue_mgmt_ops members Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 13/24] net: allocate per-queue config structs and pass them thru the queue API Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 14/24] net: pass extack to netdev_rx_queue_restart() Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 15/24] net: add queue config validation callback Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 16/24] eth: bnxt: always set the queue mgmt ops Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 17/24] eth: bnxt: store the rx buf size per queue Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 18/24] eth: bnxt: adjust the fill level of agg queues with larger buffers Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 19/24] netdev: add support for setting rx-buf-len per queue Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 20/24] net: wipe the setting of deactived queues Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 21/24] eth: bnxt: use queue op config validate Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 22/24] eth: bnxt: support per queue configuration of rx-buf-len Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 23/24] net: let pp memory provider to specify rx buf len Pavel Begunkov
2025-10-13 14:54 ` [PATCH net-next v4 24/24] net: validate driver supports passed qcfg params Pavel Begunkov
2025-10-13 15:03 ` [PATCH net-next v4 00/24][pull request] Queue configs and large buffer providers Pavel Begunkov
2025-10-13 17:54 ` Jakub Kicinski
2025-10-14 4:41 ` Mina Almasry
2025-10-14 12:50 ` Pavel Begunkov
2025-10-15 1:41 ` Jakub Kicinski [this message]
2025-10-15 17:44 ` Mina Almasry
2025-10-14 12:46 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251014184119.3ba2dd70@kernel.org \
--to=kuba@kernel.org \
--cc=alexanderduyck@fb.com \
--cc=almasrymina@google.com \
--cc=andrew@lunn.ch \
--cc=asml.silence@gmail.com \
--cc=bbhushan2@marvell.com \
--cc=corbet@lwn.net \
--cc=davem@davemloft.net \
--cc=donald.hunter@gmail.com \
--cc=dtatulea@nvidia.com \
--cc=dw@davidwei.uk \
--cc=edumazet@google.com \
--cc=gakula@marvell.com \
--cc=hawk@kernel.org \
--cc=hkelam@marvell.com \
--cc=horms@kernel.org \
--cc=hramamurthy@google.com \
--cc=ilias.apalodimas@linaro.org \
--cc=joe@dama.to \
--cc=john.fastabend@gmail.com \
--cc=joshwash@google.com \
--cc=kernel-team@meta.com \
--cc=leitao@debian.org \
--cc=leon@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=mbloch@nvidia.com \
--cc=michael.chan@broadcom.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pavan.chebbi@broadcom.com \
--cc=saeedm@nvidia.com \
--cc=salil.mehta@huawei.com \
--cc=sbhatta@marvell.com \
--cc=sdf@fomichev.me \
--cc=sgoutham@marvell.com \
--cc=shaojijie@huawei.com \
--cc=shenjian15@huawei.com \
--cc=tariqt@nvidia.com \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).