From: Pedro Falcato <pfalcato@suse.de>
To: Luigi Rizzo <lrizzo@google.com>
Cc: rizzo.unipi@gmail.com, m.szyprowski@samsung.com,
robin.murphy@arm.com, willemb@google.com, kuniyu@google.com,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, gregkh@linuxfoundation.org, rafael@kernel.org,
akpm@linux-foundation.org, david@kernel.org,
netdev@vger.kernel.org, linux-mm@kvack.org,
iommu@lists.linux.dev, driver-core@lists.linux.dev,
linux-kernel@vger.kernel.org,
Jesper Dangaard Brouer <hawk@kernel.org>,
Ilias Apalodimas <ilias.apalodimas@linaro.org>
Subject: Re: [PATCH] swiotlb: avoid double copy with swiotlb on tx socket
Date: Tue, 16 Jun 2026 10:20:25 +0100 [thread overview]
Message-ID: <ajESl4osXP7roz5q@pedro-suse.lan> (raw)
In-Reply-To: <20260615234220.3946885-1-lrizzo@google.com>
(+cc page pool maintainers)
On Mon, Jun 15, 2026 at 11:42:20PM +0000, Luigi Rizzo wrote:
> The use of swiotlb causes an extra data copy on I/O. For tx sockets,
> especially with greedy senders, this has a high chance of happening in
> the softirq handler for tx network interrupts, creating a significant
> performance bottleneck.
>
> Allow tx sockets to allocate socket buffers directly from the bounce
> buffers. This avoids the second copy and removes the above bottleneck.
> The fraction of swiotlb buffers allowed for this feature is set with
> /sys/module/swiotlb/parameters/zerocopy_tx_percent
> (0 means disabled, 90 is the maximum, to avoid persistent I/O failures).
>
> Implementation:
> - define a new page type to unambiguously identify bounce buffers used
> as backing storage for socket buffers
> - modify skb_page_frag_refill to perform the modified allocation
> - modify the destructors __free_frozen_pages(), free_unref_folio() to
> handle those pages and return them to the pool.
>
> The savings are especially visible with fewer queues. In synthetic
> benchmarks, senders with 1-2 queues would cap around 50Gbps with
> conventional swiotlb, and reach over 170Gbps with the feature enabled.
I could be wrong, but I genuinely think that the way to go about this is
using page_pool for regular TX as well. page_pool pages are all dma-mapped
(so whatever swiotlb optimization you want can be done there), and the net
stack already has awareness of these special pages and special skbs, so it
won't Just Return Them back to the page allocator.
Otherwise you can easily go all over the place, and that's just not great.
Also this could possibly benefit setups that use IOMMU as well.
--
Pedro
next prev parent reply other threads:[~2026-06-16 9:20 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-15 23:42 [PATCH] swiotlb: avoid double copy with swiotlb on tx socket Luigi Rizzo
2026-06-16 0:25 ` Jakub Kicinski
2026-06-16 0:33 ` Luigi Rizzo
2026-06-16 11:06 ` Mostafa Saleh
2026-06-16 4:17 ` Eric Dumazet
2026-06-16 5:31 ` kernel test robot
2026-06-16 8:01 ` kernel test robot
2026-06-16 8:36 ` David Hildenbrand (Arm)
2026-06-16 9:20 ` Pedro Falcato [this message]
2026-06-16 9:48 ` Luigi Rizzo
2026-06-16 10:28 ` Pedro Falcato
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ajESl4osXP7roz5q@pedro-suse.lan \
--to=pfalcato@suse.de \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=david@kernel.org \
--cc=driver-core@lists.linux.dev \
--cc=edumazet@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=hawk@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=iommu@lists.linux.dev \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lrizzo@google.com \
--cc=m.szyprowski@samsung.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rafael@kernel.org \
--cc=rizzo.unipi@gmail.com \
--cc=robin.murphy@arm.com \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox