Netdev List
 help / color / mirror / Atom feed
From: Bobby Eshleman <bobbyeshleman@gmail.com>
To: "Christian König" <christian.koenig@amd.com>
Cc: Donald Hunter <donald.hunter@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Vivek Kasireddy <vivek.kasireddy@intel.com>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	Shuah Khan <shuah@kernel.org>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org, linux-kselftest@vger.kernel.org,
	sdf@fomichev.me, razor@blackwall.org, daniel@iogearbox.net,
	almasrymina@google.com, matttbe@kernel.org, skhawaja@google.com,
	dw@davidwei.uk, Bobby Eshleman <bobbyeshleman@meta.com>
Subject: Re: [PATCH net-next 2/4] udmabuf: emit one sg entry per pinned folio
Date: Tue, 9 Jun 2026 07:58:29 -0700	[thread overview]
Message-ID: <aigqFQWzPkiSh3ie@devvm29614.prn0.facebook.com> (raw)
In-Reply-To: <a51e97bd-39dc-492f-bd7d-f137423277df@amd.com>

On Mon, Jun 08, 2026 at 03:59:04PM +0200, Christian König wrote:
> On 6/8/26 15:55, Bobby Eshleman wrote:
> > 
> > On Sun, Jun 7, 2026 at 11:42 PM Christian König <christian.koenig@amd.com <mailto:christian.koenig@amd.com>> wrote:
> > 
> >     On 6/5/26 20:44, Bobby Eshleman wrote:
> >     > On Fri, Jun 05, 2026 at 11:30:07AM +0200, Christian König wrote:
> >     >> On 6/4/26 02:42, Bobby Eshleman wrote:
> >     >>> From: Bobby Eshleman <bobbyeshleman@meta.com <mailto:bobbyeshleman@meta.com>>
> >     >>>
> >     >>> get_sg_table() emitted one PAGE_SIZE sg entry per page even when the
> >     >>> underlying folio was larger.
> >     >>>
> >     >>> Instead, walk folios[] and emit one sg entry per folio. When folios
> >     >>> represent large pages (as is for MFD_HUGETLB), each sg entry is a large
> >     >>> page. Normal PAGE_SIZE sg tables are unchanged.
> >     >>>
> >     >>> Required by net/core/devmem to support rx-buf-size > PAGE_SIZE with
> >     >>> udmabuf.
> >     >>
> >     >> That doesn't explain why this is required.
> >     >
> >     > Sure, can definitely add. Devmem currently requires dmabuf sg entries to
> >     > be length and size aligned when it allocates niovs for NIC page pools.
> >     > Though udmabuf is not violating any dmabuf contract by emitting
> >     > PAGE_SIZE entries and the above restriction is probably more a
> >     > shortfalling of devmem, by emitting a single entry per folio this patch
> >     > allows udmabuf to be used by devmem for large pages.
> >     >
> >     >>
> >     >> Please note that accessing the pages/folio of an sg-table returned by DMA-buf is illegal and strictly forbidden!
> >     >>
> >     >> Regards,
> >     >> Christian.
> >     >
> >     > It seems both devmem and io_uring zcrx at least introspect through to
> >     > the sg-table to build NIC page pools (not accessing the memory itself,
> >     > however). Is there a better way?
> > 
> >     That's an absolute NO-GO! We need to stop that immediately.
> > 
> >     Touching the underlying struct page of an DMA-buf exported sg-table is strictly forbidden.
> > 
> >     We even have code to wrap the sg_table and hide the struct pages on debug builds to catch those issues, see function dma_buf_wrap_sg_table().
> > 
> >     My last status is that the NIC page pools are build directly from the DMA addresses exposed by the sg_table.
> > 
> >     Was there any change I'm not aware of?
> > 
> >     Regards,
> >     Christian.
> > 
> > 
> > Oh no change, your mental model is still current.
> > They just go through each sg and use sg_dma_address() on each.
> 
> Ah, thanks! That was a near heart attack :D
> 
> Yeah that is perfectly correct, question is do you then still really need this udmabuf change? I mean the DMA API usually merges together contiguous DMA addresses.
> 
> Regards,
> Christian.
> 

Hey Christian, sorry for the delay I justed want to double check what
I'm seeing...

I reverted the udmabuf patch and confirmed devmem still runs into 4K
pages even for hugepage udmabuf. I see that the dma_map_direct() path is
being taken, which if I am reading the code correctly results in the
sg_dma_len(sg) inheriting sg->length directly (set by udmabuf's
sg_set_folio(..., PAGE_SIZE) call), compared to the iommu_dma_map_phys()
path which looks like it does merge when possible.

Best,
Bobby

  reply	other threads:[~2026-06-09 14:58 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-04  0:42 [PATCH net-next 0/4] net: devmem: allow rx-buf-size > PAGE_SIZE per binding Bobby Eshleman
2026-06-04  0:42 ` [PATCH net-next 1/4] net: devmem: allow rx-buf-size > PAGE_SIZE per dmabuf binding Bobby Eshleman
2026-06-05 15:33   ` Stanislav Fomichev
2026-06-05 16:20     ` Bobby Eshleman
2026-06-04  0:42 ` [PATCH net-next 2/4] udmabuf: emit one sg entry per pinned folio Bobby Eshleman
2026-06-05  9:30   ` Christian König
2026-06-05 18:44     ` Bobby Eshleman
2026-06-08  6:41       ` Christian König
     [not found]         ` <CAKB00G3opAoAYswsq2uz0Q6jgku8u4NthKOzCbSumZ0qK7QxcQ@mail.gmail.com>
2026-06-08 13:59           ` Christian König
2026-06-09 14:58             ` Bobby Eshleman [this message]
2026-06-04  0:43 ` [PATCH net-next 3/4] selftests/net: ncdevmem: add -b option to set rx-buf-size on bind Bobby Eshleman
2026-06-05 15:35   ` Stanislav Fomichev
2026-06-05 16:56     ` Bobby Eshleman
2026-06-04  0:43 ` [PATCH net-next 4/4] selftests/net: devmem.py: add check_rx_large_niov Bobby Eshleman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aigqFQWzPkiSh3ie@devvm29614.prn0.facebook.com \
    --to=bobbyeshleman@gmail.com \
    --cc=almasrymina@google.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=bobbyeshleman@meta.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=donald.hunter@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=dw@davidwei.uk \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kraxel@redhat.com \
    --cc=kuba@kernel.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=matttbe@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=skhawaja@google.com \
    --cc=sumit.semwal@linaro.org \
    --cc=vivek.kasireddy@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox