From: Bobby Eshleman <bobbyeshleman@gmail.com>
To: "Christian König" <christian.koenig@amd.com>
Cc: Donald Hunter <donald.hunter@gmail.com>,
Jakub Kicinski <kuba@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
Andrew Lunn <andrew+netdev@lunn.ch>,
Gerd Hoffmann <kraxel@redhat.com>,
Vivek Kasireddy <vivek.kasireddy@intel.com>,
Sumit Semwal <sumit.semwal@linaro.org>,
Shuah Khan <shuah@kernel.org>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org,
linaro-mm-sig@lists.linaro.org, linux-kselftest@vger.kernel.org,
sdf@fomichev.me, razor@blackwall.org, daniel@iogearbox.net,
almasrymina@google.com, matttbe@kernel.org, skhawaja@google.com,
dw@davidwei.uk, Bobby Eshleman <bobbyeshleman@meta.com>
Subject: Re: [PATCH net-next 2/4] udmabuf: emit one sg entry per pinned folio
Date: Fri, 5 Jun 2026 11:44:00 -0700 [thread overview]
Message-ID: <aiMY8CpckM8Jav0g@devvm29614.prn0.facebook.com> (raw)
In-Reply-To: <bdce2488-fe77-4f36-9ed6-dd2c785fa7c1@amd.com>
On Fri, Jun 05, 2026 at 11:30:07AM +0200, Christian König wrote:
> On 6/4/26 02:42, Bobby Eshleman wrote:
> > From: Bobby Eshleman <bobbyeshleman@meta.com>
> >
> > get_sg_table() emitted one PAGE_SIZE sg entry per page even when the
> > underlying folio was larger.
> >
> > Instead, walk folios[] and emit one sg entry per folio. When folios
> > represent large pages (as is for MFD_HUGETLB), each sg entry is a large
> > page. Normal PAGE_SIZE sg tables are unchanged.
> >
> > Required by net/core/devmem to support rx-buf-size > PAGE_SIZE with
> > udmabuf.
>
> That doesn't explain why this is required.
Sure, can definitely add. Devmem currently requires dmabuf sg entries to
be length and size aligned when it allocates niovs for NIC page pools.
Though udmabuf is not violating any dmabuf contract by emitting
PAGE_SIZE entries and the above restriction is probably more a
shortfalling of devmem, by emitting a single entry per folio this patch
allows udmabuf to be used by devmem for large pages.
>
> Please note that accessing the pages/folio of an sg-table returned by DMA-buf is illegal and strictly forbidden!
>
> Regards,
> Christian.
It seems both devmem and io_uring zcrx at least introspect through to
the sg-table to build NIC page pools (not accessing the memory itself,
however). Is there a better way?
Best,
Bobby
>
> > Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
> > ---
> > drivers/dma-buf/udmabuf.c | 47 ++++++++++++++++++++++++++++++++++++++++++-----
> > 1 file changed, 42 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
> > index 94b8ecb892bb..f28dd3788ada 100644
> > --- a/drivers/dma-buf/udmabuf.c
> > +++ b/drivers/dma-buf/udmabuf.c
> > @@ -141,26 +141,63 @@ static void vunmap_udmabuf(struct dma_buf *buf, struct iosys_map *map)
> > vm_unmap_ram(map->vaddr, ubuf->pagecount);
> > }
> >
> > +/* Return the number of contiguous pages backed by the folio at @i.
> > + * A udmabuf may map only part of a folio, or reference the same folio
> > + * in multiple non-contiguous runs, so folio_nr_pages() can't be used.
> > + */
> > +static pgoff_t udmabuf_folio_nr_pages(struct udmabuf *ubuf, pgoff_t i)
> > +{
> > + struct folio *f = ubuf->folios[i];
> > + pgoff_t j;
> > +
> > + for (j = 1; i + j < ubuf->pagecount; j++) {
> > + if (ubuf->folios[i + j] != f)
> > + break;
> > + /* Same folio, but not a sequential offset within it. */
> > + if (ubuf->offsets[i + j] != ubuf->offsets[i] + j * PAGE_SIZE)
> > + break;
> > + }
> > + return j;
> > +}
> > +
> > +/* Count the contiguous folio runs in @ubuf, one sg entry per run. */
> > +static unsigned int udmabuf_sg_nents(struct udmabuf *ubuf)
> > +{
> > + unsigned int nents = 0;
> > + pgoff_t i;
> > +
> > + for (i = 0; i < ubuf->pagecount; i += udmabuf_folio_nr_pages(ubuf, i))
> > + nents++;
> > + return nents;
> > +}
> > +
> > static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf,
> > enum dma_data_direction direction)
> > {
> > struct udmabuf *ubuf = buf->priv;
> > - struct sg_table *sg;
> > struct scatterlist *sgl;
> > - unsigned int i = 0;
> > + struct sg_table *sg;
> > + pgoff_t i, run;
> > + unsigned int nents;
> > int ret;
> >
> > + nents = udmabuf_sg_nents(ubuf);
> > +
> > sg = kzalloc_obj(*sg);
> > if (!sg)
> > return ERR_PTR(-ENOMEM);
> >
> > - ret = sg_alloc_table(sg, ubuf->pagecount, GFP_KERNEL);
> > + ret = sg_alloc_table(sg, nents, GFP_KERNEL);
> > if (ret < 0)
> > goto err_alloc;
> >
> > - for_each_sg(sg->sgl, sgl, ubuf->pagecount, i)
> > - sg_set_folio(sgl, ubuf->folios[i], PAGE_SIZE,
> > + sgl = sg->sgl;
> > + for (i = 0; i < ubuf->pagecount; i += run) {
> > + run = udmabuf_folio_nr_pages(ubuf, i);
> > + sg_set_folio(sgl, ubuf->folios[i], run << PAGE_SHIFT,
> > ubuf->offsets[i]);
> > + sgl = sg_next(sgl);
> > + }
> >
> > ret = dma_map_sgtable(dev, sg, direction, 0);
> > if (ret < 0)
> >
> > --
> > 2.53.0-Meta
> >
>
next prev parent reply other threads:[~2026-06-05 18:44 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 0:42 [PATCH net-next 0/4] net: devmem: allow rx-buf-size > PAGE_SIZE per binding Bobby Eshleman
2026-06-04 0:42 ` [PATCH net-next 1/4] net: devmem: allow rx-buf-size > PAGE_SIZE per dmabuf binding Bobby Eshleman
2026-06-05 15:33 ` Stanislav Fomichev
2026-06-05 16:20 ` Bobby Eshleman
2026-06-04 0:42 ` [PATCH net-next 2/4] udmabuf: emit one sg entry per pinned folio Bobby Eshleman
2026-06-05 9:30 ` Christian König
2026-06-05 18:44 ` Bobby Eshleman [this message]
2026-06-04 0:43 ` [PATCH net-next 3/4] selftests/net: ncdevmem: add -b option to set rx-buf-size on bind Bobby Eshleman
2026-06-05 15:35 ` Stanislav Fomichev
2026-06-05 16:56 ` Bobby Eshleman
2026-06-04 0:43 ` [PATCH net-next 4/4] selftests/net: devmem.py: add check_rx_large_niov Bobby Eshleman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aiMY8CpckM8Jav0g@devvm29614.prn0.facebook.com \
--to=bobbyeshleman@gmail.com \
--cc=almasrymina@google.com \
--cc=andrew+netdev@lunn.ch \
--cc=bobbyeshleman@meta.com \
--cc=christian.koenig@amd.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=donald.hunter@gmail.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=dw@davidwei.uk \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kraxel@redhat.com \
--cc=kuba@kernel.org \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=matttbe@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=razor@blackwall.org \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=skhawaja@google.com \
--cc=sumit.semwal@linaro.org \
--cc=vivek.kasireddy@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox