From: Alex Mastro <amastro@fb.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: "Bjorn Helgaas" <bhelgaas@google.com>,
"Logan Gunthorpe" <logang@deltatee.com>,
"Jens Axboe" <axboe@kernel.dk>,
"Robin Murphy" <robin.murphy@arm.com>,
"Joerg Roedel" <joro@8bytes.org>, "Will Deacon" <will@kernel.org>,
"Marek Szyprowski" <m.szyprowski@samsung.com>,
"Jason Gunthorpe" <jgg@ziepe.ca>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Jonathan Corbet" <corbet@lwn.net>,
"Sumit Semwal" <sumit.semwal@linaro.org>,
"Christian König" <christian.koenig@amd.com>,
"Kees Cook" <kees@kernel.org>,
"Gustavo A. R. Silva" <gustavoars@kernel.org>,
"Ankit Agrawal" <ankita@nvidia.com>,
"Yishai Hadas" <yishaih@nvidia.com>,
"Shameer Kolothum" <skolothumtho@nvidia.com>,
"Kevin Tian" <kevin.tian@intel.com>,
"Alex Williamson" <alex@shazbot.org>,
"Krishnakant Jaju" <kjaju@nvidia.com>,
"Matt Ochs" <mochs@nvidia.com>,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, iommu@lists.linux.dev,
linux-mm@kvack.org, linux-doc@vger.kernel.org,
linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org,
linaro-mm-sig@lists.linaro.org, kvm@vger.kernel.org,
linux-hardening@vger.kernel.org,
"Nicolin Chen" <nicolinc@nvidia.com>,
"Jason Gunthorpe" <jgg@nvidia.com>
Subject: Re: [PATCH v9 06/11] dma-buf: provide phys_vec to scatter-gather mapping routine
Date: Tue, 25 Nov 2025 16:18:03 -0800 [thread overview]
Message-ID: <aSZHO6otK0Heh+Qj@devgpu015.cco6.facebook.com> (raw)
In-Reply-To: <20251120-dmabuf-vfio-v9-6-d7f71607f371@nvidia.com>
On Thu, Nov 20, 2025 at 11:28:25AM +0200, Leon Romanovsky wrote:
> +static struct scatterlist *fill_sg_entry(struct scatterlist *sgl, size_t length,
> + dma_addr_t addr)
> +{
> + unsigned int len, nents;
> + int i;
> +
> + nents = DIV_ROUND_UP(length, UINT_MAX);
> + for (i = 0; i < nents; i++) {
> + len = min_t(size_t, length, UINT_MAX);
> + length -= len;
> + /*
> + * DMABUF abuses scatterlist to create a scatterlist
> + * that does not have any CPU list, only the DMA list.
> + * Always set the page related values to NULL to ensure
> + * importers can't use it. The phys_addr based DMA API
> + * does not require the CPU list for mapping or unmapping.
> + */
> + sg_set_page(sgl, NULL, 0, 0);
> + sg_dma_address(sgl) = addr + i * UINT_MAX;
(i * UINT_MAX) happens in 32-bit before being promoted to dma_addr_t for
addition with addr. Overflows for i >=2 when length >= 8 GiB. Needs a cast:
sg_dma_address(sgl) = addr + (dma_addr_t)i * UINT_MAX;
Discovered this while debugging why dma-buf import was failing for
an 8 GiB dma-buf using my earlier toy program [1]. It was surfaced by
ib_umem_find_best_pgsz() returning 0 due to malformed scatterlist, which bubbles
up as an EINVAL.
$ ./test_dmabuf 0000:05:00.0 3 4 0 0x200000000
opening 0000:05:00.0 via /dev/vfio/56
allocating dma_buf bar_idx=4, bar_offset=0x0, size=0x200000000
allocated dma_buf fd=6
discovered 4 ibv devices: mlx5_0 mlx5_1 mlx5_2 mlx5_3
opened ibv device 3: mlx5_3
test_dmabuf.c:154 Condition failed: 'mr' (errno=22: Invalid argument)
$ sudo retsnoop -e mlx5_ib_reg_user_mr_dmabuf -a 'mlx5*' -a 'ib_umem*' -a '*umr*' -a 'vfio_pci*' -a 'dma_buf_*' -x EINVAL -T
Receiving data...
13:56:22.257907 -> 13:56:22.258275 TID/PID 948895/948895 (test_dmabuf/test_dmabuf):
FUNCTION CALLS RESULT DURATION
-------------------------------------------- -------------------- ---------
→ mlx5_ib_reg_user_mr_dmabuf
↔ mlx5r_umr_resource_init [0] 2.224us
→ ib_umem_dmabuf_get
→ ib_umem_dmabuf_get_with_dma_device
↔ dma_buf_get [0xff11012a6a098c00] 0.972us
→ dma_buf_dynamic_attach
↔ vfio_pci_dma_buf_attach [0] 2.003us
← dma_buf_dynamic_attach [0xff1100012793e400] 10.566us
← ib_umem_dmabuf_get_with_dma_device [0xff110127a6c74480] 15.794us
← ib_umem_dmabuf_get [0xff110127a6c74480] 25.258us
→ mlx5_ib_init_dmabuf_mr
→ ib_umem_dmabuf_map_pages
→ dma_buf_map_attachment
→ vfio_pci_dma_buf_map
↔ dma_buf_map [0xff1100012977f700] 4.918us
← vfio_pci_dma_buf_map [0xff1100012977f700] 8.362us
← dma_buf_map_attachment [0xff1100012977f700] 10.956us
← ib_umem_dmabuf_map_pages [0] 17.336us
↔ ib_umem_find_best_pgsz [0] 6.280us
→ ib_umem_dmabuf_unmap_pages
→ dma_buf_unmap_attachment
→ vfio_pci_dma_buf_unmap
↔ dma_buf_unmap [void] 2.023us
← vfio_pci_dma_buf_unmap [void] 6.700us
← dma_buf_unmap_attachment [void] 8.142us
← ib_umem_dmabuf_unmap_pages [void] 14.953us
← mlx5_ib_init_dmabuf_mr [-EINVAL] 67.272us
→ mlx5r_umr_revoke_mr
→ mlx5r_umr_post_send_wait
→ mlx5r_umr_post_send
↔ mlx5r_begin_wqe [0] 1.703us
↔ mlx5r_finish_wqe [void] 1.633us
↔ mlx5r_ring_db [void] 1.312us
← mlx5r_umr_post_send [0] 27.451us
← mlx5r_umr_post_send_wait [0] 126.541us
← mlx5r_umr_revoke_mr [0] 141.925us
→ ib_umem_release
→ ib_umem_dmabuf_release
↔ ib_umem_dmabuf_revoke [void] 1.582us
↔ dma_buf_detach [void] 3.765us
↔ dma_buf_put [void] 0.531us
← ib_umem_dmabuf_release [void] 23.315us
← ib_umem_release [void] 40.301us
← mlx5_ib_reg_user_mr_dmabuf [-EINVAL] 363.280us
[1] https://lore.kernel.org/all/aQkLcAxEn4qmF3c4@devgpu015.cco6.facebook.com/
Alex
next prev parent reply other threads:[~2025-11-26 0:18 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-20 9:28 [PATCH v9 00/11] vfio/pci: Allow MMIO regions to be exported through dma-buf Leon Romanovsky
2025-11-20 9:28 ` [PATCH v9 01/11] PCI/P2PDMA: Separate the mmap() support from the core logic Leon Romanovsky
2025-11-20 9:28 ` [PATCH v9 02/11] PCI/P2PDMA: Simplify bus address mapping API Leon Romanovsky
2025-11-20 9:28 ` [PATCH v9 03/11] PCI/P2PDMA: Refactor to separate core P2P functionality from memory allocation Leon Romanovsky
2025-11-20 9:28 ` [PATCH v9 04/11] PCI/P2PDMA: Provide an access to pci_p2pdma_map_type() function Leon Romanovsky
2025-11-20 9:28 ` [PATCH v9 05/11] PCI/P2PDMA: Document DMABUF model Leon Romanovsky
2025-11-20 9:28 ` [PATCH v9 06/11] dma-buf: provide phys_vec to scatter-gather mapping routine Leon Romanovsky
2025-11-20 9:33 ` Christian König
2025-11-20 10:03 ` Leon Romanovsky
2025-11-26 0:18 ` Alex Mastro [this message]
2025-11-26 13:12 ` Pranjal Shrivastava
2025-11-26 16:08 ` Alex Mastro
2025-11-26 16:54 ` Jason Gunthorpe
2025-11-20 9:28 ` [PATCH v9 07/11] vfio: Export vfio device get and put registration helpers Leon Romanovsky
2025-11-20 9:28 ` [PATCH v9 08/11] vfio/pci: Share the core device pointer while invoking feature functions Leon Romanovsky
2025-11-20 9:28 ` [PATCH v9 09/11] vfio/pci: Enable peer-to-peer DMA transactions by default Leon Romanovsky
2025-11-20 9:28 ` [PATCH v9 10/11] vfio/pci: Add dma-buf export support for MMIO regions Leon Romanovsky
2025-11-21 0:04 ` Alex Williamson
2025-11-21 0:23 ` Jason Gunthorpe
2025-11-21 0:40 ` Alex Williamson
2025-11-21 7:42 ` Leon Romanovsky
2025-11-20 9:28 ` [PATCH v9 11/11] vfio/nvgrace: Support get_dmabuf_phys Leon Romanovsky
2025-11-20 17:13 ` Ankit Agrawal
2025-11-20 17:23 ` [PATCH v9 00/11] vfio/pci: Allow MMIO regions to be exported through dma-buf Ankit Agrawal
2025-11-21 16:24 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aSZHO6otK0Heh+Qj@devgpu015.cco6.facebook.com \
--to=amastro@fb.com \
--cc=akpm@linux-foundation.org \
--cc=alex@shazbot.org \
--cc=ankita@nvidia.com \
--cc=axboe@kernel.dk \
--cc=bhelgaas@google.com \
--cc=christian.koenig@amd.com \
--cc=corbet@lwn.net \
--cc=dri-devel@lists.freedesktop.org \
--cc=gustavoars@kernel.org \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=joro@8bytes.org \
--cc=kees@kernel.org \
--cc=kevin.tian@intel.com \
--cc=kjaju@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=leon@kernel.org \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-pci@vger.kernel.org \
--cc=logang@deltatee.com \
--cc=m.szyprowski@samsung.com \
--cc=mochs@nvidia.com \
--cc=nicolinc@nvidia.com \
--cc=robin.murphy@arm.com \
--cc=skolothumtho@nvidia.com \
--cc=sumit.semwal@linaro.org \
--cc=will@kernel.org \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).