linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Mastro <amastro@fb.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: "Bjorn Helgaas" <bhelgaas@google.com>,
	"Logan Gunthorpe" <logang@deltatee.com>,
	"Jens Axboe" <axboe@kernel.dk>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Joerg Roedel" <joro@8bytes.org>, "Will Deacon" <will@kernel.org>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Jason Gunthorpe" <jgg@ziepe.ca>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Kees Cook" <kees@kernel.org>,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	"Ankit Agrawal" <ankita@nvidia.com>,
	"Yishai Hadas" <yishaih@nvidia.com>,
	"Shameer Kolothum" <skolothumtho@nvidia.com>,
	"Kevin Tian" <kevin.tian@intel.com>,
	"Alex Williamson" <alex@shazbot.org>,
	"Krishnakant Jaju" <kjaju@nvidia.com>,
	"Matt Ochs" <mochs@nvidia.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org, iommu@lists.linux.dev,
	linux-mm@kvack.org, linux-doc@vger.kernel.org,
	linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linaro-mm-sig@lists.linaro.org, kvm@vger.kernel.org,
	linux-hardening@vger.kernel.org,
	"Nicolin Chen" <nicolinc@nvidia.com>,
	"Jason Gunthorpe" <jgg@nvidia.com>
Subject: Re: [PATCH v9 06/11] dma-buf: provide phys_vec to scatter-gather mapping routine
Date: Tue, 25 Nov 2025 16:18:03 -0800	[thread overview]
Message-ID: <aSZHO6otK0Heh+Qj@devgpu015.cco6.facebook.com> (raw)
In-Reply-To: <20251120-dmabuf-vfio-v9-6-d7f71607f371@nvidia.com>

On Thu, Nov 20, 2025 at 11:28:25AM +0200, Leon Romanovsky wrote:
> +static struct scatterlist *fill_sg_entry(struct scatterlist *sgl, size_t length,
> +					 dma_addr_t addr)
> +{
> +	unsigned int len, nents;
> +	int i;
> +
> +	nents = DIV_ROUND_UP(length, UINT_MAX);
> +	for (i = 0; i < nents; i++) {
> +		len = min_t(size_t, length, UINT_MAX);
> +		length -= len;
> +		/*
> +		 * DMABUF abuses scatterlist to create a scatterlist
> +		 * that does not have any CPU list, only the DMA list.
> +		 * Always set the page related values to NULL to ensure
> +		 * importers can't use it. The phys_addr based DMA API
> +		 * does not require the CPU list for mapping or unmapping.
> +		 */
> +		sg_set_page(sgl, NULL, 0, 0);
> +		sg_dma_address(sgl) = addr + i * UINT_MAX;

(i * UINT_MAX) happens in 32-bit before being promoted to dma_addr_t for
addition with addr. Overflows for i >=2 when length >= 8 GiB. Needs a cast:

		sg_dma_address(sgl) = addr + (dma_addr_t)i * UINT_MAX;

Discovered this while debugging why dma-buf import was failing for
an 8 GiB dma-buf using my earlier toy program [1]. It was surfaced by
ib_umem_find_best_pgsz() returning 0 due to malformed scatterlist, which bubbles
up as an EINVAL.

$ ./test_dmabuf 0000:05:00.0 3 4 0 0x200000000
opening 0000:05:00.0 via /dev/vfio/56
allocating dma_buf bar_idx=4, bar_offset=0x0, size=0x200000000
allocated dma_buf fd=6
discovered 4 ibv devices: mlx5_0 mlx5_1 mlx5_2 mlx5_3
opened ibv device 3: mlx5_3
test_dmabuf.c:154 Condition failed: 'mr' (errno=22: Invalid argument)

$ sudo retsnoop -e mlx5_ib_reg_user_mr_dmabuf -a 'mlx5*' -a 'ib_umem*' -a '*umr*' -a 'vfio_pci*' -a 'dma_buf_*' -x EINVAL -T
Receiving data...
13:56:22.257907 -> 13:56:22.258275 TID/PID 948895/948895 (test_dmabuf/test_dmabuf):
FUNCTION CALLS                                 RESULT                 DURATION
--------------------------------------------   --------------------  ---------
→ mlx5_ib_reg_user_mr_dmabuf
    ↔ mlx5r_umr_resource_init                  [0]                     2.224us
    → ib_umem_dmabuf_get
        → ib_umem_dmabuf_get_with_dma_device
            ↔ dma_buf_get                      [0xff11012a6a098c00]    0.972us
            → dma_buf_dynamic_attach
                ↔ vfio_pci_dma_buf_attach      [0]                     2.003us
            ← dma_buf_dynamic_attach           [0xff1100012793e400]   10.566us
        ← ib_umem_dmabuf_get_with_dma_device   [0xff110127a6c74480]   15.794us
    ← ib_umem_dmabuf_get                       [0xff110127a6c74480]   25.258us
    → mlx5_ib_init_dmabuf_mr
        → ib_umem_dmabuf_map_pages
            → dma_buf_map_attachment
                → vfio_pci_dma_buf_map
                    ↔ dma_buf_map              [0xff1100012977f700]    4.918us
                ← vfio_pci_dma_buf_map         [0xff1100012977f700]    8.362us
            ← dma_buf_map_attachment           [0xff1100012977f700]   10.956us
        ← ib_umem_dmabuf_map_pages             [0]                    17.336us
        ↔ ib_umem_find_best_pgsz               [0]                     6.280us
        → ib_umem_dmabuf_unmap_pages
            → dma_buf_unmap_attachment
                → vfio_pci_dma_buf_unmap
                    ↔ dma_buf_unmap            [void]                  2.023us
                ← vfio_pci_dma_buf_unmap       [void]                  6.700us
            ← dma_buf_unmap_attachment         [void]                  8.142us
        ← ib_umem_dmabuf_unmap_pages           [void]                 14.953us
    ← mlx5_ib_init_dmabuf_mr                   [-EINVAL]              67.272us
    → mlx5r_umr_revoke_mr
        → mlx5r_umr_post_send_wait
            → mlx5r_umr_post_send
                ↔ mlx5r_begin_wqe              [0]                     1.703us
                ↔ mlx5r_finish_wqe             [void]                  1.633us
                ↔ mlx5r_ring_db                [void]                  1.312us
            ← mlx5r_umr_post_send              [0]                    27.451us
        ← mlx5r_umr_post_send_wait             [0]                   126.541us
    ← mlx5r_umr_revoke_mr                      [0]                   141.925us
    → ib_umem_release
        → ib_umem_dmabuf_release
            ↔ ib_umem_dmabuf_revoke            [void]                  1.582us
            ↔ dma_buf_detach                   [void]                  3.765us
            ↔ dma_buf_put                      [void]                  0.531us
        ← ib_umem_dmabuf_release               [void]                 23.315us
    ← ib_umem_release                          [void]                 40.301us
← mlx5_ib_reg_user_mr_dmabuf                   [-EINVAL]             363.280us

[1] https://lore.kernel.org/all/aQkLcAxEn4qmF3c4@devgpu015.cco6.facebook.com/

Alex

  parent reply	other threads:[~2025-11-26  0:18 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-20  9:28 [PATCH v9 00/11] vfio/pci: Allow MMIO regions to be exported through dma-buf Leon Romanovsky
2025-11-20  9:28 ` [PATCH v9 01/11] PCI/P2PDMA: Separate the mmap() support from the core logic Leon Romanovsky
2025-11-20  9:28 ` [PATCH v9 02/11] PCI/P2PDMA: Simplify bus address mapping API Leon Romanovsky
2025-11-20  9:28 ` [PATCH v9 03/11] PCI/P2PDMA: Refactor to separate core P2P functionality from memory allocation Leon Romanovsky
2025-11-20  9:28 ` [PATCH v9 04/11] PCI/P2PDMA: Provide an access to pci_p2pdma_map_type() function Leon Romanovsky
2025-11-20  9:28 ` [PATCH v9 05/11] PCI/P2PDMA: Document DMABUF model Leon Romanovsky
2025-11-20  9:28 ` [PATCH v9 06/11] dma-buf: provide phys_vec to scatter-gather mapping routine Leon Romanovsky
2025-11-20  9:33   ` Christian König
2025-11-20 10:03     ` Leon Romanovsky
2025-11-26  0:18   ` Alex Mastro [this message]
2025-11-26 13:12     ` Pranjal Shrivastava
2025-11-26 16:08       ` Alex Mastro
2025-11-26 16:54         ` Jason Gunthorpe
2025-11-20  9:28 ` [PATCH v9 07/11] vfio: Export vfio device get and put registration helpers Leon Romanovsky
2025-11-20  9:28 ` [PATCH v9 08/11] vfio/pci: Share the core device pointer while invoking feature functions Leon Romanovsky
2025-11-20  9:28 ` [PATCH v9 09/11] vfio/pci: Enable peer-to-peer DMA transactions by default Leon Romanovsky
2025-11-20  9:28 ` [PATCH v9 10/11] vfio/pci: Add dma-buf export support for MMIO regions Leon Romanovsky
2025-11-21  0:04   ` Alex Williamson
2025-11-21  0:23     ` Jason Gunthorpe
2025-11-21  0:40       ` Alex Williamson
2025-11-21  7:42     ` Leon Romanovsky
2025-11-20  9:28 ` [PATCH v9 11/11] vfio/nvgrace: Support get_dmabuf_phys Leon Romanovsky
2025-11-20 17:13   ` Ankit Agrawal
2025-11-20 17:23 ` [PATCH v9 00/11] vfio/pci: Allow MMIO regions to be exported through dma-buf Ankit Agrawal
2025-11-21 16:24 ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aSZHO6otK0Heh+Qj@devgpu015.cco6.facebook.com \
    --to=amastro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@shazbot.org \
    --cc=ankita@nvidia.com \
    --cc=axboe@kernel.dk \
    --cc=bhelgaas@google.com \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gustavoars@kernel.org \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=jgg@ziepe.ca \
    --cc=joro@8bytes.org \
    --cc=kees@kernel.org \
    --cc=kevin.tian@intel.com \
    --cc=kjaju@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=leon@kernel.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=m.szyprowski@samsung.com \
    --cc=mochs@nvidia.com \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=skolothumtho@nvidia.com \
    --cc=sumit.semwal@linaro.org \
    --cc=will@kernel.org \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).