From: Christoph Hellwig <hch@lst.de>
To: Robin Murphy <robin.murphy@arm.com>
Cc: "Leon Romanovsky" <leon@kernel.org>,
"Jens Axboe" <axboe@kernel.dk>, "Jason Gunthorpe" <jgg@ziepe.ca>,
"Joerg Roedel" <joro@8bytes.org>, "Will Deacon" <will@kernel.org>,
"Christoph Hellwig" <hch@lst.de>,
"Sagi Grimberg" <sagi@grimberg.me>,
"Leon Romanovsky" <leonro@nvidia.com>,
"Keith Busch" <kbusch@kernel.org>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Logan Gunthorpe" <logang@deltatee.com>,
"Yishai Hadas" <yishaih@nvidia.com>,
"Shameer Kolothum" <shameerali.kolothum.thodi@huawei.com>,
"Kevin Tian" <kevin.tian@intel.com>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Marek Szyprowski" <m.szyprowski@samsung.com>,
"Jérôme Glisse" <jglisse@redhat.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Jonathan Corbet" <corbet@lwn.net>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, linux-rdma@vger.kernel.org,
iommu@lists.linux.dev, linux-nvme@lists.infradead.org,
linux-pci@vger.kernel.org, kvm@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH v1 07/17] dma-mapping: Implement link/unlink ranges API
Date: Mon, 4 Nov 2024 10:10:48 +0100 [thread overview]
Message-ID: <20241104091048.GA25041@lst.de> (raw)
In-Reply-To: <51c5a5d5-6f90-4c42-b0ef-b87791e00f20@arm.com>
On Thu, Oct 31, 2024 at 09:18:07PM +0000, Robin Murphy wrote:
>> +static int __dma_iova_link(struct device *dev, dma_addr_t addr,
>> + phys_addr_t phys, size_t size, enum dma_data_direction dir,
>> + unsigned long attrs)
>> +{
>> + bool coherent = dev_is_dma_coherent(dev);
>> +
>> + if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
>
> If you really imagine this can support non-coherent operation and
> DMA_ATTR_SKIP_CPU_SYNC, where are the corresponding explicit sync
> operations? dma_sync_single_*() sure as heck aren't going to work...
>
> In fact, same goes for SWIOTLB bouncing even in the coherent case.
No with explicit sync operations. But plain map/unmap works, I've
actually verified that with nvme. And that's a pretty large use
case.
>> + arch_sync_dma_for_device(phys, size, dir);
>
> Plus if the aim is to pass P2P and whatever arbitrary physical addresses
> through here as well, how can we be sure this isn't going to explode?
That's a good point. Only mapped through host bridge P2P can even
end up here, so the address is a perfectly valid physical address
in the host. But I'm not sure if all arch_sync_dma_for_device
implementations handle IOMMU memory fine.
>> + struct iommu_domain *domain = iommu_get_dma_domain(dev);
>> + struct iommu_dma_cookie *cookie = domain->iova_cookie;
>> + struct iova_domain *iovad = &cookie->iovad;
>> + size_t iova_start_pad = iova_offset(iovad, phys);
>> + size_t iova_end_pad = iova_offset(iovad, phys + size);
>
> I thought the code below was wrong until I double-checked and realised that
> this is not what its name implies it to be...
Which variable does this refer to, and what would be a better name?
>> + phys = iommu_iova_to_phys(domain, addr);
>> + if (WARN_ON(!phys))
>> + continue;
>> + len = min_t(size_t,
>> + end - addr, iovad->granule - iova_start_pad);
>> +
>> + if (!dev_is_dma_coherent(dev) &&
>> + !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
>> + arch_sync_dma_for_cpu(phys, len, dir);
>> +
>> + swiotlb_tbl_unmap_single(dev, phys, len, dir, attrs);
>
> How do you know that "phys" and "len" match what was originally allocated
> and bounced in, and this isn't going to try to bounce out too much, free
> the wrong slot, or anything else nasty? If it's not supposed to be
> intentional that a sub-granule buffer can be linked to any offset in the
> middle of the IOVA range as long as its original physical address is
> aligned to the IOVA granule size(?), why try to bounce anywhere other than
> the ends of the range at all?
Mostly because the code is simpler and unless misused it just works.
But it might be worth adding explicit checks for the start and end.
>> +static void __iommu_dma_iova_unlink(struct device *dev,
>> + struct dma_iova_state *state, size_t offset, size_t size,
>> + enum dma_data_direction dir, unsigned long attrs,
>> + bool free_iova)
>> +{
>> + struct iommu_domain *domain = iommu_get_dma_domain(dev);
>> + struct iommu_dma_cookie *cookie = domain->iova_cookie;
>> + struct iova_domain *iovad = &cookie->iovad;
>> + dma_addr_t addr = state->addr + offset;
>> + size_t iova_start_pad = iova_offset(iovad, addr);
>> + struct iommu_iotlb_gather iotlb_gather;
>> + size_t unmapped;
>> +
>> + if ((state->__size & DMA_IOVA_USE_SWIOTLB) ||
>> + (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)))
>> + iommu_dma_iova_unlink_range_slow(dev, addr, size, dir, attrs);
>> +
>> + iommu_iotlb_gather_init(&iotlb_gather);
>> + iotlb_gather.queued = free_iova && READ_ONCE(cookie->fq_domain);
>
> Is is really worth the bother?
Worth what?
>> + size = iova_align(iovad, size + iova_start_pad);
>> + addr -= iova_start_pad;
>> + unmapped = iommu_unmap_fast(domain, addr, size, &iotlb_gather);
>> + WARN_ON(unmapped != size);
>> +
>> + if (!iotlb_gather.queued)
>> + iommu_iotlb_sync(domain, &iotlb_gather);
>> + if (free_iova)
>> + iommu_dma_free_iova(cookie, addr, size, &iotlb_gather);
>
> There's no guarantee that "size" is the correct value here, so this has
> every chance of corrupting the IOVA domain.
Yes, but the same is true for every users of the iommu_* API as well.
>> +/**
>> + * dma_iova_unlink - Unlink a range of IOVA space
>> + * @dev: DMA device
>> + * @state: IOVA state
>> + * @offset: offset into the IOVA state to unlink
>> + * @size: size of the buffer
>> + * @dir: DMA direction
>> + * @attrs: attributes of mapping properties
>> + *
>> + * Unlink a range of IOVA space for the given IOVA state.
>
> If I initially link a large range in one go, then unlink a small part of
> it, what behaviour can I expect?
As in map say 128k and then unmap 4k? It will just work, even if that
is not the intended use case, which is either map everything up front
and unmap everything together, or the HMM version of random constant
mapping and unmapping at page size granularity.
next prev parent reply other threads:[~2024-11-04 9:10 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-30 15:12 [PATCH v1 00/17] Provide a new two step DMA mapping API Leon Romanovsky
2024-10-30 15:12 ` [PATCH v1 01/17] PCI/P2PDMA: Refactor the p2pdma mapping helpers Leon Romanovsky
2024-10-30 15:12 ` [PATCH v1 02/17] dma-mapping: move the PCI P2PDMA mapping helpers to pci-p2pdma.h Leon Romanovsky
2024-10-30 15:12 ` [PATCH v1 03/17] iommu: generalize the batched sync after map interface Leon Romanovsky
2024-10-30 15:12 ` [PATCH v1 04/17] dma-mapping: Add check if IOVA can be used Leon Romanovsky
2024-11-10 15:09 ` Zhu Yanjun
2024-11-10 15:19 ` Leon Romanovsky
2024-11-11 6:39 ` Christoph Hellwig
2024-11-11 7:19 ` Greg Sword
2024-10-30 15:12 ` [PATCH v1 05/17] dma: Provide an interface to allow allocate IOVA Leon Romanovsky
2024-10-30 15:12 ` [PATCH v1 06/17] iommu/dma: Factor out a iommu_dma_map_swiotlb helper Leon Romanovsky
2024-10-30 15:12 ` [PATCH v1 07/17] dma-mapping: Implement link/unlink ranges API Leon Romanovsky
2024-10-31 21:18 ` Robin Murphy
2024-11-04 9:10 ` Christoph Hellwig [this message]
2024-11-04 12:19 ` Jason Gunthorpe
2024-11-04 12:53 ` Christoph Hellwig
2024-11-07 14:50 ` Leon Romanovsky
2024-10-30 15:12 ` [PATCH v1 08/17] dma-mapping: add a dma_need_unmap helper Leon Romanovsky
2024-10-31 21:18 ` Robin Murphy
2024-11-01 11:06 ` Leon Romanovsky
2024-11-04 9:15 ` Christoph Hellwig
2024-10-30 15:12 ` [PATCH v1 09/17] docs: core-api: document the IOVA-based API Leon Romanovsky
2024-10-31 1:41 ` Randy Dunlap
2024-10-31 7:59 ` Leon Romanovsky
2024-11-08 19:34 ` Jonathan Corbet
2024-11-08 20:03 ` Leon Romanovsky
2024-11-08 20:13 ` Jonathan Corbet
2024-11-08 20:27 ` Leon Romanovsky
2024-11-10 10:41 ` Leon Romanovsky
2024-11-11 6:38 ` Christoph Hellwig
2024-11-11 6:43 ` anish kumar
2024-11-11 14:59 ` Jonathan Corbet
2024-10-30 15:12 ` [PATCH v1 10/17] mm/hmm: let users to tag specific PFN with DMA mapped bit Leon Romanovsky
2024-10-30 15:12 ` [PATCH v1 11/17] mm/hmm: provide generic DMA managing logic Leon Romanovsky
2024-10-30 15:12 ` [PATCH v1 12/17] RDMA/umem: Store ODP access mask information in PFN Leon Romanovsky
2024-10-30 15:12 ` [PATCH v1 13/17] RDMA/core: Convert UMEM ODP DMA mapping to caching IOVA and page linkage Leon Romanovsky
2024-10-30 15:13 ` [PATCH v1 14/17] RDMA/umem: Separate implicit ODP initialization from explicit ODP Leon Romanovsky
2024-10-30 15:13 ` [PATCH v1 15/17] vfio/mlx5: Explicitly use number of pages instead of allocated length Leon Romanovsky
2024-10-30 15:13 ` [PATCH v1 16/17] vfio/mlx5: Rewrite create mkey flow to allow better code reuse Leon Romanovsky
2024-10-30 15:13 ` [PATCH v1 17/17] vfio/mlx5: Convert vfio to use DMA link API Leon Romanovsky
2024-10-31 1:44 ` [PATCH v1 00/17] Provide a new two step DMA mapping API Jens Axboe
2024-10-31 8:34 ` Christoph Hellwig
2024-10-31 9:05 ` Leon Romanovsky
2024-10-31 9:21 ` Christoph Hellwig
2024-10-31 9:37 ` Leon Romanovsky
2024-10-31 17:43 ` Jens Axboe
2024-10-31 20:43 ` Leon Romanovsky
2024-10-31 17:42 ` Jens Axboe
2024-10-31 21:17 ` Robin Murphy
2024-11-04 9:58 ` Christoph Hellwig
2024-11-04 11:39 ` Leon Romanovsky
2024-11-05 19:53 ` Jason Gunthorpe
2024-11-07 8:32 ` Christoph Hellwig
2024-11-07 13:28 ` Jason Gunthorpe
2024-11-07 13:50 ` Christoph Hellwig
2024-11-08 15:02 ` Jason Gunthorpe
2024-11-08 15:05 ` Christoph Hellwig
2024-11-08 15:25 ` Jason Gunthorpe
2024-11-08 15:29 ` Christoph Hellwig
2024-11-08 15:38 ` Jason Gunthorpe
2024-11-12 6:01 ` Christoph Hellwig
2024-11-13 18:41 ` Jason Gunthorpe
2024-11-05 18:51 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241104091048.GA25041@lst.de \
--to=hch@lst.de \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=axboe@kernel.dk \
--cc=bhelgaas@google.com \
--cc=corbet@lwn.net \
--cc=iommu@lists.linux.dev \
--cc=jgg@ziepe.ca \
--cc=jglisse@redhat.com \
--cc=joro@8bytes.org \
--cc=kbusch@kernel.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=leon@kernel.org \
--cc=leonro@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=logang@deltatee.com \
--cc=m.szyprowski@samsung.com \
--cc=robin.murphy@arm.com \
--cc=sagi@grimberg.me \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=will@kernel.org \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).