* [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention @ 2018-09-12 15:24 Robin Murphy 2018-09-12 15:24 ` [PATCH v2 1/3] iommu: Add fast hook for getting DMA domains Robin Murphy ` (4 more replies) 0 siblings, 5 replies; 20+ messages in thread From: Robin Murphy @ 2018-09-12 15:24 UTC (permalink / raw) To: linux-arm-kernel John raised the issue[1] that we have some unnecessary refcount contention in the DMA ops path which shows scalability problems now that we have more real high-performance hardware using iommu-dma. The x86 IOMMU drivers are sidestepping this by stashing domain references in archdata, but since that's not very nice for architecture-agnostic code, I think it's time to look at a generic API-level solution. These are a couple of quick patches based on the idea I had back when first implementing iommu-dma, but didn't have any way to justify at the time. However, the reports of 10-25% better networking performance on v1 suggest that it's very worthwhile (and far more significant than I ever would have guessed). As far as merging goes, I don't mind at all whether this goes via IOMMU, or via dma-mapping provided Joerg's happy to ack it. Robin. [1] https://lists.linuxfoundation.org/pipermail/iommu/2018-August/029303.html Robin Murphy (3): iommu: Add fast hook for getting DMA domains iommu/dma: Use fast DMA domain lookup arm64/dma-mapping: Mildly optimise non-coherent IOMMU ops arch/arm64/mm/dma-mapping.c | 10 +++++----- drivers/iommu/dma-iommu.c | 23 ++++++++++++----------- drivers/iommu/iommu.c | 9 +++++++++ include/linux/iommu.h | 1 + 4 files changed, 27 insertions(+), 16 deletions(-) -- 2.19.0.dirty ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 1/3] iommu: Add fast hook for getting DMA domains 2018-09-12 15:24 [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention Robin Murphy @ 2018-09-12 15:24 ` Robin Murphy 2018-09-12 15:24 ` [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup Robin Murphy ` (3 subsequent siblings) 4 siblings, 0 replies; 20+ messages in thread From: Robin Murphy @ 2018-09-12 15:24 UTC (permalink / raw) To: linux-arm-kernel While iommu_get_domain_for_dev() is the robust way for arbitrary IOMMU API callers to retrieve the domain pointer, for DMA ops domains it doesn't scale well for large systems and multi-queue devices, since the momentary refcount adjustment will lead to exclusive cacheline contention when multiple CPUs are operating in parallel on different mappings for the same device. In the case of DMA ops domains, however, this refcounting is actually unnecessary, since they already imply that the group exists and is managed by platform code and IOMMU internals (by virtue of iommu_group_get_for_dev()) such that a reference will already be held for the lifetime of the device. Thus we can avoid the bottleneck by providing a fast lookup specifically for the DMA code to retrieve the default domain it already knows it has set up - a simple read-only dereference plays much nicer with cache-coherency protocols. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- drivers/iommu/iommu.c | 9 +++++++++ include/linux/iommu.h | 1 + 2 files changed, 10 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 8c15c5980299..9d70344204fe 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1415,6 +1415,15 @@ struct iommu_domain *iommu_get_domain_for_dev(struct device *dev) } EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev); +/* + * For IOMMU_DOMAIN_DMA implementations which already provide their own + * guarantees that the group and its default domain are valid and correct. + */ +struct iommu_domain *iommu_get_dma_domain(struct device *dev) +{ + return dev->iommu_group->default_domain; +} + /* * IOMMU groups are really the natrual working unit of the IOMMU, but * the IOMMU API works on domains and devices. Bridge that gap by diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 87994c265bf5..c783648d4060 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -293,6 +293,7 @@ extern int iommu_attach_device(struct iommu_domain *domain, extern void iommu_detach_device(struct iommu_domain *domain, struct device *dev); extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev); +extern struct iommu_domain *iommu_get_dma_domain(struct device *dev); extern int iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot); extern size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, -- 2.19.0.dirty ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup 2018-09-12 15:24 [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention Robin Murphy 2018-09-12 15:24 ` [PATCH v2 1/3] iommu: Add fast hook for getting DMA domains Robin Murphy @ 2018-09-12 15:24 ` Robin Murphy 2018-09-28 13:26 ` Marek Szyprowski 2018-09-12 15:24 ` [PATCH v2 3/3] arm64/dma-mapping: Mildly optimise non-coherent IOMMU ops Robin Murphy ` (2 subsequent siblings) 4 siblings, 1 reply; 20+ messages in thread From: Robin Murphy @ 2018-09-12 15:24 UTC (permalink / raw) To: linux-arm-kernel Most parts of iommu-dma already assume they are operating on a default domain set up by iommu_dma_init_domain(), and can be converted straight over to avoid the refcounting bottleneck. MSI page mappings may be in an unmanaged domain with an explicit MSI-only cookie, so retain the non-specific lookup, but that's OK since they're far from a contended fast path either way. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- drivers/iommu/dma-iommu.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 511ff9a1d6d9..320f9ea82f3f 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -491,7 +491,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, void iommu_dma_free(struct device *dev, struct page **pages, size_t size, dma_addr_t *handle) { - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle, size); + __iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size); __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT); *handle = IOMMU_MAPPING_ERROR; } @@ -518,7 +518,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp, unsigned long attrs, int prot, dma_addr_t *handle, void (*flush_page)(struct device *, const void *, phys_addr_t)) { - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + struct iommu_domain *domain = iommu_get_dma_domain(dev); struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; struct page **pages; @@ -606,9 +606,8 @@ int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma) } static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, - size_t size, int prot) + size_t size, int prot, struct iommu_domain *domain) { - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); struct iommu_dma_cookie *cookie = domain->iova_cookie; size_t iova_off = 0; dma_addr_t iova; @@ -632,13 +631,14 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, unsigned long offset, size_t size, int prot) { - return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot); + return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot, + iommu_get_dma_domain(dev)); } void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir, unsigned long attrs) { - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle, size); + __iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size); } /* @@ -726,7 +726,7 @@ static void __invalidate_sg(struct scatterlist *sg, int nents) int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, int prot) { - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + struct iommu_domain *domain = iommu_get_dma_domain(dev); struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; struct scatterlist *s, *prev = NULL; @@ -811,20 +811,21 @@ void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, sg = tmp; } end = sg_dma_address(sg) + sg_dma_len(sg); - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), start, end - start); + __iommu_dma_unmap(iommu_get_dma_domain(dev), start, end - start); } dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys, size_t size, enum dma_data_direction dir, unsigned long attrs) { return __iommu_dma_map(dev, phys, size, - dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO); + dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO, + iommu_get_dma_domain(dev)); } void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir, unsigned long attrs) { - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle, size); + __iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size); } int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr) @@ -850,7 +851,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, if (!msi_page) return NULL; - iova = __iommu_dma_map(dev, msi_addr, size, prot); + iova = __iommu_dma_map(dev, msi_addr, size, prot, domain); if (iommu_dma_mapping_error(dev, iova)) goto out_free_page; -- 2.19.0.dirty ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup 2018-09-12 15:24 ` [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup Robin Murphy @ 2018-09-28 13:26 ` Marek Szyprowski 2018-09-28 13:52 ` Robin Murphy 0 siblings, 1 reply; 20+ messages in thread From: Marek Szyprowski @ 2018-09-28 13:26 UTC (permalink / raw) To: linux-arm-kernel Hi All, On 2018-09-12 17:24, Robin Murphy wrote: > Most parts of iommu-dma already assume they are operating on a default > domain set up by iommu_dma_init_domain(), and can be converted straight > over to avoid the refcounting bottleneck. MSI page mappings may be in > an unmanaged domain with an explicit MSI-only cookie, so retain the > non-specific lookup, but that's OK since they're far from a contended > fast path either way. > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> This breaks Exynos DRM driver with Exynos IOMMU on ARM64 (Exynos5433). Exynos DRM creates it's own domain, attach all devices which performs DMA access (typically CRTC devices) to it and uses standard DMA-mapping calls to allocate/map buffers. This way it can use the same one DMA address for each buffer regardless of the CRTC/display/processing device. This no longer works with this patch. The simplest solution to fix this is add API to change default_domain to the one allocated by the Exynos DRM driver. > --- > drivers/iommu/dma-iommu.c | 23 ++++++++++++----------- > 1 file changed, 12 insertions(+), 11 deletions(-) > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index 511ff9a1d6d9..320f9ea82f3f 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -491,7 +491,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, > void iommu_dma_free(struct device *dev, struct page **pages, size_t size, > dma_addr_t *handle) > { > - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle, size); > + __iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size); > __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT); > *handle = IOMMU_MAPPING_ERROR; > } > @@ -518,7 +518,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp, > unsigned long attrs, int prot, dma_addr_t *handle, > void (*flush_page)(struct device *, const void *, phys_addr_t)) > { > - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + struct iommu_domain *domain = iommu_get_dma_domain(dev); > struct iommu_dma_cookie *cookie = domain->iova_cookie; > struct iova_domain *iovad = &cookie->iovad; > struct page **pages; > @@ -606,9 +606,8 @@ int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma) > } > > static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, > - size_t size, int prot) > + size_t size, int prot, struct iommu_domain *domain) > { > - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > struct iommu_dma_cookie *cookie = domain->iova_cookie; > size_t iova_off = 0; > dma_addr_t iova; > @@ -632,13 +631,14 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, > dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, > unsigned long offset, size_t size, int prot) > { > - return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot); > + return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot, > + iommu_get_dma_domain(dev)); > } > > void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size, > enum dma_data_direction dir, unsigned long attrs) > { > - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle, size); > + __iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size); > } > > /* > @@ -726,7 +726,7 @@ static void __invalidate_sg(struct scatterlist *sg, int nents) > int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, > int nents, int prot) > { > - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + struct iommu_domain *domain = iommu_get_dma_domain(dev); > struct iommu_dma_cookie *cookie = domain->iova_cookie; > struct iova_domain *iovad = &cookie->iovad; > struct scatterlist *s, *prev = NULL; > @@ -811,20 +811,21 @@ void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, > sg = tmp; > } > end = sg_dma_address(sg) + sg_dma_len(sg); > - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), start, end - start); > + __iommu_dma_unmap(iommu_get_dma_domain(dev), start, end - start); > } > > dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys, > size_t size, enum dma_data_direction dir, unsigned long attrs) > { > return __iommu_dma_map(dev, phys, size, > - dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO); > + dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO, > + iommu_get_dma_domain(dev)); > } > > void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle, > size_t size, enum dma_data_direction dir, unsigned long attrs) > { > - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle, size); > + __iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size); > } > > int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr) > @@ -850,7 +851,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, > if (!msi_page) > return NULL; > > - iova = __iommu_dma_map(dev, msi_addr, size, prot); > + iova = __iommu_dma_map(dev, msi_addr, size, prot, domain); > if (iommu_dma_mapping_error(dev, iova)) > goto out_free_page; > Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup 2018-09-28 13:26 ` Marek Szyprowski @ 2018-09-28 13:52 ` Robin Murphy 2018-09-28 14:21 ` Marek Szyprowski 0 siblings, 1 reply; 20+ messages in thread From: Robin Murphy @ 2018-09-28 13:52 UTC (permalink / raw) To: linux-arm-kernel On 28/09/18 14:26, Marek Szyprowski wrote: > Hi All, > > On 2018-09-12 17:24, Robin Murphy wrote: >> Most parts of iommu-dma already assume they are operating on a default >> domain set up by iommu_dma_init_domain(), and can be converted straight >> over to avoid the refcounting bottleneck. MSI page mappings may be in >> an unmanaged domain with an explicit MSI-only cookie, so retain the >> non-specific lookup, but that's OK since they're far from a contended >> fast path either way. >> >> Signed-off-by: Robin Murphy <robin.murphy@arm.com> > > This breaks Exynos DRM driver with Exynos IOMMU on ARM64 (Exynos5433). > Exynos DRM creates it's own domain, attach all devices which performs > DMA access (typically CRTC devices) to it and uses standard DMA-mapping > calls to allocate/map buffers. This way it can use the same one DMA > address for each buffer regardless of the CRTC/display/processing > device. > > This no longer works with this patch. The simplest solution to fix this > is add API to change default_domain to the one allocated by the Exynos > DRM driver. Ugh, I hadn't realised there were any drivers trying to fake up their own default domains - that's always going to be fragile. In fact, one of my proposals for un-breaking Tegra and other DRM drivers involves preventing iommu-dma from trying to operate on non-IOMMU_DOMAIN_DMA domains at all, which would stop that from working permanently. Can Exynos not put all the DRM components into a single group, or at least just pick one of the real default domains to attach everyone to instead of allocating a fake one? Robin. >> --- >> drivers/iommu/dma-iommu.c | 23 ++++++++++++----------- >> 1 file changed, 12 insertions(+), 11 deletions(-) >> >> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c >> index 511ff9a1d6d9..320f9ea82f3f 100644 >> --- a/drivers/iommu/dma-iommu.c >> +++ b/drivers/iommu/dma-iommu.c >> @@ -491,7 +491,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, >> void iommu_dma_free(struct device *dev, struct page **pages, size_t size, >> dma_addr_t *handle) >> { >> - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle, size); >> + __iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size); >> __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT); >> *handle = IOMMU_MAPPING_ERROR; >> } >> @@ -518,7 +518,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp, >> unsigned long attrs, int prot, dma_addr_t *handle, >> void (*flush_page)(struct device *, const void *, phys_addr_t)) >> { >> - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); >> + struct iommu_domain *domain = iommu_get_dma_domain(dev); >> struct iommu_dma_cookie *cookie = domain->iova_cookie; >> struct iova_domain *iovad = &cookie->iovad; >> struct page **pages; >> @@ -606,9 +606,8 @@ int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma) >> } >> >> static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, >> - size_t size, int prot) >> + size_t size, int prot, struct iommu_domain *domain) >> { >> - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); >> struct iommu_dma_cookie *cookie = domain->iova_cookie; >> size_t iova_off = 0; >> dma_addr_t iova; >> @@ -632,13 +631,14 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, >> dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, >> unsigned long offset, size_t size, int prot) >> { >> - return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot); >> + return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot, >> + iommu_get_dma_domain(dev)); >> } >> >> void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size, >> enum dma_data_direction dir, unsigned long attrs) >> { >> - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle, size); >> + __iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size); >> } >> >> /* >> @@ -726,7 +726,7 @@ static void __invalidate_sg(struct scatterlist *sg, int nents) >> int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, >> int nents, int prot) >> { >> - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); >> + struct iommu_domain *domain = iommu_get_dma_domain(dev); >> struct iommu_dma_cookie *cookie = domain->iova_cookie; >> struct iova_domain *iovad = &cookie->iovad; >> struct scatterlist *s, *prev = NULL; >> @@ -811,20 +811,21 @@ void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, >> sg = tmp; >> } >> end = sg_dma_address(sg) + sg_dma_len(sg); >> - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), start, end - start); >> + __iommu_dma_unmap(iommu_get_dma_domain(dev), start, end - start); >> } >> >> dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys, >> size_t size, enum dma_data_direction dir, unsigned long attrs) >> { >> return __iommu_dma_map(dev, phys, size, >> - dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO); >> + dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO, >> + iommu_get_dma_domain(dev)); >> } >> >> void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle, >> size_t size, enum dma_data_direction dir, unsigned long attrs) >> { >> - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle, size); >> + __iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size); >> } >> >> int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr) >> @@ -850,7 +851,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, >> if (!msi_page) >> return NULL; >> >> - iova = __iommu_dma_map(dev, msi_addr, size, prot); >> + iova = __iommu_dma_map(dev, msi_addr, size, prot, domain); >> if (iommu_dma_mapping_error(dev, iova)) >> goto out_free_page; >> > > Best regards > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup 2018-09-28 13:52 ` Robin Murphy @ 2018-09-28 14:21 ` Marek Szyprowski 2018-09-28 15:31 ` Robin Murphy 0 siblings, 1 reply; 20+ messages in thread From: Marek Szyprowski @ 2018-09-28 14:21 UTC (permalink / raw) To: linux-arm-kernel Hi Robin, On 2018-09-28 15:52, Robin Murphy wrote: > On 28/09/18 14:26, Marek Szyprowski wrote: >> On 2018-09-12 17:24, Robin Murphy wrote: >>> Most parts of iommu-dma already assume they are operating on a default >>> domain set up by iommu_dma_init_domain(), and can be converted straight >>> over to avoid the refcounting bottleneck. MSI page mappings may be in >>> an unmanaged domain with an explicit MSI-only cookie, so retain the >>> non-specific lookup, but that's OK since they're far from a contended >>> fast path either way. >>> >>> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >> >> This breaks Exynos DRM driver with Exynos IOMMU on ARM64 (Exynos5433). >> Exynos DRM creates it's own domain, attach all devices which performs >> DMA access (typically CRTC devices) to it and uses standard DMA-mapping >> calls to allocate/map buffers. This way it can use the same one DMA >> address for each buffer regardless of the CRTC/display/processing >> device. >> >> This no longer works with this patch. The simplest solution to fix this >> is add API to change default_domain to the one allocated by the Exynos >> DRM driver. > > Ugh, I hadn't realised there were any drivers trying to fake up their > own default domains - that's always going to be fragile. In fact, one > of my proposals for un-breaking Tegra and other DRM drivers involves > preventing iommu-dma from trying to operate on non-IOMMU_DOMAIN_DMA > domains at all, which would stop that from working permanently. IMHO there should be a way to let drivers like DRM to use DMA-mapping infrastructure on their own iommu domain. Better provide such API to avoid hacking in the DRM drivers to get this done without help from IOMMU core. > Can Exynos not put all the DRM components into a single group, or at > least just pick one of the real default domains to attach everyone to > instead of allocating a fake one? Exynos DRM components are not in the single group. Currently iommu-groups have no meaning on Exynos5433 and Exynos IOMMU simply allocate one group per each iommu-master device in the system. Exynos DRM already selects one of its component devices as the 'main DMA device'. It is being used for managing buffers if no IOMMU is available, so it can also use its iommu domain instead of allocating a fake one. I've checked and it fixes Exynos DRM now. The question is how to merge this fix to keep bisectability. From: Marek Szyprowski <m.szyprowski@samsung.com> Date: Fri, 28 Sep 2018 16:17:27 +0200 Subject: [PATCH] drm/exynos: Use selected dma_dev default iommu domain instead ?of a fake one Instead of allocating a fake IOMMU domain for all Exynos DRM components, simply reuse the default IOMMU domain of the already selected DMA device. This allows some design changes in IOMMU framework without breaking IOMMU support in Exynos DRM. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> --- ?drivers/gpu/drm/exynos/exynos_drm_iommu.h | 34 ++++------------------- ?1 file changed, 6 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h b/drivers/gpu/drm/exynos/exynos_drm_iommu.h index 87f6b5672e11..51d3b7dcd529 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h +++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h @@ -55,37 +55,12 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, ?static inline int __exynos_iommu_create_mapping(struct exynos_drm_private *priv, ???? ??? ??? ??? ??? unsigned long start, unsigned long size) ?{ -??? struct iommu_domain *domain; -??? int ret; - -??? domain = iommu_domain_alloc(priv->dma_dev->bus); -??? if (!domain) -??? ??? return -ENOMEM; - -??? ret = iommu_get_dma_cookie(domain); -??? if (ret) -??? ??? goto free_domain; - -??? ret = iommu_dma_init_domain(domain, start, size, NULL); -??? if (ret) -??? ??? goto put_cookie; - -??? priv->mapping = domain; +??? priv->mapping = iommu_get_dma_domain(priv->dma_dev); ???? return 0; - -put_cookie: -??? iommu_put_dma_cookie(domain); -free_domain: -??? iommu_domain_free(domain); -??? return ret; ?} ?static inline void __exynos_iommu_release_mapping(struct exynos_drm_private *priv) ?{ -??? struct iommu_domain *domain = priv->mapping; - -??? iommu_put_dma_cookie(domain); -??? iommu_domain_free(domain); ???? priv->mapping = NULL; ?} @@ -94,7 +69,9 @@ static inline int __exynos_iommu_attach(struct exynos_drm_private *priv, ?{ ???? struct iommu_domain *domain = priv->mapping; -??? return iommu_attach_device(domain, dev); +??? if (dev != priv->dma_dev) +??? ??? return iommu_attach_device(domain, dev); +??? return 0; ?} ?static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, @@ -102,7 +79,8 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, ?{ ???? struct iommu_domain *domain = priv->mapping; -??? iommu_detach_device(domain, dev); +??? if (dev != priv->dma_dev) +??? ??? iommu_detach_device(domain, dev); ?} ?#else ?#error Unsupported architecture and IOMMU/DMA-mapping glue code > ... Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup 2018-09-28 14:21 ` Marek Szyprowski @ 2018-09-28 15:31 ` Robin Murphy 2018-09-28 15:33 ` Christoph Hellwig ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Robin Murphy @ 2018-09-28 15:31 UTC (permalink / raw) To: linux-arm-kernel On 28/09/18 15:21, Marek Szyprowski wrote: > Hi Robin, > > On 2018-09-28 15:52, Robin Murphy wrote: >> On 28/09/18 14:26, Marek Szyprowski wrote: >>> On 2018-09-12 17:24, Robin Murphy wrote: >>>> Most parts of iommu-dma already assume they are operating on a default >>>> domain set up by iommu_dma_init_domain(), and can be converted straight >>>> over to avoid the refcounting bottleneck. MSI page mappings may be in >>>> an unmanaged domain with an explicit MSI-only cookie, so retain the >>>> non-specific lookup, but that's OK since they're far from a contended >>>> fast path either way. >>>> >>>> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >>> >>> This breaks Exynos DRM driver with Exynos IOMMU on ARM64 (Exynos5433). >>> Exynos DRM creates it's own domain, attach all devices which performs >>> DMA access (typically CRTC devices) to it and uses standard DMA-mapping >>> calls to allocate/map buffers. This way it can use the same one DMA >>> address for each buffer regardless of the CRTC/display/processing >>> device. >>> >>> This no longer works with this patch. The simplest solution to fix this >>> is add API to change default_domain to the one allocated by the Exynos >>> DRM driver. >> >> Ugh, I hadn't realised there were any drivers trying to fake up their >> own default domains - that's always going to be fragile. In fact, one >> of my proposals for un-breaking Tegra and other DRM drivers involves >> preventing iommu-dma from trying to operate on non-IOMMU_DOMAIN_DMA >> domains at all, which would stop that from working permanently. > > IMHO there should be a way to let drivers like DRM to use DMA-mapping > infrastructure on their own iommu domain. Better provide such API to > avoid hacking in the DRM drivers to get this done without help from > IOMMU core. The tricky part is how to reconcile that with those other drivers which want to do explicit IOMMU management with their own domain but still use the DMA API for coherency of the underlying memory. I do have a couple of half-formed ideas, but they're not quite there yet. >> Can Exynos not put all the DRM components into a single group, or at >> least just pick one of the real default domains to attach everyone to >> instead of allocating a fake one? > > Exynos DRM components are not in the single group. Currently > iommu-groups have no meaning on Exynos5433 and Exynos IOMMU simply > allocate one group per each iommu-master device in the system. Yeah, a group spanning multiple IOMMU devices would have been a bit of a hack for your topology, but still *technically* possible ;) > Exynos DRM already selects one of its component devices as the 'main DMA > device'. It is being used for managing buffers if no IOMMU is available, > so it can also use its iommu domain instead of allocating a fake one. > I've checked and it fixes Exynos DRM now. The question is how to merge > this fix to keep bisectability. > > From: Marek Szyprowski <m.szyprowski@samsung.com> > Date: Fri, 28 Sep 2018 16:17:27 +0200 > Subject: [PATCH] drm/exynos: Use selected dma_dev default iommu domain > instead > ?of a fake one > > Instead of allocating a fake IOMMU domain for all Exynos DRM components, > simply reuse the default IOMMU domain of the already selected DMA device. > This allows some design changes in IOMMU framework without breaking IOMMU > support in Exynos DRM. > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > --- > ?drivers/gpu/drm/exynos/exynos_drm_iommu.h | 34 ++++------------------- > ?1 file changed, 6 insertions(+), 28 deletions(-) > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h > b/drivers/gpu/drm/exynos/exynos_drm_iommu.h > index 87f6b5672e11..51d3b7dcd529 100644 > --- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h > +++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h > @@ -55,37 +55,12 @@ static inline void __exynos_iommu_detach(struct > exynos_drm_private *priv, > ?static inline int __exynos_iommu_create_mapping(struct > exynos_drm_private *priv, > ???? ??? ??? ??? ??? unsigned long start, unsigned long size) > ?{ > -??? struct iommu_domain *domain; > -??? int ret; > - > -??? domain = iommu_domain_alloc(priv->dma_dev->bus); > -??? if (!domain) > -??? ??? return -ENOMEM; > - > -??? ret = iommu_get_dma_cookie(domain); > -??? if (ret) > -??? ??? goto free_domain; > - > -??? ret = iommu_dma_init_domain(domain, start, size, NULL); > -??? if (ret) > -??? ??? goto put_cookie; > - > -??? priv->mapping = domain; > +??? priv->mapping = iommu_get_dma_domain(priv->dma_dev); iommu_get_domain_for_dev(), please - this isn't a performance-critical path inside a DMA API implementation, plus without an actual dependency on this series maybe there's a chance of sneaking it into 4.19? (you could always still check domain->type == IOMMU_DOMAIN_DMA if you want to be really really paranoid.) > ???? return 0; > - > -put_cookie: > -??? iommu_put_dma_cookie(domain); > -free_domain: > -??? iommu_domain_free(domain); > -??? return ret; > ?} > > ?static inline void __exynos_iommu_release_mapping(struct > exynos_drm_private *priv) > ?{ > -??? struct iommu_domain *domain = priv->mapping; > - > -??? iommu_put_dma_cookie(domain); > -??? iommu_domain_free(domain); > ???? priv->mapping = NULL; > ?} > > @@ -94,7 +69,9 @@ static inline int __exynos_iommu_attach(struct > exynos_drm_private *priv, > ?{ > ???? struct iommu_domain *domain = priv->mapping; > > -??? return iommu_attach_device(domain, dev); > +??? if (dev != priv->dma_dev) > +??? ??? return iommu_attach_device(domain, dev); > +??? return 0; > ?} > > ?static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, > @@ -102,7 +79,8 @@ static inline void __exynos_iommu_detach(struct > exynos_drm_private *priv, > ?{ > ???? struct iommu_domain *domain = priv->mapping; > > -??? iommu_detach_device(domain, dev); > +??? if (dev != priv->dma_dev) > +??? ??? iommu_detach_device(domain, dev); Strictly you could skip that check, as detaching the master device from its real default domain is just a no-op, but I guess maintaining the symmetry is probably more intuitive. Robin. > ?} > ?#else > ?#error Unsupported architecture and IOMMU/DMA-mapping glue code > > > ... > > Best regards > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup 2018-09-28 15:31 ` Robin Murphy @ 2018-09-28 15:33 ` Christoph Hellwig 2018-09-28 15:46 ` Robin Murphy [not found] ` <CGME20180928160945eucas1p17ff77fe34854548719880638ecb9c54d@eucas1p1.samsung.com> 2018-09-28 16:18 ` [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup Marek Szyprowski 2 siblings, 1 reply; 20+ messages in thread From: Christoph Hellwig @ 2018-09-28 15:33 UTC (permalink / raw) To: linux-arm-kernel On Fri, Sep 28, 2018 at 04:31:10PM +0100, Robin Murphy wrote: > The tricky part is how to reconcile that with those other drivers which > want to do explicit IOMMU management with their own domain but still use > the DMA API for coherency of the underlying memory. I do have a couple of > half-formed ideas, but they're not quite there yet. It think the only sensible answer is that they can't, and we'll need coherency API in the iommu API (or augmenting it). Which might be a real possibility now that I'm almost done lifting the coherency management to arch hooks instead of dma op methods. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup 2018-09-28 15:33 ` Christoph Hellwig @ 2018-09-28 15:46 ` Robin Murphy 0 siblings, 0 replies; 20+ messages in thread From: Robin Murphy @ 2018-09-28 15:46 UTC (permalink / raw) To: linux-arm-kernel On 28/09/18 16:33, Christoph Hellwig wrote: > On Fri, Sep 28, 2018 at 04:31:10PM +0100, Robin Murphy wrote: >> The tricky part is how to reconcile that with those other drivers which >> want to do explicit IOMMU management with their own domain but still use >> the DMA API for coherency of the underlying memory. I do have a couple of >> half-formed ideas, but they're not quite there yet. > > It think the only sensible answer is that they can't, and we'll need > coherency API in the iommu API (or augmenting it). Which might be a > real possibility now that I'm almost done lifting the coherency > management to arch hooks instead of dma op methods. I reckon it seems feasible if the few clients that want to had a way to allocate their own proper managed domains, and the IOMMU API knew how to swizzle between iommu_dma_ops and dma_direct_ops upon attach depending on whether the target domain was managed or unmanaged. The worst part is liekly to be the difference between that lovely conceptual "iommu_dma_ops" and the cross-architecture mess of reality, plus where identity domains with a possible need for SWIOTLB come into it. Robin. ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CGME20180928160945eucas1p17ff77fe34854548719880638ecb9c54d@eucas1p1.samsung.com>]
* [PATCH] drm/exynos: Use selected dma_dev default iommu domain instead of a fake one [not found] ` <CGME20180928160945eucas1p17ff77fe34854548719880638ecb9c54d@eucas1p1.samsung.com> @ 2018-09-28 16:09 ` Marek Szyprowski 2018-09-28 16:13 ` Robin Murphy 0 siblings, 1 reply; 20+ messages in thread From: Marek Szyprowski @ 2018-09-28 16:09 UTC (permalink / raw) To: linux-arm-kernel Instead of allocating a fake IOMMU domain for all Exynos DRM components, simply reuse the default IOMMU domain of the already selected DMA device. This allows some design changes in IOMMU framework without breaking IOMMU support in Exynos DRM. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> --- Inki: If possible, please consider this patch as a fix for v4.19-rc, this will help doing a rework in IOMMU and DMA-IOMMU frameworks in v4.20 without breaking Exynos DRM. It worked for current IOMMU code, but such usage is considered as a hack. --- drivers/gpu/drm/exynos/exynos_drm_iommu.h | 34 ++++------------------- 1 file changed, 6 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h b/drivers/gpu/drm/exynos/exynos_drm_iommu.h index 87f6b5672e11..797d9ee5f15a 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h +++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h @@ -55,37 +55,12 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, static inline int __exynos_iommu_create_mapping(struct exynos_drm_private *priv, unsigned long start, unsigned long size) { - struct iommu_domain *domain; - int ret; - - domain = iommu_domain_alloc(priv->dma_dev->bus); - if (!domain) - return -ENOMEM; - - ret = iommu_get_dma_cookie(domain); - if (ret) - goto free_domain; - - ret = iommu_dma_init_domain(domain, start, size, NULL); - if (ret) - goto put_cookie; - - priv->mapping = domain; + priv->mapping = iommu_get_domain_for_dev(priv->dma_dev); return 0; - -put_cookie: - iommu_put_dma_cookie(domain); -free_domain: - iommu_domain_free(domain); - return ret; } static inline void __exynos_iommu_release_mapping(struct exynos_drm_private *priv) { - struct iommu_domain *domain = priv->mapping; - - iommu_put_dma_cookie(domain); - iommu_domain_free(domain); priv->mapping = NULL; } @@ -94,7 +69,9 @@ static inline int __exynos_iommu_attach(struct exynos_drm_private *priv, { struct iommu_domain *domain = priv->mapping; - return iommu_attach_device(domain, dev); + if (dev != priv->dma_dev) + return iommu_attach_device(domain, dev); + return 0; } static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, @@ -102,7 +79,8 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, { struct iommu_domain *domain = priv->mapping; - iommu_detach_device(domain, dev); + if (dev != priv->dma_dev) + iommu_detach_device(domain, dev); } #else #error Unsupported architecture and IOMMU/DMA-mapping glue code -- 2.17.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH] drm/exynos: Use selected dma_dev default iommu domain instead of a fake one 2018-09-28 16:09 ` [PATCH] drm/exynos: Use selected dma_dev default iommu domain instead of a fake one Marek Szyprowski @ 2018-09-28 16:13 ` Robin Murphy 0 siblings, 0 replies; 20+ messages in thread From: Robin Murphy @ 2018-09-28 16:13 UTC (permalink / raw) To: linux-arm-kernel On 2018-09-28 5:09 PM, Marek Szyprowski wrote: > Instead of allocating a fake IOMMU domain for all Exynos DRM components, > simply reuse the default IOMMU domain of the already selected DMA device. > This allows some design changes in IOMMU framework without breaking IOMMU > support in Exynos DRM. Reviewed-by: Robin Murphy <robin.murphy@arm.com> > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > --- > Inki: > If possible, please consider this patch as a fix for v4.19-rc, this will > help doing a rework in IOMMU and DMA-IOMMU frameworks in v4.20 without > breaking Exynos DRM. It worked for current IOMMU code, but such usage is > considered as a hack. > --- > drivers/gpu/drm/exynos/exynos_drm_iommu.h | 34 ++++------------------- > 1 file changed, 6 insertions(+), 28 deletions(-) > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h b/drivers/gpu/drm/exynos/exynos_drm_iommu.h > index 87f6b5672e11..797d9ee5f15a 100644 > --- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h > +++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h > @@ -55,37 +55,12 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, > static inline int __exynos_iommu_create_mapping(struct exynos_drm_private *priv, > unsigned long start, unsigned long size) > { > - struct iommu_domain *domain; > - int ret; > - > - domain = iommu_domain_alloc(priv->dma_dev->bus); > - if (!domain) > - return -ENOMEM; > - > - ret = iommu_get_dma_cookie(domain); > - if (ret) > - goto free_domain; > - > - ret = iommu_dma_init_domain(domain, start, size, NULL); > - if (ret) > - goto put_cookie; > - > - priv->mapping = domain; > + priv->mapping = iommu_get_domain_for_dev(priv->dma_dev); > return 0; > - > -put_cookie: > - iommu_put_dma_cookie(domain); > -free_domain: > - iommu_domain_free(domain); > - return ret; > } > > static inline void __exynos_iommu_release_mapping(struct exynos_drm_private *priv) > { > - struct iommu_domain *domain = priv->mapping; > - > - iommu_put_dma_cookie(domain); > - iommu_domain_free(domain); > priv->mapping = NULL; > } > > @@ -94,7 +69,9 @@ static inline int __exynos_iommu_attach(struct exynos_drm_private *priv, > { > struct iommu_domain *domain = priv->mapping; > > - return iommu_attach_device(domain, dev); > + if (dev != priv->dma_dev) > + return iommu_attach_device(domain, dev); > + return 0; > } > > static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, > @@ -102,7 +79,8 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, > { > struct iommu_domain *domain = priv->mapping; > > - iommu_detach_device(domain, dev); > + if (dev != priv->dma_dev) > + iommu_detach_device(domain, dev); > } > #else > #error Unsupported architecture and IOMMU/DMA-mapping glue code > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup 2018-09-28 15:31 ` Robin Murphy 2018-09-28 15:33 ` Christoph Hellwig [not found] ` <CGME20180928160945eucas1p17ff77fe34854548719880638ecb9c54d@eucas1p1.samsung.com> @ 2018-09-28 16:18 ` Marek Szyprowski 2018-10-01 10:55 ` Inki Dae 2 siblings, 1 reply; 20+ messages in thread From: Marek Szyprowski @ 2018-09-28 16:18 UTC (permalink / raw) To: linux-arm-kernel Hi Robin, On 2018-09-28 17:31, Robin Murphy wrote: > On 28/09/18 15:21, Marek Szyprowski wrote: >> On 2018-09-28 15:52, Robin Murphy wrote: >>> On 28/09/18 14:26, Marek Szyprowski wrote: >>>> On 2018-09-12 17:24, Robin Murphy wrote: >>>>> Most parts of iommu-dma already assume they are operating on a >>>>> default >>>>> domain set up by iommu_dma_init_domain(), and can be converted >>>>> straight >>>>> over to avoid the refcounting bottleneck. MSI page mappings may be in >>>>> an unmanaged domain with an explicit MSI-only cookie, so retain the >>>>> non-specific lookup, but that's OK since they're far from a contended >>>>> fast path either way. >>>>> >>>>> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >>>> >>>> This breaks Exynos DRM driver with Exynos IOMMU on ARM64 (Exynos5433). >>>> Exynos DRM creates it's own domain, attach all devices which performs >>>> DMA access (typically CRTC devices) to it and uses standard >>>> DMA-mapping >>>> calls to allocate/map buffers. This way it can use the same one DMA >>>> address for each buffer regardless of the CRTC/display/processing >>>> device. >>>> >>>> This no longer works with this patch. The simplest solution to fix >>>> this >>>> is add API to change default_domain to the one allocated by the Exynos >>>> DRM driver. >>> >>> Ugh, I hadn't realised there were any drivers trying to fake up their >>> own default domains - that's always going to be fragile. In fact, one >>> of my proposals for un-breaking Tegra and other DRM drivers involves >>> preventing iommu-dma from trying to operate on non-IOMMU_DOMAIN_DMA >>> domains at all, which would stop that from working permanently. >> >> IMHO there should be a way to let drivers like DRM to use DMA-mapping >> infrastructure on their own iommu domain. Better provide such API to >> avoid hacking in the DRM drivers to get this done without help from >> IOMMU core. > > The tricky part is how to reconcile that with those other drivers > which want to do explicit IOMMU management with their own domain but > still use the DMA API for coherency of the underlying memory. I do > have a couple of half-formed ideas, but they're not quite there yet. The only idea I have here is to add some flags to struct driver to let device core and frameworks note that this driver wants to manage everything on their own. Then such drivers, once they settle their IOMMU domain, should call some simple API to bless it for DMA API. > >>> Can Exynos not put all the DRM components into a single group, or at >>> least just pick one of the real default domains to attach everyone to >>> instead of allocating a fake one? >> >> Exynos DRM components are not in the single group. Currently >> iommu-groups have no meaning on Exynos5433 and Exynos IOMMU simply >> allocate one group per each iommu-master device in the system. > > Yeah, a group spanning multiple IOMMU devices would have been a bit of > a hack for your topology, but still *technically* possible ;) The question if I want to put all DRM components into single IOMMU group, how to propagate such knowledge from Exynos DRM driver to Exynos IOMMU driver (groups are created by IOMMU driver)? Anyway, I don't see any benefits from using IOMMU groups as for now, especially when each Exynos DRM component has its own, separate IOMMU (or even more than one, but that a different story). >> Exynos DRM already selects one of its component devices as the 'main DMA >> device'. It is being used for managing buffers if no IOMMU is available, >> so it can also use its iommu domain instead of allocating a fake one. >> I've checked and it fixes Exynos DRM now. The question is how to merge >> this fix to keep bisectability. >> >> From: Marek Szyprowski <m.szyprowski@samsung.com> >> Date: Fri, 28 Sep 2018 16:17:27 +0200 >> Subject: [PATCH] drm/exynos: Use selected dma_dev default iommu domain >> instead >> ? ?of a fake one >> >> Instead of allocating a fake IOMMU domain for all Exynos DRM components, >> simply reuse the default IOMMU domain of the already selected DMA >> device. >> This allows some design changes in IOMMU framework without breaking >> IOMMU >> support in Exynos DRM. >> >> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >> --- >> ? ?drivers/gpu/drm/exynos/exynos_drm_iommu.h | 34 >> ++++------------------- >> ? ?1 file changed, 6 insertions(+), 28 deletions(-) >> >> diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h >> b/drivers/gpu/drm/exynos/exynos_drm_iommu.h >> index 87f6b5672e11..51d3b7dcd529 100644 >> --- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h >> +++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h >> @@ -55,37 +55,12 @@ static inline void __exynos_iommu_detach(struct >> exynos_drm_private *priv, >> ? ?static inline int __exynos_iommu_create_mapping(struct >> exynos_drm_private *priv, >> ? ???? ??? ??? ??? ??? unsigned long start, unsigned long size) >> ? ?{ >> -??? struct iommu_domain *domain; >> -??? int ret; >> - >> -??? domain = iommu_domain_alloc(priv->dma_dev->bus); >> -??? if (!domain) >> -??? ??? return -ENOMEM; >> - >> -??? ret = iommu_get_dma_cookie(domain); >> -??? if (ret) >> -??? ??? goto free_domain; >> - >> -??? ret = iommu_dma_init_domain(domain, start, size, NULL); >> -??? if (ret) >> -??? ??? goto put_cookie; >> - >> -??? priv->mapping = domain; >> +??? priv->mapping = iommu_get_dma_domain(priv->dma_dev); > > iommu_get_domain_for_dev(), please - this isn't a performance-critical > path inside a DMA API implementation, plus without an actual > dependency on this series maybe there's a chance of sneaking it into > 4.19? > > (you could always still check domain->type == IOMMU_DOMAIN_DMA if you > want to be really really paranoid.) > I've sent a patch. I hope we will manage to get it as a fix into v4.19, so it won't block your changes in v4.20. >> ? ???? return 0; >> - >> -put_cookie: >> -??? iommu_put_dma_cookie(domain); >> -free_domain: >> -??? iommu_domain_free(domain); >> -??? return ret; >> ? ?} >> >> ? ?static inline void __exynos_iommu_release_mapping(struct >> exynos_drm_private *priv) >> ? ?{ >> -??? struct iommu_domain *domain = priv->mapping; >> - >> -??? iommu_put_dma_cookie(domain); >> -??? iommu_domain_free(domain); >> ? ???? priv->mapping = NULL; >> ? ?} >> >> @@ -94,7 +69,9 @@ static inline int __exynos_iommu_attach(struct >> exynos_drm_private *priv, >> ? ?{ >> ? ???? struct iommu_domain *domain = priv->mapping; >> >> -??? return iommu_attach_device(domain, dev); >> +??? if (dev != priv->dma_dev) >> +??? ??? return iommu_attach_device(domain, dev); >> +??? return 0; >> ? ?} >> >> ? ?static inline void __exynos_iommu_detach(struct exynos_drm_private >> *priv, >> @@ -102,7 +79,8 @@ static inline void __exynos_iommu_detach(struct >> exynos_drm_private *priv, >> ? ?{ >> ? ???? struct iommu_domain *domain = priv->mapping; >> >> -??? iommu_detach_device(domain, dev); >> +??? if (dev != priv->dma_dev) >> +??? ??? iommu_detach_device(domain, dev); > > Strictly you could skip that check, as detaching the master device > from its real default domain is just a no-op, but I guess maintaining > the symmetry is probably more intuitive. I would like to keep the symmetry. > ... Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup 2018-09-28 16:18 ` [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup Marek Szyprowski @ 2018-10-01 10:55 ` Inki Dae 0 siblings, 0 replies; 20+ messages in thread From: Inki Dae @ 2018-10-01 10:55 UTC (permalink / raw) To: linux-arm-kernel + dri-devel This patch may also break other ARM DRM drivers. I cced dri-devel so that they could manage this. And below relevant link for ARM DRM maintainers, https://www.spinics.net/lists/arm-kernel/msg676098.html Thanks, Inki Dae 2018? 9? 29? (?) ?? 1:19, Marek Szyprowski <m.szyprowski@samsung.com>?? ??: > > Hi Robin, > > On 2018-09-28 17:31, Robin Murphy wrote: > > On 28/09/18 15:21, Marek Szyprowski wrote: > >> On 2018-09-28 15:52, Robin Murphy wrote: > >>> On 28/09/18 14:26, Marek Szyprowski wrote: > >>>> On 2018-09-12 17:24, Robin Murphy wrote: > >>>>> Most parts of iommu-dma already assume they are operating on a > >>>>> default > >>>>> domain set up by iommu_dma_init_domain(), and can be converted > >>>>> straight > >>>>> over to avoid the refcounting bottleneck. MSI page mappings may be in > >>>>> an unmanaged domain with an explicit MSI-only cookie, so retain the > >>>>> non-specific lookup, but that's OK since they're far from a contended > >>>>> fast path either way. > >>>>> > >>>>> Signed-off-by: Robin Murphy <robin.murphy@arm.com> > >>>> > >>>> This breaks Exynos DRM driver with Exynos IOMMU on ARM64 (Exynos5433). > >>>> Exynos DRM creates it's own domain, attach all devices which performs > >>>> DMA access (typically CRTC devices) to it and uses standard > >>>> DMA-mapping > >>>> calls to allocate/map buffers. This way it can use the same one DMA > >>>> address for each buffer regardless of the CRTC/display/processing > >>>> device. > >>>> > >>>> This no longer works with this patch. The simplest solution to fix > >>>> this > >>>> is add API to change default_domain to the one allocated by the Exynos > >>>> DRM driver. > >>> > >>> Ugh, I hadn't realised there were any drivers trying to fake up their > >>> own default domains - that's always going to be fragile. In fact, one > >>> of my proposals for un-breaking Tegra and other DRM drivers involves > >>> preventing iommu-dma from trying to operate on non-IOMMU_DOMAIN_DMA > >>> domains at all, which would stop that from working permanently. > >> > >> IMHO there should be a way to let drivers like DRM to use DMA-mapping > >> infrastructure on their own iommu domain. Better provide such API to > >> avoid hacking in the DRM drivers to get this done without help from > >> IOMMU core. > > > > The tricky part is how to reconcile that with those other drivers > > which want to do explicit IOMMU management with their own domain but > > still use the DMA API for coherency of the underlying memory. I do > > have a couple of half-formed ideas, but they're not quite there yet. > > The only idea I have here is to add some flags to struct driver to let > device core and frameworks note that this driver wants to manage > everything on their own. Then such drivers, once they settle their IOMMU > domain, should call some simple API to bless it for DMA API. > > > > >>> Can Exynos not put all the DRM components into a single group, or at > >>> least just pick one of the real default domains to attach everyone to > >>> instead of allocating a fake one? > >> > >> Exynos DRM components are not in the single group. Currently > >> iommu-groups have no meaning on Exynos5433 and Exynos IOMMU simply > >> allocate one group per each iommu-master device in the system. > > > > Yeah, a group spanning multiple IOMMU devices would have been a bit of > > a hack for your topology, but still *technically* possible ;) > > The question if I want to put all DRM components into single IOMMU > group, how to propagate such knowledge from Exynos DRM driver to Exynos > IOMMU driver (groups are created by IOMMU driver)? Anyway, I don't see > any benefits from using IOMMU groups as for now, especially when each > Exynos DRM component has its own, separate IOMMU (or even more than one, > but that a different story). > > >> Exynos DRM already selects one of its component devices as the 'main DMA > >> device'. It is being used for managing buffers if no IOMMU is available, > >> so it can also use its iommu domain instead of allocating a fake one. > >> I've checked and it fixes Exynos DRM now. The question is how to merge > >> this fix to keep bisectability. > >> > >> From: Marek Szyprowski <m.szyprowski@samsung.com> > >> Date: Fri, 28 Sep 2018 16:17:27 +0200 > >> Subject: [PATCH] drm/exynos: Use selected dma_dev default iommu domain > >> instead > >> of a fake one > >> > >> Instead of allocating a fake IOMMU domain for all Exynos DRM components, > >> simply reuse the default IOMMU domain of the already selected DMA > >> device. > >> This allows some design changes in IOMMU framework without breaking > >> IOMMU > >> support in Exynos DRM. > >> > >> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > >> --- > >> drivers/gpu/drm/exynos/exynos_drm_iommu.h | 34 > >> ++++------------------- > >> 1 file changed, 6 insertions(+), 28 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h > >> b/drivers/gpu/drm/exynos/exynos_drm_iommu.h > >> index 87f6b5672e11..51d3b7dcd529 100644 > >> --- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h > >> +++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h > >> @@ -55,37 +55,12 @@ static inline void __exynos_iommu_detach(struct > >> exynos_drm_private *priv, > >> static inline int __exynos_iommu_create_mapping(struct > >> exynos_drm_private *priv, > >> unsigned long start, unsigned long size) > >> { > >> - struct iommu_domain *domain; > >> - int ret; > >> - > >> - domain = iommu_domain_alloc(priv->dma_dev->bus); > >> - if (!domain) > >> - return -ENOMEM; > >> - > >> - ret = iommu_get_dma_cookie(domain); > >> - if (ret) > >> - goto free_domain; > >> - > >> - ret = iommu_dma_init_domain(domain, start, size, NULL); > >> - if (ret) > >> - goto put_cookie; > >> - > >> - priv->mapping = domain; > >> + priv->mapping = iommu_get_dma_domain(priv->dma_dev); > > > > iommu_get_domain_for_dev(), please - this isn't a performance-critical > > path inside a DMA API implementation, plus without an actual > > dependency on this series maybe there's a chance of sneaking it into > > 4.19? > > > > (you could always still check domain->type == IOMMU_DOMAIN_DMA if you > > want to be really really paranoid.) > > > > I've sent a patch. I hope we will manage to get it as a fix into v4.19, > so it won't block your changes in v4.20. > > >> return 0; > >> - > >> -put_cookie: > >> - iommu_put_dma_cookie(domain); > >> -free_domain: > >> - iommu_domain_free(domain); > >> - return ret; > >> } > >> > >> static inline void __exynos_iommu_release_mapping(struct > >> exynos_drm_private *priv) > >> { > >> - struct iommu_domain *domain = priv->mapping; > >> - > >> - iommu_put_dma_cookie(domain); > >> - iommu_domain_free(domain); > >> priv->mapping = NULL; > >> } > >> > >> @@ -94,7 +69,9 @@ static inline int __exynos_iommu_attach(struct > >> exynos_drm_private *priv, > >> { > >> struct iommu_domain *domain = priv->mapping; > >> > >> - return iommu_attach_device(domain, dev); > >> + if (dev != priv->dma_dev) > >> + return iommu_attach_device(domain, dev); > >> + return 0; > >> } > >> > >> static inline void __exynos_iommu_detach(struct exynos_drm_private > >> *priv, > >> @@ -102,7 +79,8 @@ static inline void __exynos_iommu_detach(struct > >> exynos_drm_private *priv, > >> { > >> struct iommu_domain *domain = priv->mapping; > >> > >> - iommu_detach_device(domain, dev); > >> + if (dev != priv->dma_dev) > >> + iommu_detach_device(domain, dev); > > > > Strictly you could skip that check, as detaching the master device > > from its real default domain is just a no-op, but I guess maintaining > > the symmetry is probably more intuitive. > > I would like to keep the symmetry. > > > ... > > Best regards > -- > Marek Szyprowski, PhD > Samsung R&D Institute Poland > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 3/3] arm64/dma-mapping: Mildly optimise non-coherent IOMMU ops 2018-09-12 15:24 [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention Robin Murphy 2018-09-12 15:24 ` [PATCH v2 1/3] iommu: Add fast hook for getting DMA domains Robin Murphy 2018-09-12 15:24 ` [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup Robin Murphy @ 2018-09-12 15:24 ` Robin Murphy 2018-09-14 12:48 ` [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention Will Deacon 2018-09-25 8:24 ` Joerg Roedel 4 siblings, 0 replies; 20+ messages in thread From: Robin Murphy @ 2018-09-12 15:24 UTC (permalink / raw) To: linux-arm-kernel Whilst the symmetry of deferring to the existing sync callback in __iommu_map_page() is nice, taking a round-trip through iommu_iova_to_phys() is a pretty heavyweight way to get an address we can trivially compute from the page we already have. Tweaking it to just perform the cache maintenance directly when appropriate doesn't really make the code any more complicated, and the runtime efficiency gain can only be a benefit. Furthermore, the sync operations themselves know they can only be invoked on a managed DMA ops domain, so can use the fast specific domain lookup to avoid excessive manipulation of the group refcount (particularly in the scatterlist cases). Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- v2: Don't be totally broken by forgetting the offset arch/arm64/mm/dma-mapping.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index 072c51fb07d7..cf017c5bb5e7 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -712,7 +712,7 @@ static void __iommu_sync_single_for_cpu(struct device *dev, if (is_device_dma_coherent(dev)) return; - phys = iommu_iova_to_phys(iommu_get_domain_for_dev(dev), dev_addr); + phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dev_addr); __dma_unmap_area(phys_to_virt(phys), size, dir); } @@ -725,7 +725,7 @@ static void __iommu_sync_single_for_device(struct device *dev, if (is_device_dma_coherent(dev)) return; - phys = iommu_iova_to_phys(iommu_get_domain_for_dev(dev), dev_addr); + phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dev_addr); __dma_map_area(phys_to_virt(phys), size, dir); } @@ -738,9 +738,9 @@ static dma_addr_t __iommu_map_page(struct device *dev, struct page *page, int prot = dma_info_to_prot(dir, coherent, attrs); dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot); - if (!iommu_dma_mapping_error(dev, dev_addr) && - (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0) - __iommu_sync_single_for_device(dev, dev_addr, size, dir); + if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) && + !iommu_dma_mapping_error(dev, dev_addr)) + __dma_map_area(page_address(page) + offset, size, dir); return dev_addr; } -- 2.19.0.dirty ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention 2018-09-12 15:24 [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention Robin Murphy ` (2 preceding siblings ...) 2018-09-12 15:24 ` [PATCH v2 3/3] arm64/dma-mapping: Mildly optimise non-coherent IOMMU ops Robin Murphy @ 2018-09-14 12:48 ` Will Deacon 2018-09-17 11:20 ` John Garry 2018-09-17 13:33 ` Christoph Hellwig 2018-09-25 8:24 ` Joerg Roedel 4 siblings, 2 replies; 20+ messages in thread From: Will Deacon @ 2018-09-14 12:48 UTC (permalink / raw) To: linux-arm-kernel Hi Robin, On Wed, Sep 12, 2018 at 04:24:11PM +0100, Robin Murphy wrote: > John raised the issue[1] that we have some unnecessary refcount contention > in the DMA ops path which shows scalability problems now that we have more > real high-performance hardware using iommu-dma. The x86 IOMMU drivers are > sidestepping this by stashing domain references in archdata, but since > that's not very nice for architecture-agnostic code, I think it's time to > look at a generic API-level solution. > > These are a couple of quick patches based on the idea I had back when > first implementing iommu-dma, but didn't have any way to justify at the > time. However, the reports of 10-25% better networking performance on v1 > suggest that it's very worthwhile (and far more significant than I ever > would have guessed). > > As far as merging goes, I don't mind at all whether this goes via IOMMU, > or via dma-mapping provided Joerg's happy to ack it. I think it makes most sense for Joerg to take this series via his tree. Anyway, I've been running them on my TX2 box and things are happy enough, so: Tested-by: Will Deacon <will.deacon@arm.com> Will ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention 2018-09-14 12:48 ` [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention Will Deacon @ 2018-09-17 11:20 ` John Garry 2018-09-20 12:51 ` John Garry 2018-09-17 13:33 ` Christoph Hellwig 1 sibling, 1 reply; 20+ messages in thread From: John Garry @ 2018-09-17 11:20 UTC (permalink / raw) To: linux-arm-kernel On 14/09/2018 13:48, Will Deacon wrote: > Hi Robin, > Hi Robin, I just spoke with Dongdong and we will test this version also so that we may provide a "Tested-by" tag. Thanks, John > On Wed, Sep 12, 2018 at 04:24:11PM +0100, Robin Murphy wrote: >> John raised the issue[1] that we have some unnecessary refcount contention >> in the DMA ops path which shows scalability problems now that we have more >> real high-performance hardware using iommu-dma. The x86 IOMMU drivers are >> sidestepping this by stashing domain references in archdata, but since >> that's not very nice for architecture-agnostic code, I think it's time to >> look at a generic API-level solution. >> >> These are a couple of quick patches based on the idea I had back when >> first implementing iommu-dma, but didn't have any way to justify at the >> time. However, the reports of 10-25% better networking performance on v1 >> suggest that it's very worthwhile (and far more significant than I ever >> would have guessed). >> >> As far as merging goes, I don't mind at all whether this goes via IOMMU, >> or via dma-mapping provided Joerg's happy to ack it. > > I think it makes most sense for Joerg to take this series via his tree. > > Anyway, I've been running them on my TX2 box and things are happy enough, > so: > > Tested-by: Will Deacon <will.deacon@arm.com> > > Will > > . > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention 2018-09-17 11:20 ` John Garry @ 2018-09-20 12:51 ` John Garry 0 siblings, 0 replies; 20+ messages in thread From: John Garry @ 2018-09-20 12:51 UTC (permalink / raw) To: linux-arm-kernel On 17/09/2018 12:20, John Garry wrote: > On 14/09/2018 13:48, Will Deacon wrote: >> Hi Robin, >> > > Hi Robin, > > I just spoke with Dongdong and we will test this version also so that we > may provide a "Tested-by" tag. > I tested this, so for series: Tested-by: John Garry <john.garry@huawei.com> Thanks, John > Thanks, > John > >> On Wed, Sep 12, 2018 at 04:24:11PM +0100, Robin Murphy wrote: >>> John raised the issue[1] that we have some unnecessary refcount >>> contention >>> in the DMA ops path which shows scalability problems now that we have >>> more >>> real high-performance hardware using iommu-dma. The x86 IOMMU drivers >>> are >>> sidestepping this by stashing domain references in archdata, but since >>> that's not very nice for architecture-agnostic code, I think it's >>> time to >>> look at a generic API-level solution. >>> >>> These are a couple of quick patches based on the idea I had back when >>> first implementing iommu-dma, but didn't have any way to justify at the >>> time. However, the reports of 10-25% better networking performance on v1 >>> suggest that it's very worthwhile (and far more significant than I ever >>> would have guessed). >>> >>> As far as merging goes, I don't mind at all whether this goes via IOMMU, >>> or via dma-mapping provided Joerg's happy to ack it. >> >> I think it makes most sense for Joerg to take this series via his tree. >> >> Anyway, I've been running them on my TX2 box and things are happy enough, >> so: >> >> Tested-by: Will Deacon <will.deacon@arm.com> >> >> Will >> >> . >> > > > _______________________________________________ > iommu mailing list > iommu at lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > > . > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention 2018-09-14 12:48 ` [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention Will Deacon 2018-09-17 11:20 ` John Garry @ 2018-09-17 13:33 ` Christoph Hellwig 2018-09-18 13:28 ` Tom Murphy 1 sibling, 1 reply; 20+ messages in thread From: Christoph Hellwig @ 2018-09-17 13:33 UTC (permalink / raw) To: linux-arm-kernel On Fri, Sep 14, 2018 at 01:48:59PM +0100, Will Deacon wrote: > > As far as merging goes, I don't mind at all whether this goes via IOMMU, > > or via dma-mapping provided Joerg's happy to ack it. > > I think it makes most sense for Joerg to take this series via his tree. FYI, I have WIP patches to move the arm dma-iommu wrappers to common code: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-maybe-coherent which will require some synchronization of the involved trees. In the end it is iommu code, so the actual iommu patches should probably go into the iommu tree after all, but it might have to pull in the dma-mapping branch for that. Or we just punt it until the next merge window. Not sure how fast Tom needs the common dma-iommu code for the x86 AMD iommu conversion. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention 2018-09-17 13:33 ` Christoph Hellwig @ 2018-09-18 13:28 ` Tom Murphy 0 siblings, 0 replies; 20+ messages in thread From: Tom Murphy @ 2018-09-18 13:28 UTC (permalink / raw) To: linux-arm-kernel >Not sure how fast Tom needs the common dma-iommu code for the x86 AMD iommu conversion. I am currently busy working on something else and won't be able to do/test the x86 AMD iommu conversion anytime soon. So I don't need the common dma-iommu code anytime soon. On 17 September 2018 at 14:33, Christoph Hellwig <hch@lst.de> wrote: > On Fri, Sep 14, 2018 at 01:48:59PM +0100, Will Deacon wrote: >> > As far as merging goes, I don't mind at all whether this goes via IOMMU, >> > or via dma-mapping provided Joerg's happy to ack it. >> >> I think it makes most sense for Joerg to take this series via his tree. > > FYI, I have WIP patches to move the arm dma-iommu wrappers to common > code: > > http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-maybe-coherent > > which will require some synchronization of the involved trees. In the > end it is iommu code, so the actual iommu patches should probably > go into the iommu tree after all, but it might have to pull in the > dma-mapping branch for that. Or we just punt it until the next merge > window. Not sure how fast Tom needs the common dma-iommu code for > the x86 AMD iommu conversion. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention 2018-09-12 15:24 [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention Robin Murphy ` (3 preceding siblings ...) 2018-09-14 12:48 ` [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention Will Deacon @ 2018-09-25 8:24 ` Joerg Roedel 4 siblings, 0 replies; 20+ messages in thread From: Joerg Roedel @ 2018-09-25 8:24 UTC (permalink / raw) To: linux-arm-kernel On Wed, Sep 12, 2018 at 04:24:11PM +0100, Robin Murphy wrote: > Robin Murphy (3): > iommu: Add fast hook for getting DMA domains > iommu/dma: Use fast DMA domain lookup > arm64/dma-mapping: Mildly optimise non-coherent IOMMU ops Applied, thanks Robin. ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2018-10-01 10:55 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-09-12 15:24 [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention Robin Murphy 2018-09-12 15:24 ` [PATCH v2 1/3] iommu: Add fast hook for getting DMA domains Robin Murphy 2018-09-12 15:24 ` [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup Robin Murphy 2018-09-28 13:26 ` Marek Szyprowski 2018-09-28 13:52 ` Robin Murphy 2018-09-28 14:21 ` Marek Szyprowski 2018-09-28 15:31 ` Robin Murphy 2018-09-28 15:33 ` Christoph Hellwig 2018-09-28 15:46 ` Robin Murphy [not found] ` <CGME20180928160945eucas1p17ff77fe34854548719880638ecb9c54d@eucas1p1.samsung.com> 2018-09-28 16:09 ` [PATCH] drm/exynos: Use selected dma_dev default iommu domain instead of a fake one Marek Szyprowski 2018-09-28 16:13 ` Robin Murphy 2018-09-28 16:18 ` [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup Marek Szyprowski 2018-10-01 10:55 ` Inki Dae 2018-09-12 15:24 ` [PATCH v2 3/3] arm64/dma-mapping: Mildly optimise non-coherent IOMMU ops Robin Murphy 2018-09-14 12:48 ` [PATCH v2 0/3] iommu: Avoid DMA ops domain refcount contention Will Deacon 2018-09-17 11:20 ` John Garry 2018-09-20 12:51 ` John Garry 2018-09-17 13:33 ` Christoph Hellwig 2018-09-18 13:28 ` Tom Murphy 2018-09-25 8:24 ` Joerg Roedel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).