* [PATCH v2 1/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS [not found] ` <1485531259-17513-2-git-send-email-geert+renesas@glider.be> @ 2017-01-27 17:50 ` Robin Murphy 2017-01-31 10:24 ` Geert Uytterhoeven 0 siblings, 1 reply; 3+ messages in thread From: Robin Murphy @ 2017-01-27 17:50 UTC (permalink / raw) To: linux-arm-kernel Hi Geert, On 27/01/17 15:34, Geert Uytterhoeven wrote: > Add helpers for allocating physically contiguous DMA buffers to the > generic IOMMU DMA code. This can be useful when two or more devices > with different memory requirements are involved in buffer sharing. > > The iommu_dma_{alloc,free}_contiguous() functions complement the existing > iommu_dma_{alloc,free}() functions, and allow architecture-specific code > to implement support for the DMA_ATTR_FORCE_CONTIGUOUS attribute on > systems with an IOMMU. As this uses the CMA allocator, setting this > attribute has a runtime-dependency on CONFIG_DMA_CMA. > > Note that unlike the existing iommu_dma_alloc() helper, > iommu_dma_alloc_contiguous() has no callback to flush pages. > Ensuring the returned region is made visible to a non-coherent device is > the responsibility of the caller. > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> > --- > v2: > - Provide standalone iommu_dma_{alloc,free}_contiguous() functions, as > requested by Robin Murphy, > - Simplify operations by getting rid of the page array/scatterlist > dance, as the buffer is contiguous, > - Move CPU cache magement into the caller, which is much simpler with > a single contiguous buffer. Thanks for the rework, that's a lot easier to make sense of! Now, please don't hate me, but... > --- > drivers/iommu/dma-iommu.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++ > include/linux/dma-iommu.h | 4 +++ > 2 files changed, 76 insertions(+) > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index 2db0d641cf4505b5..8f8ed4426f9a3a12 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -30,6 +30,7 @@ > #include <linux/pci.h> > #include <linux/scatterlist.h> > #include <linux/vmalloc.h> > +#include <linux/dma-contiguous.h> > > struct iommu_dma_msi_page { > struct list_head list; > @@ -408,6 +409,77 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp, > } > > /** > + * iommu_dma_free_contiguous - Free a buffer allocated by > + * iommu_dma_alloc_contiguous() > + * @dev: Device which owns this buffer > + * @page: Buffer page pointer as returned by iommu_dma_alloc_contiguous() > + * @size: Size of buffer in bytes > + * @handle: DMA address of buffer > + * > + * Frees the pages associated with the buffer. > + */ > +void iommu_dma_free_contiguous(struct device *dev, struct page *page, > + size_t size, dma_addr_t *handle) > +{ > + __iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle); > + dma_release_from_contiguous(dev, page, PAGE_ALIGN(size) >> PAGE_SHIFT); > + *handle = DMA_ERROR_CODE; > +} > + > +/** > + * iommu_dma_alloc_contiguous - Allocate and map a buffer contiguous in IOVA > + * and physical space > + * @dev: Device to allocate memory for. Must be a real device attached to an > + * iommu_dma_domain > + * @size: Size of buffer in bytes > + * @prot: IOMMU mapping flags > + * @handle: Out argument for allocated DMA handle > + * > + * Return: Buffer page pointer, or NULL on failure. > + * > + * Note that unlike iommu_dma_alloc(), it's the caller's responsibility to > + * ensure the returned region is made visible to the given non-coherent device. > + */ > +struct page *iommu_dma_alloc_contiguous(struct device *dev, size_t size, > + int prot, dma_addr_t *handle) > +{ > + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + struct iova_domain *iovad = cookie_iovad(domain); > + dma_addr_t dma_addr; > + unsigned int count; > + struct page *page; > + struct iova *iova; > + int ret; > + > + *handle = DMA_ERROR_CODE; > + > + size = PAGE_ALIGN(size); > + count = size >> PAGE_SHIFT; > + page = dma_alloc_from_contiguous(dev, count, get_order(size)); > + if (!page) > + return NULL; > + > + iova = __alloc_iova(domain, size, dev->coherent_dma_mask); > + if (!iova) > + goto out_free_pages; > + > + size = iova_align(iovad, size); > + dma_addr = iova_dma_addr(iovad, iova); > + ret = iommu_map(domain, dma_addr, page_to_phys(page), size, prot); > + if (ret < 0) > + goto out_free_iova; > + > + *handle = dma_addr; > + return page; > + > +out_free_iova: > + __free_iova(iovad, iova); > +out_free_pages: > + dma_release_from_contiguous(dev, page, count); > + return NULL; > +} ...now that I can see it clearly, isn't this more or less just: page = dma_alloc_from_contiguous(dev, ...); if (page) dma_addr = iommu_dma_map_page(dev, page, ...); ? Would it not be even simpler to just make those two calls directly from the arm64 code? Robin. > + > +/** > * iommu_dma_mmap - Map a buffer into provided user VMA > * @pages: Array representing buffer from iommu_dma_alloc() > * @size: Size of buffer in bytes > diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h > index 7f7e9a7e3839966c..7eee62c2b0e752f7 100644 > --- a/include/linux/dma-iommu.h > +++ b/include/linux/dma-iommu.h > @@ -45,6 +45,10 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp, > void (*flush_page)(struct device *, const void *, phys_addr_t)); > void iommu_dma_free(struct device *dev, struct page **pages, size_t size, > dma_addr_t *handle); > +struct page *iommu_dma_alloc_contiguous(struct device *dev, size_t size, > + int prot, dma_addr_t *handle); > +void iommu_dma_free_contiguous(struct device *dev, struct page *page, > + size_t size, dma_addr_t *handle); > > int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma); > > ^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v2 1/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS 2017-01-27 17:50 ` [PATCH v2 1/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS Robin Murphy @ 2017-01-31 10:24 ` Geert Uytterhoeven 0 siblings, 0 replies; 3+ messages in thread From: Geert Uytterhoeven @ 2017-01-31 10:24 UTC (permalink / raw) To: linux-arm-kernel Hi Robin, On Fri, Jan 27, 2017 at 6:50 PM, Robin Murphy <robin.murphy@arm.com> wrote: > On 27/01/17 15:34, Geert Uytterhoeven wrote: >> Add helpers for allocating physically contiguous DMA buffers to the >> generic IOMMU DMA code. This can be useful when two or more devices >> with different memory requirements are involved in buffer sharing. >> >> The iommu_dma_{alloc,free}_contiguous() functions complement the existing >> iommu_dma_{alloc,free}() functions, and allow architecture-specific code >> to implement support for the DMA_ATTR_FORCE_CONTIGUOUS attribute on >> systems with an IOMMU. As this uses the CMA allocator, setting this >> attribute has a runtime-dependency on CONFIG_DMA_CMA. >> >> Note that unlike the existing iommu_dma_alloc() helper, >> iommu_dma_alloc_contiguous() has no callback to flush pages. >> Ensuring the returned region is made visible to a non-coherent device is >> the responsibility of the caller. >> >> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> >> --- >> v2: >> - Provide standalone iommu_dma_{alloc,free}_contiguous() functions, as >> requested by Robin Murphy, >> - Simplify operations by getting rid of the page array/scatterlist >> dance, as the buffer is contiguous, >> - Move CPU cache magement into the caller, which is much simpler with >> a single contiguous buffer. > > Thanks for the rework, that's a lot easier to make sense of! Now, please > don't hate me, but... I don't hate you at all. I'm just a drivers/iommu rookie ;-) >> --- a/drivers/iommu/dma-iommu.c >> +++ b/drivers/iommu/dma-iommu.c >> +/** >> + * iommu_dma_alloc_contiguous - Allocate and map a buffer contiguous in IOVA >> + * and physical space >> + * @dev: Device to allocate memory for. Must be a real device attached to an >> + * iommu_dma_domain >> + * @size: Size of buffer in bytes >> + * @prot: IOMMU mapping flags >> + * @handle: Out argument for allocated DMA handle >> + * >> + * Return: Buffer page pointer, or NULL on failure. >> + * >> + * Note that unlike iommu_dma_alloc(), it's the caller's responsibility to >> + * ensure the returned region is made visible to the given non-coherent device. >> + */ >> +struct page *iommu_dma_alloc_contiguous(struct device *dev, size_t size, >> + int prot, dma_addr_t *handle) >> +{ >> + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); >> + struct iova_domain *iovad = cookie_iovad(domain); >> + dma_addr_t dma_addr; >> + unsigned int count; >> + struct page *page; >> + struct iova *iova; >> + int ret; >> + >> + *handle = DMA_ERROR_CODE; >> + >> + size = PAGE_ALIGN(size); >> + count = size >> PAGE_SHIFT; >> + page = dma_alloc_from_contiguous(dev, count, get_order(size)); >> + if (!page) >> + return NULL; >> + >> + iova = __alloc_iova(domain, size, dev->coherent_dma_mask); >> + if (!iova) >> + goto out_free_pages; >> + >> + size = iova_align(iovad, size); >> + dma_addr = iova_dma_addr(iovad, iova); >> + ret = iommu_map(domain, dma_addr, page_to_phys(page), size, prot); >> + if (ret < 0) >> + goto out_free_iova; >> + >> + *handle = dma_addr; >> + return page; >> + >> +out_free_iova: >> + __free_iova(iovad, iova); >> +out_free_pages: >> + dma_release_from_contiguous(dev, page, count); >> + return NULL; >> +} > > ...now that I can see it clearly, isn't this more or less just: > > page = dma_alloc_from_contiguous(dev, ...); > if (page) > dma_addr = iommu_dma_map_page(dev, page, ...); > ? > > Would it not be even simpler to just make those two calls directly from > the arm64 code? You're right. Will update. Thanks a lot! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <1485531259-17513-3-git-send-email-geert+renesas@glider.be>]
* [PATCH v2 2/2] arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU [not found] ` <1485531259-17513-3-git-send-email-geert+renesas@glider.be> @ 2017-01-28 15:43 ` Laurent Pinchart 0 siblings, 0 replies; 3+ messages in thread From: Laurent Pinchart @ 2017-01-28 15:43 UTC (permalink / raw) To: linux-arm-kernel Hi Geert, Thank you for the patch. On Friday 27 Jan 2017 16:34:19 Geert Uytterhoeven wrote: > Add support for allocation physically contiguous DMA buffers on arm64 > systems with an IOMMU, by dispatching DMA buffer allocations with the > DMA_ATTR_FORCE_CONTIGUOUS attribute to the appropriate IOMMU DMA > helpers. > > Note that as this uses the CMA allocator, setting this attribute has a > runtime-dependency on CONFIG_DMA_CMA, just like on arm32. > > For arm64 systems using swiotlb, no changes are needed to support the > allocation of physically contiguous DMA buffers: > - swiotlb always uses physically contiguous buffers (up to > IO_TLB_SEGSIZE = 128 pages), > - arm64's __dma_alloc_coherent() already calls > dma_alloc_from_contiguous() when CMA is available. > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> > --- > v2: > - New, handle dispatching in the arch (arm64) code, as requested by > Robin Murphy. > --- > arch/arm64/mm/dma-mapping.c | 51 +++++++++++++++++++++++++++++------------- > 1 file changed, 37 insertions(+), 14 deletions(-) > > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index 1d7d5d2881db7c19..325803e0ba79ef26 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -577,20 +577,7 @@ static void *__iommu_alloc_attrs(struct device *dev, > size_t size, */ > gfp |= __GFP_ZERO; > > - if (gfpflags_allow_blocking(gfp)) { > - struct page **pages; > - pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent); > - > - pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot, > - handle, flush_page); > - if (!pages) > - return NULL; > - > - addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot, > - __builtin_return_address(0)); > - if (!addr) > - iommu_dma_free(dev, pages, iosize, handle); > - } else { > + if (!gfpflags_allow_blocking(gfp)) { > struct page *page; > /* > * In atomic context we can't remap anything, so we'll only > @@ -614,6 +601,35 @@ static void *__iommu_alloc_attrs(struct device *dev, > size_t size, __free_from_pool(addr, size); > addr = NULL; > } > + } else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) { > + pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent); > + struct page *page; > + > + page = iommu_dma_alloc_contiguous(dev, iosize, ioprot, handle); > + if (!page) > + return NULL; > + > + if (!coherent) > + __dma_flush_area(page_to_virt(page), iosize); > + > + addr = dma_common_contiguous_remap(page, size, VM_USERMAP, > + prot, > + __builtin_return_address(0)); > + if (!addr) > + iommu_dma_free_contiguous(dev, page, iosize, handle); > + } else { > + pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent); > + struct page **pages; > + > + pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot, > + handle, flush_page); > + if (!pages) > + return NULL; > + > + addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot, > + __builtin_return_address(0)); > + if (!addr) > + iommu_dma_free(dev, pages, iosize, handle); > } > return addr; > } > @@ -626,6 +642,8 @@ static void __iommu_free_attrs(struct device *dev, > size_t size, void *cpu_addr, size = PAGE_ALIGN(size); > /* > * @cpu_addr will be one of 3 things depending on how it was allocated: s/one of 3/one of 4/ Apart from that, Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > + * - A remapped array of pages from iommu_dma_alloc_contiguous() > + * for contiguous allocations. > * - A remapped array of pages from iommu_dma_alloc(), for all > * non-atomic allocations. > * - A non-cacheable alias from the atomic pool, for atomic > @@ -637,6 +655,11 @@ static void __iommu_free_attrs(struct device *dev, > size_t size, void *cpu_addr, if (__in_atomic_pool(cpu_addr, size)) { > iommu_dma_unmap_page(dev, handle, iosize, 0, 0); > __free_from_pool(cpu_addr, size); > + } else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) { > + struct page *page = phys_to_page(dma_to_phys(dev, handle)); > + > + iommu_dma_free_contiguous(dev, page, iosize, &handle); > + dma_common_free_remap(cpu_addr, size, VM_USERMAP); > } else if (is_vmalloc_addr(cpu_addr)){ > struct vm_struct *area = find_vm_area(cpu_addr); -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-01-31 10:24 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1485531259-17513-1-git-send-email-geert+renesas@glider.be>
[not found] ` <1485531259-17513-2-git-send-email-geert+renesas@glider.be>
2017-01-27 17:50 ` [PATCH v2 1/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS Robin Murphy
2017-01-31 10:24 ` Geert Uytterhoeven
[not found] ` <1485531259-17513-3-git-send-email-geert+renesas@glider.be>
2017-01-28 15:43 ` [PATCH v2 2/2] arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU Laurent Pinchart
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).