linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS
       [not found] ` <1485531259-17513-2-git-send-email-geert+renesas@glider.be>
@ 2017-01-27 17:50   ` Robin Murphy
  2017-01-31 10:24     ` Geert Uytterhoeven
  0 siblings, 1 reply; 3+ messages in thread
From: Robin Murphy @ 2017-01-27 17:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Geert,

On 27/01/17 15:34, Geert Uytterhoeven wrote:
> Add helpers for allocating physically contiguous DMA buffers to the
> generic IOMMU DMA code.  This can be useful when two or more devices
> with different memory requirements are involved in buffer sharing.
> 
> The iommu_dma_{alloc,free}_contiguous() functions complement the existing
> iommu_dma_{alloc,free}() functions, and allow architecture-specific code
> to implement support for the DMA_ATTR_FORCE_CONTIGUOUS attribute on
> systems with an IOMMU.  As this uses the CMA allocator, setting this
> attribute has a runtime-dependency on CONFIG_DMA_CMA.
> 
> Note that unlike the existing iommu_dma_alloc() helper,
> iommu_dma_alloc_contiguous() has no callback to flush pages.
> Ensuring the returned region is made visible to a non-coherent device is
> the responsibility of the caller.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> v2:
>   - Provide standalone iommu_dma_{alloc,free}_contiguous() functions, as
>     requested by Robin Murphy,
>   - Simplify operations by getting rid of the page array/scatterlist
>     dance, as the buffer is contiguous,
>   - Move CPU cache magement into the caller, which is much simpler with
>     a single contiguous buffer.

Thanks for the rework, that's a lot easier to make sense of! Now, please
don't hate me, but...

> ---
>  drivers/iommu/dma-iommu.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/dma-iommu.h |  4 +++
>  2 files changed, 76 insertions(+)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 2db0d641cf4505b5..8f8ed4426f9a3a12 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -30,6 +30,7 @@
>  #include <linux/pci.h>
>  #include <linux/scatterlist.h>
>  #include <linux/vmalloc.h>
> +#include <linux/dma-contiguous.h>
>  
>  struct iommu_dma_msi_page {
>  	struct list_head	list;
> @@ -408,6 +409,77 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
>  }
>  
>  /**
> + * iommu_dma_free_contiguous - Free a buffer allocated by
> + *			       iommu_dma_alloc_contiguous()
> + * @dev: Device which owns this buffer
> + * @page: Buffer page pointer as returned by iommu_dma_alloc_contiguous()
> + * @size: Size of buffer in bytes
> + * @handle: DMA address of buffer
> + *
> + * Frees the pages associated with the buffer.
> + */
> +void iommu_dma_free_contiguous(struct device *dev, struct page *page,
> +		size_t size, dma_addr_t *handle)
> +{
> +	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle);
> +	dma_release_from_contiguous(dev, page, PAGE_ALIGN(size) >> PAGE_SHIFT);
> +	*handle = DMA_ERROR_CODE;
> +}
> +
> +/**
> + * iommu_dma_alloc_contiguous - Allocate and map a buffer contiguous in IOVA
> + *				and physical space
> + * @dev: Device to allocate memory for. Must be a real device attached to an
> + *	 iommu_dma_domain
> + * @size: Size of buffer in bytes
> + * @prot: IOMMU mapping flags
> + * @handle: Out argument for allocated DMA handle
> + *
> + * Return: Buffer page pointer, or NULL on failure.
> + *
> + * Note that unlike iommu_dma_alloc(), it's the caller's responsibility to
> + * ensure the returned region is made visible to the given non-coherent device.
> + */
> +struct page *iommu_dma_alloc_contiguous(struct device *dev, size_t size,
> +		int prot, dma_addr_t *handle)
> +{
> +	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> +	struct iova_domain *iovad = cookie_iovad(domain);
> +	dma_addr_t dma_addr;
> +	unsigned int count;
> +	struct page *page;
> +	struct iova *iova;
> +	int ret;
> +
> +	*handle = DMA_ERROR_CODE;
> +
> +	size = PAGE_ALIGN(size);
> +	count = size >> PAGE_SHIFT;
> +	page = dma_alloc_from_contiguous(dev, count, get_order(size));
> +	if (!page)
> +		return NULL;
> +
> +	iova = __alloc_iova(domain, size, dev->coherent_dma_mask);
> +	if (!iova)
> +		goto out_free_pages;
> +
> +	size = iova_align(iovad, size);
> +	dma_addr = iova_dma_addr(iovad, iova);
> +	ret = iommu_map(domain, dma_addr, page_to_phys(page), size, prot);
> +	if (ret < 0)
> +		goto out_free_iova;
> +
> +	*handle = dma_addr;
> +	return page;
> +
> +out_free_iova:
> +	__free_iova(iovad, iova);
> +out_free_pages:
> +	dma_release_from_contiguous(dev, page, count);
> +	return NULL;
> +}

...now that I can see it clearly, isn't this more or less just:

	page = dma_alloc_from_contiguous(dev, ...);
	if (page)
		dma_addr = iommu_dma_map_page(dev, page, ...);
?

Would it not be even simpler to just make those two calls directly from
the arm64 code?

Robin.

> +
> +/**
>   * iommu_dma_mmap - Map a buffer into provided user VMA
>   * @pages: Array representing buffer from iommu_dma_alloc()
>   * @size: Size of buffer in bytes
> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 7f7e9a7e3839966c..7eee62c2b0e752f7 100644
> --- a/include/linux/dma-iommu.h
> +++ b/include/linux/dma-iommu.h
> @@ -45,6 +45,10 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
>  		void (*flush_page)(struct device *, const void *, phys_addr_t));
>  void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
>  		dma_addr_t *handle);
> +struct page *iommu_dma_alloc_contiguous(struct device *dev, size_t size,
> +		int prot, dma_addr_t *handle);
> +void iommu_dma_free_contiguous(struct device *dev, struct page *page,
> +		size_t size, dma_addr_t *handle);
>  
>  int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma);
>  
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2 2/2] arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU
       [not found] ` <1485531259-17513-3-git-send-email-geert+renesas@glider.be>
@ 2017-01-28 15:43   ` Laurent Pinchart
  0 siblings, 0 replies; 3+ messages in thread
From: Laurent Pinchart @ 2017-01-28 15:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Geert,

Thank you for the patch.

On Friday 27 Jan 2017 16:34:19 Geert Uytterhoeven wrote:
> Add support for allocation physically contiguous DMA buffers on arm64
> systems with an IOMMU, by dispatching DMA buffer allocations with the
> DMA_ATTR_FORCE_CONTIGUOUS attribute to the appropriate IOMMU DMA
> helpers.
> 
> Note that as this uses the CMA allocator, setting this attribute has a
> runtime-dependency on CONFIG_DMA_CMA, just like on arm32.
> 
> For arm64 systems using swiotlb, no changes are needed to support the
> allocation of physically contiguous DMA buffers:
>   - swiotlb always uses physically contiguous buffers (up to
>     IO_TLB_SEGSIZE = 128 pages),
>   - arm64's __dma_alloc_coherent() already calls
>     dma_alloc_from_contiguous() when CMA is available.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> v2:
>   - New, handle dispatching in the arch (arm64) code, as requested by
>     Robin Murphy.
> ---
>  arch/arm64/mm/dma-mapping.c | 51 +++++++++++++++++++++++++++++-------------
>  1 file changed, 37 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 1d7d5d2881db7c19..325803e0ba79ef26 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -577,20 +577,7 @@ static void *__iommu_alloc_attrs(struct device *dev,
> size_t size, */
>  	gfp |= __GFP_ZERO;
> 
> -	if (gfpflags_allow_blocking(gfp)) {
> -		struct page **pages;
> -		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, 
coherent);
> -
> -		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> -					handle, flush_page);
> -		if (!pages)
> -			return NULL;
> -
> -		addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
> -					      __builtin_return_address(0));
> -		if (!addr)
> -			iommu_dma_free(dev, pages, iosize, handle);
> -	} else {
> +	if (!gfpflags_allow_blocking(gfp)) {
>  		struct page *page;
>  		/*
>  		 * In atomic context we can't remap anything, so we'll only
> @@ -614,6 +601,35 @@ static void *__iommu_alloc_attrs(struct device *dev,
> size_t size, __free_from_pool(addr, size);
>  			addr = NULL;
>  		}
> +	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> +		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, 
coherent);
> +		struct page *page;
> +
> +		page = iommu_dma_alloc_contiguous(dev, iosize, ioprot, 
handle);
> +		if (!page)
> +			return NULL;
> +
> +		if (!coherent)
> +			__dma_flush_area(page_to_virt(page), iosize);
> +
> +		addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
> +						   prot,
> +						   
__builtin_return_address(0));
> +		if (!addr)
> +			iommu_dma_free_contiguous(dev, page, iosize, handle);
> +	} else {
> +		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, 
coherent);
> +		struct page **pages;
> +
> +		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> +					handle, flush_page);
> +		if (!pages)
> +			return NULL;
> +
> +		addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
> +					      __builtin_return_address(0));
> +		if (!addr)
> +			iommu_dma_free(dev, pages, iosize, handle);
>  	}
>  	return addr;
>  }
> @@ -626,6 +642,8 @@ static void __iommu_free_attrs(struct device *dev,
> size_t size, void *cpu_addr, size = PAGE_ALIGN(size);
>  	/*
>  	 * @cpu_addr will be one of 3 things depending on how it was 
allocated:

s/one of 3/one of 4/

Apart from that,

Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

> +	 * - A remapped array of pages from iommu_dma_alloc_contiguous()
> +	 *   for contiguous allocations.
>  	 * - A remapped array of pages from iommu_dma_alloc(), for all
>  	 *   non-atomic allocations.
>  	 * - A non-cacheable alias from the atomic pool, for atomic
> @@ -637,6 +655,11 @@ static void __iommu_free_attrs(struct device *dev,
> size_t size, void *cpu_addr, if (__in_atomic_pool(cpu_addr, size)) {
>  		iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
>  		__free_from_pool(cpu_addr, size);
> +	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> +		struct page *page = phys_to_page(dma_to_phys(dev, handle));
> +
> +		iommu_dma_free_contiguous(dev, page, iosize, &handle);
> +		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
>  	} else if (is_vmalloc_addr(cpu_addr)){
>  		struct vm_struct *area = find_vm_area(cpu_addr);

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2 1/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS
  2017-01-27 17:50   ` [PATCH v2 1/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS Robin Murphy
@ 2017-01-31 10:24     ` Geert Uytterhoeven
  0 siblings, 0 replies; 3+ messages in thread
From: Geert Uytterhoeven @ 2017-01-31 10:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin,

On Fri, Jan 27, 2017 at 6:50 PM, Robin Murphy <robin.murphy@arm.com> wrote:
> On 27/01/17 15:34, Geert Uytterhoeven wrote:
>> Add helpers for allocating physically contiguous DMA buffers to the
>> generic IOMMU DMA code.  This can be useful when two or more devices
>> with different memory requirements are involved in buffer sharing.
>>
>> The iommu_dma_{alloc,free}_contiguous() functions complement the existing
>> iommu_dma_{alloc,free}() functions, and allow architecture-specific code
>> to implement support for the DMA_ATTR_FORCE_CONTIGUOUS attribute on
>> systems with an IOMMU.  As this uses the CMA allocator, setting this
>> attribute has a runtime-dependency on CONFIG_DMA_CMA.
>>
>> Note that unlike the existing iommu_dma_alloc() helper,
>> iommu_dma_alloc_contiguous() has no callback to flush pages.
>> Ensuring the returned region is made visible to a non-coherent device is
>> the responsibility of the caller.
>>
>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>> ---
>> v2:
>>   - Provide standalone iommu_dma_{alloc,free}_contiguous() functions, as
>>     requested by Robin Murphy,
>>   - Simplify operations by getting rid of the page array/scatterlist
>>     dance, as the buffer is contiguous,
>>   - Move CPU cache magement into the caller, which is much simpler with
>>     a single contiguous buffer.
>
> Thanks for the rework, that's a lot easier to make sense of! Now, please
> don't hate me, but...

I don't hate you at all. I'm just a drivers/iommu rookie ;-)

>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c

>> +/**
>> + * iommu_dma_alloc_contiguous - Allocate and map a buffer contiguous in IOVA
>> + *                           and physical space
>> + * @dev: Device to allocate memory for. Must be a real device attached to an
>> + *    iommu_dma_domain
>> + * @size: Size of buffer in bytes
>> + * @prot: IOMMU mapping flags
>> + * @handle: Out argument for allocated DMA handle
>> + *
>> + * Return: Buffer page pointer, or NULL on failure.
>> + *
>> + * Note that unlike iommu_dma_alloc(), it's the caller's responsibility to
>> + * ensure the returned region is made visible to the given non-coherent device.
>> + */
>> +struct page *iommu_dma_alloc_contiguous(struct device *dev, size_t size,
>> +             int prot, dma_addr_t *handle)
>> +{
>> +     struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
>> +     struct iova_domain *iovad = cookie_iovad(domain);
>> +     dma_addr_t dma_addr;
>> +     unsigned int count;
>> +     struct page *page;
>> +     struct iova *iova;
>> +     int ret;
>> +
>> +     *handle = DMA_ERROR_CODE;
>> +
>> +     size = PAGE_ALIGN(size);
>> +     count = size >> PAGE_SHIFT;
>> +     page = dma_alloc_from_contiguous(dev, count, get_order(size));
>> +     if (!page)
>> +             return NULL;
>> +
>> +     iova = __alloc_iova(domain, size, dev->coherent_dma_mask);
>> +     if (!iova)
>> +             goto out_free_pages;
>> +
>> +     size = iova_align(iovad, size);
>> +     dma_addr = iova_dma_addr(iovad, iova);
>> +     ret = iommu_map(domain, dma_addr, page_to_phys(page), size, prot);
>> +     if (ret < 0)
>> +             goto out_free_iova;
>> +
>> +     *handle = dma_addr;
>> +     return page;
>> +
>> +out_free_iova:
>> +     __free_iova(iovad, iova);
>> +out_free_pages:
>> +     dma_release_from_contiguous(dev, page, count);
>> +     return NULL;
>> +}
>
> ...now that I can see it clearly, isn't this more or less just:
>
>         page = dma_alloc_from_contiguous(dev, ...);
>         if (page)
>                 dma_addr = iommu_dma_map_page(dev, page, ...);
> ?
>
> Would it not be even simpler to just make those two calls directly from
> the arm64 code?

You're right. Will update. Thanks a lot!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-01-31 10:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1485531259-17513-1-git-send-email-geert+renesas@glider.be>
     [not found] ` <1485531259-17513-2-git-send-email-geert+renesas@glider.be>
2017-01-27 17:50   ` [PATCH v2 1/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS Robin Murphy
2017-01-31 10:24     ` Geert Uytterhoeven
     [not found] ` <1485531259-17513-3-git-send-email-geert+renesas@glider.be>
2017-01-28 15:43   ` [PATCH v2 2/2] arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU Laurent Pinchart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).