* Query on patch to be upstream ? @ 2014-03-05 20:46 Ritesh Harjani 2014-03-11 12:15 ` Catalin Marinas 0 siblings, 1 reply; 18+ messages in thread From: Ritesh Harjani @ 2014-03-05 20:46 UTC (permalink / raw) To: linux-arm-kernel Hi Catalin, Your Patch (arm64: Implement coherent DMA API based on swiotlb) from your devel branch going upstream (without modifications) ? Thanks Ritesh ^ permalink raw reply [flat|nested] 18+ messages in thread
* Query on patch to be upstream ? 2014-03-05 20:46 Query on patch to be upstream ? Ritesh Harjani @ 2014-03-11 12:15 ` Catalin Marinas 2014-03-11 18:04 ` Laura Abbott 0 siblings, 1 reply; 18+ messages in thread From: Catalin Marinas @ 2014-03-11 12:15 UTC (permalink / raw) To: linux-arm-kernel On Wed, Mar 05, 2014 at 08:46:24PM +0000, Ritesh Harjani wrote: > Your Patch (arm64: Implement coherent DMA API based on swiotlb) from > your devel branch going upstream (without modifications) ? That's the plan. But once we get the system topology or Santosh's dma I coherent DT bindings plan to make it DT-driven on whether a device is coherent or not. For now we need some solution as I get people asking about cache management for DMA buffers (though none of them got back with a Tested-by ;)). -- Catalin ^ permalink raw reply [flat|nested] 18+ messages in thread
* Query on patch to be upstream ? 2014-03-11 12:15 ` Catalin Marinas @ 2014-03-11 18:04 ` Laura Abbott 2014-03-11 18:26 ` Catalin Marinas 0 siblings, 1 reply; 18+ messages in thread From: Laura Abbott @ 2014-03-11 18:04 UTC (permalink / raw) To: linux-arm-kernel With one caveat below, you can add Tested-by: Laura Abbott <lauraa@codeaurora.org> for "arm64: Implement coherent DMA API based on swiotlb" Thanks, Laura -- 8< -- From: Laura Abbott <lauraa@codeaurora.org> Date: Thu, 6 Mar 2014 21:05:41 -0800 Subject: [PATCH] arm64: Use custom mmap function for noncoherent dma ops The non-coherent dma ops remap memory with appropriate attributes. This remapped address cannot be used with virt_to_page which dma_common_mmap uses. Implement a custom (but very similar) function which correctly calculates the physical address and remaps to userspace. Signed-off-by: Laura Abbott <lauraa@codeaurora.org> --- arch/arm64/mm/dma-mapping.c | 27 +++++++++++++++++++++++++++ 1 files changed, 27 insertions(+), 0 deletions(-) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index e923a5b..5ac7a14 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -194,9 +194,36 @@ static void arm64_swiotlb_sync_sg_for_device(struct device *dev, sg->length, dir); } +int arm64_swiotlb_mmap(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size, + struct dma_attrs *attrs) +{ + int ret = -ENXIO; + unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >> + PAGE_SHIFT; + unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT; + unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT; + unsigned long off = vma->vm_pgoff; + + vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot); + + if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) + return ret; + + if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) { + ret = remap_pfn_range(vma, vma->vm_start, + pfn + off, + vma->vm_end - vma->vm_start, + vma->vm_page_prot); + } + + return ret; +} + struct dma_map_ops noncoherent_swiotlb_dma_ops = { .alloc = arm64_swiotlb_alloc_noncoherent, .free = arm64_swiotlb_free_noncoherent, + .mmap = arm64_swiotlb_mmap, .map_page = arm64_swiotlb_map_page, .unmap_page = arm64_swiotlb_unmap_page, .map_sg = arm64_swiotlb_map_sg_attrs, -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Query on patch to be upstream ? 2014-03-11 18:04 ` Laura Abbott @ 2014-03-11 18:26 ` Catalin Marinas 2014-03-12 18:20 ` Laura Abbott 0 siblings, 1 reply; 18+ messages in thread From: Catalin Marinas @ 2014-03-11 18:26 UTC (permalink / raw) To: linux-arm-kernel On Tue, Mar 11, 2014 at 06:04:19PM +0000, Laura Abbott wrote: > With one caveat below, you can add > > Tested-by: Laura Abbott <lauraa@codeaurora.org> Thanks. > From: Laura Abbott <lauraa@codeaurora.org> > Date: Thu, 6 Mar 2014 21:05:41 -0800 > Subject: [PATCH] arm64: Use custom mmap function for noncoherent dma ops > > The non-coherent dma ops remap memory with appropriate attributes. > This remapped address cannot be used with virt_to_page which > dma_common_mmap uses. Implement a custom (but very similar) function > which correctly calculates the physical address and remaps to userspace. Looking at the coherent implementation, I think we also have a (performance) problem with dma_common_mmap(). It uses pgprot_noncached() by default and if the DMA is coherent we don't really need strongly ordered memory. Would you mind writing a __dma_common_mmap() for arm64 with separate coherent/non-coherent dma mmap functions that set the vm_page_prot accordingly? Thanks. -- Catalin ^ permalink raw reply [flat|nested] 18+ messages in thread
* Query on patch to be upstream ? 2014-03-11 18:26 ` Catalin Marinas @ 2014-03-12 18:20 ` Laura Abbott 2014-03-13 17:45 ` [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping Laura Abbott 2014-03-13 17:45 ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott 0 siblings, 2 replies; 18+ messages in thread From: Laura Abbott @ 2014-03-12 18:20 UTC (permalink / raw) To: linux-arm-kernel On 3/11/2014 11:26 AM, Catalin Marinas wrote: > On Tue, Mar 11, 2014 at 06:04:19PM +0000, Laura Abbott wrote: >> With one caveat below, you can add >> >> Tested-by: Laura Abbott <lauraa@codeaurora.org> > > Thanks. > >> From: Laura Abbott <lauraa@codeaurora.org> >> Date: Thu, 6 Mar 2014 21:05:41 -0800 >> Subject: [PATCH] arm64: Use custom mmap function for noncoherent dma ops >> >> The non-coherent dma ops remap memory with appropriate attributes. >> This remapped address cannot be used with virt_to_page which >> dma_common_mmap uses. Implement a custom (but very similar) function >> which correctly calculates the physical address and remaps to userspace. > > Looking at the coherent implementation, I think we also have a > (performance) problem with dma_common_mmap(). It uses pgprot_noncached() > by default and if the DMA is coherent we don't really need strongly > ordered memory. > > Would you mind writing a __dma_common_mmap() for arm64 with separate > coherent/non-coherent dma mmap functions that set the vm_page_prot > accordingly? > > Thanks. > Yes, I will add that to my TODO list. Thanks, Laura -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping 2014-03-12 18:20 ` Laura Abbott @ 2014-03-13 17:45 ` Laura Abbott 2014-03-13 17:49 ` Catalin Marinas 2014-03-13 17:45 ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott 1 sibling, 1 reply; 18+ messages in thread From: Laura Abbott @ 2014-03-13 17:45 UTC (permalink / raw) To: linux-arm-kernel The current dma_ops do not specify an mmap function so maping falls back to the default implementation. There are at least two issues with using the default implementation: 1) The pgprot is always pgprot_noncached (strongly ordered) memory even with coherent operations 2) dma_common_mmap calls virt_to_page on the remapped non-coherent address which leads to invalid memory being mapped. Fix both these issue by implementing a custom mmap function which correctly accounts for remapped addresses and sets vm_pg_prot appropriately. Signed-off-by: Laura Abbott <lauraa@codeaurora.org> --- arch/arm64/mm/dma-mapping.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index e923a5b..9a639bf 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -194,9 +194,52 @@ static void arm64_swiotlb_sync_sg_for_device(struct device *dev, sg->length, dir); } +/* vma->vm_page_prot must be set appropriately before calling this function */ +static int __dma_common_mmap(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size) +{ + int ret = -ENXIO; + unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >> + PAGE_SHIFT; + unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT; + unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT; + unsigned long off = vma->vm_pgoff; + + if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) + return ret; + + if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) { + ret = remap_pfn_range(vma, vma->vm_start, + pfn + off, + vma->vm_end - vma->vm_start, + vma->vm_page_prot); + } + + return ret; +} + +static int arm64_swiotlb_mmap_noncoherent(struct device *dev, + struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size, + struct dma_attrs *attrs) +{ + /* Just use whatever page_prot attributes were specified */ + return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size); +} + +static int arm64_swiotlb_mmap_coherent(struct device *dev, + struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size, + struct dma_attrs *attrs) +{ + vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot); + return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size); +} + struct dma_map_ops noncoherent_swiotlb_dma_ops = { .alloc = arm64_swiotlb_alloc_noncoherent, .free = arm64_swiotlb_free_noncoherent, + .mmap = arm64_swiotlb_mmap_noncoherent, .map_page = arm64_swiotlb_map_page, .unmap_page = arm64_swiotlb_unmap_page, .map_sg = arm64_swiotlb_map_sg_attrs, @@ -213,6 +256,7 @@ EXPORT_SYMBOL(noncoherent_swiotlb_dma_ops); struct dma_map_ops coherent_swiotlb_dma_ops = { .alloc = arm64_swiotlb_alloc_coherent, .free = arm64_swiotlb_free_coherent, + .mmap = arm64_swiotlb_mmap_coherent, .map_page = swiotlb_map_page, .unmap_page = swiotlb_unmap_page, .map_sg = swiotlb_map_sg_attrs, -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping 2014-03-13 17:45 ` [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping Laura Abbott @ 2014-03-13 17:49 ` Catalin Marinas 2014-03-14 1:53 ` Laura Abbott 0 siblings, 1 reply; 18+ messages in thread From: Catalin Marinas @ 2014-03-13 17:49 UTC (permalink / raw) To: linux-arm-kernel On Thu, Mar 13, 2014 at 05:45:15PM +0000, Laura Abbott wrote: > +/* vma->vm_page_prot must be set appropriately before calling this function */ > +static int __dma_common_mmap(struct device *dev, struct vm_area_struct *vma, > + void *cpu_addr, dma_addr_t dma_addr, size_t size) > +{ > + int ret = -ENXIO; > + unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >> > + PAGE_SHIFT; > + unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT; > + unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT; > + unsigned long off = vma->vm_pgoff; > + > + if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) > + return ret; > + > + if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) { > + ret = remap_pfn_range(vma, vma->vm_start, > + pfn + off, > + vma->vm_end - vma->vm_start, > + vma->vm_page_prot); > + } > + > + return ret; > +} > + > +static int arm64_swiotlb_mmap_noncoherent(struct device *dev, > + struct vm_area_struct *vma, > + void *cpu_addr, dma_addr_t dma_addr, size_t size, > + struct dma_attrs *attrs) > +{ > + /* Just use whatever page_prot attributes were specified */ > + return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size); > +} > + > +static int arm64_swiotlb_mmap_coherent(struct device *dev, > + struct vm_area_struct *vma, > + void *cpu_addr, dma_addr_t dma_addr, size_t size, > + struct dma_attrs *attrs) > +{ > + vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot); > + return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size); > +} So I think we got the naming the other way around. The noncoherent_dma_ops assume that the DMA is non-coherent and therefore must change the pgprot. -- Catalin ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping 2014-03-13 17:49 ` Catalin Marinas @ 2014-03-14 1:53 ` Laura Abbott 0 siblings, 0 replies; 18+ messages in thread From: Laura Abbott @ 2014-03-14 1:53 UTC (permalink / raw) To: linux-arm-kernel On 3/13/2014 10:49 AM, Catalin Marinas wrote: > On Thu, Mar 13, 2014 at 05:45:15PM +0000, Laura Abbott wrote: >> +/* vma->vm_page_prot must be set appropriately before calling this function */ >> +static int __dma_common_mmap(struct device *dev, struct vm_area_struct *vma, >> + void *cpu_addr, dma_addr_t dma_addr, size_t size) >> +{ >> + int ret = -ENXIO; >> + unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >> >> + PAGE_SHIFT; >> + unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT; >> + unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT; >> + unsigned long off = vma->vm_pgoff; >> + >> + if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) >> + return ret; >> + >> + if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) { >> + ret = remap_pfn_range(vma, vma->vm_start, >> + pfn + off, >> + vma->vm_end - vma->vm_start, >> + vma->vm_page_prot); >> + } >> + >> + return ret; >> +} >> + >> +static int arm64_swiotlb_mmap_noncoherent(struct device *dev, >> + struct vm_area_struct *vma, >> + void *cpu_addr, dma_addr_t dma_addr, size_t size, >> + struct dma_attrs *attrs) >> +{ >> + /* Just use whatever page_prot attributes were specified */ >> + return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size); >> +} >> + >> +static int arm64_swiotlb_mmap_coherent(struct device *dev, >> + struct vm_area_struct *vma, >> + void *cpu_addr, dma_addr_t dma_addr, size_t size, >> + struct dma_attrs *attrs) >> +{ >> + vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot); >> + return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size); >> +} > > So I think we got the naming the other way around. The > noncoherent_dma_ops assume that the DMA is non-coherent and therefore > must change the pgprot. I spent way too long convincing myself this was right and I was still wrong. I'll fix it up for v2 (I have one more patch I need to put in for CMA integration as well) Thanks, Laura -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE 2014-03-12 18:20 ` Laura Abbott 2014-03-13 17:45 ` [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping Laura Abbott @ 2014-03-13 17:45 ` Laura Abbott 2014-03-13 17:52 ` Catalin Marinas ` (3 more replies) 1 sibling, 4 replies; 18+ messages in thread From: Laura Abbott @ 2014-03-13 17:45 UTC (permalink / raw) To: linux-arm-kernel DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot appropriately for non coherent opperations. Signed-off-by: Laura Abbott <lauraa@codeaurora.org> --- Potential addition on top of the coherent work as well. Writecombine and dmacoherent seem to be the same at the moment but it might be good practice to have the two be separate? arch/arm64/mm/dma-mapping.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index 9a639bf..d2c0027 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -29,6 +29,15 @@ struct dma_map_ops *dma_ops; EXPORT_SYMBOL(dma_ops); + +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot) +{ + prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ? + pgprot_writecombine(prot) : + pgprot_dmacoherent(prot); + return prot; +} + static void *arm64_swiotlb_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t flags, struct dma_attrs *attrs) @@ -72,7 +81,7 @@ static void *arm64_swiotlb_alloc_noncoherent(struct device *dev, size_t size, for (i = 0; i < (size >> PAGE_SHIFT); i++) map[i] = page + i; coherent_ptr = vmap(map, size >> PAGE_SHIFT, VM_MAP, - pgprot_dmacoherent(pgprot_default)); + __get_dma_pgprot(attrs, pgprot_default)); kfree(map); if (!coherent_ptr) goto no_map; @@ -232,7 +241,7 @@ static int arm64_swiotlb_mmap_coherent(struct device *dev, void *cpu_addr, dma_addr_t dma_addr, size_t size, struct dma_attrs *attrs) { - vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot); + vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size); } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE 2014-03-13 17:45 ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott @ 2014-03-13 17:52 ` Catalin Marinas 2014-03-14 2:02 ` Laura Abbott 2014-03-14 19:52 ` [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping Laura Abbott ` (2 subsequent siblings) 3 siblings, 1 reply; 18+ messages in thread From: Catalin Marinas @ 2014-03-13 17:52 UTC (permalink / raw) To: linux-arm-kernel On Thu, Mar 13, 2014 at 05:45:16PM +0000, Laura Abbott wrote: > DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot > appropriately for non coherent opperations. > > Signed-off-by: Laura Abbott <lauraa@codeaurora.org> > --- > Potential addition on top of the coherent work as well. Writecombine > and dmacoherent seem to be the same at the moment but it might be > good practice to have the two be separate? > > arch/arm64/mm/dma-mapping.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index 9a639bf..d2c0027 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -29,6 +29,15 @@ > struct dma_map_ops *dma_ops; > EXPORT_SYMBOL(dma_ops); > > + > +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot) > +{ > + prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ? > + pgprot_writecombine(prot) : > + pgprot_dmacoherent(prot); > + return prot; > +} pgprot_writecombine and pgprot_dmacoherent are the same on arm64 (and ARMv6/v7). So when the DMA is coherent on an SoC, we need to leave the prot unchanged if !DMA_ATTR_WRITE_COMBINE. -- Catalin ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE 2014-03-13 17:52 ` Catalin Marinas @ 2014-03-14 2:02 ` Laura Abbott 0 siblings, 0 replies; 18+ messages in thread From: Laura Abbott @ 2014-03-14 2:02 UTC (permalink / raw) To: linux-arm-kernel On 3/13/2014 10:52 AM, Catalin Marinas wrote: > On Thu, Mar 13, 2014 at 05:45:16PM +0000, Laura Abbott wrote: >> DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot >> appropriately for non coherent opperations. >> >> Signed-off-by: Laura Abbott <lauraa@codeaurora.org> >> --- >> Potential addition on top of the coherent work as well. Writecombine >> and dmacoherent seem to be the same at the moment but it might be >> good practice to have the two be separate? >> >> arch/arm64/mm/dma-mapping.c | 13 +++++++++++-- >> 1 file changed, 11 insertions(+), 2 deletions(-) >> >> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c >> index 9a639bf..d2c0027 100644 >> --- a/arch/arm64/mm/dma-mapping.c >> +++ b/arch/arm64/mm/dma-mapping.c >> @@ -29,6 +29,15 @@ >> struct dma_map_ops *dma_ops; >> EXPORT_SYMBOL(dma_ops); >> >> + >> +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot) >> +{ >> + prot = dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs) ? >> + pgprot_writecombine(prot) : >> + pgprot_dmacoherent(prot); >> + return prot; >> +} > > pgprot_writecombine and pgprot_dmacoherent are the same on arm64 (and > ARMv6/v7). So when the DMA is coherent on an SoC, we need to leave the > prot unchanged if !DMA_ATTR_WRITE_COMBINE. > Ah yes I missed that. I'll fix it up. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping 2014-03-13 17:45 ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott 2014-03-13 17:52 ` Catalin Marinas @ 2014-03-14 19:52 ` Laura Abbott 2014-03-24 10:33 ` Catalin Marinas [not found] ` <CALk7dXr3cZSkQ6dTUyCjUDStOd6=ghGN9-iO5RQiTfHCciGxLg@mail.gmail.com> 2014-03-14 19:52 ` [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott 2014-03-14 19:52 ` [PATCHv2 3/3] arm64: Use arm64 coherent APIs for non-coherent freeing Laura Abbott 3 siblings, 2 replies; 18+ messages in thread From: Laura Abbott @ 2014-03-14 19:52 UTC (permalink / raw) To: linux-arm-kernel The current dma_ops do not specify an mmap function so maping falls back to the default implementation. There are at least two issues with using the default implementation: 1) The pgprot is always pgprot_noncached (strongly ordered) memory even with coherent operations 2) dma_common_mmap calls virt_to_page on the remapped non-coherent address which leads to invalid memory being mapped. Fix both these issue by implementing a custom mmap function which correctly accounts for remapped addresses and sets vm_pg_prot appropriately. Signed-off-by: Laura Abbott <lauraa@codeaurora.org> --- arch/arm64/mm/dma-mapping.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index e923a5b..0cdd2f6 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -194,9 +194,52 @@ static void arm64_swiotlb_sync_sg_for_device(struct device *dev, sg->length, dir); } +/* vma->vm_page_prot must be set appropriately before calling this function */ +static int __dma_common_mmap(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size) +{ + int ret = -ENXIO; + unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >> + PAGE_SHIFT; + unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT; + unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT; + unsigned long off = vma->vm_pgoff; + + if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) + return ret; + + if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) { + ret = remap_pfn_range(vma, vma->vm_start, + pfn + off, + vma->vm_end - vma->vm_start, + vma->vm_page_prot); + } + + return ret; +} + +static int arm64_swiotlb_mmap_noncoherent(struct device *dev, + struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size, + struct dma_attrs *attrs) +{ + vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot); + return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size); +} + +static int arm64_swiotlb_mmap_coherent(struct device *dev, + struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size, + struct dma_attrs *attrs) +{ + /* Just use whatever page_prot attributes were specified */ + return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size); +} + struct dma_map_ops noncoherent_swiotlb_dma_ops = { .alloc = arm64_swiotlb_alloc_noncoherent, .free = arm64_swiotlb_free_noncoherent, + .mmap = arm64_swiotlb_mmap_noncoherent, .map_page = arm64_swiotlb_map_page, .unmap_page = arm64_swiotlb_unmap_page, .map_sg = arm64_swiotlb_map_sg_attrs, @@ -213,6 +256,7 @@ EXPORT_SYMBOL(noncoherent_swiotlb_dma_ops); struct dma_map_ops coherent_swiotlb_dma_ops = { .alloc = arm64_swiotlb_alloc_coherent, .free = arm64_swiotlb_free_coherent, + .mmap = arm64_swiotlb_mmap_coherent, .map_page = swiotlb_map_page, .unmap_page = swiotlb_unmap_page, .map_sg = swiotlb_map_sg_attrs, -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping 2014-03-14 19:52 ` [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping Laura Abbott @ 2014-03-24 10:33 ` Catalin Marinas [not found] ` <CALk7dXr3cZSkQ6dTUyCjUDStOd6=ghGN9-iO5RQiTfHCciGxLg@mail.gmail.com> 1 sibling, 0 replies; 18+ messages in thread From: Catalin Marinas @ 2014-03-24 10:33 UTC (permalink / raw) To: linux-arm-kernel On Fri, Mar 14, 2014 at 07:52:23PM +0000, Laura Abbott wrote: > The current dma_ops do not specify an mmap function so maping > falls back to the default implementation. There are at least > two issues with using the default implementation: > > 1) The pgprot is always pgprot_noncached (strongly ordered) > memory even with coherent operations > 2) dma_common_mmap calls virt_to_page on the remapped non-coherent > address which leads to invalid memory being mapped. > > Fix both these issue by implementing a custom mmap function which > correctly accounts for remapped addresses and sets vm_pg_prot > appropriately. > > Signed-off-by: Laura Abbott <lauraa@codeaurora.org> I thought there was still some update needed to this series but it turns out that patch 2/3 was what I was expecting already, so I merged the first two patches (with minor changes for s/arm64_/__/ prefix). Patch 3 seems to be folded onto my patch already. Thanks. -- Catalin ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CALk7dXr3cZSkQ6dTUyCjUDStOd6=ghGN9-iO5RQiTfHCciGxLg@mail.gmail.com>]
* [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping [not found] ` <CALk7dXr3cZSkQ6dTUyCjUDStOd6=ghGN9-iO5RQiTfHCciGxLg@mail.gmail.com> @ 2014-03-28 10:37 ` Catalin Marinas 0 siblings, 0 replies; 18+ messages in thread From: Catalin Marinas @ 2014-03-28 10:37 UTC (permalink / raw) To: linux-arm-kernel On Fri, Mar 28, 2014 at 09:45:08AM +0000, Ritesh Harjani wrote: > On Sat, Mar 15, 2014 at 1:22 AM, Laura Abbott <lauraa@codeaurora.org> wrote: > > +static int __dma_common_mmap(struct device *dev, struct vm_area_struct *vma, > > + void *cpu_addr, dma_addr_t dma_addr, size_t size) > > +{ > > + int ret = -ENXIO; > > + unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >> > > + PAGE_SHIFT; > > + unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT; > > + unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT; > > Why not __phys_to_pfn here ?? just asking, I know there is nothing wrong in > this too.. Because dma_addr is a DMA address (as seen by the device from the other side of the bus) rather than a CPU physical address. In many cases they are the same but not always. -- Catalin ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE 2014-03-13 17:45 ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott 2014-03-13 17:52 ` Catalin Marinas 2014-03-14 19:52 ` [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping Laura Abbott @ 2014-03-14 19:52 ` Laura Abbott 2014-03-14 20:24 ` Rob Herring 2014-03-14 19:52 ` [PATCHv2 3/3] arm64: Use arm64 coherent APIs for non-coherent freeing Laura Abbott 3 siblings, 1 reply; 18+ messages in thread From: Laura Abbott @ 2014-03-14 19:52 UTC (permalink / raw) To: linux-arm-kernel DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot appropriately for non coherent opperations. Signed-off-by: Laura Abbott <lauraa@codeaurora.org> --- arch/arm64/mm/dma-mapping.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index 0cdd2f6..608c343 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -29,6 +29,17 @@ struct dma_map_ops *dma_ops; EXPORT_SYMBOL(dma_ops); + +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot, + bool coherent) +{ + if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs)) + return pgprot_writecombine(prot) + else if (!coherent) + return pgprot_dmacoherent(prot); + return prot; +} + static void *arm64_swiotlb_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t flags, struct dma_attrs *attrs) @@ -72,7 +83,7 @@ static void *arm64_swiotlb_alloc_noncoherent(struct device *dev, size_t size, for (i = 0; i < (size >> PAGE_SHIFT); i++) map[i] = page + i; coherent_ptr = vmap(map, size >> PAGE_SHIFT, VM_MAP, - pgprot_dmacoherent(pgprot_default)); + __get_dma_pgprot(attrs, pgprot_default, false)); kfree(map); if (!coherent_ptr) goto no_map; @@ -223,7 +234,7 @@ static int arm64_swiotlb_mmap_noncoherent(struct device *dev, void *cpu_addr, dma_addr_t dma_addr, size_t size, struct dma_attrs *attrs) { - vma->vm_page_prot = pgprot_dmacoherent(vma->vm_page_prot); + vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot, false); return __dma_common_mmap(dev, vma, cpu_addr, dma_addr, size); } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE 2014-03-14 19:52 ` [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott @ 2014-03-14 20:24 ` Rob Herring 2014-03-14 23:07 ` Catalin Marinas 0 siblings, 1 reply; 18+ messages in thread From: Rob Herring @ 2014-03-14 20:24 UTC (permalink / raw) To: linux-arm-kernel On Fri, Mar 14, 2014 at 2:52 PM, Laura Abbott <lauraa@codeaurora.org> wrote: > DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot > appropriately for non coherent opperations. > > Signed-off-by: Laura Abbott <lauraa@codeaurora.org> > --- > arch/arm64/mm/dma-mapping.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index 0cdd2f6..608c343 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -29,6 +29,17 @@ > struct dma_map_ops *dma_ops; > EXPORT_SYMBOL(dma_ops); > > + > +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot, > + bool coherent) > +{ > + if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs)) > + return pgprot_writecombine(prot) Does this compile? Missing semicolon. > + else if (!coherent) > + return pgprot_dmacoherent(prot); > + return prot; Isn't DMA_ATTR_WRITE_COMBINE supposed to be an optimization over plain non-cached? But coherent would be more optimal over just write combine, so we would want to continue to ignore DMA_ATTR_WRITE_COMBINE in the coherent case. Something like this: if (coherent) return prot; else if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs)) return pgprot_writecombine(prot); return pgprot_dmacoherent(prot); But then this function is never used in the coherent case, so I'm confused by Catalin's comment. The original version seems sufficient to me. Rob ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE 2014-03-14 20:24 ` Rob Herring @ 2014-03-14 23:07 ` Catalin Marinas 0 siblings, 0 replies; 18+ messages in thread From: Catalin Marinas @ 2014-03-14 23:07 UTC (permalink / raw) To: linux-arm-kernel On 14 Mar 2014, at 20:24, Rob Herring <robherring2@gmail.com> wrote: > On Fri, Mar 14, 2014 at 2:52 PM, Laura Abbott <lauraa@codeaurora.org> wrote: >> DMA_ATTR_WRITE_COMBINE is currently ignored. Set the pgprot >> appropriately for non coherent opperations. >> >> Signed-off-by: Laura Abbott <lauraa@codeaurora.org> >> --- >> arch/arm64/mm/dma-mapping.c | 15 +++++++++++++-- >> 1 file changed, 13 insertions(+), 2 deletions(-) >> >> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c >> index 0cdd2f6..608c343 100644 >> --- a/arch/arm64/mm/dma-mapping.c >> +++ b/arch/arm64/mm/dma-mapping.c >> @@ -29,6 +29,17 @@ >> struct dma_map_ops *dma_ops; >> EXPORT_SYMBOL(dma_ops); >> >> + >> +static inline pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot, >> + bool coherent) >> +{ >> + if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs)) >> + return pgprot_writecombine(prot) > > Does this compile? Missing semicolon. > >> + else if (!coherent) >> + return pgprot_dmacoherent(prot); >> + return prot; > > Isn't DMA_ATTR_WRITE_COMBINE supposed to be an optimization over plain > non-cached? Since writecombine and dmacoherent are the same, we probably shouldn?t even bother with the check as it doesn?t have any effect. > But coherent would be more optimal over just write > combine, so we would want to continue to ignore DMA_ATTR_WRITE_COMBINE > in the coherent case. Something like this: > > if (coherent) > return prot; > else if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs)) > return pgprot_writecombine(prot); > return pgprot_dmacoherent(prot); I?ve got request from graphics people in the past about using writecombine even though the DMA is fully coherent. The reason is that some (frame)buffers are more efficient to fill with non-cacheable memory than cacheable + write allocate which affects the whole CPU cache. > But then this function is never used in the coherent case, so I'm > confused by Catalin's comment. The original version seems sufficient > to me. I think we need mmap function for both coherent and non-coherent cases since the default dma_mmap_coherent() just uses pgprot_noncached() which gives us Strongly Ordered. So my proposal: if (!coherent) return pgprot_dmacoherent(prot); else if (dma_get_attr(DMA_ATTR_WRITE_COMBINE, attrs)) return pgprot_writecombine(prot); return prot; I could even use dmacoherent instead of writecombine to avoid confusion. Catalin ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCHv2 3/3] arm64: Use arm64 coherent APIs for non-coherent freeing 2014-03-13 17:45 ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott ` (2 preceding siblings ...) 2014-03-14 19:52 ` [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott @ 2014-03-14 19:52 ` Laura Abbott 3 siblings, 0 replies; 18+ messages in thread From: Laura Abbott @ 2014-03-14 19:52 UTC (permalink / raw) To: linux-arm-kernel The noncoherent free function currently unconditionally frees back to the page allocator which is incorrect for CMA pages. call the coherent free function to correctly differentiate between pages. Signed-off-by: Laura Abbott <lauraa@codeaurora.org> --- arch/arm64/mm/dma-mapping.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index 608c343..99ff063 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -104,7 +104,7 @@ static void arm64_swiotlb_free_noncoherent(struct device *dev, size_t size, void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle)); vunmap(vaddr); - swiotlb_free_coherent(dev, size, swiotlb_addr, dma_handle); + arm64_swiotlb_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs); } static dma_addr_t arm64_swiotlb_map_page(struct device *dev, -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 18+ messages in thread
end of thread, other threads:[~2014-03-28 10:37 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-05 20:46 Query on patch to be upstream ? Ritesh Harjani 2014-03-11 12:15 ` Catalin Marinas 2014-03-11 18:04 ` Laura Abbott 2014-03-11 18:26 ` Catalin Marinas 2014-03-12 18:20 ` Laura Abbott 2014-03-13 17:45 ` [PATCH 1/2] arm64: Implement custom mmap functions for dma mapping Laura Abbott 2014-03-13 17:49 ` Catalin Marinas 2014-03-14 1:53 ` Laura Abbott 2014-03-13 17:45 ` [PATCH 2/2] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott 2014-03-13 17:52 ` Catalin Marinas 2014-03-14 2:02 ` Laura Abbott 2014-03-14 19:52 ` [PATCHv2 1/3] arm64: Implement custom mmap functions for dma mapping Laura Abbott 2014-03-24 10:33 ` Catalin Marinas [not found] ` <CALk7dXr3cZSkQ6dTUyCjUDStOd6=ghGN9-iO5RQiTfHCciGxLg@mail.gmail.com> 2014-03-28 10:37 ` Catalin Marinas 2014-03-14 19:52 ` [PATCHv2 2/3] arm64: Support DMA_ATTR_WRITE_COMBINE Laura Abbott 2014-03-14 20:24 ` Rob Herring 2014-03-14 23:07 ` Catalin Marinas 2014-03-14 19:52 ` [PATCHv2 3/3] arm64: Use arm64 coherent APIs for non-coherent freeing Laura Abbott
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).