* [PATCH 0/2] scatterlist cleanups
@ 2015-06-09 16:27 Dan Williams
2015-06-09 16:27 ` [PATCH 1/2] scatterlist: use sg_phys() Dan Williams
2015-06-09 16:27 ` [PATCH 2/2] scatterlist: cleanup sg_chain() and sg_unmark_end() Dan Williams
0 siblings, 2 replies; 14+ messages in thread
From: Dan Williams @ 2015-06-09 16:27 UTC (permalink / raw)
To: linux-arm-kernel
Hi Jens,
While working through a conversion of the scattlerlist data structure
I noticed that some users were open coding scatterlist manipulations.
Christoph recommended sending these for 4.1-rc inclusion. These are
based on 4.1-rc7.
---
Dan Williams (2):
scatterlist: use sg_phys()
scatterlist: cleanup sg_chain() and sg_unmark_end()
arch/arm/mm/dma-mapping.c | 2 +-
arch/microblaze/kernel/dma.c | 2 +-
block/blk-merge.c | 2 +-
drivers/crypto/omap-sham.c | 2 +-
drivers/dma/imx-dma.c | 8 ++------
drivers/dma/ste_dma40.c | 5 +----
drivers/iommu/intel-iommu.c | 4 ++--
drivers/iommu/iommu.c | 2 +-
drivers/mmc/card/queue.c | 4 ++--
drivers/staging/android/ion/ion_chunk_heap.c | 4 ++--
include/crypto/scatterwalk.h | 9 ++-------
11 files changed, 16 insertions(+), 28 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH 1/2] scatterlist: use sg_phys() 2015-06-09 16:27 [PATCH 0/2] scatterlist cleanups Dan Williams @ 2015-06-09 16:27 ` Dan Williams 2015-06-10 0:34 ` Elliott, Robert (Server Storage) 2015-06-10 9:32 ` Joerg Roedel 2015-06-09 16:27 ` [PATCH 2/2] scatterlist: cleanup sg_chain() and sg_unmark_end() Dan Williams 1 sibling, 2 replies; 14+ messages in thread From: Dan Williams @ 2015-06-09 16:27 UTC (permalink / raw) To: linux-arm-kernel Coccinelle cleanup to replace open coded sg to physical address translations. This is in preparation for introducing scatterlists that reference __pfn_t. // sg_phys.cocci: convert usage page_to_phys(sg_page(sg)) to sg_phys(sg) // usage: make coccicheck COCCI=sg_phys.cocci MODE=patch virtual patch virtual report virtual org @@ struct scatterlist *sg; @@ - page_to_phys(sg_page(sg)) + sg->offset + sg_phys(sg) @@ struct scatterlist *sg; @@ - page_to_phys(sg_page(sg)) + sg_phys(sg) - sg->offset Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Russell King <linux@arm.linux.org.uk> Cc: Michal Simek <monstr@monstr.eu> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- arch/arm/mm/dma-mapping.c | 2 +- arch/microblaze/kernel/dma.c | 2 +- drivers/iommu/intel-iommu.c | 4 ++-- drivers/iommu/iommu.c | 2 +- drivers/staging/android/ion/ion_chunk_heap.c | 4 ++-- 5 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 7e7583ddd607..9f6ff6671f01 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -1502,7 +1502,7 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, return -ENOMEM; for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) { - phys_addr_t phys = page_to_phys(sg_page(s)); + phys_addr_t phys = sg_phys(s) - s->offset; unsigned int len = PAGE_ALIGN(s->offset + s->length); if (!is_coherent && diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c index ed7ba8a11822..dcb3c594d626 100644 --- a/arch/microblaze/kernel/dma.c +++ b/arch/microblaze/kernel/dma.c @@ -61,7 +61,7 @@ static int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, /* FIXME this part of code is untested */ for_each_sg(sgl, sg, nents, i) { sg->dma_address = sg_phys(sg); - __dma_sync(page_to_phys(sg_page(sg)) + sg->offset, + __dma_sync(sg_phys(sg), sg->length, direction); } diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 68d43beccb7e..9b9ada71e0d3 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1998,7 +1998,7 @@ static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn, sg_res = aligned_nrpages(sg->offset, sg->length); sg->dma_address = ((dma_addr_t)iov_pfn << VTD_PAGE_SHIFT) + sg->offset; sg->dma_length = sg->length; - pteval = page_to_phys(sg_page(sg)) | prot; + pteval = (sg_phys(sg) - sg->offset) | prot; phys_pfn = pteval >> VTD_PAGE_SHIFT; } @@ -3302,7 +3302,7 @@ static int intel_nontranslate_map_sg(struct device *hddev, for_each_sg(sglist, sg, nelems, i) { BUG_ON(!sg_page(sg)); - sg->dma_address = page_to_phys(sg_page(sg)) + sg->offset; + sg->dma_address = sg_phys(sg); sg->dma_length = sg->length; } return nelems; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index d4f527e56679..59808fc9110d 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1147,7 +1147,7 @@ size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova, min_pagesz = 1 << __ffs(domain->ops->pgsize_bitmap); for_each_sg(sg, s, nents, i) { - phys_addr_t phys = page_to_phys(sg_page(s)) + s->offset; + phys_addr_t phys = sg_phys(s); /* * We are mapping on IOMMU page boundaries, so offset within diff --git a/drivers/staging/android/ion/ion_chunk_heap.c b/drivers/staging/android/ion/ion_chunk_heap.c index 3e6ec2ee6802..b7da5d142aa9 100644 --- a/drivers/staging/android/ion/ion_chunk_heap.c +++ b/drivers/staging/android/ion/ion_chunk_heap.c @@ -81,7 +81,7 @@ static int ion_chunk_heap_allocate(struct ion_heap *heap, err: sg = table->sgl; for (i -= 1; i >= 0; i--) { - gen_pool_free(chunk_heap->pool, page_to_phys(sg_page(sg)), + gen_pool_free(chunk_heap->pool, sg_phys(sg) - sg->offset, sg->length); sg = sg_next(sg); } @@ -109,7 +109,7 @@ static void ion_chunk_heap_free(struct ion_buffer *buffer) DMA_BIDIRECTIONAL); for_each_sg(table->sgl, sg, table->nents, i) { - gen_pool_free(chunk_heap->pool, page_to_phys(sg_page(sg)), + gen_pool_free(chunk_heap->pool, sg_phys(sg) - sg->offset, sg->length); } chunk_heap->allocated -= allocated_size; ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 1/2] scatterlist: use sg_phys() 2015-06-09 16:27 ` [PATCH 1/2] scatterlist: use sg_phys() Dan Williams @ 2015-06-10 0:34 ` Elliott, Robert (Server Storage) 2015-06-10 9:32 ` Joerg Roedel 1 sibling, 0 replies; 14+ messages in thread From: Elliott, Robert (Server Storage) @ 2015-06-10 0:34 UTC (permalink / raw) To: linux-arm-kernel > -----Original Message----- > From: linux-kernel-owner at vger.kernel.org [mailto:linux-kernel- > owner at vger.kernel.org] On Behalf Of Dan Williams > Sent: Tuesday, June 09, 2015 10:27 AM > Subject: [PATCH 1/2] scatterlist: use sg_phys() > ... > diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c > index ed7ba8a11822..dcb3c594d626 100644 > --- a/arch/microblaze/kernel/dma.c > +++ b/arch/microblaze/kernel/dma.c > @@ -61,7 +61,7 @@ static int dma_direct_map_sg(struct device *dev, struct > scatterlist *sgl, > /* FIXME this part of code is untested */ > for_each_sg(sgl, sg, nents, i) { > sg->dma_address = sg_phys(sg); > - __dma_sync(page_to_phys(sg_page(sg)) + sg->offset, > + __dma_sync(sg_phys(sg), > sg->length, direction); > } That one ends up with weird indentation. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] scatterlist: use sg_phys() 2015-06-09 16:27 ` [PATCH 1/2] scatterlist: use sg_phys() Dan Williams 2015-06-10 0:34 ` Elliott, Robert (Server Storage) @ 2015-06-10 9:32 ` Joerg Roedel 2015-06-10 16:00 ` Dan Williams 1 sibling, 1 reply; 14+ messages in thread From: Joerg Roedel @ 2015-06-10 9:32 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jun 09, 2015 at 12:27:10PM -0400, Dan Williams wrote: > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index 7e7583ddd607..9f6ff6671f01 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -1502,7 +1502,7 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, > return -ENOMEM; > > for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) { > - phys_addr_t phys = page_to_phys(sg_page(s)); > + phys_addr_t phys = sg_phys(s) - s->offset; So sg_phys() turns out to be 'page_to_phys(sg_page(s)) + s->offset', which makes the above statement to: page_to_phys(sg_page(s)) + s->offset - s->offset; The compiler will probably optimize that away, but it still doesn't look like an improvement. > unsigned int len = PAGE_ALIGN(s->offset + s->length); > > if (!is_coherent && > diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c > index ed7ba8a11822..dcb3c594d626 100644 > --- a/arch/microblaze/kernel/dma.c > +++ b/arch/microblaze/kernel/dma.c > @@ -61,7 +61,7 @@ static int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, > /* FIXME this part of code is untested */ > for_each_sg(sgl, sg, nents, i) { > sg->dma_address = sg_phys(sg); > - __dma_sync(page_to_phys(sg_page(sg)) + sg->offset, > + __dma_sync(sg_phys(sg), > sg->length, direction); Here the replacement makes sense, but weird indendation. Could all be moved to one line, I guess. > } > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 68d43beccb7e..9b9ada71e0d3 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -1998,7 +1998,7 @@ static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn, > sg_res = aligned_nrpages(sg->offset, sg->length); > sg->dma_address = ((dma_addr_t)iov_pfn << VTD_PAGE_SHIFT) + sg->offset; > sg->dma_length = sg->length; > - pteval = page_to_phys(sg_page(sg)) | prot; > + pteval = (sg_phys(sg) - sg->offset) | prot; Here it doesn't make sense too. In general, please remove the cases where you have to subtract sg->offset after the conversion. Joerg ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] scatterlist: use sg_phys() 2015-06-10 9:32 ` Joerg Roedel @ 2015-06-10 16:00 ` Dan Williams 2015-06-10 16:31 ` Russell King - ARM Linux 2015-06-11 6:50 ` Joerg Roedel 0 siblings, 2 replies; 14+ messages in thread From: Dan Williams @ 2015-06-10 16:00 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jun 10, 2015 at 2:32 AM, Joerg Roedel <joro@8bytes.org> wrote: > On Tue, Jun 09, 2015 at 12:27:10PM -0400, Dan Williams wrote: >> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c >> index 7e7583ddd607..9f6ff6671f01 100644 >> --- a/arch/arm/mm/dma-mapping.c >> +++ b/arch/arm/mm/dma-mapping.c >> @@ -1502,7 +1502,7 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, >> return -ENOMEM; >> >> for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) { >> - phys_addr_t phys = page_to_phys(sg_page(s)); >> + phys_addr_t phys = sg_phys(s) - s->offset; > > So sg_phys() turns out to be 'page_to_phys(sg_page(s)) + s->offset', > which makes the above statement to: > > page_to_phys(sg_page(s)) + s->offset - s->offset; > > The compiler will probably optimize that away, but it still doesn't look > like an improvement. The goal is to eventually stop leaking struct page deep into the i/o stack. Anything that relies on being able to retrieve a struct page out of an sg entry needs to be converted. I think we need a new helper for this case "sg_phys_aligned()?". ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] scatterlist: use sg_phys() 2015-06-10 16:00 ` Dan Williams @ 2015-06-10 16:31 ` Russell King - ARM Linux 2015-06-10 16:57 ` Dan Williams 2015-06-11 6:50 ` Joerg Roedel 1 sibling, 1 reply; 14+ messages in thread From: Russell King - ARM Linux @ 2015-06-10 16:31 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jun 10, 2015 at 09:00:31AM -0700, Dan Williams wrote: > On Wed, Jun 10, 2015 at 2:32 AM, Joerg Roedel <joro@8bytes.org> wrote: > > On Tue, Jun 09, 2015 at 12:27:10PM -0400, Dan Williams wrote: > >> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > >> index 7e7583ddd607..9f6ff6671f01 100644 > >> --- a/arch/arm/mm/dma-mapping.c > >> +++ b/arch/arm/mm/dma-mapping.c > >> @@ -1502,7 +1502,7 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, > >> return -ENOMEM; > >> > >> for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) { > >> - phys_addr_t phys = page_to_phys(sg_page(s)); > >> + phys_addr_t phys = sg_phys(s) - s->offset; > > > > So sg_phys() turns out to be 'page_to_phys(sg_page(s)) + s->offset', > > which makes the above statement to: > > > > page_to_phys(sg_page(s)) + s->offset - s->offset; > > > > The compiler will probably optimize that away, but it still doesn't look > > like an improvement. > > The goal is to eventually stop leaking struct page deep into the i/o > stack. Anything that relies on being able to retrieve a struct page > out of an sg entry needs to be converted. I think we need a new > helper for this case "sg_phys_aligned()?". Why? The aim of the code is not to detect whether the address is aligned to a page (if it were, it'd be testing for a zero s->offset, or it would be testing for an s->offset being a multiple of the page size. Let's first understand the code that's being modified (which seems to be something which isn't done very much today - people seem to just like changing code for the hell of it.) for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) { phys_addr_t phys = page_to_phys(sg_page(s)); unsigned int len = PAGE_ALIGN(s->offset + s->length); if (!is_coherent && !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs)) __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir); prot = __dma_direction_to_prot(dir); ret = iommu_map(mapping->domain, iova, phys, len, prot); if (ret < 0) goto fail; count += len >> PAGE_SHIFT; iova += len; } What it's doing is trying to map each scatterlist entry via an IOMMU. Unsurprisingly, IOMMUs are page based - you can't map a partial IOMMU page. However, what says that the IOMMU page size is the same as the host CPU page size - it may not be... so the above code is wrong for a completely different reason. What we _should_ be doing is finding out what the IOMMU minimum page size is, and using that in conjunction with the sg_phys() of the scatterlist entry to determine the start and length of the mapping such that the IOMMU mapping covers the range described by the scatterlist entry. (iommu_map() takes arguments which must be aligned to the IOMMU minimum page size.) However, what we can also see from the above is that we have other code here using sg_page() - which is a necessity to be able to perform the required DMA cache maintanence to ensure that the data is visible to the DMA device. We need to kmap_atomic() these in order to flush them, and there's no other way other than struct page to represent a highmem page. So, I think your intent to want to remove struct page from the I/O subsystem is a false hope, unless you want to end up rewriting lots of architecture code, and you can come up with an alternative method to represent highmem pages. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] scatterlist: use sg_phys() 2015-06-10 16:31 ` Russell King - ARM Linux @ 2015-06-10 16:57 ` Dan Williams 2015-06-10 17:13 ` Russell King - ARM Linux 0 siblings, 1 reply; 14+ messages in thread From: Dan Williams @ 2015-06-10 16:57 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jun 10, 2015 at 9:31 AM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Wed, Jun 10, 2015 at 09:00:31AM -0700, Dan Williams wrote: >> On Wed, Jun 10, 2015 at 2:32 AM, Joerg Roedel <joro@8bytes.org> wrote: >> > On Tue, Jun 09, 2015 at 12:27:10PM -0400, Dan Williams wrote: >> >> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c >> >> index 7e7583ddd607..9f6ff6671f01 100644 >> >> --- a/arch/arm/mm/dma-mapping.c >> >> +++ b/arch/arm/mm/dma-mapping.c >> >> @@ -1502,7 +1502,7 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, >> >> return -ENOMEM; >> >> >> >> for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) { >> >> - phys_addr_t phys = page_to_phys(sg_page(s)); >> >> + phys_addr_t phys = sg_phys(s) - s->offset; >> > >> > So sg_phys() turns out to be 'page_to_phys(sg_page(s)) + s->offset', >> > which makes the above statement to: >> > >> > page_to_phys(sg_page(s)) + s->offset - s->offset; >> > >> > The compiler will probably optimize that away, but it still doesn't look >> > like an improvement. >> >> The goal is to eventually stop leaking struct page deep into the i/o >> stack. Anything that relies on being able to retrieve a struct page >> out of an sg entry needs to be converted. I think we need a new >> helper for this case "sg_phys_aligned()?". > > Why? The aim of the code is not to detect whether the address is aligned > to a page (if it were, it'd be testing for a zero s->offset, or it would > be testing for an s->offset being a multiple of the page size. > > Let's first understand the code that's being modified (which seems to be > something which isn't done very much today - people seem to just like > changing code for the hell of it.) > > for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) { > phys_addr_t phys = page_to_phys(sg_page(s)); > unsigned int len = PAGE_ALIGN(s->offset + s->length); > > if (!is_coherent && > !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs)) > __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, > dir); > > prot = __dma_direction_to_prot(dir); > > ret = iommu_map(mapping->domain, iova, phys, len, prot); > if (ret < 0) > goto fail; > count += len >> PAGE_SHIFT; > iova += len; > } > > What it's doing is trying to map each scatterlist entry via an IOMMU. > Unsurprisingly, IOMMUs are page based - you can't map a partial IOMMU > page. > > However, what says that the IOMMU page size is the same as the host CPU > page size - it may not be... so the above code is wrong for a completely > different reason. > > What we _should_ be doing is finding out what the IOMMU minimum page > size is, and using that in conjunction with the sg_phys() of the > scatterlist entry to determine the start and length of the mapping > such that the IOMMU mapping covers the range described by the scatterlist > entry. (iommu_map() takes arguments which must be aligned to the IOMMU > minimum page size.) > > However, what we can also see from the above is that we have other code > here using sg_page() - which is a necessity to be able to perform the > required DMA cache maintanence to ensure that the data is visible to the > DMA device. We need to kmap_atomic() these in order to flush them, and > there's no other way other than struct page to represent a highmem page. > > So, I think your intent to want to remove struct page from the I/O > subsystem is a false hope, unless you want to end up rewriting lots of > architecture code, and you can come up with an alternative method to > represent highmem pages. I think there will always be cases that need to do pfn_to_page() for buffer management. Those configurations will be blocked from seeing cases where a pfn is not struct page backed. So, you can have highmem or dma to pmem, but not both. Christoph, this is why I have Kconfig symbols (DEV_PFN etc) to gate whether an arch/config supports pfn-only i/o. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] scatterlist: use sg_phys() 2015-06-10 16:57 ` Dan Williams @ 2015-06-10 17:13 ` Russell King - ARM Linux 2015-06-10 17:25 ` Dan Williams 0 siblings, 1 reply; 14+ messages in thread From: Russell King - ARM Linux @ 2015-06-10 17:13 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jun 10, 2015 at 09:57:06AM -0700, Dan Williams wrote: > On Wed, Jun 10, 2015 at 9:31 AM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > Why? The aim of the code is not to detect whether the address is aligned > > to a page (if it were, it'd be testing for a zero s->offset, or it would > > be testing for an s->offset being a multiple of the page size. > > > > Let's first understand the code that's being modified (which seems to be > > something which isn't done very much today - people seem to just like > > changing code for the hell of it.) > > > > for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) { > > phys_addr_t phys = page_to_phys(sg_page(s)); > > unsigned int len = PAGE_ALIGN(s->offset + s->length); > > > > if (!is_coherent && > > !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs)) > > __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, > > dir); > > > > prot = __dma_direction_to_prot(dir); > > > > ret = iommu_map(mapping->domain, iova, phys, len, prot); > > if (ret < 0) > > goto fail; > > count += len >> PAGE_SHIFT; > > iova += len; > > } > > > > What it's doing is trying to map each scatterlist entry via an IOMMU. > > Unsurprisingly, IOMMUs are page based - you can't map a partial IOMMU > > page. > > > > However, what says that the IOMMU page size is the same as the host CPU > > page size - it may not be... so the above code is wrong for a completely > > different reason. > > > > What we _should_ be doing is finding out what the IOMMU minimum page > > size is, and using that in conjunction with the sg_phys() of the > > scatterlist entry to determine the start and length of the mapping > > such that the IOMMU mapping covers the range described by the scatterlist > > entry. (iommu_map() takes arguments which must be aligned to the IOMMU > > minimum page size.) > > > > However, what we can also see from the above is that we have other code > > here using sg_page() - which is a necessity to be able to perform the > > required DMA cache maintanence to ensure that the data is visible to the > > DMA device. We need to kmap_atomic() these in order to flush them, and > > there's no other way other than struct page to represent a highmem page. > > > > So, I think your intent to want to remove struct page from the I/O > > subsystem is a false hope, unless you want to end up rewriting lots of > > architecture code, and you can come up with an alternative method to > > represent highmem pages. > > I think there will always be cases that need to do pfn_to_page() for > buffer management. Those configurations will be blocked from seeing > cases where a pfn is not struct page backed. So, you can have highmem > or dma to pmem, but not both. Christoph, this is why I have Kconfig > symbols (DEV_PFN etc) to gate whether an arch/config supports pfn-only > i/o. Hmm, pmem... yea, in the SolidRun community, we've basically decided to stick with my updated Marvell BMM layer rather than switch to pmem. I forget the reasons why, but the decision was made after looking at what pmem was doing... In any case, let's not get bogged down in a peripheral issue. What I'm objecting to is that the patches I've seen seem to be just churn without any net benefit. You can't simply make sg_page() return NULL after this change, because you've done nothing with the remaining sg_page() callers to allow them to gracefully handle that case. What I'd like to see is a much fuller series of patches which show the whole progression towards your end goal rather than a piecemeal approach. Right now, it's not clear that there is any benefit to this round of changes. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] scatterlist: use sg_phys() 2015-06-10 17:13 ` Russell King - ARM Linux @ 2015-06-10 17:25 ` Dan Williams 0 siblings, 0 replies; 14+ messages in thread From: Dan Williams @ 2015-06-10 17:25 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jun 10, 2015 at 10:13 AM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Wed, Jun 10, 2015 at 09:57:06AM -0700, Dan Williams wrote: >> On Wed, Jun 10, 2015 at 9:31 AM, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >> > Why? The aim of the code is not to detect whether the address is aligned >> > to a page (if it were, it'd be testing for a zero s->offset, or it would >> > be testing for an s->offset being a multiple of the page size. >> > >> > Let's first understand the code that's being modified (which seems to be >> > something which isn't done very much today - people seem to just like >> > changing code for the hell of it.) >> > >> > for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) { >> > phys_addr_t phys = page_to_phys(sg_page(s)); >> > unsigned int len = PAGE_ALIGN(s->offset + s->length); >> > >> > if (!is_coherent && >> > !dma_get_attr(DMA_ATTR_SKIP_CPU_SYNC, attrs)) >> > __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, >> > dir); >> > >> > prot = __dma_direction_to_prot(dir); >> > >> > ret = iommu_map(mapping->domain, iova, phys, len, prot); >> > if (ret < 0) >> > goto fail; >> > count += len >> PAGE_SHIFT; >> > iova += len; >> > } >> > >> > What it's doing is trying to map each scatterlist entry via an IOMMU. >> > Unsurprisingly, IOMMUs are page based - you can't map a partial IOMMU >> > page. >> > >> > However, what says that the IOMMU page size is the same as the host CPU >> > page size - it may not be... so the above code is wrong for a completely >> > different reason. >> > >> > What we _should_ be doing is finding out what the IOMMU minimum page >> > size is, and using that in conjunction with the sg_phys() of the >> > scatterlist entry to determine the start and length of the mapping >> > such that the IOMMU mapping covers the range described by the scatterlist >> > entry. (iommu_map() takes arguments which must be aligned to the IOMMU >> > minimum page size.) >> > >> > However, what we can also see from the above is that we have other code >> > here using sg_page() - which is a necessity to be able to perform the >> > required DMA cache maintanence to ensure that the data is visible to the >> > DMA device. We need to kmap_atomic() these in order to flush them, and >> > there's no other way other than struct page to represent a highmem page. >> > >> > So, I think your intent to want to remove struct page from the I/O >> > subsystem is a false hope, unless you want to end up rewriting lots of >> > architecture code, and you can come up with an alternative method to >> > represent highmem pages. >> >> I think there will always be cases that need to do pfn_to_page() for >> buffer management. Those configurations will be blocked from seeing >> cases where a pfn is not struct page backed. So, you can have highmem >> or dma to pmem, but not both. Christoph, this is why I have Kconfig >> symbols (DEV_PFN etc) to gate whether an arch/config supports pfn-only >> i/o. > > Hmm, pmem... yea, in the SolidRun community, we've basically decided to > stick with my updated Marvell BMM layer rather than switch to pmem. I > forget the reasons why, but the decision was made after looking at what > pmem was doing... I'd of course be open to exploring if drivers/nvdimm/ could be made more generally useful. > In any case, let's not get bogged down in a peripheral issue. > > What I'm objecting to is that the patches I've seen seem to be just > churn without any net benefit. > > You can't simply make sg_page() return NULL after this change, because > you've done nothing with the remaining sg_page() callers to allow them > to gracefully handle that case. > > What I'd like to see is a much fuller series of patches which show the > whole progression towards your end goal rather than a piecemeal > approach. Right now, it's not clear that there is any benefit to > this round of changes. > Fair enough. I had them as part of a larger series [1]. Christoph suggested that I break out the pure cleanups separately. I'll add you to the next rev of that series. [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-June/001094.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] scatterlist: use sg_phys() 2015-06-10 16:00 ` Dan Williams 2015-06-10 16:31 ` Russell King - ARM Linux @ 2015-06-11 6:50 ` Joerg Roedel 1 sibling, 0 replies; 14+ messages in thread From: Joerg Roedel @ 2015-06-11 6:50 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jun 10, 2015 at 09:00:31AM -0700, Dan Williams wrote: > > So sg_phys() turns out to be 'page_to_phys(sg_page(s)) + s->offset', > > which makes the above statement to: > > > > page_to_phys(sg_page(s)) + s->offset - s->offset; > > > > The compiler will probably optimize that away, but it still doesn't look > > like an improvement. > > The goal is to eventually stop leaking struct page deep into the i/o > stack. Anything that relies on being able to retrieve a struct page > out of an sg entry needs to be converted. I think we need a new > helper for this case "sg_phys_aligned()?". You still have a reference to a struct page, because sg_phys() calls sg_page() too. If you want to get rid of sg_page() something like sg_pfn() migth be a more workable solution than sg_phys_(page_)aligned. But maybe I am just missing the bigger scope of this, so I agree with Russell that it is better so see a patch series which shows the direction you want to go with this. Joerg ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/2] scatterlist: cleanup sg_chain() and sg_unmark_end() 2015-06-09 16:27 [PATCH 0/2] scatterlist cleanups Dan Williams 2015-06-09 16:27 ` [PATCH 1/2] scatterlist: use sg_phys() Dan Williams @ 2015-06-09 16:27 ` Dan Williams 2015-06-10 5:38 ` Herbert Xu 1 sibling, 1 reply; 14+ messages in thread From: Dan Williams @ 2015-06-09 16:27 UTC (permalink / raw) To: linux-arm-kernel Replace open coded sg_chain() and sg_unmark_end() instances with the aforementioned helpers. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Christoph Hellwig <hch@lst.de> Cc: Vinod Koul <vinod.koul@intel.com> Cc: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- block/blk-merge.c | 2 +- drivers/crypto/omap-sham.c | 2 +- drivers/dma/imx-dma.c | 8 ++------ drivers/dma/ste_dma40.c | 5 +---- drivers/mmc/card/queue.c | 4 ++-- include/crypto/scatterwalk.h | 9 ++------- 6 files changed, 9 insertions(+), 21 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index fd3fee81c23c..7bb4f260ca4b 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -266,7 +266,7 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, if (rq->cmd_flags & REQ_WRITE) memset(q->dma_drain_buffer, 0, q->dma_drain_size); - sg->page_link &= ~0x02; + sg_unmark_end(sg); sg = sg_next(sg); sg_set_page(sg, virt_to_page(q->dma_drain_buffer), q->dma_drain_size, diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c index 4d63e0d4da9a..df8b23e19b90 100644 --- a/drivers/crypto/omap-sham.c +++ b/drivers/crypto/omap-sham.c @@ -582,7 +582,7 @@ static int omap_sham_xmit_dma(struct omap_sham_dev *dd, dma_addr_t dma_addr, * the dmaengine may try to DMA the incorrect amount of data. */ sg_init_table(&ctx->sgl, 1); - ctx->sgl.page_link = ctx->sg->page_link; + sg_assign_page(&ctx->sgl, sg_page(ctx->sg)); ctx->sgl.offset = ctx->sg->offset; sg_dma_len(&ctx->sgl) = len32; sg_dma_address(&ctx->sgl) = sg_dma_address(ctx->sg); diff --git a/drivers/dma/imx-dma.c b/drivers/dma/imx-dma.c index eed405976ea9..081fbfc87f6b 100644 --- a/drivers/dma/imx-dma.c +++ b/drivers/dma/imx-dma.c @@ -886,18 +886,14 @@ static struct dma_async_tx_descriptor *imxdma_prep_dma_cyclic( sg_init_table(imxdmac->sg_list, periods); for (i = 0; i < periods; i++) { - imxdmac->sg_list[i].page_link = 0; - imxdmac->sg_list[i].offset = 0; + sg_set_page(&imxdmac->sg_list[i], NULL, period_len, 0); imxdmac->sg_list[i].dma_address = dma_addr; sg_dma_len(&imxdmac->sg_list[i]) = period_len; dma_addr += period_len; } /* close the loop */ - imxdmac->sg_list[periods].offset = 0; - sg_dma_len(&imxdmac->sg_list[periods]) = 0; - imxdmac->sg_list[periods].page_link = - ((unsigned long)imxdmac->sg_list | 0x01) & ~0x02; + sg_chain(imxdmac->sg_list, periods + 1, imxdmac->sg_list); desc->type = IMXDMA_DESC_CYCLIC; desc->sg = imxdmac->sg_list; diff --git a/drivers/dma/ste_dma40.c b/drivers/dma/ste_dma40.c index 3c10f034d4b9..e8c00642cacb 100644 --- a/drivers/dma/ste_dma40.c +++ b/drivers/dma/ste_dma40.c @@ -2562,10 +2562,7 @@ dma40_prep_dma_cyclic(struct dma_chan *chan, dma_addr_t dma_addr, dma_addr += period_len; } - sg[periods].offset = 0; - sg_dma_len(&sg[periods]) = 0; - sg[periods].page_link = - ((unsigned long)sg | 0x01) & ~0x02; + sg_chain(sg, periods + 1, sg); txd = d40_prep_sg(chan, sg, sg, periods, direction, DMA_PREP_INTERRUPT); diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c index 8efa3684aef8..fa525ee20470 100644 --- a/drivers/mmc/card/queue.c +++ b/drivers/mmc/card/queue.c @@ -469,7 +469,7 @@ static unsigned int mmc_queue_packed_map_sg(struct mmc_queue *mq, sg_set_buf(__sg, buf + offset, len); offset += len; remain -= len; - (__sg++)->page_link &= ~0x02; + sg_unmark_end(__sg++); sg_len++; } while (remain); } @@ -477,7 +477,7 @@ static unsigned int mmc_queue_packed_map_sg(struct mmc_queue *mq, list_for_each_entry(req, &packed->list, queuelist) { sg_len += blk_rq_map_sg(mq->queue, req, __sg); __sg = sg + (sg_len - 1); - (__sg++)->page_link &= ~0x02; + sg_unmark_end(__sg++); } sg_mark_end(sg + (sg_len - 1)); return sg_len; diff --git a/include/crypto/scatterwalk.h b/include/crypto/scatterwalk.h index 20e4226a2e14..4529889b0f07 100644 --- a/include/crypto/scatterwalk.h +++ b/include/crypto/scatterwalk.h @@ -25,13 +25,8 @@ #include <linux/scatterlist.h> #include <linux/sched.h> -static inline void scatterwalk_sg_chain(struct scatterlist *sg1, int num, - struct scatterlist *sg2) -{ - sg_set_page(&sg1[num - 1], (void *)sg2, 0, 0); - sg1[num - 1].page_link &= ~0x02; - sg1[num - 1].page_link |= 0x01; -} +#define scatterwalk_sg_chain(prv, num, sgl) sg_chain(prv, num, sgl) +#define scatterwalk_sg_next(sgl) sg_next(sgl) static inline void scatterwalk_crypto_chain(struct scatterlist *head, struct scatterlist *sg, ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/2] scatterlist: cleanup sg_chain() and sg_unmark_end() 2015-06-09 16:27 ` [PATCH 2/2] scatterlist: cleanup sg_chain() and sg_unmark_end() Dan Williams @ 2015-06-10 5:38 ` Herbert Xu 2015-06-11 7:31 ` Christoph Hellwig 0 siblings, 1 reply; 14+ messages in thread From: Herbert Xu @ 2015-06-10 5:38 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jun 09, 2015 at 12:27:15PM -0400, Dan Williams wrote: > > +#define scatterwalk_sg_chain(prv, num, sgl) sg_chain(prv, num, sgl) > +#define scatterwalk_sg_next(sgl) sg_next(sgl) Why are you reintroducing the scatterwalk_sg_next macro that we just removed? I also don't understand why this is so urgent that it has to go into mainline at this late date. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/2] scatterlist: cleanup sg_chain() and sg_unmark_end() 2015-06-10 5:38 ` Herbert Xu @ 2015-06-11 7:31 ` Christoph Hellwig 2015-06-11 7:34 ` Herbert Xu 0 siblings, 1 reply; 14+ messages in thread From: Christoph Hellwig @ 2015-06-11 7:31 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jun 10, 2015 at 01:38:04PM +0800, Herbert Xu wrote: > On Tue, Jun 09, 2015 at 12:27:15PM -0400, Dan Williams wrote: > > > > +#define scatterwalk_sg_chain(prv, num, sgl) sg_chain(prv, num, sgl) > > +#define scatterwalk_sg_next(sgl) sg_next(sgl) > > Why are you reintroducing the scatterwalk_sg_next macro that we > just removed? > > I also don't understand why this is so urgent that it has to go > into mainline at this late date. It allows getting a cleaner slate for the next merge window, which seems useful on it's own. The re-addition of scatterwalk_sg_next seems next, but getting rid of the open-coded sg_chain is useful. Maybe you can take it through the crypto tree and also kill off the scatterwalk_sg_chain name as well while we're at it? ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/2] scatterlist: cleanup sg_chain() and sg_unmark_end() 2015-06-11 7:31 ` Christoph Hellwig @ 2015-06-11 7:34 ` Herbert Xu 0 siblings, 0 replies; 14+ messages in thread From: Herbert Xu @ 2015-06-11 7:34 UTC (permalink / raw) To: linux-arm-kernel On Thu, Jun 11, 2015 at 09:31:07AM +0200, Christoph Hellwig wrote: > > It allows getting a cleaner slate for the next merge window, which seems > useful on it's own. The re-addition of scatterwalk_sg_next seems next, > but getting rid of the open-coded sg_chain is useful. Sure I can take the crypto bits. > Maybe you can take it through the crypto tree and also kill off the > scatterwalk_sg_chain name as well while we're at it? Sure I'm happy to take such a patch. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-06-11 7:34 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-06-09 16:27 [PATCH 0/2] scatterlist cleanups Dan Williams 2015-06-09 16:27 ` [PATCH 1/2] scatterlist: use sg_phys() Dan Williams 2015-06-10 0:34 ` Elliott, Robert (Server Storage) 2015-06-10 9:32 ` Joerg Roedel 2015-06-10 16:00 ` Dan Williams 2015-06-10 16:31 ` Russell King - ARM Linux 2015-06-10 16:57 ` Dan Williams 2015-06-10 17:13 ` Russell King - ARM Linux 2015-06-10 17:25 ` Dan Williams 2015-06-11 6:50 ` Joerg Roedel 2015-06-09 16:27 ` [PATCH 2/2] scatterlist: cleanup sg_chain() and sg_unmark_end() Dan Williams 2015-06-10 5:38 ` Herbert Xu 2015-06-11 7:31 ` Christoph Hellwig 2015-06-11 7:34 ` Herbert Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).