* [PATCH v2 1/2] iommu/dma: Stop getting dma_32bit_pfn wrong
@ 2017-01-16 13:24 Robin Murphy
2017-01-16 13:24 ` [PATCH v2 2/2] iommu/dma: Implement PCI allocation optimisation Robin Murphy
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Robin Murphy @ 2017-01-16 13:24 UTC (permalink / raw)
To: linux-arm-kernel
iommu_dma_init_domain() was originally written under the misconception
that dma_32bit_pfn represented some sort of size limit for IOVA domains.
Since the truth is almost the exact opposite of that, rework the logic
and comments to reflect its real purpose of optimising lookups when
allocating from a subset of the available 64-bit space.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
Sending this as a v2 since both patches have been seen before, and #1 is
ever so slightly tweaked. #2 applies on top of Eric's MSI series, since
that seems ready to go now - there is a trivial merge conflict otherwise
around the extra argument in the __alloc_iova() call.
Robin.
drivers/iommu/dma-iommu.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index de41ead6542a..9aa432e8fbd3 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -203,6 +203,7 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
unsigned long order, base_pfn, end_pfn;
+ bool pci = dev && dev_is_pci(dev);
if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
return -EINVAL;
@@ -225,19 +226,31 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
end_pfn = min_t(unsigned long, end_pfn,
domain->geometry.aperture_end >> order);
}
+ /*
+ * PCI devices may have larger DMA masks, but still prefer allocating
+ * within a 32-bit mask to avoid DAC addressing. Such limitations don't
+ * apply to the typical platform device, so for those we may as well
+ * leave the cache limit at the top of their range to save an rb_last()
+ * traversal on every allocation.
+ */
+ if (pci)
+ end_pfn &= DMA_BIT_MASK(32) >> order;
- /* All we can safely do with an existing domain is enlarge it */
+ /* start_pfn is always nonzero for an already-initialised domain */
if (iovad->start_pfn) {
if (1UL << order != iovad->granule ||
- base_pfn != iovad->start_pfn ||
- end_pfn < iovad->dma_32bit_pfn) {
+ base_pfn != iovad->start_pfn) {
pr_warn("Incompatible range for DMA domain\n");
return -EFAULT;
}
- iovad->dma_32bit_pfn = end_pfn;
+ /*
+ * If we have devices with different DMA masks, move the free
+ * area cache limit down for the benefit of the smaller one.
+ */
+ iovad->dma_32bit_pfn = min(end_pfn, iovad->dma_32bit_pfn);
} else {
init_iova_domain(iovad, 1UL << order, base_pfn, end_pfn);
- if (dev && dev_is_pci(dev))
+ if (pci)
iova_reserve_pci_windows(to_pci_dev(dev), iovad);
}
return 0;
--
2.10.2.dirty
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 2/2] iommu/dma: Implement PCI allocation optimisation
2017-01-16 13:24 [PATCH v2 1/2] iommu/dma: Stop getting dma_32bit_pfn wrong Robin Murphy
@ 2017-01-16 13:24 ` Robin Murphy
2017-01-23 17:39 ` Will Deacon
2017-01-23 17:40 ` [PATCH v2 1/2] iommu/dma: Stop getting dma_32bit_pfn wrong Will Deacon
2017-01-30 15:16 ` Joerg Roedel
2 siblings, 1 reply; 5+ messages in thread
From: Robin Murphy @ 2017-01-16 13:24 UTC (permalink / raw)
To: linux-arm-kernel
Whilst PCI devices may have 64-bit DMA masks, they still benefit from
using 32-bit addresses wherever possible in order to avoid DAC (PCI) or
longer address packets (PCIe), which may incur a performance overhead.
Implement the same optimisation as other allocators by trying to get a
32-bit address first, only falling back to the full mask if that fails.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
drivers/iommu/dma-iommu.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9aa432e8fbd3..8adff5f83b38 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -281,19 +281,28 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent)
}
static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
- dma_addr_t dma_limit)
+ dma_addr_t dma_limit, struct device *dev)
{
struct iova_domain *iovad = cookie_iovad(domain);
unsigned long shift = iova_shift(iovad);
unsigned long length = iova_align(iovad, size) >> shift;
+ struct iova *iova = NULL;
if (domain->geometry.force_aperture)
dma_limit = min(dma_limit, domain->geometry.aperture_end);
+
+ /* Try to get PCI devices a SAC address */
+ if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev))
+ iova = alloc_iova(iovad, length, DMA_BIT_MASK(32) >> shift,
+ true);
/*
* Enforce size-alignment to be safe - there could perhaps be an
* attribute to control this per-device, or at least per-domain...
*/
- return alloc_iova(iovad, length, dma_limit >> shift, true);
+ if (!iova)
+ iova = alloc_iova(iovad, length, dma_limit >> shift, true);
+
+ return iova;
}
/* The IOVA allocator knows what we mapped, so just unmap whatever that was */
@@ -446,7 +455,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
if (!pages)
return NULL;
- iova = __alloc_iova(domain, size, dev->coherent_dma_mask);
+ iova = __alloc_iova(domain, size, dev->coherent_dma_mask, dev);
if (!iova)
goto out_free_pages;
@@ -517,7 +526,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
struct iova_domain *iovad = cookie_iovad(domain);
size_t iova_off = iova_offset(iovad, phys);
size_t len = iova_align(iovad, size + iova_off);
- struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev));
+ struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev), dev);
if (!iova)
return DMA_ERROR_CODE;
@@ -675,7 +684,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
prev = s;
}
- iova = __alloc_iova(domain, iova_len, dma_get_mask(dev));
+ iova = __alloc_iova(domain, iova_len, dma_get_mask(dev), dev);
if (!iova)
goto out_restore_sg;
@@ -755,7 +764,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
msi_page->phys = msi_addr;
if (iovad) {
- iova = __alloc_iova(domain, size, dma_get_mask(dev));
+ iova = __alloc_iova(domain, size, dma_get_mask(dev), dev);
if (!iova)
goto out_free_page;
msi_page->iova = iova_dma_addr(iovad, iova);
--
2.10.2.dirty
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 2/2] iommu/dma: Implement PCI allocation optimisation
2017-01-16 13:24 ` [PATCH v2 2/2] iommu/dma: Implement PCI allocation optimisation Robin Murphy
@ 2017-01-23 17:39 ` Will Deacon
0 siblings, 0 replies; 5+ messages in thread
From: Will Deacon @ 2017-01-23 17:39 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 16, 2017 at 01:24:55PM +0000, Robin Murphy wrote:
> Whilst PCI devices may have 64-bit DMA masks, they still benefit from
> using 32-bit addresses wherever possible in order to avoid DAC (PCI) or
> longer address packets (PCIe), which may incur a performance overhead.
> Implement the same optimisation as other allocators by trying to get a
> 32-bit address first, only falling back to the full mask if that fails.
>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
> drivers/iommu/dma-iommu.c | 21 +++++++++++++++------
> 1 file changed, 15 insertions(+), 6 deletions(-)
Looks good to me, and has the added benefit of getting PCI ethernet
working on Juno when using the SMMU. Unintended side-effect, but:
Acked-by: Will Deacon <will.deacon@arm.com>
Will
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 1/2] iommu/dma: Stop getting dma_32bit_pfn wrong
2017-01-16 13:24 [PATCH v2 1/2] iommu/dma: Stop getting dma_32bit_pfn wrong Robin Murphy
2017-01-16 13:24 ` [PATCH v2 2/2] iommu/dma: Implement PCI allocation optimisation Robin Murphy
@ 2017-01-23 17:40 ` Will Deacon
2017-01-30 15:16 ` Joerg Roedel
2 siblings, 0 replies; 5+ messages in thread
From: Will Deacon @ 2017-01-23 17:40 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 16, 2017 at 01:24:54PM +0000, Robin Murphy wrote:
> iommu_dma_init_domain() was originally written under the misconception
> that dma_32bit_pfn represented some sort of size limit for IOVA domains.
> Since the truth is almost the exact opposite of that, rework the logic
> and comments to reflect its real purpose of optimising lookups when
> allocating from a subset of the available 64-bit space.
>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>
> Sending this as a v2 since both patches have been seen before, and #1 is
> ever so slightly tweaked. #2 applies on top of Eric's MSI series, since
> that seems ready to go now - there is a trivial merge conflict otherwise
> around the extra argument in the __alloc_iova() call.
>
> Robin.
>
> drivers/iommu/dma-iommu.c | 23 ++++++++++++++++++-----
> 1 file changed, 18 insertions(+), 5 deletions(-)
Tested-by: Will Deacon <will.deacon@arm.com>
Will
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 1/2] iommu/dma: Stop getting dma_32bit_pfn wrong
2017-01-16 13:24 [PATCH v2 1/2] iommu/dma: Stop getting dma_32bit_pfn wrong Robin Murphy
2017-01-16 13:24 ` [PATCH v2 2/2] iommu/dma: Implement PCI allocation optimisation Robin Murphy
2017-01-23 17:40 ` [PATCH v2 1/2] iommu/dma: Stop getting dma_32bit_pfn wrong Will Deacon
@ 2017-01-30 15:16 ` Joerg Roedel
2 siblings, 0 replies; 5+ messages in thread
From: Joerg Roedel @ 2017-01-30 15:16 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 16, 2017 at 01:24:54PM +0000, Robin Murphy wrote:
> iommu_dma_init_domain() was originally written under the misconception
> that dma_32bit_pfn represented some sort of size limit for IOVA domains.
> Since the truth is almost the exact opposite of that, rework the logic
> and comments to reflect its real purpose of optimising lookups when
> allocating from a subset of the available 64-bit space.
>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>
> Sending this as a v2 since both patches have been seen before, and #1 is
> ever so slightly tweaked. #2 applies on top of Eric's MSI series, since
> that seems ready to go now - there is a trivial merge conflict otherwise
> around the extra argument in the __alloc_iova() call.
>
> Robin.
>
> drivers/iommu/dma-iommu.c | 23 ++++++++++++++++++-----
> 1 file changed, 18 insertions(+), 5 deletions(-)
Applied both, thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-01-30 15:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-16 13:24 [PATCH v2 1/2] iommu/dma: Stop getting dma_32bit_pfn wrong Robin Murphy
2017-01-16 13:24 ` [PATCH v2 2/2] iommu/dma: Implement PCI allocation optimisation Robin Murphy
2017-01-23 17:39 ` Will Deacon
2017-01-23 17:40 ` [PATCH v2 1/2] iommu/dma: Stop getting dma_32bit_pfn wrong Will Deacon
2017-01-30 15:16 ` Joerg Roedel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).