[PATCH 0/3] IOVA allocation improvements for iommu-dma

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] IOVA allocation improvements for iommu-dma
@ 2017-03-15 13:33 Robin Murphy
  2017-03-15 13:33 ` [PATCH 1/3] iommu/dma: Convert to address-based allocation Robin Murphy
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Robin Murphy @ 2017-03-15 13:33 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

Here's the first bit of lock contention removal to chew on - feedback
welcome! Note that for the current users of the io-pgtable framework,
this is most likely to simply push more contention onto the io-pgtable
lock, so may not show a great improvement alone. Will and I both have
rough proof-of-concept implementations of lock-free io-pgtable code
which we need to sit down and agree on at some point, hopefullt fairly
soon.

I've taken the opportunity to do a bit of cleanup and refactoring
within the series to make the final state of the code nicer, but the
diffstat still turns out surprisingly reasonable in the end - it would
actually be negative but for the new comments!

Magnus, Shimoda-san, the first two patches should be of interest as they
constitute the allocation rework I mentioned a while back[1] - if you
still need to implement that scary workaround, this should make it
simple to hook IPMMU-specific calls into the alloc and free paths, and
let the driver take care of the details internally.

Robin.

[1]:https://lists.linuxfoundation.org/pipermail/iommu/2017-January/020189.html

Robin Murphy (3):
  iommu/dma: Convert to address-based allocation
  iommu/dma: Clean up MSI IOVA allocation
  iommu/dma: Plumb in the per-CPU IOVA caches

 drivers/iommu/dma-iommu.c | 176 ++++++++++++++++++++++++----------------------
 1 file changed, 90 insertions(+), 86 deletions(-)

-- 
2.11.0.dirty

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] iommu/dma: Convert to address-based allocation
  2017-03-15 13:33 [PATCH 0/3] IOVA allocation improvements for iommu-dma Robin Murphy
@ 2017-03-15 13:33 ` Robin Murphy
  2017-03-15 13:33 ` [PATCH 2/3] iommu/dma: Clean up MSI IOVA allocation Robin Murphy
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Robin Murphy @ 2017-03-15 13:33 UTC (permalink / raw)
  To: linux-arm-kernel

In preparation for some IOVA allocation improvements, clean up all the
explicit struct iova usage such that all our mapping, unmapping and
cleanup paths deal exclusively with addresses rather than implementation
details. In the process, a few of the things we're touching get renamed
for the sake of internal consistency.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 119 ++++++++++++++++++++++++++--------------------
 1 file changed, 67 insertions(+), 52 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 48d36ce59efb..9a9dca1073e6 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -286,12 +286,12 @@ int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
 	}
 }
 
-static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
-		dma_addr_t dma_limit, struct device *dev)
+static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain,
+		size_t size, dma_addr_t dma_limit, struct device *dev)
 {
 	struct iova_domain *iovad = cookie_iovad(domain);
 	unsigned long shift = iova_shift(iovad);
-	unsigned long length = iova_align(iovad, size) >> shift;
+	unsigned long iova_len = size >> shift;
 	struct iova *iova = NULL;
 
 	if (domain->geometry.force_aperture)
@@ -299,35 +299,42 @@ static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
 
 	/* Try to get PCI devices a SAC address */
 	if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev))
-		iova = alloc_iova(iovad, length, DMA_BIT_MASK(32) >> shift,
+		iova = alloc_iova(iovad, iova_len, DMA_BIT_MASK(32) >> shift,
 				  true);
 	/*
 	 * Enforce size-alignment to be safe - there could perhaps be an
 	 * attribute to control this per-device, or at least per-domain...
 	 */
 	if (!iova)
-		iova = alloc_iova(iovad, length, dma_limit >> shift, true);
+		iova = alloc_iova(iovad, iova_len, dma_limit >> shift, true);
 
-	return iova;
+	return (dma_addr_t)iova->pfn_lo << shift;
 }
 
-/* The IOVA allocator knows what we mapped, so just unmap whatever that was */
-static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr)
+static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
+		dma_addr_t iova, size_t size)
 {
-	struct iova_domain *iovad = cookie_iovad(domain);
-	unsigned long shift = iova_shift(iovad);
-	unsigned long pfn = dma_addr >> shift;
-	struct iova *iova = find_iova(iovad, pfn);
-	size_t size;
+	struct iova_domain *iovad = &cookie->iovad;
+	struct iova *iova_rbnode;
 
-	if (WARN_ON(!iova))
+	iova_rbnode = find_iova(iovad, iova_pfn(iovad, iova));
+	if (WARN_ON(!iova_rbnode))
 		return;
 
-	size = iova_size(iova) << shift;
-	size -= iommu_unmap(domain, pfn << shift, size);
-	/* ...and if we can't, then something is horribly, horribly wrong */
-	WARN_ON(size > 0);
-	__free_iova(iovad, iova);
+	__free_iova(iovad, iova_rbnode);
+}
+
+static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr,
+		size_t size)
+{
+	struct iova_domain *iovad = cookie_iovad(domain);
+	size_t iova_off = iova_offset(iovad, dma_addr);
+
+	dma_addr -= iova_off;
+	size = iova_align(iovad, size + iova_off);
+
+	WARN_ON(iommu_unmap(domain, dma_addr, size) != size);
+	iommu_dma_free_iova(domain->iova_cookie, dma_addr, size);
 }
 
 static void __iommu_dma_free_pages(struct page **pages, int count)
@@ -409,7 +416,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count,
 void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 		dma_addr_t *handle)
 {
-	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle);
+	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle, size);
 	__iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
 	*handle = DMA_ERROR_CODE;
 }
@@ -437,11 +444,11 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 		void (*flush_page)(struct device *, const void *, phys_addr_t))
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-	struct iova_domain *iovad = cookie_iovad(domain);
-	struct iova *iova;
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	struct iova_domain *iovad = &cookie->iovad;
 	struct page **pages;
 	struct sg_table sgt;
-	dma_addr_t dma_addr;
+	dma_addr_t iova;
 	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
 
 	*handle = DMA_ERROR_CODE;
@@ -461,11 +468,11 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 	if (!pages)
 		return NULL;
 
-	iova = __alloc_iova(domain, size, dev->coherent_dma_mask, dev);
+	size = iova_align(iovad, size);
+	iova = iommu_dma_alloc_iova(domain, size, dev->coherent_dma_mask, dev);
 	if (!iova)
 		goto out_free_pages;
 
-	size = iova_align(iovad, size);
 	if (sg_alloc_table_from_pages(&sgt, pages, count, 0, size, GFP_KERNEL))
 		goto out_free_iova;
 
@@ -481,19 +488,18 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 		sg_miter_stop(&miter);
 	}
 
-	dma_addr = iova_dma_addr(iovad, iova);
-	if (iommu_map_sg(domain, dma_addr, sgt.sgl, sgt.orig_nents, prot)
+	if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, prot)
 			< size)
 		goto out_free_sg;
 
-	*handle = dma_addr;
+	*handle = iova;
 	sg_free_table(&sgt);
 	return pages;
 
 out_free_sg:
 	sg_free_table(&sgt);
 out_free_iova:
-	__free_iova(iovad, iova);
+	iommu_dma_free_iova(cookie, iova, size);
 out_free_pages:
 	__iommu_dma_free_pages(pages, count);
 	return NULL;
@@ -527,22 +533,22 @@ int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma)
 static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
 		size_t size, int prot)
 {
-	dma_addr_t dma_addr;
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-	struct iova_domain *iovad = cookie_iovad(domain);
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	struct iova_domain *iovad = &cookie->iovad;
 	size_t iova_off = iova_offset(iovad, phys);
-	size_t len = iova_align(iovad, size + iova_off);
-	struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev), dev);
+	dma_addr_t iova;
 
+	size = iova_align(iovad, size + iova_off);
+	iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
 	if (!iova)
 		return DMA_ERROR_CODE;
 
-	dma_addr = iova_dma_addr(iovad, iova);
-	if (iommu_map(domain, dma_addr, phys - iova_off, len, prot)) {
-		__free_iova(iovad, iova);
+	if (iommu_map(domain, iova, phys - iova_off, size, prot)) {
+		iommu_dma_free_iova(cookie, iova, size);
 		return DMA_ERROR_CODE;
 	}
-	return dma_addr + iova_off;
+	return iova + iova_off;
 }
 
 dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
@@ -554,7 +560,7 @@ dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
 void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size,
 		enum dma_data_direction dir, unsigned long attrs)
 {
-	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle);
+	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle, size);
 }
 
 /*
@@ -643,10 +649,10 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 		int nents, int prot)
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-	struct iova_domain *iovad = cookie_iovad(domain);
-	struct iova *iova;
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	struct iova_domain *iovad = &cookie->iovad;
 	struct scatterlist *s, *prev = NULL;
-	dma_addr_t dma_addr;
+	dma_addr_t iova;
 	size_t iova_len = 0;
 	unsigned long mask = dma_get_seg_boundary(dev);
 	int i;
@@ -690,7 +696,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 		prev = s;
 	}
 
-	iova = __alloc_iova(domain, iova_len, dma_get_mask(dev), dev);
+	iova = iommu_dma_alloc_iova(domain, iova_len, dma_get_mask(dev), dev);
 	if (!iova)
 		goto out_restore_sg;
 
@@ -698,14 +704,13 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 	 * We'll leave any physical concatenation to the IOMMU driver's
 	 * implementation - it knows better than we do.
 	 */
-	dma_addr = iova_dma_addr(iovad, iova);
-	if (iommu_map_sg(domain, dma_addr, sg, nents, prot) < iova_len)
+	if (iommu_map_sg(domain, iova, sg, nents, prot) < iova_len)
 		goto out_free_iova;
 
-	return __finalise_sg(dev, sg, nents, dma_addr);
+	return __finalise_sg(dev, sg, nents, iova);
 
 out_free_iova:
-	__free_iova(iovad, iova);
+	iommu_dma_free_iova(cookie, iova, iova_len);
 out_restore_sg:
 	__invalidate_sg(sg, nents);
 	return 0;
@@ -714,11 +719,21 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 		enum dma_data_direction dir, unsigned long attrs)
 {
+	dma_addr_t start, end;
+	struct scatterlist *tmp;
+	int i;
 	/*
 	 * The scatterlist segments are mapped into a single
 	 * contiguous IOVA allocation, so this is incredibly easy.
 	 */
-	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), sg_dma_address(sg));
+	start = sg_dma_address(sg);
+	for_each_sg(sg_next(sg), tmp, nents - 1, i) {
+		if (sg_dma_len(tmp) == 0)
+			break;
+		sg = tmp;
+	}
+	end = sg_dma_address(sg) + sg_dma_len(sg);
+	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), start, end - start);
 }
 
 dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
@@ -731,7 +746,7 @@ dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
 void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
-	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle);
+	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle, size);
 }
 
 int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
@@ -745,7 +760,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iommu_dma_msi_page *msi_page;
 	struct iova_domain *iovad = cookie_iovad(domain);
-	struct iova *iova;
+	dma_addr_t iova;
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 	size_t size = cookie_msi_granule(cookie);
 
@@ -760,10 +775,10 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 
 	msi_page->phys = msi_addr;
 	if (iovad) {
-		iova = __alloc_iova(domain, size, dma_get_mask(dev), dev);
+		iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
 		if (!iova)
 			goto out_free_page;
-		msi_page->iova = iova_dma_addr(iovad, iova);
+		msi_page->iova = iova;
 	} else {
 		msi_page->iova = cookie->msi_iova;
 		cookie->msi_iova += size;
@@ -778,7 +793,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 
 out_free_iova:
 	if (iovad)
-		__free_iova(iovad, iova);
+		iommu_dma_free_iova(cookie, iova, size);
 	else
 		cookie->msi_iova -= size;
 out_free_page:
-- 
2.11.0.dirty

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] iommu/dma: Clean up MSI IOVA allocation
  2017-03-15 13:33 [PATCH 0/3] IOVA allocation improvements for iommu-dma Robin Murphy
  2017-03-15 13:33 ` [PATCH 1/3] iommu/dma: Convert to address-based allocation Robin Murphy
@ 2017-03-15 13:33 ` Robin Murphy
  2017-03-15 13:33 ` [PATCH 3/3] iommu/dma: Plumb in the per-CPU IOVA caches Robin Murphy
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Robin Murphy @ 2017-03-15 13:33 UTC (permalink / raw)
  To: linux-arm-kernel

Now that allocation is suitably abstracted, our private alloc/free
helpers can drive the trivial MSI cookie allocator directly as well,
which lets us clean up its exposed guts from iommu_dma_map_msi_msg() and
simplify things quite a bit.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 58 ++++++++++++++++++++---------------------------
 1 file changed, 25 insertions(+), 33 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9a9dca1073e6..c03e2eb4ebbb 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -61,15 +61,6 @@ static inline size_t cookie_msi_granule(struct iommu_dma_cookie *cookie)
 	return PAGE_SIZE;
 }
 
-static inline struct iova_domain *cookie_iovad(struct iommu_domain *domain)
-{
-	struct iommu_dma_cookie *cookie = domain->iova_cookie;
-
-	if (cookie->type == IOMMU_DMA_IOVA_COOKIE)
-		return &cookie->iovad;
-	return NULL;
-}
-
 static struct iommu_dma_cookie *cookie_alloc(enum iommu_dma_cookie_type type)
 {
 	struct iommu_dma_cookie *cookie;
@@ -289,11 +280,19 @@ int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
 static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain,
 		size_t size, dma_addr_t dma_limit, struct device *dev)
 {
-	struct iova_domain *iovad = cookie_iovad(domain);
-	unsigned long shift = iova_shift(iovad);
-	unsigned long iova_len = size >> shift;
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	struct iova_domain *iovad = &cookie->iovad;
+	unsigned long shift, iova_len;
 	struct iova *iova = NULL;
 
+	if (cookie->type == IOMMU_DMA_MSI_COOKIE) {
+		cookie->msi_iova += size;
+		return cookie->msi_iova - size;
+	}
+
+	shift = iova_shift(iovad);
+	iova_len = size >> shift;
+
 	if (domain->geometry.force_aperture)
 		dma_limit = min(dma_limit, domain->geometry.aperture_end);
 
@@ -317,6 +316,12 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
 	struct iova_domain *iovad = &cookie->iovad;
 	struct iova *iova_rbnode;
 
+	/* The MSI case is only ever cleaning up its most recent allocation */
+	if (cookie->type == IOMMU_DMA_MSI_COOKIE) {
+		cookie->msi_iova -= size;
+		return;
+	}
+
 	iova_rbnode = find_iova(iovad, iova_pfn(iovad, iova));
 	if (WARN_ON(!iova_rbnode))
 		return;
@@ -327,14 +332,15 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
 static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr,
 		size_t size)
 {
-	struct iova_domain *iovad = cookie_iovad(domain);
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	struct iova_domain *iovad = &cookie->iovad;
 	size_t iova_off = iova_offset(iovad, dma_addr);
 
 	dma_addr -= iova_off;
 	size = iova_align(iovad, size + iova_off);
 
 	WARN_ON(iommu_unmap(domain, dma_addr, size) != size);
-	iommu_dma_free_iova(domain->iova_cookie, dma_addr, size);
+	iommu_dma_free_iova(cookie, dma_addr, size);
 }
 
 static void __iommu_dma_free_pages(struct page **pages, int count)
@@ -759,7 +765,6 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 {
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iommu_dma_msi_page *msi_page;
-	struct iova_domain *iovad = cookie_iovad(domain);
 	dma_addr_t iova;
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 	size_t size = cookie_msi_granule(cookie);
@@ -773,29 +778,16 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 	if (!msi_page)
 		return NULL;
 
-	msi_page->phys = msi_addr;
-	if (iovad) {
-		iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
-		if (!iova)
-			goto out_free_page;
-		msi_page->iova = iova;
-	} else {
-		msi_page->iova = cookie->msi_iova;
-		cookie->msi_iova += size;
-	}
-
-	if (iommu_map(domain, msi_page->iova, msi_addr, size, prot))
-		goto out_free_iova;
+	iova = __iommu_dma_map(dev, msi_addr, size, prot);
+	if (iommu_dma_mapping_error(dev, iova))
+		goto out_free_page;
 
 	INIT_LIST_HEAD(&msi_page->list);
+	msi_page->phys = msi_addr;
+	msi_page->iova = iova;
 	list_add(&msi_page->list, &cookie->msi_page_list);
 	return msi_page;
 
-out_free_iova:
-	if (iovad)
-		iommu_dma_free_iova(cookie, iova, size);
-	else
-		cookie->msi_iova -= size;
 out_free_page:
 	kfree(msi_page);
 	return NULL;
-- 
2.11.0.dirty

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] iommu/dma: Plumb in the per-CPU IOVA caches
  2017-03-15 13:33 [PATCH 0/3] IOVA allocation improvements for iommu-dma Robin Murphy
  2017-03-15 13:33 ` [PATCH 1/3] iommu/dma: Convert to address-based allocation Robin Murphy
  2017-03-15 13:33 ` [PATCH 2/3] iommu/dma: Clean up MSI IOVA allocation Robin Murphy
@ 2017-03-15 13:33 ` Robin Murphy
  2017-03-16 13:18 ` [PATCH 0/3] IOVA allocation improvements for iommu-dma Sunil Kovvuri
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Robin Murphy @ 2017-03-15 13:33 UTC (permalink / raw)
  To: linux-arm-kernel

With IOVA allocation suitably tidied up, we are finally free to opt in
to the per-CPU caching mechanism. The caching alone can provide a modest
improvement over walking the rbtree for weedier systems (iperf3 shows
~10% more ethernet throughput on an ARM Juno r1 constrained to a single
650MHz Cortex-A53), but the real gain will be in sidestepping the rbtree
lock contention which larger ARM-based systems with lots of parallel I/O
are starting to feel the pain of.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 39 ++++++++++++++++++---------------------
 1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index c03e2eb4ebbb..292008de68f0 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -282,8 +282,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain,
 {
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iova_domain *iovad = &cookie->iovad;
-	unsigned long shift, iova_len;
-	struct iova *iova = NULL;
+	unsigned long shift, iova_len, iova = 0;
 
 	if (cookie->type == IOMMU_DMA_MSI_COOKIE) {
 		cookie->msi_iova += size;
@@ -292,41 +291,39 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain,
 
 	shift = iova_shift(iovad);
 	iova_len = size >> shift;
+	/*
+	 * Freeing non-power-of-two-sized allocations back into the IOVA caches
+	 * will come back to bite us badly, so we have to waste a bit of space
+	 * rounding up anything cacheable to make sure that can't happen. The
+	 * order of the unadjusted size will still match upon freeing.
+	 */
+	if (iova_len < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
+		iova_len = roundup_pow_of_two(iova_len);
 
 	if (domain->geometry.force_aperture)
 		dma_limit = min(dma_limit, domain->geometry.aperture_end);
 
 	/* Try to get PCI devices a SAC address */
 	if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev))
-		iova = alloc_iova(iovad, iova_len, DMA_BIT_MASK(32) >> shift,
-				  true);
-	/*
-	 * Enforce size-alignment to be safe - there could perhaps be an
-	 * attribute to control this per-device, or at least per-domain...
-	 */
-	if (!iova)
-		iova = alloc_iova(iovad, iova_len, dma_limit >> shift, true);
+		iova = alloc_iova_fast(iovad, iova_len, DMA_BIT_MASK(32) >> shift);
 
-	return (dma_addr_t)iova->pfn_lo << shift;
+	if (!iova)
+		iova = alloc_iova_fast(iovad, iova_len, dma_limit >> shift);
+
+	return (dma_addr_t)iova << shift;
 }
 
 static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
 		dma_addr_t iova, size_t size)
 {
 	struct iova_domain *iovad = &cookie->iovad;
-	struct iova *iova_rbnode;
+	unsigned long shift = iova_shift(iovad);
 
 	/* The MSI case is only ever cleaning up its most recent allocation */
-	if (cookie->type == IOMMU_DMA_MSI_COOKIE) {
+	if (cookie->type == IOMMU_DMA_MSI_COOKIE)
 		cookie->msi_iova -= size;
-		return;
-	}
-
-	iova_rbnode = find_iova(iovad, iova_pfn(iovad, iova));
-	if (WARN_ON(!iova_rbnode))
-		return;
-
-	__free_iova(iovad, iova_rbnode);
+	else
+		free_iova_fast(iovad, iova >> shift, size >> shift);
 }
 
 static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr,
-- 
2.11.0.dirty

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 0/3] IOVA allocation improvements for iommu-dma
  2017-03-15 13:33 [PATCH 0/3] IOVA allocation improvements for iommu-dma Robin Murphy
                   ` (2 preceding siblings ...)
  2017-03-15 13:33 ` [PATCH 3/3] iommu/dma: Plumb in the per-CPU IOVA caches Robin Murphy
@ 2017-03-16 13:18 ` Sunil Kovvuri
  2017-03-22 17:43 ` Nate Watterson
  2017-03-31 12:40 ` Will Deacon
  5 siblings, 0 replies; 9+ messages in thread
From: Sunil Kovvuri @ 2017-03-16 13:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 15, 2017 at 7:03 PM, Robin Murphy <robin.murphy@arm.com> wrote:
> Hi all,
>
> Here's the first bit of lock contention removal to chew on - feedback
> welcome! Note that for the current users of the io-pgtable framework,
> this is most likely to simply push more contention onto the io-pgtable
> lock, so may not show a great improvement alone. Will and I both have
> rough proof-of-concept implementations of lock-free io-pgtable code
> which we need to sit down and agree on at some point, hopefullt fairly
> soon.

Thanks for working on this.
As you said, it's indeed pushing lock contention down to pgtable lock from
iova rbtree lock but now morethan lock I see issue is with yielding CPU while
waiting for tlb_sync. Below are some numbers.

I have tweaked '__arm_smmu_tlb_sync' in SMMUv2 driver i.e basically removed
cpu_relax() and udelay() to make it a busy loop.

Before: 1.1 Gbps
With your patches: 1.45Gbps
With your patches + busy loop in tlb_sync: 7Gbps

If we reduce pgtable contention a bit
With your patches + busy loop in tlb_sync + Iperf threads reduced to 8
from 16: ~9Gbps

So looks like along with pgtable lock, some optimization can be done
to tlb_sync code as well.

Thanks,
Sunil.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 0/3] IOVA allocation improvements for iommu-dma
  2017-03-15 13:33 [PATCH 0/3] IOVA allocation improvements for iommu-dma Robin Murphy
                   ` (3 preceding siblings ...)
  2017-03-16 13:18 ` [PATCH 0/3] IOVA allocation improvements for iommu-dma Sunil Kovvuri
@ 2017-03-22 17:43 ` Nate Watterson
  2017-03-31 12:40 ` Will Deacon
  5 siblings, 0 replies; 9+ messages in thread
From: Nate Watterson @ 2017-03-22 17:43 UTC (permalink / raw)
  To: linux-arm-kernel

On 2017-03-15 09:33, Robin Murphy wrote:
> Hi all,
Hi Robin,
> 
> Here's the first bit of lock contention removal to chew on - feedback
> welcome! Note that for the current users of the io-pgtable framework,
> this is most likely to simply push more contention onto the io-pgtable
> lock, so may not show a great improvement alone. Will and I both have
> rough proof-of-concept implementations of lock-free io-pgtable code
> which we need to sit down and agree on at some point, hopefullt fairly
> soon.
> 
> I've taken the opportunity to do a bit of cleanup and refactoring
> within the series to make the final state of the code nicer, but the
> diffstat still turns out surprisingly reasonable in the end - it would
> actually be negative but for the new comments!
> 
> Magnus, Shimoda-san, the first two patches should be of interest as 
> they
> constitute the allocation rework I mentioned a while back[1] - if you
> still need to implement that scary workaround, this should make it
> simple to hook IPMMU-specific calls into the alloc and free paths, and
> let the driver take care of the details internally.

I've tested your patches on a QDF2400 platform and generally
see modest improvements in iperf/fio performance. As you
suspected would happen, contention has indeed moved to the
io-pgtable lock. I am looking forward to testing with the
lock-free io-pgtable implementation, however I suspect that
there will still be contention issues acquiring the (SMMUv3)
cmdq lock on the unmap path.

Reviewed/Tested-by: Nate Watterson <nwatters@codeaurora.org>
> 
> Robin.
> 
> [1]:https://lists.linuxfoundation.org/pipermail/iommu/2017-January/020189.html
> 
> Robin Murphy (3):
>   iommu/dma: Convert to address-based allocation
>   iommu/dma: Clean up MSI IOVA allocation
>   iommu/dma: Plumb in the per-CPU IOVA caches
> 
>  drivers/iommu/dma-iommu.c | 176 
> ++++++++++++++++++++++++----------------------
>  1 file changed, 90 insertions(+), 86 deletions(-)

-- 
Qualcomm Datacenter Technologies, Inc. on behalf of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a 
Linux
Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 0/3] IOVA allocation improvements for iommu-dma
  2017-03-15 13:33 [PATCH 0/3] IOVA allocation improvements for iommu-dma Robin Murphy
                   ` (4 preceding siblings ...)
  2017-03-22 17:43 ` Nate Watterson
@ 2017-03-31 12:40 ` Will Deacon
  2017-03-31 13:13   ` Robin Murphy
  5 siblings, 1 reply; 9+ messages in thread
From: Will Deacon @ 2017-03-31 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin, Joerg,

On Wed, Mar 15, 2017 at 01:33:13PM +0000, Robin Murphy wrote:
> Here's the first bit of lock contention removal to chew on - feedback
> welcome! Note that for the current users of the io-pgtable framework,
> this is most likely to simply push more contention onto the io-pgtable
> lock, so may not show a great improvement alone. Will and I both have
> rough proof-of-concept implementations of lock-free io-pgtable code
> which we need to sit down and agree on at some point, hopefullt fairly
> soon.
> 
> I've taken the opportunity to do a bit of cleanup and refactoring
> within the series to make the final state of the code nicer, but the
> diffstat still turns out surprisingly reasonable in the end - it would
> actually be negative but for the new comments!
> 
> Magnus, Shimoda-san, the first two patches should be of interest as they
> constitute the allocation rework I mentioned a while back[1] - if you
> still need to implement that scary workaround, this should make it
> simple to hook IPMMU-specific calls into the alloc and free paths, and
> let the driver take care of the details internally.

What's the plan for merging this? Whilst we still have the iopgtable lock
contention to resolve, this is a good incremental step towards improving
our scalability and sits nicely alongside the SMMUv2 TLB improvements I
just queued.

Thanks,

Will

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 0/3] IOVA allocation improvements for iommu-dma
  2017-03-31 12:40 ` Will Deacon
@ 2017-03-31 13:13   ` Robin Murphy
  2017-03-31 14:33     ` Joerg Roedel
  0 siblings, 1 reply; 9+ messages in thread
From: Robin Murphy @ 2017-03-31 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

On 31/03/17 13:40, Will Deacon wrote:
> Hi Robin, Joerg,
> 
> On Wed, Mar 15, 2017 at 01:33:13PM +0000, Robin Murphy wrote:
>> Here's the first bit of lock contention removal to chew on - feedback
>> welcome! Note that for the current users of the io-pgtable framework,
>> this is most likely to simply push more contention onto the io-pgtable
>> lock, so may not show a great improvement alone. Will and I both have
>> rough proof-of-concept implementations of lock-free io-pgtable code
>> which we need to sit down and agree on at some point, hopefullt fairly
>> soon.
>>
>> I've taken the opportunity to do a bit of cleanup and refactoring
>> within the series to make the final state of the code nicer, but the
>> diffstat still turns out surprisingly reasonable in the end - it would
>> actually be negative but for the new comments!
>>
>> Magnus, Shimoda-san, the first two patches should be of interest as they
>> constitute the allocation rework I mentioned a while back[1] - if you
>> still need to implement that scary workaround, this should make it
>> simple to hook IPMMU-specific calls into the alloc and free paths, and
>> let the driver take care of the details internally.
> 
> What's the plan for merging this? Whilst we still have the iopgtable lock
> contention to resolve, this is a good incremental step towards improving
> our scalability and sits nicely alongside the SMMUv2 TLB improvements I
> just queued.

I was assuming this would go through the IOMMU tree if there were no
problems and the feedback was good, and that would seem to be the case.
I'm pleasantly surprised that this series doesn't even conflict at all
with my other iommu-dma changes already queued - I've just checked that
it still applies cleanly to the current iommu/next as-is.

Robin.

> 
> Thanks,
> 
> Will
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 0/3] IOVA allocation improvements for iommu-dma
  2017-03-31 13:13   ` Robin Murphy
@ 2017-03-31 14:33     ` Joerg Roedel
  0 siblings, 0 replies; 9+ messages in thread
From: Joerg Roedel @ 2017-03-31 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Mar 31, 2017 at 02:13:13PM +0100, Robin Murphy wrote:
> I was assuming this would go through the IOMMU tree if there were no
> problems and the feedback was good, and that would seem to be the case.
> I'm pleasantly surprised that this series doesn't even conflict at all
> with my other iommu-dma changes already queued - I've just checked that
> it still applies cleanly to the current iommu/next as-is.

Yes, makes sense. Can you please resend it with the review and
tested-tags added? I queue it then.


Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-03-31 14:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-15 13:33 [PATCH 0/3] IOVA allocation improvements for iommu-dma Robin Murphy
2017-03-15 13:33 ` [PATCH 1/3] iommu/dma: Convert to address-based allocation Robin Murphy
2017-03-15 13:33 ` [PATCH 2/3] iommu/dma: Clean up MSI IOVA allocation Robin Murphy
2017-03-15 13:33 ` [PATCH 3/3] iommu/dma: Plumb in the per-CPU IOVA caches Robin Murphy
2017-03-16 13:18 ` [PATCH 0/3] IOVA allocation improvements for iommu-dma Sunil Kovvuri
2017-03-22 17:43 ` Nate Watterson
2017-03-31 12:40 ` Will Deacon
2017-03-31 13:13   ` Robin Murphy
2017-03-31 14:33     ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).