All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] iommu: Do physical merging in iommu_map_sg()
@ 2018-10-04 15:47 Robin Murphy
       [not found] ` <1be92cab99ba7fc82cc355bdda239f2ddcb92db0.1538667993.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Robin Murphy @ 2018-10-04 15:47 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

The original motivation for iommu_map_sg() was to give IOMMU drivers the
chance to map an IOVA-contiguous scatterlist as efficiently as they
could. It turns out that there isn't really much driver-specific
business involved there, so now that the default implementation is
mandatory let's just improve that - the main thing we're after is to use
larger pages wherever possible, and as long as domain->pgsize_bitmap
reflects reality, iommu_map() can already do that in a generic way. All
we need to do is detect physically-contiguous segments and batch them
into a single map operation, since whatever we do here is transparent to
our caller and not bound by any segment-length restrictions on the list
itself.

Speaking of efficiency, there's really very little point in duplicating
the checks that iommu_map() is going to do anyway, so those get cleared
up in the process.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/iommu.c | 42 ++++++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8c15c5980299..8b22e0502349 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1677,33 +1677,35 @@ size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 		    struct scatterlist *sg, unsigned int nents, int prot)
 {
 	struct scatterlist *s;
-	size_t mapped = 0;
-	unsigned int i, min_pagesz;
+	size_t len = 0, mapped = 0;
+	phys_addr_t start;
+	unsigned int i;
 	int ret;
 
-	if (unlikely(domain->pgsize_bitmap == 0UL))
-		return 0;
-
-	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
-
 	for_each_sg(sg, s, nents, i) {
-		phys_addr_t phys = page_to_phys(sg_page(s)) + s->offset;
+		phys_addr_t s_phys = sg_phys(s);
 
-		/*
-		 * We are mapping on IOMMU page boundaries, so offset within
-		 * the page must be 0. However, the IOMMU may support pages
-		 * smaller than PAGE_SIZE, so s->offset may still represent
-		 * an offset of that boundary within the CPU page.
-		 */
-		if (!IS_ALIGNED(s->offset, min_pagesz))
-			goto out_err;
+		if (len && s_phys != start + len) {
+do_map:
+			ret = iommu_map(domain, iova + mapped, start, len, prot);
+			if (ret)
+				goto out_err;
 
-		ret = iommu_map(domain, iova + mapped, phys, s->length, prot);
-		if (ret)
-			goto out_err;
+			mapped += len;
+			len = 0;
+			if (!s)
+				break;
+		}
 
-		mapped += s->length;
+		if (len) {
+			len += s->length;
+		} else {
+			len = s->length;
+			start = s_phys;
+		}
 	}
+	if (len)
+		goto do_map;
 
 	return mapped;
 
-- 
2.19.0.dirty

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] iommu: Do physical merging in iommu_map_sg()
       [not found] ` <1be92cab99ba7fc82cc355bdda239f2ddcb92db0.1538667993.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2018-10-05  7:19   ` Christoph Hellwig
       [not found]     ` <20181005071934.GA9238-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2018-10-05  7:19 UTC (permalink / raw)
  To: Robin Murphy; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Thu, Oct 04, 2018 at 04:47:37PM +0100, Robin Murphy wrote:
> The original motivation for iommu_map_sg() was to give IOMMU drivers the
> chance to map an IOVA-contiguous scatterlist as efficiently as they
> could. It turns out that there isn't really much driver-specific
> business involved there, so now that the default implementation is
> mandatory let's just improve that - the main thing we're after is to use
> larger pages wherever possible, and as long as domain->pgsize_bitmap
> reflects reality, iommu_map() can already do that in a generic way. All
> we need to do is detect physically-contiguous segments and batch them
> into a single map operation, since whatever we do here is transparent to
> our caller and not bound by any segment-length restrictions on the list
> itself.
> 
> Speaking of efficiency, there's really very little point in duplicating
> the checks that iommu_map() is going to do anyway, so those get cleared
> up in the process.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>

I like the idea, but I find the goto usage to jump back into the just
terminated loop highly confusing.  Would it be that much worse to simply
duplicate the iommu_map call?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] iommu: Do physical merging in iommu_map_sg()
       [not found]     ` <20181005071934.GA9238-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2018-10-05 10:57       ` Robin Murphy
       [not found]         ` <13093b94-be59-030d-7176-16dab22bcdce-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Robin Murphy @ 2018-10-05 10:57 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On 05/10/18 08:19, Christoph Hellwig wrote:
> On Thu, Oct 04, 2018 at 04:47:37PM +0100, Robin Murphy wrote:
>> The original motivation for iommu_map_sg() was to give IOMMU drivers the
>> chance to map an IOVA-contiguous scatterlist as efficiently as they
>> could. It turns out that there isn't really much driver-specific
>> business involved there, so now that the default implementation is
>> mandatory let's just improve that - the main thing we're after is to use
>> larger pages wherever possible, and as long as domain->pgsize_bitmap
>> reflects reality, iommu_map() can already do that in a generic way. All
>> we need to do is detect physically-contiguous segments and batch them
>> into a single map operation, since whatever we do here is transparent to
>> our caller and not bound by any segment-length restrictions on the list
>> itself.
>>
>> Speaking of efficiency, there's really very little point in duplicating
>> the checks that iommu_map() is going to do anyway, so those get cleared
>> up in the process.
>>
>> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> 
> I like the idea, but I find the goto usage to jump back into the just
> terminated loop highly confusing.  Would it be that much worse to simply
> duplicate the iommu_map call?

Yeah, I fiddled around for ages trying to find the cleanest approach, 
but really there just doesn't seem to be one - I'd say the worst bit of 
that goto is the even-more-subtle need for the explicit break. FWIW the 
naive diff below is only actually +2 source lines, but the duplication 
does also carry through to the object code (at least for my arm64 GCC7 
build). I might have a quick hack around to see if I can do any better 
with a do...while loop - of course what I *really* want is nested 
function definitions, but that might just be brain damage from doing too 
much MATLAB in the past ;)

Robin.

----->8-----
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8b22e0502349..4d43146720e9 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1686,15 +1686,12 @@ size_t iommu_map_sg(struct iommu_domain *domain, 
unsigned long iova,
  		phys_addr_t s_phys = sg_phys(s);

  		if (len && s_phys != start + len) {
-do_map:
  			ret = iommu_map(domain, iova + mapped, start, len, prot);
  			if (ret)
  				goto out_err;

  			mapped += len;
  			len = 0;
-			if (!s)
-				break;
  		}

  		if (len) {
@@ -1704,8 +1701,13 @@ size_t iommu_map_sg(struct iommu_domain *domain, 
unsigned long iova,
  			start = s_phys;
  		}
  	}
-	if (len)
-		goto do_map;
+	if (len) {
+		ret = iommu_map(domain, iova + mapped, start, len, prot);
+		if (ret)
+			goto out_err;
+
+		mapped += len;
+	}

  	return mapped;

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] iommu: Do physical merging in iommu_map_sg()
       [not found]         ` <13093b94-be59-030d-7176-16dab22bcdce-5wv7dgnIgG8@public.gmane.org>
@ 2018-10-07 10:38           ` Christoph Hellwig
  0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2018-10-07 10:38 UTC (permalink / raw)
  To: Robin Murphy; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Fri, Oct 05, 2018 at 11:57:57AM +0100, Robin Murphy wrote:
> Yeah, I fiddled around for ages trying to find the cleanest approach, but
> really there just doesn't seem to be one - I'd say the worst bit of that
> goto is the even-more-subtle need for the explicit break. FWIW the naive
> diff below is only actually +2 source lines, but the duplication does also
> carry through to the object code (at least for my arm64 GCC7 build). I might
> have a quick hack around to see if I can do any better with a do...while
> loop - of course what I *really* want is nested function definitions, but
> that might just be brain damage from doing too much MATLAB in the past ;)

I much prefer this version, even if it leads to slightly larger object
code.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-10-07 10:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-10-04 15:47 [PATCH] iommu: Do physical merging in iommu_map_sg() Robin Murphy
     [not found] ` <1be92cab99ba7fc82cc355bdda239f2ddcb92db0.1538667993.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2018-10-05  7:19   ` Christoph Hellwig
     [not found]     ` <20181005071934.GA9238-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2018-10-05 10:57       ` Robin Murphy
     [not found]         ` <13093b94-be59-030d-7176-16dab22bcdce-5wv7dgnIgG8@public.gmane.org>
2018-10-07 10:38           ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.