* [PATCH] iommu: Do physical merging in iommu_map_sg()
@ 2018-10-04 15:47 Robin Murphy
[not found] ` <1be92cab99ba7fc82cc355bdda239f2ddcb92db0.1538667993.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Robin Murphy @ 2018-10-04 15:47 UTC (permalink / raw)
To: joro-zLv9SwRftAIdnm+yROfE0A
Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
The original motivation for iommu_map_sg() was to give IOMMU drivers the
chance to map an IOVA-contiguous scatterlist as efficiently as they
could. It turns out that there isn't really much driver-specific
business involved there, so now that the default implementation is
mandatory let's just improve that - the main thing we're after is to use
larger pages wherever possible, and as long as domain->pgsize_bitmap
reflects reality, iommu_map() can already do that in a generic way. All
we need to do is detect physically-contiguous segments and batch them
into a single map operation, since whatever we do here is transparent to
our caller and not bound by any segment-length restrictions on the list
itself.
Speaking of efficiency, there's really very little point in duplicating
the checks that iommu_map() is going to do anyway, so those get cleared
up in the process.
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
drivers/iommu/iommu.c | 42 ++++++++++++++++++++++--------------------
1 file changed, 22 insertions(+), 20 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8c15c5980299..8b22e0502349 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1677,33 +1677,35 @@ size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
struct scatterlist *sg, unsigned int nents, int prot)
{
struct scatterlist *s;
- size_t mapped = 0;
- unsigned int i, min_pagesz;
+ size_t len = 0, mapped = 0;
+ phys_addr_t start;
+ unsigned int i;
int ret;
- if (unlikely(domain->pgsize_bitmap == 0UL))
- return 0;
-
- min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
-
for_each_sg(sg, s, nents, i) {
- phys_addr_t phys = page_to_phys(sg_page(s)) + s->offset;
+ phys_addr_t s_phys = sg_phys(s);
- /*
- * We are mapping on IOMMU page boundaries, so offset within
- * the page must be 0. However, the IOMMU may support pages
- * smaller than PAGE_SIZE, so s->offset may still represent
- * an offset of that boundary within the CPU page.
- */
- if (!IS_ALIGNED(s->offset, min_pagesz))
- goto out_err;
+ if (len && s_phys != start + len) {
+do_map:
+ ret = iommu_map(domain, iova + mapped, start, len, prot);
+ if (ret)
+ goto out_err;
- ret = iommu_map(domain, iova + mapped, phys, s->length, prot);
- if (ret)
- goto out_err;
+ mapped += len;
+ len = 0;
+ if (!s)
+ break;
+ }
- mapped += s->length;
+ if (len) {
+ len += s->length;
+ } else {
+ len = s->length;
+ start = s_phys;
+ }
}
+ if (len)
+ goto do_map;
return mapped;
--
2.19.0.dirty
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] iommu: Do physical merging in iommu_map_sg()
[not found] ` <1be92cab99ba7fc82cc355bdda239f2ddcb92db0.1538667993.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2018-10-05 7:19 ` Christoph Hellwig
[not found] ` <20181005071934.GA9238-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2018-10-05 7:19 UTC (permalink / raw)
To: Robin Murphy; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On Thu, Oct 04, 2018 at 04:47:37PM +0100, Robin Murphy wrote:
> The original motivation for iommu_map_sg() was to give IOMMU drivers the
> chance to map an IOVA-contiguous scatterlist as efficiently as they
> could. It turns out that there isn't really much driver-specific
> business involved there, so now that the default implementation is
> mandatory let's just improve that - the main thing we're after is to use
> larger pages wherever possible, and as long as domain->pgsize_bitmap
> reflects reality, iommu_map() can already do that in a generic way. All
> we need to do is detect physically-contiguous segments and batch them
> into a single map operation, since whatever we do here is transparent to
> our caller and not bound by any segment-length restrictions on the list
> itself.
>
> Speaking of efficiency, there's really very little point in duplicating
> the checks that iommu_map() is going to do anyway, so those get cleared
> up in the process.
>
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
I like the idea, but I find the goto usage to jump back into the just
terminated loop highly confusing. Would it be that much worse to simply
duplicate the iommu_map call?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] iommu: Do physical merging in iommu_map_sg()
[not found] ` <20181005071934.GA9238-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2018-10-05 10:57 ` Robin Murphy
[not found] ` <13093b94-be59-030d-7176-16dab22bcdce-5wv7dgnIgG8@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Robin Murphy @ 2018-10-05 10:57 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On 05/10/18 08:19, Christoph Hellwig wrote:
> On Thu, Oct 04, 2018 at 04:47:37PM +0100, Robin Murphy wrote:
>> The original motivation for iommu_map_sg() was to give IOMMU drivers the
>> chance to map an IOVA-contiguous scatterlist as efficiently as they
>> could. It turns out that there isn't really much driver-specific
>> business involved there, so now that the default implementation is
>> mandatory let's just improve that - the main thing we're after is to use
>> larger pages wherever possible, and as long as domain->pgsize_bitmap
>> reflects reality, iommu_map() can already do that in a generic way. All
>> we need to do is detect physically-contiguous segments and batch them
>> into a single map operation, since whatever we do here is transparent to
>> our caller and not bound by any segment-length restrictions on the list
>> itself.
>>
>> Speaking of efficiency, there's really very little point in duplicating
>> the checks that iommu_map() is going to do anyway, so those get cleared
>> up in the process.
>>
>> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
>
> I like the idea, but I find the goto usage to jump back into the just
> terminated loop highly confusing. Would it be that much worse to simply
> duplicate the iommu_map call?
Yeah, I fiddled around for ages trying to find the cleanest approach,
but really there just doesn't seem to be one - I'd say the worst bit of
that goto is the even-more-subtle need for the explicit break. FWIW the
naive diff below is only actually +2 source lines, but the duplication
does also carry through to the object code (at least for my arm64 GCC7
build). I might have a quick hack around to see if I can do any better
with a do...while loop - of course what I *really* want is nested
function definitions, but that might just be brain damage from doing too
much MATLAB in the past ;)
Robin.
----->8-----
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8b22e0502349..4d43146720e9 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1686,15 +1686,12 @@ size_t iommu_map_sg(struct iommu_domain *domain,
unsigned long iova,
phys_addr_t s_phys = sg_phys(s);
if (len && s_phys != start + len) {
-do_map:
ret = iommu_map(domain, iova + mapped, start, len, prot);
if (ret)
goto out_err;
mapped += len;
len = 0;
- if (!s)
- break;
}
if (len) {
@@ -1704,8 +1701,13 @@ size_t iommu_map_sg(struct iommu_domain *domain,
unsigned long iova,
start = s_phys;
}
}
- if (len)
- goto do_map;
+ if (len) {
+ ret = iommu_map(domain, iova + mapped, start, len, prot);
+ if (ret)
+ goto out_err;
+
+ mapped += len;
+ }
return mapped;
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] iommu: Do physical merging in iommu_map_sg()
[not found] ` <13093b94-be59-030d-7176-16dab22bcdce-5wv7dgnIgG8@public.gmane.org>
@ 2018-10-07 10:38 ` Christoph Hellwig
0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2018-10-07 10:38 UTC (permalink / raw)
To: Robin Murphy; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On Fri, Oct 05, 2018 at 11:57:57AM +0100, Robin Murphy wrote:
> Yeah, I fiddled around for ages trying to find the cleanest approach, but
> really there just doesn't seem to be one - I'd say the worst bit of that
> goto is the even-more-subtle need for the explicit break. FWIW the naive
> diff below is only actually +2 source lines, but the duplication does also
> carry through to the object code (at least for my arm64 GCC7 build). I might
> have a quick hack around to see if I can do any better with a do...while
> loop - of course what I *really* want is nested function definitions, but
> that might just be brain damage from doing too much MATLAB in the past ;)
I much prefer this version, even if it leads to slightly larger object
code.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-10-07 10:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-10-04 15:47 [PATCH] iommu: Do physical merging in iommu_map_sg() Robin Murphy
[not found] ` <1be92cab99ba7fc82cc355bdda239f2ddcb92db0.1538667993.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2018-10-05 7:19 ` Christoph Hellwig
[not found] ` <20181005071934.GA9238-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2018-10-05 10:57 ` Robin Murphy
[not found] ` <13093b94-be59-030d-7176-16dab22bcdce-5wv7dgnIgG8@public.gmane.org>
2018-10-07 10:38 ` Christoph Hellwig
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.