* [PATCH] iommu: Do physical merging in iommu_map_sg()
@ 2018-10-04 15:47 Robin Murphy
[not found] ` <1be92cab99ba7fc82cc355bdda239f2ddcb92db0.1538667993.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Robin Murphy @ 2018-10-04 15:47 UTC (permalink / raw)
To: joro-zLv9SwRftAIdnm+yROfE0A
Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
The original motivation for iommu_map_sg() was to give IOMMU drivers the
chance to map an IOVA-contiguous scatterlist as efficiently as they
could. It turns out that there isn't really much driver-specific
business involved there, so now that the default implementation is
mandatory let's just improve that - the main thing we're after is to use
larger pages wherever possible, and as long as domain->pgsize_bitmap
reflects reality, iommu_map() can already do that in a generic way. All
we need to do is detect physically-contiguous segments and batch them
into a single map operation, since whatever we do here is transparent to
our caller and not bound by any segment-length restrictions on the list
itself.
Speaking of efficiency, there's really very little point in duplicating
the checks that iommu_map() is going to do anyway, so those get cleared
up in the process.
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
drivers/iommu/iommu.c | 42 ++++++++++++++++++++++--------------------
1 file changed, 22 insertions(+), 20 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8c15c5980299..8b22e0502349 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1677,33 +1677,35 @@ size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
struct scatterlist *sg, unsigned int nents, int prot)
{
struct scatterlist *s;
- size_t mapped = 0;
- unsigned int i, min_pagesz;
+ size_t len = 0, mapped = 0;
+ phys_addr_t start;
+ unsigned int i;
int ret;
- if (unlikely(domain->pgsize_bitmap == 0UL))
- return 0;
-
- min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
-
for_each_sg(sg, s, nents, i) {
- phys_addr_t phys = page_to_phys(sg_page(s)) + s->offset;
+ phys_addr_t s_phys = sg_phys(s);
- /*
- * We are mapping on IOMMU page boundaries, so offset within
- * the page must be 0. However, the IOMMU may support pages
- * smaller than PAGE_SIZE, so s->offset may still represent
- * an offset of that boundary within the CPU page.
- */
- if (!IS_ALIGNED(s->offset, min_pagesz))
- goto out_err;
+ if (len && s_phys != start + len) {
+do_map:
+ ret = iommu_map(domain, iova + mapped, start, len, prot);
+ if (ret)
+ goto out_err;
- ret = iommu_map(domain, iova + mapped, phys, s->length, prot);
- if (ret)
- goto out_err;
+ mapped += len;
+ len = 0;
+ if (!s)
+ break;
+ }
- mapped += s->length;
+ if (len) {
+ len += s->length;
+ } else {
+ len = s->length;
+ start = s_phys;
+ }
}
+ if (len)
+ goto do_map;
return mapped;
--
2.19.0.dirty
^ permalink raw reply related [flat|nested] 4+ messages in thread[parent not found: <1be92cab99ba7fc82cc355bdda239f2ddcb92db0.1538667993.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH] iommu: Do physical merging in iommu_map_sg() [not found] ` <1be92cab99ba7fc82cc355bdda239f2ddcb92db0.1538667993.git.robin.murphy-5wv7dgnIgG8@public.gmane.org> @ 2018-10-05 7:19 ` Christoph Hellwig [not found] ` <20181005071934.GA9238-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Christoph Hellwig @ 2018-10-05 7:19 UTC (permalink / raw) To: Robin Murphy; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On Thu, Oct 04, 2018 at 04:47:37PM +0100, Robin Murphy wrote: > The original motivation for iommu_map_sg() was to give IOMMU drivers the > chance to map an IOVA-contiguous scatterlist as efficiently as they > could. It turns out that there isn't really much driver-specific > business involved there, so now that the default implementation is > mandatory let's just improve that - the main thing we're after is to use > larger pages wherever possible, and as long as domain->pgsize_bitmap > reflects reality, iommu_map() can already do that in a generic way. All > we need to do is detect physically-contiguous segments and batch them > into a single map operation, since whatever we do here is transparent to > our caller and not bound by any segment-length restrictions on the list > itself. > > Speaking of efficiency, there's really very little point in duplicating > the checks that iommu_map() is going to do anyway, so those get cleared > up in the process. > > Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> I like the idea, but I find the goto usage to jump back into the just terminated loop highly confusing. Would it be that much worse to simply duplicate the iommu_map call? ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <20181005071934.GA9238-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>]
* Re: [PATCH] iommu: Do physical merging in iommu_map_sg() [not found] ` <20181005071934.GA9238-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> @ 2018-10-05 10:57 ` Robin Murphy [not found] ` <13093b94-be59-030d-7176-16dab22bcdce-5wv7dgnIgG8@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Robin Murphy @ 2018-10-05 10:57 UTC (permalink / raw) To: Christoph Hellwig; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On 05/10/18 08:19, Christoph Hellwig wrote: > On Thu, Oct 04, 2018 at 04:47:37PM +0100, Robin Murphy wrote: >> The original motivation for iommu_map_sg() was to give IOMMU drivers the >> chance to map an IOVA-contiguous scatterlist as efficiently as they >> could. It turns out that there isn't really much driver-specific >> business involved there, so now that the default implementation is >> mandatory let's just improve that - the main thing we're after is to use >> larger pages wherever possible, and as long as domain->pgsize_bitmap >> reflects reality, iommu_map() can already do that in a generic way. All >> we need to do is detect physically-contiguous segments and batch them >> into a single map operation, since whatever we do here is transparent to >> our caller and not bound by any segment-length restrictions on the list >> itself. >> >> Speaking of efficiency, there's really very little point in duplicating >> the checks that iommu_map() is going to do anyway, so those get cleared >> up in the process. >> >> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> > > I like the idea, but I find the goto usage to jump back into the just > terminated loop highly confusing. Would it be that much worse to simply > duplicate the iommu_map call? Yeah, I fiddled around for ages trying to find the cleanest approach, but really there just doesn't seem to be one - I'd say the worst bit of that goto is the even-more-subtle need for the explicit break. FWIW the naive diff below is only actually +2 source lines, but the duplication does also carry through to the object code (at least for my arm64 GCC7 build). I might have a quick hack around to see if I can do any better with a do...while loop - of course what I *really* want is nested function definitions, but that might just be brain damage from doing too much MATLAB in the past ;) Robin. ----->8----- diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 8b22e0502349..4d43146720e9 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1686,15 +1686,12 @@ size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova, phys_addr_t s_phys = sg_phys(s); if (len && s_phys != start + len) { -do_map: ret = iommu_map(domain, iova + mapped, start, len, prot); if (ret) goto out_err; mapped += len; len = 0; - if (!s) - break; } if (len) { @@ -1704,8 +1701,13 @@ size_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova, start = s_phys; } } - if (len) - goto do_map; + if (len) { + ret = iommu_map(domain, iova + mapped, start, len, prot); + if (ret) + goto out_err; + + mapped += len; + } return mapped; ^ permalink raw reply related [flat|nested] 4+ messages in thread
[parent not found: <13093b94-be59-030d-7176-16dab22bcdce-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH] iommu: Do physical merging in iommu_map_sg() [not found] ` <13093b94-be59-030d-7176-16dab22bcdce-5wv7dgnIgG8@public.gmane.org> @ 2018-10-07 10:38 ` Christoph Hellwig 0 siblings, 0 replies; 4+ messages in thread From: Christoph Hellwig @ 2018-10-07 10:38 UTC (permalink / raw) To: Robin Murphy; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On Fri, Oct 05, 2018 at 11:57:57AM +0100, Robin Murphy wrote: > Yeah, I fiddled around for ages trying to find the cleanest approach, but > really there just doesn't seem to be one - I'd say the worst bit of that > goto is the even-more-subtle need for the explicit break. FWIW the naive > diff below is only actually +2 source lines, but the duplication does also > carry through to the object code (at least for my arm64 GCC7 build). I might > have a quick hack around to see if I can do any better with a do...while > loop - of course what I *really* want is nested function definitions, but > that might just be brain damage from doing too much MATLAB in the past ;) I much prefer this version, even if it leads to slightly larger object code. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-10-07 10:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-10-04 15:47 [PATCH] iommu: Do physical merging in iommu_map_sg() Robin Murphy
[not found] ` <1be92cab99ba7fc82cc355bdda239f2ddcb92db0.1538667993.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2018-10-05 7:19 ` Christoph Hellwig
[not found] ` <20181005071934.GA9238-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2018-10-05 10:57 ` Robin Murphy
[not found] ` <13093b94-be59-030d-7176-16dab22bcdce-5wv7dgnIgG8@public.gmane.org>
2018-10-07 10:38 ` Christoph Hellwig
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.