From: Robin Murphy <robin.murphy@arm.com>
To: Joerg Roedel <joro@8bytes.org>, iommu@lists.linux-foundation.org
Cc: Vincent.Wan@amd.com, Joerg Roedel <jroedel@suse.de>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 16/20] iommu/amd: Optimize map_sg and unmap_sg
Date: Tue, 12 Jul 2016 12:33:43 +0100 [thread overview]
Message-ID: <5784D597.4010703@arm.com> (raw)
In-Reply-To: <1467978311-28322-17-git-send-email-joro@8bytes.org>
On 08/07/16 12:45, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
>
> Optimize these functions so that they need only one call
> into the address alloctor. This also saves a couple of
> io-tlb flushes in the unmap_sg path.
>
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
> drivers/iommu/amd_iommu.c | 77 ++++++++++++++++++++++++++++++++---------------
> 1 file changed, 52 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 78b278b..e5f8e7f 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2409,45 +2409,70 @@ static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
> * lists).
> */
> static int map_sg(struct device *dev, struct scatterlist *sglist,
> - int nelems, enum dma_data_direction dir,
> + int nelems, enum dma_data_direction direction,
> struct dma_attrs *attrs)
> {
> + int mapped_pages = 0, npages = 0, prot = 0, i;
> + unsigned long start_addr, address;
> struct protection_domain *domain;
> - int i;
> + struct dma_ops_domain *dma_dom;
> struct scatterlist *s;
> - phys_addr_t paddr;
> - int mapped_elems = 0;
> u64 dma_mask;
>
> domain = get_domain(dev);
> if (IS_ERR(domain))
> return 0;
>
> + dma_dom = domain->priv;
> dma_mask = *dev->dma_mask;
>
> + for_each_sg(sglist, s, nelems, i)
> + npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
This fails to account for the segment boundary mask[1]. Given a typical
sglist from the block layer where the boundary mask is 64K, the first
segment is 8k long, and subsequent segments are 64K long, those
subsequent segments will end up with misaligned addresses which certain
hardware may object to.
> + address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
Since a typical dma_map_sg() call is likely to involve >128K worth of
data, I wonder if it's worth going directly to a slow-path IOVA
allocation...
> + if (address == DMA_ERROR_CODE)
> + goto out_err;
> +
> + start_addr = address;
> + prot = dir2prot(direction);
> +
> for_each_sg(sglist, s, nelems, i) {
> - paddr = sg_phys(s);
> + int j, pages = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
> +
> + for (j = 0; j < pages; ++j) {
> + unsigned long bus_addr, phys_addr;
> + int ret;
>
> - s->dma_address = __map_single(dev, domain->priv,
> - paddr, s->length, dir, dma_mask);
> + bus_addr = address + (j << PAGE_SHIFT);
> + phys_addr = (sg_phys(s) & PAGE_MASK) + (j << PAGE_SHIFT);
> + ret = iommu_map_page(domain, bus_addr, phys_addr, PAGE_SIZE, prot, GFP_ATOMIC);
> + if (ret)
> + goto out_unmap;
> +
> + mapped_pages += 1;
> + }
>
> - if (s->dma_address) {
> - s->dma_length = s->length;
> - mapped_elems++;
> - } else
> - goto unmap;
> + s->dma_address = address + s->offset;
> + s->dma_length = s->length;
> + address += pages << PAGE_SHIFT;
> }
>
> - return mapped_elems;
> + return nelems;
>
> -unmap:
> - for_each_sg(sglist, s, mapped_elems, i) {
> - if (s->dma_address)
> - __unmap_single(domain->priv, s->dma_address,
> - s->dma_length, dir);
> - s->dma_address = s->dma_length = 0;
> +
> +out_unmap:
> + pr_err("%s: IOMMU mapping error in map_sg (io-pages: %d)\n",
> + dev_name(dev), npages);
> +
> + for (i = 0; i < mapped_pages; ++i) {
> + iommu_unmap_page(domain,
> + start_addr + (i << PAGE_SHIFT),
> + PAGE_SIZE);
> }
>
> + free_iova_fast(&dma_dom->iovad, start_addr, npages);
> +
> +out_err:
> return 0;
> }
>
> @@ -2460,18 +2485,20 @@ static void unmap_sg(struct device *dev, struct scatterlist *sglist,
> struct dma_attrs *attrs)
> {
> struct protection_domain *domain;
> + unsigned long startaddr;
> struct scatterlist *s;
> - int i;
> + int i,npages = 0;
>
> domain = get_domain(dev);
> if (IS_ERR(domain))
> return;
>
> - for_each_sg(sglist, s, nelems, i) {
> - __unmap_single(domain->priv, s->dma_address,
> - s->dma_length, dir);
> - s->dma_address = s->dma_length = 0;
> - }
> + for_each_sg(sglist, s, nelems, i)
> + npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
...which would also then allow this to be further simplified down to the
find_iova() trick we use in iommu-dma.
Robin.
[1]:http://article.gmane.org/gmane.linux.kernel.iommu/10553 - almost the
1-year anniversary of you making much the same comment to me :D
> +
> + startaddr = sg_dma_address(sglist) & PAGE_MASK;
> +
> + __unmap_single(domain->priv, startaddr, npages << PAGE_SHIFT, dir);
> }
>
> /*
>
next prev parent reply other threads:[~2016-07-12 11:33 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-08 11:44 [PATCH 00/20] iommu/amd: Use generic IOVA allocator Joerg Roedel
2016-07-08 11:44 ` [PATCH 01/20] iommu: Add apply_dm_region call-back to iommu-ops Joerg Roedel
2016-07-08 11:44 ` [PATCH 02/20] iommu/amd: Select IOMMU_IOVA for AMD IOMMU Joerg Roedel
2016-07-08 11:44 ` [PATCH 03/20] iommu/amd: Allocate iova_domain for dma_ops_domain Joerg Roedel
2016-07-08 11:44 ` [PATCH 04/20] iommu/amd: Create a list of reserved iova addresses Joerg Roedel
2016-07-08 11:44 ` [PATCH 05/20] iommu/amd: Implement apply_dm_region call-back Joerg Roedel
2016-07-08 11:44 ` [PATCH 06/20] iommu/amd: Pass gfp-flags to iommu_map_page() Joerg Roedel
2016-07-08 11:44 ` [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path Joerg Roedel
2016-07-12 10:55 ` Robin Murphy
2016-07-12 11:08 ` Joerg Roedel
2016-07-12 11:42 ` Robin Murphy
2016-07-12 11:48 ` Joerg Roedel
2016-07-08 11:44 ` [PATCH 08/20] iommu/amd: Make use of the generic IOVA allocator Joerg Roedel
2016-07-08 11:45 ` [PATCH 09/20] iommu/amd: Remove other remains of old address allocator Joerg Roedel
2016-07-08 11:45 ` [PATCH 10/20] iommu/amd: Remove align-parameter from __map_single() Joerg Roedel
2016-07-08 11:45 ` [PATCH 11/20] iommu/amd: Set up data structures for flush queue Joerg Roedel
2016-07-08 11:45 ` [PATCH 12/20] iommu/amd: Allow NULL pointer parameter for domain_flush_complete() Joerg Roedel
2016-07-08 11:45 ` [PATCH 13/20] iommu/amd: Implement flush queue Joerg Roedel
2016-07-08 11:45 ` [PATCH 14/20] iommu/amd: Implement timeout to flush unmap queues Joerg Roedel
2016-07-08 11:45 ` [PATCH 15/20] iommu/amd: Introduce dir2prot() helper Joerg Roedel
2016-07-08 11:45 ` [PATCH 16/20] iommu/amd: Optimize map_sg and unmap_sg Joerg Roedel
2016-07-12 11:33 ` Robin Murphy [this message]
2016-07-12 13:30 ` [PATCH 16/20 v2] " Joerg Roedel
2016-07-12 15:34 ` Robin Murphy
2016-07-13 10:27 ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 17/20] iommu/amd: Use dev_data->domain in get_domain() Joerg Roedel
2016-07-08 11:45 ` [PATCH 18/20] iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back Joerg Roedel
2016-07-08 11:45 ` [PATCH 19/20] iommu/amd: Flush iova queue before releasing dma_ops_domain Joerg Roedel
2016-07-08 11:45 ` [PATCH 20/20] iommu/amd: Use container_of to get dma_ops_domain Joerg Roedel
2016-07-12 9:03 ` [PATCH 00/20] iommu/amd: Use generic IOVA allocator Wan Zongshun
2016-07-12 10:55 ` Joerg Roedel
2016-07-13 9:44 ` Wan Zongshun
2016-07-13 9:51 ` Joerg Roedel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5784D597.4010703@arm.com \
--to=robin.murphy@arm.com \
--cc=Vincent.Wan@amd.com \
--cc=iommu@lists.linux-foundation.org \
--cc=joro@8bytes.org \
--cc=jroedel@suse.de \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox