Re: [PATCH 16/20 v2] iommu/amd: Optimize map_sg and unmap_sg

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
To: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
Cc: Vincent.Wan-5C7GfCeVMHo@public.gmane.org,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 16/20 v2] iommu/amd: Optimize map_sg and unmap_sg
Date: Wed, 13 Jul 2016 12:27:18 +0200	[thread overview]
Message-ID: <20160713102718.GD27306@suse.de> (raw)
In-Reply-To: <57850DF8.9040507-5wv7dgnIgG8@public.gmane.org>

On Tue, Jul 12, 2016 at 04:34:16PM +0100, Robin Murphy wrote:
> The boundary masks for block devices are tricky to track down through so
> many layers of indirection in the common frameworks, but there are a lot
> of 64K ones there. After some more impromptu digging into the subject
> I've finally satisfied my curiosity - it seems this restriction stems
> from the ATA DMA PRD table format, so it could perhaps still be a real
> concern for anyone using some crusty old PCI IDE card in their modern
> system.

The boundary-mask is a capability of the underlying PCI device, no? The
ATA or whatever-stack above should have no influence on it.
> 
> Indeed, I wasn't suggesting making more than one call, just that
> alloc_iova_fast() is quite likely to have to fall back to alloc_iova()
> here, so there may be some mileage in going directly to the latter, with
> the benefit of then being able to rely on find_iova() later (since you
> know for sure you allocated out of the tree rather than the caches). My
> hunch is that dma_map_sg() tends to be called for bulk data transfer
> (block devices, DRM, etc.) so is probably a less contended path compared
> to the network layer hammering dma_map_single().

Using different functions for allocation would also require special
handling in the queued-freeing code, as I have to track the allocation
then to know wheter I free it with the _fast variant or not.

> > +	mask          = dma_get_seg_boundary(dev);
> > +	boundary_size = mask + 1 ? ALIGN(mask + 1, PAGE_SIZE) >> PAGE_SHIFT :
> > +				   1UL << (BITS_PER_LONG - PAGE_SHIFT);
> 
> (mask >> PAGE_SHIFT) + 1 ?

Should make no difference unless some of the first PAGE_SHIFT bits of
mask is 0 (which shouldn't happen).



	Joerg

WARNING: multiple messages have this Message-ID (diff)

From: Joerg Roedel <jroedel@suse.de>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>,
	iommu@lists.linux-foundation.org, Vincent.Wan@amd.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 16/20 v2] iommu/amd: Optimize map_sg and unmap_sg
Date: Wed, 13 Jul 2016 12:27:18 +0200	[thread overview]
Message-ID: <20160713102718.GD27306@suse.de> (raw)
In-Reply-To: <57850DF8.9040507@arm.com>

On Tue, Jul 12, 2016 at 04:34:16PM +0100, Robin Murphy wrote:
> The boundary masks for block devices are tricky to track down through so
> many layers of indirection in the common frameworks, but there are a lot
> of 64K ones there. After some more impromptu digging into the subject
> I've finally satisfied my curiosity - it seems this restriction stems
> from the ATA DMA PRD table format, so it could perhaps still be a real
> concern for anyone using some crusty old PCI IDE card in their modern
> system.

The boundary-mask is a capability of the underlying PCI device, no? The
ATA or whatever-stack above should have no influence on it.
> 
> Indeed, I wasn't suggesting making more than one call, just that
> alloc_iova_fast() is quite likely to have to fall back to alloc_iova()
> here, so there may be some mileage in going directly to the latter, with
> the benefit of then being able to rely on find_iova() later (since you
> know for sure you allocated out of the tree rather than the caches). My
> hunch is that dma_map_sg() tends to be called for bulk data transfer
> (block devices, DRM, etc.) so is probably a less contended path compared
> to the network layer hammering dma_map_single().

Using different functions for allocation would also require special
handling in the queued-freeing code, as I have to track the allocation
then to know wheter I free it with the _fast variant or not.

> > +	mask          = dma_get_seg_boundary(dev);
> > +	boundary_size = mask + 1 ? ALIGN(mask + 1, PAGE_SIZE) >> PAGE_SHIFT :
> > +				   1UL << (BITS_PER_LONG - PAGE_SHIFT);
> 
> (mask >> PAGE_SHIFT) + 1 ?

Should make no difference unless some of the first PAGE_SHIFT bits of
mask is 0 (which shouldn't happen).



	Joerg

next prev parent reply	other threads:[~2016-07-13 10:27 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-08 11:44 [PATCH 00/20] iommu/amd: Use generic IOVA allocator Joerg Roedel
2016-07-08 11:44 ` Joerg Roedel
     [not found] ` <1467978311-28322-1-git-send-email-joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2016-07-08 11:44   ` [PATCH 01/20] iommu: Add apply_dm_region call-back to iommu-ops Joerg Roedel
2016-07-08 11:44     ` Joerg Roedel
2016-07-08 11:44   ` [PATCH 02/20] iommu/amd: Select IOMMU_IOVA for AMD IOMMU Joerg Roedel
2016-07-08 11:44     ` Joerg Roedel
2016-07-08 11:44   ` [PATCH 03/20] iommu/amd: Allocate iova_domain for dma_ops_domain Joerg Roedel
2016-07-08 11:44     ` Joerg Roedel
2016-07-08 11:44   ` [PATCH 04/20] iommu/amd: Create a list of reserved iova addresses Joerg Roedel
2016-07-08 11:44     ` Joerg Roedel
2016-07-08 11:44   ` [PATCH 05/20] iommu/amd: Implement apply_dm_region call-back Joerg Roedel
2016-07-08 11:44     ` Joerg Roedel
2016-07-08 11:44   ` [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path Joerg Roedel
2016-07-08 11:44     ` Joerg Roedel
     [not found]     ` <1467978311-28322-8-git-send-email-joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2016-07-12 10:55       ` Robin Murphy
2016-07-12 10:55         ` Robin Murphy
2016-07-12 11:08         ` Joerg Roedel
2016-07-12 11:42           ` Robin Murphy
     [not found]             ` <5784D7C3.4010104-5wv7dgnIgG8@public.gmane.org>
2016-07-12 11:48               ` Joerg Roedel
2016-07-12 11:48                 ` Joerg Roedel
2016-07-08 11:44   ` [PATCH 08/20] iommu/amd: Make use of the generic IOVA allocator Joerg Roedel
2016-07-08 11:44     ` Joerg Roedel
2016-07-08 11:45   ` [PATCH 09/20] iommu/amd: Remove other remains of old address allocator Joerg Roedel
2016-07-08 11:45     ` Joerg Roedel
2016-07-08 11:45   ` [PATCH 10/20] iommu/amd: Remove align-parameter from __map_single() Joerg Roedel
2016-07-08 11:45     ` Joerg Roedel
2016-07-08 11:45   ` [PATCH 11/20] iommu/amd: Set up data structures for flush queue Joerg Roedel
2016-07-08 11:45     ` Joerg Roedel
2016-07-08 11:45   ` [PATCH 12/20] iommu/amd: Allow NULL pointer parameter for domain_flush_complete() Joerg Roedel
2016-07-08 11:45     ` Joerg Roedel
2016-07-08 11:45   ` [PATCH 13/20] iommu/amd: Implement flush queue Joerg Roedel
2016-07-08 11:45     ` Joerg Roedel
2016-07-08 11:45   ` [PATCH 14/20] iommu/amd: Implement timeout to flush unmap queues Joerg Roedel
2016-07-08 11:45     ` Joerg Roedel
2016-07-08 11:45   ` [PATCH 15/20] iommu/amd: Introduce dir2prot() helper Joerg Roedel
2016-07-08 11:45     ` Joerg Roedel
2016-07-08 11:45   ` [PATCH 17/20] iommu/amd: Use dev_data->domain in get_domain() Joerg Roedel
2016-07-08 11:45     ` Joerg Roedel
2016-07-08 11:44 ` [PATCH 06/20] iommu/amd: Pass gfp-flags to iommu_map_page() Joerg Roedel
2016-07-08 11:45 ` [PATCH 16/20] iommu/amd: Optimize map_sg and unmap_sg Joerg Roedel
2016-07-12 11:33   ` Robin Murphy
     [not found]     ` <5784D597.4010703-5wv7dgnIgG8@public.gmane.org>
2016-07-12 13:30       ` [PATCH 16/20 v2] " Joerg Roedel
2016-07-12 13:30         ` Joerg Roedel
2016-07-12 15:34         ` Robin Murphy
     [not found]           ` <57850DF8.9040507-5wv7dgnIgG8@public.gmane.org>
2016-07-13 10:27             ` Joerg Roedel [this message]
2016-07-13 10:27               ` Joerg Roedel
2016-07-08 11:45 ` [PATCH 18/20] iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back Joerg Roedel
2016-07-08 11:45 ` [PATCH 19/20] iommu/amd: Flush iova queue before releasing dma_ops_domain Joerg Roedel
2016-07-08 11:45 ` [PATCH 20/20] iommu/amd: Use container_of to get dma_ops_domain Joerg Roedel
2016-07-12  9:03 ` [PATCH 00/20] iommu/amd: Use generic IOVA allocator Wan Zongshun
2016-07-12 10:55   ` Joerg Roedel
     [not found]     ` <20160712105533.GE12639-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2016-07-13  9:44       ` Wan Zongshun
2016-07-13  9:44         ` Wan Zongshun
     [not found]         ` <57860D78.5090409-6ukY98dZOFrYtjvyW6yDsg@public.gmane.org>
2016-07-13  9:51           ` Joerg Roedel
2016-07-13  9:51             ` Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160713102718.GD27306@suse.de \
    --to=jroedel-l3a5bk7wagm@public.gmane.org \
    --cc=Vincent.Wan-5C7GfCeVMHo@public.gmane.org \
    --cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=robin.murphy-5wv7dgnIgG8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.