linux-tegra.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thierry Reding <thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Cc: Olav Haugan <ohaugan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>,
	Alexandre Courbot
	<gnurou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Paul Walmsley <paul-DWxLp4Yu+b8AvxtiuMwx3w@public.gmane.org>,
	Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>,
	Tomeu Vizoso
	<tomeu.vizoso-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>,
	Stephen Warren <swarren-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org>,
	Antonios Motakis
	<a.motakis-lrHrjnjw1UfHK3s98zE1ajGjJy/sRE9J@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>,
	Mikko Perttunen
	<mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Nicolas Iooss <nicolas.iooss_linux-oWGTIYur0i8@public.gmane.org>,
	Russell King <rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org>,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Vince Hsu <vince.h-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
Subject: Re: [RFC PATCH 0/3] iommu: Add range flush operation
Date: Tue, 29 Sep 2015 11:27:14 +0200	[thread overview]
Message-ID: <20150929092714.GD9460@ulmo.nvidia.com> (raw)
In-Reply-To: <1443504379-31841-1-git-send-email-tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 4776 bytes --]

On Tue, Sep 29, 2015 at 02:25:23PM +0900, Tomasz Figa wrote:
> Currently the IOMMU subsystem provides 3 basic operations: iommu_map(),
> iommu_map_sg() and iommu_unmap(). iommu_map() can be used to map memory
> page by page, however it involves flushing the caches (CPU and IOMMU) for
> every mapped page separately, which is unsuitable for use cases that
> require low mapping latency. Similarly iommu_unmap(), even though it
> takes a full IOVA range as its argument, performs unmapping in a page
> by page manner.
> 
> To make mapping operation more suitable for such use cases, iommu_map_sg()
> and .map_sg() callback in iommu_ops struct were introduced, which allowed
> particular IOMMU drivers to directly iterate over SG entries, create
> necessary mappings and flush everything in one go.
> 
> This approach, however, has two drawbacks:
>  1) it does not do anything about unmap performance,
>  2) it requires each driver willing to have fast map to implement its
>     own SG iteration code, even though this is a mostly generic operation.
> 
> This series tries to mitigate the two issues above, while acknowledging
> the fact that the .map_sg() callback might be still necessary for some
> specific platforms, which could have the need to iterate over SG elements
> inside driver code. Proposed solution introduces a new .flush() callback,
> which expects IOVA range as its argument and is expected to flush all
> respective caches (be it CPU, IOMMU TLB or whatever) to make the given
> IOVA area mapping change visible to IOMMU clients. Then all the 3 basic
> map/unmap operations are modified to call the .flush() callback at the end
> of the operation. 
> 
> Advantages of proposed approach include:
>  1) ability to use default_iommu_map_sg() helper if all the driver needs
>     for performance optimization is batching the flush,
>  2) completely no effect on existing code - the .flush() callback is made
>     optional and if it isn't implemented drivers are expected to do
>     necessary flushes on a page by page basis in respective (un)mapping
>     callbakcs,
>  3) possibility of exporting the iommu_flush() operation and providing
>     unsynchronized map/unmap operations for subsystems with even higher
>     requirements for performance (e.g. drivers/gpu/drm).

That would require passing in some sort of flag that the core shouldn't
be flushing itself, right? Currently it would flush on every map/unmap.

> 
> The series includes a generic patch implementing necessary changes in
> IOMMU API and two Tegra-specific patches that demonstrate implementation
> on driver side and which can be used for further testing.
> 
> Last, but not least, some performance numbers on Tegra210:
> +-----------+--------------+-------------+------------+
> | Operation | Size [bytes] | Before [us] | After [us] |
> +-----------+--------------+-------------+------------+
> | Map       | 128K         |         139 |         40 |
> |           |              |         136 |         34 |
> |           |              |         137 |         38 |
> |           |              |         136 |         36 |
> |           | 4M           |        3939 |       1163 |
> |           |              |        3730 |       2389 |
> |           |              |        3613 |        997 |
> |           |              |        3622 |       1620 |
> |           | ~18M         |       18635 |       4741 |
> |           |              |       19261 |       6550 |
> |           |              |       18473 |       9304 |
> |           |              |       18125 |       5120 |
> | Unmap     | 128K         |         128 |          7 |
> |           |              |         122 |          8 |
> |           |              |         119 |         10 |
> |           |              |         123 |         12 |
> |           | 4M           |        3829 |        151 |
> |           |              |        3964 |        150 |
> |           |              |        3908 |        145 |
> |           |              |        3875 |        155 |
> |           | ~18M         |       18570 |        683 |
> |           |              |       18473 |        806 |
> |           |              |       21020 |        643 |
> |           |              |       21764 |        652 |
> +-----------+--------------+-------------+------------+
> The values are obtained by surrounding the calls to iommu_map_sg()
> (with default_iommu_map_sg() helper used as .map_sg() callback) and
> iommu_unmap() with ktime-based time measurement code. Taken 4 samples
> of every buffer size. ~18M means around 17-19M due do the variance
> in requested buffer sizes.

Those are pretty impressive numbers.

Thierry

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



  parent reply	other threads:[~2015-09-29  9:27 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-29  5:25 [RFC PATCH 0/3] iommu: Add range flush operation Tomasz Figa
     [not found] ` <1443504379-31841-1-git-send-email-tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2015-09-29  5:25   ` [RFC PATCH 1/3] iommu: Add support for out of band flushing Tomasz Figa
     [not found]     ` <1443504379-31841-2-git-send-email-tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2015-09-29  9:32       ` Thierry Reding
     [not found]         ` <20150929093257.GE9460-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
2015-09-29 11:56           ` Tomasz Figa
2015-09-29  5:25   ` [RFC PATCH 2/3] memory: tegra: add TLB cache line size Tomasz Figa
     [not found]     ` <1443504379-31841-3-git-send-email-tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2015-09-29  9:43       ` Thierry Reding
     [not found]         ` <20150929094348.GF9460-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
2015-09-29 12:11           ` Tomasz Figa
2015-09-29  5:25   ` [RFC PATCH 3/3] iommu/tegra-smmu: Make the driver use out of band flushing Tomasz Figa
2015-09-29  9:27   ` Thierry Reding [this message]
     [not found]     ` <20150929092714.GD9460-AwZRO8vwLAwmlAP/+Wk3EA@public.gmane.org>
2015-09-29 11:54       ` [RFC PATCH 0/3] iommu: Add range flush operation Tomasz Figa
2015-09-29 12:22       ` Joerg Roedel
2015-09-29 12:20   ` Joerg Roedel
2015-09-29 14:20   ` Robin Murphy
     [not found]     ` <560A9E36.9030903-5wv7dgnIgG8@public.gmane.org>
2015-09-29 14:32       ` Russell King - ARM Linux
     [not found]         ` <20150929143241.GI21513-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
2015-09-29 16:27           ` Robin Murphy
     [not found]             ` <560ABBE0.8020805-5wv7dgnIgG8@public.gmane.org>
2015-09-29 16:40               ` Russell King - ARM Linux
     [not found]                 ` <20150929164014.GL21513-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
2015-09-29 17:13                   ` Robin Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150929092714.GD9460@ulmo.nvidia.com \
    --to=thierry.reding-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=a.motakis-lrHrjnjw1UfHK3s98zE1ajGjJy/sRE9J@public.gmane.org \
    --cc=arnd-r2nGTMty4D4@public.gmane.org \
    --cc=gnurou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org \
    --cc=nicolas.iooss_linux-oWGTIYur0i8@public.gmane.org \
    --cc=ohaugan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org \
    --cc=paul-DWxLp4Yu+b8AvxtiuMwx3w@public.gmane.org \
    --cc=rmk+kernel-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org \
    --cc=swarren-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org \
    --cc=tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=tomeu.vizoso-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org \
    --cc=vince.h-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org \
    --cc=will.deacon-5wv7dgnIgG8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).