From: "Roedel, Joerg" <Joerg.Roedel@amd.com>
To: Ohad Ben-Cohen <ohad@wizery.com>
Cc: "iommu@lists.linux-foundation.org"
<iommu@lists.linux-foundation.org>,
"linux-omap@vger.kernel.org" <linux-omap@vger.kernel.org>,
Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
David Woodhouse <dwmw2@infradead.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
David Brown <davidb@codeaurora.org>,
Arnd Bergmann <arnd@arndb.de>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Stepan Moskovchenko <stepanm@codeaurora.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Hiroshi Doyu <hdoyu@nvidia.com>
Subject: Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware
Date: Mon, 17 Oct 2011 11:28:24 +0200 [thread overview]
Message-ID: <20111017092824.GF2198@amd.com> (raw)
In-Reply-To: <CAK=WgbbQ9RRFmMXGOK2An7f0QkObYigv_OWXALFr+Bchw97j9Q@mail.gmail.com>
Hi Ohad,
On Mon, Oct 17, 2011 at 04:05:02AM -0400, Ohad Ben-Cohen wrote:
> > This whole thing is quite marginal though and also very easy to change
> > later, so we can start with the "driver-provided io_page_size"
> > approach for now.
>
> Sorry, I just couldn't get myself to sign-off this as it really feels
> wrong to me.
>
> This min_pagesz member is just cached by the core so it doesn't need to
> look it up every time we're mapping. Drivers shouldn't care about it, as it's
> completely internal to the iommu core. I'm afraid that pushing this to
> the drivers
> feels like redundant duplication of code and might also confuse developers.
>
> Let me please suggest two alternatives:
> a) drop this min_pagesz cache completely. iommu core would then redundantly
> re-calculate this every time something is mapped, but I hardly believe there
> is going to be a measurable impact on performance.
> b) keep the current implementation for now, and fix this later (when we constify
> struct iommu_ops *) by caching min_pagesz in a dynamically allocated iommu
> context. Since this future "constify" patch will anyway need to change 'struct
> bus_type', it would be a good opportunity to do this change at the same time.
>
> I don't mind which of those approaches to take, and I also don't mind doing (b)
> myself later, in a separate patch. Your call.
I think option a) is the best. It should add only minimal overhead to
the iommu_map path.
>
> >> This needs to be (left > 0). The drivers are allowed to unmap more then
> >> requested, so this value may turn negative.
> >
> > Good point. 'left' is size_t though, so i'll fix this a bit differently.
>
> Fixed, please take a look:
>
> >From 00b8b9373fe2d73da0280ac1e6ade4a701c95140 Mon Sep 17 00:00:00 2001
> From: Ohad Ben-Cohen <ohad@wizery.com>
> Date: Mon, 10 Oct 2011 23:50:55 +0200
> Subject: [PATCH] iommu/core: split mapping to page sizes as supported
> by the hardware
>
> When mapping a memory region, split it to page sizes as supported
> by the iommu hardware. Always prefer bigger pages, when possible,
> in order to reduce the TLB pressure.
>
> The logic to do that is now added to the IOMMU core, so neither the iommu
> drivers themselves nor users of the IOMMU API have to duplicate it.
>
> This allows a more lenient granularity of mappings; traditionally the
> IOMMU API took 'order' (of a page) as a mapping size, and directly let
> the low level iommu drivers handle the mapping, but now that the IOMMU
> core can split arbitrary memory regions into pages, we can remove this
> limitation, so users don't have to split those regions by themselves.
>
> Currently the supported page sizes are advertised once and they then
> remain static. That works well for OMAP and MSM but it would probably
> not fly well with intel's hardware, where the page size capabilities
> seem to have the potential to be different between several DMA
> remapping devices.
>
> register_iommu() currently sets a default pgsize behavior, so we can convert
> the IOMMU drivers in subsequent patches. After all the drivers
> are converted, the temporary default settings will be removed.
>
> Mainline users of the IOMMU API (kvm and omap-iovmm) are adopted
> to deal with bytes instead of page order.
>
> Many thanks to Joerg Roedel <Joerg.Roedel@amd.com> for significant review!
>
> Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
> Cc: David Brown <davidb@codeaurora.org>
> Cc: David Woodhouse <dwmw2@infradead.org>
> Cc: Joerg Roedel <Joerg.Roedel@amd.com>
> Cc: Stepan Moskovchenko <stepanm@codeaurora.org>
> Cc: KyongHo Cho <pullip.cho@samsung.com>
> Cc: Hiroshi DOYU <hdoyu@nvidia.com>
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> Cc: kvm@vger.kernel.org
> ---
> drivers/iommu/iommu.c | 124 +++++++++++++++++++++++++++++++++++++++-----
> drivers/iommu/omap-iovmm.c | 17 ++----
> include/linux/iommu.h | 24 +++++++-
> virt/kvm/iommu.c | 8 ++--
> 4 files changed, 141 insertions(+), 32 deletions(-)
The patch looks good now. Please implement option a) and it should be
fine. I will test it on an AMD IOMMU platform. We still need someone to
test it on VT-d.
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
next prev parent reply other threads:[~2011-10-17 9:29 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-16 17:51 [PATCH v3 0/6] iommu: split mapping to page sizes as supported by the hardware Ohad Ben-Cohen
2011-09-16 17:51 ` [PATCH v3 1/6] iommu/core: " Ohad Ben-Cohen
2011-09-27 10:05 ` Roedel, Joerg
2011-09-27 12:26 ` Ohad Ben-Cohen
2011-09-27 13:12 ` Roedel, Joerg
2011-09-27 13:28 ` Ohad Ben-Cohen
2011-09-27 18:14 ` Roedel, Joerg
2011-10-02 15:58 ` Ohad Ben-Cohen
2011-10-10 7:40 ` Ohad Ben-Cohen
2011-10-10 9:47 ` Roedel, Joerg
2011-10-10 13:59 ` Ohad Ben-Cohen
2011-10-10 15:36 ` Roedel, Joerg
2011-10-10 17:02 ` Ohad Ben-Cohen
2011-10-10 22:01 ` Ohad Ben-Cohen
2011-10-11 10:21 ` Roedel, Joerg
2011-10-11 10:19 ` Roedel, Joerg
2011-10-10 22:49 ` Ohad Ben-Cohen
2011-10-11 10:38 ` Roedel, Joerg
2011-10-11 17:01 ` Ohad Ben-Cohen
2011-10-14 13:35 ` Roedel, Joerg
2011-10-14 17:03 ` Ohad Ben-Cohen
2011-10-14 17:05 ` Ohad Ben-Cohen
2011-10-17 8:05 ` Ohad Ben-Cohen
2011-10-17 9:28 ` Roedel, Joerg [this message]
2011-10-17 9:55 ` Ohad Ben-Cohen
2011-10-11 14:59 ` KyongHo Cho
2011-10-11 17:04 ` Ohad Ben-Cohen
2011-10-10 12:52 ` KyongHo Cho
2011-10-10 14:00 ` Ohad Ben-Cohen
2011-09-16 17:51 ` [PATCH v3 2/6] iommu/omap: announce supported page sizes Ohad Ben-Cohen
2011-09-16 17:51 ` [PATCH v3 3/6] iommu/msm: " Ohad Ben-Cohen
2011-09-25 5:03 ` David Brown
2011-09-16 17:51 ` [PATCH v3 4/6] iommu/amd: " Ohad Ben-Cohen
2011-09-16 17:51 ` [PATCH v3 5/6] iommu/intel: " Ohad Ben-Cohen
2011-09-16 17:51 ` [PATCH v3 6/6] iommu/core: remove the temporary register_iommu_pgsize API Ohad Ben-Cohen
2011-09-27 8:56 ` [PATCH v3 0/6] iommu: split mapping to page sizes as supported by the hardware Ohad Ben-Cohen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111017092824.GF2198@amd.com \
--to=joerg.roedel@amd.com \
--cc=arnd@arndb.de \
--cc=davidb@codeaurora.org \
--cc=dwmw2@infradead.org \
--cc=hdoyu@nvidia.com \
--cc=iommu@lists.linux-foundation.org \
--cc=kvm@vger.kernel.org \
--cc=laurent.pinchart@ideasonboard.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-omap@vger.kernel.org \
--cc=ohad@wizery.com \
--cc=stepanm@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox