Re: [PATCH v4 2/7] iommu/core: split mapping to page sizes as supported by the hardware

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Joerg Roedel <Joerg.Roedel@amd.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Kai Huang <mail.kai.huang@gmail.com>,
	Ohad Ben-Cohen <ohad@wizery.com>,
	iommu@lists.linux-foundation.org, linux-omap@vger.kernel.org,
	Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
	linux-arm-kernel@lists.infradead.org,
	David Brown <davidb@codeaurora.org>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-kernel@vger.kernel.org, Hiroshi Doyu <hdoyu@nvidia.com>,
	Stepan Moskovchenko <stepanm@codeaurora.org>,
	KyongHo Cho <pullip.cho@samsung.com>,
	kvm@vger.kernel.org
Subject: Re: [PATCH v4 2/7] iommu/core: split mapping to page sizes as supported by the hardware
Date: Fri, 11 Nov 2011 13:58:37 +0100	[thread overview]
Message-ID: <20111111125837.GF13213@amd.com> (raw)
In-Reply-To: <1320953319.535.11.camel@i7.infradead.org>

On Thu, Nov 10, 2011 at 07:28:39PM +0000, David Woodhouse wrote:

> ... which implies that a mapping, once made, might *never* actually get
> torn down until we loop and start reusing address space? That has
> interesting security implications.

Yes, it is a trade-off between security and performance. But if the user
wants more security the unmap_flush parameter can be used.

> Is it true even for devices which have been assigned to a VM and then
> unassigned?

No, this is only used in the DMA-API path. The device-assignment code
uses the IOMMU-API directly. There the IOTLB is always flushed on unmap.

> > There is something similar on the AMD IOMMU side. There it is called
> > unmap_flush.
> 
> OK, so that definitely wants consolidating into a generic option.

Agreed.

> > Some time ago I proposed the iommu_commit() interface which changes
> > these requirements. With this interface the requirement is that after a
> > couple of map/unmap operations the IOMMU-API user has to call
> > iommu_commit() to make these changes visible to the hardware (so mostly
> > sync the IOTLBs). As discussed at that time this would make sense for
> > the Intel and AMD IOMMU drivers.
> 
> I would *really* want to keep those off the fast path (thinking mostly
> about DMA API here, since that's the performance issue). But as long as
> we can achieve that, that's fine.

For AMD IOMMU there is a feature called not-present cache. It says that
the IOMMU caches non-present entries as well and needs an IOTLB flush
when something is mapped (meant for software implementations of the
IOMMU).
So it can't be really taken out of the fast-path. But the IOMMU driver
can optimize the function so that it only flushes the IOTLB when there
was an unmap-call before. It is also an improvement over the current
situation where every iommu_unmap call results in a flush implicitly.
This pretty much a no-go for using IOMMU-API in DMA mapping at the
moment.

> But also, it's not *so* much of an issue to divide the space up even
> when it's limited. The idea was not to have it *strictly* per-CPU, but
> just for a CPU to try allocating from "its own" subrange first, and then
> fall back to allocating a new subrange, and *then* fall back to
> allocating from subranges "belonging" to other CPUs. It's not that the
> allocation from a subrange would be lockless — it's that the lock would
> almost never leave the l1 cache of the CPU that *normally* uses that
> subrange.

Yeah, I get the idea. I fear that the memory consumption will get pretty
high with that approach. It basically means one round-robin allocator
per cpu and device. What does that mean on a 4096 CPU machine :)
How much lock contention will be lowered also depends on the work-load.
If dma-handles are frequently freed from another cpu than they were
allocated from the same problem re-appears.
But in the end we have to try it out and see what works best :)

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

WARNING: multiple messages have this Message-ID (diff)

From: Joerg.Roedel@amd.com (Joerg Roedel)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v4 2/7] iommu/core: split mapping to page sizes as supported by the hardware
Date: Fri, 11 Nov 2011 13:58:37 +0100	[thread overview]
Message-ID: <20111111125837.GF13213@amd.com> (raw)
In-Reply-To: <1320953319.535.11.camel@i7.infradead.org>

On Thu, Nov 10, 2011 at 07:28:39PM +0000, David Woodhouse wrote:

> ... which implies that a mapping, once made, might *never* actually get
> torn down until we loop and start reusing address space? That has
> interesting security implications.

Yes, it is a trade-off between security and performance. But if the user
wants more security the unmap_flush parameter can be used.

> Is it true even for devices which have been assigned to a VM and then
> unassigned?

No, this is only used in the DMA-API path. The device-assignment code
uses the IOMMU-API directly. There the IOTLB is always flushed on unmap.

> > There is something similar on the AMD IOMMU side. There it is called
> > unmap_flush.
> 
> OK, so that definitely wants consolidating into a generic option.

Agreed.

> > Some time ago I proposed the iommu_commit() interface which changes
> > these requirements. With this interface the requirement is that after a
> > couple of map/unmap operations the IOMMU-API user has to call
> > iommu_commit() to make these changes visible to the hardware (so mostly
> > sync the IOTLBs). As discussed at that time this would make sense for
> > the Intel and AMD IOMMU drivers.
> 
> I would *really* want to keep those off the fast path (thinking mostly
> about DMA API here, since that's the performance issue). But as long as
> we can achieve that, that's fine.

For AMD IOMMU there is a feature called not-present cache. It says that
the IOMMU caches non-present entries as well and needs an IOTLB flush
when something is mapped (meant for software implementations of the
IOMMU).
So it can't be really taken out of the fast-path. But the IOMMU driver
can optimize the function so that it only flushes the IOTLB when there
was an unmap-call before. It is also an improvement over the current
situation where every iommu_unmap call results in a flush implicitly.
This pretty much a no-go for using IOMMU-API in DMA mapping at the
moment.

> But also, it's not *so* much of an issue to divide the space up even
> when it's limited. The idea was not to have it *strictly* per-CPU, but
> just for a CPU to try allocating from "its own" subrange first, and then
> fall back to allocating a new subrange, and *then* fall back to
> allocating from subranges "belonging" to other CPUs. It's not that the
> allocation from a subrange would be lockless ? it's that the lock would
> almost never leave the l1 cache of the CPU that *normally* uses that
> subrange.

Yeah, I get the idea. I fear that the memory consumption will get pretty
high with that approach. It basically means one round-robin allocator
per cpu and device. What does that mean on a 4096 CPU machine :)
How much lock contention will be lowered also depends on the work-load.
If dma-handles are frequently freed from another cpu than they were
allocated from the same problem re-appears.
But in the end we have to try it out and see what works best :)

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

WARNING: multiple messages have this Message-ID (diff)

From: Joerg Roedel <Joerg.Roedel@amd.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Kai Huang <mail.kai.huang@gmail.com>,
	Ohad Ben-Cohen <ohad@wizery.com>,
	<iommu@lists.linux-foundation.org>, <linux-omap@vger.kernel.org>,
	Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
	<linux-arm-kernel@lists.infradead.org>,
	David Brown <davidb@codeaurora.org>,
	Arnd Bergmann <arnd@arndb.de>, <linux-kernel@vger.kernel.org>,
	Hiroshi Doyu <hdoyu@nvidia.com>,
	Stepan Moskovchenko <stepanm@codeaurora.org>,
	KyongHo Cho <pullip.cho@samsung.com>, <kvm@vger.kernel.org>
Subject: Re: [PATCH v4 2/7] iommu/core: split mapping to page sizes as supported by the hardware
Date: Fri, 11 Nov 2011 13:58:37 +0100	[thread overview]
Message-ID: <20111111125837.GF13213@amd.com> (raw)
In-Reply-To: <1320953319.535.11.camel@i7.infradead.org>

On Thu, Nov 10, 2011 at 07:28:39PM +0000, David Woodhouse wrote:

> ... which implies that a mapping, once made, might *never* actually get
> torn down until we loop and start reusing address space? That has
> interesting security implications.

Yes, it is a trade-off between security and performance. But if the user
wants more security the unmap_flush parameter can be used.

> Is it true even for devices which have been assigned to a VM and then
> unassigned?

No, this is only used in the DMA-API path. The device-assignment code
uses the IOMMU-API directly. There the IOTLB is always flushed on unmap.

> > There is something similar on the AMD IOMMU side. There it is called
> > unmap_flush.
> 
> OK, so that definitely wants consolidating into a generic option.

Agreed.

> > Some time ago I proposed the iommu_commit() interface which changes
> > these requirements. With this interface the requirement is that after a
> > couple of map/unmap operations the IOMMU-API user has to call
> > iommu_commit() to make these changes visible to the hardware (so mostly
> > sync the IOTLBs). As discussed at that time this would make sense for
> > the Intel and AMD IOMMU drivers.
> 
> I would *really* want to keep those off the fast path (thinking mostly
> about DMA API here, since that's the performance issue). But as long as
> we can achieve that, that's fine.

For AMD IOMMU there is a feature called not-present cache. It says that
the IOMMU caches non-present entries as well and needs an IOTLB flush
when something is mapped (meant for software implementations of the
IOMMU).
So it can't be really taken out of the fast-path. But the IOMMU driver
can optimize the function so that it only flushes the IOTLB when there
was an unmap-call before. It is also an improvement over the current
situation where every iommu_unmap call results in a flush implicitly.
This pretty much a no-go for using IOMMU-API in DMA mapping at the
moment.

> But also, it's not *so* much of an issue to divide the space up even
> when it's limited. The idea was not to have it *strictly* per-CPU, but
> just for a CPU to try allocating from "its own" subrange first, and then
> fall back to allocating a new subrange, and *then* fall back to
> allocating from subranges "belonging" to other CPUs. It's not that the
> allocation from a subrange would be lockless — it's that the lock would
> almost never leave the l1 cache of the CPU that *normally* uses that
> subrange.

Yeah, I get the idea. I fear that the memory consumption will get pretty
high with that approach. It basically means one round-robin allocator
per cpu and device. What does that mean on a 4096 CPU machine :)
How much lock contention will be lowered also depends on the work-load.
If dma-handles are frequently freed from another cpu than they were
allocated from the same problem re-appears.
But in the end we have to try it out and see what works best :)

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

next prev parent reply	other threads:[~2011-11-11 12:58 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-17 11:27 [PATCH v4 0/7] iommu: split mapping to page sizes as supported by the hardware Ohad Ben-Cohen
2011-10-17 11:27 ` Ohad Ben-Cohen
2011-10-17 11:27 ` Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 1/7] iommu/core: stop converting bytes to page order back and forth Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 2/7] iommu/core: split mapping to page sizes as supported by the hardware Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-11-10  6:17   ` Kai Huang
2011-11-10  6:17     ` Kai Huang
2011-11-10  7:31     ` Ohad Ben-Cohen
2011-11-10  7:31       ` Ohad Ben-Cohen
2011-11-10 12:16       ` cody
2011-11-10 12:16         ` cody
2011-11-10 13:08         ` Joerg Roedel
2011-11-10 13:08           ` Joerg Roedel
2011-11-10 13:08           ` Joerg Roedel
2011-11-10 14:35           ` cody
2011-11-10 14:35             ` cody
2011-11-10 14:51             ` Joerg Roedel
2011-11-10 14:51               ` Joerg Roedel
2011-11-10 14:51               ` Joerg Roedel
2011-11-10 15:28     ` David Woodhouse
2011-11-10 15:28       ` David Woodhouse
2011-11-10 17:09       ` Joerg Roedel
2011-11-10 17:09         ` Joerg Roedel
2011-11-10 17:09         ` Joerg Roedel
2011-11-10 19:28         ` David Woodhouse
2011-11-10 19:28           ` David Woodhouse
2011-11-11 12:58           ` Joerg Roedel [this message]
2011-11-11 12:58             ` Joerg Roedel
2011-11-11 12:58             ` Joerg Roedel
2011-11-11 13:27             ` David Woodhouse
2011-11-11 13:27               ` David Woodhouse
2011-11-11 14:18               ` Joerg Roedel
2011-11-11 14:18                 ` Joerg Roedel
2011-11-11 14:18                 ` Joerg Roedel
2011-11-11 13:17           ` Changing IOMMU-API for generic DMA-mapping " Joerg Roedel
2011-11-11 13:17             ` Joerg Roedel
2011-11-24 12:52             ` Marek Szyprowski
2011-11-24 12:52               ` Marek Szyprowski
2011-11-24 15:27               ` 'Joerg Roedel'
2011-11-24 15:27                 ` 'Joerg Roedel'
2011-11-24 15:27                 ` 'Joerg Roedel'
2011-11-10 21:12         ` [PATCH v4 2/7] iommu/core: split mapping to page sizes as " Stepan Moskovchenko
2011-11-10 21:12           ` Stepan Moskovchenko
2011-11-11 13:24           ` Joerg Roedel
2011-11-11 13:24             ` Joerg Roedel
2011-11-11 13:24             ` Joerg Roedel
2011-11-12  2:04             ` Stepan Moskovchenko
2011-11-12  2:04               ` Stepan Moskovchenko
2011-11-13  1:43               ` KyongHo Cho
2011-11-13  1:43                 ` KyongHo Cho
2011-10-17 11:27 ` [PATCH v4 3/7] iommu/omap: announce supported page sizes Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 4/7] iommu/msm: " Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 5/7] iommu/amd: " Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 6/7] iommu/intel: " Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 7/7] iommu/core: remove the temporary pgsize settings Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-10-17 11:27   ` Ohad Ben-Cohen
2011-11-08 12:57 ` [PATCH v4 0/7] iommu: split mapping to page sizes as supported by the hardware Ohad Ben-Cohen
2011-11-08 12:57   ` Ohad Ben-Cohen
2011-11-08 14:01   ` Joerg Roedel
2011-11-08 14:01     ` Joerg Roedel
2011-11-08 14:01     ` Joerg Roedel
2011-11-08 14:03     ` Ohad Ben-Cohen
2011-11-08 14:03       ` Ohad Ben-Cohen
2011-11-08 16:23       ` Joerg Roedel
2011-11-08 16:23         ` Joerg Roedel
2011-11-08 16:23         ` Joerg Roedel
2011-11-08 16:43         ` Ohad Ben-Cohen
2011-11-08 16:43           ` Ohad Ben-Cohen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111111125837.GF13213@amd.com \
    --to=joerg.roedel@amd.com \
    --cc=arnd@arndb.de \
    --cc=davidb@codeaurora.org \
    --cc=dwmw2@infradead.org \
    --cc=hdoyu@nvidia.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=kvm@vger.kernel.org \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=mail.kai.huang@gmail.com \
    --cc=ohad@wizery.com \
    --cc=pullip.cho@samsung.com \
    --cc=stepanm@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.