* [PATCH v4 0/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping
@ 2018-05-30 14:06 Thierry Reding
2018-05-30 14:06 ` [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device() Thierry Reding
2018-05-30 14:06 ` [PATCH v4 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping Thierry Reding
0 siblings, 2 replies; 10+ messages in thread
From: Thierry Reding @ 2018-05-30 14:06 UTC (permalink / raw)
To: linux-arm-kernel
From: Thierry Reding <treding@nvidia.com>
An unfortunate interaction between the 32-bit ARM DMA/IOMMU mapping code
and Tegra SMMU driver changes to support IOMMU groups introduced a boot-
time regression on Tegra124. This was caught very late because none of
the standard configurations that are tested on Tegra enable the ARM DMA/
IOMMU mapping code since it is not needed.
The reason for the failure is that the GPU found on Tegra uses a special
bit in physical addresses to determine whether or not a buffer is mapped
through the SMMU. In order to achieve this, the Nouveau driver needs to
explicitly understand which buffers are mapped through the SMMU and
which aren't. Hiding usage of the SMMU behind the DMA API is bound to
fail because the knowledge doesn't exist. Furthermore, the GPU has its
own IOMMU and in most cases doesn't need buffers to be physically or
virtually contiguous. One notable exception is for compressible buffers
which need to be mapped with large pages, which in turn require all the
small pages in a large page to be contiguous. This can be achieved with
an SMMU mapping, though it isn't currently supported in Nouveau. Since
Translating through the SMMU is unnecessary and can have a negative
impact on performance for the common case, so we want to avoid it when
possible.
This series of patches adds a 32-bit ARM specific API that allows a
driver to detach the device from the DMA/IOMMU mapping so that it can
provide its own implementation for dealing with the SMMU. The second
patch makes use of that new API in the Nouveau driver to fix the
regression.
Thierry
Thierry Reding (2):
ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()
drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping
arch/arm/mm/dma-mapping.c | 12 ++++++------
drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 13 +++++++++++++
2 files changed, 19 insertions(+), 6 deletions(-)
--
2.17.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()
2018-05-30 14:06 [PATCH v4 0/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping Thierry Reding
@ 2018-05-30 14:06 ` Thierry Reding
2018-05-31 17:52 ` Robin Murphy
` (2 more replies)
2018-05-30 14:06 ` [PATCH v4 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping Thierry Reding
1 sibling, 3 replies; 10+ messages in thread
From: Thierry Reding @ 2018-05-30 14:06 UTC (permalink / raw)
To: linux-arm-kernel
From: Thierry Reding <treding@nvidia.com>
Instead of setting the DMA ops pointer to NULL, set the correct,
non-IOMMU ops depending on the device's coherency setting.
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v4:
- new patch to fix existing arm_iommu_detach_device() to do what we need
arch/arm/mm/dma-mapping.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index af27f1c22d93..87a0037574e4 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1151,6 +1151,11 @@ int arm_dma_supported(struct device *dev, u64 mask)
return __dma_supported(dev, mask, false);
}
+static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
+{
+ return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
+}
+
#ifdef CONFIG_ARM_DMA_USE_IOMMU
static int __dma_info_to_prot(enum dma_data_direction dir, unsigned long attrs)
@@ -2296,7 +2301,7 @@ void arm_iommu_detach_device(struct device *dev)
iommu_detach_device(mapping->domain, dev);
kref_put(&mapping->kref, release_iommu_mapping);
to_dma_iommu_mapping(dev) = NULL;
- set_dma_ops(dev, NULL);
+ set_dma_ops(dev, arm_get_dma_map_ops(dev->archdata.dma_coherent));
pr_debug("Detached IOMMU controller from %s device.\n", dev_name(dev));
}
@@ -2357,11 +2362,6 @@ static void arm_teardown_iommu_dma_ops(struct device *dev) { }
#endif /* CONFIG_ARM_DMA_USE_IOMMU */
-static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
-{
- return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
-}
-
void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
const struct iommu_ops *iommu, bool coherent)
{
--
2.17.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v4 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping
2018-05-30 14:06 [PATCH v4 0/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping Thierry Reding
2018-05-30 14:06 ` [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device() Thierry Reding
@ 2018-05-30 14:06 ` Thierry Reding
2018-05-31 16:12 ` Christoph Hellwig
` (2 more replies)
1 sibling, 3 replies; 10+ messages in thread
From: Thierry Reding @ 2018-05-30 14:06 UTC (permalink / raw)
To: linux-arm-kernel
From: Thierry Reding <treding@nvidia.com>
Depending on the kernel configuration, early ARM architecture setup code
may have attached the GPU to a DMA/IOMMU mapping that transparently uses
the IOMMU to back the DMA API. Tegra requires special handling for IOMMU
backed buffers (a special bit in the GPU's MMU page tables indicates the
memory path to take: via the SMMU or directly to the memory controller).
Transparently backing DMA memory with an IOMMU prevents Nouveau from
properly handling such memory accesses and causes memory access faults.
As a side-note: buffers other than those allocated in instance memory
don't need to be physically contiguous from the GPU's perspective since
the GPU can map them into contiguous buffers using its own MMU. Mapping
these buffers through the IOMMU is unnecessary and will even lead to
performance degradation because of the additional translation. One
exception to this are compressible buffers which need large pages. In
order to enable these large pages, multiple small pages will have to be
combined into one large (I/O virtually contiguous) mapping via the
IOMMU. However, that is a topic outside the scope of this fix and isn't
currently supported. An implementation will want to explicitly create
these large pages in the Nouveau driver, so detaching from a DMA/IOMMU
mapping would still be required.
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v4:
- use existing APIs to detach from a DMA/IOMMU mapping
Changes in v3:
- clarify the use of IOMMU mapping for compressible buffers
- squash multiple patches into this
drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
index 78597da6313a..0e372a190d3f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
@@ -23,6 +23,10 @@
#ifdef CONFIG_NOUVEAU_PLATFORM_DRIVER
#include "priv.h"
+#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
+#include <asm/dma-iommu.h>
+#endif
+
static int
nvkm_device_tegra_power_up(struct nvkm_device_tegra *tdev)
{
@@ -105,6 +109,15 @@ nvkm_device_tegra_probe_iommu(struct nvkm_device_tegra *tdev)
unsigned long pgsize_bitmap;
int ret;
+#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
+ if (dev->archdata.mapping) {
+ struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
+
+ arm_iommu_detach_device(dev);
+ arm_iommu_release_mapping(mapping);
+ }
+#endif
+
if (!tdev->func->iommu_bit)
return;
--
2.17.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v4 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping
2018-05-30 14:06 ` [PATCH v4 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping Thierry Reding
@ 2018-05-31 16:12 ` Christoph Hellwig
2018-05-31 17:56 ` Robin Murphy
2018-07-06 15:36 ` Nicolas Chauvet
2 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2018-05-31 16:12 UTC (permalink / raw)
To: linux-arm-kernel
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> + if (dev->archdata.mapping) {
> + struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> +
> + arm_iommu_detach_device(dev);
> + arm_iommu_release_mapping(mapping);
> + }
> +#endif
Having this hidden in a helper would be nicer, but anything that
doesn't directly expose the dma_map_ops to a driver is fine with me.
So from the dma-mapping POV:
Acked-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()
2018-05-30 14:06 ` [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device() Thierry Reding
@ 2018-05-31 17:52 ` Robin Murphy
2018-07-02 11:53 ` Thierry Reding
2018-07-02 15:34 ` Christoph Hellwig
2 siblings, 0 replies; 10+ messages in thread
From: Robin Murphy @ 2018-05-31 17:52 UTC (permalink / raw)
To: linux-arm-kernel
On 30/05/18 15:06, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> Instead of setting the DMA ops pointer to NULL, set the correct,
> non-IOMMU ops depending on the device's coherency setting.
It looks like it's probably been 4 or 5 years since that became subtly
wrong by virtue of the landscape changing around it, but it's clearly
not enough of a problem to consider stable backports :)
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
> Changes in v4:
> - new patch to fix existing arm_iommu_detach_device() to do what we need
>
> arch/arm/mm/dma-mapping.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index af27f1c22d93..87a0037574e4 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -1151,6 +1151,11 @@ int arm_dma_supported(struct device *dev, u64 mask)
> return __dma_supported(dev, mask, false);
> }
>
> +static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
> +{
> + return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
> +}
> +
> #ifdef CONFIG_ARM_DMA_USE_IOMMU
>
> static int __dma_info_to_prot(enum dma_data_direction dir, unsigned long attrs)
> @@ -2296,7 +2301,7 @@ void arm_iommu_detach_device(struct device *dev)
> iommu_detach_device(mapping->domain, dev);
> kref_put(&mapping->kref, release_iommu_mapping);
> to_dma_iommu_mapping(dev) = NULL;
> - set_dma_ops(dev, NULL);
> + set_dma_ops(dev, arm_get_dma_map_ops(dev->archdata.dma_coherent));
>
> pr_debug("Detached IOMMU controller from %s device.\n", dev_name(dev));
> }
> @@ -2357,11 +2362,6 @@ static void arm_teardown_iommu_dma_ops(struct device *dev) { }
>
> #endif /* CONFIG_ARM_DMA_USE_IOMMU */
>
> -static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
> -{
> - return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
> -}
> -
> void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> const struct iommu_ops *iommu, bool coherent)
> {
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v4 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping
2018-05-30 14:06 ` [PATCH v4 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping Thierry Reding
2018-05-31 16:12 ` Christoph Hellwig
@ 2018-05-31 17:56 ` Robin Murphy
2018-07-06 15:36 ` Nicolas Chauvet
2 siblings, 0 replies; 10+ messages in thread
From: Robin Murphy @ 2018-05-31 17:56 UTC (permalink / raw)
To: linux-arm-kernel
On 30/05/18 15:06, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> Depending on the kernel configuration, early ARM architecture setup code
> may have attached the GPU to a DMA/IOMMU mapping that transparently uses
> the IOMMU to back the DMA API. Tegra requires special handling for IOMMU
> backed buffers (a special bit in the GPU's MMU page tables indicates the
> memory path to take: via the SMMU or directly to the memory controller).
> Transparently backing DMA memory with an IOMMU prevents Nouveau from
> properly handling such memory accesses and causes memory access faults.
>
> As a side-note: buffers other than those allocated in instance memory
> don't need to be physically contiguous from the GPU's perspective since
> the GPU can map them into contiguous buffers using its own MMU. Mapping
> these buffers through the IOMMU is unnecessary and will even lead to
> performance degradation because of the additional translation. One
> exception to this are compressible buffers which need large pages. In
> order to enable these large pages, multiple small pages will have to be
> combined into one large (I/O virtually contiguous) mapping via the
> IOMMU. However, that is a topic outside the scope of this fix and isn't
> currently supported. An implementation will want to explicitly create
> these large pages in the Nouveau driver, so detaching from a DMA/IOMMU
> mapping would still be required.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
> Changes in v4:
> - use existing APIs to detach from a DMA/IOMMU mapping
>
> Changes in v3:
> - clarify the use of IOMMU mapping for compressible buffers
> - squash multiple patches into this
>
> drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> index 78597da6313a..0e372a190d3f 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> @@ -23,6 +23,10 @@
> #ifdef CONFIG_NOUVEAU_PLATFORM_DRIVER
> #include "priv.h"
>
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +#include <asm/dma-iommu.h>
> +#endif
> +
> static int
> nvkm_device_tegra_power_up(struct nvkm_device_tegra *tdev)
> {
> @@ -105,6 +109,15 @@ nvkm_device_tegra_probe_iommu(struct nvkm_device_tegra *tdev)
> unsigned long pgsize_bitmap;
> int ret;
>
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> + if (dev->archdata.mapping) {
> + struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
Nit: there's arguably little point using the helper here after you've
already shattered the illusion by poking dev->archdata.mapping directly,
but I guess this disappears again anyway once the refcounting gets
sorted out and the mapping releases itself properly, so:
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> +
> + arm_iommu_detach_device(dev);
> + arm_iommu_release_mapping(mapping);
> + }
> +#endif
> +
> if (!tdev->func->iommu_bit)
> return;
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()
2018-05-30 14:06 ` [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device() Thierry Reding
2018-05-31 17:52 ` Robin Murphy
@ 2018-07-02 11:53 ` Thierry Reding
2018-07-02 15:23 ` Russell King - ARM Linux
2018-07-02 15:34 ` Christoph Hellwig
2 siblings, 1 reply; 10+ messages in thread
From: Thierry Reding @ 2018-07-02 11:53 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, May 30, 2018 at 04:06:24PM +0200, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> Instead of setting the DMA ops pointer to NULL, set the correct,
> non-IOMMU ops depending on the device's coherency setting.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
> Changes in v4:
> - new patch to fix existing arm_iommu_detach_device() to do what we need
>
> arch/arm/mm/dma-mapping.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
Christoph, Russell,
could either of you provide an Acked-by for this? I think it makes the
most sense for Ben to pick this up into the Nouveau tree along with
patch 2/2.
Thierry
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index af27f1c22d93..87a0037574e4 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -1151,6 +1151,11 @@ int arm_dma_supported(struct device *dev, u64 mask)
> return __dma_supported(dev, mask, false);
> }
>
> +static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
> +{
> + return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
> +}
> +
> #ifdef CONFIG_ARM_DMA_USE_IOMMU
>
> static int __dma_info_to_prot(enum dma_data_direction dir, unsigned long attrs)
> @@ -2296,7 +2301,7 @@ void arm_iommu_detach_device(struct device *dev)
> iommu_detach_device(mapping->domain, dev);
> kref_put(&mapping->kref, release_iommu_mapping);
> to_dma_iommu_mapping(dev) = NULL;
> - set_dma_ops(dev, NULL);
> + set_dma_ops(dev, arm_get_dma_map_ops(dev->archdata.dma_coherent));
>
> pr_debug("Detached IOMMU controller from %s device.\n", dev_name(dev));
> }
> @@ -2357,11 +2362,6 @@ static void arm_teardown_iommu_dma_ops(struct device *dev) { }
>
> #endif /* CONFIG_ARM_DMA_USE_IOMMU */
>
> -static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
> -{
> - return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
> -}
> -
> void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> const struct iommu_ops *iommu, bool coherent)
> {
> --
> 2.17.0
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180702/3e08a723/attachment.sig>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()
2018-07-02 11:53 ` Thierry Reding
@ 2018-07-02 15:23 ` Russell King - ARM Linux
0 siblings, 0 replies; 10+ messages in thread
From: Russell King - ARM Linux @ 2018-07-02 15:23 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jul 02, 2018 at 01:53:17PM +0200, Thierry Reding wrote:
> On Wed, May 30, 2018 at 04:06:24PM +0200, Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> >
> > Instead of setting the DMA ops pointer to NULL, set the correct,
> > non-IOMMU ops depending on the device's coherency setting.
> >
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> > Changes in v4:
> > - new patch to fix existing arm_iommu_detach_device() to do what we need
> >
> > arch/arm/mm/dma-mapping.c | 12 ++++++------
> > 1 file changed, 6 insertions(+), 6 deletions(-)
>
> Christoph, Russell,
>
> could either of you provide an Acked-by for this? I think it makes the
> most sense for Ben to pick this up into the Nouveau tree along with
> patch 2/2.
Looks fine to me.
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Thanks.
>
> Thierry
>
> > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> > index af27f1c22d93..87a0037574e4 100644
> > --- a/arch/arm/mm/dma-mapping.c
> > +++ b/arch/arm/mm/dma-mapping.c
> > @@ -1151,6 +1151,11 @@ int arm_dma_supported(struct device *dev, u64 mask)
> > return __dma_supported(dev, mask, false);
> > }
> >
> > +static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
> > +{
> > + return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
> > +}
> > +
> > #ifdef CONFIG_ARM_DMA_USE_IOMMU
> >
> > static int __dma_info_to_prot(enum dma_data_direction dir, unsigned long attrs)
> > @@ -2296,7 +2301,7 @@ void arm_iommu_detach_device(struct device *dev)
> > iommu_detach_device(mapping->domain, dev);
> > kref_put(&mapping->kref, release_iommu_mapping);
> > to_dma_iommu_mapping(dev) = NULL;
> > - set_dma_ops(dev, NULL);
> > + set_dma_ops(dev, arm_get_dma_map_ops(dev->archdata.dma_coherent));
> >
> > pr_debug("Detached IOMMU controller from %s device.\n", dev_name(dev));
> > }
> > @@ -2357,11 +2362,6 @@ static void arm_teardown_iommu_dma_ops(struct device *dev) { }
> >
> > #endif /* CONFIG_ARM_DMA_USE_IOMMU */
> >
> > -static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
> > -{
> > - return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
> > -}
> > -
> > void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> > const struct iommu_ops *iommu, bool coherent)
> > {
> > --
> > 2.17.0
> >
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()
2018-05-30 14:06 ` [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device() Thierry Reding
2018-05-31 17:52 ` Robin Murphy
2018-07-02 11:53 ` Thierry Reding
@ 2018-07-02 15:34 ` Christoph Hellwig
2 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2018-07-02 15:34 UTC (permalink / raw)
To: linux-arm-kernel
Looks good:
Acked-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v4 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping
2018-05-30 14:06 ` [PATCH v4 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping Thierry Reding
2018-05-31 16:12 ` Christoph Hellwig
2018-05-31 17:56 ` Robin Murphy
@ 2018-07-06 15:36 ` Nicolas Chauvet
2 siblings, 0 replies; 10+ messages in thread
From: Nicolas Chauvet @ 2018-07-06 15:36 UTC (permalink / raw)
To: linux-arm-kernel
2018-05-30 16:06 GMT+02:00 Thierry Reding <thierry.reding@gmail.com>:
> From: Thierry Reding <treding@nvidia.com>
>
> Depending on the kernel configuration, early ARM architecture setup code
> may have attached the GPU to a DMA/IOMMU mapping that transparently uses
> the IOMMU to back the DMA API. Tegra requires special handling for IOMMU
> backed buffers (a special bit in the GPU's MMU page tables indicates the
> memory path to take: via the SMMU or directly to the memory controller).
> Transparently backing DMA memory with an IOMMU prevents Nouveau from
> properly handling such memory accesses and causes memory access faults.
>
> As a side-note: buffers other than those allocated in instance memory
> don't need to be physically contiguous from the GPU's perspective since
> the GPU can map them into contiguous buffers using its own MMU. Mapping
> these buffers through the IOMMU is unnecessary and will even lead to
> performance degradation because of the additional translation. One
> exception to this are compressible buffers which need large pages. In
> order to enable these large pages, multiple small pages will have to be
> combined into one large (I/O virtually contiguous) mapping via the
> IOMMU. However, that is a topic outside the scope of this fix and isn't
> currently supported. An implementation will want to explicitly create
> these large pages in the Nouveau driver, so detaching from a DMA/IOMMU
> mapping would still be required.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
> Changes in v4:
> - use existing APIs to detach from a DMA/IOMMU mapping
>
> Changes in v3:
> - clarify the use of IOMMU mapping for compressible buffers
> - squash multiple patches into this
>
> drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> index 78597da6313a..0e372a190d3f 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> @@ -23,6 +23,10 @@
> #ifdef CONFIG_NOUVEAU_PLATFORM_DRIVER
> #include "priv.h"
>
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> +#include <asm/dma-iommu.h>
> +#endif
> +
> static int
> nvkm_device_tegra_power_up(struct nvkm_device_tegra *tdev)
> {
> @@ -105,6 +109,15 @@ nvkm_device_tegra_probe_iommu(struct nvkm_device_tegra *tdev)
> unsigned long pgsize_bitmap;
> int ret;
>
> +#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> + if (dev->archdata.mapping) {
> + struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> +
> + arm_iommu_detach_device(dev);
> + arm_iommu_release_mapping(mapping);
> + }
> +#endif
> +
> if (!tdev->func->iommu_bit)
> return;
>
> --
> 2.17.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
This serie (v4)
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested on jetson-tk1 on a Fedora 4.18-rc3 kernel.
--
-
Nicolas (kwizart)
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-07-06 15:36 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-05-30 14:06 [PATCH v4 0/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping Thierry Reding
2018-05-30 14:06 ` [PATCH v4 1/2] ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device() Thierry Reding
2018-05-31 17:52 ` Robin Murphy
2018-07-02 11:53 ` Thierry Reding
2018-07-02 15:23 ` Russell King - ARM Linux
2018-07-02 15:34 ` Christoph Hellwig
2018-05-30 14:06 ` [PATCH v4 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping Thierry Reding
2018-05-31 16:12 ` Christoph Hellwig
2018-05-31 17:56 ` Robin Murphy
2018-07-06 15:36 ` Nicolas Chauvet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).