linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/13] iommu/arm-smmu: Updates for 4.3
@ 2015-08-03 13:25 Will Deacon
  2015-08-03 13:25 ` [PATCH 01/13] iommu/arm-smmu: Fix enabling of PRIQ interrupt Will Deacon
                   ` (12 more replies)
  0 siblings, 13 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

This is the ARM SMMU driver queue I currently have for 4.3. The diffstat
is slightly noisy since Robin cleaned up the DMA API usage in the
io-pgtable code so the client drivers no longer need to perform
dma_map_page hacks in order to clean the CPU caches on non-coherent
systems.

Other changes include some MSI prep., a couple of minor fixes for
SMMUv3 and honouring of the dma-coherent property for SMMUv1/2.

In the absence of any issues, I plan to send a pull request to Joerg
later this week.

Cheers,

Will

--->8

Marc Zyngier (2):
  iommu/arm-smmu: Fix enabling of PRIQ interrupt
  iommu/arm-smmu: Fix MSI memory attributes to match specification

Robin Murphy (9):
  iommu/arm-smmu: Sort out coherency
  iommu/io-pgtable-arm: Allow appropriate DMA API use
  iommu/arm-smmu: Clean up DMA API usage
  iommu/arm-smmu: Clean up DMA API usage
  iommu/ipmmu-vmsa: Clean up DMA API usage
  iommu/io-pgtable-arm: Centralise sync points
  iommu/arm-smmu: Remove arm_smmu_flush_pgtable()
  iommu/arm-smmu: Remove arm_smmu_flush_pgtable()
  iommu/io-pgtable: Remove flush_pgtable callback

Will Deacon (2):
  iommu/arm-smmu: Limit 2-level strtab allocation for small SID sizes
  iommu/arm-smmu: Treat unknown OAS as 48-bit

 .../devicetree/bindings/iommu/arm,smmu.txt         |   6 +
 drivers/iommu/Kconfig                              |   3 +-
 drivers/iommu/arm-smmu-v3.c                        |  65 +++++------
 drivers/iommu/arm-smmu.c                           |  45 +++-----
 drivers/iommu/io-pgtable-arm.c                     | 126 +++++++++++++++------
 drivers/iommu/io-pgtable.h                         |   9 +-
 drivers/iommu/ipmmu-vmsa.c                         |  19 +---
 7 files changed, 156 insertions(+), 117 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 01/13] iommu/arm-smmu: Fix enabling of PRIQ interrupt
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 13:25 ` [PATCH 02/13] iommu/arm-smmu: Fix MSI memory attributes to match specification Will Deacon
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: Marc Zyngier <marc.zyngier@arm.com>

When an ARM SMMUv3 instance supports PRI, the driver registers
an interrupt handler, but fails to enable the generation of
such interrupt at the SMMU level.

This patches simply moves the enable flags to a variable that
gets updated by the PRI handling code before being written to the
SMMU register.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/arm-smmu-v3.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index da902baaa794..5d2cbdab5afa 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -118,6 +118,7 @@
 
 #define ARM_SMMU_IRQ_CTRL		0x50
 #define IRQ_CTRL_EVTQ_IRQEN		(1 << 2)
+#define IRQ_CTRL_PRIQ_IRQEN		(1 << 1)
 #define IRQ_CTRL_GERROR_IRQEN		(1 << 0)
 
 #define ARM_SMMU_IRQ_CTRLACK		0x54
@@ -2198,6 +2199,7 @@ static int arm_smmu_write_reg_sync(struct arm_smmu_device *smmu, u32 val,
 static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
 {
 	int ret, irq;
+	u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
 
 	/* Disable IRQs first */
 	ret = arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_IRQ_CTRL,
@@ -2252,13 +2254,13 @@ static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
 			if (IS_ERR_VALUE(ret))
 				dev_warn(smmu->dev,
 					 "failed to enable priq irq\n");
+			else
+				irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
 		}
 	}
 
 	/* Enable interrupt generation on the SMMU */
-	ret = arm_smmu_write_reg_sync(smmu,
-				      IRQ_CTRL_EVTQ_IRQEN |
-				      IRQ_CTRL_GERROR_IRQEN,
+	ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
 				      ARM_SMMU_IRQ_CTRL, ARM_SMMU_IRQ_CTRLACK);
 	if (ret)
 		dev_warn(smmu->dev, "failed to enable irqs\n");
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 02/13] iommu/arm-smmu: Fix MSI memory attributes to match specification
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
  2015-08-03 13:25 ` [PATCH 01/13] iommu/arm-smmu: Fix enabling of PRIQ interrupt Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 13:25 ` [PATCH 03/13] iommu/arm-smmu: Limit 2-level strtab allocation for small SID sizes Will Deacon
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: Marc Zyngier <marc.zyngier@arm.com>

The MSI memory attributes in the SMMUv3 driver are from an older
revision of the spec, which doesn't match the current implementations.

Out with the old, in with the new.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/arm-smmu-v3.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5d2cbdab5afa..c2c1ad8915d9 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -174,14 +174,14 @@
 #define ARM_SMMU_PRIQ_IRQ_CFG2		0xdc
 
 /* Common MSI config fields */
-#define MSI_CFG0_SH_SHIFT		60
-#define MSI_CFG0_SH_NSH			(0UL << MSI_CFG0_SH_SHIFT)
-#define MSI_CFG0_SH_OSH			(2UL << MSI_CFG0_SH_SHIFT)
-#define MSI_CFG0_SH_ISH			(3UL << MSI_CFG0_SH_SHIFT)
-#define MSI_CFG0_MEMATTR_SHIFT		56
-#define MSI_CFG0_MEMATTR_DEVICE_nGnRE	(0x1 << MSI_CFG0_MEMATTR_SHIFT)
 #define MSI_CFG0_ADDR_SHIFT		2
 #define MSI_CFG0_ADDR_MASK		0x3fffffffffffUL
+#define MSI_CFG2_SH_SHIFT		4
+#define MSI_CFG2_SH_NSH			(0UL << MSI_CFG2_SH_SHIFT)
+#define MSI_CFG2_SH_OSH			(2UL << MSI_CFG2_SH_SHIFT)
+#define MSI_CFG2_SH_ISH			(3UL << MSI_CFG2_SH_SHIFT)
+#define MSI_CFG2_MEMATTR_SHIFT		0
+#define MSI_CFG2_MEMATTR_DEVICE_nGnRE	(0x1 << MSI_CFG2_MEMATTR_SHIFT)
 
 #define Q_IDX(q, p)			((p) & ((1 << (q)->max_n_shift) - 1))
 #define Q_WRP(q, p)			((p) & (1 << (q)->max_n_shift))
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 03/13] iommu/arm-smmu: Limit 2-level strtab allocation for small SID sizes
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
  2015-08-03 13:25 ` [PATCH 01/13] iommu/arm-smmu: Fix enabling of PRIQ interrupt Will Deacon
  2015-08-03 13:25 ` [PATCH 02/13] iommu/arm-smmu: Fix MSI memory attributes to match specification Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 13:25 ` [PATCH 04/13] iommu/arm-smmu: Sort out coherency Will Deacon
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

If the StreamIDs in a system can all be resolved by a single level-2
stream table (i.e. SIDSIZE < SPLIT), then we currently get our maths
wrong and allocate the largest strtab we support, thanks to unsigned
overflow in our calculation.

This patch fixes the issue by checking the SIDSIZE explicitly when
calculating the size of our first-level stream table.

Reported-by: Matt Evans <matt.evans@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/arm-smmu-v3.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index c2c1ad8915d9..4f093373f4c3 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2054,9 +2054,17 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 	int ret;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
 
-	/* Calculate the L1 size, capped to the SIDSIZE */
-	size = STRTAB_L1_SZ_SHIFT - (ilog2(STRTAB_L1_DESC_DWORDS) + 3);
-	size = min(size, smmu->sid_bits - STRTAB_SPLIT);
+	/*
+	 * If we can resolve everything with a single L2 table, then we
+	 * just need a single L1 descriptor. Otherwise, calculate the L1
+	 * size, capped to the SIDSIZE.
+	 */
+	if (smmu->sid_bits < STRTAB_SPLIT) {
+		size = 0;
+	} else {
+		size = STRTAB_L1_SZ_SHIFT - (ilog2(STRTAB_L1_DESC_DWORDS) + 3);
+		size = min(size, smmu->sid_bits - STRTAB_SPLIT);
+	}
 	cfg->num_l1_ents = 1 << size;
 
 	size += STRTAB_SPLIT;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 04/13] iommu/arm-smmu: Sort out coherency
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
                   ` (2 preceding siblings ...)
  2015-08-03 13:25 ` [PATCH 03/13] iommu/arm-smmu: Limit 2-level strtab allocation for small SID sizes Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 13:25 ` [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use Will Deacon
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: Robin Murphy <Robin.Murphy@arm.com>

Currently, we detect whether the SMMU has coherent page table walk
capability from the IDR0.CTTW field, and base our cache maintenance
decisions on that. In preparation for fixing the bogus DMA API usage,
however, we need to ensure that the DMA API agrees about this, which
necessitates deferring to the dma-coherent property in the device tree
for the final say.

As an added bonus, since systems exist where an external CTTW signal
has been tied off incorrectly at integration, allowing DT to override
it offers a neat workaround for coherency issues with such SMMUs.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 Documentation/devicetree/bindings/iommu/arm,smmu.txt |  6 ++++++
 drivers/iommu/arm-smmu.c                             | 20 +++++++++++++++++---
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 06760503a819..718074501fcb 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -43,6 +43,12 @@ conditions.
 
 ** System MMU optional properties:
 
+- dma-coherent  : Present if page table walks made by the SMMU are
+                  cache coherent with the CPU.
+
+                  NOTE: this only applies to the SMMU itself, not
+                  masters connected upstream of the SMMU.
+
 - calxeda,smmu-secure-config-access : Enable proper handling of buggy
                   implementations that always use secure access to
                   SMMU configuration registers. In this case non-secure
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 4cd0c29cb585..0583ed2f33c0 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -37,6 +37,7 @@
 #include <linux/iopoll.h>
 #include <linux/module.h>
 #include <linux/of.h>
+#include <linux/of_address.h>
 #include <linux/pci.h>
 #include <linux/platform_device.h>
 #include <linux/slab.h>
@@ -1532,6 +1533,7 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	unsigned long size;
 	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
 	u32 id;
+	bool cttw_dt, cttw_reg;
 
 	dev_notice(smmu->dev, "probing hardware configuration...\n");
 	dev_notice(smmu->dev, "SMMUv%d with:\n", smmu->version);
@@ -1571,10 +1573,22 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 		dev_notice(smmu->dev, "\taddress translation ops\n");
 	}
 
-	if (id & ID0_CTTW) {
+	/*
+	 * In order for DMA API calls to work properly, we must defer to what
+	 * the DT says about coherency, regardless of what the hardware claims.
+	 * Fortunately, this also opens up a workaround for systems where the
+	 * ID register value has ended up configured incorrectly.
+	 */
+	cttw_dt = of_dma_is_coherent(smmu->dev->of_node);
+	cttw_reg = !!(id & ID0_CTTW);
+	if (cttw_dt)
 		smmu->features |= ARM_SMMU_FEAT_COHERENT_WALK;
-		dev_notice(smmu->dev, "\tcoherent table walk\n");
-	}
+	if (cttw_dt || cttw_reg)
+		dev_notice(smmu->dev, "\t%scoherent table walk\n",
+			   cttw_dt ? "" : "non-");
+	if (cttw_dt != cttw_reg)
+		dev_notice(smmu->dev,
+			   "\t(IDR0.CTTW overridden by dma-coherent property)\n");
 
 	if (id & ID0_SMS) {
 		u32 smr, sid, mask;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
                   ` (3 preceding siblings ...)
  2015-08-03 13:25 ` [PATCH 04/13] iommu/arm-smmu: Sort out coherency Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-04 13:16   ` Laurent Pinchart
  2015-08-03 13:25 ` [PATCH 06/13] iommu/arm-smmu: Clean up DMA API usage Will Deacon
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: Robin Murphy <Robin.Murphy@arm.com>

Currently, users of the LPAE page table code are (ab)using dma_map_page()
as a means to flush page table updates for non-coherent IOMMUs. Since
from the CPU's point of view, creating IOMMU page tables *is* passing
DMA buffers to a device (the IOMMU's page table walker), there's little
reason not to use the DMA API correctly.

Allow IOMMU drivers to opt into DMA API operations for page table
allocation and updates by providing their appropriate device pointer.
The expectation is that an LPAE IOMMU should have a full view of system
memory, so use streaming mappings to avoid unnecessary pressure on
ZONE_DMA, and treat any DMA translation as a warning sign.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/Kconfig          |   3 +-
 drivers/iommu/io-pgtable-arm.c | 107 ++++++++++++++++++++++++++++++++---------
 drivers/iommu/io-pgtable.h     |   3 ++
 3 files changed, 89 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index f1fb1d3ccc56..d77a848d50de 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -23,7 +23,8 @@ config IOMMU_IO_PGTABLE
 config IOMMU_IO_PGTABLE_LPAE
 	bool "ARMv7/v8 Long Descriptor Format"
 	select IOMMU_IO_PGTABLE
-	depends on ARM || ARM64 || COMPILE_TEST
+	# SWIOTLB guarantees a dma_to_phys() implementation
+	depends on ARM || ARM64 || (COMPILE_TEST && SWIOTLB)
 	help
 	  Enable support for the ARM long descriptor pagetable format.
 	  This allocator supports 4K/2M/1G, 16K/32M and 64K/512M page
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 4e460216bd16..28cca8a652f9 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -200,12 +200,76 @@ typedef u64 arm_lpae_iopte;
 
 static bool selftest_running = false;
 
+static dma_addr_t __arm_lpae_dma_addr(struct device *dev, void *pages)
+{
+	return phys_to_dma(dev, virt_to_phys(pages));
+}
+
+static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
+				    struct io_pgtable_cfg *cfg)
+{
+	struct device *dev = cfg->iommu_dev;
+	dma_addr_t dma;
+	void *pages = alloc_pages_exact(size, gfp | __GFP_ZERO);
+
+	if (!pages)
+		return NULL;
+
+	if (dev) {
+		dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
+		if (dma_mapping_error(dev, dma))
+			goto out_free;
+		/*
+		 * We depend on the IOMMU being able to work with any physical
+		 * address directly, so if the DMA layer suggests it can't by
+		 * giving us back some translation, that bodes very badly...
+		 */
+		if (dma != __arm_lpae_dma_addr(dev, pages))
+			goto out_unmap;
+	}
+
+	return pages;
+
+out_unmap:
+	dev_err(dev, "Cannot accommodate DMA translation for IOMMU page tables\n");
+	dma_unmap_single(dev, dma, size, DMA_TO_DEVICE);
+out_free:
+	free_pages_exact(pages, size);
+	return NULL;
+}
+
+static void __arm_lpae_free_pages(void *pages, size_t size,
+				  struct io_pgtable_cfg *cfg)
+{
+	struct device *dev = cfg->iommu_dev;
+
+	if (dev)
+		dma_unmap_single(dev, __arm_lpae_dma_addr(dev, pages),
+				 size, DMA_TO_DEVICE);
+	free_pages_exact(pages, size);
+}
+
+static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
+			       struct io_pgtable_cfg *cfg, void *cookie)
+{
+	struct device *dev = cfg->iommu_dev;
+
+	*ptep = pte;
+
+	if (dev)
+		dma_sync_single_for_device(dev, __arm_lpae_dma_addr(dev, ptep),
+					   sizeof(pte), DMA_TO_DEVICE);
+	else if (cfg->tlb->flush_pgtable)
+		cfg->tlb->flush_pgtable(ptep, sizeof(pte), cookie);
+}
+
 static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
 			     unsigned long iova, phys_addr_t paddr,
 			     arm_lpae_iopte prot, int lvl,
 			     arm_lpae_iopte *ptep)
 {
 	arm_lpae_iopte pte = prot;
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
 
 	/* We require an unmap first */
 	if (iopte_leaf(*ptep, lvl)) {
@@ -213,7 +277,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
 		return -EEXIST;
 	}
 
-	if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_NS)
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
 		pte |= ARM_LPAE_PTE_NS;
 
 	if (lvl == ARM_LPAE_MAX_LEVELS - 1)
@@ -224,8 +288,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
 	pte |= ARM_LPAE_PTE_AF | ARM_LPAE_PTE_SH_IS;
 	pte |= pfn_to_iopte(paddr >> data->pg_shift, data);
 
-	*ptep = pte;
-	data->iop.cfg.tlb->flush_pgtable(ptep, sizeof(*ptep), data->iop.cookie);
+	__arm_lpae_set_pte(ptep, pte, cfg, data->iop.cookie);
 	return 0;
 }
 
@@ -236,12 +299,13 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova,
 	arm_lpae_iopte *cptep, pte;
 	void *cookie = data->iop.cookie;
 	size_t block_size = ARM_LPAE_BLOCK_SIZE(lvl, data);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
 
 	/* Find our entry at the current level */
 	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
 
 	/* If we can install a leaf entry at this level, then do so */
-	if (size == block_size && (size & data->iop.cfg.pgsize_bitmap))
+	if (size == block_size && (size & cfg->pgsize_bitmap))
 		return arm_lpae_init_pte(data, iova, paddr, prot, lvl, ptep);
 
 	/* We can't allocate tables at the final level */
@@ -251,18 +315,15 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova,
 	/* Grab a pointer to the next level */
 	pte = *ptep;
 	if (!pte) {
-		cptep = alloc_pages_exact(1UL << data->pg_shift,
-					 GFP_ATOMIC | __GFP_ZERO);
+		cptep = __arm_lpae_alloc_pages(1UL << data->pg_shift,
+					       GFP_ATOMIC, cfg);
 		if (!cptep)
 			return -ENOMEM;
 
-		data->iop.cfg.tlb->flush_pgtable(cptep, 1UL << data->pg_shift,
-						 cookie);
 		pte = __pa(cptep) | ARM_LPAE_PTE_TYPE_TABLE;
-		if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_NS)
+		if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
 			pte |= ARM_LPAE_PTE_NSTABLE;
-		*ptep = pte;
-		data->iop.cfg.tlb->flush_pgtable(ptep, sizeof(*ptep), cookie);
+		__arm_lpae_set_pte(ptep, pte, cfg, cookie);
 	} else {
 		cptep = iopte_deref(pte, data);
 	}
@@ -347,7 +408,7 @@ static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl,
 		__arm_lpae_free_pgtable(data, lvl + 1, iopte_deref(pte, data));
 	}
 
-	free_pages_exact(start, table_size);
+	__arm_lpae_free_pages(start, table_size, &data->iop.cfg);
 }
 
 static void arm_lpae_free_pgtable(struct io_pgtable *iop)
@@ -366,8 +427,8 @@ static int arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
 	unsigned long blk_start, blk_end;
 	phys_addr_t blk_paddr;
 	arm_lpae_iopte table = 0;
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
 	void *cookie = data->iop.cookie;
-	const struct iommu_gather_ops *tlb = data->iop.cfg.tlb;
 
 	blk_start = iova & ~(blk_size - 1);
 	blk_end = blk_start + blk_size;
@@ -393,10 +454,9 @@ static int arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
 		}
 	}
 
-	*ptep = table;
-	tlb->flush_pgtable(ptep, sizeof(*ptep), cookie);
+	__arm_lpae_set_pte(ptep, table, cfg, cookie);
 	iova &= ~(blk_size - 1);
-	tlb->tlb_add_flush(iova, blk_size, true, cookie);
+	cfg->tlb->tlb_add_flush(iova, blk_size, true, cookie);
 	return size;
 }
 
@@ -418,13 +478,12 @@ static int __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
 
 	/* If the size matches this level, we're in the right place */
 	if (size == blk_size) {
-		*ptep = 0;
-		tlb->flush_pgtable(ptep, sizeof(*ptep), cookie);
+		__arm_lpae_set_pte(ptep, 0, &data->iop.cfg, cookie);
 
 		if (!iopte_leaf(pte, lvl)) {
 			/* Also flush any partial walks */
 			tlb->tlb_add_flush(iova, size, false, cookie);
-			tlb->tlb_sync(data->iop.cookie);
+			tlb->tlb_sync(cookie);
 			ptep = iopte_deref(pte, data);
 			__arm_lpae_free_pgtable(data, lvl + 1, ptep);
 		} else {
@@ -640,11 +699,12 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 	cfg->arm_lpae_s1_cfg.mair[1] = 0;
 
 	/* Looking good; allocate a pgd */
-	data->pgd = alloc_pages_exact(data->pgd_size, GFP_KERNEL | __GFP_ZERO);
+	data->pgd = __arm_lpae_alloc_pages(data->pgd_size, GFP_KERNEL, cfg);
 	if (!data->pgd)
 		goto out_free_data;
 
-	cfg->tlb->flush_pgtable(data->pgd, data->pgd_size, cookie);
+	if (cfg->tlb->flush_pgtable)
+		cfg->tlb->flush_pgtable(data->pgd, data->pgd_size, cookie);
 
 	/* TTBRs */
 	cfg->arm_lpae_s1_cfg.ttbr[0] = virt_to_phys(data->pgd);
@@ -728,11 +788,12 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 	cfg->arm_lpae_s2_cfg.vtcr = reg;
 
 	/* Allocate pgd pages */
-	data->pgd = alloc_pages_exact(data->pgd_size, GFP_KERNEL | __GFP_ZERO);
+	data->pgd = __arm_lpae_alloc_pages(data->pgd_size, GFP_KERNEL, cfg);
 	if (!data->pgd)
 		goto out_free_data;
 
-	cfg->tlb->flush_pgtable(data->pgd, data->pgd_size, cookie);
+	if (cfg->tlb->flush_pgtable)
+		cfg->tlb->flush_pgtable(data->pgd, data->pgd_size, cookie);
 
 	/* VTTBR */
 	cfg->arm_lpae_s2_cfg.vttbr = virt_to_phys(data->pgd);
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index 10e32f69c668..c69529c78914 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -41,6 +41,8 @@ struct iommu_gather_ops {
  * @ias:           Input address (iova) size, in bits.
  * @oas:           Output address (paddr) size, in bits.
  * @tlb:           TLB management callbacks for this set of tables.
+ * @iommu_dev:     The device representing the DMA configuration for the
+ *                 page table walker.
  */
 struct io_pgtable_cfg {
 	#define IO_PGTABLE_QUIRK_ARM_NS	(1 << 0)	/* Set NS bit in PTEs */
@@ -49,6 +51,7 @@ struct io_pgtable_cfg {
 	unsigned int			ias;
 	unsigned int			oas;
 	const struct iommu_gather_ops	*tlb;
+	struct device			*iommu_dev;
 
 	/* Low-level data specific to the table format */
 	union {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 06/13] iommu/arm-smmu: Clean up DMA API usage
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
                   ` (4 preceding siblings ...)
  2015-08-03 13:25 ` [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 13:25 ` [PATCH 07/13] " Will Deacon
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: Robin Murphy <Robin.Murphy@arm.com>

With the correct DMA API calls now integrated into the io-pgtable code,
let that handle the flushing of non-coherent page table updates.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/arm-smmu.c | 22 ++++++----------------
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 0583ed2f33c0..5770ab98fa38 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -611,24 +611,13 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
 static void arm_smmu_flush_pgtable(void *addr, size_t size, void *cookie)
 {
 	struct arm_smmu_domain *smmu_domain = cookie;
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
-	unsigned long offset = (unsigned long)addr & ~PAGE_MASK;
-
 
-	/* Ensure new page tables are visible to the hardware walker */
-	if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) {
+	/*
+	 * Ensure new page tables are visible to a coherent hardware walker.
+	 * The page table code deals with flushing for the non-coherent case.
+	 */
+	if (smmu_domain->smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
 		dsb(ishst);
-	} else {
-		/*
-		 * If the SMMU can't walk tables in the CPU caches, treat them
-		 * like non-coherent DMA since we need to flush the new entries
-		 * all the way out to memory. There's no possibility of
-		 * recursion here as the SMMU table walker will not be wired
-		 * through another SMMU.
-		 */
-		dma_map_page(smmu->dev, virt_to_page(addr), offset, size,
-			     DMA_TO_DEVICE);
-	}
 }
 
 static struct iommu_gather_ops arm_smmu_gather_ops = {
@@ -899,6 +888,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		.ias		= ias,
 		.oas		= oas,
 		.tlb		= &arm_smmu_gather_ops,
+		.iommu_dev	= smmu->dev,
 	};
 
 	smmu_domain->smmu = smmu;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 07/13] iommu/arm-smmu: Clean up DMA API usage
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
                   ` (5 preceding siblings ...)
  2015-08-03 13:25 ` [PATCH 06/13] iommu/arm-smmu: Clean up DMA API usage Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 13:25 ` [PATCH 08/13] iommu/ipmmu-vmsa: " Will Deacon
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: Robin Murphy <Robin.Murphy@arm.com>

With the correct DMA API calls now integrated into the io-pgtable code,
let that handle the flushing of non-coherent page table updates.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/arm-smmu-v3.c | 18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4f093373f4c3..38a891e28e13 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1334,23 +1334,10 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
 static void arm_smmu_flush_pgtable(void *addr, size_t size, void *cookie)
 {
 	struct arm_smmu_domain *smmu_domain = cookie;
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
-	unsigned long offset = (unsigned long)addr & ~PAGE_MASK;
 
-	if (smmu->features & ARM_SMMU_FEAT_COHERENCY) {
+	/* The page table code handles flushing in the non-coherent case */
+	if (smmu_domain->smmu->features & ARM_SMMU_FEAT_COHERENCY)
 		dsb(ishst);
-	} else {
-		dma_addr_t dma_addr;
-		struct device *dev = smmu->dev;
-
-		dma_addr = dma_map_page(dev, virt_to_page(addr), offset, size,
-					DMA_TO_DEVICE);
-
-		if (dma_mapping_error(dev, dma_addr))
-			dev_err(dev, "failed to flush pgtable at %p\n", addr);
-		else
-			dma_unmap_page(dev, dma_addr, size, DMA_TO_DEVICE);
-	}
 }
 
 static struct iommu_gather_ops arm_smmu_gather_ops = {
@@ -1532,6 +1519,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 		.ias		= ias,
 		.oas		= oas,
 		.tlb		= &arm_smmu_gather_ops,
+		.iommu_dev	= smmu->dev,
 	};
 
 	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 08/13] iommu/ipmmu-vmsa: Clean up DMA API usage
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
                   ` (6 preceding siblings ...)
  2015-08-03 13:25 ` [PATCH 07/13] " Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 13:25 ` [PATCH 09/13] iommu/io-pgtable-arm: Centralise sync points Will Deacon
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: Robin Murphy <Robin.Murphy@arm.com>

With the correct DMA API calls now integrated into the io-pgtable code,
let that handle the flushing of non-coherent page table updates.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/ipmmu-vmsa.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 1a67c531a07e..8cf605fa9946 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -283,24 +283,10 @@ static void ipmmu_tlb_add_flush(unsigned long iova, size_t size, bool leaf,
 	/* The hardware doesn't support selective TLB flush. */
 }
 
-static void ipmmu_flush_pgtable(void *ptr, size_t size, void *cookie)
-{
-	unsigned long offset = (unsigned long)ptr & ~PAGE_MASK;
-	struct ipmmu_vmsa_domain *domain = cookie;
-
-	/*
-	 * TODO: Add support for coherent walk through CCI with DVM and remove
-	 * cache handling.
-	 */
-	dma_map_page(domain->mmu->dev, virt_to_page(ptr), offset, size,
-		     DMA_TO_DEVICE);
-}
-
 static struct iommu_gather_ops ipmmu_gather_ops = {
 	.tlb_flush_all = ipmmu_tlb_flush_all,
 	.tlb_add_flush = ipmmu_tlb_add_flush,
 	.tlb_sync = ipmmu_tlb_flush_all,
-	.flush_pgtable = ipmmu_flush_pgtable,
 };
 
 /* -----------------------------------------------------------------------------
@@ -327,6 +313,11 @@ static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
 	domain->cfg.ias = 32;
 	domain->cfg.oas = 40;
 	domain->cfg.tlb = &ipmmu_gather_ops;
+	/*
+	 * TODO: Add support for coherent walk through CCI with DVM and remove
+	 * cache handling. For now, delegate it to the io-pgtable code.
+	 */
+	domain->cfg.iommu_dev = domain->mmu->dev;
 
 	domain->iop = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &domain->cfg,
 					   domain);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 09/13] iommu/io-pgtable-arm: Centralise sync points
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
                   ` (7 preceding siblings ...)
  2015-08-03 13:25 ` [PATCH 08/13] iommu/ipmmu-vmsa: " Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 13:25 ` [PATCH 10/13] iommu/arm-smmu: Remove arm_smmu_flush_pgtable() Will Deacon
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: Robin Murphy <Robin.Murphy@arm.com>

With all current users now opted in to DMA API operations, make the
iommu_dev pointer mandatory, rendering the flush_pgtable callback
redundant for cache maintenance. However, since the DMA calls could be
nops in the case of a coherent IOMMU, we still need to ensure the page
table updates are fully synchronised against a subsequent page table
walk. In the unmap path, the TLB sync will usually need to do this
anyway, so just cement that requirement; in the map path which may
consist solely of cacheable memory writes (in the coherent case),
insert an appropriate barrier at the end of the operation, and obviate
the need to call flush_pgtable on every individual update for
synchronisation.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: slight clarification to tlb_sync comment]
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/io-pgtable-arm.c | 43 +++++++++++++++++++++++-------------------
 drivers/iommu/io-pgtable.h     |  4 +++-
 2 files changed, 27 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 28cca8a652f9..06176872fb78 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -26,6 +26,8 @@
 #include <linux/slab.h>
 #include <linux/types.h>
 
+#include <asm/barrier.h>
+
 #include "io-pgtable.h"
 
 #define ARM_LPAE_MAX_ADDR_BITS		48
@@ -215,7 +217,7 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
 	if (!pages)
 		return NULL;
 
-	if (dev) {
+	if (!selftest_running) {
 		dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
 		if (dma_mapping_error(dev, dma))
 			goto out_free;
@@ -243,24 +245,22 @@ static void __arm_lpae_free_pages(void *pages, size_t size,
 {
 	struct device *dev = cfg->iommu_dev;
 
-	if (dev)
+	if (!selftest_running)
 		dma_unmap_single(dev, __arm_lpae_dma_addr(dev, pages),
 				 size, DMA_TO_DEVICE);
 	free_pages_exact(pages, size);
 }
 
 static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
-			       struct io_pgtable_cfg *cfg, void *cookie)
+			       struct io_pgtable_cfg *cfg)
 {
 	struct device *dev = cfg->iommu_dev;
 
 	*ptep = pte;
 
-	if (dev)
+	if (!selftest_running)
 		dma_sync_single_for_device(dev, __arm_lpae_dma_addr(dev, ptep),
 					   sizeof(pte), DMA_TO_DEVICE);
-	else if (cfg->tlb->flush_pgtable)
-		cfg->tlb->flush_pgtable(ptep, sizeof(pte), cookie);
 }
 
 static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
@@ -288,7 +288,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
 	pte |= ARM_LPAE_PTE_AF | ARM_LPAE_PTE_SH_IS;
 	pte |= pfn_to_iopte(paddr >> data->pg_shift, data);
 
-	__arm_lpae_set_pte(ptep, pte, cfg, data->iop.cookie);
+	__arm_lpae_set_pte(ptep, pte, cfg);
 	return 0;
 }
 
@@ -297,7 +297,6 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova,
 			  int lvl, arm_lpae_iopte *ptep)
 {
 	arm_lpae_iopte *cptep, pte;
-	void *cookie = data->iop.cookie;
 	size_t block_size = ARM_LPAE_BLOCK_SIZE(lvl, data);
 	struct io_pgtable_cfg *cfg = &data->iop.cfg;
 
@@ -323,7 +322,7 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova,
 		pte = __pa(cptep) | ARM_LPAE_PTE_TYPE_TABLE;
 		if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
 			pte |= ARM_LPAE_PTE_NSTABLE;
-		__arm_lpae_set_pte(ptep, pte, cfg, cookie);
+		__arm_lpae_set_pte(ptep, pte, cfg);
 	} else {
 		cptep = iopte_deref(pte, data);
 	}
@@ -370,7 +369,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 {
 	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
 	arm_lpae_iopte *ptep = data->pgd;
-	int lvl = ARM_LPAE_START_LVL(data);
+	int ret, lvl = ARM_LPAE_START_LVL(data);
 	arm_lpae_iopte prot;
 
 	/* If no access, then nothing to do */
@@ -378,7 +377,14 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 		return 0;
 
 	prot = arm_lpae_prot_to_pte(data, iommu_prot);
-	return __arm_lpae_map(data, iova, paddr, size, prot, lvl, ptep);
+	ret = __arm_lpae_map(data, iova, paddr, size, prot, lvl, ptep);
+	/*
+	 * Synchronise all PTE updates for the new mapping before there's
+	 * a chance for anything to kick off a table walk for the new iova.
+	 */
+	wmb();
+
+	return ret;
 }
 
 static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl,
@@ -428,7 +434,6 @@ static int arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
 	phys_addr_t blk_paddr;
 	arm_lpae_iopte table = 0;
 	struct io_pgtable_cfg *cfg = &data->iop.cfg;
-	void *cookie = data->iop.cookie;
 
 	blk_start = iova & ~(blk_size - 1);
 	blk_end = blk_start + blk_size;
@@ -454,9 +459,9 @@ static int arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
 		}
 	}
 
-	__arm_lpae_set_pte(ptep, table, cfg, cookie);
+	__arm_lpae_set_pte(ptep, table, cfg);
 	iova &= ~(blk_size - 1);
-	cfg->tlb->tlb_add_flush(iova, blk_size, true, cookie);
+	cfg->tlb->tlb_add_flush(iova, blk_size, true, data->iop.cookie);
 	return size;
 }
 
@@ -478,7 +483,7 @@ static int __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
 
 	/* If the size matches this level, we're in the right place */
 	if (size == blk_size) {
-		__arm_lpae_set_pte(ptep, 0, &data->iop.cfg, cookie);
+		__arm_lpae_set_pte(ptep, 0, &data->iop.cfg);
 
 		if (!iopte_leaf(pte, lvl)) {
 			/* Also flush any partial walks */
@@ -703,8 +708,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 	if (!data->pgd)
 		goto out_free_data;
 
-	if (cfg->tlb->flush_pgtable)
-		cfg->tlb->flush_pgtable(data->pgd, data->pgd_size, cookie);
+	/* Ensure the empty pgd is visible before any actual TTBR write */
+	wmb();
 
 	/* TTBRs */
 	cfg->arm_lpae_s1_cfg.ttbr[0] = virt_to_phys(data->pgd);
@@ -792,8 +797,8 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 	if (!data->pgd)
 		goto out_free_data;
 
-	if (cfg->tlb->flush_pgtable)
-		cfg->tlb->flush_pgtable(data->pgd, data->pgd_size, cookie);
+	/* Ensure the empty pgd is visible before any actual TTBR write */
+	wmb();
 
 	/* VTTBR */
 	cfg->arm_lpae_s2_cfg.vttbr = virt_to_phys(data->pgd);
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index c69529c78914..e8fadb012ed2 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -17,7 +17,9 @@ enum io_pgtable_fmt {
  *
  * @tlb_flush_all: Synchronously invalidate the entire TLB context.
  * @tlb_add_flush: Queue up a TLB invalidation for a virtual address range.
- * @tlb_sync:      Ensure any queue TLB invalidation has taken effect.
+ * @tlb_sync:      Ensure any queued TLB invalidation has taken effect, and
+ *                 any corresponding page table updates are visible to the
+ *                 IOMMU.
  * @flush_pgtable: Ensure page table updates are visible to the IOMMU.
  *
  * Note that these can all be called in atomic context and must therefore
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 10/13] iommu/arm-smmu: Remove arm_smmu_flush_pgtable()
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
                   ` (8 preceding siblings ...)
  2015-08-03 13:25 ` [PATCH 09/13] iommu/io-pgtable-arm: Centralise sync points Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 13:25 ` [PATCH 11/13] " Will Deacon
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: Robin Murphy <Robin.Murphy@arm.com>

With the io-pgtable code now enforcing its own appropriate sync points,
the vestigial flush_pgtable callback becomes entirely redundant, so
remove it altogether.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/arm-smmu.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5770ab98fa38..48a39dfa9777 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -608,23 +608,10 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
 	}
 }
 
-static void arm_smmu_flush_pgtable(void *addr, size_t size, void *cookie)
-{
-	struct arm_smmu_domain *smmu_domain = cookie;
-
-	/*
-	 * Ensure new page tables are visible to a coherent hardware walker.
-	 * The page table code deals with flushing for the non-coherent case.
-	 */
-	if (smmu_domain->smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
-		dsb(ishst);
-}
-
 static struct iommu_gather_ops arm_smmu_gather_ops = {
 	.tlb_flush_all	= arm_smmu_tlb_inv_context,
 	.tlb_add_flush	= arm_smmu_tlb_inv_range_nosync,
 	.tlb_sync	= arm_smmu_tlb_sync,
-	.flush_pgtable	= arm_smmu_flush_pgtable,
 };
 
 static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 11/13] iommu/arm-smmu: Remove arm_smmu_flush_pgtable()
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
                   ` (9 preceding siblings ...)
  2015-08-03 13:25 ` [PATCH 10/13] iommu/arm-smmu: Remove arm_smmu_flush_pgtable() Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 13:25 ` [PATCH 12/13] iommu/io-pgtable: Remove flush_pgtable callback Will Deacon
  2015-08-03 13:25 ` [PATCH 13/13] iommu/arm-smmu: Treat unknown OAS as 48-bit Will Deacon
  12 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: Robin Murphy <Robin.Murphy@arm.com>

With the io-pgtable code now enforcing its own appropriate sync points,
the vestigial flush_pgtable callback becomes entirely redundant, so
remove it altogether.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/arm-smmu-v3.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 38a891e28e13..15fb669282b4 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1331,20 +1331,10 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
 	arm_smmu_cmdq_issue_cmd(smmu, &cmd);
 }
 
-static void arm_smmu_flush_pgtable(void *addr, size_t size, void *cookie)
-{
-	struct arm_smmu_domain *smmu_domain = cookie;
-
-	/* The page table code handles flushing in the non-coherent case */
-	if (smmu_domain->smmu->features & ARM_SMMU_FEAT_COHERENCY)
-		dsb(ishst);
-}
-
 static struct iommu_gather_ops arm_smmu_gather_ops = {
 	.tlb_flush_all	= arm_smmu_tlb_inv_context,
 	.tlb_add_flush	= arm_smmu_tlb_inv_range_nosync,
 	.tlb_sync	= arm_smmu_tlb_sync,
-	.flush_pgtable	= arm_smmu_flush_pgtable,
 };
 
 /* IOMMU API */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 12/13] iommu/io-pgtable: Remove flush_pgtable callback
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
                   ` (10 preceding siblings ...)
  2015-08-03 13:25 ` [PATCH 11/13] " Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 13:25 ` [PATCH 13/13] iommu/arm-smmu: Treat unknown OAS as 48-bit Will Deacon
  12 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: Robin Murphy <Robin.Murphy@arm.com>

With the users fully converted to DMA API operations, it's dead, Jim.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/io-pgtable-arm.c | 6 ------
 drivers/iommu/io-pgtable.h     | 2 --
 2 files changed, 8 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 06176872fb78..e4bc2b23ab96 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -884,16 +884,10 @@ static void dummy_tlb_sync(void *cookie)
 	WARN_ON(cookie != cfg_cookie);
 }
 
-static void dummy_flush_pgtable(void *ptr, size_t size, void *cookie)
-{
-	WARN_ON(cookie != cfg_cookie);
-}
-
 static struct iommu_gather_ops dummy_tlb_ops __initdata = {
 	.tlb_flush_all	= dummy_tlb_flush_all,
 	.tlb_add_flush	= dummy_tlb_add_flush,
 	.tlb_sync	= dummy_tlb_sync,
-	.flush_pgtable	= dummy_flush_pgtable,
 };
 
 static void __init arm_lpae_dump_ops(struct io_pgtable_ops *ops)
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index e8fadb012ed2..48538a35d078 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -20,7 +20,6 @@ enum io_pgtable_fmt {
  * @tlb_sync:      Ensure any queued TLB invalidation has taken effect, and
  *                 any corresponding page table updates are visible to the
  *                 IOMMU.
- * @flush_pgtable: Ensure page table updates are visible to the IOMMU.
  *
  * Note that these can all be called in atomic context and must therefore
  * not block.
@@ -30,7 +29,6 @@ struct iommu_gather_ops {
 	void (*tlb_add_flush)(unsigned long iova, size_t size, bool leaf,
 			      void *cookie);
 	void (*tlb_sync)(void *cookie);
-	void (*flush_pgtable)(void *ptr, size_t size, void *cookie);
 };
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 13/13] iommu/arm-smmu: Treat unknown OAS as 48-bit
  2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
                   ` (11 preceding siblings ...)
  2015-08-03 13:25 ` [PATCH 12/13] iommu/io-pgtable: Remove flush_pgtable callback Will Deacon
@ 2015-08-03 13:25 ` Will Deacon
  2015-08-03 18:23   ` Sergei Shtylyov
  12 siblings, 1 reply; 21+ messages in thread
From: Will Deacon @ 2015-08-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

A late change to the SMMUv3 architecture ensures that the OAS field
will be monotonically increasing, so we can assume that an unknown OAS
is at least 48-bit and use that, rather than fail the device probe.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 drivers/iommu/arm-smmu-v3.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 15fb669282b4..a435e706226a 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2528,12 +2528,11 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu)
 	case IDR5_OAS_44_BIT:
 		smmu->oas = 44;
 		break;
+	default:
+		dev_info(smmu->dev,
+			"unknown output address size. Truncating to 48-bit\n");
 	case IDR5_OAS_48_BIT:
 		smmu->oas = 48;
-		break;
-	default:
-		dev_err(smmu->dev, "unknown output address size!\n");
-		return -ENXIO;
 	}
 
 	/* Set the DMA mask for our table walker */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 13/13] iommu/arm-smmu: Treat unknown OAS as 48-bit
  2015-08-03 13:25 ` [PATCH 13/13] iommu/arm-smmu: Treat unknown OAS as 48-bit Will Deacon
@ 2015-08-03 18:23   ` Sergei Shtylyov
  0 siblings, 0 replies; 21+ messages in thread
From: Sergei Shtylyov @ 2015-08-03 18:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hello.

On 08/03/2015 04:25 PM, Will Deacon wrote:

> A late change to the SMMUv3 architecture ensures that the OAS field
> will be monotonically increasing, so we can assume that an unknown OAS
> is at least 48-bit and use that, rather than fail the device probe.

> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>   drivers/iommu/arm-smmu-v3.c | 7 +++----
>   1 file changed, 3 insertions(+), 4 deletions(-)

> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 15fb669282b4..a435e706226a 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2528,12 +2528,11 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu)
>   	case IDR5_OAS_44_BIT:
>   		smmu->oas = 44;
>   		break;
> +	default:
> +		dev_info(smmu->dev,
> +			"unknown output address size. Truncating to 48-bit\n");

    Please add a comment like /* FALLTHRU */ here.

>   	case IDR5_OAS_48_BIT:
>   		smmu->oas = 48;
> -		break;
> -	default:
> -		dev_err(smmu->dev, "unknown output address size!\n");
> -		return -ENXIO;
>   	}
>
>   	/* Set the DMA mask for our table walker */

MBR, Sergei

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use
  2015-08-03 13:25 ` [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use Will Deacon
@ 2015-08-04 13:16   ` Laurent Pinchart
  2015-08-04 14:47     ` Robin Murphy
  0 siblings, 1 reply; 21+ messages in thread
From: Laurent Pinchart @ 2015-08-04 13:16 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will and Robin,

Thank you for the patch.

On Monday 03 August 2015 14:25:47 Will Deacon wrote:
> From: Robin Murphy <Robin.Murphy@arm.com>
> 
> Currently, users of the LPAE page table code are (ab)using dma_map_page()
> as a means to flush page table updates for non-coherent IOMMUs. Since
> from the CPU's point of view, creating IOMMU page tables *is* passing
> DMA buffers to a device (the IOMMU's page table walker), there's little
> reason not to use the DMA API correctly.
> 
> Allow IOMMU drivers to opt into DMA API operations for page table
> allocation and updates by providing their appropriate device pointer.
> The expectation is that an LPAE IOMMU should have a full view of system
> memory, so use streaming mappings to avoid unnecessary pressure on
> ZONE_DMA, and treat any DMA translation as a warning sign.

I like the idea of doing this in core code rather than in individual drivers, 
but I believe we're not using the right API. Please see below.

> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  drivers/iommu/Kconfig          |   3 +-
>  drivers/iommu/io-pgtable-arm.c | 107 +++++++++++++++++++++++++++++---------
>  drivers/iommu/io-pgtable.h     |   3 ++
>  3 files changed, 89 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index f1fb1d3ccc56..d77a848d50de 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -23,7 +23,8 @@ config IOMMU_IO_PGTABLE
>  config IOMMU_IO_PGTABLE_LPAE
>  	bool "ARMv7/v8 Long Descriptor Format"
>  	select IOMMU_IO_PGTABLE
> -	depends on ARM || ARM64 || COMPILE_TEST
> +	# SWIOTLB guarantees a dma_to_phys() implementation
> +	depends on ARM || ARM64 || (COMPILE_TEST && SWIOTLB)
>  	help
>  	  Enable support for the ARM long descriptor pagetable format.
>  	  This allocator supports 4K/2M/1G, 16K/32M and 64K/512M page
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 4e460216bd16..28cca8a652f9 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -200,12 +200,76 @@ typedef u64 arm_lpae_iopte;
> 
>  static bool selftest_running = false;
> 
> +static dma_addr_t __arm_lpae_dma_addr(struct device *dev, void *pages)
> +{
> +	return phys_to_dma(dev, virt_to_phys(pages));
> +}
> +
> +static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
> +				    struct io_pgtable_cfg *cfg)
> +{
> +	struct device *dev = cfg->iommu_dev;
> +	dma_addr_t dma;
> +	void *pages = alloc_pages_exact(size, gfp | __GFP_ZERO);
> +
> +	if (!pages)
> +		return NULL;
> +
> +	if (dev) {
> +		dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
> +		if (dma_mapping_error(dev, dma))
> +			goto out_free;
> +		/*
> +		 * We depend on the IOMMU being able to work with any physical
> +		 * address directly, so if the DMA layer suggests it can't by
> +		 * giving us back some translation, that bodes very badly...
> +		 */
> +		if (dma != __arm_lpae_dma_addr(dev, pages))
> +			goto out_unmap;

Why do we need to create a mapping at all then ? Because 
dma_sync_single_for_device() requires it ?

> +	}
> +
> +	return pages;
> +
> +out_unmap:
> +	dev_err(dev, "Cannot accommodate DMA translation for IOMMU page
> tables\n");
> +	dma_unmap_single(dev, dma, size, DMA_TO_DEVICE);
> +out_free:
> +	free_pages_exact(pages, size);
> +	return NULL;
> +}
> +
> +static void __arm_lpae_free_pages(void *pages, size_t size,
> +				  struct io_pgtable_cfg *cfg)
> +{
> +	struct device *dev = cfg->iommu_dev;
> +
> +	if (dev)
> +		dma_unmap_single(dev, __arm_lpae_dma_addr(dev, pages),
> +				 size, DMA_TO_DEVICE);
> +	free_pages_exact(pages, size);
> +}
> +
> +static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
> +			       struct io_pgtable_cfg *cfg, void *cookie)
> +{
> +	struct device *dev = cfg->iommu_dev;
> +
> +	*ptep = pte;
> +
> +	if (dev)
> +		dma_sync_single_for_device(dev, __arm_lpae_dma_addr(dev, ptep),
> +					   sizeof(pte), DMA_TO_DEVICE);

This is what I believe to be an API abuse. The dma_sync_single_for_device() 
API is meant to pass ownership of a buffer to the device. Unless I'm mistaken, 
once that's done the CPU isn't allowed to touch the buffer anymore until 
dma_sync_single_for_cpu() is called to get ownership of the buffer back. Sure, 
it might work on many ARM systems, but we really should be careful not to use 
APIs as delicate as DMA mapping and cache handling for purposes different than 
what they explicitly allow.

It might be that I'm wrong and that the streaming DMA API allows this exact 
kind of usage, but I haven't found a clear indication of that in the 
documentation. It could also be that all implementations would support it 
today, and that we would then consider it should be explicitly allowed by the 
API. In both cases a documentation patch would be welcome.

> +	else if (cfg->tlb->flush_pgtable)
> +		cfg->tlb->flush_pgtable(ptep, sizeof(pte), cookie);
> +}
> +
>  static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>  			     unsigned long iova, phys_addr_t paddr,
>  			     arm_lpae_iopte prot, int lvl,
>  			     arm_lpae_iopte *ptep)
>  {
>  	arm_lpae_iopte pte = prot;
> +	struct io_pgtable_cfg *cfg = &data->iop.cfg;
> 
>  	/* We require an unmap first */
>  	if (iopte_leaf(*ptep, lvl)) {
> @@ -213,7 +277,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable
> *data, return -EEXIST;
>  	}
> 
> -	if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_NS)
> +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
>  		pte |= ARM_LPAE_PTE_NS;
> 
>  	if (lvl == ARM_LPAE_MAX_LEVELS - 1)
> @@ -224,8 +288,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable
> *data, pte |= ARM_LPAE_PTE_AF | ARM_LPAE_PTE_SH_IS;
>  	pte |= pfn_to_iopte(paddr >> data->pg_shift, data);
> 
> -	*ptep = pte;
> -	data->iop.cfg.tlb->flush_pgtable(ptep, sizeof(*ptep), data->iop.cookie);
> +	__arm_lpae_set_pte(ptep, pte, cfg, data->iop.cookie);
>  	return 0;
>  }
> 
> @@ -236,12 +299,13 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable
> *data, unsigned long iova, arm_lpae_iopte *cptep, pte;
>  	void *cookie = data->iop.cookie;
>  	size_t block_size = ARM_LPAE_BLOCK_SIZE(lvl, data);
> +	struct io_pgtable_cfg *cfg = &data->iop.cfg;
> 
>  	/* Find our entry at the current level */
>  	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
> 
>  	/* If we can install a leaf entry at this level, then do so */
> -	if (size == block_size && (size & data->iop.cfg.pgsize_bitmap))
> +	if (size == block_size && (size & cfg->pgsize_bitmap))
>  		return arm_lpae_init_pte(data, iova, paddr, prot, lvl, ptep);
> 
>  	/* We can't allocate tables at the final level */
> @@ -251,18 +315,15 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable
> *data, unsigned long iova, /* Grab a pointer to the next level */
>  	pte = *ptep;
>  	if (!pte) {
> -		cptep = alloc_pages_exact(1UL << data->pg_shift,
> -					 GFP_ATOMIC | __GFP_ZERO);
> +		cptep = __arm_lpae_alloc_pages(1UL << data->pg_shift,
> +					       GFP_ATOMIC, cfg);
>  		if (!cptep)
>  			return -ENOMEM;
> 
> -		data->iop.cfg.tlb->flush_pgtable(cptep, 1UL << data->pg_shift,
> -						 cookie);
>  		pte = __pa(cptep) | ARM_LPAE_PTE_TYPE_TABLE;
> -		if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_NS)
> +		if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
>  			pte |= ARM_LPAE_PTE_NSTABLE;
> -		*ptep = pte;
> -		data->iop.cfg.tlb->flush_pgtable(ptep, sizeof(*ptep), cookie);
> +		__arm_lpae_set_pte(ptep, pte, cfg, cookie);
>  	} else {
>  		cptep = iopte_deref(pte, data);
>  	}
> @@ -347,7 +408,7 @@ static void __arm_lpae_free_pgtable(struct
> arm_lpae_io_pgtable *data, int lvl, __arm_lpae_free_pgtable(data, lvl + 1,
> iopte_deref(pte, data)); }
> 
> -	free_pages_exact(start, table_size);
> +	__arm_lpae_free_pages(start, table_size, &data->iop.cfg);
>  }
> 
>  static void arm_lpae_free_pgtable(struct io_pgtable *iop)
> @@ -366,8 +427,8 @@ static int arm_lpae_split_blk_unmap(struct
> arm_lpae_io_pgtable *data, unsigned long blk_start, blk_end;
>  	phys_addr_t blk_paddr;
>  	arm_lpae_iopte table = 0;
> +	struct io_pgtable_cfg *cfg = &data->iop.cfg;
>  	void *cookie = data->iop.cookie;
> -	const struct iommu_gather_ops *tlb = data->iop.cfg.tlb;
> 
>  	blk_start = iova & ~(blk_size - 1);
>  	blk_end = blk_start + blk_size;
> @@ -393,10 +454,9 @@ static int arm_lpae_split_blk_unmap(struct
> arm_lpae_io_pgtable *data, }
>  	}
> 
> -	*ptep = table;
> -	tlb->flush_pgtable(ptep, sizeof(*ptep), cookie);
> +	__arm_lpae_set_pte(ptep, table, cfg, cookie);
>  	iova &= ~(blk_size - 1);
> -	tlb->tlb_add_flush(iova, blk_size, true, cookie);
> +	cfg->tlb->tlb_add_flush(iova, blk_size, true, cookie);
>  	return size;
>  }
> 
> @@ -418,13 +478,12 @@ static int __arm_lpae_unmap(struct arm_lpae_io_pgtable
> *data,
> 
>  	/* If the size matches this level, we're in the right place */
>  	if (size == blk_size) {
> -		*ptep = 0;
> -		tlb->flush_pgtable(ptep, sizeof(*ptep), cookie);
> +		__arm_lpae_set_pte(ptep, 0, &data->iop.cfg, cookie);
> 
>  		if (!iopte_leaf(pte, lvl)) {
>  			/* Also flush any partial walks */
>  			tlb->tlb_add_flush(iova, size, false, cookie);
> -			tlb->tlb_sync(data->iop.cookie);
> +			tlb->tlb_sync(cookie);
>  			ptep = iopte_deref(pte, data);
>  			__arm_lpae_free_pgtable(data, lvl + 1, ptep);
>  		} else {
> @@ -640,11 +699,12 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg
> *cfg, void *cookie) cfg->arm_lpae_s1_cfg.mair[1] = 0;
> 
>  	/* Looking good; allocate a pgd */
> -	data->pgd = alloc_pages_exact(data->pgd_size, GFP_KERNEL | __GFP_ZERO);
> +	data->pgd = __arm_lpae_alloc_pages(data->pgd_size, GFP_KERNEL, cfg);
>  	if (!data->pgd)
>  		goto out_free_data;
> 
> -	cfg->tlb->flush_pgtable(data->pgd, data->pgd_size, cookie);
> +	if (cfg->tlb->flush_pgtable)
> +		cfg->tlb->flush_pgtable(data->pgd, data->pgd_size, cookie);
> 
>  	/* TTBRs */
>  	cfg->arm_lpae_s1_cfg.ttbr[0] = virt_to_phys(data->pgd);
> @@ -728,11 +788,12 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg
> *cfg, void *cookie) cfg->arm_lpae_s2_cfg.vtcr = reg;
> 
>  	/* Allocate pgd pages */
> -	data->pgd = alloc_pages_exact(data->pgd_size, GFP_KERNEL | __GFP_ZERO);
> +	data->pgd = __arm_lpae_alloc_pages(data->pgd_size, GFP_KERNEL, cfg);
>  	if (!data->pgd)
>  		goto out_free_data;
> 
> -	cfg->tlb->flush_pgtable(data->pgd, data->pgd_size, cookie);
> +	if (cfg->tlb->flush_pgtable)
> +		cfg->tlb->flush_pgtable(data->pgd, data->pgd_size, cookie);
> 
>  	/* VTTBR */
>  	cfg->arm_lpae_s2_cfg.vttbr = virt_to_phys(data->pgd);
> diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
> index 10e32f69c668..c69529c78914 100644
> --- a/drivers/iommu/io-pgtable.h
> +++ b/drivers/iommu/io-pgtable.h
> @@ -41,6 +41,8 @@ struct iommu_gather_ops {
>   * @ias:           Input address (iova) size, in bits.
>   * @oas:           Output address (paddr) size, in bits.
>   * @tlb:           TLB management callbacks for this set of tables.
> + * @iommu_dev:     The device representing the DMA configuration for the
> + *                 page table walker.
>   */
>  struct io_pgtable_cfg {
>  	#define IO_PGTABLE_QUIRK_ARM_NS	(1 << 0)	/* Set NS bit in PTEs */
> @@ -49,6 +51,7 @@ struct io_pgtable_cfg {
>  	unsigned int			ias;
>  	unsigned int			oas;
>  	const struct iommu_gather_ops	*tlb;
> +	struct device			*iommu_dev;
> 
>  	/* Low-level data specific to the table format */
>  	union {

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use
  2015-08-04 13:16   ` Laurent Pinchart
@ 2015-08-04 14:47     ` Robin Murphy
  2015-08-04 14:56       ` Russell King - ARM Linux
  0 siblings, 1 reply; 21+ messages in thread
From: Robin Murphy @ 2015-08-04 14:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Laurent,

[ +RMK, as his patch is indirectly involved here ]

On 04/08/15 14:16, Laurent Pinchart wrote:
> Hi Will and Robin,
>
> Thank you for the patch.
>
> On Monday 03 August 2015 14:25:47 Will Deacon wrote:
>> From: Robin Murphy <Robin.Murphy@arm.com>
>>
>> Currently, users of the LPAE page table code are (ab)using dma_map_page()
>> as a means to flush page table updates for non-coherent IOMMUs. Since
>> from the CPU's point of view, creating IOMMU page tables *is* passing
>> DMA buffers to a device (the IOMMU's page table walker), there's little
>> reason not to use the DMA API correctly.
>>
>> Allow IOMMU drivers to opt into DMA API operations for page table
>> allocation and updates by providing their appropriate device pointer.
>> The expectation is that an LPAE IOMMU should have a full view of system
>> memory, so use streaming mappings to avoid unnecessary pressure on
>> ZONE_DMA, and treat any DMA translation as a warning sign.
>
> I like the idea of doing this in core code rather than in individual drivers,
> but I believe we're not using the right API. Please see below.

Perhaps this could have one of my trademark "for now"s - the aim of this 
series is really just to stop the per-driver hacks proliferating, as per 
Russell's comment[1]. I left the semi-artificial DMA==phys restriction 
for expediency, since it follows the current usage and keeps the code 
changes minimal. With this series in place I'd be happy to go back and 
try a full-blown DMA conversion if and when a real need shows up, but I 
think it would significantly complicate all the current software page 
table walking.

>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> Signed-off-by: Will Deacon <will.deacon@arm.com>
>> ---
>>   drivers/iommu/Kconfig          |   3 +-
>>   drivers/iommu/io-pgtable-arm.c | 107 +++++++++++++++++++++++++++++---------
>>   drivers/iommu/io-pgtable.h     |   3 ++
>>   3 files changed, 89 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index f1fb1d3ccc56..d77a848d50de 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -23,7 +23,8 @@ config IOMMU_IO_PGTABLE
>>   config IOMMU_IO_PGTABLE_LPAE
>>   	bool "ARMv7/v8 Long Descriptor Format"
>>   	select IOMMU_IO_PGTABLE
>> -	depends on ARM || ARM64 || COMPILE_TEST
>> +	# SWIOTLB guarantees a dma_to_phys() implementation
>> +	depends on ARM || ARM64 || (COMPILE_TEST && SWIOTLB)
>>   	help
>>   	  Enable support for the ARM long descriptor pagetable format.
>>   	  This allocator supports 4K/2M/1G, 16K/32M and 64K/512M page
>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>> index 4e460216bd16..28cca8a652f9 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -200,12 +200,76 @@ typedef u64 arm_lpae_iopte;
>>
>>   static bool selftest_running = false;
>>
>> +static dma_addr_t __arm_lpae_dma_addr(struct device *dev, void *pages)
>> +{
>> +	return phys_to_dma(dev, virt_to_phys(pages));
>> +}
>> +
>> +static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
>> +				    struct io_pgtable_cfg *cfg)
>> +{
>> +	struct device *dev = cfg->iommu_dev;
>> +	dma_addr_t dma;
>> +	void *pages = alloc_pages_exact(size, gfp | __GFP_ZERO);
>> +
>> +	if (!pages)
>> +		return NULL;
>> +
>> +	if (dev) {
>> +		dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
>> +		if (dma_mapping_error(dev, dma))
>> +			goto out_free;
>> +		/*
>> +		 * We depend on the IOMMU being able to work with any physical
>> +		 * address directly, so if the DMA layer suggests it can't by
>> +		 * giving us back some translation, that bodes very badly...
>> +		 */
>> +		if (dma != __arm_lpae_dma_addr(dev, pages))
>> +			goto out_unmap;
>
> Why do we need to create a mapping at all then ? Because
> dma_sync_single_for_device() requires it ?

We still need to expose this new table to the device. If we never update 
it, we'll never have reason to call dma_sync_, but we definitely want 
the IOMMU to know there's a page of invalid PTEs there.

>> +	}
>> +
>> +	return pages;
>> +
>> +out_unmap:
>> +	dev_err(dev, "Cannot accommodate DMA translation for IOMMU page
>> tables\n");
>> +	dma_unmap_single(dev, dma, size, DMA_TO_DEVICE);
>> +out_free:
>> +	free_pages_exact(pages, size);
>> +	return NULL;
>> +}
>> +
>> +static void __arm_lpae_free_pages(void *pages, size_t size,
>> +				  struct io_pgtable_cfg *cfg)
>> +{
>> +	struct device *dev = cfg->iommu_dev;
>> +
>> +	if (dev)
>> +		dma_unmap_single(dev, __arm_lpae_dma_addr(dev, pages),
>> +				 size, DMA_TO_DEVICE);
>> +	free_pages_exact(pages, size);
>> +}
>> +
>> +static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
>> +			       struct io_pgtable_cfg *cfg, void *cookie)
>> +{
>> +	struct device *dev = cfg->iommu_dev;
>> +
>> +	*ptep = pte;
>> +
>> +	if (dev)
>> +		dma_sync_single_for_device(dev, __arm_lpae_dma_addr(dev, ptep),
>> +					   sizeof(pte), DMA_TO_DEVICE);
>
> This is what I believe to be an API abuse. The dma_sync_single_for_device()
> API is meant to pass ownership of a buffer to the device. Unless I'm mistaken,
> once that's done the CPU isn't allowed to touch the buffer anymore until
> dma_sync_single_for_cpu() is called to get ownership of the buffer back. Sure,
> it might work on many ARM systems, but we really should be careful not to use
> APIs as delicate as DMA mapping and cache handling for purposes different than
> what they explicitly allow.
>
> It might be that I'm wrong and that the streaming DMA API allows this exact
> kind of usage, but I haven't found a clear indication of that in the
> documentation. It could also be that all implementations would support it
> today, and that we would then consider it should be explicitly allowed by the
> API. In both cases a documentation patch would be welcome.

TBH, I was largely going by Russell's Tegra patch, which similarly 
elides any sync_*_for_cpu. In reality, since everything is DMA_TO_DEVICE 
and the IOMMU itself can't modify the page tables[3], I can't think of 
any situation where sync_*_for_cpu would actually do anything.

 From reading this part of DMA-API.txt:

   Notes:  You must do this:

   - Before reading values that have been written by DMA from the device
     (use the DMA_FROM_DEVICE direction)
   - After writing values that will be written to the device using DMA
     (use the DMA_TO_DEVICE) direction
   - before *and* after handing memory to the device if the memory is
     DMA_BIDIRECTIONAL

I would conclude that since a sync using DMA_TO_DEVICE *before* writing 
is not a "must", then it's probably unnecessary.

Robin.

[1]:http://article.gmane.org/gmane.linux.kernel/2005551
[2]:http://article.gmane.org/gmane.linux.ports.tegra/23150
[3]:Yes, there may generally be exceptions to that, but not in the 
context of this code. Unless the Renesas IPMMU does something I don't 
know about?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use
  2015-08-04 14:47     ` Robin Murphy
@ 2015-08-04 14:56       ` Russell King - ARM Linux
  2015-08-04 20:54         ` Laurent Pinchart
  0 siblings, 1 reply; 21+ messages in thread
From: Russell King - ARM Linux @ 2015-08-04 14:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 04, 2015 at 03:47:13PM +0100, Robin Murphy wrote:
> Hi Laurent,
> 
> [ +RMK, as his patch is indirectly involved here ]
> 
> On 04/08/15 14:16, Laurent Pinchart wrote:
> >This is what I believe to be an API abuse. The dma_sync_single_for_device()
> >API is meant to pass ownership of a buffer to the device. Unless I'm mistaken,
> >once that's done the CPU isn't allowed to touch the buffer anymore until
> >dma_sync_single_for_cpu() is called to get ownership of the buffer back.

That's what I thought up until recently, but it's not strictly true - see
Documentation/DMA-API.txt which Robin quoted.

> [3]:Yes, there may generally be exceptions to that, but not in the context
> of this code. Unless the Renesas IPMMU does something I don't know about?

If an IOMMU does write to its page tables, then the only way to handle
those is using DMA-coherent memory, either via a coherent mapping or
allocated via dma_alloc_coherent().  The streaming DMA API is wholely
unsuitable for any mapping where both the CPU and DMA device both want
to simultaneously alter the contained data.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use
  2015-08-04 14:56       ` Russell King - ARM Linux
@ 2015-08-04 20:54         ` Laurent Pinchart
  2015-08-05 16:24           ` Will Deacon
  0 siblings, 1 reply; 21+ messages in thread
From: Laurent Pinchart @ 2015-08-04 20:54 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Tuesday 04 August 2015 15:56:42 Russell King - ARM Linux wrote:
> On Tue, Aug 04, 2015 at 03:47:13PM +0100, Robin Murphy wrote:
> > Hi Laurent,
> > 
> > [ +RMK, as his patch is indirectly involved here ]
> > 
> > On 04/08/15 14:16, Laurent Pinchart wrote:
> >> This is what I believe to be an API abuse. The
> >> dma_sync_single_for_device()
> >> API is meant to pass ownership of a buffer to the device. Unless I'm
> >> mistaken, once that's done the CPU isn't allowed to touch the buffer
> >> anymore until dma_sync_single_for_cpu() is called to get ownership of
> >> the buffer back.
>
> That's what I thought up until recently, but it's not strictly true - see
> Documentation/DMA-API.txt which Robin quoted.

I find the documentation slightly unclear on this topic. I have nothing 
against updating it to state that the sync functions ensure visibility of the 
data to the device or CPU instead of transferring ownership if that's the 
general understanding of how the API work (or should work). This would of 
course need to be carefully reviewed to ensure that the current implementation 
really works that way across all architectures, or at least that it can be 
made to work that way.

> > [3]:Yes, there may generally be exceptions to that, but not in the context
> > of this code. Unless the Renesas IPMMU does something I don't know about?
> 
> If an IOMMU does write to its page tables, then the only way to handle
> those is using DMA-coherent memory, either via a coherent mapping or
> allocated via dma_alloc_coherent().  The streaming DMA API is wholely
> unsuitable for any mapping where both the CPU and DMA device both want
> to simultaneously alter the contained data.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use
  2015-08-04 20:54         ` Laurent Pinchart
@ 2015-08-05 16:24           ` Will Deacon
  2015-08-06 19:10             ` Laurent Pinchart
  0 siblings, 1 reply; 21+ messages in thread
From: Will Deacon @ 2015-08-05 16:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Laurent,

On Tue, Aug 04, 2015 at 09:54:27PM +0100, Laurent Pinchart wrote:
> On Tuesday 04 August 2015 15:56:42 Russell King - ARM Linux wrote:
> > On Tue, Aug 04, 2015 at 03:47:13PM +0100, Robin Murphy wrote:
> > > On 04/08/15 14:16, Laurent Pinchart wrote:
> > >> This is what I believe to be an API abuse. The
> > >> dma_sync_single_for_device()
> > >> API is meant to pass ownership of a buffer to the device. Unless I'm
> > >> mistaken, once that's done the CPU isn't allowed to touch the buffer
> > >> anymore until dma_sync_single_for_cpu() is called to get ownership of
> > >> the buffer back.
> >
> > That's what I thought up until recently, but it's not strictly true - see
> > Documentation/DMA-API.txt which Robin quoted.
> 
> I find the documentation slightly unclear on this topic. I have nothing 
> against updating it to state that the sync functions ensure visibility of the 
> data to the device or CPU instead of transferring ownership if that's the 
> general understanding of how the API work (or should work). This would of 
> course need to be carefully reviewed to ensure that the current implementation 
> really works that way across all architectures, or at least that it can be 
> made to work that way.

I was hoping to send this lot off to Joerg tomorrow morning, since it's
removing a hack from the io-pgtable users before it has chance to spread
further. This use of the API is certainly ok on arm and arm64 (where
these drivers currently run) and I don't see why it wouldn't work for
any IOMMU that treats the page tables as read-only. Once we start getting
into hardware access/dirty bits, then the streaming-DMA API is a lost
cause but I would expect those systems to be using SVM with the CPU page
tables anyway and therefore be cache-coherent.

I agree that the Documentation isn't brilliant, but that could be addressed
in a separate patch (especially since I don't think it would be merged
via the iommu tree).

Is that alright?

Will

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use
  2015-08-05 16:24           ` Will Deacon
@ 2015-08-06 19:10             ` Laurent Pinchart
  0 siblings, 0 replies; 21+ messages in thread
From: Laurent Pinchart @ 2015-08-06 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On Wednesday 05 August 2015 17:24:52 Will Deacon wrote:
> On Tue, Aug 04, 2015 at 09:54:27PM +0100, Laurent Pinchart wrote:
> > On Tuesday 04 August 2015 15:56:42 Russell King - ARM Linux wrote:
> > > On Tue, Aug 04, 2015 at 03:47:13PM +0100, Robin Murphy wrote:
> > > > On 04/08/15 14:16, Laurent Pinchart wrote:
> > > >> This is what I believe to be an API abuse. The
> > > >> dma_sync_single_for_device()
> > > >> API is meant to pass ownership of a buffer to the device. Unless I'm
> > > >> mistaken, once that's done the CPU isn't allowed to touch the buffer
> > > >> anymore until dma_sync_single_for_cpu() is called to get ownership of
> > > >> the buffer back.
> > > 
> > > That's what I thought up until recently, but it's not strictly true -
> > > see Documentation/DMA-API.txt which Robin quoted.
> > 
> > I find the documentation slightly unclear on this topic. I have nothing
> > against updating it to state that the sync functions ensure visibility of
> > the data to the device or CPU instead of transferring ownership if that's
> > the general understanding of how the API work (or should work). This
> > would of course need to be carefully reviewed to ensure that the current
> > implementation really works that way across all architectures, or at
> > least that it can be made to work that way.
> 
> I was hoping to send this lot off to Joerg tomorrow morning, since it's
> removing a hack from the io-pgtable users before it has chance to spread
> further. This use of the API is certainly ok on arm and arm64 (where
> these drivers currently run) and I don't see why it wouldn't work for
> any IOMMU that treats the page tables as read-only. Once we start getting
> into hardware access/dirty bits, then the streaming-DMA API is a lost
> cause but I would expect those systems to be using SVM with the CPU page
> tables anyway and therefore be cache-coherent.
> 
> I agree that the Documentation isn't brilliant, but that could be addressed
> in a separate patch (especially since I don't think it would be merged
> via the iommu tree).
> 
> Is that alright?

Yes, that's fine with me. The patch set moves usage of the DMA mapping API 
from drivers to a single place in the core, so no problem is introduced. If we 
later consider this to be an API abuse we'll only need to fix it in a single 
place. And if it's a valid use of the API then the documentation 
fix/improvement doesn't need to be synchronous with this patch set.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-08-06 19:10 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-03 13:25 [PATCH 00/13] iommu/arm-smmu: Updates for 4.3 Will Deacon
2015-08-03 13:25 ` [PATCH 01/13] iommu/arm-smmu: Fix enabling of PRIQ interrupt Will Deacon
2015-08-03 13:25 ` [PATCH 02/13] iommu/arm-smmu: Fix MSI memory attributes to match specification Will Deacon
2015-08-03 13:25 ` [PATCH 03/13] iommu/arm-smmu: Limit 2-level strtab allocation for small SID sizes Will Deacon
2015-08-03 13:25 ` [PATCH 04/13] iommu/arm-smmu: Sort out coherency Will Deacon
2015-08-03 13:25 ` [PATCH 05/13] iommu/io-pgtable-arm: Allow appropriate DMA API use Will Deacon
2015-08-04 13:16   ` Laurent Pinchart
2015-08-04 14:47     ` Robin Murphy
2015-08-04 14:56       ` Russell King - ARM Linux
2015-08-04 20:54         ` Laurent Pinchart
2015-08-05 16:24           ` Will Deacon
2015-08-06 19:10             ` Laurent Pinchart
2015-08-03 13:25 ` [PATCH 06/13] iommu/arm-smmu: Clean up DMA API usage Will Deacon
2015-08-03 13:25 ` [PATCH 07/13] " Will Deacon
2015-08-03 13:25 ` [PATCH 08/13] iommu/ipmmu-vmsa: " Will Deacon
2015-08-03 13:25 ` [PATCH 09/13] iommu/io-pgtable-arm: Centralise sync points Will Deacon
2015-08-03 13:25 ` [PATCH 10/13] iommu/arm-smmu: Remove arm_smmu_flush_pgtable() Will Deacon
2015-08-03 13:25 ` [PATCH 11/13] " Will Deacon
2015-08-03 13:25 ` [PATCH 12/13] iommu/io-pgtable: Remove flush_pgtable callback Will Deacon
2015-08-03 13:25 ` [PATCH 13/13] iommu/arm-smmu: Treat unknown OAS as 48-bit Will Deacon
2015-08-03 18:23   ` Sergei Shtylyov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).