linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v1 00/15] drm/msm: Per-instance pagetable support
@ 2019-03-01 19:38 Jordan Crouse
  2019-03-01 19:38 ` [RFC PATCH v1 02/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2 Jordan Crouse
  2019-03-01 19:38 ` [RFC PATCH v1 05/15] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
  0 siblings, 2 replies; 5+ messages in thread
From: Jordan Crouse @ 2019-03-01 19:38 UTC (permalink / raw)
  To: freedreno
  Cc: David Airlie, Will Deacon, dri-devel, Bjorn Andersson,
	Jonathan Marek, jean-philippe.brucker, Joerg Roedel, iommu,
	Mamta Shukla, Kees Cook, linux-arm-msm, Sharat Masetty, dianders,
	Daniel Vetter, Sean Paul, linux-arm-kernel, linux-kernel,
	Rob Clark, hoegsberg, Thomas Zimmermann, Robin Murphy, baolu.lu

This is the latest incarnation of per-instance pagetable support for the MSM GPU
driver. Some of these have been seen before, most recently [1].

Per-instance pagetables allow the target GPU driver to create and manage
an individual pagetable for each file descriptor instance and switch
between them asynchronously using the GPU to reprogram the pagetable
registers on the fly.

This is accomplished in this series by taking advantage of the multiple
IOMMU domain API from Lu Baolu [2] and all these patches are based on that
patch. This series is split into three parts:

Part one adds support for split pagetables. These are the same patches from the
previous attempts [1]. Split pagetables allow the hardware to switch out the
lower pagetable (TTBR0) without affecting the global allocations in the upper
one (TTBR1).

Part 2 adds aux domain support for arm-smmu-v2. New aux domains create a new
pagetable but do not touch the underlying hardware.  The target driver uses the
new aux domain to map and unmap memory through the usual mechanisms.

The final part is the support in the GPU driver to enable 64 bit addressing for
a5xx and a6xx, set up the support for split pagetables, create new per-instance
pagetables for a new instance and submit the GPU command to switch the pagetable
at the appropriate time.

This is compile tested but I haven't done much target testing as of yet. I
wanted to get this out in the world for debate while we work on fixing up the
minor issues. In particular, I want to make sure that this fits with the
current thinking about how aux domains should look and feel.

[1] https://patchwork.freedesktop.org/series/43447/
[2] https://patchwork.kernel.org/patch/10825061/


Jordan Crouse (15):
  iommu: Add DOMAIN_ATTR_SPLIT_TABLES
  iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  iommu/io-pgtable: Allow TLB operations to be optional
  iommu: Add DOMAIN_ATTR_PTBASE
  iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets
  drm/msm: Print all 64 bits of the faulting IOMMU address
  drm/msm: Pass the MMU domain index in struct msm_file_private
  drm/msm/gpu: Move address space setup to the GPU targets
  drm/msm: Add support for IOMMU auxiliary domains
  drm/msm: Add a helper function for a per-instance address space
  drm/msm: Add support to create target specific address spaces
  drm/msm/gpu: Add ttbr0 to the memptrs
  drm/msm/a6xx: Support per-instance pagetables
  drm/msm/a5xx: Support per-instance pagetables

 drivers/gpu/drm/msm/adreno/a2xx_gpu.c     |  37 ++--
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c     |  50 ++++--
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c     |  51 ++++--
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 163 +++++++++++++++++-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     |  19 ++
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c |  70 ++++++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c     | 167 +++++++++++++++++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h     |   1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |   7 -
 drivers/gpu/drm/msm/msm_drv.c             |  25 ++-
 drivers/gpu/drm/msm/msm_drv.h             |   5 +
 drivers/gpu/drm/msm/msm_gem.h             |   2 +
 drivers/gpu/drm/msm/msm_gem_submit.c      |  13 +-
 drivers/gpu/drm/msm/msm_gem_vma.c         |  53 +++---
 drivers/gpu/drm/msm/msm_gpu.c             |  59 +------
 drivers/gpu/drm/msm/msm_gpu.h             |   3 +
 drivers/gpu/drm/msm/msm_iommu.c           |  99 ++++++++++-
 drivers/gpu/drm/msm/msm_mmu.h             |   4 +
 drivers/gpu/drm/msm/msm_ringbuffer.h      |   1 +
 drivers/iommu/arm-smmu-regs.h             |  18 ++
 drivers/iommu/arm-smmu.c                  | 278 ++++++++++++++++++++++++++----
 drivers/iommu/io-pgtable-arm.c            |   3 +-
 drivers/iommu/io-pgtable.h                |  10 +-
 include/linux/iommu.h                     |   2 +
 24 files changed, 952 insertions(+), 188 deletions(-)

-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH v1 02/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  2019-03-01 19:38 [RFC PATCH v1 00/15] drm/msm: Per-instance pagetable support Jordan Crouse
@ 2019-03-01 19:38 ` Jordan Crouse
  2019-03-01 20:25   ` [Freedreno] " Rob Clark
  2019-03-01 19:38 ` [RFC PATCH v1 05/15] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
  1 sibling, 1 reply; 5+ messages in thread
From: Jordan Crouse @ 2019-03-01 19:38 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, dianders, hoegsberg, Robin Murphy,
	linux-arm-kernel, baolu.lu

Add support for a split pagetable (TTBR0/TTBR1) scheme for
arm-smmu-v2. If split pagetables are enabled, create a
pagetable for TTBR1 and set up the sign extension bit so
that all IOVAs with that bit set are mapped and translated
from the TTBR1 pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu-regs.h  |  18 +++++
 drivers/iommu/arm-smmu.c       | 149 +++++++++++++++++++++++++++++++++++++----
 drivers/iommu/io-pgtable-arm.c |   3 +-
 3 files changed, 154 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index a1226e4..56f9709 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -193,7 +193,25 @@ enum arm_smmu_s2cr_privcfg {
 #define RESUME_RETRY			(0 << 0)
 #define RESUME_TERMINATE		(1 << 0)
 
+#define TTBCR_EPD1			(1 << 23)
+#define TTBCR_T1SZ_SHIFT		16
+#define TTBCR_IRGN1_SHIFT		24
+#define TTBCR_ORGN1_SHIFT		26
+#define TTBCR_RGN_WBWA			1
+#define TTBCR_SH1_SHIFT			28
+#define TTBCR_SH_IS			3
+
+#define TTBCR_TG1_16K			(1 << 30)
+#define TTBCR_TG1_4K			(2 << 30)
+#define TTBCR_TG1_64K			(3 << 30)
+
 #define TTBCR2_SEP_SHIFT		15
+#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_AS			(1 << 4)
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index af18a7e..05eb126 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -151,6 +151,7 @@ struct arm_smmu_cb {
 	u32				tcr[2];
 	u32				mair[2];
 	struct arm_smmu_cfg		*cfg;
+	u64				split_table_mask;
 };
 
 struct arm_smmu_master_cfg {
@@ -208,6 +209,7 @@ struct arm_smmu_device {
 	unsigned long			va_size;
 	unsigned long			ipa_size;
 	unsigned long			pa_size;
+	unsigned long			ubs_size;
 	unsigned long			pgsize_bitmap;
 
 	u32				num_global_irqs;
@@ -252,13 +254,14 @@ enum arm_smmu_domain_stage {
 
 struct arm_smmu_domain {
 	struct arm_smmu_device		*smmu;
-	struct io_pgtable_ops		*pgtbl_ops;
+	struct io_pgtable_ops		*pgtbl_ops[2];
 	const struct iommu_gather_ops	*tlb_ops;
 	struct arm_smmu_cfg		cfg;
 	enum arm_smmu_domain_stage	stage;
 	bool				non_strict;
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
+	u32 attributes;
 	struct iommu_domain		domain;
 };
 
@@ -618,6 +621,69 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 	return IRQ_HANDLED;
 }
 
+static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
+		struct io_pgtable_cfg *pgtbl_cfg)
+{
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
+
+	/* Enable speculative walks through the TTBR1 */
+	cb->tcr[0] &= ~TTBCR_EPD1;
+
+	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
+
+	switch (pgsize) {
+	case SZ_4K:
+		cb->tcr[0] |= TTBCR_TG1_4K;
+		break;
+	case SZ_16K:
+		cb->tcr[0] |= TTBCR_TG1_16K;
+		break;
+	case SZ_64K:
+		cb->tcr[0] |= TTBCR_TG1_64K;
+		break;
+	}
+
+	cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
+
+	/* Clear the existing SEP configuration */
+	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
+
+	/* Set up the sign extend bit */
+	switch (smmu->va_size) {
+	case 32:
+		cb->tcr[1] |= TTBCR2_SEP_31;
+		cb->split_table_mask = (1ULL << 31);
+		break;
+	case 36:
+		cb->tcr[1] |= TTBCR2_SEP_35;
+		cb->split_table_mask = (1ULL << 35);
+		break;
+	case 40:
+		cb->tcr[1] |= TTBCR2_SEP_39;
+		cb->split_table_mask = (1ULL << 39);
+		break;
+	case 42:
+		cb->tcr[1] |= TTBCR2_SEP_41;
+		cb->split_table_mask = (1ULL << 41);
+		break;
+	case 44:
+		cb->tcr[1] |= TTBCR2_SEP_43;
+		cb->split_table_mask = (1ULL << 43);
+		break;
+	case 48:
+		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
+		cb->split_table_mask = (1ULL << 48);
+	}
+
+	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+	cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
+}
+
 static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 				       struct io_pgtable_cfg *pgtbl_cfg)
 {
@@ -650,8 +716,12 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 		} else {
 			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
 			cb->ttbr[0] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
-			cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
-			cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
+
+			/*
+			 * Set TTBR1 to empty by default - it will get
+			 * programmed later if it is enabled
+			 */
+			cb->ttbr[1] = (u64)cfg->asid << TTBRn_ASID_SHIFT;
 		}
 	} else {
 		cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
@@ -760,11 +830,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 {
 	int irq, start, ret = 0;
 	unsigned long ias, oas;
-	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_ops *pgtbl_ops[2];
 	struct io_pgtable_cfg pgtbl_cfg;
 	enum io_pgtable_fmt fmt;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	bool split_tables =
+		(smmu_domain->attributes & (1 << DOMAIN_ATTR_SPLIT_TABLES));
 
 	mutex_lock(&smmu_domain->init_mutex);
 	if (smmu_domain->smmu)
@@ -794,8 +866,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	 *
 	 * Note that you can't actually request stage-2 mappings.
 	 */
-	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
+	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) {
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
+		/* FIXME: fail instead? */
+		split_tables = false;
+	}
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
@@ -812,8 +887,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (IS_ENABLED(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) &&
 	    !IS_ENABLED(CONFIG_64BIT) && !IS_ENABLED(CONFIG_ARM_LPAE) &&
 	    (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) &&
-	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1))
+	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1)) {
+		/* FIXME: fail instead? */
+		split_tables = false;
 		cfg->fmt = ARM_SMMU_CTX_FMT_AARCH32_S;
+	}
 	if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt == ARM_SMMU_CTX_FMT_NONE) &&
 	    (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K |
 			       ARM_SMMU_FEAT_FMT_AARCH64_16K |
@@ -903,8 +981,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
 	smmu_domain->smmu = smmu;
-	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
-	if (!pgtbl_ops) {
+	pgtbl_ops[0] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
+	if (!pgtbl_ops[0]) {
 		ret = -ENOMEM;
 		goto out_clear_smmu;
 	}
@@ -916,6 +994,22 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
+
+	pgtbl_ops[1] = NULL;
+
+	if (split_tables) {
+		/* FIXME: I think it is safe to reuse pgtbl_cfg here */
+		pgtbl_ops[1] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg,
+			smmu_domain);
+		if (!pgtbl_ops[1]) {
+			free_io_pgtable_ops(pgtbl_ops[0]);
+			ret = -ENOMEM;
+			goto out_clear_smmu;
+		}
+
+		arm_smmu_init_ttbr1(smmu_domain, &pgtbl_cfg);
+	}
+
 	arm_smmu_write_context_bank(smmu, cfg->cbndx);
 
 	/*
@@ -934,7 +1028,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	/* Publish page table ops for map/unmap */
-	smmu_domain->pgtbl_ops = pgtbl_ops;
+	smmu_domain->pgtbl_ops[0] = pgtbl_ops[0];
+	smmu_domain->pgtbl_ops[1] = pgtbl_ops[1];
+
 	return 0;
 
 out_clear_smmu:
@@ -970,7 +1066,9 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 		devm_free_irq(smmu->dev, irq, domain);
 	}
 
-	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[1]);
+
 	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
 
 	arm_smmu_rpm_put(smmu);
@@ -1285,10 +1383,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	return ret;
 }
 
+static struct io_pgtable_ops *
+arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+
+	if (iova & cb->split_table_mask)
+		return smmu_domain->pgtbl_ops[1];
+
+	return smmu_domain->pgtbl_ops[0];
+}
+
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 			phys_addr_t paddr, size_t size, int prot)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
 	int ret;
 
@@ -1305,7 +1416,7 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 			     size_t size)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
 	size_t ret;
 
@@ -1349,7 +1460,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct device *dev = smmu->dev;
 	void __iomem *cb_base;
 	u32 tmp;
@@ -1397,7 +1508,7 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
 					dma_addr_t iova)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 
 	if (domain->type == IOMMU_DOMAIN_IDENTITY)
 		return iova;
@@ -1584,6 +1695,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 		case DOMAIN_ATTR_NESTING:
 			*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 			return 0;
+		case DOMAIN_ATTR_SPLIT_TABLES:
+			*((int *)data) =
+				!!(smmu_domain->attributes &
+				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
+			return 0;
 		default:
 			return -ENODEV;
 		}
@@ -1624,6 +1740,11 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 			else
 				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 			break;
+		case DOMAIN_ATTR_SPLIT_TABLES:
+			if (*((int *)data))
+				smmu_domain->attributes |=
+					(1 << DOMAIN_ATTR_SPLIT_TABLES);
+			break;
 		default:
 			ret = -ENODEV;
 		}
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 237cacd..dc9fb2e 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -475,8 +475,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
 		return 0;
 
-	if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
-		    paddr >= (1ULL << data->iop.cfg.oas)))
+	if (WARN_ON(paddr >= (1ULL << data->iop.cfg.oas)))
 		return -ERANGE;
 
 	prot = arm_lpae_prot_to_pte(data, iommu_prot);
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH v1 05/15] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  2019-03-01 19:38 [RFC PATCH v1 00/15] drm/msm: Per-instance pagetable support Jordan Crouse
  2019-03-01 19:38 ` [RFC PATCH v1 02/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2 Jordan Crouse
@ 2019-03-01 19:38 ` Jordan Crouse
  2019-03-04 12:19   ` Jean-Philippe Brucker
  1 sibling, 1 reply; 5+ messages in thread
From: Jordan Crouse @ 2019-03-01 19:38 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, dianders, hoegsberg, Robin Murphy,
	linux-arm-kernel, baolu.lu

Support the new auxiliary domain API for arm-smmuv2 to initialize and
support multiple pagetables for a SMMU device. Since the smmu-v2 hardware
doesn't have any built in support for switching the pagetable base it is
left as an exercise to the caller to actually use the pagetable; aux
domains in the IOMMU driver are only preoccupied with creating and managing
the pagetable memory.

Following is a pseudo code example of how a domain can be created

 /* Check to see if aux domains are supported */
 if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
	 iommu = iommu_domain_alloc(...);

	 if (iommu_aux_attach_device(domain, dev))
		 return FAIL;

	/* Save the base address of the pagetable for use by the driver
	iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
 }

After this 'domain' can be used like any other iommu domain to map and
unmap iova addresses in the pagetable. The driver/hardware can be used
to switch the pagetable according to its own specific implementation.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu.c | 135 ++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 111 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 05eb126..b7b508e 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -263,6 +263,8 @@ struct arm_smmu_domain {
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
 	u32 attributes;
 	struct iommu_domain		domain;
+	bool				is_aux;
+	u64				ttbr0;
 };
 
 struct arm_smmu_option_prop {
@@ -874,6 +876,12 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
+	/* Aux domains can only be created for stage-1 tables */
+	if (smmu_domain->is_aux && smmu_domain->stage != ARM_SMMU_DOMAIN_S1) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
 	/*
 	 * Choosing a suitable context format is even more fiddly. Until we
 	 * grow some way for the caller to express a preference, and/or move
@@ -920,7 +928,10 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 			ias = min(ias, 32UL);
 			oas = min(oas, 32UL);
 		}
-		smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
+
+		/* aux domains shouldn't touch hardware so no TLB ops */
+		if (!smmu_domain->is_aux)
+			smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
 		break;
 	case ARM_SMMU_DOMAIN_NESTED:
 		/*
@@ -939,32 +950,42 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 			ias = min(ias, 40UL);
 			oas = min(oas, 40UL);
 		}
-		if (smmu->version == ARM_SMMU_V2)
-			smmu_domain->tlb_ops = &arm_smmu_s2_tlb_ops_v2;
-		else
-			smmu_domain->tlb_ops = &arm_smmu_s2_tlb_ops_v1;
+
+		if (!smmu_domain->is_aux) {
+			if (smmu->version == ARM_SMMU_V2)
+				smmu_domain->tlb_ops = &arm_smmu_s2_tlb_ops_v2;
+			else
+				smmu_domain->tlb_ops = &arm_smmu_s2_tlb_ops_v1;
+		}
 		break;
 	default:
 		ret = -EINVAL;
 		goto out_unlock;
 	}
-	ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
-				      smmu->num_context_banks);
-	if (ret < 0)
-		goto out_unlock;
 
-	cfg->cbndx = ret;
-	if (smmu->version < ARM_SMMU_V2) {
-		cfg->irptndx = atomic_inc_return(&smmu->irptndx);
-		cfg->irptndx %= smmu->num_context_irqs;
-	} else {
-		cfg->irptndx = cfg->cbndx;
-	}
+	/*
+	 * Aux domains will use the same context bank assigned to the master
+	 * domain for the device
+	 */
+	if (!smmu_domain->is_aux) {
+		ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
+					      smmu->num_context_banks);
+		if (ret < 0)
+			goto out_unlock;
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2)
-		cfg->vmid = cfg->cbndx + 1 + smmu->cavium_id_base;
-	else
-		cfg->asid = cfg->cbndx + smmu->cavium_id_base;
+		cfg->cbndx = ret;
+		if (smmu->version < ARM_SMMU_V2) {
+			cfg->irptndx = atomic_inc_return(&smmu->irptndx);
+			cfg->irptndx %= smmu->num_context_irqs;
+		} else {
+			cfg->irptndx = cfg->cbndx;
+		}
+
+		if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2)
+			cfg->vmid = cfg->cbndx + 1 + smmu->cavium_id_base;
+		else
+			cfg->asid = cfg->cbndx + smmu->cavium_id_base;
+	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
 		.pgsize_bitmap	= smmu->pgsize_bitmap,
@@ -987,16 +1008,26 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_clear_smmu;
 	}
 
+	/* Cache the TTBR0 for the aux domain */
+	smmu_domain->ttbr0 = pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0];
+
 	/* Update the domain's page sizes to reflect the page table format */
 	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 	domain->geometry.aperture_end = (1UL << ias) - 1;
 	domain->geometry.force_aperture = true;
 
+	pgtbl_ops[1] = NULL;
+
+	/*
+	 * aux domains don't use split tables or program the hardware so we're
+	 * done setting it up
+	 */
+	if (smmu_domain->is_aux)
+		goto end;
+
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
 
-	pgtbl_ops[1] = NULL;
-
 	if (split_tables) {
 		/* FIXME: I think it is safe to reuse pgtbl_cfg here */
 		pgtbl_ops[1] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg,
@@ -1018,13 +1049,15 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	 */
 	irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
 	ret = devm_request_irq(smmu->dev, irq, arm_smmu_context_fault,
-			       IRQF_SHARED, "arm-smmu-context-fault", domain);
+			       IRQF_SHARED, "arm-smmu-context-fault",
+			       domain);
 	if (ret < 0) {
 		dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
 			cfg->irptndx, irq);
 		cfg->irptndx = INVALID_IRPTNDX;
 	}
 
+end:
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	/* Publish page table ops for map/unmap */
@@ -1050,6 +1083,12 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
 		return;
 
+	/* All we need to do for aux devices is destroy the pagetable */
+	if (smmu_domain->is_aux) {
+		free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
+		return;
+	}
+
 	ret = arm_smmu_rpm_get(smmu);
 	if (ret < 0)
 		return;
@@ -1330,6 +1369,39 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
+static bool arm_smmu_dev_has_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	/*
+	 * FIXME: Should we do some hardware checking here, like to be sure this
+	 * is a stage 1 and such?
+	 */
+
+	/* Always allow aux domains */
+	if (feat == IOMMU_DEV_FEAT_AUX)
+		return true;
+
+	return false;
+}
+
+/* FIXME: Add stubs for dev_enable_feat and dev_disable_feat? */
+
+/* Set up a new aux domain and create a new pagetable with the same
+ * characteristics as the master
+ */
+static int arm_smmu_aux_attach_dev(struct iommu_domain *domain,
+		struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	smmu_domain->is_aux = true;
+
+	/* No power is needed because aux domain doesn't touch the hardware */
+	return arm_smmu_init_domain_context(domain, smmu);
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1342,6 +1414,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		return -ENXIO;
 	}
 
+	/* FIXME: Reject unmanged domains since those should be aux? */
+
 	/*
 	 * FIXME: The arch/arm DMA API code tries to attach devices to its own
 	 * domains between of_xlate() and add_device() - we have no way to cope
@@ -1388,7 +1462,13 @@ arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+	struct arm_smmu_cb *cb;
+
+	/* quick escape for domains that don't have split pagetables enabled */
+	if (!smmu_domain->pgtbl_ops[1])
+		return smmu_domain->pgtbl_ops[0];
+
+	cb = &smmu_domain->smmu->cbs[cfg->cbndx];
 
 	if (iova & cb->split_table_mask)
 		return smmu_domain->pgtbl_ops[1];
@@ -1700,6 +1780,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 				!!(smmu_domain->attributes &
 				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
 			return 0;
+		case DOMAIN_ATTR_PTBASE:
+			if (!smmu_domain->is_aux)
+				return -ENODEV;
+			*((u64 *)data) = smmu_domain->ttbr0;
+			return 0;
 		default:
 			return -ENODEV;
 		}
@@ -1810,7 +1895,9 @@ static struct iommu_ops arm_smmu_ops = {
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
 	.domain_free		= arm_smmu_domain_free,
+	.dev_has_feat		= arm_smmu_dev_has_feat,
 	.attach_dev		= arm_smmu_attach_dev,
+	.aux_attach_dev		= arm_smmu_aux_attach_dev,
 	.map			= arm_smmu_map,
 	.unmap			= arm_smmu_unmap,
 	.flush_iotlb_all	= arm_smmu_flush_iotlb_all,
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Freedreno] [RFC PATCH v1 02/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  2019-03-01 19:38 ` [RFC PATCH v1 02/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2 Jordan Crouse
@ 2019-03-01 20:25   ` Rob Clark
  0 siblings, 0 replies; 5+ messages in thread
From: Rob Clark @ 2019-03-01 20:25 UTC (permalink / raw)
  To: Jordan Crouse
  Cc: Jean-Philippe Brucker, linux-arm-msm, Joerg Roedel, Will Deacon,
	Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>, ,
	dianders, Kristian H. Kristensen, baolu.lu, freedreno,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	Robin Murphy

On Fri, Mar 1, 2019 at 2:38 PM Jordan Crouse <jcrouse@codeaurora.org> wrote:
>
> Add support for a split pagetable (TTBR0/TTBR1) scheme for
> arm-smmu-v2. If split pagetables are enabled, create a
> pagetable for TTBR1 and set up the sign extension bit so
> that all IOVAs with that bit set are mapped and translated
> from the TTBR1 pagetable.
>
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
>
>  drivers/iommu/arm-smmu-regs.h  |  18 +++++
>  drivers/iommu/arm-smmu.c       | 149 +++++++++++++++++++++++++++++++++++++----
>  drivers/iommu/io-pgtable-arm.c |   3 +-
>  3 files changed, 154 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
> index a1226e4..56f9709 100644
> --- a/drivers/iommu/arm-smmu-regs.h
> +++ b/drivers/iommu/arm-smmu-regs.h
> @@ -193,7 +193,25 @@ enum arm_smmu_s2cr_privcfg {
>  #define RESUME_RETRY                   (0 << 0)
>  #define RESUME_TERMINATE               (1 << 0)
>
> +#define TTBCR_EPD1                     (1 << 23)
> +#define TTBCR_T1SZ_SHIFT               16
> +#define TTBCR_IRGN1_SHIFT              24
> +#define TTBCR_ORGN1_SHIFT              26
> +#define TTBCR_RGN_WBWA                 1
> +#define TTBCR_SH1_SHIFT                        28
> +#define TTBCR_SH_IS                    3
> +
> +#define TTBCR_TG1_16K                  (1 << 30)
> +#define TTBCR_TG1_4K                   (2 << 30)
> +#define TTBCR_TG1_64K                  (3 << 30)
> +
>  #define TTBCR2_SEP_SHIFT               15
> +#define TTBCR2_SEP_31                  (0x0 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_35                  (0x1 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_39                  (0x2 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_41                  (0x3 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_43                  (0x4 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_47                  (0x5 << TTBCR2_SEP_SHIFT)
>  #define TTBCR2_SEP_UPSTREAM            (0x7 << TTBCR2_SEP_SHIFT)
>  #define TTBCR2_AS                      (1 << 4)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index af18a7e..05eb126 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -151,6 +151,7 @@ struct arm_smmu_cb {
>         u32                             tcr[2];
>         u32                             mair[2];
>         struct arm_smmu_cfg             *cfg;
> +       u64                             split_table_mask;
>  };
>
>  struct arm_smmu_master_cfg {
> @@ -208,6 +209,7 @@ struct arm_smmu_device {
>         unsigned long                   va_size;
>         unsigned long                   ipa_size;
>         unsigned long                   pa_size;
> +       unsigned long                   ubs_size;
>         unsigned long                   pgsize_bitmap;
>
>         u32                             num_global_irqs;
> @@ -252,13 +254,14 @@ enum arm_smmu_domain_stage {
>
>  struct arm_smmu_domain {
>         struct arm_smmu_device          *smmu;
> -       struct io_pgtable_ops           *pgtbl_ops;
> +       struct io_pgtable_ops           *pgtbl_ops[2];
>         const struct iommu_gather_ops   *tlb_ops;
>         struct arm_smmu_cfg             cfg;
>         enum arm_smmu_domain_stage      stage;
>         bool                            non_strict;
>         struct mutex                    init_mutex; /* Protects smmu pointer */
>         spinlock_t                      cb_lock; /* Serialises ATS1* ops and TLB syncs */
> +       u32 attributes;
>         struct iommu_domain             domain;
>  };
>
> @@ -618,6 +621,69 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
>         return IRQ_HANDLED;
>  }
>
> +static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
> +               struct io_pgtable_cfg *pgtbl_cfg)
> +{
> +       struct arm_smmu_device *smmu = smmu_domain->smmu;
> +       struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +       struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +       int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
> +
> +       /* Enable speculative walks through the TTBR1 */
> +       cb->tcr[0] &= ~TTBCR_EPD1;
> +
> +       cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
> +       cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
> +       cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
> +
> +       switch (pgsize) {
> +       case SZ_4K:
> +               cb->tcr[0] |= TTBCR_TG1_4K;
> +               break;
> +       case SZ_16K:
> +               cb->tcr[0] |= TTBCR_TG1_16K;
> +               break;
> +       case SZ_64K:
> +               cb->tcr[0] |= TTBCR_TG1_64K;
> +               break;
> +       }
> +
> +       cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
> +
> +       /* Clear the existing SEP configuration */
> +       cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
> +
> +       /* Set up the sign extend bit */
> +       switch (smmu->va_size) {
> +       case 32:
> +               cb->tcr[1] |= TTBCR2_SEP_31;
> +               cb->split_table_mask = (1ULL << 31);
> +               break;
> +       case 36:
> +               cb->tcr[1] |= TTBCR2_SEP_35;
> +               cb->split_table_mask = (1ULL << 35);
> +               break;
> +       case 40:
> +               cb->tcr[1] |= TTBCR2_SEP_39;
> +               cb->split_table_mask = (1ULL << 39);
> +               break;
> +       case 42:
> +               cb->tcr[1] |= TTBCR2_SEP_41;
> +               cb->split_table_mask = (1ULL << 41);
> +               break;
> +       case 44:
> +               cb->tcr[1] |= TTBCR2_SEP_43;
> +               cb->split_table_mask = (1ULL << 43);
> +               break;
> +       case 48:
> +               cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
> +               cb->split_table_mask = (1ULL << 48);
> +       }
> +
> +       cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
> +       cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
> +}
> +
>  static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
>                                        struct io_pgtable_cfg *pgtbl_cfg)
>  {
> @@ -650,8 +716,12 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
>                 } else {
>                         cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
>                         cb->ttbr[0] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
> -                       cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
> -                       cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
> +
> +                       /*
> +                        * Set TTBR1 to empty by default - it will get
> +                        * programmed later if it is enabled
> +                        */
> +                       cb->ttbr[1] = (u64)cfg->asid << TTBRn_ASID_SHIFT;
>                 }
>         } else {
>                 cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
> @@ -760,11 +830,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>  {
>         int irq, start, ret = 0;
>         unsigned long ias, oas;
> -       struct io_pgtable_ops *pgtbl_ops;
> +       struct io_pgtable_ops *pgtbl_ops[2];
>         struct io_pgtable_cfg pgtbl_cfg;
>         enum io_pgtable_fmt fmt;
>         struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>         struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +       bool split_tables =
> +               (smmu_domain->attributes & (1 << DOMAIN_ATTR_SPLIT_TABLES));

BIT(DOMAIN_ATTR_SPLIT_TABLES) ?

>
>         mutex_lock(&smmu_domain->init_mutex);
>         if (smmu_domain->smmu)
> @@ -794,8 +866,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>          *
>          * Note that you can't actually request stage-2 mappings.
>          */
> -       if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
> +       if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) {
>                 smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
> +               /* FIXME: fail instead? */
> +               split_tables = false;

yeah, I think we want to return an error somewhere if not supported.
I think we want to fall back to not using per-process pagetables if
this fails.


BR,
-R


> +       }
>         if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
>                 smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>
> @@ -812,8 +887,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>         if (IS_ENABLED(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) &&
>             !IS_ENABLED(CONFIG_64BIT) && !IS_ENABLED(CONFIG_ARM_LPAE) &&
>             (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) &&
> -           (smmu_domain->stage == ARM_SMMU_DOMAIN_S1))
> +           (smmu_domain->stage == ARM_SMMU_DOMAIN_S1)) {
> +               /* FIXME: fail instead? */
> +               split_tables = false;
>                 cfg->fmt = ARM_SMMU_CTX_FMT_AARCH32_S;
> +       }
>         if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt == ARM_SMMU_CTX_FMT_NONE) &&
>             (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K |
>                                ARM_SMMU_FEAT_FMT_AARCH64_16K |
> @@ -903,8 +981,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>                 pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>
>         smmu_domain->smmu = smmu;
> -       pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> -       if (!pgtbl_ops) {
> +       pgtbl_ops[0] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> +       if (!pgtbl_ops[0]) {
>                 ret = -ENOMEM;
>                 goto out_clear_smmu;
>         }
> @@ -916,6 +994,22 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>
>         /* Initialise the context bank with our page table cfg */
>         arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
> +
> +       pgtbl_ops[1] = NULL;
> +
> +       if (split_tables) {
> +               /* FIXME: I think it is safe to reuse pgtbl_cfg here */
> +               pgtbl_ops[1] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg,
> +                       smmu_domain);
> +               if (!pgtbl_ops[1]) {
> +                       free_io_pgtable_ops(pgtbl_ops[0]);
> +                       ret = -ENOMEM;
> +                       goto out_clear_smmu;
> +               }
> +
> +               arm_smmu_init_ttbr1(smmu_domain, &pgtbl_cfg);
> +       }
> +
>         arm_smmu_write_context_bank(smmu, cfg->cbndx);
>
>         /*
> @@ -934,7 +1028,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>         mutex_unlock(&smmu_domain->init_mutex);
>
>         /* Publish page table ops for map/unmap */
> -       smmu_domain->pgtbl_ops = pgtbl_ops;
> +       smmu_domain->pgtbl_ops[0] = pgtbl_ops[0];
> +       smmu_domain->pgtbl_ops[1] = pgtbl_ops[1];
> +
>         return 0;
>
>  out_clear_smmu:
> @@ -970,7 +1066,9 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
>                 devm_free_irq(smmu->dev, irq, domain);
>         }
>
> -       free_io_pgtable_ops(smmu_domain->pgtbl_ops);
> +       free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
> +       free_io_pgtable_ops(smmu_domain->pgtbl_ops[1]);
> +
>         __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
>
>         arm_smmu_rpm_put(smmu);
> @@ -1285,10 +1383,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         return ret;
>  }
>
> +static struct io_pgtable_ops *
> +arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
> +{
> +       struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +       struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +       struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +
> +       if (iova & cb->split_table_mask)
> +               return smmu_domain->pgtbl_ops[1];
> +
> +       return smmu_domain->pgtbl_ops[0];
> +}
> +
>  static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>                         phys_addr_t paddr, size_t size, int prot)
>  {
> -       struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
> +       struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>         struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
>         int ret;
>
> @@ -1305,7 +1416,7 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>  static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
>                              size_t size)
>  {
> -       struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
> +       struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>         struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
>         size_t ret;
>
> @@ -1349,7 +1460,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
>         struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>         struct arm_smmu_device *smmu = smmu_domain->smmu;
>         struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> -       struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
> +       struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>         struct device *dev = smmu->dev;
>         void __iomem *cb_base;
>         u32 tmp;
> @@ -1397,7 +1508,7 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
>                                         dma_addr_t iova)
>  {
>         struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> -       struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +       struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>
>         if (domain->type == IOMMU_DOMAIN_IDENTITY)
>                 return iova;
> @@ -1584,6 +1695,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>                 case DOMAIN_ATTR_NESTING:
>                         *(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
>                         return 0;
> +               case DOMAIN_ATTR_SPLIT_TABLES:
> +                       *((int *)data) =
> +                               !!(smmu_domain->attributes &
> +                                  (1 << DOMAIN_ATTR_SPLIT_TABLES));
> +                       return 0;
>                 default:
>                         return -ENODEV;
>                 }
> @@ -1624,6 +1740,11 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>                         else
>                                 smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>                         break;
> +               case DOMAIN_ATTR_SPLIT_TABLES:
> +                       if (*((int *)data))
> +                               smmu_domain->attributes |=
> +                                       (1 << DOMAIN_ATTR_SPLIT_TABLES);
> +                       break;
>                 default:
>                         ret = -ENODEV;
>                 }
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 237cacd..dc9fb2e 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -475,8 +475,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>         if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
>                 return 0;
>
> -       if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
> -                   paddr >= (1ULL << data->iop.cfg.oas)))
> +       if (WARN_ON(paddr >= (1ULL << data->iop.cfg.oas)))
>                 return -ERANGE;
>
>         prot = arm_lpae_prot_to_pte(data, iommu_prot);
> --
> 2.7.4
>
> _______________________________________________
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH v1 05/15] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  2019-03-01 19:38 ` [RFC PATCH v1 05/15] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
@ 2019-03-04 12:19   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 5+ messages in thread
From: Jean-Philippe Brucker @ 2019-03-04 12:19 UTC (permalink / raw)
  To: Jordan Crouse, freedreno
  Cc: linux-arm-msm, Will Deacon, linux-kernel, iommu, dianders,
	hoegsberg, Robin Murphy, linux-arm-kernel

Hi Jordan,

On 01/03/2019 19:38, Jordan Crouse wrote:
> Support the new auxiliary domain API for arm-smmuv2 to initialize and
> support multiple pagetables for a SMMU device. Since the smmu-v2 hardware
> doesn't have any built in support for switching the pagetable base it is
> left as an exercise to the caller to actually use the pagetable; aux
> domains in the IOMMU driver are only preoccupied with creating and managing
> the pagetable memory.
> 
> Following is a pseudo code example of how a domain can be created
> 
>  /* Check to see if aux domains are supported */
>  if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
> 	 iommu = iommu_domain_alloc(...);
> 
> 	 if (iommu_aux_attach_device(domain, dev))
> 		 return FAIL;
> 
> 	/* Save the base address of the pagetable for use by the driver
> 	iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
>  }

> After this 'domain' can be used like any other iommu domain to map and
> unmap iova addresses in the pagetable. The driver/hardware can be used
> to switch the pagetable according to its own specific implementation.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>

[...]
> +static bool arm_smmu_dev_has_feat(struct device *dev,
> +		enum iommu_dev_features feat)
> +{
> +	/*
> +	 * FIXME: Should we do some hardware checking here, like to be sure this
> +	 * is a stage 1 and such?
> +	 */
> +
> +	/* Always allow aux domains */
> +	if (feat == IOMMU_DEV_FEAT_AUX)
> +		return true;

If possible, we should only return true when SMMU and GPU are able to
coordinate and switch contexts. Can the feature be identified through ID
reg or compatible string?

If we plug a PCIe card with PASID behind a SMMUv2 'classic', and its
driver attempts to enable AUXD support, then this should return false.

> +
> +	return false;
> +}
> +
> +/* FIXME: Add stubs for dev_enable_feat and dev_disable_feat? */

Ideally yes. Although SMMUv2 support for aux domains will likely only be
used by the MSM driver, using the same model in all IOMMU drivers would
ease moving things to common code later.

> +
> +/* Set up a new aux domain and create a new pagetable with the same
> + * characteristics as the master
> + */
> +static int arm_smmu_aux_attach_dev(struct iommu_domain *domain,
> +		struct device *dev)
> +{
> +	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> +	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
> +	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +
> +	smmu_domain->is_aux = true;

The API allows to attach the same domain to one device using
aux_attach_dev() and another using attach_dev(). For SMMUv3 we'll reject
this, since normal and aux domain are different things (one has PASID
tables, the other doesn't). Is this supported by SMMUv2? Otherwise some
sanity-check here might be necessary

> +
> +	/* No power is needed because aux domain doesn't touch the hardware */
> +	return arm_smmu_init_domain_context(domain, smmu);
> +}
> +
>  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  {
>  	int ret;
> @@ -1342,6 +1414,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  		return -ENXIO;
>  	}
>  
> +	/* FIXME: Reject unmanged domains since those should be aux? */

No, unmanaged domains are also used by VFIO and a couple other drivers
that want to setup IOMMU mappings themselves.

Thanks,
Jean

> +
>  	/*
>  	 * FIXME: The arch/arm DMA API code tries to attach devices to its own
>  	 * domains between of_xlate() and add_device() - we have no way to cope
> @@ -1388,7 +1462,13 @@ arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
>  {
>  	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>  	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> -	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +	struct arm_smmu_cb *cb;
> +
> +	/* quick escape for domains that don't have split pagetables enabled */
> +	if (!smmu_domain->pgtbl_ops[1])
> +		return smmu_domain->pgtbl_ops[0];
> +
> +	cb = &smmu_domain->smmu->cbs[cfg->cbndx];
>  
>  	if (iova & cb->split_table_mask)
>  		return smmu_domain->pgtbl_ops[1];
> @@ -1700,6 +1780,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>  				!!(smmu_domain->attributes &
>  				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
>  			return 0;
> +		case DOMAIN_ATTR_PTBASE:
> +			if (!smmu_domain->is_aux)
> +				return -ENODEV;
> +			*((u64 *)data) = smmu_domain->ttbr0;
> +			return 0;
>  		default:
>  			return -ENODEV;
>  		}
> @@ -1810,7 +1895,9 @@ static struct iommu_ops arm_smmu_ops = {
>  	.capable		= arm_smmu_capable,
>  	.domain_alloc		= arm_smmu_domain_alloc,
>  	.domain_free		= arm_smmu_domain_free,
> +	.dev_has_feat		= arm_smmu_dev_has_feat,
>  	.attach_dev		= arm_smmu_attach_dev,
> +	.aux_attach_dev		= arm_smmu_aux_attach_dev,
>  	.map			= arm_smmu_map,
>  	.unmap			= arm_smmu_unmap,
>  	.flush_iotlb_all	= arm_smmu_flush_iotlb_all,
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-03-04 12:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-03-01 19:38 [RFC PATCH v1 00/15] drm/msm: Per-instance pagetable support Jordan Crouse
2019-03-01 19:38 ` [RFC PATCH v1 02/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2 Jordan Crouse
2019-03-01 20:25   ` [Freedreno] " Rob Clark
2019-03-01 19:38 ` [RFC PATCH v1 05/15] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
2019-03-04 12:19   ` Jean-Philippe Brucker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).