Linux IOMMU Development
 help / color / mirror / Atom feed
* [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2)
@ 2023-07-12 14:14 Vasant Hegde
  2023-07-12 14:14 ` [PATCH 01/21] iommu/amd: Remove unused amd_io_pgtable.pt_root variable Vasant Hegde
                   ` (21 more replies)
  0 siblings, 22 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:14 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

This is part 1 of the 2-part series to introduce Share Virtual Address
(SVA) support, which focuses on cleaning up and refactoring the existing
code in preparation for subsequent series.

It contains the following enhancements:

* Patch 1 - 7:
  Clean up and refactoring. No functional changes.

* Patch 8 - 11:
  Miscellaneous improvements.

* Patch 12 - 15:
  Introduce new set of AMD IOMMU protection domain flags, which simplifies
  domain manangements.

* Patch 16 - 21:
  Modify logic to independently enable PCI capabilities (ATS/PASID/PRI)
  for devices. This allows more flexibility in preparation for SVA and
  IOPF supports.

This patch series is based on top of upstream v6.5-rc1 and below set of patches.
  - iommu/amd: Add PPR overflow support
    https://lore.kernel.org/linux-iommu/20230628051624.5792-1-vasant.hegde@amd.com/T/#t

  - iommu/amd: Interrupt handling improvements
    https://lore.kernel.org/linux-iommu/20230628053222.5962-1-vasant.hegde@amd.com/T/#t

  - iommu/amd: Consolidate PPR log handling
    https://lore.kernel.org/linux-iommu/20230628054554.6131-1-vasant.hegde@amd.com/

This is also available at github :
  - https://github.com/AMDESE/linux/tree/iommu_sva_v1_part1_v6.5_rc1

Thank you,
Vasant / Suravee

Suravee Suthikulpanit (10):
  iommu/amd: Remove unused amd_io_pgtable.pt_root variable
  iommu/amd: Consolidate timeout pre-define to amd_iommu_type.h
  iommu/amd: Consolidate logic to allocate protection domain
  iommu/amd: Introduce helper functions for managing GCR3 table
  iommu/amd: Use struct protection_domain in helper functions
  iommu/amd: Miscellaneous clean up when free domain
  iommu/amd: Modify logic for checking GT and PPR features
  iommu/amd: Clean up and enhance protection domain flags
  iommu/amd: Consolidate logic for setting up v2 API mode
  iommu/amd: Refactor domain allocation code for v2DMA mode

Vasant Hegde (11):
  iommu/amd: Refactor protection domain allocation code
  iommu/amd/iommu_v2: Use protection_domain in struct device_state
  iommu/amd: Do not set amd_iommu_pgtable in pass-through mode
  iommu/amd: Use protection_domain.flags to check page table mode
  iommu/amd: Setup v2 domain with max pasids
  iommu/amd: Introduce iommu_dev_data.flags to track device capabilities
  iommu/amd: Rename ats related variables
  iommu/amd: Enable device ATS/PASID/PRI capabilities independently
  iommu/amd: Delay enabling PRI to after initializing domain
  iommu/amd: Initialize iommu_device->max_pasids
  iommu/amd: Simplify amd_iommu_device_info()

 drivers/iommu/amd/amd_iommu.h       |  35 +-
 drivers/iommu/amd/amd_iommu_types.h |  50 ++-
 drivers/iommu/amd/init.c            |  37 +-
 drivers/iommu/amd/io_pgtable_v2.c   |  16 +-
 drivers/iommu/amd/iommu.c           | 661 ++++++++++++++++------------
 drivers/iommu/amd/iommu_v2.c        |  38 +-
 6 files changed, 465 insertions(+), 372 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 01/21] iommu/amd: Remove unused amd_io_pgtable.pt_root variable
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
@ 2023-07-12 14:14 ` Vasant Hegde
  2023-07-12 14:14 ` [PATCH 02/21] iommu/amd: Consolidate timeout pre-define to amd_iommu_type.h Vasant Hegde
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:14 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

It has been no longer used since the commit 6eedb59c18a3 ("iommu/amd:
Remove amd_iommu_domain_get_pgtable").

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       | 1 -
 drivers/iommu/amd/amd_iommu_types.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index f23ad6043f04..6bf55de9fbeb 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -108,7 +108,6 @@ static inline void *iommu_phys_to_virt(unsigned long paddr)
 static inline
 void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root)
 {
-	atomic64_set(&domain->iop.pt_root, root);
 	domain->iop.root = (u64 *)(root & PAGE_MASK);
 	domain->iop.mode = root & 7; /* lowest 3 bits encode pgtable mode */
 }
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 7dc30c2b56b3..081d8d0f29d5 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -544,7 +544,6 @@ struct amd_io_pgtable {
 	struct io_pgtable	iop;
 	int			mode;
 	u64			*root;
-	atomic64_t		pt_root;	/* pgtable root and pgtable mode */
 	u64			*pgd;		/* v2 pgtable pgd pointer */
 };
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 02/21] iommu/amd: Consolidate timeout pre-define to amd_iommu_type.h
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
  2023-07-12 14:14 ` [PATCH 01/21] iommu/amd: Remove unused amd_io_pgtable.pt_root variable Vasant Hegde
@ 2023-07-12 14:14 ` Vasant Hegde
  2023-07-12 14:14 ` [PATCH 03/21] iommu/amd: Consolidate logic to allocate protection domain Vasant Hegde
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:14 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

To allow inclusion in other files in subsequent patches.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  4 ++++
 drivers/iommu/amd/init.c            | 10 ++++------
 drivers/iommu/amd/iommu.c           |  2 --
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 081d8d0f29d5..a8d264026f7f 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -451,6 +451,10 @@
 #define PD_IOMMUV2_MASK		BIT(3) /* domain has gcr3 table */
 #define PD_GIOV_MASK		BIT(4) /* domain enable GIOV support */
 
+/* Timeout stuff */
+#define LOOP_TIMEOUT		100000
+#define MMIO_STATUS_TIMEOUT	2000000
+
 extern bool amd_iommu_dump;
 #define DUMP_printk(format, arg...)				\
 	do {							\
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 45efb7e5d725..852e40b13d20 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -83,8 +83,6 @@
 #define ACPI_DEVFLAG_LINT1              0x80
 #define ACPI_DEVFLAG_ATSDIS             0x10000000
 
-#define LOOP_TIMEOUT	2000000
-
 #define IVRS_GET_SBDF_ID(seg, bus, dev, fn)	(((seg & 0xffff) << 16) | ((bus & 0xff) << 8) \
 						 | ((dev & 0x1f) << 3) | (fn & 0x7))
 
@@ -985,14 +983,14 @@ static int iommu_ga_log_enable(struct amd_iommu *iommu)
 	iommu_feature_enable(iommu, CONTROL_GAINT_EN);
 	iommu_feature_enable(iommu, CONTROL_GALOG_EN);
 
-	for (i = 0; i < LOOP_TIMEOUT; ++i) {
+	for (i = 0; i < MMIO_STATUS_TIMEOUT; ++i) {
 		status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
 		if (status & (MMIO_STATUS_GALOG_RUN_MASK))
 			break;
 		udelay(10);
 	}
 
-	if (WARN_ON(i >= LOOP_TIMEOUT))
+	if (WARN_ON(i >= MMIO_STATUS_TIMEOUT))
 		return -EINVAL;
 
 	return 0;
@@ -2900,14 +2898,14 @@ static void enable_iommus_vapic(void)
 		 * Need to set and poll check the GALOGRun bit to zero before
 		 * we can set/ modify GA Log registers safely.
 		 */
-		for (i = 0; i < LOOP_TIMEOUT; ++i) {
+		for (i = 0; i < MMIO_STATUS_TIMEOUT; ++i) {
 			status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
 			if (!(status & MMIO_STATUS_GALOG_RUN_MASK))
 				break;
 			udelay(10);
 		}
 
-		if (WARN_ON(i >= LOOP_TIMEOUT))
+		if (WARN_ON(i >= MMIO_STATUS_TIMEOUT))
 			return;
 	}
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 56b6cf8bf03f..dd586f69a9d8 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -44,8 +44,6 @@
 
 #define CMD_SET_TYPE(cmd, t) ((cmd)->data[1] |= ((t) << 28))
 
-#define LOOP_TIMEOUT	100000
-
 /* IO virtual address start page frame number */
 #define IOVA_START_PFN		(1)
 #define IOVA_PFN(addr)		((addr) >> PAGE_SHIFT)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 03/21] iommu/amd: Consolidate logic to allocate protection domain
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
  2023-07-12 14:14 ` [PATCH 01/21] iommu/amd: Remove unused amd_io_pgtable.pt_root variable Vasant Hegde
  2023-07-12 14:14 ` [PATCH 02/21] iommu/amd: Consolidate timeout pre-define to amd_iommu_type.h Vasant Hegde
@ 2023-07-12 14:14 ` Vasant Hegde
  2023-07-12 14:14 ` [PATCH 04/21] iommu/amd: Refactor protection domain allocation code Vasant Hegde
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:14 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Move the logic into the common caller function to simplify the code.

No functional changes intended.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index dd586f69a9d8..c2cb541b0553 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2046,12 +2046,6 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 
 	BUG_ON(mode < PAGE_MODE_NONE || mode > PAGE_MODE_6_LEVEL);
 
-	spin_lock_init(&domain->lock);
-	domain->id = domain_id_alloc();
-	if (!domain->id)
-		return -ENOMEM;
-	INIT_LIST_HEAD(&domain->dev_list);
-
 	if (mode != PAGE_MODE_NONE) {
 		pt_root = (void *)get_zeroed_page(GFP_KERNEL);
 		if (!pt_root) {
@@ -2067,12 +2061,6 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 
 static int protection_domain_init_v2(struct protection_domain *domain)
 {
-	spin_lock_init(&domain->lock);
-	domain->id = domain_id_alloc();
-	if (!domain->id)
-		return -ENOMEM;
-	INIT_LIST_HEAD(&domain->dev_list);
-
 	domain->flags |= PD_GIOV_MASK;
 
 	domain->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
@@ -2112,6 +2100,13 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
 	if (!domain)
 		return NULL;
 
+	domain->id = domain_id_alloc();
+	if (!domain->id)
+		goto out_err;
+
+	spin_lock_init(&domain->lock);
+	INIT_LIST_HEAD(&domain->dev_list);
+
 	switch (pgtable) {
 	case AMD_IOMMU_V1:
 		ret = protection_domain_init_v1(domain, mode);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 04/21] iommu/amd: Refactor protection domain allocation code
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (2 preceding siblings ...)
  2023-07-12 14:14 ` [PATCH 03/21] iommu/amd: Consolidate logic to allocate protection domain Vasant Hegde
@ 2023-07-12 14:14 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 05/21] iommu/amd/iommu_v2: Use protection_domain in struct device_state Vasant Hegde
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:14 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

To replace if-else with switch-case statment due to increasing number of
domain types.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 45 +++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index c2cb541b0553..287c979239c3 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2078,24 +2078,8 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
 	struct io_pgtable_ops *pgtbl_ops;
 	struct protection_domain *domain;
 	int pgtable;
-	int mode = DEFAULT_PGTABLE_LEVEL;
 	int ret;
 
-	/*
-	 * Force IOMMU v1 page table when iommu=pt and
-	 * when allocating domain for pass-through devices.
-	 */
-	if (type == IOMMU_DOMAIN_IDENTITY) {
-		pgtable = AMD_IOMMU_V1;
-		mode = PAGE_MODE_NONE;
-	} else if (type == IOMMU_DOMAIN_UNMANAGED) {
-		pgtable = AMD_IOMMU_V1;
-	} else if (type == IOMMU_DOMAIN_DMA || type == IOMMU_DOMAIN_DMA_FQ) {
-		pgtable = amd_iommu_pgtable;
-	} else {
-		return NULL;
-	}
-
 	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
 	if (!domain)
 		return NULL;
@@ -2106,10 +2090,31 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
 
 	spin_lock_init(&domain->lock);
 	INIT_LIST_HEAD(&domain->dev_list);
+	domain->nid = NUMA_NO_NODE;
+
+	switch (type) {
+	/* No need to allocate io pgtable ops in passthrough mode */
+	case IOMMU_DOMAIN_IDENTITY:
+		return domain;
+	case IOMMU_DOMAIN_DMA:
+		fallthrough;
+	case IOMMU_DOMAIN_DMA_FQ:
+		pgtable = amd_iommu_pgtable;
+		break;
+	/*
+	 * Force IOMMU v1 page table when allocating
+	 * domain for pass-through devices.
+	 */
+	case IOMMU_DOMAIN_UNMANAGED:
+		pgtable = AMD_IOMMU_V1;
+		break;
+	default:
+		goto out_err;
+	}
 
 	switch (pgtable) {
 	case AMD_IOMMU_V1:
-		ret = protection_domain_init_v1(domain, mode);
+		ret = protection_domain_init_v1(domain, DEFAULT_PGTABLE_LEVEL);
 		break;
 	case AMD_IOMMU_V2:
 		ret = protection_domain_init_v2(domain);
@@ -2121,12 +2126,6 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
 	if (ret)
 		goto out_err;
 
-	/* No need to allocate io pgtable ops in passthrough mode */
-	if (type == IOMMU_DOMAIN_IDENTITY)
-		return domain;
-
-	domain->nid = NUMA_NO_NODE;
-
 	pgtbl_ops = alloc_io_pgtable_ops(pgtable, &domain->iop.pgtbl_cfg, domain);
 	if (!pgtbl_ops) {
 		domain_id_free(domain->id);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 05/21] iommu/amd/iommu_v2: Use protection_domain in struct device_state
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (3 preceding siblings ...)
  2023-07-12 14:14 ` [PATCH 04/21] iommu/amd: Refactor protection domain allocation code Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 06/21] iommu/amd: Introduce helper functions for managing GCR3 table Vasant Hegde
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

Replace struct iommu_domain with struct protection_domain to simplify code
since the protection_domain contains AMD IOMMU specific information.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu_v2.c | 38 ++++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index 261352a23271..e4507a211d6b 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -55,7 +55,7 @@ struct device_state {
 	atomic_t count;
 	struct pci_dev *pdev;
 	struct pasid_state **states;
-	struct iommu_domain *domain;
+	struct protection_domain *pdom;
 	int pasid_levels;
 	int max_pasids;
 	amd_iommu_invalid_ppr_cb inv_ppr_cb;
@@ -112,6 +112,7 @@ static struct device_state *get_device_state(u32 sbdf)
 static void free_device_state(struct device_state *dev_state)
 {
 	struct iommu_group *group;
+	struct iommu_domain *domain = &dev_state->pdom->domain;
 
 	/* Get rid of any remaining pasid states */
 	free_pasid_states(dev_state);
@@ -130,12 +131,12 @@ static void free_device_state(struct device_state *dev_state)
 	if (WARN_ON(!group))
 		return;
 
-	iommu_detach_group(dev_state->domain, group);
+	iommu_detach_group(domain, group);
 
 	iommu_group_put(group);
 
 	/* Everything is down now, free the IOMMUv2 domain */
-	iommu_domain_free(dev_state->domain);
+	iommu_domain_free(domain);
 
 	/* Finally get rid of the device-state */
 	kfree(dev_state);
@@ -269,9 +270,7 @@ static void put_pasid_state_wait(struct pasid_state *pasid_state)
 
 static void unbind_pasid(struct pasid_state *pasid_state)
 {
-	struct iommu_domain *domain;
-
-	domain = pasid_state->device_state->domain;
+	struct device_state *dev_state = pasid_state->device_state;
 
 	/*
 	 * Mark pasid_state as invalid, no more faults will we added to the
@@ -283,7 +282,7 @@ static void unbind_pasid(struct pasid_state *pasid_state)
 	smp_wmb();
 
 	/* After this the device/pasid can't access the mm anymore */
-	amd_iommu_domain_clear_gcr3(domain, pasid_state->pasid);
+	amd_iommu_domain_clear_gcr3(&dev_state->pdom->domain, pasid_state->pasid);
 
 	/* Make sure no more pending faults are in the queue */
 	flush_workqueue(iommu_wq);
@@ -366,10 +365,10 @@ static void mn_invalidate_range(struct mmu_notifier *mn,
 	dev_state   = pasid_state->device_state;
 
 	if ((start ^ (end - 1)) < PAGE_SIZE)
-		amd_iommu_flush_page(dev_state->domain, pasid_state->pasid,
+		amd_iommu_flush_page(&dev_state->pdom->domain, pasid_state->pasid,
 				     start);
 	else
-		amd_iommu_flush_tlb(dev_state->domain, pasid_state->pasid);
+		amd_iommu_flush_tlb(&dev_state->pdom->domain, pasid_state->pasid);
 }
 
 static void mn_release(struct mmu_notifier *mn, struct mm_struct *mm)
@@ -648,7 +647,7 @@ int amd_iommu_bind_pasid(struct pci_dev *pdev, u32 pasid,
 	if (ret)
 		goto out_unregister;
 
-	ret = amd_iommu_domain_set_gcr3(dev_state->domain, pasid,
+	ret = amd_iommu_domain_set_gcr3(&dev_state->pdom->domain, pasid,
 					__pa(pasid_state->mm->pgd));
 	if (ret)
 		goto out_clear_state;
@@ -732,6 +731,7 @@ EXPORT_SYMBOL(amd_iommu_unbind_pasid);
 
 int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
 {
+	struct iommu_domain *domain;
 	struct device_state *dev_state;
 	struct iommu_group *group;
 	unsigned long flags;
@@ -776,15 +776,19 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
 	if (dev_state->states == NULL)
 		goto out_free_dev_state;
 
-	dev_state->domain = iommu_domain_alloc(&pci_bus_type);
-	if (dev_state->domain == NULL)
+	domain = iommu_domain_alloc(&pci_bus_type);
+	if (domain == NULL)
 		goto out_free_states;
 
+	/* Retrieve new protection_domain that has just been allocated */
+	dev_state->pdom = container_of(domain,
+				       struct protection_domain, domain);
+
 	/* See iommu_is_default_domain() */
-	dev_state->domain->type = IOMMU_DOMAIN_IDENTITY;
-	amd_iommu_domain_direct_map(dev_state->domain);
+	domain->type = IOMMU_DOMAIN_IDENTITY;
+	amd_iommu_domain_direct_map(&dev_state->pdom->domain);
 
-	ret = amd_iommu_domain_enable_v2(dev_state->domain, pasids);
+	ret = amd_iommu_domain_enable_v2(domain, pasids);
 	if (ret)
 		goto out_free_domain;
 
@@ -794,7 +798,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
 		goto out_free_domain;
 	}
 
-	ret = iommu_attach_group(dev_state->domain, group);
+	ret = iommu_attach_group(domain, group);
 	if (ret != 0)
 		goto out_drop_group;
 
@@ -818,7 +822,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
 	iommu_group_put(group);
 
 out_free_domain:
-	iommu_domain_free(dev_state->domain);
+	iommu_domain_free(domain);
 
 out_free_states:
 	free_page((unsigned long)dev_state->states);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 06/21] iommu/amd: Introduce helper functions for managing GCR3 table
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (4 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 05/21] iommu/amd/iommu_v2: Use protection_domain in struct device_state Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 07/21] iommu/amd: Use struct protection_domain in helper functions Vasant Hegde
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Refactor domain_enable_v2() into helper functions for managing GCR3 table
(i.e. setup_gcr3_table() and get_gcr3_levels()), which will be used in
subsequent patches. Also re-arrange code and remove forward declaration.

No functional changes intended.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 63 ++++++++++++++++++++++-----------------
 1 file changed, 36 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 287c979239c3..a278bd9aa9e6 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -77,7 +77,6 @@ struct iommu_cmd {
 struct kmem_cache *amd_iommu_irq_cache;
 
 static void detach_device(struct device *dev);
-static int domain_enable_v2(struct protection_domain *domain, int pasids);
 
 /****************************************************************************
  *
@@ -1575,6 +1574,40 @@ static void free_gcr3_table(struct protection_domain *domain)
 	free_page((unsigned long)domain->gcr3_tbl);
 }
 
+static int get_gcr3_levels(int pasids)
+{
+	int levels = 0;
+
+	if (pasids == -1)
+		return amd_iommu_max_glx_val;
+
+	/* Number of GCR3 table levels required */
+	for ( ; (pasids != 0) && ((pasids - 1) & ~0x1ff); pasids >>= 9)
+		levels += 1;
+
+	return levels;
+}
+
+/* Note: This function expects iommu_domain->lock to be held prior calling the function. */
+static int setup_gcr3_table(struct protection_domain *domain, int pasids)
+{
+	int levels = get_gcr3_levels(pasids);
+
+	if (levels > amd_iommu_max_glx_val)
+		return -EINVAL;
+
+	domain->gcr3_tbl = (void *)get_zeroed_page(GFP_ATOMIC);
+	if (domain->gcr3_tbl == NULL)
+		return -ENOMEM;
+
+	domain->glx      = levels;
+	domain->flags   |= PD_IOMMUV2_MASK;
+
+	amd_iommu_domain_update(domain);
+
+	return 0;
+}
+
 static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
 			  struct protection_domain *domain, bool ats, bool ppr)
 {
@@ -2065,7 +2098,7 @@ static int protection_domain_init_v2(struct protection_domain *domain)
 
 	domain->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
 
-	if (domain_enable_v2(domain, 1)) {
+	if (setup_gcr3_table(domain, 1)) {
 		domain_id_free(domain->id);
 		return -ENOMEM;
 	}
@@ -2514,30 +2547,6 @@ void amd_iommu_domain_direct_map(struct iommu_domain *dom)
 }
 EXPORT_SYMBOL(amd_iommu_domain_direct_map);
 
-/* Note: This function expects iommu_domain->lock to be held prior calling the function. */
-static int domain_enable_v2(struct protection_domain *domain, int pasids)
-{
-	int levels;
-
-	/* Number of GCR3 table levels required */
-	for (levels = 0; (pasids - 1) & ~0x1ff; pasids >>= 9)
-		levels += 1;
-
-	if (levels > amd_iommu_max_glx_val)
-		return -EINVAL;
-
-	domain->gcr3_tbl = (void *)get_zeroed_page(GFP_ATOMIC);
-	if (domain->gcr3_tbl == NULL)
-		return -ENOMEM;
-
-	domain->glx      = levels;
-	domain->flags   |= PD_IOMMUV2_MASK;
-
-	amd_iommu_domain_update(domain);
-
-	return 0;
-}
-
 int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids)
 {
 	struct protection_domain *pdom = to_pdomain(dom);
@@ -2556,7 +2565,7 @@ int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids)
 		goto out;
 
 	if (!pdom->gcr3_tbl)
-		ret = domain_enable_v2(pdom, pasids);
+		ret = setup_gcr3_table(pdom, pasids);
 
 out:
 	spin_unlock_irqrestore(&pdom->lock, flags);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 07/21] iommu/amd: Use struct protection_domain in helper functions
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (5 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 06/21] iommu/amd: Introduce helper functions for managing GCR3 table Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-14 16:57   ` Jason Gunthorpe
  2023-07-12 14:15 ` [PATCH 08/21] iommu/amd: Do not set amd_iommu_pgtable in pass-through mode Vasant Hegde
                   ` (14 subsequent siblings)
  21 siblings, 1 reply; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

To simplify the unnecessary use of container_of() on the struct
iommu_domain to get the container structure.

No functional changes intended.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       | 13 +++++++++----
 drivers/iommu/amd/amd_iommu_types.h | 10 ++++++++++
 drivers/iommu/amd/io_pgtable_v2.c   |  6 +++---
 drivers/iommu/amd/iommu.c           | 18 ++++--------------
 drivers/iommu/amd/iommu_v2.c        | 11 +++++------
 5 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 6bf55de9fbeb..a5a350ee36fe 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -58,15 +58,15 @@ int amd_iommu_register_ppr_notifier(struct notifier_block *nb);
 int amd_iommu_unregister_ppr_notifier(struct notifier_block *nb);
 void amd_iommu_domain_direct_map(struct iommu_domain *dom);
 int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids);
-int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid, u64 address);
+int amd_iommu_flush_page(struct protection_domain *domain, u32 pasid, u64 address);
 void amd_iommu_update_and_flush_device_table(struct protection_domain *domain);
 void amd_iommu_domain_update(struct protection_domain *domain);
 void amd_iommu_domain_flush_complete(struct protection_domain *domain);
 void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain);
-int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid);
-int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid,
+int amd_iommu_flush_tlb(struct protection_domain *domain, u32 pasid);
+int amd_iommu_domain_set_gcr3(struct protection_domain *domain, u32 pasid,
 			      unsigned long cr3);
-int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, u32 pasid);
+int amd_iommu_domain_clear_gcr3(struct protection_domain *domain, u32 pasid);
 
 #ifdef CONFIG_IRQ_REMAP
 int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
@@ -134,6 +134,11 @@ static inline void *alloc_pgtable_page(int nid, gfp_t gfp)
 	return page ? page_address(page) : NULL;
 }
 
+static inline struct protection_domain *to_pdomain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct protection_domain, domain);
+}
+
 bool translation_pre_enabled(struct amd_iommu *iommu);
 bool amd_iommu_is_attach_deferred(struct device *dev);
 int __init add_special_device(u8 type, u8 id, u32 *devid, bool cmd_line);
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index a8d264026f7f..1d3a35a460d0 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -927,6 +927,16 @@ static inline int get_hpet_devid(int id)
 	return -EINVAL;
 }
 
+static inline struct protection_domain *amd_iommu_get_pdomain(struct device *dev)
+{
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
+
+	if (!dev_data || !dev_data->domain)
+		return NULL;
+
+	return dev_data->domain;
+}
+
 enum amd_iommu_intr_mode_type {
 	AMD_IOMMU_GUEST_IR_LEGACY,
 
diff --git a/drivers/iommu/amd/io_pgtable_v2.c b/drivers/iommu/amd/io_pgtable_v2.c
index e9ef2e0a62f6..a4fa830eab54 100644
--- a/drivers/iommu/amd/io_pgtable_v2.c
+++ b/drivers/iommu/amd/io_pgtable_v2.c
@@ -274,9 +274,9 @@ static int iommu_v2_map_pages(struct io_pgtable_ops *ops, unsigned long iova,
 out:
 	if (updated) {
 		if (count > 1)
-			amd_iommu_flush_tlb(&pdom->domain, 0);
+			amd_iommu_flush_tlb(pdom, 0);
 		else
-			amd_iommu_flush_page(&pdom->domain, 0, o_iova);
+			amd_iommu_flush_page(pdom, 0, o_iova);
 	}
 
 	if (mapped)
@@ -384,7 +384,7 @@ static struct io_pgtable *v2_alloc_pgtable(struct io_pgtable_cfg *cfg, void *coo
 	if (!pgtable->pgd)
 		return NULL;
 
-	ret = amd_iommu_domain_set_gcr3(&pdom->domain, 0, iommu_virt_to_phys(pgtable->pgd));
+	ret = amd_iommu_domain_set_gcr3(pdom, 0, iommu_virt_to_phys(pgtable->pgd));
 	if (ret)
 		goto err_free_pgd;
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a278bd9aa9e6..037a4e28a2f3 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -174,11 +174,6 @@ static struct amd_iommu *rlookup_amd_iommu(struct device *dev)
 	return __rlookup_amd_iommu(seg, PCI_SBDF_TO_DEVID(devid));
 }
 
-static struct protection_domain *to_pdomain(struct iommu_domain *dom)
-{
-	return container_of(dom, struct protection_domain, domain);
-}
-
 static struct iommu_dev_data *alloc_dev_data(struct amd_iommu *iommu, u16 devid)
 {
 	struct iommu_dev_data *dev_data;
@@ -2641,10 +2636,8 @@ static int __amd_iommu_flush_page(struct protection_domain *domain, u32 pasid,
 	return __flush_pasid(domain, pasid, address, false);
 }
 
-int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid,
-			 u64 address)
+int amd_iommu_flush_page(struct protection_domain *domain, u32 pasid, u64 address)
 {
-	struct protection_domain *domain = to_pdomain(dom);
 	unsigned long flags;
 	int ret;
 
@@ -2662,9 +2655,8 @@ static int __amd_iommu_flush_tlb(struct protection_domain *domain, u32 pasid)
 			     true);
 }
 
-int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid)
+int amd_iommu_flush_tlb(struct protection_domain *domain, u32 pasid)
 {
-	struct protection_domain *domain = to_pdomain(dom);
 	unsigned long flags;
 	int ret;
 
@@ -2741,10 +2733,9 @@ static int __clear_gcr3(struct protection_domain *domain, u32 pasid)
 	return __amd_iommu_flush_tlb(domain, pasid);
 }
 
-int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid,
+int amd_iommu_domain_set_gcr3(struct protection_domain *domain, u32 pasid,
 			      unsigned long cr3)
 {
-	struct protection_domain *domain = to_pdomain(dom);
 	unsigned long flags;
 	int ret;
 
@@ -2756,9 +2747,8 @@ int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid,
 }
 EXPORT_SYMBOL(amd_iommu_domain_set_gcr3);
 
-int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, u32 pasid)
+int amd_iommu_domain_clear_gcr3(struct protection_domain *domain, u32 pasid)
 {
-	struct protection_domain *domain = to_pdomain(dom);
 	unsigned long flags;
 	int ret;
 
diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index e4507a211d6b..a9733fbc1761 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -282,7 +282,7 @@ static void unbind_pasid(struct pasid_state *pasid_state)
 	smp_wmb();
 
 	/* After this the device/pasid can't access the mm anymore */
-	amd_iommu_domain_clear_gcr3(&dev_state->pdom->domain, pasid_state->pasid);
+	amd_iommu_domain_clear_gcr3(dev_state->pdom, pasid_state->pasid);
 
 	/* Make sure no more pending faults are in the queue */
 	flush_workqueue(iommu_wq);
@@ -365,10 +365,10 @@ static void mn_invalidate_range(struct mmu_notifier *mn,
 	dev_state   = pasid_state->device_state;
 
 	if ((start ^ (end - 1)) < PAGE_SIZE)
-		amd_iommu_flush_page(&dev_state->pdom->domain, pasid_state->pasid,
+		amd_iommu_flush_page(dev_state->pdom, pasid_state->pasid,
 				     start);
 	else
-		amd_iommu_flush_tlb(&dev_state->pdom->domain, pasid_state->pasid);
+		amd_iommu_flush_tlb(dev_state->pdom, pasid_state->pasid);
 }
 
 static void mn_release(struct mmu_notifier *mn, struct mm_struct *mm)
@@ -647,7 +647,7 @@ int amd_iommu_bind_pasid(struct pci_dev *pdev, u32 pasid,
 	if (ret)
 		goto out_unregister;
 
-	ret = amd_iommu_domain_set_gcr3(&dev_state->pdom->domain, pasid,
+	ret = amd_iommu_domain_set_gcr3(dev_state->pdom, pasid,
 					__pa(pasid_state->mm->pgd));
 	if (ret)
 		goto out_clear_state;
@@ -781,8 +781,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
 		goto out_free_states;
 
 	/* Retrieve new protection_domain that has just been allocated */
-	dev_state->pdom = container_of(domain,
-				       struct protection_domain, domain);
+	dev_state->pdom = to_pdomain(domain);
 
 	/* See iommu_is_default_domain() */
 	domain->type = IOMMU_DOMAIN_IDENTITY;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 08/21] iommu/amd: Do not set amd_iommu_pgtable in pass-through mode
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (6 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 07/21] iommu/amd: Use struct protection_domain in helper functions Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 09/21] iommu/amd: Miscellaneous clean up when free domain Vasant Hegde
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

Since AMD IOMMU page table is not used in passthrough mode, switching to
v1 page table is not required.

Therefore, remove redundant amd_iommu_pgtable update and misleading
warning message.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/init.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 852e40b13d20..2b01dfde6cab 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2134,9 +2134,6 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
 		    !iommu_feature(iommu, FEATURE_GT)) {
 			pr_warn("Cannot enable v2 page table for DMA-API. Fallback to v1.\n");
 			amd_iommu_pgtable = AMD_IOMMU_V1;
-		} else if (iommu_default_passthrough()) {
-			pr_warn("V2 page table doesn't support passthrough mode. Fallback to v1.\n");
-			amd_iommu_pgtable = AMD_IOMMU_V1;
 		}
 	}
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 09/21] iommu/amd: Miscellaneous clean up when free domain
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (7 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 08/21] iommu/amd: Do not set amd_iommu_pgtable in pass-through mode Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 10/21] iommu/amd: Modify logic for checking GT and PPR features Vasant Hegde
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

* Use the protection_domain_free() helper function to free domain.
  The function has been modified to also free memory used for the v1 and v2
  page tables. Also clear gcr3 table in v2 page table free path.

* Refactor code into cleanup_domain() for reusability. Change BUG_ON to
  WARN_ON in cleanup path.

* Protection domain dev_cnt should be read when the domain is locked.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/io_pgtable_v2.c |  8 +++---
 drivers/iommu/amd/iommu.c         | 43 +++++++++++++++----------------
 2 files changed, 25 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable_v2.c b/drivers/iommu/amd/io_pgtable_v2.c
index a4fa830eab54..c17cda83bca5 100644
--- a/drivers/iommu/amd/io_pgtable_v2.c
+++ b/drivers/iommu/amd/io_pgtable_v2.c
@@ -363,10 +363,10 @@ static void v2_free_pgtable(struct io_pgtable *iop)
 	if (!(pdom->flags & PD_IOMMUV2_MASK))
 		return;
 
-	/*
-	 * Make changes visible to IOMMUs. No need to clear gcr3 entry
-	 * as gcr3 table is already freed.
-	 */
+	/* Clear gcr3 entry */
+	amd_iommu_domain_clear_gcr3(pdom, 0);
+
+	/* Make changes visible to IOMMUs */
 	amd_iommu_domain_update(pdom);
 
 	/* Free page table */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 037a4e28a2f3..c90d97762c33 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2037,12 +2037,13 @@ void amd_iommu_domain_update(struct protection_domain *domain)
  *
  *****************************************************************************/
 
+/* Must be called with domain->lock held */
 static void cleanup_domain(struct protection_domain *domain)
 {
 	struct iommu_dev_data *entry;
-	unsigned long flags;
 
-	spin_lock_irqsave(&domain->lock, flags);
+	if (!domain->dev_cnt)
+		return;
 
 	while (!list_empty(&domain->dev_list)) {
 		entry = list_first_entry(&domain->dev_list,
@@ -2050,8 +2051,7 @@ static void cleanup_domain(struct protection_domain *domain)
 		BUG_ON(!entry->domain);
 		do_detach(entry);
 	}
-
-	spin_unlock_irqrestore(&domain->lock, flags);
+	WARN_ON(domain->dev_cnt != 0);
 }
 
 static void protection_domain_free(struct protection_domain *domain)
@@ -2062,6 +2062,12 @@ static void protection_domain_free(struct protection_domain *domain)
 	if (domain->iop.pgtbl_cfg.tlb)
 		free_io_pgtable_ops(&domain->iop.iop.ops);
 
+	if (domain->flags & PD_IOMMUV2_MASK)
+		free_gcr3_table(domain);
+
+	if (domain->iop.root)
+		free_page((unsigned long)domain->iop.root);
+
 	if (domain->id)
 		domain_id_free(domain->id);
 
@@ -2076,10 +2082,8 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 
 	if (mode != PAGE_MODE_NONE) {
 		pt_root = (void *)get_zeroed_page(GFP_KERNEL);
-		if (!pt_root) {
-			domain_id_free(domain->id);
+		if (!pt_root)
 			return -ENOMEM;
-		}
 	}
 
 	amd_iommu_domain_set_pgtable(domain, pt_root, mode);
@@ -2093,10 +2097,8 @@ static int protection_domain_init_v2(struct protection_domain *domain)
 
 	domain->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
 
-	if (setup_gcr3_table(domain, 1)) {
-		domain_id_free(domain->id);
+	if (setup_gcr3_table(domain, 1))
 		return -ENOMEM;
-	}
 
 	return 0;
 }
@@ -2155,14 +2157,12 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
 		goto out_err;
 
 	pgtbl_ops = alloc_io_pgtable_ops(pgtable, &domain->iop.pgtbl_cfg, domain);
-	if (!pgtbl_ops) {
-		domain_id_free(domain->id);
+	if (!pgtbl_ops)
 		goto out_err;
-	}
 
 	return domain;
 out_err:
-	kfree(domain);
+	protection_domain_free(domain);
 	return NULL;
 }
 
@@ -2200,19 +2200,18 @@ static struct iommu_domain *amd_iommu_domain_alloc(unsigned type)
 static void amd_iommu_domain_free(struct iommu_domain *dom)
 {
 	struct protection_domain *domain;
+	unsigned long flags;
 
-	domain = to_pdomain(dom);
+	if (!dom)
+		return;
 
-	if (domain->dev_cnt > 0)
-		cleanup_domain(domain);
+	domain = to_pdomain(dom);
 
-	BUG_ON(domain->dev_cnt != 0);
+	spin_lock_irqsave(&domain->lock, flags);
 
-	if (!dom)
-		return;
+	cleanup_domain(domain);
 
-	if (domain->flags & PD_IOMMUV2_MASK)
-		free_gcr3_table(domain);
+	spin_unlock_irqrestore(&domain->lock, flags);
 
 	protection_domain_free(domain);
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 10/21] iommu/amd: Modify logic for checking GT and PPR features
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (8 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 09/21] iommu/amd: Miscellaneous clean up when free domain Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 11/21] iommu/amd: Clean up and enhance protection domain flags Vasant Hegde
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

In order to support v2 page table, IOMMU driver need to check if the
hardware can support Guest translation (GT) and Peripheral Page Requet
(PPR) features. Currently, IOMMU driver uses global (amd_iommu_v2_present)
and per-iommu (struct amd_iommu.is_iommu_v2) variables to track the
features. There variables area redundant since we could simply just check
the global EFR mask.

Therefore, replace it with a helper function with appropriate name.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       | 11 +++++++++++
 drivers/iommu/amd/amd_iommu_types.h |  7 ++++---
 drivers/iommu/amd/init.c            | 14 +-------------
 drivers/iommu/amd/iommu.c           |  2 +-
 4 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index a5a350ee36fe..0605f02fa711 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -95,6 +95,17 @@ static inline bool iommu_feature(struct amd_iommu *iommu, u64 mask)
 	return !!(iommu->features & mask);
 }
 
+static inline bool check_feature_on_all_iommus(u64 mask)
+{
+	return !!(amd_iommu_efr & mask);
+}
+
+static inline bool amd_iommu_gt_ppr_supported(void)
+{
+	return (check_feature_on_all_iommus(FEATURE_GT) &&
+		check_feature_on_all_iommus(FEATURE_PPR));
+}
+
 static inline u64 iommu_virt_to_phys(void *vaddr)
 {
 	return (u64)__sme_set(virt_to_phys(vaddr));
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 1d3a35a460d0..346c32343417 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -484,6 +484,10 @@ extern bool amdr_ivrs_remap_support;
 /* kmem_cache to get tables with 128 byte alignement */
 extern struct kmem_cache *amd_iommu_irq_cache;
 
+/* Global EFR and EFR2 registers */
+extern u64 amd_iommu_efr;
+extern u64 amd_iommu_efr2;
+
 #define PCI_SBDF_TO_SEGID(sbdf)		(((sbdf) >> 16) & 0xffff)
 #define PCI_SBDF_TO_DEVID(sbdf)		((sbdf) & 0xffff)
 #define PCI_SEG_DEVID_TO_SBDF(seg, devid)	((((u32)(seg) & 0xffff) << 16) | \
@@ -679,9 +683,6 @@ struct amd_iommu {
 	/* Extended features 2 */
 	u64 features2;
 
-	/* IOMMUv2 */
-	bool is_iommu_v2;
-
 	/* PCI device id of the IOMMU device */
 	u16 devid;
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 2b01dfde6cab..1f56478ae74e 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -187,7 +187,6 @@ bool amd_iommu_iotlb_sup __read_mostly = true;
 
 u32 amd_iommu_max_pasid __read_mostly = ~0;
 
-bool amd_iommu_v2_present __read_mostly;
 static bool amd_iommu_pc_present __read_mostly;
 bool amdr_ivrs_remap_support __read_mostly;
 
@@ -302,11 +301,6 @@ static void get_global_efr(void)
 	pr_info("Using global IVHD EFR:%#llx, EFR2:%#llx\n", amd_iommu_efr, amd_iommu_efr2);
 }
 
-static bool check_feature_on_all_iommus(u64 mask)
-{
-	return !!(amd_iommu_efr & mask);
-}
-
 static inline int check_feature_gpt_level(void)
 {
 	return ((amd_iommu_efr >> FEATURE_GATS_SHIFT) & FEATURE_GATS_MASK);
@@ -2112,12 +2106,6 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
 			amd_iommu_max_glx_val = min(amd_iommu_max_glx_val, glxval);
 	}
 
-	if (iommu_feature(iommu, FEATURE_GT) &&
-	    iommu_feature(iommu, FEATURE_PPR)) {
-		iommu->is_iommu_v2   = true;
-		amd_iommu_v2_present = true;
-	}
-
 	if (iommu_feature(iommu, FEATURE_PPR) && alloc_ppr_log(iommu))
 		return -ENOMEM;
 
@@ -3693,7 +3681,7 @@ bool amd_iommu_v2_supported(void)
 	 * (i.e. EFR[SNPSup]=1), IOMMUv2 page table cannot be used without
 	 * setting up IOMMUv1 page table.
 	 */
-	return amd_iommu_v2_present && !amd_iommu_snp_en;
+	return amd_iommu_gt_ppr_supported() && !amd_iommu_snp_en;
 }
 EXPORT_SYMBOL(amd_iommu_v2_supported);
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index c90d97762c33..982d5450595c 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -392,7 +392,7 @@ static int iommu_init_device(struct amd_iommu *iommu, struct device *dev)
 	 */
 	if ((iommu_default_passthrough() || !amd_iommu_force_isolation) &&
 	    dev_is_pci(dev) && pci_iommuv2_capable(to_pci_dev(dev))) {
-		dev_data->iommu_v2 = iommu->is_iommu_v2;
+		dev_data->iommu_v2 = amd_iommu_gt_ppr_supported();
 	}
 
 	dev_iommu_priv_set(dev, dev_data);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 11/21] iommu/amd: Clean up and enhance protection domain flags
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (9 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 10/21] iommu/amd: Modify logic for checking GT and PPR features Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-14 17:25   ` Jason Gunthorpe
  2023-07-12 14:15 ` [PATCH 12/21] iommu/amd: Use protection_domain.flags to check page table mode Vasant Hegde
                   ` (10 subsequent siblings)
  21 siblings, 1 reply; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

With IOMMU v1 and v2 page tables per domain, and the v2 table can
be used in various modes, keeping track of how a particular domain
is configured becoming cumbersome.

Enhance protection_domain.flags to keep track of how a domain is
configured. Also Remove unused flags, and rename them accordingly.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h | 14 +++++---------
 drivers/iommu/amd/io_pgtable_v2.c   |  2 +-
 drivers/iommu/amd/iommu.c           | 21 ++++++++++++---------
 3 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 346c32343417..907af6e41ae9 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -443,13 +443,11 @@
 #define MAX_DOMAIN_ID 65536
 
 /* Protection domain flags */
-#define PD_DMA_OPS_MASK		BIT(0) /* domain used for dma_ops */
-#define PD_DEFAULT_MASK		BIT(1) /* domain is a default dma_ops
-					      domain for an IOMMU */
-#define PD_PASSTHROUGH_MASK	BIT(2) /* domain has no page
-					      translation */
-#define PD_IOMMUV2_MASK		BIT(3) /* domain has gcr3 table */
-#define PD_GIOV_MASK		BIT(4) /* domain enable GIOV support */
+#define PD_FLAG_PT		BIT(0) /* Passthrough mode */
+#define PD_FLAG_V1DMA		BIT(1) /* DMA-API mode w/ v1 page table */
+#define PD_FLAG_V2DMA		BIT(2) /* DMA-API mode w/ v2 page table */
+#define PD_FLAG_GCR3		BIT(3) /* domain has gcr3 table */
+#define PD_FLAG_GIOV		BIT(4) /* domain enable GIOV support */
 
 /* Timeout stuff */
 #define LOOP_TIMEOUT		100000
@@ -891,8 +889,6 @@ extern unsigned long *amd_iommu_pd_alloc_bitmap;
 /* Smallest max PASID supported by any IOMMU in the system */
 extern u32 amd_iommu_max_pasid;
 
-extern bool amd_iommu_v2_present;
-
 extern bool amd_iommu_force_isolation;
 
 /* Max levels of glxval supported */
diff --git a/drivers/iommu/amd/io_pgtable_v2.c b/drivers/iommu/amd/io_pgtable_v2.c
index c17cda83bca5..0b7bab01f006 100644
--- a/drivers/iommu/amd/io_pgtable_v2.c
+++ b/drivers/iommu/amd/io_pgtable_v2.c
@@ -360,7 +360,7 @@ static void v2_free_pgtable(struct io_pgtable *iop)
 	struct amd_io_pgtable *pgtable = container_of(iop, struct amd_io_pgtable, iop);
 
 	pdom = container_of(pgtable, struct protection_domain, iop);
-	if (!(pdom->flags & PD_IOMMUV2_MASK))
+	if (!(pdom->flags & PD_FLAG_GCR3))
 		return;
 
 	/* Clear gcr3 entry */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 982d5450595c..a138727b3176 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1596,7 +1596,7 @@ static int setup_gcr3_table(struct protection_domain *domain, int pasids)
 		return -ENOMEM;
 
 	domain->glx      = levels;
-	domain->flags   |= PD_IOMMUV2_MASK;
+	domain->flags   |= PD_FLAG_GCR3;
 
 	amd_iommu_domain_update(domain);
 
@@ -1636,7 +1636,7 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
 			pte_root |= 1ULL << DEV_ENTRY_PPR;
 	}
 
-	if (domain->flags & PD_IOMMUV2_MASK) {
+	if (domain->flags & PD_FLAG_GCR3) {
 		u64 gcr3 = iommu_virt_to_phys(domain->gcr3_tbl);
 		u64 glx  = domain->glx;
 		u64 tmp;
@@ -1666,7 +1666,7 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
 				((u64)GUEST_PGTABLE_5_LEVEL << DTE_GPT_LEVEL_SHIFT);
 		}
 
-		if (domain->flags & PD_GIOV_MASK)
+		if (domain->flags & PD_FLAG_GIOV)
 			pte_root |= DTE_FLAG_GIOV;
 	}
 
@@ -1830,7 +1830,7 @@ static int attach_device(struct device *dev,
 		goto skip_ats_check;
 
 	pdev = to_pci_dev(dev);
-	if (domain->flags & PD_IOMMUV2_MASK) {
+	if (domain->flags & PD_FLAG_GCR3) {
 		struct iommu_domain *def_domain = iommu_get_dma_domain(dev);
 
 		ret = -EINVAL;
@@ -1911,7 +1911,7 @@ static void detach_device(struct device *dev)
 	if (!dev_is_pci(dev))
 		goto out;
 
-	if (domain->flags & PD_IOMMUV2_MASK && dev_data->iommu_v2)
+	if (domain->flags & PD_FLAG_GCR3 && dev_data->iommu_v2)
 		pdev_iommuv2_disable(to_pci_dev(dev));
 	else if (dev_data->ats.enabled)
 		pci_disable_ats(to_pci_dev(dev));
@@ -2062,7 +2062,7 @@ static void protection_domain_free(struct protection_domain *domain)
 	if (domain->iop.pgtbl_cfg.tlb)
 		free_io_pgtable_ops(&domain->iop.iop.ops);
 
-	if (domain->flags & PD_IOMMUV2_MASK)
+	if (domain->flags & PD_FLAG_GCR3)
 		free_gcr3_table(domain);
 
 	if (domain->iop.root)
@@ -2086,6 +2086,7 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 			return -ENOMEM;
 	}
 
+	domain->flags |= PD_FLAG_V1DMA;
 	amd_iommu_domain_set_pgtable(domain, pt_root, mode);
 
 	return 0;
@@ -2093,13 +2094,14 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 
 static int protection_domain_init_v2(struct protection_domain *domain)
 {
-	domain->flags |= PD_GIOV_MASK;
+	domain->flags |= PD_FLAG_GIOV;
 
 	domain->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
 
 	if (setup_gcr3_table(domain, 1))
 		return -ENOMEM;
 
+	domain->flags |= PD_FLAG_V2DMA;
 	return 0;
 }
 
@@ -2125,6 +2127,7 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
 	switch (type) {
 	/* No need to allocate io pgtable ops in passthrough mode */
 	case IOMMU_DOMAIN_IDENTITY:
+		domain->flags |= PD_FLAG_PT;
 		return domain;
 	case IOMMU_DOMAIN_DMA:
 		fallthrough;
@@ -2555,7 +2558,7 @@ int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids)
 	 * devices attached when it is switched into IOMMUv2 mode.
 	 */
 	ret = -EBUSY;
-	if (pdom->dev_cnt > 0 || pdom->flags & PD_IOMMUV2_MASK)
+	if (pdom->dev_cnt > 0 || pdom->flags & PD_FLAG_GCR3)
 		goto out;
 
 	if (!pdom->gcr3_tbl)
@@ -2574,7 +2577,7 @@ static int __flush_pasid(struct protection_domain *domain, u32 pasid,
 	struct iommu_cmd cmd;
 	int i, ret;
 
-	if (!(domain->flags & PD_IOMMUV2_MASK))
+	if (!(domain->flags & PD_FLAG_GCR3))
 		return -EINVAL;
 
 	build_inv_iommu_pasid(&cmd, domain->id, pasid, address, size);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 12/21] iommu/amd: Use protection_domain.flags to check page table mode
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (10 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 11/21] iommu/amd: Clean up and enhance protection domain flags Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 13/21] iommu/amd: Consolidate logic for setting up v2 API mode Vasant Hegde
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

Page table mode (v1, v2 or pt) is per domain property. Recently we have
enhanced protection_domain.flags to track per domain page table mode.
Use that variable to check the page table mode instead of global
'amd_iommu_pgtable' in {map/unmap}_pages path.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a138727b3176..685efe942ce0 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2274,7 +2274,7 @@ static int amd_iommu_map_pages(struct iommu_domain *dom, unsigned long iova,
 	int prot = 0;
 	int ret = -EINVAL;
 
-	if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
+	if ((domain->flags == PD_FLAG_V1DMA) &&
 	    (domain->iop.mode == PAGE_MODE_NONE))
 		return -EINVAL;
 
@@ -2320,7 +2320,7 @@ static size_t amd_iommu_unmap_pages(struct iommu_domain *dom, unsigned long iova
 	struct io_pgtable_ops *ops = &domain->iop.iop.ops;
 	size_t r;
 
-	if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
+	if ((domain->flags == PD_FLAG_V1DMA) &&
 	    (domain->iop.mode == PAGE_MODE_NONE))
 		return 0;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 13/21] iommu/amd: Consolidate logic for setting up v2 API mode
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (11 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 12/21] iommu/amd: Use protection_domain.flags to check page table mode Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 14/21] iommu/amd: Refactor domain allocation code for v2DMA mode Vasant Hegde
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Use the AMD IOMMU domain flags to help simplify domain setup when
enabling v2 API support for a domain.

This includes:
  * Introducing new functions to init and uninit v2 domain. These
    functions will take care of necessary changes before enabling v2
    domain. iommu_v2 module will just call amd_iommu_v2_domain_init()
    to enable new domain.

  * Consolidate logic to free all io page tables, which was handled by
    amd_iommu_domain_direct_map().

  * Enhance free_gcr3_table() to cleanup v2 page table related
    variables.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       |   6 +-
 drivers/iommu/amd/amd_iommu_types.h |   1 +
 drivers/iommu/amd/iommu.c           | 127 +++++++++++++++++++---------
 drivers/iommu/amd/iommu_v2.c        |   5 +-
 4 files changed, 95 insertions(+), 44 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 0605f02fa711..7773315e5adc 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -56,8 +56,10 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
 
 int amd_iommu_register_ppr_notifier(struct notifier_block *nb);
 int amd_iommu_unregister_ppr_notifier(struct notifier_block *nb);
-void amd_iommu_domain_direct_map(struct iommu_domain *dom);
-int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids);
+int amd_iommu_v2_domain_init(struct protection_domain *pdom,
+			     struct pci_dev *pdev, int pasids, u32 pd_flags);
+void amd_iommu_v2_domain_uninit(struct protection_domain *pdom,
+				struct pci_dev *pdev, u32 pd_flags);
 int amd_iommu_flush_page(struct protection_domain *domain, u32 pasid, u64 address);
 void amd_iommu_update_and_flush_device_table(struct protection_domain *domain);
 void amd_iommu_domain_update(struct protection_domain *domain);
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 907af6e41ae9..f74750b7f963 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -448,6 +448,7 @@
 #define PD_FLAG_V2DMA		BIT(2) /* DMA-API mode w/ v2 page table */
 #define PD_FLAG_GCR3		BIT(3) /* domain has gcr3 table */
 #define PD_FLAG_GIOV		BIT(4) /* domain enable GIOV support */
+#define PD_FLAG_V2API		BIT(5) /* domain is used for v2API */
 
 /* Timeout stuff */
 #define LOOP_TIMEOUT		100000
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 685efe942ce0..22d5ac82efc1 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1566,7 +1566,13 @@ static void free_gcr3_table(struct protection_domain *domain)
 	else
 		BUG_ON(domain->glx != 0);
 
+	domain->glx = 0;
+	domain->flags &= ~PD_FLAG_GCR3;
+
+	amd_iommu_domain_update(domain);
+
 	free_page((unsigned long)domain->gcr3_tbl);
+	domain->gcr3_tbl = NULL;
 }
 
 static int get_gcr3_levels(int pasids)
@@ -2074,6 +2080,87 @@ static void protection_domain_free(struct protection_domain *domain)
 	kfree(domain);
 }
 
+static int v2api_domain_init(struct protection_domain *pdom)
+{
+	/* V2API is already enabled for this domain */
+	if (!!(pdom->flags & PD_FLAG_V2API))
+		return -EBUSY;
+
+	/*
+	 * The V2API mode cannot be initialized if domain is not in
+	 * passthrough mode
+	 */
+	if (pdom->flags != PD_FLAG_PT)
+		return -EOPNOTSUPP;
+
+	/*
+	 * Free any existing page tables for this domain since
+	 * the V2API mode require v1 table to be in identity mode.
+	 * The iommu_v2 driver will handle setting up v2 table
+	 * when binding.
+	 */
+	if (pdom->iop.pgtbl_cfg.tlb)
+		free_io_pgtable_ops(&pdom->iop.iop.ops);
+
+	return 0;
+}
+
+int amd_iommu_v2_domain_init(struct protection_domain *pdom,
+			     struct pci_dev *pdev, int pasids, u32 pd_flags)
+{
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&pdom->lock, flags);
+
+	switch (pd_flags) {
+	case PD_FLAG_V2API:
+		ret = v2api_domain_init(pdom);
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	if (ret)
+		goto out;
+
+	/* Allocate guest page table */
+	if (pdom->gcr3_tbl == NULL)
+		ret = setup_gcr3_table(pdom, pasids);
+
+out:
+	/* Retain default page table mode flags. */
+	if (!ret)
+		pdom->flags |= pd_flags;
+
+	spin_unlock_irqrestore(&pdom->lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL(amd_iommu_v2_domain_init);
+
+void amd_iommu_v2_domain_uninit(struct protection_domain *pdom,
+				struct pci_dev *pdev, u32 pd_flags)
+{
+	unsigned long flags;
+
+	if (!pdom || !(pdom->flags & PD_FLAG_V2API))
+		return;
+
+	spin_lock_irqsave(&pdom->lock, flags);
+
+	/*
+	 * If device is booted with passthrough mode then clear guest
+	 * page table.
+	 */
+	if (!!(pdom->flags & PD_FLAG_PT))
+		free_gcr3_table(pdom);
+
+	pdom->flags &= ~(pd_flags);
+	spin_unlock_irqrestore(&pdom->lock, flags);
+}
+EXPORT_SYMBOL(amd_iommu_v2_domain_uninit);
+
 static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 {
 	u64 *pt_root = NULL;
@@ -2530,46 +2617,6 @@ int amd_iommu_unregister_ppr_notifier(struct notifier_block *nb)
 }
 EXPORT_SYMBOL(amd_iommu_unregister_ppr_notifier);
 
-void amd_iommu_domain_direct_map(struct iommu_domain *dom)
-{
-	struct protection_domain *domain = to_pdomain(dom);
-	unsigned long flags;
-
-	spin_lock_irqsave(&domain->lock, flags);
-
-	if (domain->iop.pgtbl_cfg.tlb)
-		free_io_pgtable_ops(&domain->iop.iop.ops);
-
-	spin_unlock_irqrestore(&domain->lock, flags);
-}
-EXPORT_SYMBOL(amd_iommu_domain_direct_map);
-
-int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids)
-{
-	struct protection_domain *pdom = to_pdomain(dom);
-	unsigned long flags;
-	int ret;
-
-	spin_lock_irqsave(&pdom->lock, flags);
-
-	/*
-	 * Save us all sanity checks whether devices already in the
-	 * domain support IOMMUv2. Just force that the domain has no
-	 * devices attached when it is switched into IOMMUv2 mode.
-	 */
-	ret = -EBUSY;
-	if (pdom->dev_cnt > 0 || pdom->flags & PD_FLAG_GCR3)
-		goto out;
-
-	if (!pdom->gcr3_tbl)
-		ret = setup_gcr3_table(pdom, pasids);
-
-out:
-	spin_unlock_irqrestore(&pdom->lock, flags);
-	return ret;
-}
-EXPORT_SYMBOL(amd_iommu_domain_enable_v2);
-
 static int __flush_pasid(struct protection_domain *domain, u32 pasid,
 			 u64 address, bool size)
 {
diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index a9733fbc1761..dce5e006ceb7 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -785,9 +785,10 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
 
 	/* See iommu_is_default_domain() */
 	domain->type = IOMMU_DOMAIN_IDENTITY;
-	amd_iommu_domain_direct_map(&dev_state->pdom->domain);
+	dev_state->pdom->flags = PD_FLAG_PT;
 
-	ret = amd_iommu_domain_enable_v2(domain, pasids);
+	ret = amd_iommu_v2_domain_init(dev_state->pdom, pdev,
+				       pasids, PD_FLAG_V2API);
 	if (ret)
 		goto out_free_domain;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 14/21] iommu/amd: Refactor domain allocation code for v2DMA mode
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (12 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 13/21] iommu/amd: Consolidate logic for setting up v2 API mode Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 15/21] iommu/amd: Setup v2 domain with max pasids Vasant Hegde
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

The amd_iommu_v2_domain_init() configures a protection domain based on
the specified v2 mode. It has sanity checks to determine if the mode
can be enabled safely.

Therefore, refactor the v2DMA domain initialization code into the
amd_iommu_v2_domain_init() and use this function instead.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 22 +++++++---------------
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 22d5ac82efc1..3c9fd2adad33 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2109,7 +2109,7 @@ int amd_iommu_v2_domain_init(struct protection_domain *pdom,
 			     struct pci_dev *pdev, int pasids, u32 pd_flags)
 {
 	unsigned long flags;
-	int ret;
+	int ret = 0;
 
 	spin_lock_irqsave(&pdom->lock, flags);
 
@@ -2117,6 +2117,11 @@ int amd_iommu_v2_domain_init(struct protection_domain *pdom,
 	case PD_FLAG_V2API:
 		ret = v2api_domain_init(pdom);
 		break;
+	/* This gets called from domain allocation path */
+	case PD_FLAG_V2DMA:
+		pdom->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
+		pdom->flags |= PD_FLAG_GIOV;
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -2179,19 +2184,6 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
 	return 0;
 }
 
-static int protection_domain_init_v2(struct protection_domain *domain)
-{
-	domain->flags |= PD_FLAG_GIOV;
-
-	domain->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
-
-	if (setup_gcr3_table(domain, 1))
-		return -ENOMEM;
-
-	domain->flags |= PD_FLAG_V2DMA;
-	return 0;
-}
-
 static struct protection_domain *protection_domain_alloc(unsigned int type)
 {
 	struct io_pgtable_ops *pgtbl_ops;
@@ -2237,7 +2229,7 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
 		ret = protection_domain_init_v1(domain, DEFAULT_PGTABLE_LEVEL);
 		break;
 	case AMD_IOMMU_V2:
-		ret = protection_domain_init_v2(domain);
+		ret = amd_iommu_v2_domain_init(domain, NULL, 1, PD_FLAG_V2DMA);
 		break;
 	default:
 		ret = -EINVAL;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 15/21] iommu/amd: Setup v2 domain with max pasids
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (13 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 14/21] iommu/amd: Refactor domain allocation code for v2DMA mode Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 16/21] iommu/amd: Introduce iommu_dev_data.flags to track device capabilities Vasant Hegde
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

An IOMMU domain can contain multiple devices with different PASID limits.
However, the maximum number of PASIDs of a domain is bounded by the limit
of the IOMMU hardware. This limit is used to determine the number levels
of the GCR3 table.

Therefore, to simplify management of GCR3 table, allocate the table
with the hardware maximum PASID limit by default (value -1) unless
specified otherwise.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 3c9fd2adad33..ce405cece3b5 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2229,7 +2229,11 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
 		ret = protection_domain_init_v1(domain, DEFAULT_PGTABLE_LEVEL);
 		break;
 	case AMD_IOMMU_V2:
-		ret = amd_iommu_v2_domain_init(domain, NULL, 1, PD_FLAG_V2DMA);
+		/*
+		 * Set up to support maximum number of PASIDs
+		 * supported by AMD IOMMU HW.
+		 */
+		ret = amd_iommu_v2_domain_init(domain, NULL, -1, PD_FLAG_V2DMA);
 		break;
 	default:
 		ret = -EINVAL;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 16/21] iommu/amd: Introduce iommu_dev_data.flags to track device capabilities
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (14 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 15/21] iommu/amd: Setup v2 domain with max pasids Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 17/21] iommu/amd: Rename ats related variables Vasant Hegde
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

Currently we use struct iommu_dev_data.iommu_v2 to keep track of the device
ATS, PRI, and PASID capabilities. But these capabilities can be enabled
independently (except PRI requires ATS support). Hence, replace
the iommu_v2 variable with a flags variable, which keep track of the device
capabilities.

Finally device PRI/PASID is shared between PF and any associated VFs (See
commit 9bf49e36d718 ("PCI/ATS: Handle sharing of PF PRI Capability with
all VFs")). Hence use pci_pri_supported() and pci_pasid_features() instead
of pci_find_ext_capability() to check device PRI/PASID support.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  2 +-
 drivers/iommu/amd/iommu.c           | 43 +++++++++++++++--------------
 2 files changed, 23 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index f74750b7f963..57586a8c451e 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -814,7 +814,7 @@ struct iommu_dev_data {
 	struct protection_domain *domain; /* Domain the device is bound to */
 	struct device *dev;
 	u16 devid;			  /* PCI Device ID */
-	bool iommu_v2;			  /* Device can make use of IOMMUv2 */
+	u32 flags;			  /* Holds AMD_IOMMU_DEVICE_FLAG_<*> */
 	struct {
 		bool enabled;
 		int qdep;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index ce405cece3b5..ff38561435e2 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -314,24 +314,25 @@ static struct iommu_group *acpihid_device_group(struct device *dev)
 	return entry->group;
 }
 
-static bool pci_iommuv2_capable(struct pci_dev *pdev)
+static inline bool pdev_pasid_supported(struct iommu_dev_data *dev_data)
 {
-	static const int caps[] = {
-		PCI_EXT_CAP_ID_PRI,
-		PCI_EXT_CAP_ID_PASID,
-	};
-	int i, pos;
+	return (dev_data->flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP);
+}
 
-	if (!pci_ats_supported(pdev))
-		return false;
+static u32 pdev_get_caps(struct pci_dev *pdev)
+{
+	u32 flags = 0;
 
-	for (i = 0; i < 2; ++i) {
-		pos = pci_find_ext_capability(pdev, caps[i]);
-		if (pos == 0)
-			return false;
-	}
+	if (pci_ats_supported(pdev))
+		flags |= AMD_IOMMU_DEVICE_FLAG_ATS_SUP;
 
-	return true;
+	if (pci_pri_supported(pdev))
+		flags |= AMD_IOMMU_DEVICE_FLAG_PRI_SUP;
+
+	if (pci_pasid_features(pdev) >= 0)
+		flags |= AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
+
+	return flags;
 }
 
 /*
@@ -391,8 +392,8 @@ static int iommu_init_device(struct amd_iommu *iommu, struct device *dev)
 	 * it'll be forced to go into translation mode.
 	 */
 	if ((iommu_default_passthrough() || !amd_iommu_force_isolation) &&
-	    dev_is_pci(dev) && pci_iommuv2_capable(to_pci_dev(dev))) {
-		dev_data->iommu_v2 = amd_iommu_gt_ppr_supported();
+	    dev_is_pci(dev) && amd_iommu_gt_ppr_supported()) {
+		dev_data->flags = pdev_get_caps(to_pci_dev(dev));
 	}
 
 	dev_iommu_priv_set(dev, dev_data);
@@ -1733,7 +1734,7 @@ static void do_attach(struct iommu_dev_data *dev_data,
 
 	/* Update device table */
 	set_dte_entry(iommu, dev_data->devid, domain,
-		      ats, dev_data->iommu_v2);
+		      ats, dev_data->flags);
 	clone_aliases(iommu, dev_data->dev);
 
 	device_flush_dte(dev_data);
@@ -1851,7 +1852,7 @@ static int attach_device(struct device *dev,
 			goto out;
 		}
 
-		if (dev_data->iommu_v2) {
+		if (pdev_pasid_supported(dev_data)) {
 			if (pdev_pri_ats_enable(pdev) != 0)
 				goto out;
 
@@ -1917,7 +1918,7 @@ static void detach_device(struct device *dev)
 	if (!dev_is_pci(dev))
 		goto out;
 
-	if (domain->flags & PD_FLAG_GCR3 && dev_data->iommu_v2)
+	if (domain->flags & PD_FLAG_GCR3 && pdev_pasid_supported(dev_data))
 		pdev_iommuv2_disable(to_pci_dev(dev));
 	else if (dev_data->ats.enabled)
 		pci_disable_ats(to_pci_dev(dev));
@@ -2012,7 +2013,7 @@ static void update_device_table(struct protection_domain *domain)
 		if (!iommu)
 			continue;
 		set_dte_entry(iommu, dev_data->devid, domain,
-			      dev_data->ats.enabled, dev_data->iommu_v2);
+			      dev_data->ats.enabled, dev_data->flags);
 		clone_aliases(iommu, dev_data->dev);
 	}
 }
@@ -2551,7 +2552,7 @@ static int amd_iommu_def_domain_type(struct device *dev)
 	 *    and require remapping.
 	 *  - SNP is enabled, because it prohibits DTE[Mode]=0.
 	 */
-	if (dev_data->iommu_v2 &&
+	if (pdev_pasid_supported(dev_data) &&
 	    !cc_platform_has(CC_ATTR_MEM_ENCRYPT) &&
 	    !amd_iommu_snp_en) {
 		return IOMMU_DOMAIN_IDENTITY;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 17/21] iommu/amd: Rename ats related variables
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (15 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 16/21] iommu/amd: Introduce iommu_dev_data.flags to track device capabilities Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 18/21] iommu/amd: Enable device ATS/PASID/PRI capabilities independently Vasant Hegde
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

Remove nested structure and make it as 'ats_{enable/qdep}'.

No functional changes intended.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  6 ++----
 drivers/iommu/amd/iommu.c           | 26 +++++++++++++-------------
 2 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 57586a8c451e..15c129dbacab 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -815,10 +815,8 @@ struct iommu_dev_data {
 	struct device *dev;
 	u16 devid;			  /* PCI Device ID */
 	u32 flags;			  /* Holds AMD_IOMMU_DEVICE_FLAG_<*> */
-	struct {
-		bool enabled;
-		int qdep;
-	} ats;				  /* ATS state */
+	int  ats_qdep;
+	bool ats_enabled;		  /* ATS state */
 	bool pri_tlp;			  /* PASID TLB required for
 					     PPR completions */
 	bool use_vapic;			  /* Enable device to use vapic mode */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index ff38561435e2..6c425aa521c7 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1310,7 +1310,7 @@ static int device_flush_iotlb(struct iommu_dev_data *dev_data,
 	struct iommu_cmd cmd;
 	int qdep;
 
-	qdep     = dev_data->ats.qdep;
+	qdep     = dev_data->ats_qdep;
 	iommu    = rlookup_amd_iommu(dev_data->dev);
 	if (!iommu)
 		return -EINVAL;
@@ -1361,7 +1361,7 @@ static int device_flush_dte(struct iommu_dev_data *dev_data)
 			return ret;
 	}
 
-	if (dev_data->ats.enabled)
+	if (dev_data->ats_enabled)
 		ret = device_flush_iotlb(dev_data, 0, ~0UL);
 
 	return ret;
@@ -1394,7 +1394,7 @@ static void __domain_flush_pages(struct protection_domain *domain,
 
 	list_for_each_entry(dev_data, &domain->dev_list, list) {
 
-		if (!dev_data->ats.enabled)
+		if (!dev_data->ats_enabled)
 			continue;
 
 		ret |= device_flush_iotlb(dev_data, address, size);
@@ -1718,7 +1718,7 @@ static void do_attach(struct iommu_dev_data *dev_data,
 	iommu = rlookup_amd_iommu(dev_data->dev);
 	if (!iommu)
 		return;
-	ats   = dev_data->ats.enabled;
+	ats   = dev_data->ats_enabled;
 
 	/* Update data structures */
 	dev_data->domain = domain;
@@ -1856,14 +1856,14 @@ static int attach_device(struct device *dev,
 			if (pdev_pri_ats_enable(pdev) != 0)
 				goto out;
 
-			dev_data->ats.enabled = true;
-			dev_data->ats.qdep    = pci_ats_queue_depth(pdev);
+			dev_data->ats_enabled = true;
+			dev_data->ats_qdep    = pci_ats_queue_depth(pdev);
 			dev_data->pri_tlp     = pci_prg_resp_pasid_required(pdev);
 		}
 	} else if (amd_iommu_iotlb_sup &&
 		   pci_enable_ats(pdev, PAGE_SHIFT) == 0) {
-		dev_data->ats.enabled = true;
-		dev_data->ats.qdep    = pci_ats_queue_depth(pdev);
+		dev_data->ats_enabled = true;
+		dev_data->ats_qdep    = pci_ats_queue_depth(pdev);
 	}
 
 skip_ats_check:
@@ -1920,10 +1920,10 @@ static void detach_device(struct device *dev)
 
 	if (domain->flags & PD_FLAG_GCR3 && pdev_pasid_supported(dev_data))
 		pdev_iommuv2_disable(to_pci_dev(dev));
-	else if (dev_data->ats.enabled)
+	else if (dev_data->ats_enabled)
 		pci_disable_ats(to_pci_dev(dev));
 
-	dev_data->ats.enabled = false;
+	dev_data->ats_enabled = false;
 
 out:
 	spin_unlock(&dev_data->lock);
@@ -2013,7 +2013,7 @@ static void update_device_table(struct protection_domain *domain)
 		if (!iommu)
 			continue;
 		set_dte_entry(iommu, dev_data->devid, domain,
-			      dev_data->ats.enabled, dev_data->flags);
+			      dev_data->ats_enabled, dev_data->flags);
 		clone_aliases(iommu, dev_data->dev);
 	}
 }
@@ -2651,10 +2651,10 @@ static int __flush_pasid(struct protection_domain *domain, u32 pasid,
 		   There might be non-IOMMUv2 capable devices in an IOMMUv2
 		 * domain.
 		 */
-		if (!dev_data->ats.enabled)
+		if (!dev_data->ats_enabled)
 			continue;
 
-		qdep  = dev_data->ats.qdep;
+		qdep  = dev_data->ats_qdep;
 		iommu = rlookup_amd_iommu(dev_data->dev);
 		if (!iommu)
 			continue;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 18/21] iommu/amd: Enable device ATS/PASID/PRI capabilities independently
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (16 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 17/21] iommu/amd: Rename ats related variables Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 19/21] iommu/amd: Delay enabling PRI to after initializing domain Vasant Hegde
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

Introduce helper functions to enable/disable device ATS/PASID/PRI
capabilities independently along with the new pasid_enabled and
pri_enabled variables in struct iommu_dev_data to keep track,
which allows attach_device() and detach_device() to be simplified.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       |   4 +
 drivers/iommu/amd/amd_iommu_types.h |   2 +
 drivers/iommu/amd/iommu.c           | 203 ++++++++++++++++------------
 3 files changed, 120 insertions(+), 89 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 7773315e5adc..d9d531f2775f 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -54,6 +54,10 @@ int amd_iommu_pc_get_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
 int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
 			 u8 fxn, u64 *value);
 
+/* Device capabilities */
+int amd_iommu_pdev_enable_cap_pri(struct pci_dev *pdev);
+void amd_iommu_pdev_disable_cap_pri(struct pci_dev *pdev);
+
 int amd_iommu_register_ppr_notifier(struct notifier_block *nb);
 int amd_iommu_unregister_ppr_notifier(struct notifier_block *nb);
 int amd_iommu_v2_domain_init(struct protection_domain *pdom,
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 15c129dbacab..64add3347a89 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -817,6 +817,8 @@ struct iommu_dev_data {
 	u32 flags;			  /* Holds AMD_IOMMU_DEVICE_FLAG_<*> */
 	int  ats_qdep;
 	bool ats_enabled;		  /* ATS state */
+	bool pri_enabled;		  /* PRI state */
+	bool pasid_enabled;		  /* PASID state */
 	bool pri_tlp;			  /* PASID TLB required for
 					     PPR completions */
 	bool use_vapic;			  /* Enable device to use vapic mode */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 6c425aa521c7..e0fa27aedbc0 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -335,6 +335,113 @@ static u32 pdev_get_caps(struct pci_dev *pdev)
 	return flags;
 }
 
+static inline int pdev_enable_cap_ats(struct pci_dev *pdev)
+{
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&pdev->dev);
+	int ret = -EINVAL;
+
+	if (dev_data->ats_enabled)
+		return 0;
+
+	if (amd_iommu_iotlb_sup &&
+	    (dev_data->flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP)) {
+		ret = pci_enable_ats(pdev, PAGE_SHIFT);
+		if (!ret) {
+			dev_data->ats_enabled = true;
+			dev_data->ats_qdep    = pci_ats_queue_depth(pdev);
+		}
+	}
+
+	return ret;
+}
+
+static inline void pdev_disable_cap_ats(struct pci_dev *pdev)
+{
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&pdev->dev);
+
+	if (dev_data->ats_enabled) {
+		pci_disable_ats(pdev);
+		dev_data->ats_enabled = false;
+	}
+}
+
+int amd_iommu_pdev_enable_cap_pri(struct pci_dev *pdev)
+{
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&pdev->dev);
+	int ret = -EINVAL;
+
+	if (dev_data->pri_enabled)
+		return 0;
+
+	if (dev_data->flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) {
+		/*
+		 * First reset the PRI state of the device.
+		 * FIXME: Hardcode number of outstanding requests for now
+		 */
+		if (!pci_reset_pri(pdev) && !pci_enable_pri(pdev, 32)) {
+			dev_data->pri_enabled = true;
+			dev_data->pri_tlp     = pci_prg_resp_pasid_required(pdev);
+
+			ret = 0;
+		}
+	}
+
+	return ret;
+}
+
+void amd_iommu_pdev_disable_cap_pri(struct pci_dev *pdev)
+{
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&pdev->dev);
+
+	if (dev_data->pri_enabled) {
+		pci_disable_pri(pdev);
+		dev_data->pri_enabled = false;
+	}
+}
+
+static inline int pdev_enable_cap_pasid(struct pci_dev *pdev)
+{
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&pdev->dev);
+	int ret = -EINVAL;
+
+	if (dev_data->pasid_enabled)
+		return 0;
+
+	if (dev_data->flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP) {
+		/* Only allow access to user-accessible pages */
+		ret = pci_enable_pasid(pdev, 0);
+		if (!ret)
+			dev_data->pasid_enabled = true;
+	}
+
+	return ret;
+}
+
+static inline void pdev_disable_cap_pasid(struct pci_dev *pdev)
+{
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&pdev->dev);
+
+	if (dev_data->pasid_enabled) {
+		pci_disable_pasid(pdev);
+		dev_data->pasid_enabled = false;
+	}
+}
+
+static void pdev_enable_caps(struct pci_dev *pdev)
+{
+	pdev_enable_cap_ats(pdev);
+	pdev_enable_cap_pasid(pdev);
+	amd_iommu_pdev_enable_cap_pri(pdev);
+
+}
+
+static void pdev_disable_caps(struct pci_dev *pdev)
+{
+	pdev_disable_cap_ats(pdev);
+	pdev_disable_cap_pasid(pdev);
+	amd_iommu_pdev_disable_cap_pri(pdev);
+}
+
 /*
  * This function checks if the driver got a valid device from the caller to
  * avoid dereferencing invalid pointers.
@@ -1769,48 +1876,6 @@ static void do_detach(struct iommu_dev_data *dev_data)
 	domain->dev_cnt                 -= 1;
 }
 
-static void pdev_iommuv2_disable(struct pci_dev *pdev)
-{
-	pci_disable_ats(pdev);
-	pci_disable_pri(pdev);
-	pci_disable_pasid(pdev);
-}
-
-static int pdev_pri_ats_enable(struct pci_dev *pdev)
-{
-	int ret;
-
-	/* Only allow access to user-accessible pages */
-	ret = pci_enable_pasid(pdev, 0);
-	if (ret)
-		return ret;
-
-	/* First reset the PRI state of the device */
-	ret = pci_reset_pri(pdev);
-	if (ret)
-		goto out_err_pasid;
-
-	/* Enable PRI */
-	/* FIXME: Hardcode number of outstanding requests for now */
-	ret = pci_enable_pri(pdev, 32);
-	if (ret)
-		goto out_err_pasid;
-
-	ret = pci_enable_ats(pdev, PAGE_SHIFT);
-	if (ret)
-		goto out_err_pri;
-
-	return 0;
-
-out_err_pri:
-	pci_disable_pri(pdev);
-
-out_err_pasid:
-	pci_disable_pasid(pdev);
-
-	return ret;
-}
-
 /*
  * If a device is not yet associated with a domain, this function makes the
  * device visible in the domain
@@ -1819,9 +1884,8 @@ static int attach_device(struct device *dev,
 			 struct protection_domain *domain)
 {
 	struct iommu_dev_data *dev_data;
-	struct pci_dev *pdev;
 	unsigned long flags;
-	int ret;
+	int ret = 0;
 
 	spin_lock_irqsave(&domain->lock, flags);
 
@@ -1829,45 +1893,13 @@ static int attach_device(struct device *dev,
 
 	spin_lock(&dev_data->lock);
 
-	ret = -EBUSY;
-	if (dev_data->domain != NULL)
+	if (dev_data->domain != NULL) {
+		ret = -EBUSY;
 		goto out;
-
-	if (!dev_is_pci(dev))
-		goto skip_ats_check;
-
-	pdev = to_pci_dev(dev);
-	if (domain->flags & PD_FLAG_GCR3) {
-		struct iommu_domain *def_domain = iommu_get_dma_domain(dev);
-
-		ret = -EINVAL;
-
-		/*
-		 * In case of using AMD_IOMMU_V1 page table mode and the device
-		 * is enabling for PPR/ATS support (using v2 table),
-		 * we need to make sure that the domain type is identity map.
-		 */
-		if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
-		    def_domain->type != IOMMU_DOMAIN_IDENTITY) {
-			goto out;
-		}
-
-		if (pdev_pasid_supported(dev_data)) {
-			if (pdev_pri_ats_enable(pdev) != 0)
-				goto out;
-
-			dev_data->ats_enabled = true;
-			dev_data->ats_qdep    = pci_ats_queue_depth(pdev);
-			dev_data->pri_tlp     = pci_prg_resp_pasid_required(pdev);
-		}
-	} else if (amd_iommu_iotlb_sup &&
-		   pci_enable_ats(pdev, PAGE_SHIFT) == 0) {
-		dev_data->ats_enabled = true;
-		dev_data->ats_qdep    = pci_ats_queue_depth(pdev);
 	}
 
-skip_ats_check:
-	ret = 0;
+	if (dev_is_pci(dev))
+		pdev_enable_caps(to_pci_dev(dev));
 
 	do_attach(dev_data, domain);
 
@@ -1915,15 +1947,8 @@ static void detach_device(struct device *dev)
 
 	do_detach(dev_data);
 
-	if (!dev_is_pci(dev))
-		goto out;
-
-	if (domain->flags & PD_FLAG_GCR3 && pdev_pasid_supported(dev_data))
-		pdev_iommuv2_disable(to_pci_dev(dev));
-	else if (dev_data->ats_enabled)
-		pci_disable_ats(to_pci_dev(dev));
-
-	dev_data->ats_enabled = false;
+	if (dev_is_pci(dev))
+		pdev_disable_caps(to_pci_dev(dev));
 
 out:
 	spin_unlock(&dev_data->lock);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 19/21] iommu/amd: Delay enabling PRI to after initializing domain
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (17 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 18/21] iommu/amd: Enable device ATS/PASID/PRI capabilities independently Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 20/21] iommu/amd: Initialize iommu_device->max_pasids Vasant Hegde
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

The PRI capability of a device should be enabled only when the domain is
setup to support it. Hence move PRI enablement to amd_iommu_v2_domain_init.

Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 42 +++++++++++++++++++++++++++++++++------
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index e0fa27aedbc0..62de31cd59ee 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -431,15 +431,12 @@ static void pdev_enable_caps(struct pci_dev *pdev)
 {
 	pdev_enable_cap_ats(pdev);
 	pdev_enable_cap_pasid(pdev);
-	amd_iommu_pdev_enable_cap_pri(pdev);
-
 }
 
 static void pdev_disable_caps(struct pci_dev *pdev)
 {
 	pdev_disable_cap_ats(pdev);
 	pdev_disable_cap_pasid(pdev);
-	amd_iommu_pdev_disable_cap_pri(pdev);
 }
 
 /*
@@ -2106,7 +2103,32 @@ static void protection_domain_free(struct protection_domain *domain)
 	kfree(domain);
 }
 
-static int v2api_domain_init(struct protection_domain *pdom)
+static inline int v2api_pdev_init(struct pci_dev *pdev)
+{
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&pdev->dev);
+
+	if (dev_data->pri_enabled)
+		return -EBUSY;
+
+	/* V2API mode needs ATS, PASID and PRI support */
+	if (!dev_data->ats_enabled || !dev_data->pasid_enabled)
+		return -EINVAL;
+
+	return amd_iommu_pdev_enable_cap_pri(pdev);
+}
+
+static inline void v2api_pdev_uninit(struct pci_dev *pdev)
+{
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(&pdev->dev);
+
+	if (!dev_data->pri_enabled)
+		return;
+
+	amd_iommu_pdev_disable_cap_pri(pdev);
+}
+
+static int v2api_domain_init(struct protection_domain *pdom,
+			     struct pci_dev *pdev)
 {
 	/* V2API is already enabled for this domain */
 	if (!!(pdom->flags & PD_FLAG_V2API))
@@ -2128,7 +2150,7 @@ static int v2api_domain_init(struct protection_domain *pdom)
 	if (pdom->iop.pgtbl_cfg.tlb)
 		free_io_pgtable_ops(&pdom->iop.iop.ops);
 
-	return 0;
+	return v2api_pdev_init(pdev);
 }
 
 int amd_iommu_v2_domain_init(struct protection_domain *pdom,
@@ -2141,7 +2163,7 @@ int amd_iommu_v2_domain_init(struct protection_domain *pdom,
 
 	switch (pd_flags) {
 	case PD_FLAG_V2API:
-		ret = v2api_domain_init(pdom);
+		ret = v2api_domain_init(pdom, pdev);
 		break;
 	/* This gets called from domain allocation path */
 	case PD_FLAG_V2DMA:
@@ -2180,6 +2202,14 @@ void amd_iommu_v2_domain_uninit(struct protection_domain *pdom,
 
 	spin_lock_irqsave(&pdom->lock, flags);
 
+	switch (pd_flags) {
+	case PD_FLAG_V2API:
+		v2api_pdev_uninit(pdev);
+		break;
+	default:
+		break;
+	}
+
 	/*
 	 * If device is booted with passthrough mode then clear guest
 	 * page table.
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 20/21] iommu/amd: Initialize iommu_device->max_pasids
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (18 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 19/21] iommu/amd: Delay enabling PRI to after initializing domain Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-12 14:15 ` [PATCH 21/21] iommu/amd: Simplify amd_iommu_device_info() Vasant Hegde
  2023-07-14 17:53 ` [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Jason Gunthorpe
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

Commit 1adf3cc20d69 ("iommu: Add max_pasids field in struct iommu_device")
introduced a variable struct iommu_device.max_pasids to track max
PASIDS supported by each IOMMU.

Let us initialize this field for AMD IOMMU. IOMMU core will use this value
to set max PASIDs per device (see __iommu_probe_device()).

Also remove unused global 'amd_iommu_max_pasid' variable.

Finally current code restricts max PASIDs to 16 bits and calls BUG_ON if
max PASID is more than 16bit. This patch replaces BUG_ON with WARN_ON
as system can continue to work with 16-bit PASID.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  3 ---
 drivers/iommu/amd/init.c            | 10 +++-------
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 64add3347a89..ffb1f98620c3 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -887,9 +887,6 @@ extern unsigned amd_iommu_aperture_order;
 /* allocation bitmap for domain ids */
 extern unsigned long *amd_iommu_pd_alloc_bitmap;
 
-/* Smallest max PASID supported by any IOMMU in the system */
-extern u32 amd_iommu_max_pasid;
-
 extern bool amd_iommu_force_isolation;
 
 /* Max levels of glxval supported */
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 1f56478ae74e..6441c81467ab 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -185,8 +185,6 @@ static int amd_iommus_present;
 bool amd_iommu_np_cache __read_mostly;
 bool amd_iommu_iotlb_sup __read_mostly = true;
 
-u32 amd_iommu_max_pasid __read_mostly = ~0;
-
 static bool amd_iommu_pc_present __read_mostly;
 bool amdr_ivrs_remap_support __read_mostly;
 
@@ -2086,16 +2084,14 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
 
 	if (iommu_feature(iommu, FEATURE_GT)) {
 		int glxval;
-		u32 max_pasid;
 		u64 pasmax;
 
 		pasmax = iommu->features & FEATURE_PASID_MASK;
 		pasmax >>= FEATURE_PASID_SHIFT;
-		max_pasid  = (1 << (pasmax + 1)) - 1;
-
-		amd_iommu_max_pasid = min(amd_iommu_max_pasid, max_pasid);
+		iommu->iommu.max_pasids = (1 << (pasmax + 1)) - 1;
 
-		BUG_ON(amd_iommu_max_pasid & ~PASID_MASK);
+		WARN_ON(iommu->iommu.max_pasids & ~PASID_MASK);
+		iommu->iommu.max_pasids &= PASID_MASK;
 
 		glxval   = iommu->features & FEATURE_GLXVAL_MASK;
 		glxval >>= FEATURE_GLXVAL_SHIFT;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 21/21] iommu/amd: Simplify amd_iommu_device_info()
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (19 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 20/21] iommu/amd: Initialize iommu_device->max_pasids Vasant Hegde
@ 2023-07-12 14:15 ` Vasant Hegde
  2023-07-14 17:53 ` [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Jason Gunthorpe
  21 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-12 14:15 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, Vasant Hegde

Let 'iommu_dev_data.flags' track device exec and priv support as well.
So that in amd_iommu_device_info() we don't need to check device
capabilities again.

Also in probe path __iommu_probe_device() updates dev->iommu->max_pasids.
Just use that instead of calculating max PASIDs again

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/iommu.c | 41 ++++++++++++++-------------------------
 1 file changed, 15 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 62de31cd59ee..6319a2dc4654 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -321,6 +321,7 @@ static inline bool pdev_pasid_supported(struct iommu_dev_data *dev_data)
 
 static u32 pdev_get_caps(struct pci_dev *pdev)
 {
+	int features;
 	u32 flags = 0;
 
 	if (pci_ats_supported(pdev))
@@ -329,9 +330,17 @@ static u32 pdev_get_caps(struct pci_dev *pdev)
 	if (pci_pri_supported(pdev))
 		flags |= AMD_IOMMU_DEVICE_FLAG_PRI_SUP;
 
-	if (pci_pasid_features(pdev) >= 0)
+	features = pci_pasid_features(pdev);
+	if (features >= 0) {
 		flags |= AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
 
+		if (features & PCI_PASID_CAP_EXEC)
+			flags |= AMD_IOMMU_DEVICE_FLAG_EXEC_SUP;
+
+		if (features & PCI_PASID_CAP_PRIV)
+			flags |= AMD_IOMMU_DEVICE_FLAG_PRIV_SUP;
+	}
+
 	return flags;
 }
 
@@ -2883,8 +2892,8 @@ EXPORT_SYMBOL(amd_iommu_complete_ppr);
 int amd_iommu_device_info(struct pci_dev *pdev,
                           struct amd_iommu_device_info *info)
 {
-	int max_pasids;
-	int pos;
+	struct device *dev = &pdev->dev;
+	struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
 
 	if (pdev == NULL || info == NULL)
 		return -EINVAL;
@@ -2894,29 +2903,9 @@ int amd_iommu_device_info(struct pci_dev *pdev,
 
 	memset(info, 0, sizeof(*info));
 
-	if (pci_ats_supported(pdev))
-		info->flags |= AMD_IOMMU_DEVICE_FLAG_ATS_SUP;
-
-	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI);
-	if (pos)
-		info->flags |= AMD_IOMMU_DEVICE_FLAG_PRI_SUP;
-
-	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PASID);
-	if (pos) {
-		int features;
-
-		max_pasids = 1 << (9 * (amd_iommu_max_glx_val + 1));
-		max_pasids = min(max_pasids, (1 << 20));
-
-		info->flags |= AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
-		info->max_pasids = min(pci_max_pasids(pdev), max_pasids);
-
-		features = pci_pasid_features(pdev);
-		if (features & PCI_PASID_CAP_EXEC)
-			info->flags |= AMD_IOMMU_DEVICE_FLAG_EXEC_SUP;
-		if (features & PCI_PASID_CAP_PRIV)
-			info->flags |= AMD_IOMMU_DEVICE_FLAG_PRIV_SUP;
-	}
+	info->flags = dev_data->flags;
+	if (dev->iommu)
+		info->max_pasids = dev->iommu->max_pasids;
 
 	return 0;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 07/21] iommu/amd: Use struct protection_domain in helper functions
  2023-07-12 14:15 ` [PATCH 07/21] iommu/amd: Use struct protection_domain in helper functions Vasant Hegde
@ 2023-07-14 16:57   ` Jason Gunthorpe
  0 siblings, 0 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2023-07-14 16:57 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit

On Wed, Jul 12, 2023 at 02:15:02PM +0000, Vasant Hegde wrote:

> +static inline struct protection_domain *amd_iommu_get_pdomain(struct device *dev)
> +{
> +	struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
> +
> +	if (!dev_data || !dev_data->domain)
> +		return NULL;
> +
> +	return dev_data->domain;
> +}

This hunk seems to be in the wrong patch?

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 11/21] iommu/amd: Clean up and enhance protection domain flags
  2023-07-12 14:15 ` [PATCH 11/21] iommu/amd: Clean up and enhance protection domain flags Vasant Hegde
@ 2023-07-14 17:25   ` Jason Gunthorpe
  2023-07-17  1:22     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2023-07-14 17:25 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit

On Wed, Jul 12, 2023 at 02:15:06PM +0000, Vasant Hegde wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> With IOMMU v1 and v2 page tables per domain, and the v2 table can
> be used in various modes, keeping track of how a particular domain
> is configured becoming cumbersome.
> 
> Enhance protection_domain.flags to keep track of how a domain is
> configured. Also Remove unused flags, and rename them accordingly.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
>  drivers/iommu/amd/amd_iommu_types.h | 14 +++++---------
>  drivers/iommu/amd/io_pgtable_v2.c   |  2 +-
>  drivers/iommu/amd/iommu.c           | 21 ++++++++++++---------
>  3 files changed, 18 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
> index 346c32343417..907af6e41ae9 100644
> --- a/drivers/iommu/amd/amd_iommu_types.h
> +++ b/drivers/iommu/amd/amd_iommu_types.h
> @@ -443,13 +443,11 @@
>  #define MAX_DOMAIN_ID 65536
>  
>  /* Protection domain flags */
> -#define PD_DMA_OPS_MASK		BIT(0) /* domain used for dma_ops */
> -#define PD_DEFAULT_MASK		BIT(1) /* domain is a default dma_ops
> -					      domain for an IOMMU */
> -#define PD_PASSTHROUGH_MASK	BIT(2) /* domain has no page
> -					      translation */
> -#define PD_IOMMUV2_MASK		BIT(3) /* domain has gcr3 table */
> -#define PD_GIOV_MASK		BIT(4) /* domain enable GIOV support */

> +#define PD_FLAG_V1DMA		BIT(1) /* DMA-API mode w/ v1 page table */
> +#define PD_FLAG_V2DMA		BIT(2) /* DMA-API mode w/ v2 page table */

"DMA-API" should not be part of the driver vocabulary. Does this have
anything to do with DMA-API?

Only "paging" domains should have IO page tables and an IO page table
type.

Maybe this is just misnamed?

Also, shouldn't it be an enum not flags?

> +#define PD_FLAG_PT		BIT(0) /* Passthrough mode */
> +#define PD_FLAG_GCR3		BIT(3) /* domain has gcr3 table */

These seem redundant with the domain type?

Maybe a few helpers?

> +#define PD_FLAG_GIOV		BIT(4) /* domain enable GIOV support */

This really shouldn't be a iommu_domain flag, this should be computed
based on what domains are attached where

Eg this seems like it should be triggered by attaching identity to the
RID with PASID enabled or something like that.

Why do you need to set it for the legacy GPU path?

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2)
  2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
                   ` (20 preceding siblings ...)
  2023-07-12 14:15 ` [PATCH 21/21] iommu/amd: Simplify amd_iommu_device_info() Vasant Hegde
@ 2023-07-14 17:53 ` Jason Gunthorpe
  21 siblings, 0 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2023-07-14 17:53 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit

On Wed, Jul 12, 2023 at 02:14:55PM +0000, Vasant Hegde wrote:
> This is part 1 of the 2-part series to introduce Share Virtual Address
> (SVA) support, which focuses on cleaning up and refactoring the existing
> code in preparation for subsequent series.
> 
> It contains the following enhancements:
> 
> * Patch 1 - 7:
>   Clean up and refactoring. No functional changes.
> 
> * Patch 8 - 11:
>   Miscellaneous improvements.
> 
> * Patch 12 - 15:
>   Introduce new set of AMD IOMMU protection domain flags, which simplifies
>   domain manangements.
> 
> * Patch 16 - 21:
>   Modify logic to independently enable PCI capabilities (ATS/PASID/PRI)
>   for devices. This allows more flexibility in preparation for SVA and
>   IOPF supports.

I don't know alot about the AMD driver, but these all seemed like a
pretty good movement in the right direction.

I hope part 2 removes all EXPORT_SYMBOL's from the AMD driver :)

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 11/21] iommu/amd: Clean up and enhance protection domain flags
  2023-07-14 17:25   ` Jason Gunthorpe
@ 2023-07-17  1:22     ` Suthikulpanit, Suravee
  2023-07-17 12:23       ` Jason Gunthorpe
  0 siblings, 1 reply; 30+ messages in thread
From: Suthikulpanit, Suravee @ 2023-07-17  1:22 UTC (permalink / raw)
  To: Jason Gunthorpe, Vasant Hegde; +Cc: iommu, joro

Jason,

On 7/14/2023 12:25 PM, Jason Gunthorpe wrote:
> On Wed, Jul 12, 2023 at 02:15:06PM +0000, Vasant Hegde wrote:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> With IOMMU v1 and v2 page tables per domain, and the v2 table can
>> be used in various modes, keeping track of how a particular domain
>> is configured becoming cumbersome.
>>
>> Enhance protection_domain.flags to keep track of how a domain is
>> configured. Also Remove unused flags, and rename them accordingly.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
>> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
>> ---
>>   drivers/iommu/amd/amd_iommu_types.h | 14 +++++---------
>>   drivers/iommu/amd/io_pgtable_v2.c   |  2 +-
>>   drivers/iommu/amd/iommu.c           | 21 ++++++++++++---------
>>   3 files changed, 18 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
>> index 346c32343417..907af6e41ae9 100644
>> --- a/drivers/iommu/amd/amd_iommu_types.h
>> +++ b/drivers/iommu/amd/amd_iommu_types.h
>> @@ -443,13 +443,11 @@
>>   #define MAX_DOMAIN_ID 65536
>>   
>>   /* Protection domain flags */
>> -#define PD_DMA_OPS_MASK		BIT(0) /* domain used for dma_ops */
>> -#define PD_DEFAULT_MASK		BIT(1) /* domain is a default dma_ops
>> -					      domain for an IOMMU */
>> -#define PD_PASSTHROUGH_MASK	BIT(2) /* domain has no page
>> -					      translation */
>> -#define PD_IOMMUV2_MASK		BIT(3) /* domain has gcr3 table */
>> -#define PD_GIOV_MASK		BIT(4) /* domain enable GIOV support */
> 
>> +#define PD_FLAG_V1DMA		BIT(1) /* DMA-API mode w/ v1 page table */
>> +#define PD_FLAG_V2DMA		BIT(2) /* DMA-API mode w/ v2 page table */
> 
> "DMA-API" should not be part of the driver vocabulary. Does this have
> anything to do with DMA-API?
>
> Only "paging" domains should have IO page tables and an IO page table
> type.
>
> Maybe this is just misnamed?

These flags are for struct protection_domain in AMD IOMMU driver, and 
not for the struct iommu_domain.

PD_FLAG_V[1|2]DMA flags are used to signify that the domain is using v1 
or v2 page table to implement IOMMU_DOMAIN_DMA (which has 
__IOMMU_DOMAIN_PAGING | __IOMMU_DOMAIN_DMA_API).

> Also, shouldn't it be an enum not flags?

We use flags instead of enum because we exepect to have combination of 
the flags. For example
* IOMMU_DOMAIN_DMA + amd_iommu=pgtbl_v1 = (PD_FLAG_V1DMA)
* IOMMU_DOMAIN_DMA + amd_iommu=pgtbl_v2 = (PD_FLAG_V2DMA | PD_FLAG_GIOV)
* IOMMU_DOMAIN_SVA = (PD_FLAG_SVA | PD_FLAG_GCR3)
* Domain for v2API = (PD_FLAG_V2API | PD_FLAG_GCR3)
* Etc ....

It helps keeping track of how the IOMMU driver needs to setup DTEs for 
device in a particular domain.

>> +#define PD_FLAG_PT		BIT(0) /* Passthrough mode */
>> +#define PD_FLAG_GCR3		BIT(3) /* domain has gcr3 table */
> 
> These seem redundant with the domain type?
> 
> Maybe a few helpers?
> 
>> +#define PD_FLAG_GIOV		BIT(4) /* domain enable GIOV support */
> 
> This really shouldn't be a iommu_domain flag, this should be computed
> based on what domains are attached where
> 
> Eg this seems like it should be triggered by attaching identity to the
> RID with PASID enabled or something like that.
> 
> Why do you need to set it for the legacy GPU path?

The PD_FLAG_GIOV is necessary for when we setup the v2 page table for 
devices, which cannot support PASID. With the DTE[GIOV] bit set, IOMMU 
hardware would treat the request from these devices as if it use PASID 0.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 11/21] iommu/amd: Clean up and enhance protection domain flags
  2023-07-17  1:22     ` Suthikulpanit, Suravee
@ 2023-07-17 12:23       ` Jason Gunthorpe
  2023-07-18  6:38         ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2023-07-17 12:23 UTC (permalink / raw)
  To: Suthikulpanit, Suravee; +Cc: Vasant Hegde, iommu, joro

On Sun, Jul 16, 2023 at 08:22:37PM -0500, Suthikulpanit, Suravee wrote:
> Jason,
> 
> On 7/14/2023 12:25 PM, Jason Gunthorpe wrote:
> > On Wed, Jul 12, 2023 at 02:15:06PM +0000, Vasant Hegde wrote:
> > > From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> > > 
> > > With IOMMU v1 and v2 page tables per domain, and the v2 table can
> > > be used in various modes, keeping track of how a particular domain
> > > is configured becoming cumbersome.
> > > 
> > > Enhance protection_domain.flags to keep track of how a domain is
> > > configured. Also Remove unused flags, and rename them accordingly.
> > > 
> > > Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> > > Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
> > > Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> > > ---
> > >   drivers/iommu/amd/amd_iommu_types.h | 14 +++++---------
> > >   drivers/iommu/amd/io_pgtable_v2.c   |  2 +-
> > >   drivers/iommu/amd/iommu.c           | 21 ++++++++++++---------
> > >   3 files changed, 18 insertions(+), 19 deletions(-)
> > > 
> > > diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
> > > index 346c32343417..907af6e41ae9 100644
> > > --- a/drivers/iommu/amd/amd_iommu_types.h
> > > +++ b/drivers/iommu/amd/amd_iommu_types.h
> > > @@ -443,13 +443,11 @@
> > >   #define MAX_DOMAIN_ID 65536
> > >   /* Protection domain flags */
> > > -#define PD_DMA_OPS_MASK		BIT(0) /* domain used for dma_ops */
> > > -#define PD_DEFAULT_MASK		BIT(1) /* domain is a default dma_ops
> > > -					      domain for an IOMMU */
> > > -#define PD_PASSTHROUGH_MASK	BIT(2) /* domain has no page
> > > -					      translation */
> > > -#define PD_IOMMUV2_MASK		BIT(3) /* domain has gcr3 table */
> > > -#define PD_GIOV_MASK		BIT(4) /* domain enable GIOV support */
> > 
> > > +#define PD_FLAG_V1DMA		BIT(1) /* DMA-API mode w/ v1 page table */
> > > +#define PD_FLAG_V2DMA		BIT(2) /* DMA-API mode w/ v2 page table */
> > 
> > "DMA-API" should not be part of the driver vocabulary. Does this have
> > anything to do with DMA-API?
> > 
> > Only "paging" domains should have IO page tables and an IO page table
> > type.
> > 
> > Maybe this is just misnamed?
> 
> These flags are for struct protection_domain in AMD IOMMU driver, and not
> for the struct iommu_domain.

I understand that

> PD_FLAG_V[1|2]DMA flags are used to signify that the domain is using v1 or
> v2 page table to implement IOMMU_DOMAIN_DMA (which has __IOMMU_DOMAIN_PAGING
> | __IOMMU_DOMAIN_DMA_API).

Again, no, drivers should have no sensitivity to
__IOMMU_DOMAIN_DMA_API, we are trying to get rid of this. Please don't
add more to the driver.

> > Also, shouldn't it be an enum not flags?
> 
> We use flags instead of enum because we exepect to have combination of the
> flags. For example
> * IOMMU_DOMAIN_DMA + amd_iommu=pgtbl_v1 = (PD_FLAG_V1DMA)
> * IOMMU_DOMAIN_DMA + amd_iommu=pgtbl_v2 = (PD_FLAG_V2DMA | PD_FLAG_GIOV)
> * IOMMU_DOMAIN_SVA = (PD_FLAG_SVA | PD_FLAG_GCR3)
> * Domain for v2API = (PD_FLAG_V2API | PD_FLAG_GCR3)

You shouldn't have these distinctions. There should only be one enum
that specifies that format the IOPTEs are, and the rest is implied.

GCR3 is the "PASID table" right? it should not be part of the
iommu_domain struct either, SMMUv3 is busy undoing this mistake in
their design right now.

> It helps keeping track of how the IOMMU driver needs to setup DTEs for
> device in a particular domain.

Yes, I understand what it is for. It is important that the right data
be stored in the right places or you will have to redo it all. If it
does not directly related to describing the format of the IOPTEs then
it has no buisness being in the iommu_domain struct or it's driver
varients like protection_domain. It belongs someplace else.

> > > +#define PD_FLAG_GIOV		BIT(4) /* domain enable GIOV support */
> > 
> > This really shouldn't be a iommu_domain flag, this should be computed
> > based on what domains are attached where
> > 
> > Eg this seems like it should be triggered by attaching identity to the
> > RID with PASID enabled or something like that.
> > 
> > Why do you need to set it for the legacy GPU path?
> 
> The PD_FLAG_GIOV is necessary for when we setup the v2 page table for
> devices, which cannot support PASID. With the DTE[GIOV] bit set, IOMMU
> hardware would treat the request from these devices as if it use PASID 0.

I understand that, but you are supposed to deduce this from having a
RID bound page table installed into a PASID 0 slot but other logic, it
is very much not part of the domain itself.

(as we all the other variations of this flag when you set an IDENTITY
domain on the RID while having PASID operating)

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 11/21] iommu/amd: Clean up and enhance protection domain flags
  2023-07-17 12:23       ` Jason Gunthorpe
@ 2023-07-18  6:38         ` Suthikulpanit, Suravee
  2023-07-18 12:36           ` Jason Gunthorpe
  0 siblings, 1 reply; 30+ messages in thread
From: Suthikulpanit, Suravee @ 2023-07-18  6:38 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Vasant Hegde, iommu, joro

Jason,

On 7/17/2023 7:23 AM, Jason Gunthorpe wrote:
> On Sun, Jul 16, 2023 at 08:22:37PM -0500, Suthikulpanit, Suravee wrote:
>>>>
>>>> ....
>>>>
>>>> diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
>>>> index 346c32343417..907af6e41ae9 100644
>>>> --- a/drivers/iommu/amd/amd_iommu_types.h
>>>> +++ b/drivers/iommu/amd/amd_iommu_types.h
>>>> @@ -443,13 +443,11 @@
>>>>    #define MAX_DOMAIN_ID 65536
>>>>    /* Protection domain flags */
>>>> -#define PD_DMA_OPS_MASK		BIT(0) /* domain used for dma_ops */
>>>> -#define PD_DEFAULT_MASK		BIT(1) /* domain is a default dma_ops
>>>> -					      domain for an IOMMU */
>>>> -#define PD_PASSTHROUGH_MASK	BIT(2) /* domain has no page
>>>> -					      translation */
>>>> -#define PD_IOMMUV2_MASK		BIT(3) /* domain has gcr3 table */
>>>> -#define PD_GIOV_MASK		BIT(4) /* domain enable GIOV support */
>>>
>>>> +#define PD_FLAG_V1DMA		BIT(1) /* DMA-API mode w/ v1 page table */
>>>> +#define PD_FLAG_V2DMA		BIT(2) /* DMA-API mode w/ v2 page table */
>>>
>>> "DMA-API" should not be part of the driver vocabulary. Does this have
>>> anything to do with DMA-API?
>>>
>>> Only "paging" domains should have IO page tables and an IO page table
>>> type.
>>>
>>> Maybe this is just misnamed?
>>
>> These flags are for struct protection_domain in AMD IOMMU driver, and not
>> for the struct iommu_domain.
> 
> I understand that
> 
>> PD_FLAG_V[1|2]DMA flags are used to signify that the domain is using v1 or
>> v2 page table to implement IOMMU_DOMAIN_DMA (which has __IOMMU_DOMAIN_PAGING
>> | __IOMMU_DOMAIN_DMA_API).
> 
> Again, no, drivers should have no sensitivity to
> __IOMMU_DOMAIN_DMA_API, we are trying to get rid of this. Please don't
> add more to the driver.

Ok, no mention of "DMA" in the naming.

>>> Also, shouldn't it be an enum not flags?
>>
>> We use flags instead of enum because we exepect to have combination of the
>> flags. For example
>> * IOMMU_DOMAIN_DMA + amd_iommu=pgtbl_v1 = (PD_FLAG_V1DMA)
>> * IOMMU_DOMAIN_DMA + amd_iommu=pgtbl_v2 = (PD_FLAG_V2DMA | PD_FLAG_GIOV)
>> * IOMMU_DOMAIN_SVA = (PD_FLAG_SVA | PD_FLAG_GCR3)
>> * Domain for v2API = (PD_FLAG_V2API | PD_FLAG_GCR3)
> 
> You shouldn't have these distinctions. There should only be one enum
> that specifies that format the IOPTEs are, and the rest is implied.

Still not quite sure what you meant by "the rest is implied".

Anyhow, instead, we could introduce struct protection_domain.pd_mode as 
enum where:

enum amd_iommu_domain_pt_mode {
     AMD_IOMMU_PD_MODE_PT = 0, /* No page table setup */
     AMD_IOMMU_PD_MODE_V1,     /* IOMMU v1 page table */
     AMD_IOMMU_PD_MODE_V2,     /* IOMMU v2 page table */
};

Then we need a way to distinguish whether the v2 table is:
  * Managed by IOMMU driver for DMA-API, which supports only PASID=0.
  * Bound using the SVA interface
  * Bound using the AMD IOMMUv2 API

What if we introduce a struct protection_domain.gcr3_flags, and use it 
to track information for setting up DTE and GCR3 table for the following 
scenarios.

Considering an IOMMU group containing a PASID capable device.

CASE 1:
-------
When boot w/ "iommu=nopt amd_iommu=pgtable_v2", the device would be 
setup to support DMA-API with AMD IOMMU v2 page table with PASID=0 in 
the GCR3 table for this domain (i.e. DTE[GIOV]=1).

Here, we can set:
   pd_mode    = AMD_IOMMU_PD_MODE_V2;
   gcr3_flags = AMD_IOMMU_GCR3_FLAG_GIOV;   /* bit 0 */,

CASE 2:
-------
Following case 1, the device driver calls 
iommu_dev_enable_feature(IOMMU_DEV_FEAT_SVA) and iommu_sva_bind_device().

Here, we need to maintain DMA-API mapping from case 1. We would keep the 
GCR3 table from case 1, and bind IOMMU v2 page table from mm_struct to a 
non-zero PASID in the same GCR3 table for SVA support. This is possible 
because the kernel manages PASID-space starting from PASID=1.

Here, we can set:
   pd_mode    = AMD_IOMMU_PD_MODE_V2;
   gcr3_flags = (AMD_IOMMU_GCR3_FLAS_GIOV |		/* bit 0 */
                 AMD_IOMMU_GCR3_FLAG_MANAGED_PASID)	/* bit 1 */

Notes:
  - This would also allow non-PASID and PASID capable devices in the 
same IOMMU group to co-exist with full isolation support.

  - The AMD_IOMMU_GCR3_FLAGS_MANAGED_PASID is cleared when the domain no 
longer contain SVA-enabled devices.

CASE 3:
-------
Current default behavior for IOMMU group with PASID/PRI capable device 
is to setup the device in pass-through mode.

Here, we can set
   pd_mode = AMD_IOMMU_PD_MODE_PT;

Subsequently, when calling amd_iommu_bind_pasid(), the IOMMU driver 
would set
   pd_mode    = AMD_IOMMU_PD_MODE_V2;
   gcr3_flags = AMD_IOMMU_GCR3_FLAG_UNMANAGED_PASID;	/* bit 2 */

Notes:
  - Since kernel does not manage PASIDs for v2API, and the device driver 
is responsible for providing a PASID when calling 
amd_iommu_bind_pasid(), we cannot reserve PASID 0 for non-PASID device. 
Therefore, device in this group would set DTE[GIOV]=0.

  - We cannot have case 1 or 2 with case 3 in the same IOMMU group since 
GCR3 table is currently per-group/domain (i.e. one default domain 
per-group).

  - TBD: Despite the upcoming SVA support, we still need to maintain 
support for the existing AMD v2 API (which we might consider deprecating 
in the future and put on bug-fix mode only), and we cannot break 
existing device drivers.

> GCR3 is the "PASID table" right? it should not be part of the
> iommu_domain struct either, SMMUv3 is busy undoing this mistake in
> their design right now.

GCR3 table is the same as PASID table,

Could you please elaborate on this part if it should not be as part of 
the domain? What's the new design for SMMUv3?

>> It helps keeping track of how the IOMMU driver needs to setup DTEs for
>> device in a particular domain.
> 
> Yes, I understand what it is for. It is important that the right data
> be stored in the right places or you will have to redo it all. If it
> does not directly related to describing the format of the IOPTEs then
> it has no business being in the iommu_domain struct or it's driver
> variants like protection_domain. It belongs someplace else.

For AMD IOMMU, GCR3 table is setup in DTE for each device. So, 
technically, it can be setup per device. However, devices in the same 
domain can share the same mapping (e.g. a domain w/ multiple non-PASID 
capable devices would share the same IOMMU v2 page table w/ PASID=0 for 
DMA-API support). This is why we have struct protection_domain.gcr3_tbl 
to avoid duplicate GCR3 table for these devices.

Alternatively, If we don't care about the additional memory to setup 
per-device GCR3 table, we could move gcr3_tbl, gcr3_flags, and any 
related variables to the per-device struct iommu_dev_data, which 
contains device-specific data for AMD IOMMU driver.

>>>> +#define PD_FLAG_GIOV		BIT(4) /* domain enable GIOV support */
>>>
>>> This really shouldn't be a iommu_domain flag, this should be computed
>>> based on what domains are attached where
>>>
>>> Eg this seems like it should be triggered by attaching identity to the
>>> RID with PASID enabled or something like that.

Actually, I am not sure that I understand this point.

>>> Why do you need to set it for the legacy GPU path?

This should not have anything to do with the legacy GPU.

>> The PD_FLAG_GIOV is necessary for when we setup the v2 page table for
>> devices, which cannot support PASID. With the DTE[GIOV] bit set, IOMMU
>> hardware would treat the request from these devices as if it use PASID 0.
> 
> I understand that, but you are supposed to deduce this from having a
> RID bound page table installed into a PASID 0 slot but other logic, it
> is very much not part of the domain itself.

IIUC, the scenarios above should clearly describe when the DTE[GIOV] 
would is determined, and tracked by gcr3_tbl. This information is used 
when setting up DTE for each device.

> (as we all the other variations of this flag when you set an IDENTITY
> domain on the RID while having PASID operating)

I am not sure that I understand this point.

Please let me know if I am missing something, or if you have different 
ideas to share.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 11/21] iommu/amd: Clean up and enhance protection domain flags
  2023-07-18  6:38         ` Suthikulpanit, Suravee
@ 2023-07-18 12:36           ` Jason Gunthorpe
  2023-07-28  5:43             ` Vasant Hegde
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2023-07-18 12:36 UTC (permalink / raw)
  To: Suthikulpanit, Suravee; +Cc: Vasant Hegde, iommu, joro

On Mon, Jul 17, 2023 at 11:38:44PM -0700, Suthikulpanit, Suravee wrote:

> Then we need a way to distinguish whether the v2 table is:
>  * Managed by IOMMU driver for DMA-API, which supports only PASID=0.

There is not even such a thing.

The driver should have an iommu domain. It should have and indication
if it is PAGING or BLOCKING/IDENTITY. For PAGING domains you have an
indication of the IOPTE format.

"DMA-API" is a PAGING domain with v1 or v2 IOPTEs
UNMANAGED is the same, a PAGING domain with v1 or v2 IOPTEs
SVA is a "PAGING" domain with v2 IOPTEs
"AMD IOMMUv2 API" is PAGING domain with v2 IOPTEs

The core code provides two actions for the driver, attach domain to
RID, attach domain to PASID.

If you attach a V1 PAGING domain to a RID then you have to put it in
the DTE

If you attach a V2 PAGING domain to a RID then you have to put it in
PASID 0 within a shared GCR3 table. The GCR3 table is NOT part of the
struct protection_domain.

The descision how to configure the DTE/GCR3 is based only on *how and
what domains are attached*.

Doing this wrong is why the driver that that weird mess with trying to
have two different versions of the identity domain.

> What if we introduce a struct protection_domain.gcr3_flags, and use it to
> track information for setting up DTE and GCR3 table for the following
> scenarios.

It is still logically wrong, the GCR3 table has nothing to do with
the driver's iommu_domain. The iommu domain is only about the IOPTEs
contained within it, not the GCR3 table.

> CASE 1:
> -------
> When boot w/ "iommu=nopt amd_iommu=pgtable_v2", the device would be setup to
> support DMA-API with AMD IOMMU v2 page table with PASID=0 in the GCR3 table
> for this domain (i.e. DTE[GIOV]=1).
> 
> Here, we can set:
>   pd_mode    = AMD_IOMMU_PD_MODE_V2;
>   gcr3_flags = AMD_IOMMU_GCR3_FLAG_GIOV;   /* bit 0 */,

As above, the alloc_domain will return a PAGING domain with v2 IOPTEs.

The core will request to install it in the RID

The driver will allocate a shared GCR3 "PASID table" for the RID and
install that in the DTE. Then it will install the PAGING domain in the
PASID0 slot.

> CASE 2:
> -------
> Following case 1, the device driver calls
> iommu_dev_enable_feature(IOMMU_DEV_FEAT_SVA) and iommu_sva_bind_device().
>
> Here, we need to maintain DMA-API mapping from case 1. We would keep the
> GCR3 table from case 1, and bind IOMMU v2 page table from mm_struct to a
> non-zero PASID in the same GCR3 table for SVA support. This is possible
> because the kernel manages PASID-space starting from PASID=1.

Again no issue, the core code does alloc_sva, it returns a "PAGING"
domain with v2 IOPTE format. It does a PASID attach. The driver
ensures the shared GCR3 table is loaded to the DTE and then assigns
the SVA domain to the correct PASID slot.

>  - The AMD_IOMMU_GCR3_FLAGS_MANAGED_PASID is cleared when the domain no
> longer contain SVA-enabled devices.

There is no such thing as "a domain contains devices"
 
> CASE 3:
> -------
> Current default behavior for IOMMU group with PASID/PRI capable device is to
> setup the device in pass-through mode.
> 
> Here, we can set
>   pd_mode = AMD_IOMMU_PD_MODE_PT;
> 
> Subsequently, when calling amd_iommu_bind_pasid(), the IOMMU driver would
> set
>   pd_mode    = AMD_IOMMU_PD_MODE_V2;
>   gcr3_flags = AMD_IOMMU_GCR3_FLAG_UNMANAGED_PASID;	/* bit 2 */

The core code will call alloc_domain with IDENTITY and attach it to
the RID.

This should be a DTE setup for IDENTITY.

Later something will alloc_domain an UNMANGED domain for the PASID !=0 
with v2 IOPTEs.

When the UNMANAGED domain is assigned to the PASID the driver asserts
the shared GCR3 table is loaded. Since the RID is attached to IDENTITY
it computes a DTE that keeps IDENTITY working (GIOV=0, right?). It
then loads the V2 UNAMANGED domain to the correct PASID in the shared
GCR3.

>  - Since kernel does not manage PASIDs for v2API, and the device driver is
> responsible for providing a PASID when calling amd_iommu_bind_pasid(), we
> cannot reserve PASID 0 for non-PASID device. Therefore, device in this group
> would set DTE[GIOV]=0.

The core code works this way too. PASID0 should be kept reserved in
both APIs to make RID devices easier. Driver should refuse to attach a
domain to PASID 0.

>  - We cannot have case 1 or 2 with case 3 in the same IOMMU group since GCR3
> table is currently per-group/domain (i.e. one default domain per-group).

This is the same design mistake smmuv3 made. The GCR3 table must not
be part of the iommu_domain otherwise you end up being stuck and
unable to replace the RID domain, which is breaking the API contract
from the core code.

The GCR3 table must be shared, and must not be part of the
iommu_domain related memory.

Once done like this you can freely mix cases 1/2/3 dynamically
because the GCR3 table is not stored in the wrong place.

>  - TBD: Despite the upcoming SVA support, we still need to maintain support
> for the existing AMD v2 API (which we might consider deprecating in the
> future and put on bug-fix mode only), and we cannot break existing device
> drivers.

Er, no, you need to eliminate the AMD v2 API as part of this work. We
cannot have a private API that duplicates the normal core API that is
only used by AMD GPU. This is the kernel, so we don't maintain
"APIs". You update the in-tree AMDGPU driver to use the core API and
then you purge the old API.

> > GCR3 is the "PASID table" right? it should not be part of the
> > iommu_domain struct either, SMMUv3 is busy undoing this mistake in
> > their design right now.
> 
> GCR3 table is the same as PASID table,
>
> Could you please elaborate on this part if it should not be as part of the
> domain? What's the new design for SMMUv3?

There is a long thread, all the same concepts apply here, the
underlying HW flow is basically the same, just with different names
(STE=DTE, CD=GCR3, S2=V1, S1=V2)

https://lore.kernel.org/linux-iommu/20230621063825.268890-1-mshavit@google.com

And prior postings.

> However, devices in the same domain can share the same mapping
> (e.g. a domain w/ multiple non-PASID capable devices would share the
> same IOMMU v2 page table w/ PASID=0 for DMA-API support). This is
> why we have struct protection_domain.gcr3_tbl to avoid duplicate
> GCR3 table for these devices.

The domain is the wrong place to put this. If you want to share the
GCR3 table between DTEs in the same group then use
iommu_group_get_iommudata() to associate the shared table with the
group.

> Alternatively, If we don't care about the additional memory to setup
> per-device GCR3 table, we could move gcr3_tbl, gcr3_flags, and any related
> variables to the per-device struct iommu_dev_data, which contains
> device-specific data for AMD IOMMU driver.

This is what Michael is proposing for SMMUv3.

Keep in mind that the core code now basically blocks use of PASID for
typical multi-device group scenarios. So you should think carefully if
AMD actually has a use case for multiple DTE's sharing the same GCR3
table that is worth optimizing for.

> > > > > +#define PD_FLAG_GIOV		BIT(4) /* domain enable GIOV support */
> > > > 
> > > > This really shouldn't be a iommu_domain flag, this should be computed
> > > > based on what domains are attached where
> > > > 
> > > > Eg this seems like it should be triggered by attaching identity to the
> > > > RID with PASID enabled or something like that.
> 
> Actually, I am not sure that I understand this point.

GIOV controls what happens to PASID-less TLPs.

One option is to translate through the DTE, which for the 
kernel owned configuration is IDENTITY.

Another option is to translate through the GCR3 PASID 0 v2 PAGING
domain.

The driver selects the correct GIOV value based on what is attached to
the RID. IDENTITY or PAGING will decide what GIOV should be.

Jason

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 11/21] iommu/amd: Clean up and enhance protection domain flags
  2023-07-18 12:36           ` Jason Gunthorpe
@ 2023-07-28  5:43             ` Vasant Hegde
  0 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-07-28  5:43 UTC (permalink / raw)
  To: Jason Gunthorpe, Suthikulpanit, Suravee; +Cc: iommu, joro

Hi Jason,


On 7/18/2023 6:06 PM, Jason Gunthorpe wrote:
> On Mon, Jul 17, 2023 at 11:38:44PM -0700, Suthikulpanit, Suravee wrote:
> 
>> Then we need a way to distinguish whether the v2 table is:
>>  * Managed by IOMMU driver for DMA-API, which supports only PASID=0.
> 
> There is not even such a thing.
> 
> The driver should have an iommu domain. It should have and indication
> if it is PAGING or BLOCKING/IDENTITY. For PAGING domains you have an
> indication of the IOPTE format.
> 
> "DMA-API" is a PAGING domain with v1 or v2 IOPTEs
> UNMANAGED is the same, a PAGING domain with v1 or v2 IOPTEs
> SVA is a "PAGING" domain with v2 IOPTEs
> "AMD IOMMUv2 API" is PAGING domain with v2 IOPTEs
> 
> The core code provides two actions for the driver, attach domain to
> RID, attach domain to PASID.
> 
> If you attach a V1 PAGING domain to a RID then you have to put it in
> the DTE
> 
> If you attach a V2 PAGING domain to a RID then you have to put it in
> PASID 0 within a shared GCR3 table. The GCR3 table is NOT part of the
> struct protection_domain.
> 
> The descision how to configure the DTE/GCR3 is based only on *how and
> what domains are attached*.
> 
> Doing this wrong is why the driver that that weird mess with trying to
> have two different versions of the identity domain.
> 

Thanks for the detailed explanation. We are working on moving GCR3 table from
protection domain structure to per device structure. We will post the patches soon.

-Vasant


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-07-28  5:43 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-12 14:14 [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Vasant Hegde
2023-07-12 14:14 ` [PATCH 01/21] iommu/amd: Remove unused amd_io_pgtable.pt_root variable Vasant Hegde
2023-07-12 14:14 ` [PATCH 02/21] iommu/amd: Consolidate timeout pre-define to amd_iommu_type.h Vasant Hegde
2023-07-12 14:14 ` [PATCH 03/21] iommu/amd: Consolidate logic to allocate protection domain Vasant Hegde
2023-07-12 14:14 ` [PATCH 04/21] iommu/amd: Refactor protection domain allocation code Vasant Hegde
2023-07-12 14:15 ` [PATCH 05/21] iommu/amd/iommu_v2: Use protection_domain in struct device_state Vasant Hegde
2023-07-12 14:15 ` [PATCH 06/21] iommu/amd: Introduce helper functions for managing GCR3 table Vasant Hegde
2023-07-12 14:15 ` [PATCH 07/21] iommu/amd: Use struct protection_domain in helper functions Vasant Hegde
2023-07-14 16:57   ` Jason Gunthorpe
2023-07-12 14:15 ` [PATCH 08/21] iommu/amd: Do not set amd_iommu_pgtable in pass-through mode Vasant Hegde
2023-07-12 14:15 ` [PATCH 09/21] iommu/amd: Miscellaneous clean up when free domain Vasant Hegde
2023-07-12 14:15 ` [PATCH 10/21] iommu/amd: Modify logic for checking GT and PPR features Vasant Hegde
2023-07-12 14:15 ` [PATCH 11/21] iommu/amd: Clean up and enhance protection domain flags Vasant Hegde
2023-07-14 17:25   ` Jason Gunthorpe
2023-07-17  1:22     ` Suthikulpanit, Suravee
2023-07-17 12:23       ` Jason Gunthorpe
2023-07-18  6:38         ` Suthikulpanit, Suravee
2023-07-18 12:36           ` Jason Gunthorpe
2023-07-28  5:43             ` Vasant Hegde
2023-07-12 14:15 ` [PATCH 12/21] iommu/amd: Use protection_domain.flags to check page table mode Vasant Hegde
2023-07-12 14:15 ` [PATCH 13/21] iommu/amd: Consolidate logic for setting up v2 API mode Vasant Hegde
2023-07-12 14:15 ` [PATCH 14/21] iommu/amd: Refactor domain allocation code for v2DMA mode Vasant Hegde
2023-07-12 14:15 ` [PATCH 15/21] iommu/amd: Setup v2 domain with max pasids Vasant Hegde
2023-07-12 14:15 ` [PATCH 16/21] iommu/amd: Introduce iommu_dev_data.flags to track device capabilities Vasant Hegde
2023-07-12 14:15 ` [PATCH 17/21] iommu/amd: Rename ats related variables Vasant Hegde
2023-07-12 14:15 ` [PATCH 18/21] iommu/amd: Enable device ATS/PASID/PRI capabilities independently Vasant Hegde
2023-07-12 14:15 ` [PATCH 19/21] iommu/amd: Delay enabling PRI to after initializing domain Vasant Hegde
2023-07-12 14:15 ` [PATCH 20/21] iommu/amd: Initialize iommu_device->max_pasids Vasant Hegde
2023-07-12 14:15 ` [PATCH 21/21] iommu/amd: Simplify amd_iommu_device_info() Vasant Hegde
2023-07-14 17:53 ` [PATCH 00/21] iommu/amd: Preparation for SVA Support (Part 1 of 2) Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox