linux-s390.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/5] iommu/s390: support additional table regions
@ 2025-04-11 20:24 Matthew Rosato
  2025-04-11 20:24 ` [PATCH v5 1/5] iommu/s390: set appropriate IOTA region type Matthew Rosato
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Matthew Rosato @ 2025-04-11 20:24 UTC (permalink / raw)
  To: joro, will, robin.murphy, gerald.schaefer, schnelle
  Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
	linux-s390

The series extends the maximum table size allowed by s390-iommu by
increasing the number of table regions supported.  It also adds logic to
construct the table use the minimum number of regions based upon aperture
calculation.

Changes for v5:
- fix GFP_KERNEL under spinlock in patch 5

Changes for v4:
- rebase onto master/merge window
- convert patches 3 per Niklas to remove gotos
- convert patch 4 to also remove gotos
- add review tags

Changes for v3:
- rebase onto iommu-next
- move IOTA region type setting into s390-iommu
- remove origin_type and max_table_size from zdev
- adjust reserved region calculation to be dependent on the domain

Changes for v2:
- rebase onto 6.13
- remove 'iommu/s390: add basic routines for region 1st and 2nd tables'
  and put routines in first patch that uses each.  No functional change.

Matthew Rosato (5):
  iommu/s390: set appropriate IOTA region type
  iommu/s390: support cleanup of additional table regions
  iommu/s390: support iova_to_phys for additional table regions
  iommu/s390: support map/unmap for additional table regions
  iommu/s390: allow larger region tables

 arch/s390/include/asm/pci_dma.h |   3 +
 drivers/iommu/s390-iommu.c      | 345 ++++++++++++++++++++++++++++----
 2 files changed, 314 insertions(+), 34 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 1/5] iommu/s390: set appropriate IOTA region type
  2025-04-11 20:24 [PATCH v5 0/5] iommu/s390: support additional table regions Matthew Rosato
@ 2025-04-11 20:24 ` Matthew Rosato
  2025-04-11 20:24 ` [PATCH v5 2/5] iommu/s390: support cleanup of additional table regions Matthew Rosato
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Matthew Rosato @ 2025-04-11 20:24 UTC (permalink / raw)
  To: joro, will, robin.murphy, gerald.schaefer, schnelle
  Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
	linux-s390

When registering the I/O Translation Anchor, use the current table type
stored in the s390_domain to set the appropriate region type
indication.  For the moment, the table type will always be stored as
region third.

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 drivers/iommu/s390-iommu.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index e1c76e0f9c2b..cad032d4c9a6 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -31,6 +31,7 @@ struct s390_domain {
 	unsigned long		*dma_table;
 	spinlock_t		list_lock;
 	struct rcu_head		rcu;
+	u8			origin_type;
 };
 
 static struct iommu_domain blocking_domain;
@@ -345,6 +346,7 @@ static struct iommu_domain *s390_domain_alloc_paging(struct device *dev)
 	s390_domain->domain.geometry.force_aperture = true;
 	s390_domain->domain.geometry.aperture_start = 0;
 	s390_domain->domain.geometry.aperture_end = ZPCI_TABLE_SIZE_RT - 1;
+	s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
 
 	spin_lock_init(&s390_domain->list_lock);
 	INIT_LIST_HEAD_RCU(&s390_domain->devices);
@@ -381,6 +383,21 @@ static void zdev_s390_domain_update(struct zpci_dev *zdev,
 	spin_unlock_irqrestore(&zdev->dom_lock, flags);
 }
 
+static u64 get_iota_region_flag(struct s390_domain *domain)
+{
+	switch (domain->origin_type) {
+	case ZPCI_TABLE_TYPE_RTX:
+		return ZPCI_IOTA_RTTO_FLAG;
+	case ZPCI_TABLE_TYPE_RSX:
+		return ZPCI_IOTA_RSTO_FLAG;
+	case ZPCI_TABLE_TYPE_RFX:
+		return ZPCI_IOTA_RFTO_FLAG;
+	default:
+		WARN_ONCE(1, "Invalid IOMMU table (%x)\n", domain->origin_type);
+		return 0;
+	}
+}
+
 static int s390_iommu_domain_reg_ioat(struct zpci_dev *zdev,
 				      struct iommu_domain *domain, u8 *status)
 {
@@ -399,7 +416,7 @@ static int s390_iommu_domain_reg_ioat(struct zpci_dev *zdev,
 	default:
 		s390_domain = to_s390_domain(domain);
 		iota = virt_to_phys(s390_domain->dma_table) |
-		       ZPCI_IOTA_RTTO_FLAG;
+		       get_iota_region_flag(s390_domain);
 		rc = zpci_register_ioat(zdev, 0, zdev->start_dma,
 					zdev->end_dma, iota, status);
 	}
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 2/5] iommu/s390: support cleanup of additional table regions
  2025-04-11 20:24 [PATCH v5 0/5] iommu/s390: support additional table regions Matthew Rosato
  2025-04-11 20:24 ` [PATCH v5 1/5] iommu/s390: set appropriate IOTA region type Matthew Rosato
@ 2025-04-11 20:24 ` Matthew Rosato
  2025-04-11 20:24 ` [PATCH v5 3/5] iommu/s390: support iova_to_phys for " Matthew Rosato
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Matthew Rosato @ 2025-04-11 20:24 UTC (permalink / raw)
  To: joro, will, robin.murphy, gerald.schaefer, schnelle
  Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
	linux-s390

Extend the existing dma_cleanup_tables to also handle region second and
region first tables.

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 drivers/iommu/s390-iommu.c | 71 ++++++++++++++++++++++++++++++++++----
 1 file changed, 64 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index cad032d4c9a6..f2cda0ce0fe9 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -121,6 +121,22 @@ static inline int pt_entry_isvalid(unsigned long entry)
 	return (entry & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_VALID;
 }
 
+static inline unsigned long *get_rf_rso(unsigned long entry)
+{
+	if ((entry & ZPCI_TABLE_TYPE_MASK) == ZPCI_TABLE_TYPE_RFX)
+		return phys_to_virt(entry & ZPCI_RTE_ADDR_MASK);
+	else
+		return NULL;
+}
+
+static inline unsigned long *get_rs_rto(unsigned long entry)
+{
+	if ((entry & ZPCI_TABLE_TYPE_MASK) == ZPCI_TABLE_TYPE_RSX)
+		return phys_to_virt(entry & ZPCI_RTE_ADDR_MASK);
+	else
+		return NULL;
+}
+
 static inline unsigned long *get_rt_sto(unsigned long entry)
 {
 	if ((entry & ZPCI_TABLE_TYPE_MASK) == ZPCI_TABLE_TYPE_RTX)
@@ -192,18 +208,59 @@ static void dma_free_seg_table(unsigned long entry)
 	dma_free_cpu_table(sto);
 }
 
-static void dma_cleanup_tables(unsigned long *table)
+static void dma_free_rt_table(unsigned long entry)
 {
+	unsigned long *rto = get_rs_rto(entry);
 	int rtx;
 
-	if (!table)
+	for (rtx = 0; rtx < ZPCI_TABLE_ENTRIES; rtx++)
+		if (reg_entry_isvalid(rto[rtx]))
+			dma_free_seg_table(rto[rtx]);
+
+	dma_free_cpu_table(rto);
+}
+
+static void dma_free_rs_table(unsigned long entry)
+{
+	unsigned long *rso = get_rf_rso(entry);
+	int rsx;
+
+	for (rsx = 0; rsx < ZPCI_TABLE_ENTRIES; rsx++)
+		if (reg_entry_isvalid(rso[rsx]))
+			dma_free_rt_table(rso[rsx]);
+
+	dma_free_cpu_table(rso);
+}
+
+static void dma_cleanup_tables(struct s390_domain *domain)
+{
+	int rtx, rsx, rfx;
+
+	if (!domain->dma_table)
 		return;
 
-	for (rtx = 0; rtx < ZPCI_TABLE_ENTRIES; rtx++)
-		if (reg_entry_isvalid(table[rtx]))
-			dma_free_seg_table(table[rtx]);
+	switch (domain->origin_type) {
+	case ZPCI_TABLE_TYPE_RFX:
+		for (rfx = 0; rfx < ZPCI_TABLE_ENTRIES; rfx++)
+			if (reg_entry_isvalid(domain->dma_table[rfx]))
+				dma_free_rs_table(domain->dma_table[rfx]);
+		break;
+	case ZPCI_TABLE_TYPE_RSX:
+		for (rsx = 0; rsx < ZPCI_TABLE_ENTRIES; rsx++)
+			if (reg_entry_isvalid(domain->dma_table[rsx]))
+				dma_free_rt_table(domain->dma_table[rsx]);
+		break;
+	case ZPCI_TABLE_TYPE_RTX:
+		for (rtx = 0; rtx < ZPCI_TABLE_ENTRIES; rtx++)
+			if (reg_entry_isvalid(domain->dma_table[rtx]))
+				dma_free_seg_table(domain->dma_table[rtx]);
+		break;
+	default:
+		WARN_ONCE(1, "Invalid IOMMU table (%x)\n", domain->origin_type);
+		return;
+	}
 
-	dma_free_cpu_table(table);
+	dma_free_cpu_table(domain->dma_table);
 }
 
 static unsigned long *dma_alloc_page_table(gfp_t gfp)
@@ -358,7 +415,7 @@ static void s390_iommu_rcu_free_domain(struct rcu_head *head)
 {
 	struct s390_domain *s390_domain = container_of(head, struct s390_domain, rcu);
 
-	dma_cleanup_tables(s390_domain->dma_table);
+	dma_cleanup_tables(s390_domain);
 	kfree(s390_domain);
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 3/5] iommu/s390: support iova_to_phys for additional table regions
  2025-04-11 20:24 [PATCH v5 0/5] iommu/s390: support additional table regions Matthew Rosato
  2025-04-11 20:24 ` [PATCH v5 1/5] iommu/s390: set appropriate IOTA region type Matthew Rosato
  2025-04-11 20:24 ` [PATCH v5 2/5] iommu/s390: support cleanup of additional table regions Matthew Rosato
@ 2025-04-11 20:24 ` Matthew Rosato
  2025-04-11 20:24 ` [PATCH v5 4/5] iommu/s390: support map/unmap " Matthew Rosato
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Matthew Rosato @ 2025-04-11 20:24 UTC (permalink / raw)
  To: joro, will, robin.murphy, gerald.schaefer, schnelle
  Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
	linux-s390

The origin_type of the dma_table is used to determine how many table
levels must be traversed for the translation.

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/pci_dma.h |  2 ++
 drivers/iommu/s390-iommu.c      | 60 ++++++++++++++++++++++++++++++++-
 2 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
index 42d7cc4262ca..8d8962e4fd58 100644
--- a/arch/s390/include/asm/pci_dma.h
+++ b/arch/s390/include/asm/pci_dma.h
@@ -55,6 +55,8 @@ enum zpci_ioat_dtype {
 #define ZPCI_PT_BITS			8
 #define ZPCI_ST_SHIFT			(ZPCI_PT_BITS + PAGE_SHIFT)
 #define ZPCI_RT_SHIFT			(ZPCI_ST_SHIFT + ZPCI_TABLE_BITS)
+#define ZPCI_RS_SHIFT			(ZPCI_RT_SHIFT + ZPCI_TABLE_BITS)
+#define ZPCI_RF_SHIFT			(ZPCI_RS_SHIFT + ZPCI_TABLE_BITS)
 
 #define ZPCI_RTE_FLAG_MASK		0x3fffUL
 #define ZPCI_RTE_ADDR_MASK		(~ZPCI_RTE_FLAG_MASK)
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index f2cda0ce0fe9..338a7381e918 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -36,6 +36,16 @@ struct s390_domain {
 
 static struct iommu_domain blocking_domain;
 
+static inline unsigned int calc_rfx(dma_addr_t ptr)
+{
+	return ((unsigned long)ptr >> ZPCI_RF_SHIFT) & ZPCI_INDEX_MASK;
+}
+
+static inline unsigned int calc_rsx(dma_addr_t ptr)
+{
+	return ((unsigned long)ptr >> ZPCI_RS_SHIFT) & ZPCI_INDEX_MASK;
+}
+
 static inline unsigned int calc_rtx(dma_addr_t ptr)
 {
 	return ((unsigned long)ptr >> ZPCI_RT_SHIFT) & ZPCI_INDEX_MASK;
@@ -759,6 +769,51 @@ static int s390_iommu_map_pages(struct iommu_domain *domain,
 	return rc;
 }
 
+static unsigned long *get_rso_from_iova(struct s390_domain *domain,
+					dma_addr_t iova)
+{
+	unsigned long *rfo;
+	unsigned long rfe;
+	unsigned int rfx;
+
+	switch (domain->origin_type) {
+	case ZPCI_TABLE_TYPE_RFX:
+		rfo = domain->dma_table;
+		rfx = calc_rfx(iova);
+		rfe = READ_ONCE(rfo[rfx]);
+		if (!reg_entry_isvalid(rfe))
+			return NULL;
+		return get_rf_rso(rfe);
+	case ZPCI_TABLE_TYPE_RSX:
+		return domain->dma_table;
+	default:
+		return NULL;
+	}
+}
+
+static unsigned long *get_rto_from_iova(struct s390_domain *domain,
+					dma_addr_t iova)
+{
+	unsigned long *rso;
+	unsigned long rse;
+	unsigned int rsx;
+
+	switch (domain->origin_type) {
+	case ZPCI_TABLE_TYPE_RFX:
+	case ZPCI_TABLE_TYPE_RSX:
+		rso = get_rso_from_iova(domain, iova);
+		rsx = calc_rsx(iova);
+		rse = READ_ONCE(rso[rsx]);
+		if (!reg_entry_isvalid(rse))
+			return NULL;
+		return get_rs_rto(rse);
+	case ZPCI_TABLE_TYPE_RTX:
+		return domain->dma_table;
+	default:
+		return NULL;
+	}
+}
+
 static phys_addr_t s390_iommu_iova_to_phys(struct iommu_domain *domain,
 					   dma_addr_t iova)
 {
@@ -772,10 +827,13 @@ static phys_addr_t s390_iommu_iova_to_phys(struct iommu_domain *domain,
 	    iova > domain->geometry.aperture_end)
 		return 0;
 
+	rto = get_rto_from_iova(s390_domain, iova);
+	if (!rto)
+		return 0;
+
 	rtx = calc_rtx(iova);
 	sx = calc_sx(iova);
 	px = calc_px(iova);
-	rto = s390_domain->dma_table;
 
 	rte = READ_ONCE(rto[rtx]);
 	if (reg_entry_isvalid(rte)) {
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 4/5] iommu/s390: support map/unmap for additional table regions
  2025-04-11 20:24 [PATCH v5 0/5] iommu/s390: support additional table regions Matthew Rosato
                   ` (2 preceding siblings ...)
  2025-04-11 20:24 ` [PATCH v5 3/5] iommu/s390: support iova_to_phys for " Matthew Rosato
@ 2025-04-11 20:24 ` Matthew Rosato
  2025-04-16  9:19   ` Niklas Schnelle
  2025-04-11 20:24 ` [PATCH v5 5/5] iommu/s390: allow larger region tables Matthew Rosato
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Matthew Rosato @ 2025-04-11 20:24 UTC (permalink / raw)
  To: joro, will, robin.murphy, gerald.schaefer, schnelle
  Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
	linux-s390

Map and unmap ops use the shared dma_walk_cpu_trans routine, update
this using the origin_type of the dma_table to determine how many
table levels must be walked.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 drivers/iommu/s390-iommu.c | 127 ++++++++++++++++++++++++++++++++++---
 1 file changed, 119 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index 338a7381e918..46f45b136993 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -67,6 +67,20 @@ static inline void set_pt_pfaa(unsigned long *entry, phys_addr_t pfaa)
 	*entry |= (pfaa & ZPCI_PTE_ADDR_MASK);
 }
 
+static inline void set_rf_rso(unsigned long *entry, phys_addr_t rso)
+{
+	*entry &= ZPCI_RTE_FLAG_MASK;
+	*entry |= (rso & ZPCI_RTE_ADDR_MASK);
+	*entry |= ZPCI_TABLE_TYPE_RFX;
+}
+
+static inline void set_rs_rto(unsigned long *entry, phys_addr_t rto)
+{
+	*entry &= ZPCI_RTE_FLAG_MASK;
+	*entry |= (rto & ZPCI_RTE_ADDR_MASK);
+	*entry |= ZPCI_TABLE_TYPE_RSX;
+}
+
 static inline void set_rt_sto(unsigned long *entry, phys_addr_t sto)
 {
 	*entry &= ZPCI_RTE_FLAG_MASK;
@@ -81,6 +95,22 @@ static inline void set_st_pto(unsigned long *entry, phys_addr_t pto)
 	*entry |= ZPCI_TABLE_TYPE_SX;
 }
 
+static inline void validate_rf_entry(unsigned long *entry)
+{
+	*entry &= ~ZPCI_TABLE_VALID_MASK;
+	*entry &= ~ZPCI_TABLE_OFFSET_MASK;
+	*entry |= ZPCI_TABLE_VALID;
+	*entry |= ZPCI_TABLE_LEN_RFX;
+}
+
+static inline void validate_rs_entry(unsigned long *entry)
+{
+	*entry &= ~ZPCI_TABLE_VALID_MASK;
+	*entry &= ~ZPCI_TABLE_OFFSET_MASK;
+	*entry |= ZPCI_TABLE_VALID;
+	*entry |= ZPCI_TABLE_LEN_RSX;
+}
+
 static inline void validate_rt_entry(unsigned long *entry)
 {
 	*entry &= ~ZPCI_TABLE_VALID_MASK;
@@ -286,6 +316,70 @@ static unsigned long *dma_alloc_page_table(gfp_t gfp)
 	return table;
 }
 
+static unsigned long *dma_walk_rs_table(unsigned long *rso,
+					dma_addr_t dma_addr, gfp_t gfp)
+{
+	unsigned int rsx = calc_rsx(dma_addr);
+	unsigned long old_rse, rse;
+	unsigned long *rsep, *rto;
+
+	rsep = &rso[rsx];
+	rse = READ_ONCE(*rsep);
+	if (reg_entry_isvalid(rse)) {
+		rto = get_rs_rto(rse);
+	} else {
+		rto = dma_alloc_cpu_table(gfp);
+		if (!rto)
+			return NULL;
+
+		set_rs_rto(&rse, virt_to_phys(rto));
+		validate_rs_entry(&rse);
+		entry_clr_protected(&rse);
+
+		old_rse = cmpxchg(rsep, ZPCI_TABLE_INVALID, rse);
+		if (old_rse != ZPCI_TABLE_INVALID) {
+			/* Somone else was faster, use theirs */
+			dma_free_cpu_table(rto);
+			rto = get_rs_rto(old_rse);
+		}
+	}
+	return rto;
+}
+
+static unsigned long *dma_walk_rf_table(unsigned long *rfo,
+					dma_addr_t dma_addr, gfp_t gfp)
+{
+	unsigned int rfx = calc_rfx(dma_addr);
+	unsigned long old_rfe, rfe;
+	unsigned long *rfep, *rso;
+
+	rfep = &rfo[rfx];
+	rfe = READ_ONCE(*rfep);
+	if (reg_entry_isvalid(rfe)) {
+		rso = get_rf_rso(rfe);
+	} else {
+		rso = dma_alloc_cpu_table(gfp);
+		if (!rso)
+			return NULL;
+
+		set_rf_rso(&rfe, virt_to_phys(rso));
+		validate_rf_entry(&rfe);
+		entry_clr_protected(&rfe);
+
+		old_rfe = cmpxchg(rfep, ZPCI_TABLE_INVALID, rfe);
+		if (old_rfe != ZPCI_TABLE_INVALID) {
+			/* Somone else was faster, use theirs */
+			dma_free_cpu_table(rso);
+			rso = get_rf_rso(old_rfe);
+		}
+	}
+
+	if (!rso)
+		return NULL;
+
+	return dma_walk_rs_table(rso, dma_addr, gfp);
+}
+
 static unsigned long *dma_get_seg_table_origin(unsigned long *rtep, gfp_t gfp)
 {
 	unsigned long old_rte, rte;
@@ -339,11 +433,31 @@ static unsigned long *dma_get_page_table_origin(unsigned long *step, gfp_t gfp)
 	return pto;
 }
 
-static unsigned long *dma_walk_cpu_trans(unsigned long *rto, dma_addr_t dma_addr, gfp_t gfp)
+static unsigned long *dma_walk_region_tables(struct s390_domain *domain,
+					     dma_addr_t dma_addr, gfp_t gfp)
 {
-	unsigned long *sto, *pto;
+	switch (domain->origin_type) {
+	case ZPCI_TABLE_TYPE_RFX:
+		return dma_walk_rf_table(domain->dma_table, dma_addr, gfp);
+	case ZPCI_TABLE_TYPE_RSX:
+		return dma_walk_rs_table(domain->dma_table, dma_addr, gfp);
+	case ZPCI_TABLE_TYPE_RTX:
+		return domain->dma_table;
+	default:
+		return NULL;
+	}
+}
+
+static unsigned long *dma_walk_cpu_trans(struct s390_domain *domain,
+					 dma_addr_t dma_addr, gfp_t gfp)
+{
+	unsigned long *rto, *sto, *pto;
 	unsigned int rtx, sx, px;
 
+	rto = dma_walk_region_tables(domain, dma_addr, gfp);
+	if (!rto)
+		return NULL;
+
 	rtx = calc_rtx(dma_addr);
 	sto = dma_get_seg_table_origin(&rto[rtx], gfp);
 	if (!sto)
@@ -690,8 +804,7 @@ static int s390_iommu_validate_trans(struct s390_domain *s390_domain,
 	int rc;
 
 	for (i = 0; i < nr_pages; i++) {
-		entry = dma_walk_cpu_trans(s390_domain->dma_table, dma_addr,
-					   gfp);
+		entry = dma_walk_cpu_trans(s390_domain, dma_addr, gfp);
 		if (unlikely(!entry)) {
 			rc = -ENOMEM;
 			goto undo_cpu_trans;
@@ -706,8 +819,7 @@ static int s390_iommu_validate_trans(struct s390_domain *s390_domain,
 undo_cpu_trans:
 	while (i-- > 0) {
 		dma_addr -= PAGE_SIZE;
-		entry = dma_walk_cpu_trans(s390_domain->dma_table,
-					   dma_addr, gfp);
+		entry = dma_walk_cpu_trans(s390_domain, dma_addr, gfp);
 		if (!entry)
 			break;
 		dma_update_cpu_trans(entry, 0, ZPCI_PTE_INVALID);
@@ -724,8 +836,7 @@ static int s390_iommu_invalidate_trans(struct s390_domain *s390_domain,
 	int rc = 0;
 
 	for (i = 0; i < nr_pages; i++) {
-		entry = dma_walk_cpu_trans(s390_domain->dma_table, dma_addr,
-					   GFP_ATOMIC);
+		entry = dma_walk_cpu_trans(s390_domain, dma_addr, GFP_ATOMIC);
 		if (unlikely(!entry)) {
 			rc = -EINVAL;
 			break;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 5/5] iommu/s390: allow larger region tables
  2025-04-11 20:24 [PATCH v5 0/5] iommu/s390: support additional table regions Matthew Rosato
                   ` (3 preceding siblings ...)
  2025-04-11 20:24 ` [PATCH v5 4/5] iommu/s390: support map/unmap " Matthew Rosato
@ 2025-04-11 20:24 ` Matthew Rosato
  2025-04-16  9:35   ` Niklas Schnelle
  2025-04-16  9:19 ` [PATCH v5 0/5] iommu/s390: support additional table regions Niklas Schnelle
  2025-04-17 14:43 ` Joerg Roedel
  6 siblings, 1 reply; 10+ messages in thread
From: Matthew Rosato @ 2025-04-11 20:24 UTC (permalink / raw)
  To: joro, will, robin.murphy, gerald.schaefer, schnelle
  Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
	linux-s390

Extend the aperture calculation to consider sizes beyond the maximum
size of a region third table.  Attempt to always use the smallest
table size possible to avoid unnecessary extra steps during translation.
Update reserved region calculations to use the appropriate table size.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/pci_dma.h |  1 +
 drivers/iommu/s390-iommu.c      | 70 ++++++++++++++++++++++++---------
 2 files changed, 53 insertions(+), 18 deletions(-)

diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
index 8d8962e4fd58..d12e17201661 100644
--- a/arch/s390/include/asm/pci_dma.h
+++ b/arch/s390/include/asm/pci_dma.h
@@ -25,6 +25,7 @@ enum zpci_ioat_dtype {
 #define ZPCI_KEY			(PAGE_DEFAULT_KEY << 5)
 
 #define ZPCI_TABLE_SIZE_RT	(1UL << 42)
+#define ZPCI_TABLE_SIZE_RS	(1UL << 53)
 
 #define ZPCI_IOTA_STO_FLAG	(ZPCI_IOTA_IOT_ENABLED | ZPCI_KEY | ZPCI_IOTA_DT_ST)
 #define ZPCI_IOTA_RTTO_FLAG	(ZPCI_IOTA_IOT_ENABLED | ZPCI_KEY | ZPCI_IOTA_DT_RT)
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index 46f45b136993..433b59f43530 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -511,9 +511,25 @@ static bool s390_iommu_capable(struct device *dev, enum iommu_cap cap)
 	}
 }
 
+static inline u64 max_tbl_size(struct s390_domain *domain)
+{
+	switch (domain->origin_type) {
+	case ZPCI_TABLE_TYPE_RTX:
+		return ZPCI_TABLE_SIZE_RT - 1;
+	case ZPCI_TABLE_TYPE_RSX:
+		return ZPCI_TABLE_SIZE_RS - 1;
+	case ZPCI_TABLE_TYPE_RFX:
+		return U64_MAX;
+	default:
+		return 0;
+	}
+}
+
 static struct iommu_domain *s390_domain_alloc_paging(struct device *dev)
 {
+	struct zpci_dev *zdev = to_zpci_dev(dev);
 	struct s390_domain *s390_domain;
+	u64 aperture_size;
 
 	s390_domain = kzalloc(sizeof(*s390_domain), GFP_KERNEL);
 	if (!s390_domain)
@@ -524,10 +540,26 @@ static struct iommu_domain *s390_domain_alloc_paging(struct device *dev)
 		kfree(s390_domain);
 		return NULL;
 	}
+
+	aperture_size = min(s390_iommu_aperture,
+			    zdev->end_dma - zdev->start_dma + 1);
+	if (aperture_size <= (ZPCI_TABLE_SIZE_RT - zdev->start_dma)) {
+		s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
+	} else if (aperture_size <= (ZPCI_TABLE_SIZE_RS - zdev->start_dma) &&
+		  (zdev->dtsm & ZPCI_IOTA_DT_RS)) {
+		s390_domain->origin_type = ZPCI_TABLE_TYPE_RSX;
+	} else if (zdev->dtsm & ZPCI_IOTA_DT_RF) {
+		s390_domain->origin_type = ZPCI_TABLE_TYPE_RFX;
+	} else {
+		/* Assume RTX available */
+		s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
+		aperture_size = ZPCI_TABLE_SIZE_RT - zdev->start_dma;
+	}
+	zdev->end_dma = zdev->start_dma + aperture_size - 1;
+
 	s390_domain->domain.geometry.force_aperture = true;
 	s390_domain->domain.geometry.aperture_start = 0;
-	s390_domain->domain.geometry.aperture_end = ZPCI_TABLE_SIZE_RT - 1;
-	s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
+	s390_domain->domain.geometry.aperture_end = max_tbl_size(s390_domain);
 
 	spin_lock_init(&s390_domain->list_lock);
 	INIT_LIST_HEAD_RCU(&s390_domain->devices);
@@ -680,6 +712,8 @@ static void s390_iommu_get_resv_regions(struct device *dev,
 {
 	struct zpci_dev *zdev = to_zpci_dev(dev);
 	struct iommu_resv_region *region;
+	u64 max_size, end_resv;
+	unsigned long flags;
 
 	if (zdev->start_dma) {
 		region = iommu_alloc_resv_region(0, zdev->start_dma, 0,
@@ -689,10 +723,21 @@ static void s390_iommu_get_resv_regions(struct device *dev,
 		list_add_tail(&region->list, list);
 	}
 
-	if (zdev->end_dma < ZPCI_TABLE_SIZE_RT - 1) {
-		region = iommu_alloc_resv_region(zdev->end_dma + 1,
-						 ZPCI_TABLE_SIZE_RT - zdev->end_dma - 1,
-						 0, IOMMU_RESV_RESERVED, GFP_KERNEL);
+	spin_lock_irqsave(&zdev->dom_lock, flags);
+	if (zdev->s390_domain->type == IOMMU_DOMAIN_BLOCKED ||
+	    zdev->s390_domain->type == IOMMU_DOMAIN_IDENTITY) {
+		spin_unlock_irqrestore(&zdev->dom_lock, flags);
+		return;
+	}
+
+	max_size = max_tbl_size(to_s390_domain(zdev->s390_domain));
+	spin_unlock_irqrestore(&zdev->dom_lock, flags);
+
+	if (zdev->end_dma < max_size) {
+		end_resv = max_size - zdev->end_dma;
+		region = iommu_alloc_resv_region(zdev->end_dma + 1, end_resv,
+						 0, IOMMU_RESV_RESERVED,
+						 GFP_KERNEL);
 		if (!region)
 			return;
 		list_add_tail(&region->list, list);
@@ -708,13 +753,9 @@ static struct iommu_device *s390_iommu_probe_device(struct device *dev)
 
 	zdev = to_zpci_dev(dev);
 
-	if (zdev->start_dma > zdev->end_dma ||
-	    zdev->start_dma > ZPCI_TABLE_SIZE_RT - 1)
+	if (zdev->start_dma > zdev->end_dma)
 		return ERR_PTR(-EINVAL);
 
-	if (zdev->end_dma > ZPCI_TABLE_SIZE_RT - 1)
-		zdev->end_dma = ZPCI_TABLE_SIZE_RT - 1;
-
 	if (zdev->tlb_refresh)
 		dev->iommu->shadow_on_flush = 1;
 
@@ -999,7 +1040,6 @@ struct zpci_iommu_ctrs *zpci_get_iommu_ctrs(struct zpci_dev *zdev)
 
 int zpci_init_iommu(struct zpci_dev *zdev)
 {
-	u64 aperture_size;
 	int rc = 0;
 
 	rc = iommu_device_sysfs_add(&zdev->iommu_dev, NULL, NULL,
@@ -1017,12 +1057,6 @@ int zpci_init_iommu(struct zpci_dev *zdev)
 	if (rc)
 		goto out_sysfs;
 
-	zdev->start_dma = PAGE_ALIGN(zdev->start_dma);
-	aperture_size = min3(s390_iommu_aperture,
-			     ZPCI_TABLE_SIZE_RT - zdev->start_dma,
-			     zdev->end_dma - zdev->start_dma + 1);
-	zdev->end_dma = zdev->start_dma + aperture_size - 1;
-
 	return 0;
 
 out_sysfs:
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 0/5] iommu/s390: support additional table regions
  2025-04-11 20:24 [PATCH v5 0/5] iommu/s390: support additional table regions Matthew Rosato
                   ` (4 preceding siblings ...)
  2025-04-11 20:24 ` [PATCH v5 5/5] iommu/s390: allow larger region tables Matthew Rosato
@ 2025-04-16  9:19 ` Niklas Schnelle
  2025-04-17 14:43 ` Joerg Roedel
  6 siblings, 0 replies; 10+ messages in thread
From: Niklas Schnelle @ 2025-04-16  9:19 UTC (permalink / raw)
  To: Matthew Rosato, joro, will, robin.murphy, gerald.schaefer
  Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
	linux-s390

On Fri, 2025-04-11 at 16:24 -0400, Matthew Rosato wrote:
> The series extends the maximum table size allowed by s390-iommu by
> increasing the number of table regions supported.  It also adds logic to
> construct the table use the minimum number of regions based upon aperture
> calculation.
> 
> Changes for v5:
> - fix GFP_KERNEL under spinlock in patch 5
> 
> Changes for v4:
> - rebase onto master/merge window
> - convert patches 3 per Niklas to remove gotos
> - convert patch 4 to also remove gotos
> - add review tags
> 
> Changes for v3:
> - rebase onto iommu-next
> - move IOTA region type setting into s390-iommu
> - remove origin_type and max_table_size from zdev
> - adjust reserved region calculation to be dependent on the domain
> 
> Changes for v2:
> - rebase onto 6.13
> - remove 'iommu/s390: add basic routines for region 1st and 2nd tables'
>   and put routines in first patch that uses each.  No functional change.
> 
> Matthew Rosato (5):
>   iommu/s390: set appropriate IOTA region type
>   iommu/s390: support cleanup of additional table regions
>   iommu/s390: support iova_to_phys for additional table regions
>   iommu/s390: support map/unmap for additional table regions
>   iommu/s390: allow larger region tables
> 
>  arch/s390/include/asm/pci_dma.h |   3 +
>  drivers/iommu/s390-iommu.c      | 345 ++++++++++++++++++++++++++++----
>  2 files changed, 314 insertions(+), 34 deletions(-)
> 

I gave this v5 another test run with it applied atop v6.15-rc2. I used
an LPAR with 11 TB of memory and 2 KVM guests each with 5 TB. The
guests use iommu.passthrough=1 and have NIC VFs from the same card
passed-through. With such large sizes we really need the newly
supported table types. For good measure, I also used the PCI transport
for the virtio-net NIC.

Traffic between the passed-through VFs works great. Feel free to add:

Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 4/5] iommu/s390: support map/unmap for additional table regions
  2025-04-11 20:24 ` [PATCH v5 4/5] iommu/s390: support map/unmap " Matthew Rosato
@ 2025-04-16  9:19   ` Niklas Schnelle
  0 siblings, 0 replies; 10+ messages in thread
From: Niklas Schnelle @ 2025-04-16  9:19 UTC (permalink / raw)
  To: Matthew Rosato, joro, will, robin.murphy, gerald.schaefer
  Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
	linux-s390

On Fri, 2025-04-11 at 16:24 -0400, Matthew Rosato wrote:
> Map and unmap ops use the shared dma_walk_cpu_trans routine, update
> this using the origin_type of the dma_table to determine how many
> table levels must be walked.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  drivers/iommu/s390-iommu.c | 127 ++++++++++++++++++++++++++++++++++---
>  1 file changed, 119 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
> index 338a7381e918..46f45b136993 100644
> --- a/drivers/iommu/s390-iommu.c
> +++ b/drivers/iommu/s390-iommu.c
> @@ -67,6 +67,20 @@ static inline void set_pt_pfaa(unsigned long *entry, phys_addr_t pfaa)
>  	*entry |= (pfaa & ZPCI_PTE_ADDR_MASK);
>  }
>  
> +static inline void set_rf_rso(unsigned long *entry, phys_addr_t rso)
> +{
> +	*entry &= ZPCI_RTE_FLAG_MASK;
> +	*entry |= (rso & ZPCI_RTE_ADDR_MASK);
> +	*entry |= ZPCI_TABLE_TYPE_RFX;
> +}
> +
> +static inline void set_rs_rto(unsigned long *entry, phys_addr_t rto)
> +{
> +	*entry &= ZPCI_RTE_FLAG_MASK;
> +	*entry |= (rto & ZPCI_RTE_ADDR_MASK);
> +	*entry |= ZPCI_TABLE_TYPE_RSX;
> +}
> +
>  static inline void set_rt_sto(unsigned long *entry, phys_addr_t sto)
>  {
>  	*entry &= ZPCI_RTE_FLAG_MASK;
> @@ -81,6 +95,22 @@ static inline void set_st_pto(unsigned long *entry, phys_addr_t pto)
>  	*entry |= ZPCI_TABLE_TYPE_SX;
>  }
>  
> +static inline void validate_rf_entry(unsigned long *entry)
> +{
> +	*entry &= ~ZPCI_TABLE_VALID_MASK;
> +	*entry &= ~ZPCI_TABLE_OFFSET_MASK;
> +	*entry |= ZPCI_TABLE_VALID;
> +	*entry |= ZPCI_TABLE_LEN_RFX;
> +}
> +
> +static inline void validate_rs_entry(unsigned long *entry)
> +{
> +	*entry &= ~ZPCI_TABLE_VALID_MASK;
> +	*entry &= ~ZPCI_TABLE_OFFSET_MASK;
> +	*entry |= ZPCI_TABLE_VALID;
> +	*entry |= ZPCI_TABLE_LEN_RSX;
> +}
> +
> 
--- snip ---
>  
> -static unsigned long *dma_walk_cpu_trans(unsigned long *rto, dma_addr_t dma_addr, gfp_t gfp)
> +static unsigned long *dma_walk_region_tables(struct s390_domain *domain,
> +					     dma_addr_t dma_addr, gfp_t gfp)
>  {
> -	unsigned long *sto, *pto;
> +	switch (domain->origin_type) {
> +	case ZPCI_TABLE_TYPE_RFX:
> +		return dma_walk_rf_table(domain->dma_table, dma_addr, gfp);
> +	case ZPCI_TABLE_TYPE_RSX:
> +		return dma_walk_rs_table(domain->dma_table, dma_addr, gfp);
> +	case ZPCI_TABLE_TYPE_RTX:
> +		return domain->dma_table;
> +	default:
> +		return NULL;
> +	}
> +}
> +
> +static unsigned long *dma_walk_cpu_trans(struct s390_domain *domain,
> +					 dma_addr_t dma_addr, gfp_t gfp)
> +{
> +	unsigned long *rto, *sto, *pto;
>  	unsigned int rtx, sx, px;
>  
> +	rto = dma_walk_region_tables(domain, dma_addr, gfp);
> +	if (!rto)
> +		return NULL;
> +
>  	rtx = calc_rtx(dma_addr);
>  	sto = dma_get_seg_table_origin(&rto[rtx], gfp);
>  	if (!sto)
> @@ -690,8 +804,7 @@ static int s390_iommu_validate_trans(struct s390_domain *s390_domain,
>  	int rc;
>  
>  	for (i = 0; i < nr_pages; i++) {
> -		entry = dma_walk_cpu_trans(s390_domain->dma_table, dma_addr,
> -					   gfp);
> +		entry = dma_walk_cpu_trans(s390_domain, dma_addr, gfp);
>  		if (unlikely(!entry)) {
>  			rc = -ENOMEM;
>  			goto undo_cpu_trans;
> @@ -706,8 +819,7 @@ static int s390_iommu_validate_trans(struct s390_domain *s390_domain,
>  undo_cpu_trans:
>  	while (i-- > 0) {
>  		dma_addr -= PAGE_SIZE;
> -		entry = dma_walk_cpu_trans(s390_domain->dma_table,
> -					   dma_addr, gfp);
> +		entry = dma_walk_cpu_trans(s390_domain, dma_addr, gfp);
>  		if (!entry)
>  			break;
>  		dma_update_cpu_trans(entry, 0, ZPCI_PTE_INVALID);
> @@ -724,8 +836,7 @@ static int s390_iommu_invalidate_trans(struct s390_domain *s390_domain,
>  	int rc = 0;
>  
>  	for (i = 0; i < nr_pages; i++) {
> -		entry = dma_walk_cpu_trans(s390_domain->dma_table, dma_addr,
> -					   GFP_ATOMIC);
> +		entry = dma_walk_cpu_trans(s390_domain, dma_addr, GFP_ATOMIC);
>  		if (unlikely(!entry)) {
>  			rc = -EINVAL;
>  			break;

Looks good to me and we even get nicer formatting :)

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 5/5] iommu/s390: allow larger region tables
  2025-04-11 20:24 ` [PATCH v5 5/5] iommu/s390: allow larger region tables Matthew Rosato
@ 2025-04-16  9:35   ` Niklas Schnelle
  0 siblings, 0 replies; 10+ messages in thread
From: Niklas Schnelle @ 2025-04-16  9:35 UTC (permalink / raw)
  To: Matthew Rosato, joro, will, robin.murphy, gerald.schaefer
  Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
	linux-s390

On Fri, 2025-04-11 at 16:24 -0400, Matthew Rosato wrote:
> Extend the aperture calculation to consider sizes beyond the maximum
> size of a region third table.  Attempt to always use the smallest
> table size possible to avoid unnecessary extra steps during translation.
> Update reserved region calculations to use the appropriate table size.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  arch/s390/include/asm/pci_dma.h |  1 +
>  drivers/iommu/s390-iommu.c      | 70 ++++++++++++++++++++++++---------
>  2 files changed, 53 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
> index 8d8962e4fd58..d12e17201661 100644
> --- a/arch/s390/include/asm/pci_dma.h
> +++ b/arch/s390/include/asm/pci_dma.h
> @@ -25,6 +25,7 @@ enum zpci_ioat_dtype {
>  #define ZPCI_KEY			(PAGE_DEFAULT_KEY << 5)
>  
>  #define ZPCI_TABLE_SIZE_RT	(1UL << 42)
> +#define ZPCI_TABLE_SIZE_RS	(1UL << 53)
>  
>  #define ZPCI_IOTA_STO_FLAG	(ZPCI_IOTA_IOT_ENABLED | ZPCI_KEY | ZPCI_IOTA_DT_ST)
>  #define ZPCI_IOTA_RTTO_FLAG	(ZPCI_IOTA_IOT_ENABLED | ZPCI_KEY | ZPCI_IOTA_DT_RT)
> diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
> index 46f45b136993..433b59f43530 100644
> --- a/drivers/iommu/s390-iommu.c
> +++ b/drivers/iommu/s390-iommu.c
> @@ -511,9 +511,25 @@ static bool s390_iommu_capable(struct device *dev, enum iommu_cap cap)
>  	}
>  }
>  
> +static inline u64 max_tbl_size(struct s390_domain *domain)
> +{
> +	switch (domain->origin_type) {
> +	case ZPCI_TABLE_TYPE_RTX:
> +		return ZPCI_TABLE_SIZE_RT - 1;
> +	case ZPCI_TABLE_TYPE_RSX:
> +		return ZPCI_TABLE_SIZE_RS - 1;
> +	case ZPCI_TABLE_TYPE_RFX:
> +		return U64_MAX;
> +	default:
> +		return 0;
> +	}
> +}
> +
>  static struct iommu_domain *s390_domain_alloc_paging(struct device *dev)
>  {
> +	struct zpci_dev *zdev = to_zpci_dev(dev);
>  	struct s390_domain *s390_domain;
> +	u64 aperture_size;
>  
>  	s390_domain = kzalloc(sizeof(*s390_domain), GFP_KERNEL);
>  	if (!s390_domain)
> @@ -524,10 +540,26 @@ static struct iommu_domain *s390_domain_alloc_paging(struct device *dev)
>  		kfree(s390_domain);
>  		return NULL;
>  	}
> +
> +	aperture_size = min(s390_iommu_aperture,
> +			    zdev->end_dma - zdev->start_dma + 1);
> +	if (aperture_size <= (ZPCI_TABLE_SIZE_RT - zdev->start_dma)) {
> +		s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
> +	} else if (aperture_size <= (ZPCI_TABLE_SIZE_RS - zdev->start_dma) &&
> +		  (zdev->dtsm & ZPCI_IOTA_DT_RS)) {
> +		s390_domain->origin_type = ZPCI_TABLE_TYPE_RSX;
> +	} else if (zdev->dtsm & ZPCI_IOTA_DT_RF) {
> +		s390_domain->origin_type = ZPCI_TABLE_TYPE_RFX;
> +	} else {
> +		/* Assume RTX available */
> +		s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
> +		aperture_size = ZPCI_TABLE_SIZE_RT - zdev->start_dma;
> +	}
> +	zdev->end_dma = zdev->start_dma + aperture_size - 1;
> +
>  	s390_domain->domain.geometry.force_aperture = true;
>  	s390_domain->domain.geometry.aperture_start = 0;
> -	s390_domain->domain.geometry.aperture_end = ZPCI_TABLE_SIZE_RT - 1;
> -	s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
> +	s390_domain->domain.geometry.aperture_end = max_tbl_size(s390_domain);
>  
>  	spin_lock_init(&s390_domain->list_lock);
>  	INIT_LIST_HEAD_RCU(&s390_domain->devices);
> @@ -680,6 +712,8 @@ static void s390_iommu_get_resv_regions(struct device *dev,
>  {
>  	struct zpci_dev *zdev = to_zpci_dev(dev);
>  	struct iommu_resv_region *region;
> +	u64 max_size, end_resv;
> +	unsigned long flags;
>  
>  	if (zdev->start_dma) {
>  		region = iommu_alloc_resv_region(0, zdev->start_dma, 0,
> @@ -689,10 +723,21 @@ static void s390_iommu_get_resv_regions(struct device *dev,
>  		list_add_tail(&region->list, list);
>  	}
>  
> -	if (zdev->end_dma < ZPCI_TABLE_SIZE_RT - 1) {
> -		region = iommu_alloc_resv_region(zdev->end_dma + 1,
> -						 ZPCI_TABLE_SIZE_RT - zdev->end_dma - 1,
> -						 0, IOMMU_RESV_RESERVED, GFP_KERNEL);
> +	spin_lock_irqsave(&zdev->dom_lock, flags);
> +	if (zdev->s390_domain->type == IOMMU_DOMAIN_BLOCKED ||
> +	    zdev->s390_domain->type == IOMMU_DOMAIN_IDENTITY) {
> +		spin_unlock_irqrestore(&zdev->dom_lock, flags);
> +		return;
> +	}
> +
> +	max_size = max_tbl_size(to_s390_domain(zdev->s390_domain));
> +	spin_unlock_irqrestore(&zdev->dom_lock, flags);
> +
> +	if (zdev->end_dma < max_size) {
> +		end_resv = max_size - zdev->end_dma;
> +		region = iommu_alloc_resv_region(zdev->end_dma + 1, end_resv,
> +						 0, IOMMU_RESV_RESERVED,
> +						 GFP_KERNEL);
>  		if (!region)
>  			return;
>  		list_add_tail(&region->list, list);
> @@ -708,13 +753,9 @@ static struct iommu_device *s390_iommu_probe_device(struct device *dev)
>  
>  	zdev = to_zpci_dev(dev);
>  
> -	if (zdev->start_dma > zdev->end_dma ||
> -	    zdev->start_dma > ZPCI_TABLE_SIZE_RT - 1)
> +	if (zdev->start_dma > zdev->end_dma)
>  		return ERR_PTR(-EINVAL);
>  
> -	if (zdev->end_dma > ZPCI_TABLE_SIZE_RT - 1)
> -		zdev->end_dma = ZPCI_TABLE_SIZE_RT - 1;
> -
>  	if (zdev->tlb_refresh)
>  		dev->iommu->shadow_on_flush = 1;
>  
> @@ -999,7 +1040,6 @@ struct zpci_iommu_ctrs *zpci_get_iommu_ctrs(struct zpci_dev *zdev)
>  
>  int zpci_init_iommu(struct zpci_dev *zdev)
>  {
> -	u64 aperture_size;
>  	int rc = 0;
>  
>  	rc = iommu_device_sysfs_add(&zdev->iommu_dev, NULL, NULL,
> @@ -1017,12 +1057,6 @@ int zpci_init_iommu(struct zpci_dev *zdev)
>  	if (rc)
>  		goto out_sysfs;
>  
> -	zdev->start_dma = PAGE_ALIGN(zdev->start_dma);
> -	aperture_size = min3(s390_iommu_aperture,
> -			     ZPCI_TABLE_SIZE_RT - zdev->start_dma,
> -			     zdev->end_dma - zdev->start_dma + 1);
> -	zdev->end_dma = zdev->start_dma + aperture_size - 1;
> -
>  	return 0;
>  
>  out_sysfs:

Looks good, thanks for the great work!

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 0/5] iommu/s390: support additional table regions
  2025-04-11 20:24 [PATCH v5 0/5] iommu/s390: support additional table regions Matthew Rosato
                   ` (5 preceding siblings ...)
  2025-04-16  9:19 ` [PATCH v5 0/5] iommu/s390: support additional table regions Niklas Schnelle
@ 2025-04-17 14:43 ` Joerg Roedel
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2025-04-17 14:43 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: will, robin.murphy, gerald.schaefer, schnelle, hca, gor, agordeev,
	svens, borntraeger, clg, iommu, linux-kernel, linux-s390

On Fri, Apr 11, 2025 at 04:24:28PM -0400, Matthew Rosato wrote:
> Matthew Rosato (5):
>   iommu/s390: set appropriate IOTA region type
>   iommu/s390: support cleanup of additional table regions
>   iommu/s390: support iova_to_phys for additional table regions
>   iommu/s390: support map/unmap for additional table regions
>   iommu/s390: allow larger region tables
> 
>  arch/s390/include/asm/pci_dma.h |   3 +
>  drivers/iommu/s390-iommu.c      | 345 ++++++++++++++++++++++++++++----
>  2 files changed, 314 insertions(+), 34 deletions(-)

Applied, thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-04-17 14:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-11 20:24 [PATCH v5 0/5] iommu/s390: support additional table regions Matthew Rosato
2025-04-11 20:24 ` [PATCH v5 1/5] iommu/s390: set appropriate IOTA region type Matthew Rosato
2025-04-11 20:24 ` [PATCH v5 2/5] iommu/s390: support cleanup of additional table regions Matthew Rosato
2025-04-11 20:24 ` [PATCH v5 3/5] iommu/s390: support iova_to_phys for " Matthew Rosato
2025-04-11 20:24 ` [PATCH v5 4/5] iommu/s390: support map/unmap " Matthew Rosato
2025-04-16  9:19   ` Niklas Schnelle
2025-04-11 20:24 ` [PATCH v5 5/5] iommu/s390: allow larger region tables Matthew Rosato
2025-04-16  9:35   ` Niklas Schnelle
2025-04-16  9:19 ` [PATCH v5 0/5] iommu/s390: support additional table regions Niklas Schnelle
2025-04-17 14:43 ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).