* [PATCH v3 0/5] iommu/s390: support additional table regions
@ 2025-02-28 21:44 Matthew Rosato
2025-02-28 21:44 ` [PATCH v3 1/5] iommu/s390: set appropriate IOTA region type Matthew Rosato
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Matthew Rosato @ 2025-02-28 21:44 UTC (permalink / raw)
To: joro, will, robin.murphy, gerald.schaefer, schnelle
Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
linux-s390
The series extends the maximum table size allowed by s390-iommu by
increasing the number of table regions supported. It also adds logic to
construct the table use the minimum number of regions based upon aperture
calculation.
Changes for v3:
- rebase onto iommu-next
- move IOTA region type setting into s390-iommu
- remove origin_type and max_table_size from zdev
- adjust reserved region calculation to be dependent on the domain
Changes for v2:
- rebase onto 6.13
- remove 'iommu/s390: add basic routines for region 1st and 2nd tables'
and put routines in first patch that uses each. No functional change.
Matthew Rosato (5):
iommu/s390: set appropriate IOTA region type
iommu/s390: support cleanup of additional table regions
iommu/s390: support iova_to_phys for additional table regions
iommu/s390: support map/unmap for additional table regions
iommu/s390: allow larger region tables
arch/s390/include/asm/pci_dma.h | 3 +
drivers/iommu/s390-iommu.c | 342 ++++++++++++++++++++++++++++----
2 files changed, 310 insertions(+), 35 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3 1/5] iommu/s390: set appropriate IOTA region type
2025-02-28 21:44 [PATCH v3 0/5] iommu/s390: support additional table regions Matthew Rosato
@ 2025-02-28 21:44 ` Matthew Rosato
2025-03-11 11:49 ` Niklas Schnelle
2025-02-28 21:44 ` [PATCH v3 2/5] iommu/s390: support cleanup of additional table regions Matthew Rosato
` (3 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Matthew Rosato @ 2025-02-28 21:44 UTC (permalink / raw)
To: joro, will, robin.murphy, gerald.schaefer, schnelle
Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
linux-s390
When registering the I/O Translation Anchor, use the current table type
stored in the s390_domain to set the appropriate region type
indication. For the moment, the table type will always be stored as
region third.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
drivers/iommu/s390-iommu.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index e1c76e0f9c2b..cad032d4c9a6 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -31,6 +31,7 @@ struct s390_domain {
unsigned long *dma_table;
spinlock_t list_lock;
struct rcu_head rcu;
+ u8 origin_type;
};
static struct iommu_domain blocking_domain;
@@ -345,6 +346,7 @@ static struct iommu_domain *s390_domain_alloc_paging(struct device *dev)
s390_domain->domain.geometry.force_aperture = true;
s390_domain->domain.geometry.aperture_start = 0;
s390_domain->domain.geometry.aperture_end = ZPCI_TABLE_SIZE_RT - 1;
+ s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
spin_lock_init(&s390_domain->list_lock);
INIT_LIST_HEAD_RCU(&s390_domain->devices);
@@ -381,6 +383,21 @@ static void zdev_s390_domain_update(struct zpci_dev *zdev,
spin_unlock_irqrestore(&zdev->dom_lock, flags);
}
+static u64 get_iota_region_flag(struct s390_domain *domain)
+{
+ switch (domain->origin_type) {
+ case ZPCI_TABLE_TYPE_RTX:
+ return ZPCI_IOTA_RTTO_FLAG;
+ case ZPCI_TABLE_TYPE_RSX:
+ return ZPCI_IOTA_RSTO_FLAG;
+ case ZPCI_TABLE_TYPE_RFX:
+ return ZPCI_IOTA_RFTO_FLAG;
+ default:
+ WARN_ONCE(1, "Invalid IOMMU table (%x)\n", domain->origin_type);
+ return 0;
+ }
+}
+
static int s390_iommu_domain_reg_ioat(struct zpci_dev *zdev,
struct iommu_domain *domain, u8 *status)
{
@@ -399,7 +416,7 @@ static int s390_iommu_domain_reg_ioat(struct zpci_dev *zdev,
default:
s390_domain = to_s390_domain(domain);
iota = virt_to_phys(s390_domain->dma_table) |
- ZPCI_IOTA_RTTO_FLAG;
+ get_iota_region_flag(s390_domain);
rc = zpci_register_ioat(zdev, 0, zdev->start_dma,
zdev->end_dma, iota, status);
}
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 2/5] iommu/s390: support cleanup of additional table regions
2025-02-28 21:44 [PATCH v3 0/5] iommu/s390: support additional table regions Matthew Rosato
2025-02-28 21:44 ` [PATCH v3 1/5] iommu/s390: set appropriate IOTA region type Matthew Rosato
@ 2025-02-28 21:44 ` Matthew Rosato
2025-03-11 12:01 ` Niklas Schnelle
2025-02-28 21:44 ` [PATCH v3 3/5] iommu/s390: support iova_to_phys for " Matthew Rosato
` (2 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Matthew Rosato @ 2025-02-28 21:44 UTC (permalink / raw)
To: joro, will, robin.murphy, gerald.schaefer, schnelle
Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
linux-s390
Extend the existing dma_cleanup_tables to also handle region second and
region first tables.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
drivers/iommu/s390-iommu.c | 71 ++++++++++++++++++++++++++++++++++----
1 file changed, 64 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index cad032d4c9a6..f2cda0ce0fe9 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -121,6 +121,22 @@ static inline int pt_entry_isvalid(unsigned long entry)
return (entry & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_VALID;
}
+static inline unsigned long *get_rf_rso(unsigned long entry)
+{
+ if ((entry & ZPCI_TABLE_TYPE_MASK) == ZPCI_TABLE_TYPE_RFX)
+ return phys_to_virt(entry & ZPCI_RTE_ADDR_MASK);
+ else
+ return NULL;
+}
+
+static inline unsigned long *get_rs_rto(unsigned long entry)
+{
+ if ((entry & ZPCI_TABLE_TYPE_MASK) == ZPCI_TABLE_TYPE_RSX)
+ return phys_to_virt(entry & ZPCI_RTE_ADDR_MASK);
+ else
+ return NULL;
+}
+
static inline unsigned long *get_rt_sto(unsigned long entry)
{
if ((entry & ZPCI_TABLE_TYPE_MASK) == ZPCI_TABLE_TYPE_RTX)
@@ -192,18 +208,59 @@ static void dma_free_seg_table(unsigned long entry)
dma_free_cpu_table(sto);
}
-static void dma_cleanup_tables(unsigned long *table)
+static void dma_free_rt_table(unsigned long entry)
{
+ unsigned long *rto = get_rs_rto(entry);
int rtx;
- if (!table)
+ for (rtx = 0; rtx < ZPCI_TABLE_ENTRIES; rtx++)
+ if (reg_entry_isvalid(rto[rtx]))
+ dma_free_seg_table(rto[rtx]);
+
+ dma_free_cpu_table(rto);
+}
+
+static void dma_free_rs_table(unsigned long entry)
+{
+ unsigned long *rso = get_rf_rso(entry);
+ int rsx;
+
+ for (rsx = 0; rsx < ZPCI_TABLE_ENTRIES; rsx++)
+ if (reg_entry_isvalid(rso[rsx]))
+ dma_free_rt_table(rso[rsx]);
+
+ dma_free_cpu_table(rso);
+}
+
+static void dma_cleanup_tables(struct s390_domain *domain)
+{
+ int rtx, rsx, rfx;
+
+ if (!domain->dma_table)
return;
- for (rtx = 0; rtx < ZPCI_TABLE_ENTRIES; rtx++)
- if (reg_entry_isvalid(table[rtx]))
- dma_free_seg_table(table[rtx]);
+ switch (domain->origin_type) {
+ case ZPCI_TABLE_TYPE_RFX:
+ for (rfx = 0; rfx < ZPCI_TABLE_ENTRIES; rfx++)
+ if (reg_entry_isvalid(domain->dma_table[rfx]))
+ dma_free_rs_table(domain->dma_table[rfx]);
+ break;
+ case ZPCI_TABLE_TYPE_RSX:
+ for (rsx = 0; rsx < ZPCI_TABLE_ENTRIES; rsx++)
+ if (reg_entry_isvalid(domain->dma_table[rsx]))
+ dma_free_rt_table(domain->dma_table[rsx]);
+ break;
+ case ZPCI_TABLE_TYPE_RTX:
+ for (rtx = 0; rtx < ZPCI_TABLE_ENTRIES; rtx++)
+ if (reg_entry_isvalid(domain->dma_table[rtx]))
+ dma_free_seg_table(domain->dma_table[rtx]);
+ break;
+ default:
+ WARN_ONCE(1, "Invalid IOMMU table (%x)\n", domain->origin_type);
+ return;
+ }
- dma_free_cpu_table(table);
+ dma_free_cpu_table(domain->dma_table);
}
static unsigned long *dma_alloc_page_table(gfp_t gfp)
@@ -358,7 +415,7 @@ static void s390_iommu_rcu_free_domain(struct rcu_head *head)
{
struct s390_domain *s390_domain = container_of(head, struct s390_domain, rcu);
- dma_cleanup_tables(s390_domain->dma_table);
+ dma_cleanup_tables(s390_domain);
kfree(s390_domain);
}
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 3/5] iommu/s390: support iova_to_phys for additional table regions
2025-02-28 21:44 [PATCH v3 0/5] iommu/s390: support additional table regions Matthew Rosato
2025-02-28 21:44 ` [PATCH v3 1/5] iommu/s390: set appropriate IOTA region type Matthew Rosato
2025-02-28 21:44 ` [PATCH v3 2/5] iommu/s390: support cleanup of additional table regions Matthew Rosato
@ 2025-02-28 21:44 ` Matthew Rosato
2025-03-11 14:19 ` Niklas Schnelle
2025-02-28 21:44 ` [PATCH v3 4/5] iommu/s390: support map/unmap " Matthew Rosato
2025-02-28 21:44 ` [PATCH v3 5/5] iommu/s390: allow larger region tables Matthew Rosato
4 siblings, 1 reply; 9+ messages in thread
From: Matthew Rosato @ 2025-02-28 21:44 UTC (permalink / raw)
To: joro, will, robin.murphy, gerald.schaefer, schnelle
Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
linux-s390
The origin_type of the dma_table is used to determine how many table
levels must be traversed for the translation.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
arch/s390/include/asm/pci_dma.h | 2 ++
drivers/iommu/s390-iommu.c | 52 ++++++++++++++++++++++++++++++++-
2 files changed, 53 insertions(+), 1 deletion(-)
diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
index 42d7cc4262ca..8d8962e4fd58 100644
--- a/arch/s390/include/asm/pci_dma.h
+++ b/arch/s390/include/asm/pci_dma.h
@@ -55,6 +55,8 @@ enum zpci_ioat_dtype {
#define ZPCI_PT_BITS 8
#define ZPCI_ST_SHIFT (ZPCI_PT_BITS + PAGE_SHIFT)
#define ZPCI_RT_SHIFT (ZPCI_ST_SHIFT + ZPCI_TABLE_BITS)
+#define ZPCI_RS_SHIFT (ZPCI_RT_SHIFT + ZPCI_TABLE_BITS)
+#define ZPCI_RF_SHIFT (ZPCI_RS_SHIFT + ZPCI_TABLE_BITS)
#define ZPCI_RTE_FLAG_MASK 0x3fffUL
#define ZPCI_RTE_ADDR_MASK (~ZPCI_RTE_FLAG_MASK)
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index f2cda0ce0fe9..0a6aad11c327 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -36,6 +36,16 @@ struct s390_domain {
static struct iommu_domain blocking_domain;
+static inline unsigned int calc_rfx(dma_addr_t ptr)
+{
+ return ((unsigned long)ptr >> ZPCI_RF_SHIFT) & ZPCI_INDEX_MASK;
+}
+
+static inline unsigned int calc_rsx(dma_addr_t ptr)
+{
+ return ((unsigned long)ptr >> ZPCI_RS_SHIFT) & ZPCI_INDEX_MASK;
+}
+
static inline unsigned int calc_rtx(dma_addr_t ptr)
{
return ((unsigned long)ptr >> ZPCI_RT_SHIFT) & ZPCI_INDEX_MASK;
@@ -759,6 +769,43 @@ static int s390_iommu_map_pages(struct iommu_domain *domain,
return rc;
}
+static unsigned long *get_rto_from_iova(struct s390_domain *domain,
+ dma_addr_t iova)
+{
+ unsigned long *rfo, *rso, *rto;
+ unsigned long rfe, rse;
+ unsigned int rfx, rsx;
+
+ switch (domain->origin_type) {
+ case ZPCI_TABLE_TYPE_RFX:
+ rfo = domain->dma_table;
+ goto itp_rf;
+ case ZPCI_TABLE_TYPE_RSX:
+ rso = domain->dma_table;
+ goto itp_rs;
+ case ZPCI_TABLE_TYPE_RTX:
+ return domain->dma_table;
+ default:
+ return NULL;
+ }
+
+itp_rf:
+ rfx = calc_rfx(iova);
+ rfe = READ_ONCE(rfo[rfx]);
+ if (!reg_entry_isvalid(rfe))
+ return NULL;
+ rso = get_rf_rso(rfe);
+
+itp_rs:
+ rsx = calc_rsx(iova);
+ rse = READ_ONCE(rso[rsx]);
+ if (!reg_entry_isvalid(rse))
+ return NULL;
+ rto = get_rs_rto(rse);
+
+ return rto;
+}
+
static phys_addr_t s390_iommu_iova_to_phys(struct iommu_domain *domain,
dma_addr_t iova)
{
@@ -772,10 +819,13 @@ static phys_addr_t s390_iommu_iova_to_phys(struct iommu_domain *domain,
iova > domain->geometry.aperture_end)
return 0;
+ rto = get_rto_from_iova(s390_domain, iova);
+ if (!rto)
+ return 0;
+
rtx = calc_rtx(iova);
sx = calc_sx(iova);
px = calc_px(iova);
- rto = s390_domain->dma_table;
rte = READ_ONCE(rto[rtx]);
if (reg_entry_isvalid(rte)) {
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 4/5] iommu/s390: support map/unmap for additional table regions
2025-02-28 21:44 [PATCH v3 0/5] iommu/s390: support additional table regions Matthew Rosato
` (2 preceding siblings ...)
2025-02-28 21:44 ` [PATCH v3 3/5] iommu/s390: support iova_to_phys for " Matthew Rosato
@ 2025-02-28 21:44 ` Matthew Rosato
2025-02-28 21:44 ` [PATCH v3 5/5] iommu/s390: allow larger region tables Matthew Rosato
4 siblings, 0 replies; 9+ messages in thread
From: Matthew Rosato @ 2025-02-28 21:44 UTC (permalink / raw)
To: joro, will, robin.murphy, gerald.schaefer, schnelle
Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
linux-s390
Map and unmap ops use the shared dma_walk_cpu_trans routine, update
this using the origin_type of the dma_table to determine how many
table levels must be walked.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
drivers/iommu/s390-iommu.c | 131 ++++++++++++++++++++++++++++++++++---
1 file changed, 123 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index 0a6aad11c327..e6f9ce983a57 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -67,6 +67,20 @@ static inline void set_pt_pfaa(unsigned long *entry, phys_addr_t pfaa)
*entry |= (pfaa & ZPCI_PTE_ADDR_MASK);
}
+static inline void set_rf_rso(unsigned long *entry, phys_addr_t rso)
+{
+ *entry &= ZPCI_RTE_FLAG_MASK;
+ *entry |= (rso & ZPCI_RTE_ADDR_MASK);
+ *entry |= ZPCI_TABLE_TYPE_RFX;
+}
+
+static inline void set_rs_rto(unsigned long *entry, phys_addr_t rto)
+{
+ *entry &= ZPCI_RTE_FLAG_MASK;
+ *entry |= (rto & ZPCI_RTE_ADDR_MASK);
+ *entry |= ZPCI_TABLE_TYPE_RSX;
+}
+
static inline void set_rt_sto(unsigned long *entry, phys_addr_t sto)
{
*entry &= ZPCI_RTE_FLAG_MASK;
@@ -81,6 +95,22 @@ static inline void set_st_pto(unsigned long *entry, phys_addr_t pto)
*entry |= ZPCI_TABLE_TYPE_SX;
}
+static inline void validate_rf_entry(unsigned long *entry)
+{
+ *entry &= ~ZPCI_TABLE_VALID_MASK;
+ *entry &= ~ZPCI_TABLE_OFFSET_MASK;
+ *entry |= ZPCI_TABLE_VALID;
+ *entry |= ZPCI_TABLE_LEN_RFX;
+}
+
+static inline void validate_rs_entry(unsigned long *entry)
+{
+ *entry &= ~ZPCI_TABLE_VALID_MASK;
+ *entry &= ~ZPCI_TABLE_OFFSET_MASK;
+ *entry |= ZPCI_TABLE_VALID;
+ *entry |= ZPCI_TABLE_LEN_RSX;
+}
+
static inline void validate_rt_entry(unsigned long *entry)
{
*entry &= ~ZPCI_TABLE_VALID_MASK;
@@ -286,6 +316,60 @@ static unsigned long *dma_alloc_page_table(gfp_t gfp)
return table;
}
+static unsigned long *dma_get_rs_table_origin(unsigned long *rfep, gfp_t gfp)
+{
+ unsigned long old_rfe, rfe;
+ unsigned long *rso;
+
+ rfe = READ_ONCE(*rfep);
+ if (reg_entry_isvalid(rfe)) {
+ rso = get_rf_rso(rfe);
+ } else {
+ rso = dma_alloc_cpu_table(gfp);
+ if (!rso)
+ return NULL;
+
+ set_rf_rso(&rfe, virt_to_phys(rso));
+ validate_rf_entry(&rfe);
+ entry_clr_protected(&rfe);
+
+ old_rfe = cmpxchg(rfep, ZPCI_TABLE_INVALID, rfe);
+ if (old_rfe != ZPCI_TABLE_INVALID) {
+ /* Somone else was faster, use theirs */
+ dma_free_cpu_table(rso);
+ rso = get_rf_rso(old_rfe);
+ }
+ }
+ return rso;
+}
+
+static unsigned long *dma_get_rt_table_origin(unsigned long *rsep, gfp_t gfp)
+{
+ unsigned long old_rse, rse;
+ unsigned long *rto;
+
+ rse = READ_ONCE(*rsep);
+ if (reg_entry_isvalid(rse)) {
+ rto = get_rs_rto(rse);
+ } else {
+ rto = dma_alloc_cpu_table(gfp);
+ if (!rto)
+ return NULL;
+
+ set_rs_rto(&rse, virt_to_phys(rto));
+ validate_rs_entry(&rse);
+ entry_clr_protected(&rse);
+
+ old_rse = cmpxchg(rsep, ZPCI_TABLE_INVALID, rse);
+ if (old_rse != ZPCI_TABLE_INVALID) {
+ /* Somone else was faster, use theirs */
+ dma_free_cpu_table(rto);
+ rto = get_rs_rto(old_rse);
+ }
+ }
+ return rto;
+}
+
static unsigned long *dma_get_seg_table_origin(unsigned long *rtep, gfp_t gfp)
{
unsigned long old_rte, rte;
@@ -339,11 +423,45 @@ static unsigned long *dma_get_page_table_origin(unsigned long *step, gfp_t gfp)
return pto;
}
-static unsigned long *dma_walk_cpu_trans(unsigned long *rto, dma_addr_t dma_addr, gfp_t gfp)
+static unsigned long *dma_walk_region_tables(struct s390_domain *domain,
+ dma_addr_t dma_addr, gfp_t gfp)
+{
+ unsigned long *rfo, *rso;
+ unsigned int rfx, rsx;
+
+ switch (domain->origin_type) {
+ case ZPCI_TABLE_TYPE_RFX:
+ rfo = domain->dma_table;
+ goto walk_rf;
+ case ZPCI_TABLE_TYPE_RSX:
+ rso = domain->dma_table;
+ goto walk_rs;
+ case ZPCI_TABLE_TYPE_RTX:
+ return domain->dma_table;
+ default:
+ return NULL;
+ }
+
+walk_rf:
+ rfx = calc_rfx(dma_addr);
+ rso = dma_get_rs_table_origin(&rfo[rfx], gfp);
+ if (!rso)
+ return NULL;
+walk_rs:
+ rsx = calc_rsx(dma_addr);
+ return dma_get_rt_table_origin(&rso[rsx], gfp);
+}
+
+static unsigned long *dma_walk_cpu_trans(struct s390_domain *domain,
+ dma_addr_t dma_addr, gfp_t gfp)
{
- unsigned long *sto, *pto;
+ unsigned long *rto, *sto, *pto;
unsigned int rtx, sx, px;
+ rto = dma_walk_region_tables(domain, dma_addr, gfp);
+ if (!rto)
+ return NULL;
+
rtx = calc_rtx(dma_addr);
sto = dma_get_seg_table_origin(&rto[rtx], gfp);
if (!sto)
@@ -690,8 +808,7 @@ static int s390_iommu_validate_trans(struct s390_domain *s390_domain,
int rc;
for (i = 0; i < nr_pages; i++) {
- entry = dma_walk_cpu_trans(s390_domain->dma_table, dma_addr,
- gfp);
+ entry = dma_walk_cpu_trans(s390_domain, dma_addr, gfp);
if (unlikely(!entry)) {
rc = -ENOMEM;
goto undo_cpu_trans;
@@ -706,8 +823,7 @@ static int s390_iommu_validate_trans(struct s390_domain *s390_domain,
undo_cpu_trans:
while (i-- > 0) {
dma_addr -= PAGE_SIZE;
- entry = dma_walk_cpu_trans(s390_domain->dma_table,
- dma_addr, gfp);
+ entry = dma_walk_cpu_trans(s390_domain, dma_addr, gfp);
if (!entry)
break;
dma_update_cpu_trans(entry, 0, ZPCI_PTE_INVALID);
@@ -724,8 +840,7 @@ static int s390_iommu_invalidate_trans(struct s390_domain *s390_domain,
int rc = 0;
for (i = 0; i < nr_pages; i++) {
- entry = dma_walk_cpu_trans(s390_domain->dma_table, dma_addr,
- GFP_ATOMIC);
+ entry = dma_walk_cpu_trans(s390_domain, dma_addr, GFP_ATOMIC);
if (unlikely(!entry)) {
rc = -EINVAL;
break;
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 5/5] iommu/s390: allow larger region tables
2025-02-28 21:44 [PATCH v3 0/5] iommu/s390: support additional table regions Matthew Rosato
` (3 preceding siblings ...)
2025-02-28 21:44 ` [PATCH v3 4/5] iommu/s390: support map/unmap " Matthew Rosato
@ 2025-02-28 21:44 ` Matthew Rosato
4 siblings, 0 replies; 9+ messages in thread
From: Matthew Rosato @ 2025-02-28 21:44 UTC (permalink / raw)
To: joro, will, robin.murphy, gerald.schaefer, schnelle
Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
linux-s390
Extend the aperture calculation to consider sizes beyond the maximum
size of a region third table. Attempt to always use the smallest
table size possible to avoid unnecessary extra steps during translation.
Update reserved region calculations to use the appropriate table size.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
arch/s390/include/asm/pci_dma.h | 1 +
drivers/iommu/s390-iommu.c | 71 ++++++++++++++++++++++++---------
2 files changed, 53 insertions(+), 19 deletions(-)
diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
index 8d8962e4fd58..d12e17201661 100644
--- a/arch/s390/include/asm/pci_dma.h
+++ b/arch/s390/include/asm/pci_dma.h
@@ -25,6 +25,7 @@ enum zpci_ioat_dtype {
#define ZPCI_KEY (PAGE_DEFAULT_KEY << 5)
#define ZPCI_TABLE_SIZE_RT (1UL << 42)
+#define ZPCI_TABLE_SIZE_RS (1UL << 53)
#define ZPCI_IOTA_STO_FLAG (ZPCI_IOTA_IOT_ENABLED | ZPCI_KEY | ZPCI_IOTA_DT_ST)
#define ZPCI_IOTA_RTTO_FLAG (ZPCI_IOTA_IOT_ENABLED | ZPCI_KEY | ZPCI_IOTA_DT_RT)
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index e6f9ce983a57..e36a061ee6da 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -515,9 +515,25 @@ static bool s390_iommu_capable(struct device *dev, enum iommu_cap cap)
}
}
+static inline u64 max_tbl_size(struct s390_domain *domain)
+{
+ switch (domain->origin_type) {
+ case ZPCI_TABLE_TYPE_RTX:
+ return ZPCI_TABLE_SIZE_RT - 1;
+ case ZPCI_TABLE_TYPE_RSX:
+ return ZPCI_TABLE_SIZE_RS - 1;
+ case ZPCI_TABLE_TYPE_RFX:
+ return U64_MAX;
+ default:
+ return 0;
+ }
+}
+
static struct iommu_domain *s390_domain_alloc_paging(struct device *dev)
{
+ struct zpci_dev *zdev = to_zpci_dev(dev);
struct s390_domain *s390_domain;
+ u64 aperture_size;
s390_domain = kzalloc(sizeof(*s390_domain), GFP_KERNEL);
if (!s390_domain)
@@ -528,10 +544,26 @@ static struct iommu_domain *s390_domain_alloc_paging(struct device *dev)
kfree(s390_domain);
return NULL;
}
+
+ aperture_size = min(s390_iommu_aperture,
+ zdev->end_dma - zdev->start_dma + 1);
+ if (aperture_size <= (ZPCI_TABLE_SIZE_RT - zdev->start_dma)) {
+ s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
+ } else if (aperture_size <= (ZPCI_TABLE_SIZE_RS - zdev->start_dma) &&
+ (zdev->dtsm & ZPCI_IOTA_DT_RS)) {
+ s390_domain->origin_type = ZPCI_TABLE_TYPE_RSX;
+ } else if (zdev->dtsm & ZPCI_IOTA_DT_RF) {
+ s390_domain->origin_type = ZPCI_TABLE_TYPE_RFX;
+ } else {
+ /* Assume RTX available */
+ s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
+ aperture_size = ZPCI_TABLE_SIZE_RT - zdev->start_dma;
+ }
+ zdev->end_dma = zdev->start_dma + aperture_size - 1;
+
s390_domain->domain.geometry.force_aperture = true;
s390_domain->domain.geometry.aperture_start = 0;
- s390_domain->domain.geometry.aperture_end = ZPCI_TABLE_SIZE_RT - 1;
- s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
+ s390_domain->domain.geometry.aperture_end = max_tbl_size(s390_domain);
spin_lock_init(&s390_domain->list_lock);
INIT_LIST_HEAD_RCU(&s390_domain->devices);
@@ -684,6 +716,9 @@ static void s390_iommu_get_resv_regions(struct device *dev,
{
struct zpci_dev *zdev = to_zpci_dev(dev);
struct iommu_resv_region *region;
+ struct s390_domain *s390_domain;
+ unsigned long flags;
+ u64 end_resv;
if (zdev->start_dma) {
region = iommu_alloc_resv_region(0, zdev->start_dma, 0,
@@ -693,14 +728,23 @@ static void s390_iommu_get_resv_regions(struct device *dev,
list_add_tail(®ion->list, list);
}
- if (zdev->end_dma < ZPCI_TABLE_SIZE_RT - 1) {
- region = iommu_alloc_resv_region(zdev->end_dma + 1,
- ZPCI_TABLE_SIZE_RT - zdev->end_dma - 1,
- 0, IOMMU_RESV_RESERVED, GFP_KERNEL);
+ spin_lock_irqsave(&zdev->dom_lock, flags);
+ if (zdev->s390_domain->type == IOMMU_DOMAIN_BLOCKED ||
+ zdev->s390_domain->type == IOMMU_DOMAIN_IDENTITY)
+ goto out;
+
+ s390_domain = to_s390_domain(zdev->s390_domain);
+ if (zdev->end_dma < max_tbl_size(s390_domain)) {
+ end_resv = max_tbl_size(s390_domain) - zdev->end_dma;
+ region = iommu_alloc_resv_region(zdev->end_dma + 1, end_resv,
+ 0, IOMMU_RESV_RESERVED,
+ GFP_KERNEL);
if (!region)
- return;
+ goto out;
list_add_tail(®ion->list, list);
}
+out:
+ spin_unlock_irqrestore(&zdev->dom_lock, flags);
}
static struct iommu_device *s390_iommu_probe_device(struct device *dev)
@@ -712,13 +756,9 @@ static struct iommu_device *s390_iommu_probe_device(struct device *dev)
zdev = to_zpci_dev(dev);
- if (zdev->start_dma > zdev->end_dma ||
- zdev->start_dma > ZPCI_TABLE_SIZE_RT - 1)
+ if (zdev->start_dma > zdev->end_dma)
return ERR_PTR(-EINVAL);
- if (zdev->end_dma > ZPCI_TABLE_SIZE_RT - 1)
- zdev->end_dma = ZPCI_TABLE_SIZE_RT - 1;
-
if (zdev->tlb_refresh)
dev->iommu->shadow_on_flush = 1;
@@ -995,7 +1035,6 @@ struct zpci_iommu_ctrs *zpci_get_iommu_ctrs(struct zpci_dev *zdev)
int zpci_init_iommu(struct zpci_dev *zdev)
{
- u64 aperture_size;
int rc = 0;
rc = iommu_device_sysfs_add(&zdev->iommu_dev, NULL, NULL,
@@ -1013,12 +1052,6 @@ int zpci_init_iommu(struct zpci_dev *zdev)
if (rc)
goto out_sysfs;
- zdev->start_dma = PAGE_ALIGN(zdev->start_dma);
- aperture_size = min3(s390_iommu_aperture,
- ZPCI_TABLE_SIZE_RT - zdev->start_dma,
- zdev->end_dma - zdev->start_dma + 1);
- zdev->end_dma = zdev->start_dma + aperture_size - 1;
-
return 0;
out_sysfs:
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v3 1/5] iommu/s390: set appropriate IOTA region type
2025-02-28 21:44 ` [PATCH v3 1/5] iommu/s390: set appropriate IOTA region type Matthew Rosato
@ 2025-03-11 11:49 ` Niklas Schnelle
0 siblings, 0 replies; 9+ messages in thread
From: Niklas Schnelle @ 2025-03-11 11:49 UTC (permalink / raw)
To: Matthew Rosato, joro, will, robin.murphy, gerald.schaefer
Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
linux-s390
On Fri, 2025-02-28 at 16:44 -0500, Matthew Rosato wrote:
> When registering the I/O Translation Anchor, use the current table type
> stored in the s390_domain to set the appropriate region type
> indication. For the moment, the table type will always be stored as
> region third.
>
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
> drivers/iommu/s390-iommu.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
> index e1c76e0f9c2b..cad032d4c9a6 100644
> --- a/drivers/iommu/s390-iommu.c
> +++ b/drivers/iommu/s390-iommu.c
> @@ -31,6 +31,7 @@ struct s390_domain {
> unsigned long *dma_table;
> spinlock_t list_lock;
> struct rcu_head rcu;
> + u8 origin_type;
> };
>
> static struct iommu_domain blocking_domain;
> @@ -345,6 +346,7 @@ static struct iommu_domain *s390_domain_alloc_paging(struct device *dev)
> s390_domain->domain.geometry.force_aperture = true;
> s390_domain->domain.geometry.aperture_start = 0;
> s390_domain->domain.geometry.aperture_end = ZPCI_TABLE_SIZE_RT - 1;
> + s390_domain->origin_type = ZPCI_TABLE_TYPE_RTX;
>
> spin_lock_init(&s390_domain->list_lock);
> INIT_LIST_HEAD_RCU(&s390_domain->devices);
> @@ -381,6 +383,21 @@ static void zdev_s390_domain_update(struct zpci_dev *zdev,
> spin_unlock_irqrestore(&zdev->dom_lock, flags);
> }
>
> +static u64 get_iota_region_flag(struct s390_domain *domain)
> +{
> + switch (domain->origin_type) {
> + case ZPCI_TABLE_TYPE_RTX:
> + return ZPCI_IOTA_RTTO_FLAG;
> + case ZPCI_TABLE_TYPE_RSX:
> + return ZPCI_IOTA_RSTO_FLAG;
> + case ZPCI_TABLE_TYPE_RFX:
> + return ZPCI_IOTA_RFTO_FLAG;
> + default:
> + WARN_ONCE(1, "Invalid IOMMU table (%x)\n", domain->origin_type);
> + return 0;
> + }
> +}
> +
> static int s390_iommu_domain_reg_ioat(struct zpci_dev *zdev,
> struct iommu_domain *domain, u8 *status)
> {
> @@ -399,7 +416,7 @@ static int s390_iommu_domain_reg_ioat(struct zpci_dev *zdev,
> default:
> s390_domain = to_s390_domain(domain);
> iota = virt_to_phys(s390_domain->dma_table) |
> - ZPCI_IOTA_RTTO_FLAG;
> + get_iota_region_flag(s390_domain);
> rc = zpci_register_ioat(zdev, 0, zdev->start_dma,
> zdev->end_dma, iota, status);
> }
Thanks for moving the table type into struct s390_domain I like how
this fits together with the identity domain support and overall
responsibilities now!
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 2/5] iommu/s390: support cleanup of additional table regions
2025-02-28 21:44 ` [PATCH v3 2/5] iommu/s390: support cleanup of additional table regions Matthew Rosato
@ 2025-03-11 12:01 ` Niklas Schnelle
0 siblings, 0 replies; 9+ messages in thread
From: Niklas Schnelle @ 2025-03-11 12:01 UTC (permalink / raw)
To: Matthew Rosato, joro, will, robin.murphy, gerald.schaefer
Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
linux-s390
On Fri, 2025-02-28 at 16:44 -0500, Matthew Rosato wrote:
> Extend the existing dma_cleanup_tables to also handle region second and
> region first tables.
>
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
> drivers/iommu/s390-iommu.c | 71 ++++++++++++++++++++++++++++++++++----
> 1 file changed, 64 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
> index cad032d4c9a6..f2cda0ce0fe9 100644
> --- a/drivers/iommu/s390-iommu.c
> +++ b/drivers/iommu/s390-iommu.c
> @@ -121,6 +121,22 @@ static inline int pt_entry_isvalid(unsigned long entry)
> return (entry & ZPCI_PTE_VALID_MASK) == ZPCI_PTE_VALID;
> }
>
>
--- snip ---
> +
> +static void dma_cleanup_tables(struct s390_domain *domain)
> +{
> + int rtx, rsx, rfx;
> +
> + if (!domain->dma_table)
> return;
>
> - for (rtx = 0; rtx < ZPCI_TABLE_ENTRIES; rtx++)
> - if (reg_entry_isvalid(table[rtx]))
> - dma_free_seg_table(table[rtx]);
> + switch (domain->origin_type) {
> + case ZPCI_TABLE_TYPE_RFX:
> + for (rfx = 0; rfx < ZPCI_TABLE_ENTRIES; rfx++)
> + if (reg_entry_isvalid(domain->dma_table[rfx]))
> + dma_free_rs_table(domain->dma_table[rfx]);
> + break;
> + case ZPCI_TABLE_TYPE_RSX:
> + for (rsx = 0; rsx < ZPCI_TABLE_ENTRIES; rsx++)
> + if (reg_entry_isvalid(domain->dma_table[rsx]))
> + dma_free_rt_table(domain->dma_table[rsx]);
> + break;
> + case ZPCI_TABLE_TYPE_RTX:
> + for (rtx = 0; rtx < ZPCI_TABLE_ENTRIES; rtx++)
> + if (reg_entry_isvalid(domain->dma_table[rtx]))
> + dma_free_seg_table(domain->dma_table[rtx]);
> + break;
> + default:
> + WARN_ONCE(1, "Invalid IOMMU table (%x)\n", domain->origin_type);
> + return;
> + }
>
> - dma_free_cpu_table(table);
> + dma_free_cpu_table(domain->dma_table);
> }
>
> static unsigned long *dma_alloc_page_table(gfp_t gfp)
> @@ -358,7 +415,7 @@ static void s390_iommu_rcu_free_domain(struct rcu_head *head)
> {
> struct s390_domain *s390_domain = container_of(head, struct s390_domain, rcu);
>
> - dma_cleanup_tables(s390_domain->dma_table);
> + dma_cleanup_tables(s390_domain);
> kfree(s390_domain);
> }
>
Looks good to me.
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 3/5] iommu/s390: support iova_to_phys for additional table regions
2025-02-28 21:44 ` [PATCH v3 3/5] iommu/s390: support iova_to_phys for " Matthew Rosato
@ 2025-03-11 14:19 ` Niklas Schnelle
0 siblings, 0 replies; 9+ messages in thread
From: Niklas Schnelle @ 2025-03-11 14:19 UTC (permalink / raw)
To: Matthew Rosato, joro, will, robin.murphy, gerald.schaefer
Cc: hca, gor, agordeev, svens, borntraeger, clg, iommu, linux-kernel,
linux-s390
On Fri, 2025-02-28 at 16:44 -0500, Matthew Rosato wrote:
> The origin_type of the dma_table is used to determine how many table
> levels must be traversed for the translation.
>
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
> arch/s390/include/asm/pci_dma.h | 2 ++
> drivers/iommu/s390-iommu.c | 52 ++++++++++++++++++++++++++++++++-
> 2 files changed, 53 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
> index 42d7cc4262ca..8d8962e4fd58 100644
> --- a/arch/s390/include/asm/pci_dma.h
> +++ b/arch/s390/include/asm/pci_dma.h
> @@ -55,6 +55,8 @@ enum zpci_ioat_dtype {
> #define ZPCI_PT_BITS 8
> #define ZPCI_ST_SHIFT (ZPCI_PT_BITS + PAGE_SHIFT)
> #define ZPCI_RT_SHIFT (ZPCI_ST_SHIFT + ZPCI_TABLE_BITS)
> +#define ZPCI_RS_SHIFT (ZPCI_RT_SHIFT + ZPCI_TABLE_BITS)
> +#define ZPCI_RF_SHIFT (ZPCI_RS_SHIFT + ZPCI_TABLE_BITS)
>
> #define ZPCI_RTE_FLAG_MASK 0x3fffUL
> #define ZPCI_RTE_ADDR_MASK (~ZPCI_RTE_FLAG_MASK)
> diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
> index f2cda0ce0fe9..0a6aad11c327 100644
> --- a/drivers/iommu/s390-iommu.c
> +++ b/drivers/iommu/s390-iommu.c
> @@ -36,6 +36,16 @@ struct s390_domain {
>
> static struct iommu_domain blocking_domain;
>
> +static inline unsigned int calc_rfx(dma_addr_t ptr)
> +{
> + return ((unsigned long)ptr >> ZPCI_RF_SHIFT) & ZPCI_INDEX_MASK;
> +}
> +
> +static inline unsigned int calc_rsx(dma_addr_t ptr)
> +{
> + return ((unsigned long)ptr >> ZPCI_RS_SHIFT) & ZPCI_INDEX_MASK;
> +}
> +
> static inline unsigned int calc_rtx(dma_addr_t ptr)
> {
> return ((unsigned long)ptr >> ZPCI_RT_SHIFT) & ZPCI_INDEX_MASK;
> @@ -759,6 +769,43 @@ static int s390_iommu_map_pages(struct iommu_domain *domain,
> return rc;
> }
>
> +static unsigned long *get_rto_from_iova(struct s390_domain *domain,
> + dma_addr_t iova)
> +{
> + unsigned long *rfo, *rso, *rto;
> + unsigned long rfe, rse;
> + unsigned int rfx, rsx;
> +
> + switch (domain->origin_type) {
> + case ZPCI_TABLE_TYPE_RFX:
> + rfo = domain->dma_table;
> + goto itp_rf;
> + case ZPCI_TABLE_TYPE_RSX:
> + rso = domain->dma_table;
> + goto itp_rs;
> + case ZPCI_TABLE_TYPE_RTX:
> + return domain->dma_table;
> + default:
> + return NULL;
> + }
> +
> +itp_rf:
> + rfx = calc_rfx(iova);
> + rfe = READ_ONCE(rfo[rfx]);
> + if (!reg_entry_isvalid(rfe))
> + return NULL;
> + rso = get_rf_rso(rfe);
> +
> +itp_rs:
> + rsx = calc_rsx(iova);
> + rse = READ_ONCE(rso[rsx]);
> + if (!reg_entry_isvalid(rse))
> + return NULL;
> + rto = get_rs_rto(rse);
> +
> + return rto;
> +}
I played around with re-organizing the above as the goto out of the
switch feels a bit cumbersome. One variant I came up with is a separate
get_rso_from_iova() function like below:
static unsigned long *get_rso_from_iova(struct s390_domain *domain,
dma_addr_t iova)
{
unsigned long *rfo;
unsigned long rfe;
unsigned int rfx;
switch (domain->origin_type) {
case ZPCI_TABLE_TYPE_RFX:
rfo = domain->dma_table;
rfx = calc_rfx(iova);
rfe = READ_ONCE(rfo[rfx]);
if (!reg_entry_isvalid(rfe))
return NULL;
return get_rf_rso(rfe);
case ZPCI_TABLE_TYPE_RSX:
return domain->dma_table;
default:
return NULL;
}
}
static unsigned long *get_rto_from_iova(struct s390_domain *domain,
dma_addr_t iova)
{
unsigned long *rso;
unsigned long rse;
unsigned int rsx;
switch (domain->origin_type) {
case ZPCI_TABLE_TYPE_RFX:
case ZPCI_TABLE_TYPE_RSX:
rso = get_rso_from_iova(domain, iova);
rsx = calc_rsx(iova);
rse = READ_ONCE(rso[rsx]);
if (!reg_entry_isvalid(rse))
return NULL;
return get_rs_rto(rse);
case ZPCI_TABLE_TYPE_RTX:
return domain->dma_table;
default:
return NULL;
}
}
I think this is slightly cleaner but not by enough that I'd say we have
to do it this way and I leave the choice to you.
> +
> static phys_addr_t s390_iommu_iova_to_phys(struct iommu_domain *domain,
> dma_addr_t iova)
> {
> @@ -772,10 +819,13 @@ static phys_addr_t s390_iommu_iova_to_phys(struct iommu_domain *domain,
> iova > domain->geometry.aperture_end)
> return 0;
>
> + rto = get_rto_from_iova(s390_domain, iova);
> + if (!rto)
> + return 0;
> +
> rtx = calc_rtx(iova);
> sx = calc_sx(iova);
> px = calc_px(iova);
> - rto = s390_domain->dma_table;
>
> rte = READ_ONCE(rto[rtx]);
> if (reg_entry_isvalid(rte)) {
So with or without my suggestion.
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-03-11 14:19 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-28 21:44 [PATCH v3 0/5] iommu/s390: support additional table regions Matthew Rosato
2025-02-28 21:44 ` [PATCH v3 1/5] iommu/s390: set appropriate IOTA region type Matthew Rosato
2025-03-11 11:49 ` Niklas Schnelle
2025-02-28 21:44 ` [PATCH v3 2/5] iommu/s390: support cleanup of additional table regions Matthew Rosato
2025-03-11 12:01 ` Niklas Schnelle
2025-02-28 21:44 ` [PATCH v3 3/5] iommu/s390: support iova_to_phys for " Matthew Rosato
2025-03-11 14:19 ` Niklas Schnelle
2025-02-28 21:44 ` [PATCH v3 4/5] iommu/s390: support map/unmap " Matthew Rosato
2025-02-28 21:44 ` [PATCH v3 5/5] iommu/s390: allow larger region tables Matthew Rosato
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).