linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area
@ 2024-08-06 23:31 Jason Gunthorpe
  2024-08-06 23:31 ` [PATCH v3 1/9] iommu/arm-smmu-v3: Use the new rb tree helpers Jason Gunthorpe
                   ` (11 more replies)
  0 siblings, 12 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-08-06 23:31 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

Will pointed out that two places referring to the CD/STE struct did not
get the new types. While auditing this code a few more oddities were
noticed. Based on a feedback from Mostafa and Nicolin a few more things
were fixed up too

- Use types for all the HW structures everywhere even for the L1
  descriptors that are just a single 8 bytes. This helps with clarity of
  what everthing is pointing at
- Use indexing helpers for the STE/CD two level calculations
- Use sizeof(struct X) instead of open coded math on constants. The sizeof
  naturally follows the type of the related variable in almost all cases
- Remove redundant dma_addr_t's and save some memory
- Remove redundant devm usage
- Use the modern rbtree API

Parts of this have been sitting in my tree for a while now, it grew a bit
since v1, but nothing is particularly profound here. Enough is merged now
that they can be cleanly based and are seperate from my other series.

v3:
 - Rebase to v6.11-rc2
 - Preserve the "2-level strtab only covers %u/%u bits of SID" without
   change
 - Vertically align some of the constants
 - Use u32 for the type of the index and sid
 - Fix missing * in le64_to_cpu() in interior patch
 - Bring back accidently lost "Use the new rb tree helpers" patch
v2: https://lore.kernel.org/r/0-v2-318ed5f6983b+198f-smmuv3_tidy_jgg@nvidia.com
 - Add a patch to add structs for the L1/L2 HW layouts and use their
   sizeof and type instead of constants and generic __le64 *.
 - Add a patch for L1/L2 indexing helpers for clarity
 - Reorder patches
 - Redo the union layout in the cfg for both cases
 - Fully remove some more defines
v1: https://lore.kernel.org/r/0-v1-1b720dce51d1+4f44-smmuv3_tidy_jgg@nvidia.com

Jason Gunthorpe (9):
  iommu/arm-smmu-v3: Use the new rb tree helpers
  iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()
  iommu/arm-smmu-v3: Add types for each level of the 2 level stream
    table
  iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg
  iommu/arm-smmu-v3: Remove strtab_base/cfg
  iommu/arm-smmu-v3: Do not use devm for the cd table allocations
  iommu/arm-smmu-v3: Shrink the cdtab l1_desc array
  iommu/arm-smmu-v3: Add types for each level of the CD table
  iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 355 +++++++++-----------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 100 ++++--
 2 files changed, 232 insertions(+), 223 deletions(-)


base-commit: de9c2c66ad8e787abec7c9d7eff4f8c3cdd28aed
-- 
2.46.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 1/9] iommu/arm-smmu-v3: Use the new rb tree helpers
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
@ 2024-08-06 23:31 ` Jason Gunthorpe
  2024-09-06 13:23   ` Will Deacon
  2024-08-06 23:31 ` [PATCH v3 2/9] iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx() Jason Gunthorpe
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2024-08-06 23:31 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

Since v5.12 the rbtree has gained some simplifying helpers aimed at making
rb tree users write less convoluted boiler plate code. Instead the caller
provides a single comparison function and the helpers generate the prior
open-coded stuff.

Update smmu->streams to use rb_find_add() and rb_find().

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 68 ++++++++++-----------
 1 file changed, 31 insertions(+), 37 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a31460f9f3d421..b80f3359a8d12b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1696,26 +1696,37 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 	return 0;
 }
 
+static int arm_smmu_streams_cmp_key(const void *lhs, const struct rb_node *rhs)
+{
+	struct arm_smmu_stream *stream_rhs =
+		rb_entry(rhs, struct arm_smmu_stream, node);
+	const u32 *sid_lhs = lhs;
+
+	if (*sid_lhs < stream_rhs->id)
+		return -1;
+	if (*sid_lhs > stream_rhs->id)
+		return 1;
+	return 0;
+}
+
+static int arm_smmu_streams_cmp_node(struct rb_node *lhs,
+				     const struct rb_node *rhs)
+{
+	return arm_smmu_streams_cmp_key(
+		&rb_entry(lhs, struct arm_smmu_stream, node)->id, rhs);
+}
+
 static struct arm_smmu_master *
 arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
 {
 	struct rb_node *node;
-	struct arm_smmu_stream *stream;
 
 	lockdep_assert_held(&smmu->streams_mutex);
 
-	node = smmu->streams.rb_node;
-	while (node) {
-		stream = rb_entry(node, struct arm_smmu_stream, node);
-		if (stream->id < sid)
-			node = node->rb_right;
-		else if (stream->id > sid)
-			node = node->rb_left;
-		else
-			return stream->master;
-	}
-
-	return NULL;
+	node = rb_find(&sid, &smmu->streams, arm_smmu_streams_cmp_key);
+	if (!node)
+		return NULL;
+	return rb_entry(node, struct arm_smmu_stream, node)->master;
 }
 
 /* IRQ and event handlers */
@@ -3173,8 +3184,6 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
 {
 	int i;
 	int ret = 0;
-	struct arm_smmu_stream *new_stream, *cur_stream;
-	struct rb_node **new_node, *parent_node = NULL;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
 
 	master->streams = kcalloc(fwspec->num_ids, sizeof(*master->streams),
@@ -3185,9 +3194,9 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
 
 	mutex_lock(&smmu->streams_mutex);
 	for (i = 0; i < fwspec->num_ids; i++) {
+		struct arm_smmu_stream *new_stream = &master->streams[i];
 		u32 sid = fwspec->ids[i];
 
-		new_stream = &master->streams[i];
 		new_stream->id = sid;
 		new_stream->master = master;
 
@@ -3196,28 +3205,13 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
 			break;
 
 		/* Insert into SID tree */
-		new_node = &(smmu->streams.rb_node);
-		while (*new_node) {
-			cur_stream = rb_entry(*new_node, struct arm_smmu_stream,
-					      node);
-			parent_node = *new_node;
-			if (cur_stream->id > new_stream->id) {
-				new_node = &((*new_node)->rb_left);
-			} else if (cur_stream->id < new_stream->id) {
-				new_node = &((*new_node)->rb_right);
-			} else {
-				dev_warn(master->dev,
-					 "stream %u already in tree\n",
-					 cur_stream->id);
-				ret = -EINVAL;
-				break;
-			}
-		}
-		if (ret)
+		if (rb_find_add(&new_stream->node, &smmu->streams,
+				arm_smmu_streams_cmp_node)) {
+			dev_warn(master->dev, "stream %u already in tree\n",
+				 sid);
+			ret = -EINVAL;
 			break;
-
-		rb_link_node(&new_stream->node, parent_node, new_node);
-		rb_insert_color(&new_stream->node, &smmu->streams);
+		}
 	}
 
 	if (ret) {
-- 
2.46.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 2/9] iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
  2024-08-06 23:31 ` [PATCH v3 1/9] iommu/arm-smmu-v3: Use the new rb tree helpers Jason Gunthorpe
@ 2024-08-06 23:31 ` Jason Gunthorpe
  2024-09-06 13:18   ` Will Deacon
  2024-08-06 23:31 ` [PATCH v3 3/9] iommu/arm-smmu-v3: Add types for each level of the 2 level stream table Jason Gunthorpe
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2024-08-06 23:31 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

Don't open code the calculations of the indexes for each level, provide
two functions to do that math and call them in all the places. Update all
the places computing indexes.

Calculate the L1 table size directly based on the max required index from
the cap. Remove STRTAB_L1_SZ_SHIFT in favour of STRTAB_NUM_L2_STES.

Use STRTAB_NUM_L2_STES to replace remaining open coded 1 << STRTAB_SPLIT.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 51 +++++++++------------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 14 +++++-
 2 files changed, 35 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b80f3359a8d12b..a695e5f8fc2880 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1670,20 +1670,17 @@ static void arm_smmu_init_initial_stes(struct arm_smmu_ste *strtab,
 
 static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 {
-	size_t size;
-	void *strtab;
 	dma_addr_t l2ptr_dma;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
-	struct arm_smmu_strtab_l1_desc *desc = &cfg->l1_desc[sid >> STRTAB_SPLIT];
+	struct arm_smmu_strtab_l1_desc *desc =
+		&cfg->l1_desc[arm_smmu_strtab_l1_idx(sid)];
 
 	if (desc->l2ptr)
 		return 0;
 
-	size = 1 << (STRTAB_SPLIT + ilog2(STRTAB_STE_DWORDS) + 3);
-	strtab = &cfg->strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS];
-
-	desc->l2ptr = dmam_alloc_coherent(smmu->dev, size, &l2ptr_dma,
-					  GFP_KERNEL);
+	desc->l2ptr = dmam_alloc_coherent(
+		smmu->dev, STRTAB_NUM_L2_STES * sizeof(struct arm_smmu_ste),
+		&l2ptr_dma, GFP_KERNEL);
 	if (!desc->l2ptr) {
 		dev_err(smmu->dev,
 			"failed to allocate l2 stream table for SID %u\n",
@@ -1691,8 +1688,9 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 		return -ENOMEM;
 	}
 
-	arm_smmu_init_initial_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
-	arm_smmu_write_strtab_l1_desc(strtab, l2ptr_dma);
+	arm_smmu_init_initial_stes(desc->l2ptr, STRTAB_NUM_L2_STES);
+	arm_smmu_write_strtab_l1_desc(&cfg->strtab[arm_smmu_strtab_l1_idx(sid)],
+				      l2ptr_dma);
 	return 0;
 }
 
@@ -2449,12 +2447,9 @@ arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
 
 	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
-		unsigned int idx1, idx2;
-
 		/* Two-level walk */
-		idx1 = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
-		idx2 = sid & ((1 << STRTAB_SPLIT) - 1);
-		return &cfg->l1_desc[idx1].l2ptr[idx2];
+		return &cfg->l1_desc[arm_smmu_strtab_l1_idx(sid)]
+				.l2ptr[arm_smmu_strtab_l2_idx(sid)];
 	} else {
 		/* Simple linear lookup */
 		return (struct arm_smmu_ste *)&cfg
@@ -3158,12 +3153,10 @@ struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle *fwnode)
 
 static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid)
 {
-	unsigned long limit = smmu->strtab_cfg.num_l1_ents;
-
 	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
-		limit *= 1UL << STRTAB_SPLIT;
-
-	return sid < limit;
+		return arm_smmu_strtab_l1_idx(sid) <
+		       smmu->strtab_cfg.num_l1_ents;
+	return sid < smmu->strtab_cfg.num_l1_ents;
 }
 
 static int arm_smmu_init_sid_strtab(struct arm_smmu_device *smmu, u32 sid)
@@ -3602,19 +3595,18 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 {
 	void *strtab;
 	u64 reg;
-	u32 size, l1size;
+	u32 l1size;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+	unsigned int last_sid_idx =
+		arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1);
 
 	/* Calculate the L1 size, capped to the SIDSIZE. */
-	size = STRTAB_L1_SZ_SHIFT - (ilog2(STRTAB_L1_DESC_DWORDS) + 3);
-	size = min(size, smmu->sid_bits - STRTAB_SPLIT);
-	cfg->num_l1_ents = 1 << size;
-
-	size += STRTAB_SPLIT;
-	if (size < smmu->sid_bits)
+	cfg->num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES);
+	if (cfg->num_l1_ents <= last_sid_idx)
 		dev_warn(smmu->dev,
 			 "2-level strtab only covers %u/%u bits of SID\n",
-			 size, smmu->sid_bits);
+			 ilog2(cfg->num_l1_ents * STRTAB_NUM_L2_STES),
+			 smmu->sid_bits);
 
 	l1size = cfg->num_l1_ents * (STRTAB_L1_DESC_DWORDS << 3);
 	strtab = dmam_alloc_coherent(smmu->dev, l1size, &cfg->strtab_dma,
@@ -3629,7 +3621,8 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 
 	/* Configure strtab_base_cfg for 2 levels */
 	reg  = FIELD_PREP(STRTAB_BASE_CFG_FMT, STRTAB_BASE_CFG_FMT_2LVL);
-	reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, size);
+	reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE,
+			  ilog2(cfg->num_l1_ents) + STRTAB_SPLIT);
 	reg |= FIELD_PREP(STRTAB_BASE_CFG_SPLIT, STRTAB_SPLIT);
 	cfg->strtab_base_cfg = reg;
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 14bca41a981b43..45ec88c9aa355f 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -202,7 +202,6 @@
  * 2lvl: 128k L1 entries,
  *       256 lazy entries per table (each table covers a PCI bus)
  */
-#define STRTAB_L1_SZ_SHIFT		20
 #define STRTAB_SPLIT			8
 
 #define STRTAB_L1_DESC_DWORDS		1
@@ -215,6 +214,19 @@ struct arm_smmu_ste {
 	__le64 data[STRTAB_STE_DWORDS];
 };
 
+#define STRTAB_NUM_L2_STES		(1 << STRTAB_SPLIT)
+#define STRTAB_MAX_L1_ENTRIES		(1 << 17)
+
+static inline u32 arm_smmu_strtab_l1_idx(u32 sid)
+{
+	return sid / STRTAB_NUM_L2_STES;
+}
+
+static inline u32 arm_smmu_strtab_l2_idx(u32 sid)
+{
+	return sid % STRTAB_NUM_L2_STES;
+}
+
 #define STRTAB_STE_0_V			(1UL << 0)
 #define STRTAB_STE_0_CFG		GENMASK_ULL(3, 1)
 #define STRTAB_STE_0_CFG_ABORT		0
-- 
2.46.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 3/9] iommu/arm-smmu-v3: Add types for each level of the 2 level stream table
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
  2024-08-06 23:31 ` [PATCH v3 1/9] iommu/arm-smmu-v3: Use the new rb tree helpers Jason Gunthorpe
  2024-08-06 23:31 ` [PATCH v3 2/9] iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx() Jason Gunthorpe
@ 2024-08-06 23:31 ` Jason Gunthorpe
  2024-08-06 23:31 ` [PATCH v3 4/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg Jason Gunthorpe
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-08-06 23:31 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

Add types struct arm_smmu_strtab_l1 and l2 to represent the HW layout of
the descriptors, and use them in most places, following patches will get
the remaing places. The size of the l1 and l2 HW allocations are
sizeof(struct arm_smmu_strtab_l1/2).

This provides some more clarity than having raw __le64 *'s and sizes
computed via macros.

Remove STRTAB_L1_DESC_DWORDS.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 22 +++++++++++----------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 10 ++++++++--
 2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a695e5f8fc2880..7a2d102fc11eb0 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1458,7 +1458,8 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
 }
 
 /* Stream table manipulation functions */
-static void arm_smmu_write_strtab_l1_desc(__le64 *dst, dma_addr_t l2ptr_dma)
+static void arm_smmu_write_strtab_l1_desc(struct arm_smmu_strtab_l1 *dst,
+					  dma_addr_t l2ptr_dma)
 {
 	u64 val = 0;
 
@@ -1466,7 +1467,7 @@ static void arm_smmu_write_strtab_l1_desc(__le64 *dst, dma_addr_t l2ptr_dma)
 	val |= l2ptr_dma & STRTAB_L1_DESC_L2PTR_MASK;
 
 	/* The HW has 64 bit atomicity with stores to the L2 STE table */
-	WRITE_ONCE(*dst, cpu_to_le64(val));
+	WRITE_ONCE(dst->l2ptr, cpu_to_le64(val));
 }
 
 struct arm_smmu_ste_writer {
@@ -1678,9 +1679,8 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 	if (desc->l2ptr)
 		return 0;
 
-	desc->l2ptr = dmam_alloc_coherent(
-		smmu->dev, STRTAB_NUM_L2_STES * sizeof(struct arm_smmu_ste),
-		&l2ptr_dma, GFP_KERNEL);
+	desc->l2ptr = dmam_alloc_coherent(smmu->dev, sizeof(*desc->l2ptr),
+					  &l2ptr_dma, GFP_KERNEL);
 	if (!desc->l2ptr) {
 		dev_err(smmu->dev,
 			"failed to allocate l2 stream table for SID %u\n",
@@ -1688,9 +1688,11 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 		return -ENOMEM;
 	}
 
-	arm_smmu_init_initial_stes(desc->l2ptr, STRTAB_NUM_L2_STES);
-	arm_smmu_write_strtab_l1_desc(&cfg->strtab[arm_smmu_strtab_l1_idx(sid)],
-				      l2ptr_dma);
+	arm_smmu_init_initial_stes(desc->l2ptr->stes, STRTAB_NUM_L2_STES);
+	arm_smmu_write_strtab_l1_desc(
+		(struct arm_smmu_strtab_l1 *)&cfg
+			->strtab[arm_smmu_strtab_l1_idx(sid)],
+		l2ptr_dma);
 	return 0;
 }
 
@@ -2449,7 +2451,7 @@ arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
 		/* Two-level walk */
 		return &cfg->l1_desc[arm_smmu_strtab_l1_idx(sid)]
-				.l2ptr[arm_smmu_strtab_l2_idx(sid)];
+				.l2ptr->stes[arm_smmu_strtab_l2_idx(sid)];
 	} else {
 		/* Simple linear lookup */
 		return (struct arm_smmu_ste *)&cfg
@@ -3608,7 +3610,7 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 			 ilog2(cfg->num_l1_ents * STRTAB_NUM_L2_STES),
 			 smmu->sid_bits);
 
-	l1size = cfg->num_l1_ents * (STRTAB_L1_DESC_DWORDS << 3);
+	l1size = cfg->num_l1_ents * sizeof(struct arm_smmu_strtab_l1);
 	strtab = dmam_alloc_coherent(smmu->dev, l1size, &cfg->strtab_dma,
 				     GFP_KERNEL);
 	if (!strtab) {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 45ec88c9aa355f..3a8a459f899fd8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -204,7 +204,6 @@
  */
 #define STRTAB_SPLIT			8
 
-#define STRTAB_L1_DESC_DWORDS		1
 #define STRTAB_L1_DESC_SPAN		GENMASK_ULL(4, 0)
 #define STRTAB_L1_DESC_L2PTR_MASK	GENMASK_ULL(51, 6)
 
@@ -215,6 +214,13 @@ struct arm_smmu_ste {
 };
 
 #define STRTAB_NUM_L2_STES		(1 << STRTAB_SPLIT)
+struct arm_smmu_strtab_l2 {
+	struct arm_smmu_ste stes[STRTAB_NUM_L2_STES];
+};
+
+struct arm_smmu_strtab_l1 {
+	__le64 l2ptr;
+};
 #define STRTAB_MAX_L1_ENTRIES		(1 << 17)
 
 static inline u32 arm_smmu_strtab_l1_idx(u32 sid)
@@ -597,7 +603,7 @@ struct arm_smmu_priq {
 
 /* High-level stream table and context descriptor structures */
 struct arm_smmu_strtab_l1_desc {
-	struct arm_smmu_ste		*l2ptr;
+	struct arm_smmu_strtab_l2	*l2ptr;
 };
 
 struct arm_smmu_ctx_desc {
-- 
2.46.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 4/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
                   ` (2 preceding siblings ...)
  2024-08-06 23:31 ` [PATCH v3 3/9] iommu/arm-smmu-v3: Add types for each level of the 2 level stream table Jason Gunthorpe
@ 2024-08-06 23:31 ` Jason Gunthorpe
  2024-09-06 13:19   ` Will Deacon
  2024-08-06 23:31 ` [PATCH v3 5/9] iommu/arm-smmu-v3: Remove strtab_base/cfg Jason Gunthorpe
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2024-08-06 23:31 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

The members here are being used for both the linear and the 2 level case,
with the meaning of each item slightly different in the two cases.

Split it into a clean union where both cases have their own struct with
their own logical names and correct types.

Adjust all the users to detect linear/2lvl and use the right sub structure
and types consistently.

Remove STRTAB_STE_DWORDS by changing the last places to use
sizeof(struct arm_smmu_ste).

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 77 ++++++++++-----------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 26 +++----
 2 files changed, 50 insertions(+), 53 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7a2d102fc11eb0..9b1f947102a554 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1673,26 +1673,25 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 {
 	dma_addr_t l2ptr_dma;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
-	struct arm_smmu_strtab_l1_desc *desc =
-		&cfg->l1_desc[arm_smmu_strtab_l1_idx(sid)];
+	struct arm_smmu_strtab_l2 **l2table =
+		&cfg->l2.l2ptrs[arm_smmu_strtab_l1_idx(sid)];
 
-	if (desc->l2ptr)
+	if (*l2table)
 		return 0;
 
-	desc->l2ptr = dmam_alloc_coherent(smmu->dev, sizeof(*desc->l2ptr),
-					  &l2ptr_dma, GFP_KERNEL);
-	if (!desc->l2ptr) {
+	*l2table = dmam_alloc_coherent(smmu->dev, sizeof(**l2table),
+				       &l2ptr_dma, GFP_KERNEL);
+	if (!*l2table) {
 		dev_err(smmu->dev,
 			"failed to allocate l2 stream table for SID %u\n",
 			sid);
 		return -ENOMEM;
 	}
 
-	arm_smmu_init_initial_stes(desc->l2ptr->stes, STRTAB_NUM_L2_STES);
+	arm_smmu_init_initial_stes((*l2table)->stes,
+				   ARRAY_SIZE((*l2table)->stes));
 	arm_smmu_write_strtab_l1_desc(
-		(struct arm_smmu_strtab_l1 *)&cfg
-			->strtab[arm_smmu_strtab_l1_idx(sid)],
-		l2ptr_dma);
+		&cfg->l2.l1tab[arm_smmu_strtab_l1_idx(sid)], l2ptr_dma);
 	return 0;
 }
 
@@ -2450,12 +2449,11 @@ arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 
 	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
 		/* Two-level walk */
-		return &cfg->l1_desc[arm_smmu_strtab_l1_idx(sid)]
-				.l2ptr->stes[arm_smmu_strtab_l2_idx(sid)];
+		return &cfg->l2.l2ptrs[arm_smmu_strtab_l1_idx(sid)]
+				->stes[arm_smmu_strtab_l2_idx(sid)];
 	} else {
 		/* Simple linear lookup */
-		return (struct arm_smmu_ste *)&cfg
-			       ->strtab[sid * STRTAB_STE_DWORDS];
+		return &cfg->linear.table[sid];
 	}
 }
 
@@ -3157,8 +3155,8 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid)
 {
 	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
 		return arm_smmu_strtab_l1_idx(sid) <
-		       smmu->strtab_cfg.num_l1_ents;
-	return sid < smmu->strtab_cfg.num_l1_ents;
+		       smmu->strtab_cfg.l2.num_l1_ents;
+	return sid < smmu->strtab_cfg.linear.num_ents;
 }
 
 static int arm_smmu_init_sid_strtab(struct arm_smmu_device *smmu, u32 sid)
@@ -3595,7 +3593,6 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
 
 static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 {
-	void *strtab;
 	u64 reg;
 	u32 l1size;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
@@ -3603,34 +3600,33 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 		arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1);
 
 	/* Calculate the L1 size, capped to the SIDSIZE. */
-	cfg->num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES);
-	if (cfg->num_l1_ents <= last_sid_idx)
+	cfg->l2.num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES);
+	if (cfg->l2.num_l1_ents <= last_sid_idx)
 		dev_warn(smmu->dev,
 			 "2-level strtab only covers %u/%u bits of SID\n",
-			 ilog2(cfg->num_l1_ents * STRTAB_NUM_L2_STES),
+			 ilog2(cfg->l2.num_l1_ents * STRTAB_NUM_L2_STES),
 			 smmu->sid_bits);
 
-	l1size = cfg->num_l1_ents * sizeof(struct arm_smmu_strtab_l1);
-	strtab = dmam_alloc_coherent(smmu->dev, l1size, &cfg->strtab_dma,
-				     GFP_KERNEL);
-	if (!strtab) {
+	l1size = cfg->l2.num_l1_ents * sizeof(struct arm_smmu_strtab_l1);
+	cfg->l2.l1tab = dmam_alloc_coherent(smmu->dev, l1size, &cfg->l2.l1_dma,
+					    GFP_KERNEL);
+	if (!cfg->l2.l1tab) {
 		dev_err(smmu->dev,
 			"failed to allocate l1 stream table (%u bytes)\n",
 			l1size);
 		return -ENOMEM;
 	}
-	cfg->strtab = strtab;
 
 	/* Configure strtab_base_cfg for 2 levels */
 	reg  = FIELD_PREP(STRTAB_BASE_CFG_FMT, STRTAB_BASE_CFG_FMT_2LVL);
 	reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE,
-			  ilog2(cfg->num_l1_ents) + STRTAB_SPLIT);
+			  ilog2(cfg->l2.num_l1_ents) + STRTAB_SPLIT);
 	reg |= FIELD_PREP(STRTAB_BASE_CFG_SPLIT, STRTAB_SPLIT);
 	cfg->strtab_base_cfg = reg;
 
-	cfg->l1_desc = devm_kcalloc(smmu->dev, cfg->num_l1_ents,
-				    sizeof(*cfg->l1_desc), GFP_KERNEL);
-	if (!cfg->l1_desc)
+	cfg->l2.l2ptrs = devm_kcalloc(smmu->dev, cfg->l2.num_l1_ents,
+				      sizeof(*cfg->l2.l2ptrs), GFP_KERNEL);
+	if (!cfg->l2.l2ptrs)
 		return -ENOMEM;
 
 	return 0;
@@ -3638,29 +3634,27 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 
 static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
 {
-	void *strtab;
 	u64 reg;
 	u32 size;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
 
-	size = (1 << smmu->sid_bits) * (STRTAB_STE_DWORDS << 3);
-	strtab = dmam_alloc_coherent(smmu->dev, size, &cfg->strtab_dma,
-				     GFP_KERNEL);
-	if (!strtab) {
+	size = (1 << smmu->sid_bits) * sizeof(struct arm_smmu_ste);
+	cfg->linear.table = dmam_alloc_coherent(
+		smmu->dev, size, &cfg->linear.ste_dma, GFP_KERNEL);
+	if (!cfg->linear.table) {
 		dev_err(smmu->dev,
 			"failed to allocate linear stream table (%u bytes)\n",
 			size);
 		return -ENOMEM;
 	}
-	cfg->strtab = strtab;
-	cfg->num_l1_ents = 1 << smmu->sid_bits;
+	cfg->linear.num_ents = 1 << smmu->sid_bits;
 
 	/* Configure strtab_base_cfg for a linear table covering all SIDs */
 	reg  = FIELD_PREP(STRTAB_BASE_CFG_FMT, STRTAB_BASE_CFG_FMT_LINEAR);
 	reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
 	cfg->strtab_base_cfg = reg;
 
-	arm_smmu_init_initial_stes(strtab, cfg->num_l1_ents);
+	arm_smmu_init_initial_stes(cfg->linear.table, cfg->linear.num_ents);
 	return 0;
 }
 
@@ -3669,16 +3663,17 @@ static int arm_smmu_init_strtab(struct arm_smmu_device *smmu)
 	u64 reg;
 	int ret;
 
-	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
+	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
 		ret = arm_smmu_init_strtab_2lvl(smmu);
-	else
+		reg = smmu->strtab_cfg.l2.l1_dma & STRTAB_BASE_ADDR_MASK;
+	} else {
 		ret = arm_smmu_init_strtab_linear(smmu);
-
+		reg = smmu->strtab_cfg.linear.ste_dma & STRTAB_BASE_ADDR_MASK;
+	}
 	if (ret)
 		return ret;
 
 	/* Set the strtab base address */
-	reg  = smmu->strtab_cfg.strtab_dma & STRTAB_BASE_ADDR_MASK;
 	reg |= STRTAB_BASE_RA;
 	smmu->strtab_cfg.strtab_base = reg;
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 3a8a459f899fd8..18e85fc936876b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -207,10 +207,8 @@
 #define STRTAB_L1_DESC_SPAN		GENMASK_ULL(4, 0)
 #define STRTAB_L1_DESC_L2PTR_MASK	GENMASK_ULL(51, 6)
 
-#define STRTAB_STE_DWORDS		8
-
 struct arm_smmu_ste {
-	__le64 data[STRTAB_STE_DWORDS];
+	__le64 data[8];
 };
 
 #define STRTAB_NUM_L2_STES		(1 << STRTAB_SPLIT)
@@ -602,10 +600,6 @@ struct arm_smmu_priq {
 };
 
 /* High-level stream table and context descriptor structures */
-struct arm_smmu_strtab_l1_desc {
-	struct arm_smmu_strtab_l2	*l2ptr;
-};
-
 struct arm_smmu_ctx_desc {
 	u16				asid;
 };
@@ -638,11 +632,19 @@ struct arm_smmu_s2_cfg {
 };
 
 struct arm_smmu_strtab_cfg {
-	__le64				*strtab;
-	dma_addr_t			strtab_dma;
-	struct arm_smmu_strtab_l1_desc	*l1_desc;
-	unsigned int			num_l1_ents;
-
+	union {
+		struct {
+			struct arm_smmu_ste *table;
+			dma_addr_t ste_dma;
+			unsigned int num_ents;
+		} linear;
+		struct {
+			struct arm_smmu_strtab_l1 *l1tab;
+			struct arm_smmu_strtab_l2 **l2ptrs;
+			dma_addr_t l1_dma;
+			unsigned int num_l1_ents;
+		} l2;
+	};
 	u64				strtab_base;
 	u32				strtab_base_cfg;
 };
-- 
2.46.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 5/9] iommu/arm-smmu-v3: Remove strtab_base/cfg
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
                   ` (3 preceding siblings ...)
  2024-08-06 23:31 ` [PATCH v3 4/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg Jason Gunthorpe
@ 2024-08-06 23:31 ` Jason Gunthorpe
  2024-08-06 23:31 ` [PATCH v3 6/9] iommu/arm-smmu-v3: Do not use devm for the cd table allocations Jason Gunthorpe
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-08-06 23:31 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

These values can be computed from the other values already stored in the
config. Move the calculation to arm_smmu_write_strtab() and do it directly
before writing the registers.

This moves all the logic to calculate the two registers into one function
from three and saves an unimportant 16 bytes from the arm_smmu_device.

Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 55 ++++++++++-----------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 -
 2 files changed, 27 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9b1f947102a554..95bd4a36268c00 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3593,7 +3593,6 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
 
 static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 {
-	u64 reg;
 	u32 l1size;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
 	unsigned int last_sid_idx =
@@ -3617,13 +3616,6 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 		return -ENOMEM;
 	}
 
-	/* Configure strtab_base_cfg for 2 levels */
-	reg  = FIELD_PREP(STRTAB_BASE_CFG_FMT, STRTAB_BASE_CFG_FMT_2LVL);
-	reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE,
-			  ilog2(cfg->l2.num_l1_ents) + STRTAB_SPLIT);
-	reg |= FIELD_PREP(STRTAB_BASE_CFG_SPLIT, STRTAB_SPLIT);
-	cfg->strtab_base_cfg = reg;
-
 	cfg->l2.l2ptrs = devm_kcalloc(smmu->dev, cfg->l2.num_l1_ents,
 				      sizeof(*cfg->l2.l2ptrs), GFP_KERNEL);
 	if (!cfg->l2.l2ptrs)
@@ -3634,7 +3626,6 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 
 static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
 {
-	u64 reg;
 	u32 size;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
 
@@ -3649,34 +3640,21 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
 	}
 	cfg->linear.num_ents = 1 << smmu->sid_bits;
 
-	/* Configure strtab_base_cfg for a linear table covering all SIDs */
-	reg  = FIELD_PREP(STRTAB_BASE_CFG_FMT, STRTAB_BASE_CFG_FMT_LINEAR);
-	reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
-	cfg->strtab_base_cfg = reg;
-
 	arm_smmu_init_initial_stes(cfg->linear.table, cfg->linear.num_ents);
 	return 0;
 }
 
 static int arm_smmu_init_strtab(struct arm_smmu_device *smmu)
 {
-	u64 reg;
 	int ret;
 
-	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
+	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
 		ret = arm_smmu_init_strtab_2lvl(smmu);
-		reg = smmu->strtab_cfg.l2.l1_dma & STRTAB_BASE_ADDR_MASK;
-	} else {
+	else
 		ret = arm_smmu_init_strtab_linear(smmu);
-		reg = smmu->strtab_cfg.linear.ste_dma & STRTAB_BASE_ADDR_MASK;
-	}
 	if (ret)
 		return ret;
 
-	/* Set the strtab base address */
-	reg |= STRTAB_BASE_RA;
-	smmu->strtab_cfg.strtab_base = reg;
-
 	ida_init(&smmu->vmid_map);
 
 	return 0;
@@ -3885,6 +3863,30 @@ static int arm_smmu_device_disable(struct arm_smmu_device *smmu)
 	return ret;
 }
 
+static void arm_smmu_write_strtab(struct arm_smmu_device *smmu)
+{
+	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+	dma_addr_t dma;
+	u32 reg;
+
+	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
+		reg = FIELD_PREP(STRTAB_BASE_CFG_FMT,
+				 STRTAB_BASE_CFG_FMT_2LVL) |
+		      FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE,
+				 ilog2(cfg->l2.num_l1_ents) + STRTAB_SPLIT) |
+		      FIELD_PREP(STRTAB_BASE_CFG_SPLIT, STRTAB_SPLIT);
+		dma = cfg->l2.l1_dma;
+	} else {
+		reg = FIELD_PREP(STRTAB_BASE_CFG_FMT,
+				 STRTAB_BASE_CFG_FMT_LINEAR) |
+		      FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
+		dma = cfg->linear.ste_dma;
+	}
+	writeq_relaxed((dma & STRTAB_BASE_ADDR_MASK) | STRTAB_BASE_RA,
+		       smmu->base + ARM_SMMU_STRTAB_BASE);
+	writel_relaxed(reg, smmu->base + ARM_SMMU_STRTAB_BASE_CFG);
+}
+
 static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
 {
 	int ret;
@@ -3920,10 +3922,7 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
 	writel_relaxed(reg, smmu->base + ARM_SMMU_CR2);
 
 	/* Stream table */
-	writeq_relaxed(smmu->strtab_cfg.strtab_base,
-		       smmu->base + ARM_SMMU_STRTAB_BASE);
-	writel_relaxed(smmu->strtab_cfg.strtab_base_cfg,
-		       smmu->base + ARM_SMMU_STRTAB_BASE_CFG);
+	arm_smmu_write_strtab(smmu);
 
 	/* Command queue */
 	writeq_relaxed(smmu->cmdq.q.q_base, smmu->base + ARM_SMMU_CMDQ_BASE);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 18e85fc936876b..be931616d9a27e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -645,8 +645,6 @@ struct arm_smmu_strtab_cfg {
 			unsigned int num_l1_ents;
 		} l2;
 	};
-	u64				strtab_base;
-	u32				strtab_base_cfg;
 };
 
 /* An SMMUv3 instance */
-- 
2.46.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 6/9] iommu/arm-smmu-v3: Do not use devm for the cd table allocations
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
                   ` (4 preceding siblings ...)
  2024-08-06 23:31 ` [PATCH v3 5/9] iommu/arm-smmu-v3: Remove strtab_base/cfg Jason Gunthorpe
@ 2024-08-06 23:31 ` Jason Gunthorpe
  2024-08-06 23:31 ` [PATCH v3 7/9] iommu/arm-smmu-v3: Shrink the cdtab l1_desc array Jason Gunthorpe
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-08-06 23:31 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

The master->cd_table is entirely contained within the struct
arm_smmu_master which is guaranteed to be freed by the core code under
arm_smmu_release_device().

There is no reason to use devm here, arm_smmu_free_cd_tables() is reliably
called to free the CD related memory. Remove it and save some memory.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 30 ++++++++++-----------
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 95bd4a36268c00..e6607d9d590c4d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1184,8 +1184,8 @@ static int arm_smmu_alloc_cd_leaf_table(struct arm_smmu_device *smmu,
 {
 	size_t size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);
 
-	l1_desc->l2ptr = dmam_alloc_coherent(smmu->dev, size,
-					     &l1_desc->l2ptr_dma, GFP_KERNEL);
+	l1_desc->l2ptr = dma_alloc_coherent(smmu->dev, size,
+					    &l1_desc->l2ptr_dma, GFP_KERNEL);
 	if (!l1_desc->l2ptr) {
 		dev_warn(smmu->dev,
 			 "failed to allocate context descriptor table\n");
@@ -1399,17 +1399,17 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
 		cd_table->num_l1_ents = DIV_ROUND_UP(max_contexts,
 						  CTXDESC_L2_ENTRIES);
 
-		cd_table->l1_desc = devm_kcalloc(smmu->dev, cd_table->num_l1_ents,
-					      sizeof(*cd_table->l1_desc),
-					      GFP_KERNEL);
+		cd_table->l1_desc = kcalloc(cd_table->num_l1_ents,
+					    sizeof(*cd_table->l1_desc),
+					    GFP_KERNEL);
 		if (!cd_table->l1_desc)
 			return -ENOMEM;
 
 		l1size = cd_table->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
 	}
 
-	cd_table->cdtab = dmam_alloc_coherent(smmu->dev, l1size, &cd_table->cdtab_dma,
-					   GFP_KERNEL);
+	cd_table->cdtab = dma_alloc_coherent(smmu->dev, l1size,
+					     &cd_table->cdtab_dma, GFP_KERNEL);
 	if (!cd_table->cdtab) {
 		dev_warn(smmu->dev, "failed to allocate context descriptor\n");
 		ret = -ENOMEM;
@@ -1420,7 +1420,7 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
 
 err_free_l1:
 	if (cd_table->l1_desc) {
-		devm_kfree(smmu->dev, cd_table->l1_desc);
+		kfree(cd_table->l1_desc);
 		cd_table->l1_desc = NULL;
 	}
 	return ret;
@@ -1440,21 +1440,19 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
 			if (!cd_table->l1_desc[i].l2ptr)
 				continue;
 
-			dmam_free_coherent(smmu->dev, size,
-					   cd_table->l1_desc[i].l2ptr,
-					   cd_table->l1_desc[i].l2ptr_dma);
+			dma_free_coherent(smmu->dev, size,
+					  cd_table->l1_desc[i].l2ptr,
+					  cd_table->l1_desc[i].l2ptr_dma);
 		}
-		devm_kfree(smmu->dev, cd_table->l1_desc);
-		cd_table->l1_desc = NULL;
+		kfree(cd_table->l1_desc);
 
 		l1size = cd_table->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
 	} else {
 		l1size = cd_table->num_l1_ents * (CTXDESC_CD_DWORDS << 3);
 	}
 
-	dmam_free_coherent(smmu->dev, l1size, cd_table->cdtab, cd_table->cdtab_dma);
-	cd_table->cdtab_dma = 0;
-	cd_table->cdtab = NULL;
+	dma_free_coherent(smmu->dev, l1size, cd_table->cdtab,
+			  cd_table->cdtab_dma);
 }
 
 /* Stream table manipulation functions */
-- 
2.46.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 7/9] iommu/arm-smmu-v3: Shrink the cdtab l1_desc array
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
                   ` (5 preceding siblings ...)
  2024-08-06 23:31 ` [PATCH v3 6/9] iommu/arm-smmu-v3: Do not use devm for the cd table allocations Jason Gunthorpe
@ 2024-08-06 23:31 ` Jason Gunthorpe
  2024-09-06 13:21   ` Will Deacon
  2024-08-06 23:31 ` [PATCH v3 8/9] iommu/arm-smmu-v3: Add types for each level of the CD table Jason Gunthorpe
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2024-08-06 23:31 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

The top of the 2 level CD table is (at most) 1024 entries big, and two
high order allocations are required. One of __le64 which is programmed
into the HW (8k) and one of struct arm_smmu_l1_ctx_desc which holds the
CPU pointer (16k).

There are two copies of the l2ptr_dma, one is stored in the struct
arm_smmu_l1_ctx_desc, and another is encoded in the __le64 for the HW to
use. Instead of storing two copies just decode the value from the __le64.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 +++++++++------------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
 2 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e6607d9d590c4d..d23a845ad82223 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1179,31 +1179,19 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master,
 	arm_smmu_cmdq_batch_submit(smmu, &cmds);
 }
 
-static int arm_smmu_alloc_cd_leaf_table(struct arm_smmu_device *smmu,
-					struct arm_smmu_l1_ctx_desc *l1_desc)
+static void arm_smmu_write_cd_l1_desc(__le64 *dst, dma_addr_t l2ptr_dma)
 {
-	size_t size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);
-
-	l1_desc->l2ptr = dma_alloc_coherent(smmu->dev, size,
-					    &l1_desc->l2ptr_dma, GFP_KERNEL);
-	if (!l1_desc->l2ptr) {
-		dev_warn(smmu->dev,
-			 "failed to allocate context descriptor table\n");
-		return -ENOMEM;
-	}
-	return 0;
-}
-
-static void arm_smmu_write_cd_l1_desc(__le64 *dst,
-				      struct arm_smmu_l1_ctx_desc *l1_desc)
-{
-	u64 val = (l1_desc->l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) |
-		  CTXDESC_L1_DESC_V;
+	u64 val = (l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) | CTXDESC_L1_DESC_V;
 
 	/* The HW has 64 bit atomicity with stores to the L2 CD table */
 	WRITE_ONCE(*dst, cpu_to_le64(val));
 }
 
+static dma_addr_t arm_smmu_cd_l1_get_desc(const __le64 *src)
+{
+	return le64_to_cpu(*src) & CTXDESC_L1_DESC_L2PTR_MASK;
+}
+
 struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
 					u32 ssid)
 {
@@ -1243,13 +1231,17 @@ static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
 
 		l1_desc = &cd_table->l1_desc[idx];
 		if (!l1_desc->l2ptr) {
-			__le64 *l1ptr;
+			dma_addr_t l2ptr_dma;
 
-			if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
+			l1_desc->l2ptr = dma_alloc_coherent(
+				smmu->dev,
+				CTXDESC_L2_ENTRIES * sizeof(struct arm_smmu_cd),
+				&l2ptr_dma, GFP_KERNEL);
+			if (!l1_desc->l2ptr)
 				return NULL;
 
-			l1ptr = cd_table->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
-			arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
+			arm_smmu_write_cd_l1_desc(&cd_table->cdtab[idx],
+						  l2ptr_dma);
 			/* An invalid L1CD can be cached */
 			arm_smmu_sync_cd(master, ssid, false);
 		}
@@ -1442,7 +1434,8 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
 
 			dma_free_coherent(smmu->dev, size,
 					  cd_table->l1_desc[i].l2ptr,
-					  cd_table->l1_desc[i].l2ptr_dma);
+					  arm_smmu_cd_l1_get_desc(
+						  &cd_table->cdtab[i]));
 		}
 		kfree(cd_table->l1_desc);
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index be931616d9a27e..68841ba3c2e789 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -606,7 +606,6 @@ struct arm_smmu_ctx_desc {
 
 struct arm_smmu_l1_ctx_desc {
 	struct arm_smmu_cd		*l2ptr;
-	dma_addr_t			l2ptr_dma;
 };
 
 struct arm_smmu_ctx_desc_cfg {
-- 
2.46.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 8/9] iommu/arm-smmu-v3: Add types for each level of the CD table
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
                   ` (6 preceding siblings ...)
  2024-08-06 23:31 ` [PATCH v3 7/9] iommu/arm-smmu-v3: Shrink the cdtab l1_desc array Jason Gunthorpe
@ 2024-08-06 23:31 ` Jason Gunthorpe
  2024-08-06 23:31 ` [PATCH v3 9/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg Jason Gunthorpe
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-08-06 23:31 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

As well as indexing helpers arm_smmu_cdtab_l1/2_idx().

Remove CTXDESC_L1_DESC_DWORDS and CTXDESC_CD_DWORDS replacing them all
with type specific calculations.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 46 +++++++++++----------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 27 +++++++++---
 2 files changed, 46 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d23a845ad82223..70b37d7f0e245d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1179,17 +1179,18 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master,
 	arm_smmu_cmdq_batch_submit(smmu, &cmds);
 }
 
-static void arm_smmu_write_cd_l1_desc(__le64 *dst, dma_addr_t l2ptr_dma)
+static void arm_smmu_write_cd_l1_desc(struct arm_smmu_cdtab_l1 *dst,
+				      dma_addr_t l2ptr_dma)
 {
 	u64 val = (l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) | CTXDESC_L1_DESC_V;
 
 	/* The HW has 64 bit atomicity with stores to the L2 CD table */
-	WRITE_ONCE(*dst, cpu_to_le64(val));
+	WRITE_ONCE(dst->l2ptr, cpu_to_le64(val));
 }
 
-static dma_addr_t arm_smmu_cd_l1_get_desc(const __le64 *src)
+static dma_addr_t arm_smmu_cd_l1_get_desc(const struct arm_smmu_cdtab_l1 *src)
 {
-	return le64_to_cpu(*src) & CTXDESC_L1_DESC_L2PTR_MASK;
+	return le64_to_cpu(src->l2ptr) & CTXDESC_L1_DESC_L2PTR_MASK;
 }
 
 struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
@@ -1202,13 +1203,12 @@ struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
 		return NULL;
 
 	if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
-		return (struct arm_smmu_cd *)(cd_table->cdtab +
-					      ssid * CTXDESC_CD_DWORDS);
+		return &((struct arm_smmu_cd *)cd_table->cdtab)[ssid];
 
-	l1_desc = &cd_table->l1_desc[ssid / CTXDESC_L2_ENTRIES];
+	l1_desc = &cd_table->l1_desc[arm_smmu_cdtab_l1_idx(ssid)];
 	if (!l1_desc->l2ptr)
 		return NULL;
-	return &l1_desc->l2ptr[ssid % CTXDESC_L2_ENTRIES];
+	return &l1_desc->l2ptr->cds[arm_smmu_cdtab_l2_idx(ssid)];
 }
 
 static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
@@ -1226,7 +1226,7 @@ static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
 	}
 
 	if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_64K_L2) {
-		unsigned int idx = ssid / CTXDESC_L2_ENTRIES;
+		unsigned int idx = arm_smmu_cdtab_l1_idx(ssid);
 		struct arm_smmu_l1_ctx_desc *l1_desc;
 
 		l1_desc = &cd_table->l1_desc[idx];
@@ -1240,8 +1240,10 @@ static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
 			if (!l1_desc->l2ptr)
 				return NULL;
 
-			arm_smmu_write_cd_l1_desc(&cd_table->cdtab[idx],
-						  l2ptr_dma);
+			arm_smmu_write_cd_l1_desc(
+				&((struct arm_smmu_cdtab_l1 *)
+					  cd_table->cdtab)[idx],
+				l2ptr_dma);
 			/* An invalid L1CD can be cached */
 			arm_smmu_sync_cd(master, ssid, false);
 		}
@@ -1385,7 +1387,7 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
 		cd_table->s1fmt = STRTAB_STE_0_S1FMT_LINEAR;
 		cd_table->num_l1_ents = max_contexts;
 
-		l1size = max_contexts * (CTXDESC_CD_DWORDS << 3);
+		l1size = max_contexts * sizeof(struct arm_smmu_cd);
 	} else {
 		cd_table->s1fmt = STRTAB_STE_0_S1FMT_64K_L2;
 		cd_table->num_l1_ents = DIV_ROUND_UP(max_contexts,
@@ -1397,7 +1399,8 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
 		if (!cd_table->l1_desc)
 			return -ENOMEM;
 
-		l1size = cd_table->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
+		l1size = cd_table->num_l1_ents *
+			 sizeof(struct arm_smmu_cdtab_l1);
 	}
 
 	cd_table->cdtab = dma_alloc_coherent(smmu->dev, l1size,
@@ -1421,27 +1424,28 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
 static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
 {
 	int i;
-	size_t size, l1size;
+	size_t l1size;
 	struct arm_smmu_device *smmu = master->smmu;
 	struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
 
 	if (cd_table->l1_desc) {
-		size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);
-
 		for (i = 0; i < cd_table->num_l1_ents; i++) {
 			if (!cd_table->l1_desc[i].l2ptr)
 				continue;
 
-			dma_free_coherent(smmu->dev, size,
+			dma_free_coherent(smmu->dev,
+					  sizeof(*cd_table->l1_desc[i].l2ptr),
 					  cd_table->l1_desc[i].l2ptr,
-					  arm_smmu_cd_l1_get_desc(
-						  &cd_table->cdtab[i]));
+					  arm_smmu_cd_l1_get_desc(&(
+						  (struct arm_smmu_cdtab_l1 *)
+							  cd_table->cdtab)[i]));
 		}
 		kfree(cd_table->l1_desc);
 
-		l1size = cd_table->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3);
+		l1size = cd_table->num_l1_ents *
+			 sizeof(struct arm_smmu_cdtab_l1);
 	} else {
-		l1size = cd_table->num_l1_ents * (CTXDESC_CD_DWORDS << 3);
+		l1size = cd_table->num_l1_ents * sizeof(struct arm_smmu_cd);
 	}
 
 	dma_free_coherent(smmu->dev, l1size, cd_table->cdtab,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 68841ba3c2e789..681804a3f86bec 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -296,16 +296,31 @@ static inline u32 arm_smmu_strtab_l2_idx(u32 sid)
  */
 #define CTXDESC_L2_ENTRIES		1024
 
-#define CTXDESC_L1_DESC_DWORDS		1
 #define CTXDESC_L1_DESC_V		(1UL << 0)
 #define CTXDESC_L1_DESC_L2PTR_MASK	GENMASK_ULL(51, 12)
 
-#define CTXDESC_CD_DWORDS		8
-
 struct arm_smmu_cd {
-	__le64 data[CTXDESC_CD_DWORDS];
+	__le64 data[8];
 };
 
+struct arm_smmu_cdtab_l2 {
+	struct arm_smmu_cd cds[CTXDESC_L2_ENTRIES];
+};
+
+struct arm_smmu_cdtab_l1 {
+	__le64 l2ptr;
+};
+
+static inline unsigned int arm_smmu_cdtab_l1_idx(unsigned int ssid)
+{
+	return ssid / CTXDESC_L2_ENTRIES;
+}
+
+static inline unsigned int arm_smmu_cdtab_l2_idx(unsigned int ssid)
+{
+	return ssid % CTXDESC_L2_ENTRIES;
+}
+
 #define CTXDESC_CD_0_TCR_T0SZ		GENMASK_ULL(5, 0)
 #define CTXDESC_CD_0_TCR_TG0		GENMASK_ULL(7, 6)
 #define CTXDESC_CD_0_TCR_IRGN0		GENMASK_ULL(9, 8)
@@ -336,7 +351,7 @@ struct arm_smmu_cd {
  * When the SMMU only supports linear context descriptor tables, pick a
  * reasonable size limit (64kB).
  */
-#define CTXDESC_LINEAR_CDMAX		ilog2(SZ_64K / (CTXDESC_CD_DWORDS << 3))
+#define CTXDESC_LINEAR_CDMAX ilog2(SZ_64K / sizeof(struct arm_smmu_cd))
 
 /* Command queue */
 #define CMDQ_ENT_SZ_SHIFT		4
@@ -605,7 +620,7 @@ struct arm_smmu_ctx_desc {
 };
 
 struct arm_smmu_l1_ctx_desc {
-	struct arm_smmu_cd		*l2ptr;
+	struct arm_smmu_cdtab_l2	*l2ptr;
 };
 
 struct arm_smmu_ctx_desc_cfg {
-- 
2.46.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 9/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
                   ` (7 preceding siblings ...)
  2024-08-06 23:31 ` [PATCH v3 8/9] iommu/arm-smmu-v3: Add types for each level of the CD table Jason Gunthorpe
@ 2024-08-06 23:31 ` Jason Gunthorpe
  2024-09-06 13:22   ` Will Deacon
  2024-09-05 19:25 ` [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2024-08-06 23:31 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

The members here are being used for both the linear and the 2 level case,
with the meaning of each item slightly different in the two cases.

Split it into a clean union where both cases have their own struct with
their own logical names and correct types.

Adjust all the users to detect linear/2lvl and use the right sub structure
and types consistently.

Remove CTXDESC_CD_DWORDS by changing the last places to use
sizeof(struct arm_smmu_cd).

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 119 +++++++++-----------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  24 ++--
 2 files changed, 71 insertions(+), 72 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 70b37d7f0e245d..e5db5325f7eaed 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1196,19 +1196,19 @@ static dma_addr_t arm_smmu_cd_l1_get_desc(const struct arm_smmu_cdtab_l1 *src)
 struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
 					u32 ssid)
 {
-	struct arm_smmu_l1_ctx_desc *l1_desc;
+	struct arm_smmu_cdtab_l2 *l2;
 	struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
 
-	if (!cd_table->cdtab)
+	if (!arm_smmu_cdtab_allocated(cd_table))
 		return NULL;
 
 	if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
-		return &((struct arm_smmu_cd *)cd_table->cdtab)[ssid];
+		return &cd_table->linear.table[ssid];
 
-	l1_desc = &cd_table->l1_desc[arm_smmu_cdtab_l1_idx(ssid)];
-	if (!l1_desc->l2ptr)
+	l2 = cd_table->l2.l2ptrs[arm_smmu_cdtab_l1_idx(ssid)];
+	if (!l2)
 		return NULL;
-	return &l1_desc->l2ptr->cds[arm_smmu_cdtab_l2_idx(ssid)];
+	return &l2->cds[arm_smmu_cdtab_l2_idx(ssid)];
 }
 
 static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
@@ -1220,30 +1220,25 @@ static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
 	might_sleep();
 	iommu_group_mutex_assert(master->dev);
 
-	if (!cd_table->cdtab) {
+	if (!arm_smmu_cdtab_allocated(cd_table)) {
 		if (arm_smmu_alloc_cd_tables(master))
 			return NULL;
 	}
 
 	if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_64K_L2) {
 		unsigned int idx = arm_smmu_cdtab_l1_idx(ssid);
-		struct arm_smmu_l1_ctx_desc *l1_desc;
+		struct arm_smmu_cdtab_l2 **l2ptr = &cd_table->l2.l2ptrs[idx];
 
-		l1_desc = &cd_table->l1_desc[idx];
-		if (!l1_desc->l2ptr) {
+		if (!*l2ptr) {
 			dma_addr_t l2ptr_dma;
 
-			l1_desc->l2ptr = dma_alloc_coherent(
-				smmu->dev,
-				CTXDESC_L2_ENTRIES * sizeof(struct arm_smmu_cd),
-				&l2ptr_dma, GFP_KERNEL);
-			if (!l1_desc->l2ptr)
+			*l2ptr = dma_alloc_coherent(smmu->dev, sizeof(**l2ptr),
+						    &l2ptr_dma, GFP_KERNEL);
+			if (!*l2ptr)
 				return NULL;
 
-			arm_smmu_write_cd_l1_desc(
-				&((struct arm_smmu_cdtab_l1 *)
-					  cd_table->cdtab)[idx],
-				l2ptr_dma);
+			arm_smmu_write_cd_l1_desc(&cd_table->l2.l1tab[idx],
+						  l2ptr_dma);
 			/* An invalid L1CD can be cached */
 			arm_smmu_sync_cd(master, ssid, false);
 		}
@@ -1363,7 +1358,7 @@ void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid)
 	struct arm_smmu_cd target = {};
 	struct arm_smmu_cd *cdptr;
 
-	if (!master->cd_table.cdtab)
+	if (!arm_smmu_cdtab_allocated(&master->cd_table))
 		return;
 	cdptr = arm_smmu_get_cd_ptr(master, ssid);
 	if (WARN_ON(!cdptr))
@@ -1373,8 +1368,6 @@ void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid)
 
 static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
 {
-	int ret;
-	size_t l1size;
 	size_t max_contexts;
 	struct arm_smmu_device *smmu = master->smmu;
 	struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
@@ -1385,71 +1378,67 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
 	if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB) ||
 	    max_contexts <= CTXDESC_L2_ENTRIES) {
 		cd_table->s1fmt = STRTAB_STE_0_S1FMT_LINEAR;
-		cd_table->num_l1_ents = max_contexts;
+		cd_table->linear.num_ents = max_contexts;
 
-		l1size = max_contexts * sizeof(struct arm_smmu_cd);
+		cd_table->linear.table = dma_alloc_coherent(
+			smmu->dev, max_contexts * sizeof(struct arm_smmu_cd),
+			&cd_table->cdtab_dma, GFP_KERNEL);
+		if (!cd_table->linear.table)
+			return -ENOMEM;
 	} else {
 		cd_table->s1fmt = STRTAB_STE_0_S1FMT_64K_L2;
-		cd_table->num_l1_ents = DIV_ROUND_UP(max_contexts,
-						  CTXDESC_L2_ENTRIES);
+		cd_table->l2.num_l1_ents =
+			DIV_ROUND_UP(max_contexts, CTXDESC_L2_ENTRIES);
 
-		cd_table->l1_desc = kcalloc(cd_table->num_l1_ents,
-					    sizeof(*cd_table->l1_desc),
-					    GFP_KERNEL);
-		if (!cd_table->l1_desc)
+		cd_table->l2.l2ptrs = kcalloc(cd_table->l2.num_l1_ents,
+					     sizeof(*cd_table->l2.l2ptrs),
+					     GFP_KERNEL);
+		if (!cd_table->l2.l2ptrs)
 			return -ENOMEM;
 
-		l1size = cd_table->num_l1_ents *
-			 sizeof(struct arm_smmu_cdtab_l1);
+		cd_table->l2.l1tab = dma_alloc_coherent(
+			smmu->dev,
+			cd_table->l2.num_l1_ents *
+				sizeof(struct arm_smmu_cdtab_l1),
+			&cd_table->cdtab_dma, GFP_KERNEL);
+		if (!cd_table->l2.l1tab) {
+			kfree(cd_table->l2.l2ptrs);
+			cd_table->l2.l2ptrs = NULL;
+			return -ENOMEM;
+		}
 	}
-
-	cd_table->cdtab = dma_alloc_coherent(smmu->dev, l1size,
-					     &cd_table->cdtab_dma, GFP_KERNEL);
-	if (!cd_table->cdtab) {
-		dev_warn(smmu->dev, "failed to allocate context descriptor\n");
-		ret = -ENOMEM;
-		goto err_free_l1;
-	}
-
 	return 0;
-
-err_free_l1:
-	if (cd_table->l1_desc) {
-		kfree(cd_table->l1_desc);
-		cd_table->l1_desc = NULL;
-	}
-	return ret;
 }
 
 static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
 {
 	int i;
-	size_t l1size;
 	struct arm_smmu_device *smmu = master->smmu;
 	struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
 
-	if (cd_table->l1_desc) {
-		for (i = 0; i < cd_table->num_l1_ents; i++) {
-			if (!cd_table->l1_desc[i].l2ptr)
+	if (cd_table->s1fmt != STRTAB_STE_0_S1FMT_LINEAR) {
+		for (i = 0; i < cd_table->l2.num_l1_ents; i++) {
+			if (!cd_table->l2.l2ptrs[i])
 				continue;
 
 			dma_free_coherent(smmu->dev,
-					  sizeof(*cd_table->l1_desc[i].l2ptr),
-					  cd_table->l1_desc[i].l2ptr,
-					  arm_smmu_cd_l1_get_desc(&(
-						  (struct arm_smmu_cdtab_l1 *)
-							  cd_table->cdtab)[i]));
+					  sizeof(*cd_table->l2.l2ptrs[i]),
+					  cd_table->l2.l2ptrs[i],
+					  arm_smmu_cd_l1_get_desc(
+						  &cd_table->l2.l1tab[i]));
 		}
-		kfree(cd_table->l1_desc);
+		kfree(cd_table->l2.l2ptrs);
 
-		l1size = cd_table->num_l1_ents *
-			 sizeof(struct arm_smmu_cdtab_l1);
+		dma_free_coherent(smmu->dev,
+				  cd_table->l2.num_l1_ents *
+					  sizeof(struct arm_smmu_cdtab_l1),
+				  cd_table->l2.l1tab, cd_table->cdtab_dma);
 	} else {
-		l1size = cd_table->num_l1_ents * sizeof(struct arm_smmu_cd);
+		dma_free_coherent(smmu->dev,
+				  cd_table->linear.num_ents *
+					  sizeof(struct arm_smmu_cd),
+				  cd_table->linear.table, cd_table->cdtab_dma);
 	}
-
-	dma_free_coherent(smmu->dev, l1size, cd_table->cdtab,
-			  cd_table->cdtab_dma);
 }
 
 /* Stream table manipulation functions */
@@ -3299,7 +3288,7 @@ static void arm_smmu_release_device(struct device *dev)
 
 	arm_smmu_disable_pasid(master);
 	arm_smmu_remove_master(master);
-	if (master->cd_table.cdtab)
+	if (arm_smmu_cdtab_allocated(&master->cd_table))
 		arm_smmu_free_cd_tables(master);
 	kfree(master);
 }
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 681804a3f86bec..8851a7abb5f0f3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -619,15 +619,19 @@ struct arm_smmu_ctx_desc {
 	u16				asid;
 };
 
-struct arm_smmu_l1_ctx_desc {
-	struct arm_smmu_cdtab_l2	*l2ptr;
-};
-
 struct arm_smmu_ctx_desc_cfg {
-	__le64				*cdtab;
+	union {
+		struct {
+			struct arm_smmu_cd *table;
+			unsigned int num_ents;
+		} linear;
+		struct {
+			struct arm_smmu_cdtab_l1 *l1tab;
+			struct arm_smmu_cdtab_l2 **l2ptrs;
+			unsigned int num_l1_ents;
+		} l2;
+	};
 	dma_addr_t			cdtab_dma;
-	struct arm_smmu_l1_ctx_desc	*l1_desc;
-	unsigned int			num_l1_ents;
 	unsigned int			used_ssids;
 	u8				in_ste;
 	u8				s1fmt;
@@ -635,6 +639,12 @@ struct arm_smmu_ctx_desc_cfg {
 	u8				s1cdmax;
 };
 
+static inline bool
+arm_smmu_cdtab_allocated(struct arm_smmu_ctx_desc_cfg *cfg)
+{
+	return cfg->linear.table || cfg->l2.l1tab;
+}
+
 /* True if the cd table has SSIDS > 0 in use. */
 static inline bool arm_smmu_ssids_in_use(struct arm_smmu_ctx_desc_cfg *cd_table)
 {
-- 
2.46.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
                   ` (8 preceding siblings ...)
  2024-08-06 23:31 ` [PATCH v3 9/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg Jason Gunthorpe
@ 2024-09-05 19:25 ` Jason Gunthorpe
  2024-09-05 20:10 ` Nicolin Chen
  2024-09-06 14:35 ` Will Deacon
  11 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-09-05 19:25 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

On Tue, Aug 06, 2024 at 08:31:14PM -0300, Jason Gunthorpe wrote:

> Jason Gunthorpe (9):
>   iommu/arm-smmu-v3: Use the new rb tree helpers
>   iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()
>   iommu/arm-smmu-v3: Add types for each level of the 2 level stream
>     table
>   iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg
>   iommu/arm-smmu-v3: Remove strtab_base/cfg
>   iommu/arm-smmu-v3: Do not use devm for the cd table allocations
>   iommu/arm-smmu-v3: Shrink the cdtab l1_desc array
>   iommu/arm-smmu-v3: Add types for each level of the CD table
>   iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg

Will? Is there something you are waiting on here?

Thanks,
Jason


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
                   ` (9 preceding siblings ...)
  2024-09-05 19:25 ` [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
@ 2024-09-05 20:10 ` Nicolin Chen
  2024-09-06 14:35 ` Will Deacon
  11 siblings, 0 replies; 21+ messages in thread
From: Nicolin Chen @ 2024-09-05 20:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, patches, Ryan Roberts, Mostafa Saleh

On Tue, Aug 06, 2024 at 08:31:14PM -0300, Jason Gunthorpe wrote:
> Will pointed out that two places referring to the CD/STE struct did not
> get the new types. While auditing this code a few more oddities were
> noticed. Based on a feedback from Mostafa and Nicolin a few more things
> were fixed up too
> 
> - Use types for all the HW structures everywhere even for the L1
>   descriptors that are just a single 8 bytes. This helps with clarity of
>   what everthing is pointing at
> - Use indexing helpers for the STE/CD two level calculations
> - Use sizeof(struct X) instead of open coded math on constants. The sizeof
>   naturally follows the type of the related variable in almost all cases
> - Remove redundant dma_addr_t's and save some memory
> - Remove redundant devm usage
> - Use the modern rbtree API
> 
> Parts of this have been sitting in my tree for a while now, it grew a bit
> since v1, but nothing is particularly profound here. Enough is merged now
> that they can be cleanly based and are seperate from my other series.
> 
> v3:
>  - Rebase to v6.11-rc2
>  - Preserve the "2-level strtab only covers %u/%u bits of SID" without
>    change
>  - Vertically align some of the constants
>  - Use u32 for the type of the index and sid
>  - Fix missing * in le64_to_cpu() in interior patch
>  - Bring back accidently lost "Use the new rb tree helpers" patch

I didn't exclusively test this series but it has been included in
my nesting branch for a while. Considering 2-stage configurations
and vSVA cases are involved during my testings, all my Tested-bys
still stand.

Thanks
Nicolin


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 2/9] iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()
  2024-08-06 23:31 ` [PATCH v3 2/9] iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx() Jason Gunthorpe
@ 2024-09-06 13:18   ` Will Deacon
  2024-09-06 14:40     ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Will Deacon @ 2024-09-06 13:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy,
	Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

On Tue, Aug 06, 2024 at 08:31:16PM -0300, Jason Gunthorpe wrote:
> Don't open code the calculations of the indexes for each level, provide
> two functions to do that math and call them in all the places. Update all
> the places computing indexes.
> 
> Calculate the L1 table size directly based on the max required index from
> the cap. Remove STRTAB_L1_SZ_SHIFT in favour of STRTAB_NUM_L2_STES.
> 
> Use STRTAB_NUM_L2_STES to replace remaining open coded 1 << STRTAB_SPLIT.
> 
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 51 +++++++++------------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 14 +++++-
>  2 files changed, 35 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index b80f3359a8d12b..a695e5f8fc2880 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1670,20 +1670,17 @@ static void arm_smmu_init_initial_stes(struct arm_smmu_ste *strtab,
>  
>  static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
>  {
> -	size_t size;
> -	void *strtab;
>  	dma_addr_t l2ptr_dma;
>  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> -	struct arm_smmu_strtab_l1_desc *desc = &cfg->l1_desc[sid >> STRTAB_SPLIT];
> +	struct arm_smmu_strtab_l1_desc *desc =
> +		&cfg->l1_desc[arm_smmu_strtab_l1_idx(sid)];
>  
>  	if (desc->l2ptr)
>  		return 0;
>  
> -	size = 1 << (STRTAB_SPLIT + ilog2(STRTAB_STE_DWORDS) + 3);
> -	strtab = &cfg->strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS];
> -
> -	desc->l2ptr = dmam_alloc_coherent(smmu->dev, size, &l2ptr_dma,
> -					  GFP_KERNEL);
> +	desc->l2ptr = dmam_alloc_coherent(
> +		smmu->dev, STRTAB_NUM_L2_STES * sizeof(struct arm_smmu_ste),
> +		&l2ptr_dma, GFP_KERNEL);

Since this series is mainly about clean-up, please can you be consistent
with the indentation style that the driver currently uses for multi-line
function invocations? I applied the whole series, but I struggled looking
at the resulting code (of course, there's no "right" way, but the driver
is written with whatever I'm used to so I'd really like to keep the
consistency).

Will


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 4/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg
  2024-08-06 23:31 ` [PATCH v3 4/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg Jason Gunthorpe
@ 2024-09-06 13:19   ` Will Deacon
  2024-09-06 15:06     ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Will Deacon @ 2024-09-06 13:19 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy,
	Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

On Tue, Aug 06, 2024 at 08:31:18PM -0300, Jason Gunthorpe wrote:
> The members here are being used for both the linear and the 2 level case,
> with the meaning of each item slightly different in the two cases.
> 
> Split it into a clean union where both cases have their own struct with
> their own logical names and correct types.
> 
> Adjust all the users to detect linear/2lvl and use the right sub structure
> and types consistently.
> 
> Remove STRTAB_STE_DWORDS by changing the last places to use
> sizeof(struct arm_smmu_ste).
> 
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 77 ++++++++++-----------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 26 +++----
>  2 files changed, 50 insertions(+), 53 deletions(-)

[...]

> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 3a8a459f899fd8..18e85fc936876b 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -207,10 +207,8 @@
>  #define STRTAB_L1_DESC_SPAN		GENMASK_ULL(4, 0)
>  #define STRTAB_L1_DESC_L2PTR_MASK	GENMASK_ULL(51, 6)
>  
> -#define STRTAB_STE_DWORDS		8
> -
>  struct arm_smmu_ste {
> -	__le64 data[STRTAB_STE_DWORDS];
> +	__le64 data[8];
>  };

As discussed before, please keep this part as-is (i.e. __le64
data[STRTAB_STE_DWORDS])

Will


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 7/9] iommu/arm-smmu-v3: Shrink the cdtab l1_desc array
  2024-08-06 23:31 ` [PATCH v3 7/9] iommu/arm-smmu-v3: Shrink the cdtab l1_desc array Jason Gunthorpe
@ 2024-09-06 13:21   ` Will Deacon
  0 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2024-09-06 13:21 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy,
	Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

On Tue, Aug 06, 2024 at 08:31:21PM -0300, Jason Gunthorpe wrote:
> The top of the 2 level CD table is (at most) 1024 entries big, and two
> high order allocations are required. One of __le64 which is programmed
> into the HW (8k) and one of struct arm_smmu_l1_ctx_desc which holds the
> CPU pointer (16k).
> 
> There are two copies of the l2ptr_dma, one is stored in the struct
> arm_smmu_l1_ctx_desc, and another is encoded in the __le64 for the HW to
> use. Instead of storing two copies just decode the value from the __le64.
> 
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 +++++++++------------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
>  2 files changed, 17 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index e6607d9d590c4d..d23a845ad82223 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1179,31 +1179,19 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master,
>  	arm_smmu_cmdq_batch_submit(smmu, &cmds);
>  }
>  
> -static int arm_smmu_alloc_cd_leaf_table(struct arm_smmu_device *smmu,
> -					struct arm_smmu_l1_ctx_desc *l1_desc)
> +static void arm_smmu_write_cd_l1_desc(__le64 *dst, dma_addr_t l2ptr_dma)
>  {
> -	size_t size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);
> -
> -	l1_desc->l2ptr = dma_alloc_coherent(smmu->dev, size,
> -					    &l1_desc->l2ptr_dma, GFP_KERNEL);
> -	if (!l1_desc->l2ptr) {
> -		dev_warn(smmu->dev,
> -			 "failed to allocate context descriptor table\n");
> -		return -ENOMEM;
> -	}
> -	return 0;
> -}
> -
> -static void arm_smmu_write_cd_l1_desc(__le64 *dst,
> -				      struct arm_smmu_l1_ctx_desc *l1_desc)
> -{
> -	u64 val = (l1_desc->l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) |
> -		  CTXDESC_L1_DESC_V;
> +	u64 val = (l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) | CTXDESC_L1_DESC_V;
>  
>  	/* The HW has 64 bit atomicity with stores to the L2 CD table */
>  	WRITE_ONCE(*dst, cpu_to_le64(val));
>  }
>  
> +static dma_addr_t arm_smmu_cd_l1_get_desc(const __le64 *src)
> +{
> +	return le64_to_cpu(*src) & CTXDESC_L1_DESC_L2PTR_MASK;
> +}
> +
>  struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
>  					u32 ssid)
>  {
> @@ -1243,13 +1231,17 @@ static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
>  
>  		l1_desc = &cd_table->l1_desc[idx];
>  		if (!l1_desc->l2ptr) {
> -			__le64 *l1ptr;
> +			dma_addr_t l2ptr_dma;
>  
> -			if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
> +			l1_desc->l2ptr = dma_alloc_coherent(
> +				smmu->dev,
> +				CTXDESC_L2_ENTRIES * sizeof(struct arm_smmu_cd),
> +				&l2ptr_dma, GFP_KERNEL);

Here's another example of the inconsistent indentation.

Will


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 9/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg
  2024-08-06 23:31 ` [PATCH v3 9/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg Jason Gunthorpe
@ 2024-09-06 13:22   ` Will Deacon
  2024-09-06 15:13     ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Will Deacon @ 2024-09-06 13:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy,
	Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

On Tue, Aug 06, 2024 at 08:31:23PM -0300, Jason Gunthorpe wrote:
> @@ -1373,8 +1368,6 @@ void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid)
>  
>  static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
>  {
> -	int ret;
> -	size_t l1size;
>  	size_t max_contexts;
>  	struct arm_smmu_device *smmu = master->smmu;
>  	struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
> @@ -1385,71 +1378,67 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
>  	if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB) ||
>  	    max_contexts <= CTXDESC_L2_ENTRIES) {
>  		cd_table->s1fmt = STRTAB_STE_0_S1FMT_LINEAR;
> -		cd_table->num_l1_ents = max_contexts;
> +		cd_table->linear.num_ents = max_contexts;
>  
> -		l1size = max_contexts * sizeof(struct arm_smmu_cd);
> +		cd_table->linear.table = dma_alloc_coherent(
> +			smmu->dev, max_contexts * sizeof(struct arm_smmu_cd),
> +			&cd_table->cdtab_dma, GFP_KERNEL);
> +		if (!cd_table->linear.table)
> +			return -ENOMEM;
>  	} else {
>  		cd_table->s1fmt = STRTAB_STE_0_S1FMT_64K_L2;
> -		cd_table->num_l1_ents = DIV_ROUND_UP(max_contexts,
> -						  CTXDESC_L2_ENTRIES);
> +		cd_table->l2.num_l1_ents =
> +			DIV_ROUND_UP(max_contexts, CTXDESC_L2_ENTRIES);
>  
> -		cd_table->l1_desc = kcalloc(cd_table->num_l1_ents,
> -					    sizeof(*cd_table->l1_desc),
> -					    GFP_KERNEL);
> -		if (!cd_table->l1_desc)
> +		cd_table->l2.l2ptrs = kcalloc(cd_table->l2.num_l1_ents,
> +					     sizeof(*cd_table->l2.l2ptrs),
> +					     GFP_KERNEL);
> +		if (!cd_table->l2.l2ptrs)
>  			return -ENOMEM;
>  
> -		l1size = cd_table->num_l1_ents *
> -			 sizeof(struct arm_smmu_cdtab_l1);
> +		cd_table->l2.l1tab = dma_alloc_coherent(
> +			smmu->dev,
> +			cd_table->l2.num_l1_ents *
> +				sizeof(struct arm_smmu_cdtab_l1),
> +			&cd_table->cdtab_dma, GFP_KERNEL);
> +		if (!cd_table->l2.l1tab) {
> +			kfree(cd_table->l2.l2ptrs);
> +			cd_table->l2.l2ptrs = NULL;
> +			return -ENOMEM;
> +		}
>  	}
> -
> -	cd_table->cdtab = dma_alloc_coherent(smmu->dev, l1size,
> -					     &cd_table->cdtab_dma, GFP_KERNEL);
> -	if (!cd_table->cdtab) {
> -		dev_warn(smmu->dev, "failed to allocate context descriptor\n");
> -		ret = -ENOMEM;
> -		goto err_free_l1;
> -	}
> -
>  	return 0;
> -
> -err_free_l1:
> -	if (cd_table->l1_desc) {
> -		kfree(cd_table->l1_desc);
> -		cd_table->l1_desc = NULL;
> -	}
> -	return ret;

Why inline the error path here? Sure, I get that it works both ways, but
it seems a little gratuitous to me.

Will


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 1/9] iommu/arm-smmu-v3: Use the new rb tree helpers
  2024-08-06 23:31 ` [PATCH v3 1/9] iommu/arm-smmu-v3: Use the new rb tree helpers Jason Gunthorpe
@ 2024-09-06 13:23   ` Will Deacon
  0 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2024-09-06 13:23 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy,
	Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

On Tue, Aug 06, 2024 at 08:31:15PM -0300, Jason Gunthorpe wrote:
> Since v5.12 the rbtree has gained some simplifying helpers aimed at making
> rb tree users write less convoluted boiler plate code. Instead the caller
> provides a single comparison function and the helpers generate the prior
> open-coded stuff.
> 
> Update smmu->streams to use rb_find_add() and rb_find().
> 
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Reviewed-by: Mostafa Saleh <smostafa@google.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 68 ++++++++++-----------
>  1 file changed, 31 insertions(+), 37 deletions(-)

Thanks, this is a lot better! I'll pick this one up.

Will


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area
  2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
                   ` (10 preceding siblings ...)
  2024-09-05 20:10 ` Nicolin Chen
@ 2024-09-06 14:35 ` Will Deacon
  11 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2024-09-06 14:35 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy,
	Jason Gunthorpe
  Cc: catalin.marinas, kernel-team, Will Deacon, Michael Shavit,
	Nicolin Chen, patches, Ryan Roberts, Mostafa Saleh

On Tue, 06 Aug 2024 20:31:14 -0300, Jason Gunthorpe wrote:
> Will pointed out that two places referring to the CD/STE struct did not
> get the new types. While auditing this code a few more oddities were
> noticed. Based on a feedback from Mostafa and Nicolin a few more things
> were fixed up too
> 
> - Use types for all the HW structures everywhere even for the L1
>   descriptors that are just a single 8 bytes. This helps with clarity of
>   what everthing is pointing at
> - Use indexing helpers for the STE/CD two level calculations
> - Use sizeof(struct X) instead of open coded math on constants. The sizeof
>   naturally follows the type of the related variable in almost all cases
> - Remove redundant dma_addr_t's and save some memory
> - Remove redundant devm usage
> - Use the modern rbtree API
> 
> [...]

Applied first change to will (for-joerg/arm-smmu/updates), thanks!

[1/9] iommu/arm-smmu-v3: Use the new rb tree helpers
      https://git.kernel.org/will/c/a2bb820e862d

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 2/9] iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()
  2024-09-06 13:18   ` Will Deacon
@ 2024-09-06 14:40     ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-09-06 14:40 UTC (permalink / raw)
  To: Will Deacon
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy,
	Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

On Fri, Sep 06, 2024 at 02:18:18PM +0100, Will Deacon wrote:
> On Tue, Aug 06, 2024 at 08:31:16PM -0300, Jason Gunthorpe wrote:
> > Don't open code the calculations of the indexes for each level, provide
> > two functions to do that math and call them in all the places. Update all
> > the places computing indexes.
> > 
> > Calculate the L1 table size directly based on the max required index from
> > the cap. Remove STRTAB_L1_SZ_SHIFT in favour of STRTAB_NUM_L2_STES.
> > 
> > Use STRTAB_NUM_L2_STES to replace remaining open coded 1 << STRTAB_SPLIT.
> > 
> > Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> > Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > ---
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 51 +++++++++------------
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 14 +++++-
> >  2 files changed, 35 insertions(+), 30 deletions(-)
> > 
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index b80f3359a8d12b..a695e5f8fc2880 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -1670,20 +1670,17 @@ static void arm_smmu_init_initial_stes(struct arm_smmu_ste *strtab,
> >  
> >  static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
> >  {
> > -	size_t size;
> > -	void *strtab;
> >  	dma_addr_t l2ptr_dma;
> >  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> > -	struct arm_smmu_strtab_l1_desc *desc = &cfg->l1_desc[sid >> STRTAB_SPLIT];
> > +	struct arm_smmu_strtab_l1_desc *desc =
> > +		&cfg->l1_desc[arm_smmu_strtab_l1_idx(sid)];
> >  
> >  	if (desc->l2ptr)
> >  		return 0;
> >  
> > -	size = 1 << (STRTAB_SPLIT + ilog2(STRTAB_STE_DWORDS) + 3);
> > -	strtab = &cfg->strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS];
> > -
> > -	desc->l2ptr = dmam_alloc_coherent(smmu->dev, size, &l2ptr_dma,
> > -					  GFP_KERNEL);
> > +	desc->l2ptr = dmam_alloc_coherent(
> > +		smmu->dev, STRTAB_NUM_L2_STES * sizeof(struct arm_smmu_ste),
> > +		&l2ptr_dma, GFP_KERNEL);
> 
> Since this series is mainly about clean-up, please can you be consistent
> with the indentation style that the driver currently uses for multi-line
> function invocations? 

Honestly, I don't know what that style is..

I try to stick with a common style from clang-format using the
checked-in specs. I touch so many drivers and so many subsystems it is
a big ask to memorize everyone's little preferences and adjust
things. I feel there is no way I could do that reliably..

Still, if you have some specific things that really bother you then
please let me know them? I can make an effort, and/or you can adjust
them when applying? For instance I have already been mindful to try to
keep the vertical alignment in some places.

I'm guessing you are looking at aligning the argument wrap to the ( ?

To explain - clang-format has a heuristic for this. If the code
formats nicely it will align to the (. 

For instance this proposed change to fit into 80 cols:

-       cd_table->cdtab = dmam_alloc_coherent(smmu->dev, l1size, &cd_table->cdtab_dma,
-                                          GFP_KERNEL);
+       cd_table->cdtab = dmam_alloc_coherent(smmu->dev, l1size,
+                                             &cd_table->cdtab_dma, GFP_KERNEL);

However, the hunk you clipped above doesn't fit nicely because the
argument 'STRTAB_NUM_L2_STES * sizeof(struct arm_smmu_ste)' would push
past the 80 col limit if aligned to the (.

Rather than break an argument in the middle of an expression, or go
past 80 cols, it switches to one tab and keeps the argument expression
together on one line.

So, which alternative is your preference?

> I applied the whole series,

It seems not?

Thanks,
Jason


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 4/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg
  2024-09-06 13:19   ` Will Deacon
@ 2024-09-06 15:06     ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-09-06 15:06 UTC (permalink / raw)
  To: Will Deacon
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy,
	Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

On Fri, Sep 06, 2024 at 02:19:17PM +0100, Will Deacon wrote:
> > @@ -207,10 +207,8 @@
> >  #define STRTAB_L1_DESC_SPAN		GENMASK_ULL(4, 0)
> >  #define STRTAB_L1_DESC_L2PTR_MASK	GENMASK_ULL(51, 6)
> >  
> > -#define STRTAB_STE_DWORDS		8
> > -
> >  struct arm_smmu_ste {
> > -	__le64 data[STRTAB_STE_DWORDS];
> > +	__le64 data[8];
> >  };
> 
> As discussed before, please keep this part as-is (i.e. __le64
> data[STRTAB_STE_DWORDS])

Missed this one, sorry

Jason


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 9/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg
  2024-09-06 13:22   ` Will Deacon
@ 2024-09-06 15:13     ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-09-06 15:13 UTC (permalink / raw)
  To: Will Deacon
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy,
	Michael Shavit, Nicolin Chen, patches, Ryan Roberts,
	Mostafa Saleh

On Fri, Sep 06, 2024 at 02:22:22PM +0100, Will Deacon wrote:
> On Tue, Aug 06, 2024 at 08:31:23PM -0300, Jason Gunthorpe wrote:
> > @@ -1373,8 +1368,6 @@ void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid)
> >  
> >  static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
> >  {
> > -	int ret;
> > -	size_t l1size;
> >  	size_t max_contexts;
> >  	struct arm_smmu_device *smmu = master->smmu;
> >  	struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
> > @@ -1385,71 +1378,67 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master)
> >  	if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB) ||
> >  	    max_contexts <= CTXDESC_L2_ENTRIES) {
> >  		cd_table->s1fmt = STRTAB_STE_0_S1FMT_LINEAR;
> > -		cd_table->num_l1_ents = max_contexts;
> > +		cd_table->linear.num_ents = max_contexts;
> >  
> > -		l1size = max_contexts * sizeof(struct arm_smmu_cd);
> > +		cd_table->linear.table = dma_alloc_coherent(
> > +			smmu->dev, max_contexts * sizeof(struct arm_smmu_cd),
> > +			&cd_table->cdtab_dma, GFP_KERNEL);
> > +		if (!cd_table->linear.table)
> > +			return -ENOMEM;
> >  	} else {
> >  		cd_table->s1fmt = STRTAB_STE_0_S1FMT_64K_L2;
> > -		cd_table->num_l1_ents = DIV_ROUND_UP(max_contexts,
> > -						  CTXDESC_L2_ENTRIES);
> > +		cd_table->l2.num_l1_ents =
> > +			DIV_ROUND_UP(max_contexts, CTXDESC_L2_ENTRIES);
> >  
> > -		cd_table->l1_desc = kcalloc(cd_table->num_l1_ents,
> > -					    sizeof(*cd_table->l1_desc),
> > -					    GFP_KERNEL);
> > -		if (!cd_table->l1_desc)
> > +		cd_table->l2.l2ptrs = kcalloc(cd_table->l2.num_l1_ents,
> > +					     sizeof(*cd_table->l2.l2ptrs),
> > +					     GFP_KERNEL);
> > +		if (!cd_table->l2.l2ptrs)
> >  			return -ENOMEM;
> >  
> > -		l1size = cd_table->num_l1_ents *
> > -			 sizeof(struct arm_smmu_cdtab_l1);
> > +		cd_table->l2.l1tab = dma_alloc_coherent(
> > +			smmu->dev,
> > +			cd_table->l2.num_l1_ents *
> > +				sizeof(struct arm_smmu_cdtab_l1),
> > +			&cd_table->cdtab_dma, GFP_KERNEL);
> > +		if (!cd_table->l2.l1tab) {
> > +			kfree(cd_table->l2.l2ptrs);
> > +			cd_table->l2.l2ptrs = NULL;
> > +			return -ENOMEM;
> > +		}
> >  	}
> > -
> > -	cd_table->cdtab = dma_alloc_coherent(smmu->dev, l1size,
> > -					     &cd_table->cdtab_dma, GFP_KERNEL);
> > -	if (!cd_table->cdtab) {
> > -		dev_warn(smmu->dev, "failed to allocate context descriptor\n");
> > -		ret = -ENOMEM;
> > -		goto err_free_l1;
> > -	}
> > -
> >  	return 0;
> > -
> > -err_free_l1:
> > -	if (cd_table->l1_desc) {
> > -		kfree(cd_table->l1_desc);
> > -		cd_table->l1_desc = NULL;
> > -	}
> > -	return ret;
> 
> Why inline the error path here? Sure, I get that it works both ways, but
> it seems a little gratuitous to me.

With the change there is only one goto.

But sure, can leave it:

[..]
		if (!cd_table->l2.l2ptrs) {
			ret = -ENOMEM;
			goto err_free_l2ptrs;
		}
	}
	return 0;

err_free_l2ptrs:
	kfree(cd_table->l2.l2ptrs);
	cd_table->l2.l2ptrs = NULL;
	return ret;
}

Jason


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-09-06 15:27 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-06 23:31 [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
2024-08-06 23:31 ` [PATCH v3 1/9] iommu/arm-smmu-v3: Use the new rb tree helpers Jason Gunthorpe
2024-09-06 13:23   ` Will Deacon
2024-08-06 23:31 ` [PATCH v3 2/9] iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx() Jason Gunthorpe
2024-09-06 13:18   ` Will Deacon
2024-09-06 14:40     ` Jason Gunthorpe
2024-08-06 23:31 ` [PATCH v3 3/9] iommu/arm-smmu-v3: Add types for each level of the 2 level stream table Jason Gunthorpe
2024-08-06 23:31 ` [PATCH v3 4/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg Jason Gunthorpe
2024-09-06 13:19   ` Will Deacon
2024-09-06 15:06     ` Jason Gunthorpe
2024-08-06 23:31 ` [PATCH v3 5/9] iommu/arm-smmu-v3: Remove strtab_base/cfg Jason Gunthorpe
2024-08-06 23:31 ` [PATCH v3 6/9] iommu/arm-smmu-v3: Do not use devm for the cd table allocations Jason Gunthorpe
2024-08-06 23:31 ` [PATCH v3 7/9] iommu/arm-smmu-v3: Shrink the cdtab l1_desc array Jason Gunthorpe
2024-09-06 13:21   ` Will Deacon
2024-08-06 23:31 ` [PATCH v3 8/9] iommu/arm-smmu-v3: Add types for each level of the CD table Jason Gunthorpe
2024-08-06 23:31 ` [PATCH v3 9/9] iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg Jason Gunthorpe
2024-09-06 13:22   ` Will Deacon
2024-09-06 15:13     ` Jason Gunthorpe
2024-09-05 19:25 ` [PATCH v3 0/9] Tidy some minor things in the stream table/cd table area Jason Gunthorpe
2024-09-05 20:10 ` Nicolin Chen
2024-09-06 14:35 ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).