[v3 PATCH] iommu/arm-smmu-v3: Fix L1 stream table index calculation for 32-bit sid size

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Yang Shi <yang@os.amperecomputing.com>
To: jgg@ziepe.ca, nicolinc@nvidia.com, james.morse@arm.com,
	will@kernel.org, robin.murphy@arm.com
Cc: yang@os.amperecomputing.com,
	linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org
Subject: [v3 PATCH] iommu/arm-smmu-v3: Fix L1 stream table index calculation for 32-bit sid size
Date: Fri,  4 Oct 2024 11:04:05 -0700	[thread overview]
Message-ID: <20241004180405.555194-1-yang@os.amperecomputing.com> (raw)

The commit ce410410f1a7 ("iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()")
calculated the last index of L1 stream table by 1 << smmu->sid_bits. 1
is 32 bit value.
However some platforms, for example, AmpereOne and the platforms with
ARM MMU-700, have 32-bit stream id size.  This resulted in ouf-of-bound shift.
The disassembly of shift is:

    ldr     w2, [x19, 828]  //, smmu_7(D)->sid_bits
    mov     w20, 1
    lsl     w20, w20, w2

According to ARM spec, if the registers are 32 bit, the instruction actually
does:
    dest = src << (shift % 32)

So it actually shifted by zero bit.

The out-of-bound shift is also undefined behavior according to C
language standard.

This caused v6.12-rc1 failed to boot on such platforms.

UBSAN also reported:

UBSAN: shift-out-of-bounds in drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:3628:29
shift exponent 32 is too large for 32-bit type 'int'

Using 64 bit immediate when doing shift can solve the problem.  The
disassembly after the fix looks like:
    ldr     w20, [x19, 828] //, smmu_7(D)->sid_bits
    mov     x0, 1
    lsl     x0, x0, x20

There are a couple of problematic places, extracted the shift into a helper.

Fixes: ce410410f1a7 ("iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()")
Tested-by: James Morse <james.morse@arm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 +++++++++++-----
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  5 +++++
 2 files changed, 16 insertions(+), 5 deletions(-)

v3: * Some trivial modification to the commit log per Robin Murphy.
    * Used "num_sids" instead of "max_sids" per Robin Murphy.
    * Returned u64 type for arm_smmu_strtab_num_sids() per Nicolin Chen.
    * Checked size in arm_smmu_init_strtab_linear() in order to avoid
      overflow per Jason Gunthorpe.
    * Collected r-b tag from Jason Gunthorpe.
v2: * Extracted the shift into a helper per Jason Gunthorpe.
    * Covered more places per Nicolin Chen and Jason Gunthorpe.
    * Used 1ULL instead of 1UL to guarantee 64 bit per Robin Murphy.
    * Made the subject more general since this is not AmpereOne specific
      problem per the report from James Morse.
    * Collected t-b tag from James Morse.
    * Added Fixes tag in commit log.

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 737c5b882355..9d4fc91d9258 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3624,8 +3624,9 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 {
 	u32 l1size;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+	u64 num_sids = arm_smmu_strtab_num_sids(smmu);
 	unsigned int last_sid_idx =
-		arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1);
+		arm_smmu_strtab_l1_idx(num_sids - 1);
 
 	/* Calculate the L1 size, capped to the SIDSIZE. */
 	cfg->l2.num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES);
@@ -3655,20 +3656,25 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
 
 static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
 {
-	u32 size;
+	u64 size;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+	u64 num_sids = arm_smmu_strtab_num_sids(smmu);
+
+	size = num_sids * sizeof(struct arm_smmu_ste);
+	/* The max size for dmam_alloc_coherent() is 32-bit */
+	if (size > SIZE_MAX)
+		return -EINVAL;
 
-	size = (1 << smmu->sid_bits) * sizeof(struct arm_smmu_ste);
 	cfg->linear.table = dmam_alloc_coherent(smmu->dev, size,
 						&cfg->linear.ste_dma,
 						GFP_KERNEL);
 	if (!cfg->linear.table) {
 		dev_err(smmu->dev,
-			"failed to allocate linear stream table (%u bytes)\n",
+			"failed to allocate linear stream table (%llu bytes)\n",
 			size);
 		return -ENOMEM;
 	}
-	cfg->linear.num_ents = 1 << smmu->sid_bits;
+	cfg->linear.num_ents = num_sids;
 
 	arm_smmu_init_initial_stes(cfg->linear.table, cfg->linear.num_ents);
 	return 0;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 1e9952ca989f..c8ceddc5e8ef 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -853,6 +853,11 @@ struct arm_smmu_master_domain {
 	ioasid_t ssid;
 };
 
+static inline u64 arm_smmu_strtab_num_sids(struct arm_smmu_device *smmu)
+{
+	return (1ULL << smmu->sid_bits);
+}
+
 static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
 {
 	return container_of(dom, struct arm_smmu_domain, domain);
-- 
2.41.0

next             reply	other threads:[~2024-10-04 18:04 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-04 18:04 Yang Shi [this message]
2024-10-04 21:14 ` [v3 PATCH] iommu/arm-smmu-v3: Fix L1 stream table index calculation for 32-bit sid size Daniel Mentz
2024-10-04 21:47   ` Yang Shi
2024-10-05  1:03     ` Daniel Mentz
2024-10-05  1:53       ` Yang Shi
2024-10-07 16:36         ` Daniel Mentz
2024-10-07 17:49           ` Yang Shi
2024-10-07 17:50         ` Jason Gunthorpe
2024-10-07 18:43           ` Yang Shi
2024-10-08 13:34 ` Will Deacon
2024-10-08 15:15   ` Jason Gunthorpe
2024-10-08 17:04     ` Yang Shi
2024-10-08 17:41       ` Will Deacon
2024-10-08 17:42     ` Will Deacon

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:737c5b88235 dfblob:9d4fc91d925 dfblob:1e9952ca989
dfblob:c8ceddc5e8e )
 OR (
bs:"[v3 PATCH] iommu/arm-smmu-v3: Fix L1 stream table index calculation for 32-bit sid size" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241004180405.555194-1-yang@os.amperecomputing.com \
    --to=yang@os.amperecomputing.com \
    --cc=iommu@lists.linux.dev \
    --cc=james.morse@arm.com \
    --cc=jgg@ziepe.ca \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox