From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D8B37CEB2E2 for ; Wed, 2 Oct 2024 18:23:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date: Message-ID:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=AU9xAbt/Rs3YRsPHSH8m8hrVsfL774li9dWxLwycJ+I=; b=p4vwxi5mAFYdAWMrnPopFFNye9 wPntayaaqOtfsCr3xLWLsx7jSaV9F/7ZA0CONRcSljY/xONbw6ueIRm9tN5UIuFKr4og0TuGbrUjm JVC21UzC0WU8ufHdic3exF0FZKkgdz/3kmXvpumBeUzxJd1xx3JPfXbv04sxkXj9Vg09uvTCWPfY9 9L3Owi6X8EPmCAbDg25nQH+KFHdLe6rS1LYypu5+fjWWkkHQGhDZtfQqd7QeGMS0Jw7NTc4dugTjc r1xtQMO20NhCRfdXeQb3zikDNpr6k2sdtq3n1tSiPOjgWBqBqpf0pEzHpe7Z8L7QKAJJObsLyoPBy sBckZE7A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1sw400-000000079VU-206d; Wed, 02 Oct 2024 18:22:56 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1sw3yf-000000079Gd-2c8f for linux-arm-kernel@lists.infradead.org; Wed, 02 Oct 2024 18:21:38 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E1C1D339; Wed, 2 Oct 2024 11:22:01 -0700 (PDT) Received: from [10.57.75.22] (unknown [10.57.75.22]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 252103F58B; Wed, 2 Oct 2024 11:21:30 -0700 (PDT) Message-ID: <0a0c6d4f-e9f7-4fe6-9bb8-25772856c345@arm.com> Date: Wed, 2 Oct 2024 19:21:29 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [v2 PATCH] iommu/arm-smmu-v3: Fix L1 stream table index calculation for 32-bit sid size To: Yang Shi , jgg@ziepe.ca, nicolinc@nvidia.com, james.morse@arm.com, will@kernel.org Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <20241002175514.1165299-1-yang@os.amperecomputing.com> From: Robin Murphy Content-Language: en-GB In-Reply-To: <20241002175514.1165299-1-yang@os.amperecomputing.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241002_112133_797076_07FA3301 X-CRM114-Status: GOOD ( 24.92 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2024-10-02 6:55 pm, Yang Shi wrote: > The commit ce410410f1a7 ("iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()") > calculated the last index of L1 stream table by 1 << smmu->sid_bits. 1 > is 32 bit value. > However some platforms, for example, AmpereOne, have 32-bit stream id size. > This resulted in ouf-of-bound shift. The disassembly of shift is: > > ldr w2, [x19, 828] //, smmu_7(D)->sid_bits > mov w20, 1 > lsl w20, w20, w2 > > According to ARM spec, if the registers are 32 bit, the instruction actually > does: > dest = src << (shift % 32) > > So it actually shifted by zero bit. > > This caused v6.12-rc1 failed to boot on AmpereOne and other platform [1]. FWIW it's going to be seen on any platform with Arm MMU-700 since that always advertises 32-bit StreamID support (as other SMMU implementations may do too). > UBSAN also reported: > > UBSAN: shift-out-of-bounds in drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:3628:29 > shift exponent 32 is too large for 32-bit type 'int' At best, those two lines of actual UBSAN warning are the only part relevant to the point, the rest of the backtrace below is definitely not, please trim it. > CPU: 70 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc1 #4 > Hardware name: ZOLLNER SUNMOONLAKE/SunMoon Lake, BIOS 00.00. 2024-08-28 18:42:45 08/28/2024 > Call trace: > dump_backtrace+0xdc/0x140 > show_stack+0x20/0x40 > dump_stack_lvl+0x60/0x80 > dump_stack+0x18/0x28 > ubsan_epilogue+0x10/0x48 > __ubsan_handle_shift_out_of_bounds+0xd8/0x1a0 > arm_smmu_init_structures+0x374/0x3c8 > arm_smmu_device_probe+0x208/0x600 > platform_probe+0x70/0xe8 > really_probe+0xc8/0x3a0 > __driver_probe_device+0x84/0x160 > driver_probe_device+0x44/0x130 > __driver_attach+0xcc/0x208 > bus_for_each_dev+0x84/0x100 > driver_attach+0x2c/0x40 > bus_add_driver+0x158/0x290 > driver_register+0x70/0x138 > __platform_driver_register+0x2c/0x40 > arm_smmu_driver_init+0x28/0x40 > do_one_initcall+0x60/0x318 > do_initcalls+0x198/0x1e0 > kernel_init_freeable+0x18c/0x1e8 > kernel_init+0x28/0x160 > ret_from_fork+0x10/0x20 > > Using 64 bit immediate when doing shift can solve the problem. The > disassembly after the fix looks like: > ldr w20, [x19, 828] //, smmu_7(D)->sid_bits > mov x0, 1 > lsl x0, x0, x20 > > There are a couple of problematic places, extracted the shift into a helper. > > [1] https://lore.kernel.org/lkml/d4b53bbb-333a-45b9-9eb0-23ddd0820a14@arm.com/ > Fixes: ce410410f1a7 ("iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()") > Tested-by: James Morse > Signed-off-by: Yang Shi > --- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 +++++--- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +++++ > 2 files changed, 10 insertions(+), 3 deletions(-) > > v2: * Extracted the shift into a helper per Jason Gunthorpe. > * Covered more places per Nicolin Chen and Jason Gunthorpe. > * Used 1ULL instead of 1UL to guarantee 64 bit per Robin Murphy. > * Made the subject more general since this is not AmpereOne specific > problem per the report from James Morse. > * Collected t-b tag from James Morse. > * Added Fixes tag in commit log. > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > index 737c5b882355..4eafd9f04808 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > @@ -3624,8 +3624,9 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu) > { > u32 l1size; > struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; > + unsigned int max_sid = arm_smmu_strtab_max_sid(smmu); > unsigned int last_sid_idx = > - arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1); > + arm_smmu_strtab_l1_idx(max_sid - 1); > > /* Calculate the L1 size, capped to the SIDSIZE. */ > cfg->l2.num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES); > @@ -3657,8 +3658,9 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu) > { > u32 size; > struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; > + unsigned int max_sid = arm_smmu_strtab_max_sid(smmu); > > - size = (1 << smmu->sid_bits) * sizeof(struct arm_smmu_ste); > + size = max_sid * sizeof(struct arm_smmu_ste); > cfg->linear.table = dmam_alloc_coherent(smmu->dev, size, > &cfg->linear.ste_dma, > GFP_KERNEL); > @@ -3668,7 +3670,7 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu) > size); > return -ENOMEM; > } > - cfg->linear.num_ents = 1 << smmu->sid_bits; > + cfg->linear.num_ents = max_sid; > > arm_smmu_init_initial_stes(cfg->linear.table, cfg->linear.num_ents); > return 0; > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h > index 1e9952ca989f..f7e8465c629a 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h > @@ -853,6 +853,11 @@ struct arm_smmu_master_domain { > ioasid_t ssid; > }; > > +static inline unsigned int arm_smmu_strtab_max_sid(struct arm_smmu_device *smmu) Nit: "max_sid" implies returning the largest supported StreamID value, so it would be logical to either include the "- 1" in here and adjust the other callers, or instead call this something like "num_sids". Thanks, Robin. > +{ > + return (1ULL << smmu->sid_bits); > +} > + > static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) > { > return container_of(dom, struct arm_smmu_domain, domain);