All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicolin Chen <nicolinc@nvidia.com>
To: <will@kernel.org>, <robin.murphy@arm.com>
Cc: <joro@8bytes.org>, <jgg@nvidia.com>, <jean-philippe@linaro.org>,
	<apopple@nvidia.com>, <linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>, <iommu@lists.linux.dev>,
	<stable@vger.kernel.org>
Subject: Re: [PATCH] iommu/arm-smmu-v3: Fix soft lockup triggered by arm_smmu_mm_invalidate_range
Date: Fri, 15 Sep 2023 00:43:50 -0700	[thread overview]
Message-ID: <ZQQLNmmAOsNmvtDs@Asurada-Nvidia> (raw)
In-Reply-To: <20230901203904.4073-1-nicolinc@nvidia.com>

I found this patch cannot be applied to v6.6-rc1 due to conflicts
with some new commits that were merged during the previous cycle.

I can redo a version rebasing on the v6.6-rc1, yet the new version
won't apply to earlier kernel stable trees. Is there a way to make
it happen that both mainline and earlier trees can have this fix?

Thanks
Nicolin

On Fri, Sep 01, 2023 at 01:39:04PM -0700, Nicolin Chen wrote:
> When running an SVA case, the following soft lockup is triggered:
> --------------------------------------------------------------------
> watchdog: BUG: soft lockup - CPU#244 stuck for 26s!
> pstate: 83400009 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
> pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
> lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50
> sp : ffff8000d83ef290
> x29: ffff8000d83ef290 x28: 000000003b9aca00 x27: 0000000000000000
> x26: ffff8000d83ef3c0 x25: da86c0812194a0e8 x24: 0000000000000000
> x23: 0000000000000040 x22: ffff8000d83ef340 x21: ffff0000c63980c0
> x20: 0000000000000001 x19: ffff0000c6398080 x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: ffff3000b4a3bbb0
> x14: ffff3000b4a30888 x13: ffff3000b4a3cf60 x12: 0000000000000000
> x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc08120e4d6bc
> x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000048cfa
> x5 : 0000000000000000 x4 : 0000000000000001 x3 : 000000000000000a
> x2 : 0000000080000000 x1 : 0000000000000000 x0 : 0000000000000001
> Call trace:
>  arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
>  __arm_smmu_tlb_inv_range+0x118/0x254
>  arm_smmu_tlb_inv_range_asid+0x6c/0x130
>  arm_smmu_mm_invalidate_range+0xa0/0xa4
>  __mmu_notifier_invalidate_range_end+0x88/0x120
>  unmap_vmas+0x194/0x1e0
>  unmap_region+0xb4/0x144
>  do_mas_align_munmap+0x290/0x490
>  do_mas_munmap+0xbc/0x124
>  __vm_munmap+0xa8/0x19c
>  __arm64_sys_munmap+0x28/0x50
>  invoke_syscall+0x78/0x11c
>  el0_svc_common.constprop.0+0x58/0x1c0
>  do_el0_svc+0x34/0x60
>  el0_svc+0x2c/0xd4
>  el0t_64_sync_handler+0x114/0x140
>  el0t_64_sync+0x1a4/0x1a8
> --------------------------------------------------------------------
> 
> The commit 06ff87bae8d3 ("arm64: mm: remove unused functions and variable
> protoypes") fixed a similar lockup on the CPU MMU side. Yet, it can occur
> to SMMU too since arm_smmu_mm_invalidate_range() is typically called next
> to MMU tlb flush function, e.g.
>         tlb_flush_mmu_tlbonly {
>                 tlb_flush {
>                         __flush_tlb_range {
>                                 // check MAX_TLBI_OPS
>                         }
>                 }
>                 mmu_notifier_invalidate_range {
>                         arm_smmu_mm_invalidate_range {
>                                 // does not check MAX_TLBI_OPS
>                         }
>                 }
>         }
> 
> Clone a CMDQ_MAX_TLBI_OPS from the MAX_TLBI_OPS in tlbflush.h, since in an
> SVA case SMMU uses the CPU page table, so it makes sense to align with the
> tlbflush code. Then, replace per-page TLBI commands with a single per-asid
> TLBI command, if the request size hits this threshold.
> 
> Fixes: 51d113c3be09 ("iommu/arm-smmu-v3: Make BTM optional for SVA")
> Cc: stable@vger.kernel.org
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 11 ++++++++---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h     |  9 +++++++++
>  2 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index a5a63b1c947e..7ec3f5219250 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -201,9 +201,14 @@ static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn,
>          */
>         size = end - start;
> 
> -       if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM))
> -               arm_smmu_tlb_inv_range_asid(start, size, smmu_mn->cd->asid,
> -                                           PAGE_SIZE, false, smmu_domain);
> +       if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM)) {
> +               if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_RANGE_INV) &&
> +                   size >= CMDQ_MAX_TLBI_OPS * PAGE_SIZE)
> +                       arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
> +               else
> +                       arm_smmu_tlb_inv_range_asid(start, size, smmu_mn->cd->asid,
> +                                                   PAGE_SIZE, false, smmu_domain);
> +       }
>         arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, start, size);
>  }
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index dcab85698a4e..79a81eed1dcc 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -326,6 +326,15 @@
>   */
>  #define CMDQ_BATCH_ENTRIES             BITS_PER_LONG
> 
> +/*
> + * Cloned from the MAX_TLBI_OPS in arch/arm64/include/asm/tlbflush.h, this
> + * is used as a threshold to replace per-page TLBI commands to issue in the
> + * command queue with an address-space TLBI command, when SMMU w/o a range
> + * invalidation feature handles too many per-page TLBI commands, which will
> + * otherwise result in a soft lockup.
> + */
> +#define CMDQ_MAX_TLBI_OPS              (1 << (PAGE_SHIFT - 3))
> +
>  #define CMDQ_0_OP                      GENMASK_ULL(7, 0)
>  #define CMDQ_0_SSV                     (1UL << 11)
> 
> --
> 2.42.0
> 
> 

WARNING: multiple messages have this Message-ID (diff)
From: Nicolin Chen <nicolinc@nvidia.com>
To: <will@kernel.org>, <robin.murphy@arm.com>
Cc: <joro@8bytes.org>, <jgg@nvidia.com>, <jean-philippe@linaro.org>,
	<apopple@nvidia.com>, <linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>, <iommu@lists.linux.dev>,
	<stable@vger.kernel.org>
Subject: Re: [PATCH] iommu/arm-smmu-v3: Fix soft lockup triggered by arm_smmu_mm_invalidate_range
Date: Fri, 15 Sep 2023 00:43:50 -0700	[thread overview]
Message-ID: <ZQQLNmmAOsNmvtDs@Asurada-Nvidia> (raw)
In-Reply-To: <20230901203904.4073-1-nicolinc@nvidia.com>

I found this patch cannot be applied to v6.6-rc1 due to conflicts
with some new commits that were merged during the previous cycle.

I can redo a version rebasing on the v6.6-rc1, yet the new version
won't apply to earlier kernel stable trees. Is there a way to make
it happen that both mainline and earlier trees can have this fix?

Thanks
Nicolin

On Fri, Sep 01, 2023 at 01:39:04PM -0700, Nicolin Chen wrote:
> When running an SVA case, the following soft lockup is triggered:
> --------------------------------------------------------------------
> watchdog: BUG: soft lockup - CPU#244 stuck for 26s!
> pstate: 83400009 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
> pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
> lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50
> sp : ffff8000d83ef290
> x29: ffff8000d83ef290 x28: 000000003b9aca00 x27: 0000000000000000
> x26: ffff8000d83ef3c0 x25: da86c0812194a0e8 x24: 0000000000000000
> x23: 0000000000000040 x22: ffff8000d83ef340 x21: ffff0000c63980c0
> x20: 0000000000000001 x19: ffff0000c6398080 x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: ffff3000b4a3bbb0
> x14: ffff3000b4a30888 x13: ffff3000b4a3cf60 x12: 0000000000000000
> x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc08120e4d6bc
> x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000048cfa
> x5 : 0000000000000000 x4 : 0000000000000001 x3 : 000000000000000a
> x2 : 0000000080000000 x1 : 0000000000000000 x0 : 0000000000000001
> Call trace:
>  arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
>  __arm_smmu_tlb_inv_range+0x118/0x254
>  arm_smmu_tlb_inv_range_asid+0x6c/0x130
>  arm_smmu_mm_invalidate_range+0xa0/0xa4
>  __mmu_notifier_invalidate_range_end+0x88/0x120
>  unmap_vmas+0x194/0x1e0
>  unmap_region+0xb4/0x144
>  do_mas_align_munmap+0x290/0x490
>  do_mas_munmap+0xbc/0x124
>  __vm_munmap+0xa8/0x19c
>  __arm64_sys_munmap+0x28/0x50
>  invoke_syscall+0x78/0x11c
>  el0_svc_common.constprop.0+0x58/0x1c0
>  do_el0_svc+0x34/0x60
>  el0_svc+0x2c/0xd4
>  el0t_64_sync_handler+0x114/0x140
>  el0t_64_sync+0x1a4/0x1a8
> --------------------------------------------------------------------
> 
> The commit 06ff87bae8d3 ("arm64: mm: remove unused functions and variable
> protoypes") fixed a similar lockup on the CPU MMU side. Yet, it can occur
> to SMMU too since arm_smmu_mm_invalidate_range() is typically called next
> to MMU tlb flush function, e.g.
>         tlb_flush_mmu_tlbonly {
>                 tlb_flush {
>                         __flush_tlb_range {
>                                 // check MAX_TLBI_OPS
>                         }
>                 }
>                 mmu_notifier_invalidate_range {
>                         arm_smmu_mm_invalidate_range {
>                                 // does not check MAX_TLBI_OPS
>                         }
>                 }
>         }
> 
> Clone a CMDQ_MAX_TLBI_OPS from the MAX_TLBI_OPS in tlbflush.h, since in an
> SVA case SMMU uses the CPU page table, so it makes sense to align with the
> tlbflush code. Then, replace per-page TLBI commands with a single per-asid
> TLBI command, if the request size hits this threshold.
> 
> Fixes: 51d113c3be09 ("iommu/arm-smmu-v3: Make BTM optional for SVA")
> Cc: stable@vger.kernel.org
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 11 ++++++++---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h     |  9 +++++++++
>  2 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index a5a63b1c947e..7ec3f5219250 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -201,9 +201,14 @@ static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn,
>          */
>         size = end - start;
> 
> -       if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM))
> -               arm_smmu_tlb_inv_range_asid(start, size, smmu_mn->cd->asid,
> -                                           PAGE_SIZE, false, smmu_domain);
> +       if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM)) {
> +               if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_RANGE_INV) &&
> +                   size >= CMDQ_MAX_TLBI_OPS * PAGE_SIZE)
> +                       arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
> +               else
> +                       arm_smmu_tlb_inv_range_asid(start, size, smmu_mn->cd->asid,
> +                                                   PAGE_SIZE, false, smmu_domain);
> +       }
>         arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, start, size);
>  }
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index dcab85698a4e..79a81eed1dcc 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -326,6 +326,15 @@
>   */
>  #define CMDQ_BATCH_ENTRIES             BITS_PER_LONG
> 
> +/*
> + * Cloned from the MAX_TLBI_OPS in arch/arm64/include/asm/tlbflush.h, this
> + * is used as a threshold to replace per-page TLBI commands to issue in the
> + * command queue with an address-space TLBI command, when SMMU w/o a range
> + * invalidation feature handles too many per-page TLBI commands, which will
> + * otherwise result in a soft lockup.
> + */
> +#define CMDQ_MAX_TLBI_OPS              (1 << (PAGE_SHIFT - 3))
> +
>  #define CMDQ_0_OP                      GENMASK_ULL(7, 0)
>  #define CMDQ_0_SSV                     (1UL << 11)
> 
> --
> 2.42.0
> 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-09-15  7:43 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-01 20:39 [PATCH] iommu/arm-smmu-v3: Fix soft lockup triggered by arm_smmu_mm_invalidate_range Nicolin Chen
2023-09-01 20:39 ` Nicolin Chen
2023-09-15  7:43 ` Nicolin Chen [this message]
2023-09-15  7:43   ` Nicolin Chen
2023-09-18  5:33   ` Baolu Lu
2023-09-18  5:33     ` Baolu Lu
2023-09-18  9:23     ` Will Deacon
2023-09-18  9:23       ` Will Deacon
2023-09-18 14:58       ` Nicolin Chen
2023-09-18 14:58         ` Nicolin Chen
2023-09-18  9:22 ` Will Deacon
2023-09-18  9:22   ` Will Deacon
2023-09-18 15:09   ` Nicolin Chen
2023-09-18 15:09     ` Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZQQLNmmAOsNmvtDs@Asurada-Nvidia \
    --to=nicolinc@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=iommu@lists.linux.dev \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=stable@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.