From: Nicolin Chen <nicolinc@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: <will@kernel.org>, <jgg@nvidia.com>, <joro@8bytes.org>,
<jean-philippe@linaro.org>, <apopple@nvidia.com>,
<linux-kernel@vger.kernel.org>,
<linux-arm-kernel@lists.infradead.org>, <iommu@lists.linux.dev>
Subject: Re: [PATCH 3/3] iommu/arm-smmu-v3: Add a max_tlbi_ops for __arm_smmu_tlb_inv_range()
Date: Tue, 22 Aug 2023 16:04:10 -0700 [thread overview]
Message-ID: <ZOU+6hsKy4R099B3@Asurada-Nvidia> (raw)
In-Reply-To: <ZOTjGIJU8Kfl1Q4f@Asurada-Nvidia>
Hi Robin,
On Tue, Aug 22, 2023 at 09:32:26AM -0700, Nicolin Chen wrote:
> On Tue, Aug 22, 2023 at 10:30:35AM +0100, Robin Murphy wrote:
>
> > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > index d6c647e1eb01..3f0db30932bd 100644
> > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > @@ -1897,7 +1897,14 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
> > > if (!size)
> > > return;
> > >
> > > - if (smmu->features & ARM_SMMU_FEAT_RANGE_INV) {
> > > + if (!(smmu->features & ARM_SMMU_FEAT_RANGE_INV)) {
> > > + /*
> > > + * When the size reaches a threshold, replace per-granule TLBI
> > > + * commands with one single per-asid or per-vmid TLBI command.
> > > + */
> > > + if (size >= granule * smmu_domain->max_tlbi_ops)
> > > + return arm_smmu_tlb_inv_domain(smmu_domain);
> >
> > This looks like it's at the wrong level - we should have figured this
> > out before we got as far as low-level command-building. I'd have thought
> > it would be a case of short-circuiting directly from
> > arm_smmu_tlb_inv_range_domain() to arm_smmu_tlb_inv_context().
>
> OK, I could do that. We would have copies of this same routine
> though. Also, the shortcut applies to !ARM_SMMU_FEAT_RANGE_INV
> cases only, so this function feels convenient to me.
I was trying to say that we would need the same piece in both
arm_smmu_tlb_inv_range_domain() and arm_smmu_tlb_inv_range_asid(),
though the latter one only needs to call arm_smmu_tlb_inv_asid().
Also, arm_smmu_tlb_inv_context() does a full range ATC invalidation
instead of a given range like what arm_smmu_tlb_inv_range_domain()
currently does. So, it might be a bit overkill.
Combining all your comments, we'd have something like this:
-------------------------------------------------------------------
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7614739ea2c1..2967a6634c7c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1937,12 +1937,22 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
size_t granule, bool leaf,
struct arm_smmu_domain *smmu_domain)
{
+ struct io_pgtable_cfg *cfg =
+ &io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
struct arm_smmu_cmdq_ent cmd = {
.tlbi = {
.leaf = leaf,
},
};
+ /*
+ * If the given size is too large that would end up with too many TLBI
+ * commands in CMDQ, short circuit directly to a full invalidation
+ */
+ if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_RANGE_INV) &&
+ size >= granule * (1UL << cfg->bits_per_level))
+ return arm_smmu_tlb_inv_context(smmu_domain);
+
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
cmd.opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA;
@@ -1964,6 +1974,8 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
struct arm_smmu_domain *smmu_domain)
{
+ struct io_pgtable_cfg *cfg =
+ &io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
struct arm_smmu_cmdq_ent cmd = {
.opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA,
@@ -1973,6 +1985,14 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
},
};
+ /*
+ * If the given size is too large that would end up with too many TLBI
+ * commands in CMDQ, short circuit directly to a full invalidation
+ */
+ if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_RANGE_INV) &&
+ size >= granule * (1UL << cfg->bits_per_level))
+ return arm_smmu_tlb_inv_asid(smmu_domain->smmu, asid);
+
__arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
}
-------------------------------------------------------------------
You're sure that you prefer this, right?
Thanks
Nicolin
WARNING: multiple messages have this Message-ID (diff)
From: Nicolin Chen <nicolinc@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: <will@kernel.org>, <jgg@nvidia.com>, <joro@8bytes.org>,
<jean-philippe@linaro.org>, <apopple@nvidia.com>,
<linux-kernel@vger.kernel.org>,
<linux-arm-kernel@lists.infradead.org>, <iommu@lists.linux.dev>
Subject: Re: [PATCH 3/3] iommu/arm-smmu-v3: Add a max_tlbi_ops for __arm_smmu_tlb_inv_range()
Date: Tue, 22 Aug 2023 16:04:10 -0700 [thread overview]
Message-ID: <ZOU+6hsKy4R099B3@Asurada-Nvidia> (raw)
In-Reply-To: <ZOTjGIJU8Kfl1Q4f@Asurada-Nvidia>
Hi Robin,
On Tue, Aug 22, 2023 at 09:32:26AM -0700, Nicolin Chen wrote:
> On Tue, Aug 22, 2023 at 10:30:35AM +0100, Robin Murphy wrote:
>
> > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > index d6c647e1eb01..3f0db30932bd 100644
> > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > @@ -1897,7 +1897,14 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
> > > if (!size)
> > > return;
> > >
> > > - if (smmu->features & ARM_SMMU_FEAT_RANGE_INV) {
> > > + if (!(smmu->features & ARM_SMMU_FEAT_RANGE_INV)) {
> > > + /*
> > > + * When the size reaches a threshold, replace per-granule TLBI
> > > + * commands with one single per-asid or per-vmid TLBI command.
> > > + */
> > > + if (size >= granule * smmu_domain->max_tlbi_ops)
> > > + return arm_smmu_tlb_inv_domain(smmu_domain);
> >
> > This looks like it's at the wrong level - we should have figured this
> > out before we got as far as low-level command-building. I'd have thought
> > it would be a case of short-circuiting directly from
> > arm_smmu_tlb_inv_range_domain() to arm_smmu_tlb_inv_context().
>
> OK, I could do that. We would have copies of this same routine
> though. Also, the shortcut applies to !ARM_SMMU_FEAT_RANGE_INV
> cases only, so this function feels convenient to me.
I was trying to say that we would need the same piece in both
arm_smmu_tlb_inv_range_domain() and arm_smmu_tlb_inv_range_asid(),
though the latter one only needs to call arm_smmu_tlb_inv_asid().
Also, arm_smmu_tlb_inv_context() does a full range ATC invalidation
instead of a given range like what arm_smmu_tlb_inv_range_domain()
currently does. So, it might be a bit overkill.
Combining all your comments, we'd have something like this:
-------------------------------------------------------------------
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7614739ea2c1..2967a6634c7c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1937,12 +1937,22 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
size_t granule, bool leaf,
struct arm_smmu_domain *smmu_domain)
{
+ struct io_pgtable_cfg *cfg =
+ &io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
struct arm_smmu_cmdq_ent cmd = {
.tlbi = {
.leaf = leaf,
},
};
+ /*
+ * If the given size is too large that would end up with too many TLBI
+ * commands in CMDQ, short circuit directly to a full invalidation
+ */
+ if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_RANGE_INV) &&
+ size >= granule * (1UL << cfg->bits_per_level))
+ return arm_smmu_tlb_inv_context(smmu_domain);
+
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
cmd.opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA;
@@ -1964,6 +1974,8 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
struct arm_smmu_domain *smmu_domain)
{
+ struct io_pgtable_cfg *cfg =
+ &io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
struct arm_smmu_cmdq_ent cmd = {
.opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA,
@@ -1973,6 +1985,14 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
},
};
+ /*
+ * If the given size is too large that would end up with too many TLBI
+ * commands in CMDQ, short circuit directly to a full invalidation
+ */
+ if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_RANGE_INV) &&
+ size >= granule * (1UL << cfg->bits_per_level))
+ return arm_smmu_tlb_inv_asid(smmu_domain->smmu, asid);
+
__arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
}
-------------------------------------------------------------------
You're sure that you prefer this, right?
Thanks
Nicolin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-08-22 23:04 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-22 8:45 [PATCH 0/3] iommu/arm-smmu-v3: Reduce latency in __arm_smmu_tlb_inv_range() Nicolin Chen
2023-08-22 8:45 ` Nicolin Chen
2023-08-22 8:45 ` [PATCH 1/3] iommu/io-pgtable-arm: Add nents_per_pgtable in struct io_pgtable_cfg Nicolin Chen
2023-08-22 8:45 ` Nicolin Chen
2023-08-22 9:19 ` Robin Murphy
2023-08-22 9:19 ` Robin Murphy
2023-08-22 16:42 ` Nicolin Chen
2023-08-22 16:42 ` Nicolin Chen
2023-08-29 15:37 ` Robin Murphy
2023-08-29 15:37 ` Robin Murphy
2023-08-29 20:15 ` Nicolin Chen
2023-08-29 20:15 ` Nicolin Chen
2023-08-29 21:25 ` Robin Murphy
2023-08-29 21:25 ` Robin Murphy
2023-08-29 22:15 ` Nicolin Chen
2023-08-29 22:15 ` Nicolin Chen
2023-08-30 21:49 ` Will Deacon
2023-08-30 21:49 ` Will Deacon
2023-08-31 17:39 ` Nicolin Chen
2023-08-31 17:39 ` Nicolin Chen
2023-09-01 0:08 ` Nicolin Chen
2023-09-01 0:08 ` Nicolin Chen
2023-09-01 18:02 ` Robin Murphy
2023-09-01 18:02 ` Robin Murphy
2023-09-01 18:23 ` Nicolin Chen
2023-09-01 18:23 ` Nicolin Chen
2024-01-20 19:59 ` Nicolin Chen
2024-01-20 19:59 ` Nicolin Chen
2024-01-22 13:01 ` Jason Gunthorpe
2024-01-22 13:01 ` Jason Gunthorpe
2024-01-22 17:24 ` Nicolin Chen
2024-01-22 17:24 ` Nicolin Chen
2024-01-22 17:57 ` Jason Gunthorpe
2024-01-22 17:57 ` Jason Gunthorpe
2024-01-24 0:11 ` Nicolin Chen
2024-01-24 0:11 ` Nicolin Chen
2024-01-25 13:55 ` Jason Gunthorpe
2024-01-25 13:55 ` Jason Gunthorpe
2024-01-25 17:23 ` Nicolin Chen
2024-01-25 17:23 ` Nicolin Chen
2024-01-25 17:47 ` Jason Gunthorpe
2024-01-25 17:47 ` Jason Gunthorpe
2024-01-25 19:55 ` Nicolin Chen
2024-01-25 19:55 ` Nicolin Chen
[not found] ` <098d64da-ecf5-4a23-bff9-a04840726ef0@huawei.com>
2024-01-25 5:09 ` Nicolin Chen
2024-01-25 5:09 ` Nicolin Chen
2023-08-22 8:45 ` [PATCH 2/3] iommu/arm-smmu-v3: Add an arm_smmu_tlb_inv_domain helper Nicolin Chen
2023-08-22 8:45 ` Nicolin Chen
2023-08-22 9:40 ` Robin Murphy
2023-08-22 9:40 ` Robin Murphy
2023-08-22 17:03 ` Nicolin Chen
2023-08-22 17:03 ` Nicolin Chen
2023-08-29 21:54 ` Robin Murphy
2023-08-29 21:54 ` Robin Murphy
2023-08-29 23:03 ` Nicolin Chen
2023-08-29 23:03 ` Nicolin Chen
2023-08-22 8:45 ` [PATCH 3/3] iommu/arm-smmu-v3: Add a max_tlbi_ops for __arm_smmu_tlb_inv_range() Nicolin Chen
2023-08-22 8:45 ` Nicolin Chen
2023-08-22 9:30 ` Robin Murphy
2023-08-22 9:30 ` Robin Murphy
2023-08-22 16:32 ` Nicolin Chen
2023-08-22 16:32 ` Nicolin Chen
2023-08-22 23:04 ` Nicolin Chen [this message]
2023-08-22 23:04 ` Nicolin Chen
2023-08-29 22:40 ` Robin Murphy
2023-08-29 22:40 ` Robin Murphy
2023-08-29 23:14 ` Nicolin Chen
2023-08-29 23:14 ` Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZOU+6hsKy4R099B3@Asurada-Nvidia \
--to=nicolinc@nvidia.com \
--cc=apopple@nvidia.com \
--cc=iommu@lists.linux.dev \
--cc=jean-philippe@linaro.org \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.