From: Jason Gunthorpe <jgg@nvidia.com>
To: Michael Shavit <mshavit@google.com>
Cc: iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, nicolinc@nvidia.com,
tina.zhang@intel.com, jean-philippe@linaro.org, will@kernel.org,
robin.murphy@arm.com, Dawei Li <set_pte_at@outlook.com>,
Joerg Roedel <joro@8bytes.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Lu Baolu <baolu.lu@linux.intel.com>,
Mark Brown <broonie@kernel.org>
Subject: Re: [RFC PATCH v2 3/9] iommu/arm-smmu-v3: Issue invalidations commands to multiple SMMUs
Date: Tue, 22 Aug 2023 10:14:09 -0300 [thread overview]
Message-ID: <ZOS0ocsjy34N5s4l@nvidia.com> (raw)
In-Reply-To: <20230822185632.RFC.v2.3.I0f149f177e5478e28dc3223c2d10729d8f28d53a@changeid>
On Tue, Aug 22, 2023 at 06:56:59PM +0800, Michael Shavit wrote:
> Assume that devices in the smmu_domain->domain list that belong to the
> same SMMU are adjacent to each other in the list.
> Batch TLB/ATC invalidation commands for an smmu_domain by the SMMU
> devices that the domain is installed to.
>
> Signed-off-by: Michael Shavit <mshavit@google.com>
> ---
>
> Changes in v2:
> - Moved the ARM_SMMU_FEAT_BTM changes into a new prepatory commit
>
> .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 6 +-
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 134 +++++++++++++-----
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +-
> 3 files changed, 104 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index 53f65a89a55f9..fe88a7880ad57 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -112,7 +112,7 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid)
> arm_smmu_write_ctx_desc_devices(smmu_domain, 0, cd);
>
> /* Invalidate TLB entries previously associated with that context */
> - arm_smmu_tlb_inv_asid(smmu, asid);
> + arm_smmu_tlb_inv_asid(smmu_domain, asid);
>
> xa_erase(&arm_smmu_asid_xa, asid);
> return NULL;
> @@ -252,7 +252,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> */
> arm_smmu_write_ctx_desc_devices(smmu_domain, mm->pasid, &quiet_cd);
>
> - arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
> + arm_smmu_tlb_inv_asid(smmu_domain, smmu_mn->cd->asid);
> arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, 0, 0);
>
> smmu_mn->cleared = true;
> @@ -340,7 +340,7 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
> * new TLB entry can have been formed.
> */
> if (!smmu_mn->cleared) {
> - arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid);
> + arm_smmu_tlb_inv_asid(smmu_domain, cd->asid);
> arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, 0, 0);
> }
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index db4df9d6aef10..1d072fd38a2d6 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -960,15 +960,28 @@ static int arm_smmu_page_response(struct device *dev,
> }
>
> /* Context descriptor manipulation functions */
> -void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
> +void arm_smmu_tlb_inv_asid(struct arm_smmu_domain *smmu_domain, u16 asid)
> {
> + struct arm_smmu_device *smmu = NULL;
> + struct arm_smmu_master *master;
> struct arm_smmu_cmdq_ent cmd = {
> - .opcode = smmu->features & ARM_SMMU_FEAT_E2H ?
> - CMDQ_OP_TLBI_EL2_ASID : CMDQ_OP_TLBI_NH_ASID,
> .tlbi.asid = asid,
> };
> + unsigned long flags;
>
> - arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
> + spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> + list_for_each_entry(master, &smmu_domain->devices,
> + domain_head) {
> + if (!smmu)
> + smmu = master->smmu;
> + if (smmu != master->smmu ||
> + list_is_last(&master->domain_head, &smmu_domain->devices)) {
Finding the end of the list seems too complicated, just:
struct arm_smmu_device *invalidated_smmu = NULL;
list_for_each_entry(master, &smmu_domain->devices,
domain_head) {
if (master->smmu == invalidated_smmu)
continue;
cmd.opcode = smmu->features & ARM_SMMU_FEAT_E2H ?
CMDQ_OP_TLBI_EL2_ASID : CMDQ_OP_TLBI_NH_ASID,
arm_smmu_cmdq_issue_cmd_with_sync(master->smmu, &cmd);
invalidated_smmu = master->smmu;
}
> @@ -1839,28 +1851,56 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
> arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
>
> cmds.num = 0;
> -
> spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> list_for_each_entry(master, &smmu_domain->devices, domain_head) {
> if (!master->ats_enabled)
> continue;
> + if (!smmu)
> + smmu = master->smmu;
> + if (smmu != master->smmu ||
> + list_is_last(&master->domain_head, &smmu_domain->devices)) {
> + ret = arm_smmu_cmdq_batch_submit(smmu, &cmds);
> + if (ret)
> + break;
> + cmds.num = 0;
> + }
>
> for (i = 0; i < master->num_streams; i++) {
> cmd.atc.sid = master->streams[i].id;
> - arm_smmu_cmdq_batch_add(smmu_domain->smmu, &cmds, &cmd);
> + arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd);
> }
> }
Doesn't the IOTLB invalidate have to come before the ATC invalidate?
So again, use the pattern as above?
> spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>
> - return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds);
> + return ret;
> +}
> +
> +static void arm_smmu_tlb_inv_vmid(struct arm_smmu_domain *smmu_domain)
> +{
> + struct arm_smmu_device *smmu = NULL;
> + struct arm_smmu_master *master;
> + struct arm_smmu_cmdq_ent cmd = {
> + .opcode = CMDQ_OP_TLBI_S12_VMALL,
> + .tlbi.vmid = smmu_domain->s2_cfg.vmid,
> + };
> + unsigned long flags;
> +
> + spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> + list_for_each_entry(master, &smmu_domain->devices,
> + domain_head) {
> + if (!smmu)
> + smmu = master->smmu;
> + if (smmu != master->smmu ||
> + list_is_last(&master->domain_head, &smmu_domain->devices))
> + arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
> + }
> + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
> }
I count three of these, so a macro helper is probably a good
idea. Something approx like:
static struct arm_smmu_master *smmu_next_entry(struct arm_smmu_master *pos,
struct arm_smmu_domain *domain)
{
struct arm_smmu *smmu = pos->smmu;
do {
pos = list_next_entry(pos, domain_head);
} while (!list_entry_is_head(pos, domain->devices, domain_head) &&
pos->smmu == smmu);
return pos;
}
#define for_each_smmu(pos, domain, smmu) \
for (pos = list_first_entry((domain)->devices, struct arm_smmu_master, \
domain_head), \
smmu = (pos)->smmu; \
!list_entry_is_head(pos, (domain)->devices, domain_head); \
pos = smmu_next_entry(pos, domain), smmu = (pos)->smmu)
> @@ -1949,21 +1987,36 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
> size_t granule, bool leaf,
> struct arm_smmu_domain *smmu_domain)
> {
> + struct arm_smmu_device *smmu = NULL;
> + struct arm_smmu_master *master;
> struct arm_smmu_cmdq_ent cmd = {
> .tlbi = {
> .leaf = leaf,
> },
> };
> + unsigned long flags;
>
> - if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> - cmd.opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
> - CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA;
> - cmd.tlbi.asid = smmu_domain->cd.asid;
> - } else {
> - cmd.opcode = CMDQ_OP_TLBI_S2_IPA;
> - cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
> + spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> + list_for_each_entry(master, &smmu_domain->devices, domain_head) {
> + if (!smmu)
> + smmu = master->smmu;
> + if (smmu != master->smmu ||
> + list_is_last(&master->domain_head, &smmu_domain->devices)) {
> + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> + cmd.opcode = smmu->features &
> + ARM_SMMU_FEAT_E2H ?
> + CMDQ_OP_TLBI_EL2_VA :
> + CMDQ_OP_TLBI_NH_VA;
> + cmd.tlbi.asid = smmu_domain->cd.asid;
> + } else {
> + cmd.opcode = CMDQ_OP_TLBI_S2_IPA;
> + cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
> + }
These calculations based on smmu domain shouldn't be in the loop, the
smmu_domain doesn't change.
> - __arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
> + spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> + list_for_each_entry(master, &smmu_domain->devices, domain_head) {
> + if (!smmu)
> + smmu = master->smmu;
> + if (smmu != master->smmu ||
> + list_is_last(&master->domain_head, &smmu_domain->devices)) {
> + if (skip_btm_capable_devices &&
> + smmu->features & ARM_SMMU_FEAT_BTM)
> + continue;
> + cmd.opcode = smmu->features & ARM_SMMU_FEAT_E2H ?
> + CMDQ_OP_TLBI_EL2_VA :
> + CMDQ_OP_TLBI_NH_VA;
There are 3 places doing this if, maybe it should be in a wrapper of
__arm_smmu_tlb_inv_range ?
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-08-22 13:16 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-22 10:56 [RFC PATCH v2 0/9] Install domain onto multiple smmus Michael Shavit
2023-08-22 10:56 ` [RFC PATCH v2 1/9] iommu/arm-smmu-v3: group attached devices by smmu Michael Shavit
2023-08-22 12:49 ` Jason Gunthorpe
2023-08-22 10:56 ` [RFC PATCH v2 2/9] iommu/arm-smmu-v3-sva: Move SVA optimization into arm_smmu_tlb_inv_range_asid Michael Shavit
2023-08-22 10:56 ` [RFC PATCH v2 3/9] iommu/arm-smmu-v3: Issue invalidations commands to multiple SMMUs Michael Shavit
2023-08-22 13:14 ` Jason Gunthorpe [this message]
2023-08-22 10:57 ` [RFC PATCH v2 4/9] iommu/arm-smmu-v3-sva: Allocate new ASID from installed_smmus Michael Shavit
2023-08-22 13:19 ` Jason Gunthorpe
2023-08-23 7:26 ` Michael Shavit
2023-08-22 10:57 ` [RFC PATCH v2 5/9] iommu/arm-smmu-v3: Alloc vmid from global pool Michael Shavit
2023-08-22 10:57 ` [RFC PATCH v2 6/9] iommu/arm-smmu-v3: check smmu compatibility on attach Michael Shavit
2023-08-22 10:57 ` [RFC PATCH v2 7/9] iommu/arm-smmu-v3: Add arm_smmu_device as a parameter to domain_finalise Michael Shavit
2023-08-22 10:57 ` [RFC PATCH v2 8/9] iommu/arm-smmu-v3: check for domain initialization using pgtbl_ops Michael Shavit
2023-08-22 10:57 ` [RFC PATCH v2 9/9] iommu/arm-smmu-v3: allow multi-SMMU domain installs Michael Shavit
2023-08-23 2:42 ` [RFC PATCH v2 0/9] Install domain onto multiple smmus Baolu Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZOS0ocsjy34N5s4l@nvidia.com \
--to=jgg@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=broonie@kernel.org \
--cc=iommu@lists.linux.dev \
--cc=jean-philippe@linaro.org \
--cc=joro@8bytes.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mshavit@google.com \
--cc=nicolinc@nvidia.com \
--cc=robin.murphy@arm.com \
--cc=set_pte_at@outlook.com \
--cc=tina.zhang@intel.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).