* [PATCH v3 0/2] iommu/arm-smmu-v3: Allow default substream bypass with a pasid support @ 2023-08-25 10:31 ` Nicolin Chen 0 siblings, 0 replies; 6+ messages in thread From: Nicolin Chen @ 2023-08-25 10:31 UTC (permalink / raw) To: will, robin.murphy, jgg Cc: joro, mshavit, linux-kernel, linux-arm-kernel, iommu (This series is rebased on top of Michael's refactor series [1]) When an iommu_domain is set to IOMMU_DOMAIN_IDENTITY, the driver sets the arm_smmu_domain->stage to ARM_SMMU_DOMAIN_BYPASS and skips the allocation of a CD table, and then sets STRTAB_STE_0_CFG_BYPASS to the CONFIG field of the STE. This works well for devices that only have one substream, i.e. pasid disabled. With a pasid-capable device, however, there could be a use case where it allows an IDENTITY domain attachment without disabling its pasid feature. This requires the driver to allocate a multi-entry CD table to attach the IDENTITY domain to its default substream and to configure the S1DSS filed of the STE to STRTAB_STE_1_S1DSS_BYPASS. So, there is a missing link here between the STE setup and an IDENTITY domain attachment. This series fills the gap for the use case above. The first patch corrects the conditions at ats_enabled capability and arm_smmu_alloc_cd_tables() so that the use case above could set the ats_enabled and allocate a CD table correctly. The second patch reworks the arm_smmu_write_strtab_ent() in a fashion of all possible configurations of STE.Config field. [1] https://lore.kernel.org/all/20230816131925.2521220-1-mshavit@google.com/ --- Changelog v3: * Replaced ARM_SMMU_DOMAIN_BYPASS_S1DSS with two boolean flags to correct conditions of STE bypass and CD table allocation. * Reworked arm_smmu_write_strtab_ent() with four helper functions v2: https://lore.kernel.org/all/20230817042135.32822-1-nicolinc@nvidia.com/ * Rebased on top of Michael's series reworking CD table ownership [1] * Added a new ARM_SMMU_DOMAIN_BYPASS_S1DSS stage to tag the use case v1: https://lore.kernel.org/all/20230627033326.5236-1-nicolinc@nvidia.com/ Nicolin Chen (2): iommu/arm-smmu-v3: Add boolean bypass_ste and skip_cdtab flags iommu/arm-smmu-v3: Refactor arm_smmu_write_strtab_ent() drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 236 ++++++++++++-------- 1 file changed, 139 insertions(+), 97 deletions(-) base-commit: acd552d4b3b14d639784ea5ccfd61ba1fa85a16b -- 2.42.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 0/2] iommu/arm-smmu-v3: Allow default substream bypass with a pasid support @ 2023-08-25 10:31 ` Nicolin Chen 0 siblings, 0 replies; 6+ messages in thread From: Nicolin Chen @ 2023-08-25 10:31 UTC (permalink / raw) To: will, robin.murphy, jgg Cc: joro, mshavit, linux-kernel, linux-arm-kernel, iommu (This series is rebased on top of Michael's refactor series [1]) When an iommu_domain is set to IOMMU_DOMAIN_IDENTITY, the driver sets the arm_smmu_domain->stage to ARM_SMMU_DOMAIN_BYPASS and skips the allocation of a CD table, and then sets STRTAB_STE_0_CFG_BYPASS to the CONFIG field of the STE. This works well for devices that only have one substream, i.e. pasid disabled. With a pasid-capable device, however, there could be a use case where it allows an IDENTITY domain attachment without disabling its pasid feature. This requires the driver to allocate a multi-entry CD table to attach the IDENTITY domain to its default substream and to configure the S1DSS filed of the STE to STRTAB_STE_1_S1DSS_BYPASS. So, there is a missing link here between the STE setup and an IDENTITY domain attachment. This series fills the gap for the use case above. The first patch corrects the conditions at ats_enabled capability and arm_smmu_alloc_cd_tables() so that the use case above could set the ats_enabled and allocate a CD table correctly. The second patch reworks the arm_smmu_write_strtab_ent() in a fashion of all possible configurations of STE.Config field. [1] https://lore.kernel.org/all/20230816131925.2521220-1-mshavit@google.com/ --- Changelog v3: * Replaced ARM_SMMU_DOMAIN_BYPASS_S1DSS with two boolean flags to correct conditions of STE bypass and CD table allocation. * Reworked arm_smmu_write_strtab_ent() with four helper functions v2: https://lore.kernel.org/all/20230817042135.32822-1-nicolinc@nvidia.com/ * Rebased on top of Michael's series reworking CD table ownership [1] * Added a new ARM_SMMU_DOMAIN_BYPASS_S1DSS stage to tag the use case v1: https://lore.kernel.org/all/20230627033326.5236-1-nicolinc@nvidia.com/ Nicolin Chen (2): iommu/arm-smmu-v3: Add boolean bypass_ste and skip_cdtab flags iommu/arm-smmu-v3: Refactor arm_smmu_write_strtab_ent() drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 236 ++++++++++++-------- 1 file changed, 139 insertions(+), 97 deletions(-) base-commit: acd552d4b3b14d639784ea5ccfd61ba1fa85a16b -- 2.42.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 1/2] iommu/arm-smmu-v3: Add boolean bypass_ste and skip_cdtab flags 2023-08-25 10:31 ` Nicolin Chen @ 2023-08-25 10:31 ` Nicolin Chen -1 siblings, 0 replies; 6+ messages in thread From: Nicolin Chen @ 2023-08-25 10:31 UTC (permalink / raw) To: will, robin.murphy, jgg Cc: joro, mshavit, linux-kernel, linux-arm-kernel, iommu If a master has only a default substream, it can skip CD/translation table allocations when being attached to an IDENTITY domain, by simply setting STE to the "bypass" mode (STE.Config[2:0] == 0b100). If a master has multiple substreams, it will still need a CD table for the non-default substreams when being attached to an IDENTITY domain, in which case the STE.Config is set to the "stage-1 translate" mode while STE.S1DSS field instead is set to the "bypass" mode (STE.S1DSS[1:0] == 0b01). If a master is attached to a stage-2 domain, it does not need a CD table, while the STE.Config is set to the "stage-2 translate" mode. Add boolean bypass_ste and skip_cdtab flags in arm_smmu_attach_dev(), to handle clearly the cases above, which also corrects the conditions at the ats_enabled setting and arm_smmu_alloc_cd_tables() callback to cover the second use case. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 35 ++++++++++++++++----- 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index ffd430948e9e..de8bc4c3ad7a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2406,6 +2406,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) struct arm_smmu_device *smmu; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_master *master; + bool byapss_ste, skip_cdtab; if (!fwspec) return -ENOENT; @@ -2441,6 +2442,24 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) master->domain = smmu_domain; + /* + * When master attaches ARM_SMMU_DOMAIN_BYPASS to its single substream, + * set STE.Config to "bypass" and skip a CD table allocation. Otherwise, + * set STE.Config to "stage-1 translate" and allocate a CD table for its + * multiple stage-1 substream support, unless with a stage-2 domain in + * which case set STE.config to "stage-2 translate" and skip a CD table. + */ + if (smmu_domain->stage == ARM_SMMU_DOMAIN_BYPASS && !master->ssid_bits) { + byapss_ste = true; + skip_cdtab = true; + } else { + byapss_ste = false; + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2) + skip_cdtab = true; + else + skip_cdtab = false; + } + /* * The SMMU does not support enabling ATS with bypass. When the STE is * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and @@ -2448,22 +2467,22 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) * stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry). */ - if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS) + if (!byapss_ste) master->ats_enabled = arm_smmu_ats_supported(master); spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_add(&master->domain_head, &smmu_domain->devices); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); - if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { - if (!master->cd_table.cdtab) { - ret = arm_smmu_alloc_cd_tables(master); - if (ret) { - master->domain = NULL; - goto out_list_del; - } + if (!skip_cdtab && !master->cd_table.cdtab) { + ret = arm_smmu_alloc_cd_tables(master); + if (ret) { + master->domain = NULL; + goto out_list_del; } + } + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { /* * Prevent SVA from concurrently modifying the CD or writing to * the CD entry -- 2.42.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 1/2] iommu/arm-smmu-v3: Add boolean bypass_ste and skip_cdtab flags @ 2023-08-25 10:31 ` Nicolin Chen 0 siblings, 0 replies; 6+ messages in thread From: Nicolin Chen @ 2023-08-25 10:31 UTC (permalink / raw) To: will, robin.murphy, jgg Cc: joro, mshavit, linux-kernel, linux-arm-kernel, iommu If a master has only a default substream, it can skip CD/translation table allocations when being attached to an IDENTITY domain, by simply setting STE to the "bypass" mode (STE.Config[2:0] == 0b100). If a master has multiple substreams, it will still need a CD table for the non-default substreams when being attached to an IDENTITY domain, in which case the STE.Config is set to the "stage-1 translate" mode while STE.S1DSS field instead is set to the "bypass" mode (STE.S1DSS[1:0] == 0b01). If a master is attached to a stage-2 domain, it does not need a CD table, while the STE.Config is set to the "stage-2 translate" mode. Add boolean bypass_ste and skip_cdtab flags in arm_smmu_attach_dev(), to handle clearly the cases above, which also corrects the conditions at the ats_enabled setting and arm_smmu_alloc_cd_tables() callback to cover the second use case. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 35 ++++++++++++++++----- 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index ffd430948e9e..de8bc4c3ad7a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2406,6 +2406,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) struct arm_smmu_device *smmu; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_master *master; + bool byapss_ste, skip_cdtab; if (!fwspec) return -ENOENT; @@ -2441,6 +2442,24 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) master->domain = smmu_domain; + /* + * When master attaches ARM_SMMU_DOMAIN_BYPASS to its single substream, + * set STE.Config to "bypass" and skip a CD table allocation. Otherwise, + * set STE.Config to "stage-1 translate" and allocate a CD table for its + * multiple stage-1 substream support, unless with a stage-2 domain in + * which case set STE.config to "stage-2 translate" and skip a CD table. + */ + if (smmu_domain->stage == ARM_SMMU_DOMAIN_BYPASS && !master->ssid_bits) { + byapss_ste = true; + skip_cdtab = true; + } else { + byapss_ste = false; + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2) + skip_cdtab = true; + else + skip_cdtab = false; + } + /* * The SMMU does not support enabling ATS with bypass. When the STE is * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and @@ -2448,22 +2467,22 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) * stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry). */ - if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS) + if (!byapss_ste) master->ats_enabled = arm_smmu_ats_supported(master); spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_add(&master->domain_head, &smmu_domain->devices); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); - if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { - if (!master->cd_table.cdtab) { - ret = arm_smmu_alloc_cd_tables(master); - if (ret) { - master->domain = NULL; - goto out_list_del; - } + if (!skip_cdtab && !master->cd_table.cdtab) { + ret = arm_smmu_alloc_cd_tables(master); + if (ret) { + master->domain = NULL; + goto out_list_del; } + } + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { /* * Prevent SVA from concurrently modifying the CD or writing to * the CD entry -- 2.42.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 2/2] iommu/arm-smmu-v3: Refactor arm_smmu_write_strtab_ent() 2023-08-25 10:31 ` Nicolin Chen @ 2023-08-25 10:31 ` Nicolin Chen -1 siblings, 0 replies; 6+ messages in thread From: Nicolin Chen @ 2023-08-25 10:31 UTC (permalink / raw) To: will, robin.murphy, jgg Cc: joro, mshavit, linux-kernel, linux-arm-kernel, iommu A stream table entry generally can be configured for the following cases: Case #1: STE Stage-1 Translate Only The master has a CD table and attached to an S1 or BYPASS domain. [Config #1] Set STE.Config to S1_TRANS. And set STE.SHCFG to INCOMING, required by a BYPASS domain and ignored by an S1 domain. Then follow the CD table to set the other fields. Case #2: STE Stage-2 Translate Only The master doesn't have a CD table and attached to an S2 domain. [Config #2] Set STE.Config to S2_TRANS. Then follow the s2_cfg to set the other fields. Case #3: STE Stage-1 and Stage-2 Translate The master allocated a CD table and attached to a NESTED domain that has an s2_cfg somewhere for stage-2 fields. [Config #4] Set STE.Config to S1_TRANS | S2_TRANS. Then follow both the CD table and the s2_cfg to set the other fields. Case #4: STE Bypass The master doesn't have a CD table and attached to an INDENTITY domain. [Config #3] Set STE.Config to BYPASS and set STE.SHCFG to INCOMING. Case #5: STE Abort The master is not attached to any domain, and the "disable_bypass" param is set to "true". [Config #4] Set STE.Config to ABORT After the recent refactor of moving cd/cd_table ownerships, things in the arm_smmu_write_strtab_ent() are a bit out of date, e.g. master pointer now is always available. And it doesn't support a special case of attaching a BYPASS domain to a multi-ssid master in the case #1. Add helpers by naming them clearly for the first four STE.Config settings. The case #5 can be covered by calling Config #2 at the end of Config #1, though the driver currently doesn't really use it and should be updated to the ongoing nesting design in the IOMMUFD. Yet, the helpers would be able to simply support that in the future by adding very limited changes in the switch-case in arm_smmu_ste_stage2_translate(). Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 201 +++++++++++--------- 1 file changed, 112 insertions(+), 89 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index de8bc4c3ad7a..c2ebbc916a2e 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1251,6 +1251,91 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid) arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd); } +static void arm_smmu_ste_stage2_translate(struct arm_smmu_master *master, + u64 *ste) +{ + struct arm_smmu_domain *smmu_domain = master->domain; + struct arm_smmu_device *smmu = master->smmu; + struct arm_smmu_s2_cfg *s2_cfg; + + switch (smmu_domain->stage) { + case ARM_SMMU_DOMAIN_NESTED: + case ARM_SMMU_DOMAIN_S2: + s2_cfg = &smmu_domain->s2_cfg; + break; + default: + WARN_ON(1); + return; + } + + ste[0] |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS); + + if (smmu->features & ARM_SMMU_FEAT_STALLS && !master->stall_enabled) + ste[1] |= STRTAB_STE_1_S1STALLD; + + if (master->ats_enabled) + ste[1] |= FIELD_PREP(STRTAB_STE_1_EATS, STRTAB_STE_1_EATS_TRANS); + + ste[2] |= FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) | + FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) | +#ifdef __BIG_ENDIAN + STRTAB_STE_2_S2ENDI | +#endif + STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2R; + + ste[3] |= s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK; +} + +static void arm_smmu_ste_stage1_translate(struct arm_smmu_master *master, + u64 *ste) +{ + struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; + struct arm_smmu_device *smmu = master->smmu; + __le64 *cdptr = arm_smmu_get_cd_ptr(master, 0); + + WARN_ON_ONCE(!cdptr); + + ste[0] |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | + FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) | + FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) | + FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt); + + if (FIELD_GET(CTXDESC_CD_0_ASID, le64_to_cpu(cdptr[0]))) + ste[1] |= FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0); + else + ste[1] |= FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_BYPASS); + + ste[1] |= FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING) | + FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | + FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | + FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH); + + if (smmu->features & ARM_SMMU_FEAT_E2H) + ste[1] |= FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_EL2); + else + ste[1] |= FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1); + + if (smmu->features & ARM_SMMU_FEAT_STALLS && !master->stall_enabled) + ste[1] |= STRTAB_STE_1_S1STALLD; + + if (master->ats_enabled) + ste[1] |= FIELD_PREP(STRTAB_STE_1_EATS, STRTAB_STE_1_EATS_TRANS); + + if (master->domain->stage == ARM_SMMU_DOMAIN_NESTED) + arm_smmu_ste_stage2_translate(master, ste); +} + +static void arm_smmu_ste_abort(u64 *ste) +{ + ste[0] |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT); +} + +static void arm_smmu_ste_bypass(u64 *ste) +{ + ste[0] |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); + ste[1] |= FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING); +} + static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, __le64 *dst) { @@ -1270,12 +1355,11 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, * 2. Write everything apart from dword 0, sync, write dword 0, sync * 3. Update Config, sync */ - u64 val = le64_to_cpu(dst[0]); + int i; + u64 ste[4] = {0}; + bool ste_sync_all = false; bool ste_live = false; - struct arm_smmu_device *smmu = NULL; - struct arm_smmu_ctx_desc_cfg *cd_table = NULL; - struct arm_smmu_s2_cfg *s2_cfg = NULL; - struct arm_smmu_domain *smmu_domain = NULL; + struct arm_smmu_device *smmu = master->smmu; struct arm_smmu_cmdq_ent prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG, .prefetch = { @@ -1283,27 +1367,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, }, }; - if (master) { - smmu_domain = master->domain; - smmu = master->smmu; - } - - if (smmu_domain) { - switch (smmu_domain->stage) { - case ARM_SMMU_DOMAIN_S1: - cd_table = &master->cd_table; - break; - case ARM_SMMU_DOMAIN_S2: - case ARM_SMMU_DOMAIN_NESTED: - s2_cfg = &smmu_domain->s2_cfg; - break; - default: - break; - } - } - - if (val & STRTAB_STE_0_V) { - switch (FIELD_GET(STRTAB_STE_0_CFG, val)) { + if (le64_to_cpu(dst[0]) & STRTAB_STE_0_V) { + switch (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(dst[0]))) { case STRTAB_STE_0_CFG_BYPASS: break; case STRTAB_STE_0_CFG_S1_TRANS: @@ -1318,78 +1383,36 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, } } - /* Nuke the existing STE_0 value, as we're going to rewrite it */ - val = STRTAB_STE_0_V; - - /* Bypass/fault */ - if (!smmu_domain || !(cd_table || s2_cfg)) { - if (!smmu_domain && disable_bypass) - val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT); - else - val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); - - dst[0] = cpu_to_le64(val); - dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, - STRTAB_STE_1_SHCFG_INCOMING)); - dst[2] = 0; /* Nuke the VMID */ - /* - * The SMMU can perform negative caching, so we must sync - * the STE regardless of whether the old value was live. - */ - if (smmu) - arm_smmu_sync_ste_for_sid(smmu, sid); - master->cd_table.installed = false; - return; - } - - if (cd_table) { - u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ? - STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1; + ste[0] = STRTAB_STE_0_V; + if (master->cd_table.cdtab && master->domain) { + BUG_ON(ste_live); + arm_smmu_ste_stage1_translate(master, ste); + master->cd_table.installed = true; + } else if (master->domain && + master->domain->stage == ARM_SMMU_DOMAIN_S2) { BUG_ON(ste_live); - dst[1] = cpu_to_le64( - FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) | - FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | - FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | - FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) | - FIELD_PREP(STRTAB_STE_1_STRW, strw)); - - if (smmu->features & ARM_SMMU_FEAT_STALLS && - !master->stall_enabled) - dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); - - val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | - FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) | - FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) | - FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt); - cd_table->installed = true; + arm_smmu_ste_stage2_translate(master, ste); + master->cd_table.installed = false; + } else if (!master->domain && disable_bypass) { /* Master is detached */ + arm_smmu_ste_abort(ste); + master->cd_table.installed = false; } else { + arm_smmu_ste_bypass(ste); master->cd_table.installed = false; } - if (s2_cfg) { - BUG_ON(ste_live); - dst[2] = cpu_to_le64( - FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) | - FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) | -#ifdef __BIG_ENDIAN - STRTAB_STE_2_S2ENDI | -#endif - STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 | - STRTAB_STE_2_S2R); - - dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK); - - val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS); + for (i = 1; i < 4; i++) { + if (dst[i] == cpu_to_le64(ste[i])) + continue; + dst[i] = cpu_to_le64(ste[i]); + ste_sync_all = true; } - if (master->ats_enabled) - dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS, - STRTAB_STE_1_EATS_TRANS)); - - arm_smmu_sync_ste_for_sid(smmu, sid); + if (ste_sync_all) + arm_smmu_sync_ste_for_sid(smmu, sid); /* See comment in arm_smmu_write_ctx_desc() */ - WRITE_ONCE(dst[0], cpu_to_le64(val)); + WRITE_ONCE(dst[0], cpu_to_le64(ste[0])); arm_smmu_sync_ste_for_sid(smmu, sid); /* It's likely that we'll want to use the new STE soon */ -- 2.42.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 2/2] iommu/arm-smmu-v3: Refactor arm_smmu_write_strtab_ent() @ 2023-08-25 10:31 ` Nicolin Chen 0 siblings, 0 replies; 6+ messages in thread From: Nicolin Chen @ 2023-08-25 10:31 UTC (permalink / raw) To: will, robin.murphy, jgg Cc: joro, mshavit, linux-kernel, linux-arm-kernel, iommu A stream table entry generally can be configured for the following cases: Case #1: STE Stage-1 Translate Only The master has a CD table and attached to an S1 or BYPASS domain. [Config #1] Set STE.Config to S1_TRANS. And set STE.SHCFG to INCOMING, required by a BYPASS domain and ignored by an S1 domain. Then follow the CD table to set the other fields. Case #2: STE Stage-2 Translate Only The master doesn't have a CD table and attached to an S2 domain. [Config #2] Set STE.Config to S2_TRANS. Then follow the s2_cfg to set the other fields. Case #3: STE Stage-1 and Stage-2 Translate The master allocated a CD table and attached to a NESTED domain that has an s2_cfg somewhere for stage-2 fields. [Config #4] Set STE.Config to S1_TRANS | S2_TRANS. Then follow both the CD table and the s2_cfg to set the other fields. Case #4: STE Bypass The master doesn't have a CD table and attached to an INDENTITY domain. [Config #3] Set STE.Config to BYPASS and set STE.SHCFG to INCOMING. Case #5: STE Abort The master is not attached to any domain, and the "disable_bypass" param is set to "true". [Config #4] Set STE.Config to ABORT After the recent refactor of moving cd/cd_table ownerships, things in the arm_smmu_write_strtab_ent() are a bit out of date, e.g. master pointer now is always available. And it doesn't support a special case of attaching a BYPASS domain to a multi-ssid master in the case #1. Add helpers by naming them clearly for the first four STE.Config settings. The case #5 can be covered by calling Config #2 at the end of Config #1, though the driver currently doesn't really use it and should be updated to the ongoing nesting design in the IOMMUFD. Yet, the helpers would be able to simply support that in the future by adding very limited changes in the switch-case in arm_smmu_ste_stage2_translate(). Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 201 +++++++++++--------- 1 file changed, 112 insertions(+), 89 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index de8bc4c3ad7a..c2ebbc916a2e 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1251,6 +1251,91 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid) arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd); } +static void arm_smmu_ste_stage2_translate(struct arm_smmu_master *master, + u64 *ste) +{ + struct arm_smmu_domain *smmu_domain = master->domain; + struct arm_smmu_device *smmu = master->smmu; + struct arm_smmu_s2_cfg *s2_cfg; + + switch (smmu_domain->stage) { + case ARM_SMMU_DOMAIN_NESTED: + case ARM_SMMU_DOMAIN_S2: + s2_cfg = &smmu_domain->s2_cfg; + break; + default: + WARN_ON(1); + return; + } + + ste[0] |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS); + + if (smmu->features & ARM_SMMU_FEAT_STALLS && !master->stall_enabled) + ste[1] |= STRTAB_STE_1_S1STALLD; + + if (master->ats_enabled) + ste[1] |= FIELD_PREP(STRTAB_STE_1_EATS, STRTAB_STE_1_EATS_TRANS); + + ste[2] |= FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) | + FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) | +#ifdef __BIG_ENDIAN + STRTAB_STE_2_S2ENDI | +#endif + STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2R; + + ste[3] |= s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK; +} + +static void arm_smmu_ste_stage1_translate(struct arm_smmu_master *master, + u64 *ste) +{ + struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; + struct arm_smmu_device *smmu = master->smmu; + __le64 *cdptr = arm_smmu_get_cd_ptr(master, 0); + + WARN_ON_ONCE(!cdptr); + + ste[0] |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | + FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) | + FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) | + FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt); + + if (FIELD_GET(CTXDESC_CD_0_ASID, le64_to_cpu(cdptr[0]))) + ste[1] |= FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0); + else + ste[1] |= FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_BYPASS); + + ste[1] |= FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING) | + FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | + FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | + FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH); + + if (smmu->features & ARM_SMMU_FEAT_E2H) + ste[1] |= FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_EL2); + else + ste[1] |= FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1); + + if (smmu->features & ARM_SMMU_FEAT_STALLS && !master->stall_enabled) + ste[1] |= STRTAB_STE_1_S1STALLD; + + if (master->ats_enabled) + ste[1] |= FIELD_PREP(STRTAB_STE_1_EATS, STRTAB_STE_1_EATS_TRANS); + + if (master->domain->stage == ARM_SMMU_DOMAIN_NESTED) + arm_smmu_ste_stage2_translate(master, ste); +} + +static void arm_smmu_ste_abort(u64 *ste) +{ + ste[0] |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT); +} + +static void arm_smmu_ste_bypass(u64 *ste) +{ + ste[0] |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); + ste[1] |= FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING); +} + static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, __le64 *dst) { @@ -1270,12 +1355,11 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, * 2. Write everything apart from dword 0, sync, write dword 0, sync * 3. Update Config, sync */ - u64 val = le64_to_cpu(dst[0]); + int i; + u64 ste[4] = {0}; + bool ste_sync_all = false; bool ste_live = false; - struct arm_smmu_device *smmu = NULL; - struct arm_smmu_ctx_desc_cfg *cd_table = NULL; - struct arm_smmu_s2_cfg *s2_cfg = NULL; - struct arm_smmu_domain *smmu_domain = NULL; + struct arm_smmu_device *smmu = master->smmu; struct arm_smmu_cmdq_ent prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG, .prefetch = { @@ -1283,27 +1367,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, }, }; - if (master) { - smmu_domain = master->domain; - smmu = master->smmu; - } - - if (smmu_domain) { - switch (smmu_domain->stage) { - case ARM_SMMU_DOMAIN_S1: - cd_table = &master->cd_table; - break; - case ARM_SMMU_DOMAIN_S2: - case ARM_SMMU_DOMAIN_NESTED: - s2_cfg = &smmu_domain->s2_cfg; - break; - default: - break; - } - } - - if (val & STRTAB_STE_0_V) { - switch (FIELD_GET(STRTAB_STE_0_CFG, val)) { + if (le64_to_cpu(dst[0]) & STRTAB_STE_0_V) { + switch (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(dst[0]))) { case STRTAB_STE_0_CFG_BYPASS: break; case STRTAB_STE_0_CFG_S1_TRANS: @@ -1318,78 +1383,36 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, } } - /* Nuke the existing STE_0 value, as we're going to rewrite it */ - val = STRTAB_STE_0_V; - - /* Bypass/fault */ - if (!smmu_domain || !(cd_table || s2_cfg)) { - if (!smmu_domain && disable_bypass) - val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT); - else - val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); - - dst[0] = cpu_to_le64(val); - dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, - STRTAB_STE_1_SHCFG_INCOMING)); - dst[2] = 0; /* Nuke the VMID */ - /* - * The SMMU can perform negative caching, so we must sync - * the STE regardless of whether the old value was live. - */ - if (smmu) - arm_smmu_sync_ste_for_sid(smmu, sid); - master->cd_table.installed = false; - return; - } - - if (cd_table) { - u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ? - STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1; + ste[0] = STRTAB_STE_0_V; + if (master->cd_table.cdtab && master->domain) { + BUG_ON(ste_live); + arm_smmu_ste_stage1_translate(master, ste); + master->cd_table.installed = true; + } else if (master->domain && + master->domain->stage == ARM_SMMU_DOMAIN_S2) { BUG_ON(ste_live); - dst[1] = cpu_to_le64( - FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) | - FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | - FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | - FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) | - FIELD_PREP(STRTAB_STE_1_STRW, strw)); - - if (smmu->features & ARM_SMMU_FEAT_STALLS && - !master->stall_enabled) - dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); - - val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | - FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) | - FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) | - FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt); - cd_table->installed = true; + arm_smmu_ste_stage2_translate(master, ste); + master->cd_table.installed = false; + } else if (!master->domain && disable_bypass) { /* Master is detached */ + arm_smmu_ste_abort(ste); + master->cd_table.installed = false; } else { + arm_smmu_ste_bypass(ste); master->cd_table.installed = false; } - if (s2_cfg) { - BUG_ON(ste_live); - dst[2] = cpu_to_le64( - FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) | - FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) | -#ifdef __BIG_ENDIAN - STRTAB_STE_2_S2ENDI | -#endif - STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 | - STRTAB_STE_2_S2R); - - dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK); - - val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS); + for (i = 1; i < 4; i++) { + if (dst[i] == cpu_to_le64(ste[i])) + continue; + dst[i] = cpu_to_le64(ste[i]); + ste_sync_all = true; } - if (master->ats_enabled) - dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS, - STRTAB_STE_1_EATS_TRANS)); - - arm_smmu_sync_ste_for_sid(smmu, sid); + if (ste_sync_all) + arm_smmu_sync_ste_for_sid(smmu, sid); /* See comment in arm_smmu_write_ctx_desc() */ - WRITE_ONCE(dst[0], cpu_to_le64(val)); + WRITE_ONCE(dst[0], cpu_to_le64(ste[0])); arm_smmu_sync_ste_for_sid(smmu, sid); /* It's likely that we'll want to use the new STE soon */ -- 2.42.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-08-25 10:32 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-08-25 10:31 [PATCH v3 0/2] iommu/arm-smmu-v3: Allow default substream bypass with a pasid support Nicolin Chen 2023-08-25 10:31 ` Nicolin Chen 2023-08-25 10:31 ` [PATCH v3 1/2] iommu/arm-smmu-v3: Add boolean bypass_ste and skip_cdtab flags Nicolin Chen 2023-08-25 10:31 ` Nicolin Chen 2023-08-25 10:31 ` [PATCH v3 2/2] iommu/arm-smmu-v3: Refactor arm_smmu_write_strtab_ent() Nicolin Chen 2023-08-25 10:31 ` Nicolin Chen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.