* [PATCH v3 1/2] iommu/arm-smmu-v3: Add boolean bypass_ste and skip_cdtab flags
2023-08-25 10:31 [PATCH v3 0/2] iommu/arm-smmu-v3: Allow default substream bypass with a pasid support Nicolin Chen
@ 2023-08-25 10:31 ` Nicolin Chen
2023-08-25 10:31 ` [PATCH v3 2/2] iommu/arm-smmu-v3: Refactor arm_smmu_write_strtab_ent() Nicolin Chen
1 sibling, 0 replies; 3+ messages in thread
From: Nicolin Chen @ 2023-08-25 10:31 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, mshavit, linux-kernel, linux-arm-kernel, iommu
If a master has only a default substream, it can skip CD/translation table
allocations when being attached to an IDENTITY domain, by simply setting
STE to the "bypass" mode (STE.Config[2:0] == 0b100).
If a master has multiple substreams, it will still need a CD table for the
non-default substreams when being attached to an IDENTITY domain, in which
case the STE.Config is set to the "stage-1 translate" mode while STE.S1DSS
field instead is set to the "bypass" mode (STE.S1DSS[1:0] == 0b01).
If a master is attached to a stage-2 domain, it does not need a CD table,
while the STE.Config is set to the "stage-2 translate" mode.
Add boolean bypass_ste and skip_cdtab flags in arm_smmu_attach_dev(), to
handle clearly the cases above, which also corrects the conditions at the
ats_enabled setting and arm_smmu_alloc_cd_tables() callback to cover the
second use case.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 35 ++++++++++++++++-----
1 file changed, 27 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index ffd430948e9e..de8bc4c3ad7a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2406,6 +2406,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_master *master;
+ bool byapss_ste, skip_cdtab;
if (!fwspec)
return -ENOENT;
@@ -2441,6 +2442,24 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
master->domain = smmu_domain;
+ /*
+ * When master attaches ARM_SMMU_DOMAIN_BYPASS to its single substream,
+ * set STE.Config to "bypass" and skip a CD table allocation. Otherwise,
+ * set STE.Config to "stage-1 translate" and allocate a CD table for its
+ * multiple stage-1 substream support, unless with a stage-2 domain in
+ * which case set STE.config to "stage-2 translate" and skip a CD table.
+ */
+ if (smmu_domain->stage == ARM_SMMU_DOMAIN_BYPASS && !master->ssid_bits) {
+ byapss_ste = true;
+ skip_cdtab = true;
+ } else {
+ byapss_ste = false;
+ if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2)
+ skip_cdtab = true;
+ else
+ skip_cdtab = false;
+ }
+
/*
* The SMMU does not support enabling ATS with bypass. When the STE is
* in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
@@ -2448,22 +2467,22 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
* stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
* F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
*/
- if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS)
+ if (!byapss_ste)
master->ats_enabled = arm_smmu_ats_supported(master);
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
list_add(&master->domain_head, &smmu_domain->devices);
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
- if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
- if (!master->cd_table.cdtab) {
- ret = arm_smmu_alloc_cd_tables(master);
- if (ret) {
- master->domain = NULL;
- goto out_list_del;
- }
+ if (!skip_cdtab && !master->cd_table.cdtab) {
+ ret = arm_smmu_alloc_cd_tables(master);
+ if (ret) {
+ master->domain = NULL;
+ goto out_list_del;
}
+ }
+ if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
/*
* Prevent SVA from concurrently modifying the CD or writing to
* the CD entry
--
2.42.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 3+ messages in thread* [PATCH v3 2/2] iommu/arm-smmu-v3: Refactor arm_smmu_write_strtab_ent()
2023-08-25 10:31 [PATCH v3 0/2] iommu/arm-smmu-v3: Allow default substream bypass with a pasid support Nicolin Chen
2023-08-25 10:31 ` [PATCH v3 1/2] iommu/arm-smmu-v3: Add boolean bypass_ste and skip_cdtab flags Nicolin Chen
@ 2023-08-25 10:31 ` Nicolin Chen
1 sibling, 0 replies; 3+ messages in thread
From: Nicolin Chen @ 2023-08-25 10:31 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, mshavit, linux-kernel, linux-arm-kernel, iommu
A stream table entry generally can be configured for the following cases:
Case #1: STE Stage-1 Translate Only
The master has a CD table and attached to an S1 or BYPASS domain.
[Config #1] Set STE.Config to S1_TRANS. And set STE.SHCFG to INCOMING,
required by a BYPASS domain and ignored by an S1 domain.
Then follow the CD table to set the other fields.
Case #2: STE Stage-2 Translate Only
The master doesn't have a CD table and attached to an S2 domain.
[Config #2] Set STE.Config to S2_TRANS. Then follow the s2_cfg to set the
other fields.
Case #3: STE Stage-1 and Stage-2 Translate
The master allocated a CD table and attached to a NESTED domain that has
an s2_cfg somewhere for stage-2 fields.
[Config #4] Set STE.Config to S1_TRANS | S2_TRANS. Then follow both the CD
table and the s2_cfg to set the other fields.
Case #4: STE Bypass
The master doesn't have a CD table and attached to an INDENTITY domain.
[Config #3] Set STE.Config to BYPASS and set STE.SHCFG to INCOMING.
Case #5: STE Abort
The master is not attached to any domain, and the "disable_bypass" param
is set to "true".
[Config #4] Set STE.Config to ABORT
After the recent refactor of moving cd/cd_table ownerships, things in the
arm_smmu_write_strtab_ent() are a bit out of date, e.g. master pointer now
is always available. And it doesn't support a special case of attaching a
BYPASS domain to a multi-ssid master in the case #1.
Add helpers by naming them clearly for the first four STE.Config settings.
The case #5 can be covered by calling Config #2 at the end of Config #1,
though the driver currently doesn't really use it and should be updated to
the ongoing nesting design in the IOMMUFD. Yet, the helpers would be able
to simply support that in the future by adding very limited changes in the
switch-case in arm_smmu_ste_stage2_translate().
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 201 +++++++++++---------
1 file changed, 112 insertions(+), 89 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index de8bc4c3ad7a..c2ebbc916a2e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1251,6 +1251,91 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
}
+static void arm_smmu_ste_stage2_translate(struct arm_smmu_master *master,
+ u64 *ste)
+{
+ struct arm_smmu_domain *smmu_domain = master->domain;
+ struct arm_smmu_device *smmu = master->smmu;
+ struct arm_smmu_s2_cfg *s2_cfg;
+
+ switch (smmu_domain->stage) {
+ case ARM_SMMU_DOMAIN_NESTED:
+ case ARM_SMMU_DOMAIN_S2:
+ s2_cfg = &smmu_domain->s2_cfg;
+ break;
+ default:
+ WARN_ON(1);
+ return;
+ }
+
+ ste[0] |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
+
+ if (smmu->features & ARM_SMMU_FEAT_STALLS && !master->stall_enabled)
+ ste[1] |= STRTAB_STE_1_S1STALLD;
+
+ if (master->ats_enabled)
+ ste[1] |= FIELD_PREP(STRTAB_STE_1_EATS, STRTAB_STE_1_EATS_TRANS);
+
+ ste[2] |= FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
+ FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
+#ifdef __BIG_ENDIAN
+ STRTAB_STE_2_S2ENDI |
+#endif
+ STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2R;
+
+ ste[3] |= s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK;
+}
+
+static void arm_smmu_ste_stage1_translate(struct arm_smmu_master *master,
+ u64 *ste)
+{
+ struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
+ struct arm_smmu_device *smmu = master->smmu;
+ __le64 *cdptr = arm_smmu_get_cd_ptr(master, 0);
+
+ WARN_ON_ONCE(!cdptr);
+
+ ste[0] |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
+ FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
+ FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) |
+ FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt);
+
+ if (FIELD_GET(CTXDESC_CD_0_ASID, le64_to_cpu(cdptr[0])))
+ ste[1] |= FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0);
+ else
+ ste[1] |= FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_BYPASS);
+
+ ste[1] |= FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING) |
+ FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
+ FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
+ FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH);
+
+ if (smmu->features & ARM_SMMU_FEAT_E2H)
+ ste[1] |= FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_EL2);
+ else
+ ste[1] |= FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1);
+
+ if (smmu->features & ARM_SMMU_FEAT_STALLS && !master->stall_enabled)
+ ste[1] |= STRTAB_STE_1_S1STALLD;
+
+ if (master->ats_enabled)
+ ste[1] |= FIELD_PREP(STRTAB_STE_1_EATS, STRTAB_STE_1_EATS_TRANS);
+
+ if (master->domain->stage == ARM_SMMU_DOMAIN_NESTED)
+ arm_smmu_ste_stage2_translate(master, ste);
+}
+
+static void arm_smmu_ste_abort(u64 *ste)
+{
+ ste[0] |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
+}
+
+static void arm_smmu_ste_bypass(u64 *ste)
+{
+ ste[0] |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
+ ste[1] |= FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING);
+}
+
static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
__le64 *dst)
{
@@ -1270,12 +1355,11 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
* 2. Write everything apart from dword 0, sync, write dword 0, sync
* 3. Update Config, sync
*/
- u64 val = le64_to_cpu(dst[0]);
+ int i;
+ u64 ste[4] = {0};
+ bool ste_sync_all = false;
bool ste_live = false;
- struct arm_smmu_device *smmu = NULL;
- struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
- struct arm_smmu_s2_cfg *s2_cfg = NULL;
- struct arm_smmu_domain *smmu_domain = NULL;
+ struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_cmdq_ent prefetch_cmd = {
.opcode = CMDQ_OP_PREFETCH_CFG,
.prefetch = {
@@ -1283,27 +1367,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
},
};
- if (master) {
- smmu_domain = master->domain;
- smmu = master->smmu;
- }
-
- if (smmu_domain) {
- switch (smmu_domain->stage) {
- case ARM_SMMU_DOMAIN_S1:
- cd_table = &master->cd_table;
- break;
- case ARM_SMMU_DOMAIN_S2:
- case ARM_SMMU_DOMAIN_NESTED:
- s2_cfg = &smmu_domain->s2_cfg;
- break;
- default:
- break;
- }
- }
-
- if (val & STRTAB_STE_0_V) {
- switch (FIELD_GET(STRTAB_STE_0_CFG, val)) {
+ if (le64_to_cpu(dst[0]) & STRTAB_STE_0_V) {
+ switch (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(dst[0]))) {
case STRTAB_STE_0_CFG_BYPASS:
break;
case STRTAB_STE_0_CFG_S1_TRANS:
@@ -1318,78 +1383,36 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
}
}
- /* Nuke the existing STE_0 value, as we're going to rewrite it */
- val = STRTAB_STE_0_V;
-
- /* Bypass/fault */
- if (!smmu_domain || !(cd_table || s2_cfg)) {
- if (!smmu_domain && disable_bypass)
- val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
- else
- val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
-
- dst[0] = cpu_to_le64(val);
- dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
- STRTAB_STE_1_SHCFG_INCOMING));
- dst[2] = 0; /* Nuke the VMID */
- /*
- * The SMMU can perform negative caching, so we must sync
- * the STE regardless of whether the old value was live.
- */
- if (smmu)
- arm_smmu_sync_ste_for_sid(smmu, sid);
- master->cd_table.installed = false;
- return;
- }
-
- if (cd_table) {
- u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
- STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
+ ste[0] = STRTAB_STE_0_V;
+ if (master->cd_table.cdtab && master->domain) {
+ BUG_ON(ste_live);
+ arm_smmu_ste_stage1_translate(master, ste);
+ master->cd_table.installed = true;
+ } else if (master->domain &&
+ master->domain->stage == ARM_SMMU_DOMAIN_S2) {
BUG_ON(ste_live);
- dst[1] = cpu_to_le64(
- FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
- FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
- FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
- FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
- FIELD_PREP(STRTAB_STE_1_STRW, strw));
-
- if (smmu->features & ARM_SMMU_FEAT_STALLS &&
- !master->stall_enabled)
- dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
-
- val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
- FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
- FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) |
- FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt);
- cd_table->installed = true;
+ arm_smmu_ste_stage2_translate(master, ste);
+ master->cd_table.installed = false;
+ } else if (!master->domain && disable_bypass) { /* Master is detached */
+ arm_smmu_ste_abort(ste);
+ master->cd_table.installed = false;
} else {
+ arm_smmu_ste_bypass(ste);
master->cd_table.installed = false;
}
- if (s2_cfg) {
- BUG_ON(ste_live);
- dst[2] = cpu_to_le64(
- FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
- FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
-#ifdef __BIG_ENDIAN
- STRTAB_STE_2_S2ENDI |
-#endif
- STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
- STRTAB_STE_2_S2R);
-
- dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
-
- val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
+ for (i = 1; i < 4; i++) {
+ if (dst[i] == cpu_to_le64(ste[i]))
+ continue;
+ dst[i] = cpu_to_le64(ste[i]);
+ ste_sync_all = true;
}
- if (master->ats_enabled)
- dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
- STRTAB_STE_1_EATS_TRANS));
-
- arm_smmu_sync_ste_for_sid(smmu, sid);
+ if (ste_sync_all)
+ arm_smmu_sync_ste_for_sid(smmu, sid);
/* See comment in arm_smmu_write_ctx_desc() */
- WRITE_ONCE(dst[0], cpu_to_le64(val));
+ WRITE_ONCE(dst[0], cpu_to_le64(ste[0]));
arm_smmu_sync_ste_for_sid(smmu, sid);
/* It's likely that we'll want to use the new STE soon */
--
2.42.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 3+ messages in thread