* [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
@ 2023-12-05 19:14 Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 01/19] iommu/arm-smmu-v3: Add a type for the STE Jason Gunthorpe
` (21 more replies)
0 siblings, 22 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
The SMMUv3 driver was originally written in 2015 when the iommu driver
facing API looked quite different. The API has evolved, especially lately,
and the driver has fallen behind.
This work aims to bring make the SMMUv3 driver the best IOMMU driver with
the most comprehensive implementation of the API. After all parts it
addresses:
- Global static BLOCKED and IDENTITY domains with 'never fail' attach
semantics. BLOCKED is desired for efficient VFIO.
- Support map before attach for PAGING iommu_domains.
- attach_dev failure does not change the HW configuration.
- Fully hitless transitions between IDENTITY -> DMA -> IDENTITY.
The API has IOMMU_RESV_DIRECT which is expected to be
continuously translating.
- Safe transitions between PAGING -> BLOCKED, do not ever temporarily
do IDENTITY. This is required for iommufd security.
- Full PASID API support including:
- S1/SVA domains attached to PASIDs
- IDENTITY/BLOCKED/S1 attached to RID
- Change of the RID domain while PASIDs are attached
- Streamlined SVA support using the core infrastructure
- Hitless, whenever possible, change between two domains
- iommufd IOMMU_GET_HW_INFO, IOMMU_HWPT_ALLOC_NEST_PARENT, and
IOMMU_DOMAIN_NESTED support
Over all these things are going to become more accessible to iommufd, and
exposed to VMs, so it is important for the driver to have a robust
implementation of the API.
The work is split into three parts, with this part largely focusing on the
STE and building up to the BLOCKED & IDENTITY global static domains.
The second part largely focuses on the CD and builds up to having a common
PASID infrastructure that SVA and S1 domains equally use.
The third part has some random cleanups and the iommufd related parts.
Overall this takes the approach of turning the STE/CD programming upside
down where the CD/STE value is computed right at a driver callback
function and then pushed down into programming logic. The programming
logic hides the details of the required CD/STE tear-less update. This
makes the CD/STE functions independent of the arm_smmu_domain which makes
it fairly straightforward to untangle all the different call chains, and
add news ones.
Further, this frees the arm_smmu_domain related logic from keeping track
of what state the STE/CD is currently in so it can carefully sequence the
correct update. There are many new update pairs that are subtly introduced
as the work progresses.
The locking to support BTM via arm_smmu_asid_lock is a bit subtle right
now and patches throughout this work adjust and tighten this so that it is
clearer and doesn't get broken.
Once the lower STE layers no longer need to touch arm_smmu_domain we can
isolate struct arm_smmu_domain to be only used for PAGING domains, audit
all the to_smmu_domain() calls to be only in PAGING domain ops, and
introduce the normal global static BLOCKED/IDENTITY domains using the new
STE infrastructure. Part 2 will ultimately migrate SVA over to use
arm_smmu_domain as well.
All parts are on github:
https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
v3:
- Use some local variables in arm_smmu_get_step_for_sid() for clarity
- White space and spelling changes
- Commit message updates
- Keep master->domain_head initialized to avoid a list_del corruption
v2: https://lore.kernel.org/r/0-v2-de8b10590bf5+400-smmuv3_newapi_p1_jgg@nvidia.com
- Rebased on v6.7-rc1
- Improve the comment for arm_smmu_write_entry_step()
- Fix the botched memcmp
- Document the spec justification for the SHCFG exclusion in used
- Include STRTAB_STE_1_SHCFG for STRTAB_STE_0_CFG_S2_TRANS in used
- WARN_ON for unknown STEs in used
- Fix error unwind in arm_smmu_attach_dev()
- Whitespace, spelling, and checkpatch related items
v1: https://lore.kernel.org/r/0-v1-e289ca9121be+2be-smmuv3_newapi_p1_jgg@nvidia.com
Jason Gunthorpe (19):
iommu/arm-smmu-v3: Add a type for the STE
iommu/arm-smmu-v3: Master cannot be NULL in
arm_smmu_write_strtab_ent()
iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
iommu/arm-smmu-v3: Make STE programming independent of the callers
iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into
functions
iommu/arm-smmu-v3: Build the whole STE in
arm_smmu_make_s2_domain_ste()
iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
iommu/arm-smmu-v3: Compute the STE only once for each master
iommu/arm-smmu-v3: Do not change the STE twice during
arm_smmu_attach_dev()
iommu/arm-smmu-v3: Put writing the context descriptor in the right
order
iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
iommu/arm-smmu-v3: Remove arm_smmu_master->domain
iommu/arm-smmu-v3: Add a global static IDENTITY domain
iommu/arm-smmu-v3: Add a global static BLOCKED domain
iommu/arm-smmu-v3: Use the identity/blocked domain during release
iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to
finalize
iommu/arm-smmu-v3: Convert to domain_alloc_paging()
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 729 +++++++++++++-------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 12 +-
2 files changed, 477 insertions(+), 264 deletions(-)
base-commit: ca7fcaff577c92d85f0e05cc7be79759155fe328
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v3 01/19] iommu/arm-smmu-v3: Add a type for the STE
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent() Jason Gunthorpe
` (20 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Instead of passing a naked __le16 * around to represent a STE wrap it in a
"struct arm_smmu_ste" with an array of the correct size. This makes it
much clearer which functions will comprise the "STE API".
Reviewed-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 59 ++++++++++-----------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 ++-
2 files changed, 35 insertions(+), 31 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7445454c2af244..c5895f4d7d6c9d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1249,7 +1249,7 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
}
static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
- __le64 *dst)
+ struct arm_smmu_ste *dst)
{
/*
* This is hideously complicated, but we only really care about
@@ -1267,7 +1267,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
* 2. Write everything apart from dword 0, sync, write dword 0, sync
* 3. Update Config, sync
*/
- u64 val = le64_to_cpu(dst[0]);
+ u64 val = le64_to_cpu(dst->data[0]);
bool ste_live = false;
struct arm_smmu_device *smmu = NULL;
struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
@@ -1325,10 +1325,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
else
val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
- dst[0] = cpu_to_le64(val);
- dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
+ dst->data[0] = cpu_to_le64(val);
+ dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
STRTAB_STE_1_SHCFG_INCOMING));
- dst[2] = 0; /* Nuke the VMID */
+ dst->data[2] = 0; /* Nuke the VMID */
/*
* The SMMU can perform negative caching, so we must sync
* the STE regardless of whether the old value was live.
@@ -1343,7 +1343,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
BUG_ON(ste_live);
- dst[1] = cpu_to_le64(
+ dst->data[1] = cpu_to_le64(
FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
@@ -1352,7 +1352,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
if (smmu->features & ARM_SMMU_FEAT_STALLS &&
!master->stall_enabled)
- dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
+ dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
@@ -1362,7 +1362,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
if (s2_cfg) {
BUG_ON(ste_live);
- dst[2] = cpu_to_le64(
+ dst->data[2] = cpu_to_le64(
FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
#ifdef __BIG_ENDIAN
@@ -1371,18 +1371,18 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
STRTAB_STE_2_S2R);
- dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+ dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
}
if (master->ats_enabled)
- dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
+ dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
STRTAB_STE_1_EATS_TRANS));
arm_smmu_sync_ste_for_sid(smmu, sid);
/* See comment in arm_smmu_write_ctx_desc() */
- WRITE_ONCE(dst[0], cpu_to_le64(val));
+ WRITE_ONCE(dst->data[0], cpu_to_le64(val));
arm_smmu_sync_ste_for_sid(smmu, sid);
/* It's likely that we'll want to use the new STE soon */
@@ -1390,7 +1390,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
}
-static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool force)
+static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
+ unsigned int nent, bool force)
{
unsigned int i;
u64 val = STRTAB_STE_0_V;
@@ -1401,11 +1402,11 @@ static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool fo
val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
for (i = 0; i < nent; ++i) {
- strtab[0] = cpu_to_le64(val);
- strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
- STRTAB_STE_1_SHCFG_INCOMING));
- strtab[2] = 0;
- strtab += STRTAB_STE_DWORDS;
+ strtab->data[0] = cpu_to_le64(val);
+ strtab->data[1] = cpu_to_le64(FIELD_PREP(
+ STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
+ strtab->data[2] = 0;
+ strtab++;
}
}
@@ -2209,26 +2210,23 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
return 0;
}
-static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
+static struct arm_smmu_ste *
+arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
{
- __le64 *step;
struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
- struct arm_smmu_strtab_l1_desc *l1_desc;
- int idx;
+ unsigned int idx1, idx2;
/* Two-level walk */
- idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
- l1_desc = &cfg->l1_desc[idx];
- idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
- step = &l1_desc->l2ptr[idx];
+ idx1 = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
+ idx2 = sid & ((1 << STRTAB_SPLIT) - 1);
+ return &cfg->l1_desc[idx1].l2ptr[idx2];
} else {
/* Simple linear lookup */
- step = &cfg->strtab[sid * STRTAB_STE_DWORDS];
+ return (struct arm_smmu_ste *)&cfg
+ ->strtab[sid * STRTAB_STE_DWORDS];
}
-
- return step;
}
static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
@@ -2238,7 +2236,8 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
for (i = 0; i < master->num_streams; ++i) {
u32 sid = master->streams[i].id;
- __le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
+ struct arm_smmu_ste *step =
+ arm_smmu_get_step_for_sid(smmu, sid);
/* Bridged PCI devices may end up with duplicated IDs */
for (j = 0; j < i; j++)
@@ -3769,7 +3768,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
list_for_each_entry(e, &rmr_list, list) {
- __le64 *step;
+ struct arm_smmu_ste *step;
struct iommu_iort_rmr_data *rmr;
int ret, i;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 961205ba86d25d..03f9e526cbd92f 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -206,6 +206,11 @@
#define STRTAB_L1_DESC_L2PTR_MASK GENMASK_ULL(51, 6)
#define STRTAB_STE_DWORDS 8
+
+struct arm_smmu_ste {
+ __le64 data[STRTAB_STE_DWORDS];
+};
+
#define STRTAB_STE_0_V (1UL << 0)
#define STRTAB_STE_0_CFG GENMASK_ULL(3, 1)
#define STRTAB_STE_0_CFG_ABORT 0
@@ -571,7 +576,7 @@ struct arm_smmu_priq {
struct arm_smmu_strtab_l1_desc {
u8 span;
- __le64 *l2ptr;
+ struct arm_smmu_ste *l2ptr;
dma_addr_t l2ptr_dma;
};
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 01/19] iommu/arm-smmu-v3: Add a type for the STE Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED Jason Gunthorpe
` (19 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
master. Remove the confusing if.
Reviewed-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c5895f4d7d6c9d..89e9c001faad71 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1269,10 +1269,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
*/
u64 val = le64_to_cpu(dst->data[0]);
bool ste_live = false;
- struct arm_smmu_device *smmu = NULL;
+ struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
struct arm_smmu_s2_cfg *s2_cfg = NULL;
- struct arm_smmu_domain *smmu_domain = NULL;
+ struct arm_smmu_domain *smmu_domain = master->domain;
struct arm_smmu_cmdq_ent prefetch_cmd = {
.opcode = CMDQ_OP_PREFETCH_CFG,
.prefetch = {
@@ -1280,11 +1280,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
},
};
- if (master) {
- smmu_domain = master->domain;
- smmu = master->smmu;
- }
-
if (smmu_domain) {
switch (smmu_domain->stage) {
case ARM_SMMU_DOMAIN_S1:
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 01/19] iommu/arm-smmu-v3: Add a type for the STE Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent() Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers Jason Gunthorpe
` (18 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Currently this is exactly the same as ARM_SMMU_DOMAIN_S2, so just remove
it. The ongoing work to add nesting support through iommufd will do
something a little different.
Reviewed-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 89e9c001faad71..b120d836681c1c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1286,7 +1286,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
cd_table = &master->cd_table;
break;
case ARM_SMMU_DOMAIN_S2:
- case ARM_SMMU_DOMAIN_NESTED:
s2_cfg = &smmu_domain->s2_cfg;
break;
default:
@@ -2167,7 +2166,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
fmt = ARM_64_LPAE_S1;
finalise_stage_fn = arm_smmu_domain_finalise_s1;
break;
- case ARM_SMMU_DOMAIN_NESTED:
case ARM_SMMU_DOMAIN_S2:
ias = smmu->ias;
oas = smmu->oas;
@@ -2736,7 +2734,7 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
if (smmu_domain->smmu)
ret = -EPERM;
else
- smmu_domain->stage = ARM_SMMU_DOMAIN_NESTED;
+ smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
mutex_unlock(&smmu_domain->init_mutex);
return ret;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 03f9e526cbd92f..27ddf1acd12cea 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -715,7 +715,6 @@ struct arm_smmu_master {
enum arm_smmu_domain_stage {
ARM_SMMU_DOMAIN_S1 = 0,
ARM_SMMU_DOMAIN_S2,
- ARM_SMMU_DOMAIN_NESTED,
ARM_SMMU_DOMAIN_BYPASS,
};
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (2 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-12 16:23 ` Will Deacon
2024-01-29 19:10 ` Mostafa Saleh
2023-12-05 19:14 ` [PATCH v3 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass Jason Gunthorpe
` (17 subsequent siblings)
21 siblings, 2 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
As the comment in arm_smmu_write_strtab_ent() explains, this routine has
been limited to only work correctly in certain scenarios that the caller
must ensure. Generally the caller must put the STE into ABORT or BYPASS
before attempting to program it to something else.
The next patches/series are going to start removing some of this logic
from the callers, and add more complex state combinations than currently.
Thus, consolidate all the complexity here. Callers do not have to care
about what STE transition they are doing, this function will handle
everything optimally.
Revise arm_smmu_write_strtab_ent() so it algorithmically computes the
required programming sequence to avoid creating an incoherent 'torn' STE
in the HW caches. The update algorithm follows the same design that the
driver already uses: it is safe to change bits that HW doesn't currently
use and then do a single 64 bit update, with sync's in between.
The basic idea is to express in a bitmask what bits the HW is actually
using based on the V and CFG bits. Based on that mask we know what STE
changes are safe and which are disruptive. We can count how many 64 bit
QWORDS need a disruptive update and know if a step with V=0 is required.
This gives two basic flows through the algorithm.
If only a single 64 bit quantity needs disruptive replacement:
- Write the target value into all currently unused bits
- Write the single 64 bit quantity
- Zero the remaining different bits
If multiple 64 bit quantities need disruptive replacement then do:
- Write V=0 to QWORD 0
- Write the entire STE except QWORD 0
- Write QWORD 0
With HW flushes at each step, that can be skipped if the STE didn't change
in that step.
At this point it generates the same sequence of updates as the current
code, except that zeroing the VMID on entry to BYPASS/ABORT will do an
extra sync (this seems to be an existing bug).
Going forward this will use a V=0 transition instead of cycling through
ABORT if a hitfull change is required. This seems more appropriate as ABORT
will fail DMAs without any logging, but dropping a DMA due to transient
V=0 is probably signaling a bug, so the C_BAD_STE is valuable.
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 272 +++++++++++++++-----
1 file changed, 208 insertions(+), 64 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b120d836681c1c..0934f882b94e94 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -971,6 +971,101 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
}
+/*
+ * This algorithm updates any STE/CD to any value without creating a situation
+ * where the HW can percieve a corrupted entry. HW is only required to have a 64
+ * bit atomicity with stores from the CPU, while entries are many 64 bit values
+ * big.
+ *
+ * The algorithm works by evolving the entry toward the target in a series of
+ * steps. Each step synchronizes with the HW so that the HW can not see an entry
+ * torn across two steps. Upon each call cur/cur_used reflect the current
+ * synchronized value seen by the HW.
+ *
+ * During each step the HW can observe a torn entry that has any combination of
+ * the step's old/new 64 bit words. The algorithm objective is for the HW
+ * behavior to always be one of current behavior, V=0, or new behavior, during
+ * each step, and across all steps.
+ *
+ * At each step one of three actions is chosen to evolve cur to target:
+ * - Update all unused bits with their target values.
+ * This relies on the IGNORED behavior described in the specification
+ * - Update a single 64-bit value
+ * - Update all unused bits and set V=0
+ *
+ * The last two actions will cause cur_used to change, which will then allow the
+ * first action on the next step.
+ *
+ * In the most general case we can make any update in three steps:
+ * - Disrupting the entry (V=0)
+ * - Fill now unused bits, all bits except V
+ * - Make valid (V=1), single 64 bit store
+ *
+ * However this disrupts the HW while it is happening. There are several
+ * interesting cases where a STE/CD can be updated without disturbing the HW
+ * because only a small number of bits are changing (S1DSS, CONFIG, etc) or
+ * because the used bits don't intersect. We can detect this by calculating how
+ * many 64 bit values need update after adjusting the unused bits and skip the
+ * V=0 process.
+ */
+static bool arm_smmu_write_entry_step(__le64 *cur, const __le64 *cur_used,
+ const __le64 *target,
+ const __le64 *target_used, __le64 *step,
+ __le64 v_bit,
+ unsigned int len)
+{
+ u8 step_used_diff = 0;
+ u8 step_change = 0;
+ unsigned int i;
+
+ /*
+ * Compute a step that has all the bits currently unused by HW set to
+ * their target values.
+ */
+ for (i = 0; i != len; i++) {
+ step[i] = (cur[i] & cur_used[i]) | (target[i] & ~cur_used[i]);
+ if (cur[i] != step[i])
+ step_change |= 1 << i;
+ /*
+ * Each bit indicates if the step is incorrect compared to the
+ * target, considering only the used bits in the target
+ */
+ if ((step[i] & target_used[i]) != (target[i] & target_used[i]))
+ step_used_diff |= 1 << i;
+ }
+
+ if (hweight8(step_used_diff) > 1) {
+ /*
+ * More than 1 qword is mismatched, this cannot be done without
+ * a break. Clear the V bit and go again.
+ */
+ step[0] &= ~v_bit;
+ } else if (!step_change && step_used_diff) {
+ /*
+ * Have exactly one critical qword, all the other qwords are set
+ * correctly, so we can set this qword now.
+ */
+ i = ffs(step_used_diff) - 1;
+ step[i] = target[i];
+ } else if (!step_change) {
+ /* cur == target, so all done */
+ if (memcmp(cur, target, len * sizeof(*cur)) == 0)
+ return true;
+
+ /*
+ * All the used HW bits match, but unused bits are different.
+ * Set them as well. Technically this isn't necessary but it
+ * brings the entry to the full target state, so if there are
+ * bugs in the mask calculation this will obscure them.
+ */
+ memcpy(step, target, len * sizeof(*step));
+ }
+
+ for (i = 0; i != len; i++)
+ WRITE_ONCE(cur[i], step[i]);
+ return false;
+}
+
static void arm_smmu_sync_cd(struct arm_smmu_master *master,
int ssid, bool leaf)
{
@@ -1248,37 +1343,115 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
}
+/*
+ * Based on the value of ent report which bits of the STE the HW will access. It
+ * would be nice if this was complete according to the spec, but minimally it
+ * has to capture the bits this driver uses.
+ */
+static void arm_smmu_get_ste_used(const struct arm_smmu_ste *ent,
+ struct arm_smmu_ste *used_bits)
+{
+ memset(used_bits, 0, sizeof(*used_bits));
+
+ used_bits->data[0] = cpu_to_le64(STRTAB_STE_0_V);
+ if (!(ent->data[0] & cpu_to_le64(STRTAB_STE_0_V)))
+ return;
+
+ /*
+ * If S1 is enabled S1DSS is valid, see 13.5 Summary of
+ * attribute/permission configuration fields for the SHCFG behavior.
+ */
+ if (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0])) & 1 &&
+ FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent->data[1])) ==
+ STRTAB_STE_1_S1DSS_BYPASS)
+ used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
+
+ used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_CFG);
+ switch (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0]))) {
+ case STRTAB_STE_0_CFG_ABORT:
+ break;
+ case STRTAB_STE_0_CFG_BYPASS:
+ used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
+ break;
+ case STRTAB_STE_0_CFG_S1_TRANS:
+ used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
+ STRTAB_STE_0_S1CTXPTR_MASK |
+ STRTAB_STE_0_S1CDMAX);
+ used_bits->data[1] |=
+ cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
+ STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
+ STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW);
+ used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
+ break;
+ case STRTAB_STE_0_CFG_S2_TRANS:
+ used_bits->data[1] |=
+ cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_SHCFG);
+ used_bits->data[2] |=
+ cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR |
+ STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI |
+ STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R);
+ used_bits->data[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK);
+ break;
+
+ default:
+ memset(used_bits, 0xFF, sizeof(*used_bits));
+ WARN_ON(true);
+ }
+}
+
+static bool arm_smmu_write_ste_step(struct arm_smmu_ste *cur,
+ const struct arm_smmu_ste *target,
+ const struct arm_smmu_ste *target_used)
+{
+ struct arm_smmu_ste cur_used;
+ struct arm_smmu_ste step;
+
+ arm_smmu_get_ste_used(cur, &cur_used);
+ return arm_smmu_write_entry_step(cur->data, cur_used.data, target->data,
+ target_used->data, step.data,
+ cpu_to_le64(STRTAB_STE_0_V),
+ ARRAY_SIZE(cur->data));
+}
+
+static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
+ struct arm_smmu_ste *ste,
+ const struct arm_smmu_ste *target)
+{
+ struct arm_smmu_ste target_used;
+ int i;
+
+ arm_smmu_get_ste_used(target, &target_used);
+ /* Masks in arm_smmu_get_ste_used() are up to date */
+ for (i = 0; i != ARRAY_SIZE(target->data); i++)
+ WARN_ON_ONCE(target->data[i] & ~target_used.data[i]);
+
+ while (true) {
+ if (arm_smmu_write_ste_step(ste, target, &target_used))
+ break;
+ arm_smmu_sync_ste_for_sid(smmu, sid);
+ }
+
+ /* It's likely that we'll want to use the new STE soon */
+ if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH)) {
+ struct arm_smmu_cmdq_ent
+ prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG,
+ .prefetch = {
+ .sid = sid,
+ } };
+
+ arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
+ }
+}
+
static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
struct arm_smmu_ste *dst)
{
- /*
- * This is hideously complicated, but we only really care about
- * three cases at the moment:
- *
- * 1. Invalid (all zero) -> bypass/fault (init)
- * 2. Bypass/fault -> translation/bypass (attach)
- * 3. Translation/bypass -> bypass/fault (detach)
- *
- * Given that we can't update the STE atomically and the SMMU
- * doesn't read the thing in a defined order, that leaves us
- * with the following maintenance requirements:
- *
- * 1. Update Config, return (init time STEs aren't live)
- * 2. Write everything apart from dword 0, sync, write dword 0, sync
- * 3. Update Config, sync
- */
- u64 val = le64_to_cpu(dst->data[0]);
- bool ste_live = false;
+ u64 val;
struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
struct arm_smmu_s2_cfg *s2_cfg = NULL;
struct arm_smmu_domain *smmu_domain = master->domain;
- struct arm_smmu_cmdq_ent prefetch_cmd = {
- .opcode = CMDQ_OP_PREFETCH_CFG,
- .prefetch = {
- .sid = sid,
- },
- };
+ struct arm_smmu_ste target = {};
if (smmu_domain) {
switch (smmu_domain->stage) {
@@ -1293,22 +1466,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
}
}
- if (val & STRTAB_STE_0_V) {
- switch (FIELD_GET(STRTAB_STE_0_CFG, val)) {
- case STRTAB_STE_0_CFG_BYPASS:
- break;
- case STRTAB_STE_0_CFG_S1_TRANS:
- case STRTAB_STE_0_CFG_S2_TRANS:
- ste_live = true;
- break;
- case STRTAB_STE_0_CFG_ABORT:
- BUG_ON(!disable_bypass);
- break;
- default:
- BUG(); /* STE corruption */
- }
- }
-
/* Nuke the existing STE_0 value, as we're going to rewrite it */
val = STRTAB_STE_0_V;
@@ -1319,16 +1476,11 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
else
val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
- dst->data[0] = cpu_to_le64(val);
- dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
+ target.data[0] = cpu_to_le64(val);
+ target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
STRTAB_STE_1_SHCFG_INCOMING));
- dst->data[2] = 0; /* Nuke the VMID */
- /*
- * The SMMU can perform negative caching, so we must sync
- * the STE regardless of whether the old value was live.
- */
- if (smmu)
- arm_smmu_sync_ste_for_sid(smmu, sid);
+ target.data[2] = 0; /* Nuke the VMID */
+ arm_smmu_write_ste(smmu, sid, dst, &target);
return;
}
@@ -1336,8 +1488,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
- BUG_ON(ste_live);
- dst->data[1] = cpu_to_le64(
+ target.data[1] = cpu_to_le64(
FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
@@ -1346,7 +1497,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
if (smmu->features & ARM_SMMU_FEAT_STALLS &&
!master->stall_enabled)
- dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
+ target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
@@ -1355,8 +1506,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
}
if (s2_cfg) {
- BUG_ON(ste_live);
- dst->data[2] = cpu_to_le64(
+ target.data[2] = cpu_to_le64(
FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
#ifdef __BIG_ENDIAN
@@ -1365,23 +1515,17 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
STRTAB_STE_2_S2R);
- dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+ target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
}
if (master->ats_enabled)
- dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
+ target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
STRTAB_STE_1_EATS_TRANS));
- arm_smmu_sync_ste_for_sid(smmu, sid);
- /* See comment in arm_smmu_write_ctx_desc() */
- WRITE_ONCE(dst->data[0], cpu_to_le64(val));
- arm_smmu_sync_ste_for_sid(smmu, sid);
-
- /* It's likely that we'll want to use the new STE soon */
- if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH))
- arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
+ target.data[0] = cpu_to_le64(val);
+ arm_smmu_write_ste(smmu, sid, dst, &target);
}
static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (3 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste() Jason Gunthorpe
` (16 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
This allows writing the flow of arm_smmu_write_strtab_ent() around abort
and bypass domains more naturally.
Note that the core code no longer supplies NULL domains, though there is
still a flow in the driver that end up in arm_smmu_write_strtab_ent() with
NULL. A later patch will remove it.
Remove the duplicate calculation of the STE in arm_smmu_init_bypass_stes()
and remove the force parameter. arm_smmu_rmr_install_bypass_ste() can now
simply invoke arm_smmu_make_bypass_ste() directly.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 89 +++++++++++----------
1 file changed, 47 insertions(+), 42 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 0934f882b94e94..0a4bf1cbe42293 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1443,6 +1443,24 @@ static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
}
}
+static void arm_smmu_make_abort_ste(struct arm_smmu_ste *target)
+{
+ memset(target, 0, sizeof(*target));
+ target->data[0] = cpu_to_le64(
+ STRTAB_STE_0_V |
+ FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT));
+}
+
+static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
+{
+ memset(target, 0, sizeof(*target));
+ target->data[0] = cpu_to_le64(
+ STRTAB_STE_0_V |
+ FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS));
+ target->data[1] = cpu_to_le64(
+ FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
+}
+
static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
struct arm_smmu_ste *dst)
{
@@ -1453,37 +1471,31 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
struct arm_smmu_domain *smmu_domain = master->domain;
struct arm_smmu_ste target = {};
- if (smmu_domain) {
- switch (smmu_domain->stage) {
- case ARM_SMMU_DOMAIN_S1:
- cd_table = &master->cd_table;
- break;
- case ARM_SMMU_DOMAIN_S2:
- s2_cfg = &smmu_domain->s2_cfg;
- break;
- default:
- break;
- }
+ if (!smmu_domain) {
+ if (disable_bypass)
+ arm_smmu_make_abort_ste(&target);
+ else
+ arm_smmu_make_bypass_ste(&target);
+ arm_smmu_write_ste(smmu, sid, dst, &target);
+ return;
+ }
+
+ switch (smmu_domain->stage) {
+ case ARM_SMMU_DOMAIN_S1:
+ cd_table = &master->cd_table;
+ break;
+ case ARM_SMMU_DOMAIN_S2:
+ s2_cfg = &smmu_domain->s2_cfg;
+ break;
+ case ARM_SMMU_DOMAIN_BYPASS:
+ arm_smmu_make_bypass_ste(&target);
+ arm_smmu_write_ste(smmu, sid, dst, &target);
+ return;
}
/* Nuke the existing STE_0 value, as we're going to rewrite it */
val = STRTAB_STE_0_V;
- /* Bypass/fault */
- if (!smmu_domain || !(cd_table || s2_cfg)) {
- if (!smmu_domain && disable_bypass)
- val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
- else
- val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
-
- target.data[0] = cpu_to_le64(val);
- target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
- STRTAB_STE_1_SHCFG_INCOMING));
- target.data[2] = 0; /* Nuke the VMID */
- arm_smmu_write_ste(smmu, sid, dst, &target);
- return;
- }
-
if (cd_table) {
u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
@@ -1529,21 +1541,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
}
static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
- unsigned int nent, bool force)
+ unsigned int nent)
{
unsigned int i;
- u64 val = STRTAB_STE_0_V;
-
- if (disable_bypass && !force)
- val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
- else
- val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
for (i = 0; i < nent; ++i) {
- strtab->data[0] = cpu_to_le64(val);
- strtab->data[1] = cpu_to_le64(FIELD_PREP(
- STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
- strtab->data[2] = 0;
+ if (disable_bypass)
+ arm_smmu_make_abort_ste(strtab);
+ else
+ arm_smmu_make_bypass_ste(strtab);
strtab++;
}
}
@@ -1571,7 +1577,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
return -ENOMEM;
}
- arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT, false);
+ arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
arm_smmu_write_strtab_l1_desc(strtab, desc);
return 0;
}
@@ -3194,7 +3200,7 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
cfg->strtab_base_cfg = reg;
- arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents, false);
+ arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
return 0;
}
@@ -3905,7 +3911,6 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
list_for_each_entry(e, &rmr_list, list) {
- struct arm_smmu_ste *step;
struct iommu_iort_rmr_data *rmr;
int ret, i;
@@ -3918,8 +3923,8 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
continue;
}
- step = arm_smmu_get_step_for_sid(smmu, rmr->sids[i]);
- arm_smmu_init_bypass_stes(step, 1, true);
+ arm_smmu_make_bypass_ste(
+ arm_smmu_get_step_for_sid(smmu, rmr->sids[i]));
}
}
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (4 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions Jason Gunthorpe
` (15 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Logically arm_smmu_init_strtab_linear() is the function that allocates and
populates the stream table with the initial value of the STEs. After this
function returns the stream table should be fully ready.
arm_smmu_rmr_install_bypass_ste() adjusts the initial stream table to force
any SIDs that the FW says have IOMMU_RESV_DIRECT to use bypass. This
ensures there is no disruption to the identity mapping during boot.
Put arm_smmu_rmr_install_bypass_ste() into arm_smmu_init_strtab_linear(),
it already executes immediately after arm_smmu_init_strtab_linear().
No functional change intended.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 0a4bf1cbe42293..95c78ccaebd439 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -86,6 +86,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
{ 0, NULL},
};
+static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
+
static void parse_driver_options(struct arm_smmu_device *smmu)
{
int i = 0;
@@ -3201,6 +3203,9 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
cfg->strtab_base_cfg = reg;
arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
+
+ /* Check for RMRs and install bypass STEs if any */
+ arm_smmu_rmr_install_bypass_ste(smmu);
return 0;
}
@@ -4014,9 +4019,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
/* Record our private device structure */
platform_set_drvdata(pdev, smmu);
- /* Check for RMRs and install bypass STEs if any */
- arm_smmu_rmr_install_bypass_ste(smmu);
-
/* Reset the device */
ret = arm_smmu_device_reset(smmu, bypass);
if (ret)
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (5 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste() Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste() Jason Gunthorpe
` (14 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
This is preparation to move the STE calculation higher up in to the call
chain and remove arm_smmu_write_strtab_ent(). These new functions will be
called directly from attach_dev.
Reviewed-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 114 +++++++++++---------
1 file changed, 62 insertions(+), 52 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 95c78ccaebd439..b3b28c10bd042e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1463,13 +1463,69 @@ static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
}
+static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
+ struct arm_smmu_master *master,
+ struct arm_smmu_ctx_desc_cfg *cd_table)
+{
+ struct arm_smmu_device *smmu = master->smmu;
+
+ memset(target, 0, sizeof(*target));
+ target->data[0] = cpu_to_le64(
+ STRTAB_STE_0_V |
+ FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
+ FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt) |
+ (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
+ FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax));
+
+ target->data[1] = cpu_to_le64(
+ FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
+ FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
+ FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
+ FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
+ ((smmu->features & ARM_SMMU_FEAT_STALLS &&
+ !master->stall_enabled) ?
+ STRTAB_STE_1_S1STALLD :
+ 0) |
+ FIELD_PREP(STRTAB_STE_1_EATS,
+ master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0) |
+ FIELD_PREP(STRTAB_STE_1_STRW,
+ (smmu->features & ARM_SMMU_FEAT_E2H) ?
+ STRTAB_STE_1_STRW_EL2 :
+ STRTAB_STE_1_STRW_NSEL1));
+}
+
+static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
+ struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain)
+{
+ struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
+
+ memset(target, 0, sizeof(*target));
+ target->data[0] = cpu_to_le64(
+ STRTAB_STE_0_V |
+ FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS));
+
+ target->data[1] |= cpu_to_le64(
+ FIELD_PREP(STRTAB_STE_1_EATS,
+ master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+
+ target->data[2] = cpu_to_le64(
+ FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
+ FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
+ STRTAB_STE_2_S2AA64 |
+#ifdef __BIG_ENDIAN
+ STRTAB_STE_2_S2ENDI |
+#endif
+ STRTAB_STE_2_S2PTW |
+ STRTAB_STE_2_S2R);
+
+ target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+}
+
static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
struct arm_smmu_ste *dst)
{
- u64 val;
struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
- struct arm_smmu_s2_cfg *s2_cfg = NULL;
struct arm_smmu_domain *smmu_domain = master->domain;
struct arm_smmu_ste target = {};
@@ -1484,61 +1540,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
switch (smmu_domain->stage) {
case ARM_SMMU_DOMAIN_S1:
- cd_table = &master->cd_table;
+ arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
break;
case ARM_SMMU_DOMAIN_S2:
- s2_cfg = &smmu_domain->s2_cfg;
+ arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
break;
case ARM_SMMU_DOMAIN_BYPASS:
arm_smmu_make_bypass_ste(&target);
- arm_smmu_write_ste(smmu, sid, dst, &target);
- return;
+ break;
}
-
- /* Nuke the existing STE_0 value, as we're going to rewrite it */
- val = STRTAB_STE_0_V;
-
- if (cd_table) {
- u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
- STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
-
- target.data[1] = cpu_to_le64(
- FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
- FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
- FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
- FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
- FIELD_PREP(STRTAB_STE_1_STRW, strw));
-
- if (smmu->features & ARM_SMMU_FEAT_STALLS &&
- !master->stall_enabled)
- target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
-
- val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
- FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
- FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) |
- FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt);
- }
-
- if (s2_cfg) {
- target.data[2] = cpu_to_le64(
- FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
- FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
-#ifdef __BIG_ENDIAN
- STRTAB_STE_2_S2ENDI |
-#endif
- STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
- STRTAB_STE_2_S2R);
-
- target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
-
- val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
- }
-
- if (master->ats_enabled)
- target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
- STRTAB_STE_1_EATS_TRANS));
-
- target.data[0] = cpu_to_le64(val);
arm_smmu_write_ste(smmu, sid, dst, &target);
}
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste()
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (6 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev Jason Gunthorpe
` (13 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Half the code was living in arm_smmu_domain_finalise_s2(), just move it
here and take the values directly from the pgtbl_ops instead of storing
copies.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 27 ++++++++++++---------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 --
2 files changed, 15 insertions(+), 14 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b3b28c10bd042e..e2ae0081c47820 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1499,6 +1499,11 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_domain *smmu_domain)
{
struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
+ const struct io_pgtable_cfg *pgtbl_cfg =
+ &io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
+ typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr =
+ &pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
+ u64 vtcr_val;
memset(target, 0, sizeof(*target));
target->data[0] = cpu_to_le64(
@@ -1509,9 +1514,16 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
FIELD_PREP(STRTAB_STE_1_EATS,
master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+ vtcr_val = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
+ FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
+ FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
+ FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
+ FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
+ FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
+ FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
target->data[2] = cpu_to_le64(
FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
- FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
+ FIELD_PREP(STRTAB_STE_2_VTCR, vtcr_val) |
STRTAB_STE_2_S2AA64 |
#ifdef __BIG_ENDIAN
STRTAB_STE_2_S2ENDI |
@@ -1519,7 +1531,8 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
STRTAB_STE_2_S2PTW |
STRTAB_STE_2_S2R);
- target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+ target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s2_cfg.vttbr &
+ STRTAB_STE_3_S2TTB_MASK);
}
static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
@@ -2276,7 +2289,6 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
int vmid;
struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
- typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr;
/* Reserve VMID 0 for stage-2 bypass STEs */
vmid = ida_alloc_range(&smmu->vmid_map, 1, (1 << smmu->vmid_bits) - 1,
@@ -2284,16 +2296,7 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
if (vmid < 0)
return vmid;
- vtcr = &pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
cfg->vmid = (u16)vmid;
- cfg->vttbr = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
- cfg->vtcr = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
- FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
- FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
- FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
- FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
- FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
- FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
return 0;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 27ddf1acd12cea..1be0c1151c50c3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -609,8 +609,6 @@ struct arm_smmu_ctx_desc_cfg {
struct arm_smmu_s2_cfg {
u16 vmid;
- u64 vttbr;
- u64 vtcr;
};
struct arm_smmu_strtab_cfg {
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (7 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste() Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master Jason Gunthorpe
` (12 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
The BTM support wants to be able to change the ASID of any smmu_domain.
When it goes to do this it holds the arm_smmu_asid_lock and iterates over
the target domain's devices list.
During attach of a S1 domain we must ensure that the devices list and
CD are in sync, otherwise we could miss CD updates or a parallel CD update
could push an out of date CD.
This is pretty complicated, and almost works today because
arm_smmu_detach_dev() removes the master from the linked list before
working on the CD entries, preventing parallel update of the CD.
However, it does have an issue where the CD can remain programed while the
domain appears to be unattached. arm_smmu_share_asid() will then not clear
any CD entriess and install its own CD entry with the same ASID
concurrently. This creates a small race window where the IOMMU can see two
ASIDs pointing to different translations.
Solve this by wrapping most of the attach flow in the
arm_smmu_asid_lock. This locks more than strictly needed to prepare for
the next patch which will reorganize the order of the linked list, STE and
CD changes.
Move arm_smmu_detach_dev() till after we have initialized the domain so
the lock can be held for less time.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 22 ++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e2ae0081c47820..c375da195af713 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2560,8 +2560,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return -EBUSY;
}
- arm_smmu_detach_dev(master);
-
mutex_lock(&smmu_domain->init_mutex);
if (!smmu_domain->smmu) {
@@ -2576,6 +2574,16 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
if (ret)
return ret;
+ /*
+ * Prevent arm_smmu_share_asid() from trying to change the ASID
+ * of either the old or new domain while we are working on it.
+ * This allows the STE and the smmu_domain->devices list to
+ * be inconsistent during this routine.
+ */
+ mutex_lock(&arm_smmu_asid_lock);
+
+ arm_smmu_detach_dev(master);
+
master->domain = smmu_domain;
/*
@@ -2601,13 +2609,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
}
}
- /*
- * Prevent SVA from concurrently modifying the CD or writing to
- * the CD entry
- */
- mutex_lock(&arm_smmu_asid_lock);
ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
- mutex_unlock(&arm_smmu_asid_lock);
if (ret) {
master->domain = NULL;
goto out_list_del;
@@ -2617,13 +2619,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
arm_smmu_install_ste_for_dev(master);
arm_smmu_enable_ats(master);
- return 0;
+ goto out_unlock;
out_list_del:
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
list_del(&master->domain_head);
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+out_unlock:
+ mutex_unlock(&arm_smmu_asid_lock);
return ret;
}
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (8 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev() Jason Gunthorpe
` (11 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Currently arm_smmu_install_ste_for_dev() iterates over every SID and
computes from scratch an identical STE. Every SID should have the same STE
contents. Turn this inside out so that the STE is supplied by the caller
and arm_smmu_install_ste_for_dev() simply installs it to every SID.
This is possible now that the STE generation does not inform what sequence
should be used to program it.
This allows splitting the STE calculation up according to the call site,
which following patches will make use of, and removes the confusing NULL
domain special case that only supported arm_smmu_detach_dev().
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 58 ++++++++-------------
1 file changed, 22 insertions(+), 36 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c375da195af713..0c1d70b8f325ed 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1535,36 +1535,6 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
STRTAB_STE_3_S2TTB_MASK);
}
-static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
- struct arm_smmu_ste *dst)
-{
- struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_domain *smmu_domain = master->domain;
- struct arm_smmu_ste target = {};
-
- if (!smmu_domain) {
- if (disable_bypass)
- arm_smmu_make_abort_ste(&target);
- else
- arm_smmu_make_bypass_ste(&target);
- arm_smmu_write_ste(smmu, sid, dst, &target);
- return;
- }
-
- switch (smmu_domain->stage) {
- case ARM_SMMU_DOMAIN_S1:
- arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
- break;
- case ARM_SMMU_DOMAIN_S2:
- arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
- break;
- case ARM_SMMU_DOMAIN_BYPASS:
- arm_smmu_make_bypass_ste(&target);
- break;
- }
- arm_smmu_write_ste(smmu, sid, dst, &target);
-}
-
static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
unsigned int nent)
{
@@ -2387,7 +2357,8 @@ arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
}
}
-static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
+static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
+ const struct arm_smmu_ste *target)
{
int i, j;
struct arm_smmu_device *smmu = master->smmu;
@@ -2404,7 +2375,7 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
if (j < i)
continue;
- arm_smmu_write_strtab_ent(master, sid, step);
+ arm_smmu_write_ste(smmu, sid, step, target);
}
}
@@ -2511,6 +2482,7 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
static void arm_smmu_detach_dev(struct arm_smmu_master *master)
{
unsigned long flags;
+ struct arm_smmu_ste target;
struct arm_smmu_domain *smmu_domain = master->domain;
if (!smmu_domain)
@@ -2524,7 +2496,11 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
master->domain = NULL;
master->ats_enabled = false;
- arm_smmu_install_ste_for_dev(master);
+ if (disable_bypass)
+ arm_smmu_make_abort_ste(&target);
+ else
+ arm_smmu_make_bypass_ste(&target);
+ arm_smmu_install_ste_for_dev(master, &target);
/*
* Clearing the CD entry isn't strictly required to detach the domain
* since the table is uninstalled anyway, but it helps avoid confusion
@@ -2539,6 +2515,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
{
int ret = 0;
unsigned long flags;
+ struct arm_smmu_ste target;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -2600,7 +2577,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
list_add(&master->domain_head, &smmu_domain->devices);
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
- if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+ switch (smmu_domain->stage) {
+ case ARM_SMMU_DOMAIN_S1:
if (!master->cd_table.cdtab) {
ret = arm_smmu_alloc_cd_tables(master);
if (ret) {
@@ -2614,9 +2592,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
master->domain = NULL;
goto out_list_del;
}
- }
- arm_smmu_install_ste_for_dev(master);
+ arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
+ break;
+ case ARM_SMMU_DOMAIN_S2:
+ arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
+ break;
+ case ARM_SMMU_DOMAIN_BYPASS:
+ arm_smmu_make_bypass_ste(&target);
+ break;
+ }
+ arm_smmu_install_ste_for_dev(master, &target);
arm_smmu_enable_ats(master);
goto out_unlock;
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (9 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order Jason Gunthorpe
` (10 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
This was needed because the STE code required the STE to be in
ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
can automatically handle all transitions we can remove this step
from the attach_dev flow.
A few small bugs exist because of this:
1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
then there will be a moment where the STE points at BYPASS. Since
this can be done by VFIO/IOMMUFD it is a small security race.
2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
regions will temporarily become BLOCKED. We'd like drivers to
work in a way that allows IOMMU_RESV_DIRECT to be continuously
functional during these transitions.
Make arm_smmu_release_device() put the STE back to the correct
ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
this path.
As noted before the reordering of the linked list/STE/CD changes is OK
against concurrent arm_smmu_share_asid() because of the
arm_smmu_asid_lock.
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 0c1d70b8f325ed..7b1f7fa27b3df0 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2482,7 +2482,6 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
static void arm_smmu_detach_dev(struct arm_smmu_master *master)
{
unsigned long flags;
- struct arm_smmu_ste target;
struct arm_smmu_domain *smmu_domain = master->domain;
if (!smmu_domain)
@@ -2496,11 +2495,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
master->domain = NULL;
master->ats_enabled = false;
- if (disable_bypass)
- arm_smmu_make_abort_ste(&target);
- else
- arm_smmu_make_bypass_ste(&target);
- arm_smmu_install_ste_for_dev(master, &target);
/*
* Clearing the CD entry isn't strictly required to detach the domain
* since the table is uninstalled anyway, but it helps avoid confusion
@@ -2852,9 +2846,18 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
static void arm_smmu_release_device(struct device *dev)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_ste target;
if (WARN_ON(arm_smmu_master_sva_enabled(master)))
iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
+
+ /* Put the STE back to what arm_smmu_init_strtab() sets */
+ if (disable_bypass && !dev->iommu->require_direct)
+ arm_smmu_make_abort_ste(&target);
+ else
+ arm_smmu_make_bypass_ste(&target);
+ arm_smmu_install_ste_for_dev(master, &target);
+
arm_smmu_detach_dev(master);
arm_smmu_disable_pasid(master);
arm_smmu_remove_master(master);
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (10 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev() Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats() Jason Gunthorpe
` (9 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Get closer to the IOMMU API ideal that changes between domains can be
hitless. The ordering for the CD table entry is not entirely clean from
this perspective.
When switching away from a STE with a CD table programmed in it we should
write the new STE first, then clear any old data in the CD entry.
If we are programming a CD table for the first time to a STE then the CD
entry should be programmed before the STE is loaded.
If we are replacing a CD table entry when the STE already points at the CD
entry then we just need to do the make/break sequence.
Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
properly. The only other caller is arm_smmu_release_device() and it is
going to free the cdtable anyhow, so it doesn't matter what is in it.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 29 ++++++++++++++-------
1 file changed, 20 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7b1f7fa27b3df0..212d0ad7e5f911 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2495,14 +2495,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
master->domain = NULL;
master->ats_enabled = false;
- /*
- * Clearing the CD entry isn't strictly required to detach the domain
- * since the table is uninstalled anyway, but it helps avoid confusion
- * in the call to arm_smmu_write_ctx_desc on the next attach (which
- * expects the entry to be empty).
- */
- if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab)
- arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
}
static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -2579,6 +2571,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
master->domain = NULL;
goto out_list_del;
}
+ } else {
+ /*
+ * arm_smmu_write_ctx_desc() relies on the entry being
+ * invalid to work, clear any existing entry.
+ */
+ ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+ NULL);
+ if (ret) {
+ master->domain = NULL;
+ goto out_list_del;
+ }
}
ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
@@ -2588,15 +2591,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
}
arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
+ arm_smmu_install_ste_for_dev(master, &target);
break;
case ARM_SMMU_DOMAIN_S2:
arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
+ arm_smmu_install_ste_for_dev(master, &target);
+ if (master->cd_table.cdtab)
+ arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+ NULL);
break;
case ARM_SMMU_DOMAIN_BYPASS:
arm_smmu_make_bypass_ste(&target);
+ arm_smmu_install_ste_for_dev(master, &target);
+ if (master->cd_table.cdtab)
+ arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+ NULL);
break;
}
- arm_smmu_install_ste_for_dev(master, &target);
arm_smmu_enable_ats(master);
goto out_unlock;
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (11 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain Jason Gunthorpe
` (8 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
The caller already has the domain, just pass it in. A following patch will
remove master->domain.
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 212d0ad7e5f911..e1de2799310961 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2394,12 +2394,12 @@ static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
return dev_is_pci(dev) && pci_ats_supported(to_pci_dev(dev));
}
-static void arm_smmu_enable_ats(struct arm_smmu_master *master)
+static void arm_smmu_enable_ats(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain)
{
size_t stu;
struct pci_dev *pdev;
struct arm_smmu_device *smmu = master->smmu;
- struct arm_smmu_domain *smmu_domain = master->domain;
/* Don't enable ATS at the endpoint if it's not enabled in the STE */
if (!master->ats_enabled)
@@ -2415,10 +2415,9 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
}
-static void arm_smmu_disable_ats(struct arm_smmu_master *master)
+static void arm_smmu_disable_ats(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain)
{
- struct arm_smmu_domain *smmu_domain = master->domain;
-
if (!master->ats_enabled)
return;
@@ -2487,7 +2486,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
if (!smmu_domain)
return;
- arm_smmu_disable_ats(master);
+ arm_smmu_disable_ats(master, smmu_domain);
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
list_del(&master->domain_head);
@@ -2609,7 +2608,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
break;
}
- arm_smmu_enable_ats(master);
+ arm_smmu_enable_ats(master, smmu_domain);
goto out_unlock;
out_list_del:
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (12 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats() Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain Jason Gunthorpe
` (7 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Introducing global statics which are of type struct iommu_domain, not
struct arm_smmu_domain makes it difficult to retain
arm_smmu_master->domain, as it can no longer point to an IDENTITY or
BLOCKED domain.
The only place that uses the value is arm_smmu_detach_dev(). Change things
to work like other drivers and call iommu_get_domain_for_dev() to obtain
the current domain.
The master->domain is subtly protecting the domain_head against being
unused, change the domain_head to be INIT'd when the master is not
attached to a domain instead of garbage/zero.
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 26 ++++++++-------------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
2 files changed, 10 insertions(+), 17 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e1de2799310961..525048a79e8a90 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2480,19 +2480,20 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
static void arm_smmu_detach_dev(struct arm_smmu_master *master)
{
+ struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
+ struct arm_smmu_domain *smmu_domain;
unsigned long flags;
- struct arm_smmu_domain *smmu_domain = master->domain;
- if (!smmu_domain)
+ if (!domain)
return;
+ smmu_domain = to_smmu_domain(domain);
arm_smmu_disable_ats(master, smmu_domain);
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_del(&master->domain_head);
+ list_del_init(&master->domain_head);
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
- master->domain = NULL;
master->ats_enabled = false;
}
@@ -2546,8 +2547,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
arm_smmu_detach_dev(master);
- master->domain = smmu_domain;
-
/*
* The SMMU does not support enabling ATS with bypass. When the STE is
* in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
@@ -2566,10 +2565,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
case ARM_SMMU_DOMAIN_S1:
if (!master->cd_table.cdtab) {
ret = arm_smmu_alloc_cd_tables(master);
- if (ret) {
- master->domain = NULL;
+ if (ret)
goto out_list_del;
- }
} else {
/*
* arm_smmu_write_ctx_desc() relies on the entry being
@@ -2577,17 +2574,13 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
*/
ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
NULL);
- if (ret) {
- master->domain = NULL;
+ if (ret)
goto out_list_del;
- }
}
ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
- if (ret) {
- master->domain = NULL;
+ if (ret)
goto out_list_del;
- }
arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
arm_smmu_install_ste_for_dev(master, &target);
@@ -2613,7 +2606,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
out_list_del:
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_del(&master->domain_head);
+ list_del_init(&master->domain_head);
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
out_unlock:
@@ -2817,6 +2810,7 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
master->dev = dev;
master->smmu = smmu;
INIT_LIST_HEAD(&master->bonds);
+ INIT_LIST_HEAD(&master->domain_head);
dev_iommu_priv_set(dev, master);
ret = arm_smmu_insert_master(smmu, master);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 1be0c1151c50c3..21f2f73501019a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -695,7 +695,6 @@ struct arm_smmu_stream {
struct arm_smmu_master {
struct arm_smmu_device *smmu;
struct device *dev;
- struct arm_smmu_domain *domain;
struct list_head domain_head;
struct arm_smmu_stream *streams;
/* Locked by the iommu core using the group mutex */
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (13 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain Jason Gunthorpe
` (6 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Move to the new static global for identity domains. Move all the logic out
of arm_smmu_attach_dev into an identity only function.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++++++++------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
2 files changed, 58 insertions(+), 25 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 525048a79e8a90..9899caeabc8744 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2173,8 +2173,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
return arm_smmu_sva_domain_alloc();
if (type != IOMMU_DOMAIN_UNMANAGED &&
- type != IOMMU_DOMAIN_DMA &&
- type != IOMMU_DOMAIN_IDENTITY)
+ type != IOMMU_DOMAIN_DMA)
return NULL;
/*
@@ -2282,11 +2281,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_device *smmu = smmu_domain->smmu;
- if (domain->type == IOMMU_DOMAIN_IDENTITY) {
- smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS;
- return 0;
- }
-
/* Restrict the stage to what we can actually support */
if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
@@ -2484,7 +2478,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
struct arm_smmu_domain *smmu_domain;
unsigned long flags;
- if (!domain)
+ if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
return;
smmu_domain = to_smmu_domain(domain);
@@ -2547,15 +2541,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
arm_smmu_detach_dev(master);
- /*
- * The SMMU does not support enabling ATS with bypass. When the STE is
- * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
- * Translated transactions are denied as though ATS is disabled for the
- * stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
- * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
- */
- if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS)
- master->ats_enabled = arm_smmu_ats_supported(master);
+ master->ats_enabled = arm_smmu_ats_supported(master);
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
list_add(&master->domain_head, &smmu_domain->devices);
@@ -2592,13 +2578,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
NULL);
break;
- case ARM_SMMU_DOMAIN_BYPASS:
- arm_smmu_make_bypass_ste(&target);
- arm_smmu_install_ste_for_dev(master, &target);
- if (master->cd_table.cdtab)
- arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
- NULL);
- break;
}
arm_smmu_enable_ats(master, smmu_domain);
@@ -2614,6 +2593,60 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return ret;
}
+static int arm_smmu_attach_dev_ste(struct device *dev,
+ struct arm_smmu_ste *ste)
+{
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+ if (arm_smmu_master_sva_enabled(master))
+ return -EBUSY;
+
+ /*
+ * Do not allow any ASID to be changed while are working on the STE,
+ * otherwise we could miss invalidations.
+ */
+ mutex_lock(&arm_smmu_asid_lock);
+
+ /*
+ * The SMMU does not support enabling ATS with bypass/abort. When the
+ * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
+ * and Translated transactions are denied as though ATS is disabled for
+ * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
+ * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
+ */
+ arm_smmu_detach_dev(master);
+
+ arm_smmu_install_ste_for_dev(master, ste);
+ mutex_unlock(&arm_smmu_asid_lock);
+
+ /*
+ * This has to be done after removing the master from the
+ * arm_smmu_domain->devices to avoid races updating the same context
+ * descriptor from arm_smmu_share_asid().
+ */
+ if (master->cd_table.cdtab)
+ arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
+ return 0;
+}
+
+static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
+ struct device *dev)
+{
+ struct arm_smmu_ste ste;
+
+ arm_smmu_make_bypass_ste(&ste);
+ return arm_smmu_attach_dev_ste(dev, &ste);
+}
+
+static const struct iommu_domain_ops arm_smmu_identity_ops = {
+ .attach_dev = arm_smmu_attach_dev_identity,
+};
+
+static struct iommu_domain arm_smmu_identity_domain = {
+ .type = IOMMU_DOMAIN_IDENTITY,
+ .ops = &arm_smmu_identity_ops,
+};
+
static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t pgsize, size_t pgcount,
int prot, gfp_t gfp, size_t *mapped)
@@ -3007,6 +3040,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
}
static struct iommu_ops arm_smmu_ops = {
+ .identity_domain = &arm_smmu_identity_domain,
.capable = arm_smmu_capable,
.domain_alloc = arm_smmu_domain_alloc,
.probe_device = arm_smmu_probe_device,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 21f2f73501019a..154808f96718df 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -712,7 +712,6 @@ struct arm_smmu_master {
enum arm_smmu_domain_stage {
ARM_SMMU_DOMAIN_S1 = 0,
ARM_SMMU_DOMAIN_S2,
- ARM_SMMU_DOMAIN_BYPASS,
};
struct arm_smmu_domain {
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (14 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release Jason Gunthorpe
` (5 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Using the same design as the IDENTITY domain install an
STRTAB_STE_0_CFG_ABORT STE.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9899caeabc8744..386bf954542e7d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2647,6 +2647,24 @@ static struct iommu_domain arm_smmu_identity_domain = {
.ops = &arm_smmu_identity_ops,
};
+static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
+ struct device *dev)
+{
+ struct arm_smmu_ste ste;
+
+ arm_smmu_make_abort_ste(&ste);
+ return arm_smmu_attach_dev_ste(dev, &ste);
+}
+
+static const struct iommu_domain_ops arm_smmu_blocked_ops = {
+ .attach_dev = arm_smmu_attach_dev_blocked,
+};
+
+static struct iommu_domain arm_smmu_blocked_domain = {
+ .type = IOMMU_DOMAIN_BLOCKED,
+ .ops = &arm_smmu_blocked_ops,
+};
+
static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t pgsize, size_t pgcount,
int prot, gfp_t gfp, size_t *mapped)
@@ -3041,6 +3059,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
static struct iommu_ops arm_smmu_ops = {
.identity_domain = &arm_smmu_identity_domain,
+ .blocked_domain = &arm_smmu_blocked_domain,
.capable = arm_smmu_capable,
.domain_alloc = arm_smmu_domain_alloc,
.probe_device = arm_smmu_probe_device,
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (15 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize Jason Gunthorpe
` (4 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Consolidate some more code by having release call
arm_smmu_attach_dev_identity/blocked() instead of open coding this.
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 386bf954542e7d..8269e2fed6038b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2901,19 +2901,16 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
static void arm_smmu_release_device(struct device *dev)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
- struct arm_smmu_ste target;
if (WARN_ON(arm_smmu_master_sva_enabled(master)))
iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
/* Put the STE back to what arm_smmu_init_strtab() sets */
if (disable_bypass && !dev->iommu->require_direct)
- arm_smmu_make_abort_ste(&target);
+ arm_smmu_attach_dev_blocked(&arm_smmu_blocked_domain, dev);
else
- arm_smmu_make_bypass_ste(&target);
- arm_smmu_install_ste_for_dev(master, &target);
+ arm_smmu_attach_dev_identity(&arm_smmu_identity_domain, dev);
- arm_smmu_detach_dev(master);
arm_smmu_disable_pasid(master);
arm_smmu_remove_master(master);
if (master->cd_table.cdtab)
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (16 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging() Jason Gunthorpe
` (3 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Instead of putting container_of() casts in the internals, use the proper
type in this call chain. This makes it easier to check that the two global
static domains are not leaking into call chains they should not.
Passing the smmu avoids the only caller from having to set it and unset it
in the error path.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 34 ++++++++++-----------
1 file changed, 17 insertions(+), 17 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8269e2fed6038b..873343109a90bb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -87,6 +87,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
};
static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
+static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_device *smmu);
static void parse_driver_options(struct arm_smmu_device *smmu)
{
@@ -2215,12 +2217,12 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
kfree(smmu_domain);
}
-static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
+static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
+ struct arm_smmu_domain *smmu_domain,
struct io_pgtable_cfg *pgtbl_cfg)
{
int ret;
u32 asid;
- struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr;
@@ -2252,11 +2254,11 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
return ret;
}
-static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
+static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu,
+ struct arm_smmu_domain *smmu_domain,
struct io_pgtable_cfg *pgtbl_cfg)
{
int vmid;
- struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
/* Reserve VMID 0 for stage-2 bypass STEs */
@@ -2269,17 +2271,17 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
return 0;
}
-static int arm_smmu_domain_finalise(struct iommu_domain *domain)
+static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_device *smmu)
{
int ret;
unsigned long ias, oas;
enum io_pgtable_fmt fmt;
struct io_pgtable_cfg pgtbl_cfg;
struct io_pgtable_ops *pgtbl_ops;
- int (*finalise_stage_fn)(struct arm_smmu_domain *,
- struct io_pgtable_cfg *);
- struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
- struct arm_smmu_device *smmu = smmu_domain->smmu;
+ int (*finalise_stage_fn)(struct arm_smmu_device *smmu,
+ struct arm_smmu_domain *smmu_domain,
+ struct io_pgtable_cfg *pgtbl_cfg);
/* Restrict the stage to what we can actually support */
if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
@@ -2318,17 +2320,18 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
if (!pgtbl_ops)
return -ENOMEM;
- domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
- domain->geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
- domain->geometry.force_aperture = true;
+ smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+ smmu_domain->domain.geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
+ smmu_domain->domain.geometry.force_aperture = true;
- ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
+ ret = finalise_stage_fn(smmu, smmu_domain, &pgtbl_cfg);
if (ret < 0) {
free_io_pgtable_ops(pgtbl_ops);
return ret;
}
smmu_domain->pgtbl_ops = pgtbl_ops;
+ smmu_domain->smmu = smmu;
return 0;
}
@@ -2520,10 +2523,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
mutex_lock(&smmu_domain->init_mutex);
if (!smmu_domain->smmu) {
- smmu_domain->smmu = smmu;
- ret = arm_smmu_domain_finalise(domain);
- if (ret)
- smmu_domain->smmu = NULL;
+ ret = arm_smmu_domain_finalise(smmu_domain, smmu);
} else if (smmu_domain->smmu != smmu)
ret = -EINVAL;
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging()
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (17 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize Jason Gunthorpe
@ 2023-12-05 19:14 ` Jason Gunthorpe
2023-12-06 1:53 ` [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Moritz Fischer
` (2 subsequent siblings)
21 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:14 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Now that the BLOCKED and IDENTITY behaviors are managed with their own
domains change to the domain_alloc_paging() op.
For now SVA remains using the old interface, eventually it will get its
own op that can pass in the device and mm_struct which will let us have a
sane lifetime for the mmu_notifier.
Call arm_smmu_domain_finalise() early if dev is available.
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 873343109a90bb..f3900a3d52524a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2169,14 +2169,15 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
{
- struct arm_smmu_domain *smmu_domain;
if (type == IOMMU_DOMAIN_SVA)
return arm_smmu_sva_domain_alloc();
+ return NULL;
+}
- if (type != IOMMU_DOMAIN_UNMANAGED &&
- type != IOMMU_DOMAIN_DMA)
- return NULL;
+static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
+{
+ struct arm_smmu_domain *smmu_domain;
/*
* Allocate the domain and initialise some of its data structures.
@@ -2192,6 +2193,14 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
spin_lock_init(&smmu_domain->devices_lock);
INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
+ if (dev) {
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+ if (arm_smmu_domain_finalise(smmu_domain, master->smmu)) {
+ kfree(smmu_domain);
+ return NULL;
+ }
+ }
return &smmu_domain->domain;
}
@@ -3059,6 +3068,7 @@ static struct iommu_ops arm_smmu_ops = {
.blocked_domain = &arm_smmu_blocked_domain,
.capable = arm_smmu_capable,
.domain_alloc = arm_smmu_domain_alloc,
+ .domain_alloc_paging = arm_smmu_domain_alloc_paging,
.probe_device = arm_smmu_probe_device,
.release_device = arm_smmu_release_device,
.device_group = arm_smmu_device_group,
--
2.43.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (18 preceding siblings ...)
2023-12-05 19:14 ` [PATCH v3 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging() Jason Gunthorpe
@ 2023-12-06 1:53 ` Moritz Fischer
2023-12-11 18:03 ` Jason Gunthorpe
2024-01-29 19:13 ` Mostafa Saleh
21 siblings, 0 replies; 31+ messages in thread
From: Moritz Fischer @ 2023-12-06 1:53 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Hi Jason,
just got back to actually having access to my machine...
On Tue, Dec 05, 2023 at 03:14:32PM -0400, Jason Gunthorpe wrote:
> The SMMUv3 driver was originally written in 2015 when the iommu driver
> facing API looked quite different. The API has evolved, especially lately,
> and the driver has fallen behind.
> This work aims to bring make the SMMUv3 driver the best IOMMU driver with
> the most comprehensive implementation of the API. After all parts it
> addresses:
> - Global static BLOCKED and IDENTITY domains with 'never fail' attach
> semantics. BLOCKED is desired for efficient VFIO.
> - Support map before attach for PAGING iommu_domains.
> - attach_dev failure does not change the HW configuration.
> - Fully hitless transitions between IDENTITY -> DMA -> IDENTITY.
> The API has IOMMU_RESV_DIRECT which is expected to be
> continuously translating.
> - Safe transitions between PAGING -> BLOCKED, do not ever temporarily
> do IDENTITY. This is required for iommufd security.
> - Full PASID API support including:
> - S1/SVA domains attached to PASIDs
> - IDENTITY/BLOCKED/S1 attached to RID
> - Change of the RID domain while PASIDs are attached
> - Streamlined SVA support using the core infrastructure
> - Hitless, whenever possible, change between two domains
> - iommufd IOMMU_GET_HW_INFO, IOMMU_HWPT_ALLOC_NEST_PARENT, and
> IOMMU_DOMAIN_NESTED support
> Over all these things are going to become more accessible to iommufd, and
> exposed to VMs, so it is important for the driver to have a robust
> implementation of the API.
> The work is split into three parts, with this part largely focusing on the
> STE and building up to the BLOCKED & IDENTITY global static domains.
> The second part largely focuses on the CD and builds up to having a common
> PASID infrastructure that SVA and S1 domains equally use.
> The third part has some random cleanups and the iommufd related parts.
> Overall this takes the approach of turning the STE/CD programming upside
> down where the CD/STE value is computed right at a driver callback
> function and then pushed down into programming logic. The programming
> logic hides the details of the required CD/STE tear-less update. This
> makes the CD/STE functions independent of the arm_smmu_domain which makes
> it fairly straightforward to untangle all the different call chains, and
> add news ones.
> Further, this frees the arm_smmu_domain related logic from keeping track
> of what state the STE/CD is currently in so it can carefully sequence the
> correct update. There are many new update pairs that are subtly introduced
> as the work progresses.
> The locking to support BTM via arm_smmu_asid_lock is a bit subtle right
> now and patches throughout this work adjust and tighten this so that it is
> clearer and doesn't get broken.
> Once the lower STE layers no longer need to touch arm_smmu_domain we can
> isolate struct arm_smmu_domain to be only used for PAGING domains, audit
> all the to_smmu_domain() calls to be only in PAGING domain ops, and
> introduce the normal global static BLOCKED/IDENTITY domains using the new
> STE infrastructure. Part 2 will ultimately migrate SVA over to use
> arm_smmu_domain as well.
> All parts are on github:
> https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
> v3:
> - Use some local variables in arm_smmu_get_step_for_sid() for clarity
> - White space and spelling changes
> - Commit message updates
> - Keep master->domain_head initialized to avoid a list_del corruption
> v2:
> https://lore.kernel.org/r/0-v2-de8b10590bf5+400-smmuv3_newapi_p1_jgg@nvidia.com
> - Rebased on v6.7-rc1
> - Improve the comment for arm_smmu_write_entry_step()
> - Fix the botched memcmp
> - Document the spec justification for the SHCFG exclusion in used
> - Include STRTAB_STE_1_SHCFG for STRTAB_STE_0_CFG_S2_TRANS in used
> - WARN_ON for unknown STEs in used
> - Fix error unwind in arm_smmu_attach_dev()
> - Whitespace, spelling, and checkpatch related items
> v1:
> https://lore.kernel.org/r/0-v1-e289ca9121be+2be-smmuv3_newapi_p1_jgg@nvidia.com
> Jason Gunthorpe (19):
> iommu/arm-smmu-v3: Add a type for the STE
> iommu/arm-smmu-v3: Master cannot be NULL in
> arm_smmu_write_strtab_ent()
> iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
> iommu/arm-smmu-v3: Make STE programming independent of the callers
> iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
> iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
> iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into
> functions
> iommu/arm-smmu-v3: Build the whole STE in
> arm_smmu_make_s2_domain_ste()
> iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
> iommu/arm-smmu-v3: Compute the STE only once for each master
> iommu/arm-smmu-v3: Do not change the STE twice during
> arm_smmu_attach_dev()
> iommu/arm-smmu-v3: Put writing the context descriptor in the right
> order
> iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
> iommu/arm-smmu-v3: Remove arm_smmu_master->domain
> iommu/arm-smmu-v3: Add a global static IDENTITY domain
> iommu/arm-smmu-v3: Add a global static BLOCKED domain
> iommu/arm-smmu-v3: Use the identity/blocked domain during release
> iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to
> finalize
> iommu/arm-smmu-v3: Convert to domain_alloc_paging()
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 729 +++++++++++++-------
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 12 +-
> 2 files changed, 477 insertions(+), 264 deletions(-)
> base-commit: ca7fcaff577c92d85f0e05cc7be79759155fe328
> --
> 2.43.0
For whole series:
Tested-by: Moritz Fischer <moritzf@google.com>
Cheers,
Moritz
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (19 preceding siblings ...)
2023-12-06 1:53 ` [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Moritz Fischer
@ 2023-12-11 18:03 ` Jason Gunthorpe
2023-12-11 18:15 ` Will Deacon
2024-01-29 19:13 ` Mostafa Saleh
21 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-11 18:03 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
On Tue, Dec 05, 2023 at 03:14:32PM -0400, Jason Gunthorpe wrote:
> All parts are on github:
>
> https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
>
> v3:
> - Use some local variables in arm_smmu_get_step_for_sid() for clarity
> - White space and spelling changes
> - Commit message updates
> - Keep master->domain_head initialized to avoid a list_del corruption
> v2: https://lore.kernel.org/r/0-v2-de8b10590bf5+400-smmuv3_newapi_p1_jgg@nvidia.com
> - Rebased on v6.7-rc1
> - Improve the comment for arm_smmu_write_entry_step()
> - Fix the botched memcmp
> - Document the spec justification for the SHCFG exclusion in used
> - Include STRTAB_STE_1_SHCFG for STRTAB_STE_0_CFG_S2_TRANS in used
> - WARN_ON for unknown STEs in used
> - Fix error unwind in arm_smmu_attach_dev()
> - Whitespace, spelling, and checkpatch related items
> v1: https://lore.kernel.org/r/0-v1-e289ca9121be+2be-smmuv3_newapi_p1_jgg@nvidia.com
This hasn't changed significantly in the last three months, so I feel
done now. I think Eric may still have a formal Tested-by for his
Fujitsu system to record the run he did.
Will, we are waiting for you to say something so we can shift review
and testing focus to part 2, ideally in January. Many people are
waiting for this.
Thanks,
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
2023-12-11 18:03 ` Jason Gunthorpe
@ 2023-12-11 18:15 ` Will Deacon
0 siblings, 0 replies; 31+ messages in thread
From: Will Deacon @ 2023-12-11 18:15 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Eric Auger,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
On Mon, Dec 11, 2023 at 02:03:24PM -0400, Jason Gunthorpe wrote:
> On Tue, Dec 05, 2023 at 03:14:32PM -0400, Jason Gunthorpe wrote:
> > All parts are on github:
> >
> > https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
> >
> > v3:
> > - Use some local variables in arm_smmu_get_step_for_sid() for clarity
> > - White space and spelling changes
> > - Commit message updates
> > - Keep master->domain_head initialized to avoid a list_del corruption
> > v2: https://lore.kernel.org/r/0-v2-de8b10590bf5+400-smmuv3_newapi_p1_jgg@nvidia.com
> > - Rebased on v6.7-rc1
> > - Improve the comment for arm_smmu_write_entry_step()
> > - Fix the botched memcmp
> > - Document the spec justification for the SHCFG exclusion in used
> > - Include STRTAB_STE_1_SHCFG for STRTAB_STE_0_CFG_S2_TRANS in used
> > - WARN_ON for unknown STEs in used
> > - Fix error unwind in arm_smmu_attach_dev()
> > - Whitespace, spelling, and checkpatch related items
> > v1: https://lore.kernel.org/r/0-v1-e289ca9121be+2be-smmuv3_newapi_p1_jgg@nvidia.com
>
> This hasn't changed significantly in the last three months, so I feel
> done now. I think Eric may still have a formal Tested-by for his
> Fujitsu system to record the run he did.
>
> Will, we are waiting for you to say something so we can shift review
> and testing focus to part 2, ideally in January. Many people are
> waiting for this.
I'm sorry that you're waiting for me, but I'm snowed under with other
changes and the arm64 tree is my priority at the moment. This series _is_ on
my list and I appreciate that you've got some review, however the fact that
you seem to be lacking any comments from the usual SMMU folks such as Robin
and Jean-Philippe does make me worry about this series to the point that I'm
not prepared just to pick it up without a thorough look.
It sucks, but I don't know what else to tell you.
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
2023-12-05 19:14 ` [PATCH v3 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers Jason Gunthorpe
@ 2023-12-12 16:23 ` Will Deacon
2023-12-12 18:04 ` Jason Gunthorpe
2024-01-29 19:10 ` Mostafa Saleh
1 sibling, 1 reply; 31+ messages in thread
From: Will Deacon @ 2023-12-12 16:23 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Eric Auger,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Hi Jason,
On Tue, Dec 05, 2023 at 03:14:36PM -0400, Jason Gunthorpe wrote:
> As the comment in arm_smmu_write_strtab_ent() explains, this routine has
> been limited to only work correctly in certain scenarios that the caller
> must ensure. Generally the caller must put the STE into ABORT or BYPASS
> before attempting to program it to something else.
>
> The next patches/series are going to start removing some of this logic
> from the callers, and add more complex state combinations than currently.
>
> Thus, consolidate all the complexity here. Callers do not have to care
> about what STE transition they are doing, this function will handle
> everything optimally.
>
> Revise arm_smmu_write_strtab_ent() so it algorithmically computes the
> required programming sequence to avoid creating an incoherent 'torn' STE
> in the HW caches. The update algorithm follows the same design that the
> driver already uses: it is safe to change bits that HW doesn't currently
> use and then do a single 64 bit update, with sync's in between.
>
> The basic idea is to express in a bitmask what bits the HW is actually
> using based on the V and CFG bits. Based on that mask we know what STE
> changes are safe and which are disruptive. We can count how many 64 bit
> QWORDS need a disruptive update and know if a step with V=0 is required.
>
> This gives two basic flows through the algorithm.
>
> If only a single 64 bit quantity needs disruptive replacement:
> - Write the target value into all currently unused bits
> - Write the single 64 bit quantity
> - Zero the remaining different bits
>
> If multiple 64 bit quantities need disruptive replacement then do:
> - Write V=0 to QWORD 0
> - Write the entire STE except QWORD 0
> - Write QWORD 0
>
> With HW flushes at each step, that can be skipped if the STE didn't change
> in that step.
>
> At this point it generates the same sequence of updates as the current
> code, except that zeroing the VMID on entry to BYPASS/ABORT will do an
> extra sync (this seems to be an existing bug).
This is certainly very clever, but at the same time I can't help but feel
that it's slightly over-engineered to solve the general case, whereas I'm
struggling to see why such a level of complexity is necessary.
In the comment, you say:
> + * In the most general case we can make any update in three steps:
> + * - Disrupting the entry (V=0)
> + * - Fill now unused bits, all bits except V
> + * - Make valid (V=1), single 64 bit store
> + *
> + * However this disrupts the HW while it is happening. There are several
> + * interesting cases where a STE/CD can be updated without disturbing the HW
> + * because only a small number of bits are changing (S1DSS, CONFIG, etc) or
> + * because the used bits don't intersect. We can detect this by calculating how
> + * many 64 bit values need update after adjusting the unused bits and skip the
> + * V=0 process.
Please can you spell out these "interesting cases"? For cases where we're
changing CONFIG, I'd have thought it would be perfectly fine to go via an
invalid STE. What am I missing?
Generally, I like where the later patches in the series take things, but
I'd really like to reduce the complexity of the strtab updating code to
what is absolutely required.
I've left some minor comments on the code below, but I'd really like to
see this whole thing simplified if possible.
> + */
> +static bool arm_smmu_write_entry_step(__le64 *cur, const __le64 *cur_used,
> + const __le64 *target,
> + const __le64 *target_used, __le64 *step,
> + __le64 v_bit,
> + unsigned int len)
> +{
> + u8 step_used_diff = 0;
> + u8 step_change = 0;
> + unsigned int i;
> +
> + /*
> + * Compute a step that has all the bits currently unused by HW set to
> + * their target values.
> + */
Well, ok, I do have a cosmetic nit here: using 'step' for both "STE pointer"
and "incremental change" is perhaps, err, a step too far ;)
> + for (i = 0; i != len; i++) {
> + step[i] = (cur[i] & cur_used[i]) | (target[i] & ~cur_used[i]);
Isn't 'cur[i] & cur_used[i]' always cur[i]?
> + if (cur[i] != step[i])
> + step_change |= 1 << i;
> + /*
> + * Each bit indicates if the step is incorrect compared to the
> + * target, considering only the used bits in the target
> + */
> + if ((step[i] & target_used[i]) != (target[i] & target_used[i]))
> + step_used_diff |= 1 << i;
> + }
> +
> + if (hweight8(step_used_diff) > 1) {
> + /*
> + * More than 1 qword is mismatched, this cannot be done without
> + * a break. Clear the V bit and go again.
> + */
> + step[0] &= ~v_bit;
> + } else if (!step_change && step_used_diff) {
> + /*
> + * Have exactly one critical qword, all the other qwords are set
> + * correctly, so we can set this qword now.
> + */
> + i = ffs(step_used_diff) - 1;
> + step[i] = target[i];
> + } else if (!step_change) {
> + /* cur == target, so all done */
> + if (memcmp(cur, target, len * sizeof(*cur)) == 0)
> + return true;
> +
> + /*
> + * All the used HW bits match, but unused bits are different.
> + * Set them as well. Technically this isn't necessary but it
> + * brings the entry to the full target state, so if there are
> + * bugs in the mask calculation this will obscure them.
> + */
> + memcpy(step, target, len * sizeof(*step));
Bah, I'm not a huge fan of this sort of defensive programming. I'd prefer
to propagate the error rather than quietly try to cover it up.
> +/*
> + * Based on the value of ent report which bits of the STE the HW will access. It
> + * would be nice if this was complete according to the spec, but minimally it
> + * has to capture the bits this driver uses.
> + */
> +static void arm_smmu_get_ste_used(const struct arm_smmu_ste *ent,
> + struct arm_smmu_ste *used_bits)
> +{
> + memset(used_bits, 0, sizeof(*used_bits));
> +
> + used_bits->data[0] = cpu_to_le64(STRTAB_STE_0_V);
> + if (!(ent->data[0] & cpu_to_le64(STRTAB_STE_0_V)))
> + return;
> +
> + /*
> + * If S1 is enabled S1DSS is valid, see 13.5 Summary of
> + * attribute/permission configuration fields for the SHCFG behavior.
> + */
> + if (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0])) & 1 &&
> + FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent->data[1])) ==
> + STRTAB_STE_1_S1DSS_BYPASS)
> + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
> +
> + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_CFG);
> + switch (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0]))) {
> + case STRTAB_STE_0_CFG_ABORT:
> + break;
> + case STRTAB_STE_0_CFG_BYPASS:
> + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
> + break;
> + case STRTAB_STE_0_CFG_S1_TRANS:
> + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
> + STRTAB_STE_0_S1CTXPTR_MASK |
> + STRTAB_STE_0_S1CDMAX);
> + used_bits->data[1] |=
> + cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
> + STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
> + STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW);
> + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
> + break;
> + case STRTAB_STE_0_CFG_S2_TRANS:
> + used_bits->data[1] |=
> + cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_SHCFG);
> + used_bits->data[2] |=
> + cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR |
> + STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI |
> + STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R);
> + used_bits->data[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK);
> + break;
I think this is going to be a pain to maintain :/
> +static bool arm_smmu_write_ste_step(struct arm_smmu_ste *cur,
> + const struct arm_smmu_ste *target,
> + const struct arm_smmu_ste *target_used)
> +{
> + struct arm_smmu_ste cur_used;
> + struct arm_smmu_ste step;
> +
> + arm_smmu_get_ste_used(cur, &cur_used);
> + return arm_smmu_write_entry_step(cur->data, cur_used.data, target->data,
> + target_used->data, step.data,
> + cpu_to_le64(STRTAB_STE_0_V),
> + ARRAY_SIZE(cur->data));
> +}
> +
> +static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
> + struct arm_smmu_ste *ste,
> + const struct arm_smmu_ste *target)
> +{
> + struct arm_smmu_ste target_used;
> + int i;
> +
> + arm_smmu_get_ste_used(target, &target_used);
> + /* Masks in arm_smmu_get_ste_used() are up to date */
> + for (i = 0; i != ARRAY_SIZE(target->data); i++)
> + WARN_ON_ONCE(target->data[i] & ~target_used.data[i]);
That's a runtime cost on every single STE update for what would be a driver
bug.
> +
> + while (true) {
> + if (arm_smmu_write_ste_step(ste, target, &target_used))
> + break;
> + arm_smmu_sync_ste_for_sid(smmu, sid);
> + }
This really should be bounded...
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
2023-12-12 16:23 ` Will Deacon
@ 2023-12-12 18:04 ` Jason Gunthorpe
0 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2023-12-12 18:04 UTC (permalink / raw)
To: Will Deacon
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Eric Auger,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
On Tue, Dec 12, 2023 at 04:23:53PM +0000, Will Deacon wrote:
> > At this point it generates the same sequence of updates as the current
> > code, except that zeroing the VMID on entry to BYPASS/ABORT will do an
> > extra sync (this seems to be an existing bug).
>
> This is certainly very clever, but at the same time I can't help but feel
> that it's slightly over-engineered to solve the general case, whereas I'm
> struggling to see why such a level of complexity is necessary.
I'm open to any alternative that accomplishes the same outcome of
decoupling the STE programming from the STE state and adding the API
required hitless transitions.
This is a really important part of all of this work.
IMHO this is inherently complex. The SMMU spec devotes several pages
to describing how to do tearless updates. However, that text does not
even consider the need for hitless updates. It only contemplates the
V=0 flow.
Here we are very much adding hitless and intending to do so.
> In the comment, you say:
>
> > + * In the most general case we can make any update in three steps:
> > + * - Disrupting the entry (V=0)
> > + * - Fill now unused bits, all bits except V
> > + * - Make valid (V=1), single 64 bit store
> > + *
> > + * However this disrupts the HW while it is happening. There are several
> > + * interesting cases where a STE/CD can be updated without disturbing the HW
> > + * because only a small number of bits are changing (S1DSS, CONFIG, etc) or
> > + * because the used bits don't intersect. We can detect this by calculating how
> > + * many 64 bit values need update after adjusting the unused bits and skip the
> > + * V=0 process.
>
> Please can you spell out these "interesting cases"? For cases where we're
> changing CONFIG, I'd have thought it would be perfectly fine to go via an
> invalid STE. What am I missing?
The modern IOMMU driver API requires hitless transitions in a bunch of
cases. I think I explained it in an email to Michael:
https://lore.kernel.org/linux-iommu/20231012121616.GF3952@nvidia.com/
- IDENTIY -> DMA -> IDENTITY hitless with RESV_DIRECT
- STE -> S1DSS -> STE hitless (PASID upgrade)
- S1 -> BLOCKING -> S1 with active PASID hitless (iommufd case)
- NESTING -> NESTING (eg to change S1DSS, change CD table pointers, etc)
- CD ASID change hitless (BTM S1 replacement)
- CD quiet_cd hitless (SVA mm release)
I will add that list to the commit message, the cover letter talks
about it a bit too.
To properly implement the API the driver has to support this.
Rather than try to just target those few cases (and hope I even know
all the needed cases) I just did everything.
"Everything" was a reaction to the complexity, at the end of part 3 we
have 7 different categories of STEs:
- IDENTITY
- BLOCKING
- S2 domain
- IDENTITY w/ PASID
- BLOCKING w/ PASID
- S1 domain
- S1 domain nested on S2 domain
(plus a bunch of those have ATS and !ATS variations, and the nesting
has its own sub types..)
Which is 42 different permutations of transitions (plus the CD
stuff). Even if we want to just special case the sequences above we
still have to identify them and open code the arcs. Then the next
person along has to understand why those specific arcs are special
which is a complex and subtle argument based on the spec saying the HW
does not read certain bits.
Finally, there is a VM emulation reason - we don't know what the VM
will do so it may be assuming hitless changes. To actually correctly
emulate this we do need this general approach. Otherwise we have a
weird emulation gap where STE sequences generated by a VM that should
be hitless on real HW (eg the ones above, perhaps) will not
work. Indeed a VM running kernels with all three parts of this will be
doing and expecting hitless STE updates!
While this is perhaps not a real world functional concern it falls
under the usual rubrik of accurate VM emulation.
So, to my mind ignoring the general case and just doing V=0 all the
time is not OK.
> Generally, I like where the later patches in the series take things, but
> I'd really like to reduce the complexity of the strtab updating code to
> what is absolutely required.
This is the sort of thing where the algorithm in
arm_smmu_write_entry_step() is perhaps complex, but the ongoing
support of it is not. Just update the used bits function when new STE
features are (rarely) introduced.
> I've left some minor comments on the code below, but I'd really like to
> see this whole thing simplified if possible.
I have no particularly goood ideas on this front, sorry. I did try
several other things. Many thoughts were so difficult I couldn't even
write them down correctly :(
> > + */
> > +static bool arm_smmu_write_entry_step(__le64 *cur, const __le64 *cur_used,
> > + const __le64 *target,
> > + const __le64 *target_used, __le64 *step,
> > + __le64 v_bit,
> > + unsigned int len)
> > +{
> > + u8 step_used_diff = 0;
> > + u8 step_change = 0;
> > + unsigned int i;
> > +
> > + /*
> > + * Compute a step that has all the bits currently unused by HW set to
> > + * their target values.
> > + */
>
> Well, ok, I do have a cosmetic nit here: using 'step' for both "STE pointer"
> and "incremental change" is perhaps, err, a step too far ;)
Ok.. I'll call it arm_smmu_write_entry_next()
> > + for (i = 0; i != len; i++) {
> > + step[i] = (cur[i] & cur_used[i]) | (target[i] & ~cur_used[i]);
>
> Isn't 'cur[i] & cur_used[i]' always cur[i]?
No. The routine is called iteratively as the comment explained. The
first iteration may set unused bits to their target values, so we have
to mask them away again here during the second iteration. Later
iterations may change, eg the config, which means again we will have
unused values set from the prior config.
> > + } else if (!step_change) {
> > + /* cur == target, so all done */
> > + if (memcmp(cur, target, len * sizeof(*cur)) == 0)
> > + return true;
> > +
> > + /*
> > + * All the used HW bits match, but unused bits are different.
> > + * Set them as well. Technically this isn't necessary but it
> > + * brings the entry to the full target state, so if there are
> > + * bugs in the mask calculation this will obscure them.
> > + */
> > + memcpy(step, target, len * sizeof(*step));
>
> Bah, I'm not a huge fan of this sort of defensive programming. I'd prefer
> to propagate the error rather than quietly try to cover it up.
There is no error.
This is adjusting the unused bits that were left over from the prior
config to 0, it is a similar answer to the 'cur[i]' question above.
The defensiveness is only a decision that the installed STE should
fully match the given target STE.
In principle the HW doesn't read unused bits in the target STE so we
could leave them set to 1 instead of fully matching.
"bugs in the mask calculation" means everything from the "HW should
not look at bit X but does" to "the spec was misread and the HW does
look at bit X"
> > +/*
> > + * Based on the value of ent report which bits of the STE the HW will access. It
> > + * would be nice if this was complete according to the spec, but minimally it
> > + * has to capture the bits this driver uses.
> > + */
> > +static void arm_smmu_get_ste_used(const struct arm_smmu_ste *ent,
> > + struct arm_smmu_ste *used_bits)
> > +{
> > + memset(used_bits, 0, sizeof(*used_bits));
> > +
> > + used_bits->data[0] = cpu_to_le64(STRTAB_STE_0_V);
> > + if (!(ent->data[0] & cpu_to_le64(STRTAB_STE_0_V)))
> > + return;
> > +
> > + /*
> > + * If S1 is enabled S1DSS is valid, see 13.5 Summary of
> > + * attribute/permission configuration fields for the SHCFG behavior.
> > + */
> > + if (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0])) & 1 &&
> > + FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent->data[1])) ==
> > + STRTAB_STE_1_S1DSS_BYPASS)
> > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
> > +
> > + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_CFG);
> > + switch (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0]))) {
> > + case STRTAB_STE_0_CFG_ABORT:
> > + break;
> > + case STRTAB_STE_0_CFG_BYPASS:
> > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
> > + break;
> > + case STRTAB_STE_0_CFG_S1_TRANS:
> > + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
> > + STRTAB_STE_0_S1CTXPTR_MASK |
> > + STRTAB_STE_0_S1CDMAX);
> > + used_bits->data[1] |=
> > + cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
> > + STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
> > + STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW);
> > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
> > + break;
> > + case STRTAB_STE_0_CFG_S2_TRANS:
> > + used_bits->data[1] |=
> > + cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_SHCFG);
> > + used_bits->data[2] |=
> > + cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR |
> > + STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI |
> > + STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R);
> > + used_bits->data[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK);
> > + break;
>
> I think this is going to be a pain to maintain :/
Ah, but not your pain :)
Some day someone adds a new STE bit, they don't understand this so
they just change one of the make functions.
They test and hit the WARN_ON, which brings them here. Then they
hopefully realize they need to read the spec and understand the new
bit so they do that and make a try at the right conditions.
Reviewer sees the new hunk in this function and double-checks that the
conditions for the new bit are correct. Reviewer needs to consider if
any hitless flows become broken (and I've considered adding an
explicit self test for this)
The hard work of reviewing the spec and deciding how a new STE bit is
processed cannot be skipped, but we can make it harder to notice that
it has been skipped. I think the current design makes skipping this
work too easy.
So you are calling it a "pain to maintain" I'm calling it an "enforced
rigor" going forward. Like type safety.
> > +static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
> > + struct arm_smmu_ste *ste,
> > + const struct arm_smmu_ste *target)
> > +{
> > + struct arm_smmu_ste target_used;
> > + int i;
> > +
> > + arm_smmu_get_ste_used(target, &target_used);
> > + /* Masks in arm_smmu_get_ste_used() are up to date */
> > + for (i = 0; i != ARRAY_SIZE(target->data); i++)
> > + WARN_ON_ONCE(target->data[i] & ~target_used.data[i]);
>
> That's a runtime cost on every single STE update for what would be a driver
> bug.
See above. STE programming is not critical path, and I think this
series makes it faster overall anyhow.
> > + while (true) {
> > + if (arm_smmu_write_ste_step(ste, target, &target_used))
> > + break;
> > + arm_smmu_sync_ste_for_sid(smmu, sid);
> > + }
>
> This really should be bounded...
Sure, I think the bound is 4 but I will check.
Thanks,
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
2023-12-05 19:14 ` [PATCH v3 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers Jason Gunthorpe
2023-12-12 16:23 ` Will Deacon
@ 2024-01-29 19:10 ` Mostafa Saleh
2024-01-29 19:49 ` Jason Gunthorpe
1 sibling, 1 reply; 31+ messages in thread
From: Mostafa Saleh @ 2024-01-29 19:10 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Hi Jason,
On Tue, Dec 05, 2023 at 03:14:36PM -0400, Jason Gunthorpe wrote:
> As the comment in arm_smmu_write_strtab_ent() explains, this routine has
> been limited to only work correctly in certain scenarios that the caller
> must ensure. Generally the caller must put the STE into ABORT or BYPASS
> before attempting to program it to something else.
>
> The next patches/series are going to start removing some of this logic
> from the callers, and add more complex state combinations than currently.
>
> Thus, consolidate all the complexity here. Callers do not have to care
> about what STE transition they are doing, this function will handle
> everything optimally.
>
> Revise arm_smmu_write_strtab_ent() so it algorithmically computes the
> required programming sequence to avoid creating an incoherent 'torn' STE
> in the HW caches. The update algorithm follows the same design that the
> driver already uses: it is safe to change bits that HW doesn't currently
> use and then do a single 64 bit update, with sync's in between.
>
> The basic idea is to express in a bitmask what bits the HW is actually
> using based on the V and CFG bits. Based on that mask we know what STE
> changes are safe and which are disruptive. We can count how many 64 bit
> QWORDS need a disruptive update and know if a step with V=0 is required.
>
> This gives two basic flows through the algorithm.
>
> If only a single 64 bit quantity needs disruptive replacement:
> - Write the target value into all currently unused bits
> - Write the single 64 bit quantity
> - Zero the remaining different bits
>
> If multiple 64 bit quantities need disruptive replacement then do:
> - Write V=0 to QWORD 0
> - Write the entire STE except QWORD 0
> - Write QWORD 0
>
> With HW flushes at each step, that can be skipped if the STE didn't change
> in that step.
>
> At this point it generates the same sequence of updates as the current
> code, except that zeroing the VMID on entry to BYPASS/ABORT will do an
> extra sync (this seems to be an existing bug).
>
> Going forward this will use a V=0 transition instead of cycling through
> ABORT if a hitfull change is required. This seems more appropriate as ABORT
> will fail DMAs without any logging, but dropping a DMA due to transient
> V=0 is probably signaling a bug, so the C_BAD_STE is valuable.
Would the driver do anything in that case, or would just print the log message?
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 272 +++++++++++++++-----
> 1 file changed, 208 insertions(+), 64 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index b120d836681c1c..0934f882b94e94 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -971,6 +971,101 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
> arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
> }
>
> +/*
> + * This algorithm updates any STE/CD to any value without creating a situation
> + * where the HW can percieve a corrupted entry. HW is only required to have a 64
> + * bit atomicity with stores from the CPU, while entries are many 64 bit values
> + * big.
> + *
> + * The algorithm works by evolving the entry toward the target in a series of
> + * steps. Each step synchronizes with the HW so that the HW can not see an entry
> + * torn across two steps. Upon each call cur/cur_used reflect the current
> + * synchronized value seen by the HW.
> + *
> + * During each step the HW can observe a torn entry that has any combination of
> + * the step's old/new 64 bit words. The algorithm objective is for the HW
> + * behavior to always be one of current behavior, V=0, or new behavior, during
> + * each step, and across all steps.
> + *
> + * At each step one of three actions is chosen to evolve cur to target:
> + * - Update all unused bits with their target values.
> + * This relies on the IGNORED behavior described in the specification
> + * - Update a single 64-bit value
> + * - Update all unused bits and set V=0
> + *
> + * The last two actions will cause cur_used to change, which will then allow the
> + * first action on the next step.
> + *
> + * In the most general case we can make any update in three steps:
> + * - Disrupting the entry (V=0)
> + * - Fill now unused bits, all bits except V
> + * - Make valid (V=1), single 64 bit store
> + *
> + * However this disrupts the HW while it is happening. There are several
> + * interesting cases where a STE/CD can be updated without disturbing the HW
> + * because only a small number of bits are changing (S1DSS, CONFIG, etc) or
> + * because the used bits don't intersect. We can detect this by calculating how
> + * many 64 bit values need update after adjusting the unused bits and skip the
> + * V=0 process.
> + */
> +static bool arm_smmu_write_entry_step(__le64 *cur, const __le64 *cur_used,
> + const __le64 *target,
> + const __le64 *target_used, __le64 *step,
> + __le64 v_bit,
I think this is confusing here, I believe we have this as an argument as this
function would be used for CD later, however for this series it is unnecessary,
IMHO, this should be removed and added in another patch for the CD rework.
> + unsigned int len)
> +{
> + u8 step_used_diff = 0;
> + u8 step_change = 0;
> + unsigned int i;
> +
> + /*
> + * Compute a step that has all the bits currently unused by HW set to
> + * their target values.
> + */
> + for (i = 0; i != len; i++) {
> + step[i] = (cur[i] & cur_used[i]) | (target[i] & ~cur_used[i]);
> + if (cur[i] != step[i])
> + step_change |= 1 << i;
> + /*
> + * Each bit indicates if the step is incorrect compared to the
> + * target, considering only the used bits in the target
> + */
> + if ((step[i] & target_used[i]) != (target[i] & target_used[i]))
> + step_used_diff |= 1 << i;
> + }
> +
> + if (hweight8(step_used_diff) > 1) {
> + /*
> + * More than 1 qword is mismatched, this cannot be done without
> + * a break. Clear the V bit and go again.
> + */
> + step[0] &= ~v_bit;
> + } else if (!step_change && step_used_diff) {
> + /*
> + * Have exactly one critical qword, all the other qwords are set
> + * correctly, so we can set this qword now.
> + */
> + i = ffs(step_used_diff) - 1;
> + step[i] = target[i];
> + } else if (!step_change) {
> + /* cur == target, so all done */
> + if (memcmp(cur, target, len * sizeof(*cur)) == 0)
> + return true;
> +
> + /*
> + * All the used HW bits match, but unused bits are different.
> + * Set them as well. Technically this isn't necessary but it
> + * brings the entry to the full target state, so if there are
> + * bugs in the mask calculation this will obscure them.
> + */
> + memcpy(step, target, len * sizeof(*step));
> + }
> +
> + for (i = 0; i != len; i++)
> + WRITE_ONCE(cur[i], step[i]);
> + return false;
> +}
> +
> static void arm_smmu_sync_cd(struct arm_smmu_master *master,
> int ssid, bool leaf)
> {
> @@ -1248,37 +1343,115 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
> arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
> }
>
> +/*
> + * Based on the value of ent report which bits of the STE the HW will access. It
> + * would be nice if this was complete according to the spec, but minimally it
> + * has to capture the bits this driver uses.
> + */
> +static void arm_smmu_get_ste_used(const struct arm_smmu_ste *ent,
> + struct arm_smmu_ste *used_bits)
> +{
> + memset(used_bits, 0, sizeof(*used_bits));
> +
> + used_bits->data[0] = cpu_to_le64(STRTAB_STE_0_V);
> + if (!(ent->data[0] & cpu_to_le64(STRTAB_STE_0_V)))
> + return;
> +
> + /*
> + * If S1 is enabled S1DSS is valid, see 13.5 Summary of
> + * attribute/permission configuration fields for the SHCFG behavior.
> + */
> + if (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0])) & 1 &&
> + FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent->data[1])) ==
> + STRTAB_STE_1_S1DSS_BYPASS)
> + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
> +
> + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_CFG);
> + switch (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0]))) {
> + case STRTAB_STE_0_CFG_ABORT:
> + break;
> + case STRTAB_STE_0_CFG_BYPASS:
> + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
> + break;
> + case STRTAB_STE_0_CFG_S1_TRANS:
> + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
> + STRTAB_STE_0_S1CTXPTR_MASK |
> + STRTAB_STE_0_S1CDMAX);
> + used_bits->data[1] |=
> + cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
> + STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
> + STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW);
> + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
> + break;
AFAIU, this is missing something like (while passing smmu->features)
used_bits->data[2] |= features & ARM_SMMU_FEAT_NESTING ?
cpu_to_le64(STRTAB_STE_2_S2VMID) : 0;
As the SMMUv3 manual says:
“ For a Non-secure STE when stage 2 is implemented (SMMU_IDR0.S2P == 1)
translations resulting from a StreamWorld == NS-EL1 configuration are
VMID-tagged with S2VMID when either of stage 1 (Config[0] == 1) or stage 2
(Config[1] == 1) provide translation.“
Which means in case of S1=>S2 switch or vice versa this algorithm will ignore
VMID while it is used.
> + case STRTAB_STE_0_CFG_S2_TRANS:
> + used_bits->data[1] |=
> + cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_SHCFG);
> + used_bits->data[2] |=
> + cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR |
> + STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI |
> + STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R);
> + used_bits->data[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK);
> + break;
> +
> + default:
> + memset(used_bits, 0xFF, sizeof(*used_bits));
> + WARN_ON(true);
> + }
> +}
> +
> +static bool arm_smmu_write_ste_step(struct arm_smmu_ste *cur,
> + const struct arm_smmu_ste *target,
> + const struct arm_smmu_ste *target_used)
> +{
> + struct arm_smmu_ste cur_used;
> + struct arm_smmu_ste step;
> +
> + arm_smmu_get_ste_used(cur, &cur_used);
> + return arm_smmu_write_entry_step(cur->data, cur_used.data, target->data,
> + target_used->data, step.data,
> + cpu_to_le64(STRTAB_STE_0_V),
> + ARRAY_SIZE(cur->data));
> +}
> +
> +static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
> + struct arm_smmu_ste *ste,
> + const struct arm_smmu_ste *target)
> +{
> + struct arm_smmu_ste target_used;
> + int i;
> +
> + arm_smmu_get_ste_used(target, &target_used);
> + /* Masks in arm_smmu_get_ste_used() are up to date */
> + for (i = 0; i != ARRAY_SIZE(target->data); i++)
> + WARN_ON_ONCE(target->data[i] & ~target_used.data[i]);
In what situation this would be triggered, is that for future proofing,
maybe we can move it to arm_smmu_get_ste_used()?
> +
> + while (true) {
> + if (arm_smmu_write_ste_step(ste, target, &target_used))
> + break;
> + arm_smmu_sync_ste_for_sid(smmu, sid);
> + }
> +
> + /* It's likely that we'll want to use the new STE soon */
> + if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH)) {
> + struct arm_smmu_cmdq_ent
> + prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG,
> + .prefetch = {
> + .sid = sid,
> + } };
> +
> + arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
> + }
> +}
> +
> static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> struct arm_smmu_ste *dst)
> {
> - /*
> - * This is hideously complicated, but we only really care about
> - * three cases at the moment:
> - *
> - * 1. Invalid (all zero) -> bypass/fault (init)
> - * 2. Bypass/fault -> translation/bypass (attach)
> - * 3. Translation/bypass -> bypass/fault (detach)
> - *
> - * Given that we can't update the STE atomically and the SMMU
> - * doesn't read the thing in a defined order, that leaves us
> - * with the following maintenance requirements:
> - *
> - * 1. Update Config, return (init time STEs aren't live)
> - * 2. Write everything apart from dword 0, sync, write dword 0, sync
> - * 3. Update Config, sync
> - */
> - u64 val = le64_to_cpu(dst->data[0]);
> - bool ste_live = false;
> + u64 val;
> struct arm_smmu_device *smmu = master->smmu;
> struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> struct arm_smmu_s2_cfg *s2_cfg = NULL;
> struct arm_smmu_domain *smmu_domain = master->domain;
> - struct arm_smmu_cmdq_ent prefetch_cmd = {
> - .opcode = CMDQ_OP_PREFETCH_CFG,
> - .prefetch = {
> - .sid = sid,
> - },
> - };
> + struct arm_smmu_ste target = {};
>
> if (smmu_domain) {
> switch (smmu_domain->stage) {
> @@ -1293,22 +1466,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> }
> }
>
> - if (val & STRTAB_STE_0_V) {
> - switch (FIELD_GET(STRTAB_STE_0_CFG, val)) {
> - case STRTAB_STE_0_CFG_BYPASS:
> - break;
> - case STRTAB_STE_0_CFG_S1_TRANS:
> - case STRTAB_STE_0_CFG_S2_TRANS:
> - ste_live = true;
> - break;
> - case STRTAB_STE_0_CFG_ABORT:
> - BUG_ON(!disable_bypass);
> - break;
> - default:
> - BUG(); /* STE corruption */
> - }
> - }
> -
> /* Nuke the existing STE_0 value, as we're going to rewrite it */
> val = STRTAB_STE_0_V;
>
> @@ -1319,16 +1476,11 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> else
> val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>
> - dst->data[0] = cpu_to_le64(val);
> - dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> + target.data[0] = cpu_to_le64(val);
> + target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> STRTAB_STE_1_SHCFG_INCOMING));
> - dst->data[2] = 0; /* Nuke the VMID */
> - /*
> - * The SMMU can perform negative caching, so we must sync
> - * the STE regardless of whether the old value was live.
> - */
> - if (smmu)
> - arm_smmu_sync_ste_for_sid(smmu, sid);
> + target.data[2] = 0; /* Nuke the VMID */
> + arm_smmu_write_ste(smmu, sid, dst, &target);
> return;
> }
>
> @@ -1336,8 +1488,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
> STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
>
> - BUG_ON(ste_live);
> - dst->data[1] = cpu_to_le64(
> + target.data[1] = cpu_to_le64(
> FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
> FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> @@ -1346,7 +1497,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>
> if (smmu->features & ARM_SMMU_FEAT_STALLS &&
> !master->stall_enabled)
> - dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> + target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
>
> val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> @@ -1355,8 +1506,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> }
>
> if (s2_cfg) {
> - BUG_ON(ste_live);
> - dst->data[2] = cpu_to_le64(
> + target.data[2] = cpu_to_le64(
> FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> #ifdef __BIG_ENDIAN
> @@ -1365,23 +1515,17 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
> STRTAB_STE_2_S2R);
>
> - dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> + target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
>
> val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
> }
>
> if (master->ats_enabled)
> - dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> + target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> STRTAB_STE_1_EATS_TRANS));
>
> - arm_smmu_sync_ste_for_sid(smmu, sid);
> - /* See comment in arm_smmu_write_ctx_desc() */
> - WRITE_ONCE(dst->data[0], cpu_to_le64(val));
> - arm_smmu_sync_ste_for_sid(smmu, sid);
> -
> - /* It's likely that we'll want to use the new STE soon */
> - if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH))
> - arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
> + target.data[0] = cpu_to_le64(val);
> + arm_smmu_write_ste(smmu, sid, dst, &target);
> }
>
> static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
> --
> 2.43.0
>
Thanks,
Mostafa
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
` (20 preceding siblings ...)
2023-12-11 18:03 ` Jason Gunthorpe
@ 2024-01-29 19:13 ` Mostafa Saleh
2024-01-29 19:42 ` Jason Gunthorpe
21 siblings, 1 reply; 31+ messages in thread
From: Mostafa Saleh @ 2024-01-29 19:13 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
Hi Jason,
On Tue, Dec 05, 2023 at 03:14:32PM -0400, Jason Gunthorpe wrote:
> The SMMUv3 driver was originally written in 2015 when the iommu driver
> facing API looked quite different. The API has evolved, especially lately,
> and the driver has fallen behind.
>
> This work aims to bring make the SMMUv3 driver the best IOMMU driver with
> the most comprehensive implementation of the API. After all parts it
> addresses:
>
> - Global static BLOCKED and IDENTITY domains with 'never fail' attach
> semantics. BLOCKED is desired for efficient VFIO.
>
> - Support map before attach for PAGING iommu_domains.
>
> - attach_dev failure does not change the HW configuration.
>
> - Fully hitless transitions between IDENTITY -> DMA -> IDENTITY.
> The API has IOMMU_RESV_DIRECT which is expected to be
> continuously translating.
>
> - Safe transitions between PAGING -> BLOCKED, do not ever temporarily
> do IDENTITY. This is required for iommufd security.
>
> - Full PASID API support including:
> - S1/SVA domains attached to PASIDs
> - IDENTITY/BLOCKED/S1 attached to RID
> - Change of the RID domain while PASIDs are attached
>
> - Streamlined SVA support using the core infrastructure
>
> - Hitless, whenever possible, change between two domains
>
> - iommufd IOMMU_GET_HW_INFO, IOMMU_HWPT_ALLOC_NEST_PARENT, and
> IOMMU_DOMAIN_NESTED support
>
> Over all these things are going to become more accessible to iommufd, and
> exposed to VMs, so it is important for the driver to have a robust
> implementation of the API.
>
> The work is split into three parts, with this part largely focusing on the
> STE and building up to the BLOCKED & IDENTITY global static domains.
>
> The second part largely focuses on the CD and builds up to having a common
> PASID infrastructure that SVA and S1 domains equally use.
>
> The third part has some random cleanups and the iommufd related parts.
>
> Overall this takes the approach of turning the STE/CD programming upside
> down where the CD/STE value is computed right at a driver callback
> function and then pushed down into programming logic. The programming
> logic hides the details of the required CD/STE tear-less update. This
> makes the CD/STE functions independent of the arm_smmu_domain which makes
> it fairly straightforward to untangle all the different call chains, and
> add news ones.
>
> Further, this frees the arm_smmu_domain related logic from keeping track
> of what state the STE/CD is currently in so it can carefully sequence the
> correct update. There are many new update pairs that are subtly introduced
> as the work progresses.
>
> The locking to support BTM via arm_smmu_asid_lock is a bit subtle right
> now and patches throughout this work adjust and tighten this so that it is
> clearer and doesn't get broken.
>
> Once the lower STE layers no longer need to touch arm_smmu_domain we can
> isolate struct arm_smmu_domain to be only used for PAGING domains, audit
> all the to_smmu_domain() calls to be only in PAGING domain ops, and
> introduce the normal global static BLOCKED/IDENTITY domains using the new
> STE infrastructure. Part 2 will ultimately migrate SVA over to use
> arm_smmu_domain as well.
>
> All parts are on github:
>
> https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
I added some comments/questions for this series, but didn’t review it
thoroughly as I see the code on github is quite different from these patches,
and it seems to be targeted for v4. Do you have any plans to send it soon?
>
> v3:
> - Use some local variables in arm_smmu_get_step_for_sid() for clarity
> - White space and spelling changes
> - Commit message updates
> - Keep master->domain_head initialized to avoid a list_del corruption
> v2: https://lore.kernel.org/r/0-v2-de8b10590bf5+400-smmuv3_newapi_p1_jgg@nvidia.com
> - Rebased on v6.7-rc1
> - Improve the comment for arm_smmu_write_entry_step()
> - Fix the botched memcmp
> - Document the spec justification for the SHCFG exclusion in used
> - Include STRTAB_STE_1_SHCFG for STRTAB_STE_0_CFG_S2_TRANS in used
> - WARN_ON for unknown STEs in used
> - Fix error unwind in arm_smmu_attach_dev()
> - Whitespace, spelling, and checkpatch related items
> v1: https://lore.kernel.org/r/0-v1-e289ca9121be+2be-smmuv3_newapi_p1_jgg@nvidia.com
>
> Jason Gunthorpe (19):
> iommu/arm-smmu-v3: Add a type for the STE
> iommu/arm-smmu-v3: Master cannot be NULL in
> arm_smmu_write_strtab_ent()
> iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
> iommu/arm-smmu-v3: Make STE programming independent of the callers
> iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
> iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
> iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into
> functions
> iommu/arm-smmu-v3: Build the whole STE in
> arm_smmu_make_s2_domain_ste()
> iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
> iommu/arm-smmu-v3: Compute the STE only once for each master
> iommu/arm-smmu-v3: Do not change the STE twice during
> arm_smmu_attach_dev()
> iommu/arm-smmu-v3: Put writing the context descriptor in the right
> order
> iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
> iommu/arm-smmu-v3: Remove arm_smmu_master->domain
> iommu/arm-smmu-v3: Add a global static IDENTITY domain
> iommu/arm-smmu-v3: Add a global static BLOCKED domain
> iommu/arm-smmu-v3: Use the identity/blocked domain during release
> iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to
> finalize
> iommu/arm-smmu-v3: Convert to domain_alloc_paging()
>
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 729 +++++++++++++-------
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 12 +-
> 2 files changed, 477 insertions(+), 264 deletions(-)
>
>
> base-commit: ca7fcaff577c92d85f0e05cc7be79759155fe328
> --
> 2.43.0
>
Thanks,
Mostafa
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
2024-01-29 19:13 ` Mostafa Saleh
@ 2024-01-29 19:42 ` Jason Gunthorpe
2024-01-29 20:45 ` Mostafa Saleh
0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2024-01-29 19:42 UTC (permalink / raw)
To: Mostafa Saleh
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
On Mon, Jan 29, 2024 at 07:13:13PM +0000, Mostafa Saleh wrote:
> > All parts are on github:
> >
> > https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
> I added some comments/questions for this series, but didn’t review it
> thoroughly as I see the code on github is quite different from these patches,
> and it seems to be targeted for v4. Do you have any plans to send it soon?
The part 1 didn't change too, much aside from patch 4, but v4 is already posted:
https://lore.kernel.org/linux-iommu/0-v4-c93b774edcc4+42d2b-smmuv3_newapi_p1_jgg@nvidia.com/
Thanks,
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
2024-01-29 19:10 ` Mostafa Saleh
@ 2024-01-29 19:49 ` Jason Gunthorpe
2024-01-29 20:48 ` Mostafa Saleh
0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2024-01-29 19:49 UTC (permalink / raw)
To: Mostafa Saleh
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
On Mon, Jan 29, 2024 at 07:10:47PM +0000, Mostafa Saleh wrote:
> > Going forward this will use a V=0 transition instead of cycling through
> > ABORT if a hitfull change is required. This seems more appropriate as ABORT
> > will fail DMAs without any logging, but dropping a DMA due to transient
> > V=0 is probably signaling a bug, so the C_BAD_STE is valuable.
> Would the driver do anything in that case, or would just print the log message?
Just log, AFAIK.
> > +static bool arm_smmu_write_entry_step(__le64 *cur, const __le64 *cur_used,
> > + const __le64 *target,
> > + const __le64 *target_used, __le64 *step,
> > + __le64 v_bit,
> I think this is confusing here, I believe we have this as an argument as this
> function would be used for CD later, however for this series it is unnecessary,
> IMHO, this should be removed and added in another patch for the CD rework.
It is alot of code churn to do that, even more on the new version.
> > + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_CFG);
> > + switch (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0]))) {
> > + case STRTAB_STE_0_CFG_ABORT:
> > + break;
> > + case STRTAB_STE_0_CFG_BYPASS:
> > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
> > + break;
> > + case STRTAB_STE_0_CFG_S1_TRANS:
> > + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
> > + STRTAB_STE_0_S1CTXPTR_MASK |
> > + STRTAB_STE_0_S1CDMAX);
> > + used_bits->data[1] |=
> > + cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
> > + STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
> > + STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW);
> > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
> > + break;
> AFAIU, this is missing something like (while passing smmu->features)
> used_bits->data[2] |= features & ARM_SMMU_FEAT_NESTING ?
> cpu_to_le64(STRTAB_STE_2_S2VMID) : 0;
>
> As the SMMUv3 manual says:
> “ For a Non-secure STE when stage 2 is implemented (SMMU_IDR0.S2P == 1)
> translations resulting from a StreamWorld == NS-EL1 configuration are
> VMID-tagged with S2VMID when either of stage 1 (Config[0] == 1) or stage 2
> (Config[1] == 1) provide translation.“
>
> Which means in case of S1=>S2 switch or vice versa this algorithm will ignore
> VMID while it is used.
Ah, yes, that is a small miss, thanks. I don't think we need the
features test though, s2vmid doesn't mean something different if the
feature is not present..
> > +static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
> > + struct arm_smmu_ste *ste,
> > + const struct arm_smmu_ste *target)
> > +{
> > + struct arm_smmu_ste target_used;
> > + int i;
> > +
> > + arm_smmu_get_ste_used(target, &target_used);
> > + /* Masks in arm_smmu_get_ste_used() are up to date */
> > + for (i = 0; i != ARRAY_SIZE(target->data); i++)
> > + WARN_ON_ONCE(target->data[i] & ~target_used.data[i]);
> In what situation this would be triggered, is that for future proofing,
> maybe we can move it to arm_smmu_get_ste_used()?
Yes, prevent people from making an error down the road.
It can't be in ste_used due to how this specific algorithm works
iteratively
And in the v4 version it still wouldn't be a good idea at this point
due to how the series slowly migrates STE and CD programming
over. There are cases where the current STE will not have been written
by this code and may not pass this test.
Thanks,
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
2024-01-29 19:42 ` Jason Gunthorpe
@ 2024-01-29 20:45 ` Mostafa Saleh
0 siblings, 0 replies; 31+ messages in thread
From: Mostafa Saleh @ 2024-01-29 20:45 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
On Mon, Jan 29, 2024 at 03:42:45PM -0400, Jason Gunthorpe wrote:
> On Mon, Jan 29, 2024 at 07:13:13PM +0000, Mostafa Saleh wrote:
>
> > > All parts are on github:
> > >
> > > https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
> > I added some comments/questions for this series, but didn’t review it
> > thoroughly as I see the code on github is quite different from these patches,
> > and it seems to be targeted for v4. Do you have any plans to send it soon?
>
> The part 1 didn't change too, much aside from patch 4, but v4 is already posted:
>
> https://lore.kernel.org/linux-iommu/0-v4-c93b774edcc4+42d2b-smmuv3_newapi_p1_jgg@nvidia.com/
Oh, I missed it, thanks, I will move to reviewing v4.
> Thanks,
> Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v3 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
2024-01-29 19:49 ` Jason Gunthorpe
@ 2024-01-29 20:48 ` Mostafa Saleh
0 siblings, 0 replies; 31+ messages in thread
From: Mostafa Saleh @ 2024-01-29 20:48 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameer Kolothum
On Mon, Jan 29, 2024 at 03:49:10PM -0400, Jason Gunthorpe wrote:
> On Mon, Jan 29, 2024 at 07:10:47PM +0000, Mostafa Saleh wrote:
>
> > > Going forward this will use a V=0 transition instead of cycling through
> > > ABORT if a hitfull change is required. This seems more appropriate as ABORT
> > > will fail DMAs without any logging, but dropping a DMA due to transient
> > > V=0 is probably signaling a bug, so the C_BAD_STE is valuable.
> > Would the driver do anything in that case, or would just print the log message?
>
> Just log, AFAIK.
>
> > > +static bool arm_smmu_write_entry_step(__le64 *cur, const __le64 *cur_used,
> > > + const __le64 *target,
> > > + const __le64 *target_used, __le64 *step,
> > > + __le64 v_bit,
> > I think this is confusing here, I believe we have this as an argument as this
> > function would be used for CD later, however for this series it is unnecessary,
> > IMHO, this should be removed and added in another patch for the CD rework.
>
> It is alot of code churn to do that, even more on the new version.
>
> > > + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_CFG);
> > > + switch (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0]))) {
> > > + case STRTAB_STE_0_CFG_ABORT:
> > > + break;
> > > + case STRTAB_STE_0_CFG_BYPASS:
> > > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
> > > + break;
> > > + case STRTAB_STE_0_CFG_S1_TRANS:
> > > + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
> > > + STRTAB_STE_0_S1CTXPTR_MASK |
> > > + STRTAB_STE_0_S1CDMAX);
> > > + used_bits->data[1] |=
> > > + cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
> > > + STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
> > > + STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW);
> > > + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
> > > + break;
> > AFAIU, this is missing something like (while passing smmu->features)
> > used_bits->data[2] |= features & ARM_SMMU_FEAT_NESTING ?
> > cpu_to_le64(STRTAB_STE_2_S2VMID) : 0;
> >
> > As the SMMUv3 manual says:
> > “ For a Non-secure STE when stage 2 is implemented (SMMU_IDR0.S2P == 1)
> > translations resulting from a StreamWorld == NS-EL1 configuration are
> > VMID-tagged with S2VMID when either of stage 1 (Config[0] == 1) or stage 2
> > (Config[1] == 1) provide translation.“
> >
> > Which means in case of S1=>S2 switch or vice versa this algorithm will ignore
> > VMID while it is used.
Yes, In that case we would consider S2VMID even for stage-1 instances only,
even though it should never change and in that case the algorithm will have
the same steps. I guess it might still look confusing, but no strong opinion.
>
> Ah, yes, that is a small miss, thanks. I don't think we need the
> features test though, s2vmid doesn't mean something different if the
> feature is not present..
>
> > > +static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
> > > + struct arm_smmu_ste *ste,
> > > + const struct arm_smmu_ste *target)
> > > +{
> > > + struct arm_smmu_ste target_used;
> > > + int i;
> > > +
> > > + arm_smmu_get_ste_used(target, &target_used);
> > > + /* Masks in arm_smmu_get_ste_used() are up to date */
> > > + for (i = 0; i != ARRAY_SIZE(target->data); i++)
> > > + WARN_ON_ONCE(target->data[i] & ~target_used.data[i]);
> > In what situation this would be triggered, is that for future proofing,
> > maybe we can move it to arm_smmu_get_ste_used()?
>
> Yes, prevent people from making an error down the road.
>
> It can't be in ste_used due to how this specific algorithm works
> iteratively
>
> And in the v4 version it still wouldn't be a good idea at this point
> due to how the series slowly migrates STE and CD programming
> over. There are cases where the current STE will not have been written
> by this code and may not pass this test.
>
> Thanks,
> Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2024-01-29 20:49 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-05 19:14 [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 01/19] iommu/arm-smmu-v3: Add a type for the STE Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent() Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers Jason Gunthorpe
2023-12-12 16:23 ` Will Deacon
2023-12-12 18:04 ` Jason Gunthorpe
2024-01-29 19:10 ` Mostafa Saleh
2024-01-29 19:49 ` Jason Gunthorpe
2024-01-29 20:48 ` Mostafa Saleh
2023-12-05 19:14 ` [PATCH v3 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste() Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste() Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev() Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats() Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize Jason Gunthorpe
2023-12-05 19:14 ` [PATCH v3 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging() Jason Gunthorpe
2023-12-06 1:53 ` [PATCH v3 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Moritz Fischer
2023-12-11 18:03 ` Jason Gunthorpe
2023-12-11 18:15 ` Will Deacon
2024-01-29 19:13 ` Mostafa Saleh
2024-01-29 19:42 ` Jason Gunthorpe
2024-01-29 20:45 ` Mostafa Saleh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).