* [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3)
@ 2024-06-25 12:37 Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva() Jason Gunthorpe
` (14 more replies)
0 siblings, 15 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
Continuing the work of part 1 and 2a this part focuses on the PASID and
SVA code bringing these functional improvements:
- attach_dev failure does not change the HW configuration.
- Full PASID API support including:
- S1/SVA domains attached to PASIDs
- IDENTITY/BLOCKED/S1 attached to RID
- Change of the RID domain while PASIDs are attached
- Streamlined SVA support using the core infrastructure
- Hitless, whenever possible, change between two domains
This requires some reorganizing how the invalidation is tracked, a
flexible linked list containing the SSIDs as well as the ATC information
is used instead of a single master.
The revised invalidation infrastructure is used to build generic support
for attaching any sort of domain to any SSID, including the necessary
maintenance of the invalidation list and the global ATS state. This
mechansim is used again in part 3 as part of the nesting series.
The ATS ordering is generalized and consolidated so that the PASID flow
can use it and put into a form where it is fully hitless, whenever
possible. This is necessary as changes to single PASIDs or RIDs can not
change global state like ATS without impacting other, still attached,
PASIDs or RIDs.
Next simply replace the entire outdated SVA mmu_notifier implementation in
one shot and switch it over to the newly created generic PASID layer and
generic invalidation logic. This avoids the messy and confusing approach
of trying to incrementally untangle this in place. The new code is small
and simple enough this is much better than trying to figure out smaller
steps.
Once SVA is resting on the consolidate PASID layer it is straightforward
to make the PASID interface functionally complete by supporting S1DSS to
allow concurrent change of the RID while a PASID is attached and allow
supporting PASID over an IDENTIY RID.
This continues to support lazy allocation and installation of the CD table
by continuing to use the CFG mode for IDENTITY/BLOCKED when when the PASID
table is empty.
Finally enable attaching any S1 domain to a PASID, not just SVA.
It achieves the same goals as the several series from Michael and the
S1DSS series from Nicolin that were trying to improve portions of the API.
This is on github:
https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
v9:
- Clarify comments
- Collect tags
v8: https://lore.kernel.org/r/0-v8-6f85cdc10ce7+563e-smmuv3_newapi_p2b_jgg@nvidia.com
- Rebase on v6.10-rc2
- Make arm_smmu_sva_domain_alloc NULL when SVA is disabled so the core
code sees a NULL function pointer
- Update comments around arm_smmu_attach_prepare()
- Rename struct attach_state -> arm_smmu_attach_state and document
better, include more common function paramters in state
- Consistently use ats_enabled everywhere, replacing disable_ats in state
- Move the note about ATS bypass/abort to arm_smmu_attach_prepare()
- Remove temporary cd_table.in_ste check in arm_smmu_sva_set_dev_pasid()
- Improve comments and clarity of logic in arm_smmu_attach_commit()
- Shorten arm_smmu_mmu_notifier_free()
- Fix domain -> sid_domain typo in arm_smmu_remove_dev_pasid()
v7: https://lore.kernel.org/r/0-v7-9597c885796c+d2-smmuv3_newapi_p2b_jgg@nvidia.com
- Second half of the split series
- Rebase on Joerg's latest
- Accommodate ARM_SMMU_FEAT_ATTR_TYPES_OVR for the S1DSS code
- Include the S1DSS kunit tests
- Include hunks to adjust the unit tests to API changes from this series
- Move 3 BTM related patches out of this series, they can go in the BTM
enablement series.
- Move the domain_alloc_sva() conversion to the first patch, and rebase
on the accepted core code change
- Use the new core APIs for the PASID ops
- Revise commit messages
v6: https://lore.kernel.org/r/0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com
Jason Gunthorpe (14):
iommu/arm-smmu-v3: Convert to domain_alloc_sva()
iommu/arm-smmu-v3: Start building a generic PASID layer
iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list
iommu/arm-smmu-v3: Make changing domains be hitless for ATS
iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain
iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches
iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*()
interface
iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain
iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA
iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain
iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is
used
iommu/arm-smmu-v3: Test the STE S1DSS functionality
iommu/arm-smmu-v3: Allow a PASID to be set when RID is
IDENTITY/BLOCKED
iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 431 +++-----------
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 116 +++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 560 ++++++++++++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 52 +-
4 files changed, 663 insertions(+), 496 deletions(-)
base-commit: c3f38fa61af77b49866b006939479069cd451173
--
2.45.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v9 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva()
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer Jason Gunthorpe
` (13 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
This allows the driver the receive the mm and always a device during
allocation. Later patches need this to properly setup the notifier when
the domain is first allocated.
Remove ops->domain_alloc() as SVA was the only remaining purpose.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 6 ++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 +---------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 +++-----
3 files changed, 8 insertions(+), 16 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index e490ffb3801545..28f8bf4327f69a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -656,13 +656,15 @@ static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
.free = arm_smmu_sva_domain_free
};
-struct iommu_domain *arm_smmu_sva_domain_alloc(void)
+struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
+ struct mm_struct *mm)
{
struct iommu_domain *domain;
domain = kzalloc(sizeof(*domain), GFP_KERNEL);
if (!domain)
- return NULL;
+ return ERR_PTR(-ENOMEM);
+ domain->type = IOMMU_DOMAIN_SVA;
domain->ops = &arm_smmu_sva_domain_ops;
return domain;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index ab415e107054c1..bd79422f7b6f50 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2237,14 +2237,6 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
}
}
-static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
-{
-
- if (type == IOMMU_DOMAIN_SVA)
- return arm_smmu_sva_domain_alloc();
- return ERR_PTR(-EOPNOTSUPP);
-}
-
static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
{
struct arm_smmu_domain *smmu_domain;
@@ -3097,8 +3089,8 @@ static struct iommu_ops arm_smmu_ops = {
.identity_domain = &arm_smmu_identity_domain,
.blocked_domain = &arm_smmu_blocked_domain,
.capable = arm_smmu_capable,
- .domain_alloc = arm_smmu_domain_alloc,
.domain_alloc_paging = arm_smmu_domain_alloc_paging,
+ .domain_alloc_sva = arm_smmu_sva_domain_alloc,
.probe_device = arm_smmu_probe_device,
.release_device = arm_smmu_release_device,
.device_group = arm_smmu_device_group,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 1242a086c9f948..b10712d3de66a9 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -802,7 +802,8 @@ int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
void arm_smmu_sva_notifier_synchronize(void);
-struct iommu_domain *arm_smmu_sva_domain_alloc(void);
+struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
+ struct mm_struct *mm);
void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t id);
#else /* CONFIG_ARM_SMMU_V3_SVA */
@@ -838,10 +839,7 @@ static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master
static inline void arm_smmu_sva_notifier_synchronize(void) {}
-static inline struct iommu_domain *arm_smmu_sva_domain_alloc(void)
-{
- return NULL;
-}
+#define arm_smmu_sva_domain_alloc NULL
static inline void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
struct device *dev,
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva() Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-07-02 14:57 ` Will Deacon
2024-06-25 12:37 ` [PATCH v9 03/14] iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list Jason Gunthorpe
` (12 subsequent siblings)
14 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
Add arm_smmu_set_pasid()/arm_smmu_remove_pasid() which are to be used by
callers that already constructed the arm_smmu_cd they wish to program.
These functions will encapsulate the shared logic to setup a CD entry that
will be shared by SVA and S1 domain cases.
Prior fixes had already moved most of this logic up into
__arm_smmu_sva_bind(), move it to it's final home.
Following patches will relieve some of the remaining SVA restrictions:
- The RID domain is a S1 domain and has already setup the STE to point to
the CD table
- The programmed PASID is the mm_get_enqcmd_pasid()
- Nothing changes while SVA is running (sva_enable)
SVA invalidation will still iterate over the S1 domain's master list,
later patches will resolve that.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 57 ++++++++++---------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 ++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 ++-
3 files changed, 67 insertions(+), 31 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 28f8bf4327f69a..71ca87c2c5c3b6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -417,29 +417,27 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
arm_smmu_free_shared_cd(cd);
}
-static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid,
- struct mm_struct *mm)
+static struct arm_smmu_bond *__arm_smmu_sva_bind(struct device *dev,
+ struct mm_struct *mm)
{
int ret;
- struct arm_smmu_cd target;
- struct arm_smmu_cd *cdptr;
struct arm_smmu_bond *bond;
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct arm_smmu_domain *smmu_domain;
if (!(domain->type & __IOMMU_DOMAIN_PAGING))
- return -ENODEV;
+ return ERR_PTR(-ENODEV);
smmu_domain = to_smmu_domain(domain);
if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
- return -ENODEV;
+ return ERR_PTR(-ENODEV);
if (!master || !master->sva_enabled)
- return -ENODEV;
+ return ERR_PTR(-ENODEV);
bond = kzalloc(sizeof(*bond), GFP_KERNEL);
if (!bond)
- return -ENOMEM;
+ return ERR_PTR(-ENOMEM);
bond->mm = mm;
@@ -449,22 +447,12 @@ static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid,
goto err_free_bond;
}
- cdptr = arm_smmu_alloc_cd_ptr(master, mm_get_enqcmd_pasid(mm));
- if (!cdptr) {
- ret = -ENOMEM;
- goto err_put_notifier;
- }
- arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
- arm_smmu_write_cd_entry(master, pasid, cdptr, &target);
-
list_add(&bond->list, &master->bonds);
- return 0;
+ return bond;
-err_put_notifier:
- arm_smmu_mmu_notifier_put(bond->smmu_mn);
err_free_bond:
kfree(bond);
- return ret;
+ return ERR_PTR(ret);
}
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
@@ -611,10 +599,9 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
struct arm_smmu_bond *bond = NULL, *t;
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ arm_smmu_remove_pasid(master, to_smmu_domain(domain), id);
+
mutex_lock(&sva_lock);
-
- arm_smmu_clear_cd(master, id);
-
list_for_each_entry(t, &master->bonds, list) {
if (t->mm == mm) {
bond = t;
@@ -633,17 +620,33 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t id)
{
- int ret = 0;
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct mm_struct *mm = domain->mm;
+ struct arm_smmu_bond *bond;
+ struct arm_smmu_cd target;
+ int ret;
if (mm_get_enqcmd_pasid(mm) != id)
return -EINVAL;
mutex_lock(&sva_lock);
- ret = __arm_smmu_sva_bind(dev, id, mm);
- mutex_unlock(&sva_lock);
+ bond = __arm_smmu_sva_bind(dev, mm);
+ if (IS_ERR(bond)) {
+ mutex_unlock(&sva_lock);
+ return PTR_ERR(bond);
+ }
- return ret;
+ arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
+ ret = arm_smmu_set_pasid(master, NULL, id, &target);
+ if (ret) {
+ list_del(&bond->list);
+ arm_smmu_mmu_notifier_put(bond->smmu_mn);
+ kfree(bond);
+ mutex_unlock(&sva_lock);
+ return ret;
+ }
+ mutex_unlock(&sva_lock);
+ return 0;
}
static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index bd79422f7b6f50..3f19436fe86a37 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1211,8 +1211,8 @@ struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
return &l1_desc->l2ptr[ssid % CTXDESC_L2_ENTRIES];
}
-struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
- u32 ssid)
+static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
+ u32 ssid)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
@@ -2412,6 +2412,10 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
int i, j;
struct arm_smmu_device *smmu = master->smmu;
+ master->cd_table.in_ste =
+ FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(target->data[0])) ==
+ STRTAB_STE_0_CFG_S1_TRANS;
+
for (i = 0; i < master->num_streams; ++i) {
u32 sid = master->streams[i].id;
struct arm_smmu_ste *step =
@@ -2632,6 +2636,30 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return 0;
}
+int arm_smmu_set_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
+ const struct arm_smmu_cd *cd)
+{
+ struct arm_smmu_cd *cdptr;
+
+ /* The core code validates pasid */
+
+ if (!master->cd_table.in_ste)
+ return -ENODEV;
+
+ cdptr = arm_smmu_alloc_cd_ptr(master, pasid);
+ if (!cdptr)
+ return -ENOMEM;
+ arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
+ return 0;
+}
+
+void arm_smmu_remove_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid)
+{
+ arm_smmu_clear_cd(master, pasid);
+}
+
static int arm_smmu_attach_dev_ste(struct device *dev,
struct arm_smmu_ste *ste)
{
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index b10712d3de66a9..6a74d3d884fe8d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -602,6 +602,7 @@ struct arm_smmu_ctx_desc_cfg {
dma_addr_t cdtab_dma;
struct arm_smmu_l1_ctx_desc *l1_desc;
unsigned int num_l1_ents;
+ u8 in_ste;
u8 s1fmt;
/* log2 of the maximum number of CDs supported by this table */
u8 s1cdmax;
@@ -777,8 +778,6 @@ extern struct mutex arm_smmu_asid_lock;
void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid);
struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
u32 ssid);
-struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
- u32 ssid);
void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain);
@@ -786,6 +785,12 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
struct arm_smmu_cd *cdptr,
const struct arm_smmu_cd *target);
+int arm_smmu_set_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
+ const struct arm_smmu_cd *cd);
+void arm_smmu_remove_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid);
+
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 03/14] iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva() Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS Jason Gunthorpe
` (11 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
The next patch will need to store the same master twice (with different
SSIDs), so allocate memory for each list element.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 11 ++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 39 ++++++++++++++++---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 +++-
3 files changed, 47 insertions(+), 10 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 71ca87c2c5c3b6..cb3a0e4143c84a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -38,12 +38,13 @@ static DEFINE_MUTEX(sva_lock);
static void
arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
{
- struct arm_smmu_master *master;
+ struct arm_smmu_master_domain *master_domain;
struct arm_smmu_cd target_cd;
unsigned long flags;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_for_each_entry(master, &smmu_domain->devices, domain_head) {
+ list_for_each_entry(master_domain, &smmu_domain->devices, devices_elm) {
+ struct arm_smmu_master *master = master_domain->master;
struct arm_smmu_cd *cdptr;
/* S1 domains only support RID attachment right now */
@@ -301,7 +302,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
- struct arm_smmu_master *master;
+ struct arm_smmu_master_domain *master_domain;
unsigned long flags;
mutex_lock(&sva_lock);
@@ -315,7 +316,9 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
* but disable translation.
*/
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_for_each_entry(master, &smmu_domain->devices, domain_head) {
+ list_for_each_entry(master_domain, &smmu_domain->devices,
+ devices_elm) {
+ struct arm_smmu_master *master = master_domain->master;
struct arm_smmu_cd target;
struct arm_smmu_cd *cdptr;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3f19436fe86a37..532fe17f28bfe5 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2015,10 +2015,10 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size)
{
+ struct arm_smmu_master_domain *master_domain;
int i;
unsigned long flags;
struct arm_smmu_cmdq_ent cmd;
- struct arm_smmu_master *master;
struct arm_smmu_cmdq_batch cmds;
if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS))
@@ -2046,7 +2046,10 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
cmds.num = 0;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_for_each_entry(master, &smmu_domain->devices, domain_head) {
+ list_for_each_entry(master_domain, &smmu_domain->devices,
+ devices_elm) {
+ struct arm_smmu_master *master = master_domain->master;
+
if (!master->ats_enabled)
continue;
@@ -2534,9 +2537,26 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
pci_disable_pasid(pdev);
}
+static struct arm_smmu_master_domain *
+arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_master *master)
+{
+ struct arm_smmu_master_domain *master_domain;
+
+ lockdep_assert_held(&smmu_domain->devices_lock);
+
+ list_for_each_entry(master_domain, &smmu_domain->devices,
+ devices_elm) {
+ if (master_domain->master == master)
+ return master_domain;
+ }
+ return NULL;
+}
+
static void arm_smmu_detach_dev(struct arm_smmu_master *master)
{
struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
+ struct arm_smmu_master_domain *master_domain;
struct arm_smmu_domain *smmu_domain;
unsigned long flags;
@@ -2547,7 +2567,11 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
arm_smmu_disable_ats(master, smmu_domain);
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_del_init(&master->domain_head);
+ master_domain = arm_smmu_find_master_domain(smmu_domain, master);
+ if (master_domain) {
+ list_del(&master_domain->devices_elm);
+ kfree(master_domain);
+ }
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
master->ats_enabled = false;
@@ -2561,6 +2585,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+ struct arm_smmu_master_domain *master_domain;
struct arm_smmu_master *master;
struct arm_smmu_cd *cdptr;
@@ -2597,6 +2622,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return -ENOMEM;
}
+ master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
+ if (!master_domain)
+ return -ENOMEM;
+ master_domain->master = master;
+
/*
* Prevent arm_smmu_share_asid() from trying to change the ASID
* of either the old or new domain while we are working on it.
@@ -2610,7 +2640,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
master->ats_enabled = arm_smmu_ats_supported(master);
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_add(&master->domain_head, &smmu_domain->devices);
+ list_add(&master_domain->devices_elm, &smmu_domain->devices);
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
switch (smmu_domain->stage) {
@@ -2925,7 +2955,6 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
master->dev = dev;
master->smmu = smmu;
INIT_LIST_HEAD(&master->bonds);
- INIT_LIST_HEAD(&master->domain_head);
dev_iommu_priv_set(dev, master);
ret = arm_smmu_insert_master(smmu, master);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 6a74d3d884fe8d..01769b5286a83a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -697,7 +697,6 @@ struct arm_smmu_stream {
struct arm_smmu_master {
struct arm_smmu_device *smmu;
struct device *dev;
- struct list_head domain_head;
struct arm_smmu_stream *streams;
/* Locked by the iommu core using the group mutex */
struct arm_smmu_ctx_desc_cfg cd_table;
@@ -731,6 +730,7 @@ struct arm_smmu_domain {
struct iommu_domain domain;
+ /* List of struct arm_smmu_master_domain */
struct list_head devices;
spinlock_t devices_lock;
@@ -767,6 +767,11 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
u16 asid);
#endif
+struct arm_smmu_master_domain {
+ struct list_head devices_elm;
+ struct arm_smmu_master *master;
+};
+
static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
{
return container_of(dom, struct arm_smmu_domain, domain);
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (2 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 03/14] iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 05/14] iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain Jason Gunthorpe
` (10 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
The core code allows the domain to be changed on the fly without a forced
stop in BLOCKED/IDENTITY. In this flow the driver should just continually
maintain the ATS with no change while the STE is updated.
ATS relies on a linked list smmu_domain->devices to keep track of which
masters have the domain programmed, but this list is also used by
arm_smmu_share_asid(), unrelated to ats.
Create two new functions to encapsulate this combined logic:
arm_smmu_attach_prepare()
<caller generates and sets the STE>
arm_smmu_attach_commit()
The two functions can sequence both enabling ATS and disabling across
the STE store. Have every update of the STE use this sequence.
Installing a S1/S2 domain always enables the ATS if the PCIe device
supports it.
The enable flow is now ordered differently to allow it to be hitless:
1) Add the master to the new smmu_domain->devices list
2) Program the STE
3) Enable ATS at PCIe
4) Remove the master from the old smmu_domain
This flow ensures that invalidations to either domain will generate an ATC
invalidation to the device while the STE is being switched. Thus we don't
need to turn off the ATS anymore for correctness.
The disable flow is the reverse:
1) Disable ATS at PCIe
2) Program the STE
3) Invalidate the ATC
4) Remove the master from the old smmu_domain
Move the nr_ats_masters adjustments to be close to the list
manipulations. It is a count of the number of ATS enabled masters
currently in the list. This is stricly before and after the STE/CD are
revised, and done under the list's spin_lock.
This is part of the bigger picture to allow changing the RID domain while
a PASID is in use. If a SVA PASID is relying on ATS to function then
changing the RID domain cannot just temporarily toggle ATS off without
also wrecking the SVA PASID. The new infrastructure here is organized so
that the PASID attach/detach flows will make use of it as well in
following patches.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 5 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 237 +++++++++++++-----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 +-
3 files changed, 177 insertions(+), 71 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index 315e487fd990eb..a460b71f585789 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -164,7 +164,7 @@ static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste,
.smmu = &smmu,
};
- arm_smmu_make_cdtable_ste(ste, &master);
+ arm_smmu_make_cdtable_ste(ste, &master, true);
}
static void arm_smmu_v3_write_ste_test_bypass_to_abort(struct kunit *test)
@@ -231,7 +231,6 @@ static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste,
{
struct arm_smmu_master master = {
.smmu = &smmu,
- .ats_enabled = ats_enabled,
};
struct io_pgtable io_pgtable = {};
struct arm_smmu_domain smmu_domain = {
@@ -247,7 +246,7 @@ static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste,
io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.sl = 3;
io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.tsz = 4;
- arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain);
+ arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain, ats_enabled);
}
static void arm_smmu_v3_write_ste_test_s2_to_abort(struct kunit *test)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 532fe17f28bfe5..d33d97496a03fb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1538,7 +1538,7 @@ EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_bypass_ste);
VISIBLE_IF_KUNIT
void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
- struct arm_smmu_master *master)
+ struct arm_smmu_master *master, bool ats_enabled)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
@@ -1561,7 +1561,7 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
STRTAB_STE_1_S1STALLD :
0) |
FIELD_PREP(STRTAB_STE_1_EATS,
- master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+ ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
if (smmu->features & ARM_SMMU_FEAT_E2H) {
/*
@@ -1591,7 +1591,8 @@ EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_cdtable_ste);
VISIBLE_IF_KUNIT
void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain)
+ struct arm_smmu_domain *smmu_domain,
+ bool ats_enabled)
{
struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
const struct io_pgtable_cfg *pgtbl_cfg =
@@ -1608,7 +1609,7 @@ void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
target->data[1] = cpu_to_le64(
FIELD_PREP(STRTAB_STE_1_EATS,
- master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+ ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
if (smmu->features & ARM_SMMU_FEAT_ATTR_TYPES_OVR)
target->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
@@ -2450,22 +2451,16 @@ static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
return dev_is_pci(dev) && pci_ats_supported(to_pci_dev(dev));
}
-static void arm_smmu_enable_ats(struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain)
+static void arm_smmu_enable_ats(struct arm_smmu_master *master)
{
size_t stu;
struct pci_dev *pdev;
struct arm_smmu_device *smmu = master->smmu;
- /* Don't enable ATS at the endpoint if it's not enabled in the STE */
- if (!master->ats_enabled)
- return;
-
/* Smallest Translation Unit: log2 of the smallest supported granule */
stu = __ffs(smmu->pgsize_bitmap);
pdev = to_pci_dev(master->dev);
- atomic_inc(&smmu_domain->nr_ats_masters);
/*
* ATC invalidation of PASID 0 causes the entire ATC to be flushed.
*/
@@ -2474,22 +2469,6 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master,
dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
}
-static void arm_smmu_disable_ats(struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain)
-{
- if (!master->ats_enabled)
- return;
-
- pci_disable_ats(to_pci_dev(master->dev));
- /*
- * Ensure ATS is disabled at the endpoint before we issue the
- * ATC invalidation via the SMMU.
- */
- wmb();
- arm_smmu_atc_inv_master(master);
- atomic_dec(&smmu_domain->nr_ats_masters);
-}
-
static int arm_smmu_enable_pasid(struct arm_smmu_master *master)
{
int ret;
@@ -2553,46 +2532,181 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
return NULL;
}
-static void arm_smmu_detach_dev(struct arm_smmu_master *master)
+/*
+ * If the domain uses the smmu_domain->devices list return the arm_smmu_domain
+ * structure, otherwise NULL. These domains track attached devices so they can
+ * issue invalidations.
+ */
+static struct arm_smmu_domain *
+to_smmu_domain_devices(struct iommu_domain *domain)
{
- struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
+ /* The domain can be NULL only when processing the first attach */
+ if (!domain)
+ return NULL;
+ if (domain->type & __IOMMU_DOMAIN_PAGING)
+ return to_smmu_domain(domain);
+ return NULL;
+}
+
+static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
+ struct iommu_domain *domain)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
struct arm_smmu_master_domain *master_domain;
- struct arm_smmu_domain *smmu_domain;
unsigned long flags;
- if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
+ if (!smmu_domain)
return;
- smmu_domain = to_smmu_domain(domain);
- arm_smmu_disable_ats(master, smmu_domain);
-
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
master_domain = arm_smmu_find_master_domain(smmu_domain, master);
if (master_domain) {
list_del(&master_domain->devices_elm);
kfree(master_domain);
+ if (master->ats_enabled)
+ atomic_dec(&smmu_domain->nr_ats_masters);
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+}
- master->ats_enabled = false;
+struct arm_smmu_attach_state {
+ /* Inputs */
+ struct iommu_domain *old_domain;
+ struct arm_smmu_master *master;
+ /* Resulting state */
+ bool ats_enabled;
+};
+
+/*
+ * Start the sequence to attach a domain to a master. The sequence contains three
+ * steps:
+ * arm_smmu_attach_prepare()
+ * arm_smmu_install_ste_for_dev()
+ * arm_smmu_attach_commit()
+ *
+ * If prepare succeeds then the sequence must be completed. The STE installed
+ * must set the STE.EATS field according to state.ats_enabled.
+ *
+ * If the device supports ATS then this determines if EATS should be enabled
+ * in the STE, and starts sequencing EATS disable if required.
+ *
+ * The change of the EATS in the STE and the PCI ATS config space is managed by
+ * this sequence to be in the right order so that if PCI ATS is enabled then
+ * STE.ETAS is enabled.
+ *
+ * new_domain can be a non-paging domain. In this case ATS will not be enabled,
+ * and invalidations won't be tracked.
+ */
+static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
+ struct iommu_domain *new_domain)
+{
+ struct arm_smmu_master *master = state->master;
+ struct arm_smmu_master_domain *master_domain;
+ struct arm_smmu_domain *smmu_domain =
+ to_smmu_domain_devices(new_domain);
+ unsigned long flags;
+
+ /*
+ * arm_smmu_share_asid() must not see two domains pointing to the same
+ * arm_smmu_master_domain contents otherwise it could randomly write one
+ * or the other to the CD.
+ */
+ lockdep_assert_held(&arm_smmu_asid_lock);
+
+ if (smmu_domain) {
+ /*
+ * The SMMU does not support enabling ATS with bypass/abort.
+ * When the STE is in bypass (STE.Config[2:0] == 0b100), ATS
+ * Translation Requests and Translated transactions are denied
+ * as though ATS is disabled for the stream (STE.EATS == 0b00),
+ * causing F_BAD_ATS_TREQ and F_TRANSL_FORBIDDEN events
+ * (IHI0070Ea 5.2 Stream Table Entry). Thus ATS can only be
+ * enabled if we have arm_smmu_domain, those always have page
+ * tables.
+ */
+ state->ats_enabled = arm_smmu_ats_supported(master);
+
+ master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
+ if (!master_domain)
+ return -ENOMEM;
+ master_domain->master = master;
+
+ /*
+ * During prepare we want the current smmu_domain and new
+ * smmu_domain to be in the devices list before we change any
+ * HW. This ensures that both domains will send ATS
+ * invalidations to the master until we are done.
+ *
+ * It is tempting to make this list only track masters that are
+ * using ATS, but arm_smmu_share_asid() also uses this to change
+ * the ASID of a domain, unrelated to ATS.
+ *
+ * Notice if we are re-attaching the same domain then the list
+ * will have two identical entries and commit will remove only
+ * one of them.
+ */
+ spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+ if (state->ats_enabled)
+ atomic_inc(&smmu_domain->nr_ats_masters);
+ list_add(&master_domain->devices_elm, &smmu_domain->devices);
+ spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+ }
+
+ if (!state->ats_enabled && master->ats_enabled) {
+ pci_disable_ats(to_pci_dev(master->dev));
+ /*
+ * This is probably overkill, but the config write for disabling
+ * ATS should complete before the STE is configured to generate
+ * UR to avoid AER noise.
+ */
+ wmb();
+ }
+ return 0;
+}
+
+/*
+ * Commit is done after the STE/CD are configured with the EATS setting. It
+ * completes synchronizing the PCI device's ATC and finishes manipulating the
+ * smmu_domain->devices list.
+ */
+static void arm_smmu_attach_commit(struct arm_smmu_attach_state *state)
+{
+ struct arm_smmu_master *master = state->master;
+
+ lockdep_assert_held(&arm_smmu_asid_lock);
+
+ if (state->ats_enabled && !master->ats_enabled) {
+ arm_smmu_enable_ats(master);
+ } else if (master->ats_enabled) {
+ /*
+ * The translation has changed, flush the ATC. At this point the
+ * SMMU is translating for the new domain and both the old&new
+ * domain will issue invalidations.
+ */
+ arm_smmu_atc_inv_master(master);
+ }
+ master->ats_enabled = state->ats_enabled;
+
+ arm_smmu_remove_master_domain(master, state->old_domain);
}
static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
{
int ret = 0;
- unsigned long flags;
struct arm_smmu_ste target;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
- struct arm_smmu_master_domain *master_domain;
+ struct arm_smmu_attach_state state = {
+ .old_domain = iommu_get_domain_for_dev(dev),
+ };
struct arm_smmu_master *master;
struct arm_smmu_cd *cdptr;
if (!fwspec)
return -ENOENT;
- master = dev_iommu_priv_get(dev);
+ state.master = master = dev_iommu_priv_get(dev);
smmu = master->smmu;
/*
@@ -2622,11 +2736,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return -ENOMEM;
}
- master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
- if (!master_domain)
- return -ENOMEM;
- master_domain->master = master;
-
/*
* Prevent arm_smmu_share_asid() from trying to change the ASID
* of either the old or new domain while we are working on it.
@@ -2635,13 +2744,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
*/
mutex_lock(&arm_smmu_asid_lock);
- arm_smmu_detach_dev(master);
-
- master->ats_enabled = arm_smmu_ats_supported(master);
-
- spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_add(&master_domain->devices_elm, &smmu_domain->devices);
- spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+ ret = arm_smmu_attach_prepare(&state, domain);
+ if (ret) {
+ mutex_unlock(&arm_smmu_asid_lock);
+ return ret;
+ }
switch (smmu_domain->stage) {
case ARM_SMMU_DOMAIN_S1: {
@@ -2650,18 +2757,19 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
&target_cd);
- arm_smmu_make_cdtable_ste(&target, master);
+ arm_smmu_make_cdtable_ste(&target, master, state.ats_enabled);
arm_smmu_install_ste_for_dev(master, &target);
break;
}
case ARM_SMMU_DOMAIN_S2:
- arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
+ arm_smmu_make_s2_domain_ste(&target, master, smmu_domain,
+ state.ats_enabled);
arm_smmu_install_ste_for_dev(master, &target);
arm_smmu_clear_cd(master, IOMMU_NO_PASID);
break;
}
- arm_smmu_enable_ats(master, smmu_domain);
+ arm_smmu_attach_commit(&state);
mutex_unlock(&arm_smmu_asid_lock);
return 0;
}
@@ -2690,10 +2798,14 @@ void arm_smmu_remove_pasid(struct arm_smmu_master *master,
arm_smmu_clear_cd(master, pasid);
}
-static int arm_smmu_attach_dev_ste(struct device *dev,
- struct arm_smmu_ste *ste)
+static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
+ struct device *dev, struct arm_smmu_ste *ste)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_attach_state state = {
+ .master = master,
+ .old_domain = iommu_get_domain_for_dev(dev),
+ };
if (arm_smmu_master_sva_enabled(master))
return -EBUSY;
@@ -2704,16 +2816,9 @@ static int arm_smmu_attach_dev_ste(struct device *dev,
*/
mutex_lock(&arm_smmu_asid_lock);
- /*
- * The SMMU does not support enabling ATS with bypass/abort. When the
- * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
- * and Translated transactions are denied as though ATS is disabled for
- * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
- * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
- */
- arm_smmu_detach_dev(master);
-
+ arm_smmu_attach_prepare(&state, domain);
arm_smmu_install_ste_for_dev(master, ste);
+ arm_smmu_attach_commit(&state);
mutex_unlock(&arm_smmu_asid_lock);
/*
@@ -2732,7 +2837,7 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
arm_smmu_make_bypass_ste(master->smmu, &ste);
- return arm_smmu_attach_dev_ste(dev, &ste);
+ return arm_smmu_attach_dev_ste(domain, dev, &ste);
}
static const struct iommu_domain_ops arm_smmu_identity_ops = {
@@ -2750,7 +2855,7 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
struct arm_smmu_ste ste;
arm_smmu_make_abort_ste(&ste);
- return arm_smmu_attach_dev_ste(dev, &ste);
+ return arm_smmu_attach_dev_ste(domain, dev, &ste);
}
static const struct iommu_domain_ops arm_smmu_blocked_ops = {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 01769b5286a83a..f9b4bfb2e6b723 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -758,10 +758,12 @@ void arm_smmu_make_abort_ste(struct arm_smmu_ste *target);
void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu,
struct arm_smmu_ste *target);
void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
- struct arm_smmu_master *master);
+ struct arm_smmu_master *master,
+ bool ats_enabled);
void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain);
+ struct arm_smmu_domain *smmu_domain,
+ bool ats_enabled);
void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
struct arm_smmu_master *master, struct mm_struct *mm,
u16 asid);
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 05/14] iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (3 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 06/14] iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches Jason Gunthorpe
` (9 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
Prepare to allow a S1 domain to be attached to a PASID as well. Keep track
of the SSID the domain is using on each master in the
arm_smmu_master_domain.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 15 ++++---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 42 +++++++++++++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 ++-
3 files changed, 43 insertions(+), 19 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index cb3a0e4143c84a..d31caceb584984 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -47,13 +47,12 @@ arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
struct arm_smmu_master *master = master_domain->master;
struct arm_smmu_cd *cdptr;
- /* S1 domains only support RID attachment right now */
- cdptr = arm_smmu_get_cd_ptr(master, IOMMU_NO_PASID);
+ cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid);
if (WARN_ON(!cdptr))
continue;
arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
- arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
+ arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr,
&target_cd);
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
@@ -294,8 +293,8 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
smmu_domain);
}
- arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), start,
- size);
+ arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), start,
+ size);
}
static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
@@ -332,7 +331,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
- arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
+ arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
smmu_mn->cleared = true;
mutex_unlock(&sva_lock);
@@ -411,8 +410,8 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
*/
if (!smmu_mn->cleared) {
arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid);
- arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0,
- 0);
+ arm_smmu_atc_inv_domain_sva(smmu_domain,
+ mm_get_enqcmd_pasid(mm), 0, 0);
}
/* Frees smmu_mn */
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d33d97496a03fb..f563f2ee6fd76d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2013,8 +2013,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
return arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
}
-int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
- unsigned long iova, size_t size)
+static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ ioasid_t ssid, unsigned long iova, size_t size)
{
struct arm_smmu_master_domain *master_domain;
int i;
@@ -2042,8 +2042,6 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
if (!atomic_read(&smmu_domain->nr_ats_masters))
return 0;
- arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
-
cmds.num = 0;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
@@ -2054,6 +2052,16 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
if (!master->ats_enabled)
continue;
+ /*
+ * Non-zero ssid means SVA is co-opting the S1 domain to issue
+ * invalidations for SVA PASIDs.
+ */
+ if (ssid != IOMMU_NO_PASID)
+ arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
+ else
+ arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size,
+ &cmd);
+
for (i = 0; i < master->num_streams; i++) {
cmd.atc.sid = master->streams[i].id;
arm_smmu_cmdq_batch_add(smmu_domain->smmu, &cmds, &cmd);
@@ -2064,6 +2072,19 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds);
}
+static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ unsigned long iova, size_t size)
+{
+ return __arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova,
+ size);
+}
+
+int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
+ ioasid_t ssid, unsigned long iova, size_t size)
+{
+ return __arm_smmu_atc_inv_domain(smmu_domain, ssid, iova, size);
+}
+
/* IO_PGTABLE API */
static void arm_smmu_tlb_inv_context(void *cookie)
{
@@ -2085,7 +2106,7 @@ static void arm_smmu_tlb_inv_context(void *cookie)
cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
}
- arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, 0, 0);
+ arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
}
static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
@@ -2183,7 +2204,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
* Unfortunately, this can't be leaf-only since we may have
* zapped an entire table.
*/
- arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova, size);
+ arm_smmu_atc_inv_domain(smmu_domain, iova, size);
}
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
@@ -2518,7 +2539,8 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
static struct arm_smmu_master_domain *
arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
- struct arm_smmu_master *master)
+ struct arm_smmu_master *master,
+ ioasid_t ssid)
{
struct arm_smmu_master_domain *master_domain;
@@ -2526,7 +2548,8 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
list_for_each_entry(master_domain, &smmu_domain->devices,
devices_elm) {
- if (master_domain->master == master)
+ if (master_domain->master == master &&
+ master_domain->ssid == ssid)
return master_domain;
}
return NULL;
@@ -2559,7 +2582,8 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
return;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- master_domain = arm_smmu_find_master_domain(smmu_domain, master);
+ master_domain = arm_smmu_find_master_domain(smmu_domain, master,
+ IOMMU_NO_PASID);
if (master_domain) {
list_del(&master_domain->devices_elm);
kfree(master_domain);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index f9b4bfb2e6b723..f4061ffc1e612d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -772,6 +772,7 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
struct arm_smmu_master_domain {
struct list_head devices_elm;
struct arm_smmu_master *master;
+ ioasid_t ssid;
};
static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
@@ -803,8 +804,8 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
struct arm_smmu_domain *smmu_domain);
bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
-int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
- unsigned long iova, size_t size);
+int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
+ ioasid_t ssid, unsigned long iova, size_t size);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 06/14] iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (4 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 05/14] iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 07/14] iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*() interface Jason Gunthorpe
` (8 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
We no longer need a master->sva_enable to control what attaches are
allowed. Instead we can tell if the attach is legal based on the current
configuration of the master.
Keep track of the number of valid CD entries for SSID's in the cd_table
and if the cd_table has been installed in the STE directly so we know what
the configuration is.
The attach logic is then made into:
- SVA bind, check if the CD is installed
- RID attach of S2, block if SSIDs are used
- RID attach of IDENTITY/BLOCKING, block if SSIDs are used
arm_smmu_set_pasid() is already checking if it is possible to setup a CD
entry, at this patch it means the RID path already set a STE pointing at
the CD table.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 ++++++++++-----------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 ++++++
2 files changed, 19 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f563f2ee6fd76d..38d1465a44b479 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1289,6 +1289,8 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
struct arm_smmu_cd *cdptr,
const struct arm_smmu_cd *target)
{
+ bool target_valid = target->data[0] & cpu_to_le64(CTXDESC_CD_0_V);
+ bool cur_valid = cdptr->data[0] & cpu_to_le64(CTXDESC_CD_0_V);
struct arm_smmu_cd_writer cd_writer = {
.writer = {
.ops = &arm_smmu_cd_writer_ops,
@@ -1297,6 +1299,13 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
.ssid = ssid,
};
+ if (ssid != IOMMU_NO_PASID && cur_valid != target_valid) {
+ if (cur_valid)
+ master->cd_table.used_ssids--;
+ else
+ master->cd_table.used_ssids++;
+ }
+
arm_smmu_write_entry(&cd_writer.writer, cdptr->data, target->data);
}
@@ -2733,16 +2742,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
state.master = master = dev_iommu_priv_get(dev);
smmu = master->smmu;
- /*
- * Checking that SVA is disabled ensures that this device isn't bound to
- * any mm, and can be safely detached from its old domain. Bonds cannot
- * be removed concurrently since we're holding the group mutex.
- */
- if (arm_smmu_master_sva_enabled(master)) {
- dev_err(dev, "cannot attach - SVA enabled\n");
- return -EBUSY;
- }
-
mutex_lock(&smmu_domain->init_mutex);
if (!smmu_domain->smmu) {
@@ -2758,7 +2757,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
cdptr = arm_smmu_alloc_cd_ptr(master, IOMMU_NO_PASID);
if (!cdptr)
return -ENOMEM;
- }
+ } else if (arm_smmu_ssids_in_use(&master->cd_table))
+ return -EBUSY;
/*
* Prevent arm_smmu_share_asid() from trying to change the ASID
@@ -2831,7 +2831,7 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
.old_domain = iommu_get_domain_for_dev(dev),
};
- if (arm_smmu_master_sva_enabled(master))
+ if (arm_smmu_ssids_in_use(&master->cd_table))
return -EBUSY;
/*
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index f4061ffc1e612d..65b75dbfd15914 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -602,12 +602,19 @@ struct arm_smmu_ctx_desc_cfg {
dma_addr_t cdtab_dma;
struct arm_smmu_l1_ctx_desc *l1_desc;
unsigned int num_l1_ents;
+ unsigned int used_ssids;
u8 in_ste;
u8 s1fmt;
/* log2 of the maximum number of CDs supported by this table */
u8 s1cdmax;
};
+/* True if the cd table has SSIDS > 0 in use. */
+static inline bool arm_smmu_ssids_in_use(struct arm_smmu_ctx_desc_cfg *cd_table)
+{
+ return cd_table->used_ssids;
+}
+
struct arm_smmu_s2_cfg {
u16 vmid;
};
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 07/14] iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*() interface
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (5 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 06/14] iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 08/14] iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain Jason Gunthorpe
` (7 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
Allow creating and managing arm_smmu_mater_domain's with a non-zero SSID
through the arm_smmu_attach_*() family of functions. This triggers ATC
invalidation for the correct SSID in PASID cases and tracks the
per-attachment SSID in the struct arm_smmu_master_domain.
Generalize arm_smmu_attach_remove() to be able to remove SSID's as well by
ensuring the ATC for the PASID is flushed properly.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 26 ++++++++++++++-------
1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 38d1465a44b479..f6634c37601b89 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2005,13 +2005,14 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
cmd->atc.size = log2_span;
}
-static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
+static int arm_smmu_atc_inv_master(struct arm_smmu_master *master,
+ ioasid_t ssid)
{
int i;
struct arm_smmu_cmdq_ent cmd;
struct arm_smmu_cmdq_batch cmds;
- arm_smmu_atc_inv_to_cmd(IOMMU_NO_PASID, 0, 0, &cmd);
+ arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd);
cmds.num = 0;
for (i = 0; i < master->num_streams; i++) {
@@ -2494,7 +2495,7 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
/*
* ATC invalidation of PASID 0 causes the entire ATC to be flushed.
*/
- arm_smmu_atc_inv_master(master);
+ arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
if (pci_enable_ats(pdev, stu))
dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
}
@@ -2581,7 +2582,8 @@ to_smmu_domain_devices(struct iommu_domain *domain)
}
static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
- struct iommu_domain *domain)
+ struct iommu_domain *domain,
+ ioasid_t ssid)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
struct arm_smmu_master_domain *master_domain;
@@ -2591,8 +2593,7 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
return;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- master_domain = arm_smmu_find_master_domain(smmu_domain, master,
- IOMMU_NO_PASID);
+ master_domain = arm_smmu_find_master_domain(smmu_domain, master, ssid);
if (master_domain) {
list_del(&master_domain->devices_elm);
kfree(master_domain);
@@ -2606,6 +2607,7 @@ struct arm_smmu_attach_state {
/* Inputs */
struct iommu_domain *old_domain;
struct arm_smmu_master *master;
+ ioasid_t ssid;
/* Resulting state */
bool ats_enabled;
};
@@ -2663,6 +2665,7 @@ static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
if (!master_domain)
return -ENOMEM;
master_domain->master = master;
+ master_domain->ssid = state->ssid;
/*
* During prepare we want the current smmu_domain and new
@@ -2710,17 +2713,20 @@ static void arm_smmu_attach_commit(struct arm_smmu_attach_state *state)
if (state->ats_enabled && !master->ats_enabled) {
arm_smmu_enable_ats(master);
- } else if (master->ats_enabled) {
+ } else if (state->ats_enabled && master->ats_enabled) {
/*
* The translation has changed, flush the ATC. At this point the
* SMMU is translating for the new domain and both the old&new
* domain will issue invalidations.
*/
- arm_smmu_atc_inv_master(master);
+ arm_smmu_atc_inv_master(master, state->ssid);
+ } else if (!state->ats_enabled && master->ats_enabled) {
+ /* ATS is being switched off, invalidate the entire ATC */
+ arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
}
master->ats_enabled = state->ats_enabled;
- arm_smmu_remove_master_domain(master, state->old_domain);
+ arm_smmu_remove_master_domain(master, state->old_domain, state->ssid);
}
static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -2732,6 +2738,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_attach_state state = {
.old_domain = iommu_get_domain_for_dev(dev),
+ .ssid = IOMMU_NO_PASID,
};
struct arm_smmu_master *master;
struct arm_smmu_cd *cdptr;
@@ -2829,6 +2836,7 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
struct arm_smmu_attach_state state = {
.master = master,
.old_domain = iommu_get_domain_for_dev(dev),
+ .ssid = IOMMU_NO_PASID,
};
if (arm_smmu_ssids_in_use(&master->cd_table))
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 08/14] iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (6 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 07/14] iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*() interface Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 09/14] iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA Jason Gunthorpe
` (6 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
Currently the SVA domain is a naked struct iommu_domain, allocate a struct
arm_smmu_domain instead.
This is necessary to be able to use the struct arm_master_domain
mechanism.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 21 +++++++------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 31 +++++++++++++------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
3 files changed, 35 insertions(+), 19 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index d31caceb584984..aa033cd65adc5a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -639,7 +639,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
}
arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
- ret = arm_smmu_set_pasid(master, NULL, id, &target);
+ ret = arm_smmu_set_pasid(master, to_smmu_domain(domain), id, &target);
if (ret) {
list_del(&bond->list);
arm_smmu_mmu_notifier_put(bond->smmu_mn);
@@ -653,7 +653,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
{
- kfree(domain);
+ kfree(to_smmu_domain(domain));
}
static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
@@ -664,13 +664,16 @@ static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm)
{
- struct iommu_domain *domain;
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_device *smmu = master->smmu;
+ struct arm_smmu_domain *smmu_domain;
- domain = kzalloc(sizeof(*domain), GFP_KERNEL);
- if (!domain)
- return ERR_PTR(-ENOMEM);
- domain->type = IOMMU_DOMAIN_SVA;
- domain->ops = &arm_smmu_sva_domain_ops;
+ smmu_domain = arm_smmu_domain_alloc();
+ if (IS_ERR(smmu_domain))
+ return ERR_CAST(smmu_domain);
+ smmu_domain->domain.type = IOMMU_DOMAIN_SVA;
+ smmu_domain->domain.ops = &arm_smmu_sva_domain_ops;
+ smmu_domain->smmu = smmu;
- return domain;
+ return &smmu_domain->domain;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f6634c37601b89..dc1d53ce2b40fb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2272,6 +2272,22 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
}
}
+struct arm_smmu_domain *arm_smmu_domain_alloc(void)
+{
+ struct arm_smmu_domain *smmu_domain;
+
+ smmu_domain = kzalloc(sizeof(*smmu_domain), GFP_KERNEL);
+ if (!smmu_domain)
+ return ERR_PTR(-ENOMEM);
+
+ mutex_init(&smmu_domain->init_mutex);
+ INIT_LIST_HEAD(&smmu_domain->devices);
+ spin_lock_init(&smmu_domain->devices_lock);
+ INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
+
+ return smmu_domain;
+}
+
static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
{
struct arm_smmu_domain *smmu_domain;
@@ -2281,14 +2297,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
* We can't really do anything meaningful until we've added a
* master.
*/
- smmu_domain = kzalloc(sizeof(*smmu_domain), GFP_KERNEL);
- if (!smmu_domain)
- return ERR_PTR(-ENOMEM);
-
- mutex_init(&smmu_domain->init_mutex);
- INIT_LIST_HEAD(&smmu_domain->devices);
- spin_lock_init(&smmu_domain->devices_lock);
- INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
+ smmu_domain = arm_smmu_domain_alloc();
+ if (IS_ERR(smmu_domain))
+ return ERR_CAST(smmu_domain);
if (dev) {
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
@@ -2303,7 +2314,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
return &smmu_domain->domain;
}
-static void arm_smmu_domain_free(struct iommu_domain *domain)
+static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_device *smmu = smmu_domain->smmu;
@@ -3305,7 +3316,7 @@ static struct iommu_ops arm_smmu_ops = {
.iotlb_sync = arm_smmu_iotlb_sync,
.iova_to_phys = arm_smmu_iova_to_phys,
.enable_nesting = arm_smmu_enable_nesting,
- .free = arm_smmu_domain_free,
+ .free = arm_smmu_domain_free_paging,
}
};
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 65b75dbfd15914..212c18c70fa03e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -790,6 +790,8 @@ static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
extern struct xarray arm_smmu_asid_xa;
extern struct mutex arm_smmu_asid_lock;
+struct arm_smmu_domain *arm_smmu_domain_alloc(void);
+
void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid);
struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
u32 ssid);
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 09/14] iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (7 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 08/14] iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 10/14] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain Jason Gunthorpe
` (5 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
Fill in the smmu_domain->devices list in the new struct arm_smmu_domain
that SVA allocates. Keep track of every SSID and master that is using the
domain reusing the logic for the RID attach.
This is the first step to making the SVA invalidation follow the same
design as S1/S2 invalidation. At present nothing will read this list.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 30 +++++++++++++++++++--
1 file changed, 28 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index dc1d53ce2b40fb..527560b36b3993 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2587,7 +2587,8 @@ to_smmu_domain_devices(struct iommu_domain *domain)
/* The domain can be NULL only when processing the first attach */
if (!domain)
return NULL;
- if (domain->type & __IOMMU_DOMAIN_PAGING)
+ if ((domain->type & __IOMMU_DOMAIN_PAGING) ||
+ domain->type == IOMMU_DOMAIN_SVA)
return to_smmu_domain(domain);
return NULL;
}
@@ -2820,7 +2821,16 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
const struct arm_smmu_cd *cd)
{
+ struct arm_smmu_attach_state state = {
+ .master = master,
+ /*
+ * For now the core code prevents calling this when a domain is
+ * already attached, no need to set old_domain.
+ */
+ .ssid = pasid,
+ };
struct arm_smmu_cd *cdptr;
+ int ret;
/* The core code validates pasid */
@@ -2830,14 +2840,30 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
cdptr = arm_smmu_alloc_cd_ptr(master, pasid);
if (!cdptr)
return -ENOMEM;
+
+ mutex_lock(&arm_smmu_asid_lock);
+ ret = arm_smmu_attach_prepare(&state, &smmu_domain->domain);
+ if (ret)
+ goto out_unlock;
+
arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
- return 0;
+
+ arm_smmu_attach_commit(&state);
+
+out_unlock:
+ mutex_unlock(&arm_smmu_asid_lock);
+ return ret;
}
void arm_smmu_remove_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid)
{
+ mutex_lock(&arm_smmu_asid_lock);
arm_smmu_clear_cd(master, pasid);
+ if (master->ats_enabled)
+ arm_smmu_atc_inv_master(master, pasid);
+ arm_smmu_remove_master_domain(master, &smmu_domain->domain, pasid);
+ mutex_unlock(&arm_smmu_asid_lock);
}
static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 10/14] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (8 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 09/14] iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 11/14] iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is used Jason Gunthorpe
` (4 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
This removes all the notifier de-duplication logic in the driver and
relies on the core code to de-duplicate and allocate only one SVA domain
per mm per smmu instance. This naturally gives a 1:1 relationship between
SVA domain and mmu notifier.
It is a significant simplication of the flow, as we end up with a single
struct arm_smmu_domain for each MM and the invalidation can then be
shifted to properly use the masters list like S1/S2 do.
Remove all of the previous mmu_notifier, bond, shared cd, and cd refcount
logic entirely.
The logic here is tightly wound together with the unusued BTM
support. Since the BTM logic requires holding all the iommu_domains in a
global ASID xarray it conflicts with the design to have a single SVA
domain per PASID, as multiple SMMU instances will need to have different
domains.
Following patches resolve this by making the ASID xarray per-instance
instead of global. However, converting the BTM code over to this
methodology requires many changes.
Thus, since ARM_SMMU_FEAT_BTM is never enabled, remove the parts of the
BTM support for ASID sharing that interact with SVA as well.
A followup series is already working on fully enabling the BTM support,
that requires iommufd's VIOMMU feature to bring in the KVM's VMID as
well. It will come with an already written patch to bring back the ASID
sharing using a per-instance ASID xarray.
https://lore.kernel.org/linux-iommu/20240208151837.35068-1-shameerali.kolothum.thodi@huawei.com/
https://lore.kernel.org/linux-iommu/26-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com/
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 395 +++---------------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 69 +--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 15 +-
3 files changed, 86 insertions(+), 393 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index aa033cd65adc5a..a7c36654dee5a5 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -13,29 +13,9 @@
#include "arm-smmu-v3.h"
#include "../../io-pgtable-arm.h"
-struct arm_smmu_mmu_notifier {
- struct mmu_notifier mn;
- struct arm_smmu_ctx_desc *cd;
- bool cleared;
- refcount_t refs;
- struct list_head list;
- struct arm_smmu_domain *domain;
-};
-
-#define mn_to_smmu(mn) container_of(mn, struct arm_smmu_mmu_notifier, mn)
-
-struct arm_smmu_bond {
- struct mm_struct *mm;
- struct arm_smmu_mmu_notifier *smmu_mn;
- struct list_head list;
-};
-
-#define sva_to_bond(handle) \
- container_of(handle, struct arm_smmu_bond, sva)
-
static DEFINE_MUTEX(sva_lock);
-static void
+static void __maybe_unused
arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
{
struct arm_smmu_master_domain *master_domain;
@@ -58,58 +38,6 @@ arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
}
-/*
- * Check if the CPU ASID is available on the SMMU side. If a private context
- * descriptor is using it, try to replace it.
- */
-static struct arm_smmu_ctx_desc *
-arm_smmu_share_asid(struct mm_struct *mm, u16 asid)
-{
- int ret;
- u32 new_asid;
- struct arm_smmu_ctx_desc *cd;
- struct arm_smmu_device *smmu;
- struct arm_smmu_domain *smmu_domain;
-
- cd = xa_load(&arm_smmu_asid_xa, asid);
- if (!cd)
- return NULL;
-
- if (cd->mm) {
- if (WARN_ON(cd->mm != mm))
- return ERR_PTR(-EINVAL);
- /* All devices bound to this mm use the same cd struct. */
- refcount_inc(&cd->refs);
- return cd;
- }
-
- smmu_domain = container_of(cd, struct arm_smmu_domain, cd);
- smmu = smmu_domain->smmu;
-
- ret = xa_alloc(&arm_smmu_asid_xa, &new_asid, cd,
- XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
- if (ret)
- return ERR_PTR(-ENOSPC);
- /*
- * Race with unmap: TLB invalidations will start targeting the new ASID,
- * which isn't assigned yet. We'll do an invalidate-all on the old ASID
- * later, so it doesn't matter.
- */
- cd->asid = new_asid;
- /*
- * Update ASID and invalidate CD in all associated masters. There will
- * be some overlap between use of both ASIDs, until we invalidate the
- * TLB.
- */
- arm_smmu_update_s1_domain_cd_entry(smmu_domain);
-
- /* Invalidate TLB entries previously associated with that context */
- arm_smmu_tlb_inv_asid(smmu, asid);
-
- xa_erase(&arm_smmu_asid_xa, asid);
- return NULL;
-}
-
static u64 page_size_to_cd(void)
{
static_assert(PAGE_SIZE == SZ_4K || PAGE_SIZE == SZ_16K ||
@@ -187,69 +115,6 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
}
EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_sva_cd);
-static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
-{
- u16 asid;
- int err = 0;
- struct arm_smmu_ctx_desc *cd;
- struct arm_smmu_ctx_desc *ret = NULL;
-
- /* Don't free the mm until we release the ASID */
- mmgrab(mm);
-
- asid = arm64_mm_context_get(mm);
- if (!asid) {
- err = -ESRCH;
- goto out_drop_mm;
- }
-
- cd = kzalloc(sizeof(*cd), GFP_KERNEL);
- if (!cd) {
- err = -ENOMEM;
- goto out_put_context;
- }
-
- refcount_set(&cd->refs, 1);
-
- mutex_lock(&arm_smmu_asid_lock);
- ret = arm_smmu_share_asid(mm, asid);
- if (ret) {
- mutex_unlock(&arm_smmu_asid_lock);
- goto out_free_cd;
- }
-
- err = xa_insert(&arm_smmu_asid_xa, asid, cd, GFP_KERNEL);
- mutex_unlock(&arm_smmu_asid_lock);
-
- if (err)
- goto out_free_asid;
-
- cd->asid = asid;
- cd->mm = mm;
-
- return cd;
-
-out_free_asid:
- arm_smmu_free_asid(cd);
-out_free_cd:
- kfree(cd);
-out_put_context:
- arm64_mm_context_put(mm);
-out_drop_mm:
- mmdrop(mm);
- return err < 0 ? ERR_PTR(err) : ret;
-}
-
-static void arm_smmu_free_shared_cd(struct arm_smmu_ctx_desc *cd)
-{
- if (arm_smmu_free_asid(cd)) {
- /* Unpin ASID */
- arm64_mm_context_put(cd->mm);
- mmdrop(cd->mm);
- kfree(cd);
- }
-}
-
/*
* Cloned from the MAX_TLBI_OPS in arch/arm64/include/asm/tlbflush.h, this
* is used as a threshold to replace per-page TLBI commands to issue in the
@@ -264,8 +129,8 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
unsigned long start,
unsigned long end)
{
- struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
- struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
+ struct arm_smmu_domain *smmu_domain =
+ container_of(mn, struct arm_smmu_domain, mmu_notifier);
size_t size;
/*
@@ -282,34 +147,22 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
size = 0;
}
- if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM)) {
- if (!size)
- arm_smmu_tlb_inv_asid(smmu_domain->smmu,
- smmu_mn->cd->asid);
- else
- arm_smmu_tlb_inv_range_asid(start, size,
- smmu_mn->cd->asid,
- PAGE_SIZE, false,
- smmu_domain);
- }
+ if (!size)
+ arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
+ else
+ arm_smmu_tlb_inv_range_asid(start, size, smmu_domain->cd.asid,
+ PAGE_SIZE, false, smmu_domain);
- arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), start,
- size);
+ arm_smmu_atc_inv_domain(smmu_domain, start, size);
}
static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
- struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
- struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
+ struct arm_smmu_domain *smmu_domain =
+ container_of(mn, struct arm_smmu_domain, mmu_notifier);
struct arm_smmu_master_domain *master_domain;
unsigned long flags;
- mutex_lock(&sva_lock);
- if (smmu_mn->cleared) {
- mutex_unlock(&sva_lock);
- return;
- }
-
/*
* DMA may still be running. Keep the cd valid to avoid C_BAD_CD events,
* but disable translation.
@@ -321,25 +174,23 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
struct arm_smmu_cd target;
struct arm_smmu_cd *cdptr;
- cdptr = arm_smmu_get_cd_ptr(master, mm_get_enqcmd_pasid(mm));
+ cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid);
if (WARN_ON(!cdptr))
continue;
- arm_smmu_make_sva_cd(&target, master, NULL, smmu_mn->cd->asid);
- arm_smmu_write_cd_entry(master, mm_get_enqcmd_pasid(mm), cdptr,
+ arm_smmu_make_sva_cd(&target, master, NULL,
+ smmu_domain->cd.asid);
+ arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr,
&target);
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
- arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
- arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
-
- smmu_mn->cleared = true;
- mutex_unlock(&sva_lock);
+ arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
+ arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
}
static void arm_smmu_mmu_notifier_free(struct mmu_notifier *mn)
{
- kfree(mn_to_smmu(mn));
+ kfree(container_of(mn, struct arm_smmu_domain, mmu_notifier));
}
static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
@@ -348,115 +199,6 @@ static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
.free_notifier = arm_smmu_mmu_notifier_free,
};
-/* Allocate or get existing MMU notifier for this {domain, mm} pair */
-static struct arm_smmu_mmu_notifier *
-arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain,
- struct mm_struct *mm)
-{
- int ret;
- struct arm_smmu_ctx_desc *cd;
- struct arm_smmu_mmu_notifier *smmu_mn;
-
- list_for_each_entry(smmu_mn, &smmu_domain->mmu_notifiers, list) {
- if (smmu_mn->mn.mm == mm) {
- refcount_inc(&smmu_mn->refs);
- return smmu_mn;
- }
- }
-
- cd = arm_smmu_alloc_shared_cd(mm);
- if (IS_ERR(cd))
- return ERR_CAST(cd);
-
- smmu_mn = kzalloc(sizeof(*smmu_mn), GFP_KERNEL);
- if (!smmu_mn) {
- ret = -ENOMEM;
- goto err_free_cd;
- }
-
- refcount_set(&smmu_mn->refs, 1);
- smmu_mn->cd = cd;
- smmu_mn->domain = smmu_domain;
- smmu_mn->mn.ops = &arm_smmu_mmu_notifier_ops;
-
- ret = mmu_notifier_register(&smmu_mn->mn, mm);
- if (ret) {
- kfree(smmu_mn);
- goto err_free_cd;
- }
-
- list_add(&smmu_mn->list, &smmu_domain->mmu_notifiers);
- return smmu_mn;
-
-err_free_cd:
- arm_smmu_free_shared_cd(cd);
- return ERR_PTR(ret);
-}
-
-static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
-{
- struct mm_struct *mm = smmu_mn->mn.mm;
- struct arm_smmu_ctx_desc *cd = smmu_mn->cd;
- struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
-
- if (!refcount_dec_and_test(&smmu_mn->refs))
- return;
-
- list_del(&smmu_mn->list);
-
- /*
- * If we went through clear(), we've already invalidated, and no
- * new TLB entry can have been formed.
- */
- if (!smmu_mn->cleared) {
- arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid);
- arm_smmu_atc_inv_domain_sva(smmu_domain,
- mm_get_enqcmd_pasid(mm), 0, 0);
- }
-
- /* Frees smmu_mn */
- mmu_notifier_put(&smmu_mn->mn);
- arm_smmu_free_shared_cd(cd);
-}
-
-static struct arm_smmu_bond *__arm_smmu_sva_bind(struct device *dev,
- struct mm_struct *mm)
-{
- int ret;
- struct arm_smmu_bond *bond;
- struct arm_smmu_master *master = dev_iommu_priv_get(dev);
- struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
- struct arm_smmu_domain *smmu_domain;
-
- if (!(domain->type & __IOMMU_DOMAIN_PAGING))
- return ERR_PTR(-ENODEV);
- smmu_domain = to_smmu_domain(domain);
- if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
- return ERR_PTR(-ENODEV);
-
- if (!master || !master->sva_enabled)
- return ERR_PTR(-ENODEV);
-
- bond = kzalloc(sizeof(*bond), GFP_KERNEL);
- if (!bond)
- return ERR_PTR(-ENOMEM);
-
- bond->mm = mm;
-
- bond->smmu_mn = arm_smmu_mmu_notifier_get(smmu_domain, mm);
- if (IS_ERR(bond->smmu_mn)) {
- ret = PTR_ERR(bond->smmu_mn);
- goto err_free_bond;
- }
-
- list_add(&bond->list, &master->bonds);
- return bond;
-
-err_free_bond:
- kfree(bond);
- return ERR_PTR(ret);
-}
-
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
{
unsigned long reg, fld;
@@ -573,11 +315,6 @@ int arm_smmu_master_enable_sva(struct arm_smmu_master *master)
int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
{
mutex_lock(&sva_lock);
- if (!list_empty(&master->bonds)) {
- dev_err(master->dev, "cannot disable SVA, device is bound\n");
- mutex_unlock(&sva_lock);
- return -EBUSY;
- }
arm_smmu_master_sva_disable_iopf(master);
master->sva_enabled = false;
mutex_unlock(&sva_lock);
@@ -594,66 +331,51 @@ void arm_smmu_sva_notifier_synchronize(void)
mmu_notifier_synchronize();
}
-void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t id)
-{
- struct mm_struct *mm = domain->mm;
- struct arm_smmu_bond *bond = NULL, *t;
- struct arm_smmu_master *master = dev_iommu_priv_get(dev);
-
- arm_smmu_remove_pasid(master, to_smmu_domain(domain), id);
-
- mutex_lock(&sva_lock);
- list_for_each_entry(t, &master->bonds, list) {
- if (t->mm == mm) {
- bond = t;
- break;
- }
- }
-
- if (!WARN_ON(!bond)) {
- list_del(&bond->list);
- arm_smmu_mmu_notifier_put(bond->smmu_mn);
- kfree(bond);
- }
- mutex_unlock(&sva_lock);
-}
-
static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t id)
{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
- struct mm_struct *mm = domain->mm;
- struct arm_smmu_bond *bond;
struct arm_smmu_cd target;
int ret;
- if (mm_get_enqcmd_pasid(mm) != id)
+ /* Prevent arm_smmu_mm_release from being called while we are attaching */
+ if (!mmget_not_zero(domain->mm))
return -EINVAL;
- mutex_lock(&sva_lock);
- bond = __arm_smmu_sva_bind(dev, mm);
- if (IS_ERR(bond)) {
- mutex_unlock(&sva_lock);
- return PTR_ERR(bond);
- }
+ /*
+ * This does not need the arm_smmu_asid_lock because SVA domains never
+ * get reassigned
+ */
+ arm_smmu_make_sva_cd(&target, master, domain->mm, smmu_domain->cd.asid);
+ ret = arm_smmu_set_pasid(master, smmu_domain, id, &target);
- arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
- ret = arm_smmu_set_pasid(master, to_smmu_domain(domain), id, &target);
- if (ret) {
- list_del(&bond->list);
- arm_smmu_mmu_notifier_put(bond->smmu_mn);
- kfree(bond);
- mutex_unlock(&sva_lock);
- return ret;
- }
- mutex_unlock(&sva_lock);
- return 0;
+ mmput(domain->mm);
+ return ret;
}
static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
{
- kfree(to_smmu_domain(domain));
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+ /*
+ * Ensure the ASID is empty in the iommu cache before allowing reuse.
+ */
+ arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
+
+ /*
+ * Notice that the arm_smmu_mm_arch_invalidate_secondary_tlbs op can
+ * still be called/running at this point. We allow the ASID to be
+ * reused, and if there is a race then it just suffers harmless
+ * unnecessary invalidation.
+ */
+ xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
+
+ /*
+ * Actual free is defered to the SRCU callback
+ * arm_smmu_mmu_notifier_free()
+ */
+ mmu_notifier_put(&smmu_domain->mmu_notifier);
}
static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
@@ -667,6 +389,8 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_domain *smmu_domain;
+ u32 asid;
+ int ret;
smmu_domain = arm_smmu_domain_alloc();
if (IS_ERR(smmu_domain))
@@ -675,5 +399,22 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
smmu_domain->domain.ops = &arm_smmu_sva_domain_ops;
smmu_domain->smmu = smmu;
+ ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
+ XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
+ if (ret)
+ goto err_free;
+
+ smmu_domain->cd.asid = asid;
+ smmu_domain->mmu_notifier.ops = &arm_smmu_mmu_notifier_ops;
+ ret = mmu_notifier_register(&smmu_domain->mmu_notifier, mm);
+ if (ret)
+ goto err_asid;
+
return &smmu_domain->domain;
+
+err_asid:
+ xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
+err_free:
+ kfree(smmu_domain);
+ return ERR_PTR(ret);
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 527560b36b3993..26d597f608e3b5 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1439,22 +1439,6 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
cd_table->cdtab = NULL;
}
-bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
-{
- bool free;
- struct arm_smmu_ctx_desc *old_cd;
-
- if (!cd->asid)
- return false;
-
- free = refcount_dec_and_test(&cd->refs);
- if (free) {
- old_cd = xa_erase(&arm_smmu_asid_xa, cd->asid);
- WARN_ON(old_cd != cd);
- }
- return free;
-}
-
/* Stream table manipulation functions */
static void
arm_smmu_write_strtab_l1_desc(__le64 *dst, struct arm_smmu_strtab_l1_desc *desc)
@@ -2023,8 +2007,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master,
return arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
}
-static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
- ioasid_t ssid, unsigned long iova, size_t size)
+int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ unsigned long iova, size_t size)
{
struct arm_smmu_master_domain *master_domain;
int i;
@@ -2062,15 +2046,7 @@ static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
if (!master->ats_enabled)
continue;
- /*
- * Non-zero ssid means SVA is co-opting the S1 domain to issue
- * invalidations for SVA PASIDs.
- */
- if (ssid != IOMMU_NO_PASID)
- arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
- else
- arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size,
- &cmd);
+ arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size, &cmd);
for (i = 0; i < master->num_streams; i++) {
cmd.atc.sid = master->streams[i].id;
@@ -2082,19 +2058,6 @@ static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds);
}
-static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
- unsigned long iova, size_t size)
-{
- return __arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova,
- size);
-}
-
-int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
- ioasid_t ssid, unsigned long iova, size_t size)
-{
- return __arm_smmu_atc_inv_domain(smmu_domain, ssid, iova, size);
-}
-
/* IO_PGTABLE API */
static void arm_smmu_tlb_inv_context(void *cookie)
{
@@ -2283,7 +2246,6 @@ struct arm_smmu_domain *arm_smmu_domain_alloc(void)
mutex_init(&smmu_domain->init_mutex);
INIT_LIST_HEAD(&smmu_domain->devices);
spin_lock_init(&smmu_domain->devices_lock);
- INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
return smmu_domain;
}
@@ -2325,7 +2287,7 @@ static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
/* Prevent SVA from touching the CD while we're freeing it */
mutex_lock(&arm_smmu_asid_lock);
- arm_smmu_free_asid(&smmu_domain->cd);
+ xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
mutex_unlock(&arm_smmu_asid_lock);
} else {
struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
@@ -2343,11 +2305,9 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
u32 asid;
struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
- refcount_set(&cd->refs, 1);
-
/* Prevent SVA from modifying the ASID until it is written to the CD */
mutex_lock(&arm_smmu_asid_lock);
- ret = xa_alloc(&arm_smmu_asid_xa, &asid, cd,
+ ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
cd->asid = (u16)asid;
mutex_unlock(&arm_smmu_asid_lock);
@@ -2834,6 +2794,9 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
/* The core code validates pasid */
+ if (smmu_domain->smmu != master->smmu)
+ return -EINVAL;
+
if (!master->cd_table.in_ste)
return -ENODEV;
@@ -2855,9 +2818,14 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
return ret;
}
-void arm_smmu_remove_pasid(struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain, ioasid_t pasid)
+static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
+ struct iommu_domain *domain)
{
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_domain *smmu_domain;
+
+ smmu_domain = to_smmu_domain(domain);
+
mutex_lock(&arm_smmu_asid_lock);
arm_smmu_clear_cd(master, pasid);
if (master->ats_enabled)
@@ -3128,7 +3096,6 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
master->dev = dev;
master->smmu = smmu;
- INIT_LIST_HEAD(&master->bonds);
dev_iommu_priv_set(dev, master);
ret = arm_smmu_insert_master(smmu, master);
@@ -3310,12 +3277,6 @@ static int arm_smmu_def_domain_type(struct device *dev)
return 0;
}
-static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
- struct iommu_domain *domain)
-{
- arm_smmu_sva_remove_dev_pasid(domain, dev, pasid);
-}
-
static struct iommu_ops arm_smmu_ops = {
.identity_domain = &arm_smmu_identity_domain,
.blocked_domain = &arm_smmu_blocked_domain,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 212c18c70fa03e..d175d9eee6c61b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -587,9 +587,6 @@ struct arm_smmu_strtab_l1_desc {
struct arm_smmu_ctx_desc {
u16 asid;
-
- refcount_t refs;
- struct mm_struct *mm;
};
struct arm_smmu_l1_ctx_desc {
@@ -712,7 +709,6 @@ struct arm_smmu_master {
bool stall_enabled;
bool sva_enabled;
bool iopf_enabled;
- struct list_head bonds;
unsigned int ssid_bits;
};
@@ -741,7 +737,7 @@ struct arm_smmu_domain {
struct list_head devices;
spinlock_t devices_lock;
- struct list_head mmu_notifiers;
+ struct mmu_notifier mmu_notifier;
};
/* The following are exposed for testing purposes. */
@@ -805,16 +801,13 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
const struct arm_smmu_cd *cd);
-void arm_smmu_remove_pasid(struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain, ioasid_t pasid);
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
struct arm_smmu_domain *smmu_domain);
-bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
-int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
- ioasid_t ssid, unsigned long iova, size_t size);
+int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ unsigned long iova, size_t size);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
@@ -826,8 +819,6 @@ bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
void arm_smmu_sva_notifier_synchronize(void);
struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm);
-void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t id);
#else /* CONFIG_ARM_SMMU_V3_SVA */
static inline bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
{
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 11/14] iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is used
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (9 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 10/14] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 12/14] iommu/arm-smmu-v3: Test the STE S1DSS functionality Jason Gunthorpe
` (3 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
The HW supports this, use the S1DSS bits to configure the behavior
of SSID=0 which is the RID's translation.
If SSID's are currently being used in the CD table then just update the
S1DSS bits in the STE, remove the master_domain and leave ATS alone.
For iommufd the driver design has a small problem that all the unused CD
table entries are set with V=0 which will generate an event if VFIO
userspace tries to use the CD entry. This patch extends this problem to
include the RID as well if PASID is being used.
For BLOCKED with used PASIDs the
F_STREAM_DISABLED (STRTAB_STE_1_S1DSS_TERMINATE) event is generated on
untagged traffic and a substream CD table entry with V=0 (removed pasid)
will generate C_BAD_CD. Arguably there is no advantage to using S1DSS over
the CD entry 0 with V=0.
As we don't yet support PASID in iommufd this is a problem to resolve
later, possibly by using EPD0 for unused CD table entries instead of V=0,
and not using S1DSS for BLOCKED.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 60 +++++++++++++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 +-
3 files changed, 50 insertions(+), 16 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index a460b71f585789..d7e022bb9df530 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -164,7 +164,7 @@ static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste,
.smmu = &smmu,
};
- arm_smmu_make_cdtable_ste(ste, &master, true);
+ arm_smmu_make_cdtable_ste(ste, &master, true, STRTAB_STE_1_S1DSS_SSID0);
}
static void arm_smmu_v3_write_ste_test_bypass_to_abort(struct kunit *test)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 26d597f608e3b5..10d140a5cd0f38 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -991,6 +991,14 @@ void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits)
STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW |
STRTAB_STE_1_EATS);
used_bits[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID);
+
+ /*
+ * See 13.5 Summary of attribute/permission configuration fields
+ * for the SHCFG behavior.
+ */
+ if (FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent[1])) ==
+ STRTAB_STE_1_S1DSS_BYPASS)
+ used_bits[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
}
/* S2 translates */
@@ -1531,7 +1539,8 @@ EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_bypass_ste);
VISIBLE_IF_KUNIT
void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
- struct arm_smmu_master *master, bool ats_enabled)
+ struct arm_smmu_master *master, bool ats_enabled,
+ unsigned int s1dss)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
@@ -1545,7 +1554,7 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax));
target->data[1] = cpu_to_le64(
- FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
+ FIELD_PREP(STRTAB_STE_1_S1DSS, s1dss) |
FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
@@ -1556,6 +1565,11 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
FIELD_PREP(STRTAB_STE_1_EATS,
ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+ if ((smmu->features & ARM_SMMU_FEAT_ATTR_TYPES_OVR) &&
+ s1dss == STRTAB_STE_1_S1DSS_BYPASS)
+ target->data[1] |= cpu_to_le64(FIELD_PREP(
+ STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
+
if (smmu->features & ARM_SMMU_FEAT_E2H) {
/*
* To support BTM the streamworld needs to match the
@@ -2579,6 +2593,7 @@ struct arm_smmu_attach_state {
/* Inputs */
struct iommu_domain *old_domain;
struct arm_smmu_master *master;
+ bool cd_needs_ats;
ioasid_t ssid;
/* Resulting state */
bool ats_enabled;
@@ -2620,7 +2635,7 @@ static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
*/
lockdep_assert_held(&arm_smmu_asid_lock);
- if (smmu_domain) {
+ if (smmu_domain || state->cd_needs_ats) {
/*
* The SMMU does not support enabling ATS with bypass/abort.
* When the STE is in bypass (STE.Config[2:0] == 0b100), ATS
@@ -2632,7 +2647,9 @@ static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
* tables.
*/
state->ats_enabled = arm_smmu_ats_supported(master);
+ }
+ if (smmu_domain) {
master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
if (!master_domain)
return -ENOMEM;
@@ -2760,7 +2777,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
&target_cd);
- arm_smmu_make_cdtable_ste(&target, master, state.ats_enabled);
+ arm_smmu_make_cdtable_ste(&target, master, state.ats_enabled,
+ STRTAB_STE_1_S1DSS_SSID0);
arm_smmu_install_ste_for_dev(master, &target);
break;
}
@@ -2834,8 +2852,10 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
mutex_unlock(&arm_smmu_asid_lock);
}
-static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
- struct device *dev, struct arm_smmu_ste *ste)
+static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
+ struct device *dev,
+ struct arm_smmu_ste *ste,
+ unsigned int s1dss)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_attach_state state = {
@@ -2844,16 +2864,28 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
.ssid = IOMMU_NO_PASID,
};
- if (arm_smmu_ssids_in_use(&master->cd_table))
- return -EBUSY;
-
/*
* Do not allow any ASID to be changed while are working on the STE,
* otherwise we could miss invalidations.
*/
mutex_lock(&arm_smmu_asid_lock);
- arm_smmu_attach_prepare(&state, domain);
+ /*
+ * If the CD table is not in use we can use the provided STE, otherwise
+ * we use a cdtable STE with the provided S1DSS.
+ */
+ if (arm_smmu_ssids_in_use(&master->cd_table)) {
+ /*
+ * If a CD table has to be present then we need to run with ATS
+ * on even though the RID will fail ATS queries with UR. This is
+ * because we have no idea what the PASID's need.
+ */
+ state.cd_needs_ats = true;
+ arm_smmu_attach_prepare(&state, domain);
+ arm_smmu_make_cdtable_ste(ste, master, state.ats_enabled, s1dss);
+ } else {
+ arm_smmu_attach_prepare(&state, domain);
+ }
arm_smmu_install_ste_for_dev(master, ste);
arm_smmu_attach_commit(&state);
mutex_unlock(&arm_smmu_asid_lock);
@@ -2864,7 +2896,6 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
* descriptor from arm_smmu_share_asid().
*/
arm_smmu_clear_cd(master, IOMMU_NO_PASID);
- return 0;
}
static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
@@ -2874,7 +2905,8 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
arm_smmu_make_bypass_ste(master->smmu, &ste);
- return arm_smmu_attach_dev_ste(domain, dev, &ste);
+ arm_smmu_attach_dev_ste(domain, dev, &ste, STRTAB_STE_1_S1DSS_BYPASS);
+ return 0;
}
static const struct iommu_domain_ops arm_smmu_identity_ops = {
@@ -2892,7 +2924,9 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
struct arm_smmu_ste ste;
arm_smmu_make_abort_ste(&ste);
- return arm_smmu_attach_dev_ste(domain, dev, &ste);
+ arm_smmu_attach_dev_ste(domain, dev, &ste,
+ STRTAB_STE_1_S1DSS_TERMINATE);
+ return 0;
}
static const struct iommu_domain_ops arm_smmu_blocked_ops = {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index d175d9eee6c61b..30459a800c7b2d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -761,8 +761,8 @@ void arm_smmu_make_abort_ste(struct arm_smmu_ste *target);
void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu,
struct arm_smmu_ste *target);
void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
- struct arm_smmu_master *master,
- bool ats_enabled);
+ struct arm_smmu_master *master, bool ats_enabled,
+ unsigned int s1dss);
void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain,
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 12/14] iommu/arm-smmu-v3: Test the STE S1DSS functionality
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (10 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 11/14] iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is used Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 13/14] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED Jason Gunthorpe
` (2 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
S1DSS brings in quite a few new transition pairs that are
interesting. Test to/from S1DSS_BYPASS <-> S1DSS_SSID0, and
BYPASS <-> S1DSS_SSID0.
Test a contrived non-hitless flow to make sure that the logic works.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 113 +++++++++++++++++-
1 file changed, 108 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index d7e022bb9df530..e0fce31eba54dd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -144,6 +144,14 @@ static void arm_smmu_v3_test_ste_expect_transition(
KUNIT_EXPECT_MEMEQ(test, target->data, cur_copy.data, sizeof(cur_copy));
}
+static void arm_smmu_v3_test_ste_expect_non_hitless_transition(
+ struct kunit *test, const struct arm_smmu_ste *cur,
+ const struct arm_smmu_ste *target, unsigned int num_syncs_expected)
+{
+ arm_smmu_v3_test_ste_expect_transition(test, cur, target,
+ num_syncs_expected, false);
+}
+
static void arm_smmu_v3_test_ste_expect_hitless_transition(
struct kunit *test, const struct arm_smmu_ste *cur,
const struct arm_smmu_ste *target, unsigned int num_syncs_expected)
@@ -155,6 +163,7 @@ static void arm_smmu_v3_test_ste_expect_hitless_transition(
static const dma_addr_t fake_cdtab_dma_addr = 0xF0F0F0F0F0F0;
static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste,
+ unsigned int s1dss,
const dma_addr_t dma_addr)
{
struct arm_smmu_master master = {
@@ -164,7 +173,7 @@ static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste,
.smmu = &smmu,
};
- arm_smmu_make_cdtable_ste(ste, &master, true, STRTAB_STE_1_S1DSS_SSID0);
+ arm_smmu_make_cdtable_ste(ste, &master, true, s1dss);
}
static void arm_smmu_v3_write_ste_test_bypass_to_abort(struct kunit *test)
@@ -194,7 +203,8 @@ static void arm_smmu_v3_write_ste_test_cdtable_to_abort(struct kunit *test)
{
struct arm_smmu_ste ste;
- arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
arm_smmu_v3_test_ste_expect_hitless_transition(test, &ste, &abort_ste,
NUM_EXPECTED_SYNCS(2));
}
@@ -203,7 +213,8 @@ static void arm_smmu_v3_write_ste_test_abort_to_cdtable(struct kunit *test)
{
struct arm_smmu_ste ste;
- arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
arm_smmu_v3_test_ste_expect_hitless_transition(test, &abort_ste, &ste,
NUM_EXPECTED_SYNCS(2));
}
@@ -212,7 +223,8 @@ static void arm_smmu_v3_write_ste_test_cdtable_to_bypass(struct kunit *test)
{
struct arm_smmu_ste ste;
- arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
arm_smmu_v3_test_ste_expect_hitless_transition(test, &ste, &bypass_ste,
NUM_EXPECTED_SYNCS(3));
}
@@ -221,11 +233,54 @@ static void arm_smmu_v3_write_ste_test_bypass_to_cdtable(struct kunit *test)
{
struct arm_smmu_ste ste;
- arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
arm_smmu_v3_test_ste_expect_hitless_transition(test, &bypass_ste, &ste,
NUM_EXPECTED_SYNCS(3));
}
+static void arm_smmu_v3_write_ste_test_cdtable_s1dss_change(struct kunit *test)
+{
+ struct arm_smmu_ste ste;
+ struct arm_smmu_ste s1dss_bypass;
+
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&s1dss_bypass, STRTAB_STE_1_S1DSS_BYPASS,
+ fake_cdtab_dma_addr);
+
+ /*
+ * Flipping s1dss on a CD table STE only involves changes to the second
+ * qword of an STE and can be done in a single write.
+ */
+ arm_smmu_v3_test_ste_expect_hitless_transition(
+ test, &ste, &s1dss_bypass, NUM_EXPECTED_SYNCS(1));
+ arm_smmu_v3_test_ste_expect_hitless_transition(
+ test, &s1dss_bypass, &ste, NUM_EXPECTED_SYNCS(1));
+}
+
+static void
+arm_smmu_v3_write_ste_test_s1dssbypass_to_stebypass(struct kunit *test)
+{
+ struct arm_smmu_ste s1dss_bypass;
+
+ arm_smmu_test_make_cdtable_ste(&s1dss_bypass, STRTAB_STE_1_S1DSS_BYPASS,
+ fake_cdtab_dma_addr);
+ arm_smmu_v3_test_ste_expect_hitless_transition(
+ test, &s1dss_bypass, &bypass_ste, NUM_EXPECTED_SYNCS(2));
+}
+
+static void
+arm_smmu_v3_write_ste_test_stebypass_to_s1dssbypass(struct kunit *test)
+{
+ struct arm_smmu_ste s1dss_bypass;
+
+ arm_smmu_test_make_cdtable_ste(&s1dss_bypass, STRTAB_STE_1_S1DSS_BYPASS,
+ fake_cdtab_dma_addr);
+ arm_smmu_v3_test_ste_expect_hitless_transition(
+ test, &bypass_ste, &s1dss_bypass, NUM_EXPECTED_SYNCS(2));
+}
+
static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste,
bool ats_enabled)
{
@@ -285,6 +340,48 @@ static void arm_smmu_v3_write_ste_test_bypass_to_s2(struct kunit *test)
NUM_EXPECTED_SYNCS(2));
}
+static void arm_smmu_v3_write_ste_test_s1_to_s2(struct kunit *test)
+{
+ struct arm_smmu_ste s1_ste;
+ struct arm_smmu_ste s2_ste;
+
+ arm_smmu_test_make_cdtable_ste(&s1_ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
+ arm_smmu_test_make_s2_ste(&s2_ste, true);
+ arm_smmu_v3_test_ste_expect_hitless_transition(test, &s1_ste, &s2_ste,
+ NUM_EXPECTED_SYNCS(3));
+}
+
+static void arm_smmu_v3_write_ste_test_s2_to_s1(struct kunit *test)
+{
+ struct arm_smmu_ste s1_ste;
+ struct arm_smmu_ste s2_ste;
+
+ arm_smmu_test_make_cdtable_ste(&s1_ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
+ arm_smmu_test_make_s2_ste(&s2_ste, true);
+ arm_smmu_v3_test_ste_expect_hitless_transition(test, &s2_ste, &s1_ste,
+ NUM_EXPECTED_SYNCS(3));
+}
+
+static void arm_smmu_v3_write_ste_test_non_hitless(struct kunit *test)
+{
+ struct arm_smmu_ste ste;
+ struct arm_smmu_ste ste_2;
+
+ /*
+ * Although no flow resembles this in practice, one way to force an STE
+ * update to be non-hitless is to change its CD table pointer as well as
+ * s1 dss field in the same update.
+ */
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&ste_2, STRTAB_STE_1_S1DSS_BYPASS,
+ 0x4B4B4b4B4B);
+ arm_smmu_v3_test_ste_expect_non_hitless_transition(
+ test, &ste, &ste_2, NUM_EXPECTED_SYNCS(3));
+}
+
static void arm_smmu_v3_test_cd_expect_transition(
struct kunit *test, const struct arm_smmu_cd *cur,
const struct arm_smmu_cd *target, unsigned int num_syncs_expected,
@@ -438,10 +535,16 @@ static struct kunit_case arm_smmu_v3_test_cases[] = {
KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_cdtable),
KUNIT_CASE(arm_smmu_v3_write_ste_test_cdtable_to_bypass),
KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_cdtable),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_cdtable_s1dss_change),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_s1dssbypass_to_stebypass),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_stebypass_to_s1dssbypass),
KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_abort),
KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_s2),
KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_bypass),
KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_s2),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_s1_to_s2),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_s1),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_non_hitless),
KUNIT_CASE(arm_smmu_v3_write_cd_test_s1_clear),
KUNIT_CASE(arm_smmu_v3_write_cd_test_s1_change_asid),
KUNIT_CASE(arm_smmu_v3_write_cd_test_sva_clear),
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 13/14] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (11 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 12/14] iommu/arm-smmu-v3: Test the STE S1DSS functionality Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 14/14] iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID Jason Gunthorpe
2024-07-02 18:43 ` [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Will Deacon
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
If the STE doesn't point to the CD table we can upgrade it by
reprogramming the STE with the appropriate S1DSS. We may also need to turn
on ATS at the same time.
Keep track if the installed STE is pointing at the cd_table and the ATS
state to trigger this path.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 49 ++++++++++++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +-
2 files changed, 49 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 10d140a5cd0f38..fbe14466f5f379 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2435,6 +2435,9 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
master->cd_table.in_ste =
FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(target->data[0])) ==
STRTAB_STE_0_CFG_S1_TRANS;
+ master->ste_ats_enabled =
+ FIELD_GET(STRTAB_STE_1_EATS, le64_to_cpu(target->data[1])) ==
+ STRTAB_STE_1_EATS_TRANS;
for (i = 0; i < master->num_streams; ++i) {
u32 sid = master->streams[i].id;
@@ -2795,10 +2798,36 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return 0;
}
+static void arm_smmu_update_ste(struct arm_smmu_master *master,
+ struct iommu_domain *sid_domain,
+ bool ats_enabled)
+{
+ unsigned int s1dss = STRTAB_STE_1_S1DSS_TERMINATE;
+ struct arm_smmu_ste ste;
+
+ if (master->cd_table.in_ste && master->ste_ats_enabled == ats_enabled)
+ return;
+
+ if (sid_domain->type == IOMMU_DOMAIN_IDENTITY)
+ s1dss = STRTAB_STE_1_S1DSS_BYPASS;
+ else
+ WARN_ON(sid_domain->type != IOMMU_DOMAIN_BLOCKED);
+
+ /*
+ * Change the STE into a cdtable one with SID IDENTITY/BLOCKED behavior
+ * using s1dss if necessary. If the cd_table is already installed then
+ * the S1DSS is correct and this will just update the EATS. Otherwise it
+ * installs the entire thing. This will be hitless.
+ */
+ arm_smmu_make_cdtable_ste(&ste, master, ats_enabled, s1dss);
+ arm_smmu_install_ste_for_dev(master, &ste);
+}
+
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
const struct arm_smmu_cd *cd)
{
+ struct iommu_domain *sid_domain = iommu_get_domain_for_dev(master->dev);
struct arm_smmu_attach_state state = {
.master = master,
/*
@@ -2815,8 +2844,10 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
if (smmu_domain->smmu != master->smmu)
return -EINVAL;
- if (!master->cd_table.in_ste)
- return -ENODEV;
+ if (!master->cd_table.in_ste &&
+ sid_domain->type != IOMMU_DOMAIN_IDENTITY &&
+ sid_domain->type != IOMMU_DOMAIN_BLOCKED)
+ return -EINVAL;
cdptr = arm_smmu_alloc_cd_ptr(master, pasid);
if (!cdptr)
@@ -2828,6 +2859,7 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
goto out_unlock;
arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
+ arm_smmu_update_ste(master, sid_domain, state.ats_enabled);
arm_smmu_attach_commit(&state);
@@ -2850,6 +2882,19 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
arm_smmu_atc_inv_master(master, pasid);
arm_smmu_remove_master_domain(master, &smmu_domain->domain, pasid);
mutex_unlock(&arm_smmu_asid_lock);
+
+ /*
+ * When the last user of the CD table goes away downgrade the STE back
+ * to a non-cd_table one.
+ */
+ if (!arm_smmu_ssids_in_use(&master->cd_table)) {
+ struct iommu_domain *sid_domain =
+ iommu_get_domain_for_dev(master->dev);
+
+ if (sid_domain->type == IOMMU_DOMAIN_IDENTITY ||
+ sid_domain->type == IOMMU_DOMAIN_BLOCKED)
+ sid_domain->ops->attach_dev(sid_domain, dev);
+ }
}
static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 30459a800c7b2d..cdd426efb384d2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -705,7 +705,8 @@ struct arm_smmu_master {
/* Locked by the iommu core using the group mutex */
struct arm_smmu_ctx_desc_cfg cd_table;
unsigned int num_streams;
- bool ats_enabled;
+ bool ats_enabled : 1;
+ bool ste_ats_enabled : 1;
bool stall_enabled;
bool sva_enabled;
bool iopf_enabled;
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v9 14/14] iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (12 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 13/14] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED Jason Gunthorpe
@ 2024-06-25 12:37 ` Jason Gunthorpe
2024-07-02 18:43 ` [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Will Deacon
14 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-06-25 12:37 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, Nicolin Chen, patches,
Shameerali Kolothum Thodi
The SVA cleanup made the SSID logic entirely general so all we need to do
is call it with the correct cd table entry for a S1 domain.
This is slightly tricky because of the ASID and how the locking works, the
simple fix is to just update the ASID once we get the right locks.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 ++++++++++++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +-
2 files changed, 41 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index fbe14466f5f379..1e8c996a4be0d2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2798,6 +2798,36 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return 0;
}
+static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t id)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_device *smmu = master->smmu;
+ struct arm_smmu_cd target_cd;
+ int ret = 0;
+
+ mutex_lock(&smmu_domain->init_mutex);
+ if (!smmu_domain->smmu)
+ ret = arm_smmu_domain_finalise(smmu_domain, smmu);
+ else if (smmu_domain->smmu != smmu)
+ ret = -EINVAL;
+ mutex_unlock(&smmu_domain->init_mutex);
+ if (ret)
+ return ret;
+
+ if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+ return -EINVAL;
+
+ /*
+ * We can read cd.asid outside the lock because arm_smmu_set_pasid()
+ * will fix it
+ */
+ arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
+ return arm_smmu_set_pasid(master, to_smmu_domain(domain), id,
+ &target_cd);
+}
+
static void arm_smmu_update_ste(struct arm_smmu_master *master,
struct iommu_domain *sid_domain,
bool ats_enabled)
@@ -2825,7 +2855,7 @@ static void arm_smmu_update_ste(struct arm_smmu_master *master,
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
- const struct arm_smmu_cd *cd)
+ struct arm_smmu_cd *cd)
{
struct iommu_domain *sid_domain = iommu_get_domain_for_dev(master->dev);
struct arm_smmu_attach_state state = {
@@ -2858,6 +2888,14 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
if (ret)
goto out_unlock;
+ /*
+ * We don't want to obtain to the asid_lock too early, so fix up the
+ * caller set ASID under the lock in case it changed.
+ */
+ cd->data[0] &= ~cpu_to_le64(CTXDESC_CD_0_ASID);
+ cd->data[0] |= cpu_to_le64(
+ FIELD_PREP(CTXDESC_CD_0_ASID, smmu_domain->cd.asid));
+
arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
arm_smmu_update_ste(master, sid_domain, state.ats_enabled);
@@ -3376,6 +3414,7 @@ static struct iommu_ops arm_smmu_ops = {
.owner = THIS_MODULE,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = arm_smmu_attach_dev,
+ .set_dev_pasid = arm_smmu_s1_set_dev_pasid,
.map_pages = arm_smmu_map_pages,
.unmap_pages = arm_smmu_unmap_pages,
.flush_iotlb_all = arm_smmu_flush_iotlb_all,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index cdd426efb384d2..91ec2d49ecbf2e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -801,7 +801,7 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
- const struct arm_smmu_cd *cd);
+ struct arm_smmu_cd *cd);
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
--
2.45.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v9 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer
2024-06-25 12:37 ` [PATCH v9 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer Jason Gunthorpe
@ 2024-07-02 14:57 ` Will Deacon
2024-07-02 17:03 ` Nicolin Chen
2024-07-09 19:39 ` Jason Gunthorpe
0 siblings, 2 replies; 21+ messages in thread
From: Will Deacon @ 2024-07-02 14:57 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Eric Auger,
Jean-Philippe Brucker, Jerry Snitselaar, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameerali Kolothum Thodi
On Tue, Jun 25, 2024 at 09:37:33AM -0300, Jason Gunthorpe wrote:
> Add arm_smmu_set_pasid()/arm_smmu_remove_pasid() which are to be used by
> callers that already constructed the arm_smmu_cd they wish to program.
>
> These functions will encapsulate the shared logic to setup a CD entry that
> will be shared by SVA and S1 domain cases.
>
> Prior fixes had already moved most of this logic up into
> __arm_smmu_sva_bind(), move it to it's final home.
>
> Following patches will relieve some of the remaining SVA restrictions:
>
> - The RID domain is a S1 domain and has already setup the STE to point to
> the CD table
> - The programmed PASID is the mm_get_enqcmd_pasid()
> - Nothing changes while SVA is running (sva_enable)
>
> SVA invalidation will still iterate over the S1 domain's master list,
> later patches will resolve that.
>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 57 ++++++++++---------
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 ++++++++++-
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 ++-
> 3 files changed, 67 insertions(+), 31 deletions(-)
[...]
> @@ -611,10 +599,9 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
> struct arm_smmu_bond *bond = NULL, *t;
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
>
> + arm_smmu_remove_pasid(master, to_smmu_domain(domain), id);
> +
> mutex_lock(&sva_lock);
> -
> - arm_smmu_clear_cd(master, id);
This looks a bit alarming, as you're effectively moving the CD
modification outside of the critical section. I assume we're relying on
the iommu group mutex to serialise this in the caller? I can't see any
consistent locking in the driver for arm_smmu_clear_cd().
As an additional patch, perhaps we should consider documenting what each
lock in the driver protects and the lock ordering requirements they
have? We've got a few global locks with generic names and after a few
rounds of refactoring it's really hard to know who's responsible for
what, especially now that we have stale comments referring to
arm_smmu_share_asid(). We've also grown a number of places where we
drop a lock in the callee and immediately re-take it in the caller,
which tends to be a source of bugs.
Thanks,
Will
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v9 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer
2024-07-02 14:57 ` Will Deacon
@ 2024-07-02 17:03 ` Nicolin Chen
2024-07-09 19:39 ` Jason Gunthorpe
1 sibling, 0 replies; 21+ messages in thread
From: Nicolin Chen @ 2024-07-02 17:03 UTC (permalink / raw)
To: Will Deacon
Cc: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
Robin Murphy, Eric Auger, Jean-Philippe Brucker, Jerry Snitselaar,
Moritz Fischer, Michael Shavit, patches,
Shameerali Kolothum Thodi
On Tue, Jul 02, 2024 at 03:57:05PM +0100, Will Deacon wrote:
> > @@ -611,10 +599,9 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
> > struct arm_smmu_bond *bond = NULL, *t;
> > struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> >
> > + arm_smmu_remove_pasid(master, to_smmu_domain(domain), id);
> > +
> > mutex_lock(&sva_lock);
> > -
> > - arm_smmu_clear_cd(master, id);
>
> This looks a bit alarming, as you're effectively moving the CD
> modification outside of the critical section. I assume we're relying on
> the iommu group mutex to serialise this in the caller? I can't see any
> consistent locking in the driver for arm_smmu_clear_cd().
Hi Will, your assumption is correct. Jason had the same remark
during the v7 review:
https://lore.kernel.org/linux-iommu/ZkDAL+TX93QfTFMc@nvidia.com/
Thanks
Nicolin
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3)
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (13 preceding siblings ...)
2024-06-25 12:37 ` [PATCH v9 14/14] iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID Jason Gunthorpe
@ 2024-07-02 18:43 ` Will Deacon
14 siblings, 0 replies; 21+ messages in thread
From: Will Deacon @ 2024-07-02 18:43 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy,
Jason Gunthorpe
Cc: catalin.marinas, kernel-team, Will Deacon, Eric Auger,
Jean-Philippe Brucker, Jerry Snitselaar, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameerali Kolothum Thodi
On Tue, 25 Jun 2024 09:37:31 -0300, Jason Gunthorpe wrote:
> Continuing the work of part 1 and 2a this part focuses on the PASID and
> SVA code bringing these functional improvements:
>
> - attach_dev failure does not change the HW configuration.
>
> - Full PASID API support including:
> - S1/SVA domains attached to PASIDs
> - IDENTITY/BLOCKED/S1 attached to RID
> - Change of the RID domain while PASIDs are attached
>
> [...]
Applied to will (for-joerg/arm-smmu/updates), thanks!
[01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva()
https://git.kernel.org/will/c/678d79b98028
[02/14] iommu/arm-smmu-v3: Start building a generic PASID layer
https://git.kernel.org/will/c/85f2fb6ef413
[03/14] iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list
https://git.kernel.org/will/c/ad10dce61303
[04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS
https://git.kernel.org/will/c/7497f4211f4f
[05/14] iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain
https://git.kernel.org/will/c/64efb3def3a5
[06/14] iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches
https://git.kernel.org/will/c/be7c90de39fd
[07/14] iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*() interface
https://git.kernel.org/will/c/1d5f34f0002f
[08/14] iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain
https://git.kernel.org/will/c/d7b2d2ba1b84
[09/14] iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA
https://git.kernel.org/will/c/49db2ed23c52
[10/14] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain
https://git.kernel.org/will/c/d38c28dbefee
[11/14] iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is used
https://git.kernel.org/will/c/ce26ea9e6e12
[12/14] iommu/arm-smmu-v3: Test the STE S1DSS functionality
https://git.kernel.org/will/c/3b5302cbb06a
[13/14] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED
https://git.kernel.org/will/c/8ee9175c2582
[14/14] iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID
https://git.kernel.org/will/c/f3b273b7c7e4
Cheers,
--
Will
https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v9 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer
2024-07-02 14:57 ` Will Deacon
2024-07-02 17:03 ` Nicolin Chen
@ 2024-07-09 19:39 ` Jason Gunthorpe
2024-09-06 14:08 ` Will Deacon
1 sibling, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2024-07-09 19:39 UTC (permalink / raw)
To: Will Deacon
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Eric Auger,
Jean-Philippe Brucker, Jerry Snitselaar, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameerali Kolothum Thodi
On Tue, Jul 02, 2024 at 03:57:05PM +0100, Will Deacon wrote:
> > @@ -611,10 +599,9 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
> > struct arm_smmu_bond *bond = NULL, *t;
> > struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> >
> > + arm_smmu_remove_pasid(master, to_smmu_domain(domain), id);
> > +
> > mutex_lock(&sva_lock);
> > -
> > - arm_smmu_clear_cd(master, id);
>
> This looks a bit alarming, as you're effectively moving the CD
> modification outside of the critical section. I assume we're relying on
> the iommu group mutex to serialise this in the caller? I can't see any
> consistent locking in the driver for arm_smmu_clear_cd().
I see Nicolin got this - but yes, sva_lock has nothing to do with
CD. CD/STE has always been protected by the group_mutex.
> As an additional patch, perhaps we should consider documenting what each
> lock in the driver protects and the lock ordering requirements they
> have?
There is still a bunch of rework to do here, it may be better to
complete the rework than to try to document it, but let me know which
ones you are interested in and I'll write some thing.
sva_lock is almost gone, it just locks the IOPF flow on the master
because we are doing the IOPF flow in the wrong place. The IOPF
enable/disbale should be done in attach under the group mutex.
init_mutex will be deleted once the iommu_domain_alloc_paging()
conversion is done.
asid_lock.. Is a place holder for the nascent BTM support. It locks
domain->asid only. The BTM patches make this per-instance instead of
global. Unfortunately that locking scheme doesn't work 100%, but I
have a notion how to fix it..
asid_lock is also going to need some reconsidering when we make the
domain able to attach to multiple instances which is something iommufd
wants.
devices_lock protects the device list only, excluding the special
nr_ats_masters thing.
Then the hidden group mutex makes all the ops touching master single
threaded, and the driver has always quietly relied on this. It
protects the STE/CD and parts of the master.
So I think we end up with only the group mutex, asid_lock and
devices_lock it is OK.
The order is group -> asid -> devices, which is layed out clearly in
the attach functions.
> We've got a few global locks with generic names and after a few
> rounds of refactoring it's really hard to know who's responsible for
> what, especially now that we have stale comments referring to
> arm_smmu_share_asid().
> We've also grown a number of places where we
> drop a lock in the callee and immediately re-take it in the caller,
> which tends to be a source of bugs.
Do we? Can you point to what you noticed?
Thanks,
Jason
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v9 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer
2024-07-09 19:39 ` Jason Gunthorpe
@ 2024-09-06 14:08 ` Will Deacon
2024-10-07 17:43 ` Jason Gunthorpe
0 siblings, 1 reply; 21+ messages in thread
From: Will Deacon @ 2024-09-06 14:08 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Eric Auger,
Jean-Philippe Brucker, Jerry Snitselaar, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameerali Kolothum Thodi
Hi Jason,
Sorry, it's taken me ages to get back to this after applying the series.
On Tue, Jul 09, 2024 at 04:39:05PM -0300, Jason Gunthorpe wrote:
> On Tue, Jul 02, 2024 at 03:57:05PM +0100, Will Deacon wrote:
>
> > > @@ -611,10 +599,9 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
> > > struct arm_smmu_bond *bond = NULL, *t;
> > > struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > >
> > > + arm_smmu_remove_pasid(master, to_smmu_domain(domain), id);
> > > +
> > > mutex_lock(&sva_lock);
> > > -
> > > - arm_smmu_clear_cd(master, id);
> >
> > This looks a bit alarming, as you're effectively moving the CD
> > modification outside of the critical section. I assume we're relying on
> > the iommu group mutex to serialise this in the caller? I can't see any
> > consistent locking in the driver for arm_smmu_clear_cd().
>
> I see Nicolin got this - but yes, sva_lock has nothing to do with
> CD. CD/STE has always been protected by the group_mutex.
>
> > As an additional patch, perhaps we should consider documenting what each
> > lock in the driver protects and the lock ordering requirements they
> > have?
>
> There is still a bunch of rework to do here, it may be better to
> complete the rework than to try to document it, but let me know which
> ones you are interested in and I'll write some thing.
I think listing the locks we have in the driver and describing both
what they protect and the ordering between them would be really helpful.
> sva_lock is almost gone, it just locks the IOPF flow on the master
> because we are doing the IOPF flow in the wrong place. The IOPF
> enable/disbale should be done in attach under the group mutex.
>
> init_mutex will be deleted once the iommu_domain_alloc_paging()
> conversion is done.
>
> asid_lock.. Is a place holder for the nascent BTM support. It locks
> domain->asid only. The BTM patches make this per-instance instead of
> global. Unfortunately that locking scheme doesn't work 100%, but I
> have a notion how to fix it..
>
> asid_lock is also going to need some reconsidering when we make the
> domain able to attach to multiple instances which is something iommufd
> wants.
>
> devices_lock protects the device list only, excluding the special
> nr_ats_masters thing.
>
> Then the hidden group mutex makes all the ops touching master single
> threaded, and the driver has always quietly relied on this. It
> protects the STE/CD and parts of the master.
>
> So I think we end up with only the group mutex, asid_lock and
> devices_lock it is OK.
>
> The order is group -> asid -> devices, which is layed out clearly in
> the attach functions.
>
> > We've got a few global locks with generic names and after a few
> > rounds of refactoring it's really hard to know who's responsible for
> > what, especially now that we have stale comments referring to
> > arm_smmu_share_asid().
>
> > We've also grown a number of places where we
> > drop a lock in the callee and immediately re-take it in the caller,
> > which tends to be a source of bugs.
>
> Do we? Can you point to what you noticed?
As I recall, I just noticed that:
- We have a bunch of comments around 'arm_smmu_asid_lock' that refer
to arm_smmu_share_asid(), which no longer exists.
- arm_smmu_remove_dev_pasid() drops the asid_lock only to have it retaken
in the callee via ->attach_dev().
- arm_smmu_attach_dev() takes/drops/re-takes the devices_lock indirectly
when it calls arm_smmu_attach_prepare() and arm_smmu_attach_commit().
- arm_smmu_attach_dev() takes/drops 'arm_smmu_asid_lock' via
arm_smmu_domain_finalise()) and then re-takes it before the attach.
Please note, I'm not saying that there's a bug here, just that it would
be easier to work with if we had some documentation and lock ordering
assertions.
Will
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v9 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer
2024-09-06 14:08 ` Will Deacon
@ 2024-10-07 17:43 ` Jason Gunthorpe
0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2024-10-07 17:43 UTC (permalink / raw)
To: Will Deacon
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Eric Auger,
Jean-Philippe Brucker, Jerry Snitselaar, Moritz Fischer,
Michael Shavit, Nicolin Chen, patches, Shameerali Kolothum Thodi
On Fri, Sep 06, 2024 at 03:08:54PM +0100, Will Deacon wrote:
> - We have a bunch of comments around 'arm_smmu_asid_lock' that refer
> to arm_smmu_share_asid(), which no longer exists.
So, we can go ahead and put back the BTM support in the pre-existing
slightly racy form. I think we can turn it on for guest support by
operating it only if there is no S2 support in HW which would make it
justified.
Or we could delete the 'arm_smmu_asid_lock' for now.
My idea to fix the race goes through making the domains sharable
across instances because that will change the invalidation in a way
that lets double invalidation happen during the ASID change.
I was planning to wait for that, but it looks like it will be more
time. It is linked to the viommu work which needs to go first.
> - arm_smmu_attach_dev() takes/drops/re-takes the devices_lock indirectly
> when it calls arm_smmu_attach_prepare() and arm_smmu_attach_commit().
This isn't retaking a lock, the operation being locked ran to
completion, we just need to run two operations..
> - arm_smmu_attach_dev() takes/drops 'arm_smmu_asid_lock' via
> arm_smmu_domain_finalise()) and then re-takes it before the attach.
finalize won't be called from arm_smmu_attach_dev() very soon, they
are unrelated operations.
> Please note, I'm not saying that there's a bug here, just that it would
> be easier to work with if we had some documentation and lock ordering
> assertions.
There are more assertions than there was now, I think most of the new
code has them already, excluding the group mutex.
Jason
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2024-10-07 17:43 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-25 12:37 [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva() Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer Jason Gunthorpe
2024-07-02 14:57 ` Will Deacon
2024-07-02 17:03 ` Nicolin Chen
2024-07-09 19:39 ` Jason Gunthorpe
2024-09-06 14:08 ` Will Deacon
2024-10-07 17:43 ` Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 03/14] iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 05/14] iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 06/14] iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 07/14] iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*() interface Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 08/14] iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 09/14] iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 10/14] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 11/14] iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is used Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 12/14] iommu/arm-smmu-v3: Test the STE S1DSS functionality Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 13/14] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED Jason Gunthorpe
2024-06-25 12:37 ` [PATCH v9 14/14] iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID Jason Gunthorpe
2024-07-02 18:43 ` [PATCH v9 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).