* [PATCH v8 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva()
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 3:47 ` Nicolin Chen
2024-06-18 17:27 ` Jerry Snitselaar
2024-06-04 0:15 ` [PATCH v8 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer Jason Gunthorpe
` (14 subsequent siblings)
15 siblings, 2 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameer Kolothum
This allows the driver the receive the mm and always a device during
allocation. Later patches need this to properly setup the notifier when
the domain is first allocated.
Remove ops->domain_alloc() as SVA was the only remaining purpose.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 6 ++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 +---------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 +++-----
3 files changed, 8 insertions(+), 16 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index e490ffb3801545..28f8bf4327f69a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -656,13 +656,15 @@ static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
.free = arm_smmu_sva_domain_free
};
-struct iommu_domain *arm_smmu_sva_domain_alloc(void)
+struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
+ struct mm_struct *mm)
{
struct iommu_domain *domain;
domain = kzalloc(sizeof(*domain), GFP_KERNEL);
if (!domain)
- return NULL;
+ return ERR_PTR(-ENOMEM);
+ domain->type = IOMMU_DOMAIN_SVA;
domain->ops = &arm_smmu_sva_domain_ops;
return domain;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index ab415e107054c1..bd79422f7b6f50 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2237,14 +2237,6 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
}
}
-static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
-{
-
- if (type == IOMMU_DOMAIN_SVA)
- return arm_smmu_sva_domain_alloc();
- return ERR_PTR(-EOPNOTSUPP);
-}
-
static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
{
struct arm_smmu_domain *smmu_domain;
@@ -3097,8 +3089,8 @@ static struct iommu_ops arm_smmu_ops = {
.identity_domain = &arm_smmu_identity_domain,
.blocked_domain = &arm_smmu_blocked_domain,
.capable = arm_smmu_capable,
- .domain_alloc = arm_smmu_domain_alloc,
.domain_alloc_paging = arm_smmu_domain_alloc_paging,
+ .domain_alloc_sva = arm_smmu_sva_domain_alloc,
.probe_device = arm_smmu_probe_device,
.release_device = arm_smmu_release_device,
.device_group = arm_smmu_device_group,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 1242a086c9f948..b10712d3de66a9 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -802,7 +802,8 @@ int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
void arm_smmu_sva_notifier_synchronize(void);
-struct iommu_domain *arm_smmu_sva_domain_alloc(void);
+struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
+ struct mm_struct *mm);
void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t id);
#else /* CONFIG_ARM_SMMU_V3_SVA */
@@ -838,10 +839,7 @@ static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master
static inline void arm_smmu_sva_notifier_synchronize(void) {}
-static inline struct iommu_domain *arm_smmu_sva_domain_alloc(void)
-{
- return NULL;
-}
+#define arm_smmu_sva_domain_alloc NULL
static inline void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
struct device *dev,
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH v8 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva()
2024-06-04 0:15 ` [PATCH v8 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva() Jason Gunthorpe
@ 2024-06-04 3:47 ` Nicolin Chen
2024-06-18 17:27 ` Jerry Snitselaar
1 sibling, 0 replies; 28+ messages in thread
From: Nicolin Chen @ 2024-06-04 3:47 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
patches, Shameer Kolothum
On Mon, Jun 03, 2024 at 09:15:46PM -0300, Jason Gunthorpe wrote:
> This allows the driver the receive the mm and always a device during
> allocation. Later patches need this to properly setup the notifier when
> the domain is first allocated.
>
> Remove ops->domain_alloc() as SVA was the only remaining purpose.
>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Reviewed-by: Michael Shavit <mshavit@google.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva()
2024-06-04 0:15 ` [PATCH v8 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva() Jason Gunthorpe
2024-06-04 3:47 ` Nicolin Chen
@ 2024-06-18 17:27 ` Jerry Snitselaar
1 sibling, 0 replies; 28+ messages in thread
From: Jerry Snitselaar @ 2024-06-18 17:27 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameer Kolothum
On Mon, Jun 03, 2024 at 09:15:46PM GMT, Jason Gunthorpe wrote:
> This allows the driver the receive the mm and always a device during
> allocation. Later patches need this to properly setup the notifier when
> the domain is first allocated.
>
> Remove ops->domain_alloc() as SVA was the only remaining purpose.
>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Reviewed-by: Michael Shavit <mshavit@google.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v8 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
2024-06-04 0:15 ` [PATCH v8 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva() Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 5:07 ` Nicolin Chen
2024-06-04 0:15 ` [PATCH v8 03/14] iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list Jason Gunthorpe
` (13 subsequent siblings)
15 siblings, 1 reply; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameerali Kolothum Thodi
Add arm_smmu_set_pasid()/arm_smmu_remove_pasid() which are to be used by
callers that already constructed the arm_smmu_cd they wish to program.
These functions will encapsulate the shared logic to setup a CD entry that
will be shared by SVA and S1 domain cases.
Prior fixes had already moved most of this logic up into
__arm_smmu_sva_bind(), move it to it's final home.
Following patches will relieve some of the remaining SVA restrictions:
- The RID domain is a S1 domain and has already setup the STE to point to
the CD table
- The programmed PASID is the mm_get_enqcmd_pasid()
- Nothing changes while SVA is running (sva_enable)
SVA invalidation will still iterate over the S1 domain's master list,
later patches will resolve that.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 57 ++++++++++---------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 ++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 ++-
3 files changed, 67 insertions(+), 31 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 28f8bf4327f69a..71ca87c2c5c3b6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -417,29 +417,27 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
arm_smmu_free_shared_cd(cd);
}
-static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid,
- struct mm_struct *mm)
+static struct arm_smmu_bond *__arm_smmu_sva_bind(struct device *dev,
+ struct mm_struct *mm)
{
int ret;
- struct arm_smmu_cd target;
- struct arm_smmu_cd *cdptr;
struct arm_smmu_bond *bond;
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct arm_smmu_domain *smmu_domain;
if (!(domain->type & __IOMMU_DOMAIN_PAGING))
- return -ENODEV;
+ return ERR_PTR(-ENODEV);
smmu_domain = to_smmu_domain(domain);
if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
- return -ENODEV;
+ return ERR_PTR(-ENODEV);
if (!master || !master->sva_enabled)
- return -ENODEV;
+ return ERR_PTR(-ENODEV);
bond = kzalloc(sizeof(*bond), GFP_KERNEL);
if (!bond)
- return -ENOMEM;
+ return ERR_PTR(-ENOMEM);
bond->mm = mm;
@@ -449,22 +447,12 @@ static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid,
goto err_free_bond;
}
- cdptr = arm_smmu_alloc_cd_ptr(master, mm_get_enqcmd_pasid(mm));
- if (!cdptr) {
- ret = -ENOMEM;
- goto err_put_notifier;
- }
- arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
- arm_smmu_write_cd_entry(master, pasid, cdptr, &target);
-
list_add(&bond->list, &master->bonds);
- return 0;
+ return bond;
-err_put_notifier:
- arm_smmu_mmu_notifier_put(bond->smmu_mn);
err_free_bond:
kfree(bond);
- return ret;
+ return ERR_PTR(ret);
}
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
@@ -611,10 +599,9 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
struct arm_smmu_bond *bond = NULL, *t;
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ arm_smmu_remove_pasid(master, to_smmu_domain(domain), id);
+
mutex_lock(&sva_lock);
-
- arm_smmu_clear_cd(master, id);
-
list_for_each_entry(t, &master->bonds, list) {
if (t->mm == mm) {
bond = t;
@@ -633,17 +620,33 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t id)
{
- int ret = 0;
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct mm_struct *mm = domain->mm;
+ struct arm_smmu_bond *bond;
+ struct arm_smmu_cd target;
+ int ret;
if (mm_get_enqcmd_pasid(mm) != id)
return -EINVAL;
mutex_lock(&sva_lock);
- ret = __arm_smmu_sva_bind(dev, id, mm);
- mutex_unlock(&sva_lock);
+ bond = __arm_smmu_sva_bind(dev, mm);
+ if (IS_ERR(bond)) {
+ mutex_unlock(&sva_lock);
+ return PTR_ERR(bond);
+ }
- return ret;
+ arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
+ ret = arm_smmu_set_pasid(master, NULL, id, &target);
+ if (ret) {
+ list_del(&bond->list);
+ arm_smmu_mmu_notifier_put(bond->smmu_mn);
+ kfree(bond);
+ mutex_unlock(&sva_lock);
+ return ret;
+ }
+ mutex_unlock(&sva_lock);
+ return 0;
}
static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index bd79422f7b6f50..3f19436fe86a37 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1211,8 +1211,8 @@ struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
return &l1_desc->l2ptr[ssid % CTXDESC_L2_ENTRIES];
}
-struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
- u32 ssid)
+static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
+ u32 ssid)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
@@ -2412,6 +2412,10 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
int i, j;
struct arm_smmu_device *smmu = master->smmu;
+ master->cd_table.in_ste =
+ FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(target->data[0])) ==
+ STRTAB_STE_0_CFG_S1_TRANS;
+
for (i = 0; i < master->num_streams; ++i) {
u32 sid = master->streams[i].id;
struct arm_smmu_ste *step =
@@ -2632,6 +2636,30 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return 0;
}
+int arm_smmu_set_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
+ const struct arm_smmu_cd *cd)
+{
+ struct arm_smmu_cd *cdptr;
+
+ /* The core code validates pasid */
+
+ if (!master->cd_table.in_ste)
+ return -ENODEV;
+
+ cdptr = arm_smmu_alloc_cd_ptr(master, pasid);
+ if (!cdptr)
+ return -ENOMEM;
+ arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
+ return 0;
+}
+
+void arm_smmu_remove_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid)
+{
+ arm_smmu_clear_cd(master, pasid);
+}
+
static int arm_smmu_attach_dev_ste(struct device *dev,
struct arm_smmu_ste *ste)
{
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index b10712d3de66a9..6a74d3d884fe8d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -602,6 +602,7 @@ struct arm_smmu_ctx_desc_cfg {
dma_addr_t cdtab_dma;
struct arm_smmu_l1_ctx_desc *l1_desc;
unsigned int num_l1_ents;
+ u8 in_ste;
u8 s1fmt;
/* log2 of the maximum number of CDs supported by this table */
u8 s1cdmax;
@@ -777,8 +778,6 @@ extern struct mutex arm_smmu_asid_lock;
void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid);
struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
u32 ssid);
-struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
- u32 ssid);
void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain);
@@ -786,6 +785,12 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
struct arm_smmu_cd *cdptr,
const struct arm_smmu_cd *target);
+int arm_smmu_set_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
+ const struct arm_smmu_cd *cd);
+void arm_smmu_remove_pasid(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain, ioasid_t pasid);
+
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH v8 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer
2024-06-04 0:15 ` [PATCH v8 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer Jason Gunthorpe
@ 2024-06-04 5:07 ` Nicolin Chen
0 siblings, 0 replies; 28+ messages in thread
From: Nicolin Chen @ 2024-06-04 5:07 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
patches, Shameerali Kolothum Thodi
On Mon, Jun 03, 2024 at 09:15:47PM -0300, Jason Gunthorpe wrote:
> Add arm_smmu_set_pasid()/arm_smmu_remove_pasid() which are to be used by
> callers that already constructed the arm_smmu_cd they wish to program.
>
> These functions will encapsulate the shared logic to setup a CD entry that
> will be shared by SVA and S1 domain cases.
>
> Prior fixes had already moved most of this logic up into
> __arm_smmu_sva_bind(), move it to it's final home.
>
> Following patches will relieve some of the remaining SVA restrictions:
>
> - The RID domain is a S1 domain and has already setup the STE to point to
> the CD table
> - The programmed PASID is the mm_get_enqcmd_pasid()
> - Nothing changes while SVA is running (sva_enable)
>
> SVA invalidation will still iterate over the S1 domain's master list,
> later patches will resolve that.
>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v8 03/14] iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
2024-06-04 0:15 ` [PATCH v8 01/14] iommu/arm-smmu-v3: Convert to domain_alloc_sva() Jason Gunthorpe
2024-06-04 0:15 ` [PATCH v8 02/14] iommu/arm-smmu-v3: Start building a generic PASID layer Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 0:15 ` [PATCH v8 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS Jason Gunthorpe
` (12 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameer Kolothum
The next patch will need to store the same master twice (with different
SSIDs), so allocate memory for each list element.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 11 ++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 39 ++++++++++++++++---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 +++-
3 files changed, 47 insertions(+), 10 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 71ca87c2c5c3b6..cb3a0e4143c84a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -38,12 +38,13 @@ static DEFINE_MUTEX(sva_lock);
static void
arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
{
- struct arm_smmu_master *master;
+ struct arm_smmu_master_domain *master_domain;
struct arm_smmu_cd target_cd;
unsigned long flags;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_for_each_entry(master, &smmu_domain->devices, domain_head) {
+ list_for_each_entry(master_domain, &smmu_domain->devices, devices_elm) {
+ struct arm_smmu_master *master = master_domain->master;
struct arm_smmu_cd *cdptr;
/* S1 domains only support RID attachment right now */
@@ -301,7 +302,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
- struct arm_smmu_master *master;
+ struct arm_smmu_master_domain *master_domain;
unsigned long flags;
mutex_lock(&sva_lock);
@@ -315,7 +316,9 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
* but disable translation.
*/
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_for_each_entry(master, &smmu_domain->devices, domain_head) {
+ list_for_each_entry(master_domain, &smmu_domain->devices,
+ devices_elm) {
+ struct arm_smmu_master *master = master_domain->master;
struct arm_smmu_cd target;
struct arm_smmu_cd *cdptr;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3f19436fe86a37..532fe17f28bfe5 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2015,10 +2015,10 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size)
{
+ struct arm_smmu_master_domain *master_domain;
int i;
unsigned long flags;
struct arm_smmu_cmdq_ent cmd;
- struct arm_smmu_master *master;
struct arm_smmu_cmdq_batch cmds;
if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS))
@@ -2046,7 +2046,10 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
cmds.num = 0;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_for_each_entry(master, &smmu_domain->devices, domain_head) {
+ list_for_each_entry(master_domain, &smmu_domain->devices,
+ devices_elm) {
+ struct arm_smmu_master *master = master_domain->master;
+
if (!master->ats_enabled)
continue;
@@ -2534,9 +2537,26 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
pci_disable_pasid(pdev);
}
+static struct arm_smmu_master_domain *
+arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_master *master)
+{
+ struct arm_smmu_master_domain *master_domain;
+
+ lockdep_assert_held(&smmu_domain->devices_lock);
+
+ list_for_each_entry(master_domain, &smmu_domain->devices,
+ devices_elm) {
+ if (master_domain->master == master)
+ return master_domain;
+ }
+ return NULL;
+}
+
static void arm_smmu_detach_dev(struct arm_smmu_master *master)
{
struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
+ struct arm_smmu_master_domain *master_domain;
struct arm_smmu_domain *smmu_domain;
unsigned long flags;
@@ -2547,7 +2567,11 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
arm_smmu_disable_ats(master, smmu_domain);
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_del_init(&master->domain_head);
+ master_domain = arm_smmu_find_master_domain(smmu_domain, master);
+ if (master_domain) {
+ list_del(&master_domain->devices_elm);
+ kfree(master_domain);
+ }
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
master->ats_enabled = false;
@@ -2561,6 +2585,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+ struct arm_smmu_master_domain *master_domain;
struct arm_smmu_master *master;
struct arm_smmu_cd *cdptr;
@@ -2597,6 +2622,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return -ENOMEM;
}
+ master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
+ if (!master_domain)
+ return -ENOMEM;
+ master_domain->master = master;
+
/*
* Prevent arm_smmu_share_asid() from trying to change the ASID
* of either the old or new domain while we are working on it.
@@ -2610,7 +2640,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
master->ats_enabled = arm_smmu_ats_supported(master);
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_add(&master->domain_head, &smmu_domain->devices);
+ list_add(&master_domain->devices_elm, &smmu_domain->devices);
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
switch (smmu_domain->stage) {
@@ -2925,7 +2955,6 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
master->dev = dev;
master->smmu = smmu;
INIT_LIST_HEAD(&master->bonds);
- INIT_LIST_HEAD(&master->domain_head);
dev_iommu_priv_set(dev, master);
ret = arm_smmu_insert_master(smmu, master);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 6a74d3d884fe8d..01769b5286a83a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -697,7 +697,6 @@ struct arm_smmu_stream {
struct arm_smmu_master {
struct arm_smmu_device *smmu;
struct device *dev;
- struct list_head domain_head;
struct arm_smmu_stream *streams;
/* Locked by the iommu core using the group mutex */
struct arm_smmu_ctx_desc_cfg cd_table;
@@ -731,6 +730,7 @@ struct arm_smmu_domain {
struct iommu_domain domain;
+ /* List of struct arm_smmu_master_domain */
struct list_head devices;
spinlock_t devices_lock;
@@ -767,6 +767,11 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
u16 asid);
#endif
+struct arm_smmu_master_domain {
+ struct list_head devices_elm;
+ struct arm_smmu_master *master;
+};
+
static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
{
return container_of(dom, struct arm_smmu_domain, domain);
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v8 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (2 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 03/14] iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 6:17 ` Nicolin Chen
2024-06-19 10:20 ` Michael Shavit
2024-06-04 0:15 ` [PATCH v8 05/14] iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain Jason Gunthorpe
` (11 subsequent siblings)
15 siblings, 2 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameerali Kolothum Thodi
The core code allows the domain to be changed on the fly without a forced
stop in BLOCKED/IDENTITY. In this flow the driver should just continually
maintain the ATS with no change while the STE is updated.
ATS relies on a linked list smmu_domain->devices to keep track of which
masters have the domain programmed, but this list is also used by
arm_smmu_share_asid(), unrelated to ats.
Create two new functions to encapsulate this combined logic:
arm_smmu_attach_prepare()
<caller generates and sets the STE>
arm_smmu_attach_commit()
The two functions can sequence both enabling ATS and disabling across
the STE store. Have every update of the STE use this sequence.
Installing a S1/S2 domain always enables the ATS if the PCIe device
supports it.
The enable flow is now ordered differently to allow it to be hitless:
1) Add the master to the new smmu_domain->devices list
2) Program the STE
3) Enable ATS at PCIe
4) Remove the master from the old smmu_domain
This flow ensures that invalidations to either domain will generate an ATC
invalidation to the device while the STE is being switched. Thus we don't
need to turn off the ATS anymore for correctness.
The disable flow is the reverse:
1) Disable ATS at PCIe
2) Program the STE
3) Invalidate the ATC
4) Remove the master from the old smmu_domain
Move the nr_ats_masters adjustments to be close to the list
manipulations. It is a count of the number of ATS enabled masters
currently in the list. This is stricly before and after the STE/CD are
revised, and done under the list's spin_lock.
This is part of the bigger picture to allow changing the RID domain while
a PASID is in use. If a SVA PASID is relying on ATS to function then
changing the RID domain cannot just temporarily toggle ATS off without
also wrecking the SVA PASID. The new infrastructure here is organized so
that the PASID attach/detach flows will make use of it as well in
following patches.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 5 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 238 +++++++++++++-----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 +-
3 files changed, 178 insertions(+), 71 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index 315e487fd990eb..a460b71f585789 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -164,7 +164,7 @@ static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste,
.smmu = &smmu,
};
- arm_smmu_make_cdtable_ste(ste, &master);
+ arm_smmu_make_cdtable_ste(ste, &master, true);
}
static void arm_smmu_v3_write_ste_test_bypass_to_abort(struct kunit *test)
@@ -231,7 +231,6 @@ static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste,
{
struct arm_smmu_master master = {
.smmu = &smmu,
- .ats_enabled = ats_enabled,
};
struct io_pgtable io_pgtable = {};
struct arm_smmu_domain smmu_domain = {
@@ -247,7 +246,7 @@ static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste,
io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.sl = 3;
io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.tsz = 4;
- arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain);
+ arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain, ats_enabled);
}
static void arm_smmu_v3_write_ste_test_s2_to_abort(struct kunit *test)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 532fe17f28bfe5..24f42ff39f77a9 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1538,7 +1538,7 @@ EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_bypass_ste);
VISIBLE_IF_KUNIT
void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
- struct arm_smmu_master *master)
+ struct arm_smmu_master *master, bool ats_enabled)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
@@ -1561,7 +1561,7 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
STRTAB_STE_1_S1STALLD :
0) |
FIELD_PREP(STRTAB_STE_1_EATS,
- master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+ ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
if (smmu->features & ARM_SMMU_FEAT_E2H) {
/*
@@ -1591,7 +1591,8 @@ EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_cdtable_ste);
VISIBLE_IF_KUNIT
void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain)
+ struct arm_smmu_domain *smmu_domain,
+ bool ats_enabled)
{
struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
const struct io_pgtable_cfg *pgtbl_cfg =
@@ -1608,7 +1609,7 @@ void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
target->data[1] = cpu_to_le64(
FIELD_PREP(STRTAB_STE_1_EATS,
- master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+ ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
if (smmu->features & ARM_SMMU_FEAT_ATTR_TYPES_OVR)
target->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
@@ -2450,22 +2451,16 @@ static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
return dev_is_pci(dev) && pci_ats_supported(to_pci_dev(dev));
}
-static void arm_smmu_enable_ats(struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain)
+static void arm_smmu_enable_ats(struct arm_smmu_master *master)
{
size_t stu;
struct pci_dev *pdev;
struct arm_smmu_device *smmu = master->smmu;
- /* Don't enable ATS at the endpoint if it's not enabled in the STE */
- if (!master->ats_enabled)
- return;
-
/* Smallest Translation Unit: log2 of the smallest supported granule */
stu = __ffs(smmu->pgsize_bitmap);
pdev = to_pci_dev(master->dev);
- atomic_inc(&smmu_domain->nr_ats_masters);
/*
* ATC invalidation of PASID 0 causes the entire ATC to be flushed.
*/
@@ -2474,22 +2469,6 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master,
dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
}
-static void arm_smmu_disable_ats(struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain)
-{
- if (!master->ats_enabled)
- return;
-
- pci_disable_ats(to_pci_dev(master->dev));
- /*
- * Ensure ATS is disabled at the endpoint before we issue the
- * ATC invalidation via the SMMU.
- */
- wmb();
- arm_smmu_atc_inv_master(master);
- atomic_dec(&smmu_domain->nr_ats_masters);
-}
-
static int arm_smmu_enable_pasid(struct arm_smmu_master *master)
{
int ret;
@@ -2553,46 +2532,182 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
return NULL;
}
-static void arm_smmu_detach_dev(struct arm_smmu_master *master)
+/*
+ * If the domain uses the smmu_domain->devices list return the arm_smmu_domain
+ * structure, otherwise NULL. These domains track attached devices so they can
+ * issue invalidations.
+ */
+static struct arm_smmu_domain *
+to_smmu_domain_devices(struct iommu_domain *domain)
{
- struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
+ /* The domain can be NULL only when processing the first attach */
+ if (!domain)
+ return NULL;
+ if (domain->type & __IOMMU_DOMAIN_PAGING)
+ return to_smmu_domain(domain);
+ return NULL;
+}
+
+static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
+ struct iommu_domain *domain)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
struct arm_smmu_master_domain *master_domain;
- struct arm_smmu_domain *smmu_domain;
unsigned long flags;
- if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
+ if (!smmu_domain)
return;
- smmu_domain = to_smmu_domain(domain);
- arm_smmu_disable_ats(master, smmu_domain);
-
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
master_domain = arm_smmu_find_master_domain(smmu_domain, master);
if (master_domain) {
list_del(&master_domain->devices_elm);
kfree(master_domain);
+ if (master->ats_enabled)
+ atomic_dec(&smmu_domain->nr_ats_masters);
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+}
- master->ats_enabled = false;
+struct arm_smmu_attach_state {
+ /* Inputs */
+ struct iommu_domain *old_domain;
+ struct arm_smmu_master *master;
+ /* Resulting state */
+ bool ats_enabled;
+};
+
+/*
+ * Start the sequence to attach a domain to a master. The sequence contains three
+ * steps:
+ * arm_smmu_attach_prepare()
+ * arm_smmu_install_ste_for_dev()
+ * arm_smmu_attach_commit()
+ *
+ * If prepare succeeds then the sequence must be completed. The STE installed
+ * must set the STE.EATS field according to state.ats_enabled.
+ *
+ * ATS is automatically enabled if the underlying device supports it.
+ * disable_ats can inhibit this to support STEs like bypass that don't allow
+ * ATS.
+ *
+ * The change of the EATS in the STE and the PCI ATS config space is managed by
+ * this sequence to be in the right order such that if PCI ATS is enabled then
+ * STE.ETAS is enabled.
+ *
+ * new_domain can be NULL if the domain being attached does not have a page
+ * table and does not require invalidation tracking, and does not support ATS.
+ */
+static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
+ struct iommu_domain *new_domain)
+{
+ struct arm_smmu_master *master = state->master;
+ struct arm_smmu_master_domain *master_domain;
+ struct arm_smmu_domain *smmu_domain =
+ to_smmu_domain_devices(new_domain);
+ unsigned long flags;
+
+ /*
+ * arm_smmu_share_asid() must not see two domains pointing to the same
+ * arm_smmu_master_domain contents otherwise it could randomly write one
+ * or the other to the CD.
+ */
+ lockdep_assert_held(&arm_smmu_asid_lock);
+
+ if (smmu_domain) {
+ /*
+ * The SMMU does not support enabling ATS with bypass/abort.
+ * When the STE is in bypass (STE.Config[2:0] == 0b100), ATS
+ * Translation Requests and Translated transactions are denied
+ * as though ATS is disabled for the stream (STE.EATS == 0b00),
+ * causing F_BAD_ATS_TREQ and F_TRANSL_FORBIDDEN events
+ * (IHI0070Ea 5.2 Stream Table Entry). Thus ATS can only be
+ * enabled if we have arm_smmu_domain, those always have page
+ * tables.
+ */
+ state->ats_enabled = arm_smmu_ats_supported(master);
+
+ master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
+ if (!master_domain)
+ return -ENOMEM;
+ master_domain->master = master;
+
+ /*
+ * During prepare we want the current smmu_domain and new
+ * smmu_domain to be in the devices list before we change any
+ * HW. This ensures that both domains will send ATS
+ * invalidations to the master until we are done.
+ *
+ * It is tempting to make this list only track masters that are
+ * using ATS, but arm_smmu_share_asid() also uses this to change
+ * the ASID of a domain, unrelated to ATS.
+ *
+ * Notice if we are re-attaching the same domain then the list
+ * will have two identical entries and commit will remove only
+ * one of them.
+ */
+ spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+ if (state->ats_enabled)
+ atomic_inc(&smmu_domain->nr_ats_masters);
+ list_add(&master_domain->devices_elm, &smmu_domain->devices);
+ spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+ }
+
+ if (!state->ats_enabled && master->ats_enabled) {
+ pci_disable_ats(to_pci_dev(master->dev));
+ /*
+ * This is probably overkill, but the config write for disabling
+ * ATS should complete before the STE is configured to generate
+ * UR to avoid AER noise.
+ */
+ wmb();
+ }
+ return 0;
+}
+
+/*
+ * Commit is done after the STE/CD are configured with the EATS setting. It
+ * completes synchronizing the PCI device's ATC and finishes manipulating the
+ * smmu_domain->devices list.
+ */
+static void arm_smmu_attach_commit(struct arm_smmu_attach_state *state)
+{
+ struct arm_smmu_master *master = state->master;
+
+ lockdep_assert_held(&arm_smmu_asid_lock);
+
+ if (state->ats_enabled && !master->ats_enabled) {
+ arm_smmu_enable_ats(master);
+ } else if (master->ats_enabled) {
+ /*
+ * The translation has changed, flush the ATC. At this point the
+ * SMMU is translating for the new domain and both the old&new
+ * domain will issue invalidations.
+ */
+ arm_smmu_atc_inv_master(master);
+ }
+ master->ats_enabled = state->ats_enabled;
+
+ arm_smmu_remove_master_domain(master, state->old_domain);
}
static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
{
int ret = 0;
- unsigned long flags;
struct arm_smmu_ste target;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
- struct arm_smmu_master_domain *master_domain;
+ struct arm_smmu_attach_state state = {
+ .old_domain = iommu_get_domain_for_dev(dev),
+ };
struct arm_smmu_master *master;
struct arm_smmu_cd *cdptr;
if (!fwspec)
return -ENOENT;
- master = dev_iommu_priv_get(dev);
+ state.master = master = dev_iommu_priv_get(dev);
smmu = master->smmu;
/*
@@ -2622,11 +2737,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return -ENOMEM;
}
- master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
- if (!master_domain)
- return -ENOMEM;
- master_domain->master = master;
-
/*
* Prevent arm_smmu_share_asid() from trying to change the ASID
* of either the old or new domain while we are working on it.
@@ -2635,13 +2745,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
*/
mutex_lock(&arm_smmu_asid_lock);
- arm_smmu_detach_dev(master);
-
- master->ats_enabled = arm_smmu_ats_supported(master);
-
- spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_add(&master_domain->devices_elm, &smmu_domain->devices);
- spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+ ret = arm_smmu_attach_prepare(&state, domain);
+ if (ret) {
+ mutex_unlock(&arm_smmu_asid_lock);
+ return ret;
+ }
switch (smmu_domain->stage) {
case ARM_SMMU_DOMAIN_S1: {
@@ -2650,18 +2758,19 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
&target_cd);
- arm_smmu_make_cdtable_ste(&target, master);
+ arm_smmu_make_cdtable_ste(&target, master, state.ats_enabled);
arm_smmu_install_ste_for_dev(master, &target);
break;
}
case ARM_SMMU_DOMAIN_S2:
- arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
+ arm_smmu_make_s2_domain_ste(&target, master, smmu_domain,
+ state.ats_enabled);
arm_smmu_install_ste_for_dev(master, &target);
arm_smmu_clear_cd(master, IOMMU_NO_PASID);
break;
}
- arm_smmu_enable_ats(master, smmu_domain);
+ arm_smmu_attach_commit(&state);
mutex_unlock(&arm_smmu_asid_lock);
return 0;
}
@@ -2690,10 +2799,14 @@ void arm_smmu_remove_pasid(struct arm_smmu_master *master,
arm_smmu_clear_cd(master, pasid);
}
-static int arm_smmu_attach_dev_ste(struct device *dev,
- struct arm_smmu_ste *ste)
+static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
+ struct device *dev, struct arm_smmu_ste *ste)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_attach_state state = {
+ .master = master,
+ .old_domain = iommu_get_domain_for_dev(dev),
+ };
if (arm_smmu_master_sva_enabled(master))
return -EBUSY;
@@ -2704,16 +2817,9 @@ static int arm_smmu_attach_dev_ste(struct device *dev,
*/
mutex_lock(&arm_smmu_asid_lock);
- /*
- * The SMMU does not support enabling ATS with bypass/abort. When the
- * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
- * and Translated transactions are denied as though ATS is disabled for
- * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
- * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
- */
- arm_smmu_detach_dev(master);
-
+ arm_smmu_attach_prepare(&state, domain);
arm_smmu_install_ste_for_dev(master, ste);
+ arm_smmu_attach_commit(&state);
mutex_unlock(&arm_smmu_asid_lock);
/*
@@ -2732,7 +2838,7 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
arm_smmu_make_bypass_ste(master->smmu, &ste);
- return arm_smmu_attach_dev_ste(dev, &ste);
+ return arm_smmu_attach_dev_ste(domain, dev, &ste);
}
static const struct iommu_domain_ops arm_smmu_identity_ops = {
@@ -2750,7 +2856,7 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
struct arm_smmu_ste ste;
arm_smmu_make_abort_ste(&ste);
- return arm_smmu_attach_dev_ste(dev, &ste);
+ return arm_smmu_attach_dev_ste(domain, dev, &ste);
}
static const struct iommu_domain_ops arm_smmu_blocked_ops = {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 01769b5286a83a..f9b4bfb2e6b723 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -758,10 +758,12 @@ void arm_smmu_make_abort_ste(struct arm_smmu_ste *target);
void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu,
struct arm_smmu_ste *target);
void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
- struct arm_smmu_master *master);
+ struct arm_smmu_master *master,
+ bool ats_enabled);
void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain);
+ struct arm_smmu_domain *smmu_domain,
+ bool ats_enabled);
void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
struct arm_smmu_master *master, struct mm_struct *mm,
u16 asid);
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH v8 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS
2024-06-04 0:15 ` [PATCH v8 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS Jason Gunthorpe
@ 2024-06-04 6:17 ` Nicolin Chen
2024-06-19 10:20 ` Michael Shavit
1 sibling, 0 replies; 28+ messages in thread
From: Nicolin Chen @ 2024-06-04 6:17 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
patches, Shameerali Kolothum Thodi
On Mon, Jun 03, 2024 at 09:15:49PM -0300, Jason Gunthorpe wrote:
> The core code allows the domain to be changed on the fly without a forced
> stop in BLOCKED/IDENTITY. In this flow the driver should just continually
> maintain the ATS with no change while the STE is updated.
>
> ATS relies on a linked list smmu_domain->devices to keep track of which
> masters have the domain programmed, but this list is also used by
> arm_smmu_share_asid(), unrelated to ats.
>
> Create two new functions to encapsulate this combined logic:
> arm_smmu_attach_prepare()
> <caller generates and sets the STE>
> arm_smmu_attach_commit()
>
> The two functions can sequence both enabling ATS and disabling across
> the STE store. Have every update of the STE use this sequence.
>
> Installing a S1/S2 domain always enables the ATS if the PCIe device
> supports it.
>
> The enable flow is now ordered differently to allow it to be hitless:
>
> 1) Add the master to the new smmu_domain->devices list
> 2) Program the STE
> 3) Enable ATS at PCIe
> 4) Remove the master from the old smmu_domain
>
> This flow ensures that invalidations to either domain will generate an ATC
> invalidation to the device while the STE is being switched. Thus we don't
> need to turn off the ATS anymore for correctness.
>
> The disable flow is the reverse:
> 1) Disable ATS at PCIe
> 2) Program the STE
> 3) Invalidate the ATC
> 4) Remove the master from the old smmu_domain
>
> Move the nr_ats_masters adjustments to be close to the list
> manipulations. It is a count of the number of ATS enabled masters
> currently in the list. This is stricly before and after the STE/CD are
> revised, and done under the list's spin_lock.
>
> This is part of the bigger picture to allow changing the RID domain while
> a PASID is in use. If a SVA PASID is relying on ATS to function then
> changing the RID domain cannot just temporarily toggle ATS off without
> also wrecking the SVA PASID. The new infrastructure here is organized so
> that the PASID attach/detach flows will make use of it as well in
> following patches.
>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS
2024-06-04 0:15 ` [PATCH v8 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS Jason Gunthorpe
2024-06-04 6:17 ` Nicolin Chen
@ 2024-06-19 10:20 ` Michael Shavit
2024-06-19 18:43 ` Jason Gunthorpe
1 sibling, 1 reply; 28+ messages in thread
From: Michael Shavit @ 2024-06-19 10:20 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Nicolin Chen,
patches, Shameerali Kolothum Thodi
On Tue, Jun 4, 2024 at 8:16 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> The core code allows the domain to be changed on the fly without a forced
> stop in BLOCKED/IDENTITY. In this flow the driver should just continually
> maintain the ATS with no change while the STE is updated.
>
> ATS relies on a linked list smmu_domain->devices to keep track of which
> masters have the domain programmed, but this list is also used by
> arm_smmu_share_asid(), unrelated to ats.
>
> Create two new functions to encapsulate this combined logic:
> arm_smmu_attach_prepare()
> <caller generates and sets the STE>
> arm_smmu_attach_commit()
>
> The two functions can sequence both enabling ATS and disabling across
> the STE store. Have every update of the STE use this sequence.
>
> Installing a S1/S2 domain always enables the ATS if the PCIe device
> supports it.
>
> The enable flow is now ordered differently to allow it to be hitless:
>
> 1) Add the master to the new smmu_domain->devices list
> 2) Program the STE
> 3) Enable ATS at PCIe
> 4) Remove the master from the old smmu_domain
>
> This flow ensures that invalidations to either domain will generate an ATC
> invalidation to the device while the STE is being switched. Thus we don't
> need to turn off the ATS anymore for correctness.
>
> The disable flow is the reverse:
> 1) Disable ATS at PCIe
> 2) Program the STE
> 3) Invalidate the ATC
> 4) Remove the master from the old smmu_domain
>
> Move the nr_ats_masters adjustments to be close to the list
> manipulations. It is a count of the number of ATS enabled masters
> currently in the list. This is stricly before and after the STE/CD are
> revised, and done under the list's spin_lock.
>
> This is part of the bigger picture to allow changing the RID domain while
> a PASID is in use. If a SVA PASID is relying on ATS to function then
> changing the RID domain cannot just temporarily toggle ATS off without
> also wrecking the SVA PASID. The new infrastructure here is organized so
> that the PASID attach/detach flows will make use of it as well in
> following patches.
>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> .../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 5 +-
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 238 +++++++++++++-----
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 +-
> 3 files changed, 178 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
> index 315e487fd990eb..a460b71f585789 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
> @@ -164,7 +164,7 @@ static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste,
> .smmu = &smmu,
> };
>
> - arm_smmu_make_cdtable_ste(ste, &master);
> + arm_smmu_make_cdtable_ste(ste, &master, true);
> }
>
> static void arm_smmu_v3_write_ste_test_bypass_to_abort(struct kunit *test)
> @@ -231,7 +231,6 @@ static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste,
> {
> struct arm_smmu_master master = {
> .smmu = &smmu,
> - .ats_enabled = ats_enabled,
> };
> struct io_pgtable io_pgtable = {};
> struct arm_smmu_domain smmu_domain = {
> @@ -247,7 +246,7 @@ static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste,
> io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.sl = 3;
> io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.tsz = 4;
>
> - arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain);
> + arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain, ats_enabled);
> }
>
> static void arm_smmu_v3_write_ste_test_s2_to_abort(struct kunit *test)
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 532fe17f28bfe5..24f42ff39f77a9 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1538,7 +1538,7 @@ EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_bypass_ste);
>
> VISIBLE_IF_KUNIT
> void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
> - struct arm_smmu_master *master)
> + struct arm_smmu_master *master, bool ats_enabled)
> {
> struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
> struct arm_smmu_device *smmu = master->smmu;
> @@ -1561,7 +1561,7 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
> STRTAB_STE_1_S1STALLD :
> 0) |
> FIELD_PREP(STRTAB_STE_1_EATS,
> - master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
> + ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
>
> if (smmu->features & ARM_SMMU_FEAT_E2H) {
> /*
> @@ -1591,7 +1591,8 @@ EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_cdtable_ste);
> VISIBLE_IF_KUNIT
> void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
> struct arm_smmu_master *master,
> - struct arm_smmu_domain *smmu_domain)
> + struct arm_smmu_domain *smmu_domain,
> + bool ats_enabled)
> {
> struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
> const struct io_pgtable_cfg *pgtbl_cfg =
> @@ -1608,7 +1609,7 @@ void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
>
> target->data[1] = cpu_to_le64(
> FIELD_PREP(STRTAB_STE_1_EATS,
> - master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
> + ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
>
> if (smmu->features & ARM_SMMU_FEAT_ATTR_TYPES_OVR)
> target->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> @@ -2450,22 +2451,16 @@ static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
> return dev_is_pci(dev) && pci_ats_supported(to_pci_dev(dev));
> }
>
> -static void arm_smmu_enable_ats(struct arm_smmu_master *master,
> - struct arm_smmu_domain *smmu_domain)
> +static void arm_smmu_enable_ats(struct arm_smmu_master *master)
> {
> size_t stu;
> struct pci_dev *pdev;
> struct arm_smmu_device *smmu = master->smmu;
>
> - /* Don't enable ATS at the endpoint if it's not enabled in the STE */
> - if (!master->ats_enabled)
> - return;
> -
> /* Smallest Translation Unit: log2 of the smallest supported granule */
> stu = __ffs(smmu->pgsize_bitmap);
> pdev = to_pci_dev(master->dev);
>
> - atomic_inc(&smmu_domain->nr_ats_masters);
> /*
> * ATC invalidation of PASID 0 causes the entire ATC to be flushed.
> */
> @@ -2474,22 +2469,6 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master,
> dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
> }
>
> -static void arm_smmu_disable_ats(struct arm_smmu_master *master,
> - struct arm_smmu_domain *smmu_domain)
> -{
> - if (!master->ats_enabled)
> - return;
> -
> - pci_disable_ats(to_pci_dev(master->dev));
> - /*
> - * Ensure ATS is disabled at the endpoint before we issue the
> - * ATC invalidation via the SMMU.
> - */
> - wmb();
> - arm_smmu_atc_inv_master(master);
> - atomic_dec(&smmu_domain->nr_ats_masters);
> -}
> -
> static int arm_smmu_enable_pasid(struct arm_smmu_master *master)
> {
> int ret;
> @@ -2553,46 +2532,182 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
> return NULL;
> }
>
> -static void arm_smmu_detach_dev(struct arm_smmu_master *master)
> +/*
> + * If the domain uses the smmu_domain->devices list return the arm_smmu_domain
> + * structure, otherwise NULL. These domains track attached devices so they can
> + * issue invalidations.
> + */
> +static struct arm_smmu_domain *
> +to_smmu_domain_devices(struct iommu_domain *domain)
> {
> - struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
> + /* The domain can be NULL only when processing the first attach */
> + if (!domain)
> + return NULL;
> + if (domain->type & __IOMMU_DOMAIN_PAGING)
> + return to_smmu_domain(domain);
> + return NULL;
> +}
> +
> +static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
> + struct iommu_domain *domain)
> +{
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
> struct arm_smmu_master_domain *master_domain;
> - struct arm_smmu_domain *smmu_domain;
> unsigned long flags;
>
> - if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
> + if (!smmu_domain)
> return;
>
> - smmu_domain = to_smmu_domain(domain);
> - arm_smmu_disable_ats(master, smmu_domain);
> -
> spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> master_domain = arm_smmu_find_master_domain(smmu_domain, master);
> if (master_domain) {
> list_del(&master_domain->devices_elm);
> kfree(master_domain);
> + if (master->ats_enabled)
> + atomic_dec(&smmu_domain->nr_ats_masters);
> }
> spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
> +}
>
> - master->ats_enabled = false;
> +struct arm_smmu_attach_state {
> + /* Inputs */
> + struct iommu_domain *old_domain;
> + struct arm_smmu_master *master;
> + /* Resulting state */
> + bool ats_enabled;
> +};
> +
> +/*
> + * Start the sequence to attach a domain to a master. The sequence contains three
> + * steps:
> + * arm_smmu_attach_prepare()
> + * arm_smmu_install_ste_for_dev()
> + * arm_smmu_attach_commit()
> + *
> + * If prepare succeeds then the sequence must be completed. The STE installed
> + * must set the STE.EATS field according to state.ats_enabled.
> + *
> + * ATS is automatically enabled if the underlying device supports it.
> + * disable_ats can inhibit this to support STEs like bypass that don't allow
> + * ATS.
This comment is out of date since disable_ats was removed between v7 and v8.
A nit, but "automatically" is also a little imprecise IMO (almost
sounds like the device is automatically enabling it). How about:
+ * ATS is enabled after the STE is installed if the new domain and
underlying device
+ * supports it. On the other hand, ATS is disabled before installing
the STE if it doesn't
+ * support ATS like bypass domains.
Or something else if that's too redundant with the next paragraph :) .
> + *
> + * The change of the EATS in the STE and the PCI ATS config space is managed by
> + * this sequence to be in the right order such that if PCI ATS is enabled then
> + * STE.ETAS is enabled.
> + *
> + * new_domain can be NULL if the domain being attached does not have a page
> + * table and does not require invalidation tracking, and does not support ATS.
> + */
This is also confusing, new_domain is never NULL. It's
to_smmu_domain_devices(new_domain) that can be null.
> +static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
> + struct iommu_domain *new_domain)
> +{
> + struct arm_smmu_master *master = state->master;
> + struct arm_smmu_master_domain *master_domain;
> + struct arm_smmu_domain *smmu_domain =
> + to_smmu_domain_devices(new_domain);
> + unsigned long flags;
> +
> + /*
> + * arm_smmu_share_asid() must not see two domains pointing to the same
> + * arm_smmu_master_domain contents otherwise it could randomly write one
> + * or the other to the CD.
> + */
> + lockdep_assert_held(&arm_smmu_asid_lock);
> +
> + if (smmu_domain) {
> + /*
> + * The SMMU does not support enabling ATS with bypass/abort.
> + * When the STE is in bypass (STE.Config[2:0] == 0b100), ATS
> + * Translation Requests and Translated transactions are denied
> + * as though ATS is disabled for the stream (STE.EATS == 0b00),
> + * causing F_BAD_ATS_TREQ and F_TRANSL_FORBIDDEN events
> + * (IHI0070Ea 5.2 Stream Table Entry). Thus ATS can only be
> + * enabled if we have arm_smmu_domain, those always have page
> + * tables.
> + */
> + state->ats_enabled = arm_smmu_ats_supported(master);
> +
> + master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
> + if (!master_domain)
> + return -ENOMEM;
> + master_domain->master = master;
> +
> + /*
> + * During prepare we want the current smmu_domain and new
> + * smmu_domain to be in the devices list before we change any
> + * HW. This ensures that both domains will send ATS
> + * invalidations to the master until we are done.
> + *
> + * It is tempting to make this list only track masters that are
> + * using ATS, but arm_smmu_share_asid() also uses this to change
> + * the ASID of a domain, unrelated to ATS.
> + *
> + * Notice if we are re-attaching the same domain then the list
> + * will have two identical entries and commit will remove only
> + * one of them.
> + */
> + spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> + if (state->ats_enabled)
> + atomic_inc(&smmu_domain->nr_ats_masters);
> + list_add(&master_domain->devices_elm, &smmu_domain->devices);
> + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
> + }
> +
> + if (!state->ats_enabled && master->ats_enabled) {
> + pci_disable_ats(to_pci_dev(master->dev));
> + /*
> + * This is probably overkill, but the config write for disabling
> + * ATS should complete before the STE is configured to generate
> + * UR to avoid AER noise.
> + */
> + wmb();
> + }
> + return 0;
> +}
> +
> +/*
> + * Commit is done after the STE/CD are configured with the EATS setting. It
> + * completes synchronizing the PCI device's ATC and finishes manipulating the
> + * smmu_domain->devices list.
> + */
> +static void arm_smmu_attach_commit(struct arm_smmu_attach_state *state)
> +{
> + struct arm_smmu_master *master = state->master;
> +
> + lockdep_assert_held(&arm_smmu_asid_lock);
> +
> + if (state->ats_enabled && !master->ats_enabled) {
> + arm_smmu_enable_ats(master);
> + } else if (master->ats_enabled) {
> + /*
> + * The translation has changed, flush the ATC. At this point the
> + * SMMU is translating for the new domain and both the old&new
> + * domain will issue invalidations.
> + */
> + arm_smmu_atc_inv_master(master);
> + }
> + master->ats_enabled = state->ats_enabled;
> +
> + arm_smmu_remove_master_domain(master, state->old_domain);
> }
>
> static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> {
> int ret = 0;
> - unsigned long flags;
> struct arm_smmu_ste target;
> struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> struct arm_smmu_device *smmu;
> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> - struct arm_smmu_master_domain *master_domain;
> + struct arm_smmu_attach_state state = {
> + .old_domain = iommu_get_domain_for_dev(dev),
> + };
> struct arm_smmu_master *master;
> struct arm_smmu_cd *cdptr;
>
> if (!fwspec)
> return -ENOENT;
>
> - master = dev_iommu_priv_get(dev);
> + state.master = master = dev_iommu_priv_get(dev);
> smmu = master->smmu;
>
> /*
> @@ -2622,11 +2737,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> return -ENOMEM;
> }
>
> - master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
> - if (!master_domain)
> - return -ENOMEM;
> - master_domain->master = master;
> -
> /*
> * Prevent arm_smmu_share_asid() from trying to change the ASID
> * of either the old or new domain while we are working on it.
> @@ -2635,13 +2745,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> */
> mutex_lock(&arm_smmu_asid_lock);
>
> - arm_smmu_detach_dev(master);
> -
> - master->ats_enabled = arm_smmu_ats_supported(master);
> -
> - spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> - list_add(&master_domain->devices_elm, &smmu_domain->devices);
> - spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
> + ret = arm_smmu_attach_prepare(&state, domain);
> + if (ret) {
> + mutex_unlock(&arm_smmu_asid_lock);
> + return ret;
> + }
>
> switch (smmu_domain->stage) {
> case ARM_SMMU_DOMAIN_S1: {
> @@ -2650,18 +2758,19 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
> arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
> &target_cd);
> - arm_smmu_make_cdtable_ste(&target, master);
> + arm_smmu_make_cdtable_ste(&target, master, state.ats_enabled);
> arm_smmu_install_ste_for_dev(master, &target);
> break;
> }
> case ARM_SMMU_DOMAIN_S2:
> - arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
> + arm_smmu_make_s2_domain_ste(&target, master, smmu_domain,
> + state.ats_enabled);
> arm_smmu_install_ste_for_dev(master, &target);
> arm_smmu_clear_cd(master, IOMMU_NO_PASID);
> break;
> }
>
> - arm_smmu_enable_ats(master, smmu_domain);
> + arm_smmu_attach_commit(&state);
> mutex_unlock(&arm_smmu_asid_lock);
> return 0;
> }
> @@ -2690,10 +2799,14 @@ void arm_smmu_remove_pasid(struct arm_smmu_master *master,
> arm_smmu_clear_cd(master, pasid);
> }
>
> -static int arm_smmu_attach_dev_ste(struct device *dev,
> - struct arm_smmu_ste *ste)
> +static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
> + struct device *dev, struct arm_smmu_ste *ste)
> {
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> + struct arm_smmu_attach_state state = {
> + .master = master,
> + .old_domain = iommu_get_domain_for_dev(dev),
> + };
>
> if (arm_smmu_master_sva_enabled(master))
> return -EBUSY;
> @@ -2704,16 +2817,9 @@ static int arm_smmu_attach_dev_ste(struct device *dev,
> */
> mutex_lock(&arm_smmu_asid_lock);
>
> - /*
> - * The SMMU does not support enabling ATS with bypass/abort. When the
> - * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> - * and Translated transactions are denied as though ATS is disabled for
> - * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> - * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> - */
> - arm_smmu_detach_dev(master);
> -
> + arm_smmu_attach_prepare(&state, domain);
> arm_smmu_install_ste_for_dev(master, ste);
> + arm_smmu_attach_commit(&state);
> mutex_unlock(&arm_smmu_asid_lock);
>
> /*
> @@ -2732,7 +2838,7 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
>
> arm_smmu_make_bypass_ste(master->smmu, &ste);
> - return arm_smmu_attach_dev_ste(dev, &ste);
> + return arm_smmu_attach_dev_ste(domain, dev, &ste);
> }
>
> static const struct iommu_domain_ops arm_smmu_identity_ops = {
> @@ -2750,7 +2856,7 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
> struct arm_smmu_ste ste;
>
> arm_smmu_make_abort_ste(&ste);
> - return arm_smmu_attach_dev_ste(dev, &ste);
> + return arm_smmu_attach_dev_ste(domain, dev, &ste);
> }
>
> static const struct iommu_domain_ops arm_smmu_blocked_ops = {
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 01769b5286a83a..f9b4bfb2e6b723 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -758,10 +758,12 @@ void arm_smmu_make_abort_ste(struct arm_smmu_ste *target);
> void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu,
> struct arm_smmu_ste *target);
> void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
> - struct arm_smmu_master *master);
> + struct arm_smmu_master *master,
> + bool ats_enabled);
> void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
> struct arm_smmu_master *master,
> - struct arm_smmu_domain *smmu_domain);
> + struct arm_smmu_domain *smmu_domain,
> + bool ats_enabled);
> void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
> struct arm_smmu_master *master, struct mm_struct *mm,
> u16 asid);
> --
> 2.45.2
>
Reviewed-by: Michael Shavit <mshavit@google.com>
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH v8 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS
2024-06-19 10:20 ` Michael Shavit
@ 2024-06-19 18:43 ` Jason Gunthorpe
2024-06-20 5:25 ` Michael Shavit
0 siblings, 1 reply; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-19 18:43 UTC (permalink / raw)
To: Michael Shavit
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Nicolin Chen,
patches, Shameerali Kolothum Thodi
On Wed, Jun 19, 2024 at 06:20:56PM +0800, Michael Shavit wrote:
> > +/*
> > + * Start the sequence to attach a domain to a master. The sequence contains three
> > + * steps:
> > + * arm_smmu_attach_prepare()
> > + * arm_smmu_install_ste_for_dev()
> > + * arm_smmu_attach_commit()
> > + *
> > + * If prepare succeeds then the sequence must be completed. The STE installed
> > + * must set the STE.EATS field according to state.ats_enabled.
> > + *
> > + * ATS is automatically enabled if the underlying device supports it.
> > + * disable_ats can inhibit this to support STEs like bypass that don't allow
> > + * ATS.
>
> This comment is out of date since disable_ats was removed between v7 and v8.
> A nit, but "automatically" is also a little imprecise IMO (almost
> sounds like the device is automatically enabling it). How about:
>
> + * ATS is enabled after the STE is installed if the new domain and
> underlying device
> + * supports it. On the other hand, ATS is disabled before installing
> the STE if it doesn't
> + * support ATS like bypass domains.
>
> Or something else if that's too redundant with the next paragraph :) .
>
> > + *
> > + * The change of the EATS in the STE and the PCI ATS config space is managed by
> > + * this sequence to be in the right order such that if PCI ATS is enabled then
> > + * STE.ETAS is enabled.
> > + *
> > + * new_domain can be NULL if the domain being attached does not have a page
> > + * table and does not require invalidation tracking, and does not support ATS.
> > + */
>
> This is also confusing, new_domain is never NULL. It's
> to_smmu_domain_devices(new_domain) that can be null.
Yes, the comment didn't survive some of the edits..
/*
* Start the sequence to attach a domain to a master. The sequence contains three
* steps:
* arm_smmu_attach_prepare()
* arm_smmu_install_ste_for_dev()
* arm_smmu_attach_commit()
*
* If prepare succeeds then the sequence must be completed. The STE installed
* must set the STE.EATS field according to state.ats_enabled.
*
* If the device supports ATS then this determines if EATS should be enabled
* in the STE, and starts sequencing EATS disable if required.
*
* The change of the EATS in the STE and the PCI ATS config space is managed by
* this sequence to be in the right order so that if PCI ATS is enabled then
* STE.ETAS is enabled.
*
* new_domain can be a non-paging domain. In this case ATS will not be enabled,
* and invalidations won't be tracked.
*/
?
Thanks,
Jason
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS
2024-06-19 18:43 ` Jason Gunthorpe
@ 2024-06-20 5:25 ` Michael Shavit
0 siblings, 0 replies; 28+ messages in thread
From: Michael Shavit @ 2024-06-20 5:25 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Nicolin Chen,
patches, Shameerali Kolothum Thodi
On Thu, Jun 20, 2024 at 2:43 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Jun 19, 2024 at 06:20:56PM +0800, Michael Shavit wrote:
> > > +/*
> > > + * Start the sequence to attach a domain to a master. The sequence contains three
> > > + * steps:
> > > + * arm_smmu_attach_prepare()
> > > + * arm_smmu_install_ste_for_dev()
> > > + * arm_smmu_attach_commit()
> > > + *
> > > + * If prepare succeeds then the sequence must be completed. The STE installed
> > > + * must set the STE.EATS field according to state.ats_enabled.
> > > + *
> > > + * ATS is automatically enabled if the underlying device supports it.
> > > + * disable_ats can inhibit this to support STEs like bypass that don't allow
> > > + * ATS.
> >
> > This comment is out of date since disable_ats was removed between v7 and v8.
> > A nit, but "automatically" is also a little imprecise IMO (almost
> > sounds like the device is automatically enabling it). How about:
> >
> > + * ATS is enabled after the STE is installed if the new domain and
> > underlying device
> > + * supports it. On the other hand, ATS is disabled before installing
> > the STE if it doesn't
> > + * support ATS like bypass domains.
> >
> > Or something else if that's too redundant with the next paragraph :) .
> >
> > > + *
> > > + * The change of the EATS in the STE and the PCI ATS config space is managed by
> > > + * this sequence to be in the right order such that if PCI ATS is enabled then
> > > + * STE.ETAS is enabled.
> > > + *
> > > + * new_domain can be NULL if the domain being attached does not have a page
> > > + * table and does not require invalidation tracking, and does not support ATS.
> > > + */
> >
> > This is also confusing, new_domain is never NULL. It's
> > to_smmu_domain_devices(new_domain) that can be null.
>
> Yes, the comment didn't survive some of the edits..
>
> /*
> * Start the sequence to attach a domain to a master. The sequence contains three
> * steps:
> * arm_smmu_attach_prepare()
> * arm_smmu_install_ste_for_dev()
> * arm_smmu_attach_commit()
> *
> * If prepare succeeds then the sequence must be completed. The STE installed
> * must set the STE.EATS field according to state.ats_enabled.
> *
> * If the device supports ATS then this determines if EATS should be enabled
> * in the STE, and starts sequencing EATS disable if required.
> *
> * The change of the EATS in the STE and the PCI ATS config space is managed by
> * this sequence to be in the right order so that if PCI ATS is enabled then
> * STE.ETAS is enabled.
> *
> * new_domain can be a non-paging domain. In this case ATS will not be enabled,
> * and invalidations won't be tracked.
> */
>
> ?
>
> Thanks,
> Jason
Looks good to me.
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v8 05/14] iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (3 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 04/14] iommu/arm-smmu-v3: Make changing domains be hitless for ATS Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 0:15 ` [PATCH v8 06/14] iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches Jason Gunthorpe
` (10 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameer Kolothum
Prepare to allow a S1 domain to be attached to a PASID as well. Keep track
of the SSID the domain is using on each master in the
arm_smmu_master_domain.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 15 ++++---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 42 +++++++++++++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 ++-
3 files changed, 43 insertions(+), 19 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index cb3a0e4143c84a..d31caceb584984 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -47,13 +47,12 @@ arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
struct arm_smmu_master *master = master_domain->master;
struct arm_smmu_cd *cdptr;
- /* S1 domains only support RID attachment right now */
- cdptr = arm_smmu_get_cd_ptr(master, IOMMU_NO_PASID);
+ cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid);
if (WARN_ON(!cdptr))
continue;
arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
- arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
+ arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr,
&target_cd);
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
@@ -294,8 +293,8 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
smmu_domain);
}
- arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), start,
- size);
+ arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), start,
+ size);
}
static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
@@ -332,7 +331,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
- arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
+ arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
smmu_mn->cleared = true;
mutex_unlock(&sva_lock);
@@ -411,8 +410,8 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
*/
if (!smmu_mn->cleared) {
arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid);
- arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0,
- 0);
+ arm_smmu_atc_inv_domain_sva(smmu_domain,
+ mm_get_enqcmd_pasid(mm), 0, 0);
}
/* Frees smmu_mn */
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 24f42ff39f77a9..674884a8fe25ba 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2013,8 +2013,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
return arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
}
-int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
- unsigned long iova, size_t size)
+static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ ioasid_t ssid, unsigned long iova, size_t size)
{
struct arm_smmu_master_domain *master_domain;
int i;
@@ -2042,8 +2042,6 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
if (!atomic_read(&smmu_domain->nr_ats_masters))
return 0;
- arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
-
cmds.num = 0;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
@@ -2054,6 +2052,16 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
if (!master->ats_enabled)
continue;
+ /*
+ * Non-zero ssid means SVA is co-opting the S1 domain to issue
+ * invalidations for SVA PASIDs.
+ */
+ if (ssid != IOMMU_NO_PASID)
+ arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
+ else
+ arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size,
+ &cmd);
+
for (i = 0; i < master->num_streams; i++) {
cmd.atc.sid = master->streams[i].id;
arm_smmu_cmdq_batch_add(smmu_domain->smmu, &cmds, &cmd);
@@ -2064,6 +2072,19 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds);
}
+static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ unsigned long iova, size_t size)
+{
+ return __arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova,
+ size);
+}
+
+int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
+ ioasid_t ssid, unsigned long iova, size_t size)
+{
+ return __arm_smmu_atc_inv_domain(smmu_domain, ssid, iova, size);
+}
+
/* IO_PGTABLE API */
static void arm_smmu_tlb_inv_context(void *cookie)
{
@@ -2085,7 +2106,7 @@ static void arm_smmu_tlb_inv_context(void *cookie)
cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
}
- arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, 0, 0);
+ arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
}
static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
@@ -2183,7 +2204,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
* Unfortunately, this can't be leaf-only since we may have
* zapped an entire table.
*/
- arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova, size);
+ arm_smmu_atc_inv_domain(smmu_domain, iova, size);
}
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
@@ -2518,7 +2539,8 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
static struct arm_smmu_master_domain *
arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
- struct arm_smmu_master *master)
+ struct arm_smmu_master *master,
+ ioasid_t ssid)
{
struct arm_smmu_master_domain *master_domain;
@@ -2526,7 +2548,8 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain,
list_for_each_entry(master_domain, &smmu_domain->devices,
devices_elm) {
- if (master_domain->master == master)
+ if (master_domain->master == master &&
+ master_domain->ssid == ssid)
return master_domain;
}
return NULL;
@@ -2559,7 +2582,8 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
return;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- master_domain = arm_smmu_find_master_domain(smmu_domain, master);
+ master_domain = arm_smmu_find_master_domain(smmu_domain, master,
+ IOMMU_NO_PASID);
if (master_domain) {
list_del(&master_domain->devices_elm);
kfree(master_domain);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index f9b4bfb2e6b723..f4061ffc1e612d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -772,6 +772,7 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
struct arm_smmu_master_domain {
struct list_head devices_elm;
struct arm_smmu_master *master;
+ ioasid_t ssid;
};
static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
@@ -803,8 +804,8 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
struct arm_smmu_domain *smmu_domain);
bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
-int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
- unsigned long iova, size_t size);
+int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
+ ioasid_t ssid, unsigned long iova, size_t size);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v8 06/14] iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (4 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 05/14] iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 0:15 ` [PATCH v8 07/14] iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*() interface Jason Gunthorpe
` (9 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameer Kolothum
We no longer need a master->sva_enable to control what attaches are
allowed. Instead we can tell if the attach is legal based on the current
configuration of the master.
Keep track of the number of valid CD entries for SSID's in the cd_table
and if the cd_table has been installed in the STE directly so we know what
the configuration is.
The attach logic is then made into:
- SVA bind, check if the CD is installed
- RID attach of S2, block if SSIDs are used
- RID attach of IDENTITY/BLOCKING, block if SSIDs are used
arm_smmu_set_pasid() is already checking if it is possible to setup a CD
entry, at this patch it means the RID path already set a STE pointing at
the CD table.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 ++++++++++-----------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 ++++++
2 files changed, 19 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 674884a8fe25ba..e7cd1ddc03517d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1289,6 +1289,8 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
struct arm_smmu_cd *cdptr,
const struct arm_smmu_cd *target)
{
+ bool target_valid = target->data[0] & cpu_to_le64(CTXDESC_CD_0_V);
+ bool cur_valid = cdptr->data[0] & cpu_to_le64(CTXDESC_CD_0_V);
struct arm_smmu_cd_writer cd_writer = {
.writer = {
.ops = &arm_smmu_cd_writer_ops,
@@ -1297,6 +1299,13 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
.ssid = ssid,
};
+ if (ssid != IOMMU_NO_PASID && cur_valid != target_valid) {
+ if (cur_valid)
+ master->cd_table.used_ssids--;
+ else
+ master->cd_table.used_ssids++;
+ }
+
arm_smmu_write_entry(&cd_writer.writer, cdptr->data, target->data);
}
@@ -2734,16 +2743,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
state.master = master = dev_iommu_priv_get(dev);
smmu = master->smmu;
- /*
- * Checking that SVA is disabled ensures that this device isn't bound to
- * any mm, and can be safely detached from its old domain. Bonds cannot
- * be removed concurrently since we're holding the group mutex.
- */
- if (arm_smmu_master_sva_enabled(master)) {
- dev_err(dev, "cannot attach - SVA enabled\n");
- return -EBUSY;
- }
-
mutex_lock(&smmu_domain->init_mutex);
if (!smmu_domain->smmu) {
@@ -2759,7 +2758,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
cdptr = arm_smmu_alloc_cd_ptr(master, IOMMU_NO_PASID);
if (!cdptr)
return -ENOMEM;
- }
+ } else if (arm_smmu_ssids_in_use(&master->cd_table))
+ return -EBUSY;
/*
* Prevent arm_smmu_share_asid() from trying to change the ASID
@@ -2832,7 +2832,7 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
.old_domain = iommu_get_domain_for_dev(dev),
};
- if (arm_smmu_master_sva_enabled(master))
+ if (arm_smmu_ssids_in_use(&master->cd_table))
return -EBUSY;
/*
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index f4061ffc1e612d..65b75dbfd15914 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -602,12 +602,19 @@ struct arm_smmu_ctx_desc_cfg {
dma_addr_t cdtab_dma;
struct arm_smmu_l1_ctx_desc *l1_desc;
unsigned int num_l1_ents;
+ unsigned int used_ssids;
u8 in_ste;
u8 s1fmt;
/* log2 of the maximum number of CDs supported by this table */
u8 s1cdmax;
};
+/* True if the cd table has SSIDS > 0 in use. */
+static inline bool arm_smmu_ssids_in_use(struct arm_smmu_ctx_desc_cfg *cd_table)
+{
+ return cd_table->used_ssids;
+}
+
struct arm_smmu_s2_cfg {
u16 vmid;
};
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v8 07/14] iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*() interface
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (5 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 06/14] iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 0:15 ` [PATCH v8 08/14] iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain Jason Gunthorpe
` (8 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameerali Kolothum Thodi
Allow creating and managing arm_smmu_mater_domain's with a non-zero SSID
through the arm_smmu_attach_*() family of functions. This triggers ATC
invalidation for the correct SSID in PASID cases and tracks the
per-attachment SSID in the struct arm_smmu_master_domain.
Generalize arm_smmu_attach_remove() to be able to remove SSID's as well by
ensuring the ATC for the PASID is flushed properly.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 26 ++++++++++++++-------
1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e7cd1ddc03517d..56e2fe52df530d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2005,13 +2005,14 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
cmd->atc.size = log2_span;
}
-static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
+static int arm_smmu_atc_inv_master(struct arm_smmu_master *master,
+ ioasid_t ssid)
{
int i;
struct arm_smmu_cmdq_ent cmd;
struct arm_smmu_cmdq_batch cmds;
- arm_smmu_atc_inv_to_cmd(IOMMU_NO_PASID, 0, 0, &cmd);
+ arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd);
cmds.num = 0;
for (i = 0; i < master->num_streams; i++) {
@@ -2494,7 +2495,7 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
/*
* ATC invalidation of PASID 0 causes the entire ATC to be flushed.
*/
- arm_smmu_atc_inv_master(master);
+ arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
if (pci_enable_ats(pdev, stu))
dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
}
@@ -2581,7 +2582,8 @@ to_smmu_domain_devices(struct iommu_domain *domain)
}
static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
- struct iommu_domain *domain)
+ struct iommu_domain *domain,
+ ioasid_t ssid)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
struct arm_smmu_master_domain *master_domain;
@@ -2591,8 +2593,7 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
return;
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- master_domain = arm_smmu_find_master_domain(smmu_domain, master,
- IOMMU_NO_PASID);
+ master_domain = arm_smmu_find_master_domain(smmu_domain, master, ssid);
if (master_domain) {
list_del(&master_domain->devices_elm);
kfree(master_domain);
@@ -2606,6 +2607,7 @@ struct arm_smmu_attach_state {
/* Inputs */
struct iommu_domain *old_domain;
struct arm_smmu_master *master;
+ ioasid_t ssid;
/* Resulting state */
bool ats_enabled;
};
@@ -2664,6 +2666,7 @@ static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
if (!master_domain)
return -ENOMEM;
master_domain->master = master;
+ master_domain->ssid = state->ssid;
/*
* During prepare we want the current smmu_domain and new
@@ -2711,17 +2714,20 @@ static void arm_smmu_attach_commit(struct arm_smmu_attach_state *state)
if (state->ats_enabled && !master->ats_enabled) {
arm_smmu_enable_ats(master);
- } else if (master->ats_enabled) {
+ } else if (state->ats_enabled && master->ats_enabled) {
/*
* The translation has changed, flush the ATC. At this point the
* SMMU is translating for the new domain and both the old&new
* domain will issue invalidations.
*/
- arm_smmu_atc_inv_master(master);
+ arm_smmu_atc_inv_master(master, state->ssid);
+ } else if (!state->ats_enabled && master->ats_enabled) {
+ /* ATS is being switched off, invalidate the entire ATC */
+ arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
}
master->ats_enabled = state->ats_enabled;
- arm_smmu_remove_master_domain(master, state->old_domain);
+ arm_smmu_remove_master_domain(master, state->old_domain, state->ssid);
}
static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -2733,6 +2739,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_attach_state state = {
.old_domain = iommu_get_domain_for_dev(dev),
+ .ssid = IOMMU_NO_PASID,
};
struct arm_smmu_master *master;
struct arm_smmu_cd *cdptr;
@@ -2830,6 +2837,7 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
struct arm_smmu_attach_state state = {
.master = master,
.old_domain = iommu_get_domain_for_dev(dev),
+ .ssid = IOMMU_NO_PASID,
};
if (arm_smmu_ssids_in_use(&master->cd_table))
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v8 08/14] iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (6 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 07/14] iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*() interface Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 0:15 ` [PATCH v8 09/14] iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA Jason Gunthorpe
` (7 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameer Kolothum
Currently the SVA domain is a naked struct iommu_domain, allocate a struct
arm_smmu_domain instead.
This is necessary to be able to use the struct arm_master_domain
mechanism.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 21 +++++++------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 31 +++++++++++++------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
3 files changed, 35 insertions(+), 19 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index d31caceb584984..aa033cd65adc5a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -639,7 +639,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
}
arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
- ret = arm_smmu_set_pasid(master, NULL, id, &target);
+ ret = arm_smmu_set_pasid(master, to_smmu_domain(domain), id, &target);
if (ret) {
list_del(&bond->list);
arm_smmu_mmu_notifier_put(bond->smmu_mn);
@@ -653,7 +653,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
{
- kfree(domain);
+ kfree(to_smmu_domain(domain));
}
static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
@@ -664,13 +664,16 @@ static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm)
{
- struct iommu_domain *domain;
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_device *smmu = master->smmu;
+ struct arm_smmu_domain *smmu_domain;
- domain = kzalloc(sizeof(*domain), GFP_KERNEL);
- if (!domain)
- return ERR_PTR(-ENOMEM);
- domain->type = IOMMU_DOMAIN_SVA;
- domain->ops = &arm_smmu_sva_domain_ops;
+ smmu_domain = arm_smmu_domain_alloc();
+ if (IS_ERR(smmu_domain))
+ return ERR_CAST(smmu_domain);
+ smmu_domain->domain.type = IOMMU_DOMAIN_SVA;
+ smmu_domain->domain.ops = &arm_smmu_sva_domain_ops;
+ smmu_domain->smmu = smmu;
- return domain;
+ return &smmu_domain->domain;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 56e2fe52df530d..c6dba933d2c38c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2272,6 +2272,22 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
}
}
+struct arm_smmu_domain *arm_smmu_domain_alloc(void)
+{
+ struct arm_smmu_domain *smmu_domain;
+
+ smmu_domain = kzalloc(sizeof(*smmu_domain), GFP_KERNEL);
+ if (!smmu_domain)
+ return ERR_PTR(-ENOMEM);
+
+ mutex_init(&smmu_domain->init_mutex);
+ INIT_LIST_HEAD(&smmu_domain->devices);
+ spin_lock_init(&smmu_domain->devices_lock);
+ INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
+
+ return smmu_domain;
+}
+
static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
{
struct arm_smmu_domain *smmu_domain;
@@ -2281,14 +2297,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
* We can't really do anything meaningful until we've added a
* master.
*/
- smmu_domain = kzalloc(sizeof(*smmu_domain), GFP_KERNEL);
- if (!smmu_domain)
- return ERR_PTR(-ENOMEM);
-
- mutex_init(&smmu_domain->init_mutex);
- INIT_LIST_HEAD(&smmu_domain->devices);
- spin_lock_init(&smmu_domain->devices_lock);
- INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
+ smmu_domain = arm_smmu_domain_alloc();
+ if (IS_ERR(smmu_domain))
+ return ERR_CAST(smmu_domain);
if (dev) {
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
@@ -2303,7 +2314,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
return &smmu_domain->domain;
}
-static void arm_smmu_domain_free(struct iommu_domain *domain)
+static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_device *smmu = smmu_domain->smmu;
@@ -3306,7 +3317,7 @@ static struct iommu_ops arm_smmu_ops = {
.iotlb_sync = arm_smmu_iotlb_sync,
.iova_to_phys = arm_smmu_iova_to_phys,
.enable_nesting = arm_smmu_enable_nesting,
- .free = arm_smmu_domain_free,
+ .free = arm_smmu_domain_free_paging,
}
};
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 65b75dbfd15914..212c18c70fa03e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -790,6 +790,8 @@ static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
extern struct xarray arm_smmu_asid_xa;
extern struct mutex arm_smmu_asid_lock;
+struct arm_smmu_domain *arm_smmu_domain_alloc(void);
+
void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid);
struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
u32 ssid);
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v8 09/14] iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (7 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 08/14] iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 0:15 ` [PATCH v8 10/14] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain Jason Gunthorpe
` (6 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameerali Kolothum Thodi
Fill in the smmu_domain->devices list in the new struct arm_smmu_domain
that SVA allocates. Keep track of every SSID and master that is using the
domain reusing the logic for the RID attach.
This is the first step to making the SVA invalidation follow the same
design as S1/S2 invalidation. At present nothing will read this list.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 30 +++++++++++++++++++--
1 file changed, 28 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c6dba933d2c38c..24000027253de8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2587,7 +2587,8 @@ to_smmu_domain_devices(struct iommu_domain *domain)
/* The domain can be NULL only when processing the first attach */
if (!domain)
return NULL;
- if (domain->type & __IOMMU_DOMAIN_PAGING)
+ if ((domain->type & __IOMMU_DOMAIN_PAGING) ||
+ domain->type == IOMMU_DOMAIN_SVA)
return to_smmu_domain(domain);
return NULL;
}
@@ -2821,7 +2822,16 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
const struct arm_smmu_cd *cd)
{
+ struct arm_smmu_attach_state state = {
+ .master = master,
+ /*
+ * For now the core code prevents calling this when a domain is
+ * already attached, no need to set old_domain.
+ */
+ .ssid = pasid,
+ };
struct arm_smmu_cd *cdptr;
+ int ret;
/* The core code validates pasid */
@@ -2831,14 +2841,30 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
cdptr = arm_smmu_alloc_cd_ptr(master, pasid);
if (!cdptr)
return -ENOMEM;
+
+ mutex_lock(&arm_smmu_asid_lock);
+ ret = arm_smmu_attach_prepare(&state, &smmu_domain->domain);
+ if (ret)
+ goto out_unlock;
+
arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
- return 0;
+
+ arm_smmu_attach_commit(&state);
+
+out_unlock:
+ mutex_unlock(&arm_smmu_asid_lock);
+ return ret;
}
void arm_smmu_remove_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid)
{
+ mutex_lock(&arm_smmu_asid_lock);
arm_smmu_clear_cd(master, pasid);
+ if (master->ats_enabled)
+ arm_smmu_atc_inv_master(master, pasid);
+ arm_smmu_remove_master_domain(master, &smmu_domain->domain, pasid);
+ mutex_unlock(&arm_smmu_asid_lock);
}
static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v8 10/14] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (8 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 09/14] iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-24 9:54 ` Michael Shavit
2024-06-04 0:15 ` [PATCH v8 11/14] iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is used Jason Gunthorpe
` (5 subsequent siblings)
15 siblings, 1 reply; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameerali Kolothum Thodi
This removes all the notifier de-duplication logic in the driver and
relies on the core code to de-duplicate and allocate only one SVA domain
per mm per smmu instance. This naturally gives a 1:1 relationship between
SVA domain and mmu notifier.
It is a significant simplication of the flow, as we end up with a single
struct arm_smmu_domain for each MM and the invalidation can then be
shifted to properly use the masters list like S1/S2 do.
Remove all of the previous mmu_notifier, bond, shared cd, and cd refcount
logic entirely.
The logic here is tightly wound together with the unusued BTM
support. Since the BTM logic requires holding all the iommu_domains in a
global ASID xarray it conflicts with the design to have a single SVA
domain per PASID, as multiple SMMU instances will need to have different
domains.
Following patches resolve this by making the ASID xarray per-instance
instead of global. However, converting the BTM code over to this
methodology requires many changes.
Thus, since ARM_SMMU_FEAT_BTM is never enabled, remove the parts of the
BTM support for ASID sharing that interact with SVA as well.
A followup series is already working on fully enabling the BTM support,
that requires iommufd's VIOMMU feature to bring in the KVM's VMID as
well. It will come with an already written patch to bring back the ASID
sharing using a per-instance ASID xarray.
https://lore.kernel.org/linux-iommu/20240208151837.35068-1-shameerali.kolothum.thodi@huawei.com/
https://lore.kernel.org/linux-iommu/26-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com/
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 395 +++---------------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 69 +--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 15 +-
3 files changed, 86 insertions(+), 393 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index aa033cd65adc5a..a7c36654dee5a5 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -13,29 +13,9 @@
#include "arm-smmu-v3.h"
#include "../../io-pgtable-arm.h"
-struct arm_smmu_mmu_notifier {
- struct mmu_notifier mn;
- struct arm_smmu_ctx_desc *cd;
- bool cleared;
- refcount_t refs;
- struct list_head list;
- struct arm_smmu_domain *domain;
-};
-
-#define mn_to_smmu(mn) container_of(mn, struct arm_smmu_mmu_notifier, mn)
-
-struct arm_smmu_bond {
- struct mm_struct *mm;
- struct arm_smmu_mmu_notifier *smmu_mn;
- struct list_head list;
-};
-
-#define sva_to_bond(handle) \
- container_of(handle, struct arm_smmu_bond, sva)
-
static DEFINE_MUTEX(sva_lock);
-static void
+static void __maybe_unused
arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
{
struct arm_smmu_master_domain *master_domain;
@@ -58,58 +38,6 @@ arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
}
-/*
- * Check if the CPU ASID is available on the SMMU side. If a private context
- * descriptor is using it, try to replace it.
- */
-static struct arm_smmu_ctx_desc *
-arm_smmu_share_asid(struct mm_struct *mm, u16 asid)
-{
- int ret;
- u32 new_asid;
- struct arm_smmu_ctx_desc *cd;
- struct arm_smmu_device *smmu;
- struct arm_smmu_domain *smmu_domain;
-
- cd = xa_load(&arm_smmu_asid_xa, asid);
- if (!cd)
- return NULL;
-
- if (cd->mm) {
- if (WARN_ON(cd->mm != mm))
- return ERR_PTR(-EINVAL);
- /* All devices bound to this mm use the same cd struct. */
- refcount_inc(&cd->refs);
- return cd;
- }
-
- smmu_domain = container_of(cd, struct arm_smmu_domain, cd);
- smmu = smmu_domain->smmu;
-
- ret = xa_alloc(&arm_smmu_asid_xa, &new_asid, cd,
- XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
- if (ret)
- return ERR_PTR(-ENOSPC);
- /*
- * Race with unmap: TLB invalidations will start targeting the new ASID,
- * which isn't assigned yet. We'll do an invalidate-all on the old ASID
- * later, so it doesn't matter.
- */
- cd->asid = new_asid;
- /*
- * Update ASID and invalidate CD in all associated masters. There will
- * be some overlap between use of both ASIDs, until we invalidate the
- * TLB.
- */
- arm_smmu_update_s1_domain_cd_entry(smmu_domain);
-
- /* Invalidate TLB entries previously associated with that context */
- arm_smmu_tlb_inv_asid(smmu, asid);
-
- xa_erase(&arm_smmu_asid_xa, asid);
- return NULL;
-}
-
static u64 page_size_to_cd(void)
{
static_assert(PAGE_SIZE == SZ_4K || PAGE_SIZE == SZ_16K ||
@@ -187,69 +115,6 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
}
EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_sva_cd);
-static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
-{
- u16 asid;
- int err = 0;
- struct arm_smmu_ctx_desc *cd;
- struct arm_smmu_ctx_desc *ret = NULL;
-
- /* Don't free the mm until we release the ASID */
- mmgrab(mm);
-
- asid = arm64_mm_context_get(mm);
- if (!asid) {
- err = -ESRCH;
- goto out_drop_mm;
- }
-
- cd = kzalloc(sizeof(*cd), GFP_KERNEL);
- if (!cd) {
- err = -ENOMEM;
- goto out_put_context;
- }
-
- refcount_set(&cd->refs, 1);
-
- mutex_lock(&arm_smmu_asid_lock);
- ret = arm_smmu_share_asid(mm, asid);
- if (ret) {
- mutex_unlock(&arm_smmu_asid_lock);
- goto out_free_cd;
- }
-
- err = xa_insert(&arm_smmu_asid_xa, asid, cd, GFP_KERNEL);
- mutex_unlock(&arm_smmu_asid_lock);
-
- if (err)
- goto out_free_asid;
-
- cd->asid = asid;
- cd->mm = mm;
-
- return cd;
-
-out_free_asid:
- arm_smmu_free_asid(cd);
-out_free_cd:
- kfree(cd);
-out_put_context:
- arm64_mm_context_put(mm);
-out_drop_mm:
- mmdrop(mm);
- return err < 0 ? ERR_PTR(err) : ret;
-}
-
-static void arm_smmu_free_shared_cd(struct arm_smmu_ctx_desc *cd)
-{
- if (arm_smmu_free_asid(cd)) {
- /* Unpin ASID */
- arm64_mm_context_put(cd->mm);
- mmdrop(cd->mm);
- kfree(cd);
- }
-}
-
/*
* Cloned from the MAX_TLBI_OPS in arch/arm64/include/asm/tlbflush.h, this
* is used as a threshold to replace per-page TLBI commands to issue in the
@@ -264,8 +129,8 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
unsigned long start,
unsigned long end)
{
- struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
- struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
+ struct arm_smmu_domain *smmu_domain =
+ container_of(mn, struct arm_smmu_domain, mmu_notifier);
size_t size;
/*
@@ -282,34 +147,22 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
size = 0;
}
- if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM)) {
- if (!size)
- arm_smmu_tlb_inv_asid(smmu_domain->smmu,
- smmu_mn->cd->asid);
- else
- arm_smmu_tlb_inv_range_asid(start, size,
- smmu_mn->cd->asid,
- PAGE_SIZE, false,
- smmu_domain);
- }
+ if (!size)
+ arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
+ else
+ arm_smmu_tlb_inv_range_asid(start, size, smmu_domain->cd.asid,
+ PAGE_SIZE, false, smmu_domain);
- arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), start,
- size);
+ arm_smmu_atc_inv_domain(smmu_domain, start, size);
}
static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
- struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
- struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
+ struct arm_smmu_domain *smmu_domain =
+ container_of(mn, struct arm_smmu_domain, mmu_notifier);
struct arm_smmu_master_domain *master_domain;
unsigned long flags;
- mutex_lock(&sva_lock);
- if (smmu_mn->cleared) {
- mutex_unlock(&sva_lock);
- return;
- }
-
/*
* DMA may still be running. Keep the cd valid to avoid C_BAD_CD events,
* but disable translation.
@@ -321,25 +174,23 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
struct arm_smmu_cd target;
struct arm_smmu_cd *cdptr;
- cdptr = arm_smmu_get_cd_ptr(master, mm_get_enqcmd_pasid(mm));
+ cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid);
if (WARN_ON(!cdptr))
continue;
- arm_smmu_make_sva_cd(&target, master, NULL, smmu_mn->cd->asid);
- arm_smmu_write_cd_entry(master, mm_get_enqcmd_pasid(mm), cdptr,
+ arm_smmu_make_sva_cd(&target, master, NULL,
+ smmu_domain->cd.asid);
+ arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr,
&target);
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
- arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
- arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
-
- smmu_mn->cleared = true;
- mutex_unlock(&sva_lock);
+ arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
+ arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
}
static void arm_smmu_mmu_notifier_free(struct mmu_notifier *mn)
{
- kfree(mn_to_smmu(mn));
+ kfree(container_of(mn, struct arm_smmu_domain, mmu_notifier));
}
static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
@@ -348,115 +199,6 @@ static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
.free_notifier = arm_smmu_mmu_notifier_free,
};
-/* Allocate or get existing MMU notifier for this {domain, mm} pair */
-static struct arm_smmu_mmu_notifier *
-arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain,
- struct mm_struct *mm)
-{
- int ret;
- struct arm_smmu_ctx_desc *cd;
- struct arm_smmu_mmu_notifier *smmu_mn;
-
- list_for_each_entry(smmu_mn, &smmu_domain->mmu_notifiers, list) {
- if (smmu_mn->mn.mm == mm) {
- refcount_inc(&smmu_mn->refs);
- return smmu_mn;
- }
- }
-
- cd = arm_smmu_alloc_shared_cd(mm);
- if (IS_ERR(cd))
- return ERR_CAST(cd);
-
- smmu_mn = kzalloc(sizeof(*smmu_mn), GFP_KERNEL);
- if (!smmu_mn) {
- ret = -ENOMEM;
- goto err_free_cd;
- }
-
- refcount_set(&smmu_mn->refs, 1);
- smmu_mn->cd = cd;
- smmu_mn->domain = smmu_domain;
- smmu_mn->mn.ops = &arm_smmu_mmu_notifier_ops;
-
- ret = mmu_notifier_register(&smmu_mn->mn, mm);
- if (ret) {
- kfree(smmu_mn);
- goto err_free_cd;
- }
-
- list_add(&smmu_mn->list, &smmu_domain->mmu_notifiers);
- return smmu_mn;
-
-err_free_cd:
- arm_smmu_free_shared_cd(cd);
- return ERR_PTR(ret);
-}
-
-static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
-{
- struct mm_struct *mm = smmu_mn->mn.mm;
- struct arm_smmu_ctx_desc *cd = smmu_mn->cd;
- struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
-
- if (!refcount_dec_and_test(&smmu_mn->refs))
- return;
-
- list_del(&smmu_mn->list);
-
- /*
- * If we went through clear(), we've already invalidated, and no
- * new TLB entry can have been formed.
- */
- if (!smmu_mn->cleared) {
- arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid);
- arm_smmu_atc_inv_domain_sva(smmu_domain,
- mm_get_enqcmd_pasid(mm), 0, 0);
- }
-
- /* Frees smmu_mn */
- mmu_notifier_put(&smmu_mn->mn);
- arm_smmu_free_shared_cd(cd);
-}
-
-static struct arm_smmu_bond *__arm_smmu_sva_bind(struct device *dev,
- struct mm_struct *mm)
-{
- int ret;
- struct arm_smmu_bond *bond;
- struct arm_smmu_master *master = dev_iommu_priv_get(dev);
- struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
- struct arm_smmu_domain *smmu_domain;
-
- if (!(domain->type & __IOMMU_DOMAIN_PAGING))
- return ERR_PTR(-ENODEV);
- smmu_domain = to_smmu_domain(domain);
- if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
- return ERR_PTR(-ENODEV);
-
- if (!master || !master->sva_enabled)
- return ERR_PTR(-ENODEV);
-
- bond = kzalloc(sizeof(*bond), GFP_KERNEL);
- if (!bond)
- return ERR_PTR(-ENOMEM);
-
- bond->mm = mm;
-
- bond->smmu_mn = arm_smmu_mmu_notifier_get(smmu_domain, mm);
- if (IS_ERR(bond->smmu_mn)) {
- ret = PTR_ERR(bond->smmu_mn);
- goto err_free_bond;
- }
-
- list_add(&bond->list, &master->bonds);
- return bond;
-
-err_free_bond:
- kfree(bond);
- return ERR_PTR(ret);
-}
-
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
{
unsigned long reg, fld;
@@ -573,11 +315,6 @@ int arm_smmu_master_enable_sva(struct arm_smmu_master *master)
int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
{
mutex_lock(&sva_lock);
- if (!list_empty(&master->bonds)) {
- dev_err(master->dev, "cannot disable SVA, device is bound\n");
- mutex_unlock(&sva_lock);
- return -EBUSY;
- }
arm_smmu_master_sva_disable_iopf(master);
master->sva_enabled = false;
mutex_unlock(&sva_lock);
@@ -594,66 +331,51 @@ void arm_smmu_sva_notifier_synchronize(void)
mmu_notifier_synchronize();
}
-void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t id)
-{
- struct mm_struct *mm = domain->mm;
- struct arm_smmu_bond *bond = NULL, *t;
- struct arm_smmu_master *master = dev_iommu_priv_get(dev);
-
- arm_smmu_remove_pasid(master, to_smmu_domain(domain), id);
-
- mutex_lock(&sva_lock);
- list_for_each_entry(t, &master->bonds, list) {
- if (t->mm == mm) {
- bond = t;
- break;
- }
- }
-
- if (!WARN_ON(!bond)) {
- list_del(&bond->list);
- arm_smmu_mmu_notifier_put(bond->smmu_mn);
- kfree(bond);
- }
- mutex_unlock(&sva_lock);
-}
-
static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t id)
{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
- struct mm_struct *mm = domain->mm;
- struct arm_smmu_bond *bond;
struct arm_smmu_cd target;
int ret;
- if (mm_get_enqcmd_pasid(mm) != id)
+ /* Prevent arm_smmu_mm_release from being called while we are attaching */
+ if (!mmget_not_zero(domain->mm))
return -EINVAL;
- mutex_lock(&sva_lock);
- bond = __arm_smmu_sva_bind(dev, mm);
- if (IS_ERR(bond)) {
- mutex_unlock(&sva_lock);
- return PTR_ERR(bond);
- }
+ /*
+ * This does not need the arm_smmu_asid_lock because SVA domains never
+ * get reassigned
+ */
+ arm_smmu_make_sva_cd(&target, master, domain->mm, smmu_domain->cd.asid);
+ ret = arm_smmu_set_pasid(master, smmu_domain, id, &target);
- arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
- ret = arm_smmu_set_pasid(master, to_smmu_domain(domain), id, &target);
- if (ret) {
- list_del(&bond->list);
- arm_smmu_mmu_notifier_put(bond->smmu_mn);
- kfree(bond);
- mutex_unlock(&sva_lock);
- return ret;
- }
- mutex_unlock(&sva_lock);
- return 0;
+ mmput(domain->mm);
+ return ret;
}
static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
{
- kfree(to_smmu_domain(domain));
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+ /*
+ * Ensure the ASID is empty in the iommu cache before allowing reuse.
+ */
+ arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
+
+ /*
+ * Notice that the arm_smmu_mm_arch_invalidate_secondary_tlbs op can
+ * still be called/running at this point. We allow the ASID to be
+ * reused, and if there is a race then it just suffers harmless
+ * unnecessary invalidation.
+ */
+ xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
+
+ /*
+ * Actual free is defered to the SRCU callback
+ * arm_smmu_mmu_notifier_free()
+ */
+ mmu_notifier_put(&smmu_domain->mmu_notifier);
}
static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
@@ -667,6 +389,8 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_domain *smmu_domain;
+ u32 asid;
+ int ret;
smmu_domain = arm_smmu_domain_alloc();
if (IS_ERR(smmu_domain))
@@ -675,5 +399,22 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
smmu_domain->domain.ops = &arm_smmu_sva_domain_ops;
smmu_domain->smmu = smmu;
+ ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
+ XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
+ if (ret)
+ goto err_free;
+
+ smmu_domain->cd.asid = asid;
+ smmu_domain->mmu_notifier.ops = &arm_smmu_mmu_notifier_ops;
+ ret = mmu_notifier_register(&smmu_domain->mmu_notifier, mm);
+ if (ret)
+ goto err_asid;
+
return &smmu_domain->domain;
+
+err_asid:
+ xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
+err_free:
+ kfree(smmu_domain);
+ return ERR_PTR(ret);
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 24000027253de8..2a845ab6d53b57 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1439,22 +1439,6 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
cd_table->cdtab = NULL;
}
-bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
-{
- bool free;
- struct arm_smmu_ctx_desc *old_cd;
-
- if (!cd->asid)
- return false;
-
- free = refcount_dec_and_test(&cd->refs);
- if (free) {
- old_cd = xa_erase(&arm_smmu_asid_xa, cd->asid);
- WARN_ON(old_cd != cd);
- }
- return free;
-}
-
/* Stream table manipulation functions */
static void
arm_smmu_write_strtab_l1_desc(__le64 *dst, struct arm_smmu_strtab_l1_desc *desc)
@@ -2023,8 +2007,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master,
return arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
}
-static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
- ioasid_t ssid, unsigned long iova, size_t size)
+int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ unsigned long iova, size_t size)
{
struct arm_smmu_master_domain *master_domain;
int i;
@@ -2062,15 +2046,7 @@ static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
if (!master->ats_enabled)
continue;
- /*
- * Non-zero ssid means SVA is co-opting the S1 domain to issue
- * invalidations for SVA PASIDs.
- */
- if (ssid != IOMMU_NO_PASID)
- arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
- else
- arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size,
- &cmd);
+ arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size, &cmd);
for (i = 0; i < master->num_streams; i++) {
cmd.atc.sid = master->streams[i].id;
@@ -2082,19 +2058,6 @@ static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds);
}
-static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
- unsigned long iova, size_t size)
-{
- return __arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova,
- size);
-}
-
-int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
- ioasid_t ssid, unsigned long iova, size_t size)
-{
- return __arm_smmu_atc_inv_domain(smmu_domain, ssid, iova, size);
-}
-
/* IO_PGTABLE API */
static void arm_smmu_tlb_inv_context(void *cookie)
{
@@ -2283,7 +2246,6 @@ struct arm_smmu_domain *arm_smmu_domain_alloc(void)
mutex_init(&smmu_domain->init_mutex);
INIT_LIST_HEAD(&smmu_domain->devices);
spin_lock_init(&smmu_domain->devices_lock);
- INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
return smmu_domain;
}
@@ -2325,7 +2287,7 @@ static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
/* Prevent SVA from touching the CD while we're freeing it */
mutex_lock(&arm_smmu_asid_lock);
- arm_smmu_free_asid(&smmu_domain->cd);
+ xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
mutex_unlock(&arm_smmu_asid_lock);
} else {
struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
@@ -2343,11 +2305,9 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
u32 asid;
struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
- refcount_set(&cd->refs, 1);
-
/* Prevent SVA from modifying the ASID until it is written to the CD */
mutex_lock(&arm_smmu_asid_lock);
- ret = xa_alloc(&arm_smmu_asid_xa, &asid, cd,
+ ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
cd->asid = (u16)asid;
mutex_unlock(&arm_smmu_asid_lock);
@@ -2835,6 +2795,9 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
/* The core code validates pasid */
+ if (smmu_domain->smmu != master->smmu)
+ return -EINVAL;
+
if (!master->cd_table.in_ste)
return -ENODEV;
@@ -2856,9 +2819,14 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
return ret;
}
-void arm_smmu_remove_pasid(struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain, ioasid_t pasid)
+static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
+ struct iommu_domain *domain)
{
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_domain *smmu_domain;
+
+ smmu_domain = to_smmu_domain(domain);
+
mutex_lock(&arm_smmu_asid_lock);
arm_smmu_clear_cd(master, pasid);
if (master->ats_enabled)
@@ -3129,7 +3097,6 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
master->dev = dev;
master->smmu = smmu;
- INIT_LIST_HEAD(&master->bonds);
dev_iommu_priv_set(dev, master);
ret = arm_smmu_insert_master(smmu, master);
@@ -3311,12 +3278,6 @@ static int arm_smmu_def_domain_type(struct device *dev)
return 0;
}
-static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
- struct iommu_domain *domain)
-{
- arm_smmu_sva_remove_dev_pasid(domain, dev, pasid);
-}
-
static struct iommu_ops arm_smmu_ops = {
.identity_domain = &arm_smmu_identity_domain,
.blocked_domain = &arm_smmu_blocked_domain,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 212c18c70fa03e..d175d9eee6c61b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -587,9 +587,6 @@ struct arm_smmu_strtab_l1_desc {
struct arm_smmu_ctx_desc {
u16 asid;
-
- refcount_t refs;
- struct mm_struct *mm;
};
struct arm_smmu_l1_ctx_desc {
@@ -712,7 +709,6 @@ struct arm_smmu_master {
bool stall_enabled;
bool sva_enabled;
bool iopf_enabled;
- struct list_head bonds;
unsigned int ssid_bits;
};
@@ -741,7 +737,7 @@ struct arm_smmu_domain {
struct list_head devices;
spinlock_t devices_lock;
- struct list_head mmu_notifiers;
+ struct mmu_notifier mmu_notifier;
};
/* The following are exposed for testing purposes. */
@@ -805,16 +801,13 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
const struct arm_smmu_cd *cd);
-void arm_smmu_remove_pasid(struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain, ioasid_t pasid);
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
struct arm_smmu_domain *smmu_domain);
-bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
-int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
- ioasid_t ssid, unsigned long iova, size_t size);
+int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
+ unsigned long iova, size_t size);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
@@ -826,8 +819,6 @@ bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
void arm_smmu_sva_notifier_synchronize(void);
struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm);
-void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t id);
#else /* CONFIG_ARM_SMMU_V3_SVA */
static inline bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
{
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH v8 10/14] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain
2024-06-04 0:15 ` [PATCH v8 10/14] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain Jason Gunthorpe
@ 2024-06-24 9:54 ` Michael Shavit
2024-06-24 17:01 ` Jason Gunthorpe
0 siblings, 1 reply; 28+ messages in thread
From: Michael Shavit @ 2024-06-24 9:54 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Nicolin Chen,
patches, Shameerali Kolothum Thodi
On Tue, Jun 4, 2024 at 8:16 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> This removes all the notifier de-duplication logic in the driver and
> relies on the core code to de-duplicate and allocate only one SVA domain
> per mm per smmu instance. This naturally gives a 1:1 relationship between
> SVA domain and mmu notifier.
>
> It is a significant simplication of the flow, as we end up with a single
> struct arm_smmu_domain for each MM and the invalidation can then be
> shifted to properly use the masters list like S1/S2 do.
>
> Remove all of the previous mmu_notifier, bond, shared cd, and cd refcount
> logic entirely.
>
> The logic here is tightly wound together with the unusued BTM
> support. Since the BTM logic requires holding all the iommu_domains in a
> global ASID xarray it conflicts with the design to have a single SVA
> domain per PASID, as multiple SMMU instances will need to have different
> domains.
>
> Following patches resolve this by making the ASID xarray per-instance
> instead of global. However, converting the BTM code over to this
> methodology requires many changes.
>
> Thus, since ARM_SMMU_FEAT_BTM is never enabled, remove the parts of the
> BTM support for ASID sharing that interact with SVA as well.
>
> A followup series is already working on fully enabling the BTM support,
> that requires iommufd's VIOMMU feature to bring in the KVM's VMID as
> well. It will come with an already written patch to bring back the ASID
> sharing using a per-instance ASID xarray.
>
> https://lore.kernel.org/linux-iommu/20240208151837.35068-1-shameerali.kolothum.thodi@huawei.com/
> https://lore.kernel.org/linux-iommu/26-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com/
>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 395 +++---------------
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 69 +--
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 15 +-
> 3 files changed, 86 insertions(+), 393 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index aa033cd65adc5a..a7c36654dee5a5 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -13,29 +13,9 @@
> #include "arm-smmu-v3.h"
> #include "../../io-pgtable-arm.h"
>
> -struct arm_smmu_mmu_notifier {
> - struct mmu_notifier mn;
> - struct arm_smmu_ctx_desc *cd;
> - bool cleared;
> - refcount_t refs;
> - struct list_head list;
> - struct arm_smmu_domain *domain;
> -};
> -
> -#define mn_to_smmu(mn) container_of(mn, struct arm_smmu_mmu_notifier, mn)
> -
> -struct arm_smmu_bond {
> - struct mm_struct *mm;
> - struct arm_smmu_mmu_notifier *smmu_mn;
> - struct list_head list;
> -};
> -
> -#define sva_to_bond(handle) \
> - container_of(handle, struct arm_smmu_bond, sva)
> -
> static DEFINE_MUTEX(sva_lock);
>
> -static void
> +static void __maybe_unused
> arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
> {
> struct arm_smmu_master_domain *master_domain;
> @@ -58,58 +38,6 @@ arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
> spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
> }
>
> -/*
> - * Check if the CPU ASID is available on the SMMU side. If a private context
> - * descriptor is using it, try to replace it.
> - */
> -static struct arm_smmu_ctx_desc *
> -arm_smmu_share_asid(struct mm_struct *mm, u16 asid)
> -{
> - int ret;
> - u32 new_asid;
> - struct arm_smmu_ctx_desc *cd;
> - struct arm_smmu_device *smmu;
> - struct arm_smmu_domain *smmu_domain;
> -
> - cd = xa_load(&arm_smmu_asid_xa, asid);
> - if (!cd)
> - return NULL;
> -
> - if (cd->mm) {
> - if (WARN_ON(cd->mm != mm))
> - return ERR_PTR(-EINVAL);
> - /* All devices bound to this mm use the same cd struct. */
> - refcount_inc(&cd->refs);
> - return cd;
> - }
> -
> - smmu_domain = container_of(cd, struct arm_smmu_domain, cd);
> - smmu = smmu_domain->smmu;
> -
> - ret = xa_alloc(&arm_smmu_asid_xa, &new_asid, cd,
> - XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
> - if (ret)
> - return ERR_PTR(-ENOSPC);
> - /*
> - * Race with unmap: TLB invalidations will start targeting the new ASID,
> - * which isn't assigned yet. We'll do an invalidate-all on the old ASID
> - * later, so it doesn't matter.
> - */
> - cd->asid = new_asid;
> - /*
> - * Update ASID and invalidate CD in all associated masters. There will
> - * be some overlap between use of both ASIDs, until we invalidate the
> - * TLB.
> - */
> - arm_smmu_update_s1_domain_cd_entry(smmu_domain);
> -
> - /* Invalidate TLB entries previously associated with that context */
> - arm_smmu_tlb_inv_asid(smmu, asid);
> -
> - xa_erase(&arm_smmu_asid_xa, asid);
> - return NULL;
> -}
> -
Can we leave a comment on ASID sharing in the code since it isn't
added back until the next patch series? There are references to ASID
sharing remaining (and even added in this commit) that don't make
sense without this function (e.g "Prevent arm_smmu_share_asid() from
trying to change the ASID").
> static u64 page_size_to_cd(void)
> {
> static_assert(PAGE_SIZE == SZ_4K || PAGE_SIZE == SZ_16K ||
> @@ -187,69 +115,6 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
> }
> EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_sva_cd);
>
> -static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
> -{
> - u16 asid;
> - int err = 0;
> - struct arm_smmu_ctx_desc *cd;
> - struct arm_smmu_ctx_desc *ret = NULL;
> -
> - /* Don't free the mm until we release the ASID */
> - mmgrab(mm);
> -
> - asid = arm64_mm_context_get(mm);
> - if (!asid) {
> - err = -ESRCH;
> - goto out_drop_mm;
> - }
> -
> - cd = kzalloc(sizeof(*cd), GFP_KERNEL);
> - if (!cd) {
> - err = -ENOMEM;
> - goto out_put_context;
> - }
> -
> - refcount_set(&cd->refs, 1);
> -
> - mutex_lock(&arm_smmu_asid_lock);
> - ret = arm_smmu_share_asid(mm, asid);
> - if (ret) {
> - mutex_unlock(&arm_smmu_asid_lock);
> - goto out_free_cd;
> - }
> -
> - err = xa_insert(&arm_smmu_asid_xa, asid, cd, GFP_KERNEL);
> - mutex_unlock(&arm_smmu_asid_lock);
> -
> - if (err)
> - goto out_free_asid;
> -
> - cd->asid = asid;
> - cd->mm = mm;
> -
> - return cd;
> -
> -out_free_asid:
> - arm_smmu_free_asid(cd);
> -out_free_cd:
> - kfree(cd);
> -out_put_context:
> - arm64_mm_context_put(mm);
> -out_drop_mm:
> - mmdrop(mm);
> - return err < 0 ? ERR_PTR(err) : ret;
> -}
> -
> -static void arm_smmu_free_shared_cd(struct arm_smmu_ctx_desc *cd)
> -{
> - if (arm_smmu_free_asid(cd)) {
> - /* Unpin ASID */
> - arm64_mm_context_put(cd->mm);
> - mmdrop(cd->mm);
> - kfree(cd);
> - }
> -}
> -
> /*
> * Cloned from the MAX_TLBI_OPS in arch/arm64/include/asm/tlbflush.h, this
> * is used as a threshold to replace per-page TLBI commands to issue in the
> @@ -264,8 +129,8 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
> unsigned long start,
> unsigned long end)
> {
> - struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
> - struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
> + struct arm_smmu_domain *smmu_domain =
> + container_of(mn, struct arm_smmu_domain, mmu_notifier);
> size_t size;
>
> /*
> @@ -282,34 +147,22 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
> size = 0;
> }
>
> - if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM)) {
> - if (!size)
> - arm_smmu_tlb_inv_asid(smmu_domain->smmu,
> - smmu_mn->cd->asid);
> - else
> - arm_smmu_tlb_inv_range_asid(start, size,
> - smmu_mn->cd->asid,
> - PAGE_SIZE, false,
> - smmu_domain);
> - }
> + if (!size)
> + arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
> + else
> + arm_smmu_tlb_inv_range_asid(start, size, smmu_domain->cd.asid,
> + PAGE_SIZE, false, smmu_domain);
>
> - arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), start,
> - size);
> + arm_smmu_atc_inv_domain(smmu_domain, start, size);
> }
>
> static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> {
> - struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
> - struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
> + struct arm_smmu_domain *smmu_domain =
> + container_of(mn, struct arm_smmu_domain, mmu_notifier);
> struct arm_smmu_master_domain *master_domain;
> unsigned long flags;
>
> - mutex_lock(&sva_lock);
> - if (smmu_mn->cleared) {
> - mutex_unlock(&sva_lock);
> - return;
> - }
> -
> /*
> * DMA may still be running. Keep the cd valid to avoid C_BAD_CD events,
> * but disable translation.
> @@ -321,25 +174,23 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> struct arm_smmu_cd target;
> struct arm_smmu_cd *cdptr;
>
> - cdptr = arm_smmu_get_cd_ptr(master, mm_get_enqcmd_pasid(mm));
> + cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid);
> if (WARN_ON(!cdptr))
> continue;
> - arm_smmu_make_sva_cd(&target, master, NULL, smmu_mn->cd->asid);
> - arm_smmu_write_cd_entry(master, mm_get_enqcmd_pasid(mm), cdptr,
> + arm_smmu_make_sva_cd(&target, master, NULL,
> + smmu_domain->cd.asid);
> + arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr,
> &target);
> }
> spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>
> - arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
> - arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0);
> -
> - smmu_mn->cleared = true;
> - mutex_unlock(&sva_lock);
> + arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
> + arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
> }
>
> static void arm_smmu_mmu_notifier_free(struct mmu_notifier *mn)
> {
> - kfree(mn_to_smmu(mn));
> + kfree(container_of(mn, struct arm_smmu_domain, mmu_notifier));
> }
>
> static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
> @@ -348,115 +199,6 @@ static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = {
> .free_notifier = arm_smmu_mmu_notifier_free,
> };
>
> -/* Allocate or get existing MMU notifier for this {domain, mm} pair */
> -static struct arm_smmu_mmu_notifier *
> -arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain,
> - struct mm_struct *mm)
> -{
> - int ret;
> - struct arm_smmu_ctx_desc *cd;
> - struct arm_smmu_mmu_notifier *smmu_mn;
> -
> - list_for_each_entry(smmu_mn, &smmu_domain->mmu_notifiers, list) {
> - if (smmu_mn->mn.mm == mm) {
> - refcount_inc(&smmu_mn->refs);
> - return smmu_mn;
> - }
> - }
> -
> - cd = arm_smmu_alloc_shared_cd(mm);
> - if (IS_ERR(cd))
> - return ERR_CAST(cd);
> -
> - smmu_mn = kzalloc(sizeof(*smmu_mn), GFP_KERNEL);
> - if (!smmu_mn) {
> - ret = -ENOMEM;
> - goto err_free_cd;
> - }
> -
> - refcount_set(&smmu_mn->refs, 1);
> - smmu_mn->cd = cd;
> - smmu_mn->domain = smmu_domain;
> - smmu_mn->mn.ops = &arm_smmu_mmu_notifier_ops;
> -
> - ret = mmu_notifier_register(&smmu_mn->mn, mm);
> - if (ret) {
> - kfree(smmu_mn);
> - goto err_free_cd;
> - }
> -
> - list_add(&smmu_mn->list, &smmu_domain->mmu_notifiers);
> - return smmu_mn;
> -
> -err_free_cd:
> - arm_smmu_free_shared_cd(cd);
> - return ERR_PTR(ret);
> -}
> -
> -static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn)
> -{
> - struct mm_struct *mm = smmu_mn->mn.mm;
> - struct arm_smmu_ctx_desc *cd = smmu_mn->cd;
> - struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
> -
> - if (!refcount_dec_and_test(&smmu_mn->refs))
> - return;
> -
> - list_del(&smmu_mn->list);
> -
> - /*
> - * If we went through clear(), we've already invalidated, and no
> - * new TLB entry can have been formed.
> - */
> - if (!smmu_mn->cleared) {
> - arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid);
> - arm_smmu_atc_inv_domain_sva(smmu_domain,
> - mm_get_enqcmd_pasid(mm), 0, 0);
> - }
> -
> - /* Frees smmu_mn */
> - mmu_notifier_put(&smmu_mn->mn);
> - arm_smmu_free_shared_cd(cd);
> -}
> -
> -static struct arm_smmu_bond *__arm_smmu_sva_bind(struct device *dev,
> - struct mm_struct *mm)
> -{
> - int ret;
> - struct arm_smmu_bond *bond;
> - struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> - struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> - struct arm_smmu_domain *smmu_domain;
> -
> - if (!(domain->type & __IOMMU_DOMAIN_PAGING))
> - return ERR_PTR(-ENODEV);
> - smmu_domain = to_smmu_domain(domain);
> - if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
> - return ERR_PTR(-ENODEV);
> -
> - if (!master || !master->sva_enabled)
> - return ERR_PTR(-ENODEV);
> -
> - bond = kzalloc(sizeof(*bond), GFP_KERNEL);
> - if (!bond)
> - return ERR_PTR(-ENOMEM);
> -
> - bond->mm = mm;
> -
> - bond->smmu_mn = arm_smmu_mmu_notifier_get(smmu_domain, mm);
> - if (IS_ERR(bond->smmu_mn)) {
> - ret = PTR_ERR(bond->smmu_mn);
> - goto err_free_bond;
> - }
> -
> - list_add(&bond->list, &master->bonds);
> - return bond;
> -
> -err_free_bond:
> - kfree(bond);
> - return ERR_PTR(ret);
> -}
> -
> bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
> {
> unsigned long reg, fld;
> @@ -573,11 +315,6 @@ int arm_smmu_master_enable_sva(struct arm_smmu_master *master)
> int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
> {
> mutex_lock(&sva_lock);
> - if (!list_empty(&master->bonds)) {
> - dev_err(master->dev, "cannot disable SVA, device is bound\n");
> - mutex_unlock(&sva_lock);
> - return -EBUSY;
> - }
> arm_smmu_master_sva_disable_iopf(master);
> master->sva_enabled = false;
> mutex_unlock(&sva_lock);
> @@ -594,66 +331,51 @@ void arm_smmu_sva_notifier_synchronize(void)
> mmu_notifier_synchronize();
> }
>
> -void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
> - struct device *dev, ioasid_t id)
> -{
> - struct mm_struct *mm = domain->mm;
> - struct arm_smmu_bond *bond = NULL, *t;
> - struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> -
> - arm_smmu_remove_pasid(master, to_smmu_domain(domain), id);
> -
> - mutex_lock(&sva_lock);
> - list_for_each_entry(t, &master->bonds, list) {
> - if (t->mm == mm) {
> - bond = t;
> - break;
> - }
> - }
> -
> - if (!WARN_ON(!bond)) {
> - list_del(&bond->list);
> - arm_smmu_mmu_notifier_put(bond->smmu_mn);
> - kfree(bond);
> - }
> - mutex_unlock(&sva_lock);
> -}
> -
> static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
> struct device *dev, ioasid_t id)
> {
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> - struct mm_struct *mm = domain->mm;
> - struct arm_smmu_bond *bond;
> struct arm_smmu_cd target;
> int ret;
>
> - if (mm_get_enqcmd_pasid(mm) != id)
> + /* Prevent arm_smmu_mm_release from being called while we are attaching */
> + if (!mmget_not_zero(domain->mm))
> return -EINVAL;
>
> - mutex_lock(&sva_lock);
> - bond = __arm_smmu_sva_bind(dev, mm);
> - if (IS_ERR(bond)) {
> - mutex_unlock(&sva_lock);
> - return PTR_ERR(bond);
> - }
> + /*
> + * This does not need the arm_smmu_asid_lock because SVA domains never
> + * get reassigned
> + */
> + arm_smmu_make_sva_cd(&target, master, domain->mm, smmu_domain->cd.asid);
> + ret = arm_smmu_set_pasid(master, smmu_domain, id, &target);
>
> - arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid);
> - ret = arm_smmu_set_pasid(master, to_smmu_domain(domain), id, &target);
> - if (ret) {
> - list_del(&bond->list);
> - arm_smmu_mmu_notifier_put(bond->smmu_mn);
> - kfree(bond);
> - mutex_unlock(&sva_lock);
> - return ret;
> - }
> - mutex_unlock(&sva_lock);
> - return 0;
> + mmput(domain->mm);
> + return ret;
> }
>
> static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
> {
> - kfree(to_smmu_domain(domain));
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +
> + /*
> + * Ensure the ASID is empty in the iommu cache before allowing reuse.
> + */
> + arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
> +
> + /*
> + * Notice that the arm_smmu_mm_arch_invalidate_secondary_tlbs op can
> + * still be called/running at this point. We allow the ASID to be
> + * reused, and if there is a race then it just suffers harmless
> + * unnecessary invalidation.
> + */
> + xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
> +
> + /*
> + * Actual free is defered to the SRCU callback
> + * arm_smmu_mmu_notifier_free()
> + */
> + mmu_notifier_put(&smmu_domain->mmu_notifier);
> }
>
> static const struct iommu_domain_ops arm_smmu_sva_domain_ops = {
> @@ -667,6 +389,8 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> struct arm_smmu_device *smmu = master->smmu;
> struct arm_smmu_domain *smmu_domain;
> + u32 asid;
> + int ret;
>
> smmu_domain = arm_smmu_domain_alloc();
> if (IS_ERR(smmu_domain))
> @@ -675,5 +399,22 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
> smmu_domain->domain.ops = &arm_smmu_sva_domain_ops;
> smmu_domain->smmu = smmu;
>
> + ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
> + XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
> + if (ret)
> + goto err_free;
> +
> + smmu_domain->cd.asid = asid;
> + smmu_domain->mmu_notifier.ops = &arm_smmu_mmu_notifier_ops;
> + ret = mmu_notifier_register(&smmu_domain->mmu_notifier, mm);
> + if (ret)
> + goto err_asid;
> +
> return &smmu_domain->domain;
> +
> +err_asid:
> + xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
> +err_free:
> + kfree(smmu_domain);
> + return ERR_PTR(ret);
> }
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 24000027253de8..2a845ab6d53b57 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1439,22 +1439,6 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master)
> cd_table->cdtab = NULL;
> }
>
> -bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd)
> -{
> - bool free;
> - struct arm_smmu_ctx_desc *old_cd;
> -
> - if (!cd->asid)
> - return false;
> -
> - free = refcount_dec_and_test(&cd->refs);
> - if (free) {
> - old_cd = xa_erase(&arm_smmu_asid_xa, cd->asid);
> - WARN_ON(old_cd != cd);
> - }
> - return free;
> -}
> -
> /* Stream table manipulation functions */
> static void
> arm_smmu_write_strtab_l1_desc(__le64 *dst, struct arm_smmu_strtab_l1_desc *desc)
> @@ -2023,8 +2007,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master,
> return arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
> }
>
> -static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
> - ioasid_t ssid, unsigned long iova, size_t size)
> +int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
> + unsigned long iova, size_t size)
> {
> struct arm_smmu_master_domain *master_domain;
> int i;
> @@ -2062,15 +2046,7 @@ static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
> if (!master->ats_enabled)
> continue;
>
> - /*
> - * Non-zero ssid means SVA is co-opting the S1 domain to issue
> - * invalidations for SVA PASIDs.
> - */
> - if (ssid != IOMMU_NO_PASID)
> - arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd);
> - else
> - arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size,
> - &cmd);
> + arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size, &cmd);
>
> for (i = 0; i < master->num_streams; i++) {
> cmd.atc.sid = master->streams[i].id;
> @@ -2082,19 +2058,6 @@ static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
> return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds);
> }
>
> -static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
> - unsigned long iova, size_t size)
> -{
> - return __arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova,
> - size);
> -}
> -
> -int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
> - ioasid_t ssid, unsigned long iova, size_t size)
> -{
> - return __arm_smmu_atc_inv_domain(smmu_domain, ssid, iova, size);
> -}
> -
> /* IO_PGTABLE API */
> static void arm_smmu_tlb_inv_context(void *cookie)
> {
> @@ -2283,7 +2246,6 @@ struct arm_smmu_domain *arm_smmu_domain_alloc(void)
> mutex_init(&smmu_domain->init_mutex);
> INIT_LIST_HEAD(&smmu_domain->devices);
> spin_lock_init(&smmu_domain->devices_lock);
> - INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
>
> return smmu_domain;
> }
> @@ -2325,7 +2287,7 @@ static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
> if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> /* Prevent SVA from touching the CD while we're freeing it */
> mutex_lock(&arm_smmu_asid_lock);
> - arm_smmu_free_asid(&smmu_domain->cd);
> + xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
> mutex_unlock(&arm_smmu_asid_lock);
> } else {
> struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
> @@ -2343,11 +2305,9 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
> u32 asid;
> struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
>
> - refcount_set(&cd->refs, 1);
> -
> /* Prevent SVA from modifying the ASID until it is written to the CD */
> mutex_lock(&arm_smmu_asid_lock);
> - ret = xa_alloc(&arm_smmu_asid_xa, &asid, cd,
> + ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
> XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
> cd->asid = (u16)asid;
> mutex_unlock(&arm_smmu_asid_lock);
> @@ -2835,6 +2795,9 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
>
> /* The core code validates pasid */
>
> + if (smmu_domain->smmu != master->smmu)
> + return -EINVAL;
> +
> if (!master->cd_table.in_ste)
> return -ENODEV;
>
> @@ -2856,9 +2819,14 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
> return ret;
> }
>
> -void arm_smmu_remove_pasid(struct arm_smmu_master *master,
> - struct arm_smmu_domain *smmu_domain, ioasid_t pasid)
> +static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
> + struct iommu_domain *domain)
> {
> + struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> + struct arm_smmu_domain *smmu_domain;
> +
> + smmu_domain = to_smmu_domain(domain);
> +
> mutex_lock(&arm_smmu_asid_lock);
> arm_smmu_clear_cd(master, pasid);
> if (master->ats_enabled)
> @@ -3129,7 +3097,6 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
>
> master->dev = dev;
> master->smmu = smmu;
> - INIT_LIST_HEAD(&master->bonds);
> dev_iommu_priv_set(dev, master);
>
> ret = arm_smmu_insert_master(smmu, master);
> @@ -3311,12 +3278,6 @@ static int arm_smmu_def_domain_type(struct device *dev)
> return 0;
> }
>
> -static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
> - struct iommu_domain *domain)
> -{
> - arm_smmu_sva_remove_dev_pasid(domain, dev, pasid);
> -}
> -
> static struct iommu_ops arm_smmu_ops = {
> .identity_domain = &arm_smmu_identity_domain,
> .blocked_domain = &arm_smmu_blocked_domain,
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 212c18c70fa03e..d175d9eee6c61b 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -587,9 +587,6 @@ struct arm_smmu_strtab_l1_desc {
>
> struct arm_smmu_ctx_desc {
> u16 asid;
> -
> - refcount_t refs;
> - struct mm_struct *mm;
> };
>
> struct arm_smmu_l1_ctx_desc {
> @@ -712,7 +709,6 @@ struct arm_smmu_master {
> bool stall_enabled;
> bool sva_enabled;
> bool iopf_enabled;
> - struct list_head bonds;
> unsigned int ssid_bits;
> };
>
> @@ -741,7 +737,7 @@ struct arm_smmu_domain {
> struct list_head devices;
> spinlock_t devices_lock;
>
> - struct list_head mmu_notifiers;
> + struct mmu_notifier mmu_notifier;
> };
>
> /* The following are exposed for testing purposes. */
> @@ -805,16 +801,13 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
> int arm_smmu_set_pasid(struct arm_smmu_master *master,
> struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
> const struct arm_smmu_cd *cd);
> -void arm_smmu_remove_pasid(struct arm_smmu_master *master,
> - struct arm_smmu_domain *smmu_domain, ioasid_t pasid);
>
> void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
> void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
> size_t granule, bool leaf,
> struct arm_smmu_domain *smmu_domain);
> -bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
> -int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain,
> - ioasid_t ssid, unsigned long iova, size_t size);
> +int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
> + unsigned long iova, size_t size);
>
> #ifdef CONFIG_ARM_SMMU_V3_SVA
> bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
> @@ -826,8 +819,6 @@ bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
> void arm_smmu_sva_notifier_synchronize(void);
> struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
> struct mm_struct *mm);
> -void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain,
> - struct device *dev, ioasid_t id);
> #else /* CONFIG_ARM_SMMU_V3_SVA */
> static inline bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
> {
> --
> 2.45.2
>
Reviewed-by: Michael Shavit <mshavit@google.com>
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH v8 10/14] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain
2024-06-24 9:54 ` Michael Shavit
@ 2024-06-24 17:01 ` Jason Gunthorpe
0 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-24 17:01 UTC (permalink / raw)
To: Michael Shavit
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Nicolin Chen,
patches, Shameerali Kolothum Thodi
On Mon, Jun 24, 2024 at 05:54:42PM +0800, Michael Shavit wrote:
> Can we leave a comment on ASID sharing in the code since it isn't
> added back until the next patch series? There are references to ASID
> sharing remaining (and even added in this commit) that don't make
> sense without this function (e.g "Prevent arm_smmu_share_asid() from
> trying to change the ASID").
Yes, I left the comment references because I really do expect it to
come back soon.
My plan, broadly, is to allow the domain's to be shared across smmu
instances which should introduce the infrastructure to avoid the
invalidation race in unshare by letting the domain have multiple ASIDs
at the same time.
After that we would add in vBTM support, this is BTM on systems that
only support S1 with no S2. This avoids the VMID issue that is
blocking it while still being useful.
pBTM would come after the IOMMUFD VIOMMU support that Nicolin is
working on as the VIOMMU would be the vehicle to bring in the KVM VMID
binding from userspace.
I can delete the comments too, but then someone will ask why not
delete all the locking as well. :\
Thanks,
Jason
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v8 11/14] iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is used
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (9 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 10/14] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 0:15 ` [PATCH v8 12/14] iommu/arm-smmu-v3: Test the STE S1DSS functionality Jason Gunthorpe
` (4 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameerali Kolothum Thodi
The HW supports this, use the S1DSS bits to configure the behavior
of SSID=0 which is the RID's translation.
If SSID's are currently being used in the CD table then just update the
S1DSS bits in the STE, remove the master_domain and leave ATS alone.
For iommufd the driver design has a small problem that all the unused CD
table entries are set with V=0 which will generate an event if VFIO
userspace tries to use the CD entry. This patch extends this problem to
include the RID as well if PASID is being used.
For BLOCKED with used PASIDs the
F_STREAM_DISABLED (STRTAB_STE_1_S1DSS_TERMINATE) event is generated on
untagged traffic and a substream CD table entry with V=0 (removed pasid)
will generate C_BAD_CD. Arguably there is no advantage to using S1DSS over
the CD entry 0 with V=0.
As we don't yet support PASID in iommufd this is a problem to resolve
later, possibly by using EPD0 for unused CD table entries instead of V=0,
and not using S1DSS for BLOCKED.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 60 +++++++++++++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 +-
3 files changed, 50 insertions(+), 16 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index a460b71f585789..d7e022bb9df530 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -164,7 +164,7 @@ static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste,
.smmu = &smmu,
};
- arm_smmu_make_cdtable_ste(ste, &master, true);
+ arm_smmu_make_cdtable_ste(ste, &master, true, STRTAB_STE_1_S1DSS_SSID0);
}
static void arm_smmu_v3_write_ste_test_bypass_to_abort(struct kunit *test)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2a845ab6d53b57..1b43fc1fe85387 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -991,6 +991,14 @@ void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits)
STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW |
STRTAB_STE_1_EATS);
used_bits[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID);
+
+ /*
+ * See 13.5 Summary of attribute/permission configuration fields
+ * for the SHCFG behavior.
+ */
+ if (FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent[1])) ==
+ STRTAB_STE_1_S1DSS_BYPASS)
+ used_bits[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
}
/* S2 translates */
@@ -1531,7 +1539,8 @@ EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_bypass_ste);
VISIBLE_IF_KUNIT
void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
- struct arm_smmu_master *master, bool ats_enabled)
+ struct arm_smmu_master *master, bool ats_enabled,
+ unsigned int s1dss)
{
struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
struct arm_smmu_device *smmu = master->smmu;
@@ -1545,7 +1554,7 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax));
target->data[1] = cpu_to_le64(
- FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
+ FIELD_PREP(STRTAB_STE_1_S1DSS, s1dss) |
FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
@@ -1556,6 +1565,11 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
FIELD_PREP(STRTAB_STE_1_EATS,
ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+ if ((smmu->features & ARM_SMMU_FEAT_ATTR_TYPES_OVR) &&
+ s1dss == STRTAB_STE_1_S1DSS_BYPASS)
+ target->data[1] |= cpu_to_le64(FIELD_PREP(
+ STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
+
if (smmu->features & ARM_SMMU_FEAT_E2H) {
/*
* To support BTM the streamworld needs to match the
@@ -2579,6 +2593,7 @@ struct arm_smmu_attach_state {
/* Inputs */
struct iommu_domain *old_domain;
struct arm_smmu_master *master;
+ bool cd_needs_ats;
ioasid_t ssid;
/* Resulting state */
bool ats_enabled;
@@ -2621,7 +2636,7 @@ static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
*/
lockdep_assert_held(&arm_smmu_asid_lock);
- if (smmu_domain) {
+ if (smmu_domain || state->cd_needs_ats) {
/*
* The SMMU does not support enabling ATS with bypass/abort.
* When the STE is in bypass (STE.Config[2:0] == 0b100), ATS
@@ -2633,7 +2648,9 @@ static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
* tables.
*/
state->ats_enabled = arm_smmu_ats_supported(master);
+ }
+ if (smmu_domain) {
master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
if (!master_domain)
return -ENOMEM;
@@ -2761,7 +2778,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
&target_cd);
- arm_smmu_make_cdtable_ste(&target, master, state.ats_enabled);
+ arm_smmu_make_cdtable_ste(&target, master, state.ats_enabled,
+ STRTAB_STE_1_S1DSS_SSID0);
arm_smmu_install_ste_for_dev(master, &target);
break;
}
@@ -2835,8 +2853,10 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
mutex_unlock(&arm_smmu_asid_lock);
}
-static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
- struct device *dev, struct arm_smmu_ste *ste)
+static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
+ struct device *dev,
+ struct arm_smmu_ste *ste,
+ unsigned int s1dss)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_attach_state state = {
@@ -2845,16 +2865,28 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
.ssid = IOMMU_NO_PASID,
};
- if (arm_smmu_ssids_in_use(&master->cd_table))
- return -EBUSY;
-
/*
* Do not allow any ASID to be changed while are working on the STE,
* otherwise we could miss invalidations.
*/
mutex_lock(&arm_smmu_asid_lock);
- arm_smmu_attach_prepare(&state, domain);
+ /*
+ * If the CD table is not in use we can use the provided STE, otherwise
+ * we use a cdtable STE with the provided S1DSS.
+ */
+ if (arm_smmu_ssids_in_use(&master->cd_table)) {
+ /*
+ * If a CD table has to be present then we need to run with ATS
+ * on even though the RID will fail ATS queries with UR. This is
+ * because we have no idea what the PASID's need.
+ */
+ state.cd_needs_ats = true;
+ arm_smmu_attach_prepare(&state, domain);
+ arm_smmu_make_cdtable_ste(ste, master, state.ats_enabled, s1dss);
+ } else {
+ arm_smmu_attach_prepare(&state, domain);
+ }
arm_smmu_install_ste_for_dev(master, ste);
arm_smmu_attach_commit(&state);
mutex_unlock(&arm_smmu_asid_lock);
@@ -2865,7 +2897,6 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
* descriptor from arm_smmu_share_asid().
*/
arm_smmu_clear_cd(master, IOMMU_NO_PASID);
- return 0;
}
static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
@@ -2875,7 +2906,8 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
arm_smmu_make_bypass_ste(master->smmu, &ste);
- return arm_smmu_attach_dev_ste(domain, dev, &ste);
+ arm_smmu_attach_dev_ste(domain, dev, &ste, STRTAB_STE_1_S1DSS_BYPASS);
+ return 0;
}
static const struct iommu_domain_ops arm_smmu_identity_ops = {
@@ -2893,7 +2925,9 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
struct arm_smmu_ste ste;
arm_smmu_make_abort_ste(&ste);
- return arm_smmu_attach_dev_ste(domain, dev, &ste);
+ arm_smmu_attach_dev_ste(domain, dev, &ste,
+ STRTAB_STE_1_S1DSS_TERMINATE);
+ return 0;
}
static const struct iommu_domain_ops arm_smmu_blocked_ops = {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index d175d9eee6c61b..30459a800c7b2d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -761,8 +761,8 @@ void arm_smmu_make_abort_ste(struct arm_smmu_ste *target);
void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu,
struct arm_smmu_ste *target);
void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
- struct arm_smmu_master *master,
- bool ats_enabled);
+ struct arm_smmu_master *master, bool ats_enabled,
+ unsigned int s1dss);
void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain,
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v8 12/14] iommu/arm-smmu-v3: Test the STE S1DSS functionality
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (10 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 11/14] iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is used Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 0:15 ` [PATCH v8 13/14] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED Jason Gunthorpe
` (3 subsequent siblings)
15 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameer Kolothum
S1DSS brings in quite a few new transition pairs that are
interesting. Test to/from S1DSS_BYPASS <-> S1DSS_SSID0, and
BYPASS <-> S1DSS_SSID0.
Test a contrived non-hitless flow to make sure that the logic works.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 113 +++++++++++++++++-
1 file changed, 108 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index d7e022bb9df530..e0fce31eba54dd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -144,6 +144,14 @@ static void arm_smmu_v3_test_ste_expect_transition(
KUNIT_EXPECT_MEMEQ(test, target->data, cur_copy.data, sizeof(cur_copy));
}
+static void arm_smmu_v3_test_ste_expect_non_hitless_transition(
+ struct kunit *test, const struct arm_smmu_ste *cur,
+ const struct arm_smmu_ste *target, unsigned int num_syncs_expected)
+{
+ arm_smmu_v3_test_ste_expect_transition(test, cur, target,
+ num_syncs_expected, false);
+}
+
static void arm_smmu_v3_test_ste_expect_hitless_transition(
struct kunit *test, const struct arm_smmu_ste *cur,
const struct arm_smmu_ste *target, unsigned int num_syncs_expected)
@@ -155,6 +163,7 @@ static void arm_smmu_v3_test_ste_expect_hitless_transition(
static const dma_addr_t fake_cdtab_dma_addr = 0xF0F0F0F0F0F0;
static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste,
+ unsigned int s1dss,
const dma_addr_t dma_addr)
{
struct arm_smmu_master master = {
@@ -164,7 +173,7 @@ static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste,
.smmu = &smmu,
};
- arm_smmu_make_cdtable_ste(ste, &master, true, STRTAB_STE_1_S1DSS_SSID0);
+ arm_smmu_make_cdtable_ste(ste, &master, true, s1dss);
}
static void arm_smmu_v3_write_ste_test_bypass_to_abort(struct kunit *test)
@@ -194,7 +203,8 @@ static void arm_smmu_v3_write_ste_test_cdtable_to_abort(struct kunit *test)
{
struct arm_smmu_ste ste;
- arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
arm_smmu_v3_test_ste_expect_hitless_transition(test, &ste, &abort_ste,
NUM_EXPECTED_SYNCS(2));
}
@@ -203,7 +213,8 @@ static void arm_smmu_v3_write_ste_test_abort_to_cdtable(struct kunit *test)
{
struct arm_smmu_ste ste;
- arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
arm_smmu_v3_test_ste_expect_hitless_transition(test, &abort_ste, &ste,
NUM_EXPECTED_SYNCS(2));
}
@@ -212,7 +223,8 @@ static void arm_smmu_v3_write_ste_test_cdtable_to_bypass(struct kunit *test)
{
struct arm_smmu_ste ste;
- arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
arm_smmu_v3_test_ste_expect_hitless_transition(test, &ste, &bypass_ste,
NUM_EXPECTED_SYNCS(3));
}
@@ -221,11 +233,54 @@ static void arm_smmu_v3_write_ste_test_bypass_to_cdtable(struct kunit *test)
{
struct arm_smmu_ste ste;
- arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
arm_smmu_v3_test_ste_expect_hitless_transition(test, &bypass_ste, &ste,
NUM_EXPECTED_SYNCS(3));
}
+static void arm_smmu_v3_write_ste_test_cdtable_s1dss_change(struct kunit *test)
+{
+ struct arm_smmu_ste ste;
+ struct arm_smmu_ste s1dss_bypass;
+
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&s1dss_bypass, STRTAB_STE_1_S1DSS_BYPASS,
+ fake_cdtab_dma_addr);
+
+ /*
+ * Flipping s1dss on a CD table STE only involves changes to the second
+ * qword of an STE and can be done in a single write.
+ */
+ arm_smmu_v3_test_ste_expect_hitless_transition(
+ test, &ste, &s1dss_bypass, NUM_EXPECTED_SYNCS(1));
+ arm_smmu_v3_test_ste_expect_hitless_transition(
+ test, &s1dss_bypass, &ste, NUM_EXPECTED_SYNCS(1));
+}
+
+static void
+arm_smmu_v3_write_ste_test_s1dssbypass_to_stebypass(struct kunit *test)
+{
+ struct arm_smmu_ste s1dss_bypass;
+
+ arm_smmu_test_make_cdtable_ste(&s1dss_bypass, STRTAB_STE_1_S1DSS_BYPASS,
+ fake_cdtab_dma_addr);
+ arm_smmu_v3_test_ste_expect_hitless_transition(
+ test, &s1dss_bypass, &bypass_ste, NUM_EXPECTED_SYNCS(2));
+}
+
+static void
+arm_smmu_v3_write_ste_test_stebypass_to_s1dssbypass(struct kunit *test)
+{
+ struct arm_smmu_ste s1dss_bypass;
+
+ arm_smmu_test_make_cdtable_ste(&s1dss_bypass, STRTAB_STE_1_S1DSS_BYPASS,
+ fake_cdtab_dma_addr);
+ arm_smmu_v3_test_ste_expect_hitless_transition(
+ test, &bypass_ste, &s1dss_bypass, NUM_EXPECTED_SYNCS(2));
+}
+
static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste,
bool ats_enabled)
{
@@ -285,6 +340,48 @@ static void arm_smmu_v3_write_ste_test_bypass_to_s2(struct kunit *test)
NUM_EXPECTED_SYNCS(2));
}
+static void arm_smmu_v3_write_ste_test_s1_to_s2(struct kunit *test)
+{
+ struct arm_smmu_ste s1_ste;
+ struct arm_smmu_ste s2_ste;
+
+ arm_smmu_test_make_cdtable_ste(&s1_ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
+ arm_smmu_test_make_s2_ste(&s2_ste, true);
+ arm_smmu_v3_test_ste_expect_hitless_transition(test, &s1_ste, &s2_ste,
+ NUM_EXPECTED_SYNCS(3));
+}
+
+static void arm_smmu_v3_write_ste_test_s2_to_s1(struct kunit *test)
+{
+ struct arm_smmu_ste s1_ste;
+ struct arm_smmu_ste s2_ste;
+
+ arm_smmu_test_make_cdtable_ste(&s1_ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
+ arm_smmu_test_make_s2_ste(&s2_ste, true);
+ arm_smmu_v3_test_ste_expect_hitless_transition(test, &s2_ste, &s1_ste,
+ NUM_EXPECTED_SYNCS(3));
+}
+
+static void arm_smmu_v3_write_ste_test_non_hitless(struct kunit *test)
+{
+ struct arm_smmu_ste ste;
+ struct arm_smmu_ste ste_2;
+
+ /*
+ * Although no flow resembles this in practice, one way to force an STE
+ * update to be non-hitless is to change its CD table pointer as well as
+ * s1 dss field in the same update.
+ */
+ arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0,
+ fake_cdtab_dma_addr);
+ arm_smmu_test_make_cdtable_ste(&ste_2, STRTAB_STE_1_S1DSS_BYPASS,
+ 0x4B4B4b4B4B);
+ arm_smmu_v3_test_ste_expect_non_hitless_transition(
+ test, &ste, &ste_2, NUM_EXPECTED_SYNCS(3));
+}
+
static void arm_smmu_v3_test_cd_expect_transition(
struct kunit *test, const struct arm_smmu_cd *cur,
const struct arm_smmu_cd *target, unsigned int num_syncs_expected,
@@ -438,10 +535,16 @@ static struct kunit_case arm_smmu_v3_test_cases[] = {
KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_cdtable),
KUNIT_CASE(arm_smmu_v3_write_ste_test_cdtable_to_bypass),
KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_cdtable),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_cdtable_s1dss_change),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_s1dssbypass_to_stebypass),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_stebypass_to_s1dssbypass),
KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_abort),
KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_s2),
KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_bypass),
KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_s2),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_s1_to_s2),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_s1),
+ KUNIT_CASE(arm_smmu_v3_write_ste_test_non_hitless),
KUNIT_CASE(arm_smmu_v3_write_cd_test_s1_clear),
KUNIT_CASE(arm_smmu_v3_write_cd_test_s1_change_asid),
KUNIT_CASE(arm_smmu_v3_write_cd_test_sva_clear),
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH v8 13/14] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (11 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 12/14] iommu/arm-smmu-v3: Test the STE S1DSS functionality Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 6:20 ` Nicolin Chen
2024-06-04 0:15 ` [PATCH v8 14/14] iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID Jason Gunthorpe
` (2 subsequent siblings)
15 siblings, 1 reply; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameerali Kolothum Thodi
If the STE doesn't point to the CD table we can upgrade it by
reprogramming the STE with the appropriate S1DSS. We may also need to turn
on ATS at the same time.
Keep track if the installed STE is pointing at the cd_table and the ATS
state to trigger this path.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 49 ++++++++++++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +-
2 files changed, 49 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 1b43fc1fe85387..f54be4f32d30d1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2435,6 +2435,9 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
master->cd_table.in_ste =
FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(target->data[0])) ==
STRTAB_STE_0_CFG_S1_TRANS;
+ master->ste_ats_enabled =
+ FIELD_GET(STRTAB_STE_1_EATS, le64_to_cpu(target->data[1])) ==
+ STRTAB_STE_1_EATS_TRANS;
for (i = 0; i < master->num_streams; ++i) {
u32 sid = master->streams[i].id;
@@ -2796,10 +2799,36 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return 0;
}
+static void arm_smmu_update_ste(struct arm_smmu_master *master,
+ struct iommu_domain *sid_domain,
+ bool ats_enabled)
+{
+ unsigned int s1dss = STRTAB_STE_1_S1DSS_TERMINATE;
+ struct arm_smmu_ste ste;
+
+ if (master->cd_table.in_ste && master->ste_ats_enabled == ats_enabled)
+ return;
+
+ if (sid_domain->type == IOMMU_DOMAIN_IDENTITY)
+ s1dss = STRTAB_STE_1_S1DSS_BYPASS;
+ else
+ WARN_ON(sid_domain->type != IOMMU_DOMAIN_BLOCKED);
+
+ /*
+ * Change the STE into a cdtable one with SID IDENTITY/BLOCKED behavior
+ * using s1dss if necessary. If the cd_table is already installed then
+ * the S1DSS is correct and this will just update the EATS. Otherwise it
+ * installs the entire thing. This will be hitless.
+ */
+ arm_smmu_make_cdtable_ste(&ste, master, ats_enabled, s1dss);
+ arm_smmu_install_ste_for_dev(master, &ste);
+}
+
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
const struct arm_smmu_cd *cd)
{
+ struct iommu_domain *sid_domain = iommu_get_domain_for_dev(master->dev);
struct arm_smmu_attach_state state = {
.master = master,
/*
@@ -2816,8 +2845,10 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
if (smmu_domain->smmu != master->smmu)
return -EINVAL;
- if (!master->cd_table.in_ste)
- return -ENODEV;
+ if (!master->cd_table.in_ste &&
+ sid_domain->type != IOMMU_DOMAIN_IDENTITY &&
+ sid_domain->type != IOMMU_DOMAIN_BLOCKED)
+ return -EINVAL;
cdptr = arm_smmu_alloc_cd_ptr(master, pasid);
if (!cdptr)
@@ -2829,6 +2860,7 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
goto out_unlock;
arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
+ arm_smmu_update_ste(master, sid_domain, state.ats_enabled);
arm_smmu_attach_commit(&state);
@@ -2851,6 +2883,19 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
arm_smmu_atc_inv_master(master, pasid);
arm_smmu_remove_master_domain(master, &smmu_domain->domain, pasid);
mutex_unlock(&arm_smmu_asid_lock);
+
+ /*
+ * When the last user of the CD table goes away downgrade the STE back
+ * to a non-cd_table one.
+ */
+ if (!arm_smmu_ssids_in_use(&master->cd_table)) {
+ struct iommu_domain *sid_domain =
+ iommu_get_domain_for_dev(master->dev);
+
+ if (sid_domain->type == IOMMU_DOMAIN_IDENTITY ||
+ sid_domain->type == IOMMU_DOMAIN_BLOCKED)
+ sid_domain->ops->attach_dev(sid_domain, dev);
+ }
}
static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 30459a800c7b2d..cdd426efb384d2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -705,7 +705,8 @@ struct arm_smmu_master {
/* Locked by the iommu core using the group mutex */
struct arm_smmu_ctx_desc_cfg cd_table;
unsigned int num_streams;
- bool ats_enabled;
+ bool ats_enabled : 1;
+ bool ste_ats_enabled : 1;
bool stall_enabled;
bool sva_enabled;
bool iopf_enabled;
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH v8 13/14] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED
2024-06-04 0:15 ` [PATCH v8 13/14] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED Jason Gunthorpe
@ 2024-06-04 6:20 ` Nicolin Chen
0 siblings, 0 replies; 28+ messages in thread
From: Nicolin Chen @ 2024-06-04 6:20 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
patches, Shameerali Kolothum Thodi
On Mon, Jun 03, 2024 at 09:15:58PM -0300, Jason Gunthorpe wrote:
> If the STE doesn't point to the CD table we can upgrade it by
> reprogramming the STE with the appropriate S1DSS. We may also need to turn
> on ATS at the same time.
>
> Keep track if the installed STE is pointing at the cd_table and the ATS
> state to trigger this path.
>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v8 14/14] iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (12 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 13/14] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED Jason Gunthorpe
@ 2024-06-04 0:15 ` Jason Gunthorpe
2024-06-04 8:45 ` [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Nicolin Chen
2024-06-24 22:00 ` Jerry Snitselaar
15 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 0:15 UTC (permalink / raw)
To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
Cc: Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameerali Kolothum Thodi
The SVA cleanup made the SSID logic entirely general so all we need to do
is call it with the correct cd table entry for a S1 domain.
This is slightly tricky because of the ASID and how the locking works, the
simple fix is to just update the ASID once we get the right locks.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 ++++++++++++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +-
2 files changed, 41 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f54be4f32d30d1..85d0b4fd7943c5 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2799,6 +2799,36 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return 0;
}
+static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t id)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_device *smmu = master->smmu;
+ struct arm_smmu_cd target_cd;
+ int ret = 0;
+
+ mutex_lock(&smmu_domain->init_mutex);
+ if (!smmu_domain->smmu)
+ ret = arm_smmu_domain_finalise(smmu_domain, smmu);
+ else if (smmu_domain->smmu != smmu)
+ ret = -EINVAL;
+ mutex_unlock(&smmu_domain->init_mutex);
+ if (ret)
+ return ret;
+
+ if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+ return -EINVAL;
+
+ /*
+ * We can read cd.asid outside the lock because arm_smmu_set_pasid()
+ * will fix it
+ */
+ arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
+ return arm_smmu_set_pasid(master, to_smmu_domain(domain), id,
+ &target_cd);
+}
+
static void arm_smmu_update_ste(struct arm_smmu_master *master,
struct iommu_domain *sid_domain,
bool ats_enabled)
@@ -2826,7 +2856,7 @@ static void arm_smmu_update_ste(struct arm_smmu_master *master,
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
- const struct arm_smmu_cd *cd)
+ struct arm_smmu_cd *cd)
{
struct iommu_domain *sid_domain = iommu_get_domain_for_dev(master->dev);
struct arm_smmu_attach_state state = {
@@ -2859,6 +2889,14 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
if (ret)
goto out_unlock;
+ /*
+ * We don't want to obtain to the asid_lock too early, so fix up the
+ * caller set ASID under the lock in case it changed.
+ */
+ cd->data[0] &= ~cpu_to_le64(CTXDESC_CD_0_ASID);
+ cd->data[0] |= cpu_to_le64(
+ FIELD_PREP(CTXDESC_CD_0_ASID, smmu_domain->cd.asid));
+
arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
arm_smmu_update_ste(master, sid_domain, state.ats_enabled);
@@ -3377,6 +3415,7 @@ static struct iommu_ops arm_smmu_ops = {
.owner = THIS_MODULE,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = arm_smmu_attach_dev,
+ .set_dev_pasid = arm_smmu_s1_set_dev_pasid,
.map_pages = arm_smmu_map_pages,
.unmap_pages = arm_smmu_unmap_pages,
.flush_iotlb_all = arm_smmu_flush_iotlb_all,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index cdd426efb384d2..91ec2d49ecbf2e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -801,7 +801,7 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
- const struct arm_smmu_cd *cd);
+ struct arm_smmu_cd *cd);
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
--
2.45.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3)
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (13 preceding siblings ...)
2024-06-04 0:15 ` [PATCH v8 14/14] iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID Jason Gunthorpe
@ 2024-06-04 8:45 ` Nicolin Chen
2024-06-04 19:07 ` Jason Gunthorpe
2024-06-24 22:00 ` Jerry Snitselaar
15 siblings, 1 reply; 28+ messages in thread
From: Nicolin Chen @ 2024-06-04 8:45 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
patches, Shameer Kolothum
On Mon, Jun 03, 2024 at 09:15:45PM -0300, Jason Gunthorpe wrote:
> This is on github:
> https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
>
> v8:
> - Rebase on v6.10-rc2
It seems that the branch above isn't updated yet from 6.9-rc7 :)
FWIW, it's missing a patch from Robin fixing a regression:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.10-rc2&id=8b80549f1bc6
> - Make arm_smmu_sva_domain_alloc NULL when SVA is disabled so the core
> code sees a NULL function pointer
> - Update comments around arm_smmu_attach_prepare()
> - Rename struct attach_state -> arm_smmu_attach_state and document
> better, include more common function paramters in state
> - Consistently use ats_enabled everywhere, replacing disable_ats in state
> - Move the note about ATS bypass/abort to arm_smmu_attach_prepare()
> - Remove temporary cd_table.in_ste check in arm_smmu_sva_set_dev_pasid()
> - Improve comments and clarity of logic in arm_smmu_attach_commit()
> - Shorten arm_smmu_mmu_notifier_free()
> - Fix domain -> sid_domain typo in arm_smmu_remove_dev_pasid()
> Cc: Nicolin Chen <nicolinc@nvidia.com>
With that, I retested with SVA sanity running on S1DSS.SSID0 and
S1DSS.BYPASS modes. Both work well. The kunit test passes also.
So, my previous Tested-by still stands.
Thanks
Nicolin
> Cc: Michael Shavit <mshavit@google.com>
> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Eric Auger <eric.auger@redhat.com>
> Cc: Moritz Fischer <mdf@kernel.org>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3)
2024-06-04 8:45 ` [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Nicolin Chen
@ 2024-06-04 19:07 ` Jason Gunthorpe
0 siblings, 0 replies; 28+ messages in thread
From: Jason Gunthorpe @ 2024-06-04 19:07 UTC (permalink / raw)
To: Nicolin Chen
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
patches, Shameer Kolothum
On Tue, Jun 04, 2024 at 01:45:01AM -0700, Nicolin Chen wrote:
> On Mon, Jun 03, 2024 at 09:15:45PM -0300, Jason Gunthorpe wrote:
>
> > This is on github:
> > https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
> >
> > v8:
> > - Rebase on v6.10-rc2
>
> It seems that the branch above isn't updated yet from 6.9-rc7 :)
I missed a push, it is sorted out now
Thanks,
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3)
2024-06-04 0:15 [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Jason Gunthorpe
` (14 preceding siblings ...)
2024-06-04 8:45 ` [PATCH v8 00/14] Update SMMUv3 to the modern iommu API (part 2b/3) Nicolin Chen
@ 2024-06-24 22:00 ` Jerry Snitselaar
15 siblings, 0 replies; 28+ messages in thread
From: Jerry Snitselaar @ 2024-06-24 22:00 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
Eric Auger, Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Nicolin Chen, patches, Shameer Kolothum
On Mon, Jun 03, 2024 at 09:15:45PM GMT, Jason Gunthorpe wrote:
> Continuing the work of part 1 and 2a this part focuses on the PASID and
> SVA code bringing these functional improvements:
>
> - attach_dev failure does not change the HW configuration.
>
> - Full PASID API support including:
> - S1/SVA domains attached to PASIDs
> - IDENTITY/BLOCKED/S1 attached to RID
> - Change of the RID domain while PASIDs are attached
>
> - Streamlined SVA support using the core infrastructure
>
> - Hitless, whenever possible, change between two domains
>
> This requires some reorganizing how the invalidation is tracked, a
> flexible linked list containing the SSIDs as well as the ATC information
> is used instead of a single master.
>
> The revised invalidation infrastructure is used to build generic support
> for attaching any sort of domain to any SSID, including the necessary
> maintenance of the invalidation list and the global ATS state. This
> mechansim is used again in part 3 as part of the nesting series.
>
> The ATS ordering is generalized and consolidated so that the PASID flow
> can use it and put into a form where it is fully hitless, whenever
> possible. This is necessary as changes to single PASIDs or RIDs can not
> change global state like ATS without impacting other, still attached,
> PASIDs or RIDs.
>
> Next simply replace the entire outdated SVA mmu_notifier implementation in
> one shot and switch it over to the newly created generic PASID layer and
> generic invalidation logic. This avoids the messy and confusing approach
> of trying to incrementally untangle this in place. The new code is small
> and simple enough this is much better than trying to figure out smaller
> steps.
>
> Once SVA is resting on the consolidate PASID layer it is straightforward
> to make the PASID interface functionally complete by supporting S1DSS to
> allow concurrent change of the RID while a PASID is attached and allow
> supporting PASID over an IDENTIY RID.
>
> This continues to support lazy allocation and installation of the CD table
> by continuing to use the CFG mode for IDENTITY/BLOCKED when when the PASID
> table is empty.
>
> Finally enable attaching any S1 domain to a PASID, not just SVA.
>
> It achieves the same goals as the several series from Michael and the
> S1DSS series from Nicolin that were trying to improve portions of the API.
>
> This is on github:
> https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
>
> v8:
> - Rebase on v6.10-rc2
> - Make arm_smmu_sva_domain_alloc NULL when SVA is disabled so the core
> code sees a NULL function pointer
> - Update comments around arm_smmu_attach_prepare()
> - Rename struct attach_state -> arm_smmu_attach_state and document
> better, include more common function paramters in state
> - Consistently use ats_enabled everywhere, replacing disable_ats in state
> - Move the note about ATS bypass/abort to arm_smmu_attach_prepare()
> - Remove temporary cd_table.in_ste check in arm_smmu_sva_set_dev_pasid()
> - Improve comments and clarity of logic in arm_smmu_attach_commit()
> - Shorten arm_smmu_mmu_notifier_free()
> - Fix domain -> sid_domain typo in arm_smmu_remove_dev_pasid()
> v7: https://lore.kernel.org/r/0-v7-9597c885796c+d2-smmuv3_newapi_p2b_jgg@nvidia.com
> - Second half of the split series
> - Rebase on Joerg's latest
> - Accommodate ARM_SMMU_FEAT_ATTR_TYPES_OVR for the S1DSS code
> - Include the S1DSS kunit tests
> - Include hunks to adjust the unit tests to API changes from this series
> - Move 3 BTM related patches out of this series, they can go in the BTM
> enablement series.
> - Move the domain_alloc_sva() conversion to the first patch, and rebase
> on the accepted core code change
> - Use the new core APIs for the PASID ops
> - Revise commit messages
> v6: https://lore.kernel.org/r/0-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com
>
> Cc: Nicolin Chen <nicolinc@nvidia.com>
> Cc: Michael Shavit <mshavit@google.com>
> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Eric Auger <eric.auger@redhat.com>
> Cc: Moritz Fischer <mdf@kernel.org>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>
> Jason Gunthorpe (14):
> iommu/arm-smmu-v3: Convert to domain_alloc_sva()
> iommu/arm-smmu-v3: Start building a generic PASID layer
> iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list
> iommu/arm-smmu-v3: Make changing domains be hitless for ATS
> iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain
> iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches
> iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*()
> interface
> iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain
> iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA
> iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain
> iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is
> used
> iommu/arm-smmu-v3: Test the STE S1DSS functionality
> iommu/arm-smmu-v3: Allow a PASID to be set when RID is
> IDENTITY/BLOCKED
> iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID
>
> .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 431 +++-----------
> .../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 116 +++-
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 561 ++++++++++++++----
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 52 +-
> 4 files changed, 664 insertions(+), 496 deletions(-)
>
>
> base-commit: c3f38fa61af77b49866b006939479069cd451173
> --
> 2.45.2
>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
^ permalink raw reply [flat|nested] 28+ messages in thread