* [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement
@ 2024-11-06 15:45 Yi Liu
2024-11-06 15:45 ` [PATCH v5 01/13] iommu: Pass old domain to set_dev_pasid op Yi Liu
` (12 more replies)
0 siblings, 13 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:45 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
This splits the preparation works of the iommu and the Intel iommu driver
out from the iommufd pasid attach/replace series. [1]
To support domain replacement, the definition of the set_dev_pasid op
needs to be enhanced. Meanwhile, the existing set_dev_pasid callbacks
should be extended as well to suit the new definition.
This series first passes the old domain to the set_dev_pasid op, and prepares
the Intel iommu set_dev_pasid callbacks (paging domain, identity domain, and
sva domain) for the new definition, add the missing set_dev_pasid callback
for the nested domain, makes ARM SMMUv3 set_dev_pasid op to suit the new
set_dev_pasid op definition, and in the end, claims the set_dev_pasid op support
domain replacement. The AMD set_dev_pasid callback is extended to fail if the
caller tries to do the domain replacement to meet the new definition of
set_dev_pasid op. AMD iommu driver would support it later per Vasant [2].
[1] https://lore.kernel.org/linux-iommu/20240412081516.31168-1-yi.l.liu@intel.com/
[2] https://lore.kernel.org/linux-iommu/fa9c4fc3-9365-465e-8926-b4d2d6361b9c@amd.com/
v5:
- Drop the nonsense function description in patch 02 of v4 (Kevin, Baulu)
- Add the error message as the setup helpers in replace helpers (Kevin)
- Check old domain id with the domain id in pasid entry within the replace helpers.
Along with it, iommu_domain_did() is added (Kevin, Baolu)
- Merge patch 05 and 09 of v4 and name the patch as consolidating the dev_pasid_info
add/remove (Baolu)
- Drop patch 10 of v4, hence patch 09 of v5 (Baolu, Kevin, Jason)
- Add r-b tags from Baolu, Kevin, Vasant
- Rebase on top of Joerg's next tree (75bc266cd1a6)
v4: https://lore.kernel.org/linux-iommu/20241104131842.13303-1-yi.l.liu@intel.com/
- Fix a missing input argument modification in patch 01 (Nicolin)
- Per Baolu's suggestion, this series does not extend the pasid setup helpers to
handle domain replacement, instead it adds pasid replace helpers, hence
drop patch 02 of v3 as it is no longer required in this series.
- Add a new helper to consolidate the cache flush for the modifications to a
present pasid entry (the fields other than P and SSADE) (Baolu)
- Update the set_dev_pasid op of intel identity domain to handle replacement
- Consolidate the dev_pasid_info code of intel_svm_set_dev_pasid()
- Add a separate set_dev_pasid callback for nested domain instead of sharing
intel_iommu_set_dev_pasid() (Baolu)
- Drop patch 05 of v3 as Baolu has another patch that touches it.
v3: https://lore.kernel.org/linux-iommu/20241018055402.23277-1-yi.l.liu@intel.com/
- Add Kevin and Jason's r-b to patch 01, 02, 04, 05 and 06 of v2
- Add back the patch 03 of v1 to make the pasid setup helpers do all the pasid entry
modification, hence the set_dev_pasid path is really rollback-less, which is spotted
by Baolu.
- Rename prepare_domain_attach_device() (Baolu)
- Use unsigned int instead of u32 for flags (Baolu)
- Remove a stale comment in arm_smmu_set_pasid (Will)
v2: https://lore.kernel.org/linux-iommu/20240912130427.10119-1-yi.l.liu@intel.com/
- Make ARM SMMUv3 set_dev_pasid op support domain replacement (Jason)
- Drop patch 03 of v1 (Kevin)
- Multiple tweaks in VT-d driver (Kevin)
v1: https://lore.kernel.org/linux-iommu/20240628085538.47049-1-yi.l.liu@intel.com/
Regards,
Yi Liu
Jason Gunthorpe (1):
iommu/arm-smmu-v3: Make set_dev_pasid() op support replace
Yi Liu (12):
iommu: Pass old domain to set_dev_pasid op
iommu/vt-d: Add a helper to flush cache for updating present pasid
entry
iommu/vt-d: Refactor the pasid setup helpers
iommu/vt-d: Add pasid replace helpers
iommu/vt-d: Consolidate the struct dev_pasid_info add/remove
iommu/vt-d: Add iommu_domain_did() to get did
iommu/vt-d: Make intel_iommu_set_dev_pasid() to handle domain
replacement
iommu/vt-d: Limit intel_iommu_set_dev_pasid() for paging domain
iommu/vt-d: Make intel_svm_set_dev_pasid() support domain replacement
iommu/vt-d: Make identity_domain_set_dev_pasid() to handle domain
replacement
iommu/vt-d: Add set_dev_pasid callback for nested domain
iommu: Make set_dev_pasid op support domain replacement
drivers/iommu/amd/amd_iommu.h | 3 +-
drivers/iommu/amd/pasid.c | 6 +-
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 5 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 12 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +-
drivers/iommu/intel/iommu.c | 174 +++++---
drivers/iommu/intel/iommu.h | 34 ++
drivers/iommu/intel/nested.c | 50 +++
drivers/iommu/intel/pasid.c | 381 ++++++++++++++----
drivers/iommu/intel/pasid.h | 22 +-
drivers/iommu/intel/svm.c | 36 +-
drivers/iommu/iommu.c | 3 +-
include/linux/iommu.h | 5 +-
13 files changed, 566 insertions(+), 167 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v5 01/13] iommu: Pass old domain to set_dev_pasid op
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
@ 2024-11-06 15:45 ` Yi Liu
2024-11-06 15:45 ` [PATCH v5 02/13] iommu/vt-d: Add a helper to flush cache for updating present pasid entry Yi Liu
` (11 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:45 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
To support domain replacement for pasid, the underlying iommu driver needs
to know the old domain hence be able to clean up the existing attachment.
It would be much convenient for iommu layer to pass down the old domain.
Otherwise, iommu drivers would need to track domain for pasids by themselves,
this would duplicate code among the iommu drivers. Or iommu drivers would
rely group->pasid_array to get domain, which may not always the correct
one.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/amd/amd_iommu.h | 3 ++-
drivers/iommu/amd/pasid.c | 3 ++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 3 ++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 ++-
drivers/iommu/intel/iommu.c | 6 ++++--
drivers/iommu/intel/svm.c | 3 ++-
drivers/iommu/iommu.c | 3 ++-
include/linux/iommu.h | 2 +-
8 files changed, 17 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 38509e1019e9..1bef5d55b2f9 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -53,7 +53,8 @@ struct iommu_domain *amd_iommu_domain_alloc_sva(struct device *dev,
struct mm_struct *mm);
void amd_iommu_domain_free(struct iommu_domain *dom);
int iommu_sva_set_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t pasid);
+ struct device *dev, ioasid_t pasid,
+ struct iommu_domain *old);
void amd_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
struct iommu_domain *domain);
diff --git a/drivers/iommu/amd/pasid.c b/drivers/iommu/amd/pasid.c
index 0657b9373be5..d1dfc745f55e 100644
--- a/drivers/iommu/amd/pasid.c
+++ b/drivers/iommu/amd/pasid.c
@@ -100,7 +100,8 @@ static const struct mmu_notifier_ops sva_mn = {
};
int iommu_sva_set_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t pasid)
+ struct device *dev, ioasid_t pasid,
+ struct iommu_domain *old)
{
struct pdom_dev_data *pdom_dev_data;
struct protection_domain *sva_pdom = to_pdomain(domain);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index a7c36654dee5..645da7b69bed 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -332,7 +332,8 @@ void arm_smmu_sva_notifier_synchronize(void)
}
static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t id)
+ struct device *dev, ioasid_t id,
+ struct iommu_domain *old)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 54c26f9f05db..a66d9a044e52 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2874,7 +2874,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
}
static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t id)
+ struct device *dev, ioasid_t id,
+ struct iommu_domain *old)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 3878f35be09d..402ba1058794 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4072,7 +4072,8 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
}
static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t pasid)
+ struct device *dev, ioasid_t pasid,
+ struct iommu_domain *old)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct dmar_domain *dmar_domain = to_dmar_domain(domain);
@@ -4358,7 +4359,8 @@ static int identity_domain_attach_dev(struct iommu_domain *domain, struct device
}
static int identity_domain_set_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t pasid)
+ struct device *dev, ioasid_t pasid,
+ struct iommu_domain *old)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 3cc43a958b4d..4a2bd65614ad 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -111,7 +111,8 @@ static const struct mmu_notifier_ops intel_mmuops = {
};
static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t pasid)
+ struct device *dev, ioasid_t pasid,
+ struct iommu_domain *old)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct dmar_domain *dmar_domain = to_dmar_domain(domain);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d241bfdbf1f8..13fcd9d8f2df 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3305,7 +3305,8 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain,
int ret;
for_each_group_device(group, device) {
- ret = domain->ops->set_dev_pasid(domain, device->dev, pasid);
+ ret = domain->ops->set_dev_pasid(domain, device->dev,
+ pasid, NULL);
if (ret)
goto err_revert;
}
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 6bcab2c43014..56653af5edcb 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -639,7 +639,7 @@ struct iommu_ops {
struct iommu_domain_ops {
int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
int (*set_dev_pasid)(struct iommu_domain *domain, struct device *dev,
- ioasid_t pasid);
+ ioasid_t pasid, struct iommu_domain *old);
int (*map_pages)(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t pgsize, size_t pgcount,
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 02/13] iommu/vt-d: Add a helper to flush cache for updating present pasid entry
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
2024-11-06 15:45 ` [PATCH v5 01/13] iommu: Pass old domain to set_dev_pasid op Yi Liu
@ 2024-11-06 15:45 ` Yi Liu
2024-11-06 15:45 ` [PATCH v5 03/13] iommu/vt-d: Refactor the pasid setup helpers Yi Liu
` (10 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:45 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
Generalize the logic for flushing pasid-related cache upon changes to
bits other than SSADE and P which requires a different flow according
to VT-d spec.
No functional change is intended.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/intel/pasid.c | 52 ++++++++++++++++++++++++-------------
1 file changed, 34 insertions(+), 18 deletions(-)
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 31665fb62e1c..8d11701c2e76 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -287,6 +287,39 @@ static void pasid_flush_caches(struct intel_iommu *iommu,
}
}
+/*
+ * This function is supposed to be used after caller updates the fields
+ * except for the SSADE and P bit of a pasid table entry. It does the
+ * below:
+ * - Flush cacheline if needed
+ * - Flush the caches per Table 28 ”Guidance to Software for Invalidations“
+ * of VT-d spec 5.0.
+ */
+static void intel_pasid_flush_present(struct intel_iommu *iommu,
+ struct device *dev,
+ u32 pasid, u16 did,
+ struct pasid_entry *pte)
+{
+ if (!ecap_coherent(iommu->ecap))
+ clflush_cache_range(pte, sizeof(*pte));
+
+ /*
+ * VT-d spec 5.0 table28 states guides for cache invalidation:
+ *
+ * - PASID-selective-within-Domain PASID-cache invalidation
+ * - PASID-selective PASID-based IOTLB invalidation
+ * - If (pasid is RID_PASID)
+ * - Global Device-TLB invalidation to affected functions
+ * Else
+ * - PASID-based Device-TLB invalidation (with S=1 and
+ * Addr[63:12]=0x7FFFFFFF_FFFFF) to affected functions
+ */
+ pasid_cache_invalidation_with_pasid(iommu, did, pasid);
+ qi_flush_piotlb(iommu, did, pasid, 0, -1, 0);
+
+ devtlb_invalidation_with_pasid(iommu, dev, pasid);
+}
+
/*
* Set up the scalable mode pasid table entry for first only
* translation type.
@@ -526,24 +559,7 @@ void intel_pasid_setup_page_snoop_control(struct intel_iommu *iommu,
did = pasid_get_domain_id(pte);
spin_unlock(&iommu->lock);
- if (!ecap_coherent(iommu->ecap))
- clflush_cache_range(pte, sizeof(*pte));
-
- /*
- * VT-d spec 3.4 table23 states guides for cache invalidation:
- *
- * - PASID-selective-within-Domain PASID-cache invalidation
- * - PASID-selective PASID-based IOTLB invalidation
- * - If (pasid is RID_PASID)
- * - Global Device-TLB invalidation to affected functions
- * Else
- * - PASID-based Device-TLB invalidation (with S=1 and
- * Addr[63:12]=0x7FFFFFFF_FFFFF) to affected functions
- */
- pasid_cache_invalidation_with_pasid(iommu, did, pasid);
- qi_flush_piotlb(iommu, did, pasid, 0, -1, 0);
-
- devtlb_invalidation_with_pasid(iommu, dev, pasid);
+ intel_pasid_flush_present(iommu, dev, pasid, did, pte);
}
/**
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 03/13] iommu/vt-d: Refactor the pasid setup helpers
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
2024-11-06 15:45 ` [PATCH v5 01/13] iommu: Pass old domain to set_dev_pasid op Yi Liu
2024-11-06 15:45 ` [PATCH v5 02/13] iommu/vt-d: Add a helper to flush cache for updating present pasid entry Yi Liu
@ 2024-11-06 15:45 ` Yi Liu
2024-11-06 15:45 ` [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers Yi Liu
` (9 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:45 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
It is clearer to have a new set of pasid replacement helpers other than
extending the existing ones to cover both initial setup and replacement.
Then abstract out the common code for manipulating the pasid entry as
preparation.
No functional change is intended.
Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/intel/pasid.c | 169 ++++++++++++++++++++++--------------
1 file changed, 105 insertions(+), 64 deletions(-)
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 8d11701c2e76..6841b9892d55 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -324,6 +324,32 @@ static void intel_pasid_flush_present(struct intel_iommu *iommu,
* Set up the scalable mode pasid table entry for first only
* translation type.
*/
+static void pasid_pte_config_first_level(struct intel_iommu *iommu,
+ struct pasid_entry *pte,
+ pgd_t *pgd, u16 did, int flags)
+{
+ lockdep_assert_held(&iommu->lock);
+
+ pasid_clear_entry(pte);
+
+ /* Setup the first level page table pointer: */
+ pasid_set_flptr(pte, (u64)__pa(pgd));
+
+ if (flags & PASID_FLAG_FL5LP)
+ pasid_set_flpm(pte, 1);
+
+ if (flags & PASID_FLAG_PAGE_SNOOP)
+ pasid_set_pgsnp(pte);
+
+ pasid_set_domain_id(pte, did);
+ pasid_set_address_width(pte, iommu->agaw);
+ pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
+
+ /* Setup Present and PASID Granular Transfer Type: */
+ pasid_set_translation_type(pte, PASID_ENTRY_PGTT_FL_ONLY);
+ pasid_set_present(pte);
+}
+
int intel_pasid_setup_first_level(struct intel_iommu *iommu,
struct device *dev, pgd_t *pgd,
u32 pasid, u16 did, int flags)
@@ -354,24 +380,8 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
return -EBUSY;
}
- pasid_clear_entry(pte);
-
- /* Setup the first level page table pointer: */
- pasid_set_flptr(pte, (u64)__pa(pgd));
-
- if (flags & PASID_FLAG_FL5LP)
- pasid_set_flpm(pte, 1);
-
- if (flags & PASID_FLAG_PAGE_SNOOP)
- pasid_set_pgsnp(pte);
-
- pasid_set_domain_id(pte, did);
- pasid_set_address_width(pte, iommu->agaw);
- pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
+ pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
- /* Setup Present and PASID Granular Transfer Type: */
- pasid_set_translation_type(pte, PASID_ENTRY_PGTT_FL_ONLY);
- pasid_set_present(pte);
spin_unlock(&iommu->lock);
pasid_flush_caches(iommu, pte, pasid, did);
@@ -382,6 +392,26 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
/*
* Set up the scalable mode pasid entry for second only translation type.
*/
+static void pasid_pte_config_second_level(struct intel_iommu *iommu,
+ struct pasid_entry *pte,
+ u64 pgd_val, int agaw, u16 did,
+ bool dirty_tracking)
+{
+ lockdep_assert_held(&iommu->lock);
+
+ pasid_clear_entry(pte);
+ pasid_set_domain_id(pte, did);
+ pasid_set_slptr(pte, pgd_val);
+ pasid_set_address_width(pte, agaw);
+ pasid_set_translation_type(pte, PASID_ENTRY_PGTT_SL_ONLY);
+ pasid_set_fault_enable(pte);
+ pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
+ if (dirty_tracking)
+ pasid_set_ssade(pte);
+
+ pasid_set_present(pte);
+}
+
int intel_pasid_setup_second_level(struct intel_iommu *iommu,
struct dmar_domain *domain,
struct device *dev, u32 pasid)
@@ -417,17 +447,8 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu,
return -EBUSY;
}
- pasid_clear_entry(pte);
- pasid_set_domain_id(pte, did);
- pasid_set_slptr(pte, pgd_val);
- pasid_set_address_width(pte, domain->agaw);
- pasid_set_translation_type(pte, PASID_ENTRY_PGTT_SL_ONLY);
- pasid_set_fault_enable(pte);
- pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
- if (domain->dirty_tracking)
- pasid_set_ssade(pte);
-
- pasid_set_present(pte);
+ pasid_pte_config_second_level(iommu, pte, pgd_val, domain->agaw,
+ did, domain->dirty_tracking);
spin_unlock(&iommu->lock);
pasid_flush_caches(iommu, pte, pasid, did);
@@ -507,6 +528,20 @@ int intel_pasid_setup_dirty_tracking(struct intel_iommu *iommu,
/*
* Set up the scalable mode pasid entry for passthrough translation type.
*/
+static void pasid_pte_config_pass_through(struct intel_iommu *iommu,
+ struct pasid_entry *pte, u16 did)
+{
+ lockdep_assert_held(&iommu->lock);
+
+ pasid_clear_entry(pte);
+ pasid_set_domain_id(pte, did);
+ pasid_set_address_width(pte, iommu->agaw);
+ pasid_set_translation_type(pte, PASID_ENTRY_PGTT_PT);
+ pasid_set_fault_enable(pte);
+ pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
+ pasid_set_present(pte);
+}
+
int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
struct device *dev, u32 pasid)
{
@@ -525,13 +560,7 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
return -EBUSY;
}
- pasid_clear_entry(pte);
- pasid_set_domain_id(pte, did);
- pasid_set_address_width(pte, iommu->agaw);
- pasid_set_translation_type(pte, PASID_ENTRY_PGTT_PT);
- pasid_set_fault_enable(pte);
- pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
- pasid_set_present(pte);
+ pasid_pte_config_pass_through(iommu, pte, did);
spin_unlock(&iommu->lock);
pasid_flush_caches(iommu, pte, pasid, did);
@@ -562,6 +591,46 @@ void intel_pasid_setup_page_snoop_control(struct intel_iommu *iommu,
intel_pasid_flush_present(iommu, dev, pasid, did, pte);
}
+static void pasid_pte_config_nestd(struct intel_iommu *iommu,
+ struct pasid_entry *pte,
+ struct iommu_hwpt_vtd_s1 *s1_cfg,
+ struct dmar_domain *s2_domain,
+ u16 did)
+{
+ struct dma_pte *pgd = s2_domain->pgd;
+
+ lockdep_assert_held(&iommu->lock);
+
+ pasid_clear_entry(pte);
+
+ if (s1_cfg->addr_width == ADDR_WIDTH_5LEVEL)
+ pasid_set_flpm(pte, 1);
+
+ pasid_set_flptr(pte, s1_cfg->pgtbl_addr);
+
+ if (s1_cfg->flags & IOMMU_VTD_S1_SRE) {
+ pasid_set_sre(pte);
+ if (s1_cfg->flags & IOMMU_VTD_S1_WPE)
+ pasid_set_wpe(pte);
+ }
+
+ if (s1_cfg->flags & IOMMU_VTD_S1_EAFE)
+ pasid_set_eafe(pte);
+
+ if (s2_domain->force_snooping)
+ pasid_set_pgsnp(pte);
+
+ pasid_set_slptr(pte, virt_to_phys(pgd));
+ pasid_set_fault_enable(pte);
+ pasid_set_domain_id(pte, did);
+ pasid_set_address_width(pte, s2_domain->agaw);
+ pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
+ if (s2_domain->dirty_tracking)
+ pasid_set_ssade(pte);
+ pasid_set_translation_type(pte, PASID_ENTRY_PGTT_NESTED);
+ pasid_set_present(pte);
+}
+
/**
* intel_pasid_setup_nested() - Set up PASID entry for nested translation.
* @iommu: IOMMU which the device belong to
@@ -579,7 +648,6 @@ int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev,
struct iommu_hwpt_vtd_s1 *s1_cfg = &domain->s1_cfg;
struct dmar_domain *s2_domain = domain->s2_domain;
u16 did = domain_id_iommu(domain, iommu);
- struct dma_pte *pgd = s2_domain->pgd;
struct pasid_entry *pte;
/* Address width should match the address width supported by hardware */
@@ -622,34 +690,7 @@ int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev,
return -EBUSY;
}
- pasid_clear_entry(pte);
-
- if (s1_cfg->addr_width == ADDR_WIDTH_5LEVEL)
- pasid_set_flpm(pte, 1);
-
- pasid_set_flptr(pte, s1_cfg->pgtbl_addr);
-
- if (s1_cfg->flags & IOMMU_VTD_S1_SRE) {
- pasid_set_sre(pte);
- if (s1_cfg->flags & IOMMU_VTD_S1_WPE)
- pasid_set_wpe(pte);
- }
-
- if (s1_cfg->flags & IOMMU_VTD_S1_EAFE)
- pasid_set_eafe(pte);
-
- if (s2_domain->force_snooping)
- pasid_set_pgsnp(pte);
-
- pasid_set_slptr(pte, virt_to_phys(pgd));
- pasid_set_fault_enable(pte);
- pasid_set_domain_id(pte, did);
- pasid_set_address_width(pte, s2_domain->agaw);
- pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
- if (s2_domain->dirty_tracking)
- pasid_set_ssade(pte);
- pasid_set_translation_type(pte, PASID_ENTRY_PGTT_NESTED);
- pasid_set_present(pte);
+ pasid_pte_config_nestd(iommu, pte, s1_cfg, s2_domain, did);
spin_unlock(&iommu->lock);
pasid_flush_caches(iommu, pte, pasid, did);
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
` (2 preceding siblings ...)
2024-11-06 15:45 ` [PATCH v5 03/13] iommu/vt-d: Refactor the pasid setup helpers Yi Liu
@ 2024-11-06 15:45 ` Yi Liu
2024-11-07 2:52 ` Baolu Lu
2024-11-07 2:57 ` Baolu Lu
2024-11-06 15:45 ` [PATCH v5 05/13] iommu/vt-d: Consolidate the struct dev_pasid_info add/remove Yi Liu
` (8 subsequent siblings)
12 siblings, 2 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:45 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
pasid replacement allows converting a present pasid entry to be FS, SS,
PT or nested, hence add helpers for such operations.
Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/intel/pasid.c | 182 ++++++++++++++++++++++++++++++++++++
drivers/iommu/intel/pasid.h | 15 +++
2 files changed, 197 insertions(+)
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 6841b9892d55..777e70b539b1 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -389,6 +389,48 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
return 0;
}
+int intel_pasid_replace_first_level(struct intel_iommu *iommu,
+ struct device *dev, pgd_t *pgd,
+ u32 pasid, u16 did, u16 old_did,
+ int flags)
+{
+ struct pasid_entry *pte;
+
+ if (!ecap_flts(iommu->ecap)) {
+ pr_err("No first level translation support on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu->cap)) {
+ pr_err("No 5-level paging support for first-level on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ spin_lock(&iommu->lock);
+ pte = intel_pasid_get_entry(dev, pasid);
+ if (!pte) {
+ spin_unlock(&iommu->lock);
+ return -ENODEV;
+ }
+
+ if (!pasid_pte_is_present(pte)) {
+ spin_unlock(&iommu->lock);
+ return -EINVAL;
+ }
+
+ WARN_ON(old_did != pasid_get_domain_id(pte));
+
+ pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
+ spin_unlock(&iommu->lock);
+
+ intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
+ intel_iommu_drain_pasid_prq(dev, pasid);
+
+ return 0;
+}
+
/*
* Set up the scalable mode pasid entry for second only translation type.
*/
@@ -456,6 +498,55 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu,
return 0;
}
+int intel_pasid_replace_second_level(struct intel_iommu *iommu,
+ struct dmar_domain *domain,
+ struct device *dev, u16 old_did,
+ u32 pasid)
+{
+ struct pasid_entry *pte;
+ struct dma_pte *pgd;
+ u64 pgd_val;
+ int agaw;
+ u16 did;
+
+ /*
+ * If hardware advertises no support for second level
+ * translation, return directly.
+ */
+ if (!ecap_slts(iommu->ecap)) {
+ pr_err("No second level translation support on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ pgd = domain->pgd;
+ pgd_val = virt_to_phys(pgd);
+ did = domain_id_iommu(domain, iommu);
+
+ spin_lock(&iommu->lock);
+ pte = intel_pasid_get_entry(dev, pasid);
+ if (!pte) {
+ spin_unlock(&iommu->lock);
+ return -ENODEV;
+ }
+
+ if (!pasid_pte_is_present(pte)) {
+ spin_unlock(&iommu->lock);
+ return -EINVAL;
+ }
+
+ WARN_ON(old_did != pasid_get_domain_id(pte));
+
+ pasid_pte_config_second_level(iommu, pte, pgd_val, agaw,
+ did, domain->dirty_tracking);
+ spin_unlock(&iommu->lock);
+
+ intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
+ intel_iommu_drain_pasid_prq(dev, pasid);
+
+ return 0;
+}
+
/*
* Set up dirty tracking on a second only or nested translation type.
*/
@@ -568,6 +659,36 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
return 0;
}
+int intel_pasid_replace_pass_through(struct intel_iommu *iommu,
+ struct device *dev, u16 old_did,
+ u32 pasid)
+{
+ u16 did = FLPT_DEFAULT_DID;
+ struct pasid_entry *pte;
+
+ spin_lock(&iommu->lock);
+ pte = intel_pasid_get_entry(dev, pasid);
+ if (!pte) {
+ spin_unlock(&iommu->lock);
+ return -ENODEV;
+ }
+
+ if (!pasid_pte_is_present(pte)) {
+ spin_unlock(&iommu->lock);
+ return -EINVAL;
+ }
+
+ WARN_ON(old_did != pasid_get_domain_id(pte));
+
+ pasid_pte_config_pass_through(iommu, pte, did);
+ spin_unlock(&iommu->lock);
+
+ intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
+ intel_iommu_drain_pasid_prq(dev, pasid);
+
+ return 0;
+}
+
/*
* Set the page snoop control for a pasid entry which has been set up.
*/
@@ -698,6 +819,67 @@ int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev,
return 0;
}
+int intel_pasid_replace_nested(struct intel_iommu *iommu,
+ struct device *dev, u32 pasid,
+ u16 old_did, struct dmar_domain *domain)
+{
+ struct iommu_hwpt_vtd_s1 *s1_cfg = &domain->s1_cfg;
+ struct dmar_domain *s2_domain = domain->s2_domain;
+ u16 did = domain_id_iommu(domain, iommu);
+ struct pasid_entry *pte;
+
+ /* Address width should match the address width supported by hardware */
+ switch (s1_cfg->addr_width) {
+ case ADDR_WIDTH_4LEVEL:
+ break;
+ case ADDR_WIDTH_5LEVEL:
+ if (!cap_fl5lp_support(iommu->cap)) {
+ dev_err_ratelimited(dev,
+ "5-level paging not supported\n");
+ return -EINVAL;
+ }
+ break;
+ default:
+ dev_err_ratelimited(dev, "Invalid stage-1 address width %d\n",
+ s1_cfg->addr_width);
+ return -EINVAL;
+ }
+
+ if ((s1_cfg->flags & IOMMU_VTD_S1_SRE) && !ecap_srs(iommu->ecap)) {
+ pr_err_ratelimited("No supervisor request support on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ if ((s1_cfg->flags & IOMMU_VTD_S1_EAFE) && !ecap_eafs(iommu->ecap)) {
+ pr_err_ratelimited("No extended access flag support on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ spin_lock(&iommu->lock);
+ pte = intel_pasid_get_entry(dev, pasid);
+ if (!pte) {
+ spin_unlock(&iommu->lock);
+ return -ENODEV;
+ }
+
+ if (!pasid_pte_is_present(pte)) {
+ spin_unlock(&iommu->lock);
+ return -EINVAL;
+ }
+
+ WARN_ON(old_did != pasid_get_domain_id(pte));
+
+ pasid_pte_config_nestd(iommu, pte, s1_cfg, s2_domain, did);
+ spin_unlock(&iommu->lock);
+
+ intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
+ intel_iommu_drain_pasid_prq(dev, pasid);
+
+ return 0;
+}
+
/*
* Interfaces to setup or teardown a pasid table to the scalable-mode
* context table entry:
diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h
index dde6d3ba5ae0..06d1f7006d01 100644
--- a/drivers/iommu/intel/pasid.h
+++ b/drivers/iommu/intel/pasid.h
@@ -303,6 +303,21 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
struct device *dev, u32 pasid);
int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev,
u32 pasid, struct dmar_domain *domain);
+int intel_pasid_replace_first_level(struct intel_iommu *iommu,
+ struct device *dev, pgd_t *pgd,
+ u32 pasid, u16 did, u16 old_did,
+ int flags);
+int intel_pasid_replace_second_level(struct intel_iommu *iommu,
+ struct dmar_domain *domain,
+ struct device *dev, u16 old_did,
+ u32 pasid);
+int intel_pasid_replace_pass_through(struct intel_iommu *iommu,
+ struct device *dev, u16 old_did,
+ u32 pasid);
+int intel_pasid_replace_nested(struct intel_iommu *iommu,
+ struct device *dev, u32 pasid,
+ u16 old_did, struct dmar_domain *domain);
+
void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
struct device *dev, u32 pasid,
bool fault_ignore);
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 05/13] iommu/vt-d: Consolidate the struct dev_pasid_info add/remove
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
` (3 preceding siblings ...)
2024-11-06 15:45 ` [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers Yi Liu
@ 2024-11-06 15:45 ` Yi Liu
2024-11-06 15:45 ` [PATCH v5 06/13] iommu/vt-d: Add iommu_domain_did() to get did Yi Liu
` (7 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:45 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
domain_add_dev_pasid() and domain_remove_dev_pasid() are added to consolidate
the adding/removing of the struct dev_pasid_info. Besides, it includes the cache
tag assign/unassign as well.
This also prepares for adding domain replacement for pasid. The set_dev_pasid
callbacks need to deal with the dev_pasid_info for both old and new domain.
These two helpers make the life easier.
intel_iommu_set_dev_pasid() and intel_svm_set_dev_pasid() are updated to use
the helpers.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/intel/iommu.c | 91 +++++++++++++++++++++++++------------
drivers/iommu/intel/iommu.h | 6 +++
drivers/iommu/intel/svm.c | 28 +++---------
3 files changed, 74 insertions(+), 51 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 402ba1058794..a1e9dbca6561 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4038,8 +4038,8 @@ static int intel_iommu_iotlb_sync_map(struct iommu_domain *domain,
return 0;
}
-static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
- struct iommu_domain *domain)
+void domain_remove_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct dev_pasid_info *curr, *dev_pasid = NULL;
@@ -4047,10 +4047,12 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
struct dmar_domain *dmar_domain;
unsigned long flags;
- if (domain->type == IOMMU_DOMAIN_IDENTITY) {
- intel_pasid_tear_down_entry(iommu, dev, pasid, false);
+ if (!domain)
+ return;
+
+ /* Identity domain has no meta data for pasid. */
+ if (domain->type == IOMMU_DOMAIN_IDENTITY)
return;
- }
dmar_domain = to_dmar_domain(domain);
spin_lock_irqsave(&dmar_domain->lock, flags);
@@ -4068,7 +4070,52 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
domain_detach_iommu(dmar_domain, iommu);
intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
kfree(dev_pasid);
- intel_pasid_tear_down_entry(iommu, dev, pasid, false);
+}
+
+static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
+ struct iommu_domain *domain)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+
+ intel_pasid_tear_down_entry(info->iommu, dev, pasid, false);
+ domain_remove_dev_pasid(domain, dev, pasid);
+}
+
+struct dev_pasid_info *
+domain_add_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+ struct intel_iommu *iommu = info->iommu;
+ struct dev_pasid_info *dev_pasid;
+ unsigned long flags;
+ int ret;
+
+ dev_pasid = kzalloc(sizeof(*dev_pasid), GFP_KERNEL);
+ if (!dev_pasid)
+ return ERR_PTR(-ENOMEM);
+
+ ret = domain_attach_iommu(dmar_domain, iommu);
+ if (ret)
+ goto out_free;
+
+ ret = cache_tag_assign_domain(dmar_domain, dev, pasid);
+ if (ret)
+ goto out_detach_iommu;
+
+ dev_pasid->dev = dev;
+ dev_pasid->pasid = pasid;
+ spin_lock_irqsave(&dmar_domain->lock, flags);
+ list_add(&dev_pasid->link_domain, &dmar_domain->dev_pasids);
+ spin_unlock_irqrestore(&dmar_domain->lock, flags);
+
+ return dev_pasid;
+out_detach_iommu:
+ domain_detach_iommu(dmar_domain, iommu);
+out_free:
+ kfree(dev_pasid);
+ return ERR_PTR(ret);
}
static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
@@ -4079,7 +4126,6 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
struct dmar_domain *dmar_domain = to_dmar_domain(domain);
struct intel_iommu *iommu = info->iommu;
struct dev_pasid_info *dev_pasid;
- unsigned long flags;
int ret;
if (!pasid_supported(iommu) || dev_is_real_dma_subdevice(dev))
@@ -4095,17 +4141,9 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
if (ret)
return ret;
- dev_pasid = kzalloc(sizeof(*dev_pasid), GFP_KERNEL);
- if (!dev_pasid)
- return -ENOMEM;
-
- ret = domain_attach_iommu(dmar_domain, iommu);
- if (ret)
- goto out_free;
-
- ret = cache_tag_assign_domain(dmar_domain, dev, pasid);
- if (ret)
- goto out_detach_iommu;
+ dev_pasid = domain_add_dev_pasid(domain, dev, pasid);
+ if (IS_ERR(dev_pasid))
+ return PTR_ERR(dev_pasid);
if (dmar_domain->use_first_level)
ret = domain_setup_first_level(iommu, dmar_domain,
@@ -4114,24 +4152,17 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
ret = intel_pasid_setup_second_level(iommu, dmar_domain,
dev, pasid);
if (ret)
- goto out_unassign_tag;
+ goto out_remove_dev_pasid;
- dev_pasid->dev = dev;
- dev_pasid->pasid = pasid;
- spin_lock_irqsave(&dmar_domain->lock, flags);
- list_add(&dev_pasid->link_domain, &dmar_domain->dev_pasids);
- spin_unlock_irqrestore(&dmar_domain->lock, flags);
+ domain_remove_dev_pasid(old, dev, pasid);
if (domain->type & __IOMMU_DOMAIN_PAGING)
intel_iommu_debugfs_create_dev_pasid(dev_pasid);
return 0;
-out_unassign_tag:
- cache_tag_unassign_domain(dmar_domain, dev, pasid);
-out_detach_iommu:
- domain_detach_iommu(dmar_domain, iommu);
-out_free:
- kfree(dev_pasid);
+
+out_remove_dev_pasid:
+ domain_remove_dev_pasid(domain, dev, pasid);
return ret;
}
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index b3912633ce25..df0261e03fad 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -1228,6 +1228,12 @@ void domain_detach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu);
void device_block_translation(struct device *dev);
int paging_domain_compatible(struct iommu_domain *domain, struct device *dev);
+struct dev_pasid_info *
+domain_add_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid);
+void domain_remove_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid);
+
int dmar_ir_support(void);
void iommu_flush_write_buffer(struct intel_iommu *iommu);
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 4a2bd65614ad..6c0685ea8466 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -115,43 +115,29 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
struct iommu_domain *old)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
- struct dmar_domain *dmar_domain = to_dmar_domain(domain);
struct intel_iommu *iommu = info->iommu;
struct mm_struct *mm = domain->mm;
struct dev_pasid_info *dev_pasid;
unsigned long sflags;
- unsigned long flags;
int ret = 0;
- dev_pasid = kzalloc(sizeof(*dev_pasid), GFP_KERNEL);
- if (!dev_pasid)
- return -ENOMEM;
-
- dev_pasid->dev = dev;
- dev_pasid->pasid = pasid;
-
- ret = cache_tag_assign_domain(to_dmar_domain(domain), dev, pasid);
- if (ret)
- goto free_dev_pasid;
+ dev_pasid = domain_add_dev_pasid(domain, dev, pasid);
+ if (IS_ERR(dev_pasid))
+ return PTR_ERR(dev_pasid);
/* Setup the pasid table: */
sflags = cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, pasid,
FLPT_DEFAULT_DID, sflags);
if (ret)
- goto unassign_tag;
+ goto out_remove_dev_pasid;
- spin_lock_irqsave(&dmar_domain->lock, flags);
- list_add(&dev_pasid->link_domain, &dmar_domain->dev_pasids);
- spin_unlock_irqrestore(&dmar_domain->lock, flags);
+ domain_remove_dev_pasid(old, dev, pasid);
return 0;
-unassign_tag:
- cache_tag_unassign_domain(to_dmar_domain(domain), dev, pasid);
-free_dev_pasid:
- kfree(dev_pasid);
-
+out_remove_dev_pasid:
+ domain_remove_dev_pasid(domain, dev, pasid);
return ret;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 06/13] iommu/vt-d: Add iommu_domain_did() to get did
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
` (4 preceding siblings ...)
2024-11-06 15:45 ` [PATCH v5 05/13] iommu/vt-d: Consolidate the struct dev_pasid_info add/remove Yi Liu
@ 2024-11-06 15:45 ` Yi Liu
2024-11-06 15:46 ` [PATCH v5 07/13] iommu/vt-d: Make intel_iommu_set_dev_pasid() to handle domain replacement Yi Liu
` (6 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:45 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
domain_id_iommu() does not support SVA type and identity type domains.
Add iommu_domain_did() to support all domain types.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/intel/iommu.h | 16 ++++++++++++++++
drivers/iommu/intel/pasid.h | 7 -------
2 files changed, 16 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index df0261e03fad..cdca7d5061a7 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -806,6 +806,13 @@ static inline struct dmar_domain *to_dmar_domain(struct iommu_domain *dom)
return container_of(dom, struct dmar_domain, domain);
}
+/*
+ * Domain ID reserved for pasid entries programmed for first-level
+ * only and pass-through transfer modes.
+ */
+#define FLPT_DEFAULT_DID 1
+#define NUM_RESERVED_DID 2
+
/* Retrieve the domain ID which has allocated to the domain */
static inline u16
domain_id_iommu(struct dmar_domain *domain, struct intel_iommu *iommu)
@@ -816,6 +823,15 @@ domain_id_iommu(struct dmar_domain *domain, struct intel_iommu *iommu)
return info->did;
}
+static inline u16
+iommu_domain_did(struct iommu_domain *domain, struct intel_iommu *iommu)
+{
+ if (domain->type == IOMMU_DOMAIN_SVA ||
+ domain->type == IOMMU_DOMAIN_IDENTITY)
+ return FLPT_DEFAULT_DID;
+ return domain_id_iommu(to_dmar_domain(domain), iommu);
+}
+
/*
* 0: readable
* 1: writable
diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h
index 06d1f7006d01..082f4fe20216 100644
--- a/drivers/iommu/intel/pasid.h
+++ b/drivers/iommu/intel/pasid.h
@@ -22,13 +22,6 @@
#define is_pasid_enabled(entry) (((entry)->lo >> 3) & 0x1)
#define get_pasid_dir_size(entry) (1 << ((((entry)->lo >> 9) & 0x7) + 7))
-/*
- * Domain ID reserved for pasid entries programmed for first-level
- * only and pass-through transfer modes.
- */
-#define FLPT_DEFAULT_DID 1
-#define NUM_RESERVED_DID 2
-
#define PASID_FLAG_NESTED BIT(1)
#define PASID_FLAG_PAGE_SNOOP BIT(2)
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 07/13] iommu/vt-d: Make intel_iommu_set_dev_pasid() to handle domain replacement
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
` (5 preceding siblings ...)
2024-11-06 15:45 ` [PATCH v5 06/13] iommu/vt-d: Add iommu_domain_did() to get did Yi Liu
@ 2024-11-06 15:46 ` Yi Liu
2024-11-06 15:46 ` [PATCH v5 08/13] iommu/vt-d: Limit intel_iommu_set_dev_pasid() for paging domain Yi Liu
` (5 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:46 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
Let intel_iommu_set_dev_pasid() call the pasid replace helpers hence be
able to do domain replacement.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/intel/iommu.c | 46 +++++++++++++++++++++++++++++--------
1 file changed, 37 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index a1e9dbca6561..6b7f48119502 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1752,10 +1752,36 @@ static void domain_context_clear_one(struct device_domain_info *info, u8 bus, u8
intel_context_flush_present(info, context, did, true);
}
+static int __domain_setup_first_level(struct intel_iommu *iommu,
+ struct device *dev, ioasid_t pasid,
+ u16 did, pgd_t *pgd, int flags,
+ struct iommu_domain *old)
+{
+ if (!old)
+ return intel_pasid_setup_first_level(iommu, dev, pgd,
+ pasid, did, flags);
+ return intel_pasid_replace_first_level(iommu, dev, pgd, pasid, did,
+ iommu_domain_did(old, iommu),
+ flags);
+}
+
+static int domain_setup_second_level(struct intel_iommu *iommu,
+ struct dmar_domain *domain,
+ struct device *dev, ioasid_t pasid,
+ struct iommu_domain *old)
+{
+ if (!old)
+ return intel_pasid_setup_second_level(iommu, domain,
+ dev, pasid);
+ return intel_pasid_replace_second_level(iommu, domain, dev,
+ iommu_domain_did(old, iommu),
+ pasid);
+}
+
static int domain_setup_first_level(struct intel_iommu *iommu,
struct dmar_domain *domain,
struct device *dev,
- u32 pasid)
+ u32 pasid, struct iommu_domain *old)
{
struct dma_pte *pgd = domain->pgd;
int level, flags = 0;
@@ -1770,9 +1796,9 @@ static int domain_setup_first_level(struct intel_iommu *iommu,
if (domain->force_snooping)
flags |= PASID_FLAG_PAGE_SNOOP;
- return intel_pasid_setup_first_level(iommu, dev, (pgd_t *)pgd, pasid,
- domain_id_iommu(domain, iommu),
- flags);
+ return __domain_setup_first_level(iommu, dev, pasid,
+ domain_id_iommu(domain, iommu),
+ (pgd_t *)pgd, flags, old);
}
static bool dev_is_real_dma_subdevice(struct device *dev)
@@ -1804,9 +1830,11 @@ static int dmar_domain_attach_device(struct dmar_domain *domain,
if (!sm_supported(iommu))
ret = domain_context_mapping(domain, dev);
else if (domain->use_first_level)
- ret = domain_setup_first_level(iommu, domain, dev, IOMMU_NO_PASID);
+ ret = domain_setup_first_level(iommu, domain, dev,
+ IOMMU_NO_PASID, NULL);
else
- ret = intel_pasid_setup_second_level(iommu, domain, dev, IOMMU_NO_PASID);
+ ret = domain_setup_second_level(iommu, domain, dev,
+ IOMMU_NO_PASID, NULL);
if (ret)
goto out_block_translation;
@@ -4147,10 +4175,10 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
if (dmar_domain->use_first_level)
ret = domain_setup_first_level(iommu, dmar_domain,
- dev, pasid);
+ dev, pasid, old);
else
- ret = intel_pasid_setup_second_level(iommu, dmar_domain,
- dev, pasid);
+ ret = domain_setup_second_level(iommu, dmar_domain,
+ dev, pasid, old);
if (ret)
goto out_remove_dev_pasid;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 08/13] iommu/vt-d: Limit intel_iommu_set_dev_pasid() for paging domain
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
` (6 preceding siblings ...)
2024-11-06 15:46 ` [PATCH v5 07/13] iommu/vt-d: Make intel_iommu_set_dev_pasid() to handle domain replacement Yi Liu
@ 2024-11-06 15:46 ` Yi Liu
2024-11-06 15:46 ` [PATCH v5 09/13] iommu/vt-d: Make intel_svm_set_dev_pasid() support domain replacement Yi Liu
` (4 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:46 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
intel_iommu_set_dev_pasid() is only supposed to be used by paging domain,
so limit it.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/intel/iommu.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 6b7f48119502..3c1b92fa5877 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4156,6 +4156,9 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
struct dev_pasid_info *dev_pasid;
int ret;
+ if (WARN_ON_ONCE(!(domain->type & __IOMMU_DOMAIN_PAGING)))
+ return -EINVAL;
+
if (!pasid_supported(iommu) || dev_is_real_dma_subdevice(dev))
return -EOPNOTSUPP;
@@ -4184,8 +4187,7 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
domain_remove_dev_pasid(old, dev, pasid);
- if (domain->type & __IOMMU_DOMAIN_PAGING)
- intel_iommu_debugfs_create_dev_pasid(dev_pasid);
+ intel_iommu_debugfs_create_dev_pasid(dev_pasid);
return 0;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 09/13] iommu/vt-d: Make intel_svm_set_dev_pasid() support domain replacement
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
` (7 preceding siblings ...)
2024-11-06 15:46 ` [PATCH v5 08/13] iommu/vt-d: Limit intel_iommu_set_dev_pasid() for paging domain Yi Liu
@ 2024-11-06 15:46 ` Yi Liu
2024-11-06 15:46 ` [PATCH v5 10/13] iommu/vt-d: Make identity_domain_set_dev_pasid() to handle " Yi Liu
` (3 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:46 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
Make intel_svm_set_dev_pasid() support replacement.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/intel/iommu.c | 8 ++++----
drivers/iommu/intel/iommu.h | 5 +++++
drivers/iommu/intel/svm.c | 5 +++--
3 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 3c1b92fa5877..06a7c4bf31e0 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1752,10 +1752,10 @@ static void domain_context_clear_one(struct device_domain_info *info, u8 bus, u8
intel_context_flush_present(info, context, did, true);
}
-static int __domain_setup_first_level(struct intel_iommu *iommu,
- struct device *dev, ioasid_t pasid,
- u16 did, pgd_t *pgd, int flags,
- struct iommu_domain *old)
+int __domain_setup_first_level(struct intel_iommu *iommu,
+ struct device *dev, ioasid_t pasid,
+ u16 did, pgd_t *pgd, int flags,
+ struct iommu_domain *old)
{
if (!old)
return intel_pasid_setup_first_level(iommu, dev, pgd,
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index cdca7d5061a7..d23977cc7d90 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -1250,6 +1250,11 @@ domain_add_dev_pasid(struct iommu_domain *domain,
void domain_remove_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid);
+int __domain_setup_first_level(struct intel_iommu *iommu,
+ struct device *dev, ioasid_t pasid,
+ u16 did, pgd_t *pgd, int flags,
+ struct iommu_domain *old);
+
int dmar_ir_support(void);
void iommu_flush_write_buffer(struct intel_iommu *iommu);
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 6c0685ea8466..f5569347591f 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -127,8 +127,9 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
/* Setup the pasid table: */
sflags = cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
- ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, pasid,
- FLPT_DEFAULT_DID, sflags);
+ ret = __domain_setup_first_level(iommu, dev, pasid,
+ FLPT_DEFAULT_DID, mm->pgd,
+ sflags, old);
if (ret)
goto out_remove_dev_pasid;
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 10/13] iommu/vt-d: Make identity_domain_set_dev_pasid() to handle domain replacement
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
` (8 preceding siblings ...)
2024-11-06 15:46 ` [PATCH v5 09/13] iommu/vt-d: Make intel_svm_set_dev_pasid() support domain replacement Yi Liu
@ 2024-11-06 15:46 ` Yi Liu
2024-11-06 15:46 ` [PATCH v5 11/13] iommu/vt-d: Add set_dev_pasid callback for nested domain Yi Liu
` (2 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:46 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
Let identity_domain_set_dev_pasid() call the pasid replace helpers hence
be able to do domain replacement.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/intel/iommu.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 06a7c4bf31e0..f69ea5b48ee8 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1778,6 +1778,17 @@ static int domain_setup_second_level(struct intel_iommu *iommu,
pasid);
}
+static int domain_setup_passthrough(struct intel_iommu *iommu,
+ struct device *dev, ioasid_t pasid,
+ struct iommu_domain *old)
+{
+ if (!old)
+ return intel_pasid_setup_pass_through(iommu, dev, pasid);
+ return intel_pasid_replace_pass_through(iommu, dev,
+ iommu_domain_did(old, iommu),
+ pasid);
+}
+
static int domain_setup_first_level(struct intel_iommu *iommu,
struct dmar_domain *domain,
struct device *dev,
@@ -4425,11 +4436,17 @@ static int identity_domain_set_dev_pasid(struct iommu_domain *domain,
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
+ int ret;
if (!pasid_supported(iommu) || dev_is_real_dma_subdevice(dev))
return -EOPNOTSUPP;
- return intel_pasid_setup_pass_through(iommu, dev, pasid);
+ ret = domain_setup_passthrough(iommu, dev, pasid, old);
+ if (ret)
+ return ret;
+
+ domain_remove_dev_pasid(old, dev, pasid);
+ return 0;
}
static struct iommu_domain identity_domain = {
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 11/13] iommu/vt-d: Add set_dev_pasid callback for nested domain
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
` (9 preceding siblings ...)
2024-11-06 15:46 ` [PATCH v5 10/13] iommu/vt-d: Make identity_domain_set_dev_pasid() to handle " Yi Liu
@ 2024-11-06 15:46 ` Yi Liu
2024-11-06 15:46 ` [PATCH v5 12/13] iommu/arm-smmu-v3: Make set_dev_pasid() op support replace Yi Liu
2024-11-06 15:46 ` [PATCH v5 13/13] iommu: Make set_dev_pasid op support domain replacement Yi Liu
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:46 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
Add intel_nested_set_dev_pasid() to set a nested type domain to a PASID
of a device.
Co-developed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/intel/iommu.c | 6 -----
drivers/iommu/intel/iommu.h | 7 +++++
drivers/iommu/intel/nested.c | 50 ++++++++++++++++++++++++++++++++++++
3 files changed, 57 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f69ea5b48ee8..527f6f89d8a1 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1812,12 +1812,6 @@ static int domain_setup_first_level(struct intel_iommu *iommu,
(pgd_t *)pgd, flags, old);
}
-static bool dev_is_real_dma_subdevice(struct device *dev)
-{
- return dev && dev_is_pci(dev) &&
- pci_real_dma_dev(to_pci_dev(dev)) != to_pci_dev(dev);
-}
-
static int dmar_domain_attach_device(struct dmar_domain *domain,
struct device *dev)
{
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index d23977cc7d90..2cca094c259d 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -22,6 +22,7 @@
#include <linux/bitfield.h>
#include <linux/xarray.h>
#include <linux/perf_event.h>
+#include <linux/pci.h>
#include <asm/cacheflush.h>
#include <asm/iommu.h>
@@ -832,6 +833,12 @@ iommu_domain_did(struct iommu_domain *domain, struct intel_iommu *iommu)
return domain_id_iommu(to_dmar_domain(domain), iommu);
}
+static inline bool dev_is_real_dma_subdevice(struct device *dev)
+{
+ return dev && dev_is_pci(dev) &&
+ pci_real_dma_dev(to_pci_dev(dev)) != to_pci_dev(dev);
+}
+
/*
* 0: readable
* 1: writable
diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c
index 989ca5cc04eb..42c4533a6ea2 100644
--- a/drivers/iommu/intel/nested.c
+++ b/drivers/iommu/intel/nested.c
@@ -130,8 +130,58 @@ static int intel_nested_cache_invalidate_user(struct iommu_domain *domain,
return ret;
}
+static int domain_setup_nested(struct intel_iommu *iommu,
+ struct dmar_domain *domain,
+ struct device *dev, ioasid_t pasid,
+ struct iommu_domain *old)
+{
+ if (!old)
+ return intel_pasid_setup_nested(iommu, dev, pasid, domain);
+ return intel_pasid_replace_nested(iommu, dev, pasid,
+ iommu_domain_did(old, iommu),
+ domain);
+}
+
+static int intel_nested_set_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid,
+ struct iommu_domain *old)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+ struct intel_iommu *iommu = info->iommu;
+ struct dev_pasid_info *dev_pasid;
+ int ret;
+
+ if (!pasid_supported(iommu) || dev_is_real_dma_subdevice(dev))
+ return -EOPNOTSUPP;
+
+ if (context_copied(iommu, info->bus, info->devfn))
+ return -EBUSY;
+
+ ret = paging_domain_compatible(&dmar_domain->s2_domain->domain, dev);
+ if (ret)
+ return ret;
+
+ dev_pasid = domain_add_dev_pasid(domain, dev, pasid);
+ if (IS_ERR(dev_pasid))
+ return PTR_ERR(dev_pasid);
+
+ ret = domain_setup_nested(iommu, dmar_domain, dev, pasid, old);
+ if (ret)
+ goto out_remove_dev_pasid;
+
+ domain_remove_dev_pasid(old, dev, pasid);
+
+ return 0;
+
+out_remove_dev_pasid:
+ domain_remove_dev_pasid(domain, dev, pasid);
+ return ret;
+}
+
static const struct iommu_domain_ops intel_nested_domain_ops = {
.attach_dev = intel_nested_attach_dev,
+ .set_dev_pasid = intel_nested_set_dev_pasid,
.free = intel_nested_domain_free,
.cache_invalidate_user = intel_nested_cache_invalidate_user,
};
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 12/13] iommu/arm-smmu-v3: Make set_dev_pasid() op support replace
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
` (10 preceding siblings ...)
2024-11-06 15:46 ` [PATCH v5 11/13] iommu/vt-d: Add set_dev_pasid callback for nested domain Yi Liu
@ 2024-11-06 15:46 ` Yi Liu
2024-11-06 15:46 ` [PATCH v5 13/13] iommu: Make set_dev_pasid op support domain replacement Yi Liu
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:46 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
From: Jason Gunthorpe <jgg@nvidia.com>
set_dev_pasid() op is going to be enhanced to support domain replacement
of a pasid. This prepares for this op definition.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 +++------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +-
3 files changed, 5 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 645da7b69bed..1d3e71569775 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -349,7 +349,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
* get reassigned
*/
arm_smmu_make_sva_cd(&target, master, domain->mm, smmu_domain->cd.asid);
- ret = arm_smmu_set_pasid(master, smmu_domain, id, &target);
+ ret = arm_smmu_set_pasid(master, smmu_domain, id, &target, old);
mmput(domain->mm);
return ret;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a66d9a044e52..7ee3cbbe3744 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2901,7 +2901,7 @@ static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain,
*/
arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
return arm_smmu_set_pasid(master, to_smmu_domain(domain), id,
- &target_cd);
+ &target_cd, old);
}
static void arm_smmu_update_ste(struct arm_smmu_master *master,
@@ -2931,16 +2931,13 @@ static void arm_smmu_update_ste(struct arm_smmu_master *master,
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
- struct arm_smmu_cd *cd)
+ struct arm_smmu_cd *cd, struct iommu_domain *old)
{
struct iommu_domain *sid_domain = iommu_get_domain_for_dev(master->dev);
struct arm_smmu_attach_state state = {
.master = master,
- /*
- * For now the core code prevents calling this when a domain is
- * already attached, no need to set old_domain.
- */
.ssid = pasid,
+ .old_domain = old,
};
struct arm_smmu_cd *cdptr;
int ret;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index c9e5290e995a..1e96d4af03f8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -879,7 +879,7 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
- struct arm_smmu_cd *cd);
+ struct arm_smmu_cd *cd, struct iommu_domain *old);
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v5 13/13] iommu: Make set_dev_pasid op support domain replacement
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
` (11 preceding siblings ...)
2024-11-06 15:46 ` [PATCH v5 12/13] iommu/arm-smmu-v3: Make set_dev_pasid() op support replace Yi Liu
@ 2024-11-06 15:46 ` Yi Liu
12 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-06 15:46 UTC (permalink / raw)
To: joro, jgg, kevin.tian, baolu.lu
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu,
iommu, zhenzhong.duan, vasant.hegde, willy
The iommu core is going to support domain replacement for pasid, it needs
to make the set_dev_pasid op support replacing domain and keep the old
domain config in the failure case.
AMD iommu driver does not support domain replacement for pasid yet, so it
would fail the set_dev_pasid op to keep the old config if the input @old
is non-NULL. Till now, all the set_dev_pasid callbacks can handle the old
parameter and can keep the old config when failed, so update the kdoc of
set_dev_pasid op.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/amd/pasid.c | 3 +++
include/linux/iommu.h | 3 ++-
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/amd/pasid.c b/drivers/iommu/amd/pasid.c
index d1dfc745f55e..8c73a30c2800 100644
--- a/drivers/iommu/amd/pasid.c
+++ b/drivers/iommu/amd/pasid.c
@@ -109,6 +109,9 @@ int iommu_sva_set_dev_pasid(struct iommu_domain *domain,
unsigned long flags;
int ret = -EINVAL;
+ if (old)
+ return -EOPNOTSUPP;
+
/* PASID zero is used for requests from the I/O device without PASID */
if (!is_pasid_valid(dev_data, pasid))
return ret;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 56653af5edcb..0c3bfb66dc7c 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -614,7 +614,8 @@ struct iommu_ops {
* * EBUSY - device is attached to a domain and cannot be changed
* * ENODEV - device specific errors, not able to be attached
* * <others> - treated as ENODEV by the caller. Use is discouraged
- * @set_dev_pasid: set an iommu domain to a pasid of device
+ * @set_dev_pasid: set or replace an iommu domain to a pasid of device. The pasid of
+ * the device should be left in the old config in error case.
* @map_pages: map a physically contiguous set of pages of the same size to
* an iommu domain.
* @unmap_pages: unmap a number of pages of the same size from an iommu domain
--
2.34.1
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-06 15:45 ` [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers Yi Liu
@ 2024-11-07 2:52 ` Baolu Lu
2024-11-07 4:21 ` Yi Liu
2024-11-07 2:57 ` Baolu Lu
1 sibling, 1 reply; 25+ messages in thread
From: Baolu Lu @ 2024-11-07 2:52 UTC (permalink / raw)
To: Yi Liu, joro, jgg, kevin.tian
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, iommu,
zhenzhong.duan, vasant.hegde, willy
On 11/6/24 23:45, Yi Liu wrote:
> +int intel_pasid_replace_first_level(struct intel_iommu *iommu,
> + struct device *dev, pgd_t *pgd,
> + u32 pasid, u16 did, u16 old_did,
> + int flags)
> +{
> + struct pasid_entry *pte;
> +
> + if (!ecap_flts(iommu->ecap)) {
> + pr_err("No first level translation support on %s\n",
> + iommu->name);
> + return -EINVAL;
> + }
> +
> + if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu->cap)) {
> + pr_err("No 5-level paging support for first-level on %s\n",
> + iommu->name);
> + return -EINVAL;
> + }
> +
> + spin_lock(&iommu->lock);
> + pte = intel_pasid_get_entry(dev, pasid);
> + if (!pte) {
> + spin_unlock(&iommu->lock);
> + return -ENODEV;
> + }
> +
> + if (!pasid_pte_is_present(pte)) {
> + spin_unlock(&iommu->lock);
> + return -EINVAL;
> + }
> +
> + WARN_ON(old_did != pasid_get_domain_id(pte));
> +
> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
> + spin_unlock(&iommu->lock);
> +
> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
> + intel_iommu_drain_pasid_prq(dev, pasid);
> +
> + return 0;
> +}
pasid_pte_config_first_level() causes the pasid entry to transition from
present to non-present and then to present. In this case, calling
intel_pasid_flush_present() is not accurate, as it is only intended for
pasid entries transitioning from present to present, according to the
specification.
It's recommended to move pasid_clear_entry(pte) and
pasid_set_present(pte) out to the caller, so ...
For setup case (pasid from non-present to present):
- pasid_clear_entry(pte)
- pasid_pte_config_first_level(pte)
- pasid_set_present(pte)
- cache invalidations
For replace case (pasid from present to present)
- pasid_pte_config_first_level(pte)
- cache invalidations
The same applies to other types of setup and replace.
--
baolu
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-06 15:45 ` [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers Yi Liu
2024-11-07 2:52 ` Baolu Lu
@ 2024-11-07 2:57 ` Baolu Lu
2024-11-07 4:05 ` Yi Liu
1 sibling, 1 reply; 25+ messages in thread
From: Baolu Lu @ 2024-11-07 2:57 UTC (permalink / raw)
To: Yi Liu, joro, jgg, kevin.tian
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, iommu,
zhenzhong.duan, vasant.hegde, willy
On 11/6/24 23:45, Yi Liu wrote:
> +int intel_pasid_replace_second_level(struct intel_iommu *iommu,
> + struct dmar_domain *domain,
> + struct device *dev, u16 old_did,
> + u32 pasid)
> +{
> + struct pasid_entry *pte;
> + struct dma_pte *pgd;
> + u64 pgd_val;
> + int agaw;
> + u16 did;
> +
> + /*
> + * If hardware advertises no support for second level
> + * translation, return directly.
> + */
> + if (!ecap_slts(iommu->ecap)) {
> + pr_err("No second level translation support on %s\n",
> + iommu->name);
> + return -EINVAL;
> + }
> +
> + pgd = domain->pgd;
> + pgd_val = virt_to_phys(pgd);
> + did = domain_id_iommu(domain, iommu);
> +
> + spin_lock(&iommu->lock);
> + pte = intel_pasid_get_entry(dev, pasid);
> + if (!pte) {
> + spin_unlock(&iommu->lock);
> + return -ENODEV;
> + }
> +
> + if (!pasid_pte_is_present(pte)) {
> + spin_unlock(&iommu->lock);
> + return -EINVAL;
> + }
> +
> + WARN_ON(old_did != pasid_get_domain_id(pte));
> +
> + pasid_pte_config_second_level(iommu, pte, pgd_val, agaw,
> + did, domain->dirty_tracking);
> + spin_unlock(&iommu->lock);
> +
> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
> + intel_iommu_drain_pasid_prq(dev, pasid);
> +
> + return 0;
> +}
0day robot complains:
>> drivers/iommu/intel/pasid.c:540:53: warning: variable 'agaw' is
uninitialized when used here [-Wuninitialized]
540 | pasid_pte_config_second_level(iommu, pte, pgd_val, agaw,
| ^~~~
drivers/iommu/intel/pasid.c:509:10: note: initialize the variable
'agaw' to silence this warning
509 | int agaw;
| ^
| = 0
The right fix could be like this:
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 777e70b539b1..69f12b1b8a2b 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -506,7 +506,6 @@ int intel_pasid_replace_second_level(struct
intel_iommu *iommu,
struct pasid_entry *pte;
struct dma_pte *pgd;
u64 pgd_val;
- int agaw;
u16 did;
/*
@@ -537,7 +536,7 @@ int intel_pasid_replace_second_level(struct
intel_iommu *iommu,
WARN_ON(old_did != pasid_get_domain_id(pte));
- pasid_pte_config_second_level(iommu, pte, pgd_val, agaw,
+ pasid_pte_config_second_level(iommu, pte, pgd_val, domain->agaw,
did, domain->dirty_tracking);
spin_unlock(&iommu->lock);
--
baolu
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-07 2:57 ` Baolu Lu
@ 2024-11-07 4:05 ` Yi Liu
0 siblings, 0 replies; 25+ messages in thread
From: Yi Liu @ 2024-11-07 4:05 UTC (permalink / raw)
To: Baolu Lu, joro, jgg, kevin.tian
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, iommu,
zhenzhong.duan, vasant.hegde, willy
On 2024/11/7 10:57, Baolu Lu wrote:
> On 11/6/24 23:45, Yi Liu wrote:
>> +int intel_pasid_replace_second_level(struct intel_iommu *iommu,
>> + struct dmar_domain *domain,
>> + struct device *dev, u16 old_did,
>> + u32 pasid)
>> +{
>> + struct pasid_entry *pte;
>> + struct dma_pte *pgd;
>> + u64 pgd_val;
>> + int agaw;
>> + u16 did;
>> +
>> + /*
>> + * If hardware advertises no support for second level
>> + * translation, return directly.
>> + */
>> + if (!ecap_slts(iommu->ecap)) {
>> + pr_err("No second level translation support on %s\n",
>> + iommu->name);
>> + return -EINVAL;
>> + }
>> +
>> + pgd = domain->pgd;
>> + pgd_val = virt_to_phys(pgd);
>> + did = domain_id_iommu(domain, iommu);
>> +
>> + spin_lock(&iommu->lock);
>> + pte = intel_pasid_get_entry(dev, pasid);
>> + if (!pte) {
>> + spin_unlock(&iommu->lock);
>> + return -ENODEV;
>> + }
>> +
>> + if (!pasid_pte_is_present(pte)) {
>> + spin_unlock(&iommu->lock);
>> + return -EINVAL;
>> + }
>> +
>> + WARN_ON(old_did != pasid_get_domain_id(pte));
>> +
>> + pasid_pte_config_second_level(iommu, pte, pgd_val, agaw,
>> + did, domain->dirty_tracking);
>> + spin_unlock(&iommu->lock);
>> +
>> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
>> + intel_iommu_drain_pasid_prq(dev, pasid);
>> +
>> + return 0;
>> +}
>
> 0day robot complains:
>
> >> drivers/iommu/intel/pasid.c:540:53: warning: variable 'agaw' is
> uninitialized when used here [-Wuninitialized]
> 540 | pasid_pte_config_second_level(iommu, pte, pgd_val, agaw,
> | ^~~~
> drivers/iommu/intel/pasid.c:509:10: note: initialize the variable
> 'agaw' to silence this warning
> 509 | int agaw;
> | ^
> | = 0
>
> The right fix could be like this:
>
> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
> index 777e70b539b1..69f12b1b8a2b 100644
> --- a/drivers/iommu/intel/pasid.c
> +++ b/drivers/iommu/intel/pasid.c
> @@ -506,7 +506,6 @@ int intel_pasid_replace_second_level(struct intel_iommu
> *iommu,
> struct pasid_entry *pte;
> struct dma_pte *pgd;
> u64 pgd_val;
> - int agaw;
> u16 did;
>
> /*
> @@ -537,7 +536,7 @@ int intel_pasid_replace_second_level(struct intel_iommu
> *iommu,
>
> WARN_ON(old_did != pasid_get_domain_id(pte));
>
> - pasid_pte_config_second_level(iommu, pte, pgd_val, agaw,
> + pasid_pte_config_second_level(iommu, pte, pgd_val, domain->agaw,
> did, domain->dirty_tracking);
> spin_unlock(&iommu->lock);
>
yes. will fix it.
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-07 2:52 ` Baolu Lu
@ 2024-11-07 4:21 ` Yi Liu
2024-11-07 5:46 ` Tian, Kevin
0 siblings, 1 reply; 25+ messages in thread
From: Yi Liu @ 2024-11-07 4:21 UTC (permalink / raw)
To: Baolu Lu, joro, jgg, kevin.tian
Cc: alex.williamson, eric.auger, nicolinc, kvm, chao.p.peng, iommu,
zhenzhong.duan, vasant.hegde, willy
On 2024/11/7 10:52, Baolu Lu wrote:
> On 11/6/24 23:45, Yi Liu wrote:
>> +int intel_pasid_replace_first_level(struct intel_iommu *iommu,
>> + struct device *dev, pgd_t *pgd,
>> + u32 pasid, u16 did, u16 old_did,
>> + int flags)
>> +{
>> + struct pasid_entry *pte;
>> +
>> + if (!ecap_flts(iommu->ecap)) {
>> + pr_err("No first level translation support on %s\n",
>> + iommu->name);
>> + return -EINVAL;
>> + }
>> +
>> + if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu->cap)) {
>> + pr_err("No 5-level paging support for first-level on %s\n",
>> + iommu->name);
>> + return -EINVAL;
>> + }
>> +
>> + spin_lock(&iommu->lock);
>> + pte = intel_pasid_get_entry(dev, pasid);
>> + if (!pte) {
>> + spin_unlock(&iommu->lock);
>> + return -ENODEV;
>> + }
>> +
>> + if (!pasid_pte_is_present(pte)) {
>> + spin_unlock(&iommu->lock);
>> + return -EINVAL;
>> + }
>> +
>> + WARN_ON(old_did != pasid_get_domain_id(pte));
>> +
>> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
>> + spin_unlock(&iommu->lock);
>> +
>> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
>> + intel_iommu_drain_pasid_prq(dev, pasid);
>> +
>> + return 0;
>> +}
>
> pasid_pte_config_first_level() causes the pasid entry to transition from
> present to non-present and then to present. In this case, calling
> intel_pasid_flush_present() is not accurate, as it is only intended for
> pasid entries transitioning from present to present, according to the
> specification.
>
> It's recommended to move pasid_clear_entry(pte) and
> pasid_set_present(pte) out to the caller, so ...
>
> For setup case (pasid from non-present to present):
>
> - pasid_clear_entry(pte)
> - pasid_pte_config_first_level(pte)
> - pasid_set_present(pte)
> - cache invalidations
>
> For replace case (pasid from present to present)
>
> - pasid_pte_config_first_level(pte)
> - cache invalidations
>
> The same applies to other types of setup and replace.
hmmm. Here is the reason I did it in the way of this patch:
1) pasid_clear_entry() can clear all the fields that are not supposed to
be used by the new domain. For example, converting a nested domain to SS
only domain, if no pasid_clear_entry() then the FSPTR would be there.
Although spec seems not enforce it, it might be good to clear it.
2) We don't support atomic replace yet, so the whole pasid entry transition
is not done in one shot, so it looks to be ok to do this stepping
transition.
3) It seems to be even worse if keep the Present bit during the transition.
The pasid entry might be broken while the Present bit indicates this is
a valid pasid entry. Say if there is in-flight DMA, the result may be
unpredictable.
Based on the above, I chose the current way. But I admit if we are going to
support atomic replace, then we should refactor a bit. I believe at that
time we need to construct the new pasid entry first and try to exchange it
to the pasid table. I can see some transition can be done in that way as we
can do atomic exchange with 128bits. thoughts? :)
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-07 4:21 ` Yi Liu
@ 2024-11-07 5:46 ` Tian, Kevin
2024-11-07 6:46 ` Yi Liu
0 siblings, 1 reply; 25+ messages in thread
From: Tian, Kevin @ 2024-11-07 5:46 UTC (permalink / raw)
To: Liu, Yi L, Baolu Lu, joro@8bytes.org, jgg@nvidia.com
Cc: alex.williamson@redhat.com, eric.auger@redhat.com,
nicolinc@nvidia.com, kvm@vger.kernel.org,
chao.p.peng@linux.intel.com, iommu@lists.linux.dev,
Duan, Zhenzhong, vasant.hegde@amd.com, willy@infradead.org
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, November 7, 2024 12:21 PM
>
> On 2024/11/7 10:52, Baolu Lu wrote:
> > On 11/6/24 23:45, Yi Liu wrote:
> >> +int intel_pasid_replace_first_level(struct intel_iommu *iommu,
> >> + struct device *dev, pgd_t *pgd,
> >> + u32 pasid, u16 did, u16 old_did,
> >> + int flags)
> >> +{
> >> + struct pasid_entry *pte;
> >> +
> >> + if (!ecap_flts(iommu->ecap)) {
> >> + pr_err("No first level translation support on %s\n",
> >> + iommu->name);
> >> + return -EINVAL;
> >> + }
> >> +
> >> + if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu->cap)) {
> >> + pr_err("No 5-level paging support for first-level on %s\n",
> >> + iommu->name);
> >> + return -EINVAL;
> >> + }
> >> +
> >> + spin_lock(&iommu->lock);
> >> + pte = intel_pasid_get_entry(dev, pasid);
> >> + if (!pte) {
> >> + spin_unlock(&iommu->lock);
> >> + return -ENODEV;
> >> + }
> >> +
> >> + if (!pasid_pte_is_present(pte)) {
> >> + spin_unlock(&iommu->lock);
> >> + return -EINVAL;
> >> + }
> >> +
> >> + WARN_ON(old_did != pasid_get_domain_id(pte));
> >> +
> >> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
> >> + spin_unlock(&iommu->lock);
> >> +
> >> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
> >> + intel_iommu_drain_pasid_prq(dev, pasid);
> >> +
> >> + return 0;
> >> +}
> >
> > pasid_pte_config_first_level() causes the pasid entry to transition from
> > present to non-present and then to present. In this case, calling
> > intel_pasid_flush_present() is not accurate, as it is only intended for
> > pasid entries transitioning from present to present, according to the
> > specification.
> >
> > It's recommended to move pasid_clear_entry(pte) and
> > pasid_set_present(pte) out to the caller, so ...
> >
> > For setup case (pasid from non-present to present):
> >
> > - pasid_clear_entry(pte)
> > - pasid_pte_config_first_level(pte)
> > - pasid_set_present(pte)
> > - cache invalidations
> >
> > For replace case (pasid from present to present)
> >
> > - pasid_pte_config_first_level(pte)
> > - cache invalidations
> >
> > The same applies to other types of setup and replace.
>
> hmmm. Here is the reason I did it in the way of this patch:
> 1) pasid_clear_entry() can clear all the fields that are not supposed to
> be used by the new domain. For example, converting a nested domain to
> SS
> only domain, if no pasid_clear_entry() then the FSPTR would be there.
> Although spec seems not enforce it, it might be good to clear it.
> 2) We don't support atomic replace yet, so the whole pasid entry transition
> is not done in one shot, so it looks to be ok to do this stepping
> transition.
> 3) It seems to be even worse if keep the Present bit during the transition.
> The pasid entry might be broken while the Present bit indicates this is
> a valid pasid entry. Say if there is in-flight DMA, the result may be
> unpredictable.
>
> Based on the above, I chose the current way. But I admit if we are going to
> support atomic replace, then we should refactor a bit. I believe at that
> time we need to construct the new pasid entry first and try to exchange it
> to the pasid table. I can see some transition can be done in that way as we
> can do atomic exchange with 128bits. thoughts? :)
>
yes 128bit cmpxchg is necessary to support atomic replacement.
Actually vt-d spec clearly says so e.g. SSPTPTR/DID must be updated
together in a present entry to not break in-flight DMA.
but... your current way (clear entry then update it) also break in-flight
DMA. So let's admit that as the 1st step it's not aimed to support
atomic replacement. With that Baolu's suggestion makes more sense
toward future extension with less refactoring required (otherwise
you should not use intel_pasid_flush_present() then the earlier
refactoring for that helper is also meaningless).
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-07 5:46 ` Tian, Kevin
@ 2024-11-07 6:46 ` Yi Liu
2024-11-07 6:53 ` Baolu Lu
0 siblings, 1 reply; 25+ messages in thread
From: Yi Liu @ 2024-11-07 6:46 UTC (permalink / raw)
To: Tian, Kevin, Baolu Lu, joro@8bytes.org, jgg@nvidia.com
Cc: alex.williamson@redhat.com, eric.auger@redhat.com,
nicolinc@nvidia.com, kvm@vger.kernel.org,
chao.p.peng@linux.intel.com, iommu@lists.linux.dev,
Duan, Zhenzhong, vasant.hegde@amd.com, willy@infradead.org
On 2024/11/7 13:46, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Thursday, November 7, 2024 12:21 PM
>>
>> On 2024/11/7 10:52, Baolu Lu wrote:
>>> On 11/6/24 23:45, Yi Liu wrote:
>>>> +int intel_pasid_replace_first_level(struct intel_iommu *iommu,
>>>> + struct device *dev, pgd_t *pgd,
>>>> + u32 pasid, u16 did, u16 old_did,
>>>> + int flags)
>>>> +{
>>>> + struct pasid_entry *pte;
>>>> +
>>>> + if (!ecap_flts(iommu->ecap)) {
>>>> + pr_err("No first level translation support on %s\n",
>>>> + iommu->name);
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu->cap)) {
>>>> + pr_err("No 5-level paging support for first-level on %s\n",
>>>> + iommu->name);
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + spin_lock(&iommu->lock);
>>>> + pte = intel_pasid_get_entry(dev, pasid);
>>>> + if (!pte) {
>>>> + spin_unlock(&iommu->lock);
>>>> + return -ENODEV;
>>>> + }
>>>> +
>>>> + if (!pasid_pte_is_present(pte)) {
>>>> + spin_unlock(&iommu->lock);
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + WARN_ON(old_did != pasid_get_domain_id(pte));
>>>> +
>>>> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
>>>> + spin_unlock(&iommu->lock);
>>>> +
>>>> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
>>>> + intel_iommu_drain_pasid_prq(dev, pasid);
>>>> +
>>>> + return 0;
>>>> +}
>>>
>>> pasid_pte_config_first_level() causes the pasid entry to transition from
>>> present to non-present and then to present. In this case, calling
>>> intel_pasid_flush_present() is not accurate, as it is only intended for
>>> pasid entries transitioning from present to present, according to the
>>> specification.
>>>
>>> It's recommended to move pasid_clear_entry(pte) and
>>> pasid_set_present(pte) out to the caller, so ...
>>>
>>> For setup case (pasid from non-present to present):
>>>
>>> - pasid_clear_entry(pte)
>>> - pasid_pte_config_first_level(pte)
>>> - pasid_set_present(pte)
>>> - cache invalidations
>>>
>>> For replace case (pasid from present to present)
>>>
>>> - pasid_pte_config_first_level(pte)
>>> - cache invalidations
>>>
>>> The same applies to other types of setup and replace.
>>
>> hmmm. Here is the reason I did it in the way of this patch:
>> 1) pasid_clear_entry() can clear all the fields that are not supposed to
>> be used by the new domain. For example, converting a nested domain to
>> SS
>> only domain, if no pasid_clear_entry() then the FSPTR would be there.
>> Although spec seems not enforce it, it might be good to clear it.
>> 2) We don't support atomic replace yet, so the whole pasid entry transition
>> is not done in one shot, so it looks to be ok to do this stepping
>> transition.
>> 3) It seems to be even worse if keep the Present bit during the transition.
>> The pasid entry might be broken while the Present bit indicates this is
>> a valid pasid entry. Say if there is in-flight DMA, the result may be
>> unpredictable.
>>
>> Based on the above, I chose the current way. But I admit if we are going to
>> support atomic replace, then we should refactor a bit. I believe at that
>> time we need to construct the new pasid entry first and try to exchange it
>> to the pasid table. I can see some transition can be done in that way as we
>> can do atomic exchange with 128bits. thoughts? :)
>>
>
> yes 128bit cmpxchg is necessary to support atomic replacement.
>
> Actually vt-d spec clearly says so e.g. SSPTPTR/DID must be updated
> together in a present entry to not break in-flight DMA.
>
> but... your current way (clear entry then update it) also break in-flight
> DMA. So let's admit that as the 1st step it's not aimed to support
> atomic replacement. With that Baolu's suggestion makes more sense
> toward future extension with less refactoring required (otherwise
> you should not use intel_pasid_flush_present() then the earlier
> refactoring for that helper is also meaningless).
I see. The pasid entry might have some filed that is not supposed to be
used after replacement. Should we have a comment about it?
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-07 6:46 ` Yi Liu
@ 2024-11-07 6:53 ` Baolu Lu
2024-11-07 7:57 ` Yi Liu
0 siblings, 1 reply; 25+ messages in thread
From: Baolu Lu @ 2024-11-07 6:53 UTC (permalink / raw)
To: Yi Liu, Tian, Kevin, joro@8bytes.org, jgg@nvidia.com
Cc: baolu.lu, alex.williamson@redhat.com, eric.auger@redhat.com,
nicolinc@nvidia.com, kvm@vger.kernel.org,
chao.p.peng@linux.intel.com, iommu@lists.linux.dev,
Duan, Zhenzhong, vasant.hegde@amd.com, willy@infradead.org
On 2024/11/7 14:46, Yi Liu wrote:
> On 2024/11/7 13:46, Tian, Kevin wrote:
>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>> Sent: Thursday, November 7, 2024 12:21 PM
>>>
>>> On 2024/11/7 10:52, Baolu Lu wrote:
>>>> On 11/6/24 23:45, Yi Liu wrote:
>>>>> +int intel_pasid_replace_first_level(struct intel_iommu *iommu,
>>>>> + struct device *dev, pgd_t *pgd,
>>>>> + u32 pasid, u16 did, u16 old_did,
>>>>> + int flags)
>>>>> +{
>>>>> + struct pasid_entry *pte;
>>>>> +
>>>>> + if (!ecap_flts(iommu->ecap)) {
>>>>> + pr_err("No first level translation support on %s\n",
>>>>> + iommu->name);
>>>>> + return -EINVAL;
>>>>> + }
>>>>> +
>>>>> + if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu-
>>>>> >cap)) {
>>>>> + pr_err("No 5-level paging support for first-level on %s\n",
>>>>> + iommu->name);
>>>>> + return -EINVAL;
>>>>> + }
>>>>> +
>>>>> + spin_lock(&iommu->lock);
>>>>> + pte = intel_pasid_get_entry(dev, pasid);
>>>>> + if (!pte) {
>>>>> + spin_unlock(&iommu->lock);
>>>>> + return -ENODEV;
>>>>> + }
>>>>> +
>>>>> + if (!pasid_pte_is_present(pte)) {
>>>>> + spin_unlock(&iommu->lock);
>>>>> + return -EINVAL;
>>>>> + }
>>>>> +
>>>>> + WARN_ON(old_did != pasid_get_domain_id(pte));
>>>>> +
>>>>> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
>>>>> + spin_unlock(&iommu->lock);
>>>>> +
>>>>> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
>>>>> + intel_iommu_drain_pasid_prq(dev, pasid);
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>
>>>> pasid_pte_config_first_level() causes the pasid entry to transition
>>>> from
>>>> present to non-present and then to present. In this case, calling
>>>> intel_pasid_flush_present() is not accurate, as it is only intended for
>>>> pasid entries transitioning from present to present, according to the
>>>> specification.
>>>>
>>>> It's recommended to move pasid_clear_entry(pte) and
>>>> pasid_set_present(pte) out to the caller, so ...
>>>>
>>>> For setup case (pasid from non-present to present):
>>>>
>>>> - pasid_clear_entry(pte)
>>>> - pasid_pte_config_first_level(pte)
>>>> - pasid_set_present(pte)
>>>> - cache invalidations
>>>>
>>>> For replace case (pasid from present to present)
>>>>
>>>> - pasid_pte_config_first_level(pte)
>>>> - cache invalidations
>>>>
>>>> The same applies to other types of setup and replace.
>>>
>>> hmmm. Here is the reason I did it in the way of this patch:
>>> 1) pasid_clear_entry() can clear all the fields that are not supposed to
>>> be used by the new domain. For example, converting a nested
>>> domain to
>>> SS
>>> only domain, if no pasid_clear_entry() then the FSPTR would be
>>> there.
>>> Although spec seems not enforce it, it might be good to clear it.
>>> 2) We don't support atomic replace yet, so the whole pasid entry
>>> transition
>>> is not done in one shot, so it looks to be ok to do this stepping
>>> transition.
>>> 3) It seems to be even worse if keep the Present bit during the
>>> transition.
>>> The pasid entry might be broken while the Present bit indicates
>>> this is
>>> a valid pasid entry. Say if there is in-flight DMA, the result
>>> may be
>>> unpredictable.
>>>
>>> Based on the above, I chose the current way. But I admit if we are
>>> going to
>>> support atomic replace, then we should refactor a bit. I believe at that
>>> time we need to construct the new pasid entry first and try to
>>> exchange it
>>> to the pasid table. I can see some transition can be done in that way
>>> as we
>>> can do atomic exchange with 128bits. thoughts? 🙂
>>>
>>
>> yes 128bit cmpxchg is necessary to support atomic replacement.
>>
>> Actually vt-d spec clearly says so e.g. SSPTPTR/DID must be updated
>> together in a present entry to not break in-flight DMA.
>>
>> but... your current way (clear entry then update it) also break in-flight
>> DMA. So let's admit that as the 1st step it's not aimed to support
>> atomic replacement. With that Baolu's suggestion makes more sense
>> toward future extension with less refactoring required (otherwise
>> you should not use intel_pasid_flush_present() then the earlier
>> refactoring for that helper is also meaningless).
>
> I see. The pasid entry might have some filed that is not supposed to be
> used after replacement. Should we have a comment about it?
I guess all fields except SSADE and P of a pasid table entry should be
cleared in pasid_pte_config_first_level()?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-07 6:53 ` Baolu Lu
@ 2024-11-07 7:57 ` Yi Liu
2024-11-07 8:04 ` Baolu Lu
0 siblings, 1 reply; 25+ messages in thread
From: Yi Liu @ 2024-11-07 7:57 UTC (permalink / raw)
To: Baolu Lu, Tian, Kevin, joro@8bytes.org, jgg@nvidia.com
Cc: alex.williamson@redhat.com, eric.auger@redhat.com,
nicolinc@nvidia.com, kvm@vger.kernel.org,
chao.p.peng@linux.intel.com, iommu@lists.linux.dev,
Duan, Zhenzhong, vasant.hegde@amd.com, willy@infradead.org
On 2024/11/7 14:53, Baolu Lu wrote:
> On 2024/11/7 14:46, Yi Liu wrote:
>> On 2024/11/7 13:46, Tian, Kevin wrote:
>>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>>> Sent: Thursday, November 7, 2024 12:21 PM
>>>>
>>>> On 2024/11/7 10:52, Baolu Lu wrote:
>>>>> On 11/6/24 23:45, Yi Liu wrote:
>>>>>> +int intel_pasid_replace_first_level(struct intel_iommu *iommu,
>>>>>> + struct device *dev, pgd_t *pgd,
>>>>>> + u32 pasid, u16 did, u16 old_did,
>>>>>> + int flags)
>>>>>> +{
>>>>>> + struct pasid_entry *pte;
>>>>>> +
>>>>>> + if (!ecap_flts(iommu->ecap)) {
>>>>>> + pr_err("No first level translation support on %s\n",
>>>>>> + iommu->name);
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>> +
>>>>>> + if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu-
>>>>>> >cap)) {
>>>>>> + pr_err("No 5-level paging support for first-level on %s\n",
>>>>>> + iommu->name);
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>> +
>>>>>> + spin_lock(&iommu->lock);
>>>>>> + pte = intel_pasid_get_entry(dev, pasid);
>>>>>> + if (!pte) {
>>>>>> + spin_unlock(&iommu->lock);
>>>>>> + return -ENODEV;
>>>>>> + }
>>>>>> +
>>>>>> + if (!pasid_pte_is_present(pte)) {
>>>>>> + spin_unlock(&iommu->lock);
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>> +
>>>>>> + WARN_ON(old_did != pasid_get_domain_id(pte));
>>>>>> +
>>>>>> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
>>>>>> + spin_unlock(&iommu->lock);
>>>>>> +
>>>>>> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
>>>>>> + intel_iommu_drain_pasid_prq(dev, pasid);
>>>>>> +
>>>>>> + return 0;
>>>>>> +}
>>>>>
>>>>> pasid_pte_config_first_level() causes the pasid entry to transition from
>>>>> present to non-present and then to present. In this case, calling
>>>>> intel_pasid_flush_present() is not accurate, as it is only intended for
>>>>> pasid entries transitioning from present to present, according to the
>>>>> specification.
>>>>>
>>>>> It's recommended to move pasid_clear_entry(pte) and
>>>>> pasid_set_present(pte) out to the caller, so ...
>>>>>
>>>>> For setup case (pasid from non-present to present):
>>>>>
>>>>> - pasid_clear_entry(pte)
>>>>> - pasid_pte_config_first_level(pte)
>>>>> - pasid_set_present(pte)
>>>>> - cache invalidations
>>>>>
>>>>> For replace case (pasid from present to present)
>>>>>
>>>>> - pasid_pte_config_first_level(pte)
>>>>> - cache invalidations
>>>>>
>>>>> The same applies to other types of setup and replace.
>>>>
>>>> hmmm. Here is the reason I did it in the way of this patch:
>>>> 1) pasid_clear_entry() can clear all the fields that are not supposed to
>>>> be used by the new domain. For example, converting a nested domain to
>>>> SS
>>>> only domain, if no pasid_clear_entry() then the FSPTR would be there.
>>>> Although spec seems not enforce it, it might be good to clear it.
>>>> 2) We don't support atomic replace yet, so the whole pasid entry
>>>> transition
>>>> is not done in one shot, so it looks to be ok to do this stepping
>>>> transition.
>>>> 3) It seems to be even worse if keep the Present bit during the
>>>> transition.
>>>> The pasid entry might be broken while the Present bit indicates
>>>> this is
>>>> a valid pasid entry. Say if there is in-flight DMA, the result may be
>>>> unpredictable.
>>>>
>>>> Based on the above, I chose the current way. But I admit if we are
>>>> going to
>>>> support atomic replace, then we should refactor a bit. I believe at that
>>>> time we need to construct the new pasid entry first and try to exchange it
>>>> to the pasid table. I can see some transition can be done in that way
>>>> as we
>>>> can do atomic exchange with 128bits. thoughts? 🙂
>>>>
>>>
>>> yes 128bit cmpxchg is necessary to support atomic replacement.
>>>
>>> Actually vt-d spec clearly says so e.g. SSPTPTR/DID must be updated
>>> together in a present entry to not break in-flight DMA.
>>>
>>> but... your current way (clear entry then update it) also break in-flight
>>> DMA. So let's admit that as the 1st step it's not aimed to support
>>> atomic replacement. With that Baolu's suggestion makes more sense
>>> toward future extension with less refactoring required (otherwise
>>> you should not use intel_pasid_flush_present() then the earlier
>>> refactoring for that helper is also meaningless).
>>
>> I see. The pasid entry might have some filed that is not supposed to be
>> used after replacement. Should we have a comment about it?
>
> I guess all fields except SSADE and P of a pasid table entry should be
> cleared in pasid_pte_config_first_level()?
perhaps we can take one more step forward. We can construct the new pte in
a local variable first and then push it to the pte in the pasid table. :)
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-07 7:57 ` Yi Liu
@ 2024-11-07 8:04 ` Baolu Lu
2024-11-07 8:39 ` Yi Liu
0 siblings, 1 reply; 25+ messages in thread
From: Baolu Lu @ 2024-11-07 8:04 UTC (permalink / raw)
To: Yi Liu, Tian, Kevin, joro@8bytes.org, jgg@nvidia.com
Cc: baolu.lu, alex.williamson@redhat.com, eric.auger@redhat.com,
nicolinc@nvidia.com, kvm@vger.kernel.org,
chao.p.peng@linux.intel.com, iommu@lists.linux.dev,
Duan, Zhenzhong, vasant.hegde@amd.com, willy@infradead.org
On 2024/11/7 15:57, Yi Liu wrote:
> On 2024/11/7 14:53, Baolu Lu wrote:
>> On 2024/11/7 14:46, Yi Liu wrote:
>>> On 2024/11/7 13:46, Tian, Kevin wrote:
>>>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>>>> Sent: Thursday, November 7, 2024 12:21 PM
>>>>>
>>>>> On 2024/11/7 10:52, Baolu Lu wrote:
>>>>>> On 11/6/24 23:45, Yi Liu wrote:
>>>>>>> +int intel_pasid_replace_first_level(struct intel_iommu *iommu,
>>>>>>> + struct device *dev, pgd_t *pgd,
>>>>>>> + u32 pasid, u16 did, u16 old_did,
>>>>>>> + int flags)
>>>>>>> +{
>>>>>>> + struct pasid_entry *pte;
>>>>>>> +
>>>>>>> + if (!ecap_flts(iommu->ecap)) {
>>>>>>> + pr_err("No first level translation support on %s\n",
>>>>>>> + iommu->name);
>>>>>>> + return -EINVAL;
>>>>>>> + }
>>>>>>> +
>>>>>>> + if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu-
>>>>>>> >cap)) {
>>>>>>> + pr_err("No 5-level paging support for first-level on %s\n",
>>>>>>> + iommu->name);
>>>>>>> + return -EINVAL;
>>>>>>> + }
>>>>>>> +
>>>>>>> + spin_lock(&iommu->lock);
>>>>>>> + pte = intel_pasid_get_entry(dev, pasid);
>>>>>>> + if (!pte) {
>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>> + return -ENODEV;
>>>>>>> + }
>>>>>>> +
>>>>>>> + if (!pasid_pte_is_present(pte)) {
>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>> + return -EINVAL;
>>>>>>> + }
>>>>>>> +
>>>>>>> + WARN_ON(old_did != pasid_get_domain_id(pte));
>>>>>>> +
>>>>>>> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>> +
>>>>>>> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
>>>>>>> + intel_iommu_drain_pasid_prq(dev, pasid);
>>>>>>> +
>>>>>>> + return 0;
>>>>>>> +}
>>>>>>
>>>>>> pasid_pte_config_first_level() causes the pasid entry to
>>>>>> transition from
>>>>>> present to non-present and then to present. In this case, calling
>>>>>> intel_pasid_flush_present() is not accurate, as it is only
>>>>>> intended for
>>>>>> pasid entries transitioning from present to present, according to the
>>>>>> specification.
>>>>>>
>>>>>> It's recommended to move pasid_clear_entry(pte) and
>>>>>> pasid_set_present(pte) out to the caller, so ...
>>>>>>
>>>>>> For setup case (pasid from non-present to present):
>>>>>>
>>>>>> - pasid_clear_entry(pte)
>>>>>> - pasid_pte_config_first_level(pte)
>>>>>> - pasid_set_present(pte)
>>>>>> - cache invalidations
>>>>>>
>>>>>> For replace case (pasid from present to present)
>>>>>>
>>>>>> - pasid_pte_config_first_level(pte)
>>>>>> - cache invalidations
>>>>>>
>>>>>> The same applies to other types of setup and replace.
>>>>>
>>>>> hmmm. Here is the reason I did it in the way of this patch:
>>>>> 1) pasid_clear_entry() can clear all the fields that are not
>>>>> supposed to
>>>>> be used by the new domain. For example, converting a nested
>>>>> domain to
>>>>> SS
>>>>> only domain, if no pasid_clear_entry() then the FSPTR would be
>>>>> there.
>>>>> Although spec seems not enforce it, it might be good to clear it.
>>>>> 2) We don't support atomic replace yet, so the whole pasid entry
>>>>> transition
>>>>> is not done in one shot, so it looks to be ok to do this stepping
>>>>> transition.
>>>>> 3) It seems to be even worse if keep the Present bit during the
>>>>> transition.
>>>>> The pasid entry might be broken while the Present bit
>>>>> indicates this is
>>>>> a valid pasid entry. Say if there is in-flight DMA, the result
>>>>> may be
>>>>> unpredictable.
>>>>>
>>>>> Based on the above, I chose the current way. But I admit if we are
>>>>> going to
>>>>> support atomic replace, then we should refactor a bit. I believe at
>>>>> that
>>>>> time we need to construct the new pasid entry first and try to
>>>>> exchange it
>>>>> to the pasid table. I can see some transition can be done in that
>>>>> way as we
>>>>> can do atomic exchange with 128bits. thoughts? 🙂
>>>>>
>>>>
>>>> yes 128bit cmpxchg is necessary to support atomic replacement.
>>>>
>>>> Actually vt-d spec clearly says so e.g. SSPTPTR/DID must be updated
>>>> together in a present entry to not break in-flight DMA.
>>>>
>>>> but... your current way (clear entry then update it) also break in-
>>>> flight
>>>> DMA. So let's admit that as the 1st step it's not aimed to support
>>>> atomic replacement. With that Baolu's suggestion makes more sense
>>>> toward future extension with less refactoring required (otherwise
>>>> you should not use intel_pasid_flush_present() then the earlier
>>>> refactoring for that helper is also meaningless).
>>>
>>> I see. The pasid entry might have some filed that is not supposed to be
>>> used after replacement. Should we have a comment about it?
>>
>> I guess all fields except SSADE and P of a pasid table entry should be
>> cleared in pasid_pte_config_first_level()?
>
> perhaps we can take one more step forward. We can construct the new pte
> in a local variable first and then push it to the pte in the pasid
> table. :)
>
That sounds better! The entry is composed on the stack and then copied
over to the pasid table as a whole.
With these two issues addressed, do u mind sending a new version? Let's
try to catch the pull request window. There are other series (iommufd
and vfio) about user space PASID support depending on this.
--
baolu
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-07 8:04 ` Baolu Lu
@ 2024-11-07 8:39 ` Yi Liu
2024-11-07 8:41 ` Baolu Lu
0 siblings, 1 reply; 25+ messages in thread
From: Yi Liu @ 2024-11-07 8:39 UTC (permalink / raw)
To: Baolu Lu, Tian, Kevin, joro@8bytes.org, jgg@nvidia.com
Cc: alex.williamson@redhat.com, eric.auger@redhat.com,
nicolinc@nvidia.com, kvm@vger.kernel.org,
chao.p.peng@linux.intel.com, iommu@lists.linux.dev,
Duan, Zhenzhong, vasant.hegde@amd.com, willy@infradead.org
On 2024/11/7 16:04, Baolu Lu wrote:
> On 2024/11/7 15:57, Yi Liu wrote:
>> On 2024/11/7 14:53, Baolu Lu wrote:
>>> On 2024/11/7 14:46, Yi Liu wrote:
>>>> On 2024/11/7 13:46, Tian, Kevin wrote:
>>>>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>>>>> Sent: Thursday, November 7, 2024 12:21 PM
>>>>>>
>>>>>> On 2024/11/7 10:52, Baolu Lu wrote:
>>>>>>> On 11/6/24 23:45, Yi Liu wrote:
>>>>>>>> +int intel_pasid_replace_first_level(struct intel_iommu *iommu,
>>>>>>>> + struct device *dev, pgd_t *pgd,
>>>>>>>> + u32 pasid, u16 did, u16 old_did,
>>>>>>>> + int flags)
>>>>>>>> +{
>>>>>>>> + struct pasid_entry *pte;
>>>>>>>> +
>>>>>>>> + if (!ecap_flts(iommu->ecap)) {
>>>>>>>> + pr_err("No first level translation support on %s\n",
>>>>>>>> + iommu->name);
>>>>>>>> + return -EINVAL;
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu-
>>>>>>>> >cap)) {
>>>>>>>> + pr_err("No 5-level paging support for first-level on %s\n",
>>>>>>>> + iommu->name);
>>>>>>>> + return -EINVAL;
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + spin_lock(&iommu->lock);
>>>>>>>> + pte = intel_pasid_get_entry(dev, pasid);
>>>>>>>> + if (!pte) {
>>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>>> + return -ENODEV;
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + if (!pasid_pte_is_present(pte)) {
>>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>>> + return -EINVAL;
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + WARN_ON(old_did != pasid_get_domain_id(pte));
>>>>>>>> +
>>>>>>>> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
>>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>>> +
>>>>>>>> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
>>>>>>>> + intel_iommu_drain_pasid_prq(dev, pasid);
>>>>>>>> +
>>>>>>>> + return 0;
>>>>>>>> +}
>>>>>>>
>>>>>>> pasid_pte_config_first_level() causes the pasid entry to transition
>>>>>>> from
>>>>>>> present to non-present and then to present. In this case, calling
>>>>>>> intel_pasid_flush_present() is not accurate, as it is only intended for
>>>>>>> pasid entries transitioning from present to present, according to the
>>>>>>> specification.
>>>>>>>
>>>>>>> It's recommended to move pasid_clear_entry(pte) and
>>>>>>> pasid_set_present(pte) out to the caller, so ...
>>>>>>>
>>>>>>> For setup case (pasid from non-present to present):
>>>>>>>
>>>>>>> - pasid_clear_entry(pte)
>>>>>>> - pasid_pte_config_first_level(pte)
>>>>>>> - pasid_set_present(pte)
>>>>>>> - cache invalidations
>>>>>>>
>>>>>>> For replace case (pasid from present to present)
>>>>>>>
>>>>>>> - pasid_pte_config_first_level(pte)
>>>>>>> - cache invalidations
>>>>>>>
>>>>>>> The same applies to other types of setup and replace.
>>>>>>
>>>>>> hmmm. Here is the reason I did it in the way of this patch:
>>>>>> 1) pasid_clear_entry() can clear all the fields that are not supposed to
>>>>>> be used by the new domain. For example, converting a nested
>>>>>> domain to
>>>>>> SS
>>>>>> only domain, if no pasid_clear_entry() then the FSPTR would be
>>>>>> there.
>>>>>> Although spec seems not enforce it, it might be good to clear it.
>>>>>> 2) We don't support atomic replace yet, so the whole pasid entry
>>>>>> transition
>>>>>> is not done in one shot, so it looks to be ok to do this stepping
>>>>>> transition.
>>>>>> 3) It seems to be even worse if keep the Present bit during the
>>>>>> transition.
>>>>>> The pasid entry might be broken while the Present bit indicates
>>>>>> this is
>>>>>> a valid pasid entry. Say if there is in-flight DMA, the result
>>>>>> may be
>>>>>> unpredictable.
>>>>>>
>>>>>> Based on the above, I chose the current way. But I admit if we are
>>>>>> going to
>>>>>> support atomic replace, then we should refactor a bit. I believe at that
>>>>>> time we need to construct the new pasid entry first and try to
>>>>>> exchange it
>>>>>> to the pasid table. I can see some transition can be done in that way
>>>>>> as we
>>>>>> can do atomic exchange with 128bits. thoughts? 🙂
>>>>>>
>>>>>
>>>>> yes 128bit cmpxchg is necessary to support atomic replacement.
>>>>>
>>>>> Actually vt-d spec clearly says so e.g. SSPTPTR/DID must be updated
>>>>> together in a present entry to not break in-flight DMA.
>>>>>
>>>>> but... your current way (clear entry then update it) also break in-
>>>>> flight
>>>>> DMA. So let's admit that as the 1st step it's not aimed to support
>>>>> atomic replacement. With that Baolu's suggestion makes more sense
>>>>> toward future extension with less refactoring required (otherwise
>>>>> you should not use intel_pasid_flush_present() then the earlier
>>>>> refactoring for that helper is also meaningless).
>>>>
>>>> I see. The pasid entry might have some filed that is not supposed to be
>>>> used after replacement. Should we have a comment about it?
>>>
>>> I guess all fields except SSADE and P of a pasid table entry should be
>>> cleared in pasid_pte_config_first_level()?
>>
>> perhaps we can take one more step forward. We can construct the new pte
>> in a local variable first and then push it to the pte in the pasid table. :)
>>
>
> That sounds better! The entry is composed on the stack and then copied
> over to the pasid table as a whole.
that's it. Like the below.
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 6841b9892d55..0f2a926d3bd5 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -389,6 +389,50 @@ int intel_pasid_setup_first_level(struct intel_iommu
*iommu,
return 0;
}
+int intel_pasid_replace_first_level(struct intel_iommu *iommu,
+ struct device *dev, pgd_t *pgd,
+ u32 pasid, u16 did, u16 old_did,
+ int flags)
+{
+ struct pasid_entry *pte, new_pte;
+
+ if (!ecap_flts(iommu->ecap)) {
+ pr_err("No first level translation support on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu->cap)) {
+ pr_err("No 5-level paging support for first-level on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ pasid_pte_config_first_level(iommu, &new_pte, pgd, did, flags);
+
+ spin_lock(&iommu->lock);
+ pte = intel_pasid_get_entry(dev, pasid);
+ if (!pte) {
+ spin_unlock(&iommu->lock);
+ return -ENODEV;
+ }
+
+ if (!pasid_pte_is_present(pte)) {
+ spin_unlock(&iommu->lock);
+ return -EINVAL;
+ }
+
+ WARN_ON(old_did != pasid_get_domain_id(pte));
+
+ *pte = new_pte;
+ spin_unlock(&iommu->lock);
+
+ intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
+ intel_iommu_drain_pasid_prq(dev, pasid);
+
+ return 0;
+}
+
/*
* Set up the scalable mode pasid entry for second only translation type.
*/
@@ -456,6 +500,57 @@ int intel_pasid_setup_second_level(struct intel_iommu
*iommu,
return 0;
}
+int intel_pasid_replace_second_level(struct intel_iommu *iommu,
+ struct dmar_domain *domain,
+ struct device *dev, u16 old_did,
+ u32 pasid)
+{
+ struct pasid_entry *pte, new_pte;
+ struct dma_pte *pgd;
+ u64 pgd_val;
+ u16 did;
+
+ /*
+ * If hardware advertises no support for second level
+ * translation, return directly.
+ */
+ if (!ecap_slts(iommu->ecap)) {
+ pr_err("No second level translation support on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ pgd = domain->pgd;
+ pgd_val = virt_to_phys(pgd);
+ did = domain_id_iommu(domain, iommu);
+
+ pasid_pte_config_second_level(iommu, &new_pte, pgd_val,
+ domain->agaw, did,
+ domain->dirty_tracking);
+
+ spin_lock(&iommu->lock);
+ pte = intel_pasid_get_entry(dev, pasid);
+ if (!pte) {
+ spin_unlock(&iommu->lock);
+ return -ENODEV;
+ }
+
+ if (!pasid_pte_is_present(pte)) {
+ spin_unlock(&iommu->lock);
+ return -EINVAL;
+ }
+
+ WARN_ON(old_did != pasid_get_domain_id(pte));
+
+ *pte = new_pte;
+ spin_unlock(&iommu->lock);
+
+ intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
+ intel_iommu_drain_pasid_prq(dev, pasid);
+
+ return 0;
+}
+
/*
* Set up dirty tracking on a second only or nested translation type.
*/
@@ -568,6 +663,38 @@ int intel_pasid_setup_pass_through(struct intel_iommu
*iommu,
return 0;
}
+int intel_pasid_replace_pass_through(struct intel_iommu *iommu,
+ struct device *dev, u16 old_did,
+ u32 pasid)
+{
+ struct pasid_entry *pte, new_pte;
+ u16 did = FLPT_DEFAULT_DID;
+
+ pasid_pte_config_pass_through(iommu, &new_pte, did);
+
+ spin_lock(&iommu->lock);
+ pte = intel_pasid_get_entry(dev, pasid);
+ if (!pte) {
+ spin_unlock(&iommu->lock);
+ return -ENODEV;
+ }
+
+ if (!pasid_pte_is_present(pte)) {
+ spin_unlock(&iommu->lock);
+ return -EINVAL;
+ }
+
+ WARN_ON(old_did != pasid_get_domain_id(pte));
+
+ *pte = new_pte;
+ spin_unlock(&iommu->lock);
+
+ intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
+ intel_iommu_drain_pasid_prq(dev, pasid);
+
+ return 0;
+}
+
/*
* Set the page snoop control for a pasid entry which has been set up.
*/
@@ -698,6 +825,69 @@ int intel_pasid_setup_nested(struct intel_iommu
*iommu, struct device *dev,
return 0;
}
+int intel_pasid_replace_nested(struct intel_iommu *iommu,
+ struct device *dev, u32 pasid,
+ u16 old_did, struct dmar_domain *domain)
+{
+ struct iommu_hwpt_vtd_s1 *s1_cfg = &domain->s1_cfg;
+ struct dmar_domain *s2_domain = domain->s2_domain;
+ u16 did = domain_id_iommu(domain, iommu);
+ struct pasid_entry *pte, new_pte;
+
+ /* Address width should match the address width supported by hardware */
+ switch (s1_cfg->addr_width) {
+ case ADDR_WIDTH_4LEVEL:
+ break;
+ case ADDR_WIDTH_5LEVEL:
+ if (!cap_fl5lp_support(iommu->cap)) {
+ dev_err_ratelimited(dev,
+ "5-level paging not supported\n");
+ return -EINVAL;
+ }
+ break;
+ default:
+ dev_err_ratelimited(dev, "Invalid stage-1 address width %d\n",
+ s1_cfg->addr_width);
+ return -EINVAL;
+ }
+
+ if ((s1_cfg->flags & IOMMU_VTD_S1_SRE) && !ecap_srs(iommu->ecap)) {
+ pr_err_ratelimited("No supervisor request support on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ if ((s1_cfg->flags & IOMMU_VTD_S1_EAFE) && !ecap_eafs(iommu->ecap)) {
+ pr_err_ratelimited("No extended access flag support on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ pasid_pte_config_nestd(iommu, &new_pte, s1_cfg, s2_domain, did);
+
+ spin_lock(&iommu->lock);
+ pte = intel_pasid_get_entry(dev, pasid);
+ if (!pte) {
+ spin_unlock(&iommu->lock);
+ return -ENODEV;
+ }
+
+ if (!pasid_pte_is_present(pte)) {
+ spin_unlock(&iommu->lock);
+ return -EINVAL;
+ }
+
+ WARN_ON(old_did != pasid_get_domain_id(pte));
+
+ *pte = new_pte;
+ spin_unlock(&iommu->lock);
+
+ intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
+ intel_iommu_drain_pasid_prq(dev, pasid);
+
+ return 0;
+}
+
/*
* Interfaces to setup or teardown a pasid table to the scalable-mode
* context table entry:
diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h
index dde6d3ba5ae0..082f4fe20216 100644
--- a/drivers/iommu/intel/pasid.h
+++ b/drivers/iommu/intel/pasid.h
@@ -303,6 +296,21 @@ int intel_pasid_setup_pass_through(struct intel_iommu
*iommu,
struct device *dev, u32 pasid);
int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev,
u32 pasid, struct dmar_domain *domain);
+int intel_pasid_replace_first_level(struct intel_iommu *iommu,
+ struct device *dev, pgd_t *pgd,
+ u32 pasid, u16 did, u16 old_did,
+ int flags);
+int intel_pasid_replace_second_level(struct intel_iommu *iommu,
+ struct dmar_domain *domain,
+ struct device *dev, u16 old_did,
+ u32 pasid);
+int intel_pasid_replace_pass_through(struct intel_iommu *iommu,
+ struct device *dev, u16 old_did,
+ u32 pasid);
+int intel_pasid_replace_nested(struct intel_iommu *iommu,
+ struct device *dev, u32 pasid,
+ u16 old_did, struct dmar_domain *domain);
+
void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
struct device *dev, u32 pasid,
bool fault_ignore);
> With these two issues addressed, do u mind sending a new version? Let's
> try to catch the pull request window. There are other series (iommufd
> and vfio) about user space PASID support depending on this.
yes, for sure.
--
Regards,
Yi Liu
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
2024-11-07 8:39 ` Yi Liu
@ 2024-11-07 8:41 ` Baolu Lu
0 siblings, 0 replies; 25+ messages in thread
From: Baolu Lu @ 2024-11-07 8:41 UTC (permalink / raw)
To: Yi Liu, Tian, Kevin, joro@8bytes.org, jgg@nvidia.com
Cc: baolu.lu, alex.williamson@redhat.com, eric.auger@redhat.com,
nicolinc@nvidia.com, kvm@vger.kernel.org,
chao.p.peng@linux.intel.com, iommu@lists.linux.dev,
Duan, Zhenzhong, vasant.hegde@amd.com, willy@infradead.org
On 2024/11/7 16:39, Yi Liu wrote:
> On 2024/11/7 16:04, Baolu Lu wrote:
>> On 2024/11/7 15:57, Yi Liu wrote:
>>> On 2024/11/7 14:53, Baolu Lu wrote:
>>>> On 2024/11/7 14:46, Yi Liu wrote:
>>>>> On 2024/11/7 13:46, Tian, Kevin wrote:
>>>>>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>>>>>> Sent: Thursday, November 7, 2024 12:21 PM
>>>>>>>
>>>>>>> On 2024/11/7 10:52, Baolu Lu wrote:
>>>>>>>> On 11/6/24 23:45, Yi Liu wrote:
>>>>>>>>> +int intel_pasid_replace_first_level(struct intel_iommu *iommu,
>>>>>>>>> + struct device *dev, pgd_t *pgd,
>>>>>>>>> + u32 pasid, u16 did, u16 old_did,
>>>>>>>>> + int flags)
>>>>>>>>> +{
>>>>>>>>> + struct pasid_entry *pte;
>>>>>>>>> +
>>>>>>>>> + if (!ecap_flts(iommu->ecap)) {
>>>>>>>>> + pr_err("No first level translation support on %s\n",
>>>>>>>>> + iommu->name);
>>>>>>>>> + return -EINVAL;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + if ((flags & PASID_FLAG_FL5LP) && !
>>>>>>>>> cap_fl5lp_support(iommu- >cap)) {
>>>>>>>>> + pr_err("No 5-level paging support for first-level on
>>>>>>>>> %s\n",
>>>>>>>>> + iommu->name);
>>>>>>>>> + return -EINVAL;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + spin_lock(&iommu->lock);
>>>>>>>>> + pte = intel_pasid_get_entry(dev, pasid);
>>>>>>>>> + if (!pte) {
>>>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>>>> + return -ENODEV;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + if (!pasid_pte_is_present(pte)) {
>>>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>>>> + return -EINVAL;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + WARN_ON(old_did != pasid_get_domain_id(pte));
>>>>>>>>> +
>>>>>>>>> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
>>>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>>>> +
>>>>>>>>> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
>>>>>>>>> + intel_iommu_drain_pasid_prq(dev, pasid);
>>>>>>>>> +
>>>>>>>>> + return 0;
>>>>>>>>> +}
>>>>>>>>
>>>>>>>> pasid_pte_config_first_level() causes the pasid entry to
>>>>>>>> transition from
>>>>>>>> present to non-present and then to present. In this case, calling
>>>>>>>> intel_pasid_flush_present() is not accurate, as it is only
>>>>>>>> intended for
>>>>>>>> pasid entries transitioning from present to present, according
>>>>>>>> to the
>>>>>>>> specification.
>>>>>>>>
>>>>>>>> It's recommended to move pasid_clear_entry(pte) and
>>>>>>>> pasid_set_present(pte) out to the caller, so ...
>>>>>>>>
>>>>>>>> For setup case (pasid from non-present to present):
>>>>>>>>
>>>>>>>> - pasid_clear_entry(pte)
>>>>>>>> - pasid_pte_config_first_level(pte)
>>>>>>>> - pasid_set_present(pte)
>>>>>>>> - cache invalidations
>>>>>>>>
>>>>>>>> For replace case (pasid from present to present)
>>>>>>>>
>>>>>>>> - pasid_pte_config_first_level(pte)
>>>>>>>> - cache invalidations
>>>>>>>>
>>>>>>>> The same applies to other types of setup and replace.
>>>>>>>
>>>>>>> hmmm. Here is the reason I did it in the way of this patch:
>>>>>>> 1) pasid_clear_entry() can clear all the fields that are not
>>>>>>> supposed to
>>>>>>> be used by the new domain. For example, converting a nested
>>>>>>> domain to
>>>>>>> SS
>>>>>>> only domain, if no pasid_clear_entry() then the FSPTR would
>>>>>>> be there.
>>>>>>> Although spec seems not enforce it, it might be good to
>>>>>>> clear it.
>>>>>>> 2) We don't support atomic replace yet, so the whole pasid entry
>>>>>>> transition
>>>>>>> is not done in one shot, so it looks to be ok to do this
>>>>>>> stepping
>>>>>>> transition.
>>>>>>> 3) It seems to be even worse if keep the Present bit during the
>>>>>>> transition.
>>>>>>> The pasid entry might be broken while the Present bit
>>>>>>> indicates this is
>>>>>>> a valid pasid entry. Say if there is in-flight DMA, the
>>>>>>> result may be
>>>>>>> unpredictable.
>>>>>>>
>>>>>>> Based on the above, I chose the current way. But I admit if we
>>>>>>> are going to
>>>>>>> support atomic replace, then we should refactor a bit. I believe
>>>>>>> at that
>>>>>>> time we need to construct the new pasid entry first and try to
>>>>>>> exchange it
>>>>>>> to the pasid table. I can see some transition can be done in that
>>>>>>> way as we
>>>>>>> can do atomic exchange with 128bits. thoughts? 🙂
>>>>>>>
>>>>>>
>>>>>> yes 128bit cmpxchg is necessary to support atomic replacement.
>>>>>>
>>>>>> Actually vt-d spec clearly says so e.g. SSPTPTR/DID must be updated
>>>>>> together in a present entry to not break in-flight DMA.
>>>>>>
>>>>>> but... your current way (clear entry then update it) also break
>>>>>> in- flight
>>>>>> DMA. So let's admit that as the 1st step it's not aimed to support
>>>>>> atomic replacement. With that Baolu's suggestion makes more sense
>>>>>> toward future extension with less refactoring required (otherwise
>>>>>> you should not use intel_pasid_flush_present() then the earlier
>>>>>> refactoring for that helper is also meaningless).
>>>>>
>>>>> I see. The pasid entry might have some filed that is not supposed
>>>>> to be
>>>>> used after replacement. Should we have a comment about it?
>>>>
>>>> I guess all fields except SSADE and P of a pasid table entry should be
>>>> cleared in pasid_pte_config_first_level()?
>>>
>>> perhaps we can take one more step forward. We can construct the new
>>> pte in a local variable first and then push it to the pte in the
>>> pasid table. 🙂
>>>
>>
>> That sounds better! The entry is composed on the stack and then copied
>> over to the pasid table as a whole.
>
> that's it. Like the below.
Looks good to me now.
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2024-11-07 8:41 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
2024-11-06 15:45 ` [PATCH v5 01/13] iommu: Pass old domain to set_dev_pasid op Yi Liu
2024-11-06 15:45 ` [PATCH v5 02/13] iommu/vt-d: Add a helper to flush cache for updating present pasid entry Yi Liu
2024-11-06 15:45 ` [PATCH v5 03/13] iommu/vt-d: Refactor the pasid setup helpers Yi Liu
2024-11-06 15:45 ` [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers Yi Liu
2024-11-07 2:52 ` Baolu Lu
2024-11-07 4:21 ` Yi Liu
2024-11-07 5:46 ` Tian, Kevin
2024-11-07 6:46 ` Yi Liu
2024-11-07 6:53 ` Baolu Lu
2024-11-07 7:57 ` Yi Liu
2024-11-07 8:04 ` Baolu Lu
2024-11-07 8:39 ` Yi Liu
2024-11-07 8:41 ` Baolu Lu
2024-11-07 2:57 ` Baolu Lu
2024-11-07 4:05 ` Yi Liu
2024-11-06 15:45 ` [PATCH v5 05/13] iommu/vt-d: Consolidate the struct dev_pasid_info add/remove Yi Liu
2024-11-06 15:45 ` [PATCH v5 06/13] iommu/vt-d: Add iommu_domain_did() to get did Yi Liu
2024-11-06 15:46 ` [PATCH v5 07/13] iommu/vt-d: Make intel_iommu_set_dev_pasid() to handle domain replacement Yi Liu
2024-11-06 15:46 ` [PATCH v5 08/13] iommu/vt-d: Limit intel_iommu_set_dev_pasid() for paging domain Yi Liu
2024-11-06 15:46 ` [PATCH v5 09/13] iommu/vt-d: Make intel_svm_set_dev_pasid() support domain replacement Yi Liu
2024-11-06 15:46 ` [PATCH v5 10/13] iommu/vt-d: Make identity_domain_set_dev_pasid() to handle " Yi Liu
2024-11-06 15:46 ` [PATCH v5 11/13] iommu/vt-d: Add set_dev_pasid callback for nested domain Yi Liu
2024-11-06 15:46 ` [PATCH v5 12/13] iommu/arm-smmu-v3: Make set_dev_pasid() op support replace Yi Liu
2024-11-06 15:46 ` [PATCH v5 13/13] iommu: Make set_dev_pasid op support domain replacement Yi Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox