* [PATCH v2 0/8] iommu/vt-d: Some cleanups
@ 2022-11-08 7:34 Lu Baolu
2022-11-08 7:34 ` [PATCH v2 1/8] iommu/vt-d: Allocate pasid table in device probe path Lu Baolu
` (7 more replies)
0 siblings, 8 replies; 13+ messages in thread
From: Lu Baolu @ 2022-11-08 7:34 UTC (permalink / raw)
To: iommu
Cc: Joerg Roedel, Kevin Tian, Will Deacon, Robin Murphy, Liu Yi L,
Jacob jun Pan, linux-kernel, Lu Baolu
Hi,
This series includes some cleanups in the Intel IOMMU implementation
after the IOMMU core implemented the blocking domain. The cleanup work
is mainly in the attach_dev/device_probe/device_release paths.
Please help to review.
Best regards,
baolu
Change log:
v2:
- Reorder the patches to make the device_block_translation() work with
the existing path first.
- Add a new patch to improve iommu_enable_pci_caps().
v1:
- https://lore.kernel.org/linux-iommu/20221103055329.633052-1-baolu.lu@linux.intel.com/
Lu Baolu (8):
iommu/vt-d: Allocate pasid table in device probe path
iommu/vt-d: Improve iommu_enable_pci_caps()
iommu/vt-d: Add device_block_translation() helper
iommu/vt-d: Add blocking domain support
iommu/vt-d: Fold dmar_remove_one_dev_info() into its caller
iommu/vt-d: Rename domain_add_dev_info()
iommu/vt-d: Remove unnecessary domain_context_mapped()
iommu/vt-d: Use real field for indication of first level
drivers/iommu/intel/iommu.h | 15 +--
drivers/iommu/intel/iommu.c | 252 +++++++++++++++++++-----------------
2 files changed, 138 insertions(+), 129 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/8] iommu/vt-d: Allocate pasid table in device probe path
2022-11-08 7:34 [PATCH v2 0/8] iommu/vt-d: Some cleanups Lu Baolu
@ 2022-11-08 7:34 ` Lu Baolu
2022-11-08 7:34 ` [PATCH v2 2/8] iommu/vt-d: Improve iommu_enable_pci_caps() Lu Baolu
` (6 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Lu Baolu @ 2022-11-08 7:34 UTC (permalink / raw)
To: iommu
Cc: Joerg Roedel, Kevin Tian, Will Deacon, Robin Murphy, Liu Yi L,
Jacob jun Pan, linux-kernel, Lu Baolu
Whether or not a domain is attached to the device, the pasid table should
always be valid as long as it has been probed. This moves the pasid table
allocation from the domain attaching device path to device probe path and
frees it in the device release path.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
drivers/iommu/intel/iommu.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f298e51d5aa6..bc42a2c84e2a 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2477,13 +2477,6 @@ static int domain_add_dev_info(struct dmar_domain *domain, struct device *dev)
/* PASID table is mandatory for a PCI device in scalable mode. */
if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev)) {
- ret = intel_pasid_alloc_table(dev);
- if (ret) {
- dev_err(dev, "PASID table allocation failed\n");
- dmar_remove_one_dev_info(dev);
- return ret;
- }
-
/* Setup the PASID entry for requests without PASID: */
if (hw_pass_through && domain_type_is_si(domain))
ret = intel_pasid_setup_pass_through(iommu, domain,
@@ -4108,7 +4101,6 @@ static void dmar_remove_one_dev_info(struct device *dev)
iommu_disable_dev_iotlb(info);
domain_context_clear(info);
- intel_pasid_free_table(info->dev);
}
spin_lock_irqsave(&domain->lock, flags);
@@ -4466,6 +4458,7 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
struct device_domain_info *info;
struct intel_iommu *iommu;
u8 bus, devfn;
+ int ret;
iommu = device_to_iommu(dev, &bus, &devfn);
if (!iommu || !iommu->iommu.ops)
@@ -4509,6 +4502,16 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
dev_iommu_priv_set(dev, info);
+ if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev)) {
+ ret = intel_pasid_alloc_table(dev);
+ if (ret) {
+ dev_err(dev, "PASID table allocation failed\n");
+ dev_iommu_priv_set(dev, NULL);
+ kfree(info);
+ return ERR_PTR(ret);
+ }
+ }
+
return &iommu->iommu;
}
@@ -4517,6 +4520,7 @@ static void intel_iommu_release_device(struct device *dev)
struct device_domain_info *info = dev_iommu_priv_get(dev);
dmar_remove_one_dev_info(dev);
+ intel_pasid_free_table(dev);
dev_iommu_priv_set(dev, NULL);
kfree(info);
set_dma_ops(dev, NULL);
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 2/8] iommu/vt-d: Improve iommu_enable_pci_caps()
2022-11-08 7:34 [PATCH v2 0/8] iommu/vt-d: Some cleanups Lu Baolu
2022-11-08 7:34 ` [PATCH v2 1/8] iommu/vt-d: Allocate pasid table in device probe path Lu Baolu
@ 2022-11-08 7:34 ` Lu Baolu
2022-11-11 3:45 ` Tian, Kevin
2022-11-08 7:34 ` [PATCH v2 3/8] iommu/vt-d: Add device_block_translation() helper Lu Baolu
` (5 subsequent siblings)
7 siblings, 1 reply; 13+ messages in thread
From: Lu Baolu @ 2022-11-08 7:34 UTC (permalink / raw)
To: iommu
Cc: Joerg Roedel, Kevin Tian, Will Deacon, Robin Murphy, Liu Yi L,
Jacob jun Pan, linux-kernel, Lu Baolu
The PCI subsystem triggers WARN() if a feature is repeatedly enabled.
This improves iommu_enable_pci_caps() to avoid unnecessary kernel
traces through checking and enabling. This also adds kernel messages
if any feature enabling results in failure. It is worth noting that
PRI depends on ATS. This adds a check as well.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.c | 86 ++++++++++++++++++++++++++-----------
1 file changed, 61 insertions(+), 25 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index bc42a2c84e2a..978cb7bba2e1 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1401,44 +1401,80 @@ static void domain_update_iotlb(struct dmar_domain *domain)
static void iommu_enable_pci_caps(struct device_domain_info *info)
{
struct pci_dev *pdev;
+ int ret;
if (!info || !dev_is_pci(info->dev))
return;
pdev = to_pci_dev(info->dev);
- /* For IOMMU that supports device IOTLB throttling (DIT), we assign
- * PFSID to the invalidation desc of a VF such that IOMMU HW can gauge
- * queue depth at PF level. If DIT is not set, PFSID will be treated as
- * reserved, which should be set to 0.
+ /*
+ * The PCIe spec, in its wisdom, declares that the behaviour of
+ * the device if you enable PASID support after ATS support is
+ * undefined. So always enable PASID support on devices which
+ * have it, even if we can't yet know if we're ever going to
+ * use it.
*/
- if (!ecap_dit(info->iommu->ecap))
- info->pfsid = 0;
- else {
- struct pci_dev *pf_pdev;
-
- /* pdev will be returned if device is not a vf */
- pf_pdev = pci_physfn(pdev);
- info->pfsid = pci_dev_id(pf_pdev);
+ if (info->pasid_supported && !info->pasid_enabled) {
+ ret = pci_enable_pasid(pdev, info->pasid_supported & ~1);
+ if (ret)
+ pci_info(pdev, "Failed to enable PASID: %d\n", ret);
+ else
+ info->pasid_enabled = 1;
}
- /* The PCIe spec, in its wisdom, declares that the behaviour of
- the device if you enable PASID support after ATS support is
- undefined. So always enable PASID support on devices which
- have it, even if we can't yet know if we're ever going to
- use it. */
- if (info->pasid_supported && !pci_enable_pasid(pdev, info->pasid_supported & ~1))
- info->pasid_enabled = 1;
+ if (info->ats_supported && !info->ats_enabled) {
+ if (!pci_ats_page_aligned(pdev)) {
+ pci_info(pdev, "Untranslated Addresses not aligned\n");
+ return;
+ }
- if (info->pri_supported &&
- (info->pasid_enabled ? pci_prg_resp_pasid_required(pdev) : 1) &&
- !pci_reset_pri(pdev) && !pci_enable_pri(pdev, PRQ_DEPTH))
- info->pri_enabled = 1;
+ ret = pci_enable_ats(pdev, VTD_PAGE_SHIFT);
+ if (ret) {
+ pci_info(pdev, "Failed to enable ATS: %d\n", ret);
+ return;
+ }
- if (info->ats_supported && pci_ats_page_aligned(pdev) &&
- !pci_enable_ats(pdev, VTD_PAGE_SHIFT)) {
info->ats_enabled = 1;
domain_update_iotlb(info->domain);
info->ats_qdep = pci_ats_queue_depth(pdev);
+
+ /*
+ * For IOMMU that supports device IOTLB throttling (DIT),
+ * we assign PFSID to the invalidation desc of a VF such
+ * that IOMMU HW can gauge queue depth at PF level. If DIT
+ * is not set, PFSID will be treated as reserved, which
+ * should be set to 0.
+ */
+ if (ecap_dit(info->iommu->ecap)) {
+ struct pci_dev *pf_pdev;
+
+ /* pdev will be returned if device is not a vf */
+ pf_pdev = pci_physfn(pdev);
+ info->pfsid = pci_dev_id(pf_pdev);
+ } else {
+ info->pfsid = 0;
+ }
+ }
+
+ if (info->pri_supported && !info->pri_enabled && info->ats_enabled) {
+ if (info->pasid_enabled && !pci_prg_resp_pasid_required(pdev)) {
+ pci_info(pdev, "PRG Response PASID Required\n");
+ return;
+ }
+
+ ret = pci_reset_pri(pdev);
+ if (ret) {
+ pci_info(pdev, "Failed to reset PRI: %d\n", ret);
+ return;
+ }
+
+ ret = pci_enable_pri(pdev, PRQ_DEPTH);
+ if (ret) {
+ pci_info(pdev, "Failed to enable PRI: %d\n", ret);
+ return;
+ }
+
+ info->pri_enabled = 1;
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 3/8] iommu/vt-d: Add device_block_translation() helper
2022-11-08 7:34 [PATCH v2 0/8] iommu/vt-d: Some cleanups Lu Baolu
2022-11-08 7:34 ` [PATCH v2 1/8] iommu/vt-d: Allocate pasid table in device probe path Lu Baolu
2022-11-08 7:34 ` [PATCH v2 2/8] iommu/vt-d: Improve iommu_enable_pci_caps() Lu Baolu
@ 2022-11-08 7:34 ` Lu Baolu
2022-11-08 7:34 ` [PATCH v2 4/8] iommu/vt-d: Add blocking domain support Lu Baolu
` (4 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Lu Baolu @ 2022-11-08 7:34 UTC (permalink / raw)
To: iommu
Cc: Joerg Roedel, Kevin Tian, Will Deacon, Robin Murphy, Liu Yi L,
Jacob jun Pan, linux-kernel, Lu Baolu
If domain attaching to device fails, the IOMMU driver should bring the
device to blocking DMA state. The upper layer is expected to recover it
by attaching a new domain. Use device_block_translation() in the error
path of dev_attach to make the behavior specific.
The difference between device_block_translation() and the previous
dmar_remove_one_dev_info() is that the latter disables PCIe ATS and the
related PCIe features. This is unnecessary as these features are not per
domain capabilities, disabling them during domain switching is
unnecessary. Another difference worthy of pointing out is that, in the
scalable mode, it is the RID2PASID entry instead of context entry being
cleared.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.c | 38 +++++++++++++++++++++++++++++++++----
1 file changed, 34 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 978cb7bba2e1..a5d0e6c88180 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -277,7 +277,7 @@ static LIST_HEAD(dmar_satc_units);
#define for_each_rmrr_units(rmrr) \
list_for_each_entry(rmrr, &dmar_rmrr_units, list)
-static void dmar_remove_one_dev_info(struct device *dev);
+static void device_block_translation(struct device *dev);
int dmar_disabled = !IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_ON);
int intel_iommu_sm = IS_ENABLED(CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON);
@@ -2525,7 +2525,7 @@ static int domain_add_dev_info(struct dmar_domain *domain, struct device *dev)
dev, PASID_RID2PASID);
if (ret) {
dev_err(dev, "Setup RID2PASID failed\n");
- dmar_remove_one_dev_info(dev);
+ device_block_translation(dev);
return ret;
}
}
@@ -2533,7 +2533,7 @@ static int domain_add_dev_info(struct dmar_domain *domain, struct device *dev)
ret = domain_context_mapping(domain, dev);
if (ret) {
dev_err(dev, "Domain context map failed\n");
- dmar_remove_one_dev_info(dev);
+ device_block_translation(dev);
return ret;
}
@@ -4147,6 +4147,36 @@ static void dmar_remove_one_dev_info(struct device *dev)
info->domain = NULL;
}
+/*
+ * Clear the page table pointer in context or pasid table entries so that
+ * all DMA requests without PASID from the device are blocked. If the page
+ * table has been set, clean up the data structures.
+ */
+static void device_block_translation(struct device *dev)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+ struct intel_iommu *iommu = info->iommu;
+ unsigned long flags;
+
+ if (!dev_is_real_dma_subdevice(dev)) {
+ if (sm_supported(iommu))
+ intel_pasid_tear_down_entry(iommu, dev,
+ PASID_RID2PASID, false);
+ else
+ domain_context_clear(info);
+ }
+
+ if (!info->domain)
+ return;
+
+ spin_lock_irqsave(&info->domain->lock, flags);
+ list_del(&info->link);
+ spin_unlock_irqrestore(&info->domain->lock, flags);
+
+ domain_detach_iommu(info->domain, iommu);
+ info->domain = NULL;
+}
+
static int md_domain_init(struct dmar_domain *domain, int guest_width)
{
int adjust_width;
@@ -4268,7 +4298,7 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
struct device_domain_info *info = dev_iommu_priv_get(dev);
if (info->domain)
- dmar_remove_one_dev_info(dev);
+ device_block_translation(dev);
}
ret = prepare_domain_attach_device(domain, dev);
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 4/8] iommu/vt-d: Add blocking domain support
2022-11-08 7:34 [PATCH v2 0/8] iommu/vt-d: Some cleanups Lu Baolu
` (2 preceding siblings ...)
2022-11-08 7:34 ` [PATCH v2 3/8] iommu/vt-d: Add device_block_translation() helper Lu Baolu
@ 2022-11-08 7:34 ` Lu Baolu
2022-11-08 7:34 ` [PATCH v2 5/8] iommu/vt-d: Fold dmar_remove_one_dev_info() into its caller Lu Baolu
` (3 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Lu Baolu @ 2022-11-08 7:34 UTC (permalink / raw)
To: iommu
Cc: Joerg Roedel, Kevin Tian, Will Deacon, Robin Murphy, Liu Yi L,
Jacob jun Pan, linux-kernel, Lu Baolu
The Intel IOMMU hardwares support blocking DMA transactions by clearing
the translation table entries. This implements a real blocking domain to
avoid using an empty UNMANAGED domain. The detach_dev callback of the
domain ops is not used in any path. Remove it to avoid dead code as well.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.c | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index a5d0e6c88180..5d5d08192c2c 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -278,6 +278,7 @@ static LIST_HEAD(dmar_satc_units);
list_for_each_entry(rmrr, &dmar_rmrr_units, list)
static void device_block_translation(struct device *dev);
+static void intel_iommu_domain_free(struct iommu_domain *domain);
int dmar_disabled = !IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_ON);
int intel_iommu_sm = IS_ENABLED(CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON);
@@ -4198,12 +4199,28 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
return 0;
}
+static int blocking_domain_attach_dev(struct iommu_domain *domain,
+ struct device *dev)
+{
+ device_block_translation(dev);
+ return 0;
+}
+
+static struct iommu_domain blocking_domain = {
+ .ops = &(const struct iommu_domain_ops) {
+ .attach_dev = blocking_domain_attach_dev,
+ .free = intel_iommu_domain_free
+ }
+};
+
static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
{
struct dmar_domain *dmar_domain;
struct iommu_domain *domain;
switch (type) {
+ case IOMMU_DOMAIN_BLOCKED:
+ return &blocking_domain;
case IOMMU_DOMAIN_DMA:
case IOMMU_DOMAIN_DMA_FQ:
case IOMMU_DOMAIN_UNMANAGED:
@@ -4238,7 +4255,7 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
static void intel_iommu_domain_free(struct iommu_domain *domain)
{
- if (domain != &si_domain->domain)
+ if (domain != &si_domain->domain && domain != &blocking_domain)
domain_exit(to_dmar_domain(domain));
}
@@ -4308,12 +4325,6 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
return domain_add_dev_info(to_dmar_domain(domain), dev);
}
-static void intel_iommu_detach_device(struct iommu_domain *domain,
- struct device *dev)
-{
- dmar_remove_one_dev_info(dev);
-}
-
static int intel_iommu_map(struct iommu_domain *domain,
unsigned long iova, phys_addr_t hpa,
size_t size, int iommu_prot, gfp_t gfp)
@@ -4821,7 +4832,6 @@ const struct iommu_ops intel_iommu_ops = {
#endif
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = intel_iommu_attach_device,
- .detach_dev = intel_iommu_detach_device,
.map_pages = intel_iommu_map_pages,
.unmap_pages = intel_iommu_unmap_pages,
.iotlb_sync_map = intel_iommu_iotlb_sync_map,
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 5/8] iommu/vt-d: Fold dmar_remove_one_dev_info() into its caller
2022-11-08 7:34 [PATCH v2 0/8] iommu/vt-d: Some cleanups Lu Baolu
` (3 preceding siblings ...)
2022-11-08 7:34 ` [PATCH v2 4/8] iommu/vt-d: Add blocking domain support Lu Baolu
@ 2022-11-08 7:34 ` Lu Baolu
2022-11-08 7:34 ` [PATCH v2 6/8] iommu/vt-d: Rename domain_add_dev_info() Lu Baolu
` (2 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Lu Baolu @ 2022-11-08 7:34 UTC (permalink / raw)
To: iommu
Cc: Joerg Roedel, Kevin Tian, Will Deacon, Robin Murphy, Liu Yi L,
Jacob jun Pan, linux-kernel, Lu Baolu
Fold dmar_remove_one_dev_info() into intel_iommu_release_device() which
is its only caller. Replace most of the code with
device_block_translation() to make the code neat and tidy.
Rename iommu_disable_dev_iotlb() to iommu_disable_pci_caps() to pair with
iommu_enable_pci_caps().
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.c | 31 +++++--------------------------
1 file changed, 5 insertions(+), 26 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 5d5d08192c2c..8bbe516f7d21 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1479,7 +1479,7 @@ static void iommu_enable_pci_caps(struct device_domain_info *info)
}
}
-static void iommu_disable_dev_iotlb(struct device_domain_info *info)
+static void iommu_disable_pci_caps(struct device_domain_info *info)
{
struct pci_dev *pdev;
@@ -4124,30 +4124,6 @@ static void domain_context_clear(struct device_domain_info *info)
&domain_context_clear_one_cb, info);
}
-static void dmar_remove_one_dev_info(struct device *dev)
-{
- struct device_domain_info *info = dev_iommu_priv_get(dev);
- struct dmar_domain *domain = info->domain;
- struct intel_iommu *iommu = info->iommu;
- unsigned long flags;
-
- if (!dev_is_real_dma_subdevice(info->dev)) {
- if (dev_is_pci(info->dev) && sm_supported(iommu))
- intel_pasid_tear_down_entry(iommu, info->dev,
- PASID_RID2PASID, false);
-
- iommu_disable_dev_iotlb(info);
- domain_context_clear(info);
- }
-
- spin_lock_irqsave(&domain->lock, flags);
- list_del(&info->link);
- spin_unlock_irqrestore(&domain->lock, flags);
-
- domain_detach_iommu(domain, iommu);
- info->domain = NULL;
-}
-
/*
* Clear the page table pointer in context or pasid table entries so that
* all DMA requests without PASID from the device are blocked. If the page
@@ -4596,7 +4572,10 @@ static void intel_iommu_release_device(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
- dmar_remove_one_dev_info(dev);
+ iommu_disable_pci_caps(info);
+ domain_context_clear(info);
+ device_block_translation(dev);
+
intel_pasid_free_table(dev);
dev_iommu_priv_set(dev, NULL);
kfree(info);
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 6/8] iommu/vt-d: Rename domain_add_dev_info()
2022-11-08 7:34 [PATCH v2 0/8] iommu/vt-d: Some cleanups Lu Baolu
` (4 preceding siblings ...)
2022-11-08 7:34 ` [PATCH v2 5/8] iommu/vt-d: Fold dmar_remove_one_dev_info() into its caller Lu Baolu
@ 2022-11-08 7:34 ` Lu Baolu
2022-11-08 7:34 ` [PATCH v2 7/8] iommu/vt-d: Remove unnecessary domain_context_mapped() Lu Baolu
2022-11-08 7:34 ` [PATCH v2 8/8] iommu/vt-d: Use real field for indication of first level Lu Baolu
7 siblings, 0 replies; 13+ messages in thread
From: Lu Baolu @ 2022-11-08 7:34 UTC (permalink / raw)
To: iommu
Cc: Joerg Roedel, Kevin Tian, Will Deacon, Robin Murphy, Liu Yi L,
Jacob jun Pan, linux-kernel, Lu Baolu
dmar_domain_attach_device() is more meaningful according to what this
helper does.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 8bbe516f7d21..88af8e4194de 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2492,7 +2492,8 @@ static int __init si_domain_init(int hw)
return 0;
}
-static int domain_add_dev_info(struct dmar_domain *domain, struct device *dev)
+static int dmar_domain_attach_device(struct dmar_domain *domain,
+ struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu;
@@ -4298,7 +4299,7 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
if (ret)
return ret;
- return domain_add_dev_info(to_dmar_domain(domain), dev);
+ return dmar_domain_attach_device(to_dmar_domain(domain), dev);
}
static int intel_iommu_map(struct iommu_domain *domain,
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 7/8] iommu/vt-d: Remove unnecessary domain_context_mapped()
2022-11-08 7:34 [PATCH v2 0/8] iommu/vt-d: Some cleanups Lu Baolu
` (5 preceding siblings ...)
2022-11-08 7:34 ` [PATCH v2 6/8] iommu/vt-d: Rename domain_add_dev_info() Lu Baolu
@ 2022-11-08 7:34 ` Lu Baolu
2022-11-08 7:34 ` [PATCH v2 8/8] iommu/vt-d: Use real field for indication of first level Lu Baolu
7 siblings, 0 replies; 13+ messages in thread
From: Lu Baolu @ 2022-11-08 7:34 UTC (permalink / raw)
To: iommu
Cc: Joerg Roedel, Kevin Tian, Will Deacon, Robin Murphy, Liu Yi L,
Jacob jun Pan, linux-kernel, Lu Baolu
The device_domain_info::domain accurately records the domain attached to
the device. It is unnecessary to check whether the context is present in
the attach_dev path. Remove it to make the code neat.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
drivers/iommu/intel/iommu.c | 47 +++----------------------------------
1 file changed, 3 insertions(+), 44 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 88af8e4194de..f8f6092ea23c 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -780,19 +780,6 @@ static void domain_flush_cache(struct dmar_domain *domain,
clflush_cache_range(addr, size);
}
-static int device_context_mapped(struct intel_iommu *iommu, u8 bus, u8 devfn)
-{
- struct context_entry *context;
- int ret = 0;
-
- spin_lock(&iommu->lock);
- context = iommu_context_addr(iommu, bus, devfn, 0);
- if (context)
- ret = context_present(context);
- spin_unlock(&iommu->lock);
- return ret;
-}
-
static void free_context_table(struct intel_iommu *iommu)
{
struct context_entry *context;
@@ -2136,30 +2123,6 @@ domain_context_mapping(struct dmar_domain *domain, struct device *dev)
&domain_context_mapping_cb, &data);
}
-static int domain_context_mapped_cb(struct pci_dev *pdev,
- u16 alias, void *opaque)
-{
- struct intel_iommu *iommu = opaque;
-
- return !device_context_mapped(iommu, PCI_BUS_NUM(alias), alias & 0xff);
-}
-
-static int domain_context_mapped(struct device *dev)
-{
- struct intel_iommu *iommu;
- u8 bus, devfn;
-
- iommu = device_to_iommu(dev, &bus, &devfn);
- if (!iommu)
- return -ENODEV;
-
- if (!dev_is_pci(dev))
- return device_context_mapped(iommu, bus, devfn);
-
- return !pci_for_each_dma_alias(to_pci_dev(dev),
- domain_context_mapped_cb, iommu);
-}
-
/* Returns a number of VTD pages, but aligned to MM page size */
static inline unsigned long aligned_nrpages(unsigned long host_addr,
size_t size)
@@ -4279,6 +4242,7 @@ static int prepare_domain_attach_device(struct iommu_domain *domain,
static int intel_iommu_attach_device(struct iommu_domain *domain,
struct device *dev)
{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
int ret;
if (domain->type == IOMMU_DOMAIN_UNMANAGED &&
@@ -4287,13 +4251,8 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
return -EPERM;
}
- /* normally dev is not mapped */
- if (unlikely(domain_context_mapped(dev))) {
- struct device_domain_info *info = dev_iommu_priv_get(dev);
-
- if (info->domain)
- device_block_translation(dev);
- }
+ if (info->domain)
+ device_block_translation(dev);
ret = prepare_domain_attach_device(domain, dev);
if (ret)
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 8/8] iommu/vt-d: Use real field for indication of first level
2022-11-08 7:34 [PATCH v2 0/8] iommu/vt-d: Some cleanups Lu Baolu
` (6 preceding siblings ...)
2022-11-08 7:34 ` [PATCH v2 7/8] iommu/vt-d: Remove unnecessary domain_context_mapped() Lu Baolu
@ 2022-11-08 7:34 ` Lu Baolu
7 siblings, 0 replies; 13+ messages in thread
From: Lu Baolu @ 2022-11-08 7:34 UTC (permalink / raw)
To: iommu
Cc: Joerg Roedel, Kevin Tian, Will Deacon, Robin Murphy, Liu Yi L,
Jacob jun Pan, linux-kernel, Lu Baolu
The dmar_domain uses bit field members to indicate the behaviors. Add
a bit field for using first level and remove the flags member to avoid
duplication.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
drivers/iommu/intel/iommu.h | 15 +++++----------
drivers/iommu/intel/iommu.c | 25 ++++++++++---------------
2 files changed, 15 insertions(+), 25 deletions(-)
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 251a609fdce3..7b7234689cb4 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -515,14 +515,6 @@ struct context_entry {
u64 hi;
};
-/*
- * When VT-d works in the scalable mode, it allows DMA translation to
- * happen through either first level or second level page table. This
- * bit marks that the DMA translation for the domain goes through the
- * first level page table, otherwise, it goes through the second level.
- */
-#define DOMAIN_FLAG_USE_FIRST_LEVEL BIT(1)
-
struct iommu_domain_info {
struct intel_iommu *iommu;
unsigned int refcnt; /* Refcount of devices per iommu */
@@ -539,6 +531,11 @@ struct dmar_domain {
u8 iommu_coherency: 1; /* indicate coherency of iommu access */
u8 force_snooping : 1; /* Create IOPTEs with snoop control */
u8 set_pte_snp:1;
+ u8 use_first_level:1; /* DMA translation for the domain goes
+ * through the first level page table,
+ * otherwise, goes through the second
+ * level.
+ */
spinlock_t lock; /* Protect device tracking lists */
struct list_head devices; /* all devices' list */
@@ -548,8 +545,6 @@ struct dmar_domain {
/* adjusted guest address width, 0 is level 2 30-bit */
int agaw;
-
- int flags; /* flags to find out type of domain */
int iommu_superpage;/* Level of superpages supported:
0 == 4KiB (no superpages), 1 == 2MiB,
2 == 1GiB, 3 == 512GiB, 4 == 1TiB */
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f8f6092ea23c..ccefa3b52240 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -383,11 +383,6 @@ static inline int domain_type_is_si(struct dmar_domain *domain)
return domain->domain.type == IOMMU_DOMAIN_IDENTITY;
}
-static inline bool domain_use_first_level(struct dmar_domain *domain)
-{
- return domain->flags & DOMAIN_FLAG_USE_FIRST_LEVEL;
-}
-
static inline int domain_pfn_supported(struct dmar_domain *domain,
unsigned long pfn)
{
@@ -501,7 +496,7 @@ static int domain_update_iommu_superpage(struct dmar_domain *domain,
rcu_read_lock();
for_each_active_iommu(iommu, drhd) {
if (iommu != skip) {
- if (domain && domain_use_first_level(domain)) {
+ if (domain && domain->use_first_level) {
if (!cap_fl1gp_support(iommu->cap))
mask = 0x1;
} else {
@@ -579,7 +574,7 @@ static void domain_update_iommu_cap(struct dmar_domain *domain)
* paging and 57-bits with 5-level paging). Hence, skip bit
* [N-1].
*/
- if (domain_use_first_level(domain))
+ if (domain->use_first_level)
domain->domain.geometry.aperture_end = __DOMAIN_MAX_ADDR(domain->gaw - 1);
else
domain->domain.geometry.aperture_end = __DOMAIN_MAX_ADDR(domain->gaw);
@@ -947,7 +942,7 @@ static struct dma_pte *pfn_to_dma_pte(struct dmar_domain *domain,
domain_flush_cache(domain, tmp_page, VTD_PAGE_SIZE);
pteval = ((uint64_t)virt_to_dma_pfn(tmp_page) << VTD_PAGE_SHIFT) | DMA_PTE_READ | DMA_PTE_WRITE;
- if (domain_use_first_level(domain)) {
+ if (domain->use_first_level) {
pteval |= DMA_FL_PTE_XD | DMA_FL_PTE_US;
if (iommu_is_dma_domain(&domain->domain))
pteval |= DMA_FL_PTE_ACCESS;
@@ -1536,7 +1531,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
if (ih)
ih = 1 << 6;
- if (domain_use_first_level(domain)) {
+ if (domain->use_first_level) {
qi_flush_piotlb(iommu, did, PASID_RID2PASID, addr, pages, ih);
} else {
unsigned long bitmask = aligned_pages - 1;
@@ -1590,7 +1585,7 @@ static inline void __mapping_notify_one(struct intel_iommu *iommu,
* It's a non-present to present mapping. Only flush if caching mode
* and second level.
*/
- if (cap_caching_mode(iommu->cap) && !domain_use_first_level(domain))
+ if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
else
iommu_flush_write_buffer(iommu);
@@ -1606,7 +1601,7 @@ static void intel_flush_iotlb_all(struct iommu_domain *domain)
struct intel_iommu *iommu = info->iommu;
u16 did = domain_id_iommu(dmar_domain, iommu);
- if (domain_use_first_level(dmar_domain))
+ if (dmar_domain->use_first_level)
qi_flush_piotlb(iommu, did, PASID_RID2PASID, 0, -1, 0);
else
iommu->flush.flush_iotlb(iommu, did, 0, 0,
@@ -1779,7 +1774,7 @@ static struct dmar_domain *alloc_domain(unsigned int type)
domain->nid = NUMA_NO_NODE;
if (first_level_by_default(type))
- domain->flags |= DOMAIN_FLAG_USE_FIRST_LEVEL;
+ domain->use_first_level = true;
domain->has_iotlb_device = false;
INIT_LIST_HEAD(&domain->devices);
spin_lock_init(&domain->lock);
@@ -2212,7 +2207,7 @@ __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
attr = prot & (DMA_PTE_READ | DMA_PTE_WRITE | DMA_PTE_SNP);
attr |= DMA_FL_PTE_PRESENT;
- if (domain_use_first_level(domain)) {
+ if (domain->use_first_level) {
attr |= DMA_FL_PTE_XD | DMA_FL_PTE_US | DMA_FL_PTE_ACCESS;
if (prot & DMA_PTE_WRITE)
attr |= DMA_FL_PTE_DIRTY;
@@ -2482,7 +2477,7 @@ static int dmar_domain_attach_device(struct dmar_domain *domain,
if (hw_pass_through && domain_type_is_si(domain))
ret = intel_pasid_setup_pass_through(iommu, domain,
dev, PASID_RID2PASID);
- else if (domain_use_first_level(domain))
+ else if (domain->use_first_level)
ret = domain_setup_first_level(iommu, domain, dev,
PASID_RID2PASID);
else
@@ -4422,7 +4417,7 @@ static void domain_set_force_snooping(struct dmar_domain *domain)
* Second level page table supports per-PTE snoop control. The
* iommu_map() interface will handle this by setting SNP bit.
*/
- if (!domain_use_first_level(domain)) {
+ if (!domain->use_first_level) {
domain->set_pte_snp = true;
return;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* RE: [PATCH v2 2/8] iommu/vt-d: Improve iommu_enable_pci_caps()
2022-11-08 7:34 ` [PATCH v2 2/8] iommu/vt-d: Improve iommu_enable_pci_caps() Lu Baolu
@ 2022-11-11 3:45 ` Tian, Kevin
2022-11-11 6:59 ` Baolu Lu
0 siblings, 1 reply; 13+ messages in thread
From: Tian, Kevin @ 2022-11-11 3:45 UTC (permalink / raw)
To: Lu Baolu, iommu@lists.linux.dev
Cc: Joerg Roedel, Will Deacon, Robin Murphy, Liu, Yi L,
Pan, Jacob jun, linux-kernel@vger.kernel.org
> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, November 8, 2022 3:34 PM
>
> The PCI subsystem triggers WARN() if a feature is repeatedly enabled.
> This improves iommu_enable_pci_caps() to avoid unnecessary kernel
> traces through checking and enabling. This also adds kernel messages
> if any feature enabling results in failure. It is worth noting that
> PRI depends on ATS. This adds a check as well.
Cannot we have a helper to check whether this device has been attached
to any domain? If no in the blocking path then disable PCI caps. If no
in the attaching path then enable PCI caps.
I just didn't get the point of leaving them enabled while the device can
not do any DMA at all.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/8] iommu/vt-d: Improve iommu_enable_pci_caps()
2022-11-11 3:45 ` Tian, Kevin
@ 2022-11-11 6:59 ` Baolu Lu
2022-11-11 8:16 ` Tian, Kevin
0 siblings, 1 reply; 13+ messages in thread
From: Baolu Lu @ 2022-11-11 6:59 UTC (permalink / raw)
To: Tian, Kevin, iommu@lists.linux.dev
Cc: baolu.lu, Joerg Roedel, Will Deacon, Robin Murphy, Liu, Yi L,
Pan, Jacob jun, linux-kernel@vger.kernel.org
On 2022/11/11 11:45, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Tuesday, November 8, 2022 3:34 PM
>>
>> The PCI subsystem triggers WARN() if a feature is repeatedly enabled.
>> This improves iommu_enable_pci_caps() to avoid unnecessary kernel
>> traces through checking and enabling. This also adds kernel messages
>> if any feature enabling results in failure. It is worth noting that
>> PRI depends on ATS. This adds a check as well.
>
> Cannot we have a helper to check whether this device has been attached
> to any domain? If no in the blocking path then disable PCI caps. If no
> in the attaching path then enable PCI caps.
>
> I just didn't get the point of leaving them enabled while the device can
> not do any DMA at all.
Ideally, the kernel owns the default policy (default on or off). The
upper layers are able to control it over IOMMUFD uAPI or kerneld kAPI.
I can't see the benefits of associating these features with the
existence of any domain.
The VT-d spec seems to use the same idea. The control of PASID/ATS are
placed in the device context fields, while the setting of domains are
placed in the PASID entry fields.
Best regards,
baolu
^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH v2 2/8] iommu/vt-d: Improve iommu_enable_pci_caps()
2022-11-11 6:59 ` Baolu Lu
@ 2022-11-11 8:16 ` Tian, Kevin
2022-11-11 11:57 ` Baolu Lu
0 siblings, 1 reply; 13+ messages in thread
From: Tian, Kevin @ 2022-11-11 8:16 UTC (permalink / raw)
To: Baolu Lu, iommu@lists.linux.dev
Cc: Joerg Roedel, Will Deacon, Robin Murphy, Liu, Yi L,
Pan, Jacob jun, linux-kernel@vger.kernel.org
> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Friday, November 11, 2022 2:59 PM
>
> On 2022/11/11 11:45, Tian, Kevin wrote:
> >> From: Lu Baolu <baolu.lu@linux.intel.com>
> >> Sent: Tuesday, November 8, 2022 3:34 PM
> >>
> >> The PCI subsystem triggers WARN() if a feature is repeatedly enabled.
> >> This improves iommu_enable_pci_caps() to avoid unnecessary kernel
> >> traces through checking and enabling. This also adds kernel messages
> >> if any feature enabling results in failure. It is worth noting that
> >> PRI depends on ATS. This adds a check as well.
> >
> > Cannot we have a helper to check whether this device has been attached
> > to any domain? If no in the blocking path then disable PCI caps. If no
> > in the attaching path then enable PCI caps.
> >
> > I just didn't get the point of leaving them enabled while the device can
> > not do any DMA at all.
>
> Ideally, the kernel owns the default policy (default on or off). The
> upper layers are able to control it over IOMMUFD uAPI or kerneld kAPI.
> I can't see the benefits of associating these features with the
> existence of any domain.
we don't have such uAPI or kAPI today.
the current behavior before your change is default off and then toggled
along with attach/detach domain. as only one domain is allowed per
RID it implies the capabilities are toggled along with DMA allow/block.
now you change it to a messy model:
- default off when the device is probed
- turn on at the 1st domain attach and never turn off until release
- but iommu_enable_pci_caps() is still called at every domain attach
with band-aid to allow re-entrant
this isn't like a good cleanup...
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/8] iommu/vt-d: Improve iommu_enable_pci_caps()
2022-11-11 8:16 ` Tian, Kevin
@ 2022-11-11 11:57 ` Baolu Lu
0 siblings, 0 replies; 13+ messages in thread
From: Baolu Lu @ 2022-11-11 11:57 UTC (permalink / raw)
To: Tian, Kevin, iommu@lists.linux.dev
Cc: baolu.lu, Joerg Roedel, Will Deacon, Robin Murphy, Liu, Yi L,
Pan, Jacob jun, linux-kernel@vger.kernel.org
On 2022/11/11 16:16, Tian, Kevin wrote:
>> From: Baolu Lu <baolu.lu@linux.intel.com>
>> Sent: Friday, November 11, 2022 2:59 PM
>>
>> On 2022/11/11 11:45, Tian, Kevin wrote:
>>>> From: Lu Baolu <baolu.lu@linux.intel.com>
>>>> Sent: Tuesday, November 8, 2022 3:34 PM
>>>>
>>>> The PCI subsystem triggers WARN() if a feature is repeatedly enabled.
>>>> This improves iommu_enable_pci_caps() to avoid unnecessary kernel
>>>> traces through checking and enabling. This also adds kernel messages
>>>> if any feature enabling results in failure. It is worth noting that
>>>> PRI depends on ATS. This adds a check as well.
>>>
>>> Cannot we have a helper to check whether this device has been attached
>>> to any domain? If no in the blocking path then disable PCI caps. If no
>>> in the attaching path then enable PCI caps.
>>>
>>> I just didn't get the point of leaving them enabled while the device can
>>> not do any DMA at all.
>>
>> Ideally, the kernel owns the default policy (default on or off). The
>> upper layers are able to control it over IOMMUFD uAPI or kerneld kAPI.
>> I can't see the benefits of associating these features with the
>> existence of any domain.
>
> we don't have such uAPI or kAPI today.
>
> the current behavior before your change is default off and then toggled
> along with attach/detach domain. as only one domain is allowed per
> RID it implies the capabilities are toggled along with DMA allow/block.
>
> now you change it to a messy model:
>
> - default off when the device is probed
> - turn on at the 1st domain attach and never turn off until release
> - but iommu_enable_pci_caps() is still called at every domain attach
> with band-aid to allow re-entrant
>
> this isn't like a good cleanup...
Fair enough. We should not bury this behavior change in a cleanup
series. Okay! I will keep the previous behavior.
Best regards,
baolu
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-11-11 11:57 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-08 7:34 [PATCH v2 0/8] iommu/vt-d: Some cleanups Lu Baolu
2022-11-08 7:34 ` [PATCH v2 1/8] iommu/vt-d: Allocate pasid table in device probe path Lu Baolu
2022-11-08 7:34 ` [PATCH v2 2/8] iommu/vt-d: Improve iommu_enable_pci_caps() Lu Baolu
2022-11-11 3:45 ` Tian, Kevin
2022-11-11 6:59 ` Baolu Lu
2022-11-11 8:16 ` Tian, Kevin
2022-11-11 11:57 ` Baolu Lu
2022-11-08 7:34 ` [PATCH v2 3/8] iommu/vt-d: Add device_block_translation() helper Lu Baolu
2022-11-08 7:34 ` [PATCH v2 4/8] iommu/vt-d: Add blocking domain support Lu Baolu
2022-11-08 7:34 ` [PATCH v2 5/8] iommu/vt-d: Fold dmar_remove_one_dev_info() into its caller Lu Baolu
2022-11-08 7:34 ` [PATCH v2 6/8] iommu/vt-d: Rename domain_add_dev_info() Lu Baolu
2022-11-08 7:34 ` [PATCH v2 7/8] iommu/vt-d: Remove unnecessary domain_context_mapped() Lu Baolu
2022-11-08 7:34 ` [PATCH v2 8/8] iommu/vt-d: Use real field for indication of first level Lu Baolu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox