public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs
@ 2023-09-22 15:16 Jingqi Liu
  2023-09-22 15:16 ` [PATCH v2 1/3] iommu/vt-d: debugfs: Dump entry pointing to huge page Jingqi Liu
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Jingqi Liu @ 2023-09-22 15:16 UTC (permalink / raw)
  To: iommu, Lu Baolu, Tian Kevin, Joerg Roedel, Will Deacon,
	Robin Murphy
  Cc: linux-kernel, Jingqi Liu

The original debugfs only dumps all IOMMU page tables without pasid
supported. It traverses all devices on the pci bus, then dumps all
page tables based on device domains. This traversal is from software
perspective.

This series dumps page tables whose mappings are created and destroyed
by the iommu_map/unmap() interfaces, by traversing root tables,
context tables, pasid directories and pasid tables from hardware
perspective. It supports dumping a specified page table in legacy mode
or scalable mode with or without a specified pasid.

It adds a debugfs directory per pair of {device, pasid} when attaching
device with or without pasid, i.e.
/sys/kernel/debug/iommu/intel/<device source id>/<pasid>.
And create a debugfs file in the directory for users to dump the page
table corresponding to {device, pasid}. e.g.
/sys/kernel/debug/iommu/intel/0000:00:02.0/0/domain_translation_struct.
Each device has a PASID#0, i.e. RID_PASID. Remove the corresponding
debugfs directory and file when detaching or releasing a device.

For legacy mode, according to bus number and DEVFN, traverse the root
table and context table to get the pointer of page table in the
context table entry, then dump the specified page table.

For scalable mode, according to bus number, DEVFN and pasid, traverse
the root table, context table, pasid directory and pasid table to get
the pointer of page table in the pasid table entry, then dump the
specified page table.

Examples are as follows:
1) Dump the page table of device "0000:00:01.0" that only supports
    legacy mode.
    $ sudo cat
    /sys/kernel/debug/iommu/intel/0000:00:01.0/0/domain_translation_struct

2) Dump the page table of device "0000:00:02.0" with PASID "1" that
   supports scalable mode.
   $ sudo cat
   /sys/kernel/debug/iommu/intel/0000:00:0a.0/1/domain_translation_struct

Change log:

v2:
 - Add a debugfs directory per {dev, pasid} as suggested by Kevin.
 - Create the debugfs directory when attaching device as suggested by Baolu.
 - Only dump the page tables whose mappings are created and destroyed
   by the iommu_map/unmap() interfaces per Baolu's review.
 - Rename the helpers for creating/removing debugfs directory/file and
   merge patch 2,3,4,5 to one patch per Baolu's review.

v1: https://lore.kernel.org/linux-iommu/20230625150442.42197-1-Jingqi.liu@intel.com

Jingqi Liu (3):
  iommu/vt-d: debugfs: Dump entry pointing to huge page
  iommu/vt-d: debugfs: Create/remove debugfs file per {device, pasid}
  iommu/vt-d: debugfs: Support dumping a specified page table

 drivers/iommu/intel/debugfs.c | 291 ++++++++++++++++++++++++++++------
 drivers/iommu/intel/iommu.c   |  16 ++
 drivers/iommu/intel/iommu.h   |   4 +
 3 files changed, 261 insertions(+), 50 deletions(-)

-- 
2.21.3


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/3] iommu/vt-d: debugfs: Dump entry pointing to huge page
  2023-09-22 15:16 [PATCH v2 0/3] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs Jingqi Liu
@ 2023-09-22 15:16 ` Jingqi Liu
  2023-09-22 15:16 ` [PATCH v2 2/3] iommu/vt-d: debugfs: Create/remove debugfs file per {device, pasid} Jingqi Liu
  2023-09-22 15:16 ` [PATCH v2 3/3] iommu/vt-d: debugfs: Support dumping a specified page table Jingqi Liu
  2 siblings, 0 replies; 6+ messages in thread
From: Jingqi Liu @ 2023-09-22 15:16 UTC (permalink / raw)
  To: iommu, Lu Baolu, Tian Kevin, Joerg Roedel, Will Deacon,
	Robin Murphy
  Cc: linux-kernel, Jingqi Liu

For the page table entry pointing to a huge page, the data below the
level of the huge page is meaningless and does not need to be dumped.

Signed-off-by: Jingqi Liu <Jingqi.liu@intel.com>
---
 drivers/iommu/intel/debugfs.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index 0255c4a23269..e2a3c37943a0 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -311,9 +311,14 @@ static inline unsigned long level_to_directory_size(int level)
 static inline void
 dump_page_info(struct seq_file *m, unsigned long iova, u64 *path)
 {
-	seq_printf(m, "0x%013lx |\t0x%016llx\t0x%016llx\t0x%016llx\t0x%016llx\t0x%016llx\n",
-		   iova >> VTD_PAGE_SHIFT, path[5], path[4],
-		   path[3], path[2], path[1]);
+	seq_printf(m, "0x%013lx |\t0x%016llx\t0x%016llx\t0x%016llx",
+		   iova >> VTD_PAGE_SHIFT, path[5], path[4], path[3]);
+	if (path[2]) {
+		seq_printf(m, "\t0x%016llx", path[2]);
+		if (path[1])
+			seq_printf(m, "\t0x%016llx", path[1]);
+	}
+	seq_putc(m, '\n');
 }
 
 static void pgtable_walk_level(struct seq_file *m, struct dma_pte *pde,
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/3] iommu/vt-d: debugfs: Create/remove debugfs file per {device, pasid}
  2023-09-22 15:16 [PATCH v2 0/3] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs Jingqi Liu
  2023-09-22 15:16 ` [PATCH v2 1/3] iommu/vt-d: debugfs: Dump entry pointing to huge page Jingqi Liu
@ 2023-09-22 15:16 ` Jingqi Liu
  2023-09-25  1:47   ` Baolu Lu
  2023-09-22 15:16 ` [PATCH v2 3/3] iommu/vt-d: debugfs: Support dumping a specified page table Jingqi Liu
  2 siblings, 1 reply; 6+ messages in thread
From: Jingqi Liu @ 2023-09-22 15:16 UTC (permalink / raw)
  To: iommu, Lu Baolu, Tian Kevin, Joerg Roedel, Will Deacon,
	Robin Murphy
  Cc: linux-kernel, Jingqi Liu

Add a debugfs directory per pair of {device, pasid} if the mappings of
its page table are created and destroyed by the iommu_map/unmap()
interfaces. i.e. /sys/kernel/debug/iommu/intel/<device source id>/<pasid>.
Create a debugfs file in the directory for users to dump the page
table corresponding to {device, pasid}. e.g.
/sys/kernel/debug/iommu/intel/0000:00:02.0/0/domain_translation_struct.

When attaching device without pasid, create a debugfs file with
PASID#0, i.e. RID_PASID. When attaching a domain to a pasid of device,
create a debugfs file with the specified pasid.

When detaching without pasid, remove the directory and file for
PASID#0. When detaching with pasid, remove the debugfs directory and
file of the specified pasid. Remove the entire debugfs directory of
the specified device for releasing device.
e.g. /sys/kernel/debug/iommu/intel/0000:00:01.0

Signed-off-by: Jingqi Liu <Jingqi.liu@intel.com>
---
 drivers/iommu/intel/debugfs.c | 117 ++++++++++++++++++++++++++++++++--
 drivers/iommu/intel/iommu.c   |  16 +++++
 drivers/iommu/intel/iommu.h   |   4 ++
 3 files changed, 132 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index e2a3c37943a0..9128febba3c6 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -32,6 +32,11 @@ struct iommu_regset {
 	const char *regs;
 };
 
+struct show_domain_info {
+	struct device *dev;
+	ioasid_t pasid;
+};
+
 #define DEBUG_BUFFER_SIZE	1024
 static char debug_buf[DEBUG_BUFFER_SIZE];
 
@@ -111,6 +116,8 @@ static const struct iommu_regset iommu_regs_64[] = {
 	IOMMU_REGSET_ENTRY(VCRSP),
 };
 
+static struct dentry *intel_iommu_debug;
+
 static int iommu_regset_show(struct seq_file *m, void *unused)
 {
 	struct dmar_drhd_unit *drhd;
@@ -673,16 +680,12 @@ static const struct file_operations dmar_perf_latency_fops = {
 
 void __init intel_iommu_debugfs_init(void)
 {
-	struct dentry *intel_iommu_debug = debugfs_create_dir("intel",
-						iommu_debugfs_dir);
+	intel_iommu_debug = debugfs_create_dir("intel", iommu_debugfs_dir);
 
 	debugfs_create_file("iommu_regset", 0444, intel_iommu_debug, NULL,
 			    &iommu_regset_fops);
 	debugfs_create_file("dmar_translation_struct", 0444, intel_iommu_debug,
 			    NULL, &dmar_translation_struct_fops);
-	debugfs_create_file("domain_translation_struct", 0444,
-			    intel_iommu_debug, NULL,
-			    &domain_translation_struct_fops);
 	debugfs_create_file("invalidation_queue", 0444, intel_iommu_debug,
 			    NULL, &invalidation_queue_fops);
 #ifdef CONFIG_IRQ_REMAP
@@ -692,3 +695,107 @@ void __init intel_iommu_debugfs_init(void)
 	debugfs_create_file("dmar_perf_latency", 0644, intel_iommu_debug,
 			    NULL, &dmar_perf_latency_fops);
 }
+
+/*
+ * Create a debugfs directory per pair of {device, pasid},
+ * then create the corresponding debugfs file in this directory
+ * for user to dump its page table. e.g.
+ * /sys/kernel/debug/iommu/intel/0000:00:01.0/0/domain_translation_struct
+ */
+void intel_iommu_debugfs_create_dev_pasid(struct device *dev, u32 pasid)
+{
+	struct dentry *pasid_dir = NULL, *dev_dir = NULL;
+	struct iommu_domain *domain = NULL;
+	struct show_domain_info *sinfo;
+	char pname[10];
+
+	if (pasid == IOMMU_NO_PASID) {
+		struct device_domain_info *info = dev_iommu_priv_get(dev);
+		if (!info)
+			return;
+		domain = &info->domain->domain;
+	} else
+		domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0);
+
+	/*
+	 * The debugfs only dumps the page tables whose mappings are created
+	 * and destroyed by the iommu_map/unmap() interfaces. Check the
+	 * mapping type of the domain before creating debugfs directory.
+	 */
+	if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
+		return;
+
+	dev_dir = debugfs_lookup(dev_name(dev), intel_iommu_debug);
+	if (!dev_dir) {
+		dev_dir = debugfs_create_dir(dev_name(dev), intel_iommu_debug);
+		if (IS_ERR(dev_dir))
+			return;
+	}
+
+	sprintf(pname, "%x", pasid);
+	pasid_dir = debugfs_create_dir(pname, dev_dir);
+	if (IS_ERR(pasid_dir))
+		return;
+
+	sinfo = kzalloc(sizeof(*sinfo), GFP_KERNEL);
+	if (!sinfo)
+		return;
+
+	sinfo->dev = dev;
+	sinfo->pasid = pasid;
+
+	debugfs_create_file("domain_translation_struct", 0444,
+			    pasid_dir, sinfo,
+			    &domain_translation_struct_fops);
+}
+
+/*
+ * Remove the debugfs directory and file corresponding to each pair of
+ * {device, pasid}. There're two scenarios that will call this helper:
+ * 1) Detach the specified device with pasid.
+ * 2) IOMMU release a device.
+ */
+void intel_iommu_debugfs_remove_dev_pasid(struct device *dev, u32 pasid)
+{
+	struct dentry *pasid_dir = NULL, *dev_dir = NULL;
+	struct dentry *dentry = NULL;
+	char pname[10];
+
+	dev_dir = debugfs_lookup(dev_name(dev), intel_iommu_debug);
+	if (!dev_dir)
+		return;
+
+	/* Check if the entire debugfs directory needs to be removed. */
+	if (pasid == IOMMU_PASID_INVALID) {
+		struct list_head *plist;
+		struct dentry *sub_dentry;
+
+		list_for_each(plist, &(dev_dir->d_subdirs)) {
+			sub_dentry = list_entry(plist, struct dentry, d_child);
+			if(sub_dentry) {
+				dentry = debugfs_lookup("domain_translation_struct", sub_dentry);
+				if (!dentry)
+					continue;
+
+				if (dentry->d_inode->i_private)
+					kfree(dentry->d_inode->i_private);
+			}
+		}
+
+		debugfs_remove_recursive(dev_dir);
+	} else { /* Remove the directory with specified pasid. */
+		sprintf(pname, "%x", pasid);
+		pasid_dir = debugfs_lookup(pname, dev_dir);
+		if (!pasid_dir)
+			return;
+
+		dentry = debugfs_lookup("domain_translation_struct", pasid_dir);
+		if (!dentry)
+			return;
+
+		if (dentry->d_inode->i_private)
+			kfree(dentry->d_inode->i_private);
+
+		debugfs_remove_recursive(pasid_dir);
+	}
+}
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index dd8ff358867d..8d3c3ef1d0e2 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2488,6 +2488,8 @@ static int dmar_domain_attach_device(struct dmar_domain *domain,
 
 	iommu_enable_pci_caps(info);
 
+	intel_iommu_debugfs_create_dev_pasid(dev, IOMMU_NO_PASID);
+
 	return 0;
 }
 
@@ -3966,6 +3968,9 @@ static void dmar_remove_one_dev_info(struct device *dev)
 
 	domain_detach_iommu(domain, iommu);
 	info->domain = NULL;
+
+	/* Remove the entire debugfs directory specified by device. */
+	intel_iommu_debugfs_remove_dev_pasid(dev, IOMMU_PASID_INVALID);
 }
 
 /*
@@ -3997,6 +4002,12 @@ static void device_block_translation(struct device *dev)
 
 	domain_detach_iommu(info->domain, iommu);
 	info->domain = NULL;
+
+	/*
+	 * Remove the debugfs directory specified by device and
+	 * RID_PASID pasid.
+	 */
+	intel_iommu_debugfs_remove_dev_pasid(dev, IOMMU_NO_PASID);
 }
 
 static int md_domain_init(struct dmar_domain *domain, int guest_width)
@@ -4729,6 +4740,9 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
 out_tear_down:
 	intel_pasid_tear_down_entry(iommu, dev, pasid, false);
 	intel_drain_pasid_prq(dev, pasid);
+
+	/* Remove the debugfs directory specified by device and pasid. */
+	intel_iommu_debugfs_remove_dev_pasid(dev, pasid);
 }
 
 static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
@@ -4777,6 +4791,8 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
 	list_add(&dev_pasid->link_domain, &dmar_domain->dev_pasids);
 	spin_unlock_irqrestore(&dmar_domain->lock, flags);
 
+	intel_iommu_debugfs_create_dev_pasid(dev, pasid);
+
 	return 0;
 out_detach_iommu:
 	domain_detach_iommu(dmar_domain, iommu);
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index c18fb699c87a..250178d8014b 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -883,8 +883,12 @@ static inline void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid
 
 #ifdef CONFIG_INTEL_IOMMU_DEBUGFS
 void intel_iommu_debugfs_init(void);
+void intel_iommu_debugfs_create_dev_pasid(struct device *dev, u32 pasid);
+void intel_iommu_debugfs_remove_dev_pasid(struct device *dev, u32 pasid);
 #else
 static inline void intel_iommu_debugfs_init(void) {}
+void intel_iommu_debugfs_create_dev_pasid(struct device *dev, u32 pasid) {}
+void intel_iommu_debugfs_remove_dev_pasid(struct device *dev, u32 pasid) {}
 #endif /* CONFIG_INTEL_IOMMU_DEBUGFS */
 
 extern const struct attribute_group *intel_iommu_groups[];
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 3/3] iommu/vt-d: debugfs: Support dumping a specified page table
  2023-09-22 15:16 [PATCH v2 0/3] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs Jingqi Liu
  2023-09-22 15:16 ` [PATCH v2 1/3] iommu/vt-d: debugfs: Dump entry pointing to huge page Jingqi Liu
  2023-09-22 15:16 ` [PATCH v2 2/3] iommu/vt-d: debugfs: Create/remove debugfs file per {device, pasid} Jingqi Liu
@ 2023-09-22 15:16 ` Jingqi Liu
  2 siblings, 0 replies; 6+ messages in thread
From: Jingqi Liu @ 2023-09-22 15:16 UTC (permalink / raw)
  To: iommu, Lu Baolu, Tian Kevin, Joerg Roedel, Will Deacon,
	Robin Murphy
  Cc: linux-kernel, Jingqi Liu

The original debugfs only dumps all page tables without pasid. With
pasid supported, the page table with pasid also needs to be dumped.

This patch supports dumping a specified page table in legacy mode or
scalable mode with or without a specified pasid.

For legacy mode, according to bus number and DEVFN, traverse the root
table and context table to get the pointer of page table in the
context table entry, then dump the specified page table.

For scalable mode, according to bus number, DEVFN and pasid, traverse
the root table, context table, pasid directory and pasid table to get
the pointer of page table in the pasid table entry, then dump the
specified page table..

Examples are as follows:
1) Dump the page table of device "0000:00:1f.0" that only supports
   legacy mode.
   $ sudo cat
   /sys/kernel/debug/iommu/intel/0000:00:1f.0/0/domain_translation_struct

2) Dump the page table of device "0000:00:0a.0" with PASID "1" that
   supports scalable mode.
   $ sudo cat
   /sys/kernel/debug/iommu/intel/0000:00:0a.0/1/domain_translation_struct

Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jingqi Liu <Jingqi.liu@intel.com>
---
 drivers/iommu/intel/debugfs.c | 163 +++++++++++++++++++++++++---------
 1 file changed, 121 insertions(+), 42 deletions(-)

diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index 9128febba3c6..51f0e022c06e 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -352,58 +352,137 @@ static void pgtable_walk_level(struct seq_file *m, struct dma_pte *pde,
 	}
 }
 
-static int __show_device_domain_translation(struct device *dev, void *data)
+static int domain_translation_struct_show(struct seq_file *m, void *unused)
 {
-	struct dmar_domain *domain;
-	struct seq_file *m = data;
-	u64 path[6] = { 0 };
+	struct device_domain_info *info;
+	struct show_domain_info *sinfo;
+	bool scalable, found = false;
+	struct dmar_drhd_unit *drhd;
+	struct intel_iommu *iommu;
+	u16 devfn, bus, seg;
 
-	domain = to_dmar_domain(iommu_get_domain_for_dev(dev));
-	if (!domain)
-		return 0;
+	if (!m || !m->private) {
+		seq_puts(m, "Invalid device or pasid!\n");
+		return -EINVAL;
+	}
 
-	seq_printf(m, "Device %s @0x%llx\n", dev_name(dev),
-		   (u64)virt_to_phys(domain->pgd));
-	seq_puts(m, "IOVA_PFN\t\tPML5E\t\t\tPML4E\t\t\tPDPE\t\t\tPDE\t\t\tPTE\n");
+	sinfo = (struct show_domain_info*)m->private;
+	if (!sinfo->dev ||
+	    !dev_iommu_priv_get(sinfo->dev) ||
+	    (sinfo->pasid == IOMMU_PASID_INVALID)) {
+		seq_puts(m, "Please specify device or pasid!\n");
+		return -ENODEV;
+	}
 
-	pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path);
-	seq_putc(m, '\n');
+	info = dev_iommu_priv_get(sinfo->dev);
+	bus = info->bus;
+	devfn = info->devfn;
+	seg = info->segment;
 
-	/* Don't iterate */
-	return 1;
-}
+	rcu_read_lock();
+	for_each_active_iommu(iommu, drhd) {
+		struct context_entry *context;
+		u64 pgd, path[6] = { 0 };
+		u32 sts, agaw;
 
-static int show_device_domain_translation(struct device *dev, void *data)
-{
-	struct iommu_group *group;
+		if (seg != iommu->segment)
+			continue;
 
-	device_lock(dev);
-	group = iommu_group_get(dev);
-	device_unlock(dev);
-	if (!group)
-		return 0;
+		sts = dmar_readl(iommu->reg + DMAR_GSTS_REG);
+		if (!(sts & DMA_GSTS_TES)) {
+			seq_printf(m, "DMA Remapping is not enabled on %s\n",
+				   iommu->name);
+			continue;
+		}
+		if (dmar_readq(iommu->reg + DMAR_RTADDR_REG) & DMA_RTADDR_SMT)
+			scalable = true;
+		else
+			scalable = false;
 
-	/*
-	 * The group->mutex is held across the callback, which will
-	 * block calls to iommu_attach/detach_group/device. Hence,
-	 * the domain of the device will not change during traversal.
-	 *
-	 * All devices in an iommu group share a single domain, hence
-	 * we only dump the domain of the first device. Even though,
-	 * this code still possibly races with the iommu_unmap()
-	 * interface. This could be solved by RCU-freeing the page
-	 * table pages in the iommu_unmap() path.
-	 */
-	iommu_group_for_each_dev(group, data, __show_device_domain_translation);
-	iommu_group_put(group);
+		/*
+		 * The iommu->lock is held across the callback, which will
+		 * block calls to domain_attach/domain_detach. Hence,
+		 * the domain of the device will not change during traversal.
+		 *
+		 * Traversing page table possibly races with the iommu_unmap()
+		 * interface. This could be solved by RCU-freeing the page
+		 * table pages in the iommu_unmap() path.
+		 */
+		spin_lock(&iommu->lock);
 
-	return 0;
-}
+		context = iommu_context_addr(iommu, bus, devfn, 0);
+		if (!context || !context_present(context))
+			goto iommu_unlock;
 
-static int domain_translation_struct_show(struct seq_file *m, void *unused)
-{
-	return bus_for_each_dev(&pci_bus_type, NULL, m,
-				show_device_domain_translation);
+		if (scalable) {	/* scalable mode */
+			struct pasid_dir_entry *dir_tbl, *dir_entry;
+			struct pasid_entry *pasid_tbl, *pasid_tbl_entry;
+			u16 pasid_dir_size, dir_idx, tbl_idx, pgtt;
+			u64 pasid_dir_ptr;
+
+			pasid_dir_ptr = context->lo & VTD_PAGE_MASK;
+			pasid_dir_size = get_pasid_dir_size(context);
+
+			/* Dump specified device domain mappings with PASID. */
+			dir_idx = sinfo->pasid >> PASID_PDE_SHIFT;
+			tbl_idx = sinfo->pasid & PASID_PTE_MASK;
+
+			dir_tbl = phys_to_virt(pasid_dir_ptr);
+			dir_entry = &dir_tbl[dir_idx];
+
+			pasid_tbl = get_pasid_table_from_pde(dir_entry);
+			if (!pasid_tbl)
+				goto iommu_unlock;
+
+			pasid_tbl_entry = &pasid_tbl[tbl_idx];
+			if (!pasid_pte_is_present(pasid_tbl_entry))
+				goto iommu_unlock;
+
+			/*
+			 * According to PASID Granular Translation Type(PGTT),
+			 * get the page table pointer.
+			 */
+			pgtt = (u16)(pasid_tbl_entry->val[0] & GENMASK_ULL(8, 6)) >> 6;
+			agaw = (u8)(pasid_tbl_entry->val[0] & GENMASK_ULL(4, 2)) >> 2;
+
+			switch (pgtt) {
+				case PASID_ENTRY_PGTT_FL_ONLY:
+					pgd = pasid_tbl_entry->val[2];
+					break;
+				case PASID_ENTRY_PGTT_SL_ONLY:
+				case PASID_ENTRY_PGTT_NESTED:
+					pgd = pasid_tbl_entry->val[0];
+					break;
+				default:
+					goto iommu_unlock;
+			}
+			pgd &= VTD_PAGE_MASK;
+		} else { /* legacy mode */
+			pgd = context->lo & VTD_PAGE_MASK;
+			agaw = context->hi & 7;
+		}
+
+		seq_printf(m, "Device %04x:%02x:%02x.%x ",
+			   iommu->segment, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
+
+		if (scalable)
+			seq_printf(m, "with pasid %x @0x%llx\n", sinfo->pasid, pgd);
+		else
+			seq_printf(m, "@0x%llx\n", pgd);
+
+		seq_printf(m, "%-17s\t%-18s\t%-18s\t%-18s\t%-18s\t%-s\n",
+			   "IOVA_PFN", "PML5E", "PML4E", "PDPE", "PDE", "PTE");
+		pgtable_walk_level(m, phys_to_virt(pgd), agaw + 2, 0, path);
+
+		found = true;
+iommu_unlock:
+		spin_unlock(&iommu->lock);
+		if (found)
+			break;
+	}
+	rcu_read_unlock();
+
+	return 0;
 }
 DEFINE_SHOW_ATTRIBUTE(domain_translation_struct);
 
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/3] iommu/vt-d: debugfs: Create/remove debugfs file per {device, pasid}
  2023-09-22 15:16 ` [PATCH v2 2/3] iommu/vt-d: debugfs: Create/remove debugfs file per {device, pasid} Jingqi Liu
@ 2023-09-25  1:47   ` Baolu Lu
  2023-09-25 12:03     ` Liu, Jingqi
  0 siblings, 1 reply; 6+ messages in thread
From: Baolu Lu @ 2023-09-25  1:47 UTC (permalink / raw)
  To: Jingqi Liu, iommu, Tian Kevin, Joerg Roedel, Will Deacon,
	Robin Murphy
  Cc: baolu.lu, linux-kernel

On 9/22/23 11:16 PM, Jingqi Liu wrote:
> Add a debugfs directory per pair of {device, pasid} if the mappings of
> its page table are created and destroyed by the iommu_map/unmap()
> interfaces. i.e. /sys/kernel/debug/iommu/intel/<device source id>/<pasid>.
> Create a debugfs file in the directory for users to dump the page
> table corresponding to {device, pasid}. e.g.
> /sys/kernel/debug/iommu/intel/0000:00:02.0/0/domain_translation_struct.
> 
> When attaching device without pasid, create a debugfs file with
> PASID#0, i.e. RID_PASID. When attaching a domain to a pasid of device,
> create a debugfs file with the specified pasid.
> 
> When detaching without pasid, remove the directory and file for
> PASID#0. When detaching with pasid, remove the debugfs directory and
> file of the specified pasid. Remove the entire debugfs directory of
> the specified device for releasing device.
> e.g. /sys/kernel/debug/iommu/intel/0000:00:01.0
> 
> Signed-off-by: Jingqi Liu <Jingqi.liu@intel.com>
> ---
>   drivers/iommu/intel/debugfs.c | 117 ++++++++++++++++++++++++++++++++--
>   drivers/iommu/intel/iommu.c   |  16 +++++
>   drivers/iommu/intel/iommu.h   |   4 ++
>   3 files changed, 132 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
> index e2a3c37943a0..9128febba3c6 100644
> --- a/drivers/iommu/intel/debugfs.c
> +++ b/drivers/iommu/intel/debugfs.c
> @@ -32,6 +32,11 @@ struct iommu_regset {
>   	const char *regs;
>   };
>   
> +struct show_domain_info {
> +	struct device *dev;
> +	ioasid_t pasid;
> +};
> +
>   #define DEBUG_BUFFER_SIZE	1024
>   static char debug_buf[DEBUG_BUFFER_SIZE];
>   
> @@ -111,6 +116,8 @@ static const struct iommu_regset iommu_regs_64[] = {
>   	IOMMU_REGSET_ENTRY(VCRSP),
>   };
>   
> +static struct dentry *intel_iommu_debug;
> +
>   static int iommu_regset_show(struct seq_file *m, void *unused)
>   {
>   	struct dmar_drhd_unit *drhd;
> @@ -673,16 +680,12 @@ static const struct file_operations dmar_perf_latency_fops = {
>   
>   void __init intel_iommu_debugfs_init(void)
>   {
> -	struct dentry *intel_iommu_debug = debugfs_create_dir("intel",
> -						iommu_debugfs_dir);
> +	intel_iommu_debug = debugfs_create_dir("intel", iommu_debugfs_dir);
>   
>   	debugfs_create_file("iommu_regset", 0444, intel_iommu_debug, NULL,
>   			    &iommu_regset_fops);
>   	debugfs_create_file("dmar_translation_struct", 0444, intel_iommu_debug,
>   			    NULL, &dmar_translation_struct_fops);
> -	debugfs_create_file("domain_translation_struct", 0444,
> -			    intel_iommu_debug, NULL,
> -			    &domain_translation_struct_fops);
>   	debugfs_create_file("invalidation_queue", 0444, intel_iommu_debug,
>   			    NULL, &invalidation_queue_fops);
>   #ifdef CONFIG_IRQ_REMAP
> @@ -692,3 +695,107 @@ void __init intel_iommu_debugfs_init(void)
>   	debugfs_create_file("dmar_perf_latency", 0644, intel_iommu_debug,
>   			    NULL, &dmar_perf_latency_fops);
>   }
> +
> +/*
> + * Create a debugfs directory per pair of {device, pasid},
> + * then create the corresponding debugfs file in this directory
> + * for user to dump its page table. e.g.
> + * /sys/kernel/debug/iommu/intel/0000:00:01.0/0/domain_translation_struct
> + */
> +void intel_iommu_debugfs_create_dev_pasid(struct device *dev, u32 pasid)
> +{
> +	struct dentry *pasid_dir = NULL, *dev_dir = NULL;
> +	struct iommu_domain *domain = NULL;
> +	struct show_domain_info *sinfo;
> +	char pname[10];
> +
> +	if (pasid == IOMMU_NO_PASID) {
> +		struct device_domain_info *info = dev_iommu_priv_get(dev);
> +		if (!info)
> +			return;
> +		domain = &info->domain->domain;
> +	} else
> +		domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0);

Can you just add domain as a parameter to this helper? This helper is
probably called in the process of attaching. so the domain pointers may
not be initialized yet.

> +
> +	/*
> +	 * The debugfs only dumps the page tables whose mappings are created
> +	 * and destroyed by the iommu_map/unmap() interfaces. Check the
> +	 * mapping type of the domain before creating debugfs directory.
> +	 */
> +	if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
> +		return;
> +
> +	dev_dir = debugfs_lookup(dev_name(dev), intel_iommu_debug);

The comments of debugfs_lookup says:

/**
  * debugfs_lookup() - look up an existing debugfs file
  * @name: a pointer to a string containing the name of the file to look up.
  * @parent: a pointer to the parent dentry of the file.
  *
  * This function will return a pointer to a dentry if it succeeds.  If 
the file
  * doesn't exist or an error occurs, %NULL will be returned.  The returned
  * dentry must be passed to dput() when it is no longer needed.
  *
  * If debugfs is not enabled in the kernel, the value -%ENODEV will be
  * returned.
  */

where is the paired dput()?

Ditto to all debugfs_lookup() calls in this file.


> +	if (!dev_dir) {
> +		dev_dir = debugfs_create_dir(dev_name(dev), intel_iommu_debug);
> +		if (IS_ERR(dev_dir))
> +			return;
> +	}
> +
> +	sprintf(pname, "%x", pasid);
> +	pasid_dir = debugfs_create_dir(pname, dev_dir);
> +	if (IS_ERR(pasid_dir))
> +		return;
> +
> +	sinfo = kzalloc(sizeof(*sinfo), GFP_KERNEL);
> +	if (!sinfo)
> +		return;
> +
> +	sinfo->dev = dev;
> +	sinfo->pasid = pasid;
> +
> +	debugfs_create_file("domain_translation_struct", 0444,
> +			    pasid_dir, sinfo,
> +			    &domain_translation_struct_fops);
> +}
> +
> +/*
> + * Remove the debugfs directory and file corresponding to each pair of
> + * {device, pasid}. There're two scenarios that will call this helper:
> + * 1) Detach the specified device with pasid.
> + * 2) IOMMU release a device.
> + */
> +void intel_iommu_debugfs_remove_dev_pasid(struct device *dev, u32 pasid)
> +{
> +	struct dentry *pasid_dir = NULL, *dev_dir = NULL;
> +	struct dentry *dentry = NULL;
> +	char pname[10];
> +
> +	dev_dir = debugfs_lookup(dev_name(dev), intel_iommu_debug);
> +	if (!dev_dir)
> +		return;
> +
> +	/* Check if the entire debugfs directory needs to be removed. */
> +	if (pasid == IOMMU_PASID_INVALID) {
> +		struct list_head *plist;
> +		struct dentry *sub_dentry;
> +
> +		list_for_each(plist, &(dev_dir->d_subdirs)) {
> +			sub_dentry = list_entry(plist, struct dentry, d_child);
> +			if(sub_dentry) {
> +				dentry = debugfs_lookup("domain_translation_struct", sub_dentry);
> +				if (!dentry)
> +					continue;
> +
> +				if (dentry->d_inode->i_private)
> +					kfree(dentry->d_inode->i_private);
> +			}
> +		}
> +
> +		debugfs_remove_recursive(dev_dir);
> +	} else { /* Remove the directory with specified pasid. */
> +		sprintf(pname, "%x", pasid);
> +		pasid_dir = debugfs_lookup(pname, dev_dir);
> +		if (!pasid_dir)
> +			return;
> +
> +		dentry = debugfs_lookup("domain_translation_struct", pasid_dir);
> +		if (!dentry)
> +			return;
> +
> +		if (dentry->d_inode->i_private)
> +			kfree(dentry->d_inode->i_private);
> +
> +		debugfs_remove_recursive(pasid_dir);
> +	}
> +}

The above is too complex to review and maintain.

If I were to make the change, I would move the device directory
management to the IOMMU probe/release devices path, and keep the PASID
directory management in the domain attaching/detaching paths.

Or I overlooked anything?

> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index dd8ff358867d..8d3c3ef1d0e2 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -2488,6 +2488,8 @@ static int dmar_domain_attach_device(struct dmar_domain *domain,
>   
>   	iommu_enable_pci_caps(info);
>   
> +	intel_iommu_debugfs_create_dev_pasid(dev, IOMMU_NO_PASID);
> +
>   	return 0;
>   }
>   
> @@ -3966,6 +3968,9 @@ static void dmar_remove_one_dev_info(struct device *dev)
>   
>   	domain_detach_iommu(domain, iommu);
>   	info->domain = NULL;
> +
> +	/* Remove the entire debugfs directory specified by device. */
> +	intel_iommu_debugfs_remove_dev_pasid(dev, IOMMU_PASID_INVALID);
>   }
>   
>   /*
> @@ -3997,6 +4002,12 @@ static void device_block_translation(struct device *dev)
>   
>   	domain_detach_iommu(info->domain, iommu);
>   	info->domain = NULL;
> +
> +	/*
> +	 * Remove the debugfs directory specified by device and
> +	 * RID_PASID pasid.
> +	 */
> +	intel_iommu_debugfs_remove_dev_pasid(dev, IOMMU_NO_PASID);
>   }
>   
>   static int md_domain_init(struct dmar_domain *domain, int guest_width)
> @@ -4729,6 +4740,9 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
>   out_tear_down:
>   	intel_pasid_tear_down_entry(iommu, dev, pasid, false);
>   	intel_drain_pasid_prq(dev, pasid);
> +
> +	/* Remove the debugfs directory specified by device and pasid. */
> +	intel_iommu_debugfs_remove_dev_pasid(dev, pasid);
>   }
>   
>   static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
> @@ -4777,6 +4791,8 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
>   	list_add(&dev_pasid->link_domain, &dmar_domain->dev_pasids);
>   	spin_unlock_irqrestore(&dmar_domain->lock, flags);
>   
> +	intel_iommu_debugfs_create_dev_pasid(dev, pasid);
> +
>   	return 0;
>   out_detach_iommu:
>   	domain_detach_iommu(dmar_domain, iommu);
> diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
> index c18fb699c87a..250178d8014b 100644
> --- a/drivers/iommu/intel/iommu.h
> +++ b/drivers/iommu/intel/iommu.h
> @@ -883,8 +883,12 @@ static inline void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid
>   
>   #ifdef CONFIG_INTEL_IOMMU_DEBUGFS
>   void intel_iommu_debugfs_init(void);
> +void intel_iommu_debugfs_create_dev_pasid(struct device *dev, u32 pasid);
> +void intel_iommu_debugfs_remove_dev_pasid(struct device *dev, u32 pasid);
>   #else
>   static inline void intel_iommu_debugfs_init(void) {}
> +void intel_iommu_debugfs_create_dev_pasid(struct device *dev, u32 pasid) {}
> +void intel_iommu_debugfs_remove_dev_pasid(struct device *dev, u32 pasid) {}
>   #endif /* CONFIG_INTEL_IOMMU_DEBUGFS */
>   
>   extern const struct attribute_group *intel_iommu_groups[];

Best regards,
baolu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/3] iommu/vt-d: debugfs: Create/remove debugfs file per {device, pasid}
  2023-09-25  1:47   ` Baolu Lu
@ 2023-09-25 12:03     ` Liu, Jingqi
  0 siblings, 0 replies; 6+ messages in thread
From: Liu, Jingqi @ 2023-09-25 12:03 UTC (permalink / raw)
  To: Baolu Lu, iommu, Tian Kevin, Joerg Roedel, Will Deacon,
	Robin Murphy
  Cc: linux-kernel


On 9/25/2023 9:47 AM, Baolu Lu wrote:
> On 9/22/23 11:16 PM, Jingqi Liu wrote:
>> Add a debugfs directory per pair of {device, pasid} if the mappings of
>> its page table are created and destroyed by the iommu_map/unmap()
>> interfaces. i.e. /sys/kernel/debug/iommu/intel/<device source 
>> id>/<pasid>.
>> Create a debugfs file in the directory for users to dump the page
>> table corresponding to {device, pasid}. e.g.
>> /sys/kernel/debug/iommu/intel/0000:00:02.0/0/domain_translation_struct.
>>
>> When attaching device without pasid, create a debugfs file with
>> PASID#0, i.e. RID_PASID. When attaching a domain to a pasid of device,
>> create a debugfs file with the specified pasid.
>>
>> When detaching without pasid, remove the directory and file for
>> PASID#0. When detaching with pasid, remove the debugfs directory and
>> file of the specified pasid. Remove the entire debugfs directory of
>> the specified device for releasing device.
>> e.g. /sys/kernel/debug/iommu/intel/0000:00:01.0
>>
>> Signed-off-by: Jingqi Liu <Jingqi.liu@intel.com>
>> ---
>>   drivers/iommu/intel/debugfs.c | 117 ++++++++++++++++++++++++++++++++--
>>   drivers/iommu/intel/iommu.c   |  16 +++++
>>   drivers/iommu/intel/iommu.h   |   4 ++
>>   3 files changed, 132 insertions(+), 5 deletions(-)
>>
>>
......
>> +
>> +/*
>> + * Create a debugfs directory per pair of {device, pasid},
>> + * then create the corresponding debugfs file in this directory
>> + * for user to dump its page table. e.g.
>> + * 
>> /sys/kernel/debug/iommu/intel/0000:00:01.0/0/domain_translation_struct
>> + */
>> +void intel_iommu_debugfs_create_dev_pasid(struct device *dev, u32 
>> pasid)
>> +{
>> +    struct dentry *pasid_dir = NULL, *dev_dir = NULL;
>> +    struct iommu_domain *domain = NULL;
>> +    struct show_domain_info *sinfo;
>> +    char pname[10];
>> +
>> +    if (pasid == IOMMU_NO_PASID) {
>> +        struct device_domain_info *info = dev_iommu_priv_get(dev);
>> +        if (!info)
>> +            return;
>> +        domain = &info->domain->domain;
>> +    } else
>> +        domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0);
>
> Can you just add domain as a parameter to this helper? This helper is
> probably called in the process of attaching. so the domain pointers may
> not be initialized yet.
>
Thanks.
Yes. This helper may be called before domain is initialized.
I'll add 'domain' as a parameter, and remove the check code above.
Check if the 'domain' is valid as the following code.
>> +
>> +    /*
>> +     * The debugfs only dumps the page tables whose mappings are 
>> created
>> +     * and destroyed by the iommu_map/unmap() interfaces. Check the
>> +     * mapping type of the domain before creating debugfs directory.
>> +     */
>> +    if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
>> +        return;
>> +
>> +    dev_dir = debugfs_lookup(dev_name(dev), intel_iommu_debug);
>
> The comments of debugfs_lookup says:
>
> /**
>  * debugfs_lookup() - look up an existing debugfs file
>  * @name: a pointer to a string containing the name of the file to 
> look up.
>  * @parent: a pointer to the parent dentry of the file.
>  *
>  * This function will return a pointer to a dentry if it succeeds.  If 
> the file
>  * doesn't exist or an error occurs, %NULL will be returned.  The 
> returned
>  * dentry must be passed to dput() when it is no longer needed.
>  *
>  * If debugfs is not enabled in the kernel, the value -%ENODEV will be
>  * returned.
>  */
>
> where is the paired dput()?
>
> Ditto to all debugfs_lookup() calls in this file.
>
Good catch.
Need to 'dput()' the entry before returning from the helper.
I'll revise it.

......
>> +
>> +/*
>> + * Remove the debugfs directory and file corresponding to each pair of
>> + * {device, pasid}. There're two scenarios that will call this helper:
>> + * 1) Detach the specified device with pasid.
>> + * 2) IOMMU release a device.
>> + */
>> +void intel_iommu_debugfs_remove_dev_pasid(struct device *dev, u32 
>> pasid)
>> +{
>> +    struct dentry *pasid_dir = NULL, *dev_dir = NULL;
>> +    struct dentry *dentry = NULL;
>> +    char pname[10];
>> +
>> +    dev_dir = debugfs_lookup(dev_name(dev), intel_iommu_debug);
>> +    if (!dev_dir)
>> +        return;
>> +
>> +    /* Check if the entire debugfs directory needs to be removed. */
>> +    if (pasid == IOMMU_PASID_INVALID) {
>> +        struct list_head *plist;
>> +        struct dentry *sub_dentry;
>> +
>> +        list_for_each(plist, &(dev_dir->d_subdirs)) {
>> +            sub_dentry = list_entry(plist, struct dentry, d_child);
>> +            if(sub_dentry) {
>> +                dentry = debugfs_lookup("domain_translation_struct", 
>> sub_dentry);
>> +                if (!dentry)
>> +                    continue;
>> +
>> +                if (dentry->d_inode->i_private)
>> +                    kfree(dentry->d_inode->i_private);
>> +            }
>> +        }
>> +
>> +        debugfs_remove_recursive(dev_dir);
>> +    } else { /* Remove the directory with specified pasid. */
>> +        sprintf(pname, "%x", pasid);
>> +        pasid_dir = debugfs_lookup(pname, dev_dir);
>> +        if (!pasid_dir)
>> +            return;
>> +
>> +        dentry = debugfs_lookup("domain_translation_struct", 
>> pasid_dir);
>> +        if (!dentry)
>> +            return;
>> +
>> +        if (dentry->d_inode->i_private)
>> +            kfree(dentry->d_inode->i_private);
>> +
>> +        debugfs_remove_recursive(pasid_dir);
>> +    }
>> +}
>
> The above is too complex to review and maintain.
>
> If I were to make the change, I would move the device directory
> management to the IOMMU probe/release devices path, and keep the PASID
> directory management in the domain attaching/detaching paths.
>
> Or I overlooked anything?
Good point.
In this way, the debugfs device directory and pasid directory can be 
managed separately.

intel_iommu_probe_device()/intel_iommu_release_device()
is responsible for adding/removing the device directory.
In the domain attaching/detaching paths,
add/remove the PASID directory and the corresponding debugfs file.

I'll implement like this.

Thanks,
Jingqi

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-09-25 12:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-22 15:16 [PATCH v2 0/3] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs Jingqi Liu
2023-09-22 15:16 ` [PATCH v2 1/3] iommu/vt-d: debugfs: Dump entry pointing to huge page Jingqi Liu
2023-09-22 15:16 ` [PATCH v2 2/3] iommu/vt-d: debugfs: Create/remove debugfs file per {device, pasid} Jingqi Liu
2023-09-25  1:47   ` Baolu Lu
2023-09-25 12:03     ` Liu, Jingqi
2023-09-22 15:16 ` [PATCH v2 3/3] iommu/vt-d: debugfs: Support dumping a specified page table Jingqi Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox