From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: intel-xe@lists.freedesktop.org
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Subject: [RFC PATCH 2/2] drm/xe: Add debugfs stats for DMA-mapped pages per order
Date: Wed, 10 Jun 2026 15:34:51 +0200 [thread overview]
Message-ID: <20260610133451.8930-2-thomas.hellstrom@linux.intel.com> (raw)
In-Reply-To: <20260610133451.8930-1-thomas.hellstrom@linux.intel.com>
Expose per-page-order DMA mapping counts for the system memory that the
xe driver maps for GPU access, split into two categories: TTM buffer
objects, and the GPU SVM / userptr ranges (which share a single
drm_gpusvm instance per VM).
The stats are visible at:
<debugfs>/dri/<N>/dma_mapped_pages
and are broken out into two rows (bo, svm/userptr), one column per page
order, mirroring the layout of the TTM pool page_pool stat.
For TTM BOs the order of each chunk is determined by walking the pages[]
array and finding the largest power-of-two run that is both
offset-aligned within the BO and physically contiguous (consecutive
PFNs).
For the SVM and userptr ranges the accounting is driven by the
drm_gpusvm @dma_map_account callback, which fires once per dma_addr[]
entry at the exact point the entry is DMA-mapped and unmapped. This
makes the accounting symmetric by construction and avoids driver-side
walks that could drift across migration, partial unmaps and the iova vs
non-iova paths. Only DRM_INTERCONNECT_SYSTEM entries are real DMA maps
and counted; device interconnect (VRAM, P2P) entries are skipped, so the
counter reflects only system pages actually mapped through the DMA layer.
The callback is shared between the full SVM instance (fault mode) and the
core drm_gpusvm_pages-only instance used for userptr, so userptr
mappings are accounted in both the CONFIG_DRM_XE_GPUSVM=y and =n (but
CONFIG_DRM_GPUSVM=y) configurations, into the same svm/userptr counter.
Since the counts are only ever consumed through debugfs, gate the counter
storage and all accounting on CONFIG_DEBUG_FS so that builds without
debugfs carry no extra atomic traffic or device state; the drm_gpusvm
callback is simply left unregistered in that case.
On device teardown assert that every counter has returned to zero, to
catch any unbalanced accounting (a leaked DMA mapping) during testing.
Assisted-by: GitHub_Copilot:claude-opus-4.8
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
drivers/gpu/drm/xe/xe_bo.c | 63 ++++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_debugfs.c | 26 ++++++++++++
drivers/gpu/drm/xe/xe_device.c | 24 +++++++++++
drivers/gpu/drm/xe/xe_device_types.h | 17 ++++++++
drivers/gpu/drm/xe/xe_svm.c | 36 +++++++++++++++-
drivers/gpu/drm/xe/xe_svm.h | 3 +-
drivers/gpu/drm/xe/xe_userptr.c | 55 ++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_userptr.h | 1 +
8 files changed, 222 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 4c80bac67622..673ed083d131 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -367,6 +367,67 @@ struct xe_ttm_tt {
bool purgeable;
};
+/*
+ * xe_tt_account_dma_pages - account pages in tt by contiguous aligned order
+ * @xe: the xe device
+ * @tt: the ttm_tt whose pages[] to walk
+ * @sign: +1 to add, -1 to subtract
+ *
+ * Walk the pages array and for each position find the largest order k such
+ * that the position is aligned to 2^k within the BO and the 2^k pages are
+ * physically contiguous (consecutive PFNs). Accumulate the count of 2^k pages
+ * into xe->mem.dma_mapped_pages[k].
+ */
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+static void xe_tt_account_dma_pages(struct xe_device *xe,
+ struct ttm_tt *tt, int sign)
+{
+ unsigned long i = 0;
+
+ while (i < tt->num_pages) {
+ unsigned int order = 0;
+ unsigned long chunk;
+
+ /*
+ * Find the largest order k where:
+ * - i is aligned to 2^k within the BO (offset alignment)
+ * - 2^k pages fit in the remaining range
+ * - all 2^k pages are physically contiguous
+ */
+ for (unsigned int k = 1; k < NR_PAGE_ORDERS; k++) {
+ unsigned long size = 1UL << k;
+ unsigned long j;
+
+ /* offset within BO must be aligned */
+ if (i & (size - 1))
+ break;
+
+ /* must fit in remaining pages */
+ if (i + size > tt->num_pages)
+ break;
+
+ /* check physical contiguity across the chunk */
+ for (j = 1; j < size; j++) {
+ if (page_to_pfn(tt->pages[i + j]) !=
+ page_to_pfn(tt->pages[i]) + j)
+ goto found;
+ }
+ order = k;
+ }
+found:
+ chunk = 1UL << order;
+ atomic_long_add((long)chunk * sign,
+ &xe->mem.dma_mapped_pages[order]);
+ i += chunk;
+ }
+}
+#else
+static void xe_tt_account_dma_pages(struct xe_device *xe,
+ struct ttm_tt *tt, int sign)
+{
+}
+#endif
+
static int xe_tt_map_sg(struct xe_device *xe, struct ttm_tt *tt)
{
struct xe_ttm_tt *xe_tt = container_of(tt, struct xe_ttm_tt, ttm);
@@ -396,6 +457,7 @@ static int xe_tt_map_sg(struct xe_device *xe, struct ttm_tt *tt)
return ret;
}
+ xe_tt_account_dma_pages(xe, tt, 1);
return 0;
}
@@ -404,6 +466,7 @@ static void xe_tt_unmap_sg(struct xe_device *xe, struct ttm_tt *tt)
struct xe_ttm_tt *xe_tt = container_of(tt, struct xe_ttm_tt, ttm);
if (xe_tt->sg) {
+ xe_tt_account_dma_pages(xe, tt, -1);
dma_unmap_sgtable(xe->drm.dev, xe_tt->sg,
DMA_BIDIRECTIONAL, 0);
sg_free_table(xe_tt->sg);
diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
index 22b471303984..45bdceb200ac 100644
--- a/drivers/gpu/drm/xe/xe_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_debugfs.c
@@ -6,6 +6,7 @@
#include "xe_debugfs.h"
#include <linux/debugfs.h>
+#include <linux/mmzone.h>
#include <linux/fault-inject.h>
#include <linux/string_helpers.h>
@@ -215,6 +216,28 @@ static int dgfx_pcie_link_residencies_show(struct seq_file *m, void *data)
return 0;
}
+static int dma_mapped_pages_show(struct seq_file *m, void *data)
+{
+ struct xe_device *xe = m->private;
+ unsigned int i;
+
+ seq_printf(m, "%-11s ", "");
+ for (i = 0; i < NR_PAGE_ORDERS; i++)
+ seq_printf(m, " ---%2u---", i);
+ seq_printf(m, "\n%-11s:", "bo");
+ for (i = 0; i < NR_PAGE_ORDERS; i++)
+ seq_printf(m, " %8lu",
+ atomic_long_read(&xe->mem.dma_mapped_pages[i]));
+ seq_printf(m, "\n%-11s:", "svm/userptr");
+ for (i = 0; i < NR_PAGE_ORDERS; i++)
+ seq_printf(m, " %8lu",
+ atomic_long_read(&xe->mem.dma_mapped_pages_svm[i]));
+ seq_puts(m, "\n");
+
+ return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(dma_mapped_pages);
+
static const struct drm_info_list debugfs_list[] = {
{"info", info, 0},
{ .name = "sriov_info", .show = sriov_info, },
@@ -600,6 +623,9 @@ void xe_debugfs_register(struct xe_device *xe)
if (man)
ttm_resource_manager_create_debugfs(man, root, "stolen_mm");
+ debugfs_create_file("dma_mapped_pages", 0444, root, xe,
+ &dma_mapped_pages_fops);
+
for_each_tile(tile, xe, tile_id)
xe_tile_debugfs_register(tile);
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 51e3a2dd7b22..231ad742a27c 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -454,10 +454,34 @@ bool xe_device_is_admin_only(const struct xe_device *xe)
}
#endif
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+static void xe_device_assert_dma_pages_zero(struct xe_device *xe)
+{
+ unsigned int i;
+
+ /*
+ * All BOs, userptr VMAs and SVM ranges must have been torn down by the
+ * time the device is destroyed, so every DMA-mapped-pages counter must
+ * have returned to zero. A non-zero value indicates unbalanced
+ * accounting, i.e. a missing unmap-side decrement.
+ */
+ for (i = 0; i < NR_PAGE_ORDERS; i++) {
+ xe_assert(xe, !atomic_long_read(&xe->mem.dma_mapped_pages[i]));
+ xe_assert(xe, !atomic_long_read(&xe->mem.dma_mapped_pages_svm[i]));
+ }
+}
+#else
+static void xe_device_assert_dma_pages_zero(struct xe_device *xe)
+{
+}
+#endif
+
static void xe_device_destroy(struct drm_device *dev, void *dummy)
{
struct xe_device *xe = to_xe_device(dev);
+ xe_device_assert_dma_pages_zero(xe);
+
xe_bo_dev_fini(&xe->bo_device);
if (xe->preempt_fence_wq)
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 32dd2ffbc796..6fe94fc18008 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -6,6 +6,7 @@
#ifndef _XE_DEVICE_TYPES_H_
#define _XE_DEVICE_TYPES_H_
+#include <linux/mmzone.h>
#include <linux/pci.h>
#include <drm/drm_device.h>
@@ -279,6 +280,22 @@ struct xe_device {
struct xe_shrinker *shrinker;
/** @mem.stolen_mgr: stolen memory manager. */
struct xe_ttm_stolen_mgr *stolen_mgr;
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+ /**
+ * @mem.dma_mapped_pages: number of DMA-mapped pages per page
+ * order currently live for this device, for TTM BOs. Only
+ * accounted when CONFIG_DEBUG_FS is enabled, since it is solely
+ * exposed through debugfs.
+ */
+ atomic_long_t dma_mapped_pages[NR_PAGE_ORDERS];
+ /**
+ * @mem.dma_mapped_pages_svm: number of DMA-mapped pages per
+ * page order currently live for this device, for SVM ranges
+ * and userptr VMAs (both share a single drm_gpusvm instance).
+ * Only accounted when CONFIG_DEBUG_FS is enabled.
+ */
+ atomic_long_t dma_mapped_pages_svm[NR_PAGE_ORDERS];
+#endif
} mem;
/** @sriov: device level virtualization data */
diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
index e1651e70c8f0..03cfbf831257 100644
--- a/drivers/gpu/drm/xe/xe_svm.c
+++ b/drivers/gpu/drm/xe/xe_svm.c
@@ -817,10 +817,42 @@ static int xe_svm_get_pagemaps(struct xe_vm *vm)
}
#endif
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+/*
+ * xe_svm_dma_map_account - drm_gpusvm DMA-mapping accounting callback
+ * @gpusvm: The GPU SVM the mapping belongs to
+ * @addr: The address descriptor of the chunk being (un)mapped
+ * @sign: +1 when @addr was DMA-mapped, -1 when it is being unmapped
+ *
+ * Maintain per-order counts of the system-memory pages that the full SVM
+ * instance (SVM ranges, and userptr VMAs in fault mode) has DMA-mapped for GPU
+ * access. Only DRM_INTERCONNECT_SYSTEM entries are real DMA maps and counted;
+ * device interconnect (VRAM, P2P) entries are skipped. The counts are solely
+ * exposed through debugfs, so the callback is only registered when
+ * CONFIG_DEBUG_FS is enabled.
+ */
+static void xe_svm_dma_map_account(struct drm_gpusvm *gpusvm,
+ const struct drm_pagemap_addr *addr,
+ int sign)
+{
+ struct xe_device *xe = gpusvm_to_vm(gpusvm)->xe;
+ unsigned int order = addr->order;
+
+ if (addr->proto != DRM_INTERCONNECT_SYSTEM)
+ return;
+
+ atomic_long_add((long)(1UL << order) * sign,
+ &xe->mem.dma_mapped_pages_svm[order]);
+}
+#endif
+
static const struct drm_gpusvm_ops gpusvm_ops = {
.range_alloc = xe_svm_range_alloc,
.range_free = xe_svm_range_free,
.invalidate = xe_svm_invalidate,
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+ .dma_map_account = xe_svm_dma_map_account,
+#endif
};
static const unsigned long fault_chunk_sizes[] = {
@@ -915,8 +947,8 @@ int xe_svm_init(struct xe_vm *vm)
}
} else {
err = drm_gpusvm_init(&vm->svm.gpusvm, "Xe SVM (simple)",
- &vm->xe->drm, NULL, 0, 0, 0, NULL,
- NULL, 0);
+ &vm->xe->drm, NULL, 0, 0, 0,
+ xe_userptr_gpusvm_ops_get(), NULL, 0);
}
return err;
diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
index b7b8eeacf196..729627abb08e 100644
--- a/drivers/gpu/drm/xe/xe_svm.h
+++ b/drivers/gpu/drm/xe/xe_svm.h
@@ -234,7 +234,8 @@ int xe_svm_init(struct xe_vm *vm)
{
#if IS_ENABLED(CONFIG_DRM_GPUSVM)
return drm_gpusvm_init(&vm->svm.gpusvm, "Xe SVM (simple)", &vm->xe->drm,
- NULL, 0, 0, 0, NULL, NULL, 0);
+ NULL, 0, 0, 0, xe_userptr_gpusvm_ops_get(),
+ NULL, 0);
#else
return 0;
#endif
diff --git a/drivers/gpu/drm/xe/xe_userptr.c b/drivers/gpu/drm/xe/xe_userptr.c
index 6761005c0b90..08e0d4d25f8b 100644
--- a/drivers/gpu/drm/xe/xe_userptr.c
+++ b/drivers/gpu/drm/xe/xe_userptr.c
@@ -8,9 +8,64 @@
#include <linux/mm.h>
+#include <drm/drm_pagemap.h>
+
#include "xe_tlb_inval.h"
#include "xe_trace_bo.h"
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+/*
+ * xe_userptr_dma_map_account - drm_gpusvm DMA-mapping accounting callback
+ * @gpusvm: The GPU SVM the mapping belongs to
+ * @addr: The address descriptor of the chunk being (un)mapped
+ * @sign: +1 when @addr was DMA-mapped, -1 when it is being unmapped
+ *
+ * Account, per page order, the system-memory pages DMA-mapped for GPU access
+ * through the drm_gpusvm_pages instance shared by userptr (and, in fault mode,
+ * SVM). Only DRM_INTERCONNECT_SYSTEM entries are real DMA maps and counted;
+ * device interconnect (VRAM, P2P) entries are skipped. The counts are solely
+ * exposed through debugfs, so the callback is only registered when
+ * CONFIG_DEBUG_FS is enabled.
+ */
+static void xe_userptr_dma_map_account(struct drm_gpusvm *gpusvm,
+ const struct drm_pagemap_addr *addr,
+ int sign)
+{
+ struct xe_vm *vm = container_of(gpusvm, struct xe_vm, svm.gpusvm);
+ unsigned int order = addr->order;
+
+ if (addr->proto != DRM_INTERCONNECT_SYSTEM)
+ return;
+
+ atomic_long_add((long)(1UL << order) * sign,
+ &vm->xe->mem.dma_mapped_pages_svm[order]);
+}
+
+static const struct drm_gpusvm_ops xe_userptr_gpusvm_ops = {
+ .dma_map_account = xe_userptr_dma_map_account,
+};
+#endif
+
+/**
+ * xe_userptr_gpusvm_ops_get() - Accounting ops for the simple gpusvm instance
+ *
+ * The core drm_gpusvm_pages-only instance used for userptr (and, in fault
+ * mode, shared with SVM) is initialised with these ops so that DMA-mapped
+ * system pages are accounted and exposed through debugfs. Returns NULL when
+ * CONFIG_DEBUG_FS is disabled, in which case no accounting is kept and the
+ * instance is initialised without ops.
+ *
+ * Return: Pointer to the restricted &drm_gpusvm_ops, or NULL.
+ */
+const struct drm_gpusvm_ops *xe_userptr_gpusvm_ops_get(void)
+{
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+ return &xe_userptr_gpusvm_ops;
+#else
+ return NULL;
+#endif
+}
+
static void xe_userptr_assert_in_notifier(struct xe_vm *vm)
{
lockdep_assert(lockdep_is_held_type(&vm->svm.gpusvm.notifier_lock, 0) ||
diff --git a/drivers/gpu/drm/xe/xe_userptr.h b/drivers/gpu/drm/xe/xe_userptr.h
index 2a3cd1b5efbb..1e9601be5713 100644
--- a/drivers/gpu/drm/xe/xe_userptr.h
+++ b/drivers/gpu/drm/xe/xe_userptr.h
@@ -108,6 +108,7 @@ int __xe_vm_userptr_needs_repin(struct xe_vm *vm);
int xe_vm_userptr_check_repin(struct xe_vm *vm);
int xe_vma_userptr_pin_pages(struct xe_userptr_vma *uvma);
int xe_vma_userptr_check_repin(struct xe_userptr_vma *uvma);
+const struct drm_gpusvm_ops *xe_userptr_gpusvm_ops_get(void);
#else
static inline void xe_userptr_remove(struct xe_userptr_vma *uvma) {}
--
2.54.0
next prev parent reply other threads:[~2026-06-10 13:35 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-10 13:34 [RFC PATCH 1/2] drm/gpusvm: Add a DMA-mapping accounting callback Thomas Hellström
2026-06-10 13:34 ` Thomas Hellström [this message]
2026-06-10 13:43 ` ✓ CI.KUnit: success for series starting with [RFC,1/2] " Patchwork
2026-06-10 14:23 ` ✓ Xe.CI.BAT: " Patchwork
2026-06-10 15:37 ` [RFC PATCH 1/2] " Matthew Brost
2026-06-10 17:55 ` ✗ Xe.CI.FULL: failure for series starting with [RFC,1/2] " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260610133451.8930-2-thomas.hellstrom@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=intel-xe@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.