public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH v2] drm/xe: Convert stolen memory over to ttm_range_manager
@ 2026-04-16 15:19 Sanjay Yadav
  2026-04-16 15:33 ` Maarten Lankhorst
  2026-04-20  9:36 ` Govindapillai, Vinod
  0 siblings, 2 replies; 7+ messages in thread
From: Sanjay Yadav @ 2026-04-16 15:19 UTC (permalink / raw)
  To: intel-xe; +Cc: Maarten Lankhorst, Matthew Brost, Matthew Auld

Stolen memory requires physically contiguous allocations for display
scanout and compressed framebuffers. The stolen memory manager was
sharing the gpu_buddy allocator backend with the VRAM manager, but
buddy manages non-contiguous power-of-two blocks making it a poor fit.
Stolen memory also has fundamentally different allocation patterns:

- Allocation sizes are not power-of-two. Since buddy rounds up to the
  next power-of-two block size, a ~17MB request can fail even with
  ~22MB free, because the free space is fragmented across non-fitting
  power-of-two blocks.
- Hardware restrictions prevent using the first 4K page of stolen for
  certain allocations (e.g., FBC). The display code sets fpfn=1 to
  enforce this, but when fpfn != 0, gpu_buddy enables
  GPU_BUDDY_RANGE_ALLOCATION mode which disables the try_harder
  coalescing path, further reducing allocation success.

This combination caused FBC compressed framebuffer (CFB) allocation
failures on platforms like NVL/PTL. In case of NVL where stolen memory
is ~56MB and the initial plane framebuffer consumes ~34MB at probe time,
leaving ~22MB for subsequent allocations.

Use ttm_range_man_init_nocheck() to set up a drm_mm-backed TTM resource
manager for stolen memory. This reuses the TTM core's ttm_range_manager
callbacks, avoiding duplicate implementations.

Tested on NVL with a 4K DP display: stolen_mm shows a single ~22MB
contiguous free hole after initial plane framebuffer allocation, and
FBC successfully allocates its CFB from that region. The corresponding
IGT was previously skipped and now passes.

v2:
- Clarify that stolen memory requires contiguous allocations (Matt B)
- Properly handle xe_ttm_resource_visible() for stolen instead of
  unconditionally returning true (Matt A)

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7631
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Assisted-by: GitHub Copilot:claude-sonnet-4.6
Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c             | 16 +++++--
 drivers/gpu/drm/xe/xe_device_types.h   |  3 ++
 drivers/gpu/drm/xe/xe_res_cursor.h     | 14 +++++-
 drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 64 +++++++++-----------------
 drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h |  9 ++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c   | 11 ++---
 6 files changed, 63 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index a7c2dc7f224c..b737edf04d60 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -599,11 +599,17 @@ static void xe_ttm_tt_destroy(struct ttm_device *ttm_dev, struct ttm_tt *tt)
 	kfree(tt);
 }
 
-static bool xe_ttm_resource_visible(struct ttm_resource *mem)
+static bool xe_ttm_resource_visible(struct xe_device *xe, struct ttm_resource *mem)
 {
-	struct xe_ttm_vram_mgr_resource *vres =
-		to_xe_ttm_vram_mgr_resource(mem);
+	struct xe_ttm_vram_mgr_resource *vres;
 
+	if (mem->mem_type == XE_PL_STOLEN) {
+		struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
+
+		return mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe);
+	}
+
+	vres = to_xe_ttm_vram_mgr_resource(mem);
 	return vres->used_visible_size == mem->size;
 }
 
@@ -621,7 +627,7 @@ bool xe_bo_is_visible_vram(struct xe_bo *bo)
 	if (drm_WARN_ON(bo->ttm.base.dev, !xe_bo_is_vram(bo)))
 		return false;
 
-	return xe_ttm_resource_visible(bo->ttm.resource);
+	return xe_ttm_resource_visible(xe_bo_device(bo), bo->ttm.resource);
 }
 
 static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
@@ -637,7 +643,7 @@ static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
 	case XE_PL_VRAM1: {
 		struct xe_vram_region *vram = res_to_mem_region(mem);
 
-		if (!xe_ttm_resource_visible(mem))
+		if (!xe_ttm_resource_visible(xe, mem))
 			return -EINVAL;
 
 		mem->bus.offset = mem->start << PAGE_SHIFT;
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 150c76b2acaf..ffb84659c413 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -42,6 +42,7 @@ struct xe_ggtt;
 struct xe_i2c;
 struct xe_pat_ops;
 struct xe_pxp;
+struct xe_ttm_stolen_mgr;
 struct xe_vram_region;
 
 /**
@@ -278,6 +279,8 @@ struct xe_device {
 		struct ttm_resource_manager sys_mgr;
 		/** @mem.sys_mgr: system memory shrinker. */
 		struct xe_shrinker *shrinker;
+		/** @mem.stolen_mgr: stolen memory manager. */
+		struct xe_ttm_stolen_mgr *stolen_mgr;
 	} mem;
 
 	/** @sriov: device level virtualization data */
diff --git a/drivers/gpu/drm/xe/xe_res_cursor.h b/drivers/gpu/drm/xe/xe_res_cursor.h
index 5f4ab08c0686..0522caafd89d 100644
--- a/drivers/gpu/drm/xe/xe_res_cursor.h
+++ b/drivers/gpu/drm/xe/xe_res_cursor.h
@@ -101,7 +101,15 @@ static inline void xe_res_first(struct ttm_resource *res,
 	cur->mem_type = res->mem_type;
 
 	switch (cur->mem_type) {
-	case XE_PL_STOLEN:
+	case XE_PL_STOLEN: {
+		/* res->start is in pages (ttm_range_manager). */
+		cur->start = (res->start << PAGE_SHIFT) + start;
+		cur->size = size;
+		cur->remaining = size;
+		cur->node = NULL;
+		cur->mm = NULL;
+		break;
+	}
 	case XE_PL_VRAM0:
 	case XE_PL_VRAM1: {
 		struct gpu_buddy_block *block;
@@ -289,6 +297,10 @@ static inline void xe_res_next(struct xe_res_cursor *cur, u64 size)
 
 	switch (cur->mem_type) {
 	case XE_PL_STOLEN:
+		/* Just advance within the contiguous region. */
+		cur->start += size;
+		cur->size = cur->remaining;
+		break;
 	case XE_PL_VRAM0:
 	case XE_PL_VRAM1:
 		start = size - cur->size;
diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
index 27c9d72222cf..5e9070739e65 100644
--- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
@@ -19,30 +19,11 @@
 #include "xe_device.h"
 #include "xe_gt_printk.h"
 #include "xe_mmio.h"
-#include "xe_res_cursor.h"
 #include "xe_sriov.h"
 #include "xe_ttm_stolen_mgr.h"
-#include "xe_ttm_vram_mgr.h"
 #include "xe_vram.h"
 #include "xe_wa.h"
 
-struct xe_ttm_stolen_mgr {
-	struct xe_ttm_vram_mgr base;
-
-	/* PCI base offset */
-	resource_size_t io_base;
-	/* GPU base offset */
-	resource_size_t stolen_base;
-
-	void __iomem *mapping;
-};
-
-static inline struct xe_ttm_stolen_mgr *
-to_stolen_mgr(struct ttm_resource_manager *man)
-{
-	return container_of(man, struct xe_ttm_stolen_mgr, base.manager);
-}
-
 /**
  * xe_ttm_stolen_cpu_access_needs_ggtt() - If we can't directly CPU access
  * stolen, can we then fallback to mapping through the GGTT.
@@ -210,12 +191,19 @@ static u64 detect_stolen(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr)
 #endif
 }
 
+static void xe_ttm_stolen_mgr_fini(struct drm_device *dev, void *arg)
+{
+	struct xe_device *xe = to_xe_device(dev);
+
+	ttm_range_man_fini_nocheck(&xe->ttm, XE_PL_STOLEN);
+}
+
 int xe_ttm_stolen_mgr_init(struct xe_device *xe)
 {
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
 	struct xe_ttm_stolen_mgr *mgr;
 	u64 stolen_size, io_size;
-	int err;
+	int ret;
 
 	mgr = drmm_kzalloc(&xe->drm, sizeof(*mgr), GFP_KERNEL);
 	if (!mgr)
@@ -244,12 +232,12 @@ int xe_ttm_stolen_mgr_init(struct xe_device *xe)
 	if (mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
 		io_size = stolen_size;
 
-	err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN, stolen_size,
-				     io_size, PAGE_SIZE);
-	if (err) {
-		drm_dbg_kms(&xe->drm, "Stolen mgr init failed: %i\n", err);
-		return err;
-	}
+	ret = ttm_range_man_init_nocheck(&xe->ttm, XE_PL_STOLEN, false,
+					 stolen_size >> PAGE_SHIFT);
+	if (ret)
+		return ret;
+
+	xe->mem.stolen_mgr = mgr;
 
 	drm_dbg_kms(&xe->drm, "Initialized stolen memory support with %llu bytes\n",
 		    stolen_size);
@@ -257,36 +245,32 @@ int xe_ttm_stolen_mgr_init(struct xe_device *xe)
 	if (io_size)
 		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, io_size);
 
-	return 0;
+	return drmm_add_action_or_reset(&xe->drm, xe_ttm_stolen_mgr_fini, mgr);
 }
 
 u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset)
 {
 	struct xe_device *xe = xe_bo_device(bo);
-	struct ttm_resource_manager *ttm_mgr = ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
-	struct xe_ttm_stolen_mgr *mgr = to_stolen_mgr(ttm_mgr);
-	struct xe_res_cursor cur;
+	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
 
 	XE_WARN_ON(!mgr->io_base);
 
 	if (xe_ttm_stolen_cpu_access_needs_ggtt(xe))
 		return mgr->io_base + xe_bo_ggtt_addr(bo) + offset;
 
-	xe_res_first(bo->ttm.resource, offset, 4096, &cur);
-	return mgr->io_base + cur.start;
+	/* Range allocator: res->start is in pages. */
+	return mgr->io_base + (bo->ttm.resource->start << PAGE_SHIFT) + offset;
 }
 
 static int __xe_ttm_stolen_io_mem_reserve_bar2(struct xe_device *xe,
 					       struct xe_ttm_stolen_mgr *mgr,
 					       struct ttm_resource *mem)
 {
-	struct xe_res_cursor cur;
-
 	if (!mgr->io_base)
 		return -EIO;
 
-	xe_res_first(mem, 0, 4096, &cur);
-	mem->bus.offset = cur.start;
+	/* Range allocator always produces contiguous allocations. */
+	mem->bus.offset = mem->start << PAGE_SHIFT;
 
 	drm_WARN_ON(&xe->drm, !(mem->placement & TTM_PL_FLAG_CONTIGUOUS));
 
@@ -329,8 +313,7 @@ static int __xe_ttm_stolen_io_mem_reserve_stolen(struct xe_device *xe,
 
 int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct ttm_resource *mem)
 {
-	struct ttm_resource_manager *ttm_mgr = ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
-	struct xe_ttm_stolen_mgr *mgr = ttm_mgr ? to_stolen_mgr(ttm_mgr) : NULL;
+	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
 
 	if (!mgr || !mgr->io_base)
 		return -EIO;
@@ -343,8 +326,5 @@ int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct ttm_resource *mem)
 
 u64 xe_ttm_stolen_gpu_offset(struct xe_device *xe)
 {
-	struct xe_ttm_stolen_mgr *mgr =
-		to_stolen_mgr(ttm_manager_type(&xe->ttm, XE_PL_STOLEN));
-
-	return mgr->stolen_base;
+	return xe->mem.stolen_mgr->stolen_base;
 }
diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
index 8e877d1e839b..049e91e77326 100644
--- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
+++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
@@ -12,6 +12,15 @@ struct ttm_resource;
 struct xe_bo;
 struct xe_device;
 
+struct xe_ttm_stolen_mgr {
+	/* PCI base offset */
+	resource_size_t io_base;
+	/* GPU base offset */
+	resource_size_t stolen_base;
+
+	void __iomem *mapping;
+};
+
 int xe_ttm_stolen_mgr_init(struct xe_device *xe);
 int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct ttm_resource *mem);
 bool xe_ttm_stolen_cpu_access_needs_ggtt(struct xe_device *xe);
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index 5fd0d5506a7e..79ef8e1b5e5c 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -301,14 +301,13 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
 			   u64 default_page_size)
 {
 	struct ttm_resource_manager *man = &mgr->manager;
+	const char *name;
 	int err;
 
-	if (mem_type != XE_PL_STOLEN) {
-		const char *name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
-		man->cg = drmm_cgroup_register_region(&xe->drm, name, size);
-		if (IS_ERR(man->cg))
-			return PTR_ERR(man->cg);
-	}
+	name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
+	man->cg = drmm_cgroup_register_region(&xe->drm, name, size);
+	if (IS_ERR(man->cg))
+		return PTR_ERR(man->cg);
 
 	man->func = &xe_ttm_vram_mgr_func;
 	mgr->mem_type = mem_type;
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] drm/xe: Convert stolen memory over to ttm_range_manager
  2026-04-16 15:19 [PATCH v2] drm/xe: Convert stolen memory over to ttm_range_manager Sanjay Yadav
@ 2026-04-16 15:33 ` Maarten Lankhorst
  2026-04-16 16:22   ` Matthew Auld
  2026-04-20  9:36 ` Govindapillai, Vinod
  1 sibling, 1 reply; 7+ messages in thread
From: Maarten Lankhorst @ 2026-04-16 15:33 UTC (permalink / raw)
  To: Sanjay Yadav, intel-xe; +Cc: Matthew Brost, Matthew Auld

Hey,

Den 2026-04-16 kl. 17:19, skrev Sanjay Yadav:
> Stolen memory requires physically contiguous allocations for display
> scanout and compressed framebuffers. The stolen memory manager was
> sharing the gpu_buddy allocator backend with the VRAM manager, but
> buddy manages non-contiguous power-of-two blocks making it a poor fit.
> Stolen memory also has fundamentally different allocation patterns:
> 
> - Allocation sizes are not power-of-two. Since buddy rounds up to the
>   next power-of-two block size, a ~17MB request can fail even with
>   ~22MB free, because the free space is fragmented across non-fitting
>   power-of-two blocks.
> - Hardware restrictions prevent using the first 4K page of stolen for
>   certain allocations (e.g., FBC). The display code sets fpfn=1 to
>   enforce this, but when fpfn != 0, gpu_buddy enables
>   GPU_BUDDY_RANGE_ALLOCATION mode which disables the try_harder
>   coalescing path, further reducing allocation success.
> 
> This combination caused FBC compressed framebuffer (CFB) allocation
> failures on platforms like NVL/PTL. In case of NVL where stolen memory
> is ~56MB and the initial plane framebuffer consumes ~34MB at probe time,
> leaving ~22MB for subsequent allocations.
> 
> Use ttm_range_man_init_nocheck() to set up a drm_mm-backed TTM resource
> manager for stolen memory. This reuses the TTM core's ttm_range_manager
> callbacks, avoiding duplicate implementations.
> 
> Tested on NVL with a 4K DP display: stolen_mm shows a single ~22MB
> contiguous free hole after initial plane framebuffer allocation, and
> FBC successfully allocates its CFB from that region. The corresponding
> IGT was previously skipped and now passes.
> 
> v2:
> - Clarify that stolen memory requires contiguous allocations (Matt B)
> - Properly handle xe_ttm_resource_visible() for stolen instead of
>   unconditionally returning true (Matt A)
> 
> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7631
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Suggested-by: Matthew Auld <matthew.auld@intel.com>
> Assisted-by: GitHub Copilot:claude-sonnet-4.6
> Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
> ---
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

The changes look good, I never considered the differences between the
buddy allocator and range manager. Glad to see this works a lot better!

One thing I did notice that in most cases, is there any reason for the
range manager to use pages? It should be able to use bytes and remove
a lot of shifts to convert between bytes and pages in the code.

>  drivers/gpu/drm/xe/xe_bo.c             | 16 +++++--
>  drivers/gpu/drm/xe/xe_device_types.h   |  3 ++
>  drivers/gpu/drm/xe/xe_res_cursor.h     | 14 +++++-
>  drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 64 +++++++++-----------------
>  drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h |  9 ++++
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c   | 11 ++---
>  6 files changed, 63 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index a7c2dc7f224c..b737edf04d60 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -599,11 +599,17 @@ static void xe_ttm_tt_destroy(struct ttm_device *ttm_dev, struct ttm_tt *tt)
>  	kfree(tt);
>  }
>  
> -static bool xe_ttm_resource_visible(struct ttm_resource *mem)
> +static bool xe_ttm_resource_visible(struct xe_device *xe, struct ttm_resource *mem)
>  {
> -	struct xe_ttm_vram_mgr_resource *vres =
> -		to_xe_ttm_vram_mgr_resource(mem);
> +	struct xe_ttm_vram_mgr_resource *vres;
>  
> +	if (mem->mem_type == XE_PL_STOLEN) {
> +		struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
> +
> +		return mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe);
> +	}
> +
> +	vres = to_xe_ttm_vram_mgr_resource(mem);
>  	return vres->used_visible_size == mem->size;
>  }
>  
> @@ -621,7 +627,7 @@ bool xe_bo_is_visible_vram(struct xe_bo *bo)
>  	if (drm_WARN_ON(bo->ttm.base.dev, !xe_bo_is_vram(bo)))
>  		return false;
>  
> -	return xe_ttm_resource_visible(bo->ttm.resource);
> +	return xe_ttm_resource_visible(xe_bo_device(bo), bo->ttm.resource);
>  }
>  
>  static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
> @@ -637,7 +643,7 @@ static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
>  	case XE_PL_VRAM1: {
>  		struct xe_vram_region *vram = res_to_mem_region(mem);
>  
> -		if (!xe_ttm_resource_visible(mem))
> +		if (!xe_ttm_resource_visible(xe, mem))
>  			return -EINVAL;
>  
>  		mem->bus.offset = mem->start << PAGE_SHIFT;
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 150c76b2acaf..ffb84659c413 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -42,6 +42,7 @@ struct xe_ggtt;
>  struct xe_i2c;
>  struct xe_pat_ops;
>  struct xe_pxp;
> +struct xe_ttm_stolen_mgr;
>  struct xe_vram_region;
>  
>  /**
> @@ -278,6 +279,8 @@ struct xe_device {
>  		struct ttm_resource_manager sys_mgr;
>  		/** @mem.sys_mgr: system memory shrinker. */
>  		struct xe_shrinker *shrinker;
> +		/** @mem.stolen_mgr: stolen memory manager. */
> +		struct xe_ttm_stolen_mgr *stolen_mgr;
>  	} mem;
>  
>  	/** @sriov: device level virtualization data */
> diff --git a/drivers/gpu/drm/xe/xe_res_cursor.h b/drivers/gpu/drm/xe/xe_res_cursor.h
> index 5f4ab08c0686..0522caafd89d 100644
> --- a/drivers/gpu/drm/xe/xe_res_cursor.h
> +++ b/drivers/gpu/drm/xe/xe_res_cursor.h
> @@ -101,7 +101,15 @@ static inline void xe_res_first(struct ttm_resource *res,
>  	cur->mem_type = res->mem_type;
>  
>  	switch (cur->mem_type) {
> -	case XE_PL_STOLEN:
> +	case XE_PL_STOLEN: {
> +		/* res->start is in pages (ttm_range_manager). */
> +		cur->start = (res->start << PAGE_SHIFT) + start;
> +		cur->size = size;
> +		cur->remaining = size;
> +		cur->node = NULL;
> +		cur->mm = NULL;
> +		break;
> +	}
>  	case XE_PL_VRAM0:
>  	case XE_PL_VRAM1: {
>  		struct gpu_buddy_block *block;
> @@ -289,6 +297,10 @@ static inline void xe_res_next(struct xe_res_cursor *cur, u64 size)
>  
>  	switch (cur->mem_type) {
>  	case XE_PL_STOLEN:
> +		/* Just advance within the contiguous region. */
> +		cur->start += size;
> +		cur->size = cur->remaining;
> +		break;
>  	case XE_PL_VRAM0:
>  	case XE_PL_VRAM1:
>  		start = size - cur->size;
> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> index 27c9d72222cf..5e9070739e65 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> @@ -19,30 +19,11 @@
>  #include "xe_device.h"
>  #include "xe_gt_printk.h"
>  #include "xe_mmio.h"
> -#include "xe_res_cursor.h"
>  #include "xe_sriov.h"
>  #include "xe_ttm_stolen_mgr.h"
> -#include "xe_ttm_vram_mgr.h"
>  #include "xe_vram.h"
>  #include "xe_wa.h"
>  
> -struct xe_ttm_stolen_mgr {
> -	struct xe_ttm_vram_mgr base;
> -
> -	/* PCI base offset */
> -	resource_size_t io_base;
> -	/* GPU base offset */
> -	resource_size_t stolen_base;
> -
> -	void __iomem *mapping;
> -};
> -
> -static inline struct xe_ttm_stolen_mgr *
> -to_stolen_mgr(struct ttm_resource_manager *man)
> -{
> -	return container_of(man, struct xe_ttm_stolen_mgr, base.manager);
> -}
> -
>  /**
>   * xe_ttm_stolen_cpu_access_needs_ggtt() - If we can't directly CPU access
>   * stolen, can we then fallback to mapping through the GGTT.
> @@ -210,12 +191,19 @@ static u64 detect_stolen(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr)
>  #endif
>  }
>  
> +static void xe_ttm_stolen_mgr_fini(struct drm_device *dev, void *arg)
> +{
> +	struct xe_device *xe = to_xe_device(dev);
> +
> +	ttm_range_man_fini_nocheck(&xe->ttm, XE_PL_STOLEN);
> +}
> +
>  int xe_ttm_stolen_mgr_init(struct xe_device *xe)
>  {
>  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>  	struct xe_ttm_stolen_mgr *mgr;
>  	u64 stolen_size, io_size;
> -	int err;
> +	int ret;
>  
>  	mgr = drmm_kzalloc(&xe->drm, sizeof(*mgr), GFP_KERNEL);
>  	if (!mgr)
> @@ -244,12 +232,12 @@ int xe_ttm_stolen_mgr_init(struct xe_device *xe)
>  	if (mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
>  		io_size = stolen_size;
>  
> -	err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN, stolen_size,
> -				     io_size, PAGE_SIZE);
> -	if (err) {
> -		drm_dbg_kms(&xe->drm, "Stolen mgr init failed: %i\n", err);
> -		return err;
> -	}
> +	ret = ttm_range_man_init_nocheck(&xe->ttm, XE_PL_STOLEN, false,
> +					 stolen_size >> PAGE_SHIFT);
> +	if (ret)
> +		return ret;
> +
> +	xe->mem.stolen_mgr = mgr;
>  
>  	drm_dbg_kms(&xe->drm, "Initialized stolen memory support with %llu bytes\n",
>  		    stolen_size);
> @@ -257,36 +245,32 @@ int xe_ttm_stolen_mgr_init(struct xe_device *xe)
>  	if (io_size)
>  		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, io_size);
>  
> -	return 0;
> +	return drmm_add_action_or_reset(&xe->drm, xe_ttm_stolen_mgr_fini, mgr);
>  }
>  
>  u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset)
>  {
>  	struct xe_device *xe = xe_bo_device(bo);
> -	struct ttm_resource_manager *ttm_mgr = ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
> -	struct xe_ttm_stolen_mgr *mgr = to_stolen_mgr(ttm_mgr);
> -	struct xe_res_cursor cur;
> +	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>  
>  	XE_WARN_ON(!mgr->io_base);
>  
>  	if (xe_ttm_stolen_cpu_access_needs_ggtt(xe))
>  		return mgr->io_base + xe_bo_ggtt_addr(bo) + offset;
>  
> -	xe_res_first(bo->ttm.resource, offset, 4096, &cur);
> -	return mgr->io_base + cur.start;
> +	/* Range allocator: res->start is in pages. */
> +	return mgr->io_base + (bo->ttm.resource->start << PAGE_SHIFT) + offset;
>  }
>  
>  static int __xe_ttm_stolen_io_mem_reserve_bar2(struct xe_device *xe,
>  					       struct xe_ttm_stolen_mgr *mgr,
>  					       struct ttm_resource *mem)
>  {
> -	struct xe_res_cursor cur;
> -
>  	if (!mgr->io_base)
>  		return -EIO;
>  
> -	xe_res_first(mem, 0, 4096, &cur);
> -	mem->bus.offset = cur.start;
> +	/* Range allocator always produces contiguous allocations. */
> +	mem->bus.offset = mem->start << PAGE_SHIFT;
>  
>  	drm_WARN_ON(&xe->drm, !(mem->placement & TTM_PL_FLAG_CONTIGUOUS));
>  
> @@ -329,8 +313,7 @@ static int __xe_ttm_stolen_io_mem_reserve_stolen(struct xe_device *xe,
>  
>  int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct ttm_resource *mem)
>  {
> -	struct ttm_resource_manager *ttm_mgr = ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
> -	struct xe_ttm_stolen_mgr *mgr = ttm_mgr ? to_stolen_mgr(ttm_mgr) : NULL;
> +	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>  
>  	if (!mgr || !mgr->io_base)
>  		return -EIO;
> @@ -343,8 +326,5 @@ int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct ttm_resource *mem)
>  
>  u64 xe_ttm_stolen_gpu_offset(struct xe_device *xe)
>  {
> -	struct xe_ttm_stolen_mgr *mgr =
> -		to_stolen_mgr(ttm_manager_type(&xe->ttm, XE_PL_STOLEN));
> -
> -	return mgr->stolen_base;
> +	return xe->mem.stolen_mgr->stolen_base;
>  }
> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
> index 8e877d1e839b..049e91e77326 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
> @@ -12,6 +12,15 @@ struct ttm_resource;
>  struct xe_bo;
>  struct xe_device;
>  
> +struct xe_ttm_stolen_mgr {
> +	/* PCI base offset */
> +	resource_size_t io_base;
> +	/* GPU base offset */
> +	resource_size_t stolen_base;
> +
> +	void __iomem *mapping;
> +};
> +
>  int xe_ttm_stolen_mgr_init(struct xe_device *xe);
>  int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct ttm_resource *mem);
>  bool xe_ttm_stolen_cpu_access_needs_ggtt(struct xe_device *xe);
> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> index 5fd0d5506a7e..79ef8e1b5e5c 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> @@ -301,14 +301,13 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
>  			   u64 default_page_size)
>  {
>  	struct ttm_resource_manager *man = &mgr->manager;
> +	const char *name;
>  	int err;
>  
> -	if (mem_type != XE_PL_STOLEN) {
> -		const char *name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
> -		man->cg = drmm_cgroup_register_region(&xe->drm, name, size);
> -		if (IS_ERR(man->cg))
> -			return PTR_ERR(man->cg);
> -	}
> +	name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
> +	man->cg = drmm_cgroup_register_region(&xe->drm, name, size);
> +	if (IS_ERR(man->cg))
> +		return PTR_ERR(man->cg);
>  
>  	man->func = &xe_ttm_vram_mgr_func;
>  	mgr->mem_type = mem_type;


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] drm/xe: Convert stolen memory over to ttm_range_manager
  2026-04-16 15:33 ` Maarten Lankhorst
@ 2026-04-16 16:22   ` Matthew Auld
  2026-04-20 12:20     ` Yadav, Sanjay Kumar
  2026-04-20 13:16     ` Govindapillai, Vinod
  0 siblings, 2 replies; 7+ messages in thread
From: Matthew Auld @ 2026-04-16 16:22 UTC (permalink / raw)
  To: Maarten Lankhorst, Sanjay Yadav, intel-xe; +Cc: Matthew Brost

On 16/04/2026 16:33, Maarten Lankhorst wrote:
> Hey,
> 
> Den 2026-04-16 kl. 17:19, skrev Sanjay Yadav:
>> Stolen memory requires physically contiguous allocations for display
>> scanout and compressed framebuffers. The stolen memory manager was
>> sharing the gpu_buddy allocator backend with the VRAM manager, but
>> buddy manages non-contiguous power-of-two blocks making it a poor fit.
>> Stolen memory also has fundamentally different allocation patterns:
>>
>> - Allocation sizes are not power-of-two. Since buddy rounds up to the
>>    next power-of-two block size, a ~17MB request can fail even with
>>    ~22MB free, because the free space is fragmented across non-fitting
>>    power-of-two blocks.
>> - Hardware restrictions prevent using the first 4K page of stolen for
>>    certain allocations (e.g., FBC). The display code sets fpfn=1 to
>>    enforce this, but when fpfn != 0, gpu_buddy enables
>>    GPU_BUDDY_RANGE_ALLOCATION mode which disables the try_harder
>>    coalescing path, further reducing allocation success.
>>
>> This combination caused FBC compressed framebuffer (CFB) allocation
>> failures on platforms like NVL/PTL. In case of NVL where stolen memory
>> is ~56MB and the initial plane framebuffer consumes ~34MB at probe time,
>> leaving ~22MB for subsequent allocations.
>>
>> Use ttm_range_man_init_nocheck() to set up a drm_mm-backed TTM resource
>> manager for stolen memory. This reuses the TTM core's ttm_range_manager
>> callbacks, avoiding duplicate implementations.
>>
>> Tested on NVL with a 4K DP display: stolen_mm shows a single ~22MB
>> contiguous free hole after initial plane framebuffer allocation, and
>> FBC successfully allocates its CFB from that region. The corresponding
>> IGT was previously skipped and now passes.
>>
>> v2:
>> - Clarify that stolen memory requires contiguous allocations (Matt B)
>> - Properly handle xe_ttm_resource_visible() for stolen instead of
>>    unconditionally returning true (Matt A)
>>
>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7631
>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Suggested-by: Matthew Auld <matthew.auld@intel.com>
>> Assisted-by: GitHub Copilot:claude-sonnet-4.6
>> Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
>> ---
> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> 
> The changes look good, I never considered the differences between the
> buddy allocator and range manager. Glad to see this works a lot better!
> 
> One thing I did notice that in most cases, is there any reason for the
> range manager to use pages? It should be able to use bytes and remove
> a lot of shifts to convert between bytes and pages in the code.

Ah, that reminds me of: dfd1d2f7eb39 ("drm/xe/fbdev: Fix BIOS FB vs. 
stolen size check")

That would need to be converted back to pages, I think.

> 
>>   drivers/gpu/drm/xe/xe_bo.c             | 16 +++++--
>>   drivers/gpu/drm/xe/xe_device_types.h   |  3 ++
>>   drivers/gpu/drm/xe/xe_res_cursor.h     | 14 +++++-
>>   drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 64 +++++++++-----------------
>>   drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h |  9 ++++
>>   drivers/gpu/drm/xe/xe_ttm_vram_mgr.c   | 11 ++---
>>   6 files changed, 63 insertions(+), 54 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
>> index a7c2dc7f224c..b737edf04d60 100644
>> --- a/drivers/gpu/drm/xe/xe_bo.c
>> +++ b/drivers/gpu/drm/xe/xe_bo.c
>> @@ -599,11 +599,17 @@ static void xe_ttm_tt_destroy(struct ttm_device *ttm_dev, struct ttm_tt *tt)
>>   	kfree(tt);
>>   }
>>   
>> -static bool xe_ttm_resource_visible(struct ttm_resource *mem)
>> +static bool xe_ttm_resource_visible(struct xe_device *xe, struct ttm_resource *mem)
>>   {
>> -	struct xe_ttm_vram_mgr_resource *vres =
>> -		to_xe_ttm_vram_mgr_resource(mem);
>> +	struct xe_ttm_vram_mgr_resource *vres;
>>   
>> +	if (mem->mem_type == XE_PL_STOLEN) {
>> +		struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>> +
>> +		return mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe);
>> +	}
>> +
>> +	vres = to_xe_ttm_vram_mgr_resource(mem);
>>   	return vres->used_visible_size == mem->size;
>>   }
>>   
>> @@ -621,7 +627,7 @@ bool xe_bo_is_visible_vram(struct xe_bo *bo)
>>   	if (drm_WARN_ON(bo->ttm.base.dev, !xe_bo_is_vram(bo)))
>>   		return false;
>>   
>> -	return xe_ttm_resource_visible(bo->ttm.resource);
>> +	return xe_ttm_resource_visible(xe_bo_device(bo), bo->ttm.resource);
>>   }
>>   
>>   static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
>> @@ -637,7 +643,7 @@ static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
>>   	case XE_PL_VRAM1: {
>>   		struct xe_vram_region *vram = res_to_mem_region(mem);
>>   
>> -		if (!xe_ttm_resource_visible(mem))
>> +		if (!xe_ttm_resource_visible(xe, mem))
>>   			return -EINVAL;
>>   
>>   		mem->bus.offset = mem->start << PAGE_SHIFT;
>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>> index 150c76b2acaf..ffb84659c413 100644
>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> @@ -42,6 +42,7 @@ struct xe_ggtt;
>>   struct xe_i2c;
>>   struct xe_pat_ops;
>>   struct xe_pxp;
>> +struct xe_ttm_stolen_mgr;
>>   struct xe_vram_region;
>>   
>>   /**
>> @@ -278,6 +279,8 @@ struct xe_device {
>>   		struct ttm_resource_manager sys_mgr;
>>   		/** @mem.sys_mgr: system memory shrinker. */
>>   		struct xe_shrinker *shrinker;
>> +		/** @mem.stolen_mgr: stolen memory manager. */
>> +		struct xe_ttm_stolen_mgr *stolen_mgr;
>>   	} mem;
>>   
>>   	/** @sriov: device level virtualization data */
>> diff --git a/drivers/gpu/drm/xe/xe_res_cursor.h b/drivers/gpu/drm/xe/xe_res_cursor.h
>> index 5f4ab08c0686..0522caafd89d 100644
>> --- a/drivers/gpu/drm/xe/xe_res_cursor.h
>> +++ b/drivers/gpu/drm/xe/xe_res_cursor.h
>> @@ -101,7 +101,15 @@ static inline void xe_res_first(struct ttm_resource *res,
>>   	cur->mem_type = res->mem_type;
>>   
>>   	switch (cur->mem_type) {
>> -	case XE_PL_STOLEN:
>> +	case XE_PL_STOLEN: {
>> +		/* res->start is in pages (ttm_range_manager). */
>> +		cur->start = (res->start << PAGE_SHIFT) + start;
>> +		cur->size = size;
>> +		cur->remaining = size;
>> +		cur->node = NULL;
>> +		cur->mm = NULL;
>> +		break;
>> +	}
>>   	case XE_PL_VRAM0:
>>   	case XE_PL_VRAM1: {
>>   		struct gpu_buddy_block *block;
>> @@ -289,6 +297,10 @@ static inline void xe_res_next(struct xe_res_cursor *cur, u64 size)
>>   
>>   	switch (cur->mem_type) {
>>   	case XE_PL_STOLEN:
>> +		/* Just advance within the contiguous region. */
>> +		cur->start += size;
>> +		cur->size = cur->remaining;
>> +		break;
>>   	case XE_PL_VRAM0:
>>   	case XE_PL_VRAM1:
>>   		start = size - cur->size;
>> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>> index 27c9d72222cf..5e9070739e65 100644
>> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>> @@ -19,30 +19,11 @@
>>   #include "xe_device.h"
>>   #include "xe_gt_printk.h"
>>   #include "xe_mmio.h"
>> -#include "xe_res_cursor.h"
>>   #include "xe_sriov.h"
>>   #include "xe_ttm_stolen_mgr.h"
>> -#include "xe_ttm_vram_mgr.h"
>>   #include "xe_vram.h"
>>   #include "xe_wa.h"
>>   
>> -struct xe_ttm_stolen_mgr {
>> -	struct xe_ttm_vram_mgr base;
>> -
>> -	/* PCI base offset */
>> -	resource_size_t io_base;
>> -	/* GPU base offset */
>> -	resource_size_t stolen_base;
>> -
>> -	void __iomem *mapping;
>> -};
>> -
>> -static inline struct xe_ttm_stolen_mgr *
>> -to_stolen_mgr(struct ttm_resource_manager *man)
>> -{
>> -	return container_of(man, struct xe_ttm_stolen_mgr, base.manager);
>> -}
>> -
>>   /**
>>    * xe_ttm_stolen_cpu_access_needs_ggtt() - If we can't directly CPU access
>>    * stolen, can we then fallback to mapping through the GGTT.
>> @@ -210,12 +191,19 @@ static u64 detect_stolen(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr)
>>   #endif
>>   }
>>   
>> +static void xe_ttm_stolen_mgr_fini(struct drm_device *dev, void *arg)
>> +{
>> +	struct xe_device *xe = to_xe_device(dev);
>> +
>> +	ttm_range_man_fini_nocheck(&xe->ttm, XE_PL_STOLEN);
>> +}
>> +
>>   int xe_ttm_stolen_mgr_init(struct xe_device *xe)
>>   {
>>   	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>>   	struct xe_ttm_stolen_mgr *mgr;
>>   	u64 stolen_size, io_size;
>> -	int err;
>> +	int ret;
>>   
>>   	mgr = drmm_kzalloc(&xe->drm, sizeof(*mgr), GFP_KERNEL);
>>   	if (!mgr)
>> @@ -244,12 +232,12 @@ int xe_ttm_stolen_mgr_init(struct xe_device *xe)
>>   	if (mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
>>   		io_size = stolen_size;
>>   
>> -	err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN, stolen_size,
>> -				     io_size, PAGE_SIZE);
>> -	if (err) {
>> -		drm_dbg_kms(&xe->drm, "Stolen mgr init failed: %i\n", err);
>> -		return err;
>> -	}
>> +	ret = ttm_range_man_init_nocheck(&xe->ttm, XE_PL_STOLEN, false,
>> +					 stolen_size >> PAGE_SHIFT);
>> +	if (ret)
>> +		return ret;
>> +
>> +	xe->mem.stolen_mgr = mgr;
>>   
>>   	drm_dbg_kms(&xe->drm, "Initialized stolen memory support with %llu bytes\n",
>>   		    stolen_size);
>> @@ -257,36 +245,32 @@ int xe_ttm_stolen_mgr_init(struct xe_device *xe)
>>   	if (io_size)
>>   		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, io_size);
>>   
>> -	return 0;
>> +	return drmm_add_action_or_reset(&xe->drm, xe_ttm_stolen_mgr_fini, mgr);
>>   }
>>   
>>   u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset)
>>   {
>>   	struct xe_device *xe = xe_bo_device(bo);
>> -	struct ttm_resource_manager *ttm_mgr = ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
>> -	struct xe_ttm_stolen_mgr *mgr = to_stolen_mgr(ttm_mgr);
>> -	struct xe_res_cursor cur;
>> +	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>>   
>>   	XE_WARN_ON(!mgr->io_base);
>>   
>>   	if (xe_ttm_stolen_cpu_access_needs_ggtt(xe))
>>   		return mgr->io_base + xe_bo_ggtt_addr(bo) + offset;
>>   
>> -	xe_res_first(bo->ttm.resource, offset, 4096, &cur);
>> -	return mgr->io_base + cur.start;
>> +	/* Range allocator: res->start is in pages. */
>> +	return mgr->io_base + (bo->ttm.resource->start << PAGE_SHIFT) + offset;
>>   }
>>   
>>   static int __xe_ttm_stolen_io_mem_reserve_bar2(struct xe_device *xe,
>>   					       struct xe_ttm_stolen_mgr *mgr,
>>   					       struct ttm_resource *mem)
>>   {
>> -	struct xe_res_cursor cur;
>> -
>>   	if (!mgr->io_base)
>>   		return -EIO;
>>   
>> -	xe_res_first(mem, 0, 4096, &cur);
>> -	mem->bus.offset = cur.start;
>> +	/* Range allocator always produces contiguous allocations. */
>> +	mem->bus.offset = mem->start << PAGE_SHIFT;
>>   
>>   	drm_WARN_ON(&xe->drm, !(mem->placement & TTM_PL_FLAG_CONTIGUOUS));
>>   
>> @@ -329,8 +313,7 @@ static int __xe_ttm_stolen_io_mem_reserve_stolen(struct xe_device *xe,
>>   
>>   int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct ttm_resource *mem)
>>   {
>> -	struct ttm_resource_manager *ttm_mgr = ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
>> -	struct xe_ttm_stolen_mgr *mgr = ttm_mgr ? to_stolen_mgr(ttm_mgr) : NULL;
>> +	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>>   
>>   	if (!mgr || !mgr->io_base)
>>   		return -EIO;
>> @@ -343,8 +326,5 @@ int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct ttm_resource *mem)
>>   
>>   u64 xe_ttm_stolen_gpu_offset(struct xe_device *xe)
>>   {
>> -	struct xe_ttm_stolen_mgr *mgr =
>> -		to_stolen_mgr(ttm_manager_type(&xe->ttm, XE_PL_STOLEN));
>> -
>> -	return mgr->stolen_base;
>> +	return xe->mem.stolen_mgr->stolen_base;
>>   }
>> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
>> index 8e877d1e839b..049e91e77326 100644
>> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
>> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
>> @@ -12,6 +12,15 @@ struct ttm_resource;
>>   struct xe_bo;
>>   struct xe_device;
>>   
>> +struct xe_ttm_stolen_mgr {
>> +	/* PCI base offset */
>> +	resource_size_t io_base;
>> +	/* GPU base offset */
>> +	resource_size_t stolen_base;
>> +
>> +	void __iomem *mapping;
>> +};
>> +
>>   int xe_ttm_stolen_mgr_init(struct xe_device *xe);
>>   int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct ttm_resource *mem);
>>   bool xe_ttm_stolen_cpu_access_needs_ggtt(struct xe_device *xe);
>> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>> index 5fd0d5506a7e..79ef8e1b5e5c 100644
>> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>> @@ -301,14 +301,13 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
>>   			   u64 default_page_size)
>>   {
>>   	struct ttm_resource_manager *man = &mgr->manager;
>> +	const char *name;
>>   	int err;
>>   
>> -	if (mem_type != XE_PL_STOLEN) {
>> -		const char *name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
>> -		man->cg = drmm_cgroup_register_region(&xe->drm, name, size);
>> -		if (IS_ERR(man->cg))
>> -			return PTR_ERR(man->cg);
>> -	}
>> +	name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
>> +	man->cg = drmm_cgroup_register_region(&xe->drm, name, size);
>> +	if (IS_ERR(man->cg))
>> +		return PTR_ERR(man->cg);
>>   
>>   	man->func = &xe_ttm_vram_mgr_func;
>>   	mgr->mem_type = mem_type;
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] drm/xe: Convert stolen memory over to ttm_range_manager
  2026-04-16 15:19 [PATCH v2] drm/xe: Convert stolen memory over to ttm_range_manager Sanjay Yadav
  2026-04-16 15:33 ` Maarten Lankhorst
@ 2026-04-20  9:36 ` Govindapillai, Vinod
  2026-04-20 12:20   ` Yadav, Sanjay Kumar
  1 sibling, 1 reply; 7+ messages in thread
From: Govindapillai, Vinod @ 2026-04-20  9:36 UTC (permalink / raw)
  To: intel-xe@lists.freedesktop.org, Yadav,  Sanjay Kumar
  Cc: Brost, Matthew, maarten.lankhorst@linux.intel.com, Auld, Matthew

On Thu, 2026-04-16 at 20:49 +0530, Sanjay Yadav wrote:
> Stolen memory requires physically contiguous allocations for display
> scanout and compressed framebuffers. The stolen memory manager was
> sharing the gpu_buddy allocator backend with the VRAM manager, but
> buddy manages non-contiguous power-of-two blocks making it a poor
> fit.
> Stolen memory also has fundamentally different allocation patterns:
> 
> - Allocation sizes are not power-of-two. Since buddy rounds up to the
>   next power-of-two block size, a ~17MB request can fail even with
>   ~22MB free, because the free space is fragmented across non-fitting
>   power-of-two blocks.
> - Hardware restrictions prevent using the first 4K page of stolen for
>   certain allocations (e.g., FBC). The display code sets fpfn=1 to
>   enforce this, but when fpfn != 0, gpu_buddy enables
>   GPU_BUDDY_RANGE_ALLOCATION mode which disables the try_harder
>   coalescing path, further reducing allocation success.
> 
> This combination caused FBC compressed framebuffer (CFB) allocation
> failures on platforms like NVL/PTL. In case of NVL where stolen
> memory
> is ~56MB and the initial plane framebuffer consumes ~34MB at probe
> time,
> leaving ~22MB for subsequent allocations.
> 
> Use ttm_range_man_init_nocheck() to set up a drm_mm-backed TTM
> resource
> manager for stolen memory. This reuses the TTM core's
> ttm_range_manager
> callbacks, avoiding duplicate implementations.
> 
> Tested on NVL with a 4K DP display: stolen_mm shows a single ~22MB
> contiguous free hole after initial plane framebuffer allocation, and
> FBC successfully allocates its CFB from that region. The
> corresponding
> IGT was previously skipped and now passes.
> 
> v2:
> - Clarify that stolen memory requires contiguous allocations (Matt B)
> - Properly handle xe_ttm_resource_visible() for stolen instead of
>   unconditionally returning true (Matt A)
> 
> Closes:
> https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7631
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Suggested-by: Matthew Auld <matthew.auld@intel.com>
> Assisted-by: GitHub Copilot:claude-sonnet-4.6
> Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_bo.c             | 16 +++++--
>  drivers/gpu/drm/xe/xe_device_types.h   |  3 ++
>  drivers/gpu/drm/xe/xe_res_cursor.h     | 14 +++++-
>  drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 64 +++++++++---------------
> --
>  drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h |  9 ++++
>  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c   | 11 ++---
>  6 files changed, 63 insertions(+), 54 deletions(-)
> 

FP16 stolen memory allocations are working ~35mb

Acked-by: Vinod Govindapillai <vinod.govindapillai@intel.com>

> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index a7c2dc7f224c..b737edf04d60 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -599,11 +599,17 @@ static void xe_ttm_tt_destroy(struct ttm_device
> *ttm_dev, struct ttm_tt *tt)
>  	kfree(tt);
>  }
>  
> -static bool xe_ttm_resource_visible(struct ttm_resource *mem)
> +static bool xe_ttm_resource_visible(struct xe_device *xe, struct
> ttm_resource *mem)
>  {
> -	struct xe_ttm_vram_mgr_resource *vres =
> -		to_xe_ttm_vram_mgr_resource(mem);
> +	struct xe_ttm_vram_mgr_resource *vres;
>  
> +	if (mem->mem_type == XE_PL_STOLEN) {
> +		struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
> +
> +		return mgr->io_base &&
> !xe_ttm_stolen_cpu_access_needs_ggtt(xe);
> +	}
> +
> +	vres = to_xe_ttm_vram_mgr_resource(mem);
>  	return vres->used_visible_size == mem->size;
>  }
>  
> @@ -621,7 +627,7 @@ bool xe_bo_is_visible_vram(struct xe_bo *bo)
>  	if (drm_WARN_ON(bo->ttm.base.dev, !xe_bo_is_vram(bo)))
>  		return false;
>  
> -	return xe_ttm_resource_visible(bo->ttm.resource);
> +	return xe_ttm_resource_visible(xe_bo_device(bo), bo-
> >ttm.resource);
>  }
>  
>  static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
> @@ -637,7 +643,7 @@ static int xe_ttm_io_mem_reserve(struct
> ttm_device *bdev,
>  	case XE_PL_VRAM1: {
>  		struct xe_vram_region *vram =
> res_to_mem_region(mem);
>  
> -		if (!xe_ttm_resource_visible(mem))
> +		if (!xe_ttm_resource_visible(xe, mem))
>  			return -EINVAL;
>  
>  		mem->bus.offset = mem->start << PAGE_SHIFT;
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> b/drivers/gpu/drm/xe/xe_device_types.h
> index 150c76b2acaf..ffb84659c413 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -42,6 +42,7 @@ struct xe_ggtt;
>  struct xe_i2c;
>  struct xe_pat_ops;
>  struct xe_pxp;
> +struct xe_ttm_stolen_mgr;
>  struct xe_vram_region;
>  
>  /**
> @@ -278,6 +279,8 @@ struct xe_device {
>  		struct ttm_resource_manager sys_mgr;
>  		/** @mem.sys_mgr: system memory shrinker. */
>  		struct xe_shrinker *shrinker;
> +		/** @mem.stolen_mgr: stolen memory manager. */
> +		struct xe_ttm_stolen_mgr *stolen_mgr;
>  	} mem;
>  
>  	/** @sriov: device level virtualization data */
> diff --git a/drivers/gpu/drm/xe/xe_res_cursor.h
> b/drivers/gpu/drm/xe/xe_res_cursor.h
> index 5f4ab08c0686..0522caafd89d 100644
> --- a/drivers/gpu/drm/xe/xe_res_cursor.h
> +++ b/drivers/gpu/drm/xe/xe_res_cursor.h
> @@ -101,7 +101,15 @@ static inline void xe_res_first(struct
> ttm_resource *res,
>  	cur->mem_type = res->mem_type;
>  
>  	switch (cur->mem_type) {
> -	case XE_PL_STOLEN:
> +	case XE_PL_STOLEN: {
> +		/* res->start is in pages (ttm_range_manager). */
> +		cur->start = (res->start << PAGE_SHIFT) + start;
> +		cur->size = size;
> +		cur->remaining = size;
> +		cur->node = NULL;
> +		cur->mm = NULL;
> +		break;
> +	}
>  	case XE_PL_VRAM0:
>  	case XE_PL_VRAM1: {
>  		struct gpu_buddy_block *block;
> @@ -289,6 +297,10 @@ static inline void xe_res_next(struct
> xe_res_cursor *cur, u64 size)
>  
>  	switch (cur->mem_type) {
>  	case XE_PL_STOLEN:
> +		/* Just advance within the contiguous region. */
> +		cur->start += size;
> +		cur->size = cur->remaining;
> +		break;
>  	case XE_PL_VRAM0:
>  	case XE_PL_VRAM1:
>  		start = size - cur->size;
> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> index 27c9d72222cf..5e9070739e65 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> @@ -19,30 +19,11 @@
>  #include "xe_device.h"
>  #include "xe_gt_printk.h"
>  #include "xe_mmio.h"
> -#include "xe_res_cursor.h"
>  #include "xe_sriov.h"
>  #include "xe_ttm_stolen_mgr.h"
> -#include "xe_ttm_vram_mgr.h"
>  #include "xe_vram.h"
>  #include "xe_wa.h"
>  
> -struct xe_ttm_stolen_mgr {
> -	struct xe_ttm_vram_mgr base;
> -
> -	/* PCI base offset */
> -	resource_size_t io_base;
> -	/* GPU base offset */
> -	resource_size_t stolen_base;
> -
> -	void __iomem *mapping;
> -};
> -
> -static inline struct xe_ttm_stolen_mgr *
> -to_stolen_mgr(struct ttm_resource_manager *man)
> -{
> -	return container_of(man, struct xe_ttm_stolen_mgr,
> base.manager);
> -}
> -
>  /**
>   * xe_ttm_stolen_cpu_access_needs_ggtt() - If we can't directly CPU
> access
>   * stolen, can we then fallback to mapping through the GGTT.
> @@ -210,12 +191,19 @@ static u64 detect_stolen(struct xe_device *xe,
> struct xe_ttm_stolen_mgr *mgr)
>  #endif
>  }
>  
> +static void xe_ttm_stolen_mgr_fini(struct drm_device *dev, void
> *arg)
> +{
> +	struct xe_device *xe = to_xe_device(dev);
> +
> +	ttm_range_man_fini_nocheck(&xe->ttm, XE_PL_STOLEN);
> +}
> +
>  int xe_ttm_stolen_mgr_init(struct xe_device *xe)
>  {
>  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>  	struct xe_ttm_stolen_mgr *mgr;
>  	u64 stolen_size, io_size;
> -	int err;
> +	int ret;
>  
>  	mgr = drmm_kzalloc(&xe->drm, sizeof(*mgr), GFP_KERNEL);
>  	if (!mgr)
> @@ -244,12 +232,12 @@ int xe_ttm_stolen_mgr_init(struct xe_device
> *xe)
>  	if (mgr->io_base &&
> !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
>  		io_size = stolen_size;
>  
> -	err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN,
> stolen_size,
> -				     io_size, PAGE_SIZE);
> -	if (err) {
> -		drm_dbg_kms(&xe->drm, "Stolen mgr init failed:
> %i\n", err);
> -		return err;
> -	}
> +	ret = ttm_range_man_init_nocheck(&xe->ttm, XE_PL_STOLEN,
> false,
> +					 stolen_size >> PAGE_SHIFT);
> +	if (ret)
> +		return ret;
> +
> +	xe->mem.stolen_mgr = mgr;
>  
>  	drm_dbg_kms(&xe->drm, "Initialized stolen memory support
> with %llu bytes\n",
>  		    stolen_size);
> @@ -257,36 +245,32 @@ int xe_ttm_stolen_mgr_init(struct xe_device
> *xe)
>  	if (io_size)
>  		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr-
> >io_base, io_size);
>  
> -	return 0;
> +	return drmm_add_action_or_reset(&xe->drm,
> xe_ttm_stolen_mgr_fini, mgr);
>  }
>  
>  u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset)
>  {
>  	struct xe_device *xe = xe_bo_device(bo);
> -	struct ttm_resource_manager *ttm_mgr = ttm_manager_type(&xe-
> >ttm, XE_PL_STOLEN);
> -	struct xe_ttm_stolen_mgr *mgr = to_stolen_mgr(ttm_mgr);
> -	struct xe_res_cursor cur;
> +	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>  
>  	XE_WARN_ON(!mgr->io_base);
>  
>  	if (xe_ttm_stolen_cpu_access_needs_ggtt(xe))
>  		return mgr->io_base + xe_bo_ggtt_addr(bo) + offset;
>  
> -	xe_res_first(bo->ttm.resource, offset, 4096, &cur);
> -	return mgr->io_base + cur.start;
> +	/* Range allocator: res->start is in pages. */
> +	return mgr->io_base + (bo->ttm.resource->start <<
> PAGE_SHIFT) + offset;
>  }
>  
>  static int __xe_ttm_stolen_io_mem_reserve_bar2(struct xe_device *xe,
>  					       struct
> xe_ttm_stolen_mgr *mgr,
>  					       struct ttm_resource
> *mem)
>  {
> -	struct xe_res_cursor cur;
> -
>  	if (!mgr->io_base)
>  		return -EIO;
>  
> -	xe_res_first(mem, 0, 4096, &cur);
> -	mem->bus.offset = cur.start;
> +	/* Range allocator always produces contiguous allocations.
> */
> +	mem->bus.offset = mem->start << PAGE_SHIFT;
>  
>  	drm_WARN_ON(&xe->drm, !(mem->placement &
> TTM_PL_FLAG_CONTIGUOUS));
>  
> @@ -329,8 +313,7 @@ static int
> __xe_ttm_stolen_io_mem_reserve_stolen(struct xe_device *xe,
>  
>  int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct
> ttm_resource *mem)
>  {
> -	struct ttm_resource_manager *ttm_mgr = ttm_manager_type(&xe-
> >ttm, XE_PL_STOLEN);
> -	struct xe_ttm_stolen_mgr *mgr = ttm_mgr ?
> to_stolen_mgr(ttm_mgr) : NULL;
> +	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>  
>  	if (!mgr || !mgr->io_base)
>  		return -EIO;
> @@ -343,8 +326,5 @@ int xe_ttm_stolen_io_mem_reserve(struct xe_device
> *xe, struct ttm_resource *mem)
>  
>  u64 xe_ttm_stolen_gpu_offset(struct xe_device *xe)
>  {
> -	struct xe_ttm_stolen_mgr *mgr =
> -		to_stolen_mgr(ttm_manager_type(&xe->ttm,
> XE_PL_STOLEN));
> -
> -	return mgr->stolen_base;
> +	return xe->mem.stolen_mgr->stolen_base;
>  }
> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
> b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
> index 8e877d1e839b..049e91e77326 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
> @@ -12,6 +12,15 @@ struct ttm_resource;
>  struct xe_bo;
>  struct xe_device;
>  
> +struct xe_ttm_stolen_mgr {
> +	/* PCI base offset */
> +	resource_size_t io_base;
> +	/* GPU base offset */
> +	resource_size_t stolen_base;
> +
> +	void __iomem *mapping;
> +};
> +
>  int xe_ttm_stolen_mgr_init(struct xe_device *xe);
>  int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct
> ttm_resource *mem);
>  bool xe_ttm_stolen_cpu_access_needs_ggtt(struct xe_device *xe);
> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> index 5fd0d5506a7e..79ef8e1b5e5c 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> @@ -301,14 +301,13 @@ int __xe_ttm_vram_mgr_init(struct xe_device
> *xe, struct xe_ttm_vram_mgr *mgr,
>  			   u64 default_page_size)
>  {
>  	struct ttm_resource_manager *man = &mgr->manager;
> +	const char *name;
>  	int err;
>  
> -	if (mem_type != XE_PL_STOLEN) {
> -		const char *name = mem_type == XE_PL_VRAM0 ? "vram0"
> : "vram1";
> -		man->cg = drmm_cgroup_register_region(&xe->drm,
> name, size);
> -		if (IS_ERR(man->cg))
> -			return PTR_ERR(man->cg);
> -	}
> +	name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
> +	man->cg = drmm_cgroup_register_region(&xe->drm, name, size);
> +	if (IS_ERR(man->cg))
> +		return PTR_ERR(man->cg);
>  
>  	man->func = &xe_ttm_vram_mgr_func;
>  	mgr->mem_type = mem_type;


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] drm/xe: Convert stolen memory over to ttm_range_manager
  2026-04-16 16:22   ` Matthew Auld
@ 2026-04-20 12:20     ` Yadav, Sanjay Kumar
  2026-04-20 13:16     ` Govindapillai, Vinod
  1 sibling, 0 replies; 7+ messages in thread
From: Yadav, Sanjay Kumar @ 2026-04-20 12:20 UTC (permalink / raw)
  To: Matthew Auld, Maarten Lankhorst, intel-xe; +Cc: Matthew Brost


On 16-04-2026 21:52, Matthew Auld wrote:
> On 16/04/2026 16:33, Maarten Lankhorst wrote:
>> Hey,
>>
>> Den 2026-04-16 kl. 17:19, skrev Sanjay Yadav:
>>> Stolen memory requires physically contiguous allocations for display
>>> scanout and compressed framebuffers. The stolen memory manager was
>>> sharing the gpu_buddy allocator backend with the VRAM manager, but
>>> buddy manages non-contiguous power-of-two blocks making it a poor fit.
>>> Stolen memory also has fundamentally different allocation patterns:
>>>
>>> - Allocation sizes are not power-of-two. Since buddy rounds up to the
>>>    next power-of-two block size, a ~17MB request can fail even with
>>>    ~22MB free, because the free space is fragmented across non-fitting
>>>    power-of-two blocks.
>>> - Hardware restrictions prevent using the first 4K page of stolen for
>>>    certain allocations (e.g., FBC). The display code sets fpfn=1 to
>>>    enforce this, but when fpfn != 0, gpu_buddy enables
>>>    GPU_BUDDY_RANGE_ALLOCATION mode which disables the try_harder
>>>    coalescing path, further reducing allocation success.
>>>
>>> This combination caused FBC compressed framebuffer (CFB) allocation
>>> failures on platforms like NVL/PTL. In case of NVL where stolen memory
>>> is ~56MB and the initial plane framebuffer consumes ~34MB at probe 
>>> time,
>>> leaving ~22MB for subsequent allocations.
>>>
>>> Use ttm_range_man_init_nocheck() to set up a drm_mm-backed TTM resource
>>> manager for stolen memory. This reuses the TTM core's ttm_range_manager
>>> callbacks, avoiding duplicate implementations.
>>>
>>> Tested on NVL with a 4K DP display: stolen_mm shows a single ~22MB
>>> contiguous free hole after initial plane framebuffer allocation, and
>>> FBC successfully allocates its CFB from that region. The corresponding
>>> IGT was previously skipped and now passes.
>>>
>>> v2:
>>> - Clarify that stolen memory requires contiguous allocations (Matt B)
>>> - Properly handle xe_ttm_resource_visible() for stolen instead of
>>>    unconditionally returning true (Matt A)
>>>
>>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7631
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>> Suggested-by: Matthew Auld <matthew.auld@intel.com>
>>> Assisted-by: GitHub Copilot:claude-sonnet-4.6
>>> Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
>>> ---
>> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>
>> The changes look good, I never considered the differences between the
>> buddy allocator and range manager. Glad to see this works a lot better!
>>
>> One thing I did notice that in most cases, is there any reason for the
>> range manager to use pages? It should be able to use bytes and remove
>> a lot of shifts to convert between bytes and pages in the code.
>
> Ah, that reminds me of: dfd1d2f7eb39 ("drm/xe/fbdev: Fix BIOS FB vs. 
> stolen size check")
>
> That would need to be converted back to pages, I think.
Thanks Maarten for the review Ack, Matt A for the discussion.
>
>>
>>>   drivers/gpu/drm/xe/xe_bo.c | 16 +++++--
>>>   drivers/gpu/drm/xe/xe_device_types.h   |  3 ++
>>>   drivers/gpu/drm/xe/xe_res_cursor.h     | 14 +++++-
>>>   drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 64 
>>> +++++++++-----------------
>>>   drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h |  9 ++++
>>>   drivers/gpu/drm/xe/xe_ttm_vram_mgr.c   | 11 ++---
>>>   6 files changed, 63 insertions(+), 54 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
>>> index a7c2dc7f224c..b737edf04d60 100644
>>> --- a/drivers/gpu/drm/xe/xe_bo.c
>>> +++ b/drivers/gpu/drm/xe/xe_bo.c
>>> @@ -599,11 +599,17 @@ static void xe_ttm_tt_destroy(struct 
>>> ttm_device *ttm_dev, struct ttm_tt *tt)
>>>       kfree(tt);
>>>   }
>>>   -static bool xe_ttm_resource_visible(struct ttm_resource *mem)
>>> +static bool xe_ttm_resource_visible(struct xe_device *xe, struct 
>>> ttm_resource *mem)
>>>   {
>>> -    struct xe_ttm_vram_mgr_resource *vres =
>>> -        to_xe_ttm_vram_mgr_resource(mem);
>>> +    struct xe_ttm_vram_mgr_resource *vres;
>>>   +    if (mem->mem_type == XE_PL_STOLEN) {
>>> +        struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>>> +
>>> +        return mgr->io_base && 
>>> !xe_ttm_stolen_cpu_access_needs_ggtt(xe);
>>> +    }
>>> +
>>> +    vres = to_xe_ttm_vram_mgr_resource(mem);
>>>       return vres->used_visible_size == mem->size;
>>>   }
>>>   @@ -621,7 +627,7 @@ bool xe_bo_is_visible_vram(struct xe_bo *bo)
>>>       if (drm_WARN_ON(bo->ttm.base.dev, !xe_bo_is_vram(bo)))
>>>           return false;
>>>   -    return xe_ttm_resource_visible(bo->ttm.resource);
>>> +    return xe_ttm_resource_visible(xe_bo_device(bo), 
>>> bo->ttm.resource);
>>>   }
>>>     static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
>>> @@ -637,7 +643,7 @@ static int xe_ttm_io_mem_reserve(struct 
>>> ttm_device *bdev,
>>>       case XE_PL_VRAM1: {
>>>           struct xe_vram_region *vram = res_to_mem_region(mem);
>>>   -        if (!xe_ttm_resource_visible(mem))
>>> +        if (!xe_ttm_resource_visible(xe, mem))
>>>               return -EINVAL;
>>>             mem->bus.offset = mem->start << PAGE_SHIFT;
>>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
>>> b/drivers/gpu/drm/xe/xe_device_types.h
>>> index 150c76b2acaf..ffb84659c413 100644
>>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>>> @@ -42,6 +42,7 @@ struct xe_ggtt;
>>>   struct xe_i2c;
>>>   struct xe_pat_ops;
>>>   struct xe_pxp;
>>> +struct xe_ttm_stolen_mgr;
>>>   struct xe_vram_region;
>>>     /**
>>> @@ -278,6 +279,8 @@ struct xe_device {
>>>           struct ttm_resource_manager sys_mgr;
>>>           /** @mem.sys_mgr: system memory shrinker. */
>>>           struct xe_shrinker *shrinker;
>>> +        /** @mem.stolen_mgr: stolen memory manager. */
>>> +        struct xe_ttm_stolen_mgr *stolen_mgr;
>>>       } mem;
>>>         /** @sriov: device level virtualization data */
>>> diff --git a/drivers/gpu/drm/xe/xe_res_cursor.h 
>>> b/drivers/gpu/drm/xe/xe_res_cursor.h
>>> index 5f4ab08c0686..0522caafd89d 100644
>>> --- a/drivers/gpu/drm/xe/xe_res_cursor.h
>>> +++ b/drivers/gpu/drm/xe/xe_res_cursor.h
>>> @@ -101,7 +101,15 @@ static inline void xe_res_first(struct 
>>> ttm_resource *res,
>>>       cur->mem_type = res->mem_type;
>>>         switch (cur->mem_type) {
>>> -    case XE_PL_STOLEN:
>>> +    case XE_PL_STOLEN: {
>>> +        /* res->start is in pages (ttm_range_manager). */
>>> +        cur->start = (res->start << PAGE_SHIFT) + start;
>>> +        cur->size = size;
>>> +        cur->remaining = size;
>>> +        cur->node = NULL;
>>> +        cur->mm = NULL;
>>> +        break;
>>> +    }
>>>       case XE_PL_VRAM0:
>>>       case XE_PL_VRAM1: {
>>>           struct gpu_buddy_block *block;
>>> @@ -289,6 +297,10 @@ static inline void xe_res_next(struct 
>>> xe_res_cursor *cur, u64 size)
>>>         switch (cur->mem_type) {
>>>       case XE_PL_STOLEN:
>>> +        /* Just advance within the contiguous region. */
>>> +        cur->start += size;
>>> +        cur->size = cur->remaining;
>>> +        break;
>>>       case XE_PL_VRAM0:
>>>       case XE_PL_VRAM1:
>>>           start = size - cur->size;
>>> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c 
>>> b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>>> index 27c9d72222cf..5e9070739e65 100644
>>> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>>> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>>> @@ -19,30 +19,11 @@
>>>   #include "xe_device.h"
>>>   #include "xe_gt_printk.h"
>>>   #include "xe_mmio.h"
>>> -#include "xe_res_cursor.h"
>>>   #include "xe_sriov.h"
>>>   #include "xe_ttm_stolen_mgr.h"
>>> -#include "xe_ttm_vram_mgr.h"
>>>   #include "xe_vram.h"
>>>   #include "xe_wa.h"
>>>   -struct xe_ttm_stolen_mgr {
>>> -    struct xe_ttm_vram_mgr base;
>>> -
>>> -    /* PCI base offset */
>>> -    resource_size_t io_base;
>>> -    /* GPU base offset */
>>> -    resource_size_t stolen_base;
>>> -
>>> -    void __iomem *mapping;
>>> -};
>>> -
>>> -static inline struct xe_ttm_stolen_mgr *
>>> -to_stolen_mgr(struct ttm_resource_manager *man)
>>> -{
>>> -    return container_of(man, struct xe_ttm_stolen_mgr, base.manager);
>>> -}
>>> -
>>>   /**
>>>    * xe_ttm_stolen_cpu_access_needs_ggtt() - If we can't directly 
>>> CPU access
>>>    * stolen, can we then fallback to mapping through the GGTT.
>>> @@ -210,12 +191,19 @@ static u64 detect_stolen(struct xe_device *xe, 
>>> struct xe_ttm_stolen_mgr *mgr)
>>>   #endif
>>>   }
>>>   +static void xe_ttm_stolen_mgr_fini(struct drm_device *dev, void 
>>> *arg)
>>> +{
>>> +    struct xe_device *xe = to_xe_device(dev);
>>> +
>>> +    ttm_range_man_fini_nocheck(&xe->ttm, XE_PL_STOLEN);
>>> +}
>>> +
>>>   int xe_ttm_stolen_mgr_init(struct xe_device *xe)
>>>   {
>>>       struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>>>       struct xe_ttm_stolen_mgr *mgr;
>>>       u64 stolen_size, io_size;
>>> -    int err;
>>> +    int ret;
>>>         mgr = drmm_kzalloc(&xe->drm, sizeof(*mgr), GFP_KERNEL);
>>>       if (!mgr)
>>> @@ -244,12 +232,12 @@ int xe_ttm_stolen_mgr_init(struct xe_device *xe)
>>>       if (mgr->io_base && !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
>>>           io_size = stolen_size;
>>>   -    err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN, 
>>> stolen_size,
>>> -                     io_size, PAGE_SIZE);
>>> -    if (err) {
>>> -        drm_dbg_kms(&xe->drm, "Stolen mgr init failed: %i\n", err);
>>> -        return err;
>>> -    }
>>> +    ret = ttm_range_man_init_nocheck(&xe->ttm, XE_PL_STOLEN, false,
>>> +                     stolen_size >> PAGE_SHIFT);
>>> +    if (ret)
>>> +        return ret;
>>> +
>>> +    xe->mem.stolen_mgr = mgr;
>>>         drm_dbg_kms(&xe->drm, "Initialized stolen memory support 
>>> with %llu bytes\n",
>>>               stolen_size);
>>> @@ -257,36 +245,32 @@ int xe_ttm_stolen_mgr_init(struct xe_device *xe)
>>>       if (io_size)
>>>           mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, 
>>> io_size);
>>>   -    return 0;
>>> +    return drmm_add_action_or_reset(&xe->drm, 
>>> xe_ttm_stolen_mgr_fini, mgr);
>>>   }
>>>     u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset)
>>>   {
>>>       struct xe_device *xe = xe_bo_device(bo);
>>> -    struct ttm_resource_manager *ttm_mgr = 
>>> ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
>>> -    struct xe_ttm_stolen_mgr *mgr = to_stolen_mgr(ttm_mgr);
>>> -    struct xe_res_cursor cur;
>>> +    struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>>>         XE_WARN_ON(!mgr->io_base);
>>>         if (xe_ttm_stolen_cpu_access_needs_ggtt(xe))
>>>           return mgr->io_base + xe_bo_ggtt_addr(bo) + offset;
>>>   -    xe_res_first(bo->ttm.resource, offset, 4096, &cur);
>>> -    return mgr->io_base + cur.start;
>>> +    /* Range allocator: res->start is in pages. */
>>> +    return mgr->io_base + (bo->ttm.resource->start << PAGE_SHIFT) + 
>>> offset;
>>>   }
>>>     static int __xe_ttm_stolen_io_mem_reserve_bar2(struct xe_device 
>>> *xe,
>>>                              struct xe_ttm_stolen_mgr *mgr,
>>>                              struct ttm_resource *mem)
>>>   {
>>> -    struct xe_res_cursor cur;
>>> -
>>>       if (!mgr->io_base)
>>>           return -EIO;
>>>   -    xe_res_first(mem, 0, 4096, &cur);
>>> -    mem->bus.offset = cur.start;
>>> +    /* Range allocator always produces contiguous allocations. */
>>> +    mem->bus.offset = mem->start << PAGE_SHIFT;
>>>         drm_WARN_ON(&xe->drm, !(mem->placement & 
>>> TTM_PL_FLAG_CONTIGUOUS));
>>>   @@ -329,8 +313,7 @@ static int 
>>> __xe_ttm_stolen_io_mem_reserve_stolen(struct xe_device *xe,
>>>     int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct 
>>> ttm_resource *mem)
>>>   {
>>> -    struct ttm_resource_manager *ttm_mgr = 
>>> ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
>>> -    struct xe_ttm_stolen_mgr *mgr = ttm_mgr ? 
>>> to_stolen_mgr(ttm_mgr) : NULL;
>>> +    struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>>>         if (!mgr || !mgr->io_base)
>>>           return -EIO;
>>> @@ -343,8 +326,5 @@ int xe_ttm_stolen_io_mem_reserve(struct 
>>> xe_device *xe, struct ttm_resource *mem)
>>>     u64 xe_ttm_stolen_gpu_offset(struct xe_device *xe)
>>>   {
>>> -    struct xe_ttm_stolen_mgr *mgr =
>>> -        to_stolen_mgr(ttm_manager_type(&xe->ttm, XE_PL_STOLEN));
>>> -
>>> -    return mgr->stolen_base;
>>> +    return xe->mem.stolen_mgr->stolen_base;
>>>   }
>>> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h 
>>> b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
>>> index 8e877d1e839b..049e91e77326 100644
>>> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
>>> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
>>> @@ -12,6 +12,15 @@ struct ttm_resource;
>>>   struct xe_bo;
>>>   struct xe_device;
>>>   +struct xe_ttm_stolen_mgr {
>>> +    /* PCI base offset */
>>> +    resource_size_t io_base;
>>> +    /* GPU base offset */
>>> +    resource_size_t stolen_base;
>>> +
>>> +    void __iomem *mapping;
>>> +};
>>> +
>>>   int xe_ttm_stolen_mgr_init(struct xe_device *xe);
>>>   int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct 
>>> ttm_resource *mem);
>>>   bool xe_ttm_stolen_cpu_access_needs_ggtt(struct xe_device *xe);
>>> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c 
>>> b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>>> index 5fd0d5506a7e..79ef8e1b5e5c 100644
>>> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>>> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>>> @@ -301,14 +301,13 @@ int __xe_ttm_vram_mgr_init(struct xe_device 
>>> *xe, struct xe_ttm_vram_mgr *mgr,
>>>                  u64 default_page_size)
>>>   {
>>>       struct ttm_resource_manager *man = &mgr->manager;
>>> +    const char *name;
>>>       int err;
>>>   -    if (mem_type != XE_PL_STOLEN) {
>>> -        const char *name = mem_type == XE_PL_VRAM0 ? "vram0" : 
>>> "vram1";
>>> -        man->cg = drmm_cgroup_register_region(&xe->drm, name, size);
>>> -        if (IS_ERR(man->cg))
>>> -            return PTR_ERR(man->cg);
>>> -    }
>>> +    name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
>>> +    man->cg = drmm_cgroup_register_region(&xe->drm, name, size);
>>> +    if (IS_ERR(man->cg))
>>> +        return PTR_ERR(man->cg);
>>>         man->func = &xe_ttm_vram_mgr_func;
>>>       mgr->mem_type = mem_type;
>>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] drm/xe: Convert stolen memory over to ttm_range_manager
  2026-04-20  9:36 ` Govindapillai, Vinod
@ 2026-04-20 12:20   ` Yadav, Sanjay Kumar
  0 siblings, 0 replies; 7+ messages in thread
From: Yadav, Sanjay Kumar @ 2026-04-20 12:20 UTC (permalink / raw)
  To: Govindapillai, Vinod, intel-xe@lists.freedesktop.org
  Cc: Brost, Matthew, maarten.lankhorst@linux.intel.com, Auld, Matthew


On 20-04-2026 15:06, Govindapillai, Vinod wrote:
> On Thu, 2026-04-16 at 20:49 +0530, Sanjay Yadav wrote:
>> Stolen memory requires physically contiguous allocations for display
>> scanout and compressed framebuffers. The stolen memory manager was
>> sharing the gpu_buddy allocator backend with the VRAM manager, but
>> buddy manages non-contiguous power-of-two blocks making it a poor
>> fit.
>> Stolen memory also has fundamentally different allocation patterns:
>>
>> - Allocation sizes are not power-of-two. Since buddy rounds up to the
>>    next power-of-two block size, a ~17MB request can fail even with
>>    ~22MB free, because the free space is fragmented across non-fitting
>>    power-of-two blocks.
>> - Hardware restrictions prevent using the first 4K page of stolen for
>>    certain allocations (e.g., FBC). The display code sets fpfn=1 to
>>    enforce this, but when fpfn != 0, gpu_buddy enables
>>    GPU_BUDDY_RANGE_ALLOCATION mode which disables the try_harder
>>    coalescing path, further reducing allocation success.
>>
>> This combination caused FBC compressed framebuffer (CFB) allocation
>> failures on platforms like NVL/PTL. In case of NVL where stolen
>> memory
>> is ~56MB and the initial plane framebuffer consumes ~34MB at probe
>> time,
>> leaving ~22MB for subsequent allocations.
>>
>> Use ttm_range_man_init_nocheck() to set up a drm_mm-backed TTM
>> resource
>> manager for stolen memory. This reuses the TTM core's
>> ttm_range_manager
>> callbacks, avoiding duplicate implementations.
>>
>> Tested on NVL with a 4K DP display: stolen_mm shows a single ~22MB
>> contiguous free hole after initial plane framebuffer allocation, and
>> FBC successfully allocates its CFB from that region. The
>> corresponding
>> IGT was previously skipped and now passes.
>>
>> v2:
>> - Clarify that stolen memory requires contiguous allocations (Matt B)
>> - Properly handle xe_ttm_resource_visible() for stolen instead of
>>    unconditionally returning true (Matt A)
>>
>> Closes:
>> https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7631
>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Suggested-by: Matthew Auld <matthew.auld@intel.com>
>> Assisted-by: GitHub Copilot:claude-sonnet-4.6
>> Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_bo.c             | 16 +++++--
>>   drivers/gpu/drm/xe/xe_device_types.h   |  3 ++
>>   drivers/gpu/drm/xe/xe_res_cursor.h     | 14 +++++-
>>   drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 64 +++++++++---------------
>> --
>>   drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h |  9 ++++
>>   drivers/gpu/drm/xe/xe_ttm_vram_mgr.c   | 11 ++---
>>   6 files changed, 63 insertions(+), 54 deletions(-)
>>
> FP16 stolen memory allocations are working ~35mb
>
> Acked-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
Thanks Vinod for the confirmation!
>
>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
>> index a7c2dc7f224c..b737edf04d60 100644
>> --- a/drivers/gpu/drm/xe/xe_bo.c
>> +++ b/drivers/gpu/drm/xe/xe_bo.c
>> @@ -599,11 +599,17 @@ static void xe_ttm_tt_destroy(struct ttm_device
>> *ttm_dev, struct ttm_tt *tt)
>>   	kfree(tt);
>>   }
>>   
>> -static bool xe_ttm_resource_visible(struct ttm_resource *mem)
>> +static bool xe_ttm_resource_visible(struct xe_device *xe, struct
>> ttm_resource *mem)
>>   {
>> -	struct xe_ttm_vram_mgr_resource *vres =
>> -		to_xe_ttm_vram_mgr_resource(mem);
>> +	struct xe_ttm_vram_mgr_resource *vres;
>>   
>> +	if (mem->mem_type == XE_PL_STOLEN) {
>> +		struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>> +
>> +		return mgr->io_base &&
>> !xe_ttm_stolen_cpu_access_needs_ggtt(xe);
>> +	}
>> +
>> +	vres = to_xe_ttm_vram_mgr_resource(mem);
>>   	return vres->used_visible_size == mem->size;
>>   }
>>   
>> @@ -621,7 +627,7 @@ bool xe_bo_is_visible_vram(struct xe_bo *bo)
>>   	if (drm_WARN_ON(bo->ttm.base.dev, !xe_bo_is_vram(bo)))
>>   		return false;
>>   
>> -	return xe_ttm_resource_visible(bo->ttm.resource);
>> +	return xe_ttm_resource_visible(xe_bo_device(bo), bo-
>>> ttm.resource);
>>   }
>>   
>>   static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
>> @@ -637,7 +643,7 @@ static int xe_ttm_io_mem_reserve(struct
>> ttm_device *bdev,
>>   	case XE_PL_VRAM1: {
>>   		struct xe_vram_region *vram =
>> res_to_mem_region(mem);
>>   
>> -		if (!xe_ttm_resource_visible(mem))
>> +		if (!xe_ttm_resource_visible(xe, mem))
>>   			return -EINVAL;
>>   
>>   		mem->bus.offset = mem->start << PAGE_SHIFT;
>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h
>> b/drivers/gpu/drm/xe/xe_device_types.h
>> index 150c76b2acaf..ffb84659c413 100644
>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> @@ -42,6 +42,7 @@ struct xe_ggtt;
>>   struct xe_i2c;
>>   struct xe_pat_ops;
>>   struct xe_pxp;
>> +struct xe_ttm_stolen_mgr;
>>   struct xe_vram_region;
>>   
>>   /**
>> @@ -278,6 +279,8 @@ struct xe_device {
>>   		struct ttm_resource_manager sys_mgr;
>>   		/** @mem.sys_mgr: system memory shrinker. */
>>   		struct xe_shrinker *shrinker;
>> +		/** @mem.stolen_mgr: stolen memory manager. */
>> +		struct xe_ttm_stolen_mgr *stolen_mgr;
>>   	} mem;
>>   
>>   	/** @sriov: device level virtualization data */
>> diff --git a/drivers/gpu/drm/xe/xe_res_cursor.h
>> b/drivers/gpu/drm/xe/xe_res_cursor.h
>> index 5f4ab08c0686..0522caafd89d 100644
>> --- a/drivers/gpu/drm/xe/xe_res_cursor.h
>> +++ b/drivers/gpu/drm/xe/xe_res_cursor.h
>> @@ -101,7 +101,15 @@ static inline void xe_res_first(struct
>> ttm_resource *res,
>>   	cur->mem_type = res->mem_type;
>>   
>>   	switch (cur->mem_type) {
>> -	case XE_PL_STOLEN:
>> +	case XE_PL_STOLEN: {
>> +		/* res->start is in pages (ttm_range_manager). */
>> +		cur->start = (res->start << PAGE_SHIFT) + start;
>> +		cur->size = size;
>> +		cur->remaining = size;
>> +		cur->node = NULL;
>> +		cur->mm = NULL;
>> +		break;
>> +	}
>>   	case XE_PL_VRAM0:
>>   	case XE_PL_VRAM1: {
>>   		struct gpu_buddy_block *block;
>> @@ -289,6 +297,10 @@ static inline void xe_res_next(struct
>> xe_res_cursor *cur, u64 size)
>>   
>>   	switch (cur->mem_type) {
>>   	case XE_PL_STOLEN:
>> +		/* Just advance within the contiguous region. */
>> +		cur->start += size;
>> +		cur->size = cur->remaining;
>> +		break;
>>   	case XE_PL_VRAM0:
>>   	case XE_PL_VRAM1:
>>   		start = size - cur->size;
>> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>> b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>> index 27c9d72222cf..5e9070739e65 100644
>> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>> @@ -19,30 +19,11 @@
>>   #include "xe_device.h"
>>   #include "xe_gt_printk.h"
>>   #include "xe_mmio.h"
>> -#include "xe_res_cursor.h"
>>   #include "xe_sriov.h"
>>   #include "xe_ttm_stolen_mgr.h"
>> -#include "xe_ttm_vram_mgr.h"
>>   #include "xe_vram.h"
>>   #include "xe_wa.h"
>>   
>> -struct xe_ttm_stolen_mgr {
>> -	struct xe_ttm_vram_mgr base;
>> -
>> -	/* PCI base offset */
>> -	resource_size_t io_base;
>> -	/* GPU base offset */
>> -	resource_size_t stolen_base;
>> -
>> -	void __iomem *mapping;
>> -};
>> -
>> -static inline struct xe_ttm_stolen_mgr *
>> -to_stolen_mgr(struct ttm_resource_manager *man)
>> -{
>> -	return container_of(man, struct xe_ttm_stolen_mgr,
>> base.manager);
>> -}
>> -
>>   /**
>>    * xe_ttm_stolen_cpu_access_needs_ggtt() - If we can't directly CPU
>> access
>>    * stolen, can we then fallback to mapping through the GGTT.
>> @@ -210,12 +191,19 @@ static u64 detect_stolen(struct xe_device *xe,
>> struct xe_ttm_stolen_mgr *mgr)
>>   #endif
>>   }
>>   
>> +static void xe_ttm_stolen_mgr_fini(struct drm_device *dev, void
>> *arg)
>> +{
>> +	struct xe_device *xe = to_xe_device(dev);
>> +
>> +	ttm_range_man_fini_nocheck(&xe->ttm, XE_PL_STOLEN);
>> +}
>> +
>>   int xe_ttm_stolen_mgr_init(struct xe_device *xe)
>>   {
>>   	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>>   	struct xe_ttm_stolen_mgr *mgr;
>>   	u64 stolen_size, io_size;
>> -	int err;
>> +	int ret;
>>   
>>   	mgr = drmm_kzalloc(&xe->drm, sizeof(*mgr), GFP_KERNEL);
>>   	if (!mgr)
>> @@ -244,12 +232,12 @@ int xe_ttm_stolen_mgr_init(struct xe_device
>> *xe)
>>   	if (mgr->io_base &&
>> !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
>>   		io_size = stolen_size;
>>   
>> -	err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN,
>> stolen_size,
>> -				     io_size, PAGE_SIZE);
>> -	if (err) {
>> -		drm_dbg_kms(&xe->drm, "Stolen mgr init failed:
>> %i\n", err);
>> -		return err;
>> -	}
>> +	ret = ttm_range_man_init_nocheck(&xe->ttm, XE_PL_STOLEN,
>> false,
>> +					 stolen_size >> PAGE_SHIFT);
>> +	if (ret)
>> +		return ret;
>> +
>> +	xe->mem.stolen_mgr = mgr;
>>   
>>   	drm_dbg_kms(&xe->drm, "Initialized stolen memory support
>> with %llu bytes\n",
>>   		    stolen_size);
>> @@ -257,36 +245,32 @@ int xe_ttm_stolen_mgr_init(struct xe_device
>> *xe)
>>   	if (io_size)
>>   		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr-
>>> io_base, io_size);
>>   
>> -	return 0;
>> +	return drmm_add_action_or_reset(&xe->drm,
>> xe_ttm_stolen_mgr_fini, mgr);
>>   }
>>   
>>   u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset)
>>   {
>>   	struct xe_device *xe = xe_bo_device(bo);
>> -	struct ttm_resource_manager *ttm_mgr = ttm_manager_type(&xe-
>>> ttm, XE_PL_STOLEN);
>> -	struct xe_ttm_stolen_mgr *mgr = to_stolen_mgr(ttm_mgr);
>> -	struct xe_res_cursor cur;
>> +	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>>   
>>   	XE_WARN_ON(!mgr->io_base);
>>   
>>   	if (xe_ttm_stolen_cpu_access_needs_ggtt(xe))
>>   		return mgr->io_base + xe_bo_ggtt_addr(bo) + offset;
>>   
>> -	xe_res_first(bo->ttm.resource, offset, 4096, &cur);
>> -	return mgr->io_base + cur.start;
>> +	/* Range allocator: res->start is in pages. */
>> +	return mgr->io_base + (bo->ttm.resource->start <<
>> PAGE_SHIFT) + offset;
>>   }
>>   
>>   static int __xe_ttm_stolen_io_mem_reserve_bar2(struct xe_device *xe,
>>   					       struct
>> xe_ttm_stolen_mgr *mgr,
>>   					       struct ttm_resource
>> *mem)
>>   {
>> -	struct xe_res_cursor cur;
>> -
>>   	if (!mgr->io_base)
>>   		return -EIO;
>>   
>> -	xe_res_first(mem, 0, 4096, &cur);
>> -	mem->bus.offset = cur.start;
>> +	/* Range allocator always produces contiguous allocations.
>> */
>> +	mem->bus.offset = mem->start << PAGE_SHIFT;
>>   
>>   	drm_WARN_ON(&xe->drm, !(mem->placement &
>> TTM_PL_FLAG_CONTIGUOUS));
>>   
>> @@ -329,8 +313,7 @@ static int
>> __xe_ttm_stolen_io_mem_reserve_stolen(struct xe_device *xe,
>>   
>>   int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct
>> ttm_resource *mem)
>>   {
>> -	struct ttm_resource_manager *ttm_mgr = ttm_manager_type(&xe-
>>> ttm, XE_PL_STOLEN);
>> -	struct xe_ttm_stolen_mgr *mgr = ttm_mgr ?
>> to_stolen_mgr(ttm_mgr) : NULL;
>> +	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
>>   
>>   	if (!mgr || !mgr->io_base)
>>   		return -EIO;
>> @@ -343,8 +326,5 @@ int xe_ttm_stolen_io_mem_reserve(struct xe_device
>> *xe, struct ttm_resource *mem)
>>   
>>   u64 xe_ttm_stolen_gpu_offset(struct xe_device *xe)
>>   {
>> -	struct xe_ttm_stolen_mgr *mgr =
>> -		to_stolen_mgr(ttm_manager_type(&xe->ttm,
>> XE_PL_STOLEN));
>> -
>> -	return mgr->stolen_base;
>> +	return xe->mem.stolen_mgr->stolen_base;
>>   }
>> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
>> b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
>> index 8e877d1e839b..049e91e77326 100644
>> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
>> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
>> @@ -12,6 +12,15 @@ struct ttm_resource;
>>   struct xe_bo;
>>   struct xe_device;
>>   
>> +struct xe_ttm_stolen_mgr {
>> +	/* PCI base offset */
>> +	resource_size_t io_base;
>> +	/* GPU base offset */
>> +	resource_size_t stolen_base;
>> +
>> +	void __iomem *mapping;
>> +};
>> +
>>   int xe_ttm_stolen_mgr_init(struct xe_device *xe);
>>   int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct
>> ttm_resource *mem);
>>   bool xe_ttm_stolen_cpu_access_needs_ggtt(struct xe_device *xe);
>> diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>> b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>> index 5fd0d5506a7e..79ef8e1b5e5c 100644
>> --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>> +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
>> @@ -301,14 +301,13 @@ int __xe_ttm_vram_mgr_init(struct xe_device
>> *xe, struct xe_ttm_vram_mgr *mgr,
>>   			   u64 default_page_size)
>>   {
>>   	struct ttm_resource_manager *man = &mgr->manager;
>> +	const char *name;
>>   	int err;
>>   
>> -	if (mem_type != XE_PL_STOLEN) {
>> -		const char *name = mem_type == XE_PL_VRAM0 ? "vram0"
>> : "vram1";
>> -		man->cg = drmm_cgroup_register_region(&xe->drm,
>> name, size);
>> -		if (IS_ERR(man->cg))
>> -			return PTR_ERR(man->cg);
>> -	}
>> +	name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
>> +	man->cg = drmm_cgroup_register_region(&xe->drm, name, size);
>> +	if (IS_ERR(man->cg))
>> +		return PTR_ERR(man->cg);
>>   
>>   	man->func = &xe_ttm_vram_mgr_func;
>>   	mgr->mem_type = mem_type;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] drm/xe: Convert stolen memory over to ttm_range_manager
  2026-04-16 16:22   ` Matthew Auld
  2026-04-20 12:20     ` Yadav, Sanjay Kumar
@ 2026-04-20 13:16     ` Govindapillai, Vinod
  1 sibling, 0 replies; 7+ messages in thread
From: Govindapillai, Vinod @ 2026-04-20 13:16 UTC (permalink / raw)
  To: intel-xe@lists.freedesktop.org, maarten.lankhorst@linux.intel.com,
	Auld, Matthew, Yadav, Sanjay Kumar
  Cc: Brost, Matthew

On Thu, 2026-04-16 at 17:22 +0100, Matthew Auld wrote:
> On 16/04/2026 16:33, Maarten Lankhorst wrote:
> > Hey,
> > 
> > Den 2026-04-16 kl. 17:19, skrev Sanjay Yadav:
> > > Stolen memory requires physically contiguous allocations for
> > > display
> > > scanout and compressed framebuffers. The stolen memory manager
> > > was
> > > sharing the gpu_buddy allocator backend with the VRAM manager,
> > > but
> > > buddy manages non-contiguous power-of-two blocks making it a poor
> > > fit.
> > > Stolen memory also has fundamentally different allocation
> > > patterns:
> > > 
> > > - Allocation sizes are not power-of-two. Since buddy rounds up to
> > > the
> > >    next power-of-two block size, a ~17MB request can fail even
> > > with
> > >    ~22MB free, because the free space is fragmented across non-
> > > fitting
> > >    power-of-two blocks.
> > > - Hardware restrictions prevent using the first 4K page of stolen
> > > for
> > >    certain allocations (e.g., FBC). The display code sets fpfn=1
> > > to
> > >    enforce this, but when fpfn != 0, gpu_buddy enables
> > >    GPU_BUDDY_RANGE_ALLOCATION mode which disables the try_harder
> > >    coalescing path, further reducing allocation success.
> > > 
> > > This combination caused FBC compressed framebuffer (CFB)
> > > allocation
> > > failures on platforms like NVL/PTL. In case of NVL where stolen
> > > memory
> > > is ~56MB and the initial plane framebuffer consumes ~34MB at
> > > probe time,
> > > leaving ~22MB for subsequent allocations.
> > > 
> > > Use ttm_range_man_init_nocheck() to set up a drm_mm-backed TTM
> > > resource
> > > manager for stolen memory. This reuses the TTM core's
> > > ttm_range_manager
> > > callbacks, avoiding duplicate implementations.
> > > 
> > > Tested on NVL with a 4K DP display: stolen_mm shows a single
> > > ~22MB
> > > contiguous free hole after initial plane framebuffer allocation,
> > > and
> > > FBC successfully allocates its CFB from that region. The
> > > corresponding
> > > IGT was previously skipped and now passes.
> > > 
> > > v2:
> > > - Clarify that stolen memory requires contiguous allocations
> > > (Matt B)
> > > - Properly handle xe_ttm_resource_visible() for stolen instead of
> > >    unconditionally returning true (Matt A)
> > > 
> > > Closes:
> > > https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7631
> > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > Suggested-by: Matthew Auld <matthew.auld@intel.com>
> > > Assisted-by: GitHub Copilot:claude-sonnet-4.6
> > > Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
> > > ---
> > Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > 
> > The changes look good, I never considered the differences between
> > the
> > buddy allocator and range manager. Glad to see this works a lot
> > better!
> > 
> > One thing I did notice that in most cases, is there any reason for
> > the
> > range manager to use pages? It should be able to use bytes and
> > remove
> > a lot of shifts to convert between bytes and pages in the code.
> 
> Ah, that reminds me of: dfd1d2f7eb39 ("drm/xe/fbdev: Fix BIOS FB vs. 
> stolen size check")
> 
> That would need to be converted back to pages, I think.
> 

Ok. thanks for the information. I will update that.

Btw, the buddy allocator size was in bytes! Now I wonder if it would
make sense to have an uniform behavior in this case! 

BR
Vinod

> > 
> > >   drivers/gpu/drm/xe/xe_bo.c             | 16 +++++--
> > >   drivers/gpu/drm/xe/xe_device_types.h   |  3 ++
> > >   drivers/gpu/drm/xe/xe_res_cursor.h     | 14 +++++-
> > >   drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 64 +++++++++----------
> > > -------
> > >   drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h |  9 ++++
> > >   drivers/gpu/drm/xe/xe_ttm_vram_mgr.c   | 11 ++---
> > >   6 files changed, 63 insertions(+), 54 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_bo.c
> > > b/drivers/gpu/drm/xe/xe_bo.c
> > > index a7c2dc7f224c..b737edf04d60 100644
> > > --- a/drivers/gpu/drm/xe/xe_bo.c
> > > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > > @@ -599,11 +599,17 @@ static void xe_ttm_tt_destroy(struct
> > > ttm_device *ttm_dev, struct ttm_tt *tt)
> > >   	kfree(tt);
> > >   }
> > >   
> > > -static bool xe_ttm_resource_visible(struct ttm_resource *mem)
> > > +static bool xe_ttm_resource_visible(struct xe_device *xe, struct
> > > ttm_resource *mem)
> > >   {
> > > -	struct xe_ttm_vram_mgr_resource *vres =
> > > -		to_xe_ttm_vram_mgr_resource(mem);
> > > +	struct xe_ttm_vram_mgr_resource *vres;
> > >   
> > > +	if (mem->mem_type == XE_PL_STOLEN) {
> > > +		struct xe_ttm_stolen_mgr *mgr = xe-
> > > >mem.stolen_mgr;
> > > +
> > > +		return mgr->io_base &&
> > > !xe_ttm_stolen_cpu_access_needs_ggtt(xe);
> > > +	}
> > > +
> > > +	vres = to_xe_ttm_vram_mgr_resource(mem);
> > >   	return vres->used_visible_size == mem->size;
> > >   }
> > >   
> > > @@ -621,7 +627,7 @@ bool xe_bo_is_visible_vram(struct xe_bo *bo)
> > >   	if (drm_WARN_ON(bo->ttm.base.dev, !xe_bo_is_vram(bo)))
> > >   		return false;
> > >   
> > > -	return xe_ttm_resource_visible(bo->ttm.resource);
> > > +	return xe_ttm_resource_visible(xe_bo_device(bo), bo-
> > > >ttm.resource);
> > >   }
> > >   
> > >   static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
> > > @@ -637,7 +643,7 @@ static int xe_ttm_io_mem_reserve(struct
> > > ttm_device *bdev,
> > >   	case XE_PL_VRAM1: {
> > >   		struct xe_vram_region *vram =
> > > res_to_mem_region(mem);
> > >   
> > > -		if (!xe_ttm_resource_visible(mem))
> > > +		if (!xe_ttm_resource_visible(xe, mem))
> > >   			return -EINVAL;
> > >   
> > >   		mem->bus.offset = mem->start << PAGE_SHIFT;
> > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h
> > > b/drivers/gpu/drm/xe/xe_device_types.h
> > > index 150c76b2acaf..ffb84659c413 100644
> > > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > > @@ -42,6 +42,7 @@ struct xe_ggtt;
> > >   struct xe_i2c;
> > >   struct xe_pat_ops;
> > >   struct xe_pxp;
> > > +struct xe_ttm_stolen_mgr;
> > >   struct xe_vram_region;
> > >   
> > >   /**
> > > @@ -278,6 +279,8 @@ struct xe_device {
> > >   		struct ttm_resource_manager sys_mgr;
> > >   		/** @mem.sys_mgr: system memory shrinker. */
> > >   		struct xe_shrinker *shrinker;
> > > +		/** @mem.stolen_mgr: stolen memory manager. */
> > > +		struct xe_ttm_stolen_mgr *stolen_mgr;
> > >   	} mem;
> > >   
> > >   	/** @sriov: device level virtualization data */
> > > diff --git a/drivers/gpu/drm/xe/xe_res_cursor.h
> > > b/drivers/gpu/drm/xe/xe_res_cursor.h
> > > index 5f4ab08c0686..0522caafd89d 100644
> > > --- a/drivers/gpu/drm/xe/xe_res_cursor.h
> > > +++ b/drivers/gpu/drm/xe/xe_res_cursor.h
> > > @@ -101,7 +101,15 @@ static inline void xe_res_first(struct
> > > ttm_resource *res,
> > >   	cur->mem_type = res->mem_type;
> > >   
> > >   	switch (cur->mem_type) {
> > > -	case XE_PL_STOLEN:
> > > +	case XE_PL_STOLEN: {
> > > +		/* res->start is in pages (ttm_range_manager).
> > > */
> > > +		cur->start = (res->start << PAGE_SHIFT) + start;
> > > +		cur->size = size;
> > > +		cur->remaining = size;
> > > +		cur->node = NULL;
> > > +		cur->mm = NULL;
> > > +		break;
> > > +	}
> > >   	case XE_PL_VRAM0:
> > >   	case XE_PL_VRAM1: {
> > >   		struct gpu_buddy_block *block;
> > > @@ -289,6 +297,10 @@ static inline void xe_res_next(struct
> > > xe_res_cursor *cur, u64 size)
> > >   
> > >   	switch (cur->mem_type) {
> > >   	case XE_PL_STOLEN:
> > > +		/* Just advance within the contiguous region. */
> > > +		cur->start += size;
> > > +		cur->size = cur->remaining;
> > > +		break;
> > >   	case XE_PL_VRAM0:
> > >   	case XE_PL_VRAM1:
> > >   		start = size - cur->size;
> > > diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> > > b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> > > index 27c9d72222cf..5e9070739e65 100644
> > > --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> > > +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> > > @@ -19,30 +19,11 @@
> > >   #include "xe_device.h"
> > >   #include "xe_gt_printk.h"
> > >   #include "xe_mmio.h"
> > > -#include "xe_res_cursor.h"
> > >   #include "xe_sriov.h"
> > >   #include "xe_ttm_stolen_mgr.h"
> > > -#include "xe_ttm_vram_mgr.h"
> > >   #include "xe_vram.h"
> > >   #include "xe_wa.h"
> > >   
> > > -struct xe_ttm_stolen_mgr {
> > > -	struct xe_ttm_vram_mgr base;
> > > -
> > > -	/* PCI base offset */
> > > -	resource_size_t io_base;
> > > -	/* GPU base offset */
> > > -	resource_size_t stolen_base;
> > > -
> > > -	void __iomem *mapping;
> > > -};
> > > -
> > > -static inline struct xe_ttm_stolen_mgr *
> > > -to_stolen_mgr(struct ttm_resource_manager *man)
> > > -{
> > > -	return container_of(man, struct xe_ttm_stolen_mgr,
> > > base.manager);
> > > -}
> > > -
> > >   /**
> > >    * xe_ttm_stolen_cpu_access_needs_ggtt() - If we can't directly
> > > CPU access
> > >    * stolen, can we then fallback to mapping through the GGTT.
> > > @@ -210,12 +191,19 @@ static u64 detect_stolen(struct xe_device
> > > *xe, struct xe_ttm_stolen_mgr *mgr)
> > >   #endif
> > >   }
> > >   
> > > +static void xe_ttm_stolen_mgr_fini(struct drm_device *dev, void
> > > *arg)
> > > +{
> > > +	struct xe_device *xe = to_xe_device(dev);
> > > +
> > > +	ttm_range_man_fini_nocheck(&xe->ttm, XE_PL_STOLEN);
> > > +}
> > > +
> > >   int xe_ttm_stolen_mgr_init(struct xe_device *xe)
> > >   {
> > >   	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> > >   	struct xe_ttm_stolen_mgr *mgr;
> > >   	u64 stolen_size, io_size;
> > > -	int err;
> > > +	int ret;
> > >   
> > >   	mgr = drmm_kzalloc(&xe->drm, sizeof(*mgr), GFP_KERNEL);
> > >   	if (!mgr)
> > > @@ -244,12 +232,12 @@ int xe_ttm_stolen_mgr_init(struct xe_device
> > > *xe)
> > >   	if (mgr->io_base &&
> > > !xe_ttm_stolen_cpu_access_needs_ggtt(xe))
> > >   		io_size = stolen_size;
> > >   
> > > -	err = __xe_ttm_vram_mgr_init(xe, &mgr->base,
> > > XE_PL_STOLEN, stolen_size,
> > > -				     io_size, PAGE_SIZE);
> > > -	if (err) {
> > > -		drm_dbg_kms(&xe->drm, "Stolen mgr init failed:
> > > %i\n", err);
> > > -		return err;
> > > -	}
> > > +	ret = ttm_range_man_init_nocheck(&xe->ttm, XE_PL_STOLEN,
> > > false,
> > > +					 stolen_size >>
> > > PAGE_SHIFT);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	xe->mem.stolen_mgr = mgr;
> > >   
> > >   	drm_dbg_kms(&xe->drm, "Initialized stolen memory support
> > > with %llu bytes\n",
> > >   		    stolen_size);
> > > @@ -257,36 +245,32 @@ int xe_ttm_stolen_mgr_init(struct xe_device
> > > *xe)
> > >   	if (io_size)
> > >   		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr-
> > > >io_base, io_size);
> > >   
> > > -	return 0;
> > > +	return drmm_add_action_or_reset(&xe->drm,
> > > xe_ttm_stolen_mgr_fini, mgr);
> > >   }
> > >   
> > >   u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset)
> > >   {
> > >   	struct xe_device *xe = xe_bo_device(bo);
> > > -	struct ttm_resource_manager *ttm_mgr =
> > > ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
> > > -	struct xe_ttm_stolen_mgr *mgr = to_stolen_mgr(ttm_mgr);
> > > -	struct xe_res_cursor cur;
> > > +	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
> > >   
> > >   	XE_WARN_ON(!mgr->io_base);
> > >   
> > >   	if (xe_ttm_stolen_cpu_access_needs_ggtt(xe))
> > >   		return mgr->io_base + xe_bo_ggtt_addr(bo) +
> > > offset;
> > >   
> > > -	xe_res_first(bo->ttm.resource, offset, 4096, &cur);
> > > -	return mgr->io_base + cur.start;
> > > +	/* Range allocator: res->start is in pages. */
> > > +	return mgr->io_base + (bo->ttm.resource->start <<
> > > PAGE_SHIFT) + offset;
> > >   }
> > >   
> > >   static int __xe_ttm_stolen_io_mem_reserve_bar2(struct xe_device
> > > *xe,
> > >   					       struct
> > > xe_ttm_stolen_mgr *mgr,
> > >   					       struct
> > > ttm_resource *mem)
> > >   {
> > > -	struct xe_res_cursor cur;
> > > -
> > >   	if (!mgr->io_base)
> > >   		return -EIO;
> > >   
> > > -	xe_res_first(mem, 0, 4096, &cur);
> > > -	mem->bus.offset = cur.start;
> > > +	/* Range allocator always produces contiguous
> > > allocations. */
> > > +	mem->bus.offset = mem->start << PAGE_SHIFT;
> > >   
> > >   	drm_WARN_ON(&xe->drm, !(mem->placement &
> > > TTM_PL_FLAG_CONTIGUOUS));
> > >   
> > > @@ -329,8 +313,7 @@ static int
> > > __xe_ttm_stolen_io_mem_reserve_stolen(struct xe_device *xe,
> > >   
> > >   int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct
> > > ttm_resource *mem)
> > >   {
> > > -	struct ttm_resource_manager *ttm_mgr =
> > > ttm_manager_type(&xe->ttm, XE_PL_STOLEN);
> > > -	struct xe_ttm_stolen_mgr *mgr = ttm_mgr ?
> > > to_stolen_mgr(ttm_mgr) : NULL;
> > > +	struct xe_ttm_stolen_mgr *mgr = xe->mem.stolen_mgr;
> > >   
> > >   	if (!mgr || !mgr->io_base)
> > >   		return -EIO;
> > > @@ -343,8 +326,5 @@ int xe_ttm_stolen_io_mem_reserve(struct
> > > xe_device *xe, struct ttm_resource *mem)
> > >   
> > >   u64 xe_ttm_stolen_gpu_offset(struct xe_device *xe)
> > >   {
> > > -	struct xe_ttm_stolen_mgr *mgr =
> > > -		to_stolen_mgr(ttm_manager_type(&xe->ttm,
> > > XE_PL_STOLEN));
> > > -
> > > -	return mgr->stolen_base;
> > > +	return xe->mem.stolen_mgr->stolen_base;
> > >   }
> > > diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
> > > b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
> > > index 8e877d1e839b..049e91e77326 100644
> > > --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
> > > +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.h
> > > @@ -12,6 +12,15 @@ struct ttm_resource;
> > >   struct xe_bo;
> > >   struct xe_device;
> > >   
> > > +struct xe_ttm_stolen_mgr {
> > > +	/* PCI base offset */
> > > +	resource_size_t io_base;
> > > +	/* GPU base offset */
> > > +	resource_size_t stolen_base;
> > > +
> > > +	void __iomem *mapping;
> > > +};
> > > +
> > >   int xe_ttm_stolen_mgr_init(struct xe_device *xe);
> > >   int xe_ttm_stolen_io_mem_reserve(struct xe_device *xe, struct
> > > ttm_resource *mem);
> > >   bool xe_ttm_stolen_cpu_access_needs_ggtt(struct xe_device *xe);
> > > diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> > > b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> > > index 5fd0d5506a7e..79ef8e1b5e5c 100644
> > > --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> > > +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
> > > @@ -301,14 +301,13 @@ int __xe_ttm_vram_mgr_init(struct xe_device
> > > *xe, struct xe_ttm_vram_mgr *mgr,
> > >   			   u64 default_page_size)
> > >   {
> > >   	struct ttm_resource_manager *man = &mgr->manager;
> > > +	const char *name;
> > >   	int err;
> > >   
> > > -	if (mem_type != XE_PL_STOLEN) {
> > > -		const char *name = mem_type == XE_PL_VRAM0 ?
> > > "vram0" : "vram1";
> > > -		man->cg = drmm_cgroup_register_region(&xe->drm,
> > > name, size);
> > > -		if (IS_ERR(man->cg))
> > > -			return PTR_ERR(man->cg);
> > > -	}
> > > +	name = mem_type == XE_PL_VRAM0 ? "vram0" : "vram1";
> > > +	man->cg = drmm_cgroup_register_region(&xe->drm, name,
> > > size);
> > > +	if (IS_ERR(man->cg))
> > > +		return PTR_ERR(man->cg);
> > >   
> > >   	man->func = &xe_ttm_vram_mgr_func;
> > >   	mgr->mem_type = mem_type;
> > 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-04-20 13:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-16 15:19 [PATCH v2] drm/xe: Convert stolen memory over to ttm_range_manager Sanjay Yadav
2026-04-16 15:33 ` Maarten Lankhorst
2026-04-16 16:22   ` Matthew Auld
2026-04-20 12:20     ` Yadav, Sanjay Kumar
2026-04-20 13:16     ` Govindapillai, Vinod
2026-04-20  9:36 ` Govindapillai, Vinod
2026-04-20 12:20   ` Yadav, Sanjay Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox